Semantic Deep Face Models

What?

기존의 multi-linear morphable models은 facial identity와 expression에 대해 semantic control을 제공했으나, linear한 구조 때문에 quality와 expressivity가 부족하단 단점이 있음
본 논문에선 nonlinear 3D face modeling 방법을 제안한다. 특히 multi-linear face models과 nonlinear deep face networks의 이점을 결합하였음
- 따라서 identity와 expression을 disentangle하면서 직관적인 semantic control을 제공함
3D human faces를 modeling과 synthesising하기 위한 neural architectures를 제안한 논문

자기들만의 3D facial database를 build
Ethnicities, genders, age groups and BMI가 모두 다른 224 subjects를 선정하고, 이들은 neutral expression을 포함한 24가지의 facial expressions을 수행하고 capture함
또한 dynamic speech sequence와 facial workout sequence도 capture함
- (3DMM에서의 limit은 dynamic한 expression이 없었다는 점을 개선시킴)
- Register된 static mesh부터 시작하여 subject별 해부학적인 local face model을 build해나감
- 해당 모델은 subject의 dynamic한 퍼포먼스를 추적하는데 사용됨
Total 5376개의 서로 full correspondence한 meshes and textures (224 subjects x 24 expressions)를 수집
다음으로, registered mesh별로 blendweight vectors를 연관시킴
- Blendweight vector는 captured expression과 일치한 one-hot encoded vector임
- Least squares method로 optimal한 blendweight vector를 얻음
또한 linear blendshape은 real shape에 대해 정확하게 approximation을 하지 못하므로, optimized blend weights는 유지하되, linear shape 추정치는 버린다. 그리고 captured shape을 ground truth로 사용하여 decoder를 훈련시킴
이러한 방식으로, static and dynamic한 데이터를 활용한 training을 진행할 수 있음

Untitled

앞선 설명에서 구축된 데이터베이스를 통해, 모든 subjects의 neutral expression을 평균내어 $reference \; mesh \; R$을 만들어줌