[ICLR 2023 GeneFace] [Lip Sync.] [General Animation]
GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis
Figure: GeneFace (ICLR 2023)
Task: Select 68 facial keypoints and predict the offset of keypoints to animate the NeRF rendering.
Motivation: The generalization is limited by the small scale of training data.
Motion: Keypoints Offset.
Dataset: LRS3-TED
Views:
- It attempts to reduce NeRF artifacts by translating speech features into facial landmarks, but this often results in inaccurate lip movements.
- It’s also hard to reproduce actions such as blinking and eyebrow-raising.
[arXiv 2025 KDTalker] [Lip Sync.] [Context Expression] [Head Pose]
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait
Figure: KDTalker (arXiv 2025)
Task: Integrate unsupervised 3D keypoints (K) with diffusion models.
Motivation: Fixed nature of 3DMM keypoints without flexibility.
Motion: Keypoints Position.
Dataset: