Audio-driven Facial Animation Synthesis
Problem
Generate realistic facial animation directly from raw audio while preserving lip-sync accuracy and expressive motion over time.
Method
The approach uses a deep learning pipeline that maps audio features to facial dynamics with temporal modeling to improve coherence across frames.
My Role
I designed the model architecture and training strategy, especially around lip-sync quality, temporal consistency, and controllable expression behavior.
Focus
- Temporal coherence
- Cross-modal alignment between speech and motion
- Natural and controllable facial dynamics