Talking-Face Research Papers

Automatically Updated on 2026.02.18

Current Search Keywords: Talking Face, Talking Head, Visual Dubbing, Face Genertation, Lip Sync, Talker, Portrait, Talking Video, Head Synthesis, Face Reenactment, Wav2Lip, Talking Avatar, Lip Generation, Lip-Synchronization, Portrait Animation, Facial Animation, Lip Expert

If you have any other keywords, please feel free to let us know :)

Web Page (Scrape Code)

Talking Face

Publish Date	Title	Authors	PDF	Code
2026-02-14	EchoTorrent: Towards Swift, Sustained, and Streaming Multi-Modal Video Generation	Rang Meng et.al.	2602.13669	null
2026-02-13	VineetVC: Adaptive Video Conferencing Under Severe Bandwidth Constraints Using Audio-Driven Talking-Head Reconstruction	Vineet Kumar Rakesh et.al.	2602.12758	null
2026-02-12	3DXTalker: Unifying Identity, Lip Sync, Emotion, and Spatial Dynamics in Expressive 3D Talking Avatars	Zhongju Wang et.al.	2602.10516	null
2026-02-11	SoulX-FlashHead: Oracle-guided Generation of Infinite Real-time Streaming Talking Heads	Tan Yu et.al.	2602.07449	null
2026-02-10	Toward Fine-Grained Facial Control in 3D Talking Head Generation	Shaoyang Xie et.al.	2602.09736	null
2026-02-10	AUHead: Realistic Emotional Talking Head Generation via Action Units Control	Jiayi Lyu et.al.	2602.09534	null
2026-02-10	MOVA: Towards Scalable and Synchronized Video-Audio Generation	SII-OpenMOSS Team et.al.	2602.08794	null
2026-02-09	VedicTHG: Symbolic Vedic Computation for Low-Resource Talking-Head Generation in Educational Avatars	Vineet Kumar Rakesh et.al.	2602.08775	null
2026-02-06	Condition Matters in Full-head 3D GANs	Heyuan Li et.al.	2602.07198	null
2026-02-06	Ex-Omni: Enabling 3D Facial Animation Generation for Omni-modal Large Language Models	Haoyu Zhang et.al.	2602.07106	null
2026-02-05	From Blurry to Believable: Enhancing Low-quality Talking Heads with 3D Generative Priors	Ding-Jiun Huang et.al.	2602.06122	null
2026-02-04	A $^2$ -LLM: An End-to-end Conversational Audio Avatar Large Language Model	Xiaolin Hu et.al.	2602.04913	null
2026-02-03	Asymmetric Hierarchical Anchoring for Audio-Visual Joint Representation: Resolving Information Allocation Ambiguity for Robust Cross-Modal Generalization	Bixing Wu et.al.	2602.03570	null
2026-02-02	Making Avatars Interact: Towards Text-Driven Human-Object Interaction for Controllable Talking Avatars	Youliang Zhang et.al.	2602.01538	null
2026-01-31	JoyAvatar: Unlocking Highly Expressive Avatars via Harmonized Text-Audio Conditioning	Ruikui Wang et.al.	2602.00702	null
2026-01-30	LPIPS-AttnWav2Lip: Generic Audio-Driven lip synchronization for Talking Head Generation in the Wild	Zhipeng Chen et.al.	2602.00189	null
2026-01-30	MIRRORTALK: Forging Personalized Avatars Via Disentangled Style and Hierarchical Motion Control	Renjie Lu et.al.	2601.22501	null
2026-01-29	JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion	Anthony Chen et.al.	2601.22143	null
2026-01-29	EditYourself: Audio-Driven Generation and Manipulation of Talking Head Videos with Diffusion Transformers	John Flynn et.al.	2601.22127	null
2026-01-29	Lightweight High-Fidelity Low-Bitrate Talking Face Compression for 3D Video Conference	Jianglong Li et.al.	2601.21269	null
2026-01-29	SkyReels-V3 Technique Report	Debang Li et.al.	2601.17323	null
2026-01-28	SFQA: A Comprehensive Perceptual Quality Assessment Dataset for Singing Face Generation	Zhilin Gao et.al.	2601.20385	null
2026-01-27	Uncertainty-Aware 3D Emotional Talking Face Synthesis with Emotion Prior Distillation	Nanhan Shen et.al.	2601.19112	null
2026-01-26	Audio-Driven Talking Face Generation with Blink Embedding and Hash Grid Landmarks Encoding	Yuhui Zhang et.al.	2601.18849	null
2026-01-26	Splat-Portrait: Generalizing Talking Heads with Gaussian Splatting	Tong Shi et.al.	2601.18633	null
2026-01-21	FunCineForge: A Unified Dataset Toolkit and Model for Zero-Shot Movie Dubbing in Diverse Cinematic Scenes	Jiaxuan Liu et.al.	2601.14777	null
2026-01-20	HoverAI: An Embodied Aerial Agent for Natural Human-Drone Interaction	Yuhua Jin et.al.	2601.13801	null
2026-01-19	Exploring Talking Head Models With Adjacent Frame Prior for Speech-Preserving Facial Expression Manipulation	Zhenxuan Lu et.al.	2601.12876	null
2026-01-19	Generalizable and Animatable 3D Full-Head Gaussian Avatar from a Single Image	Shuling Zhao et.al.	2601.12770	null
2026-01-15	RSATalker: Realistic Socially-Aware Talking Head Generation for Multi-Turn Conversation	Peng Chen et.al.	2601.10606	null
2026-01-15	EditEmoTalk: Controllable Speech-Driven 3D Facial Animation with Continuous Expression Editing	Diqiong Jiang et.al.	2601.10000	null
2026-01-14	Now You See Me, Now You Don’t: A Unified Framework for Expression Consistent Anonymization in Talking Head Videos	Anil Egin et.al.	2601.11635	null
2026-01-14	MoCha:End-to-End Video Character Replacement without Structural Guidance	Zhengbo Xu et.al.	2601.08587	null
2026-01-13	Deep Learning Based Facial Retargeting Using Local Patches	Yeonsoo Choi et.al.	2601.08429	null
2026-01-08	MM-Sonate: Multimodal Controllable Audio-Video Generation with Zero-Shot Voice Cloning	Chunyu Qiang et.al.	2601.01568	null
2026-01-07	REFA: Real-time Egocentric Facial Animations for Virtual Reality	Qiang Zhang et.al.	2601.03507	null
2026-01-05	HeadLighter: Disentangling Illumination in Generative 3D Gaussian Heads via Lightstage Captures	Yating Wang et.al.	2601.02103	null
2026-01-05	ESGaussianFace: Emotional and Stylized Audio-Driven Facial Animation via 3D Gaussian Splatting	Chuhang Ma et.al.	2601.01847	null
2026-01-05	MANGO:Natural Multi-speaker 3D Talking Head Generation via 2D-Lifted Enhancement	Lei Zhu et.al.	2601.01749	null
2026-01-02	Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation	Taekyung Ki et.al.	2601.00664	null
2025-12-31	From Inpainting to Editing: A Self-Bootstrapping Framework for Context-Rich Visual Dubbing	Xu He et.al.	2512.25066	null
2025-12-30	DyStream: Streaming Dyadic Talking Heads Generation via Flow Matching-based Autoregressive Model	Bohong Chen et.al.	2512.24408	null
2025-12-30	SyncAnyone: Implicit Disentanglement via Progressive Self-Correction for Lip-Syncing in the wild	Xindi Zhang et.al.	2512.21736	null
2025-12-29	Knot Forcing: Taming Autoregressive Video Diffusion Models for Real-time Infinite Interactive Portrait Animation	Steven Xiao et.al.	2512.21734	null
2025-12-29	Efficient and Robust Video Defense Framework against 3D-field Personalized Talking Face	Rui-qing Sun et.al.	2512.21019	null
2025-12-27	PTalker: Personalized Speech-Driven 3D Talking Head Animation via Style Disentanglement and Modality Alignment	Bin Wang et.al.	2512.22602	null
2025-12-24	ALIVE: An Avatar-Lecture Interactive Video Engine with Content-Aware Retrieval for Real-Time Interaction	Md Zabirul Islam et.al.	2512.20858	null
2025-12-23	TAVID: Text-Driven Audio-Visual Interactive Dialogue Generation	Ji-Hoon Kim et.al.	2512.20296	null
2025-12-23	FlashLips: 100-FPS Mask-Free Latent Lip-Sync using Reconstruction Instead of Diffusion or GANs	Andreas Zinonos et.al.	2512.20033	null
2025-12-22	ActAvatar: Temporally-Aware Precise Action Control for Talking Avatars	Ziqiao Peng et.al.	2512.19546	null
2025-12-21	In-Context Audio Control of Video Diffusion Transformers	Wenze Liu et.al.	2512.18772	null
2025-12-20	Asynchronous Pipeline Parallelism for Real-Time Multilingual Lip Synchronization in Video Communication Systems	Eren Caglar et.al.	2512.18318	null
2025-12-20	MACE-Dance: Motion-Appearance Cascaded Experts for Music-Driven Dance Video Generation	Kaixing Yang et.al.	2512.18181	null
2025-12-19	SynergyWarpNet: Attention-Guided Cooperative Warping for Neural Portrait Animation	Shihang Li et.al.	2512.17331	null
2025-12-19	InstructDubber: Instruction-based Alignment for Zero-shot Movie Dubbing	Zhedong Zhang et.al.	2512.17154	null
2025-12-18	FlashPortrait: 6x Faster Infinite Portrait Animation with Adaptive Latent Prediction	Shuyuan Tu et.al.	2512.16900	null
2025-12-18	Instant Expressive Gaussian Head Avatar via 3D-Aware Expression Distillation	Kaiwen Jiang et.al.	2512.16893	null
2025-12-17	FlexAvatar: Learning Complete 3D Head Avatars with Partial Supervision	Tobias Kirschstein et.al.	2512.15599	null
2025-12-17	DeX-Portrait: Disentangled and Expressive Portrait Animation via Explicit and Latent Motion Representations	Yuxiang Shi et.al.	2512.15524	null
2025-12-16	TalkVerse: Democratizing Minute-Long Audio-Driven Video Generation	Zhenzhi Wang et.al.	2512.14938	null
2025-12-16	VASA-3D: Lifelike Audio-Driven Gaussian Head Avatars from a Single Image	Sicheng Xu et.al.	2512.14677	null
2025-12-16	FacEDiT: Unified Talking Face Editing and Generation via Facial Motion Infilling	Kim Sung-Bin et.al.	2512.14056	null
2025-12-16	Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model	Heyi Chen et.al.	2512.13507	null
2025-12-15	JoVA: Unified Multimodal Learning for Joint Video-Audio Generation	Xiaohu Huang et.al.	2512.13677	null
2025-12-15	Soul: Breathe Life into Digital Human for High-fidelity Long-term Multimodal Animation	Jiangning Zhang et.al.	2512.13495	null
2025-12-15	KlingAvatar 2.0 Technical Report	Kling Team et.al.	2512.13313	null
2025-12-15	STARCaster: Spatio-Temporal AutoRegressive Video Diffusion for Identity- and View-Aware Talking Portraits	Foivos Paraperas Papantoniou et.al.	2512.13247	null
2025-12-12	FactorPortrait: Controllable Portrait Animation via Disentangled Expression, Pose, and Viewpoint	Jiapeng Tang et.al.	2512.11645	null
2025-12-12	JoyAvatar: Real-time and Infinite Audio-Driven Avatar Generation with Autoregressive Diffusion	Chaochao Li et.al.	2512.11423	null
2025-12-12	KeyframeFace: From Text to Expressive Facial Keyframes	Jingchao Wu et.al.	2512.11321	null
2025-12-12	PersonaLive! Expressive Portrait Image Animation for Live Streaming	Zhiyuan Li et.al.	2512.11253	null
2025-12-12	REST: Diffusion-based Real-time End-to-end Streaming Talking Head Generation via ID-Context Caching and Asynchronous Streaming Distillation	Haotian Wang et.al.	2512.11229	null
2025-12-11	GaussianHeadTalk: Wobble-Free 3D Talking Heads with Audio Driven Gaussian Splatting	Madhav Agarwal et.al.	2512.10939	null
2025-12-10	EmoDiffTalk:Emotion-aware Diffusion for Editable 3D Gaussian Talking Head	Chang Liu et.al.	2512.05991	null
2025-12-04	LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging	Zhijian Shu et.al.	2512.04939	null
2025-12-04	Measuring the Unspoken: A Disentanglement Model and Benchmark for Psychological Analysis in the Wild	Yigui Feng et.al.	2512.04728	null
2025-12-02	DF-Mamba: Deformable State Space Modeling for 3D Hand Pose Estimation in Interactions	Yifan Zhou et.al.	2512.02727	null
2025-12-01	ELVIS: Enhance Low-Light for Video Instance Segmentation in the Dark	Joanne Lin et.al.	2512.01495	null
2025-12-01	EvalTalker: Learning to Evaluate Real-Portrait-Driven Multi-Subject Talking Humans	Yingjie Zhou et.al.	2512.01340	null
2025-11-30	TalkingPose: Efficient Face and Gesture Animation with Feedback-guided Diffusion Model	Alireza Javanmardi et.al.	2512.00909	null
2025-11-29	MVAD : A Comprehensive Multimodal Video-Audio Dataset for AIGC Detection	Mengxue Hu et.al.	2512.00336	null
2025-11-28	AnyTalker: Scaling Multi-Person Talking Video Generation with Interactivity Refinement	Zhizhou Zhong et.al.	2511.23475	null
2025-11-28	DAONet-YOLOv8: An Occlusion-Aware Dual-Attention Network for Tea Leaf Pest and Disease Detection	Yefeng Wu et.al.	2511.23222	null
2025-11-28	CoordSpeaker: Exploiting Gesture Captioning for Coordinated Caption-Empowered Co-Speech Gesture Generation	Fengyi Fang et.al.	2511.22863	null
2025-11-27	AI killed the video star. Audio-driven diffusion model for expressive talking head generation	Baptiste Chopin et.al.	2511.22488	null
2025-11-27	VSpeechLM: A Visual Speech Language Model for Visual Text-to-Speech Task	Yuyue Wang et.al.	2511.22229	null
2025-11-27	IMTalker: Efficient Audio-driven Talking Face Generation with Implicit Motion Transfer	Bo Chen et.al.	2511.22167	null
2025-11-27	Lips-Jaw and Tongue-Jaw Articulatory Tradeoff in DYNARTmo	Bernd J. Kröger et.al.	2511.22155	null
2025-11-27	DiP: Taming Diffusion Models in Pixel Space	Zhennan Chen et.al.	2511.18822	null
2025-11-26	Passive Dementia Screening via Facial Temporal Micro-Dynamics Analysis of In-the-Wild Talking-Head Video	Filippo Cenacchi et.al.	2511.13802	null
2025-11-25	Image Diffusion Models Exhibit Emergent Temporal Propagation in Videos	Youngseo Kim et.al.	2511.19936	null
2025-11-24	Blinking Beyond EAR: A Stable Eyelid Angle Metric for Driver Drowsiness Detection and Data Augmentation	Mathis Wolter et.al.	2511.19519	null
2025-11-24	Assessing the alignment between infants’ visual and linguistic experience using multimodal language models	Alvin Wei Ming Tan et.al.	2511.18824	null
2025-11-23	The Locally Deployable Virtual Doctor: LLM Based Human Interface for Automated Anamnesis and Database Conversion	Jan Benedikt Ruhland et.al.	2511.18632	null
2025-11-23	RigAnyFace: Scaling Neural Facial Mesh Auto-Rigging with Unlabeled Data	Wenchao Ma et.al.	2511.18601	null
2025-11-22	A superpersuasive autonomous policy debating system	Allen Roush et.al.	2511.17854	null
2025-11-21	Investigating self-supervised representations for audio-visual deepfake detection	Dragos-Alexandru Boldisor et.al.	2511.17181	null
2025-11-21	One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models	Aleksandr Razin et.al.	2511.10629	null
2025-11-20	Motion Transfer-Enhanced StyleGAN for Generating Diverse Macaque Facial Expressions	Takuya Igaue et.al.	2511.16711	null
2025-11-19	StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model	Yifan Yang et.al.	2511.14223	null
2025-11-18	Blur-Robust Detection via Feature Restoration: An End-to-End Framework for Prior-Guided Infrared UAV Target Detection	Xiaolin Wang et.al.	2511.14371	null
2025-11-18	Towards Authentic Movie Dubbing with Retrieve-Augmented Director-Actor Interaction Learning	Rui Liu et.al.	2511.14249	null
2025-11-17	B2F: End-to-End Body-to-Face Motion Generation with Style Reference	Bokyung Jang et.al.	2511.13988	null
2025-11-17	Uni-Hand: Universal Hand Motion Forecasting in Egocentric Views	Junyi Ma et.al.	2511.12878	null
2025-11-14	3D Gaussian and Diffusion-Based Gaze Redirection	Abiram Panchalingam et.al.	2511.11231	null
2025-11-12	GRACE: Designing Generative Face Video Codec via Agile Hardware-Centric Workflow	Rui Wan et.al.	2511.09272	null
2025-11-11	StableMorph: High-Quality Face Morph Generation with Stable Diffusion	Wassim Kabbani et.al.	2511.08090	null
2025-11-11	Is It Truly Necessary to Process and Fit Minutes-Long Reference Videos for Personalized Talking Face Generation?	Rui-Qing Sun et.al.	2511.07940	null
2025-11-10	LiveNeRF: Efficient Face Replacement Through Neural Radiance Fields Integration	Tung Vu et.al.	2511.07552	null
2025-11-10	The Inner Kernel of the Classical Kuiper Belt	Amir Siraj et.al.	2511.07512	null
2025-11-10	ConsistTalk: Intensity Controllable Temporally Consistent Talking Head Generation with Diffusion Noise Search	Zhenjie Liu et.al.	2511.06833	null
2025-11-08	DiLO: Disentangled Latent Optimization for Learning Shape and Deformation in Grouped Deforming 3D Objects	Mostofa Rafid Uddin et.al.	2511.06115	null
2025-11-08	Reperio-rPPG: Relational Temporal Graph Neural Networks for Periodicity Learning in Remote Physiological Measurement	Ba-Thinh Nguyen et.al.	2511.05946	null
2025-11-07	Shared Latent Representation for Joint Text-to-Audio-Visual Synthesis	Dogucan Yaman et.al.	2511.05432	null
2025-11-07	THEval. Evaluation Framework for Talking Head Video Generation	Nabyl Quignon et.al.	2511.04520	null
2025-11-05	UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions	Guozhen Zhang et.al.	2511.03334	null
2025-11-04	Densemarks: Learning Canonical Embeddings for Human Heads Images via Point Tracks	Dmitrii Pozdeev et.al.	2511.02830	null
2025-10-29	Learning Disentangled Speech- and Expression-Driven Blendshapes for 3D Talking Face Animation	Yuxiang Mao et.al.	2510.25234	null
2025-10-28	See the Speaker: Crafting High-Resolution Talking Faces from Speech with Prior Guidance and Region Refinement	Jinting Wang et.al.	2510.26819	null
2025-10-27	Lookahead Anchoring: Preserving Character Identity in Audio-Driven Human Animation	Junyoung Seo et.al.	2510.23581	null
2025-10-27	Revising Second Order Terms in Deep Animation Video Coding	Konstantin Schmidt et.al.	2510.23561	null
2025-10-26	MAGIC-Talk: Motion-aware Audio-Driven Talking Face Generation with Customizable Identity Control	Fatemeh Nazarieh et.al.	2510.22810	null
2025-10-26	DeepfakeBench-MM: A Comprehensive Benchmark for Multimodal Deepfake Detection	Kangran Zhao et.al.	2510.22622	null
2025-10-24	Unmasking Puppeteers: Leveraging Biometric Leakage to Disarm Impersonation in AI-based Videoconferencing	Danial Samadi Vahdati et.al.	2510.03548	null
2025-10-23	LSF-Animation: Label-Free Speech-Driven Facial Animation via Implicit Feature Representation	Xin Lu et.al.	2510.21864	null
2025-10-16	PIA: Deepfake Detection Using Phoneme-Temporal and Identity-Dynamic Analysis	Soumyya Kanti Datta et.al.	2510.14241	null
2025-10-14	Playmate2: Training-Free Multi-Character Audio-Driven Animation via Diffusion Transformer with Reward Feedback	Xingpei Ma et.al.	2510.12089	null
2025-10-12	DEMO: Disentangled Motion Latent Flow Matching for Fine-Grained Controllable Talking Portrait Synthesis	Peiyin Chen et.al.	2510.10650	null
2025-10-11	VividAnimator: An End-to-End Audio and Pose-driven Half-Body Human Animation Framework	Donglin Huang et.al.	2510.10269	null
2025-10-11	SyncLipMAE: Contrastive Masked Pretraining for Audio-Visual Talking-Face Representation	Zeyu Ling et.al.	2510.10069	null
2025-10-09	Paper2Video: Automatic Video Generation from Scientific Papers	Zeyu Zhu et.al.	2510.05096	null
2025-10-08	A Bridge from Audio to Video: Phoneme-Viseme Alignment Allows Every Face to Speak Multiple Languages	Zibo Su et.al.	2510.06612	null
2025-10-03	EGSTalker: Real-Time Audio-Driven Talking Head Generation with Efficient Gaussian Deformation	Tianheng Zhu et.al.	2510.08587	null
2025-10-02	Input-Aware Sparse Attention for Real-Time Co-Speech Video Generation	Beijia Lu et.al.	2510.02617	null
2025-10-01	Audio Driven Real-Time Facial Animation for Social Telepresence	Jiye Lee et.al.	2510.01176	null
2025-09-30	3DiFACE: Synthesizing and Editing Holistic 3D Facial Animation	Balamurugan Thambiraja et.al.	2509.26233	null
2025-09-26	StableDub: Taming Diffusion Prior for Generalized and Efficient Visual Dubbing	Liyang Chen et.al.	2509.21887	null
2025-09-25	Unlocking Financial Insights: An advanced Multimodal Summarization with Multimodal Output Framework for Financial Advisory Videos	Sarmistha Das et.al.	2509.20961	null
2025-09-24	KSDiff: Keyframe-Augmented Speech-Aware Dual-Path Diffusion for Facial Animation	Tianle Lyu et.al.	2509.20128	null
2025-09-24	Comparative Study of Subjective Video Quality Assessment Test Methods in Crowdsourcing for Varied Use Cases	Babak Naderi et.al.	2509.20118	null
2025-09-24	SynchroRaMa : Lip-Synchronized and Emotion-Aware Talking Face Generation via Multi-Modal Emotion Embedding	Phyo Thet Yee et.al.	2509.19965	null
2025-09-24	Talking Head Generation via AU-Guided Landmark Prediction	Shao-Yu Chang et.al.	2509.19749	null
2025-09-23	Audio-Driven Universal Gaussian Head Avatars	Kartik Teotia et.al.	2509.18924	null
2025-09-22	“I don’t like my avatar”: Investigating Human Digital Doubles	Siyi Liu et.al.	2509.17748	null
2025-09-22	Stable Video-Driven Portraits	Mallikarjun B. R. et.al.	2509.17476	null
2025-09-21	Beat on Gaze: Learning Stylized Generation of Gaze and Head Dynamics	Chengwei Shi et.al.	2509.17168	null
2025-09-21	PGSTalker: Real-Time Audio-Driven Talking Head Generation via 3D Gaussian Splatting with Pixel-Aware Density Control	Tianheng Zhu et.al.	2509.16922	null
2025-09-20	Follow-Your-Emoji-Faster: Towards Efficient, Fine-Controllable, and Expressive Freestyle Portrait Animation	Yue Ma et.al.	2509.16630	null
2025-09-17	Kling-Avatar: Grounding Multimodal Instructions for Cascaded Long-Duration Avatar Animation Synthesis	Yikang Ding et.al.	2509.09595	null
2025-09-16	A Lightweight Pipeline for Noisy Speech Voice Cloning and Accurate Lip Sync Synthesis	Javeria Amir et.al.	2509.12831	null
2025-09-15	AvatarSync: Rethinking Talking-Head Animation through Autoregressive Perspective	Yuchen Deng et.al.	2509.12052	null
2025-09-10	Bitrate-Controlled Diffusion for Disentangling Motion and Content in Video	Xiao Li et.al.	2509.08376	null
2025-09-09	PanoLAM: Large Avatar Model for Gaussian Full-Head Synthesis from One-shot Unposed Image	Peng Li et.al.	2509.07552	null
2025-09-04	Durian: Dual Reference-guided Portrait Animation with Attribute Transfer	Hyunsoo Cha et.al.	2509.04434	null
2025-08-28	EmoCAST: Emotional Talking Portrait via Emotive Text Description	Yiguo Jiang et.al.	2508.20615	null
2025-08-27	InfinityHuman: Towards Long-Term Audio-Driven Human	Xiaodi Li et.al.	2508.20210	null
2025-08-27	Improving Generalization in Deepfake Detection with Face Foundation Models and Metric Learning	Stelios Mylonas et.al.	2508.19730	null
2025-08-26	OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation	Jianwen Jiang et.al.	2508.19209	null
2025-08-26	PanoHair: Detailed Hair Strand Synthesis on Volumetric Heads	Shashikant Verma et.al.	2508.18944	null
2025-08-26	Wan-S2V: Audio-Driven Cinematic Video Generation	Xin Gao et.al.	2508.18621	null
2025-08-26	Supervising 3D Talking Head Avatars with Analysis-by-Audio-Synthesis	Radek Daněček et.al.	2504.13386	null
2025-08-25	Lightning Fast Caching-based Parallel Denoising Prediction for Accelerating Talking Head Generation	Jianzhi Long et.al.	2509.00052	null
2025-08-25	EAI-Avatar: Emotion-Aware Interactive Talking Head Generation	Haijie Yang et.al.	2508.18337	null
2025-08-22	Audio2Face-3D: Audio-driven Realistic Facial Animation For Digital Avatars	NVIDIA et.al.	2508.16401	null
2025-08-20	D^3-Talker: Dual-Branch Decoupled Deformation Fields for Few-Shot 3D Talking Head Synthesis	Yuhang Guo et.al.	2508.14449	null
2025-08-20	Taming Transformer for Emotion-Controllable Talking Face Generation	Ziqi Zhang et.al.	2508.14359	null
2025-08-19	TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis	Shunian Chen et.al.	2508.13618	null
2025-08-19	EDTalk++: Full Disentanglement for Controllable Talking Head Synthesis	Shuai Tan et.al.	2508.13442	null
2025-08-18	Human Feedback Driven Dynamic Speech Emotion Recognition	Ilya Fedorov et.al.	2508.14920	null
2025-08-17	CEM-Net: Cross-Emotion Memory Network for Emotional Talking Face Generation	Kangyi Wu et.al.	2508.12368	null
2025-08-16	RealTalk: Realistic Emotion-Aware Lifelike Talking-Head Synthesis	Wenqing Wang et.al.	2508.12163	null
2025-08-16	SimInterview: Transforming Business Education through Large Language Model-Based Simulated Multilingual Interview Training System	Truong Thanh Hung Nguyen et.al.	2508.11873	null
2025-08-15	FantasyTalking2: Timestep-Layer Adaptive Preference Optimization for Audio-Driven Portrait Animation	MengChao Wang et.al.	2508.11255	null
2025-08-14	HM-Talker: Hybrid Motion Modeling for High-Fidelity Talking Head Synthesis	Shiyu Liu et.al.	2508.10566	null
2025-08-14	M2DAO-Talker: Harmonizing Multi-granular Motion Decoupling and Alternating Optimization for Talking-head Generation	Kui Jiang et.al.	2507.08307	null
2025-08-14	MEDTalk: Multimodal Controlled 3D Facial Animation with Dynamic Emotions by Disentangled Embedding	Chang Liu et.al.	2507.06071	null
2025-08-13	LIA-X: Interpretable Latent Portrait Animator	Yaohui Wang et.al.	2508.09959	null
2025-08-12	Preview WB-DH: Towards Whole Body Digital Human Bench for the Generation of Whole-body Talking Avatar Videos	Chaoyi Wang et.al.	2508.08891	null
2025-08-11	Learning Phonetic Context-Dependent Viseme for Enhancing Speech-Driven 3D Facial Animation	Hyung Kyu Kim et.al.	2507.20568	null
2025-08-10	KLASSify to Verify: Audio-Visual Deepfake Detection Using SSL-based Audio and Handcrafted Visual Features	Ivan Kukanov et.al.	2508.07337	null
2025-08-08	MotionSwap	Om Patil et.al.	2508.06430	null
2025-08-08	MoDA: Multi-modal Diffusion Architecture for Talking Head Generation	Xinyang Li et.al.	2507.03256	null
2025-08-07	Evaluation of a Sign Language Avatar on Comprehensibility, User Experience \& Acceptability	Fenya Wasserroth et.al.	2508.05358	null
2025-08-07	RAP: Real-time Audio-driven Portrait Animation with Video Diffusion Transformer	Fangyu Du et.al.	2508.05115	null
2025-08-07	UniTalker: Conversational Speech-Visual Synthesis	Yifan Hu et.al.	2508.04585	null
2025-08-07	AudioGen-Omni: A Unified Multimodal Diffusion Transformer for Video-Synchronized Audio, Speech, and Song Generation	Le Wang et.al.	2508.00733	null
2025-08-06	MienCap: Realtime Performance-Based Facial Animation with Live Mood Dynamics	Ye Pan et.al.	2508.04687	null
2025-08-06	READ: Real-time and Efficient Asynchronous Diffusion for Audio-driven Talking Head Generation	Haotian Wang et.al.	2508.03457	null
2025-08-06	Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation	Weipeng Tan et.al.	2504.18087	null
2025-08-05	Multi-human Interactive Talking Dataset	Zeyu Zhu et.al.	2508.03050	null
2025-08-04	X-Actor: Emotional and Expressive Long-Range Portrait Acting from Audio	Chenxu Zhang et.al.	2508.02944	null
2025-08-04	Text2Lip: Progressive Lip-Synced Talking Face Generation from Text via Viseme-Guided Rendering	Xu Wang et.al.	2508.02362	null
2025-08-04	Is It Really You? Exploring Biometric Verification Scenarios in Photorealistic Talking-Head Avatar Videos	Laura Pedrouzo-Rodriguez et.al.	2508.00748	null
2025-07-31	Who is a Better Talker: Subjective and Objective Quality Assessment for AI-Generated Talking Heads	Yingjie Zhou et.al.	2507.23343	null
2025-07-30	X-NeMo: Expressive Neural Motion Reenactment via Disentangled Latent Attention	Xiaochen Zhao et.al.	2507.23143	null
2025-07-30	Robust Deepfake Detection for Electronic Know Your Customer Systems Using Registered Images	Takuma Amada et.al.	2507.22601	null
2025-07-29	DiTalker: A Unified DiT-based Framework for High-Quality and Speaking Styles Controllable Portrait Animation	He Feng et.al.	2508.06511	null
2025-07-29	JWB-DH-V1: Benchmark for Joint Whole-Body Talking Avatar and Speech Generation Version 1	Xinhan Di et.al.	2507.20987	null
2025-07-28	Mask-Free Audio-driven Talking Face Generation for Enhanced Visual Quality and Identity Preservation	Dogucan Yaman et.al.	2507.20953	null
2025-07-28	MemoryTalker: Personalized Speech-Driven 3D Facial Animation via Audio-Guided Stylization	Hyung Kyu Kim et.al.	2507.20562	null
2025-07-28	JOLT3D: Joint Learning of Talking Heads and 3DMM Parameters with Application to Lip-Sync	Sungjoon Park et.al.	2507.20452	null
2025-07-25	Face2VoiceSync: Lightweight Face-Voice Consistency for Text-Driven Talking Face Generation	Fang Kang et.al.	2507.19225	null
2025-07-24	Tiny is not small enough: High-quality, low-resource facial animation models through hybrid knowledge distillation	Zhen Han et.al.	2507.18352	null
2025-07-24	Celeb-DF++: A Large-scale Challenging Video DeepFake Benchmark for Generalizable Forensics	Yuezun Li et.al.	2507.18015	null
2025-07-22	Livatar-1: Real-Time Talking Heads Generation with Tailored Flow Matching	Haiyang Liu et.al.	2507.18649	null
2025-07-22	Navigating Large-Pose Challenge for High-Fidelity Face Reenactment with Video Diffusion Model	Mingtao Guo et.al.	2507.16341	null
2025-07-21	VisualSpeaker: Visually-Guided 3D Avatar Lip Synthesis	Alexandre Symeonidis-Herzig et.al.	2507.06060	null
2025-07-17	FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers	Qiang Wang et.al.	2507.12956	null
2025-07-17	ATL-Diff: Audio-Driven Talking Head Generation with Early Landmarks-Guide Noise Diffusion	Hoang-Son Vo et.al.	2507.12804	null
2025-07-17	Think-Before-Draw: Decomposing Emotion Semantics & Fine-Grained Controllable Expressive Talking Head Generation	Hanlei Shi et.al.	2507.12761	null
2025-07-17	Cross-Modal Watermarking for Authentic Audio Recovery and Tamper Localization in Synthesized Audiovisual Forgeries	Minyoung Kim et.al.	2507.12723	null
2025-07-16	AU-Blendshape for Fine-grained Stylized 3D Facial Expression Manipulation	Hao Li et.al.	2507.12001	null
2025-07-15	Model See Model Do: Speech-Driven Facial Animation with Style Control	Yifang Pan et.al.	2505.01319	null
2025-07-11	Detecting Deepfake Talking Heads from Facial Biometric Anomalies	Justin D. Norman et.al.	2507.08917	null
2025-07-10	GGTalker: Talking Head Systhesis with Generalizable Gaussian Priors and Identity-Specific Adaptation	Wentao Hu et.al.	2506.21513	null
2025-07-07	MoDiT: Learning Highly Consistent 3D Motion Coefficients with Diffusion Transformer for Talking Head Generation	Yucheng Wang et.al.	2507.05092	null
2025-07-05	EchoMimicV3: 1.3B Parameters are All You Need for Unified Multi-Modal and Multi-Task Human Animation	Rang Meng et.al.	2507.03905	null
2025-07-03	CanonSwap: High-Fidelity and Consistent Video Face Swapping via Canonical Space Modulation	Xiangyang Luo et.al.	2507.02691	null
2025-07-02	FixTalk: Taming Identity Leakage for High-Quality Talking Head Generation in Extreme Cases	Shuai Tan et.al.	2507.01390	null
2025-07-01	ICME 2025 Grand Challenge on Video Super-Resolution for Video Conferencing	Babak Naderi et.al.	2506.12269	null
2025-06-30	JAM-Flow: Joint Audio-Motion Synthesis with Flow Matching	Mingi Kwon et.al.	2506.23552	null
2025-06-27	MirrorMe: Towards Realtime and High Fidelity Audio-Driven Halfbody Animation	Dechao Meng et.al.	2506.22065	null
2025-06-27	Few-Shot Identity Adaptation for 3D Talking Heads via Global Gaussian Field	Hong Nie et.al.	2506.22044	null
2025-06-27	RiverEcho: Real-Time Interactive Digital System for Ancient Yellow River Culture	Haofeng Wang et.al.	2506.21865	null
2025-06-24	Bind-Your-Avatar: Multi-Talking-Character Video Generation with Dynamic 3D-mask-based Embedding Router	Yubo Huang et.al.	2506.19833	null
2025-06-23	Advancing Talking Head Generation: A Comprehensive Survey of Multi-Modal Methodologies, Datasets, Evaluation Metrics, and Loss Functions	Vineet Kumar Rakesh et.al.	2507.02900	null
2025-06-23	OmniAvatar: Efficient Audio-Driven Avatar Video Generation with Adaptive Body Animation	Qijun Gan et.al.	2506.18866	null
2025-06-23	CGS-GAN: 3D Consistent Gaussian Splatting GANs for High Resolution Human Head Synthesis	Florian Barthel et.al.	2505.17590	null
2025-06-17	SyncTalk++: High-Fidelity and Efficient Synchronized Talking Heads Synthesis Using Gaussian Splatting	Ziqiao Peng et.al.	2506.14742	null
2025-06-17	Compressed Video Super-Resolution based on Hierarchical Encoding	Yuxuan Jiang et.al.	2506.14381	null
2025-06-16	Audio-Visual Driven Compression for Low-Bitrate Talking Head Videos	Riku Takahashi et.al.	2506.13419	null
2025-06-15	iDiT-HOI: Inpainting-based Hand Object Interaction Reenactment via Video Diffusion Transformer	Zhelun Shen et.al.	2506.12847	null
2025-06-10	HunyuanVideo-HOMA: Generic Human-Object Interaction in Multimodal Driven Human Animation	Ziyao Huang et.al.	2506.08797	null
2025-06-03	NTIRE 2025 XGC Quality Assessment Challenge: Methods and Results	Xiaohong Liu et.al.	2506.02875	null
2025-06-03	OmniTalker: One-shot Real-time Text-Driven Talking Audio-Video Generation With Multimodal Style Mimicking	Zhongjian Wang et.al.	2504.02433	null
2025-06-02	Cocktail-Party Audio-Visual Speech Recognition	Thai-Binh Nguyen et.al.	2506.02178	null
2025-06-02	Low-Rank Head Avatar Personalization with Registers	Sai Tanmay Reddy Chakkera et.al.	2506.01935	null
2025-06-02	Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head Generation	Yuan Gan et.al.	2506.01591	null
2025-06-01	SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video Diffusion Transformers	Zhengcong Fei et.al.	2506.00830	null
2025-05-30	TalkingHeadBench: A Multi-Modal Benchmark & Analysis of Talking-Head DeepFake Detection	Xinqi Xiong et.al.	2505.24866	null
2025-05-29	Hallo4: High-Fidelity Dynamic Portrait Animation via Direct Preference Optimization and Temporal Motion Modulation	Jiahao Cui et.al.	2505.23525	null
2025-05-29	Video Editing for Audio-Visual Dubbing	Binyamin Manela et.al.	2505.23406	null
2025-05-29	Wav2Sem: Plug-and-Play Audio Semantic Decoupling for 3D Speech-Driven Facial Animation	Hao Li et.al.	2505.23290	null
2025-05-29	MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation	Siyuan Wang et.al.	2505.23120	null
2025-05-28	Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation	Zhe Kong et.al.	2505.22647	null
2025-05-28	Tell me Habibi, is it Real or Fake?	Kartik Kuckreja et.al.	2505.22581	null
2025-05-28	Neural Face Skinning for Mesh-agnostic Facial Expression Cloning	Sihun Cha et.al.	2505.22416	null
2025-05-28	FaceEditTalker: Interactive Talking Head Generation with Facial Attribute Editing	Guanwen Feng et.al.	2505.22141	null
2025-05-28	RESOUND: Speech Reconstruction from Silent Videos via Acoustic-Semantic Decomposed Modeling	Long-Khanh Pham et.al.	2505.22024	null
2025-05-27	OmniSync: Towards Universal Lip Synchronization via Diffusion Transformers	Ziqiao Peng et.al.	2505.21448	null
2025-05-26	Total-Editing: Head Avatar with Editable Appearance, Motion, and Lighting	Yizhou Zhao et.al.	2505.20582	null
2025-05-26	DualTalk: Dual-Speaker Interaction for 3D Talking Head Conversations	Ziqiao Peng et.al.	2505.18096	null
2025-05-14	Test-Time Augmentation for Pose-invariant Face Recognition	Jaemin Jung et.al.	2505.09256	null
2025-05-10	VTutor: An Animated Pedagogical Agent SDK that Provide Real Time Multi-Model Feedback	Eason Chen et.al.	2505.06676	null
2025-05-10	OT-Talk: Animating 3D Talking Head with Optimal Transportation	Xinmu Wang et.al.	2505.01932	null
2025-05-10	MagicPortrait: Temporally Consistent Face Reenactment with 3D Geometric Guidance	Mengting Wei et.al.	2504.21497	null
2025-05-08	OXSeg: Multidimensional attention UNet-based lip segmentation using semi-supervised lip contours	Hanie Moghaddasi et.al.	2505.05531	null
2025-05-03	GenSync: A Generalized Talking Head Framework for Audio-driven Multi-Subject Lip-Sync using 3D Gaussian Splatting	Anushka Agarwal et.al.	2505.01928	null
2025-05-02	FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing	Gaoxiang Cong et.al.	2505.01263	null
2025-05-01	KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution	Antoni Bigata et.al.	2505.00497	null
2025-04-29	IM-Portrait: Learning 3D-aware Video Diffusion for Photorealistic Talking Heads from Monocular Videos	Yuan Li et.al.	2504.19165	null
2025-04-27	Generative AI for Character Animation: A Comprehensive Survey of Techniques, Applications, and Future Directions	Mohammad Mahdi Abootorabi et.al.	2504.19056	null
2025-04-26	Audio-Driven Talking Face Video Generation with Joint Uncertainty Learning	Yifan Xie et.al.	2504.18810	null
2025-04-14	Controllable Expressive 3D Facial Animation via Diffusion in a Unified Multimodal Space	Kangwei Liu et.al.	2506.10007	null
2025-04-14	SpinMeRound: Consistent Multi-View Identity Generation Using Diffusion Models	Stathis Galanakis et.al.	2504.10716	null
2025-04-10	ChildlikeSHAPES: Semantic Hierarchical Region Parsing for Animating Figure Drawings	Astitva Srivastava et.al.	2504.08022	null
2025-04-08	VideoSPatS: Video SPatiotemporal Splines for Disentangled Occlusion, Appearance and Motion Modeling and Editing	Juan Luis Gonzalez Bello et.al.	2504.07146	null
2025-04-08	SE4Lip: Speech-Lip Encoder for Talking Head Synthesis to Solve Phoneme-Viseme Alignment Ambiguity	Yihuan Huang et.al.	2504.05803	null
2025-04-08	Exploiting Temporal Audio-Visual Correlation Embedding for Audio-Driven One-Shot Talking Head Animation	Zhihua Xu et.al.	2504.05746	null
2025-04-08	Contrastive Decoupled Representation Learning and Regularization for Speech-Preserving Facial Expression Manipulation	Tianshui Chen et.al.	2504.05672	null
2025-04-07	Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation	Fa-Ting Hong et.al.	2504.02542	null
2025-04-06	FluentLip: A Phonemes-Based Two-stage Approach for Audio-Driven Lip Synthesis with Optical Flow Consistency	Shiyan Liu et.al.	2504.04427	null
2025-04-04	A Human Digital Twin Architecture for Knowledge-based Interactions and Context-Aware Conversations	Abdul Mannan Mohammed et.al.	2504.03147	null
2025-04-03	VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models	Kim Sung-Bin et.al.	2504.02386	null
2025-04-02	Detecting Lip-Syncing Deepfakes: Vision Temporal Transformer for Analyzing Mouth Inconsistencies	Soumyya Kanti Datta et.al.	2504.01470	link
2025-04-02	EmoHead: Emotional Talking Head via Manipulating Semantic Expression Parameters	Xuli Shen et.al.	2503.19416	null
2025-04-01	Monocular and Generalizable Gaussian Talking Head Animation	Shengjie Gong et.al.	2504.00665	null
2025-03-31	Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics	Lee Chae-Yeon et.al.	2503.20308	null
2025-03-30	MoCha: Towards Movie-Grade Talking Character Synthesis	Cong Wei et.al.	2503.23307	null
2025-03-29	STSA: Spatial-Temporal Semantic Alignment for Visual Dubbing	Zijun Ding et.al.	2503.23039	link
2025-03-28	Audio-Plane: Audio Factorization Plane Gaussian Splatting for Real-Time Talking Head Synthesis	Shuai Shen et.al.	2503.22605	null
2025-03-28	Follow Your Motion: A Generic Temporal Consistency Portrait Editing Framework with Trajectory Guidance	Haijie Yang et.al.	2503.22225	null
2025-03-27	ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model	Jinwei Qi et.al.	2503.21144	null
2025-03-27	DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation	Hanbo Cheng et.al.	2410.13726	null
2025-03-26	Dual Audio-Centric Modality Coupling for Talking Head Generation	Ao Fu et.al.	2503.22728	null
2025-03-25	AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers	Jiazhi Guan et.al.	2503.19824	null
2025-03-25	MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation	Yukang Lin et.al.	2503.19383	null
2025-03-25	HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation	Zunnan Xu et.al.	2503.18860	null
2025-03-25	Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model	Yingying Fan et.al.	2503.16942	null
2025-03-24	DisentTalk: Cross-lingual Talking Face Generation via Semantic Disentangled Diffusion Model	Kangwei Liu et.al.	2503.19001	null
2025-03-24	Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation	Dingcheng Zhen et.al.	2503.18429	null
2025-03-23	DiffusionTalker: Efficient and Compact Speech-Driven 3D Talking Head via Personalizer-Guided Distillation	Peng Chen et.al.	2503.18159	link
2025-03-21	TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting	Jianchuan Chen et.al.	2503.17032	null
2025-03-21	From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech	Ji-Hoon Kim et.al.	2503.16956	null
2025-03-20	UniSync: A Unified Framework for Audio-Visual Synchronization	Tao Feng et.al.	2503.16357	null
2025-03-20	PC-Talk: Precise Facial Animation Control for Audio-Driven Talking Face Generation	Baiqin Wang et.al.	2503.14295	null
2025-03-19	DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis	Yuming Gu et.al.	2503.15667	link
2025-03-17	SyncDiff: Diffusion-based Talking Head Synthesis with Bottlenecked Temporal Visual Prior for Improved Synchronization	Xulin Fan et.al.	2503.13371	null
2025-03-17	Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait	Chaolong Yang et.al.	2503.12963	link
2025-03-16	Versatile Multimodal Controls for Whole-Body Talking Human Animation	Zheng Qin et.al.	2503.08714	null
2025-03-14	Cafe-Talk: Generating 3D Talking Face Animation with Multimodal Coarse- and Fine-grained Control	Hejia Chen et.al.	2503.14517	null
2025-03-14	EmoDiffusion: Enhancing Emotional 3D Facial Animation with Latent Diffusion Models	Yixuan Zhang et.al.	2503.11028	null
2025-03-12	StyleSpeaker: Audio-Enhanced Fine-Grained Style Modeling for Speech-Driven 3D Facial Animation	An Yang et.al.	2503.09852	null
2025-03-12	Bidirectional Learned Facial Animation Codec for Low Bitrate Talking Head Videos	Riku Takahashi et.al.	2503.09787	null
2025-03-09	Removing Averaging: Personalized Lip-Sync Driven Characters Based on Identity Adapter	Yanyu Zhu et.al.	2503.06397	null
2025-03-07	MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice	Hongwei Yi et.al.	2503.05978	null
2025-03-06	FREAK: Frequency-modulated High-fidelity and Real-time Audio-driven Talking Portrait Synthesis	Ziqi Ni et.al.	2503.04067	null
2025-03-03	KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation	Antoni Bigata et.al.	2503.01715	null
2025-03-02	FaceShot: Bring Any Character into Life	Junyao Gao et.al.	2503.00740	null
2025-03-01	Towards High-fidelity 3D Talking Avatar with Personalized Dynamic Texture	Xuanchen Li et.al.	2503.00495	null
2025-02-28	Two-Stream Spatial-Temporal Transformer Framework for Person Identification via Natural Conversational Keypoints	Masoumeh Chapariniya et.al.	2502.20803	null
2025-02-28	ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model	Xuangeng Chu et.al.	2502.20323	null
2025-02-27	InsTaG: Learning Personalized 3D Talking Head from Few-Second Video	Jiahe Li et.al.	2502.20387	link
2025-02-27	High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model	Mingtao Guo et.al.	2502.19894	link
2025-02-26	FLAP: Fully-controllable Audio-driven Portrait Video Generation through 3D head conditioned diffusion mode	Lingzhou Mu et.al.	2502.19455	null
2025-02-24	Dimitra: Audio-driven Diffusion model for Expressive Talking Head Generation	Baptiste Chopin et.al.	2502.17198	null
2025-02-20	NeRF-3DTalker: Neural Radiance Field with 3D Prior Aided Audio Disentanglement for Talking Head Synthesis	Xiaoxing Liu et.al.	2502.14178	null
2025-02-18	AV-Flow: Transforming Text to Audio-Visual Human-like Interactions	Aggelina Chatziagapi et.al.	2502.13133	null
2025-02-17	SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion	Junxian Ma et.al.	2502.11515	null
2025-02-15	SkyReels-A1: Expressive Portrait Animation in Video Diffusion Transformers	Di Qiu et.al.	2502.10841	link
2025-02-13	Long-Term TalkingFace Generation via Motion-Prior Conditional Diffusion Model	Fei Shen et.al.	2502.09533	null
2025-02-13	VTutor: An Open-Source SDK for Generative AI-Powered Animated Pedagogical Agents with Multi-Media Output	Eason Chen et.al.	2502.04103	null
2025-02-11	Playmate: Flexible Control of Portrait Animation via 3D-Implicit Space Guided Diffusion	Xingpei Ma et.al.	2502.07203	null
2025-02-07	Towards Multimodal Empathetic Response Generation: A Rich Text-Speech-Vision Avatar-based Benchmark	Han Zhang et.al.	2502.04976	null
2025-02-02	EmoTalkingGaussian: Continuous Emotion-conditioned Talking Head Synthesis	Junuk Cha et.al.	2502.00654	null
2025-01-24	SyncAnimation: A Real-Time End-to-End Framework for Audio-Driven Human Pose and Talking Head Animation	Yujian Liu et.al.	2501.14646	null
2025-01-21	A Lightweight and Interpretable Deepfakes Detection Framework	Muhammad Umar Farooq et.al.	2501.11927	null
2025-01-18	EMO2: End-Effector Guided Audio-Driven Avatar Video Generation	Linrui Tian et.al.	2501.10687	null
2025-01-17	TalkingEyes: Pluralistic Speech-Driven 3D Eye Gaze Animation	Yixiang Zhuang et.al.	2501.09921	null
2025-01-15	Joint Learning of Depth and Appearance for Portrait Image Animation	Xinya Ji et.al.	2501.08649	null
2025-01-15	Make-A-Character 2: Animatable 3D Character Generation From a Single Image	Lin Liu et.al.	2501.07870	null
2025-01-09	Towards Dynamic Neural Communication and Speech Neuroprosthesis Based on Viseme Decoding	Ji-Ha Park et.al.	2501.14790	null
2025-01-09	Identity-Preserving Video Dubbing Using Motion Warping	Runzhen Liu et.al.	2501.04586	null
2025-01-09	MoEE: Mixture of Emotion Experts for Audio-Driven Portrait Animation	Huaize Liu et.al.	2501.01808	null
2025-01-07	Generating and Detecting Various Types of Fake Image and Audio Content: A Review of Modern Deep Learning Technologies and Tools	Arash Dehghani et.al.	2501.06227	null
2025-01-07	VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control	Yuanpeng Tu et.al.	2501.01427	null
2025-01-06	RDD4D: 4D Attention-Guided Road Damage Detection And Classification	Asma Alkalbani et.al.	2501.02822	link
2025-01-06	Takeaways from Applying LLM Capabilities to Multiple Conversational Avatars in a VR Pilot Study	Mykola Maslych et.al.	2501.00168	null
2025-01-03	JoyGen: Audio-Driven 3D Depth-Aware Talking-Face Video Editing	Qili Wang et.al.	2501.01798	link
2024-12-28	DEGSTalk: Decomposed Per-Embedding Gaussian Fields for Hair-Preserving Talking Face Synthesis	Kaijun Deng et.al.	2412.20148	link
2024-12-26	UniAvatar: Taming Lifelike Audio-Driven Talking Head Generation with Comprehensive Motion and Lighting Control	Wenzhang Sun et.al.	2412.19860	null
2024-12-26	Generating Editable Head Avatars with 3D Gaussian GANs	Guohao Li et.al.	2412.19149	link
2024-12-23	FaceLift: Single Image to 3D Head with View Generation and GS-LRM	Weijie Lyu et.al.	2412.17812	null
2024-12-22	FADA: Fast Diffusion Avatar Synthesis with Mixed-Supervised Multi-CFG Distillation	Tianyun Zhong et.al.	2412.16915	null
2024-12-18	Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with Adapters	Steven Hogue et.al.	2412.14333	link
2024-12-18	GLCF: A Global-Local Multimodal Coherence Analysis Framework for Talking Face Generation Detection	Xiaocan Chen et.al.	2412.13656	null
2024-12-18	Learning to Control an Android Robot Head for Facial Animation	Marcel Heisler et.al.	2412.13641	null
2024-12-18	Real-time One-Step Diffusion-based Expressive Portrait Videos Generation	Hanzhong Guo et.al.	2412.13479	link
2024-12-18	VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization	Tao Liu et.al.	2412.09892	null
2024-12-16	Towards a Universal Synthetic Video Detector: From Face or Background Manipulations to Fully AI-Generated Content	Rohit Kundu et.al.	2412.12278	null
2024-12-13	GoHD: Gaze-oriented and Highly Disentangled Portrait Animation with Rhythmic Poses and Realistic Expression	Ziqi Zhou et.al.	2412.09296	link
2024-12-12	LatentSync: Audio Conditioned Latent Diffusion Models for Lip Sync	Chunyu Li et.al.	2412.09262	link
2024-12-12	EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing	Gaoxiang Cong et.al.	2412.08988	null
2024-12-12	PointTalk: Audio-Driven Dynamic Lip Point Cloud for 3D Gaussian-based Talking Head Synthesis	Yifan Xie et.al.	2412.08504	null
2024-12-10	PortraitTalk: Towards Customizable One-Shot Audio-to-Talking Face Generation	Fatemeh Nazarieh et.al.	2412.07754	null
2024-12-10	IF-MDM: Implicit Face Motion Diffusion Model for High-Fidelity Realtime Talking Head Generation	Sejong Yang et.al.	2412.04000	null
2024-12-05	MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation	Longtao Zheng et.al.	2412.04448	null
2024-12-05	Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Diffusion Transformer Networks	Jiahao Cui et.al.	2412.00733	link
2024-12-04	SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model	Yan Li et.al.	2412.03430	null
2024-12-02	One Shot, One Talk: Whole-body Talking Avatar from a Single Image	Jun Xiang et.al.	2412.01106	null
2024-12-01	Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation	Shuling Zhao et.al.	2412.00719	null
2024-11-29	LokiTalk: Learning Fine-Grained and Generalizable Correspondences to Enhance NeRF-based Talking Head Synthesis	Tianqi Li et.al.	2411.19525	null
2024-11-29	Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis	Tianqi Li et.al.	2411.19509	null
2024-11-29	V2SFlow: Video-to-Speech Generation with Speech Decomposition and Rectified Flow	Jeongsoo Choi et.al.	2411.19486	null
2024-11-26	Passive Deepfake Detection Across Multi-modalities: A Comprehensive Survey	Hong-Hanh Nguyen-Le et.al.	2411.17911	null
2024-11-25	Sonic: Shifting Focus to Global Audio Perception in Portrait Animation	Xiaozhong Ji et.al.	2411.16331	null
2024-11-25	ESARM: 3D Emotional Speech-to-Animation via Reward Model from Automatically-Ranked Demonstrations	Xulong Zhang et.al.	2411.13089	null
2024-11-24	LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis	Haojie Zhang et.al.	2411.16748	null
2024-11-23	EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion	Haotian Wang et.al.	2411.16726	null
2024-11-23	ConsistentAvatar: Learning to Diffuse Fully Consistent Talking Head Avatar with Temporal Guidance	Haijie Yang et.al.	2411.15436	null
2024-11-20	Comparative Analysis of Audio Feature Extraction for Real-Time Talking Portrait Synthesis	Pegah Salehi et.al.	2411.13209	link
2024-11-20	JoyVASA: Portrait and Animal Image Animation with Diffusion-Based Audio-Driven Facial Dynamics and Head Motion Generation	Xuyang Cao et.al.	2411.09209	link
2024-11-14	LES-Talker: Fine-Grained Emotion Editing for Talking Head Generation in Linear Emotion Space	Guanwen Feng et.al.	2411.09268	null
2024-11-06	Large Generative Model-assisted Talking-face Semantic Communication System	Feibo Jiang et.al.	2411.03876	null
2024-10-31	Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided Mixture-of-Experts	Xiang Deng et.al.	2410.23836	null
2024-10-29	Multimodal Semantic Communication for Generative Audio-Driven Video Conferencing	Haonan Tong et.al.	2410.22112	null
2024-10-24	Real-time 3D-aware Portrait Video Relighting	Ziqi Cai et.al.	2410.18355	link
2024-10-21	Joker: Conditional 3D Head Synthesis with Extreme Facial Expressions	Malte Prinzler et.al.	2410.16395	null
2024-10-18	Takin-ADA: Emotion Controllable Audio-Driven Animation with Canonical and Landmark Loss Optimization	Bin Lin et.al.	2410.14283	null
2024-10-16	MuseTalk: Real-Time High Quality Lip Synchronization with Latent Space Inpainting	Yue Zhang et.al.	2410.10122	link
2024-10-15	Titanic Calling: Low Bandwidth Video Conference from the Titanic Wreck	Fevziye Irem Eyiokur et.al.	2410.11434	null
2024-10-15	MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes	Zhenhui Ye et.al.	2410.06734	null
2024-10-14	Character-aware audio-visual subtitling in context	Jaesung Huh et.al.	2410.11068	null
2024-10-14	Beyond Fixed Topologies: Unregistered Training and Comprehensive Evaluation Metrics for 3D Talking Heads	Federico Nocentini et.al.	2410.11041	null
2024-10-14	TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model	Jiazhi Guan et.al.	2410.10696	null
2024-10-14	Generative Human Video Compression with Multi-granularity Temporal Trajectory Factorization	Shanzhi Yin et.al.	2410.10171	null
2024-10-10	MMHead: Towards Fine-grained Multi-modal 3D Facial Animation	Sijing Wu et.al.	2410.07757	null
2024-10-09	FreeAvatar: Robust 3D Facial Animation Transfer by Learning an Expression Foundation Model	Feng Qiu et.al.	2409.13180	null
2024-10-01	LaDTalk: Latent Denoising for Synthesizing Talking Head Videos with High Frequency Details	Jian Yang et.al.	2410.00990	null
2024-09-29	Learning Frame-Wise Emotion Intensity for Audio-Driven Talking-Head Generation	Jingyi Xu et.al.	2409.19501	null
2024-09-27	Diverse Code Query Learning for Speech-Driven Facial Animation	Chunzhi Gu et.al.	2409.19143	null
2024-09-26	Stable Video Portraits	Mirela Ostrek et.al.	2409.18083	null
2024-09-25	ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAE	Sichun Wu et.al.	2409.07966	link
2024-09-24	FastTalker: Jointly Generating Speech and Conversational Gestures from Text	Zixin Guo et.al.	2409.16404	null
2024-09-23	FaceVid-1K: A Large-Scale High-Quality Multiracial Human Face Video Dataset	Donglin Di et.al.	2410.07151	null
2024-09-23	MIMAFace: Face Animation via Motion-Identity Modulated Appearance Feature Learning	Yue Han et.al.	2409.15179	null
2024-09-18	JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation	Sai Tanmay Reddy Chakkera et.al.	2409.12156	null
2024-09-18	GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avatars from Coarse-to-fine Representations	Kartik Teotia et.al.	2409.11951	null
2024-09-17	3DFacePolicy: Speech-Driven 3D Facial Animation with Diffusion Policy	Xuanmeng Sha et.al.	2409.10848	null
2024-09-16	DreamHead: Learning Spatial-Temporal Correspondence via Hierarchical Diffusion for Audio-driven Talking Head Synthesis	Fa-Ting Hong et.al.	2409.10281	null
2024-09-14	StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking Heads	Suzhen Wang et.al.	2409.09292	null
2024-09-11	DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures	Steven Hogue et.al.	2409.07649	null
2024-09-11	EMOdiffhead: Continuously Emotional Control in Talking Head Generation via Diffusion	Jian Zhang et.al.	2409.07255	null
2024-09-09	PersonaTalk: Bring Attention to Your Persona in Visual Dubbing	Longhao Zhang et.al.	2409.05379	null
2024-09-09	KAN-Based Fusion of Dual-Domain for Audio-Driven Facial Landmarks Generation	Hoang-Son Vo-Thanh et.al.	2409.05330	link
2024-09-05	SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing	Lingyu Xiong et.al.	2409.03605	null
2024-09-05	SVP: Style-Enhanced Vivid Portrait Talking Head Diffusion Model	Weipeng Tan et.al.	2409.03270	null
2024-09-04	PoseTalk: Text-and-Audio-based Pose Control and Motion Refinement for One-Shot Talking Head Generation	Jun Ling et.al.	2409.02657	null
2024-09-02	KMTalk: Speech-Driven 3D Facial Animation with Key Motion Embedding	Zhihao Xu et.al.	2409.01113	link
2024-08-28	Micro and macro facial expressions by driven animations in realistic Virtual Humans	Rubens Halbig Montanha et.al.	2408.16110	null
2024-08-27	MegActor- $Σ$ : Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer	Shurong Yang et.al.	2408.14975	null
2024-08-25	TalkLoRA: Low-Rank Adaptation for Speech-Driven Animation	Jack Saunders et.al.	2408.13714	null
2024-08-23	G3FA: Geometry-guided GAN for Face Animation	Alireza Javanmardi et.al.	2408.13049	null
2024-08-21	AutoDirector: Online Auto-scheduling Agents for Multi-sensory Composition	Minheng Ni et.al.	2408.11564	null
2024-08-21	EmoFace: Emotion-Content Disentangled Speech-Driven 3D Talking Face with Mesh Attention	Yihong Lin et.al.	2408.11518	null
2024-08-20	DEGAS: Detailed Expressions on Full-Body Gaussian Avatars	Zhijing Shao et.al.	2408.10588	null
2024-08-18	FD2Talk: Towards Generalized Talking Head Generation with Facial Decoupled Diffusion Model	Ziyu Yao et.al.	2408.09384	null
2024-08-18	Meta-Learning Empowered Meta-Face: Personalized Speaking Style Adaptation for Audio-Driven 3D Talking Face Animation	Xukun Zhou et.al.	2408.09357	null
2024-08-18	S^3D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis	Dongze Li et.al.	2408.09347	null
2024-08-16	GLDiTalker: Speech-Driven 3D Facial Animation with Graph Latent Diffusion Transformer	Yihong Lin et.al.	2408.01826	null
2024-08-14	Content and Style Aware Audio-Driven Facial Animation	Qingju Liu et.al.	2408.07005	null
2024-08-12	DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation	Jisoo Kim et.al.	2408.06010	null
2024-08-10	High-fidelity and Lip-synced Talking Face Synthesis via Landmark-based Diffusion Model	Weizhi Zhong et.al.	2408.05416	null
2024-08-10	Style-Preserving Lip Sync via Audio-Aware Style Reference	Weizhi Zhong et.al.	2408.05412	null
2024-08-09	DeepSpeak Dataset v1.0	Sarah Barrington et.al.	2408.05366	null
2024-08-06	ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer	Jiazhi Guan et.al.	2408.03284	null
2024-08-03	Landmark-guided Diffusion Model for High-fidelity and Temporally Coherent Talking Head Generation	Jintao Tan et.al.	2408.01732	null
2024-08-03	JambaTalk: Speech-Driven 3D Talking Head Generation Based on Hybrid Transformer-Mamba Model	Farzaneh Jafari et.al.	2408.01627	null
2024-08-01	UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model	Xiangyu Fan et.al.	2408.00762	null
2024-08-01	Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion	Manuel Kansy et.al.	2408.00458	null
2024-08-01	EmoTalk3D: High-Fidelity Free-View Synthesis of Emotional 3D Talking Head	Qianyun He et.al.	2408.00297	null
2024-07-31	Deformable 3D Shape Diffusion Model	Dengsheng Chen et.al.	2407.21428	null
2024-07-26	LinguaLinker: Audio-Driven Portraits Animation with Implicit Facial Control Enhancement	Rui Zhang et.al.	2407.18595	null
2024-07-24	A Comprehensive Review and Taxonomy of Audio-Visual Synchronization Techniques for Realistic Speech Animation	Jose Geraldo Fernandes et.al.	2407.17430	null
2024-07-24	The impact of differences in facial features between real speakers and 3D face models on synthesized lip motions	Rabab Algadhy et.al.	2407.17253	null
2024-07-22	PAV: Personalized Head Avatar from Unstructured Video Collection	Akin Caliskan et.al.	2407.21047	null
2024-07-21	Anchored Diffusion for Video Face Reenactment	Idan Kligvasser et.al.	2407.15153	null
2024-07-20	Text-based Talking Video Editing with Cascaded Conditional Diffusion	Bo Han et.al.	2407.14841	null
2024-07-17	Universal Facial Encoding of Codec Avatars from VR Headsets	Shaojie Bai et.al.	2407.13038	null
2024-07-17	EmoFace: Audio-driven Emotional 3D Face Animation	Chang Liu et.al.	2407.12501	link
2024-07-13	Learning Online Scale Transformation for Talking Head Video Generation	Fa-Ting Hong et.al.	2407.09965	null
2024-07-12	Real Face Video Animation Platform	Xiaokai Chen et.al.	2407.18955	null
2024-07-12	One-Shot Pose-Driving Face Animation Platform	He Feng et.al.	2407.08949	null
2024-07-12	EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions	Zhiyuan Chen et.al.	2407.08136	null
2024-07-08	MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices	Jianwen Jiang et.al.	2407.05712	null
2024-07-08	Audio-driven High-resolution Seamless Talking Head Video Editing via StyleGAN	Jiacheng Su et.al.	2407.05577	null
2024-07-04	Compressed Skinning for Facial Blendshapes	Ladislav Kavan et.al.	2406.11597	null
2024-07-03	LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control	Jianzhu Guo et.al.	2407.03168	link
2024-07-01	Enhancing Speech-Driven 3D Facial Animation with Audio-Visual Guidance from Lip Reading Expert	Han EunGi et.al.	2407.01034	null
2024-06-26	RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network	Xiaozhong Ji et.al.	2406.18284	null
2024-06-24	The Effects of Embodiment and Personality Expression on Learning in LLM-based Educational Agents	Sinan Sonlu et.al.	2407.10993	null
2024-06-21	EmpathyEar: An Open-source Avatar Multimodal Empathetic Chatbot	Hao Fei et.al.	2406.15177	link
2024-06-20	MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset	Kim Sung-Bin et.al.	2406.14272	null
2024-06-19	DF40: Toward Next-Generation Deepfake Detection	Zhiyuan Yan et.al.	2406.13495	null
2024-06-19	AniFaceDiff: High-Fidelity Face Reenactment via Facial Parametric Conditioned Diffusion Models	Ken Chen et.al.	2406.13272	null
2024-06-18	RITA: A Real-time Interactive Talking Avatars Framework	Wuxinlin Cheng et.al.	2406.13093	null
2024-06-18	A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing	Ming Meng et.al.	2406.10553	null
2024-06-17	NLDF: Neural Light Dynamic Fields for Efficient 3D Talking Head Generation	Niu Guanchen et.al.	2406.11259	null
2024-06-17	Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement	Runyi Yu et.al.	2406.08096	null
2024-06-16	Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation	Mingwang Xu et.al.	2406.08801	null
2024-06-14	DNPM: A Neural Parametric Model for the Synthesis of Facial Geometric Details	Haitao Cao et.al.	2405.19688	null
2024-06-13	Talking Heads: Understanding Inter-layer Communication in Transformer Language Models	Jack Merullo et.al.	2406.09519	null
2024-06-13	DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing	Neha Sahipjohn et.al.	2406.08802	null
2024-06-12	Emotional Conversation: Empowering Talking Faces with Cohesive Expression, Gaze and Pose Generation	Jiadong Liang et.al.	2406.07895	null
2024-06-07	Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation	Yue Ma et.al.	2406.01900	null
2024-06-05	Controllable Talking Face Generation by Implicit Facial Keypoints Editing	Dong Zhao et.al.	2406.02880	null
2024-05-31	MunchSonic: Tracking Fine-grained Dietary Actions through Active Acoustic Sensing on Eyeglasses	Saif Mahmud et.al.	2405.21004	null
2024-05-31	MegActor: Harness the Power of Raw Video for Vivid Portrait Animation	Shurong Yang et.al.	2405.20851	link
2024-05-30	Audio2Rig: Artist-oriented deep learning tool for facial animation	Bastien Arcelin et.al.	2405.20412	null
2024-05-28	OpFlowTalker: Realistic and Natural Talking Face Generation via Optical Flow Guidance	Shuheng Ge et.al.	2405.14709	null
2024-05-24	InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation	Yuchi Wang et.al.	2405.15758	link
2024-05-22	Metabook: An Automatically Generated Augmented Reality Storybook Interaction System to Improve Children’s Engagement in Storytelling	Yibo Wang et.al.	2405.13701	null
2024-05-21	Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control	Yue Han et.al.	2405.12970	null
2024-05-16	Faces that Speak: Jointly Synthesising Talking Face and Speech from Text	Youngjoon Jang et.al.	2405.10272	null
2024-05-14	PolyGlotFake: A Novel Multilingual and Multimodal DeepFake Dataset	Yang Hou et.al.	2405.08838	link
2024-05-12	Listen, Disentangle, and Control: Controllable Speech-Driven Talking Head Generation	Changpeng Cai et.al.	2405.07257	null
2024-05-10	NeRFFaceSpeech: One-shot Audio-driven 3D Talking Head Synthesis via Generative Prior	Gihoon Kim et.al.	2405.05749	null
2024-05-09	SwapTalk: Audio-Driven Talking Face Generation with One-Shot Customization in Latent Space	Zeren Zhang et.al.	2405.05636	null
2024-05-08	Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention	Ruijie Tao et.al.	2404.18501	null
2024-05-07	Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation	Dogucan Yaman et.al.	2405.04327	null
2024-05-06	AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding	Tao Liu et.al.	2405.03121	link
2024-04-29	EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars	Nikita Drobyshev et.al.	2404.19110	null
2024-04-29	GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting	Bo Chen et.al.	2404.19040	null
2024-04-29	Embedded Representation Learning Network for Animating Styled Video Portrait	Tianyong Wang et.al.	2404.19038	null
2024-04-29	CSTalk: Correlation Supervised Speech-driven 3D Emotional Facial Animation Generation	Xiangyu Liang et.al.	2404.18604	null
2024-04-28	GaussianTalker: Speaker-specific Talking Head Synthesis via 3D Gaussian Splatting	Hongyun Yu et.al.	2404.14037	null
2024-04-25	GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting	Kyusun Cho et.al.	2404.16012	link
2024-04-23	TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting	Jiahe Li et.al.	2404.15264	null
2024-04-22	Learn2Talk: 3D Talking Face Learns from 2D Talking Face	Yixiang Zhuang et.al.	2404.12888	null
2024-04-16	VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time	Sicheng Xu et.al.	2404.10667	null
2024-04-15	FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance, Head-pose, and Facial Expression Features	Andre Rochow et.al.	2404.09736	null
2024-04-13	THQA: A Perceptual Quality Assessment Database for Talking Heads	Yingjie Zhou et.al.	2404.09003	link
2024-04-11	EFHQ: Multi-purpose ExtremePose-Face-HQ dataset	Trung Tuan Dao et.al.	2312.17205	null
2024-04-09	Deepfake Generation and Detection: A Benchmark and Survey	Gan Pei et.al.	2403.17881	link
2024-04-08	SphereHead: Stable 3D Full-head Synthesis with Spherical Tri-plane Representation	Heyuan Li et.al.	2404.05680	null
2024-04-07	GvT: A Graph-based Vision Transformer with Talking-Heads Utilizing Sparsity, Trained from Scratch on Small Datasets	Dongjing Shan et.al.	2404.04924	null
2024-04-07	Towards a Simultaneous and Granular Identity-Expression Control in Personalized Face Generation	Renshuai Liu et.al.	2401.01207	null
2024-04-03	MI-NeRF: Learning a Single Face NeRF from Multiple Identities	Aggelina Chatziagapi et.al.	2403.19920	null
2024-04-02	EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis	Shuai Tan et.al.	2404.01647	null
2024-04-02	Learning to Generate Conditional Tri-plane for 3D-aware Expression Controllable Portrait Animation	Taekyung Ki et.al.	2404.00636	null
2024-04-01	FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio	Chao Xu et.al.	2403.01901	link
2024-04-01	Exploring Phonetic Context-Aware Lip-Sync For Talking Face Generation	Se Jin Park et.al.	2305.19556	null
2024-03-29	Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior	Jaehoon Ko et.al.	2403.20153	link
2024-03-28	MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity Talking Head Generation	Seyeon Kim et.al.	2403.19144	link
2024-03-28	GOTCHA: Real-Time Video Deepfake Detection via Challenge-Response	Govind Mittal et.al.	2210.06186	link
2024-03-27	X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention	You Xie et.al.	2403.15931	null
2024-03-26	Superior and Pragmatic Talking Face Generation with Teacher-Student Framework	Chao Liang et.al.	2403.17883	null
2024-03-26	AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation	Huawei Wei et.al.	2403.17694	link
2024-03-26	Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis	Zhenhui Ye et.al.	2401.08503	null
2024-03-25	DiffusionAct: Controllable Diffusion Autoencoder for One-shot Face Reenactment	Stella Bounareli et.al.	2403.17217	null
2024-03-25	AnimateMe: 4D Facial Expressions via Diffusion Models	Dimitrios Gerogiannis et.al.	2403.17213	null
2024-03-25	Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework	Ziyao Huang et.al.	2403.16510	link
2024-03-23	Adaptive Super Resolution For One-Shot Talking-Head Generation	Luchuan Song et.al.	2403.15944	link
2024-03-22	LeGO: Leveraging a Surface Deformation Network for Animatable Stylized Face Generation with One Example	Soyeon Yoon et.al.	2403.15227	link
2024-03-22	Virbo: Multimodal Multilingual Avatar Video Generation in Digital Marketing	Juan Zhang et.al.	2403.11700	null
2024-03-19	EmoVOCA: Speech-Driven Emotional 3D Talking Heads	Federico Nocentini et.al.	2403.12886	null
2024-03-19	ScanTalk: 3D Talking Heads from Unregistered Scans	Federico Nocentini et.al.	2403.10942	null
2024-03-15	StyleTalker: One-shot Style-based Audio-driven Talking Head Video Generation	Dongchan Min et.al.	2208.10922	null
2024-03-14	GAIA: Zero-shot Talking Avatar Generation	Tianyu He et.al.	2311.15230	null
2024-03-13	Say Anything with Any Style	Shuai Tan et.al.	2403.06363	null
2024-03-12	FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and Quantization	Shuai Tan et.al.	2403.06375	null
2024-03-12	Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style	Shuai Tan et.al.	2403.06365	null
2024-03-11	A Comparative Study of Perceptual Quality Metrics for Audio-driven Talking Head Videos	Weixia Zhang et.al.	2403.06421	link
2024-03-05	Memories are One-to-Many Mapping Alleviators in Talking Face Generation	Anni Tang et.al.	2212.05005	null
2024-03-02	G4G:A Generic Framework for High Fidelity Talking Face Generation with Fine-grained Intra-modal Alignment	Juan Zhang et.al.	2402.18122	null
2024-03-01	DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder	Chenpeng Du et.al.	2303.17550	null
2024-02-29	Learning a Generalized Physical Face Model From Data	Lingchen Yang et.al.	2402.19477	null
2024-02-28	Context-aware Talking Face Video Generation	Meidai Xuanyuan et.al.	2402.18092	null
2024-02-27	EMO: Emote Portrait Alive – Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions	Linrui Tian et.al.	2402.17485	null
2024-02-27	Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis	Zicheng Zhang et.al.	2402.17364	link
2024-02-26	Resolution-Agnostic Neural Compression for High-Fidelity Portrait Video Conferencing via Implicit Radiance Fields	Yifei Li et.al.	2402.16599	null
2024-02-25	AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation	Yasheng Sun et.al.	2402.16124	null
2024-02-21	Bring Your Own Character: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters	Zechen Bai et.al.	2402.13724	link
2024-02-21	StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing	Gaoxiang Cong et.al.	2402.12636	null
2024-02-12	StyleLipSync: Style-based Personalized Lip-sync Video Generation	Taekyung Ki et.al.	2305.00521	null
2024-02-08	DiffSpeaker: Speech-Driven 3D Facial Animation with Diffusion Transformer	Zhiyuan Ma et.al.	2402.05712	link
2024-02-05	One-shot Neural Face Reenactment via Finding Directions in GAN’s Latent Space	Stella Bounareli et.al.	2402.03553	null
2024-02-02	EmoSpeaker: One-shot Fine-grained Emotion-Controlled Talking Face Generation	Guanwen Feng et.al.	2402.01422	null
2024-01-31	MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis	Wenhao Guan et.al.	2312.10687	null
2024-01-30	Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance	Qingcheng Zhao et.al.	2401.15687	null
2024-01-28	Lips Are Lying: Spotting the Temporal Inconsistency between Audio and Visual in Lip-Syncing DeepFakes	Weifeng Liu et.al.	2401.15668	link
2024-01-27	An Implicit Physical Face Model Driven by Expression and Style	Lingchen Yang et.al.	2401.15414	null
2024-01-26	Implicit Neural Representation for Physics-driven Actuated Soft Bodies	Lingchen Yang et.al.	2401.14861	null
2024-01-25	SAiD: Speech-driven Blendshape Facial Animation with Diffusion	Inkyu Park et.al.	2401.08655	link
2024-01-23	NeRF-AD: Neural Radiance Field with Attention-based Disentanglement for Talking Face Synthesis	Chongke Bi et.al.	2401.12568	null
2024-01-19	Fast Registration of Photorealistic Avatars for VR Facial Animation	Chaitanya Patel et.al.	2401.11002	null
2024-01-18	Exposing Lip-syncing Deepfakes from Mouth Inconsistencies	Soumyya Kanti Datta et.al.	2401.10113	null
2024-01-18	Text-driven Talking Face Synthesis by Reprogramming Audio-driven Models	Jeongsoo Choi et.al.	2306.16003	null
2024-01-16	EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model	Bingyuan Zhang et.al.	2401.08049	null
2024-01-12	DiffDub: Person-generic Visual Dubbing Using Inpainting Renderer with Diffusion Auto-encoder	Tao Liu et.al.	2311.01811	null
2024-01-11	Dubbing for Everyone: Data-Efficient Visual Dubbing using Neural Rendering Priors	Jack Saunders et.al.	2401.06126	null
2024-01-11	Jump Cut Smoothing for Talking Heads	Xiaojuan Wang et.al.	2401.04718	null
2024-01-08	AdaMesh: Personalized Facial Expressions and Head Poses for Adaptive Speech-Driven 3D Facial Animation	Liyang Chen et.al.	2310.07236	null
2024-01-07	Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness	Sicheng Yang et.al.	2401.03476	null
2024-01-04	Expressive Speech-driven Facial Animation with controllable emotions	Yutong Chen et.al.	2301.02008	link
2023-12-23	TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation	Xize Cheng et.al.	2312.15197	null
2023-12-22	DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation	Chenxu Zhang et.al.	2312.13578	null
2023-12-20	FAAC: Facial Animation Generation with Anchor Frame and Conditional Control for Superior Fidelity and Editability	Linze Li et.al.	2312.03775	null
2023-12-19	Learning Dense Correspondence for NeRF-Based Face Reenactment	Songlin Yang et.al.	2312.10422	null
2023-12-19	Gaussian3Diff: 3D Gaussian Diffusion for 3D Full Head Synthesis and Editing	Yushi Lan et.al.	2312.03763	null
2023-12-18	VectorTalker: SVG Talking Face Generation with Progressive Vectorisation	Hao Hu et.al.	2312.11568	null
2023-12-18	AE-NeRF: Audio Enhanced Neural Radiance Field for Few Shot Talking Head Synthesis	Dongze Li et.al.	2312.10921	null
2023-12-18	Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation	Hui Fu et.al.	2312.10877	null
2023-12-15	DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models	Yifeng Ma et.al.	2312.09767	null
2023-12-15	Attention-Based VR Facial Animation with Visual Mouth Camera Guidance for Immersive Telepresence Avatars	Andre Rochow et.al.	2312.09750	null
2023-12-13	uTalk: Bridging the Gap Between Humans and AI	Hussam Azzuni et.al.	2310.02739	null
2023-12-13	MMFace4D: A Large-Scale Multi-Modal 4D Face Dataset for Audio-Driven 3D Face Animation	Haozhe Wu et.al.	2303.09797	null
2023-12-12	GMTalker: Gaussian Mixture based Emotional talking video Portraits	Yibo Xia et.al.	2312.07669	null
2023-12-12	GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained 3D Face Guidance	Haiming Zhang et.al.	2312.07385	null
2023-12-11	Neural Text to Articulate Talk: Deep Text to Audiovisual Speech Synthesis achieving both Auditory and Photo-realism	Georgios Milis et.al.	2312.06613	link
2023-12-11	Study of Non-Verbal Behavior in Conversational Agents	Camila Vicari Maccari et.al.	2312.06530	null
2023-12-11	DiT-Head: High-Resolution Talking Head Synthesis using Diffusion Transformers	Aaron Mir et.al.	2312.06400	null
2023-12-11	Audio-driven Talking Face Generation by Overcoming Unintended Information Flow	Dogucan Yaman et.al.	2307.09368	null
2023-12-10	DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head Video Generation	Fa-Ting Hong et.al.	2305.06225	link
2023-12-09	R2-Talker: Realistic Real-Time Talking Head Synthesis with Hash Grid Landmarks Encoding and Progressive Multilayer Conditioning	Zhiling Ye et.al.	2312.05572	null
2023-12-09	FT2TF: First-Person Statement Text-To-Talking Face Generation	Xingjian Diao et.al.	2312.05430	null
2023-12-08	SingingHead: A Large-scale 4D Dataset for Singing Head Animation	Sijing Wu et.al.	2312.04369	null
2023-12-07	VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior	Xusen Sun et.al.	2312.01841	null
2023-12-05	PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features	Tianshun Han et.al.	2312.02781	null
2023-12-05	MyPortrait: Morphable Prior-Guided Personalized Portrait Generation	Bo Ding et.al.	2312.02703	null
2023-12-02	DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D Face Diffuser	Peng Chen et.al.	2311.16565	null
2023-12-01	3DiFACE: Diffusion-based Speech-driven 3D Facial Animation and Editing	Balamurugan Thambiraja et.al.	2312.00870	null
2023-11-30	Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data	Yu Deng et.al.	2311.18729	null
2023-11-30	Talking Head(?) Anime from a Single Image 4: Improved Model and Its Distillation	Pramook Khungurn et.al.	2311.17409	null
2023-11-29	SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis	Ziqiao Peng et.al.	2311.17590	link
2023-11-28	THInImg: Cross-modal Steganography for Presenting Talking Heads in Images	Lin Zhao et.al.	2311.17177	null
2023-11-28	BakedAvatar: Baking Neural Fields for Real-Time Head Avatar Synthesis	Hao-Bin Duan et.al.	2311.05521	link
2023-11-28	Continuously Controllable Facial Expression Editing in Talking Face Videos	Zhiyao Sun et.al.	2209.08289	null
2023-11-20	MemoryCompanion: A Smart Healthcare Solution to Empower Efficient Alzheimer’s Care Via Unleashing Generative AI	Lifei Zheng et.al.	2311.14730	null
2023-11-15	CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding	Jianzong Wang et.al.	2311.08673	null
2023-11-13	DualTalker: A Cross-Modal Dual Learning Approach for Speech-Driven 3D Facial Animation	Guinan Su et.al.	2311.04766	null
2023-11-12	ChatAnything: Facetime Chat with LLM-Enhanced Personas	Yilin Zhao et.al.	2311.06772	null
2023-11-08	Synthetic Speaking Children – Why We Need Them and How to Make Them	Muhammad Ali Farooq et.al.	2311.06307	null
2023-11-06	RADIO: Reference-Agnostic Dubbing Video Synthesis	Dongyeun Lee et.al.	2309.01950	null
2023-11-05	3D-Aware Talking-Head Video Motion Transfer	Haomiao Ni et.al.	2311.02549	null
2023-11-03	Learning Separable Hidden Unit Contributions for Speaker-Adaptive Lip-Reading	Songtao Luo et.al.	2310.05058	link
2023-11-02	LaughTalk: Expressive 3D Talking Head Generation with Laughter	Kim Sung-Bin et.al.	2311.00994	null
2023-11-02	High-Fidelity and Freely Controllable Talking Head Video Generation	Yue Gao et.al.	2304.10168	null
2023-10-31	Breathing Life into Faces: Speech-driven 3D Facial Animation with Natural Head Pose and Detailed Shape	Wei Zhao et.al.	2310.20240	null
2023-10-29	On the Vulnerability of DeepFake Detectors to Attacks Generated by Denoising Diffusion Models	Marija Ivanovska et.al.	2307.05397	null
2023-10-25	Personalized Speech-driven Expressive 3D Facial Animation Synthesis with Style Control	Elif Bozkurt et.al.	2310.17011	null
2023-10-23	The Self 2.0: How AI-Enhanced Self-Clones Transform Self-Perception and Improve Presentation Skills	Qingxiao Zheng et.al.	2310.15112	null
2023-10-19	Gemino: Practical and Robust Neural Compression for Video Conferencing	Vibhaalakshmi Sivaraman et.al.	2209.10507	null
2023-10-17	CorrTalk: Correlation Between Hierarchical Speech and Facial Activity Variances for 3D Animation	Zhaojie Chu et.al.	2310.11295	null
2023-10-15	HyperLips: Hyper Control Lips with High Resolution Decoder for Talking Face Generation	Yaosen Chen et.al.	2310.05720	link
2023-10-12	CleftGAN: Adapting A Style-Based Generative Adversarial Network To Create Images Depicting Cleft Lip Deformity	Abdullah Hayajneh et.al.	2310.07969	link
2023-10-12	Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation	Yuan Gan et.al.	2309.04946	link
2023-10-08	GestSync: Determining who is speaking without a talking head	Sindhu B Hegde et.al.	2310.05304	link
2023-09-30	DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models	Zhiyao Sun et.al.	2310.00434	null
2023-09-28	OSM-Net: One-to-Many One-shot Talking Head Generation with Spontaneous Head Motions	Jin Liu et.al.	2309.16148	null
2023-09-26	Emotional Speech-Driven Animation with Content-Emotion Disentanglement	Radek Daněček et.al.	2306.08990	null
2023-09-20	FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion	Stefan Stan et.al.	2309.11306	link
2023-09-20	Context-Aware Talking-Head Video Editing	Songlin Yang et.al.	2308.00462	null
2023-09-18	That’s What I Said: Fully-Controllable Talking Face Generation	Youngjoon Jang et.al.	2304.03275	null
2023-09-15	Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-talker Speech	Junjie Li et.al.	2309.08408	link
2023-09-14	DT-NeRF: Decomposed Triplane-Hash Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis	Yaoyu Su et.al.	2309.07752	null
2023-09-14	DiffTalker: Co-driven audio-image diffusion for talking faces via intermediate landmarks	Zipeng Qi et.al.	2309.07509	null
2023-09-14	HDTR-Net: A Real-Time High-Definition Teeth Restoration Network for Arbitrary Talking Face Generation Methods	Yongyuan Li et.al.	2309.07495	link
2023-09-13	PIAVE: A Pose-Invariant Audio-Visual Speaker Extraction Network	Qinghua Liu et.al.	2309.06723	null
2023-09-12	DF-TransFusion: Multimodal Deepfake Detection via Lip-Audio Cross-Attention and Facial Self-Attention	Aaditya Kharel et.al.	2309.06511	null
2023-09-12	Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos	Ekta Prashnani et.al.	2305.03713	null
2023-09-11	ExpCLIP: Bridging Text and Facial Expressions via Semantic Alignment	Yicheng Zhong et.al.	2308.14448	null
2023-09-10	MaskRenderer: 3D-Infused Multi-Mask Realistic Face Reenactment	Tina Behrouzi et.al.	2309.05095	null
2023-09-09	Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video	Xiuzhe Wu et.al.	2309.04814	link
2023-09-01	Unsupervised Learning of Style-Aware Facial Animation from Real Acting Performances	Wolfgang Paier et.al.	2306.10006	null
2023-08-30	From Pixels to Portraits: A Comprehensive Survey of Talking Head Generation Techniques and Applications	Shreyank N Gowda et.al.	2308.16041	null
2023-08-30	SelfTalk: A Self-Supervised Commutative Training Diagram to Comprehend 3D Talking Faces	Ziqiao Peng et.al.	2306.10799	link
2023-08-30	Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models	Antoni Bigata Casademunt et.al.	2305.08854	link
2023-08-29	Papeos: Augmenting Research Papers with Talk Videos	Tae Soo Kim et.al.	2308.15224	null
2023-08-25	EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation	Ziqiao Peng et.al.	2303.11089	link
2023-08-24	ToonTalker: Cross-Domain Face Reenactment	Yuan Gong et.al.	2308.12866	null
2023-08-24	Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis	Jiahe Li et.al.	2307.09323	link
2023-08-23	DF-3DFace: One-to-Many Speech Synchronized 3D Face Animation with Diffusion	Se Jin Park et.al.	2310.05934	null
2023-08-21	Deep Person Generation: A Survey from the Perspective of Face, Pose and Cloth Synthesis	Tong Sha et.al.	2109.02081	null
2023-08-18	Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization	Soumik Mukhopadhyay et.al.	2308.09716	link
2023-08-18	Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation	Fa-Ting Hong et.al.	2307.09906	link
2023-08-17	A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation	Li Liu et.al.	2308.08849	link
2023-08-16	Instruct-NeuralTalker: Editing Audio-Driven Talking Radiance Fields with Instructions	Yuqi Sun et.al.	2306.10813	null
2023-08-12	Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head Generation	Zhichao Wang et.al.	2308.06457	link
2023-08-12	DialogueNeRF: Towards Realistic Avatar Face-to-Face Conversation Video Generation	Yichao Yan et.al.	2203.07931	null
2023-08-11	Versatile Face Animator: Driving Arbitrary 3D Facial Avatar in RGBD Space	Haoyu Wang et.al.	2308.06076	link
2023-08-11	VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer	Liyang Chen et.al.	2308.04830	null
2023-08-10	Near-realtime Facial Animation by Deep 3D Simulation Super-Resolution	Hyojoon Park et.al.	2305.03216	null
2023-08-02	Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis	Zhenhui Ye et.al.	2306.03504	null
2023-07-29	Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation	Michał Stypułkowski et.al.	2301.03396	null
2023-07-26	Learning Landmarks Motion from Speech for Speaker-Agnostic 3D Talking Heads Generation	Federico Nocentini et.al.	2306.01415	link
2023-07-20	HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and Retarget Faces	Stella Bounareli et.al.	2307.10797	link
2023-07-19	MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions	Yunfei Liu et.al.	2307.10008	null
2023-07-19	Hierarchical Semantic Perceptual Listener Head Video Generation: A High-performance Pipeline	Zhigang Chang et.al.	2307.09821	null
2023-07-19	OPHAvatars: One-shot Photo-realistic Head Avatars	Shaoxu Li et.al.	2307.09153	link
2023-07-18	FACTS: Facial Animation Creation using the Transfer of Styles	Jack Saunders et.al.	2307.09480	null
2023-07-09	Predictive Coding For Animation-Based Video Compression	Goluck Konuko et.al.	2307.04187	null
2023-07-08	FTFDNet: Learning to Detect Talking Face Video Manipulation with Tri-Modality Interaction	Ganglai Wang et.al.	2307.03990	null
2023-07-05	Interactive Conversational Head Generation	Mohan Zhou et.al.	2307.02090	null
2023-07-04	A Comprehensive Multi-scale Approach for Speech and Dynamics Synchrony in Talking Head Generation	Louis Airale et.al.	2307.03270	link
2023-07-04	Generating Animatable 3D Cartoon Faces from Single Portraits	Chuanyu Pan et.al.	2307.01468	null
2023-07-03	RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations	Neha Sahipjohn et.al.	2307.01233	null
2023-06-20	Audio-Driven 3D Facial Animation from In-the-Wild Videos	Liying Lu et.al.	2306.11541	null
2023-06-13	Parametric Implicit Face Representation for Audio-Driven Facial Reenactment	Ricong Huang et.al.	2306.07579	null
2023-06-13	AniFaceDrawing: Anime Portrait Exploration during Your Sketching	Zhengyu Huang et.al.	2306.07476	null
2023-06-12	NPVForensics: Jointing Non-critical Phonemes and Visemes for Deepfake Detection	Yu Chen et.al.	2306.06885	null
2023-06-10	StyleTalk: One-shot Talking Head Generation with Controllable Speaking Styles	Yifeng Ma et.al.	2301.01081	link
2023-06-08	ReliableSwap: Boosting General Face Swapping Via Reliable Supervision	Ge Yuan et.al.	2306.05356	link
2023-06-06	Emotional Talking Head Generation based on Memory-Sharing and Attention-Augmented Networks	Jianrong Wang et.al.	2306.03594	null
2023-06-05	Instruct-Video2Avatar: Video-to-Avatar Generation with Instructions	Shaoxu Li et.al.	2306.02903	link
2023-05-31	High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning	Chao Xu et.al.	2305.02572	null
2023-05-23	CPNet: Exploiting CLIP-based Attention Condenser and Probability Map Guidance for High-fidelity Talking Face Generation	Jingning Xu et.al.	2305.13962	null
2023-05-22	RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars	Dongwei Pan et.al.	2305.13353	link
2023-05-19	UniFLG: Unified Facial Landmark Generator from Text or Speech	Kentaro Mitsui et.al.	2302.14337	null
2023-05-18	An Android Robot Head as Embodied Conversational Agent	Marcel Heisler et.al.	2305.10945	null
2023-05-18	Audio-Visual Person-of-Interest DeepFake Detection	Davide Cozzolino et.al.	2204.03083	link
2023-05-17	INCLG: Inpainting for Non-Cleft Lip Generation with a Multi-Task Image Processing Network	Shuang Chen et.al.	2305.10589	null
2023-05-17	LPMM: Intuitive Pose Control for Neural Talking-Head Model via Landmark-Parameter Morphable Model	Kwangho Lee et.al.	2305.10456	null
2023-05-15	Identity-Preserving Talking Face Generation with Landmark and Appearance Priors	Weizhi Zhong et.al.	2305.08293	link
2023-05-09	Zero-shot personalized lip-to-speech synthesis with face image based voice control	Zheng-Yan Sheng et.al.	2305.14359	null
2023-05-09	StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator	Jiazhi Guan et.al.	2305.05445	null
2023-05-09	Multimodal-driven Talking Face Generation via a Unified Diffusion-based Generator	Chao Xu et.al.	2305.02594	null
2023-05-01	StyleAvatar: Real-time Photo-realistic Portrait Avatar from a Single Video	Lizhen Wang et.al.	2305.00942	link
2023-05-01	GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation	Zhenhui Ye et.al.	2305.00787	null
2023-04-28	A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation	Bo-Kyeong Kim et.al.	2304.00471	null
2023-04-27	Controllable One-Shot Face Video Synthesis With Semantic Aware Prior	Kangning Liu et.al.	2304.14471	null
2023-04-25	AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head	Rongjie Huang et.al.	2304.12995	link
2023-04-24	VR Facial Animation for Immersive Telepresence Avatars	Andre Rochow et.al.	2304.12051	null
2023-04-21	Implicit Neural Head Synthesis via Controllable Local Deformation Fields	Chuhan Chen et.al.	2304.11113	null
2023-04-20	DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation	Shuai Shen et.al.	2301.03786	link
2023-04-18	Audio-Driven Talking Face Generation with Diverse yet Realistic Facial Animations	Rongliang Wu et.al.	2304.08945	null
2023-04-17	Autoregressive GAN for Semantic Unconditional Head Motion Generation	Louis Airale et.al.	2211.00987	link
2023-04-11	One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural Radiance Field	Weichuang Li et.al.	2304.05097	null
2023-04-06	Face Animation with an Attribute-Guided Diffusion Model	Bohan Zeng et.al.	2304.03199	link
2023-04-06	4D Agnostic Real-Time Facial Animation Pipeline for Desktop Scenarios	Wei Chen et.al.	2304.02814	null
2023-04-03	CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior	Jinbo Xing et.al.	2301.02379	link
2023-04-01	DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance	Longwen Zhang et.al.	2304.03117	null
2023-04-01	TalkCLIP: Talking Head Generation with Text-Guided Expressive Speaking Styles	Yifeng Ma et.al.	2304.00334	null
2023-03-31	FONT: Flow-guided One-shot Talking Head Generation with Natural Head Motions	Jin Liu et.al.	2303.17789	null
2023-03-29	Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert	Jiadong Wang et.al.	2303.17480	link
2023-03-27	OmniAvatar: Geometry-Guided Controllable 3D Head Synthesis	Hongyi Xu et.al.	2303.15539	null
2023-03-27	Accurate and Interpretable Solution of the Inverse Rig for Realistic Blendshape Models with Quadratic Corrective Terms	Stevo Racković et.al.	2302.04843	null
2023-03-27	MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation	Bowen Zhang et.al.	2212.08062	link
2023-03-27	A Majorization-Minimization Based Method for Nonconvex Inverse Rig Problems in Facial Animation: Algorithm Derivation	Stevo Racković et.al.	2205.04289	null
2023-03-26	OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering	Zhiyuan Ma et.al.	2303.14662	link
2023-03-26	Emotionally Enhanced Talking Face Generation	Sahil Goyal et.al.	2303.11548	link
2023-03-26	Distributed Solution of the Inverse Rig Problem in Blendshape Facial Animation	Stevo Racković et.al.	2303.06370	null
2023-03-24	Synthesizing Photorealistic Virtual Humans Through Cross-modal Disentanglement	Siddarth Ravichandran et.al.	2209.01320	null
2023-03-23	PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360 $^{\circ}$	Sizhe An et.al.	2303.13071	null
2023-03-22	Style Transfer for 2D Talking Head Animation	Trong-Thang Pham et.al.	2303.09799	link
2023-03-22	MARLIN: Masked Autoencoder for facial video Representation LearnINg	Zhixi Cai et.al.	2211.06627	link
2023-03-14	DisCoHead: Audio-and-Video-Driven Talking Head Generation by Disentangled Control of Head Pose and Facial Expressions	Geumbyeol Hwang et.al.	2303.07697	link
2023-03-13	SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation	Wenxuan Zhang et.al.	2211.12194	link
2023-03-09	FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis Using Self-Supervised Speech Representation Learning	Kazi Injamamul Haque et.al.	2303.05416	link
2023-03-09	Improving Few-Shot Learning for Talking Face System with TTS Data Augmentation	Qi Chen et.al.	2303.05322	link
2023-03-07	DINet: Deformation Inpainting Network for Realistic Face Visually Dubbing on High Resolution Video	Zhimeng Zhang et.al.	2303.03988	link
2023-03-05	Cyber Vaccine for Deepfake Immunity	Ching-Chun Chang et.al.	2303.02659	null
2023-03-04	High-fidelity Facial Avatar Reconstruction from Monocular Video with Generative Priors	Yunpeng Bai et.al.	2211.15064	null
2023-03-01	DPE: Disentanglement of Pose and Expression for General Video Portrait Editing	Youxin Pang et.al.	2301.06281	link
2023-02-27	Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video	Minsu Kim et.al.	2303.08670	null
2023-02-27	Memory-augmented Contrastive Learning for Talking Head Generation	Jianrong Wang et.al.	2302.13469	link
2023-02-24	Pose-Controllable 3D Facial Animation Synthesis using Hierarchical Audio-Vertex Attention	Bin Liu et.al.	2302.12532	null
2023-02-16	OPT: One-shot Pose-Controllable Talking Head Generation	Jin Liu et.al.	2302.08197	null
2023-02-14	Expressive Talking Head Video Encoding in StyleGAN2 Latent-Space	Trevine Oorloff et.al.	2203.14512	link
2023-01-31	GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis	Zhenhui Ye et.al.	2301.13430	null
2023-01-23	Data standardization for robust lip sync	Chun Wang et.al.	2202.06198	null
2023-01-20	Neural Volumetric Blendshapes: Computationally Efficient Physics-Based Facial Blendshapes	Nicolas Wagner et.al.	2212.14784	null
2023-01-15	Learning Audio-Driven Viseme Dynamics for 3D Face Animation	Linchao Bao et.al.	2301.06059	null
2022-12-30	Imitator: Personalized Speech-driven 3D Facial Animation	Balamurugan Thambiraja et.al.	2301.00023	null
2022-12-28	All’s well that FID’s well? Result quality and metric scores in GAN models for lip-sychronization tasks	Carina Geldhauser et.al.	2212.13810	null
2022-12-23	Dubbing in Practice: A Large Scale Study of Human Localization With Insights for Automatic Dubbing	William Brannon et.al.	2212.12137	null
2022-12-09	Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers	Yasheng Sun et.al.	2212.04970	null
2022-12-07	Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors	Zhentao Yu et.al.	2212.04248	null
2022-12-07	SPACE: Speech-driven Portrait Animation with Controllable Expression	Siddharth Gururani et.al.	2211.09809	null
2022-11-30	Extracting Semantic Knowledge from GANs with Unsupervised Learning	Jianjin Xu et.al.	2211.16710	null
2022-11-29	VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild	Kun Cheng et.al.	2211.14758	null
2022-11-26	Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis	Duomin Wang et.al.	2211.14506	link
2022-11-22	Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition	Jiaxiang Tang et.al.	2211.12368	null
2022-11-10	On the role of Lip Articulation in Visual Speech Perception	Zakaria Aldeneh et.al.	2203.10117	null
2022-11-03	SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory	Se Jin Park et.al.	2211.00924	null
2022-10-21	Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection	Alexandros Haliassos et.al.	2201.07131	link
2022-10-14	Pre-Avatar: An Automatic Presentation Generation Framework Leveraging Talking Avatar	Aolan Sun et.al.	2210.06877	null
2022-10-13	Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors	Vladimir Iashin et.al.	2210.07055	link
2022-10-07	Compressing Video Calls using Synthetic Talking Heads	Madhav Agarwal et.al.	2210.03692	null
2022-10-07	A Keypoint Based Enhancement Method for Audio Driven Free View Talking Head Synthesis	Yichen Han et.al.	2210.03335	null
2022-10-06	Audio-Visual Face Reenactment	Madhav Agarwal et.al.	2210.02755	link
2022-10-06	Finding Directions in GAN’s Latent Space for Neural Face Reenactment	Stella Bounareli et.al.	2202.00046	link
2022-10-04	Towards MOOCs for Lipreading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale	Aditya Agarwal et.al.	2208.09796	null
2022-09-29	Facial Landmark Predictions with Applications to Metaverse	Qiao Han et.al.	2209.14698	link
2022-09-27	StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment	Stella Bounareli et.al.	2209.13375	link
2022-09-23	EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model	Xinya Ji et.al.	2205.15278	null
2022-09-21	FNeVR: Neural Volume Rendering for Face Animation	Bohan Zeng et.al.	2209.10340	link
2022-09-19	AutoLV: Automatic Lecture Video Generator	Wenbin Wang et.al.	2209.08795	null
2022-09-09	Talking Head from Speech Audio using a Pre-trained Image Generator	Mohammed M. Alghamdi et.al.	2209.04252	null
2022-09-07	Restructurable Activation Networks	Kartikeya Bhardwaj et.al.	2208.08562	link
2022-08-29	StableFace: Analyzing and Improving Motion Stability for Talking Face Generation	Jun Ling et.al.	2208.13717	null
2022-08-17	Extreme-scale Talking-Face Video Upsampling with Audio-Visual Priors	Sindhu B Hegde et.al.	2208.08118	link
2022-08-03	Free-HeadGAN: Neural Talking Head Synthesis with Explicit Gaze Control	Michail Christos Doukas et.al.	2208.02210	null
2022-08-02	Perceptual Conversational Head Generation with Regularized Driver and Enhanced Renderer	Ailin Huang et.al.	2206.12837	link
2022-08-01	A Feasibility Study on Image Inpainting for Non-cleft Lip Generation from Patients with Cleft Lip	Shuang Chen et.al.	2208.01149	link
2022-07-27	A Hybrid Deep Animation Codec for Low-bitrate Video Conferencing	Goluck Konuko et.al.	2207.13530	null
2022-07-24	Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis	Shuai Shen et.al.	2207.11770	link
2022-07-22	Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos	Panagiotis P. Filntisis et.al.	2207.11094	link
2022-07-20	NARRATE: A Normal Assisted Free-View Portrait Stylizer	Youjia Wang et.al.	2207.00974	null
2022-07-20	VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection	Joanna Hong et.al.	2206.07458	null
2022-07-20	Responsive Listening Head Generation: A Benchmark Dataset and Baseline	Mohan Zhou et.al.	2112.13548	null
2022-07-13	FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech Synthesis	Yongqi Wang et.al.	2207.03800	null
2022-06-29	Cut Inner Layers: A Structured Pruning Strategy for Efficient U-Net GANs	Bo-Kyeong Kim et.al.	2206.14658	null
2022-06-09	Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos	Alexander Waibel et.al.	2206.04523	null
2022-05-31	Text/Speech-Driven Full-Body Animation	Wenlin Zhuang et.al.	2205.15573	null
2022-05-27	Unsupervised Voice-Face Representation Learning by Cross-Modal Prototype Contrast	Boqing Zhu et.al.	2204.14057	link
2022-05-26	One-Shot Face Reenactment on Megapixels	Wonjun Kang et.al.	2205.13368	null
2022-05-24	Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel’s Weekly Video Podcasts	Debjoy Saha et.al.	2205.12194	link
2022-05-20	MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement	Alexander Richard et.al.	2104.08223	link
2022-05-13	Talking Face Generation with Multilingual TTS	Hyoung-Kyu Song et.al.	2205.06421	null
2022-05-02	Emotion-Controllable Generalized Talking Face Generation	Sanjana Sinha et.al.	2205.01155	null
2022-05-02	A Novel Speech-Driven Lip-Sync Model with CNN and LSTM	Xiaohong Li et.al.	2205.00916	null
2022-04-27	Talking Head Generation Driven by Speech-Related Facial Action Units and Audio- Based on Multimodal Representation Fusion	Sen Chen et.al.	2204.12756	null
2022-04-25	Fast Facial Landmark Detection and Applications: A Survey	Kostiantyn Khabarlak et.al.	2101.10808	null
2022-04-13	Dynamic Neural Textures: Generating Talking-Face Videos with Continuously Controllable Expressions	Zipeng Ye et.al.	2204.06180	null
2022-04-06	Transformer-S2A: Robust and Efficient Speech-to-Animation	Liyang Chen et.al.	2111.09771	null
2022-04-03	Txt2Vid: Ultra-Low Bitrate Compression of Talking-Head Videos via Text	Pulkit Tandon et.al.	2106.14014	link
2022-03-30	End to End Lip Synchronization with a Temporal AutoEncoder	Yoav Shalev et.al.	2203.16224	link
2022-03-29	Thin-Plate Spline Motion Model for Image Animation	Jian Zhao et.al.	2203.14367	link
2022-03-17	StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN	Fei Yin et.al.	2203.04036	link
2022-03-17	FaceFormer: Speech-Driven 3D Facial Animation with Transformers	Yingruo Fan et.al.	2112.05329	link
2022-03-16	Efficient conditioned face animation using frontally-viewed embedding	Maxime Oquab et.al.	2203.08765	null
2022-03-15	Depth-Aware Generative Adversarial Network for Talking Head Video Generation	Fa-Ting Hong et.al.	2203.06605	link
2022-03-10	An Audio-Visual Attention Based Multimodal Network for Fake Talking Face Videos Detection	Ganglai Wang et.al.	2203.05178	null
2022-03-08	Attention-Based Lip Audio-Visual Synthesis for Talking Face Generation in the Wild	Ganglai Wang et.al.	2203.03984	null
2022-03-04	Multi-modality Deep Restoration of Extremely Compressed Face Videos	Xi Zhang et.al.	2107.05548	null
2022-03-01	FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset	Hasam Khalid et.al.	2108.05080	link
2022-02-25	FSGANv2: Improved Subject Agnostic Face Swapping and Reenactment	Yuval Nirkin et.al.	2202.12972	null
2022-02-22	Thinking the Fusion Strategy of Multi-reference Face Reenactment	Takuya Yashima et.al.	2202.10758	null
2022-01-24	Selective Listening by Synchronizing Speech with Lips	Zexu Pan et.al.	2106.07150	link
2022-01-22	Text2Video: Text-driven Talking-head Video Synthesis with Personalized Phoneme-Pose Dictionary	Sibo Zhang et.al.	2104.14631	null
2022-01-21	Stitch it in Time: GAN-Based Facial Editing of Real Videos	Rotem Tzaban et.al.	2201.08361	link
2022-01-17	Towards Realistic Visual Dubbing with Heterogeneous Sources	Tianyi Xie et.al.	2201.06260	null
2022-01-16	Audio-Driven Talking Face Video Generation with Dynamic Convolution Kernels	Zipeng Ye et.al.	2201.05986	null
2022-01-03	DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering	Shunyu Yao et.al.	2201.00791	null
2021-12-20	Parallel and High-Fidelity Text-to-Lip Generation	Jinglin Liu et.al.	2107.06831	link
2021-12-19	Initiative Defense against Facial Manipulation	Qidong Huang et.al.	2112.10098	link
2021-12-07	Joint Audio-Text Model for Expressive Speech-Driven 3D Facial Animation	Yingruo Fan et.al.	2112.02214	null
2021-12-06	One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning	Suzhen Wang et.al.	2112.02749	null
2021-11-29	Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates	Shenhan Qian et.al.	2108.08020	link
2021-11-04	FEAFA+: An Extended Well-Annotated Dataset for Facial Expression Analysis and 3D Facial Animation	Wei Gan et.al.	2111.02751	null
2021-11-02	BiosecurID: a multimodal biometric database	Julian Fierrez et.al.	2111.03472	null
2021-10-30	Imitating Arbitrary Talking Style for Realistic Audio-DrivenTalking Face Synthesis	Haozhe Wu et.al.	2111.00203	link
2021-10-26	Emotion recognition in talking-face videos using persistent entropy and neural networks	Eduardo Paluzo-Hidalgo et.al.	2110.13571	link
2021-10-26	ViDA-MAN: Visual Dialog with Digital Humans	Tong Shen et.al.	2110.13384	null
2021-10-22	Invertible Frowns: Video-to-Video Facial Emotion Translation	Ian Magnusson et.al.	2109.08061	null
2021-10-19	Talking Head Generation with Audio and Speech Related Facial Action Units	Sen Chen et.al.	2110.09951	null
2021-10-16	Intelligent Video Editing: Incorporating Modern Talking Face Generation Algorithms in a Video Editor	Anchit Gupta et.al.	2110.08580	null
2021-10-12	Fine-grained Identity Preserving Landmark Synthesis for Face Reenactment	Haichao Zhang et.al.	2110.04708	null
2021-10-07	Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution	Yangyang Shi et.al.	2110.05241	null
2021-09-24	Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation	Yuanxun Lu et.al.	2109.10595	null
2021-09-20	Accurate, Interpretable, and Fast Animation: An Iterative, Sparse, and Nonconvex Approach	Stevo Rackovic et.al.	2109.08356	null
2021-09-17	Detection of GAN-synthesized street videos	Omran Alamayreh et.al.	2109.04991	null
2021-08-30	Audiovisual Speech Synthesis using Tacotron2	Ahmed Hussen Abdelaziz et.al.	2008.00620	null
2021-08-23	KoDF: A Large-scale Korean DeepFake Detection Dataset	Patrick Kwon et.al.	2103.10094	null
2021-08-23	HeadGAN: One-shot Neural Head Synthesis and Editing	Michail Christos Doukas et.al.	2012.08261	null
2021-08-19	AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis	Yudong Guo et.al.	2103.11078	link
2021-08-18	DeepFake MNIST+: A DeepFake Facial Animation Dataset	Jiajun Huang et.al.	2108.07949	link
2021-08-18	FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning	Chenxu Zhang et.al.	2108.07938	link
2021-08-12	UniFaceGAN: A Unified Framework for Temporally Consistent Facial Video Editing	Meng Cao et.al.	2108.05650	null
2021-08-11	AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Person	Xinsheng Wang et.al.	2108.04325	null
2021-08-06	SofGAN: A Portrait Image Generator with Dynamic Styling	Anpei Chen et.al.	2007.03780	link
2021-07-27	Beyond Voice Identity Conversion: Manipulating Voice Attributes by Adversarial Learning of Structured Disentangled Representations	Laurent Benaroya et.al.	2107.12346	null
2021-07-21	Speech Driven Talking Face Generation from a Single Image and an Emotion Condition	Sefik Emre Eskimez et.al.	2008.03592	link
2021-07-20	Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion	Suzhen Wang et.al.	2107.09293	link
2021-07-10	Speech2Video: Cross-Modal Distillation for Speech to Video Generation	Shijing Si et.al.	2107.04806	null
2021-07-07	Egocentric Videoconferencing	Mohamed Elgharib et.al.	2107.03109	null
2021-06-08	LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces from Video using Pose and Lighting Normalization	Avisek Lahiri et.al.	2106.04185	null
2021-05-20	Audio-Driven Emotional Video Portraits	Xinya Ji et.al.	2104.07452	null
2021-05-07	Write-a-speaker: Text-based Emotional and Rhythmic Talking-head Generation	Lincheng Li et.al.	2104.07995	link
2021-05-05	A Neural Lip-Sync Framework for Synthesizing Photorealistic Virtual News Anchors	Ruobing Zheng et.al.	2002.08700	null
2021-04-29	Learned Spatial Representations for Few-shot Talking-Head Synthesis	Moustafa Meshry et.al.	2104.14557	null
2021-04-26	One-shot Face Reenactment Using Appearance Adaptive Normalization	Guangming Yao et.al.	2102.03984	null
2021-04-25	3D-TalkEmo: Learning to Synthesize 3D Emotional Talking Head	Qianyun Wang et.al.	2104.12051	null
2021-04-23	Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation	Hang Zhou et.al.	2104.11116	null
2021-04-07	Single Source One Shot Reenactment using Weighted motion From Paired Feature Points	Soumya Tripathy et.al.	2104.03117	null
2021-04-07	Everything’s Talkin’: Pareidolia Face Reenactment	Linsen Song et.al.	2104.03061	link
2021-04-07	LI-Net: Large-Pose Identity-Preserving Face Reenactment Network	Jin Liu et.al.	2104.02850	null
2021-04-02	One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing	Ting-Chun Wang et.al.	2011.15126	null
2021-03-20	Not made for each other- Audio-Visual Dissonance-based Deepfake Detection and Localization	Komal Chugh et.al.	2005.14405	link
2021-03-19	End-to-End Lip Synchronisation Based on Pattern Classification	You Jin Kim et.al.	2005.08606	null
2021-03-05	Real-time RGBD-based Extended Body Pose Estimation	Renat Bashirov et.al.	2103.03663	link
2021-03-03	Estimating Uniqueness of I-Vector Representation of Human Voice	Erkam Sinan Tandogan et.al.	2008.11985	null
2021-02-25	MakeItTalk: Speaker-Aware Talking-Head Animation	Yang Zhou et.al.	2004.12992	null
2021-02-19	One Shot Audio to Animated Video Generation	Neeraj Kumar et.al.	2102.09737	null
2021-02-18	AudioVisual Speech Synthesis: A brief literature review	Efthymios Georgiou et.al.	2103.03927	null
2020-12-14	Robust One Shot Audio to Video Generation	Neeraj Kumar et.al.	2012.07842	null
2020-12-14	Multi Modal Adaptive Normalization for Audio to Video Generation	Neeraj Kumar et.al.	2012.07304	null
2020-11-30	Adaptive Compact Attention For Few-shot Video-to-video Translation	Risheng Huang et.al.	2011.14695	null
2020-11-21	Stochastic Talking Face Generation Using Latent Distribution Matching	Ravindra Yadav et.al.	2011.10727	link
2020-11-21	Iterative Text-based Editing of Talking-heads Using Neural Retargeting	Xinwei Yao et.al.	2011.10688	null
2020-11-09	FACEGAN: Facial Attribute Controllable rEenactment GAN	Soumya Tripathy et.al.	2011.04439	null
2020-11-06	Large-scale multilingual audio visual dubbing	Yi Yang et.al.	2011.03530	null
2020-11-02	Facial Keypoint Sequence Generation from Audio	Prateek Manocha et.al.	2011.01114	null
2020-10-25	APB2FaceV2: Real-Time Audio-Guided Multi-Face Reenactment	Jiangning Zhang et.al.	2010.13017	link
2020-10-12	Intuitive Facial Animation Editing Based On A Generative RNN Framework	Eloïse Berson et.al.	2010.05655	null
2020-10-05	SMILE: Semantically-guided Multi-attribute Image and Layout Editing	Andrés Romero et.al.	2010.02315	link
2020-10-05	Dynamic Facial Asset and Rig Generation from a Single Scan	Jiaman Li et.al.	2010.00560	null
2020-09-20	An Improved Approach of Intention Discovery with Machine Learning for POMDP-based Dialogue Management	Ruturaj Raval et.al.	2009.09354	null
2020-09-18	Mesh Guided One-shot Face Reenactment using Graph Convolutional Networks	Guangming Yao et.al.	2008.07783	null
2020-09-12	DualLip: A System for Joint Lip Reading and Generation	Weicong Chen et.al.	2009.05784	null
2020-09-02	Seeing wake words: Audio-visual Keyword Spotting	Liliane Momeni et.al.	2009.01225	null
2020-08-29	“It took me almost 30 minutes to practice this”. Performance and Production Practices in Dance Challenge Videos on TikTok	Daniel Klug et.al.	2008.13040	null
2020-08-25	A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild	K R Prajwal et.al.	2008.10010	null
2020-08-11	Audio- and Gaze-driven Facial Animation of Codec Avatars	Alexander Richard et.al.	2008.05023	null
2020-08-04	Speaker dependent acoustic-to-articulatory inversion using real-time MRI of the vocal tract	Tamás Gábor Csapó et.al.	2008.02098	link
2020-08-04	Real-Time Cleaning and Refinement of Facial Animation Signals	Eloïse Berson et.al.	2008.01332	null
2020-08-02	Deep Multi-modality Soft-decoding of Very Low Bit-rate Face Videos	Yanhui Guo et.al.	2008.01652	null
2020-07-29	Neural Voice Puppetry: Audio-driven Facial Reenactment	Justus Thies et.al.	1912.05566	link
2020-07-20	Deformable Style Transfer	Sunnie S. Y. Kim et.al.	2003.11038	link
2020-07-18	A Robust Interactive Facial Animation Editing System	Eloïse Berson et.al.	2007.09367	null
2020-07-16	Talking-head Generation with Rhythmic Head Motion	Lele Chen et.al.	2007.08547	link
2020-07-08	Learning Speech Representations from Raw Audio by Joint Audiovisual Self-Supervision	Abhinav Shukla et.al.	2007.04134	null
2020-06-20	Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams	Huirong Huang et.al.	2006.11610	null
2020-05-27	Modality Dropout for Improved Performance-driven Talking Faces	Ahmed Hussen Abdelaziz et.al.	2005.13616	null
2020-05-25	Identity-Preserving Realistic Talking Face Generation	Sanjana Sinha et.al.	2005.12318	null
2020-05-22	Head2Head: Video-based Neural Head Synthesis	Mohammad Rami Koujan et.al.	2005.10954	null
2020-05-16	FReeNet: Multi-Identity Face Reenactment	Jiangning Zhang et.al.	1905.11805	null
2020-05-13	FaR-GAN for One-Shot Face Reenactment	Hanxiang Hao et.al.	2005.06402	null
2020-05-13	Arbitrary Talking Face Generation via Attentional Audio-Visual Coherence Learning	Hao Zhu et.al.	1812.06589	null
2020-05-11	Dancing to the Partisan Beat: A First Analysis of Political Communication on TikTok	Juan Carlos Medina Serrano et.al.	2004.05478	link
2020-05-07	What comprises a good talking-head video generation?: A Survey and Benchmark	Lele Chen et.al.	2005.03201	link
2020-05-04	Disentangled Speech Embeddings using Cross-modal Self-supervision	Arsha Nagrani et.al.	2002.08742	null
2020-04-30	APB2Face: Audio-guided face reenactment with auxiliary pose and blink signals	Jiangning Zhang et.al.	2004.14569	null
2020-03-30	ActGAN: Flexible and Efficient One-shot Face Reenactment	Ivan Kosarevych et.al.	2003.13840	null
2020-03-29	Realistic Face Reenactment via Self-Supervised Disentangling of Identity and Pose	Xianfang Zeng et.al.	2003.12957	null
2020-03-26	High-Accuracy Facial Depth Models derived from 3D Synthetic Data	Faisal Khan et.al.	2003.06211	null
2020-03-05	Talking-Heads Attention	Noam Shazeer et.al.	2003.02436	link
2020-03-05	Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose	Ran Yi et.al.	2002.10137	link
2020-03-01	Towards Automatic Face-to-Face Translation	Prajwal K R et.al.	2003.00418	link
2020-02-19	Speech-driven facial animation using polynomial fusion of features	Triantafyllos Kefalas et.al.	1912.05833	null
2020-01-17	ICface: Interpretable and Controllable Face Reenactment Using GANs	Soumya Tripathy et.al.	1904.01909	null
2019-12-20	Disentangling Style and Content in Anime Illustrations	Sitao Xiang et.al.	1905.10742	null
2019-11-21	FLNet: Landmark Driven Fetching and Learning Network for Faithful Talking Facial Animation Synthesis	Kuangxiao Gu et.al.	1911.09224	null
2019-11-19	MarioNETte: Few-shot Face Reenactment Preserving Identity of Unseen Targets	Sungjoo Ha et.al.	1911.08139	null
2019-10-28	Few-shot Video-to-Video Synthesis	Ting-Chun Wang et.al.	1910.12713	null
2019-10-19	Real-Time Lip Sync for Live 2D Animation	Deepali Aneja et.al.	1910.08685	link
2019-10-16	Designing Style Matching Conversational Agents	Deepali Aneja et.al.	1910.07514	null
2019-10-15	A High-Fidelity Open Embodied Avatar with Lip Syncing and Expression Capabilities	Deepali Aneja et.al.	1909.08766	link
2019-10-09	EmoCo: Visual Analysis of Emotion Coherence in Presentation Videos	Haipeng Zeng et.al.	1907.12918	null
2019-10-02	Animating Face using Disentangled Audio Representations	Gaurav Mittal et.al.	1910.00726	null
2019-09-25	Few-Shot Adversarial Learning of Realistic Neural Talking Head Models	Egor Zakharov et.al.	1905.08233	null
2019-09-06	Neural Style-Preserving Visual Dubbing	Hyeongwoo Kim et.al.	1909.02518	null
2019-08-29	3D Face Pose and Animation Tracking via Eigen-Decomposition based Bayesian Approach	Ngoc-Trung Tran et.al.	1908.11039	null
2019-08-20	Prosodic Phrase Alignment for Machine Dubbing	Alp Öktem et.al.	1908.07226	link
2019-08-16	FSGAN: Subject Agnostic Face Swapping and Reenactment	Yuval Nirkin et.al.	1908.05932	link
2019-08-11	Emotion Dependent Facial Animation from Affective Speech	Rizwan Sadiq et.al.	1908.03904	null
2019-08-05	One-shot Face Reenactment	Yunxuan Zhang et.al.	1908.03251	link
2019-07-25	Talking Face Generation by Conditional Recurrent Adversarial Network	Yang Song et.al.	1804.04786	link
2019-07-24	Data-Driven Physical Face Inversion	Yeara Kozlov et.al.	1907.10402	null
2019-07-23	A system for efficient 3D printed stop-motion face animation	Rinat Abdrashitov et.al.	1907.10163	null
2019-06-14	Realistic Speech-Driven Facial Animation with GANs	Konstantinos Vougioukas et.al.	1906.06337	null
2019-06-04	Text-based Editing of Talking-head Video	Ohad Fried et.al.	1906.01524	null
2019-05-27	Audio2Face: Generating Speech/Face Animation from Single Audio with Attention-Based Bidirectional LSTM Networks	Guanzhong Tian et.al.	1905.11142	null
2019-05-09	Hierarchical Cross-Modal Talking Face Generationwith Dynamic Pixel-Wise Loss	Lele Chen et.al.	1905.03820	link
2019-05-08	Capture, Learning, and Synthesis of 3D Speaking Styles	Daniel Cudeiro et.al.	1905.03079	link
2019-04-23	Talking Face Generation by Adversarially Disentangled Audio-Visual Representation	Hang Zhou et.al.	1807.07860	null
2019-04-02	FEAFA: A Well-Annotated Dataset for Facial Expression Analysis and 3D Facial Animation	Yanfu Yan et.al.	1904.01509	null
2019-03-13	Animating an Autonomous 3D Talking Avatar	Dominik Borer et.al.	1903.05448	null
2018-12-22	Deep Audio-Visual Speech Recognition	Triantafyllos Afouras et.al.	1809.02108	null
2018-12-20	DeepFakes: a New Threat to Face Recognition? Assessment and Detection	Pavel Korshunov et.al.	1812.08685	null
2018-11-22	Towards Highly Accurate and Stable Face Alignment for High-Resolution Videos	Ying Tai et.al.	1811.00342	link
2018-11-16	Influence of visual cues on head and eye movements during listening tasks in multi-talker audiovisual environments with animated characters	Maartje M. E. Hendrikse et.al.	1812.02088	null
2018-08-28	GANimation: Anatomically-aware Facial Animation from a Single Image	Albert Pumarola et.al.	1807.09251	link
2018-08-19	Dynamic Temporal Alignment of Speech to Lips	Tavi Halperin et.al.	1808.06250	link
2018-07-29	ReenactGAN: Learning to Reenact Faces via Boundary Transfer	Wayne Wu et.al.	1807.11079	link
2018-07-26	Learnable PINs: Cross-Modal Embeddings for Person Identity	Arsha Nagrani et.al.	1805.00833	null
2018-07-19	End-to-End Speech-Driven Facial Animation with Temporal GANs	Konstantinos Vougioukas et.al.	1805.09313	null
2018-05-29	Deep Video Portraits	Hyeongwoo Kim et.al.	1805.11714	null
2018-05-24	VisemeNet: Audio-Driven Animator-Centric Speech Animation	Yang Zhou et.al.	1805.09488	null
2018-05-21	Anime Style Space Exploration Using Metric Learning and Generative Adversarial Networks	Sitao Xiang et.al.	1805.07997	null
2018-04-23	Generating Talking Face Landmarks from Speech	Sefik Emre Eskimez et.al.	1803.09803	null
2018-03-28	Generative Adversarial Talking Head: Bringing Portraits to Life with a Weakly Supervised Neural Network	Hai X. Pham et.al.	1803.07716	null
2018-03-20	Speech-Driven Facial Reenactment Using Conditional Generative Adversarial Networks	Seyed Ali Jalalifar et.al.	1803.07461	null
2017-12-07	End-to-end Learning for 3D Facial Animation from Raw Waveforms of Speech	Hai X. Pham et.al.	1710.00920	null
2017-12-06	ObamaNet: Photo-realistic lip-sync from text	Rithesh Kumar et.al.	1801.01442	null
2017-07-30	Kernel Projection of Latent Structures Regression for Facial Animation Retargeting	Christos Ouzounis et.al.	1707.09629	null
2017-07-26	Fast Deep Matting for Portrait Animation on Mobile Phone	Bingke Zhu et.al.	1707.08289	null
2017-07-21	Multichannel Attention Network for Analyzing Visual Behavior in Public Speaking	Rahul Sharma et.al.	1707.06830	null
2017-07-18	You said that?	Joon Son Chung et.al.	1705.02966	null
2017-01-30	Lip Reading Sentences in the Wild	Joon Son Chung et.al.	1611.05358	link
2016-10-28	Galaxy gas as obscurer: II. Separating the galaxy-scale and nuclear obscurers of Active Galactic Nuclei	Johannes Buchner et.al.	1610.09380	link
2016-07-11	Large-Scale MIMO is Capable of Eliminating Power-Thirsty Channel Coding for Wireless Transmission of HEVC/H.265 Video	Shaoshi Yang et.al.	1601.06684	null
2016-05-22	Improving Facial Analysis and Performance Driven Animation through Disentangling Identity and Expression	David Rim et.al.	1512.08212	null
2016-02-08	Automatic Face Reenactment	Pablo Garrido et.al.	1602.02651	null
2015-11-20	ExpressionBot: An Emotive Lifelike Robotic Face for Face-to-Face Communication	Ali Mollahosseini et.al.	1511.06502	null
2014-09-03	Visual Speech Recognition	Ahmad B. A. Hassanat et.al.	1409.1411	null
2012-09-22	Using multimodal speech production data to evaluate articulatory animation for audiovisual speech synthesis	Ingmar Steiner et.al.	1209.4982	null
2012-03-30	Face Expression Recognition and Analysis: The State of the Art	Vinay Bettadapura et.al.	1203.6722	null
2012-01-19	Progress in animation of an EMA-controlled tongue model for acoustic-visual speech synthesis	Ingmar Steiner et.al.	1201.4080	null
2010-03-01	Re-verification of a Lip Synchronization Protocol using Robust Reachability	Piotr Kordy et.al.	1003.0431	null

Image Animation

Publish Date	Title	Authors	PDF	Code
2026-02-11	MotionWeaver: Holistic 4D-Anchored Framework for Multi-Humanoid Image Animation	Xirui Hu et.al.	2602.13326	null
2026-01-29	DreamActor-M2: Universal Character Image Animation via Spatiotemporal In-Context Learning	Mingshuang Luo et.al.	2601.21716	null
2026-01-16	CoDance: An Unbind-Rebind Paradigm for Robust Multi-Subject Animation	Shuai Tan et.al.	2601.11096	null
2026-01-06	DreamLoop: Controllable Cinemagraph Generation from a Single Photograph	Aniruddha Mahapatra et.al.	2601.02646	null
2025-12-31	Few-Shot-Based Modular Image-to-Video Adapter for Diffusion Models	Zhenhao Li et.al.	2512.20000	null
2025-12-30	APOLLO Blender: A Robotics Library for Visualization and Animation in Blender	Peter Messina et.al.	2512.23103	null
2025-12-29	MACE-Dance: Motion-Appearance Cascaded Experts for Music-Driven Dance Video Generation	Kaixing Yang et.al.	2512.18181	null
2025-12-26	High-Fidelity and Long-Duration Human Image Animation with Diffusion Transformer	Shen Zheng et.al.	2512.21905	null
2025-12-12	PersonaLive! Expressive Portrait Image Animation for Live Streaming	Zhiyuan Li et.al.	2512.11253	null
2025-12-05	SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations	Wenhao Yan et.al.	2512.05905	null
2025-12-05	Learning High-Fidelity Cloth Animation via Skinning-Free Image Transfer	Rong Wang et.al.	2512.05593	null
2025-12-04	ShadowDraw: From Any Object to Shadow-Drawing Compositional Art	Rundong Luo et.al.	2512.05110	null
2025-12-04	Efficient Spatially-Variant Convolution via Differentiable Sparse Kernel Complex	Zhizhen Wu et.al.	2512.04556	null
2025-12-03	Artificial Microsaccade Compensation: Stable Vision for an Ornithopter	Levi Burner et.al.	2512.03995	null
2025-12-02	PPTArena: A Benchmark for Agentic PowerPoint Editing	Michael Ofengenden et.al.	2512.03042	null
2025-12-01	Know Thyself by Knowing Others: Learning Neuron Identity from Population Context	Vinam Arora et.al.	2512.01199	null
2025-12-01	One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer	Shijun Shi et.al.	2511.22940	null
2025-11-30	TalkingPose: Efficient Face and Gesture Animation with Feedback-guided Diffusion Model	Alireza Javanmardi et.al.	2512.00909	null
2025-11-29	Astro-Animation – How Artists and Scientists Envision the Universe	Laurence Arcadias et.al.	2512.00535	null
2025-11-28	MultiBanana: A Challenging Benchmark for Multi-Reference Text-to-Image Generation	Yuta Oshima et.al.	2511.22989	null
2025-11-27	A Progressive Evaluation Framework for Multicultural Analysis of Story Visualization	Janak Kapuriya et.al.	2511.22576	null
2025-11-27	INSIGHT: An Interpretable Neural Vision-Language Framework for Reasoning of Generative Artifacts	Anshul Bagaria et.al.	2511.22351	null
2025-11-25	MotionV2V: Editing Motion in a Video	Ryan Burgert et.al.	2511.20640	null
2025-11-25	New York Smells: A Large Multimodal Dataset for Olfaction	Ege Ozguroglu et.al.	2511.20544	null
2025-11-24	SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation	Jiaming Zhang et.al.	2511.19320	null
2025-11-22	AnimAgents: Coordinating Multi-Stage Animation Pre-Production with Human-Multi-Agent Collaboration	Wen-Fan Wang et.al.	2511.17906	null
2025-11-20	Motion Transfer-Enhanced StyleGAN for Generating Diverse Macaque Facial Expressions	Takuya Igaue et.al.	2511.16711	null
2025-11-20	Integrating Deep Learning and Spatial Statistics in Marine Ecosystem Monitoring	Gian Mario Sangiovanni et.al.	2511.16447	null
2025-11-20	How Robot Dogs See the Unseeable	Oliver Bimber et.al.	2511.16262	null
2025-11-18	PFAvatar: Pose-Fusion 3D Personalized Avatar Reconstruction from Real-World Outfit-of-the-Day Photos	Dianbing Xi et.al.	2511.12935	null
2025-11-16	Sketch2PoseNet: Efficient and Generalized Sketch to 3D Human Pose Prediction	Li Wang et.al.	2510.26196	null
2025-11-14	EmoVid: A Multimodal Emotion Video Dataset for Emotion-Centric Video Understanding and Generation	Zongyang Qiu et.al.	2511.11002	null
2025-11-11	OmniAID: Decoupling Semantic and Artifacts for Universal AI-Generated Image Detection in the Wild	Yuncheng Guo et.al.	2511.08423	null
2025-11-11	oboro: Text-to-Image Synthesis on Limited Data using Flow-based Diffusion Transformer with MMH Attention	Ryusuke Mizutani et.al.	2511.08168	null
2025-11-11	Beyond the Pixels: VLM-based Evaluation of Identity Preservation in Reference-Guided Synthesis	Aditi Singhania et.al.	2511.08087	null
2025-11-09	Time-to-Move: Training-Free Motion Controlled Video Generation via Dual-Clock Denoising	Assaf Singer et.al.	2511.08633	null
2025-11-04	Video Text Preservation with Synthetic Text-Rich Videos	Ziyang Liu et.al.	2511.05573	null
2025-11-03	FreeArt3D: Training-Free Articulated Object Generation using 3D Diffusion	Chuhao Chen et.al.	2510.25765	null
2025-11-02	A Hybrid YOLOv5-SSD IoT-Based Animal Detection System for Durian Plantation Protection	Anis Suttan Shahrir et.al.	2511.00777	null
2025-10-31	DANCER: Dance ANimation via Condition Enhancement and Rendering with diffusion model	Yucheng Xing et.al.	2510.27169	null
2025-10-29	4-Doodle: Text to 3D Sketches that Move!	Hao Chen et.al.	2510.25319	null
2025-09-19	TT-DF: A Large-Scale Diffusion-Based Dataset and Benchmark for Human Body Forgery Detection	Wenkui Yang et.al.	2505.08437	null
2025-09-09	LINR Bridge: Vector Graphic Animation via Neural Implicits and Video Diffusion Priors	Wenshuo Gao et.al.	2509.07484	null
2025-08-23	AnimateAnywhere: Rouse the Background in Human Image Animation	Xiaoyu Liu et.al.	2504.19834	null
2025-08-13	Animate-X++: Universal Character Image Animation with Dynamic Backgrounds	Shuai Tan et.al.	2508.09454	null
2025-08-10	Consistent and Controllable Image Animation with Motion Linear Diffusion Transformers	Xin Ma et.al.	2508.07246	null
2025-08-01	FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait	Taekyung Ki et.al.	2412.01064	null
2025-07-20	StableAnimator++: Overcoming Pose Misalignment and Face Distortion for Human Image Animation	Shuyuan Tu et.al.	2507.15064	null
2025-07-11	X-Dancer: Expressive Music to Human Dance Video Generation	Zeyuan Chen et.al.	2502.17414	null
2025-07-01	DAM-VSR: Disentanglement of Appearance and Motion for Video Super-Resolution	Zhe Kong et.al.	2507.01012	null
2025-06-09	Efficient Long-duration Talking Video Synthesis with Linear Diffusion Transformer under Multimodal Guidance	Haojie Zhang et.al.	2411.16748	null
2025-05-30	MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation	Yanbo Ding et.al.	2505.10238	null
2025-05-29	HyperMotion: DiT-Based Pose-Guided Human Image Animation of Complex Motions	Shuolin Xu et.al.	2505.22977	null
2025-05-24	EvAnimate: Event-conditioned Image-to-Video Generation for Human Animation	Qiang Qu et.al.	2503.18552	null
2025-05-18	DynamiCtrl: Rethinking the Basic Structure and the Role of Text for High-quality Human Image Animation	Haoyu Zhao et.al.	2503.21246	null
2025-04-20	DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance	Yuxuan Luo et.al.	2504.01724	null
2025-04-15	UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer	Xiang Wang et.al.	2504.11289	null
2025-04-15	Taming Consistency Distillation for Accelerated Human Image Animation	Xiang Wang et.al.	2504.11143	null
2025-04-14	TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation	Sunjae Yoon et.al.	2410.24037	null
2025-04-05	Multi-identity Human Image Animation with Structural Video Diffusion	Zhenzhi Wang et.al.	2504.04126	null
2025-04-04	Optimizing 4D Gaussians for Dynamic Scene Video from Single Landscape Images	In-Hwan Jin et.al.	2504.05458	null
2025-04-01	VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer	Xinyu Liu et.al.	2502.05979	null
2025-03-23	MotiF: Making Text Count in Image Animation with Motion Focal Loss	Shijie Wang et.al.	2412.16153	null
2025-03-13	Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer	Jiahao Cui et.al.	2412.00733	null
2025-03-10	Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation	Yingjie Chen et.al.	2501.05020	null
2025-03-10	VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs	Moayed Haji Ali et.al.	2304.06020	null
2025-03-01	Towards Multiple Character Image Animation Through Enhancing Implicit Decoupling	Jingyun Xue et.al.	2406.03035	null
2025-02-25	DisPose: Disentangling Pose Guidance for Controllable Human Image Animation	Hongxiang Li et.al.	2412.09349	null
2025-02-24	Dormant: Defending against Pose-driven Human Image Animation	Jiachen Zhou et.al.	2409.14424	null
2025-02-15	SkyReels-A1: Expressive Portrait Animation in Video Diffusion Transformers	Di Qiu et.al.	2502.10841	null
2025-02-10	Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance	Li Hu et.al.	2502.06145	null
2025-02-06	MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation	Jinbo Xing et.al.	2502.04299	null
2025-01-30	Every Image Listens, Every Image Dances: Music-Driven Image Animation	Zhikang Dong et.al.	2501.18801	null
2025-01-20	X-Dyna: Expressive Dynamic Human Image Animation	Di Chang et.al.	2501.10021	null
2025-01-15	Joint Learning of Depth and Appearance for Portrait Image Animation	Xinya Ji et.al.	2501.08649	null
2024-12-12	Animate-X: Universal Character Image Animation with Enhanced Motion Representation	Shuai Tan et.al.	2410.10306	null
2024-11-30	DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses	Yatian Pang et.al.	2412.00397	null
2024-11-28	JoyVASA: Portrait and Animal Image Animation with Diffusion-Based Audio-Driven Facial Dynamics and Head Motion Generation	Xuyang Cao et.al.	2411.09209	null
2024-11-27	StableAnimator: High-Quality Identity-Preserving Human Image Animation	Shuyuan Tu et.al.	2411.17697	null
2024-11-22	HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation	Zhenzhi Wang et.al.	2407.17438	null
2024-11-12	LEO: Generative Latent Image Animator for Human Video Synthesis	Yaohui Wang et.al.	2305.03989	null
2024-10-20	FrameBridge: Improving Image-to-Video Generation with Bridge Models	Yuji Wang et.al.	2410.15371	null
2024-10-14	Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation	Jiahao Cui et.al.	2410.07718	null
2024-09-30	Illustrious: an Open Advanced Illustration Model	Sang Hyun Park et.al.	2409.19946	null
2024-09-29	High Quality Human Image Animation using Regional Supervision and Motion Blur Condition	Zhongcong Xu et.al.	2409.19580	null
2024-07-23	Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models	Xin Ma et.al.	2407.15642	null
2024-07-17	Audio-Synchronized Visual Animation	Lin Zhang et.al.	2403.05659	null
2024-07-12	TCAN: Animating Human Images with Temporally Consistent Pose Guidance using Diffusion Models	Jeongho Kim et.al.	2407.09012	null
2024-07-12	EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions	Zhiyuan Chen et.al.	2407.08136	null
2024-07-11	MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model	Muyao Niu et.al.	2405.20222	null
2024-06-16	Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation	Mingwang Xu et.al.	2406.08801	null
2024-06-03	UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation	Xiang Wang et.al.	2406.01188	null
2024-06-01	Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance	Shenhao Zhu et.al.	2403.14781	null
2024-05-29	Evaluating the efectiveness of sonifcation in science education using Edukoi	Lucrezia Guiotto Nai Fovino et.al.	2405.18908	null
2024-05-28	VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation	Qilin Wang et.al.	2405.18156	null
2024-05-28	Controllable Longer Image Animation with Diffusion Models	Qiang Wang et.al.	2405.17306	null
2024-05-20	Dynamic modeling of a sliding ring on an elastic rod with incremental potential formulation	Weicheng Huang et.al.	2208.01238	null
2024-03-25	PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models	Yiming Zhang et.al.	2312.13964	null
2024-03-13	Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts	Yue Ma et.al.	2403.08268	null
2024-03-05	Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation	Weijie Li et.al.	2403.02827	null
2024-01-17	Continuous Piecewise-Affine Based Motion Model for Image Animation	Hexiang Wang et.al.	2401.09146	null
2024-01-03	Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions	David Junhao Zhang et.al.	2401.01827	null
2023-12-08	AnimateZero: Video Diffusion Models are Zero-Shot Image Animators	Jiwen Yu et.al.	2312.03793	null
2023-12-05	LivePhoto: Real Image Animation with Text-guided Motion Control	Xi Chen et.al.	2312.02928	null
2023-12-04	AnimateAnything: Fine-Grained Open Domain Image Animation with Motion Guidance	Zuozhuo Dai et.al.	2311.12886	null
2023-11-30	Motion-Conditioned Image Animation for Video Editing	Wilson Yan et.al.	2311.18827	null
2023-11-27	MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model	Zhongcong Xu et.al.	2311.16498	null
2023-11-27	DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors	Jinbo Xing et.al.	2310.12190	null
2023-11-19	Differential Motion Evolution for Fine-Grained Motion Deformation in Unsupervised Image Animation	Peirong Liu et.al.	2110.04658	null
2023-10-16	LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation	Ruiqi Wu et.al.	2310.10769	null
2023-09-26	Text-Guided Synthesis of Eulerian Cinemagraphs	Aniruddha Mahapatra et.al.	2307.03190	null
2023-09-25	Automatic Animation of Hair Blowing in Still Portrait Photos	Wenpeng Xiao et.al.	2309.14207	null
2023-07-10	AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning	Yuwei Guo et.al.	2307.04725	link
2023-07-09	Predictive Coding For Animation-Based Video Compression	Goluck Konuko et.al.	2307.04187	null
2023-03-10	3D Cinemagraphy from a Single Image	Xingyi Li et.al.	2303.05724	null
2023-02-02	Dreamix: Video Diffusion Models are General Video Editors	Eyal Molad et.al.	2302.01329	null
2023-01-27	Animating Still Images	Kushagr Batra et.al.	2209.10497	null
2023-01-14	Continuous odor profile monitoring to study olfactory navigation in small animals	Kevin S. Chen et.al.	2301.05905	null
2022-11-30	NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-shot Real Image Animation	Yu Yin et.al.	2211.17235	null
2022-10-05	Implicit Warping for Animation with Image Sets	Arun Mallya et.al.	2210.01794	null
2022-09-28	Motion Transformer for Unsupervised Image Animation	Jiale Tao et.al.	2209.14024	null
2022-07-19	Single Stage Virtual Try-on via Deformable Attention Flows	Shuai Bai et.al.	2207.09161	null
2022-07-08	Jointly Harnessing Prior Structures and Temporal Consistency for Sign Language Video Generation	Yucheng Suo et.al.	2207.03714	null
2022-06-11	Bayesian Statistics Guided Label Refurbishment Mechanism: Mitigating Label Noise in Medical Image Classification	Mengdi Gao et.al.	2106.12284	null
2022-04-05	Neural Fields in Visual Computing and Beyond	Yiheng Xie et.al.	2111.11426	null
2022-03-29	Thin-Plate Spline Motion Model for Image Animation	Jian Zhao et.al.	2203.14367	null
2022-03-29	Image Animation with Perturbed Masks	Yoav Shalev et.al.	2011.06922	null
2022-03-25	3D GAN Inversion for Controllable Portrait Image Animation	Connor Z. Lin et.al.	2203.13441	null
2022-03-18	Latent Image Animator: Learning to Animate Images via Latent Space Navigation	Yaohui Wang et.al.	2203.09043	null
2021-12-21	Image Animation with Keypoint Mask	Or Toledano et.al.	2112.10457	null
2021-12-19	Move As You Like: Image Animation in E-Commerce Scenario	Borun Xu et.al.	2112.13647	null
2021-12-17	AI-Empowered Persuasive Video Generation: A Survey	Chang Liu et.al.	2112.09401	null
2021-10-28	Application of Time Separation Technique to Enhance C-arm CT Dynamic Liver Perfusion Imaging	Hana Haseljić et.al.	2110.14318	null
2021-10-26	Incremental Learning for Animal Pose Estimation using RBF k-DPP	Gaurav Kumar Nayak et.al.	2110.13598	null
2021-09-06	Sparse to Dense Motion Transfer for Face Image Animation	Ruiqi Zhao et.al.	2109.00471	null
2021-08-18	DeepFake MNIST+: A DeepFake Facial Animation Dataset	Jiajun Huang et.al.	2108.07949	null
2021-06-23	Analisis Kualitas Layanan Website E-Commerce Bukalapak Terhadap Kepuasan Pengguna Mahasiswa Universitas Bina Darma Menggunakan Metode Webqual 4.0	Adellia et.al.	2106.15342	null
2021-04-07	Single Source One Shot Reenactment using Weighted motion From Paired Feature Points	Soumya Tripathy et.al.	2104.03117	null
2021-03-22	PriorityCut: Occlusion-guided Regularization for Warp-based Image Animation	Wai Ting Cheung et.al.	2103.11600	null
2020-12-01	Ultra-low bitrate video conferencing using deep image animation	Goluck Konuko et.al.	2012.00346	null
2020-10-01	First Order Motion Model for Image Animation	Aliaksandr Siarohin et.al.	2003.00196	null
2020-08-27	Deep Spatial Transformation for Pose-Guided Person Image Generation and Animation	Yurui Ren et.al.	2008.12606	null
2019-08-30	Animating Arbitrary Objects via Deep Motion Transfer	Aliaksandr Siarohin et.al.	1812.08861	null
2019-07-01	Style Generator Inversion for Image Enhancement and Animation	Aviv Gabbay et.al.	1906.11880	null
2018-10-09	3D model silhouette-based tracking in depth images for puppet suit dynamic video-mapping	Guillaume Caron et.al.	1810.03956	null
2018-06-24	A Design of FPGA Based Small Animal PET Real Time Digital Signal Processing and Correction Logic	Jiaming Lu et.al.	1806.09117	null
2018-01-31	RAPTOR I: Time-dependent radiative transfer in arbitrary spacetimes	Thomas Bronzwaer et.al.	1801.10452	null
2016-06-23	Gender and Interest Targeting for Sponsored Post Advertising at Tumblr	Mihajlo Grbovic et.al.	1606.07189	null
2015-03-16	Use of Effective Audio in E-learning Courseware	Kisor Ray et.al.	1503.04837	null
2015-02-04	Multimedia-Video for Learning	Kah Hean Chua et.al.	1502.01090	null
2013-01-25	Measurements of Martian Dust Devil Winds with HiRISE	David S. Choi et.al.	1301.6130	null
2010-01-04	Tutoring System for Dance Learning	Rajkumar Kannan et.al.	1001.0440	null

Notes:

We have modified the sorting rule of the above table to prioritize papers based on the time of their latest update rather than their initial publication date. If an article has been recently modified, it will appear earlier in the list.

Function added:

Support more reliable text parser. Link
Support rich markdown format (better at parsing experimental tables). Link