Contributors Forks Stargazers Issues

Talking-Face Research Papers

Automatically Updated on 2025.10.01

Current Search Keywords: Talking Face, Talking Head, Visual Dubbing, Face Genertation, Lip Sync, Talker, Portrait, Talking Video, Head Synthesis, Face Reenactment, Wav2Lip, Talking Avatar, Lip Generation, Lip-Synchronization, Portrait Animation, Facial Animation, Lip Expert

If you have any other keywords, please feel free to let us know :)

Web Page (Scrape Code)

Talking Face

Publish Date Title Authors PDF Code
2025-09-30 3DiFACE: Synthesizing and Editing Holistic 3D Facial Animation Balamurugan Thambiraja et.al. 2509.26233 null
2025-09-26 StableDub: Taming Diffusion Prior for Generalized and Efficient Visual Dubbing Liyang Chen et.al. 2509.21887 null
2025-09-25 Unlocking Financial Insights: An advanced Multimodal Summarization with Multimodal Output Framework for Financial Advisory Videos Sarmistha Das et.al. 2509.20961 null
2025-09-24 KSDiff: Keyframe-Augmented Speech-Aware Dual-Path Diffusion for Facial Animation Tianle Lyu et.al. 2509.20128 null
2025-09-24 Comparative Study of Subjective Video Quality Assessment Test Methods in Crowdsourcing for Varied Use Cases Babak Naderi et.al. 2509.20118 null
2025-09-24 SynchroRaMa : Lip-Synchronized and Emotion-Aware Talking Face Generation via Multi-Modal Emotion Embedding Phyo Thet Yee et.al. 2509.19965 null
2025-09-24 Talking Head Generation via AU-Guided Landmark Prediction Shao-Yu Chang et.al. 2509.19749 null
2025-09-23 Audio-Driven Universal Gaussian Head Avatars Kartik Teotia et.al. 2509.18924 null
2025-09-22 “I don’t like my avatar”: Investigating Human Digital Doubles Siyi Liu et.al. 2509.17748 null
2025-09-22 Stable Video-Driven Portraits Mallikarjun B. R. et.al. 2509.17476 null
2025-09-21 Beat on Gaze: Learning Stylized Generation of Gaze and Head Dynamics Chengwei Shi et.al. 2509.17168 null
2025-09-21 PGSTalker: Real-Time Audio-Driven Talking Head Generation via 3D Gaussian Splatting with Pixel-Aware Density Control Tianheng Zhu et.al. 2509.16922 null
2025-09-20 Follow-Your-Emoji-Faster: Towards Efficient, Fine-Controllable, and Expressive Freestyle Portrait Animation Yue Ma et.al. 2509.16630 null
2025-09-17 Kling-Avatar: Grounding Multimodal Instructions for Cascaded Long-Duration Avatar Animation Synthesis Yikang Ding et.al. 2509.09595 null
2025-09-16 A Lightweight Pipeline for Noisy Speech Voice Cloning and Accurate Lip Sync Synthesis Javeria Amir et.al. 2509.12831 null
2025-09-15 AvatarSync: Rethinking Talking-Head Animation through Autoregressive Perspective Yuchen Deng et.al. 2509.12052 null
2025-09-10 Bitrate-Controlled Diffusion for Disentangling Motion and Content in Video Xiao Li et.al. 2509.08376 null
2025-09-09 PanoLAM: Large Avatar Model for Gaussian Full-Head Synthesis from One-shot Unposed Image Peng Li et.al. 2509.07552 null
2025-09-04 Durian: Dual Reference-guided Portrait Animation with Attribute Transfer Hyunsoo Cha et.al. 2509.04434 null
2025-08-28 EmoCAST: Emotional Talking Portrait via Emotive Text Description Yiguo Jiang et.al. 2508.20615 null
2025-08-27 InfinityHuman: Towards Long-Term Audio-Driven Human Xiaodi Li et.al. 2508.20210 null
2025-08-27 Improving Generalization in Deepfake Detection with Face Foundation Models and Metric Learning Stelios Mylonas et.al. 2508.19730 null
2025-08-26 OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation Jianwen Jiang et.al. 2508.19209 null
2025-08-26 PanoHair: Detailed Hair Strand Synthesis on Volumetric Heads Shashikant Verma et.al. 2508.18944 null
2025-08-26 Wan-S2V: Audio-Driven Cinematic Video Generation Xin Gao et.al. 2508.18621 null
2025-08-25 Lightning Fast Caching-based Parallel Denoising Prediction for Accelerating Talking Head Generation Jianzhi Long et.al. 2509.00052 null
2025-08-25 EAI-Avatar: Emotion-Aware Interactive Talking Head Generation Haijie Yang et.al. 2508.18337 null
2025-08-22 Audio2Face-3D: Audio-driven Realistic Facial Animation For Digital Avatars NVIDIA et.al. 2508.16401 null
2025-08-20 D^3-Talker: Dual-Branch Decoupled Deformation Fields for Few-Shot 3D Talking Head Synthesis Yuhang Guo et.al. 2508.14449 null
2025-08-20 Taming Transformer for Emotion-Controllable Talking Face Generation Ziqi Zhang et.al. 2508.14359 null
2025-08-19 TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis Shunian Chen et.al. 2508.13618 null
2025-08-19 EDTalk++: Full Disentanglement for Controllable Talking Head Synthesis Shuai Tan et.al. 2508.13442 null
2025-08-18 Human Feedback Driven Dynamic Speech Emotion Recognition Ilya Fedorov et.al. 2508.14920 null
2025-08-17 CEM-Net: Cross-Emotion Memory Network for Emotional Talking Face Generation Kangyi Wu et.al. 2508.12368 null
2025-08-16 RealTalk: Realistic Emotion-Aware Lifelike Talking-Head Synthesis Wenqing Wang et.al. 2508.12163 null
2025-08-16 SimInterview: Transforming Business Education through Large Language Model-Based Simulated Multilingual Interview Training System Truong Thanh Hung Nguyen et.al. 2508.11873 null
2025-08-15 FantasyTalking2: Timestep-Layer Adaptive Preference Optimization for Audio-Driven Portrait Animation MengChao Wang et.al. 2508.11255 null
2025-08-14 HM-Talker: Hybrid Motion Modeling for High-Fidelity Talking Head Synthesis Shiyu Liu et.al. 2508.10566 null
2025-08-14 M2DAO-Talker: Harmonizing Multi-granular Motion Decoupling and Alternating Optimization for Talking-head Generation Kui Jiang et.al. 2507.08307 null
2025-08-14 MEDTalk: Multimodal Controlled 3D Facial Animation with Dynamic Emotions by Disentangled Embedding Chang Liu et.al. 2507.06071 null
2025-08-13 LIA-X: Interpretable Latent Portrait Animator Yaohui Wang et.al. 2508.09959 null
2025-08-12 Preview WB-DH: Towards Whole Body Digital Human Bench for the Generation of Whole-body Talking Avatar Videos Chaoyi Wang et.al. 2508.08891 null
2025-08-11 Learning Phonetic Context-Dependent Viseme for Enhancing Speech-Driven 3D Facial Animation Hyung Kyu Kim et.al. 2507.20568 null
2025-08-10 KLASSify to Verify: Audio-Visual Deepfake Detection Using SSL-based Audio and Handcrafted Visual Features Ivan Kukanov et.al. 2508.07337 null
2025-08-08 MotionSwap Om Patil et.al. 2508.06430 null
2025-08-08 MoDA: Multi-modal Diffusion Architecture for Talking Head Generation Xinyang Li et.al. 2507.03256 null
2025-08-07 Evaluation of a Sign Language Avatar on Comprehensibility, User Experience \& Acceptability Fenya Wasserroth et.al. 2508.05358 null
2025-08-07 RAP: Real-time Audio-driven Portrait Animation with Video Diffusion Transformer Fangyu Du et.al. 2508.05115 null
2025-08-07 UniTalker: Conversational Speech-Visual Synthesis Yifan Hu et.al. 2508.04585 null
2025-08-07 AudioGen-Omni: A Unified Multimodal Diffusion Transformer for Video-Synchronized Audio, Speech, and Song Generation Le Wang et.al. 2508.00733 null
2025-08-06 MienCap: Realtime Performance-Based Facial Animation with Live Mood Dynamics Ye Pan et.al. 2508.04687 null
2025-08-06 READ: Real-time and Efficient Asynchronous Diffusion for Audio-driven Talking Head Generation Haotian Wang et.al. 2508.03457 null
2025-08-06 Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation Weipeng Tan et.al. 2504.18087 null
2025-08-05 Multi-human Interactive Talking Dataset Zeyu Zhu et.al. 2508.03050 null
2025-08-04 X-Actor: Emotional and Expressive Long-Range Portrait Acting from Audio Chenxu Zhang et.al. 2508.02944 null
2025-08-04 Text2Lip: Progressive Lip-Synced Talking Face Generation from Text via Viseme-Guided Rendering Xu Wang et.al. 2508.02362 null
2025-08-04 Is It Really You? Exploring Biometric Verification Scenarios in Photorealistic Talking-Head Avatar Videos Laura Pedrouzo-Rodriguez et.al. 2508.00748 null
2025-07-31 Who is a Better Talker: Subjective and Objective Quality Assessment for AI-Generated Talking Heads Yingjie Zhou et.al. 2507.23343 null
2025-07-30 X-NeMo: Expressive Neural Motion Reenactment via Disentangled Latent Attention Xiaochen Zhao et.al. 2507.23143 null
2025-07-30 Robust Deepfake Detection for Electronic Know Your Customer Systems Using Registered Images Takuma Amada et.al. 2507.22601 null
2025-07-29 DiTalker: A Unified DiT-based Framework for High-Quality and Speaking Styles Controllable Portrait Animation He Feng et.al. 2508.06511 null
2025-07-29 JWB-DH-V1: Benchmark for Joint Whole-Body Talking Avatar and Speech Generation Version 1 Xinhan Di et.al. 2507.20987 null
2025-07-28 Mask-Free Audio-driven Talking Face Generation for Enhanced Visual Quality and Identity Preservation Dogucan Yaman et.al. 2507.20953 null
2025-07-28 MemoryTalker: Personalized Speech-Driven 3D Facial Animation via Audio-Guided Stylization Hyung Kyu Kim et.al. 2507.20562 null
2025-07-28 JOLT3D: Joint Learning of Talking Heads and 3DMM Parameters with Application to Lip-Sync Sungjoon Park et.al. 2507.20452 null
2025-07-25 Face2VoiceSync: Lightweight Face-Voice Consistency for Text-Driven Talking Face Generation Fang Kang et.al. 2507.19225 null
2025-07-24 Tiny is not small enough: High-quality, low-resource facial animation models through hybrid knowledge distillation Zhen Han et.al. 2507.18352 null
2025-07-24 Celeb-DF++: A Large-scale Challenging Video DeepFake Benchmark for Generalizable Forensics Yuezun Li et.al. 2507.18015 null
2025-07-22 Livatar-1: Real-Time Talking Heads Generation with Tailored Flow Matching Haiyang Liu et.al. 2507.18649 null
2025-07-22 Navigating Large-Pose Challenge for High-Fidelity Face Reenactment with Video Diffusion Model Mingtao Guo et.al. 2507.16341 null
2025-07-21 VisualSpeaker: Visually-Guided 3D Avatar Lip Synthesis Alexandre Symeonidis-Herzig et.al. 2507.06060 null
2025-07-17 FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers Qiang Wang et.al. 2507.12956 null
2025-07-17 ATL-Diff: Audio-Driven Talking Head Generation with Early Landmarks-Guide Noise Diffusion Hoang-Son Vo et.al. 2507.12804 null
2025-07-17 Think-Before-Draw: Decomposing Emotion Semantics & Fine-Grained Controllable Expressive Talking Head Generation Hanlei Shi et.al. 2507.12761 null
2025-07-17 Cross-Modal Watermarking for Authentic Audio Recovery and Tamper Localization in Synthesized Audiovisual Forgeries Minyoung Kim et.al. 2507.12723 null
2025-07-16 AU-Blendshape for Fine-grained Stylized 3D Facial Expression Manipulation Hao Li et.al. 2507.12001 null
2025-07-15 Model See Model Do: Speech-Driven Facial Animation with Style Control Yifang Pan et.al. 2505.01319 null
2025-07-11 Detecting Deepfake Talking Heads from Facial Biometric Anomalies Justin D. Norman et.al. 2507.08917 null
2025-07-10 GGTalker: Talking Head Systhesis with Generalizable Gaussian Priors and Identity-Specific Adaptation Wentao Hu et.al. 2506.21513 null
2025-07-07 MoDiT: Learning Highly Consistent 3D Motion Coefficients with Diffusion Transformer for Talking Head Generation Yucheng Wang et.al. 2507.05092 null
2025-07-05 EchoMimicV3: 1.3B Parameters are All You Need for Unified Multi-Modal and Multi-Task Human Animation Rang Meng et.al. 2507.03905 null
2025-07-03 CanonSwap: High-Fidelity and Consistent Video Face Swapping via Canonical Space Modulation Xiangyang Luo et.al. 2507.02691 null
2025-07-02 FixTalk: Taming Identity Leakage for High-Quality Talking Head Generation in Extreme Cases Shuai Tan et.al. 2507.01390 null
2025-07-01 ICME 2025 Grand Challenge on Video Super-Resolution for Video Conferencing Babak Naderi et.al. 2506.12269 null
2025-06-30 JAM-Flow: Joint Audio-Motion Synthesis with Flow Matching Mingi Kwon et.al. 2506.23552 null
2025-06-27 MirrorMe: Towards Realtime and High Fidelity Audio-Driven Halfbody Animation Dechao Meng et.al. 2506.22065 null
2025-06-27 Few-Shot Identity Adaptation for 3D Talking Heads via Global Gaussian Field Hong Nie et.al. 2506.22044 null
2025-06-27 RiverEcho: Real-Time Interactive Digital System for Ancient Yellow River Culture Haofeng Wang et.al. 2506.21865 null
2025-06-24 Bind-Your-Avatar: Multi-Talking-Character Video Generation with Dynamic 3D-mask-based Embedding Router Yubo Huang et.al. 2506.19833 null
2025-06-23 Advancing Talking Head Generation: A Comprehensive Survey of Multi-Modal Methodologies, Datasets, Evaluation Metrics, and Loss Functions Vineet Kumar Rakesh et.al. 2507.02900 null
2025-06-23 OmniAvatar: Efficient Audio-Driven Avatar Video Generation with Adaptive Body Animation Qijun Gan et.al. 2506.18866 null
2025-06-23 CGS-GAN: 3D Consistent Gaussian Splatting GANs for High Resolution Human Head Synthesis Florian Barthel et.al. 2505.17590 null
2025-06-17 SyncTalk++: High-Fidelity and Efficient Synchronized Talking Heads Synthesis Using Gaussian Splatting Ziqiao Peng et.al. 2506.14742 null
2025-06-17 Compressed Video Super-Resolution based on Hierarchical Encoding Yuxuan Jiang et.al. 2506.14381 null
2025-06-16 Audio-Visual Driven Compression for Low-Bitrate Talking Head Videos Riku Takahashi et.al. 2506.13419 null
2025-06-15 iDiT-HOI: Inpainting-based Hand Object Interaction Reenactment via Video Diffusion Transformer Zhelun Shen et.al. 2506.12847 null
2025-06-10 HunyuanVideo-HOMA: Generic Human-Object Interaction in Multimodal Driven Human Animation Ziyao Huang et.al. 2506.08797 null
2025-06-03 NTIRE 2025 XGC Quality Assessment Challenge: Methods and Results Xiaohong Liu et.al. 2506.02875 null
2025-06-03 OmniTalker: One-shot Real-time Text-Driven Talking Audio-Video Generation With Multimodal Style Mimicking Zhongjian Wang et.al. 2504.02433 null
2025-06-02 Cocktail-Party Audio-Visual Speech Recognition Thai-Binh Nguyen et.al. 2506.02178 null
2025-06-02 Low-Rank Head Avatar Personalization with Registers Sai Tanmay Reddy Chakkera et.al. 2506.01935 null
2025-06-02 Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head Generation Yuan Gan et.al. 2506.01591 null
2025-06-01 SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video Diffusion Transformers Zhengcong Fei et.al. 2506.00830 null
2025-05-30 TalkingHeadBench: A Multi-Modal Benchmark & Analysis of Talking-Head DeepFake Detection Xinqi Xiong et.al. 2505.24866 null
2025-05-29 Hallo4: High-Fidelity Dynamic Portrait Animation via Direct Preference Optimization and Temporal Motion Modulation Jiahao Cui et.al. 2505.23525 null
2025-05-29 Video Editing for Audio-Visual Dubbing Binyamin Manela et.al. 2505.23406 null
2025-05-29 Wav2Sem: Plug-and-Play Audio Semantic Decoupling for 3D Speech-Driven Facial Animation Hao Li et.al. 2505.23290 null
2025-05-29 MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation Siyuan Wang et.al. 2505.23120 null
2025-05-28 Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation Zhe Kong et.al. 2505.22647 null
2025-05-28 Tell me Habibi, is it Real or Fake? Kartik Kuckreja et.al. 2505.22581 null
2025-05-28 Neural Face Skinning for Mesh-agnostic Facial Expression Cloning Sihun Cha et.al. 2505.22416 null
2025-05-28 FaceEditTalker: Interactive Talking Head Generation with Facial Attribute Editing Guanwen Feng et.al. 2505.22141 null
2025-05-28 RESOUND: Speech Reconstruction from Silent Videos via Acoustic-Semantic Decomposed Modeling Long-Khanh Pham et.al. 2505.22024 null
2025-05-27 OmniSync: Towards Universal Lip Synchronization via Diffusion Transformers Ziqiao Peng et.al. 2505.21448 null
2025-05-26 Total-Editing: Head Avatar with Editable Appearance, Motion, and Lighting Yizhou Zhao et.al. 2505.20582 null
2025-05-26 DualTalk: Dual-Speaker Interaction for 3D Talking Head Conversations Ziqiao Peng et.al. 2505.18096 null
2025-05-22 Supervising 3D Talking Head Avatars with Analysis-by-Audio-Synthesis Radek Daněček et.al. 2504.13386 null
2025-05-14 Test-Time Augmentation for Pose-invariant Face Recognition Jaemin Jung et.al. 2505.09256 null
2025-05-10 VTutor: An Animated Pedagogical Agent SDK that Provide Real Time Multi-Model Feedback Eason Chen et.al. 2505.06676 null
2025-05-10 OT-Talk: Animating 3D Talking Head with Optimal Transportation Xinmu Wang et.al. 2505.01932 null
2025-05-10 MagicPortrait: Temporally Consistent Face Reenactment with 3D Geometric Guidance Mengting Wei et.al. 2504.21497 null
2025-05-08 OXSeg: Multidimensional attention UNet-based lip segmentation using semi-supervised lip contours Hanie Moghaddasi et.al. 2505.05531 null
2025-05-03 GenSync: A Generalized Talking Head Framework for Audio-driven Multi-Subject Lip-Sync using 3D Gaussian Splatting Anushka Agarwal et.al. 2505.01928 null
2025-05-02 FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing Gaoxiang Cong et.al. 2505.01263 null
2025-05-01 KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution Antoni Bigata et.al. 2505.00497 null
2025-04-29 IM-Portrait: Learning 3D-aware Video Diffusion for Photorealistic Talking Heads from Monocular Videos Yuan Li et.al. 2504.19165 null
2025-04-27 Generative AI for Character Animation: A Comprehensive Survey of Techniques, Applications, and Future Directions Mohammad Mahdi Abootorabi et.al. 2504.19056 null
2025-04-26 Audio-Driven Talking Face Video Generation with Joint Uncertainty Learning Yifan Xie et.al. 2504.18810 null
2025-04-14 Controllable Expressive 3D Facial Animation via Diffusion in a Unified Multimodal Space Kangwei Liu et.al. 2506.10007 null
2025-04-14 SpinMeRound: Consistent Multi-View Identity Generation Using Diffusion Models Stathis Galanakis et.al. 2504.10716 null
2025-04-10 ChildlikeSHAPES: Semantic Hierarchical Region Parsing for Animating Figure Drawings Astitva Srivastava et.al. 2504.08022 null
2025-04-08 VideoSPatS: Video SPatiotemporal Splines for Disentangled Occlusion, Appearance and Motion Modeling and Editing Juan Luis Gonzalez Bello et.al. 2504.07146 null
2025-04-08 SE4Lip: Speech-Lip Encoder for Talking Head Synthesis to Solve Phoneme-Viseme Alignment Ambiguity Yihuan Huang et.al. 2504.05803 null
2025-04-08 Exploiting Temporal Audio-Visual Correlation Embedding for Audio-Driven One-Shot Talking Head Animation Zhihua Xu et.al. 2504.05746 null
2025-04-08 Contrastive Decoupled Representation Learning and Regularization for Speech-Preserving Facial Expression Manipulation Tianshui Chen et.al. 2504.05672 null
2025-04-07 Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation Fa-Ting Hong et.al. 2504.02542 null
2025-04-06 FluentLip: A Phonemes-Based Two-stage Approach for Audio-Driven Lip Synthesis with Optical Flow Consistency Shiyan Liu et.al. 2504.04427 null
2025-04-04 A Human Digital Twin Architecture for Knowledge-based Interactions and Context-Aware Conversations Abdul Mannan Mohammed et.al. 2504.03147 null
2025-04-03 VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models Kim Sung-Bin et.al. 2504.02386 null
2025-04-02 Detecting Lip-Syncing Deepfakes: Vision Temporal Transformer for Analyzing Mouth Inconsistencies Soumyya Kanti Datta et.al. 2504.01470 link
2025-04-02 EmoHead: Emotional Talking Head via Manipulating Semantic Expression Parameters Xuli Shen et.al. 2503.19416 null
2025-04-01 Monocular and Generalizable Gaussian Talking Head Animation Shengjie Gong et.al. 2504.00665 null
2025-03-31 Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics Lee Chae-Yeon et.al. 2503.20308 null
2025-03-30 MoCha: Towards Movie-Grade Talking Character Synthesis Cong Wei et.al. 2503.23307 null
2025-03-29 STSA: Spatial-Temporal Semantic Alignment for Visual Dubbing Zijun Ding et.al. 2503.23039 link
2025-03-28 Audio-Plane: Audio Factorization Plane Gaussian Splatting for Real-Time Talking Head Synthesis Shuai Shen et.al. 2503.22605 null
2025-03-28 Follow Your Motion: A Generic Temporal Consistency Portrait Editing Framework with Trajectory Guidance Haijie Yang et.al. 2503.22225 null
2025-03-27 ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model Jinwei Qi et.al. 2503.21144 null
2025-03-26 Dual Audio-Centric Modality Coupling for Talking Head Generation Ao Fu et.al. 2503.22728 null
2025-03-25 AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers Jiazhi Guan et.al. 2503.19824 null
2025-03-25 MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation Yukang Lin et.al. 2503.19383 null
2025-03-25 HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation Zunnan Xu et.al. 2503.18860 null
2025-03-25 Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model Yingying Fan et.al. 2503.16942 null
2025-03-24 DisentTalk: Cross-lingual Talking Face Generation via Semantic Disentangled Diffusion Model Kangwei Liu et.al. 2503.19001 null
2025-03-24 Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation Dingcheng Zhen et.al. 2503.18429 null
2025-03-23 DiffusionTalker: Efficient and Compact Speech-Driven 3D Talking Head via Personalizer-Guided Distillation Peng Chen et.al. 2503.18159 link
2025-03-21 TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting Jianchuan Chen et.al. 2503.17032 null
2025-03-21 From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech Ji-Hoon Kim et.al. 2503.16956 null
2025-03-20 UniSync: A Unified Framework for Audio-Visual Synchronization Tao Feng et.al. 2503.16357 null
2025-03-20 PC-Talk: Precise Facial Animation Control for Audio-Driven Talking Face Generation Baiqin Wang et.al. 2503.14295 null
2025-03-19 DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis Yuming Gu et.al. 2503.15667 link
2025-03-17 SyncDiff: Diffusion-based Talking Head Synthesis with Bottlenecked Temporal Visual Prior for Improved Synchronization Xulin Fan et.al. 2503.13371 null
2025-03-17 Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait Chaolong Yang et.al. 2503.12963 link
2025-03-16 Versatile Multimodal Controls for Whole-Body Talking Human Animation Zheng Qin et.al. 2503.08714 null
2025-03-14 Cafe-Talk: Generating 3D Talking Face Animation with Multimodal Coarse- and Fine-grained Control Hejia Chen et.al. 2503.14517 null
2025-03-14 EmoDiffusion: Enhancing Emotional 3D Facial Animation with Latent Diffusion Models Yixuan Zhang et.al. 2503.11028 null
2025-03-12 StyleSpeaker: Audio-Enhanced Fine-Grained Style Modeling for Speech-Driven 3D Facial Animation An Yang et.al. 2503.09852 null
2025-03-12 Bidirectional Learned Facial Animation Codec for Low Bitrate Talking Head Videos Riku Takahashi et.al. 2503.09787 null
2025-03-09 Removing Averaging: Personalized Lip-Sync Driven Characters Based on Identity Adapter Yanyu Zhu et.al. 2503.06397 null
2025-03-07 MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice Hongwei Yi et.al. 2503.05978 null
2025-03-06 FREAK: Frequency-modulated High-fidelity and Real-time Audio-driven Talking Portrait Synthesis Ziqi Ni et.al. 2503.04067 null
2025-03-03 KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation Antoni Bigata et.al. 2503.01715 null
2025-03-02 FaceShot: Bring Any Character into Life Junyao Gao et.al. 2503.00740 null
2025-03-01 Towards High-fidelity 3D Talking Avatar with Personalized Dynamic Texture Xuanchen Li et.al. 2503.00495 null
2025-02-28 Two-Stream Spatial-Temporal Transformer Framework for Person Identification via Natural Conversational Keypoints Masoumeh Chapariniya et.al. 2502.20803 null
2025-02-28 ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model Xuangeng Chu et.al. 2502.20323 null
2025-02-27 InsTaG: Learning Personalized 3D Talking Head from Few-Second Video Jiahe Li et.al. 2502.20387 link
2025-02-27 High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model Mingtao Guo et.al. 2502.19894 link
2025-02-26 FLAP: Fully-controllable Audio-driven Portrait Video Generation through 3D head conditioned diffusion mode Lingzhou Mu et.al. 2502.19455 null
2025-02-24 Dimitra: Audio-driven Diffusion model for Expressive Talking Head Generation Baptiste Chopin et.al. 2502.17198 null
2025-02-20 NeRF-3DTalker: Neural Radiance Field with 3D Prior Aided Audio Disentanglement for Talking Head Synthesis Xiaoxing Liu et.al. 2502.14178 null
2025-02-18 AV-Flow: Transforming Text to Audio-Visual Human-like Interactions Aggelina Chatziagapi et.al. 2502.13133 null
2025-02-17 SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion Junxian Ma et.al. 2502.11515 null
2025-02-15 SkyReels-A1: Expressive Portrait Animation in Video Diffusion Transformers Di Qiu et.al. 2502.10841 link
2025-02-13 Long-Term TalkingFace Generation via Motion-Prior Conditional Diffusion Model Fei Shen et.al. 2502.09533 null
2025-02-13 VTutor: An Open-Source SDK for Generative AI-Powered Animated Pedagogical Agents with Multi-Media Output Eason Chen et.al. 2502.04103 null
2025-02-11 Playmate: Flexible Control of Portrait Animation via 3D-Implicit Space Guided Diffusion Xingpei Ma et.al. 2502.07203 null
2025-02-07 Towards Multimodal Empathetic Response Generation: A Rich Text-Speech-Vision Avatar-based Benchmark Han Zhang et.al. 2502.04976 null
2025-02-02 EmoTalkingGaussian: Continuous Emotion-conditioned Talking Head Synthesis Junuk Cha et.al. 2502.00654 null
2025-01-24 SyncAnimation: A Real-Time End-to-End Framework for Audio-Driven Human Pose and Talking Head Animation Yujian Liu et.al. 2501.14646 null
2025-01-21 A Lightweight and Interpretable Deepfakes Detection Framework Muhammad Umar Farooq et.al. 2501.11927 null
2025-01-18 EMO2: End-Effector Guided Audio-Driven Avatar Video Generation Linrui Tian et.al. 2501.10687 null
2025-01-17 TalkingEyes: Pluralistic Speech-Driven 3D Eye Gaze Animation Yixiang Zhuang et.al. 2501.09921 null
2025-01-15 Joint Learning of Depth and Appearance for Portrait Image Animation Xinya Ji et.al. 2501.08649 null
2025-01-15 Make-A-Character 2: Animatable 3D Character Generation From a Single Image Lin Liu et.al. 2501.07870 null
2025-01-09 Towards Dynamic Neural Communication and Speech Neuroprosthesis Based on Viseme Decoding Ji-Ha Park et.al. 2501.14790 null
2025-01-09 Identity-Preserving Video Dubbing Using Motion Warping Runzhen Liu et.al. 2501.04586 null
2025-01-09 MoEE: Mixture of Emotion Experts for Audio-Driven Portrait Animation Huaize Liu et.al. 2501.01808 null
2025-01-07 Generating and Detecting Various Types of Fake Image and Audio Content: A Review of Modern Deep Learning Technologies and Tools Arash Dehghani et.al. 2501.06227 null
2025-01-07 VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control Yuanpeng Tu et.al. 2501.01427 null
2025-01-06 RDD4D: 4D Attention-Guided Road Damage Detection And Classification Asma Alkalbani et.al. 2501.02822 link
2025-01-06 Takeaways from Applying LLM Capabilities to Multiple Conversational Avatars in a VR Pilot Study Mykola Maslych et.al. 2501.00168 null
2025-01-03 JoyGen: Audio-Driven 3D Depth-Aware Talking-Face Video Editing Qili Wang et.al. 2501.01798 link
2024-12-28 DEGSTalk: Decomposed Per-Embedding Gaussian Fields for Hair-Preserving Talking Face Synthesis Kaijun Deng et.al. 2412.20148 link
2024-12-26 UniAvatar: Taming Lifelike Audio-Driven Talking Head Generation with Comprehensive Motion and Lighting Control Wenzhang Sun et.al. 2412.19860 null
2024-12-26 Generating Editable Head Avatars with 3D Gaussian GANs Guohao Li et.al. 2412.19149 link
2024-12-23 FaceLift: Single Image to 3D Head with View Generation and GS-LRM Weijie Lyu et.al. 2412.17812 null
2024-12-22 FADA: Fast Diffusion Avatar Synthesis with Mixed-Supervised Multi-CFG Distillation Tianyun Zhong et.al. 2412.16915 null
2024-12-18 Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with Adapters Steven Hogue et.al. 2412.14333 link
2024-12-18 GLCF: A Global-Local Multimodal Coherence Analysis Framework for Talking Face Generation Detection Xiaocan Chen et.al. 2412.13656 null
2024-12-18 Learning to Control an Android Robot Head for Facial Animation Marcel Heisler et.al. 2412.13641 null
2024-12-18 Real-time One-Step Diffusion-based Expressive Portrait Videos Generation Hanzhong Guo et.al. 2412.13479 link
2024-12-18 VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization Tao Liu et.al. 2412.09892 null
2024-12-16 Towards a Universal Synthetic Video Detector: From Face or Background Manipulations to Fully AI-Generated Content Rohit Kundu et.al. 2412.12278 null
2024-12-13 GoHD: Gaze-oriented and Highly Disentangled Portrait Animation with Rhythmic Poses and Realistic Expression Ziqi Zhou et.al. 2412.09296 link
2024-12-12 LatentSync: Audio Conditioned Latent Diffusion Models for Lip Sync Chunyu Li et.al. 2412.09262 link
2024-12-12 EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing Gaoxiang Cong et.al. 2412.08988 null
2024-12-11 PointTalk: Audio-Driven Dynamic Lip Point Cloud for 3D Gaussian-based Talking Head Synthesis Yifan Xie et.al. 2412.08504 null
2024-12-10 PortraitTalk: Towards Customizable One-Shot Audio-to-Talking Face Generation Fatemeh Nazarieh et.al. 2412.07754 null
2024-12-10 IF-MDM: Implicit Face Motion Diffusion Model for High-Fidelity Realtime Talking Head Generation Sejong Yang et.al. 2412.04000 null
2024-12-05 MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation Longtao Zheng et.al. 2412.04448 null
2024-12-05 Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Diffusion Transformer Networks Jiahao Cui et.al. 2412.00733 link
2024-12-04 SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model Yan Li et.al. 2412.03430 null
2024-12-02 One Shot, One Talk: Whole-body Talking Avatar from a Single Image Jun Xiang et.al. 2412.01106 null
2024-12-01 Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation Shuling Zhao et.al. 2412.00719 null
2024-11-29 LokiTalk: Learning Fine-Grained and Generalizable Correspondences to Enhance NeRF-based Talking Head Synthesis Tianqi Li et.al. 2411.19525 null
2024-11-29 Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis Tianqi Li et.al. 2411.19509 null
2024-11-29 V2SFlow: Video-to-Speech Generation with Speech Decomposition and Rectified Flow Jeongsoo Choi et.al. 2411.19486 null
2024-11-26 Passive Deepfake Detection Across Multi-modalities: A Comprehensive Survey Hong-Hanh Nguyen-Le et.al. 2411.17911 null
2024-11-25 Sonic: Shifting Focus to Global Audio Perception in Portrait Animation Xiaozhong Ji et.al. 2411.16331 null
2024-11-25 ESARM: 3D Emotional Speech-to-Animation via Reward Model from Automatically-Ranked Demonstrations Xulong Zhang et.al. 2411.13089 null
2024-11-24 LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis Haojie Zhang et.al. 2411.16748 null
2024-11-23 EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion Haotian Wang et.al. 2411.16726 null
2024-11-23 ConsistentAvatar: Learning to Diffuse Fully Consistent Talking Head Avatar with Temporal Guidance Haijie Yang et.al. 2411.15436 null
2024-11-20 Comparative Analysis of Audio Feature Extraction for Real-Time Talking Portrait Synthesis Pegah Salehi et.al. 2411.13209 link
2024-11-20 JoyVASA: Portrait and Animal Image Animation with Diffusion-Based Audio-Driven Facial Dynamics and Head Motion Generation Xuyang Cao et.al. 2411.09209 link
2024-11-14 LES-Talker: Fine-Grained Emotion Editing for Talking Head Generation in Linear Emotion Space Guanwen Feng et.al. 2411.09268 null
2024-11-06 Large Generative Model-assisted Talking-face Semantic Communication System Feibo Jiang et.al. 2411.03876 null
2024-10-31 Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided Mixture-of-Experts Xiang Deng et.al. 2410.23836 null
2024-10-29 Multimodal Semantic Communication for Generative Audio-Driven Video Conferencing Haonan Tong et.al. 2410.22112 null
2024-10-24 Real-time 3D-aware Portrait Video Relighting Ziqi Cai et.al. 2410.18355 link
2024-10-21 Joker: Conditional 3D Head Synthesis with Extreme Facial Expressions Malte Prinzler et.al. 2410.16395 null
2024-10-18 Takin-ADA: Emotion Controllable Audio-Driven Animation with Canonical and Landmark Loss Optimization Bin Lin et.al. 2410.14283 null
2024-10-18 DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation Hanbo Cheng et.al. 2410.13726 link
2024-10-16 MuseTalk: Real-Time High Quality Lip Synchronization with Latent Space Inpainting Yue Zhang et.al. 2410.10122 link
2024-10-15 Titanic Calling: Low Bandwidth Video Conference from the Titanic Wreck Fevziye Irem Eyiokur et.al. 2410.11434 null
2024-10-15 MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes Zhenhui Ye et.al. 2410.06734 null
2024-10-14 Character-aware audio-visual subtitling in context Jaesung Huh et.al. 2410.11068 null
2024-10-14 Beyond Fixed Topologies: Unregistered Training and Comprehensive Evaluation Metrics for 3D Talking Heads Federico Nocentini et.al. 2410.11041 null
2024-10-14 TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model Jiazhi Guan et.al. 2410.10696 null
2024-10-14 Generative Human Video Compression with Multi-granularity Temporal Trajectory Factorization Shanzhi Yin et.al. 2410.10171 null
2024-10-10 MMHead: Towards Fine-grained Multi-modal 3D Facial Animation Sijing Wu et.al. 2410.07757 null
2024-10-09 FreeAvatar: Robust 3D Facial Animation Transfer by Learning an Expression Foundation Model Feng Qiu et.al. 2409.13180 null
2024-10-01 LaDTalk: Latent Denoising for Synthesizing Talking Head Videos with High Frequency Details Jian Yang et.al. 2410.00990 null
2024-09-29 Learning Frame-Wise Emotion Intensity for Audio-Driven Talking-Head Generation Jingyi Xu et.al. 2409.19501 null
2024-09-27 Diverse Code Query Learning for Speech-Driven Facial Animation Chunzhi Gu et.al. 2409.19143 null
2024-09-26 Stable Video Portraits Mirela Ostrek et.al. 2409.18083 null
2024-09-25 ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAE Sichun Wu et.al. 2409.07966 link
2024-09-24 FastTalker: Jointly Generating Speech and Conversational Gestures from Text Zixin Guo et.al. 2409.16404 null
2024-09-23 FaceVid-1K: A Large-Scale High-Quality Multiracial Human Face Video Dataset Donglin Di et.al. 2410.07151 null
2024-09-23 MIMAFace: Face Animation via Motion-Identity Modulated Appearance Feature Learning Yue Han et.al. 2409.15179 null
2024-09-18 JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation Sai Tanmay Reddy Chakkera et.al. 2409.12156 null
2024-09-18 GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avatars from Coarse-to-fine Representations Kartik Teotia et.al. 2409.11951 null
2024-09-17 3DFacePolicy: Speech-Driven 3D Facial Animation with Diffusion Policy Xuanmeng Sha et.al. 2409.10848 null
2024-09-16 DreamHead: Learning Spatial-Temporal Correspondence via Hierarchical Diffusion for Audio-driven Talking Head Synthesis Fa-Ting Hong et.al. 2409.10281 null
2024-09-14 StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking Heads Suzhen Wang et.al. 2409.09292 null
2024-09-11 DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures Steven Hogue et.al. 2409.07649 null
2024-09-11 EMOdiffhead: Continuously Emotional Control in Talking Head Generation via Diffusion Jian Zhang et.al. 2409.07255 null
2024-09-09 PersonaTalk: Bring Attention to Your Persona in Visual Dubbing Longhao Zhang et.al. 2409.05379 null
2024-09-09 KAN-Based Fusion of Dual-Domain for Audio-Driven Facial Landmarks Generation Hoang-Son Vo-Thanh et.al. 2409.05330 link
2024-09-05 SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing Lingyu Xiong et.al. 2409.03605 null
2024-09-05 SVP: Style-Enhanced Vivid Portrait Talking Head Diffusion Model Weipeng Tan et.al. 2409.03270 null
2024-09-04 PoseTalk: Text-and-Audio-based Pose Control and Motion Refinement for One-Shot Talking Head Generation Jun Ling et.al. 2409.02657 null
2024-09-02 KMTalk: Speech-Driven 3D Facial Animation with Key Motion Embedding Zhihao Xu et.al. 2409.01113 link
2024-08-28 Micro and macro facial expressions by driven animations in realistic Virtual Humans Rubens Halbig Montanha et.al. 2408.16110 null
2024-08-27 MegActor- $Σ$ : Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer Shurong Yang et.al. 2408.14975 null
2024-08-25 TalkLoRA: Low-Rank Adaptation for Speech-Driven Animation Jack Saunders et.al. 2408.13714 null
2024-08-23 G3FA: Geometry-guided GAN for Face Animation Alireza Javanmardi et.al. 2408.13049 null
2024-08-21 AutoDirector: Online Auto-scheduling Agents for Multi-sensory Composition Minheng Ni et.al. 2408.11564 null
2024-08-21 EmoFace: Emotion-Content Disentangled Speech-Driven 3D Talking Face with Mesh Attention Yihong Lin et.al. 2408.11518 null
2024-08-20 DEGAS: Detailed Expressions on Full-Body Gaussian Avatars Zhijing Shao et.al. 2408.10588 null
2024-08-18 FD2Talk: Towards Generalized Talking Head Generation with Facial Decoupled Diffusion Model Ziyu Yao et.al. 2408.09384 null
2024-08-18 Meta-Learning Empowered Meta-Face: Personalized Speaking Style Adaptation for Audio-Driven 3D Talking Face Animation Xukun Zhou et.al. 2408.09357 null
2024-08-18 S^3D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis Dongze Li et.al. 2408.09347 null
2024-08-16 GLDiTalker: Speech-Driven 3D Facial Animation with Graph Latent Diffusion Transformer Yihong Lin et.al. 2408.01826 null
2024-08-14 Content and Style Aware Audio-Driven Facial Animation Qingju Liu et.al. 2408.07005 null
2024-08-12 DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation Jisoo Kim et.al. 2408.06010 null
2024-08-10 High-fidelity and Lip-synced Talking Face Synthesis via Landmark-based Diffusion Model Weizhi Zhong et.al. 2408.05416 null
2024-08-10 Style-Preserving Lip Sync via Audio-Aware Style Reference Weizhi Zhong et.al. 2408.05412 null
2024-08-09 DeepSpeak Dataset v1.0 Sarah Barrington et.al. 2408.05366 null
2024-08-06 ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer Jiazhi Guan et.al. 2408.03284 null
2024-08-03 Landmark-guided Diffusion Model for High-fidelity and Temporally Coherent Talking Head Generation Jintao Tan et.al. 2408.01732 null
2024-08-03 JambaTalk: Speech-Driven 3D Talking Head Generation Based on Hybrid Transformer-Mamba Model Farzaneh Jafari et.al. 2408.01627 null
2024-08-01 UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model Xiangyu Fan et.al. 2408.00762 null
2024-08-01 Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion Manuel Kansy et.al. 2408.00458 null
2024-08-01 EmoTalk3D: High-Fidelity Free-View Synthesis of Emotional 3D Talking Head Qianyun He et.al. 2408.00297 null
2024-07-31 Deformable 3D Shape Diffusion Model Dengsheng Chen et.al. 2407.21428 null
2024-07-26 LinguaLinker: Audio-Driven Portraits Animation with Implicit Facial Control Enhancement Rui Zhang et.al. 2407.18595 null
2024-07-24 A Comprehensive Review and Taxonomy of Audio-Visual Synchronization Techniques for Realistic Speech Animation Jose Geraldo Fernandes et.al. 2407.17430 null
2024-07-24 The impact of differences in facial features between real speakers and 3D face models on synthesized lip motions Rabab Algadhy et.al. 2407.17253 null
2024-07-22 PAV: Personalized Head Avatar from Unstructured Video Collection Akin Caliskan et.al. 2407.21047 null
2024-07-21 Anchored Diffusion for Video Face Reenactment Idan Kligvasser et.al. 2407.15153 null
2024-07-20 Text-based Talking Video Editing with Cascaded Conditional Diffusion Bo Han et.al. 2407.14841 null
2024-07-17 Universal Facial Encoding of Codec Avatars from VR Headsets Shaojie Bai et.al. 2407.13038 null
2024-07-17 EmoFace: Audio-driven Emotional 3D Face Animation Chang Liu et.al. 2407.12501 link
2024-07-13 Learning Online Scale Transformation for Talking Head Video Generation Fa-Ting Hong et.al. 2407.09965 null
2024-07-12 Real Face Video Animation Platform Xiaokai Chen et.al. 2407.18955 null
2024-07-12 One-Shot Pose-Driving Face Animation Platform He Feng et.al. 2407.08949 null
2024-07-12 EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions Zhiyuan Chen et.al. 2407.08136 null
2024-07-08 MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices Jianwen Jiang et.al. 2407.05712 null
2024-07-08 Audio-driven High-resolution Seamless Talking Head Video Editing via StyleGAN Jiacheng Su et.al. 2407.05577 null
2024-07-04 Compressed Skinning for Facial Blendshapes Ladislav Kavan et.al. 2406.11597 null
2024-07-03 LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control Jianzhu Guo et.al. 2407.03168 link
2024-07-01 Enhancing Speech-Driven 3D Facial Animation with Audio-Visual Guidance from Lip Reading Expert Han EunGi et.al. 2407.01034 null
2024-06-26 RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network Xiaozhong Ji et.al. 2406.18284 null
2024-06-24 The Effects of Embodiment and Personality Expression on Learning in LLM-based Educational Agents Sinan Sonlu et.al. 2407.10993 null
2024-06-21 EmpathyEar: An Open-source Avatar Multimodal Empathetic Chatbot Hao Fei et.al. 2406.15177 link
2024-06-20 MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset Kim Sung-Bin et.al. 2406.14272 null
2024-06-19 DF40: Toward Next-Generation Deepfake Detection Zhiyuan Yan et.al. 2406.13495 null
2024-06-19 AniFaceDiff: High-Fidelity Face Reenactment via Facial Parametric Conditioned Diffusion Models Ken Chen et.al. 2406.13272 null
2024-06-18 RITA: A Real-time Interactive Talking Avatars Framework Wuxinlin Cheng et.al. 2406.13093 null
2024-06-18 A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing Ming Meng et.al. 2406.10553 null
2024-06-17 NLDF: Neural Light Dynamic Fields for Efficient 3D Talking Head Generation Niu Guanchen et.al. 2406.11259 null
2024-06-17 Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement Runyi Yu et.al. 2406.08096 null
2024-06-16 Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation Mingwang Xu et.al. 2406.08801 null
2024-06-14 DNPM: A Neural Parametric Model for the Synthesis of Facial Geometric Details Haitao Cao et.al. 2405.19688 null
2024-06-13 Talking Heads: Understanding Inter-layer Communication in Transformer Language Models Jack Merullo et.al. 2406.09519 null
2024-06-13 DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing Neha Sahipjohn et.al. 2406.08802 null
2024-06-12 Emotional Conversation: Empowering Talking Faces with Cohesive Expression, Gaze and Pose Generation Jiadong Liang et.al. 2406.07895 null
2024-06-07 Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation Yue Ma et.al. 2406.01900 null
2024-06-05 Controllable Talking Face Generation by Implicit Facial Keypoints Editing Dong Zhao et.al. 2406.02880 null
2024-05-31 MunchSonic: Tracking Fine-grained Dietary Actions through Active Acoustic Sensing on Eyeglasses Saif Mahmud et.al. 2405.21004 null
2024-05-31 MegActor: Harness the Power of Raw Video for Vivid Portrait Animation Shurong Yang et.al. 2405.20851 link
2024-05-30 Audio2Rig: Artist-oriented deep learning tool for facial animation Bastien Arcelin et.al. 2405.20412 null
2024-05-28 OpFlowTalker: Realistic and Natural Talking Face Generation via Optical Flow Guidance Shuheng Ge et.al. 2405.14709 null
2024-05-24 InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation Yuchi Wang et.al. 2405.15758 link
2024-05-22 Metabook: An Automatically Generated Augmented Reality Storybook Interaction System to Improve Children’s Engagement in Storytelling Yibo Wang et.al. 2405.13701 null
2024-05-21 Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control Yue Han et.al. 2405.12970 null
2024-05-16 Faces that Speak: Jointly Synthesising Talking Face and Speech from Text Youngjoon Jang et.al. 2405.10272 null
2024-05-14 PolyGlotFake: A Novel Multilingual and Multimodal DeepFake Dataset Yang Hou et.al. 2405.08838 link
2024-05-12 Listen, Disentangle, and Control: Controllable Speech-Driven Talking Head Generation Changpeng Cai et.al. 2405.07257 null
2024-05-10 NeRFFaceSpeech: One-shot Audio-driven 3D Talking Head Synthesis via Generative Prior Gihoon Kim et.al. 2405.05749 null
2024-05-09 SwapTalk: Audio-Driven Talking Face Generation with One-Shot Customization in Latent Space Zeren Zhang et.al. 2405.05636 null
2024-05-08 Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention Ruijie Tao et.al. 2404.18501 null
2024-05-07 Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation Dogucan Yaman et.al. 2405.04327 null
2024-05-06 AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding Tao Liu et.al. 2405.03121 link
2024-04-29 EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars Nikita Drobyshev et.al. 2404.19110 null
2024-04-29 GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting Bo Chen et.al. 2404.19040 null
2024-04-29 Embedded Representation Learning Network for Animating Styled Video Portrait Tianyong Wang et.al. 2404.19038 null
2024-04-29 CSTalk: Correlation Supervised Speech-driven 3D Emotional Facial Animation Generation Xiangyu Liang et.al. 2404.18604 null
2024-04-28 GaussianTalker: Speaker-specific Talking Head Synthesis via 3D Gaussian Splatting Hongyun Yu et.al. 2404.14037 null
2024-04-25 GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting Kyusun Cho et.al. 2404.16012 link
2024-04-23 TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting Jiahe Li et.al. 2404.15264 null
2024-04-19 Learn2Talk: 3D Talking Face Learns from 2D Talking Face Yixiang Zhuang et.al. 2404.12888 null
2024-04-16 VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time Sicheng Xu et.al. 2404.10667 null
2024-04-15 FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance, Head-pose, and Facial Expression Features Andre Rochow et.al. 2404.09736 null
2024-04-13 THQA: A Perceptual Quality Assessment Database for Talking Heads Yingjie Zhou et.al. 2404.09003 link
2024-04-11 EFHQ: Multi-purpose ExtremePose-Face-HQ dataset Trung Tuan Dao et.al. 2312.17205 null
2024-04-09 Deepfake Generation and Detection: A Benchmark and Survey Gan Pei et.al. 2403.17881 link
2024-04-08 SphereHead: Stable 3D Full-head Synthesis with Spherical Tri-plane Representation Heyuan Li et.al. 2404.05680 null
2024-04-07 GvT: A Graph-based Vision Transformer with Talking-Heads Utilizing Sparsity, Trained from Scratch on Small Datasets Dongjing Shan et.al. 2404.04924 null
2024-04-07 Towards a Simultaneous and Granular Identity-Expression Control in Personalized Face Generation Renshuai Liu et.al. 2401.01207 null
2024-04-03 MI-NeRF: Learning a Single Face NeRF from Multiple Identities Aggelina Chatziagapi et.al. 2403.19920 null
2024-04-02 EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis Shuai Tan et.al. 2404.01647 null
2024-04-02 Learning to Generate Conditional Tri-plane for 3D-aware Expression Controllable Portrait Animation Taekyung Ki et.al. 2404.00636 null
2024-04-01 FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio Chao Xu et.al. 2403.01901 link
2024-04-01 Exploring Phonetic Context-Aware Lip-Sync For Talking Face Generation Se Jin Park et.al. 2305.19556 null
2024-03-29 Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior Jaehoon Ko et.al. 2403.20153 link
2024-03-28 MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity Talking Head Generation Seyeon Kim et.al. 2403.19144 link
2024-03-28 GOTCHA: Real-Time Video Deepfake Detection via Challenge-Response Govind Mittal et.al. 2210.06186 link
2024-03-27 X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention You Xie et.al. 2403.15931 null
2024-03-26 Superior and Pragmatic Talking Face Generation with Teacher-Student Framework Chao Liang et.al. 2403.17883 null
2024-03-26 AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation Huawei Wei et.al. 2403.17694 link
2024-03-25 DiffusionAct: Controllable Diffusion Autoencoder for One-shot Face Reenactment Stella Bounareli et.al. 2403.17217 null
2024-03-25 AnimateMe: 4D Facial Expressions via Diffusion Models Dimitrios Gerogiannis et.al. 2403.17213 null
2024-03-25 Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework Ziyao Huang et.al. 2403.16510 link
2024-03-23 Adaptive Super Resolution For One-Shot Talking-Head Generation Luchuan Song et.al. 2403.15944 link
2024-03-23 Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis Zhenhui Ye et.al. 2401.08503 link
2024-03-22 LeGO: Leveraging a Surface Deformation Network for Animatable Stylized Face Generation with One Example Soyeon Yoon et.al. 2403.15227 link
2024-03-22 Virbo: Multimodal Multilingual Avatar Video Generation in Digital Marketing Juan Zhang et.al. 2403.11700 null
2024-03-19 EmoVOCA: Speech-Driven Emotional 3D Talking Heads Federico Nocentini et.al. 2403.12886 null
2024-03-19 ScanTalk: 3D Talking Heads from Unregistered Scans Federico Nocentini et.al. 2403.10942 null
2024-03-15 StyleTalker: One-shot Style-based Audio-driven Talking Head Video Generation Dongchan Min et.al. 2208.10922 null
2024-03-14 GAIA: Zero-shot Talking Avatar Generation Tianyu He et.al. 2311.15230 null
2024-03-13 Say Anything with Any Style Shuai Tan et.al. 2403.06363 null
2024-03-12 FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and Quantization Shuai Tan et.al. 2403.06375 null
2024-03-12 Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style Shuai Tan et.al. 2403.06365 null
2024-03-11 A Comparative Study of Perceptual Quality Metrics for Audio-driven Talking Head Videos Weixia Zhang et.al. 2403.06421 link
2024-03-05 Memories are One-to-Many Mapping Alleviators in Talking Face Generation Anni Tang et.al. 2212.05005 null
2024-03-02 G4G:A Generic Framework for High Fidelity Talking Face Generation with Fine-grained Intra-modal Alignment Juan Zhang et.al. 2402.18122 null
2024-03-01 DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder Chenpeng Du et.al. 2303.17550 null
2024-02-29 Learning a Generalized Physical Face Model From Data Lingchen Yang et.al. 2402.19477 null
2024-02-28 Context-aware Talking Face Video Generation Meidai Xuanyuan et.al. 2402.18092 null
2024-02-27 EMO: Emote Portrait Alive – Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions Linrui Tian et.al. 2402.17485 null
2024-02-27 Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis Zicheng Zhang et.al. 2402.17364 link
2024-02-26 Resolution-Agnostic Neural Compression for High-Fidelity Portrait Video Conferencing via Implicit Radiance Fields Yifei Li et.al. 2402.16599 null
2024-02-25 AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation Yasheng Sun et.al. 2402.16124 null
2024-02-21 Bring Your Own Character: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters Zechen Bai et.al. 2402.13724 link
2024-02-21 StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing Gaoxiang Cong et.al. 2402.12636 null
2024-02-12 StyleLipSync: Style-based Personalized Lip-sync Video Generation Taekyung Ki et.al. 2305.00521 null
2024-02-08 DiffSpeaker: Speech-Driven 3D Facial Animation with Diffusion Transformer Zhiyuan Ma et.al. 2402.05712 link
2024-02-05 One-shot Neural Face Reenactment via Finding Directions in GAN’s Latent Space Stella Bounareli et.al. 2402.03553 null
2024-02-02 EmoSpeaker: One-shot Fine-grained Emotion-Controlled Talking Face Generation Guanwen Feng et.al. 2402.01422 null
2024-01-31 MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis Wenhao Guan et.al. 2312.10687 null
2024-01-30 Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance Qingcheng Zhao et.al. 2401.15687 null
2024-01-28 Lips Are Lying: Spotting the Temporal Inconsistency between Audio and Visual in Lip-Syncing DeepFakes Weifeng Liu et.al. 2401.15668 link
2024-01-27 An Implicit Physical Face Model Driven by Expression and Style Lingchen Yang et.al. 2401.15414 null
2024-01-26 Implicit Neural Representation for Physics-driven Actuated Soft Bodies Lingchen Yang et.al. 2401.14861 null
2024-01-25 SAiD: Speech-driven Blendshape Facial Animation with Diffusion Inkyu Park et.al. 2401.08655 link
2024-01-23 NeRF-AD: Neural Radiance Field with Attention-based Disentanglement for Talking Face Synthesis Chongke Bi et.al. 2401.12568 null
2024-01-19 Fast Registration of Photorealistic Avatars for VR Facial Animation Chaitanya Patel et.al. 2401.11002 null
2024-01-18 Exposing Lip-syncing Deepfakes from Mouth Inconsistencies Soumyya Kanti Datta et.al. 2401.10113 null
2024-01-18 Text-driven Talking Face Synthesis by Reprogramming Audio-driven Models Jeongsoo Choi et.al. 2306.16003 null
2024-01-16 EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model Bingyuan Zhang et.al. 2401.08049 null
2024-01-12 DiffDub: Person-generic Visual Dubbing Using Inpainting Renderer with Diffusion Auto-encoder Tao Liu et.al. 2311.01811 null
2024-01-11 Dubbing for Everyone: Data-Efficient Visual Dubbing using Neural Rendering Priors Jack Saunders et.al. 2401.06126 null
2024-01-11 Jump Cut Smoothing for Talking Heads Xiaojuan Wang et.al. 2401.04718 null
2024-01-08 AdaMesh: Personalized Facial Expressions and Head Poses for Adaptive Speech-Driven 3D Facial Animation Liyang Chen et.al. 2310.07236 null
2024-01-07 Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness Sicheng Yang et.al. 2401.03476 null
2024-01-04 Expressive Speech-driven Facial Animation with controllable emotions Yutong Chen et.al. 2301.02008 link
2023-12-23 TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation Xize Cheng et.al. 2312.15197 null
2023-12-21 DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation Chenxu Zhang et.al. 2312.13578 null
2023-12-20 FAAC: Facial Animation Generation with Anchor Frame and Conditional Control for Superior Fidelity and Editability Linze Li et.al. 2312.03775 null
2023-12-19 Learning Dense Correspondence for NeRF-Based Face Reenactment Songlin Yang et.al. 2312.10422 null
2023-12-19 Gaussian3Diff: 3D Gaussian Diffusion for 3D Full Head Synthesis and Editing Yushi Lan et.al. 2312.03763 null
2023-12-18 VectorTalker: SVG Talking Face Generation with Progressive Vectorisation Hao Hu et.al. 2312.11568 null
2023-12-18 AE-NeRF: Audio Enhanced Neural Radiance Field for Few Shot Talking Head Synthesis Dongze Li et.al. 2312.10921 null
2023-12-18 Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation Hui Fu et.al. 2312.10877 null
2023-12-15 DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models Yifeng Ma et.al. 2312.09767 null
2023-12-15 Attention-Based VR Facial Animation with Visual Mouth Camera Guidance for Immersive Telepresence Avatars Andre Rochow et.al. 2312.09750 null
2023-12-13 uTalk: Bridging the Gap Between Humans and AI Hussam Azzuni et.al. 2310.02739 null
2023-12-13 MMFace4D: A Large-Scale Multi-Modal 4D Face Dataset for Audio-Driven 3D Face Animation Haozhe Wu et.al. 2303.09797 null
2023-12-12 GMTalker: Gaussian Mixture based Emotional talking video Portraits Yibo Xia et.al. 2312.07669 null
2023-12-12 GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained 3D Face Guidance Haiming Zhang et.al. 2312.07385 null
2023-12-11 Neural Text to Articulate Talk: Deep Text to Audiovisual Speech Synthesis achieving both Auditory and Photo-realism Georgios Milis et.al. 2312.06613 link
2023-12-11 Study of Non-Verbal Behavior in Conversational Agents Camila Vicari Maccari et.al. 2312.06530 null
2023-12-11 DiT-Head: High-Resolution Talking Head Synthesis using Diffusion Transformers Aaron Mir et.al. 2312.06400 null
2023-12-11 Audio-driven Talking Face Generation by Overcoming Unintended Information Flow Dogucan Yaman et.al. 2307.09368 null
2023-12-10 DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head Video Generation Fa-Ting Hong et.al. 2305.06225 link
2023-12-09 R2-Talker: Realistic Real-Time Talking Head Synthesis with Hash Grid Landmarks Encoding and Progressive Multilayer Conditioning Zhiling Ye et.al. 2312.05572 null
2023-12-09 FT2TF: First-Person Statement Text-To-Talking Face Generation Xingjian Diao et.al. 2312.05430 null
2023-12-08 SingingHead: A Large-scale 4D Dataset for Singing Head Animation Sijing Wu et.al. 2312.04369 null
2023-12-07 VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior Xusen Sun et.al. 2312.01841 null
2023-12-05 PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features Tianshun Han et.al. 2312.02781 null
2023-12-05 MyPortrait: Morphable Prior-Guided Personalized Portrait Generation Bo Ding et.al. 2312.02703 null
2023-12-02 DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D Face Diffuser Peng Chen et.al. 2311.16565 null
2023-12-01 3DiFACE: Diffusion-based Speech-driven 3D Facial Animation and Editing Balamurugan Thambiraja et.al. 2312.00870 null
2023-11-30 Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data Yu Deng et.al. 2311.18729 null
2023-11-30 Talking Head(?) Anime from a Single Image 4: Improved Model and Its Distillation Pramook Khungurn et.al. 2311.17409 null
2023-11-29 SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis Ziqiao Peng et.al. 2311.17590 link
2023-11-28 THInImg: Cross-modal Steganography for Presenting Talking Heads in Images Lin Zhao et.al. 2311.17177 null
2023-11-28 BakedAvatar: Baking Neural Fields for Real-Time Head Avatar Synthesis Hao-Bin Duan et.al. 2311.05521 link
2023-11-28 Continuously Controllable Facial Expression Editing in Talking Face Videos Zhiyao Sun et.al. 2209.08289 null
2023-11-20 MemoryCompanion: A Smart Healthcare Solution to Empower Efficient Alzheimer’s Care Via Unleashing Generative AI Lifei Zheng et.al. 2311.14730 null
2023-11-15 CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding Jianzong Wang et.al. 2311.08673 null
2023-11-13 DualTalker: A Cross-Modal Dual Learning Approach for Speech-Driven 3D Facial Animation Guinan Su et.al. 2311.04766 null
2023-11-12 ChatAnything: Facetime Chat with LLM-Enhanced Personas Yilin Zhao et.al. 2311.06772 null
2023-11-08 Synthetic Speaking Children – Why We Need Them and How to Make Them Muhammad Ali Farooq et.al. 2311.06307 null
2023-11-06 RADIO: Reference-Agnostic Dubbing Video Synthesis Dongyeun Lee et.al. 2309.01950 null
2023-11-05 3D-Aware Talking-Head Video Motion Transfer Haomiao Ni et.al. 2311.02549 null
2023-11-03 Learning Separable Hidden Unit Contributions for Speaker-Adaptive Lip-Reading Songtao Luo et.al. 2310.05058 link
2023-11-02 LaughTalk: Expressive 3D Talking Head Generation with Laughter Kim Sung-Bin et.al. 2311.00994 null
2023-11-02 High-Fidelity and Freely Controllable Talking Head Video Generation Yue Gao et.al. 2304.10168 null
2023-10-31 Breathing Life into Faces: Speech-driven 3D Facial Animation with Natural Head Pose and Detailed Shape Wei Zhao et.al. 2310.20240 null
2023-10-29 On the Vulnerability of DeepFake Detectors to Attacks Generated by Denoising Diffusion Models Marija Ivanovska et.al. 2307.05397 null
2023-10-25 Personalized Speech-driven Expressive 3D Facial Animation Synthesis with Style Control Elif Bozkurt et.al. 2310.17011 null
2023-10-23 The Self 2.0: How AI-Enhanced Self-Clones Transform Self-Perception and Improve Presentation Skills Qingxiao Zheng et.al. 2310.15112 null
2023-10-19 Gemino: Practical and Robust Neural Compression for Video Conferencing Vibhaalakshmi Sivaraman et.al. 2209.10507 null
2023-10-17 CorrTalk: Correlation Between Hierarchical Speech and Facial Activity Variances for 3D Animation Zhaojie Chu et.al. 2310.11295 null
2023-10-15 HyperLips: Hyper Control Lips with High Resolution Decoder for Talking Face Generation Yaosen Chen et.al. 2310.05720 link
2023-10-12 CleftGAN: Adapting A Style-Based Generative Adversarial Network To Create Images Depicting Cleft Lip Deformity Abdullah Hayajneh et.al. 2310.07969 link
2023-10-12 Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation Yuan Gan et.al. 2309.04946 link
2023-10-08 GestSync: Determining who is speaking without a talking head Sindhu B Hegde et.al. 2310.05304 link
2023-09-30 DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models Zhiyao Sun et.al. 2310.00434 null
2023-09-28 OSM-Net: One-to-Many One-shot Talking Head Generation with Spontaneous Head Motions Jin Liu et.al. 2309.16148 null
2023-09-26 Emotional Speech-Driven Animation with Content-Emotion Disentanglement Radek Daněček et.al. 2306.08990 null
2023-09-20 FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion Stefan Stan et.al. 2309.11306 link
2023-09-20 Context-Aware Talking-Head Video Editing Songlin Yang et.al. 2308.00462 null
2023-09-18 That’s What I Said: Fully-Controllable Talking Face Generation Youngjoon Jang et.al. 2304.03275 null
2023-09-15 Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-talker Speech Junjie Li et.al. 2309.08408 link
2023-09-14 DT-NeRF: Decomposed Triplane-Hash Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis Yaoyu Su et.al. 2309.07752 null
2023-09-14 DiffTalker: Co-driven audio-image diffusion for talking faces via intermediate landmarks Zipeng Qi et.al. 2309.07509 null
2023-09-14 HDTR-Net: A Real-Time High-Definition Teeth Restoration Network for Arbitrary Talking Face Generation Methods Yongyuan Li et.al. 2309.07495 link
2023-09-13 PIAVE: A Pose-Invariant Audio-Visual Speaker Extraction Network Qinghua Liu et.al. 2309.06723 null
2023-09-12 DF-TransFusion: Multimodal Deepfake Detection via Lip-Audio Cross-Attention and Facial Self-Attention Aaditya Kharel et.al. 2309.06511 null
2023-09-12 Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos Ekta Prashnani et.al. 2305.03713 null
2023-09-11 ExpCLIP: Bridging Text and Facial Expressions via Semantic Alignment Yicheng Zhong et.al. 2308.14448 null
2023-09-10 MaskRenderer: 3D-Infused Multi-Mask Realistic Face Reenactment Tina Behrouzi et.al. 2309.05095 null
2023-09-09 Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video Xiuzhe Wu et.al. 2309.04814 link
2023-09-01 Unsupervised Learning of Style-Aware Facial Animation from Real Acting Performances Wolfgang Paier et.al. 2306.10006 null
2023-08-30 From Pixels to Portraits: A Comprehensive Survey of Talking Head Generation Techniques and Applications Shreyank N Gowda et.al. 2308.16041 null
2023-08-30 SelfTalk: A Self-Supervised Commutative Training Diagram to Comprehend 3D Talking Faces Ziqiao Peng et.al. 2306.10799 link
2023-08-30 Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models Antoni Bigata Casademunt et.al. 2305.08854 link
2023-08-29 Papeos: Augmenting Research Papers with Talk Videos Tae Soo Kim et.al. 2308.15224 null
2023-08-25 EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation Ziqiao Peng et.al. 2303.11089 link
2023-08-24 ToonTalker: Cross-Domain Face Reenactment Yuan Gong et.al. 2308.12866 null
2023-08-24 Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis Jiahe Li et.al. 2307.09323 link
2023-08-23 DF-3DFace: One-to-Many Speech Synchronized 3D Face Animation with Diffusion Se Jin Park et.al. 2310.05934 null
2023-08-21 Deep Person Generation: A Survey from the Perspective of Face, Pose and Cloth Synthesis Tong Sha et.al. 2109.02081 null
2023-08-18 Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization Soumik Mukhopadhyay et.al. 2308.09716 link
2023-08-18 Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation Fa-Ting Hong et.al. 2307.09906 link
2023-08-17 A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation Li Liu et.al. 2308.08849 link
2023-08-16 Instruct-NeuralTalker: Editing Audio-Driven Talking Radiance Fields with Instructions Yuqi Sun et.al. 2306.10813 null
2023-08-12 Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head Generation Zhichao Wang et.al. 2308.06457 link
2023-08-12 DialogueNeRF: Towards Realistic Avatar Face-to-Face Conversation Video Generation Yichao Yan et.al. 2203.07931 null
2023-08-11 Versatile Face Animator: Driving Arbitrary 3D Facial Avatar in RGBD Space Haoyu Wang et.al. 2308.06076 link
2023-08-11 VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer Liyang Chen et.al. 2308.04830 null
2023-08-10 Near-realtime Facial Animation by Deep 3D Simulation Super-Resolution Hyojoon Park et.al. 2305.03216 null
2023-08-02 Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis Zhenhui Ye et.al. 2306.03504 null
2023-07-29 Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation Michał Stypułkowski et.al. 2301.03396 null
2023-07-26 Learning Landmarks Motion from Speech for Speaker-Agnostic 3D Talking Heads Generation Federico Nocentini et.al. 2306.01415 link
2023-07-20 HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and Retarget Faces Stella Bounareli et.al. 2307.10797 link
2023-07-19 MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions Yunfei Liu et.al. 2307.10008 null
2023-07-19 Hierarchical Semantic Perceptual Listener Head Video Generation: A High-performance Pipeline Zhigang Chang et.al. 2307.09821 null
2023-07-19 OPHAvatars: One-shot Photo-realistic Head Avatars Shaoxu Li et.al. 2307.09153 link
2023-07-18 FACTS: Facial Animation Creation using the Transfer of Styles Jack Saunders et.al. 2307.09480 null
2023-07-09 Predictive Coding For Animation-Based Video Compression Goluck Konuko et.al. 2307.04187 null
2023-07-08 FTFDNet: Learning to Detect Talking Face Video Manipulation with Tri-Modality Interaction Ganglai Wang et.al. 2307.03990 null
2023-07-05 Interactive Conversational Head Generation Mohan Zhou et.al. 2307.02090 null
2023-07-04 A Comprehensive Multi-scale Approach for Speech and Dynamics Synchrony in Talking Head Generation Louis Airale et.al. 2307.03270 link
2023-07-04 Generating Animatable 3D Cartoon Faces from Single Portraits Chuanyu Pan et.al. 2307.01468 null
2023-07-03 RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations Neha Sahipjohn et.al. 2307.01233 null
2023-06-20 Audio-Driven 3D Facial Animation from In-the-Wild Videos Liying Lu et.al. 2306.11541 null
2023-06-13 Parametric Implicit Face Representation for Audio-Driven Facial Reenactment Ricong Huang et.al. 2306.07579 null
2023-06-13 AniFaceDrawing: Anime Portrait Exploration during Your Sketching Zhengyu Huang et.al. 2306.07476 null
2023-06-12 NPVForensics: Jointing Non-critical Phonemes and Visemes for Deepfake Detection Yu Chen et.al. 2306.06885 null
2023-06-10 StyleTalk: One-shot Talking Head Generation with Controllable Speaking Styles Yifeng Ma et.al. 2301.01081 link
2023-06-08 ReliableSwap: Boosting General Face Swapping Via Reliable Supervision Ge Yuan et.al. 2306.05356 link
2023-06-06 Emotional Talking Head Generation based on Memory-Sharing and Attention-Augmented Networks Jianrong Wang et.al. 2306.03594 null
2023-06-05 Instruct-Video2Avatar: Video-to-Avatar Generation with Instructions Shaoxu Li et.al. 2306.02903 link
2023-05-31 High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning Chao Xu et.al. 2305.02572 null
2023-05-23 CPNet: Exploiting CLIP-based Attention Condenser and Probability Map Guidance for High-fidelity Talking Face Generation Jingning Xu et.al. 2305.13962 null
2023-05-22 RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars Dongwei Pan et.al. 2305.13353 link
2023-05-19 UniFLG: Unified Facial Landmark Generator from Text or Speech Kentaro Mitsui et.al. 2302.14337 null
2023-05-18 An Android Robot Head as Embodied Conversational Agent Marcel Heisler et.al. 2305.10945 null
2023-05-18 Audio-Visual Person-of-Interest DeepFake Detection Davide Cozzolino et.al. 2204.03083 link
2023-05-17 INCLG: Inpainting for Non-Cleft Lip Generation with a Multi-Task Image Processing Network Shuang Chen et.al. 2305.10589 null
2023-05-17 LPMM: Intuitive Pose Control for Neural Talking-Head Model via Landmark-Parameter Morphable Model Kwangho Lee et.al. 2305.10456 null
2023-05-15 Identity-Preserving Talking Face Generation with Landmark and Appearance Priors Weizhi Zhong et.al. 2305.08293 link
2023-05-09 Zero-shot personalized lip-to-speech synthesis with face image based voice control Zheng-Yan Sheng et.al. 2305.14359 null
2023-05-09 StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator Jiazhi Guan et.al. 2305.05445 null
2023-05-09 Multimodal-driven Talking Face Generation via a Unified Diffusion-based Generator Chao Xu et.al. 2305.02594 null
2023-05-01 StyleAvatar: Real-time Photo-realistic Portrait Avatar from a Single Video Lizhen Wang et.al. 2305.00942 link
2023-05-01 GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation Zhenhui Ye et.al. 2305.00787 null
2023-04-28 A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation Bo-Kyeong Kim et.al. 2304.00471 null
2023-04-27 Controllable One-Shot Face Video Synthesis With Semantic Aware Prior Kangning Liu et.al. 2304.14471 null
2023-04-25 AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head Rongjie Huang et.al. 2304.12995 link
2023-04-24 VR Facial Animation for Immersive Telepresence Avatars Andre Rochow et.al. 2304.12051 null
2023-04-21 Implicit Neural Head Synthesis via Controllable Local Deformation Fields Chuhan Chen et.al. 2304.11113 null
2023-04-20 DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation Shuai Shen et.al. 2301.03786 link
2023-04-18 Audio-Driven Talking Face Generation with Diverse yet Realistic Facial Animations Rongliang Wu et.al. 2304.08945 null
2023-04-17 Autoregressive GAN for Semantic Unconditional Head Motion Generation Louis Airale et.al. 2211.00987 link
2023-04-11 One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural Radiance Field Weichuang Li et.al. 2304.05097 null
2023-04-06 Face Animation with an Attribute-Guided Diffusion Model Bohan Zeng et.al. 2304.03199 link
2023-04-06 4D Agnostic Real-Time Facial Animation Pipeline for Desktop Scenarios Wei Chen et.al. 2304.02814 null
2023-04-03 CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior Jinbo Xing et.al. 2301.02379 link
2023-04-01 DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance Longwen Zhang et.al. 2304.03117 null
2023-04-01 TalkCLIP: Talking Head Generation with Text-Guided Expressive Speaking Styles Yifeng Ma et.al. 2304.00334 null
2023-03-31 FONT: Flow-guided One-shot Talking Head Generation with Natural Head Motions Jin Liu et.al. 2303.17789 null
2023-03-29 Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert Jiadong Wang et.al. 2303.17480 link
2023-03-27 OmniAvatar: Geometry-Guided Controllable 3D Head Synthesis Hongyi Xu et.al. 2303.15539 null
2023-03-27 Accurate and Interpretable Solution of the Inverse Rig for Realistic Blendshape Models with Quadratic Corrective Terms Stevo Racković et.al. 2302.04843 null
2023-03-27 MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation Bowen Zhang et.al. 2212.08062 link
2023-03-27 A Majorization-Minimization Based Method for Nonconvex Inverse Rig Problems in Facial Animation: Algorithm Derivation Stevo Racković et.al. 2205.04289 null
2023-03-26 OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering Zhiyuan Ma et.al. 2303.14662 link
2023-03-26 Emotionally Enhanced Talking Face Generation Sahil Goyal et.al. 2303.11548 link
2023-03-26 Distributed Solution of the Inverse Rig Problem in Blendshape Facial Animation Stevo Racković et.al. 2303.06370 null
2023-03-24 Synthesizing Photorealistic Virtual Humans Through Cross-modal Disentanglement Siddarth Ravichandran et.al. 2209.01320 null
2023-03-23 PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360 $^{\circ}$ Sizhe An et.al. 2303.13071 null
2023-03-22 Style Transfer for 2D Talking Head Animation Trong-Thang Pham et.al. 2303.09799 link
2023-03-22 MARLIN: Masked Autoencoder for facial video Representation LearnINg Zhixi Cai et.al. 2211.06627 link
2023-03-14 DisCoHead: Audio-and-Video-Driven Talking Head Generation by Disentangled Control of Head Pose and Facial Expressions Geumbyeol Hwang et.al. 2303.07697 link
2023-03-13 SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation Wenxuan Zhang et.al. 2211.12194 link
2023-03-09 FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis Using Self-Supervised Speech Representation Learning Kazi Injamamul Haque et.al. 2303.05416 link
2023-03-09 Improving Few-Shot Learning for Talking Face System with TTS Data Augmentation Qi Chen et.al. 2303.05322 link
2023-03-07 DINet: Deformation Inpainting Network for Realistic Face Visually Dubbing on High Resolution Video Zhimeng Zhang et.al. 2303.03988 link
2023-03-05 Cyber Vaccine for Deepfake Immunity Ching-Chun Chang et.al. 2303.02659 null
2023-03-04 High-fidelity Facial Avatar Reconstruction from Monocular Video with Generative Priors Yunpeng Bai et.al. 2211.15064 null
2023-03-01 DPE: Disentanglement of Pose and Expression for General Video Portrait Editing Youxin Pang et.al. 2301.06281 link
2023-02-27 Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video Minsu Kim et.al. 2303.08670 null
2023-02-27 Memory-augmented Contrastive Learning for Talking Head Generation Jianrong Wang et.al. 2302.13469 link
2023-02-24 Pose-Controllable 3D Facial Animation Synthesis using Hierarchical Audio-Vertex Attention Bin Liu et.al. 2302.12532 null
2023-02-16 OPT: One-shot Pose-Controllable Talking Head Generation Jin Liu et.al. 2302.08197 null
2023-02-14 Expressive Talking Head Video Encoding in StyleGAN2 Latent-Space Trevine Oorloff et.al. 2203.14512 link
2023-01-31 GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis Zhenhui Ye et.al. 2301.13430 null
2023-01-23 Data standardization for robust lip sync Chun Wang et.al. 2202.06198 null
2023-01-20 Neural Volumetric Blendshapes: Computationally Efficient Physics-Based Facial Blendshapes Nicolas Wagner et.al. 2212.14784 null
2023-01-15 Learning Audio-Driven Viseme Dynamics for 3D Face Animation Linchao Bao et.al. 2301.06059 null
2022-12-30 Imitator: Personalized Speech-driven 3D Facial Animation Balamurugan Thambiraja et.al. 2301.00023 null
2022-12-28 All’s well that FID’s well? Result quality and metric scores in GAN models for lip-sychronization tasks Carina Geldhauser et.al. 2212.13810 null
2022-12-23 Dubbing in Practice: A Large Scale Study of Human Localization With Insights for Automatic Dubbing William Brannon et.al. 2212.12137 null
2022-12-09 Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers Yasheng Sun et.al. 2212.04970 null
2022-12-07 Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors Zhentao Yu et.al. 2212.04248 null
2022-12-07 SPACE: Speech-driven Portrait Animation with Controllable Expression Siddharth Gururani et.al. 2211.09809 null
2022-11-30 Extracting Semantic Knowledge from GANs with Unsupervised Learning Jianjin Xu et.al. 2211.16710 null
2022-11-27 VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild Kun Cheng et.al. 2211.14758 null
2022-11-26 Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis Duomin Wang et.al. 2211.14506 link
2022-11-22 Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition Jiaxiang Tang et.al. 2211.12368 null
2022-11-10 On the role of Lip Articulation in Visual Speech Perception Zakaria Aldeneh et.al. 2203.10117 null
2022-11-03 SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory Se Jin Park et.al. 2211.00924 null
2022-10-21 Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection Alexandros Haliassos et.al. 2201.07131 link
2022-10-13 Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors Vladimir Iashin et.al. 2210.07055 link
2022-10-13 Pre-Avatar: An Automatic Presentation Generation Framework Leveraging Talking Avatar Aolan Sun et.al. 2210.06877 null
2022-10-07 Compressing Video Calls using Synthetic Talking Heads Madhav Agarwal et.al. 2210.03692 null
2022-10-07 A Keypoint Based Enhancement Method for Audio Driven Free View Talking Head Synthesis Yichen Han et.al. 2210.03335 null
2022-10-06 Audio-Visual Face Reenactment Madhav Agarwal et.al. 2210.02755 link
2022-10-06 Finding Directions in GAN’s Latent Space for Neural Face Reenactment Stella Bounareli et.al. 2202.00046 link
2022-10-04 Towards MOOCs for Lipreading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale Aditya Agarwal et.al. 2208.09796 null
2022-09-29 Facial Landmark Predictions with Applications to Metaverse Qiao Han et.al. 2209.14698 link
2022-09-27 StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment Stella Bounareli et.al. 2209.13375 link
2022-09-23 EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model Xinya Ji et.al. 2205.15278 null
2022-09-21 FNeVR: Neural Volume Rendering for Face Animation Bohan Zeng et.al. 2209.10340 link
2022-09-19 AutoLV: Automatic Lecture Video Generator Wenbin Wang et.al. 2209.08795 null
2022-09-09 Talking Head from Speech Audio using a Pre-trained Image Generator Mohammed M. Alghamdi et.al. 2209.04252 null
2022-09-07 Restructurable Activation Networks Kartikeya Bhardwaj et.al. 2208.08562 link
2022-08-29 StableFace: Analyzing and Improving Motion Stability for Talking Face Generation Jun Ling et.al. 2208.13717 null
2022-08-17 Extreme-scale Talking-Face Video Upsampling with Audio-Visual Priors Sindhu B Hegde et.al. 2208.08118 link
2022-08-03 Free-HeadGAN: Neural Talking Head Synthesis with Explicit Gaze Control Michail Christos Doukas et.al. 2208.02210 null
2022-08-02 Perceptual Conversational Head Generation with Regularized Driver and Enhanced Renderer Ailin Huang et.al. 2206.12837 link
2022-08-01 A Feasibility Study on Image Inpainting for Non-cleft Lip Generation from Patients with Cleft Lip Shuang Chen et.al. 2208.01149 link
2022-07-27 A Hybrid Deep Animation Codec for Low-bitrate Video Conferencing Goluck Konuko et.al. 2207.13530 null
2022-07-24 Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis Shuai Shen et.al. 2207.11770 link
2022-07-22 Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos Panagiotis P. Filntisis et.al. 2207.11094 link
2022-07-20 NARRATE: A Normal Assisted Free-View Portrait Stylizer Youjia Wang et.al. 2207.00974 null
2022-07-20 VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection Joanna Hong et.al. 2206.07458 null
2022-07-20 Responsive Listening Head Generation: A Benchmark Dataset and Baseline Mohan Zhou et.al. 2112.13548 null
2022-07-13 FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech Synthesis Yongqi Wang et.al. 2207.03800 null
2022-06-29 Cut Inner Layers: A Structured Pruning Strategy for Efficient U-Net GANs Bo-Kyeong Kim et.al. 2206.14658 null
2022-06-09 Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos Alexander Waibel et.al. 2206.04523 null
2022-05-31 Text/Speech-Driven Full-Body Animation Wenlin Zhuang et.al. 2205.15573 null
2022-05-27 Unsupervised Voice-Face Representation Learning by Cross-Modal Prototype Contrast Boqing Zhu et.al. 2204.14057 link
2022-05-26 One-Shot Face Reenactment on Megapixels Wonjun Kang et.al. 2205.13368 null
2022-05-24 Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel’s Weekly Video Podcasts Debjoy Saha et.al. 2205.12194 link
2022-05-20 MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement Alexander Richard et.al. 2104.08223 link
2022-05-13 Talking Face Generation with Multilingual TTS Hyoung-Kyu Song et.al. 2205.06421 null
2022-05-02 Emotion-Controllable Generalized Talking Face Generation Sanjana Sinha et.al. 2205.01155 null
2022-05-02 A Novel Speech-Driven Lip-Sync Model with CNN and LSTM Xiaohong Li et.al. 2205.00916 null
2022-04-27 Talking Head Generation Driven by Speech-Related Facial Action Units and Audio- Based on Multimodal Representation Fusion Sen Chen et.al. 2204.12756 null
2022-04-25 Fast Facial Landmark Detection and Applications: A Survey Kostiantyn Khabarlak et.al. 2101.10808 null
2022-04-13 Dynamic Neural Textures: Generating Talking-Face Videos with Continuously Controllable Expressions Zipeng Ye et.al. 2204.06180 null
2022-04-06 Transformer-S2A: Robust and Efficient Speech-to-Animation Liyang Chen et.al. 2111.09771 null
2022-04-03 Txt2Vid: Ultra-Low Bitrate Compression of Talking-Head Videos via Text Pulkit Tandon et.al. 2106.14014 link
2022-03-30 End to End Lip Synchronization with a Temporal AutoEncoder Yoav Shalev et.al. 2203.16224 link
2022-03-29 Thin-Plate Spline Motion Model for Image Animation Jian Zhao et.al. 2203.14367 link
2022-03-17 StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN Fei Yin et.al. 2203.04036 link
2022-03-17 FaceFormer: Speech-Driven 3D Facial Animation with Transformers Yingruo Fan et.al. 2112.05329 link
2022-03-16 Efficient conditioned face animation using frontally-viewed embedding Maxime Oquab et.al. 2203.08765 null
2022-03-15 Depth-Aware Generative Adversarial Network for Talking Head Video Generation Fa-Ting Hong et.al. 2203.06605 link
2022-03-10 An Audio-Visual Attention Based Multimodal Network for Fake Talking Face Videos Detection Ganglai Wang et.al. 2203.05178 null
2022-03-08 Attention-Based Lip Audio-Visual Synthesis for Talking Face Generation in the Wild Ganglai Wang et.al. 2203.03984 null
2022-03-04 Multi-modality Deep Restoration of Extremely Compressed Face Videos Xi Zhang et.al. 2107.05548 null
2022-03-01 FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset Hasam Khalid et.al. 2108.05080 link
2022-02-25 FSGANv2: Improved Subject Agnostic Face Swapping and Reenactment Yuval Nirkin et.al. 2202.12972 null
2022-02-22 Thinking the Fusion Strategy of Multi-reference Face Reenactment Takuya Yashima et.al. 2202.10758 null
2022-01-24 Selective Listening by Synchronizing Speech with Lips Zexu Pan et.al. 2106.07150 link
2022-01-22 Text2Video: Text-driven Talking-head Video Synthesis with Personalized Phoneme-Pose Dictionary Sibo Zhang et.al. 2104.14631 null
2022-01-21 Stitch it in Time: GAN-Based Facial Editing of Real Videos Rotem Tzaban et.al. 2201.08361 link
2022-01-17 Towards Realistic Visual Dubbing with Heterogeneous Sources Tianyi Xie et.al. 2201.06260 null
2022-01-16 Audio-Driven Talking Face Video Generation with Dynamic Convolution Kernels Zipeng Ye et.al. 2201.05986 null
2022-01-03 DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering Shunyu Yao et.al. 2201.00791 null
2021-12-20 Parallel and High-Fidelity Text-to-Lip Generation Jinglin Liu et.al. 2107.06831 link
2021-12-19 Initiative Defense against Facial Manipulation Qidong Huang et.al. 2112.10098 link
2021-12-07 Joint Audio-Text Model for Expressive Speech-Driven 3D Facial Animation Yingruo Fan et.al. 2112.02214 null
2021-12-06 One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning Suzhen Wang et.al. 2112.02749 null
2021-11-29 Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates Shenhan Qian et.al. 2108.08020 link
2021-11-04 FEAFA+: An Extended Well-Annotated Dataset for Facial Expression Analysis and 3D Facial Animation Wei Gan et.al. 2111.02751 null
2021-11-02 BiosecurID: a multimodal biometric database Julian Fierrez et.al. 2111.03472 null
2021-10-30 Imitating Arbitrary Talking Style for Realistic Audio-DrivenTalking Face Synthesis Haozhe Wu et.al. 2111.00203 link
2021-10-26 Emotion recognition in talking-face videos using persistent entropy and neural networks Eduardo Paluzo-Hidalgo et.al. 2110.13571 link
2021-10-26 ViDA-MAN: Visual Dialog with Digital Humans Tong Shen et.al. 2110.13384 null
2021-10-22 Invertible Frowns: Video-to-Video Facial Emotion Translation Ian Magnusson et.al. 2109.08061 null
2021-10-19 Talking Head Generation with Audio and Speech Related Facial Action Units Sen Chen et.al. 2110.09951 null
2021-10-16 Intelligent Video Editing: Incorporating Modern Talking Face Generation Algorithms in a Video Editor Anchit Gupta et.al. 2110.08580 null
2021-10-12 Fine-grained Identity Preserving Landmark Synthesis for Face Reenactment Haichao Zhang et.al. 2110.04708 null
2021-10-07 Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution Yangyang Shi et.al. 2110.05241 null
2021-09-24 Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation Yuanxun Lu et.al. 2109.10595 null
2021-09-20 Accurate, Interpretable, and Fast Animation: An Iterative, Sparse, and Nonconvex Approach Stevo Rackovic et.al. 2109.08356 null
2021-09-17 Detection of GAN-synthesized street videos Omran Alamayreh et.al. 2109.04991 null
2021-08-30 Audiovisual Speech Synthesis using Tacotron2 Ahmed Hussen Abdelaziz et.al. 2008.00620 null
2021-08-23 KoDF: A Large-scale Korean DeepFake Detection Dataset Patrick Kwon et.al. 2103.10094 null
2021-08-23 HeadGAN: One-shot Neural Head Synthesis and Editing Michail Christos Doukas et.al. 2012.08261 null
2021-08-19 AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis Yudong Guo et.al. 2103.11078 link
2021-08-18 DeepFake MNIST+: A DeepFake Facial Animation Dataset Jiajun Huang et.al. 2108.07949 link
2021-08-18 FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning Chenxu Zhang et.al. 2108.07938 link
2021-08-12 UniFaceGAN: A Unified Framework for Temporally Consistent Facial Video Editing Meng Cao et.al. 2108.05650 null
2021-08-11 AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Person Xinsheng Wang et.al. 2108.04325 null
2021-08-06 SofGAN: A Portrait Image Generator with Dynamic Styling Anpei Chen et.al. 2007.03780 link
2021-07-27 Beyond Voice Identity Conversion: Manipulating Voice Attributes by Adversarial Learning of Structured Disentangled Representations Laurent Benaroya et.al. 2107.12346 null
2021-07-21 Speech Driven Talking Face Generation from a Single Image and an Emotion Condition Sefik Emre Eskimez et.al. 2008.03592 link
2021-07-20 Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion Suzhen Wang et.al. 2107.09293 link
2021-07-10 Speech2Video: Cross-Modal Distillation for Speech to Video Generation Shijing Si et.al. 2107.04806 null
2021-07-07 Egocentric Videoconferencing Mohamed Elgharib et.al. 2107.03109 null
2021-06-08 LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces from Video using Pose and Lighting Normalization Avisek Lahiri et.al. 2106.04185 null
2021-05-20 Audio-Driven Emotional Video Portraits Xinya Ji et.al. 2104.07452 null
2021-05-07 Write-a-speaker: Text-based Emotional and Rhythmic Talking-head Generation Lincheng Li et.al. 2104.07995 link
2021-05-05 A Neural Lip-Sync Framework for Synthesizing Photorealistic Virtual News Anchors Ruobing Zheng et.al. 2002.08700 null
2021-04-29 Learned Spatial Representations for Few-shot Talking-Head Synthesis Moustafa Meshry et.al. 2104.14557 null
2021-04-26 One-shot Face Reenactment Using Appearance Adaptive Normalization Guangming Yao et.al. 2102.03984 null
2021-04-25 3D-TalkEmo: Learning to Synthesize 3D Emotional Talking Head Qianyun Wang et.al. 2104.12051 null
2021-04-22 Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation Hang Zhou et.al. 2104.11116 link
2021-04-07 Single Source One Shot Reenactment using Weighted motion From Paired Feature Points Soumya Tripathy et.al. 2104.03117 null
2021-04-07 Everything’s Talkin’: Pareidolia Face Reenactment Linsen Song et.al. 2104.03061 link
2021-04-07 LI-Net: Large-Pose Identity-Preserving Face Reenactment Network Jin Liu et.al. 2104.02850 null
2021-04-02 One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing Ting-Chun Wang et.al. 2011.15126 null
2021-03-20 Not made for each other- Audio-Visual Dissonance-based Deepfake Detection and Localization Komal Chugh et.al. 2005.14405 link
2021-03-19 End-to-End Lip Synchronisation Based on Pattern Classification You Jin Kim et.al. 2005.08606 null
2021-03-05 Real-time RGBD-based Extended Body Pose Estimation Renat Bashirov et.al. 2103.03663 link
2021-03-03 Estimating Uniqueness of I-Vector Representation of Human Voice Erkam Sinan Tandogan et.al. 2008.11985 null
2021-02-25 MakeItTalk: Speaker-Aware Talking-Head Animation Yang Zhou et.al. 2004.12992 null
2021-02-19 One Shot Audio to Animated Video Generation Neeraj Kumar et.al. 2102.09737 null
2021-02-18 AudioVisual Speech Synthesis: A brief literature review Efthymios Georgiou et.al. 2103.03927 null
2020-12-14 Robust One Shot Audio to Video Generation Neeraj Kumar et.al. 2012.07842 null
2020-12-14 Multi Modal Adaptive Normalization for Audio to Video Generation Neeraj Kumar et.al. 2012.07304 null
2020-11-30 Adaptive Compact Attention For Few-shot Video-to-video Translation Risheng Huang et.al. 2011.14695 null
2020-11-21 Stochastic Talking Face Generation Using Latent Distribution Matching Ravindra Yadav et.al. 2011.10727 link
2020-11-21 Iterative Text-based Editing of Talking-heads Using Neural Retargeting Xinwei Yao et.al. 2011.10688 null
2020-11-09 FACEGAN: Facial Attribute Controllable rEenactment GAN Soumya Tripathy et.al. 2011.04439 null
2020-11-06 Large-scale multilingual audio visual dubbing Yi Yang et.al. 2011.03530 null
2020-11-02 Facial Keypoint Sequence Generation from Audio Prateek Manocha et.al. 2011.01114 null
2020-10-25 APB2FaceV2: Real-Time Audio-Guided Multi-Face Reenactment Jiangning Zhang et.al. 2010.13017 link
2020-10-12 Intuitive Facial Animation Editing Based On A Generative RNN Framework Eloïse Berson et.al. 2010.05655 null
2020-10-05 SMILE: Semantically-guided Multi-attribute Image and Layout Editing Andrés Romero et.al. 2010.02315 link
2020-10-05 Dynamic Facial Asset and Rig Generation from a Single Scan Jiaman Li et.al. 2010.00560 null
2020-09-20 An Improved Approach of Intention Discovery with Machine Learning for POMDP-based Dialogue Management Ruturaj Raval et.al. 2009.09354 null
2020-09-18 Mesh Guided One-shot Face Reenactment using Graph Convolutional Networks Guangming Yao et.al. 2008.07783 null
2020-09-12 DualLip: A System for Joint Lip Reading and Generation Weicong Chen et.al. 2009.05784 null
2020-09-02 Seeing wake words: Audio-visual Keyword Spotting Liliane Momeni et.al. 2009.01225 null
2020-08-29 “It took me almost 30 minutes to practice this”. Performance and Production Practices in Dance Challenge Videos on TikTok Daniel Klug et.al. 2008.13040 null
2020-08-23 A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild K R Prajwal et.al. 2008.10010 link
2020-08-11 Audio- and Gaze-driven Facial Animation of Codec Avatars Alexander Richard et.al. 2008.05023 null
2020-08-04 Speaker dependent acoustic-to-articulatory inversion using real-time MRI of the vocal tract Tamás Gábor Csapó et.al. 2008.02098 link
2020-08-04 Real-Time Cleaning and Refinement of Facial Animation Signals Eloïse Berson et.al. 2008.01332 null
2020-08-02 Deep Multi-modality Soft-decoding of Very Low Bit-rate Face Videos Yanhui Guo et.al. 2008.01652 null
2020-07-29 Neural Voice Puppetry: Audio-driven Facial Reenactment Justus Thies et.al. 1912.05566 link
2020-07-20 Deformable Style Transfer Sunnie S. Y. Kim et.al. 2003.11038 link
2020-07-18 A Robust Interactive Facial Animation Editing System Eloïse Berson et.al. 2007.09367 null
2020-07-16 Talking-head Generation with Rhythmic Head Motion Lele Chen et.al. 2007.08547 link
2020-07-08 Learning Speech Representations from Raw Audio by Joint Audiovisual Self-Supervision Abhinav Shukla et.al. 2007.04134 null
2020-06-20 Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams Huirong Huang et.al. 2006.11610 null
2020-05-27 Modality Dropout for Improved Performance-driven Talking Faces Ahmed Hussen Abdelaziz et.al. 2005.13616 null
2020-05-25 Identity-Preserving Realistic Talking Face Generation Sanjana Sinha et.al. 2005.12318 null
2020-05-22 Head2Head: Video-based Neural Head Synthesis Mohammad Rami Koujan et.al. 2005.10954 null
2020-05-16 FReeNet: Multi-Identity Face Reenactment Jiangning Zhang et.al. 1905.11805 null
2020-05-13 FaR-GAN for One-Shot Face Reenactment Hanxiang Hao et.al. 2005.06402 null
2020-05-13 Arbitrary Talking Face Generation via Attentional Audio-Visual Coherence Learning Hao Zhu et.al. 1812.06589 null
2020-05-11 Dancing to the Partisan Beat: A First Analysis of Political Communication on TikTok Juan Carlos Medina Serrano et.al. 2004.05478 link
2020-05-07 What comprises a good talking-head video generation?: A Survey and Benchmark Lele Chen et.al. 2005.03201 link
2020-05-04 Disentangled Speech Embeddings using Cross-modal Self-supervision Arsha Nagrani et.al. 2002.08742 null
2020-04-30 APB2Face: Audio-guided face reenactment with auxiliary pose and blink signals Jiangning Zhang et.al. 2004.14569 null
2020-03-30 ActGAN: Flexible and Efficient One-shot Face Reenactment Ivan Kosarevych et.al. 2003.13840 null
2020-03-29 Realistic Face Reenactment via Self-Supervised Disentangling of Identity and Pose Xianfang Zeng et.al. 2003.12957 null
2020-03-26 High-Accuracy Facial Depth Models derived from 3D Synthetic Data Faisal Khan et.al. 2003.06211 null
2020-03-05 Talking-Heads Attention Noam Shazeer et.al. 2003.02436 link
2020-03-05 Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose Ran Yi et.al. 2002.10137 link
2020-03-01 Towards Automatic Face-to-Face Translation Prajwal K R et.al. 2003.00418 link
2020-02-19 Speech-driven facial animation using polynomial fusion of features Triantafyllos Kefalas et.al. 1912.05833 null
2020-01-17 ICface: Interpretable and Controllable Face Reenactment Using GANs Soumya Tripathy et.al. 1904.01909 null
2019-12-20 Disentangling Style and Content in Anime Illustrations Sitao Xiang et.al. 1905.10742 null
2019-11-21 FLNet: Landmark Driven Fetching and Learning Network for Faithful Talking Facial Animation Synthesis Kuangxiao Gu et.al. 1911.09224 null
2019-11-19 MarioNETte: Few-shot Face Reenactment Preserving Identity of Unseen Targets Sungjoo Ha et.al. 1911.08139 null
2019-10-28 Few-shot Video-to-Video Synthesis Ting-Chun Wang et.al. 1910.12713 null
2019-10-19 Real-Time Lip Sync for Live 2D Animation Deepali Aneja et.al. 1910.08685 link
2019-10-16 Designing Style Matching Conversational Agents Deepali Aneja et.al. 1910.07514 null
2019-10-15 A High-Fidelity Open Embodied Avatar with Lip Syncing and Expression Capabilities Deepali Aneja et.al. 1909.08766 link
2019-10-09 EmoCo: Visual Analysis of Emotion Coherence in Presentation Videos Haipeng Zeng et.al. 1907.12918 null
2019-10-02 Animating Face using Disentangled Audio Representations Gaurav Mittal et.al. 1910.00726 null
2019-09-25 Few-Shot Adversarial Learning of Realistic Neural Talking Head Models Egor Zakharov et.al. 1905.08233 null
2019-09-06 Neural Style-Preserving Visual Dubbing Hyeongwoo Kim et.al. 1909.02518 null
2019-08-29 3D Face Pose and Animation Tracking via Eigen-Decomposition based Bayesian Approach Ngoc-Trung Tran et.al. 1908.11039 null
2019-08-20 Prosodic Phrase Alignment for Machine Dubbing Alp Öktem et.al. 1908.07226 link
2019-08-16 FSGAN: Subject Agnostic Face Swapping and Reenactment Yuval Nirkin et.al. 1908.05932 link
2019-08-11 Emotion Dependent Facial Animation from Affective Speech Rizwan Sadiq et.al. 1908.03904 null
2019-08-05 One-shot Face Reenactment Yunxuan Zhang et.al. 1908.03251 link
2019-07-25 Talking Face Generation by Conditional Recurrent Adversarial Network Yang Song et.al. 1804.04786 link
2019-07-24 Data-Driven Physical Face Inversion Yeara Kozlov et.al. 1907.10402 null
2019-07-23 A system for efficient 3D printed stop-motion face animation Rinat Abdrashitov et.al. 1907.10163 null
2019-06-14 Realistic Speech-Driven Facial Animation with GANs Konstantinos Vougioukas et.al. 1906.06337 null
2019-06-04 Text-based Editing of Talking-head Video Ohad Fried et.al. 1906.01524 null
2019-05-27 Audio2Face: Generating Speech/Face Animation from Single Audio with Attention-Based Bidirectional LSTM Networks Guanzhong Tian et.al. 1905.11142 null
2019-05-09 Hierarchical Cross-Modal Talking Face Generationwith Dynamic Pixel-Wise Loss Lele Chen et.al. 1905.03820 link
2019-05-08 Capture, Learning, and Synthesis of 3D Speaking Styles Daniel Cudeiro et.al. 1905.03079 link
2019-04-23 Talking Face Generation by Adversarially Disentangled Audio-Visual Representation Hang Zhou et.al. 1807.07860 null
2019-04-02 FEAFA: A Well-Annotated Dataset for Facial Expression Analysis and 3D Facial Animation Yanfu Yan et.al. 1904.01509 null
2019-03-13 Animating an Autonomous 3D Talking Avatar Dominik Borer et.al. 1903.05448 null
2018-12-22 Deep Audio-Visual Speech Recognition Triantafyllos Afouras et.al. 1809.02108 null
2018-12-20 DeepFakes: a New Threat to Face Recognition? Assessment and Detection Pavel Korshunov et.al. 1812.08685 null
2018-11-22 Towards Highly Accurate and Stable Face Alignment for High-Resolution Videos Ying Tai et.al. 1811.00342 link
2018-11-16 Influence of visual cues on head and eye movements during listening tasks in multi-talker audiovisual environments with animated characters Maartje M. E. Hendrikse et.al. 1812.02088 null
2018-08-28 GANimation: Anatomically-aware Facial Animation from a Single Image Albert Pumarola et.al. 1807.09251 link
2018-08-19 Dynamic Temporal Alignment of Speech to Lips Tavi Halperin et.al. 1808.06250 link
2018-07-29 ReenactGAN: Learning to Reenact Faces via Boundary Transfer Wayne Wu et.al. 1807.11079 link
2018-07-26 Learnable PINs: Cross-Modal Embeddings for Person Identity Arsha Nagrani et.al. 1805.00833 null
2018-07-19 End-to-End Speech-Driven Facial Animation with Temporal GANs Konstantinos Vougioukas et.al. 1805.09313 null
2018-05-29 Deep Video Portraits Hyeongwoo Kim et.al. 1805.11714 null
2018-05-24 VisemeNet: Audio-Driven Animator-Centric Speech Animation Yang Zhou et.al. 1805.09488 null
2018-05-21 Anime Style Space Exploration Using Metric Learning and Generative Adversarial Networks Sitao Xiang et.al. 1805.07997 null
2018-04-23 Generating Talking Face Landmarks from Speech Sefik Emre Eskimez et.al. 1803.09803 null
2018-03-28 Generative Adversarial Talking Head: Bringing Portraits to Life with a Weakly Supervised Neural Network Hai X. Pham et.al. 1803.07716 null
2018-03-20 Speech-Driven Facial Reenactment Using Conditional Generative Adversarial Networks Seyed Ali Jalalifar et.al. 1803.07461 null
2017-12-07 End-to-end Learning for 3D Facial Animation from Raw Waveforms of Speech Hai X. Pham et.al. 1710.00920 null
2017-12-06 ObamaNet: Photo-realistic lip-sync from text Rithesh Kumar et.al. 1801.01442 null
2017-07-30 Kernel Projection of Latent Structures Regression for Facial Animation Retargeting Christos Ouzounis et.al. 1707.09629 null
2017-07-26 Fast Deep Matting for Portrait Animation on Mobile Phone Bingke Zhu et.al. 1707.08289 null
2017-07-21 Multichannel Attention Network for Analyzing Visual Behavior in Public Speaking Rahul Sharma et.al. 1707.06830 null
2017-07-18 You said that? Joon Son Chung et.al. 1705.02966 null
2017-01-30 Lip Reading Sentences in the Wild Joon Son Chung et.al. 1611.05358 link
2016-10-28 Galaxy gas as obscurer: II. Separating the galaxy-scale and nuclear obscurers of Active Galactic Nuclei Johannes Buchner et.al. 1610.09380 link
2016-07-11 Large-Scale MIMO is Capable of Eliminating Power-Thirsty Channel Coding for Wireless Transmission of HEVC/H.265 Video Shaoshi Yang et.al. 1601.06684 null
2016-05-22 Improving Facial Analysis and Performance Driven Animation through Disentangling Identity and Expression David Rim et.al. 1512.08212 null
2016-02-08 Automatic Face Reenactment Pablo Garrido et.al. 1602.02651 null
2015-11-20 ExpressionBot: An Emotive Lifelike Robotic Face for Face-to-Face Communication Ali Mollahosseini et.al. 1511.06502 null
2014-09-03 Visual Speech Recognition Ahmad B. A. Hassanat et.al. 1409.1411 null
2012-09-22 Using multimodal speech production data to evaluate articulatory animation for audiovisual speech synthesis Ingmar Steiner et.al. 1209.4982 null
2012-03-30 Face Expression Recognition and Analysis: The State of the Art Vinay Bettadapura et.al. 1203.6722 null
2012-01-19 Progress in animation of an EMA-controlled tongue model for acoustic-visual speech synthesis Ingmar Steiner et.al. 1201.4080 null
2010-03-01 Re-verification of a Lip Synchronization Protocol using Robust Reachability Piotr Kordy et.al. 1003.0431 null

Image Animation

Publish Date Title Authors PDF Code
2025-09-19 TT-DF: A Large-Scale Diffusion-Based Dataset and Benchmark for Human Body Forgery Detection Wenkui Yang et.al. 2505.08437 null
2025-09-09 LINR Bridge: Vector Graphic Animation via Neural Implicits and Video Diffusion Priors Wenshuo Gao et.al. 2509.07484 null
2025-08-23 AnimateAnywhere: Rouse the Background in Human Image Animation Xiaoyu Liu et.al. 2504.19834 null
2025-08-13 Animate-X++: Universal Character Image Animation with Dynamic Backgrounds Shuai Tan et.al. 2508.09454 null
2025-08-10 Consistent and Controllable Image Animation with Motion Linear Diffusion Transformers Xin Ma et.al. 2508.07246 null
2025-08-01 FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait Taekyung Ki et.al. 2412.01064 null
2025-07-20 StableAnimator++: Overcoming Pose Misalignment and Face Distortion for Human Image Animation Shuyuan Tu et.al. 2507.15064 null
2025-07-11 X-Dancer: Expressive Music to Human Dance Video Generation Zeyuan Chen et.al. 2502.17414 null
2025-07-01 DAM-VSR: Disentanglement of Appearance and Motion for Video Super-Resolution Zhe Kong et.al. 2507.01012 null
2025-06-09 Efficient Long-duration Talking Video Synthesis with Linear Diffusion Transformer under Multimodal Guidance Haojie Zhang et.al. 2411.16748 null
2025-05-30 MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation Yanbo Ding et.al. 2505.10238 null
2025-05-29 HyperMotion: DiT-Based Pose-Guided Human Image Animation of Complex Motions Shuolin Xu et.al. 2505.22977 null
2025-05-24 EvAnimate: Event-conditioned Image-to-Video Generation for Human Animation Qiang Qu et.al. 2503.18552 null
2025-05-18 DynamiCtrl: Rethinking the Basic Structure and the Role of Text for High-quality Human Image Animation Haoyu Zhao et.al. 2503.21246 null
2025-04-20 DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance Yuxuan Luo et.al. 2504.01724 null
2025-04-15 UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer Xiang Wang et.al. 2504.11289 null
2025-04-15 Taming Consistency Distillation for Accelerated Human Image Animation Xiang Wang et.al. 2504.11143 null
2025-04-14 TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation Sunjae Yoon et.al. 2410.24037 null
2025-04-05 Multi-identity Human Image Animation with Structural Video Diffusion Zhenzhi Wang et.al. 2504.04126 null
2025-04-04 Optimizing 4D Gaussians for Dynamic Scene Video from Single Landscape Images In-Hwan Jin et.al. 2504.05458 null
2025-04-01 VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer Xinyu Liu et.al. 2502.05979 null
2025-03-23 MotiF: Making Text Count in Image Animation with Motion Focal Loss Shijie Wang et.al. 2412.16153 null
2025-03-13 Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer Jiahao Cui et.al. 2412.00733 null
2025-03-10 Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation Yingjie Chen et.al. 2501.05020 null
2025-03-10 VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs Moayed Haji Ali et.al. 2304.06020 null
2025-03-01 Towards Multiple Character Image Animation Through Enhancing Implicit Decoupling Jingyun Xue et.al. 2406.03035 null
2025-02-25 DisPose: Disentangling Pose Guidance for Controllable Human Image Animation Hongxiang Li et.al. 2412.09349 null
2025-02-24 Dormant: Defending against Pose-driven Human Image Animation Jiachen Zhou et.al. 2409.14424 null
2025-02-15 SkyReels-A1: Expressive Portrait Animation in Video Diffusion Transformers Di Qiu et.al. 2502.10841 null
2025-02-10 Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance Li Hu et.al. 2502.06145 null
2025-02-06 MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation Jinbo Xing et.al. 2502.04299 null
2025-01-30 Every Image Listens, Every Image Dances: Music-Driven Image Animation Zhikang Dong et.al. 2501.18801 null
2025-01-20 X-Dyna: Expressive Dynamic Human Image Animation Di Chang et.al. 2501.10021 null
2025-01-15 Joint Learning of Depth and Appearance for Portrait Image Animation Xinya Ji et.al. 2501.08649 null
2024-12-11 Animate-X: Universal Character Image Animation with Enhanced Motion Representation Shuai Tan et.al. 2410.10306 null
2024-11-30 DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses Yatian Pang et.al. 2412.00397 null
2024-11-28 JoyVASA: Portrait and Animal Image Animation with Diffusion-Based Audio-Driven Facial Dynamics and Head Motion Generation Xuyang Cao et.al. 2411.09209 null
2024-11-27 StableAnimator: High-Quality Identity-Preserving Human Image Animation Shuyuan Tu et.al. 2411.17697 null
2024-11-21 HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation Zhenzhi Wang et.al. 2407.17438 null
2024-11-12 LEO: Generative Latent Image Animator for Human Video Synthesis Yaohui Wang et.al. 2305.03989 null
2024-10-20 FrameBridge: Improving Image-to-Video Generation with Bridge Models Yuji Wang et.al. 2410.15371 null
2024-10-14 Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation Jiahao Cui et.al. 2410.07718 null
2024-09-30 Illustrious: an Open Advanced Illustration Model Sang Hyun Park et.al. 2409.19946 null
2024-09-29 High Quality Human Image Animation using Regional Supervision and Motion Blur Condition Zhongcong Xu et.al. 2409.19580 null
2024-07-23 Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models Xin Ma et.al. 2407.15642 null
2024-07-17 Audio-Synchronized Visual Animation Lin Zhang et.al. 2403.05659 null
2024-07-12 TCAN: Animating Human Images with Temporally Consistent Pose Guidance using Diffusion Models Jeongho Kim et.al. 2407.09012 null
2024-07-12 EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions Zhiyuan Chen et.al. 2407.08136 null
2024-07-11 MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model Muyao Niu et.al. 2405.20222 null
2024-06-16 Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation Mingwang Xu et.al. 2406.08801 null
2024-06-03 UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation Xiang Wang et.al. 2406.01188 null
2024-06-01 Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance Shenhao Zhu et.al. 2403.14781 null
2024-05-29 Evaluating the efectiveness of sonifcation in science education using Edukoi Lucrezia Guiotto Nai Fovino et.al. 2405.18908 null
2024-05-28 VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation Qilin Wang et.al. 2405.18156 null
2024-05-28 Controllable Longer Image Animation with Diffusion Models Qiang Wang et.al. 2405.17306 null
2024-05-20 Dynamic modeling of a sliding ring on an elastic rod with incremental potential formulation Weicheng Huang et.al. 2208.01238 null
2024-03-25 PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models Yiming Zhang et.al. 2312.13964 null
2024-03-13 Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts Yue Ma et.al. 2403.08268 null
2024-03-05 Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation Weijie Li et.al. 2403.02827 null
2024-01-17 Continuous Piecewise-Affine Based Motion Model for Image Animation Hexiang Wang et.al. 2401.09146 null
2024-01-03 Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions David Junhao Zhang et.al. 2401.01827 null
2023-12-06 AnimateZero: Video Diffusion Models are Zero-Shot Image Animators Jiwen Yu et.al. 2312.03793 null
2023-12-05 LivePhoto: Real Image Animation with Text-guided Motion Control Xi Chen et.al. 2312.02928 null
2023-12-04 AnimateAnything: Fine-Grained Open Domain Image Animation with Motion Guidance Zuozhuo Dai et.al. 2311.12886 null
2023-11-30 Motion-Conditioned Image Animation for Video Editing Wilson Yan et.al. 2311.18827 null
2023-11-27 MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model Zhongcong Xu et.al. 2311.16498 null
2023-11-27 DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors Jinbo Xing et.al. 2310.12190 null
2023-11-19 Differential Motion Evolution for Fine-Grained Motion Deformation in Unsupervised Image Animation Peirong Liu et.al. 2110.04658 null
2023-10-16 LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation Ruiqi Wu et.al. 2310.10769 null
2023-09-26 Text-Guided Synthesis of Eulerian Cinemagraphs Aniruddha Mahapatra et.al. 2307.03190 null
2023-09-25 Automatic Animation of Hair Blowing in Still Portrait Photos Wenpeng Xiao et.al. 2309.14207 null
2023-07-10 AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning Yuwei Guo et.al. 2307.04725 link
2023-07-09 Predictive Coding For Animation-Based Video Compression Goluck Konuko et.al. 2307.04187 null
2023-03-10 3D Cinemagraphy from a Single Image Xingyi Li et.al. 2303.05724 null
2023-02-02 Dreamix: Video Diffusion Models are General Video Editors Eyal Molad et.al. 2302.01329 null
2023-01-14 Continuous odor profile monitoring to study olfactory navigation in small animals Kevin S. Chen et.al. 2301.05905 null
2022-11-30 NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-shot Real Image Animation Yu Yin et.al. 2211.17235 null
2022-10-04 Implicit Warping for Animation with Image Sets Arun Mallya et.al. 2210.01794 null
2022-09-28 Motion Transformer for Unsupervised Image Animation Jiale Tao et.al. 2209.14024 null
2022-07-19 Single Stage Virtual Try-on via Deformable Attention Flows Shuai Bai et.al. 2207.09161 null
2022-07-08 Jointly Harnessing Prior Structures and Temporal Consistency for Sign Language Video Generation Yucheng Suo et.al. 2207.03714 null
2022-06-11 Bayesian Statistics Guided Label Refurbishment Mechanism: Mitigating Label Noise in Medical Image Classification Mengdi Gao et.al. 2106.12284 null
2022-04-05 Neural Fields in Visual Computing and Beyond Yiheng Xie et.al. 2111.11426 null
2022-03-29 Thin-Plate Spline Motion Model for Image Animation Jian Zhao et.al. 2203.14367 null
2022-03-29 Image Animation with Perturbed Masks Yoav Shalev et.al. 2011.06922 null
2022-03-25 3D GAN Inversion for Controllable Portrait Image Animation Connor Z. Lin et.al. 2203.13441 null
2022-03-17 Latent Image Animator: Learning to Animate Images via Latent Space Navigation Yaohui Wang et.al. 2203.09043 null
2021-12-21 Image Animation with Keypoint Mask Or Toledano et.al. 2112.10457 null
2021-12-19 Move As You Like: Image Animation in E-Commerce Scenario Borun Xu et.al. 2112.13647 null
2021-12-17 AI-Empowered Persuasive Video Generation: A Survey Chang Liu et.al. 2112.09401 null
2021-10-26 Incremental Learning for Animal Pose Estimation using RBF k-DPP Gaurav Kumar Nayak et.al. 2110.13598 null
2021-09-03 Sparse to Dense Motion Transfer for Face Image Animation Ruiqi Zhao et.al. 2109.00471 null
2021-08-18 DeepFake MNIST+: A DeepFake Facial Animation Dataset Jiajun Huang et.al. 2108.07949 null
2021-06-23 Analisis Kualitas Layanan Website E-Commerce Bukalapak Terhadap Kepuasan Pengguna Mahasiswa Universitas Bina Darma Menggunakan Metode Webqual 4.0 Adellia et.al. 2106.15342 null
2021-04-07 Single Source One Shot Reenactment using Weighted motion From Paired Feature Points Soumya Tripathy et.al. 2104.03117 null
2021-03-22 PriorityCut: Occlusion-guided Regularization for Warp-based Image Animation Wai Ting Cheung et.al. 2103.11600 null
2020-12-01 Ultra-low bitrate video conferencing using deep image animation Goluck Konuko et.al. 2012.00346 null
2020-10-01 First Order Motion Model for Image Animation Aliaksandr Siarohin et.al. 2003.00196 null
2020-08-27 Deep Spatial Transformation for Pose-Guided Person Image Generation and Animation Yurui Ren et.al. 2008.12606 null
2019-08-30 Animating Arbitrary Objects via Deep Motion Transfer Aliaksandr Siarohin et.al. 1812.08861 null
2018-10-09 3D model silhouette-based tracking in depth images for puppet suit dynamic video-mapping Guillaume Caron et.al. 1810.03956 null
2018-06-24 A Design of FPGA Based Small Animal PET Real Time Digital Signal Processing and Correction Logic Jiaming Lu et.al. 1806.09117 null
2018-01-31 RAPTOR I: Time-dependent radiative transfer in arbitrary spacetimes Thomas Bronzwaer et.al. 1801.10452 null
2016-06-23 Gender and Interest Targeting for Sponsored Post Advertising at Tumblr Mihajlo Grbovic et.al. 1606.07189 null
2015-03-16 Use of Effective Audio in E-learning Courseware Kisor Ray et.al. 1503.04837 null
2015-02-04 Multimedia-Video for Learning Kah Hean Chua et.al. 1502.01090 null
2013-01-25 Measurements of Martian Dust Devil Winds with HiRISE David S. Choi et.al. 1301.6130 null
2010-01-04 Tutoring System for Dance Learning Rajkumar Kannan et.al. 1001.0440 null

Notes:

Function added: