This AI system could make lip sync dubbing accurate
Toronto: Dodgy lip sync dubbing could soon become a thing of the past as researchers have developed an Artificial Intelligence (AI)-based system that can edit the facial expressions of actors to accurately match dubbed voices.
The system, called Deep Video Portraits, can also be used to correct gaze and head pose in video conferencing, and enables new possibilities for video post-production and visual effects, according to the research presented at the SIGGRAPH 2018 conference in Vancouver, Canada.
"This technique could also be used for post-production in the film industry where computer graphics editing of faces is already widely used in today's feature films," said study co-author Christian Richardt from the University of Bath in Britain.
The researchers believe that the new system could help the film industry save time and reduce post-production costs.
Unlike previous methods that are focused on movements of the face interior only, Deep Video Portraits can also animate the whole face including eyes, eyebrows, and head position in videos, using controls known from computer graphics face animation.
It can even synthesise a plausible static video background if the head is moved around.
"It works by using model-based 3D face performance capture to record the detailed movements of the eyebrows, mouth, nose, and head position of the dubbing actor in a video," said one of the researchers Hyeongwoo Kim from the Max Planck Institute for Informatics in Germany.
"It then transposes these movements onto the 'target' actor in the film to accurately sync the lips and facial movements with the new audio," Hyeongwoo added.
The research is currently at the proof-of-concept stage and is yet to work at real time, however the researchers anticipate the approach could make a real difference to the visual entertainment industry.
"Despite extensive post-production manipulation, dubbing films into foreign languages always presents a mismatch between the actor on screen and the dubbed voice," Professor Christian Theobalt from the Max Planck Institute for Informatics said.
"Our new Deep Video Portrait approach enables us to modify the appearance of a target actor by transferring head pose, facial expressions, and eye motion with a high level of realism," Theobalt added.