Sync Labs
Sync Labs (legally Synchronicity Labs Inc.) is a San Francisco-based AI company that builds lip sync and visual dubbing tools for video. The company's API takes any video and any audio track and generates a new version where the speaker's lip movements match the audio, in any language, without per-speaker training.2 Sync Labs is backed by GV, Y Combinator, and Founders, Inc.1
Origins in academic research
The underlying technology came from research at CVIT, the Center for Visual Information Technology at IIIT Hyderabad, starting in 2018.3 Prajwal K R, then a Master's student, and Rudrabha Mukhopadhyay, a PhD student, collaborated on syncing lip movements to arbitrary speech in video. Their work produced Wav2Lip, a model that could match lips to any audio in any language without fine-tuning on each speaker. The paper, "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild," was published at ACM Multimedia 2020 with co-authors Vinay Namboodiri and C.V. Jawahar.6
The open-source Wav2Lip repository accumulated over 12,900 stars on GitHub.6 Prajwal went on to pursue a doctorate at the University of Oxford's Visual Geometry Group, supervised by Andrew Zisserman, focusing on multimodal learning and video understanding.9 Rudrabha completed his PhD at IIIT Hyderabad in audio-visual deep learning.3
Founding
Prady Modukuru, a product leader who had incubated and scaled AI-powered cybersecurity products at Microsoft, discovered the Wav2Lip research and connected with Prajwal and Rudrabha online.2 In mid-2023, Prady and Pavan Reddy, a two-time venture-backed entrepreneur and IIT Madras alumnus, went full-time on the company.2 Rudrabha and Prajwal joined full-time after completing their respective doctorates.3
The founding team originally started an entity within the Center for Innovation and Entrepreneurship at IIIT Hyderabad before incorporating in the United States.3
Funding
Sync Labs was accepted into Y Combinator's Winter 2024 batch, receiving $500K in funding.3
In August 2024, the company closed a $5.5M seed round led by GV (Google Ventures), bringing total funding to approximately $7M. Other investors include Founders, Inc. and Sunset Ventures.54
Technology and products
The company's core approach uses phoneme-to-viseme mapping, learning how speech sounds correspond to mouth shapes, then training generative models to produce realistic video output.7 Sync Labs names its models sequentially, with each generation building on the previous.
lipsync-2 introduced a spatiotemporal transformer architecture with "style preservation," which learns how a specific person speaks by observing them in the input video. It works on live-action footage, animation, and AI-generated content without any per-speaker fine-tuning.2 lipsync-2-pro added diffusion-based super-resolution, with improved detail on beards, teeth, and skin texture at the cost of 1.5-2x slower processing.8
sync-3 processes video at up to 4K resolution and uses spatial reasoning to understand full scene context rather than just the mouth region.1 react-1 is a separate product for editing facial expressions and emotions in existing video, with presets for six emotions (happy, sad, angry, disgusted, surprised, neutral).1
The company offers a RESTful API at api.sync.so with official Python and TypeScript SDKs, an Adobe Premiere plugin, a ComfyUI node, and a web-based Lipsync Studio. Pricing starts at $0.05 per second of generated video, with monthly tiers from $5 to custom enterprise plans.8
The platform also includes proprietary watermarking and verification technology that can detect whether a video has been modified using Sync Labs' tools.1
Adoption
The platform has grown to hundreds of thousands of users through its freemium model.4 Use cases include video translation and dubbing, dialogue replacement in film production, advertising localization, e-learning content, and AI-generated media. As a demonstration, the team dubbed the full two-hour Tucker Carlson-Putin interview into English with synchronized lip movements.7