Business

Tavus Research Models Phoenix-3, Raven-0, and Hummingbird-0 Redefine Realism and Perception in AI Systems

Six Months After Launch, TavusPhoenix-3, Raven-0, and Hummingbird-0 Are Redefining the Future of Human-AI Interaction

San Francisco, CA – Earlier this year, Tavus quietly rolled out a suite of research models that would go on to reshape how the industry thinks about AI avatars and perception: Phoenix-3, a frontier rendering model; Raven-0, the first contextual perception system; and Hummingbird-0, a zero-shot lip-sync engine.

Now these models are powering a new generation of applications for Fortune 500 companies and startups alike where AI doesn’t just look real, but feels real. From conversational video interfaces to multilingual dubbing pipelines, Tavus’ technology has kicked off a revolution for how humans interact with AI systems.

The Research Team

Behind these advances is a tightly coordinated team of researchers at Tavus: Damian Willary, Eloi du Bois, Karthik Ragunath Ananda Kumar, Minh Anh Nguyễn, Mustafa Isik, Jack Saunders, Roey Paz-Priel, Mert Gerdan, Chenglai Zhong, Haiyao Xiao, and Ari Korin.

This team’s mix of backgrounds—from rendering pipelines to perception systems to multimodal AI—was the catalyst that made the March release possible.

Phoenix-3: Solving the Uncanny Valley

For years, AI avatars struggled with the “uncanny valley” in rendered faces that moved, but didn’t emote. Phoenix-3 changed that. Using a Gaussian diffusion backbone, it renders full-face animation in real time, capturing blinks, micro-expressions, and emotional nuance. The result is something that feels less like a simulation and more like a person on the other side of the screen.

Raven-0: From Vision to Perception

Most machine vision systems see the world as pixels and categories. Raven-0 treats it like context. It’s the first AI perception system that interprets intent, emotion, and subtle cues in real time—an approach that’s already proving valuable in healthcare, education, and customer engagement.

Hummingbird-0: Lip Sync That Just Works

Born out of Phoenix-3’s development, Hummingbird-0 quickly took on a life of its own. The state-of-the-art  model can align audio and video with zero training or fine-tuning, while preserving both identity and realism. For creators, studios, and enterprises, that means faster dubbing, seamless localization, and entirely new workflows for video production.

Six Months of Impact

Since launch, developers have built on top of Tavus’ APIs to create:

Benchmarks have confirmed what early adopters are seeing in practice: Hummingbird-0 sets a new bar in lip-sync accuracy, visual quality, and identity preservation, while Phoenix-3 has brought real-time rendering to a fidelity level once thought impossible.

A Turning Point for AI Video

“The release of Phoenix-3, Raven-0, and Hummingbird-0 wasn’t just about making avatars look real—it was about making them feel present. It’s a turning point in how AI connects with people, and it’s only the start.” – statement from the Tavus Research Team.