RIP silent models. LTX-2 produces audio and lip-synced dialogue natively, with leading open-source quality. It can make clips up to 20 seconds at 4K resolution. Same prompt, different results.
LTX-2 is a complete open-source release for audio-video generation, featuring open weights, full training code, benchmarks, and customization tools.
LTX-2 is designed to run on-device, enabling private workflows, fast iteration, and full control.
Produce up to 20 seconds of high-fidelity video with complete control and consistent style.
Generate synchronized 4K video and audio in seconds with the fastest production-grade AI model available today.
Generate cinematic-grade video with synchronized audio at true 4K / 50 fps
43