Tencent unveils Hunyuan Video, a free and open-source AI video generator. It was strategically released during OpenAI’s 12-day announcement campaign. During this campaign, it is expected to introduce Sora, its long-awaited video tool.
“We present Hunyuan Video, a new open-source video base model that delivers top-quality performance comparable to, or better than, leading closed models,” Tencent said in the official announcement, quoting Decrypt.
The Shenzhen, China-based tech company claims its model “outperforms” Runway Gen-3, Luma 1.6, and “three top models from China” based on professional human evaluations. Hunyuan Video uses a decoder-only multimodal large language model as its text encoder instead of the usual CLIP and T5-XXL combination used by other AI video and image generators.
No additional training required
According to Tencent, this helps the model follow instructions better, understand image details more accurately and learn new tasks on-the-fly without additional training. In addition, its causal attention system gets a boost from a special token-refiner, allowing it to understand prompts more thoroughly than traditional models.
The model also rewrites prompts to make them richer and increase generational quality. For example, a simple prompt such as “A man walking his dog” can be expanded to include details about the scene, lighting conditions, quality, and more.
Freely available to all
Like Meta’s Llama 3, Hunyuan is free to use and monetize until you reach the 100 million-user mark – a threshold that most developers are not likely to face.
The prerequisite? You need a powerful computer with at least 60GB of GPU memory to run the 13 billion parameter model locally – think Nvidia H800 or H20 cards. That’s more video memory than most gaming PCs have.
For users without a supercomputer, cloud services already offer support. Platforms such as FAL.ai have integrated Hunyuan and charge $0.5 per video. Other providers, such as Replicate or GoEhnance, also offer access to the model. The official Hunyuan Video server offers 150 credits for $10, with a video costing a minimum of 15 credits.
Initial tests show that Hunyuan is comparable in quality to commercial heavyweights such as Luma Labs Dream Machine or Kling AI. Generating a video takes about 15 minutes and produces photorealistic scenes with natural-looking movements of people and animals.
Also read: ChatGPT Pro may spend weeks thinking about answers