2 min

Meta announced Make-A-Video, an AI-powered application that can render a video on-demand using existing graphics or text descriptions.

Make-A-Video is an artificial intelligence-powered video generator tool that allows users to render videos from image or text prompts, just like existing image synthesis tools such as Stable Diffusion and DALL-E. Make-A-Video can also make variations of existing videos. The tool isn’t accessible to the public yet.

The AI-powered tool is capable of animating static source graphics. The technology behind the AI video generator is built on text-to-image synthesis along with image generators, just like OpenAI’s DALL-E.

Instead of typical text descriptions animation, Meta uses innovative image synthesis data with unlabeled video training data to produce realistic animations. The model uses AI to make predictions and quickly create short-motion scenarios.


On the official announcement page, Meta posted various text-generating video examples. The examples include phrases like “a teddy bear painting a portrait” and “a young couple walking in heavy rain”.

“Using function-preserving transformations, we extend the spatial layers at the model initialization stage to include temporal information”, Meta said. “The extended spatial-temporal network includes new attention modules that learn temporal world dynamics from a collection of videos.”

The company hasn’t made any announcement about when Make-A-Video will be available or who will be able to access it. It did provide a sign-up form for those interested in trying the AI-powered video generator tool in the future.

Ultimately, the AI model will be capable of generating photorealistic videos on demand. Furthermore, Meta stated on its official page that videos will have a watermark to “help ensure viewers know the video was generated with AI and is not a captured video”.