Google has pushed its AI unit DeepMind hard and is now releasing new and improved AI tools for generating videos and images. These are aimed particularly at marketers and content creators. According to the company, the new applications break new ground and will ‘fundamentally change the way creatives work’.
Big attention-grabbers are Veo for video production and Imagen 3 for images, which generate short films or images based on detailed textual prompts. According to Google, the tools allow creators to create campaigns faster and measure results in real time. Google is so confident that its own marketing department has already been working with the tools for the Pixel 8 phone’s promotional campaign.
Veo generates videos at 1080p resolution, allowing for a variety of visual and cinematic styles. The tool understands instructions in natural language and even visual jargon such as ‘timelapse’ and ‘aerial shot’. That makes it possible for the creator, if we can still call it that, to use detailed, longer prompts.
Believable movements
According to Google, this allows Veo to stay close to the vision of the person who entered the prompt. The company promises in a blog post that Veo provides coherent and consistent output. Objects and characters move realistically or at least believably (in the case of cartoon characters). So far, believable movement has sometimes been a challenge in AI-generated videos.
Veo is the culmination of Google’s previous efforts in generative video, such as Generative Query Network (GQN), Lumiere, WALT and VideoPoet. Elements from these and earlier models have been integrated into Veo. The tool is not yet generally available, but there is a waiting list for creatives eager to get their hand on the private preview.
Next generation image generator
Imagen 3 is a text-to-image model that generates photorealistic images based on detailed prompts. It can also generate images that look like sketches, cartoons, pixel art, or clay animation.
Google specifically mentions the possibility of using Imagen 3 to include text in images, such as inscriptions on buildings or prints on clothing. AI tools used to mess up in this department, as they often did not know what to do with such text.
Tip: Google consolidates its teams to accelerate AI innovation
‘AI has no taste’
Google seems aware of the fear among video producers, photographers, and other creators that they will soon be out of work when these kinds of tools become available. The company says it wants to give filmmakers, musicians, illustrators, and other creators a say in the design process of the AI suite. It also lets creatives develop campaigns, videos, and images with the tools.
Vidhya Srinivasan, Vice President and General Manager of Ads at Google, compares AI to smartphone cameras: “Today, the average photo or video is much better thanks to technology. However, that doesn’t mean we can all shoot magazine covers. No matter how much AI improves, it has no taste or ingenuity. Still, AI can provide new opportunities to expand your potential for creativity.”
Digital watermark
Google also actively seeks collaboration with creative creators, for example, through its Infinite Wonderland project, where it has had illustrators generate new images for Lewis Carroll’s story Alice in Wonderland. Furthermore, the company says it is working with musicians to develop its Music AI Sandbox, a set of tools for music production. There is also Lyria for generating musical prompts.
Content generated this way will have the digital watermark SynthID, which Google already uses for other AI-generated images, sounds, lyrics and video. All videos created with Veo in VideoFX will also receive this watermark.
Also read: Google Search soon to be enriched with extensive AI functionality