Intel has developed a new AI model, the Latent Diffusion Model for 3D (LDM3D) model. It can be used to create 360-degree 3D images.
Developed in collaboration with Blockade Maps, the LDM3D 3D AI model uses generative AI functionality to create realistic 3D visual content. Specifically, the chip giant’s LDM3D model can create 3D images that have a view of 360-degrees around.
According to Intel, the LDM3D model is a huge development. This is because current LLM models for generative AI solutions only generate 2D images. The LDM3D model lets users generate both an image and a depth map via a text prompt.
However, by using the same number of parameters as a latent stable diffusion, the AI model provides a more accurate relative depth for each pixel in an image. This compares to standard processing methods for ‘estimating depth’ in an image.
Training LDM3D model
The LDM3D model was trained with a dataset from a subset of 10,000 examples from the research database LION-400M. This database contains 200 million image-caption pairs. In addition, the researchers also used the Dense Prediction Transformer (DPT) large-depth estimation model developed by Intel. This model provides highly accurate relative depth for each pixel in an image.
Furthermore, an Intel AI supercomputer with Intel Xeon processors and Intel Habana Gaudi AI accelerators were used to train the model.
Expectations
The researchers expect the development of the LDM3D model to revolutionize how visual content is handled. They consider the model especially suitable for developing so-called metaverse applications and other digital experiences. Sectors such as the entertainment industry, but also architects and other designers stand to benefit.
In the near future, the LDM3D model and DepthFusion should create even more opportunities for multi-view generative AI capabilities and computer vision, indicates Intel.
The LDM3D model is now available in open source to allow developers to create their own applications and build an ecosystem.
Tip: Meta unveils Voicebox, a “breakthrough” generative AI for speech