ADVERTISEMENT

Mark Zuckerberg's Meta Unveils New AI Model 'ImageBind'; Here's All You Need To Know

Meta said that ImageBind can outperform prior specialist models trained individually for one particular modality.

<div class="paragraphs"><p>Source: Unsplash</p></div>
Source: Unsplash

Mark Zuckerberg owned Meta on Tuesday unveiled a new AI model 'ImageBind' that combines different senses just like people do.

In a Facebook post, Zuckerberg shared the video explaining how ImageBind works. "It understands images, video, audio, depth, thermal and spatial movement," he said.

In a statement, Meta AI said that the model learns a single embedding, or shared representation space, not just for text, image/video, and audio, but also for sensors that record depth (3D), thermal (infrared radiation), and inertial measurement units (IMU), which calculate motion and position.

ImageBind equips machines with a holistic understanding that connects objects in a photo with how they will sound, their 3D shape, how warm or cold they are, and how they move.

The company said that ImageBind can outperform prior specialist models trained individually for one particular modality, as described in their research paper. But most important, it helps advance AI by enabling machines to better analyze many different forms of information together.

"For example, using ImageBind, Meta’s Make-A-Scene could create images from audio, such as creating an image based on the sounds of a rain forest or a bustling market," the company added.

"ImageBind is part of Meta’s efforts to create multimodal AI systems that learn from all possible types of data around them. As the number of modalities increases, ImageBind opens the floodgates for researchers to try to develop new, holistic systems, such as combining 3D and IMU sensors to design or experience immersive, virtual worlds," the artificial intelligence company said.