Ledger Nano X - The secure hardware wallet
ADVERTISEMENT

Meta introduces AI models for video generation, image editing

189
SHARES
1.5k
VIEWS
cryptotrader
ADVERTISEMENT

Related articles

Cryptohopper
ADVERTISEMENT

Social media giant Meta has introduced its latest artificial intelligence (AI) models for content editing and generation, according to a blog post on Nov. 16. 

The company is introducing two AI-powered generative models. The first, Emu Video, leverages Meta’s previous Emu model and is capable of generating video clips based on text and image inputs. The second model, Emu Edit, is focused on image manipulation, promising more precision in image editing.

The models are still in the research stage, but Meta says its initial results show potential use cases for creators, artists and animators alike.

Meta displays its new generative model Emu Edit. Source: Meta

According to Meta’s blog post, the Emu Video was trained with a “factorized” approach, dividing the training process into two steps to allow the model to be responsive to different inputs:

“We’ve split the process into two steps: first, generating images conditioned on a text prompt, and then generating video conditioned on both the text and the generated image. This ‘factorized’ or split approach to video generation lets us train video generation models efficiently.”

The same model can “animate” images based on a text prompt. According to Meta, instead of relying on a “deep cascade of models,” Emu Video only uses two diffusion models to generate 512×512 four-second-long videos at 16 frames per second.

Emu Edit, focused on image manipulation, will allow users to remove or add backgrounds to images, perform color and geometry transformations, as well as local and global editing of images.

“We argue that the primary objective shouldn’t just be about producing a ‘believable’ image. Instead, the model should focus on precisely altering only the pixels relevant to the edit request,” Meta noted, claiming its model is able to precisely follow instructions:

“For instance, when adding the text ‘Aloha!’ to a baseball cap, the cap itself should remain unchanged.”

Meta trained Emu Edit using computer vision tasks with a data set of 10 million synthesized images, each with an input image and a description of the task, as well as the targeted output image. “We believe it’s the largest dataset of its kind to date,” the company said.

Meta’s newly released Emu model was trained using 1.1 billion pieces of data, including photos and captions shared by users on Facebook and Instagram, CEO Mark Zuckerberg revealed during the Meta Connect event in September.

Regulators are closely scrutinizing Meta’s AI-based tools, resulting in a cautious deployment approach by the technology company. Recently, Meta disclosed it won’t allow political campaigns and advertisers to use its AI tools to create ads on Facebook and Instagram. The platform’s general advertising rules, however, do not include any rules addressing AI specifically.