OpenAI introduces AI model 'Sora' that turns text into video

Microsoft-backed OpenAI is developing software capable of generating minute-long videos based on text prompts, the company announced on Thursday.

The software, named “Sora” after the Japanese word for “sky,” is currently available for red teaming, which helps identify flaws in the AI system. Additionally, it is intended for use by visual artists, designers, and filmmakers to provide feedback on the model, the company stated.

- Advertisement -

“Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background,” the statement said, adding that it can create multiple shots within a single video.

In addition to generating videos from text prompts, Sora can also animate a still image, as mentioned in a blog post by the company.

The video generation software follows OpenAI’s ChatGPT chatbot, which was released in late 2022 and created a buzz around generative AI with its ability to compose emails and write codes and poems.

Social media giant Meta Platforms beefed up its image generation model Emu last year to add two AI-based features that can edit and generate videos from text prompts. The Facebook-parent company is also looking to compete with Microsoft, Alphabet’s Google and Amazon in the rapidly transforming generative AI universe.

Sora is still a work-in-progress, with the company acknowledging that the model may sometimes struggle with spatial details in a prompt and encounter difficulties in following a specific camera trajectory.

OpenAI also mentioned that they are developing tools to determine whether a video was generated by Sora.

The new tool is not yet publicly available, and OpenAI has disclosed limited information about its development process. The company, which has faced lawsuits from some authors and The New York Times over its use of copyrighted works to train ChatGPT, has not revealed the imagery and video sources used to train Sora.

OpenAI mentioned in a blog post that it is consulting with artists, policymakers and other stakeholders before releasing the new tool to the public.

“We are working with red teamers – domain experts in areas like misinformation, hateful content, and bias – who will be adversarially testing the model,” the company said. “We’re also building tools to help detect misleading content such as a detection classifier that can tell when a video was generated by Sora.”