sora-2

Sora 2

OpenAI’s new flagship model for video generation with synced audio.
Intelligence
Speed
Price
$0.200
Input
Output

Sora 2 is a new powerful media generation model, generating videos with synced audio. It can create richly detailed, dynamic clips from natural language or images.

Limitation: A $10 top-up is required to upgrade to Tier 2 in order to access the Sora 2 series models.

Sora 2 uses the v1/chat/completions endpoint, with prompts written into the content field of the request.

Sora 2 now supports the following features: Guest Mode: You can reference publicly authorized character IDs listed on the Sora.com website and use them in prompts with the @id format. Example: @sama. Aspect Ratio Control: Specify “horizontal” or “vertical” in your prompt to switch between landscape and portrait video output. Output Quality: Generates 10-second 720p videos by default, watermark-free.

4096 context window
2048 max output tokens
knowledge cutoff

Modalities

Text

Input only

Image

Input only

Audio

Output only

Endpoints

Chat Completions

v1/chat/completions

Responses

v1/responses

Realtime

v1/realtime

Assistants

v1/assistants

Batch

v1/batch

Fine-tuning

v1/fine-tuning

Embeddings

v1/embeddings

Image generation

v1/images/generations

Image edit

v1/images/edits

Speech generation

v1/audio/speech

Transcription

v1/audio/transcriptions

Translation

v1/audio/translations

Moderation

v1/moderations

Completions (legacy)

v1/completions

Features

Streaming

Supported

Function calling

Not supported

Structured outputs

Not supported

Fine-tuning

Not supported

Distillation

Not supported