Real-time models

Real-time models are designed to process and respond to inputs with minimal latency, enabling near-instantaneous interactions. They support multi-model communication through both voice and text interfaces, providing minimal latency and creating seamless human-AI interactions.

AI Agents support the following real-time models:

Provider

Models

OpenAI

  • gpt-realtime

  • gpt-4o-realtime

  • gpt-4o-mini-realtime

Azure OpenAI

  • gpt-realtime

  • gpt-4o-realtime

  • gpt-4o-mini-realtime

Amazon

  • nova-sonic

Google

  • gemini-live-2.5-flash

  • gemini-2.5-flash-native-audio

  • gemini-2.5-flash-native-audio-thinking

You may use pre-deployed models or bring your own API keys from OpenAI or Microsoft Azure.

Configuring real-time models

To configure agent to use real-time model:

  1. Navigate to the Agents screen, locate your Agent, and click Edit.

  2. In General tab:

    1. From the 'Large language model' drop-down, choose the real-time model.

    2. Set the number of 'Max output tokens' to 1,000 or larger.

  1. In Speech and Telephony tab, check the Enable voice streaming check box.

The above-described configuration makes real-time model interact with user via voice modality. Input audio stream is directly sent to the model. And the model generates audio stream in response. Correspondingly Speech-to-Text and Text-to-Speech services are irrelevant in this interaction mode.

If you clear Enable voice streaming check box, real-time model will be used via the text modality. In this interaction mode you will need to choose Speech-to-Text and Text-to-Speech services – same as for regular LLM models.

Chats always use text modality, regardless of the Enable voice streaming check box state.

Customizing the real-time model behavior

Use the following advanced configuration parameters to customize the real-time model behavior:

For example:

{
  "openai_realtime": {
    "voice": "coral"
  }
}

Configuring input / output language

Real-time models lack explicit configuration for input / output language. Instead you should include relevant instructions in your agent’s prompt.

For example:

Always respond to user in German.

Feature Parity

Agents that use real-time models benefit from most of the AI Agent platform features, including but not limited to:

The following limitations apply to the agents that use real-time models: