Advanced configuration

Use the Advanced tab in Agent’s configuration screen for configuring Agent’s advanced configuration parameters. Configuration must be provided in JSON format and is validated at the time of entry. Use editor’s auto-complete feature and tooltips to simplify configuration process.

The following table lists all supported advanced configuration parameters and references to relevant documentation sections.

Parameter name	Documentation section
`agent_flavor`	Mock-LLM agent flavors
`activity_params`	Call settings parameters configuration
`call_transfer_conditions`	Triggering call transfers based on patterns in LLM response
`call_recording`	Enhanced call recording control
`customize_tools`	Customizing tools behavior
`doc_content_len`	Using documents
`doc_tools`	Document tool responses in message history
`dynamic_prompt`	Prompt conditions
`establish_llm_connection`	Pre-establishing the LLM connection
`empty_llm_response`	Empty LLM responses
`explicit_tool_errors`	Error response
`gemini_audio`	Gemini native audio models configuration
`ignore_first_call_transfer`	Ignoring first call transfer request
`increment_counter_call_transfer`	Triggering call transfers by increment_counter tool
`increment_counter_conditions`	Increment dynamic counters on specific messages
`inherit_config`	Advanced configuration parameters in multi-agent topologies
`init_conditions`	Init conditions
`init_logs`	Init conditions
`init_tools`	Running tools on conversation start
`language_detected_ignore_phrases`	Language-based question routing
`language_detected_pass_question`	Language-based question routing
`llm_replace_words`	LLM response modification
`llm_add_period`	LLM response modification
`llm_stream_first_sentence`	Immediate playback of the first sentence
`logs`	Information recorded in agent logs
`max_turns_message`	Error message
`no_tool_error`	No tool response
`nova_sonic`	Nova sonic model configuration
`openai_realtime`	OpenAI real-time models configuration
`orchestration_mode`	Different orchestration modes for specific sub-agents
`post_call_analysis`	Post call analysis
`prerecorded_audio`	Play prerecorded audio files instead of specific LLM responses
`progress_message_conditions`	Conditional progress messages when calling tools
`rag_chunks`	Controlling amount of consumed documents data
`remove_symbols`	STT error correction configuration
`replace_words`
`numbers_sequence`
`session_params`	Call settings parameters configuration
`session_param_tools`	Tools for modifying call settings
`session_reminders`	Session duration reminders
`tool_certs`	Custom tool certificates
`tool_logs`	Detailed tool call logs
`transfer_call_sip_headers`	Pre-defined tools
`webhooks`	Webhooks configuration
`welcome_message_activity_params`	Call settings during welcome message
`welcome_message_barge_in`
`welcome_message_parts`

Advanced configuration parameters in multi-agent topologies

In multi-agent topologies, the following configuration parameters are inherited by default by sub-agents (i.e. parameters specified at top-level agent are applied to its sub-agents too):

explicit_tool_errors
ignore_first_call_transfer
max_turns_message
webhooks

All the rest of advanced configuration parameters apply only to the agent where they are defined.

You may change this default behavior via the inherit_config advanced configuration parameter:

Parameter	Type	Description
`inherit_config`	list[str]	List of advanced configuration parameters that are inherited by sub-agents.

Specify list of advanced configuration parameters that you want to be inherited from the agent to its sub-agents. Note that the new list overrides the default configuration, therefore don’t forget to include in it ALL parameters that you want to be inherited.

Example

{
    "inherit_config": ["rag_chunks", "webhooks"]
}

STT error correction configuration

The replace_words and remove_symbols advanced configuration parameters provide automatic correction capabilities for Speech-to-Text (STT) errors in user input, helping to improve the accuracy of voice interactions.

Word replacement

Use the replace_words and remove_symbols variable to automatically correct commonly misidentified words in user speech.

Parameter	Type	Description
`replace_words`	dict[str, str]	JSON dictionary mapping incorrect words to their correct replacements.

The specified words are matched case-sensitively and at word boundaries.

Example

{
    "replace_words": {"halo": "hello", "hye": "hi"}
}

Symbol removal

Use the remove_symbols variable to automatically strip unwanted punctuation and symbols from user input.

Parameter	Type	Description
`remove_symbols`	list[str]	List of symbols to be removed from user utterance.

Example

{
    "remove_symbols": [".", ",", "!"]
}

Numbers sequence correction

Some STT engines may hallucinate on large numbers. For example they may produce “Line 500 29” when user says “Line 529”.

There are also scenarios where users spell large numbers digit by digit – so instead of “Line 529” they say “Line 5 2 9”.

The numbers_sequence variable allows you to resolve both of these problems.

Parameter	Type	Description
`numbers_sequence`	NumbersSequence	Correct numbers sequences in user utterances

Parameter

Type

Description

numbers_sequence

NumbersSequence

Correct numbers sequences in user utterances

NumbersSequence

Parameter	Type	Description
`mode`	str	Operation mode: `join` – join number sequences, e.g. change “Line 5 2 9” to “Line 529” `sum` – sum number sequences, e.g. change “Line 500 29” to “Line 529” `auto` – automatically determine operation mode based on user utterance
`prefixes`	List[str]	List of prefixes. If specified, only number sequences that follow one of the specified prefixes will be corrected. Special prefix value “`^`” matches the start of the line.

Parameter

Type

Description

mode

str

Operation mode:

join – join number sequences, e.g. change “Line 5 2 9” to “Line 529”
sum – sum number sequences, e.g. change “Line 500 29” to “Line 529”
auto – automatically determine operation mode based on user utterance

prefixes

List[str]

List of prefixes.

If specified, only number sequences that follow one of the specified prefixes will be corrected.

Special prefix value “^” matches the start of the line.

Example

{
    "numbers_sequence": {
        "mode": "auto",
        "prefixes": ["line"]
    }
}

LLM response modification

The llm_replace_words and llm_add_period advanced configuration parameters provide ability to modify LLM response. This may be used, for example, to overcome Text-to-Speech (TTS) engine pronunciation errors by replacing specific words with their phonemes.

Word replacement

Use the llm_replace_words variable to replace specific words in LLM response (for example with their phonemes representation).

Parameter	Type	Description
`llm_replace_ words`	dict[str, str]	JSON dictionary mapping specific words to their replacements.

The specified words are matched case-sensitively and at word boundaries.

Example

{
    "llm_replace_words": {"LiveHub": " laɪv hʌb"}
}

Fixing punctuation

Use the llm_add_period variable to ensure that every LLM response is treated as a complete sentence. If the model generates a response without end of sentence punctuation, a period will be automatically appended.

Parameter	Type	Description
`llm_add_period`	bool	Add period to LLM response if it lacks end-of-sentence punctuation.

Example

{
    "llm_add_period": true
}

Conditional progress messages when calling tools

The progress_message_conditions advanced configuration parameter enables the generation of progress messages when calling specific tools. This makes conversation more fluent and mitigates slow response from certain 3rd party APIs.

Parameter	Type	Description
`progress_message_conditions`	list[ProgressCondition]	Play progress message when specific tools are called.

ProgressCondition

Parameter	Type	Description
`condition`	str	Name of the tool, for example, “`pass_question`” or “`get_weather`”. You can also specify “`enter`” for playing progress message when conversation switches to specific agent.
`messages`	list[str]	Progress messages. When more than one message is provided, the system randomly chooses which message to play.

In orchestration workflows, when an agent calls pass_question or send_message, the receiving agent's "enter" condition executes first, followed by the calling agent's tool-specific conditions.

By default, the system plays only one progress message per user utterance, either time-based or conditional. This may be changed by configuring the following advanced configuration parameter:

Parameter	Type	Description
`multiple_progress_messages`	bool	Enables playing multiple progress messages for the same user utterance. Applies only to messages triggered by `progress_message_conditions` advanced configuration parameter.

Example

"progress_message_conditions": [
    {
        "condition": "pass_question",
        "messages": ["one moment", "just a second"]
    }
]

Customizing tools behavior

The customize_tools advanced configuration parameter allows you to customize tools behavior for specific agent.

Parameter	Type	Description
`customize_tools`	dict[str, ToolCustomize]	Customize tool behavior for specific agent.

ToolCustomize

Parameter	Type	Description
`response_len`	int	Maximum length of tool response to be returned to LLM.
`redact_response`	bool	Redact tool response from message history.
`redact_msg`	str	Redact message to be used in message history if tool response is redacted. Default: "<redacted>".

Example

{
    "customize_tools ": {
        "get_weather": {"response_len": 20000}
    }
}

Triggering call transfers based on patterns in LLM response

The call_transfer_conditions advanced configuration parameter enables automatic call transfers based on patterns detected in the LLM's responses. This feature helps identify when the agent is struggling to provide assistance and automatically escalates to human support.

Parameter	Type	Description
`call_transfer_conditions`	list [CallTransferCondition]	List of call transfer conditions

CallTransferCondition

Parameter	Type	Description
`patterns`	list[str]	List of phrases or patterns to monitor in LLM responses
`threshold`	int	How many LLM response that match the patterns are required to trigger call transfer. Default: 1
`message`	str	Message to be played before the call transfer.
`phone`	str	Phone number to transfer the call to (supports variable substitution).

The system continuously monitors each response generated by the LLM, scanning for any of the specified patterns. When a pattern is detected in the agent's response, the system increments an internal condition counter that is maintained throughout the entire conversation. When the condition counter reaches the specified threshold value, the system automatically triggers the call_transfer tool.

Example

{
    "call_transfer_conditions": [
        {
            "patterns": ["cannot answer", "can't answer", "don't know"],
            "threshold": 2,
            "phone": "18005550123",
            "message": "Let me connect you with a human agent who can better assist you."
        }
    ] 
}

Triggering call transfers by increment_counter tool

The increment_counter_call_transfer advanced configuration parameter enables automatic call transfers based on variable thresholds reached through the increment_counter tool. This capability is particularly valuable for scenarios where repeated failures or issues need to escalate to human intervention, such as when multiple user requests cannot be properly categorized or answered.

Parameter	Type	Description
`increment_counter_ call_transfer`	list [IncrementCounter CallTransfer]	List of increment counter call transfer configurations

IncrementCounterCallTransfer

Parameter	Type	Description
`name`	str	Name of the counter to monitor (must match the counter name used in increment_counter tool)
`threshold`	int	Threshold value that triggers call transfer. Default: 1
`message`	str	Message to be played before the call transfer.
`phone`	str	Phone number to transfer the call to (supports variable substitution).

Example

{
    "increment_counter_call_transfer": [
        {
            "name": ["error_count "],
            "threshold": 2,
            "phone": "18005550123",
            "message": " Please wait while I transfer you to human attendant."
        }
    ] 
}

Increment dynamic counters on specific messages

The increment_counter_conditions advanced configuration parameter increments dynamic counters, as used by increment_counter pre-defined tool, on specific user utterances or LLM responses.

Parameter	Type	Description
`increment_counter_conditions`	List[IncrementCounterConditions]	Increment dynamic counters on specific user utterances or LLM responses

Parameter

Type

Description

increment_counter_conditions

List[IncrementCounterConditions]

Increment dynamic counters on specific user utterances or LLM responses

IncrementCounterConditions

Parameter	Type	Description
`sender`	str	Message sender: `user` – user utterance `llm` – LLM response
`patterns`	List[str]	List of patterns to match. If message contains one of the specified patterns, dynamic counter is incremented.
`counter`	str	Name of dynamic counter to be incremented

Parameter

Type

Description

sender

str

Message sender:

user – user utterance
llm – LLM response

patterns

List[str]

List of patterns to match.

If message contains one of the specified patterns, dynamic counter is incremented.

counter

str

Name of dynamic counter to be incremented

Example

{
    "increment_counter_conditions": [
        {
            "sender": "user",
            "patterns": ["angry", "mad", "frustrated"],
            "counter": "user_dissatisfied"
        }
    ]
}

Ignoring first call transfer request

The ignore_first_call_transfer advanced configuration parameter allows you to prevent the agent from transferring calls on the first attempt, encouraging users to interact with the agent before escalating to human assistance.

When configured, this variable intercepts the first call transfer request made by the agent and returns the specified text response instead of executing the call_transfer tool call. This is particularly useful for scenarios where you want to:

Encourage users to engage with the agent before requesting human help
Provide a friendly message explaining the agent's capabilities
Reduce unnecessary call transfers on initial interactions

Parameter	Type	Description
`ignore_first_call_transfer`	str	Message you want the agent to respond with when a call transfer is first attempted.

Example

{
    "ignore_first_call_transfer": "Let me try to help you first."
}

Inheritance

The ignore_first_call_transfer value is inherited by sub-agents by default. This ensures consistent behavior across the entire agent hierarchy.

Utterance Counting

Each agent and sub-agent maintains its own independent count of utterances. This means:

The master agent tracks its own interactions separately from sub-agents
When a sub-agent is invoked, it starts with a fresh utterance count
A sub-agent's first call transfer attempt will be intercepted, regardless of how many interactions the master agent has already handled

Consider this interaction flow:

Master agent handles multiple user utterances.
Master agent delegates a question to a sub-agent.
Sub-agent immediately attempts to call call_transfer.

In this case, the sub-agent's transfer attempt is still considered the "first utterance" from the sub-agent's perspective and will be intercepted by the ignore_first_call_transfer configuration.

Language-based question routing

The language_detected_pass_question advanced configuration parameter enables automatic passing of user question to another agent based on the language detected by the Speech-to-Text (STT) system. This may be used as a faster and more reliable alternative to LLM-based language detection.

Parameter	Type	Description
`language_detected_ pass_question`	dict[str, str]	JSON dictionary mapping language codes to their corresponding agent names.
`language_detected_ ignore_phrases`	list[str]	List of phrases to be answered by LLM.

Parameter

Type

Description

language_detected_ pass_question

dict[str, str]

JSON dictionary mapping language codes to their corresponding agent names.

language_detected_ ignore_phrases

list[str]

List of phrases to be answered by LLM.

Refer to Multi-language setup for detailed instructions on how to configure multiple languages detection in Bot connection.

You should configure language_detected_pass_question parameter in the top-level agent that starts the conversation. You may use other language name as a fallback for any language name not explicitly specified.

Example

{
    "language_detected_pass_question": {
        "he-IL": "main-agent-he",
        "en-US": "main-agent-en"
    }
}

Call settings parameters configuration

The session_params, and activity_params advanced configuration parameters enable dynamic control over call settings during agent interactions. These parameters modify call behavior such as barge-in settings, audio configurations, and other Voice.AI Connect features.

For complete details on available settings, see Changing call settings in the Voice AI Connect guide.

All variables use JSON dictionary format, for example:

{
    "activity_params": {
        "bargeIn": true
    }
}

Parameter	Type	Description
`activity_params`	dict[str, Any]	Included in every agent response. Not applicable to real-time models (e.g. `gpt-realtime`).
`session_params`	dict[str, Any]	Included in the first agent's response after "context switch".
`welcome_message_ activity_params`	dict[str, Any]	Included in welcome message (instead of activity_params). Should be configured and the top-level agent that generates welcome message.

Parameter

Type

Description

activity_params

dict[str, Any]

Included in every agent response.

Not applicable to real-time models (e.g. gpt-realtime).

session_params

dict[str, Any]

Included in the first agent's response after "context switch".

welcome_message_ activity_params

dict[str, Any]

Included in welcome message (instead of activity_params). Should be configured and the top-level agent that generates welcome message.

The session_params, and activity_params advanced configuration parameters enable dynamic control over call settings during agent interactions. These parameters modify call behavior such as barge-in settings, audio configurations, and other Voice.AI Connect fea-tures.

Call settings during welcome message

You can modify call settings during welcome message playback via the following parameters:

Parameter Type Description

Parameter	Type	Description
`welcome_ message_activity_ params`	dict [str, Any]	Included in welcome message (instead of activity_params). Should be configured and the top-level agent that generates welcome message. Not applicable to realtime models (e.g. gpt-realtime)
`welcome_ message_barge_in`	bool	Configures barge-in during welcome message playback. true - user can interrupt the welcome message false - user can not interrupt the welcome message

welcome_ message_activity_ params

dict [str, Any]

Included in welcome message (instead of activity_params). Should be configured and the top-level agent that generates welcome message.

Not applicable to realtime models (e.g. gpt-realtime)

welcome_ message_barge_in

bool

Configures barge-in during welcome message playback.

true - user can interrupt the welcome message
false - user can not interrupt the welcome message

Sometimes you may need to play welcome message that consists of multiple parts and configure barge-in differently for each part. This may be done by configuring the following parameter:

Parameter Type Description

Parameter	Type	Description
`welcome_ message_parts`	list [PartModel]	Configures multi-part welcome message with “barge in” control per-part. Make sure that you agent is configured to use static welcome message. When configured, this parameter overrides welcome message specified in agent’s configuration screen.

welcome_ message_parts

list [PartModel]

Configures multi-part welcome message with “barge in” control per-part.

Make sure that you agent is configured to use static welcome message. When configured, this parameter overrides welcome message specified in agent’s configuration screen.

PartModel

Parameter	Type	Description
`message`	str	Welcome message part to be played.
`barge_in`	bool	Enable barge-in during this part of the welcome message.

Example

{
  "welcome_message_parts": [
    {
      "message": "Welcome to Hogwarts!",
      "barge_in": false
    },
    {
      "message": "Our opening hours are from 9 AM to 4 PM.",
      "barge_in": true
    }
  ]
}

Tools for modifying call settings

The session_param_tools advanced configuration parameter allows you to define AI Agent tools that modify session‑level call‑setting parameters. This gives your AI Agent ability to control and update session behavior based on its current needs.

Parameter	Type	Description
`session_param_tools`	list[SessionParamTool]	Defines tools that can be called by LLM to change the session parameters.

SessionParamTool

Parameter Type Description

Parameter	Type	Description
`name`	str	Name of the session parameter tool. Must consist of English letters / digits / hyphens / underscores and be between 3 and 32 characters long.
`description`	str	Description of the session parameters tool. Provide clear description that will help LLM decide when to call the tool. You may also explicitly reference your tool (by name or description) in the prompt.
`session_params`	dict[str, Any]	Session parameters: key = parameter name value = parameter value Refer to VoiceAI documentation for description of supported session parameters.

name

str

Name of the session parameter tool.

Must consist of English letters / digits / hyphens / underscores and be between 3 and 32 characters long.

description

str

Description of the session parameters tool.

Provide clear description that will help LLM decide when to call the tool. You may also explicitly reference your tool (by name or description) in the prompt.

session_params

dict[str, Any]

Session parameters:

key = parameter name
value = parameter value

Refer to VoiceAI documentation for description of supported session parameters.

Example

{
    "session_param_tools": [
        {
            "name": "enable_barge_in",
            "description": "Enable barge-in",
            "session_params": {"bargeIn": true}
        },
        {
            "name": "disable_barge_in",
            "description": "Disable barge-in",
            "session_params": {"bargeIn": false}
        }
    ]
}

Init conditions

The init_conditions advanced configuration parameter provides conditional agent initialization based on local configuration, offering similar functionality to “init” webook (see Webhooks configuration for details) but using predefined rules instead of external API calls. This enables dynamic agent behavior based on conversation data such as caller information, called number, or other variables.

If your init_conditions configuration doesn’t behave as expected, use init_logs parameter to enable logs during agent initialization.

Parameter	Type	Description
`init_conditions`	list[InitCondition]	Conditional agent initialization.
`init_logs`	bool	Enable logs for agent initialization.

InitCondition

Parameter	Type	Description
`match`	dict[str, str]	Match conditions key: variable name or conversation data element (e.g. "callee") value: match value Multiple match elements use AND logic, meaning all conditions must be satisfied for the rule to apply.
`variables`	dict[str, str]	Dictionary of variables that will be added / merged to the current agent's variables.
`config`	dict[str, str]	Dictionary of advanced configuration parameters that will be added / merged to the current agent.
`agent`	str	Name of the agent that will start the conversation.
`documents`	list[str]	Names of documents that agent has access to. May be used to limit access to specific documents based, for example, on callee number.

Example

{
    "init_conditions ": {
        "match": {"callee": "12024567041"},
        "variables": {"destination": "White House"}
    }
}

Mock-LLM agent flavors

The agent_flavor advanced configuration parameter enables the creation of specialized AI agents that use “mock LLM” instead of real large language models. These "mock LLM" configurations are designed for specific use cases such as testing, monitoring, and data collection.

Parameter	Type	Description
`agent_flavor`	str	Mock-LLM agent flavor echo listen say

Parameter

Type

Description

agent_flavor

str

Mock-LLM agent flavor

echo
listen
say

Available flavors

echo – The agent repeats back exactly what the user says, without any LLM processing or interpretation.
listen – The agent silently monitors the conversation and collects the transcript without generating any responses. You may use webhooks configuration of Post call analysis to process the collected transcript. This mode is automatically activated when AI Agent is used in “agent assist” mode.
say – The prompt is split into blocks separated by empty line and agent "says" one block after another sequentially, regardless of user utterance. When there are no more blocks left, end_call tool is called. You may configure "no user response" in bot configuration to make agent say blocks without any user intervention.

When mock-LLM flavor is configured, LLM parameters in agent’s configuration screen (e.g. model name and temperature) are ignored.

Example

{
    "agent_flavor": "listen"
}

Pre-establishing the LLM connection

The establish_llm_connection advanced configuration parameter minimizes the response delays during conversations by initiating the LLM connection while the welcome message plays to users. This optimization applies exclusively to predefined welcome messages, not dynamically generated ones from the LLM.

Parameter	Type	Description
`establish_llm_connection`	bool	Establish LLM connection during agent initilization

Example

{
    "establish_llm_connection": true
}

Enhanced call recording control

The call_recording advanced configuration parameter provides granular control of the recording behavior. In order to use it, set Call recording feature to “controlled by bot” in the Features tab of the Bot connection attached to your AI Agent.

Parameter	Type	Description
`call_recording`	str	Call recording configuration `enable` – enable call recording for the whole call duration (as similar behavior can be achieved by configuring Call recording feature to “record all calls” in Bot connection). `stop_on_transfer` - begins recording at conversation start and ends when transferring to a human agent. `start_on_transfer` - initiates recording only when transferring to a human agent.

Parameter

Type

Description

call_recording

str

Call recording configuration

enable – enable call recording for the whole call duration (as similar behavior can be achieved by configuring Call recording feature to “record all calls” in Bot connection).
stop_on_transfer - begins recording at conversation start and ends when transferring to a human agent.
start_on_transfer - initiates recording only when transferring to a human agent.

Example

{
    "call_recording": "start_on_transfer"
}

Immediate playback of the first sentence

The llm_stream_first_sentence advanced configuration parameter enables immediate playback of the initial short sentence from LLM responses. When enabled, the system sends the first complete sentence to the Text-to-Speech (TTS) engine as soon as it's generated, followed by the remaining content.

For example, if the LLM generates "Hi there! I'm Jonathan, your sales assistant, how can I help you?", the system will immediately process "Hi there!" through TTS while continuing to generate the rest of the response.

This feature may result in slightly reduced TTS output quality and therefore is disabled by default. However, it can significantly improve conversation fluency when using slower LLMs or generating longer responses.

Parameter	Type	Description
`llm_stream_first_sentence`	bool	Immediately play first sentence of LLM response.

Example

{
    "llm_stream_first_sentence": true
}

Session duration reminders

The session_reminders advanced configuration parameter may be used to inform the LLM about the current session duration and, for example, “nudge” it to end the call if it continues for too long.

Example

{
    "session_reminders": [
        {
            "duration": 120,
            "text": "REMINDER: session lasts for longer than 2 minutes"
        },
        {
            "duration": 180,
            "text": "REMINDER: session lasts for too long - please wrap up!"
        }
    ]
}

Reminders are sent to the LLM and are appended to the next user utterance that follows the specified duration. It is recommended to start them with some prefix, for example, REMINDER:, so that the LLM can distinguish between actual user utterance and reminder message.

If you don’t specify text, the following default reminder text is used:

REMINDER: your session lasts longer than {duration} seconds

Reminders are typically never sent at exact duration time, as specified in the configuration, but when the next user utterance is received. If multiple reminders "match" the current user utterance, the last one is used.

Parameter	Type	Description
`session_reminders`	list[SessionReminder]	Session duration reminders to LLM.

SessionReminder

Parameter	Type	Description
`duration`	int	Duration of the session (in seconds) after which reminder is sent.
`text`	str	Reminder text. It is recommended to start it with some prefix, for example, `REMINDER:`, to ensure that LLM can distinguish reminder text from user utterance. You may use variables and conversation data in reminder – for example, `{duration}`.

Parameter

Type

Description

duration

int

Duration of the session (in seconds) after which reminder is sent.

text

str

Reminder text.

It is recommended to start it with some prefix, for example, REMINDER:, to ensure that LLM can distinguish reminder text from user utterance.

You may use variables and conversation data in reminder – for example, {duration}.

OpenAI real-time models configuration

Use openai_realtime advanced configuration parameters to customize the real-time model behavior:

gpt-realtime
gpt-realtime-mini
gpt-4o-realtime
gpt-4o-mini-realtime

Parameter	Type	Description
`openai_realtime`	OpenAIRealtime	Configuration for OpenAI real-time models – e.g. gpt-realtime.

OpenAIRealtime

Parameter	Type	Description
`voice`	enum	Voice name. The following voices are supported: `alloy`, `ash`, `ballad`, `coral`, `echo`, `sage`, `shimmer`, `verse`
`input_audio_transcription_language`	str	Language for input audio transcription model. Use ISO 639-1 code. For example, "en" for English, "fr" for French. Note that input audio transcription is performed by a separate transcription model and has no direct relation to what real-time model hears / responds to. It is primarily used for logging purposes.
`input_audio_transcription_prompt`	str	Prompt for input audio transcription model.
`input_audio_noise_reduction`	enum	Configuration of input audio noise reduction. `near_field` – for close-taking microphones, such as headphones. `far_field` – for far-field microphones such as laptop or conference room microphones.
`turn_detection`	enum	Configuration of turn detection `server_vad` – the model will detect the start and end of speech based on audio volume and respond at the end of user speech. `semantic_vad` – the model will use turn detection model (in conjuction with VAD) to semantically estimate when the user has finished speaking.
`eagerness`	enum	Eagerness of the model to respond for semantic VAD. Supported values: `low`, `medium`, `high`, `auto`.
`prefix_padding_ms`	int	Amount of audio to include before the VAD detected speech (in milliseconds) for server VAD.
`silence_duration_ms`	int	Duration of silence to wait before considering the speech finished (in milliseconds) for server VAD.
`threshold`	float	Activation threshold for server VAD. Valid range: (0.0 to 1.0).

Example

{
  "openai_realtime": {
    "voice": "coral"
  }
}

Gemini native audio models configuration

The gemini_audio advanced configuration parameter provides configuration for the following Gemini speech-to-speech models:

gemini-2.5-flash-native-audio

Parameter	Type	Description
`gemini_audio`	GeminiAudio	Configuration for Gemini audio models – e.g. gemini-2.5-flash-native-audio.

GeminiAudio

Parameter	Type	Description
`voice`	enum	Voice name. The following voices are supported for native audio models: `Zephyr, Puck, Charon, Kore, Fenrir, Leda, Orus, Aoede, Callirrhoe, Autonoe, Enceladus, Iapetus, Umbriel, Algieba, Despina, Erinome, Algenib, Rasalgethi, Laomedeia, Achernar, Alnilam, Schedar, Gacrux, Pulcherrima, Achird, Zubenelgenubi, Vindemiatrix, Sadachbia, Sadaltager, Sulafat`.
`thinking_budget`	int	Thinking budget in tokens. Use the following reference for mapping `thinking_budget` to reasoning level: 0 – disable reasoning 512 – low reasoning level 1024 – medium reasoning level 2048 – high reasoning level
`top_p`	float	Tokens are selected from the most to least probable until the sum of their probabilities equals this value.
`top_k`	int	For each token selection step, the `top_k` tokens with the highest probabilities are sampled.
`enable_affective_dialog`	bool	If enabled, the model will detect emotions and adapt its responses accordingly.
`proactive_audio`	bool	If enabled, the model can reject responding to the last prompt. Only for native-audio models.
`vad_mode`	enum	Configures voice activity detection. Supported values: `automatic` (default) – uses native VAD implementation inside the model `manual` – VAD inside the model is disabled; instead acoustic VAD is used to detect when user is speaking mixed – VAD inside the model is enabled; acoustic VAD is used to suppress sending of audio chunks during silence periods For manual and mixed VAD modes you must also add the following advanced configuration parameter to your Bot connection: `{` `"enableSpeechDetectionEvent": true }` Consider switching to mixed VAD mode if your agent experiences “freezes” in the middle of the conversation.
`vad_start_of_speech_sensitivity`	enum	Determines how likely speech is detected. Supported values: `low`, `high` Applicable to `automatic` and `mixed` VAD modes.
`vad_end_of_speech_sensitivity`	enum	Determines how likely the end of speech is detected. Supported values: `low`, `high` Applicable to `automatic` and `mixed` VAD modes.
`vad_prefix_padding_ms`	int	The required duration of detected speech before start-of-speech is committed. Applicable to `automatic` and `mixed` VAD modes.
`vad_silence_duration_ms`	int	The required duration of detected non-speech (e.g. silence) before end-of-speech is committed. Applicable to `automatic` and `mixed` VAD modes.
`context_compression_trigger_tokens`	int	The number of tokens required to trigger context window compression.
`context_compression_target_tokens`	int	The target number of tokens to keep for context window.

Example

{
  "gemini_audio": {
    "voice": "Puck"
  }
}

Nova sonic model configuration

The nova_sonic advanced configuration parameter provides configuration for the following Amazon speech-to-speech models:

nova_sonic
nova-2-sonic

Parameter	Type	Description
`nova_sonic`	NovaSonic	Configuration for Amazon nova sonic models – e.g. `nova-sonic`.

NovaSonic

Parameter	Type	Description
`voice`	enum	Voice name. The following voices are supported by the `nova-sonic` model: `tiffany, matthew, amy, ambre, florian, beatrice, lorenzo, greta, lennart, lupe, carlos`. The following voices are supported by `nova-2-sonic` model: `tiffany, matthew, amy, olivia, kiara, arjun, ambre, florian, beatrice, lorenzo, greta, tina, lennart, lupe, carlos, carolina, leo.`
`top_p`	float	The percentage of most-likely candidates that the model considers for the next token.
`top_k`	int	Only sample from the top K options for each subsequent token.
`endpointing_sensitivity`	enum	Configures turn detection sensitivity Supported values: `low`, `medium`, `high`

Example

{
  "nova_sonic": {
    "voice": "amy"
  }
}

Information recorded in agent logs

The logs advanced configuration parameter provides granular control over the information being logged.

Parameter Type Description

Parameter	Type	Description
`logs`	enum	Configures information recorded in AI Agent logs. `enabled` - enables logging of all events, including user utterance and LLM response (default). `disabled` – disables logging of all events, except for `start`, `summary`, and `task_switch.` `masked` – masks content in all log events - i.e. you will still see that there was user utterance or LLM response, but the content will be masked.

logs

enum

Configures information recorded in AI Agent logs.

enabled - enables logging of all events, including user utterance and LLM response (default).
disabled – disables logging of all events, except for start, summary, and task_switch.
masked – masks content in all log events - i.e. you will still see that there was user utterance or LLM response, but the content will be masked.

In multi-agent topologies logging configuration is automatically copied to all sub-agents. It is possible, however, to explicitly define logging configuration for specific sub-agent by including logs parameter in its advanced configuration. For example, you may disable or mask logs for specific sub-agent that handles sensitive conversation parts.

For Flows, logging configuration can be overridden on a per conversation node basis.

When logging is disabled or masked in AI Agent configuration, corresponding user utterances and LLM responses are automatically marked as “containing sensitive information”, thus preventing them from being logged in call transcript and LiveHub platform logs. For a detailed description, see VoiceAI Connect > Bot integration > Basic behavior > Logging and privacy > Masking sensitive information from bot.

Example

{
    "logs": "masked"
}

Running tools on conversation start

The init_tools advanced configuration parameter enables running tools on conversation start. This may be used to improve agent latency by collecting needed information, for example, user profile – during welcome message stage.

You may pass conversation-specific parameters to each tool being executed. For example, you may use {caller} conversation data variable to extract information about the specific user making the call.

{
  "init_tools": [
    {
      "tool": "get_user_profile",
      "params": {
        "user_phone": "{caller}"
      }
    }
  ]
}

Responses from the tools are added to the conversation context as soon as they become available.

You may optionally populate conversation variables by extracting information from tool response. Use extract parameter for that, and jq syntax for extract operation.

{
  "init_tools": [
    {
      "tool": "get_weather",
      "params": {
        "city": "London"
      },
      "extract": {
        "current_temperature": ".current_condition[0].temp_C"
      }
    }
  ]
}

Responses from tools that use extract parameter are not added to the context by default. This may be changed by setting response parameter to true.

Parameter	Type	Description
`init_tools`	list[InitTool]	Execute tools on conversation start (during agent initialization).

InitTool

Parameter	Type	Description
`tool`	str	Name of the tool.
`params`	dict[str, Any]	Tool parameters; key = parameter name, value = parameter value
`extract`	dict[str, Any]	Variables to extract from tool response; key = variable name, value = jq extract expression
`response`	bool	Add response to conversation context.

Use init_tools_cancel parameter to cancel init tools after specified number of user utterances. For example, the following cancels init tools if they don't complete during the "welcome phase" - i.e. before the first user utterance is received:

{
  "init_tools_cancel": 1
}

Parameter	Type	Description
`init_tools_cancel`	int	Cancel init tools after specified number of user utterances.

Play prerecorded audio files instead of specific LLM responses

The prerecorded_audio advanced configuration parameter may be used to play prerecorded audio files instead of specific LLM-generated phrases. This feature can be used to reduce TTS usage (and cost) and avoid mispronunciation for known, fixed phrases.

Preparing your audio files

Prepare separate audio recordings for each phrase you want to override:

It is recommended to use wav/lpcm16 format. Although you can also use other formats supported by playURL VoiceAI Connect API.
For example, you may use ffmpeg utility to convert the file to correct format: ffmpeg -i input_file.ext -acodec pcm_s16le output.wav

Make sure that the file extension is .wav or .pcm

Creating a document with audio files

Upload audio files to AI Agents > Documents:

Navigate to AI Agents > Documents.
Click Create new document
Specify unique document name.
Upload one or more of the audio files that you prepared.(.wav or .pcm extension). In order to do this, you must change file selector to “All files”.
Recordings in the same document are played in random order. Therefore if you have multiple recordings of the same LLM response, for example, different variants of the welcome message – upload them to the same document. Otherwise create separate document for each audio file.

Configuring AI Agent to use audio files

Add prerecorded_audio advanced configuration parameter to your AI Agent

Specify dictionary of LLM phrases and corresponding audio document names. For example:

{
    "prerecorded_audio": {
        "phrases": {
            "Hello": "hello-doc",
            "Goodbye": "goodbye-doc"
        }
    }
}

LLM phrases must be the exact match of the LLM response. Therefore, consider using “fixed phrases” in your prompt, for example:

End conversation by saying "Goodbye"

Do not include audio documents in the Documents tab. Only reference them by name in the prerecorded_audio advanced configuration parameter.

Parameter	Type	Description
`prerecorded_audio`	PrerecordedAudio	Play prerecorded audio files instead of specific LLM responses.

PrerecordedAudio

Parameter Type Description

Parameter	Type	Description
`phrases`	dict[str, str]	LLM response phrases that are replaced by prerecorded audio files. key = LLM phrase value = name or ID of the document containing corresponding audio files
`format`	str	Audio format to be used. Default = "wav/lpcm16".

phrases

dict[str, str]

LLM response phrases that are replaced by prerecorded audio files.

key = LLM phrase
value = name or ID of the document containing corresponding audio files

format

str

Audio format to be used.

Default = "wav/lpcm16".

Custom tool certificates

The tool_certs advanced configuration parameter may be used to configure custom certificates for REST tools. You may provision custom CA certificate for verifying tool’s URL server certificate. You may also provision custom client key and certificate for mutual authentication during the tool call.

Preparing your certificate files

Prepare certificate files relevant to your use case. Files must have specified names and be in PEM format:

server.crt – CA certificate used to verify server certificate; if you need to include multiple CA certificates, specify them one after another.
client.key – client private key (without encryption)
client.crt – client certificate

Creating a document with certificate files

Upload certificate files to AI Agents > Documents:

Navigate to AI Agents > Documents.
Click Create new document.
Specify a unique document name.
Upload all certificate files prepared in the previous step. In order to do this, you must change file selector to “All files”.

Configuring AI Agent to use custom certificates

Add tool_certs advanced configuration parameter to your AI Agent.
Specify a dictionary of tool names and corresponding certificate document names. For example:
```
{
    "tool_certs": {
        "get_weather": "my-certs-doc"
    }
}
```

Do not include certificate documents in the Documents tab. Only reference them by name in the tool_certs advanced configuration parameter.

Parameter Type Description

Parameter	Type	Description
`tool_certs`	dict[str, str]	Custom certificates for tools calls. key = tool name value = name or ID of the document containing the certificate files

tool_certs

dict[str, str]

Custom certificates for tools calls.

key = tool name
value = name or ID of the document containing the certificate files