Active listening mode

Active listening mode refers to AI Agent ability to process input while user is speaking instead of waiting for final speech recognition. This can be used to improve agent responsiveness when using standard textual (non-realtime) Large Language Models.

When active listening mode is enabled, AI Agent uses hypotheses from STT engine to generate LLM responses while user speaks. In most cases last hypothesis is very close to the final recognition – thus allowing AI Agent to use pre-generated LLM response and shortening the end-to-end response latency.

Note that active listening mode consumes additional LLM tokens – therefore you are essentially trading tokens for latency.

Active listening mode can be activated by adding the following in you AI Agent’s advanced configuration screen:

{
    "active_listening": {
        "enabled": true
    }
}

In addition to that you may customize active listening mode behavior by configuring additional parameters as described below.

Parameter

Type

Description

active_listening

ActiveListening

Improve agent responsiveness by generating LLM responses based on STT hypotheses.

ActiveListening

Parameter Type Description
mode enum

Active listening mode.

Supported values:

  • disabled – active listening is off (default)

  • every_hypothesis – process every hypothesis received from STT; recommended for STTs that do not include punctuation in hypotheses (e.g. Azure STT)

  • end_of_sentence - process hypotheses ending with sentence punctuation (.!?) and "eager end of turn" events; recommended for STTs that include punctuation in hypotheses (e.g. Deepgram Nova 3)

  • eager_end_of_turn - process only "eager end of turn" events (currently supported by Deepgram Flux only)

max_parallel int

Maximum number of parallel response generations.

Default = 3

hypothesis_interval_ms int

Minimum time between hypotheses in milliseconds that trigger response generation.

Default = 100 msec

similarity_threshold float

Similarity threshold between final recognition and last hypothesis (0.0 to 1.0)

Default = 0.9