Active listening mode

Active listening mode refers to AI Agent ability to process input while user is speaking instead of waiting for final speech recognition. This can be used to improve agent responsiveness when using standard textual (non-realtime) Large Language Models.

When active listening mode is enabled, AI Agent uses hypotheses from STT engine to generate LLM responses while user speaks. In most cases last hypothesis is very close to the final recognition – thus allowing AI Agent to use pre-generated LLM response and shortening the end-to-end response latency.

Note that active listening mode consumes additional LLM tokens – therefore you are essentially trading tokens for latency.

Active listening mode can be activated by adding the following in you AI Agent’s advanced configuration screen:

{
    "active_listening": {
        "enabled": true
    }
}

In addition to that you may customize active listening mode behavior by configuring additional parameters as described below.

Parameter

Type

Description

active_listening

ActiveListening

Improve agent responsiveness by generating LLM responses based on STT hypotheses.

ActiveListening

Parameter Type Description
enabled bool Enable active listening mode.
max_parallel int

Maximum number of parallel response generations.

Default = 3

hypothesis_interval_ms int

Minimum time between hypotheses in milliseconds that trigger response generation.

Default = 100 msec

confidence_threshold float

Confidence threshold for processing hypotheses (0.0 to 1.0).

Default = 0.8

similarity_threshold float

Similarity threshold between final recognition and last hypothesis (0.0 to 1.0)

Default = 0.9

Configuring confidence threshold

Active listening mode takes into consideration confidence level included in STT hypothesis event.

Many STT engines, for example, Azure STT – do not include confidence level in hypothesis events. For them, we implicitly assume the following confidence level:

Active listening mode uses confidence_threshold parameter to determine whether to trigger LLM response for each STT hypothesis event.

Use the following guidelines to configure optimal confidence_threshold parameter value for your setup: