Agent assist

The agent assist feature enables a bot to monitor (receive the transcript of) an ongoing conversation between a customer and a human agent.

The bot can use its internal logic to act upon the sentences being said in the conversation, by sending out-of-band messages to the human agent or to other recipients.

This functionality can used for different goals,

The following illustration provides an example of an agent assist implementation with VoiceAI Connect:

This feature is supported only when an external speech-to-text service is used, and not for bots with voice channel.

How do I use it?

This section describes the agent assist feature for regular bots. For integration with Google Agent Assist, see google agent assist.

There are several ways in which this can be set up. In general, there is an SBC used in the original call (we shall call this the original SBC) that is being recorded, and an SBC for the VoiceAI Connect. An AudioCodes original SBC has the capability of starting SIPRec on demand. While non- AudioCodes SBCs should be configured to start SIPRec for every call on call start.

The following functional components are required for implementing agent assist:

AudioCodes offers professional services for assistance in implementing such setups.

After you have a setup with VoiceAI Connect serving as a SIPREC Server (SRS), you should route the recording sessions towards your agent-assist bot.

Bot configuration

The following configuration parameters must be configured for the agent-assist bot:

The following additional parameters should be configured when using the Agent Assist API:

Agent-assist conversation start

When an agent-assist conversation starts, the bot will receive the initial activity (see Inital activity page). The initial activity includes an agentConnected field, specifying whether or not the agent assist session is already in progress. In addition, the initial activity includes a participants field. When an agent-assist conversation starts, the bot will receive the initial activity (see Inital activity page) that includes a participants field. This field is used by the bot to identify the participants of the conversation being recorded (usually, the end-user and the human agent).

Example of the participants field:

  "participants": [
    {
      "participant": "caller",
      "uriUser": "A",
      "uriHost": "example.com",
      "displayName": "My name"
    },
    {
      "participant": "callee",
      "uriUser": "12345",
      "uriHost": "example.com"
    }
  ]

The following table lists the fields of each participant in the participants array:

Property

Type

Description

participant

String

Identifier of the participant.

The identifier is obtained from the <ac:role> element under the <participant> element in the SIPREC XML body. The identifier could be set using the SBC's Message Manipulation functionality on the SRC SBC or the SRS SBC.

The following example shows a <participant> element which has the identifier set to "caller":

<participant id="+123456789" session="0000-0000-0000-0000-b44497aaf9597f7f">
  <nameID aor="+123456789@example.com"></nameID>
  <ac:role>caller</ac:role>
</participant>

If no <ac:role> element is set, the identifiers will be set to "participant", "participant-2", etc.

The identifier is assured to be unique among the participants of the conversation.

uriUser

String

User-part of the URI of the participant.

The value is obtained from the user-part of the aor property of the <nameID> element in the SIPREC XML body.

uriHost

String

Host-part of the URI of the participant.

The value is obtained from the host-part of the aor property of the <nameID> element in the SIPREC XML body.

displayName

String

Display name of the participant. The value is obtained from the 'name' sub-element of the 'nameID' element in the SIPRec XML body.

Receiving agent session status

When the agent assist client is using agent assist API, VoiceAI Connect sends an agentConnected event to notify the bot when the agent assist session has started and stopped.

Event format:

AudioCodes Bot API
{
  "type": "event",
  "name": "agentConnected",
  "value": {
    "connected": true,
    "agentId": "agent5",
    "metadataToBot": {}
  }
}
Microsoft Bot Framework
{
  "type": "event",
  "name": "agentConnected",
  "value": {
    "connected": true,
    "agentId": "agent5",
    "metadataToBot": {}
  }
}
Dialogflow CX

Add a Custom Payload fulfillment with the following content:

{
  "queryInput": {
    "event": {
      "name": "agentConnected",
      "parameters": {
        "connected": true,
        "agentId": "agent5",
        "metadataToBot ": {}
      }
    }
  }
}
Dialogflow ES

Add a Custom Payload response with the following content:

{
  "queryInput": {
    "event": {
      "name": "agentConnected",
      "parameters": {
        "connected": true,
        "agentId": "agent5",
        "metadataToBot ": {}
      }
    }
  }
}
Microsoft Copilot Studio
{
  "type": "event",
  "name": "agentConnected",
  "value": {
    "connected": true,
    "agentId": "agent5",
    "metadataToBot": {}
  }
}
Microsoft Copilot Studio legacy
{
  "type": "event",
  "name": "agentConnected",
  "value": {
    "connected": true,
    "agentId": "agent5",
    "metadataToBot": {}
  }
}

Starting recognition

For receiving transcript of a specific participant, your bot should send a startRecognition event indicating the identifier of the participant. If you wish to receive the transcript of both participants, two startRecognition events should be sent. To reduce costs of unnecessary activation of speech-to-text, the speech-to-text engine will not start recognition of an audio stream until a startRecognition activity is received.

See Sending activities page for instructions on how to send events using your bot framework.

Your bot must send at least one startRecognition event in order to receive textual messages of the conversation.

If you wish to stop receiving transcript of a participant during the conversation, your bot can send a stopRecognition event.

Example of a startRecognition event:

AudioCodes Bot API
{
  "type": "event",
  "name": "startRecognition",
  "activityParams": {
    "targetParticipant": "caller"
  }
}
Microsoft Bot Framework
{
  "type": "event",
  "name": "startRecognition",
  "channelData": {
    "activityParams": {
      "targetParticipant": "caller"
    }
  }
}
Dialogflow CX

Add a Custom Payload fulfillment with the following content:

{
  "activities": [{
    "type": "event",
    "name": "startRecognition",
    "activityParams": {
      "targetParticipant": "caller"
    }
  }]
}
Dialogflow ES

Add a Custom Payload response with the following content:

{
  "activities": [{
    "type": "event",
    "name": "startRecognition",
    "activityParams": {
      "targetParticipant": "caller"
    }
  }]
}

Event parameters

The following table lists the parameters associated with startRecognition and stopRecognition events:

Parameter

Type

Description

targetParticipant

String

Defines the participant identifier for which to start or stop speech recognition.

Optimizing speech-to-text activation

The bot receives an agentConnected field (for more information, see Call initiation) in the start message, and an agentConnected event (for more information, see Receiving agent session status) when an agent assist client starts/stops the session for this call. The bot can decide to start recognition only after there is an agent assist session, and stop recognition if the agent session stops.

The Silence and speech detection feature can be used to activate the speech-to-text only when the participant is talking. This can considerably reduce costs for agent-assist calls.

Our recommendation is to use the enabled value for the speechDetection parameter for agent-assist calls. See the feature documentation for details.

Receiving transcript

The bot will receive the transcript as regular textual message (see Receiving user's speech). However, to allow the bot to distinguish between the messages of the different participants, each message will have a participant field indicating the participant identifier and, optionally, a participantUriUser field indicating the user part of the participant.

For example:

AudioCodes Bot API
{
  "type": "message",
  "text": "Hi, how are you doing?",
  "parameters": {
    "participant": "caller",
    "participantUriUser": "Alice"
  }
}
Microsoft Bot Framework
{
  "type": "message",
  "text": "Hi, how are you doing?",
  "channelData": {
    "participant": "caller",
    "participantUriUser": "Alice"
  }
}
Dialogflow CX

The text-message is sent as text input.

The participant and participantUriUser fields are sent as payload parameters.

Dialogflow ES

The text-message is sent as text input.

The participant and participantUriUser fields are sent as payload parameters.

the participant is also set as an input context with the name vaig-participant-<id>, for example: vaig-participant-caller. This can be used for filtering the matching intents. See Dialogflow documentation for more details.

Sending messages by the bot

When the bot needs to send a notification to the human agent or to another recipient, it can do so in several ways:

API for agent application

Agent assist application API supports agents using frameworks such as Genesys and Microsoft Teams.

This feature is applicable only to VoiceAI Connect Enterprise (Version 3.4 and later).

The agent establishes a socket.io WebSocket connection to VoiceAI Connect. After the WebSocket is set up, the agent sends an init message containing the agent’s phone number and other agent info.

During an active call, the agent can initiate an agent assist session by sending a getAgentAssist message to the service.

VoiceAI Connect initiates a SIPREC session on the SBC for this call, creating a new VoiceAI Connect session, mirroring both audio streams of the original call, performing speech-to-text, and sending the text from both participants to the bot.

The bot can send metadata events to VoiceAI Connect, which are forwarded as-is to the agent application via the agent assist API.

If the agent reloads, it sends a new init. The service recognizes the existing call and responds with the agent’s existing sessions so that the agent can continue from where it dropped off.

For API details, see Agent-assist-rest-api.htm