Bot API
VoiceAI Connect Enterprise provides a generic bot API that can be used for connecting it to any bot service that doesn't use the standard bot frameworks (such as Microsoft Azure, Amazon Lex, and Google Dialogflow). This Customer-proprietary bot service could also employ middleware that proxies between it and VoiceAI Connect Enterprise. In such a scenario, it's preferable that VoiceAI Connect Enterprise connects directly to your framework or middleware.
AudioCodes bot API offers the following benefits:
-
Easy to implement
-
Simple authentication scheme
-
Traverses firewalls and HTTP proxies
-
Enables the bot to use VoiceAI Connect Enterprise's wide range of features
Bot API implementation
AudioCodes has developed a channel for Rasa bot framework that implements AudioCodes bot API. More information about this channel can be found here.
This implementation can be used as a reference for using AudioCodes Bot API.
Overview
The roles in the bot API are:
-
Client: VoiceAI Connect Enterprise
-
Server: Your bot service
You should implement the server-side of the API so that VoiceAI Connect Enterprise can connect to it.
The API uses HTTP. All requests by VoiceAI Connect Enterprise are sent to the bot service.
The API only conveys textual messages (not voice) because VoiceAI Connect Enterprise uses speech-to-text (STT) and text-to-speech (TTS) engines.
Conversation Flow
The conversation flow between VoiceAI Connect Enterprise and the bot service is as follows:
-
VoiceAI Connect Enterprise creates a new conversation by using a pre-configured URL.
-
The reply contains URLs for posting messages to the conversation.
-
Throughout the conversation, VoiceAI Connect Enterprise posts the user’s messages to the given URL, while the responses contains the bot’s replies.
-
VoiceAI Connect Enterprise ends the conversation.
The following shows an example of a conversation flow between VoiceAI Connect Enterprise and a proprietary bot service:
Before You Begin
Prior to using this API, please note the following:
-
All VoiceAI Connect Enterprise requests use HTTP POST request methods.
-
All requests and responses contain a JSON body and with the appropriate 'Content-Type: application/json' header.
-
All JSON bodies must be encoded with UTF-8.
-
Any non-200 response is considered a failure and disconnects the conversation. Failure responses can optionally contain a JSON body with a
reason
attribute. -
All requests have a timeout of 20 seconds. If the timeout expires and no response was received, the conversation is disconnected.
-
VoiceAI Connect Enterprise uses connection reuse (HTTP Connection Keep-Alive). It's recommended that the bot service sets the HTTP Keep-Alive time to at least 30 seconds.
-
If a connection error occurs, VoiceAI Connect Enterprise retries the request. Note that VoiceAI Connect Enterprise ignores duplicated activity IDs and therefore, retrying is not expected to cause double handling of activities.
-
You can define an HTTP header for the unique message ID that is sent in each HTTP request, using the
sendRequestIdHeader
parameter.
Configuration
VoiceAI Connect Enterprise should be configured with a botURL
parameter. VoiceAI Connect Enterprise uses the botURL
to connect to your bot, for two purposes:
-
To verify that the URL is valid and active (using HTTP GET, as described in Health Check)
-
To create new conversations (using HTTP POST, as described in Creation of a Conversation)
For Rasa bots, a typical value for botURL
has the form: http://<host>/webhooks/audiocodes/webhook
Other relevant configuration parameters include:
-
Security: Depends on one of the following authentication methods (see Security):
-
token
: Static access token used to authenticate communication with the bot. -
oauthTokenUrl: OAuth access token provided by an authorization server for OAuth 2.0 authentication.
-
-
providerBotName
: This is the value that is sent on the creation of the conversation to allow one botURL to be used for several bots.
API Endpoints
Creation of a Conversation
To start a conversation, VoiceAI Connect Enterprise sends a POST request to the botURL
. VoiceAI Connect Enterprise sends the unique ID of the conversation in the conversation
attribute. If multiple bots share the same URL, VoiceAI Connect Enterprise can be configured to add a bot
attribute to the request body.
The body of the response from the bot service should contain a set of URLs for performing actions on the newly created conversation. The URLs should be unique to the conversation, by containing a UUID as part of the path - either by using the ID from the conversation
attribute or a UUID generated by the bot service.
If a URL is relative, VoiceAI Connect Enterprise resolves the URL using the botURL
parameter as the base URL (according to Section 4 of RFC 1808).
After the conversation is created, VoiceAI Connect Enterprise sends an activity with the start
event.
Request Body Attributes
Parameter |
Type |
Description |
---|---|---|
|
String |
VoiceAI Connect Enterprise's conversation ID. |
|
String |
(Optional) The value of the |
|
Array of strings |
An array with the following values:
|
Response Body Attributes
Parameter |
Type |
Description |
---|---|---|
|
String |
Relative or absolute URL. VoiceAI Connect Enterprise sends activities to this URL. Note: This parameter is mandatory. |
|
String |
Relative or absolute URL. Note: This parameter is mandatory. |
|
String |
Relative or absolute URL. Note: This parameter is mandatory. |
|
String |
(Optional) Relative or absolute URL. This URL indicates that the bot (server) supports sending activities using WebSocket. For more information, see Sending Activities over WebSocket. |
|
Number |
The value can be from 60 to 3600. The recommended value is 120. For more information on conversation refreshes, see Conversation Refresh. Note: This parameter is mandatory. |
Example
The following shows an example of creating a conversation:
Request:
{ "conversation": "ad8f59d2-4a72-4f19-ad34-e7e9b1636111", "capabilities": [ "websocket" ] }
Response:
{ "activitiesURL": "conversation/ad8f59d2-4a72-4f19-ad34-e7e9b1636111/activities", "refreshURL": "conversation/ad8f59d2-4a72-4f19-ad34-e7e9b1636111/refresh", "disconnectURL": "conversation/ad8f59d2-4a72-4f19-ad34-e7e9b1636111/disconnect", "expiresSeconds": 60 }
Sending and Receiving Activities
The messages sent between the parties of the conversation are called activities. When VoiceAI Connect Enterprise has activities to send, it sends a POST request to the URL specified in activitiesURL
. The body of the POST request includes an activities
attribute containing an array of activities.
The body of the response should also include an activities
attribute containing an array of activities. If no activities are needed, either the activities
attribute is omitted or it's sent with an empty array.
If the conversation doesn’t exist, the bot service should respond with a 404 Not Found.
In addition, each activity
must include the following additional attributes:
-
id
: The sender of an activity should generate a UUID (RFC 4122, v4) per activity and send it in theid
attribute. The receiver of activities should retain a set of all the received activities IDs (in the current conversation) and ignore duplicate activities. This allows the resending of activities in case of failures, without the activities being handled twice. -
timestamp
: The sender of an activity should add atimestamp
attribute containing the current time. The format of the timestamp is according to RFC 3339, where the time is in UTC with 3 decimal digits for milliseconds. For example: "2019-04-23T18:25:43.511Z".The
timestamp
must include the creation time of the activity and must not be modified if the activity is re-sent.The
timestamp
is mainly used for logging and debugging. -
language
: When VoiceAI Connect Enterprise sends activities to the bot, it adds the language attribute to the activities. Note that VoiceAI Connect Enterprise ignores this attribute for activities sent by the bot.
Request Body Attributes
Parameter |
Type |
Description |
---|---|---|
|
String |
VoiceAI Connect Enterprise's conversation ID. |
|
Array of Objects |
Array of activities. |
Response Body Attributes
Parameter |
Type |
Description |
---|---|---|
|
Array of Objects |
Array of activities. |
Example
The following shows an example of the start
activity that is sent by VoiceAI Connect Enterprise when a conversation starts (using activities
endpoint):
Request:
{ "conversation": "ad8f59d2-4a72-4f19-ad34-e7e9b1636111", "activities": [ { "id": "ecf2d78d-ef7b-4a5e-907c-53c97cef5f97", "timestamp": "2020-01-26T13:03:48.745Z", "language": "en-US", "type": "event", "name": "start", "parameters": { "callee": "1234", "calleeHost": "10.20.30.40", "caller": "+123456789", "callerHost": "10.20.30.40" } } ] }
Response:
{ "activities": [ { "id": "15b3d407-5161-41e7-8114-a273859c5f6d", "timestamp": "2020-01-26T13:03:48.748Z", "language": "en-US", "type": "message", "text": "Hi there." } ] }
The following shows an example of message
activities that correspond to speech utterances:
Request (to activitiesURL
):
{ "conversation": "55b77909-82d8-4355-87f1-68081f4dbb36", "activities": [ { "id": "bc44c054-846d-490d-85e9-d3aea96b4f0f", "timestamp": "2019-08-20T14:09:12.251Z", "language": "en-US", "type": "message", "text": "Hi.", "parameters": { "confidence": 0.6599681377410889, "recognitionOutput": { "RecognitionStatus": "Success", "Offset": 32300000, "Duration": 5800000, "NBest": [ { "Confidence": 0.6599681377410889, "Lexical": "hi", "ITN": "Hi", "MaskedITN": "Hi", "Display": "Hi." }, { "Confidence": 0.3150425851345062, "Lexical": "high", "ITN": "high", "MaskedITN": "high", "Display": "high" } ] } } } ] }
Response:
{ "activities": [ { "id": "dc4eb401-17f2-436f-80fa-b60156b8a804", "timestamp": "2020-01-26T13:04:00.885Z", "language": "en-US", "type": "message", "text": "How may I assist you?" } ] }
Sending Activities over WebSocket
Typically, bots based on the AudioCodes Bot API operate by request-response communication. The user's input is conveyed to the bot in the request, and the bot's immediate response is conveyed in the response. The Asynchronous API allows bots to also send any messages (activities) asynchronously to users through VoiceAI Connect Enterprise (i.e., without requiring a request).
An example of a scenario where this asynchronous feature could be implemented is when the bot needs to perform a time consuming action such as fetching information from a database. In this scenario, the bot may first send a reply of "please wait" to the user, and then once the information is retrieved, sends a message to the user with the information.
For using the asynchronous API, the bot should specify a URL in the websocketURL
property on the response it sends to the botURL
(as described in Creation of a Conversation). If this property is specified, VoiceAI Connect Enterprise opens a WebSocket connection dedicated for this conversation. The bot must be ready to accept this incoming WebSocket connection (it's recommended to use a library that implements a WebSocket server at the bot service). This WebSocket connection is used by VoiceAI Connect Enterprise to receive asynchronous activities from the bot; VoiceAI Connect Enterprise doesn't send any messages through this connection.
VoiceAI Connect Enterprise maintains the WebSocket connection for the entire duration of the conversation and closes it upon the end of the conversation. If the WebSocket connection establishment fails, or an unrecoverable error causes it to close, the conversation is terminated with an error.
It's recommended to use HTTPS for securing the WebSocket connection. In addition, a static token or a dynamic OAuth 2.0 token can be used as specified in Security (the token is sent in the Authorization header of the WebSocket establishment request).
Once the WebSocket connection is established, the bot can send any activities through this connection whenever it wants. Activities are sent as text messages (WebSocket payload) containing a JSON object with a single activities
parameter (the structure of the JSON object is identical to the response body described in Sending and Receiving Activities). For example:
{ "activities": [ { "id": "15b3d407-5161-41e7-8114-a273859c5f6d", "timestamp": "2020-01-26T13:03:48.748Z", "language": "en-US", "type": "message", "text": "Hi there." } ] }
Conversation Refresh
VoiceAI Connect Enterprise refreshes the conversation by sending a refresh request to the conversation at least 30 seconds before the expiresSeconds
value expires. The expiresSeconds
time is activated upon the start of conversation or last refresh. The refresh is done by sending a POST request to the URL specified in refreshURL.
The expiresSeconds
value can be updated by the response body.
If the bot service doesn't receive a refresh request before expiresSeconds
value expires, it should consider the conversation as terminated (with an error condition).
The bot service should reply with a 200 OK response to the refresh request (see possible attributes below). If no reply is received for a refresh request, or an error reply is received, the conversation is terminated with an error.
If the conversation doesn’t exist, the bot service should respond with a 404 Not Found.
Request Body Attributes
Parameter |
Type |
Description |
---|---|---|
|
String |
VoiceAI Connect Enterprise's conversation ID. |
Response Body Attributes
Parameter |
Type |
Description |
---|---|---|
|
Number |
The value can be from 60 to 3600 seconds. The recommended value is 120. Note: The parameter is optional. If not specified, the previous value of |
Example
Request:
{ "conversation": "ad8f59d2-4a72-4f19-ad34-e7e9b1636111" }
Response:
{ "expiresSeconds": 60 }
Ending a Conversation
The conversation may end due to the following reasons:
-
The VoIP call has ended (loss of connection with VoiceAI Connect Enterprise, or some failure on the SIP side).
-
The bot has disconnected (using the
hangup
event, as described in Disconnecting the call). -
An error has occurred.
For any of the above reasons, VoiceAI Connect Enterprise sends a POST request to the URL specified in disconnectURL
. The body of the POST request can contain a reason
attribute. The body of the response should be an empty JSON object. If the conversation doesn’t exist, the bot service should respond with a 404 Not Found.
If the conversation expires on the bot service side (i.e., no refresh was done by VoiceAI Connect Enterprise), no explicit message is sent by VoiceAI Connect Enterprise.
Request Body Attributes
Parameter |
Type |
Description |
---|---|---|
|
String |
VoiceAI Connect Enterprise's conversation ID. |
|
String |
(Optional) The reason for disconnecting the conversation (free text). |
Response Body Attributes
The response body is empty.
Example
Request:
{ "conversation": "ad8f59d2-4a72-4f19-ad34-e7e9b1636111", "reasonCode": "client-disconnected", "reason": "Client Side" }
Response:
{ }
Health Check
To validate the connection with the bot without creating a conversation, the bot side should handle GET requests to the botURL
URL without creating a conversation (as conversations are created by POST requests). When VoiceAI Connect Enterprise is deployed as a Software as a Service (SaaS) cloud service, it uses this health-check endpoint to verify that the botURL
and token that were provided are correct. Upon success, the bot replies with a 200 OK containing a JSON body having the attributes listed below.
Request Body Attributes
The request body is empty.
Response Body Attributes
Parameter |
Type |
Description |
---|---|---|
|
String |
The value is always ac-bot-api. |
|
Boolean |
The value is always true. |
Example
{ "type": "ac-bot-api", "success": true }