Text-to-speech service information

To connect VoiceAI Connect to a text-to-speech service provider, certain information is required from the provider, which is then used in the VoiceAI Connect configuration for the bot.

Microsoft Azure Speech Services

Connectivity

To connect to Azure's Speech Service, you need to provide AudioCodes with your subscription key for the service. To obtain the key, see Azure's documentation.

The key is configured on VoiceAI Connect using the credentials > key parameter in the providers section.

Note: The key is valid only for a specific region. The region is configured using the region parameter.

Language Definition

To define the language, you need to provide AudioCodes with the following from Azure's Text-to-Speech table:

'Locale' (language)
'Voice name' (gender based)

The 'Local' and 'Voice name' values are configured on VoiceAI Connect using the language and voiceName parameters, respectively. For example, for Italian, the language parameter should be configured to it-IT and the voiceName parameter to it-IT-ElsaNeural.

Customized neural voice

If you have defined a customized synthetic voice with Azure's Custom Neural Voice feature, you need to configure VoiceAI Connect using the ttsDeploymentId parameter to identify the associated speech-to-text endpoint.

For VoiceAI Connect Enterprise, this feature is supported only from Version 2.6 and later

Custom subdomain names for Azure Cognitive Services

Azure Cognitive Services provides a layered security model. This model enables you to secure your Cognitive Services accounts to a specific subset of networks. For more details, click here.

This section describes how to use custom subdomain and limit network access with VoiceAI Connect.

Creating Custom Domain

1. In Microsoft Azure’s Speech Services > [service name], select Networking (under Resource Management).

2. Click Generate Custom Domain Name.

3. Type in a custom domain name.

4. Click Save.

The custom domain name will also appear in Microsoft Azure’s Cognitive Services > Speech service.

Allowing only Selected Networks

1. In Microsoft Azure’s Cognitive Services > Speech service., select the service name you created.

2. Click Networking (under Resource Management).

3. If not already selected, select Firewalls and virtual networks tab.

4. Select Selected Networks and Private Endpointss.

5. Add virtual networks, or external IP addresses.

Setting up Private endpoint connections

For VoiceAIConnect Enterprise, this feature is supported only from Version 3.4 and later.

1. In Microsoft Azure’s Cognitive Services > Speech service, select the custom domain you created.

2. Click Networking (under Resource Management).

3. Select Private endpoint connections tab and create a private endpoint. For more details, click here.

4. In the provider, configure azureCustomSubdomain to the domain name (not FQDN, only the name) and set azureIsPrivateEndpoint to true.

Parameter azureIsPrivateEndpoint was deprecated in version 3.10.1.

Google Cloud Text-to-Speech

Connectivity

To connect to Google Cloud Text-to-Speech service, you need to provide AudioCodes with the following:

Private key of the Google service account. To create the account key, refer to Google's documentation. From the JSON object representing the key, you need to extract the private key (including the "-----BEGIN PRIVATE KEY-----" prefix) and the service account email.
Client email

Configuration

The keys are configured on VoiceAI Connect using the privateKey and clientEmail parameters in the providers > credentials section. To create the account key, refer to Google's documentation. From the JSON object representing the key, extract the private key (including the "-----BEGIN PRIVATE KEY-----" prefix) and the service account email. These two values must be configured on VoiceAI Connect using the privateKey and clientEmail parameters.

Language Definition

To define the language, you need to provide AudioCodes with the following from Google's Supported voices and languages table:

'Language code' (language)
'Voice name' (gender based)

The 'Language code' and 'Voice name' values are configured on VoiceAI Connect using the language and voiceName parameters, respectively. For example, for English (US), the language parameter should be configured to en-US and the voiceName parameter to en-US-Wavenet-A.

AWS Amazon Polly

Connectivity

To connect to Amazon Polly Text-to-Speech service, see Text-to-speech service information for required information.

Language Definition

To define the language, you need to provide AudioCodes with the following information from Voices in Amazon Polly table:

'Language'
'Name/ID' (gender based)
Support (Yes or No) for 'Neural Voice' or 'Standard Voice'

The 'Language' and 'Name/ID' values are configured on VoiceAI Connect using the language and voiceName parameters, respectively. For example, for English (US), the language parameter should be configured to English, US (en-US) and the voiceName parameter to Matthew.

The usage of 'Neural Voice' or 'Standard Voice' is configured on VoiceAI Connect using the ttsEnhancedVoice parameter. Refer to the Voices in Amazon Polly table to check if the specific language voice supports Neural Voice and/or Standard Voice.

For VoiceAI Connect Enterprise, the ttsEnhancedVoice parameter (neural voices) is supported from Version 3.0 and later.

Nuance

Connectivity

To connect VoiceAI Connect to Nuance Vocalizer for Cloud (NVC) speech service, it can use the WebSocket API or the open source Remote Procedure Calls (gRPC) API. To connect to Nuance Mix, it must use the gRPC API.

VoiceAI Connect is configured to connect to the specific Nuance API type, by setting the type parameter in the providers section, to nuance or nuance-grpc.

You need to provide AudioCodes with the URL of your Nuance's text-to-speech endpoint instance. This URL (with port number) is configured on the VoiceAI Connect using the ttsHost parameter.

Note: Nuance offers a cloud service (Nuance Mix) as well as an option to install an on-premise server. The on-premise server is without authentication while the cloud service uses OAuth 2.0 authentication (see below).

VoiceAI Connect supports Nuance Mix, Nuance Conversational AI services (gRPC) API interfaces. VoiceAI Connect authenticates itself with Nuance Mix (which is located in the public cloud), using OAuth 2.0. To configure OAuth 2.0, use the following providers parameters: oauthTokenUrl, credentials > oauthClientId, and credentials > oauthClientSecret.

Nuance Mix is supported only by VoiceAI Connect Enterprise from Version 2.6 and later.

Language Definition

To define the language, you need to provide AudioCodes with the following from Nuance's Vocalizer Language Availability table:

'Language Code' (language)
'Voice' (gender)

The 'Language' and 'Voice' values are configured on VoiceAI Connect using the language and voiceName parameters, respectively. For example, for English (US), the language parameter should be configured to en-US and the voiceName parameter to Kate.

Nuance Vocalizer for Cloud

To define the language, you need to provide AudioCodes with the language code from Nuance.

This value (ISO 639-1 format) is configured on VoiceAI Connect using the language parameter. For example, for English (USA), the parameter should be configured to en-US.

ReadSpeaker

Connectivity

To connect to ReadSpeaker Text-to-Speech service, enter the information provided by your ReadSpeaker account manager upon delivery of the service.

For authentication, enter the entire string of the authorization key as provided by ReadSpeaker. This value is a combination of both your ReadSpeaker account ID and a private key (e.g., "1234.abcdefghijklmnopqrstuvxuz1234567"). This value should be configured in VoiceAI Connect using the credentials > key parameter under the providers section.

To request a new key, contact the ReadSpeaker support team or your ReadSpeaker account manager.

The endpoint value to use in your AudioCodes implementation is provided by the ReadSpeaker team.

Language Definition

To define the language and voice, enter the language and voice values as provided by ReadSpeaker.

For language and voice, the following needs to be defined:

'Language code' (language)
'Voice'

The 'Language' and 'Voice' values are configured on VoiceAI Connect using the language and voiceName parameters, respectively. For example: for English (US) using the voice Paul, the language parameter should be configured to English, US (en-US), and the voiceName parameter to Paul.

Yandex

To connect to Yandex, contact AudioCodes for information.

ElevenLabs

For VoiceAI Connect Enterprise, this feature is supported only from Version 3.22 and later

Connectivity

To connect VoiceAI Connect to ElevenLabs text-to-speech service, you need to provide AudioCodes with the following from ElevenLabs:

API key name: The key can be obtained in the Profile Settings dialog box in ElevenLabs management interface:

This key must be configured on VoiceAI Connect using the credentials > key parameter under the providers section.

The provider type under the providers section must be configured to elevenlabs.

For example:

{
  "name": "my_elevenlabs",
  "type": "elevenlabs",
  "credentials": {
    "key": "api key from elevenlabs"
  }
}

Language Definition

To define the language and voice, enter the language and voice values as provided by ElevenLabs.

voice-id: Obtain this value from ElevenLabs (refer to their documentation Fetching the voice-id). VoiceAI Connect's default voice-id is "21m00Tcm4TlvDq8ikWAM".
model-id: (Optional) Use ElevenLabs model-id models.

The 'voice-id' value is configured on VoiceAI Connect using the voiceName parameter.

The 'model-id' is configured on VoiceAI Connect using the ttsModel parameter.

Advanced Parameters

ElevenLabs advanced parameters can be added in the provider section. For a list of the advanced parameters, refer to ElevenLabs documentation.

Parameters listed under 'Query Parameters' in ElevenLabs documentation should be under the query section parameters.
Parameters listed under 'Body' in ElevenLabs documentation should be under the body section.

Example of advanced configuration:

{
  "ttsOverrideConfig": {
    "query": {
      "optimize_streaming_latency": 2
    },
    "body": {
      "voiceSettings": {
        "stability": 3
      }
    }
  }
}

Azure OpenAI

For VoiceAI Connect Enterprise, this feature is supported only from Version 3.24.2 and later

VoiceAI Connect Enterprise now supports a custom speech service for text-to-speech (TTS) using Azure OpenAI. This allows you to use Azure-hosted OpenAI TTS models, such as gpt-4o-mini-tts, for synthesizing speech responses.

Connectivity

To connect VoiceAI Connect to the Azure OpenAI text-to-speech service, you must create an OpenAI resource in Azure and deploy a TTS model. (for more information, see here).

Configure the parameters below using values from the Azure OpenAI resource.

VoiceAI Connect Enterprise supports Azure OpenAI text-to-speech configuration from Version 3.24.2 and later.

How to use it?

This service is configured at the provider level by specifying the following parameters in the speechProviders section of the configuration file.

Parameter	Type	Description
`azureOpenAIEndpoint`	String	Describes the endpoint of the Azure OpenAI service. It can be found either in the Azure OpenAI resource under Resource Management → Keys and Endpoint, or in the deployment details in Azure AI Foundry.
`azureOpenAIDeployment`	String	Describes the deployment name for the Azure OpenAI service. It can be found in the deployment details in Azure AI Foundry. Example: "gpt-4o-mini-tts".

These parameters are applicable only when type is set to "azure-openai" in the provider configuration.

speechProviders Configuration

The speechProviders array includes configuration blocks for each speech provider. To use Azure OpenAI for text-to-speech, set the type to "azure-openai" and include the mandatory parameters as shown below.

Example Configuration:

{
  "name": "AzureOpenAI TTS <abc12345>",
  "type": "azure-openai",
  "credentials": {
    "key": "api key from Azure OpenAI Resource"
  },
  "azureOpenAIEndpoint": "https://your-resource-name.openai.azure.com/",
  "azureOpenAIDeployment": "gpt-4o-mini-tts",
  "azureOpenAIAPIVersion": "2025-03-01-preview"
}

Note: This provider supports text-to-speech only. It does not support speech-to-text.

Deepgram

Connectivity

To connect VoiceAI Connect with Deepgram's text-to-speech service, you need to provide AudioCodes with the following:

Deepgram API key: You can obtain an API key by either signing up for an account at Deepgram’s website, or by contacting Deepgram’s sales team.

This key must be configured on VoiceAI Connect using the credentials > key parameter under the providers section.

The provider type under the providers section must be configured to deepgram.

For example:

{
  "name": "my_deepgram",
  "type": "deepgram",
  "credentials": {
    "key": "API key from Deepgram"
  }
}

The default URL to Deepgram's API is api.deepgram.com. However, you can override this URL using the ttsHost parameter.

Voice Definition

Deepgram offers various text-to-speech voices, as listed here under the Values column (for example, "aura-asteria-en" for an English US female voice). This is configured on VoiceAI Connect using the voiceName parameter.

Connecting Deepgram using AudioCodes Live Hub

If you want to connect to Deepgram's speech services using AudioCodes Live Hub:

Sign into the Live Hub portal.
From the Navigation Menu pane, click Speech Services.
Click Add new speech service button, and then do the following:
1. In the 'Speech service name' field, type a name for the speech service.
2. Select only the Text to Speech check box.
3. Select the Generic provider option.
4. Click Next.
In the 'Authentication Key' field, enter the token supplied by Deepgram.
In the 'Text to Speech (TTS) URL' field, enter the URL supplied by Deepgram.
Click Create.

Almagu

Connectivity

To connect to Almagu, contact AudioCodes for information.

Language Definition

To define the language, you need to provide AudioCodes with the following from Almagu documentation:

'Voice'

The 'Voice' value is configured on VoiceAI Connect using the language parameter.