Text-to-speech service information
To connect VoiceAI Connect to a text-to-speech service provider, certain information is required from the provider, which is then used in the VoiceAI Connect configuration for the bot.
Microsoft Azure Speech Services
Connectivity
To connect to Azure's Speech Service, you need to provide AudioCodes with your subscription key for the service. To obtain the key, see Azure's documentation.
The key is configured on VoiceAI Connect using the credentials > key
parameter in the providers
section.
Note: The key is valid only for a specific region. The region is configured using the region parameter.
Language Definition
To define the language, you need to provide AudioCodes with the following from Azure's Text-to-Speech table:
-
'Locale' (language)
-
'Voice name' (gender based)
The 'Local' and 'Voice name' values are configured on VoiceAI Connect using the language
and voiceName
parameters, respectively. For example, for Italian, the language
parameter should be configured to it-IT
and the voiceName
parameter to it-IT-ElsaNeural
.
Customized neural voice
If you have defined a customized synthetic voice with Azure's Custom Neural Voice feature, you need to configure VoiceAI Connect using the ttsDeploymentId
parameter to identify the associated speech-to-text endpoint.
For VoiceAI Connect Enterprise, this feature is supported only from Version 2.6 and later
Custom subdomain names for Azure Cognitive Services
Azure Cognitive Services provides a layered security model. This model enables you to secure your Cognitive Services accounts to a specific subset of networks. For more details, click here.
This section describes how to use custom subdomain and limit network access with VoiceAI Connect.
Creating Custom Domain
1. In Microsoft Azure’s Speech Services > [service name], select Networking (under Resource Management).
2. Click Generate Custom Domain Name.
3. Type in a custom domain name.
4. Click Save.
The custom domain name will also appear in Microsoft Azure’s Cognitive Services > Speech service.
Allowing only Selected Networks
1. In Microsoft Azure’s Cognitive Services > Speech service., select the service name you created.
2. Click Networking (under Resource Management).
3. If not already selected, select Firewalls and virtual networks tab.
4. Select Selected Networks and Private Endpointss.
5. Add virtual networks, or external IP addresses.
Setting up Private endpoint connections
1. In Microsoft Azure’s Cognitive Services > Speech service, select the custom domain you created.
2. Click Networking (under Resource Management).
3. Select Private endpoint connections tab and create a private endpoint. For more details, click here.
4. In the provider, configure azureCustomSubdomain
to the domain name (not FQDN, only the name) and set azureIsPrivateEndpoint
to true.
ParameterazureIsPrivateEndpoint
was deprecated in version 3.10.1.
Google Cloud Text-to-Speech
Connectivity
To connect to Google Cloud Text-to-Speech service, you need to provide AudioCodes with the following:
-
Private key of the Google service account. To create the account key, refer to Google's documentation. From the JSON object representing the key, you need to extract the private key (including the "-----BEGIN PRIVATE KEY-----" prefix) and the service account email.
-
Client email
Configuration
The keys are configured on VoiceAI Connect using the privateKey
and clientEmail
parameters in the providers
> credentials
section. To create the account key, refer to Google's documentation. From the JSON object representing the key, extract the private key (including the "-----BEGIN PRIVATE KEY-----" prefix) and the service account email. These two values must be configured on VoiceAI Connect using the privateKey
and clientEmail
parameters.
Language Definition
To define the language, you need to provide AudioCodes with the following from Google's Supported voices and languages table:
-
'Language code' (language)
-
'Voice name' (gender based)
The 'Language code' and 'Voice name' values are configured on VoiceAI Connect using the language
and voiceName
parameters, respectively. For example, for English (US), the language
parameter should be configured to en-US
and the voiceName
parameter to en-US-Wavenet-A
.
AWS Amazon Polly
Connectivity
To connect to Amazon Polly Text-to-Speech service, see Text-to-speech service information for required information.
Language Definition
To define the language, you need to provide AudioCodes with the following information from Voices in Amazon Polly table:
-
'Language'
-
'Name/ID' (gender based)
-
Support (Yes or No) for 'Neural Voice' or 'Standard Voice'
The 'Language' and 'Name/ID' values are configured on VoiceAI Connect using the language
and voiceName
parameters, respectively. For example, for English (US), the language
parameter should be configured to English, US (en-US)
and the voiceName
parameter to Matthew
.
The usage of 'Neural Voice' or 'Standard Voice' is configured on VoiceAI Connect using the ttsEnhancedVoice
parameter. Refer to the Voices in Amazon Polly table to check if the specific language voice supports Neural Voice and/or Standard Voice.
ttsEnhancedVoice
parameter (neural voices) is supported from Version 3.0 and later.Nuance
Connectivity
To connect VoiceAI Connect to Nuance Vocalizer for Cloud (NVC) speech service, it can use the WebSocket API or the open source Remote Procedure Calls (gRPC) API. To connect to Nuance Mix, it must use the gRPC API.
VoiceAI Connect is configured to connect to the specific Nuance API type, by setting the type
parameter in the providers
section, to nuance
or nuance-grpc
.
You need to provide AudioCodes with the URL of your Nuance's text-to-speech endpoint instance. This URL (with port number) is configured on the VoiceAI Connect using the ttsHost
parameter.
Note: Nuance offers a cloud service (Nuance Mix) as well as an option to install an on-premise server. The on-premise server is without authentication while the cloud service uses OAuth 2.0 authentication (see below).
VoiceAI Connect supports Nuance Mix, Nuance Conversational AI services (gRPC) API interfaces. VoiceAI Connect authenticates itself with Nuance Mix (which is located in the public cloud), using OAuth 2.0. To configure OAuth 2.0, use the following providers parameters: oauthTokenUrl
, credentials > oauthClientId, and credentials > oauthClientSecret
.
Nuance Mix is supported only by VoiceAI Connect Enterprise from Version 2.6 and later.
Language Definition
To define the language, you need to provide AudioCodes with the following from Nuance's Vocalizer Language Availability table:
-
'Language Code' (language)
-
'Voice' (gender)
The 'Language' and 'Voice' values are configured on VoiceAI Connect using the language
and voiceName
parameters, respectively. For example, for English (US), the language
parameter should be configured to en-US
and the voiceName
parameter to Kate
.
-
Nuance Vocalizer for Cloud
To define the language, you need to provide AudioCodes with the language code from Nuance.
This value (ISO 639-1 format) is configured on VoiceAI Connect using the language
parameter. For example, for English (USA), the parameter should be configured to en-US
.
ReadSpeaker
Connectivity
To connect to ReadSpeaker Text-to-Speech service, enter the information provided by your ReadSpeaker account manager upon delivery of the service.
For authentication, enter the entire string of the authorization key as provided by ReadSpeaker. This value is a combination of both your ReadSpeaker account ID and a private key (e.g., "1234.abcdefghijklmnopqrstuvxuz1234567"). This value should be configured in VoiceAI Connect using the credentials
> key
parameter under the providers
section.
To request a new key, contact the ReadSpeaker support team or your ReadSpeaker account manager.
The endpoint
value to use in your AudioCodes implementation is provided by the ReadSpeaker team.
Language Definition
To define the language and voice, enter the language and voice values as provided by ReadSpeaker.
For language and voice, the following needs to be defined:
-
'Language code' (language)
-
'Voice'
The 'Language' and 'Voice' values are configured on VoiceAI Connect using the language
and voiceName
parameters, respectively. For example: for English (US) using the voice Paul, the language
parameter should be configured to English, US (en-US)
, and the voiceName
parameter to Paul
.
Yandex
To connect to Yandex, contact AudioCodes for information.
ElevenLabs
For VoiceAI Connect Enterprise, this feature is supported only from Version 3.22 and later
Connectivity
To connect VoiceAI Connect to ElevenLabs text-to-speech service, you need to provide AudioCodes with the following from ElevenLabs:
-
API key name: The key can be obtained in the Profile Settings dialog box in ElevenLabs management interface:
This key must be configured on VoiceAI Connect using the credentials
> key
parameter under the providers
section.
The provider type
under the providers
section must be configured to elevenlabs
.
For example:
{ "name": "my_elevenlabs", "type": "elevenlabs", "credentials": { "key": "api key from elevenlabs" } }
Language Definition
To define the language and voice, enter the language and voice values as provided by ElevenLabs.
-
voice-id: Obtain this value from ElevenLabs (refer to their documentation Fetching the voice-id). VoiceAI Connect's default voice-id is "21m00Tcm4TlvDq8ikWAM".
-
model-id: (Optional) Use ElevenLabs model-id models.
The 'voice-id' value is configured on VoiceAI Connect using the voiceName
parameter.
The 'model-id' is configured on VoiceAI Connect using the ttsModel
parameter.
Advanced Parameters
ElevenLabs advanced parameters can be added in the provider
section. For a list of the advanced parameters, refer to ElevenLabs documentation.
-
Parameters listed under 'Query Parameters' in ElevenLabs documentation should be under the
query
section parameters. -
Parameters listed under 'Body' in ElevenLabs documentation should be under the
body
section.
Example of advanced configuration:
{ "ttsOverrideConfig": { "query": { "optimize_streaming_latency": 2 }, "body": { "voiceSettings": { "stability": 3 } } } }
Deepgram
Connectivity
To connect VoiceAI Connect with Deepgram's text-to-speech service, you need to provide AudioCodes with the following:
-
Deepgram API key: You can obtain an API key by either signing up for an account at Deepgram’s website, or by contacting Deepgram’s sales team.
This key must be configured on VoiceAI Connect using the credentials
> key
parameter under the providers
section.
The provider type
under the providers
section must be configured to deepgram
.
For example:
{ "name": "my_deepgram", "type": "deepgram", "credentials": { "key": "API key from Deepgram" } }
The default URL to Deepgram's API is api.deepgram.com. However, you can override this URL using the ttsHost
parameter.
Voice Definition
Deepgram offers various text-to-speech voices, as listed here under the Values column (for example, "aura-asteria-en" for an English US female voice). This is configured on VoiceAI Connect using the voiceName
parameter.
Connecting Deepgram using AudioCodes Live Hub
If you want to connect to Deepgram's speech services using AudioCodes Live Hub:
-
Sign into the Live Hub portal.
-
From the Navigation Menu pane, click Speech Services.
-
Click Add new speech service button, and then do the following:
-
In the 'Speech service name' field, type a name for the speech service.
-
Select only the Text to Speech check box.
-
Select the Generic provider option.
-
Click Next.
-
-
In the 'Authentication Key' field, enter the token supplied by Deepgram.
-
In the 'Text to Speech (TTS) URL' field, enter the URL supplied by Deepgram.
-
Click Create.
Almagu
Connectivity
To connect to Almagu, contact AudioCodes for information.
Language Definition
To define the language, you need to provide AudioCodes with the following from Almagu documentation:
-
'Voice'
The 'Voice' value is configured on VoiceAI Connect using the language
parameter.