Audiocodes: Advanced Configuration

After connecting Audiocodes to your phone number/SIP trunk and building a first simple Flow, you might want to take the next step and configure advanced settings to create even better user experiences. Some of these advanced settings are the topic of this article.

Overview

Audiocodes comes with a large number of configuration settings which can be controlled directly from within your Cognigy Flow. These settings can be applied individually to two scopes:

sessionParams - the settings apply to the whole session from the time of applying them
activityParams - the settings apply only to the current activity (e.g. sendMessage)

In order to set these settings, you can use the built-in Audiocodes Extension.

Setting Session Parameters

Session parameters can comfortably be set with the "Set Session Parameters" Node. When executed, the settings will apply for the remainder of the session.

Setting Activity Parameters

Activity parameters can be set per activity. If for example set on "Send Message", they will only impact the execution of this activity. An example would be setting "Barge In" to true only for a long message that is sent, allowing the user to interrupt the voicebot during this message, but not afterwards.

Parameter Details

STT Settings

These settings apply to the Speech-to-Text engine (e.g. Azure Speech Services).

Parameter

Type

Description

Language Code

Text

Defines the language (e.g., "en-ZA" for South African English) of the voicebot conversation and is used for TTS and STT functionality. The value is obtained from the service provider.

■

STT:

✔

Azure: The parameter is configured with the value from the 'Locale' column in Azure's Speech-Text table (e.g., "en-GB").

✔

Google: The parameter is configured with the value from the 'languageCode' (BCP-47) column in Google's Cloud Speech-to-Text table (e.g., "nl-NL").

■

TTS:

✔

Azure: The parameter is configured with the value from the 'Locale' column in Azure's Text-to-Speech table (e.g., "it-IT").

✔

Google: The parameter is configured with the value from the 'Language code' column in Google's Cloud Text-to-Speech table (e.g., "en-US").

✔

AWS: The parameter is configured with the value from the 'Language' column in Amazon's Polly TTS table (e.g., "de-DE").

Disable STT Punctuation

Toggle

Prevents the STT response from Audiocodes to include punctuation marks.

■

on: Enabled. Punctuation is excluded.

■

off: (Default) Disabled. Punctuation is included.

Note: This requires support from the STT engine

TTS Settings

These settings apply to the Text-to-Speech engine (e.g. Azure Speech Services).

Parameter

Type

Description

Voice Name

Text

Defines the voice name for the TTS service.

■

Azure: The parameter is configured with the value from the 'Short voice name' column in Azure's Text-to-Speech table (e.g., "it-IT-ElsaNeural").

■

Google: The parameter is configured with the value from the 'Voice name' column in Google's Cloud Text-to-Speech table (e.g., "en-US-Wavenet-A").

■

AWS: The parameter is configured with the value from the 'Name/ID' column in Amazon's Polly TTS table (e.g., "Hans").

■

Almagu: The parameter is configured with the value from the 'Voice' column in Almagu's TTS table (e.g., "Osnat").

Disable TTS Cache

Toggle

Enables caching of TTS (audio) results from the Flow. Therefore, if Audiocodes needs to send a request for TTS to a TTS provider and this text has been requested before, it retrieves the result from its cache instead of requesting it again from the TTS provider.

■

on: Enabled

■

off: (Default) Disabled

DTMF

These settings apply to the DTMF (dual tone multi frequency) features of Audiocodes.

Parameter

Type

Description

Send DTMF

Toggle

Enables the sending of DTMF events to the Flow.

■

on: Enabled

■

off: (Default) Disabled

Note: For configuring the DTMF collection and sending method, see the dtmfCollect parameter.

DTMF Collect

Toggle

Defines the DTMF digit collection and sending method.

■

on: Enabled. Audiocodes first collects all the DTMF digits entered by the user, and only then sends them all together to the Flow.

■

off: (Default) Disabled. As Audiocodes receives a DTMF digit entered by the user, it sends that single digit to the Flow. In other words, it sends each DTMF digit one at a time to the flow.

Note:

■

When enabled, you can configure additional settings using the following parameters: dtmfCollectInterDigitTimeoutMS, dtmfCollectMaxDigits, and dtmfCollectSubmitDigit.

■

If the sendDTMF parameter is configured to off(default), incoming DTMF digits are ignored by Audiocodes even if the dtmfCollect parameter is configured to true.

off: (Default) Disabled

DTMF Collect Timeout

Number

Defines the timeout (in milliseconds) that Audiocodes waits for the user to press another digit before it sends all the digits to the Flow. If the timeout expires since the last digit entered by the user, Audiocodes sends all the collected digits to the Flow(as a DTMF message), without waiting for the maximum number of expected digits or for the "submit" digit. The timeout is triggered after the user enters the first DTMF digit and is reset after each digit.

The valid value range is 0 to unlimited. The default is 2000.

DTMF Collect Max Digits

Number

Defines the maximum number of DTMF digits that Audiocodes expects to receive from the user. Once Audiocodes receives and collects this number of digits entered by the user, it immediately sends all the digits to the Flow (as a DTMF message), without waiting for the timeout to expire or for the "submit" digit.

The valid value range is 0 (disabled) to unlimited. The default is 5. If configured to 0, the DTMF collection and sending method is according to dtmfCollectInterDigitTimeoutMS or dtmfCollectSubmitDigit.

DTMF Collect Submit Digit

Text

Defines a special DTMF "submit" digit that when received from the user, Audiocodes immediately sends all the collected digits to the Flow (as a DTMF message), without waiting for the timeout to expire or for the maximum number of expected digits.

The valid value is any symbol on a phone keypad. The default is # (pound key). If you want to disable this parameter, configure it to "" (empty string).

Barge In

Barge In stands for the ability of the user to interrupt the voicebot by speaking during a running prompt.

Parameter

Type

Description

Barge In

Toggle

Enables the Barge-In feature.

■

on: Enabled, When the voicebot is playing a response to the user (playback of Flow message), the user can "barge-in" (interrupt) and start speaking. This terminates the voicebot response, allowing the voicebot to listen to the new speech input from the user (i.e., Audiocodes sends detected utterance to the Flow).

■

off: (Default) Disabled. Audiocodes doesn't expect speech input from the user until the voicebot has finished playing its response to the user. In other words, the user can't "barge-in" until the voicebot message response has finished playing.

Barge In on DTMF

Toggle

Enables the Barge-In on DTMF feature.

■

on: (Default) Enabled. When the voicebot is playing a response to the user (playback of Flow message), the user can "barge-in" (interrupt) with a DTMF digit. This terminates the voicebot response, allowing the voicebot to listen to and process the digits sent from the user.

■

off: Disabled. Audiocodes doesn't expect DTMF input from the user until the voicebot has finished playing its response to the user. In other words, the user can't "barge-in" until the voicebot message response has finished playing.

Note:

■

If you enable this feature (i.e., bargeInOnDTMF configured to true), you also need to enable the sending of DTMF digits (see the sendDTMF parameter).

Barge In Minimum Words

Number

Defines the minimum number of words that the user must say for Audiocodes to consider it a barge-in. For example, if configured to 4 and the user only says 3 words during the bot's playback response, no barge-in occurs.

The valid range is 1 to 5. The default is 1.

Continuous ASR

These settings relate to the continuos ASR feature of the Voice Gatway.

Parameter

Type

Description

Enable Continuous ASR

Toggle

Enables the Continuous ASR feature. Continuous ASR enables Audiocodes to concatenate multiple STT recognitions of the user and then send them as a single textual message to the bot.

■

on: Enabled

■

off: (Default) Disabled

Continuous ASR Digits

Text

This parameter is applicable when the Continuous ASR feature is enabled.

Defines a special DTMF key, which if pressed, causes Audiocodes to immediately send the accumulated recognitions of the user to the Flow. For example, if configured to "#" and the user presses the pound key (#) on the phone's keypad, the device concatenates the accumulated recognitions and then sends them as one single textual message to the Flow.

The default is "#".

Note: Using this feature incurs an additional delay from the user’s perspective because the speech is not sent immediately to the Flow after it has been recognized. To overcome this delay, configure the parameter to a value that is appropriate to your environment.

Continuous ASR Timeout

Number

This parameter is applicable when the Continuous ASR feature is enabled.

Defines the automatic speech recognition (ASR) timeout (in milliseconds). When the device detects silence from the user for a duration configured by this parameter, it concatenates all the accumulated STT recognitions and sends them as one single textual message to the Flow.

The valid value is 2,500 (i.e., 2.5 seconds) to 60,000 (i.e., 1 minute). The default is 3,000.

User Timeouts

Parameter Type Description

No User Input Timeout (ms)

Number

Defines the maximum time (in milliseconds) that Audiocodes waits for input from the user.

If no input is received when this timeout expires, you can configure Audiocodes to play a textual (see the "No User Input Prompt" parameter) or audio (see the "No User Input URL" parameter) prompt to ask the user to say something. If there is still no input from the user, you can configure Audiocodes to prompt the user again. The number of times to prompt is configured by the "No User Input Retries" parameter.

If the "Send No User Input Event" parameter is configured to "on" and the timeout expires, Audiocodes sends an event to Cognigy.AI, indicating how many times the timer has expired.

The default is 0 (i.e., feature disabled).

Note:

■

DTMF (any input) is considered as user input (in addition to user speech) if the "Send DTMF" parameter is configured to "on".

■

If you have configured a prompt to play when the timeout expires, the timer is triggered only after playing the prompt to the user.

No User Input Retries

Number

Defines the maximum number of allowed timeouts (configured by the "No User Input Timeout" parameter) for no user input. If you have configured a prompt to play (see the "No User Input Prompt" or "No User Input URL" parameter), the prompt is played each time the timeout expires.

The default is 0 (i.e., only one timeout).

For more information on the no user input feature, see the "No User Input Timeout" parameter.

Note: If you have configured a prompt to play upon timeout expiry, the timer is triggered only after playing the prompt to the user.

Send No User Input Event

Toggle

Enables Audiocodes to send an event message to the Flow if there is no user input for the duration configured by the "No User Input Timeout"parameter, indicating how many times the timer has expired ('value' field):

{"type": "event", "name": "noUserInput", "value": 1}

■

on: Enabled.

■

off: (Default) Disabled.

No User Input Prompt

Text

Defines the textual prompt to play to the user when no input has been received from the user when the timeout expires (configured by "No User Input Timeout").

The prompt can be configured in plain text or in Speech Synthesis Markup Language (SSML) format:

By default, the parameter is not configured.

■

Plain-text example:

{"name": "LondonTube", "provider": "my_azure", "displayName": "London Tube", "userNoInputTimeoutMS": 5000, "userNoInputSpeech": "Hi there. Please say something}

■

SSML example:

{"name": "LondonTube", "provider": "my_azure", "displayName": "London Tube", "userNoInputTimeoutMS": 5000, "userNoInputSpeech": <speak> "This is <say-as interpret-as="characters"> SSML </say-as> " </speak>}

For more information on the no user input feature, see the "No User Input Timeout".

Note:

■

If you have also configured to play an audio prompt (see the "No User Input URL" parameter), the "No User Input Prompt" takes precedence.

■

The supported SSML elements depend on the text-to-speech provider:

✔

Google: https://cloud.google.com/text-to-speech/docs/ssml

✔

Azure: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-synthesis-markup#supported-ssml-elements

✔

AWS: https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html

No User Input URL

Text

Defines the URL from where the audio prompt is played to the user when no input has been received from the user when the timeout expires (configured by "No User Input Timeout").

By default, the parameter is not configured.

For more information on the no user input feature, see the "No User Input Timeout".

Note: If you have also configured to play a textual prompt (see the "No User Input Prompt" parameter), the "No User Input Prompt" takes precedence.

Bot Timeouts

Parameter Type Description

No Flow Input Timeout (ms)

Number

Defines the maximum time (in milliseconds) that Audiocodes waits for input from the Flow.

If no input is received from the Flow when this timeout expires, you can configure Audiocodes to play a textual (see the "No Flow Input Prompt" parameter) or audio (see the "No Flow Input URL" parameter) prompt to the user.

The default is 0 (i.e., feature disabled).

No Flow Input Retries

Number

Defines the maximum number of allowed timeouts (configured by the "No Flow Input Timeout" parameter) for Flow input. If you have configured a prompt to play (see the "No Flow Input Prompt" parameter) or audio (see the "No Flow Input URL" parameter), the prompt is played to the user each time the timeout expires.

The default is 0 (i.e., only one timeout – no retries).

For more information on the no flow input feature, see the "No Flow Input Timeout" parameter.

Note: If you have configured a prompt to play upon timeout expiry, the timer is triggered only after playing the prompt to the user.

No Flow Input Prompt

Text

Defines the textual prompt to play to the user when no input has been received from the Flows when the timeout expires (configured by "No Flow Input Timeout").

The prompt can be configured in plain text or in Speech Synthesis Markup Language (SSML) format:

■

Plain-text example:

{name": "LondonTube", "provider": "my_azure", "displayName": "London Tube", "botNoInputTimeoutMS": 5000, "botNoInputSpeech": "Please wait for bot input"}

■

SSML example:

{"name": "LondonTube", "provider": "my_azure", "displayName": "London Tube", "botNoInputTimeoutMS": 5000, "botNoInputSpeech": <speak> "This is <say-as interpret-as="characters"> SSML </say-as> " </speak>}

By default, the parameter is not configured.

Note:

■

For more information on the no flow input feature, see the"No Flow Input Timeout" parameter.

■

If you have also configured to play an audio prompt (see the "No Flow Input URL" parameter), the "No Flow Input Prompt" takes precedence.

■

This feature requires a text-to-speech provider. It will not work when the speech is synthesized by the flow framework.

■

The supported SSML elements depend on the text-to-speech provider:

✔

Google: https://cloud.google.com/text-to-speech/docs/ssml

✔

Azure: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-synthesis-markup#supported-ssml-elements

✔

AWS: https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html

No Flow Input URL

Text

Defines the URL from where the audio prompt is played to the user when no input has been received from the Flow when the timeout expires (configured by "No Flow Input Timeout ").

By default, the parameter is not configured.

For more information on the no Flow input feature, see the "No Flow Input Timeout".

Note: If you have also configured to play a textual prompt (see the "No Flow Input Prompt" parameter), the "No Flow Input Prompt" takes precedence.

Azure Configuration

Parameter

Type

Description

Azure STT Mode

Select

Defines the Azure STT recognition mode.

■

conversation (default)

■

dictation

■

interactive

Note: The parameter is applicable only to the Microsoft Azure STT service.

Azure STT Context ID

Text

■

Azure speech-to-text engine: This parameter controls Azure's Custom Speech model. The parameter can be set to the endpoint ID that is used when accessing the STT engine. For more information on how to obtain the endpoint ID, go to https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-custom-speech-deploy-model.

Note:

■

The parameter can be used by the Flow, as long as the STT engine is Azure or AudioCodes DNN.

■

For Azure STT, the Custom Speech model must be deployed on the same subscription used for the Azure STT engine.

■

When using other STT engines, the parameter has no effect.

Azure TTS Deployment ID

Text

Defines the customized synthetic voice model for Azure's text-to-speech Custom Neural Voice feature. Once you have deployed your custom text-to-speech endpoint on Azure, you can integrate it with Audiocodes using this parameter.

For more information on Azure's Custom Neural Voice feature, click here.

By default, this parameter is undefined.

Note:

This parameter is applicable only to Azure.

Enable Audio Logging

Toggle

Enables recording and logging of audio from the user (endpoint) that Audiocodes sends to the STT engine. The recording is done by the STT engine and stored on the STT engine.

■

on: Instructs the STT engine to enable audio logging.

■

off: Instructs the STT engine to disable audio logging.

When the parameter is not defined (default), audio logging is according to the STT engine.

Note: The parameter and audio logging is applicable only when using the Azure STT.

Google Configuration

Parameter Type Description

Google Interaction Types Select Defines the Google STT interaction type. For more information, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1p1beta1/RecognitionConfig#InteractionType.

Google Cloud STT Context Phrases

Text Array

When using Google's Cloud STT engine, this parameter controls Speech Context phrases.

The parameter can list phrases or words that are passed to the STT engine as "hints" for improving the accuracy of speech recognition.

For more information on speech context (speech adaptation) as well details regarding tokens (class tokens) that can be used in phrases, go to https://cloud.google.com/speech-to-text/docs/speech-adaptation.

For example, whenever a speaker says "weather" frequently, you want the STT engine to transcribe it as "weather" and not "whether". To do this, the parameter can be used to create a context for this word (and other similar phrases associated with weather):

"sttContextPhrases": ["weather"]

Note:

■

The parameter can be used when the STT engine is Google.

■

When using other STT engines, the parameter has no effect.

Google Cloud STT Context Boost

Number

Defines the boost number for context recognition of the speech context phrase configured by sttContextPhrases. Speech-adaptation boost allows you to increase the recognition model bias by assigning more weight to some phrases than others. For example, when users say "weather" or "whether", you may want the STT to recognize the word as weather.

For more information, see https://cloud.google.com/speech-to-text/docs/context-strength.

Note:

■

The parameter can be used when the STT engine is Google.

■

When using other STT engines, the parameter has no affect.

Advanced

Additional Session Parameters

JSON

Provides a JSON payload field to configure any of the additional or existing session parameters as per the Voice AI Connect session control documentation

Audiocodes: Advanced Configuration

Overview

Parameter Details

STT Settings

TTS Settings

DTMF

Barge In

Continuous ASR

User Timeouts

Bot Timeouts

Azure Configuration

Google Configuration

Advanced

Contact Center Automation

Comments