AudioCodes Voice Gateway: Automatically Recognizing the Language Spoken

This article explains how to activate language recognition with AudioCodes Voices Gateway and Cognigy. 

 

audiocodes-logo_v2.png

AudioCodes Version Information

The functionality described on this page is only available in version 3.4 or later of Voice Gateway Connection Enterprise/VAIC.

Description

Let's imagine you want to use one virtual agent to deploy Cognigy voice automation across multiple countries, or the place you want to deploy your virtual voice agent has multiple official languages which need to be supported. 

With Cognigy & AudioCodes Voice Gateway it is possible to recognize the language the user is speaking and to change the STT (Speech-To-Text) and TTS (Text-To-Speech) automatically. 

Setup

In order to use this functionality you need to be using Microsoft Azure as your TTS and STT provider. 

Set Session Parameters

Go into your flow and click on the "+" where you want to add the language recognition functionality. Choose Extension - VG - Set Session Parameters to add the Set Session Parameters node. : 

Cognigy-AudioCodes_Set_Session_Parameters.png

Image 1: Set Session Parameters

Open the node and scroll down to Advanced and open up the drop down settings. Expand it to see the Additional Session Parameters JSON field. 

 

Cognigy-AudioCodes_Set_Session_Parameters_Advanced_Seetings.png

Image 2 Where to Find Advanced Settings

 

Cognigy-AudioCodes_Set_Session_Parameters_Advanced_Settings_open.png

Image 3: JSON field for Advanced Settings

Define Languages

You can use the Additional Session Parameters field to add additional AudioCodes settings not available in any of the other fields of the Set Session Parameters node. 

In this situation we will add language recognition for speech to text (Microsoft) as described in the AudioCodes speech customization documentation.

As an example let's say our main language is English but we want to try to recognize either German, French or Japanese. We can then add the following JSON information into the Additional Session Parameters field:

{
    "languageDetectionActivate": true, // Activate language detection
    "languageDetectionAutoSwitch": true, // Tell AudioCodes to switch language automatically
    "alternativeLanguages": [
        {
          "language": "de-DE", // Tell voice agent to listen for a specific language
          "voiceName": "de-DE-KatjaNeural" // Tell voice agent which voice to use if this language is recognized (languageDetectionAutoSwitch must be set to true)
        },
        {
          "language": "fr-FR",
          "voiceName": "fr-FR-DeniseNeural"
        },
         {
          "language": "ja-JP",
          "voiceName": "ja-JP-KeitaNeural"
        }
    ]
}

So in this example if the user were to speak German after the set session parameters node, it would switch to the Katja Neural voice.

A list of the voices provided by Microsoft can be found here. But please keep in mind some voices are only available in certain Azure regions. Please talk to your AudioCodes engineer for more details. 

Reactivating Language Detection

After a language has been set language recognition will be deactivated to make the language does not accidentally change a second time. If for some reason you need to reactivate language recognition you will need to add another Set Session Parameters node with the same settings.

mceclip0.png

Image 4: Additional Set Session Parameters node to reactivate language recognition

Data Received from AudioCodes

Even if you do not want to switch the TTS and STT automatically you can still use the language recognition settings. In both cases we will receive data from AudioCodes telling us what language was spoken. 

This can be found in the user input data object in the following field and will return the language in the Azure format (en-US, de-DE etc.):

input.data.request.parameters.recognitionOutput.PrimaryLanguage.Language

You can then use this data, for example, to switch the locale of the agent or set the language for automatic translation, if so desired. 

Conversational Design Considerations

As a last note the most important thing to consider here is the length of the user input. If the sentence the user says is too short to recognize the language it will not work. In order to set this properly you will need to make sure the conversation is designed in a way that the user says a longer sentence. A simple greeting like "Hallo" for example will not be recognized easily in an agent where both Dutch and German need to be recognized. 

Video Demo


Comments

0 comments

Article is closed for comments.

Was this article helpful?
0 out of 0 found this helpful