This article explains how to activate language recognition with Cognigy VoiceGateway.

Description

Let's imagine you want to use one virtual agent to deploy Cognigy voice automation across multiple countries, or the place you want to deploy your virtual voice agent has multiple official languages which need to be supported.

With Cognigy it is possible to recognize the language the user is speaking and to change the STT (Speech-To-Text) and TTS (Text-To-Speech) automatically.

This can be done with Cognigy's native VoiceGateway.

Setup Language Provider

To use this functionality, you need to be using either Microsoft Azure or Google as your TTS and STT provider. In our example we will be using Microsoft but the basic principle in the same.

To configure the speech service, you need to open your Voice Gateway UI/Self Service Portal and select Speech on the left-hand side.

Click Add Speech Service and select Microsoft Azure Speech Services as the vendor.
Choose the account you want to use it with and select both TTS and SST. (Although in this case only STT is relevant)
Continue by selecting Use Hosted Azure Service, choosing a region and entering the API key (we can also implement a self hosted/Docker Azure instance but we are using the public cloud version as an example).
Save your changes by clicking Save.
Once finished, add this service to the application you wish to use it in (this can also be done directly in the Set Session Parameters Node).

Overview of the speech settings

Adding to the Flow

Set Session Config

Go into your flow and click on the "+" where you want to add the language recognition functionality.

Tip!

We suggest after a part in the flow where the user needs to say a longer text as this helps with language recognition.

Navigate to Extension – Voice Gateway - Set Session Config to add the Set Session Config Node.

Next up, open the node and click on the Recognizer (STT) drop down menu.

Here scroll down, until you see the check box for Recognize Language, and click on the box to activate it. Additional fields will appear below the checkbox. Here you can define which languages you wish to detect. In this example its set to look for German, Japanese or French as alternative languages to English.

The output for the language recognition will be stored in the output as:

ci.data.payload.speech.language_code

The format is usually the ISO 639 Language Code followed by a dash ("-") and then the ISO 3166 Country Code.

For example:

"de-DE" = German (Germany)

"fr-FR" = French (France)

"ja-JP" = Japanese (Japan)

"en-US" = English (United States)

Flow Structure and Switching Languages

After setting up the Set Session Config Node you can configure a simple Lookup Node to switch to the proper language when it’s detected. In order to switch the language, you will have to setup another Set Session Config Node, in which you define the language that has been detected as main language. That would look something like this:

Warning!

Don't forget to deactivate the language detection in the new Set Session Config Node otherwise language detection will continue throughout the call.

This can cause latency in the bot, but also in general might lead to unexpected behavior.

Redetecting Languages

If you wish to redetect the language you will need to add another Set Session Config Node which reactivates the languages detection.

Tip!

For countries with more than one national language where the user might interact with the bot in one language but the answer might be in another (for example a native English speaker in Quebec trying to pronounce French street names) you can also set this up as an Activity Parameter for that one specific question.

Alternative Methods

Even if you do not want to switch the TTS and STT automatically you can still use the language recognition settings. spoken.

You can also then use this data, for example, to switch the locale of the agent or set the language for automatic translation, if so desired.

Conversational Design Considerations

As a last note the most important thing to consider here is the length of the user input. If the sentence the user says is too short to recognize the language it will not work. In order to set this properly you will need to make sure the conversation is designed in a way that the user says a longer sentence. A simple greeting like "Hallo" for example will not be recognized easily in an agent where both Dutch and German need to be recognized.

Voice Language Recognition with Cognigy Voice Gateway

Description

Setup Language Provider