Search code examples
speech-recognitionspeech-to-textazure-cognitive-services

In Azure Cognitive Services Speech-to-Text, PhraseListGrammar truncates utterance as soon as match is found


The documentation on using phrase lists to improve speech-to-text in JavaScript here uses move to ward as an example of using PhraseListGrammar to teach the service to recognise that rather than move toward.

This works well for the example on its own. However there do appear to be two problems;

  1. Recognition terminates after finding the phrase at the beginning of an utterance. For example Move to ward number ten is recognised just as Move to ward.
  2. The improved recognition does not appear to work when a phrase is not at the beginning of an utterance. For example I want to move to ward number ten is recognised as I want to move toward number ten.

I have found these issues in both the C# and Javascript SDKs v1.08 and v1.12.1.

Reproduction:

  • Using an empty PhraseListGrammar list:
    • move to ward is recognised as move toward
    • move to ward number ten is recognised as move toward number ten
    • i want to move to ward number ten is recognised as i want to move toward number ten
  • With move to ward in the PhraseListGrammar list:
    • move to ward is recognised as move to ward (correct)
    • move to ward number ten is recognised as move to ward (truncated)
    • i want to move to ward number ten is recognised as i want to move toward number ten (no effect)

Is this by design or is it a bug?

This is the output from a program I wrote to illustrate the effects described above: This is the output from a program I wrote to illustrate the effects described above


Solution

  • Thanks for reaching out. This is the expected behavior. With the current version, words that do not match entries in the phraselist are ignored. Also, when the beginning of the utterances matches an entry in the phraselist, words at the end that don't match are ignored. The product team are aware of this limitation and are working on addressing it in a newer version that will be rolled out soon.