Search code examples
azurespeech-recognitionspeech-to-text

Azure speech-to-text ignores numbers


I'm using azure speech to text to find timestamps of utterances in a wav file.

The problem I'm encountering is that if the user has recorded numbers, for instance "I'm going to count to three. One, two, three, here I come". The numbers are omitted from the output. This happens both for English and other languages. I can understand utterances like 'eh' and 'ah' being omitted, but numbers? why is that the default.

I'm using:

  • speechConfig.OutputFormat = OutputFormat.Detailed;
  • the default language model.

Can I somehow configure the SpeechRecognizer differently so it also outputs numbers?


Solution

  • I found the reason my results did not recognizing numbers. It was in my own code. In my postprocessing I was trying to get rid of punctuation marks from the result. Here I was also accidently getting rid of numbers.