I'm using azure speech to text to find timestamps of utterances in a wav file.
The problem I'm encountering is that if the user has recorded numbers, for instance "I'm going to count to three. One, two, three, here I come". The numbers are omitted from the output. This happens both for English and other languages. I can understand utterances like 'eh' and 'ah' being omitted, but numbers? why is that the default.
I'm using:
Can I somehow configure the SpeechRecognizer differently so it also outputs numbers?
I found the reason my results did not recognizing numbers. It was in my own code. In my postprocessing I was trying to get rid of punctuation marks from the result. Here I was also accidently getting rid of numbers.