Search code examples
c#xamarinspeech-recognitionaudio-recordingbing-speech

Perform real time continuous speech recognition using Xamarin and Microsoft Speech Service API


I saw on the documentation of the Bing Speech API that it is possible to stream a recording microphone input to the REST service (https://learn.microsoft.com/en-us/azure/cognitive-services/speech/home):

Real-time continuous recognition. The speech recognition API enables users to transcribe audio into text in real time, and supports to receive the intermediate results of the words that have been recognized so far.

However, I was not able to find a sample showing how this could be achieved in a cross-platform fashion using Xamarin Forms.

I have found the following tutorial: https://developer.xamarin.com/guides/xamarin-forms/cloud-services/cognitive-services/speech-recognition/

But in this, the audio stream sent to the API is an already existing audio file, what I would like to achieve, however, is to stream the microphone input of the device running the app (Android, iOS, UWP).

Any insight would be appreciated.


Solution

  • I am afraid that there are no libraries compatible with Xamarin that support real-time Microsoft Speech API. The only compatible is the Bing Speech API which uses the REST protocol and does not offer the real-time transcription.

    The real-time transcription requires Speech Service WebSocket protocol which is fully documented. You could implement this interface yourself, but it may be quite a complex task to do it reliably.

    There are however native libraries for iOS and Android which do support the real-time streaming functionality. You can see tutorial for iOS and tutorial for Android.

    What you could do then is use Xamarin Binding Libraries to bind the native libraries into your Xamarin project. For Java library see this tutorial and for Objective-C library see this tutorial.

    Especially creating the Objective-C binding might be a daunting task and it is usually easier to create a Objective-C library that will act as a facade, which then uses the native library. You will know the interface of your facade library and you will then be able to create the binding more easily. You may also consider asking the Xamarin team to create the binding for you, as they maintain a growing collection of third-party library bindings on GitHub.