Search code examples
c#google-speech-api

How to prepare audio files (wav or mp3) for Google speech recognition api in c#?


       String jsonRequest = "{\"config\": {\"languageCode\":\"en-US\"},\"audio\": {\"content\": \"" + base64Content+ "\"}}";
        String str = "";
        var speech = SpeechClient.Create();
        var response = 
        speech.Recognize(RecognizeRequest.Parser.ParseJson(jsonRequest));
        foreach (var result in response.Results)
        {
            foreach (var alternative in result.Alternatives)
            {
                Console.WriteLine(alternative.Transcript);
                str += alternative.Transcript;
            }
        }

This code is working fine with mono .wav files but it throws exception for stereo files. The exception says

Status(StatusCode=InvalidArgument, Detail="Must use single channel (mono) audio, but WAV header indicates 2 channels.")

So, my question is how can I add support for stereo files? How to convert multi channel audio to a single channel in c#? I have already tried this answer so plz don't refer to it. It is not working.


Solution

  • You should take a look into sox which can convert nearly any format into another including sample rate conversion, and in your case interesting, channel conversion. In the documentation you find many examples on how to use it.

    In your case I would advise against just extracting one channel from the audio because that could mean that the desired audio is only on another channel which is not selected.

    If you want to have full control over the audio you could dive into bass.dll in liaison with bass.net.dll, which allows you to mix channels together or select a specific channel.