azure .net-core naudio azure-cognitive-services

Play audio obtained as byte[] from Azure Speech translation

I am following the samples for Microsoft Cognitive Services Speech SDK, namely the Speech Translation.

The sample for dotnet core uses microphone as audio input and translates what you speak. Translated results are also available as synthesized speech. I would like to play this audio but could not find the appropriate code for that.

Tried using NAudio as sugguested in this answer but I get garbled audio. Guess there is more to the format of the audio.

Any pointers?

Solution

On .Net Core, many audio pacakges might not work. For example with NAudio, I can't play sound on my Mac.

I got it working using NetCoreAudio package (Nuget), with the following implementation in the translation Synthesizing event:

recognizer.Synthesizing += (s, e) =>
{
    var audio = e.Result.GetAudio();
    Console.WriteLine(audio.Length != 0
        ? $"AudioSize: {audio.Length}"
        : $"AudioSize: {audio.Length} (end of synthesis data)");

    if (audio.Length > 0)
    {
        var fileName = Path.Combine(Directory.GetCurrentDirectory(), $"{DateTime.Now.ToString("yyyy-MM-dd_HH-mm-ss.wav")}");
        File.WriteAllBytes(fileName, audio);

        var player = new Player();
        player.Play(fileName).Wait();    
    }
};