Search code examples
c#azurespeech-to-text

Azure speech to text model runs but returns wrong output


I'm trying to run a speech to text model using Azure and C# in Visual Studio Code following these instructions: https://learn.microsoft.com/en-us/learn/modules/transcribe-speech-input-text/5-exercise-convert-speech-from-audio-file?pivots=csharp

CognitiveServices is installed. The audio file path is correctly labeled. I have entered the correct subscription key and service region (edited here for privacy). The project file paths are correct in the associated json file. My code is:

 using System;
   using System.Threading.Tasks;
   using Microsoft.CognitiveServices.Speech;
   using Microsoft.CognitiveServices.Speech.Audio;

   namespace SpeechToTextCsharp
   {
    class Program
    {
        static async Task Main(string[] args)
        {
            await RecognizeSpeechAsync();
        }

        static async Task RecognizeSpeechAsync()
        {
            // Configure the subscription information for the service to access.
            // Use either key1 or key2 from the Speech Service resource you have created
            var config = SpeechConfig.FromSubscription("YourSubscriptionKey", "YourServiceRegion");

            // Setup the audio configuration, in this case, using a file that is in local storage.
            using (var audioInput = AudioConfig.FromWavFileInput("../narration.wav"))

            // Pass the required parameters to the Speech Service which includes the configuration       information
            // and the audio file name that you will use as input
            using (var recognizer = new SpeechRecognizer(config, audioInput))
            {
                Console.WriteLine("Recognizing first result...");
                var result = await recognizer.RecognizeOnceAsync();

                switch (result.Reason)
                {
                    case ResultReason.RecognizedSpeech:
                        
                        // to the terminal window
                        Console.WriteLine($"We recognized: {result.Text}");
                        break;
                    case ResultReason.NoMatch:
                        // No recognizable speech found in the audio file that was supplied.
                        // Out an informative message
                        Console.WriteLine($"NOMATCH: Speech could not be recognized.");
                        break;
                    case ResultReason.Canceled:
                        // Operation was cancelled
                        // Output the reason
                        var cancellation = CancellationDetails.FromResult(result);
                        Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");

                        if (cancellation.Reason == CancellationReason.Error)
                        {
                           Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
                           Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}");
                           Console.WriteLine($"CANCELED: Did you update the subscription info?");
                        }
                        break;
                }
            }
        }
    }
   }

It runs with no errors, but the output returned is "Hello World!". The narration.wav file does not include the words hello world. Since there are no errors returned this is difficult to diagnose and internet searches haven't returned anything helpful.

How do I transcribe the audio file?


Solution

  • You may have misplaced the narration.wav file. Or the program reads an audio file containing Hello world.

    Reconfirm the storage path of the narration.wav file. According to your code ../narration.wav, your files should be stored in the place as shown in the screenshot.

    enter image description here

    If you put your wav file in project, you can use ../../../narration.wav to test.

    enter image description here