Search code examples
c#google-cloud-speech

Google Speech to Text files bigger than 10MB


I am trying to use Google Speech to Text for long files (~100MB)

But even when I am using the code (adapted) from https://cloud.google.com/speech-to-text/docs/async-recognize

I get the following exception:

Status(StatusCode=InvalidArgument, Detail="Request payload size exceeds the limit: 10485760 bytes.")

This is my code so far:

string convertedFile = WavUtils.WavUtils.EncodeToWav(filename);
Dictionary<string, long> wavData = WavUtils.WavUtils.GetWAVData(convertedFile);

var speech = SpeechClient.Create();
var longOperation = speech.LongRunningRecognize(
  new RecognitionConfig()
  {
    Encoding = RecognitionConfig.Types.AudioEncoding.Linear16,
    SampleRateHertz = (int)wavData["sampleRateHz"],
    LanguageCode = LanguageCodes.English.UnitedStates
  },
  RecognitionAudio.FromFile(convertedFile));
longOperation = longOperation.PollUntilCompleted();

var response = longOperation.Result;
foreach (var result in response.Results)
{
  foreach (var alternative in result.Alternatives)
  {
    Console.WriteLine(alternative.Transcript);
  }
}

Is the maximum file size really 10MB even with LongRunningRecognize?

The original file is actually a MP3 from a recorded Webcast. But from what I have read Google Speech to Text does not support MP3 as input. That is why I am converting it to Wav.

Any help would be welcomed.


Solution

  • you need to upload your audio file to Google storage first. https://cloud.google.com/speech-to-text/docs/async-recognize#speech-async-recognize-gcs-csharp