I am trying to use Google Speech to Text for long files (~100MB)
But even when I am using the code (adapted) from https://cloud.google.com/speech-to-text/docs/async-recognize
I get the following exception:
Status(StatusCode=InvalidArgument, Detail="Request payload size exceeds the limit: 10485760 bytes.")
This is my code so far:
string convertedFile = WavUtils.WavUtils.EncodeToWav(filename);
Dictionary<string, long> wavData = WavUtils.WavUtils.GetWAVData(convertedFile);
var speech = SpeechClient.Create();
var longOperation = speech.LongRunningRecognize(
new RecognitionConfig()
{
Encoding = RecognitionConfig.Types.AudioEncoding.Linear16,
SampleRateHertz = (int)wavData["sampleRateHz"],
LanguageCode = LanguageCodes.English.UnitedStates
},
RecognitionAudio.FromFile(convertedFile));
longOperation = longOperation.PollUntilCompleted();
var response = longOperation.Result;
foreach (var result in response.Results)
{
foreach (var alternative in result.Alternatives)
{
Console.WriteLine(alternative.Transcript);
}
}
Is the maximum file size really 10MB even with LongRunningRecognize?
The original file is actually a MP3 from a recorded Webcast. But from what I have read Google Speech to Text does not support MP3 as input. That is why I am converting it to Wav.
Any help would be welcomed.
you need to upload your audio file to Google storage first. https://cloud.google.com/speech-to-text/docs/async-recognize#speech-async-recognize-gcs-csharp