I have implemented Google Cloud Speech API in a c# console API. Now I want to implement the same on a HTML page. Below are the steps I have followed:
mediaRecorder.ondataavailable = function (e) {
chunks.push(e.data);
var blob = new Blob(chunks, { 'type': 'audio/wav; codecs=0' });
var fd = new FormData();
fd.append('fname', 'test.wav');
//fd.append('data', chunks[0]);
fd.append('data', blob);
$.ajax({
type: 'POST',
url: APIUrl,
data: fd,
processData: false,
contentType: false
}).done(function (data) {
console.log(data);
});
string text = "";
var speech = SpeechClient.Create();
var response = speech.Recognize(new RecognitionConfig()
{
Encoding = RecognitionConfig.Types.AudioEncoding.OggOpus,
SampleRateHertz = 48000,
LanguageCode = "en",
}, RecognitionAudio.FromStream(HttpContext.Current.Request.Files[0].InputStream));
foreach (var result in response.Results)
{
foreach (var alternative in result.Alternatives)
{
text = alternative.Transcript;
}
}
I have tried different combinations of Encoding and Hertz. But none works. Also I tried saving the audio first on local drive in WAV format and reading the response from local file. But it does not work either.
You are not recording in the format you think you are recording.
opus
in WebM
container.opus
in Ogg
container. This can quickly validated by running the following snippet in respective browser's JS console. You will see True
or False
based on the support.
MediaRecorder.isTypeSupported('audio/webm;codecs=opus')
MediaRecorder.isTypeSupported('audio/ogg;codecs=opus')
Google Cloud Speech API supports Opus but only in Ogg
container. If you run the same code in Firefox, the output with Speech API should work.
For this to work with Chrome you will need to re-mux the file in Ogg container on the server side before sending it to the Cloud Speech API.
You can use ffmpeg to do so
ffmpeg -i file_chrome.wav -acodec copy resources/file.oga
Note that this is a re-mux and not a re-encode process. You are just copying the same data in a different container.
Bonus Tip: If you are on Linux/Mac you can use the file <file_name>
command to check the output file type. Chrome file would show up as WebM
and Firefox output would show up as Ogg data, Opus audio
.