Search code examples
c#google-speech-api

Google Cloud Streaming Speech API


I need real time speech recognition through Google Cloud Speech API. However it is still in beta version and there are not much helpful things available on the internet.

https://cloud.google.com/speech/docs/samples there are few samples available here but I don't see streaming API with C#, does that mean I cannot use C# for steaming my audio input the Google Cloud Speech API?

Anyone tried streaming audio input to the Cloud Speech API using .NET?

FYI, I cannot used normal Web Speech API available from Google. I need to use only Goolge Cloud Speech API.


Solution

  • You have to download the sample applications from here: https://cloud.google.com/speech/docs/samples

    The you will find the Speech samples: QuickStart and Recognize.

    The Recogize have a lot option, and one of them is Listen. This sample is streaming audio, and write the result to the console continuously.

    The sample uses a protobuf byte stream for streaming. Here is the main part of the code:

    var credential = GoogleCredential.FromFile( "privatekey.json" ).CreateScoped( SpeechClient.DefaultScopes );
    var channel = new Grpc.Core.Channel( SpeechClient.DefaultEndpoint.ToString(), credential.ToChannelCredentials() );
    var speech = SpeechClient.Create( channel );
    var streamingCall = speech.StreamingRecognize();
    // Write the initial request with the config.
    await streamingCall.WriteAsync(
        new StreamingRecognizeRequest()
        {
            StreamingConfig = new StreamingRecognitionConfig()
            {
                Config = new RecognitionConfig()
                {
                    Encoding =
                    RecognitionConfig.Types.AudioEncoding.Linear16,
                    SampleRateHertz = 16000,
                    LanguageCode = "hu",
                },
                InterimResults = true,
            }
        } );
    

    of course the language must be changed.

    Then must be stream the content:

    streamingCall.WriteAsync(
        new StreamingRecognizeRequest()
        {
            AudioContent = Google.Protobuf.ByteString
                .CopyFrom( args.Buffer, 0, args.BytesRecorded )
        } ).Wait();