Search code examples
c#asp.netspeech-to-textazure-cognitive-services

How to include a continuous Speech To Text service(using MS Cognitive Service) into an ASP.NET web application that writes to text boxes in pages?


I am trying to include a continuous Speech to Text service in an ASP.net application. The user uses a microphone on the client side and the speech is captured in a text box. The server side will use Microsoft's Cognitive service on Azure. I found this article https://codez.deedx.cz/posts/continuous-speech-to-text/ . I am not sure how the client side will talk to this API. Any help or sample code which captures both client and server side will be appreciated.

Thanks!


Solution

  • If you want to integrate Speech to text(stt) service with your asp.net application , maybe using stt service as an HTML page and integrate it as a view will be the simplest way.

    I write a continuous speech to text HTML demo for you,just try the code below :

    <!DOCTYPE html>
    <html>
    <head>
      <title>Microsoft Cognitive Services Speech SDK JavaScript Quickstart</title>
      <meta charset="utf-8" />
    </head>
    <body style="font-family:'Helvetica Neue',Helvetica,Arial,sans-serif; font-size:13px;">
      <!-- <uidiv> -->
    
      <div id="content" style="display:none">
        <table width="100%">
          <tr>
            <td></td>
            <td><h1 style="font-weight:500;">Continuous speech to text demo </h1></td>
          </tr>
            <td></td>
            <td><button id="startRecognizeAsyncButton">Start recognition</button>
            <button id="stopRecognizeAsyncButton">Stop recognition</button>
            </td>
    
          </tr>
          <tr>
            <td align="right" valign="top">Results</td>
            <td><textarea id="phraseDiv" style="display: inline-block;width:500px;height:200px"></textarea></td>
          </tr>
        </table>
      </div>
    
      <script src="microsoft.cognitiveservices.speech.sdk.bundle.js"></script>
    
      <script>
        // status fields and start button in UI
        var subscriptionKey = "<your subscription key>";
        var serviceRegion= "<your region>";
        var phraseDiv;
        var startRecognizeAsyncButton;
        var stopRecognizeAsyncButton;
        var SpeechSDK;
        var recognizer;
        document.addEventListener("DOMContentLoaded", function () {
          startRecognizeAsyncButton = document.getElementById("startRecognizeAsyncButton");
          stopRecognizeAsyncButton = document.getElementById("stopRecognizeAsyncButton");
          stopRecognizeAsyncButton.disabled=true;
          phraseDiv = document.getElementById("phraseDiv");
          var speechConfig;
          speechConfig = SpeechSDK.SpeechConfig.fromSubscription(subscriptionKey, serviceRegion);
          speechConfig.speechRecognitionLanguage = "en-US";
          var audioConfig  = SpeechSDK.AudioConfig.fromDefaultMicrophoneInput();
          recognizer = new SpeechSDK.SpeechRecognizer(speechConfig, audioConfig);
    
          startRecognizeAsyncButton.addEventListener("click", function () {
            startRecognizeAsyncButton.disabled = true;
            stopRecognizeAsyncButton.disabled=false;
            phraseDiv.innerHTML = "";
    
            recognizer.startContinuousRecognitionAsync();
            recognizer.recognized = function(s, e){
              phraseDiv.innerHTML += e.result.text;
            };
          });
    
          stopRecognizeAsyncButton.addEventListener("click",function(){
            startRecognizeAsyncButton.disabled = false ; 
            stopRecognizeAsyncButton.disabled = true ; 
            recognizer.stopContinuousRecognitionAsync();
    
          });
          if (!!window.SpeechSDK) {
            SpeechSDK = window.SpeechSDK;
            startRecognizeAsyncButton.disabled = false;
            document.getElementById('content').style.display = 'block';
            document.getElementById('warning').style.display = 'none';
            // in case we have a function for getting an authorization token, call it.
            if (typeof RequestAuthorizationToken === "function") {
                RequestAuthorizationToken();
            }
          }
        });
      </script>
      <!-- </quickstartcode> -->
    </body>
    </html>
    

    How to run it :

    1. Save the code as a .html file.
    2. Replace subscriptionKey and serviceRegion with your own service values , you can find them here :

    enter image description here

    3.From the Speech SDK for JavaScript .zip package extract the file microsoft.cognitiveservices.speech.sdk.bundle.js and place it into the folder that contains this sample. Just as below : enter image description here

    Test Result :

    enter image description here

    Hope it helps!