Search code examples
azureazure-cognitive-servicesazure-managed-identityazure-speech

Managed Identity Authentication for Azure AI Speech - WebSocket upgrade failed: Authentication error (401)


I am trying to connect with Speech SDK using managed Identity, I dont want to use API key. I followed this article - https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-configure-azure-ad-auth?tabs=portal&pivots=programming-language-csharp. and created custom domain , assigned required role to myself (for Visual Studio) and the app service. But I am getting this issue - WebSocket upgrade failed: Authentication error (401). Please check subscription information and region name.

Below is my code -

var tokenCredential = new DefaultAzureCredential(new DefaultAzureCredentialOptions{
    TenantId = $"{Environment.GetEnvironmentVariable("tenant")}"
});
string token = tokenCredential.GetTokenAsync(
            new TokenRequestContext(scopes: new string[] { "https://cognitiveservices.azure.com/.default" })).GetAwaiter().GetResult().Token;

string authorizationToken = $"aad#{cognitiveResourceId}#{token}";
SpeechConfig Config = SpeechConfig.FromAuthorizationToken(authorizationToken, speechRegion);

Let me know if it is supposed to work with Managed identity or only when you get your token with interactive browser.


Solution

  • I created a sample console app using DefaultAzureCredential and Managed Identity to work in both development and production respectively.

    Code :

    using Azure.Identity;
    using Azure.Core;
    using Microsoft.CognitiveServices.Speech;
    
    class Program
    {
        static void Main(string[] args)
        {
            try
            {
                var tokenCredential = new DefaultAzureCredential(new DefaultAzureCredentialOptions
                {
                    TenantId = $"<tenantID>"
                });
    
                string[] scopes = new string[] { "https://cognitiveservices.azure.com/.default" };
                var token = tokenCredential.GetTokenAsync(new TokenRequestContext(scopes)).GetAwaiter().GetResult().Token;
                string cognitiveResourceId = "<ResourceID>"; 
                string authorizationToken = $"aad#{cognitiveResourceId}#{token}";
                string speechRegion = "<speechRegion>";
                SpeechConfig speechConfig = SpeechConfig.FromAuthorizationToken(authorizationToken, speechRegion);
                SynthesizeTextToSpeech(speechConfig);
            }
            catch (Exception ex)
            {
                Console.WriteLine($"An error occurred: {ex.Message}");
            }
        }
    
        static void SynthesizeTextToSpeech(SpeechConfig speechConfig)
        {
            var synthesizer = new SpeechSynthesizer(speechConfig);
            string textToSynthesize = "Hello Kamali, how are you?";
            var result = synthesizer.SpeakTextAsync(textToSynthesize).GetAwaiter().GetResult();
            if (result.Reason == ResultReason.SynthesizingAudioCompleted)
            {
                Console.WriteLine("Text-to-Speech Synthesis completed successfully.");
            }
            else
            {
                Console.WriteLine($"Text-to-Speech synthesis failed: {result.Reason}");
            }
        }
    }
    

    I have assigned the Cognitive Services Speech Contributor role to the service principal and Azure Web App to work locally using DefaultAzureCredential and in production using Managed Identity.

    enter image description here

    I successfully converted the text to speech and heard the audio of the converted speech.

    enter image description here

    I have successfully deployed the console app to Azure WebJobs.

    enter image description here

    Logs :

    enter image description here