Search code examples
c#azureazure-cognitive-servicesazure-speech

Azure Neural Voice: Invalid deploymentId


I trained, created and deployed a Custom Voice model with Azure's Speech Studio.

On the Deploy Model page, I am given a Resource key, a Service region and Endpoint ID. I used the Endpoint ID (I am sure it is correct) in my code below.

However, I'm getting this error: Error code: 1007. Error details: Invalid deploymentId XXXXX USP state: TurnStarted. Received audio size: 0 bytes.] CANCELED: Did you set the speech resource key and region values?

Can anyone give a hint as to why it isn't working?


using System;
using System.IO;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;

class Program 
{
    // This example requires environment variables named "SPEECH_KEY" and "SPEECH_REGION"
    static string speechKey = "my-resource-key";
    static string speechRegion = "my-region";

    

    static void OutputSpeechSynthesisResult(SpeechSynthesisResult speechSynthesisResult, string text)
    {
        switch (speechSynthesisResult.Reason)
        {
            case ResultReason.SynthesizingAudioCompleted:
                Console.WriteLine($"Speech synthesized for text");
                break;
            case ResultReason.Canceled:
                var cancellation = SpeechSynthesisCancellationDetails.FromResult(speechSynthesisResult);
                Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");

                if (cancellation.Reason == CancellationReason.Error)
                {
                    Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
                    Console.WriteLine($"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]");
                    Console.WriteLine($"CANCELED: Did you set the speech resource key and region values?");
                }
                break;
            default:
                break;
        }
    }

    async static Task Main(string[] args)
    {
        var speechConfig = SpeechConfig.FromSubscription(speechKey, speechRegion);  
           

        // The language of the voice that speaks.
        speechConfig.SpeechSynthesisVoiceName = "MyNeuralVoice"; 
        speechConfig.EndpointId = "my-endpoint-id"; 

        using (var speechSynthesizer = new SpeechSynthesizer(speechConfig))
        {
            // Get text from the console and synthesize to the default speaker.
            Console.WriteLine("Enter some text that you want to speak >");
            string text = Console.ReadLine();

            var speechSynthesisResult = await speechSynthesizer.SpeakTextAsync(text);
            OutputSpeechSynthesisResult(speechSynthesisResult, text);
        }

        Console.WriteLine("Press any key to exit...");
        Console.ReadKey();
    }
}



Solution

  • I tried your code from my end and successfully converted the text to speech.

    Note :

    1. Whenever making a request, ensure the status is Succeeded. If it is Processing or Suspended, you will receive an error.

    enter image description here

    1. If I provided the incorrect deploymentID, I also received the same error as shown below:

    enter image description here

    Follow the steps below to create a Text to Speech Custom Neural Voice:

    1. Go to the Speech Studio (microsoft.com) and click on create Custom Voice as below:

    enter image description here

    1. Click on Create a project. I already have a project named kamtts as below:

    enter image description here

    1. Go to Record and build. Record some samples (I recorded 32 samples) and click on Train model as below:

    enter image description here

    1. Go to Review model and check the details in your model as below:

    enter image description here

    1. Go to Deploy model > click on Deploy model to deploy your model as below:

    enter image description here

    1. Go to your deployment model > Copy the Endpoint key, region, and deploymentId as below:

    enter image description here enter image description here

    Use the above Endpoint key, region, and deploymentId in the code below.

    Code :

    using Microsoft.CognitiveServices.Speech;
    
    class Program
    {
        static string speechKey = "<speech_key>";
        static string speechRegion = "<speech_region>";
    
        static void OutputSpeechSynthesisResult(SpeechSynthesisResult speechSynthesisResult, string text)
        {
            switch (speechSynthesisResult.Reason)
            {
                case ResultReason.SynthesizingAudioCompleted:
                    Console.WriteLine($"Speech synthesized for text");
                    break;
                case ResultReason.Canceled:
                    var cancellation = SpeechSynthesisCancellationDetails.FromResult(speechSynthesisResult);
                    Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");
    
                    if (cancellation.Reason == CancellationReason.Error)
                    {
                        Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
                        Console.WriteLine($"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]");
                        Console.WriteLine($"CANCELED: Did you set the speech resource key and region values?");
                    }
                    break;
                default:
                    break;
            }
        }
    
        async static Task Main(string[] args)
        {
            var speechConfig = SpeechConfig.FromSubscription(speechKey, speechRegion);
            speechConfig.SpeechSynthesisVoiceName = "en-US"; //Replace your voice name here, I used en-US 
            speechConfig.EndpointId = "<deploymentID>";
    
            using (var speechSynthesizer = new SpeechSynthesizer(speechConfig))
            {
                Console.WriteLine("Enter some text that you want to speak >");
                string text = Console.ReadLine();
                var speechSynthesisResult = await speechSynthesizer.SpeakTextAsync(text);
                OutputSpeechSynthesisResult(speechSynthesisResult, text);
            }
    
            Console.WriteLine("Press any key to exit...");
            Console.ReadKey();
        }
    }
    

    Output :

    The code ran successfully, and I received the following prompt to enter some text input to obtain speech output:

    enter image description here

    I entered the text Hi Kamali. How are you? and heard the speech output for the provided input text, resulting in this message:

    enter image description here

    Enter some text that you want to speak >
    Hi Kamali. How are you?
    Speech synthesized for text
    Press any key to exit...
    

    Reference :

    Check this MSDOC to convert text to speech.