I trained, created and deployed a Custom Voice model with Azure's Speech Studio.
On the Deploy Model page, I am given a Resource key, a Service region and Endpoint ID. I used the Endpoint ID (I am sure it is correct) in my code below.
However, I'm getting this error: Error code: 1007. Error details: Invalid deploymentId XXXXX USP state: TurnStarted. Received audio size: 0 bytes.] CANCELED: Did you set the speech resource key and region values?
Can anyone give a hint as to why it isn't working?
using System;
using System.IO;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;
class Program
{
// This example requires environment variables named "SPEECH_KEY" and "SPEECH_REGION"
static string speechKey = "my-resource-key";
static string speechRegion = "my-region";
static void OutputSpeechSynthesisResult(SpeechSynthesisResult speechSynthesisResult, string text)
{
switch (speechSynthesisResult.Reason)
{
case ResultReason.SynthesizingAudioCompleted:
Console.WriteLine($"Speech synthesized for text");
break;
case ResultReason.Canceled:
var cancellation = SpeechSynthesisCancellationDetails.FromResult(speechSynthesisResult);
Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");
if (cancellation.Reason == CancellationReason.Error)
{
Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
Console.WriteLine($"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]");
Console.WriteLine($"CANCELED: Did you set the speech resource key and region values?");
}
break;
default:
break;
}
}
async static Task Main(string[] args)
{
var speechConfig = SpeechConfig.FromSubscription(speechKey, speechRegion);
// The language of the voice that speaks.
speechConfig.SpeechSynthesisVoiceName = "MyNeuralVoice";
speechConfig.EndpointId = "my-endpoint-id";
using (var speechSynthesizer = new SpeechSynthesizer(speechConfig))
{
// Get text from the console and synthesize to the default speaker.
Console.WriteLine("Enter some text that you want to speak >");
string text = Console.ReadLine();
var speechSynthesisResult = await speechSynthesizer.SpeakTextAsync(text);
OutputSpeechSynthesisResult(speechSynthesisResult, text);
}
Console.WriteLine("Press any key to exit...");
Console.ReadKey();
}
}
I tried your code from my end and successfully converted the text to speech.
Note :
Follow the steps below to create a Text to Speech Custom Neural Voice:
Use the above Endpoint key, region, and deploymentId in the code below.
Code :
using Microsoft.CognitiveServices.Speech;
class Program
{
static string speechKey = "<speech_key>";
static string speechRegion = "<speech_region>";
static void OutputSpeechSynthesisResult(SpeechSynthesisResult speechSynthesisResult, string text)
{
switch (speechSynthesisResult.Reason)
{
case ResultReason.SynthesizingAudioCompleted:
Console.WriteLine($"Speech synthesized for text");
break;
case ResultReason.Canceled:
var cancellation = SpeechSynthesisCancellationDetails.FromResult(speechSynthesisResult);
Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");
if (cancellation.Reason == CancellationReason.Error)
{
Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
Console.WriteLine($"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]");
Console.WriteLine($"CANCELED: Did you set the speech resource key and region values?");
}
break;
default:
break;
}
}
async static Task Main(string[] args)
{
var speechConfig = SpeechConfig.FromSubscription(speechKey, speechRegion);
speechConfig.SpeechSynthesisVoiceName = "en-US"; //Replace your voice name here, I used en-US
speechConfig.EndpointId = "<deploymentID>";
using (var speechSynthesizer = new SpeechSynthesizer(speechConfig))
{
Console.WriteLine("Enter some text that you want to speak >");
string text = Console.ReadLine();
var speechSynthesisResult = await speechSynthesizer.SpeakTextAsync(text);
OutputSpeechSynthesisResult(speechSynthesisResult, text);
}
Console.WriteLine("Press any key to exit...");
Console.ReadKey();
}
}
Output :
The code ran successfully, and I received the following prompt to enter some text input to obtain speech output:
I entered the text Hi Kamali. How are you? and heard the speech output for the provided input text, resulting in this message:
Enter some text that you want to speak >
Hi Kamali. How are you?
Speech synthesized for text
Press any key to exit...
Reference :
Check this MSDOC to convert text to speech.