I created a bot with bot framework and now i'm trying to use the CustomSpeech service instead of the bing SpeechToText Service that works fine. I have tried various way to resolve the problem but i get the error 400 and i don't know how to solve this.
The method where i would like to get the text from a Stream of a wav pcm audio:
public static async Task<string> CustomSpeechToTextStream(Stream audioStream)
{
audioStream.Seek(0, SeekOrigin.Begin);
var customSpeechUrl = "https://westus.stt.speech.microsoft.com/speech/recognition/interactive/cognitiveservices/v1?cid=<MyEndPointId>";
string token;
token = GetToken();
HttpWebRequest request = null;
request = (HttpWebRequest)HttpWebRequest.Create(customSpeechUrl);
request.SendChunked = true;
//request.Accept = @"application/json;text/xml";
request.Method = "POST";
request.ProtocolVersion = HttpVersion.Version11;
request.ContentType = "audio/wav; codec=\"audio/pcm\"; samplerate=16000";
request.Headers["Authorization"] = "Bearer " + token;
byte[] buffer = null;
int bytesRead = 0;
using (Stream requestStream = request.GetRequestStream())
{
// Read 1024 raw bytes from the input audio file.
buffer = new Byte[checked((uint)Math.Min(1024, (int)audioStream.Length))];
while ((bytesRead = audioStream.Read(buffer, 0, buffer.Length)) != 0)
{
requestStream.Write(buffer, 0, bytesRead);
}
requestStream.Flush();
}
string responseString = string.Empty;
// Get the response from the service.
using (WebResponse response = request.GetResponse()) // Here i get the error
{
using (StreamReader sr = new StreamReader(response.GetResponseStream()))
{
responseString = sr.ReadToEnd();
}
}
dynamic deserializedResponse = Newtonsoft.Json.JsonConvert.DeserializeObject(responseString);
if (deserializedResponse.RecognitionStatus == "Success")
{
return deserializedResponse.DisplayText;
}
else
{
return null;
}
}
At using (WebResponse response = request.GetResponse()){}
i get an exception (Error 400).
Am I doing the HttpWebRequest in the right way?
I read in internet that maybe the problem is the file audio... but then why with the same Stream bing speech service doesn't return this error?
In my case the problem was that i had a wav stream audio that doesn't had the file header that Cris (Custom Speech Service) needs. The sulution is creating a temporary file wav, read the file wav and copy it in a Stream to send it as array to Cris
byte[] buffer = null;
int bytesRead = 0;
using (Stream requestStream = request.GetRequestStream())
{
buffer = new Byte[checked((uint)Math.Min(1024, (int)audioStream.Length))];
while ((bytesRead = audioStream.Read(buffer, 0, buffer.Length)) != 0)
{
requestStream.Write(buffer, 0, bytesRead);
}
requestStream.Flush();
}
or copy it in a MemoryStream and send it as array
using (Stream requestStream = request.GetRequestStream())
{
requestStream.Write(audioStream.ToArray(), 0, audioStream.ToArray().Length);
requestStream.Flush();
}