I am using the Dialogflow C# Library Google.Cloud.Dialogflow.V2 to communicate with my Dialogflow Agent.
Everything works find when sending Text using the DetectIntentAsync()
My issue is when sending an AUDIO and more precisely with this Format: .AAC
To be able to send an audio using DetectIntentAsync() we need to create a DetectIntentRequest like below
DetectIntentRequest detectIntentRequest = new DetectIntentRequest
{
InputAudio = **HERE WHERE I HAVE AN ISSUE**,
QueryInput = queryInput,
Session = "projects/" + _sessionName.ProjectId + "/agent/sessions/" + _sessionName.SessionId
};
Where the QueryInput is configured with AudioConfig like below
QueryInput queryInput = new QueryInput
{
AudioConfig = audioConfig,
};
Where the AudioConfig is configured like below
var audioConfig= new InputAudioConfig
{
AudioEncoding = **HAVING ISSUE HERE ON HOW TO CHOOSE THE CORRECT ENCODING**,
LanguageCode = "en-US",
ModelVariant = SpeechModelVariant.Unspecified,
SampleRateHertz = **HAVING ISSUE HERE ON HOW TO CHOOSE THE CORRECT SAMPLE RATE HERTZ**,
};
PROBLEM
I got the byte[] from a URL
// THE AUDIO IS A .AAC FILE
string audio = "https://cdn.fbsbx.com/v/t59.3654-21/72342591_3243833722299817_3308062589669343232_n.aac/audioclip-1575911942672-2279.aac?_nc_cat=102&_nc_ohc=heP60KND_DMAQl5-tE77rKNtUzHw_aILXdKfPPejdr7YVqzbYLQRv9BWA&_nc_ht=cdn.fbsbx.com&oh=1c4dbf0a64e0d1fb057b79354c17ca1c&oe=5DF17429";
byte[] audioBytes;
using (var webClient = new WebClient())
{
audioBytes = webClient.DownloadData(audio);
}
Then I added that into the DetectIntentRequest like below
DetectIntentRequest detectIntentRequest = new DetectIntentRequest
{
InputAudio = Google.Protobuf.ByteString.CopyFrom(audioBytes),
QueryInput = queryInput,
Session = "projects/" + _sessionName.ProjectId + "/agent/sessions/" + _sessionName.SessionId
};
If I do not specify the SampleRateHertz i get the following error:
Error: "{"Status(StatusCode=InvalidArgument, Detail=\"Invalid input audio or config. Unable to calculate audio duration. Possibly no audio data sent.\")"} "
I stopped getting the error when I Specified a SampleRateHertz value but this is the response I keep getting no matter what values I use in the AudioEncoding and SampleRateHertz:
Response: {{ "languageCode": "en" }}
Everything else in the DetectIntentResponse is null
Guidance/Help is appreciated
Thank you
For those who face the .AAC issue with dialogflow, I managed to get it working like below:
DetectIntentResponse response = new DetectIntentResponse();
var queryAudio = new InputAudioConfig
{
LanguageCode = LanguageCode,
ModelVariant = SpeechModelVariant.Unspecified,
};
QueryInput queryInput = new QueryInput
{
AudioConfig = queryAudio,
};
var filename = "fileName".wav";
// userAudioInput is the .AAC string URL
// creating and saving the wav format from AAC
using (var reader = new MediaFoundationReader(userAudioInput))
{
Directory.CreateDirectory(path);
WaveFileWriter.CreateWaveFile(path + "/" + filename, reader);
}
// Reading the previously saved wav file
byte[] inputAudio = File.ReadAllBytes(path + "/" + filename);
DetectIntentRequest detectIntentRequest = new DetectIntentRequest
{
//InputAudio = Google.Protobuf.ByteString.CopyFrom(ReadFully(outputStreamMono)),
InputAudio = Google.Protobuf.ByteString.CopyFrom(inputAudio),
QueryInput = queryInput,
Session = "projects/" + _sessionName.ProjectId + "/agent/sessions/" + _sessionName.SessionId
};
// Make the request
response = await _sessionsClient.DetectIntentAsync(detectIntentRequest);