How to send an Audio for Dialogflow using C# library - DetectIntent

I am using the Dialogflow C# Library Google.Cloud.Dialogflow.V2 to communicate with my Dialogflow Agent.

Everything works find when sending Text using the DetectIntentAsync()

My issue is when sending an AUDIO and more precisely with this Format: .AAC

To be able to send an audio using DetectIntentAsync() we need to create a DetectIntentRequest like below


 DetectIntentRequest detectIntentRequest = new DetectIntentRequest
            {
                InputAudio = **HERE WHERE I HAVE AN ISSUE**,
                QueryInput = queryInput,
                Session = "projects/" + _sessionName.ProjectId + "/agent/sessions/" + _sessionName.SessionId
            };

Where the QueryInput is configured with AudioConfig like below

            QueryInput queryInput = new QueryInput
            {
                AudioConfig = audioConfig,
            };

Where the AudioConfig is configured like below

   var audioConfig= new InputAudioConfig
            {
                AudioEncoding = **HAVING ISSUE HERE ON HOW TO CHOOSE THE CORRECT ENCODING**,
                LanguageCode = "en-US",
                ModelVariant = SpeechModelVariant.Unspecified,
                SampleRateHertz = **HAVING ISSUE HERE ON HOW TO CHOOSE THE CORRECT SAMPLE RATE HERTZ**,
            };

PROBLEM

How to figure out what SampleRateHertz to choose?
How to figure out what AudioEncoding to choose?
How to provide the correct Protobuf.ByteString to InputAudio?
What if i want to use other formats than .AAC, how to automatically provide the needed info?

WHAT I TESTED

I got the byte[] from a URL

// THE AUDIO IS A .AAC FILE
string audio = "https://cdn.fbsbx.com/v/t59.3654-21/72342591_3243833722299817_3308062589669343232_n.aac/audioclip-1575911942672-2279.aac?_nc_cat=102&_nc_ohc=heP60KND_DMAQl5-tE77rKNtUzHw_aILXdKfPPejdr7YVqzbYLQRv9BWA&_nc_ht=cdn.fbsbx.com&oh=1c4dbf0a64e0d1fb057b79354c17ca1c&oe=5DF17429";
byte[] audioBytes;
            using (var webClient = new WebClient())
            {
                audioBytes = webClient.DownloadData(audio);
            }

Then I added that into the DetectIntentRequest like below

DetectIntentRequest detectIntentRequest = new DetectIntentRequest
            {
                InputAudio = Google.Protobuf.ByteString.CopyFrom(audioBytes),
                QueryInput = queryInput,
                Session = "projects/" + _sessionName.ProjectId + "/agent/sessions/" + _sessionName.SessionId
            };

If I do not specify the SampleRateHertz i get the following error:

Error: "{"Status(StatusCode=InvalidArgument, Detail=\"Invalid input audio or config. Unable to calculate audio duration. Possibly no audio data sent.\")"} "

I stopped getting the error when I Specified a SampleRateHertz value but this is the response I keep getting no matter what values I use in the AudioEncoding and SampleRateHertz:

Response: {{ "languageCode": "en" }}

Everything else in the DetectIntentResponse is null

Guidance/Help is appreciated

Thank you

Solution

For those who face the .AAC issue with dialogflow, I managed to get it working like below:

 DetectIntentResponse response = new DetectIntentResponse();
            var queryAudio = new InputAudioConfig
            {
                LanguageCode = LanguageCode,
                ModelVariant = SpeechModelVariant.Unspecified,
            };

            QueryInput queryInput = new QueryInput
            {
                AudioConfig = queryAudio,
            };

                var filename = "fileName".wav";
                // userAudioInput is the .AAC string URL 
                // creating and saving the wav format from AAC
                using (var reader = new MediaFoundationReader(userAudioInput))
                {
                    Directory.CreateDirectory(path);
                    WaveFileWriter.CreateWaveFile(path + "/" + filename, reader);
                }
                // Reading the previously saved wav file
                byte[] inputAudio = File.ReadAllBytes(path + "/" + filename);

                DetectIntentRequest detectIntentRequest = new DetectIntentRequest
                {
                    //InputAudio = Google.Protobuf.ByteString.CopyFrom(ReadFully(outputStreamMono)),
                    InputAudio = Google.Protobuf.ByteString.CopyFrom(inputAudio),
                    QueryInput = queryInput,
                    Session = "projects/" + _sessionName.ProjectId + "/agent/sessions/" + _sessionName.SessionId
                };

                // Make the request
                response = await _sessionsClient.DetectIntentAsync(detectIntentRequest);