Search code examples
c#audioffmpegibm-watsonm3u8

Send MP3 audio extracted from m3u8 stream to IBM Watson Speech To Text


I'm extracting audio in MP3 format from a M3U8 live url and the final goal is to send the live audio stream to IBM Watson Speech To Text. The m3u8 is obtained by calling an external script via a Process. Then I use FFMPEG script to get the audio in stdout. It works if I save the audio in a file but I don't want to save the extracted audio, I need to send the datas directly to the STT service. So far I proceeded like this:

SpeechToTextService speechToTextService = new SpeechToTextService(sttUsername, sttPassword);
string m3u8Url = "https://something.m3u8";
char[] buffer = new char[48000];
Process ffmpeg = new ProcessHelper(@"ffmpeg\ffmpeg.exe", $"-v 0 -i {m3u8Url} -acodec mp3 -ac 2 -ar 48000 -f mp3 -");

ffmpeg.Start();
int count;
while ((count = ffmpeg.StandardOutput.Read(buffer, 0, 48000)) > 0)
{
    ffmpeg.StandardOutput.Read(buffer, 0, 48000);
    var answer = speechToTextService.RecognizeSessionless(
        audio: buffer.Select(c => (byte)c).ToArray(),
        contentType: "audio/mpeg",
        smartFormatting: true,
        speakerLabels: false,
        model: "en-US_BroadbandModel"
    );
    // Get answer.ResponseJson, deserializing, clean buffer, etc...
}

When requesting the transcribed audio I'm getting this error:

An unhandled exception of type 'System.AggregateException' occurred in IBM.WatsonDeveloperCloud.SpeechToText.v1.dll: 'One or more errors occurred. (The API query failed with status code BadRequest: Bad Request | x-global-transaction-id: bd6cd203720a70d83b9a03451fe28973 | X-DP-Watson-Tran-ID: bd6cd203720a70d83b9a03451fe28973)'
 Inner exceptions found, see $exception in variables window for more details.
 Innermost exception     IBM.WatsonDeveloperCloud.Http.Exceptions.ServiceResponseException : The API query failed with status code BadRequest: Bad Request | x-global-transaction-id: bd6cd203720a70d83b9a03451fe28973 | X-DP-Watson-Tran-ID: bd6cd203720a70d83b9a03451fe28973
   at IBM.WatsonDeveloperCloud.Http.Filters.ErrorFilter.OnResponse(IResponse response, HttpResponseMessage responseMessage)
   at IBM.WatsonDeveloperCloud.Http.Request.<GetResponse>d__30.MoveNext()
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at IBM.WatsonDeveloperCloud.Http.Request.<AsMessage>d__23.MoveNext()
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at IBM.WatsonDeveloperCloud.Http.Request.<As>d__24`1.MoveNext()

ProcessHelper is just for convenience:

class ProcessHelper : Process
{
    private string command;
    private string arguments;
    public ProcessHelper(string command, string arguments, bool redirectStandardOutput = true)
    {
        this.command = command;
        this.arguments = arguments;
        StartInfo = new ProcessStartInfo()
        {
            FileName = this.command,
            Arguments = this.arguments,
            UseShellExecute = false,
            RedirectStandardOutput = redirectStandardOutput,
            CreateNoWindow = true
        };
    }
}

Pretty sure I'm doing it wrong, I'd love someone to shine a light on this. Thanks.


Solution

  • I still don't know why I can't recognizesessionless my buffer (the second ffmpeg.StandardOutput.Read(buffer, 0, 48000); was a typo btw) but I managed to make it work with websockets like explained there https://gist.github.com/nfriedly/0240e862901474a9447a600e5795d500