Search code examples
c#naudiowasapi

What is the structure of a NAudio capture?


I am trying to send over PCM audio using UDP via WLAN. The audio I am trying to send is the data I get from NAudio's WasapiLoopBackCapture(). When the DataAvailable event is called I send the whole buffer in one message:

var capture = new WasapiLoopbackCapture();
capture.DataAvailable += RecorderOnDataAvailable;
static void RecorderOnDataAvailable(object sender, WaveInEventArgs waveInEventArgs)
{
    Byte[] msg = waveInEventArgs.Buffer;
    udpClient.Send(msg, msg.Length);
    Console.WriteLine(msg.Length);
    Console.WriteLine(ByteArrayToString(waveInEventArgs.Buffer));
}

Now, a python script on the other end has to receive and store the bytes in order to convert them later into signed integers (I need to visualize the audio). In order to do that, I need to know what is the structure of the data when it's written to the UDP packets.

I tried looking at the hex of the bytes:

008059380080f1380080dc3800c007390040193900c03039004039390080633900404c390090873900e0583900a09b39
00206a390010b13900e07d390030c43900c0803900a0cf3900c06f3900b0d23900a05d3900a0d0390040573900c0c939
00e04f3900c0b53900e0443900608d390020443900a02f3900c04f3900408e3800e05a390000ccb700405939000007b9
00805139001082b9006052390010bcb900a05a390040eab900205f39001808ba0020593900f018ba00604d39001825ba
00e04039004829ba00202f3900c026ba00000e39007821ba0000b23800481cba0000e43700c817ba0000d8b7005814ba
00809bb8000811ba00c002b900700dba00203bb900a00bba008064b900000cba000070b900200eba00e063b9003012ba
006053b9008817ba00804bb900601dba00c03bb9006821ba004013b900b021ba0080b1b800701fba00009bb700a01cba
0080723800b016ba00801539004807ba00406e390030dab900f09c3900e09ab900f0b53900c02fb90010c3390000f9b7...

Still, I can't make sense of it, because it's basically my first time interacting with bytes at this level.

So, what is the structure of the captured audio, and does it depend on the system I am on? How do I go about converting the bytes to integers? Thanks in advance


Solution

  • WASAPI captures audio using IEEE floating point 32 bit samples (and will be stereo). It would make sense to convert those to 16 bit integer.