Search code examples
audiouwppcm

uwp audioGraph convert 32 bit to 16 bit pcm


I need to pass an audio recording from mic to buffer, and then from buffer to speakers(I send the buffer via network). My configuration: Mic->AudioFrameOutput->Network->AudioFrameInput->Speakers.

I need the recording to be in 16 bits/sample PCM(for the network). The documentation of AudioGraph mentions that it only supports 32 bit float format. How can I convert the 32 bit recording to 16 bit and then play the recording?

Thanks, Tony


Solution

  • How to convert 32 bit float to 16 bit integer is a very common desire in the world of streaming audio ... here we convert an element of your 32 bit float buffer (array) into a lossy (32 bit does not fit into 16 bits) unsigned 16 bit integer ... with input float varying from -1 to +1

    my_16_bit_unsigned_int = ((input_32_bit_floats[index] + 1.0) * 32768) - 1;
    

    When playing with audio data at this most direct level you are exposed to many fundamental design decisions :

    • is input audio wave of floats varying from say -1 to +1, or -0.5 to +0.5, or from say 0 to +1 or other
    • do I want my output 16 bit PCM to be signed or unsigned (typically unsigned)
    • am I dealing with big endian or little endian byte ordering which is important when sending memory buffers over the wire (typically little endian) in particular when you might need to collapse a 16 bit integer buffer into a byte stream

    Knowing these questions and having answers after mulling your data above equation does assume the input 32 bit float representation of the audio wave varies from -1.0 to +1.0 ( typical )

    You ask where did that value 32768 come from ? ... well 16 bit integers have 2^16 distinct values which range from 0 to ( 2^16 - 1 ) so if your input float varies from -1 to +1 we first add 1 to make it vary from 0 to +2 which makes our output unsigned ( no negative numbers ), then we multiply values in that range by 32768 then subtract 1 to accommodate a starting lower bound of 0 such that output range of integers varies from 0 to ( 2^16 - 1 ) ... or 0 to 65537 which gives you a total of 2^16 distinct integer values

    Lets break it down with concrete examples

    • this time input 32 bit floats vary from -1.0 to +1.0 ... actually range is from -1 < value < 1

    example A

    inputA = -0.999   #   close to minimum possible value
    
    outputA = int((input_32_bit_floats[index] + 1.0) * 32768) - 1;
    
    outputA = int(( -0.999 + 1.0) * 32768) - 1;
    outputA = int( 0.001 * 32768) - 1;
    outputA = int( 32.768) - 1;    
    outputA = 33 - 1;
    outputA = 32;     #    close to min possible value of 0
    

    example B

    inputB = 0.999   #   almost max possible value 
    
    outputB = int((input_32_bit_floats[index] + 1.0) * 32768) - 1;
    outputB = int((0.999  + 1.0) * 32768) - 1;
    outputB = 65503 - 1;
    outputB = 65502  #   close to our max possible value of 65537
    

    You can speed up the multiplication by 32768 by replacing it by a bit shift left ... how many bit positions you shift is driven by what power of 2 your shift operation is replacing ...

    outputA = int((input_32_bit_floats[index] + 1.0) * 32768) - 1;
    

    would become

    outputA = ( int(input_32_bit_floats[index] + 1.0)  << 15) - 1;