Search code examples
ctype-conversionpcm

represent 2 sint16s samples as a single uint32 sample


My question is related to this question but it is in reverse.

Split UInt32 (audio frame) into two SInt16s (left and right)?

In the link above the op wanted to split a 32 bit frame into 2 16 bit samples (left and right).

I have 2 16 bit samples and I would like to express them as a single 32 bit interleaved frame.

What I have attempted so far looks like this.

UInt32* ar = malloc(totalFramesInFile * sizeof (ar));

for (int b=0; b < totalFramesInFile; b++)
{
  UInt32 l = soundStructArray[audioFile].audioDataLeft[b];
  UInt32 r = soundStructArray[audioFile].audioDataRight[b];

  UInt32 t = (UInt32) l + r;

  ar[b] = t;
}
soundStructArray[audioFile].audioData = ar;

Is this legitimate and correct? I'm not sure due to my inexperience. My audio output is sounding a little peculiar and I'm trying to by process of elimination determine whats going wrong.

It would be helpful if somebody could either confirm that what I'm doing is the correct way to express 2 16 bit samples as a 32 bit frame or suggest a correct way.

I have a feeling that what I have done is wrong. I think this because the first 16 bits of the 32bit frame should be the left and the second 16 bits should be the right. The whole thing shouldnt be the sum of the two...... I think.


Solution

  • You need to change::

    UInt32 t = (UInt32) l + r;
    

    to:

    UInt32 t = (l << 16) | (r & 0xffff);
    

    This puts l in the most significant 16 bits of t and r in the least significant 16 bits.

    Detailed explanation

    If you have two 16 bit samples, l and r, which look like this in binary:

    l: LLLLLLLLLLLLLLLL
    r: RRRRRRRRRRRRRRRR
    

    let's first extend them to to 32 bits:

    l: 0000000000000000LLLLLLLLLLLLLLLL
    r: 0000000000000000RRRRRRRRRRRRRRRR
    

    Now let's shift l left by 16 bits (l << 16)

    l: LLLLLLLLLLLLLLLL0000000000000000
    r: 0000000000000000RRRRRRRRRRRRRRRR
    

    Now let's OR (|) them together:

    t: LLLLLLLLLLLLLLLLRRRRRRRRRRRRRRRR
    

    Voilà ! Now you have l and r combined into a 32 bit value from which they can easily be extracted again later.