Search code examples
c#unsafe-pointersbitconverter

Improve performance of Bitconverter.ToInt16


I am collecting data from a USB device and this data has to go to an audio output component. At the moment I am not delivering the data fast enough to avoid clicks in the output signal. So every millisecond counts.

At the moment I am collecting the data which is delivered in a byte array of 65536 bytes.The first two bytes represent 16 bits of data in little endian format. These two bytes must be placed in the first element of a double array. The second two bytes, must be placed in the first element of a different double array. This is then repeated for all the bytes in the 65536 buffer so that you end up with 2 double[] arrays of size 16384.

I am currently using BitConverter.ToInt16 as shown in the code. It takes around 0.3ms to run this but it has to be done 10 times to get a packet to send off to the audio output. So the overhead is 3ms which is just enough for some packets to not be delivered on time eventually.

Code

byte[] buffer = new byte[65536];
double[] bufferA = new double[16384];
double[] bufferB = new double[16384]

for(int i= 0; i < 65536; i +=4)
{
    bufferA[i/4] = BitConverter.ToInt16(buffer, i);
    bufferB[i/4] = BitConverter.ToInt16(buffer, i+2);
}

How can I improve this? Is it possible to copy the values with unsafe code? I have no experience in that. Thanks


Solution

  • This gets me about triple the speed in release, using Pointers and unsafe. There maybe other micro-optimisations, however I'll leave those details up to the masses

    Updated

    My original algorithm had a bug, and could have been improved

    Modified Code

    public unsafe (double[], double[]) Test2(byte[] input, int scale)
    {
       var bufferA = new double[input.Length / 4];
       var bufferB = new double[input.Length / 4];
    
       fixed (byte* pSource = input)
          fixed (double* pBufferA = bufferA, pBufferB = bufferB)
          {
             var pLen = pSource + input.Length;
             double* pA = pBufferA, pB = pBufferB;
    
             for (var pS = pSource; pS < pLen; pS += 4, pA++, pB++)
             {
                *pA = *(short*)pS;
                *pB = *(short*)(pS + 2);
             }
          }
    
       return (bufferA, bufferB);
    }
    

    Benchmarks

    Each test is run 1000 times, garbage collected before each run, and scaled to various array lengths. All results are checked against the original OP version

    Test Environment

    ----------------------------------------------------------------------------
    Mode             : Release (64Bit)
    Test Framework   : .NET Framework 4.7.1 (CLR 4.0.30319.42000)
    ----------------------------------------------------------------------------
    Operating System : Microsoft Windows 10 Pro
    Version          : 10.0.17134
    ----------------------------------------------------------------------------
    CPU Name         : Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
    Description      : Intel64 Family 6 Model 58 Stepping 9
    Cores (Threads)  : 4 (8)      : Architecture  : x64
    Clock Speed      : 3901 MHz   : Bus Speed     : 100 MHz
    L2Cache          : 1 MB       : L3Cache       : 8 MB
    ----------------------------------------------------------------------------
    

    Results

    --- Random Set of byte ------------------------------------------------------
    | Value    |    Average |    Fastest |    Cycles | Garbage | Test |    Gain |
    --- Scale 16,384 -------------------------------------------- Time 13.727 ---
    | Unsafe   |  19.487 µs |  14.029 µs |  71.479 K | 0.000 B | Pass | 59.02 % |
    | Original |  47.556 µs |  34.781 µs | 169.580 K | 0.000 B | Base |  0.00 % |
    --- Scale 32,768 -------------------------------------------- Time 14.809 ---
    | Unsafe   |  40.398 µs |  31.274 µs | 145.024 K | 0.000 B | Pass | 56.62 % |
    | Original |  93.127 µs |  79.501 µs | 329.320 K | 0.000 B | Base |  0.00 % |
    --- Scale 65,536 -------------------------------------------- Time 18.984 ---
    | Unsafe   |  68.318 µs |  43.550 µs | 245.083 K | 0.000 B | Pass | 68.34 % |
    | Original | 215.758 µs | 160.171 µs | 758.955 K | 0.000 B | Base |  0.00 % |
    --- Scale 131,072 ------------------------------------------- Time 22.620 ---
    | Unsafe   | 120.764 µs |  79.208 µs | 428.626 K | 0.000 B | Pass | 71.24 % |
    | Original | 419.889 µs | 322.388 µs |   1.461 M | 0.000 B | Base |  0.00 % |
    -----------------------------------------------------------------------------