Search code examples
pythonarraysswapendianness

Efficient way to swap bytes in python


I have some bytearray with length of 2*n:

a1 a2 b1 b2 c1 c2

I need to switch bytes endian in each 2-byte word, and make:

a2 a1 b2 b1 c2 c1

Now I use next approach but it is very slow for my task:

converted = bytearray([])
for i in range(int(len(chunk)/2)):
   converted += bytearray([ chunk[i*2+1], chunk[i*2] ])

Is it possible to switch endian of bytearray by calling some system/libc function?


Ok, thanks to all, I timed some suggestions:

import timeit

test = [
"""
converted = bytearray([])
for i in range(int(len(chunk)/2)):
   converted += bytearray([ chunk[i*2+1], chunk[i*2] ])
""",
"""
for i in range(0, len(chunk), 2):
    chunk[i], chunk[i+1] = chunk[i+1], chunk[i]
""",
"""
byteswapped = bytearray([0]) * len(chunk)
byteswapped[0::2] = chunk[1::2]
byteswapped[1::2] = chunk[0::2]
""",
"""
chunk[0::2], chunk[1::2] = chunk[1::2], chunk[0::2]
"""
]

for t in test:
    print(timeit.timeit(t, setup='chunk = bytearray([1]*10)'))

and result is:

$ python ti.py
11.6219761372
2.61883187294
3.47194099426
1.66421198845

So in-pace slice assignment with a step of 2 now is fastest. Also thanks to Mr. F for detailed explaining but I not yet tried it because of numpy


Solution

  • You could use slice assignment with a step of 2:

    byteswapped = bytearray(len(original))
    byteswapped[0::2] = original[1::2]
    byteswapped[1::2] = original[0::2]
    

    Or if you want to do it in-place:

    original[0::2], original[1::2] = original[1::2], original[0::2]
    

    Timing shows that slicing massively outperforms a Python-level loop for large arrays:

    >>> timeit.timeit('''
    ... for i in range(0, len(chunk), 2):
    ...     chunk[i], chunk[i+1] = chunk[i+1], chunk[i]''',
    ... 'chunk=bytearray(1000)')
    81.70195105159564
    >>>
    >>> timeit.timeit('''
    ... byteswapped = bytearray(len(original))
    ... byteswapped[0::2] = original[1::2]
    ... byteswapped[1::2] = original[0::2]''',
    ... 'original=bytearray(1000)')
    2.1136113323948393
    >>>
    >>> timeit.timeit('chunk[0::2], chunk[1::2] = chunk[1::2], chunk[0::2]', 'chunk=
    bytearray(1000)')
    1.79349659994989
    

    For small arrays, slicing still beats the explicit loop, but the difference isn't as big:

    >>> timeit.timeit('''
    ... for i in range(0, len(chunk), 2):
    ...     chunk[i], chunk[i+1] = chunk[i+1], chunk[i]''',
    ... 'chunk=bytearray(10)')
    1.2503637694328518
    >>>
    >>> timeit.timeit('''
    ... byteswapped = bytearray(len(original))
    ... byteswapped[0::2] = original[1::2]
    ... byteswapped[1::2] = original[0::2]''',
    ... 'original=bytearray(10)')
    0.8973060929306484
    >>>
    >>> timeit.timeit('chunk[0::2], chunk[1::2] = chunk[1::2], chunk[0::2]', 'chunk=
    bytearray(10)')
    0.6282232971918802