Search code examples
pythonmemoryview

Transposing `memoryview` in pure Python


Is there a pure Python way of transposing a memoryview?


Python memoryviews can represent more than just a 1-dimensional chunk of bytes. They can represent multidimensional layouts, noncontiguous memory, complex element types, and more. For example, in the following code:

In [1]: import numpy

In [2]: x = numpy.array([[1, 2], [3, 4]])

In [3]: y = x.T

In [4]: a = memoryview(x)

In [5]: b = memoryview(y)

a and b are 2-by-2 multidimensional memoryviews:

In [6]: a.shape
Out[6]: (2, 2)

In [7]: b.shape
Out[7]: (2, 2)

and b represents a transpose of a, so a[i, j] and b[j, i] alias the same memory (which is cell i, j of the original x array):

In [8]: a[0, 1] = 5

In [9]: b[1, 0]
Out[9]: 5

In [10]: x
Out[10]: 
array([[1, 5],
       [3, 4]])

NumPy arrays support easy transposes, but NumPy arrays aren't the only sources of multidimensional memoryviews. For example, you can cast a single-dimensional memoryview:

In [11]: bytearr = bytearray([1, 2, 3, 4])

In [12]: mem = memoryview(bytearr).cast('b', (2, 2))

In [13]: mem.shape
Out[13]: (2, 2)

In [14]: mem[1, 0] = 5

In [15]: bytearr
Out[15]: bytearray(b'\x01\x02\x05\x04')

The memoryview format is flexible enough to represent a transpose of mem, like what b was to a in our earlier example, but there doesn't seem to be an easy transpose method in the memoryview API. Is there a pure-Python way of transposing arbitrary multidimensional memoryviews?


Solution

  • There's no good way without taking dependencies. With NumPy, it's fairly straightforward, as long as the memoryview doesn't have suboffsets:

    transposed = memoryview(numpy.asarray(orig_memoryview).T)
    

    orig_memoryview can be backed by anything - there doesn't have to be a NumPy array behind it.

    Unlike the other answer, the resulting memoryview is backed by the same memory as the original memoryview. For example, with the following multidimensional memoryview:

    In [1]: import numpy
    
    In [2]: arr = numpy.array([[1, 2], [3, 4]])
    
    In [3]: mem = memoryview(arr)
    

    we can transpose it:

    In [4]: transposed = memoryview(numpy.asarray(mem).T)
    

    and writes to the transposed memoryview affect the original array:

    In [5]: transposed[0, 1] = 5
    
    In [6]: arr
    Out[6]: 
    array([[1, 2],
           [5, 4]])
    

    Here, writing to cell 0, 1 of the transpose corresponds to cell 1, 0 of the original array.

    This doesn't rely on the original memoryview being backed by a NumPy array. It works fine with memoryviews backed by other things, like bytearrays:

    In [7]: x = bytearray([1, 2, 3, 4])
    
    In [8]: y = memoryview(x).cast('b', (2, 2))
    
    In [9]: transposed = memoryview(numpy.asarray(y).T)
    
    In [10]: transposed[0, 1] = 5
    
    In [11]: y[1, 0]
    Out[11]: 5
    
    In [12]: x
    Out[12]: bytearray(b'\x01\x02\x05\x04')
    

    Without NumPy or a similar dependency, I don't see a good way. The closest thing to a good way would be to use ctypes, but you'd need to hardcode the Py_buffer struct layout for that, and the exact layout of a Py_buffer struct is undocumented. (The field order and types don't quite match the order in which the fields are documented, or the types they're documented with.) Also, for a PIL-style array with suboffsets, there's no way to transpose the memoryview without copying data.

    On the bright side, most cases where you'd be dealing with multidimensional memoryviews, you'd already have the dependencies you'd need to transpose them.