Search code examples
pythonnumpycythonnumpy-ndarray

Fast conversion of c char numpy array to list of python strings


I'm making an interface between Python and Fortran code with Cython. One part of that is retrieving arrays of strings. In Fortran,

character(len=3) :: str_array(:)

For the sake of this example, suppose str_array contains the following

allocate(str_array(2))
str_array = ['abc','def']

My approach is to return this to Cython as a single C char array. I end up with a numpy array of byte strings:

c_str_arr = np.array([b'a', b'b', b'c', b'd', b'e', b'f'], dtype='|S1')

I then convert this numpy array to a list of python strings with the following python code:

str_len = 3
arr_len = 2
c_str_arr.shape = (arr_len,str_len)
str_arr = []
for i in range(arr_len):
    str_arr.append(b''.join(c_str_arr[i]).decode())

But this is pretty slow.

My question: Is there a faster way to convert c_str_arr to a list of python strings?


Solution

  • Basically, avoid iteration over the array. This is a bit of a shot in the dark, but try:

    bs = c_str_arr.tobytes()
    str_arr = [bs[i:i+str_len].decode() for i in range(0, str_len*arr_len, str_len)]