Replace removed function PyArray_GetCastFunc in numpy 2

I'm migrating some python C extension to numpy 2. The extension basically gets a list of 2D numpy arrays and generates a new 2D array by combining them (average, median, etc,). The difficulty is that the input and output arrays are byteswapped. I cannot byteswap the input arrays to machine order (they are too many to fit in memory). So

I read an element of each input array,
bitswap them to machine order,
then cast them to a list of doubles,
perform my operation on the list to obtain a double
cast the double to the dtype of the output array
bitswap again
and write the result in the output array

To achieve this (using numpy 1.x C-API) I was using something like:

PyArray_Descr* descr_in = PyArray_DESCR((PyArrayObject*)input_frame_1);
PyArray_CopySwapFunc* swap_in = descr_in->f->copyswap;
PyArray_VectorUnaryFunc* cast_in = PyArray_GetCastFunc(descr_in, NPY_DOUBLE);
bool need_to_swap_in = PyArray_ISBYTESWAPPED((PyArrayObject*)input_frame_1);

And something slightly different but similar for the output. I use the function swap_in to read a value from the input array, bitswap it and write it into a buffer and then cast_in to cast the contents of the buffer into a double.

In numpy 2, the copyswap function is still accesible with a different syntax:

PyArray_CopySwapFunc* swap_in = PyDataType_GetArrFuncs(descr_in)->copyswap;

But the cast function is not. Although the member is still in the struct, most of its values are NULL. So this doesn't work:

PyArray_VectorUnaryFunc* cast_in = PyDataType_GetArrFuncs(descr_in)->cast[NPY_DOUBLE];

The documentation says

PyArray_GetCastFunc is removed. Note that custom legacy user dtypes can still provide a castfunc as their implementation, but any access to them is now removed. The reason for this is that NumPy never used these internally for many years. If you use simple numeric types, please just use C casts directly. In case you require an alternative, please let us know so we can create new API such as PyArray_CastBuffer() which could use old or new cast functions depending on the NumPy version.

So the function has been removed, but there isn't a clear path to subtitute it with something else. What is the correct way of read and write values from/to bitswapped arrays?

More detailed sample code. It just iterates over the input and saves the value in a double.

double d_val = 0;
char buffer[NPY_BUFSIZE];
PyObject* input_frame_1;
// input_frame_1 is initialized over here

// Conversion
PyArray_Descr* descr_in = PyArray_DESCR((PyArrayObject*)input_frame_1);
PyArray_CopySwapFunc* swap_in = descr_in->f->copyswap;
PyArray_VectorUnaryFunc* cast_in = PyArray_GetCastFunc(descr_in, NPY_DOUBLE);
bool need_to_swap_in = PyArray_ISBYTESWAPPED((PyArrayObject*)input_frame_1);

// Iterator
PyArrayIterObject* iter = PyArray_IterNew(input_frame_1);

// Just reads the value and casts it into a double d_val
while (iter->index < iter->size) {
 d_val = 0;
 // Swap the value if needed and store it in the buffer
 swap_in(buffer, iter->dataptr, need_to_swap_in, NULL);
 cast_in(buffer, &d_val, 1, NULL, NULL);
 
 /* Code to advance iter comes here */
}

Solution

I have found a solution for my problem using NpyIter iterators. This type of iterators can be commanded to take care of the buffering and casting that I was doing manually previously.

So my example would be something like:


PyObject* input_frame_1;
// input_frame_1 is initialized over here

/* This var will contain the output */
PyObjecj* out_res = NULL; 

/* required to create the iterator */
PyArray_Descr* dtype_res = NULL;
npy_uint32 op_flags[2];
PyArray_Descr*> op_dtypes[2];
PyObject* ops[2];
NpyIter *iter = NULL;
NpyIter_IterNextFunc *iternext;
char** dataptr;

/* I have an input array, the output array 
   will be automatically allocated.
   The input array will be casted into double, and the
   output array will be double also

*/

ops[0] = input_frame_1; /* input operand */
ops[1] = NULL; /* output operand will be allocated */
op_flags[0] = NPY_ITER_READONLY | NPY_ITER_NBO;
op_flags[1] = NPY_ITER_WRITEONLY | NPY_ITER_ALLOCATE | NPY_ITER_NBO| NPY_ITER_ALIGNED;
dtype_res = PyArray_DescrFromType(NPY_DOUBLE);
op_dtypes[0] = dtype_res; /* input is converted to double */
op_dtypes[1] = dtype_res; /* output is allocated as double */

iter = NpyIter_MultiNew(2, ops,
                  NPY_ITER_BUFFERED, /* this must be enabled to allow bitswapping and casting */
                  NPY_KEEPORDER, NPY_UNSAFE_CASTING,
                  op_flags, op_dtypes);

Py_DECREF(dtype_res);
dtype_res = NULL;

if (iter == NULL) {
      return NULL; /* you will get and error if arrays are not compatible */
  }


/* Specific methods to advance the loop and get the data */

iternext = NpyIter_GetIterNext(iter, NULL);
dataptr = NpyIter_GetDataPtrArray(iter);

do {
  double *dbl_ptr;
  double value;

  /* Now dataptr contains correctly formated data */

  /* the input */
  dbl_ptr = (double*) dataptr[0];
  value = *dbl_ptr;
  /* lets say our operation is b = 2 * a + 1 */
  value = 2 * value + 1;

 /* and the output, stored in the other pointer */
 memcpy(dataptr[1], &value, sizeof(double));

} while(iternext(iter));

/* The output array cab be recovered with */
out_res = NpyIter_GetOperandArray(iter)[1];

NpyIter_Deallocate(iter);

There are lots of different flags, per operand and per loop. For example, you can use NPY_ITER_COPY instead of NPY_ITER_BUFFERED, or different rules for casting (or no casting), disallow broadcasting, get larger chunks of data for external loops, etc.

Full documentation is here: https://numpy.org/doc/stable/reference/c-api/iterator.html