Search code examples
pythonpython-3.xnumpyiterable-unpacking

How to iterate numpy array (of tuples) in list manner


I am getting an error TypeError: Iterator operand or requested dtype holds references, but the REFS_OK flag was not enabled when iterating numpy array of tuples as below:

import numpy as np

tmp = np.empty((), dtype=object)
tmp[()] = (0, 0)
arr = np.full(10, tmp, dtype=object)

for a, b in np.nditer(arr):
    print(a, b)

How to fix this?


Solution

  • In [71]: tmp = np.empty((), dtype=object)
        ...: tmp[()] = (0, 0)
        ...: arr = np.full(10, tmp, dtype=object)
    

    You don't need nditer to iterate through this array:

    In [74]: for i in arr:print(i)
    (0, 0)
    (0, 0)
    (0, 0)
    ...
    (0, 0)
    

    nditer just makes life more complicated, and isn't any faster, especially for something like print. Who or what recommended nditer?

    For that matter, you can simply print the array:

    In [75]: arr
    Out[75]: 
    array([(0, 0), (0, 0), (0, 0), (0, 0), (0, 0), (0, 0), (0, 0), (0, 0),
           (0, 0), (0, 0)], dtype=object)
    

    But let's look at something else - the id of elements of this object dtype array:

    In [76]: [id(i) for i in arr]
    Out[76]: 
    [1562261311040,
     1562261311040,
     ...
     1562261311040,
     1562261311040]
    

    You made an array with 10 references to the same tuple. Is that what you intended? It's the full that has done that.

    To make a different tuple in each slot, I was going to suggest this list comprehension, but then realized it just produced a 2d array:

    In [83]: arr1 = np.array([(0,0) for _ in range(5)]); arr1
    Out[83]: 
    array([[0, 0],
           [0, 0],
           [0, 0],
           [0, 0],
           [0, 0]])
    

    To make an object dtype array with actual tuples (different) we have to do something like:

    In [84]: arr1 = np.empty(5, object); arr1
    Out[84]: array([None, None, None, None, None], dtype=object)    
    In [85]: arr1[:] = [(0,i) for i in range(5)]    
    In [86]: arr1
    Out[86]: array([(0, 0), (0, 1), (0, 2), (0, 3), (0, 4)], dtype=object)
    

    But that brings us back to the basic question - why make an array of tuples in the first place? What's the point. numpy is best with multidimensional numeric arrays. Object dtype array are, in many ways, just glorified (or debased) lists.