Search code examples
pythonnumpyarcpy

python dict to numpy structured array


I have a dictionary that I need to convert to a NumPy structured array. I'm using the arcpy function NumPyArraytoTable, so a NumPy structured array is the only data format that will work.

Based on this thread: Writing to numpy array from dictionary and this thread: How to convert Python dictionary object to numpy array

I've tried this:

result = {0: 1.1181753789488595, 1: 0.5566080288678394, 2: 0.4718269778030734, 3: 0.48716683119447185, 4: 1.0, 5: 0.1395076201641266, 6: 0.20941558441558442}

names = ['id','data']
formats = ['f8','f8']
dtype = dict(names = names, formats=formats)
array=numpy.array([[key,val] for (key,val) in result.iteritems()],dtype)

But I keep getting expected a readable buffer object

The method below works, but is stupid and obviously won't work for real data. I know there is a more graceful approach, I just can't figure it out.

totable = numpy.array([[key,val] for (key,val) in result.iteritems()])
array=numpy.array([(totable[0,0],totable[0,1]),(totable[1,0],totable[1,1])],dtype)

Solution

  • You could use np.array(list(result.items()), dtype=dtype):

    import numpy as np
    result = {0: 1.1181753789488595, 1: 0.5566080288678394, 2: 0.4718269778030734, 3: 0.48716683119447185, 4: 1.0, 5: 0.1395076201641266, 6: 0.20941558441558442}
    
    names = ['id','data']
    formats = ['f8','f8']
    dtype = dict(names = names, formats=formats)
    array = np.array(list(result.items()), dtype=dtype)
    
    print(repr(array))
    

    yields

    array([(0.0, 1.1181753789488595), (1.0, 0.5566080288678394),
           (2.0, 0.4718269778030734), (3.0, 0.48716683119447185), (4.0, 1.0),
           (5.0, 0.1395076201641266), (6.0, 0.20941558441558442)], 
          dtype=[('id', '<f8'), ('data', '<f8')])
    

    If you don't want to create the intermediate list of tuples, list(result.items()), then you could instead use np.fromiter:

    In Python2:

    array = np.fromiter(result.iteritems(), dtype=dtype, count=len(result))
    

    In Python3:

    array = np.fromiter(result.items(), dtype=dtype, count=len(result))
    

    Why using the list [key,val] does not work:

    By the way, your attempt,

    numpy.array([[key,val] for (key,val) in result.iteritems()],dtype)
    

    was very close to working. If you change the list [key, val] to the tuple (key, val), then it would have worked. Of course,

    numpy.array([(key,val) for (key,val) in result.iteritems()], dtype)
    

    is the same thing as

    numpy.array(result.items(), dtype)
    

    in Python2, or

    numpy.array(list(result.items()), dtype)
    

    in Python3.


    np.array treats lists differently than tuples: Robert Kern explains:

    As a rule, tuples are considered "scalar" records and lists are recursed upon. This rule helps numpy.array() figure out which sequences are records and which are other sequences to be recursed upon; i.e. which sequences create another dimension and which are the atomic elements.

    Since (0.0, 1.1181753789488595) is considered one of those atomic elements, it should be a tuple, not a list.