Search code examples
arrayspython-3.xnanhdf5pytables

How to initialise a PyTables float array with NaNs?


All the values I want to store in a PyTables HDF5 table are real numbers and this makes it logical to choose a 2D array over a regular table. But for many cells of the array the value can be unavailable (and become available later) and the only reasonably simple way to indicate this seems assigning NaN to the cell (you can't just put None there, as far as I understand).

I have create the table the following way:

with tables.open_file(file_full_name, 'a') as file:
    table = table_file.create_array(where=table_file.root,
                                    name='main',
                                    atom=tables.FloatAtom(dflt=float('nan')),
                                    shape=(table_length, table_width))

But the resulting table gets filled with zeros (0.0), not with NaNs.

Assigning float('nan') to a cell of the newly created table works just fine but setting float('nan') to be the default value via atom=tables.FloatAtom(dflt=float('nan')) doesn't so I have to initialise newly created tables manually.

How can this possibly be fixed? Or is there a better way perhaps?


Solution

  • It's a known bug, explained here--the dflt parameter does nothing: https://github.com/PyTables/PyTables/issues/423