How to save a huge structured array in Python?

I have an array in the following format (modified to fit here):

array([(0.358174, -0.508718, 2728, 0.103, 23.255, 22.633, 22.459, 21.911, 21.211, 0.487, 0.126, 0.145, 0.129, 0.264, 23.028, 22.621, 22.563, 22.039, 21.24 , 0.378, 0.164, 0.14 , 0.125, 0.248, 3, 1),
...,
(3.881584, -0.209449, 5052, 0.075, 22.778, 22.741, 22.187, 21.901, 21.29 , 0.308, 0.128, 0.124, 0.148, 0.345, 22.801, 22.859, 22.291, 22.047, 21.441, 0.285, 0.141, 0.119, 1.056, 0.323, 3, 0)],
dtype=[('ra', '<f8'), ('dec', '<f8'), ('run', '<i2'), ('rExtSFD', '<f8'), ('uRaw', '<f8'), ('gRaw', '<f8'), ('rRaw', '<f8'), ('iRaw', '<f8'), ('zRaw', '<f8'), ('uErr', '<f8'), ('gErr', '<f8'), ('rErr', '<f8'), ('iErr', '<f8'), ('zErr', '<f8'), ('uRawPSF', '<f8'), ('gRawPSF', '<f8'), ('rRawPSF', '<f8'), ('iRawPSF', '<f8'), ('zRawPSF', '<f8'), ('upsfErr', '<f8'), ('gpsfErr', '<f8'), ('rpsfErr', '<f8'), ('ipsfErr', '<f8'), ('zpsfErr', '<f8'), ('type', '<i2'), ('ISOLATED', '<i4')])

I want to find a way to save this array in a txt file so that when I return data.dtype.names[:5], for example (reloading the file), I can get ('ra', 'dec', 'run', 'rExtSFD', 'uRaw').

However, all the attempts I've made so far (using np.savetxt, for example, and setting fmt='%f...' for all dtypes) haven't worked. I don't want to resort to Pandas. Does anyone have a suggestion?

Solution

Here's a small example:

In [124]: dt = np.dtype([('x','f'),('y','i')])    
In [125]: dt
Out[125]: dtype([('x', '<f4'), ('y', '<i4')])    
In [126]: arr = np.array([(1,2),(3,4),(5,6)],dt)    
In [127]: arr
Out[127]: array([(1., 2), (3., 4), (5., 6)], dtype=[('x', '<f4'), ('y', '<i4')])

Save the array, and show the resulting file:

In [131]: np.savetxt('test.txt',arr, fmt='%f, %d',header='x, y',comments='')

In [132]: !more test.txt
x, y
1.000000, 2
3.000000, 4
5.000000, 6

This can be loaded with:

In [133]: data = np.genfromtxt('test.txt',delimiter=',',dtype=None, names=True)    
In [134]: data
Out[134]: array([(1., 2), (3., 4), (5., 6)], dtype=[('x', '<f8'), ('y', '<i4')])

savetxt just iterates through arr, and writes fmt%tuple(row) to the file. The array dtype is not used for this.

header could be derived from arr.dtype.names. I explicitly turned off the default comment character.

genfromtxt reads the text file; dtype=None tells it to deduce column dtypes. names=True tells it to take the dtype names from the header.

Without the header, or just ignoring it, we can use the dtype directly:

In [136]: np.genfromtxt('test.txt',delimiter=',',dtype=dt, skip_header=1)
Out[136]: array([(1., 2), (3., 4), (5., 6)], dtype=[('x', '<f4'), ('y', '<i4')])

Note, the csv file itself does not directly have dtype information; it has to be inferred, or you have to know it before hand.