I am importing data using numpy.genfromtxt
, and I would like to add a field of values derived from some of those within the dataset. As this is a structured array, it seems like the most simple, efficient way of adding a new column to the array is by using numpy.lib.recfunctions.append_fields()
. I found a good description of this library HERE.
Is there a way of doing this without copying the array, perhaps by forcing genfromtxt
to create an empty column to which I can append derived values?
Here's a simple example using a generator to add a field to a data file using genfromtxt
Our example data file will be data.txt with the contents:
1,11,1.1
2,22,2.2
3,33,3.3
So
In [19]: np.genfromtxt('data.txt',delimiter=',')
Out[19]:
array([[ 1. , 11. , 1.1],
[ 2. , 22. , 2.2],
[ 3. , 33. , 3.3]])
If we make a generator such as:
def genfield():
for line in open('data.txt'):
yield '0,' + line
which prepends a comma-delimited 0 to each line of the file, then:
In [22]: np.genfromtxt(genfield(),delimiter=',')
Out[22]:
array([[ 0. , 1. , 11. , 1.1],
[ 0. , 2. , 22. , 2.2],
[ 0. , 3. , 33. , 3.3]])
You can do the same thing with comprehensions as follows:
In [26]: np.genfromtxt(('0,'+line for line in open('data.txt')),delimiter=',')
Out[26]:
array([[ 0. , 1. , 11. , 1.1],
[ 0. , 2. , 22. , 2.2],
[ 0. , 3. , 33. , 3.3]])