Search code examples
pythonnumpygenfromtxt

forcing genfromtxt output to no-vector


Is there a way to force genfromtxt to output data with the shape : (xx, 1) in the case only one column of data is loaded? The usual shape is (xx, ). xx in my example could any integer.

update: here is an example of code:

import numpy as np
a = np.zeros([1000, 10])
nbcols = 1
for ind in range(0, 10, nbcols)
    a[:, ind : ind + nbcols] = np.genfromtxt('file_1000x10.csv', usecols = range(nbcols))

this piece of code works only for nbcols >= 2; assuming nbcols is an integer c [1, 10]. is there a solution to make it work for nbcols = 1 without adding an if statement.

In fact I simplified too much the original code for this post, though that wouldn't affect the answers to my problem. In fact the filename is given through a variable as following:

filename = 'file_1000x10_' + '%02d' % ind.astype(int) + '.csv'

So at each iteration in the for loop, np.genfromtxt loads data from another file.


Solution

  • I think the trick is to reshape(-1, nbcols) what you get from np.genfromtxt, so your assignment should look like:

    a[:, ind:ind + nbcols] = np.genfromtxt('file_1000x10.csv',
                                           usecols = range(nbcols)).reshape(-1, nbcols)
    

    On a separate note, looping over ind, and reading the file every time is unnecessary. You can do a little bit of higher dimensionality voodoo as follows:

    import numpy as np
    from StringIO import StringIO
    
    def make_data(rows, cols) :
        data = ((str(k + cols * j) for k in xrange(cols)) for j in xrange(rows))
        data = '\n'.join(map(lambda x: ' '.join(x), data))
        return StringIO(data)
    
    def read_data(f, rows, cols, nbcols) :
        a = np.zeros((rows, (cols + nbcols - 1) // nbcols, nbcols))
        a[...] = np.genfromtxt(f, usecols=range(nbcols)).reshape(-1, 1, nbcols)
        return a.reshape(rows, -1)[:, :cols]
    
    >>> read_data(make_data(3, 6), 3, 6, 2)
    array([[  0.,   1.,   0.,   1.,   0.,   1.],
           [  6.,   7.,   6.,   7.,   6.,   7.],
           [ 12.,  13.,  12.,  13.,  12.,  13.]])
    >>> read_data(make_data(3, 6), 3, 6, 1)
    array([[  0.,   0.,   0.,   0.,   0.,   0.],
           [  6.,   6.,   6.,   6.,   6.,   6.],
           [ 12.,  12.,  12.,  12.,  12.,  12.]])
    >>> read_data(make_data(3, 6), 3, 6, 4)
    array([[  0.,   1.,   2.,   3.,   0.,   1.],
           [  6.,   7.,   8.,   9.,   6.,   7.],
           [ 12.,  13.,  14.,  15.,  12.,  13.]])
    

    ORIGINAL ANSWER You can add that extra dimension of size 1 to your_array using:

    your_array.reshape(your_array.shape + (1,))
    

    or the equivalent

    your_array.reshape(-1, 1)
    

    The same can be achieved with

    your_array[..., np.newaxis]
    

    or the equivalent

    your_array[..., None]