Search code examples
pythonarraysnumpygenfromtxt

Fastest way to read every n-th row with numpy's genfromtxt


I read my data with numpy's genfromtxt:

import numpy as np
measurement = np.genfromtxt('measurementProfile2.txt', delimiter=None, dtype=None, skip_header=4, skip_footer=2, usecols=(3,0,2))
rows, columns = np.shape(measurement)
x=np.zeros((rows, 1), dtype=measurement.dtype)
x[:]=394
measurement = np.hstack((measurement, x))
np.savetxt('measurementProfileFormatted.txt',measurement)

this works fine. But i want only ever 5-th, 6-th (so n-th) row in the final Output file. According to numpy.genfromtxt.html there is no Parameter which would do that. I dont want to iterate the array. Is there a recommended way to deal with this problem?


Solution

  • To avoid reading the whole array you can combine np.genfromtxt with itertools.islice to skip the rows. This is marginally faster than reading the whole array and then slicing (at least for the smaller arrays I tried).

    For instance, here's the contents of file.txt:

    12
    34
    22
    17
    41
    28
    62
    71
    

    Then for example:

    >>> import itertools
    >>> with open('file.txt') as f_in:
            x = np.genfromtxt(itertools.islice(f_in, 0, None, 3), dtype=int)
    

    returns an array x with the 0, 3 and 6 indexed elements of the above file:

    array([12, 17, 62])