Search code examples
pythonmultidimensional-arrayindexinggenfromtxt

Indexing a ndarray - 1 item is stored differently to >1


I'm using genfromtxt to import data from a txt file.

This data is imported into an ndarray, as per genfromtxt

Normally, this text file has many lines of data to inport, meaning that the ndarray is like this:

array([ ('2016-04-17T00:08:42.273000Z', '2016-04-17T00:08:50.595000Z', '2016-04-17T00:08:58.378391Z', '2016-04-17T00:08:58.840273Z', '2016-04-17T00:09:05.670000Z', '2016-04-17T00:09:06.115000Z', '2016-04-17T00:09:07.155000Z', '2016-04-17T00:09:06.804999Z', '2016-04-17T00:09:08.488391Z', '2016-04-17T00:09:14.890273Z', '2016-04-17T00:09:11.648393Z', 1.702756, 10, 3.959),
   ('2016-04-17T01:11:11.393000Z', '2016-04-17T01:11:19.715000Z', '2016-04-17T01:11:27.498391Z', '2016-04-17T01:11:27.960273Z', '2016-04-17T01:11:34.790000Z', '2016-04-17T01:11:35.235000Z', '2016-04-17T01:11:36.275000Z', '2016-04-17T01:11:35.924999Z', '2016-04-17T01:11:37.608391Z', '2016-04-17T01:11:44.010273Z', '2016-04-17T01:11:40.768393Z', 3.084912, 10, 3.423),
   ('2016-05-20T19:10:42.883000Z', '2016-05-20T19:10:51.205000Z', '2016-05-20T19:10:58.978393Z', '2016-05-20T19:10:59.441114Z', '2016-05-20T19:11:06.280000Z', '2016-05-20T19:11:06.705000Z', '2016-05-20T19:11:07.725000Z', '2016-05-20T19:11:07.405000Z', '2016-05-20T19:11:09.108393Z', '2016-05-20T19:11:15.481160Z', '2016-05-20T19:11:12.258393Z', 1.956513, 10, 3.078)], 
  dtype=[('origintime', 'S27'), ('JAMA', 'S27'), ('FLF1', 'S27'), ('MAG1', 'S27'), ('AV18', 'S27'), ('AV21', 'S27'), ('AMA1', 'S27'), ('BV15', 'S27'), ('PPLP', 'S27'), ('HPAL', 'S27'), ('ILLI', 'S27'), ('stackedcorr', '<f8'), ('totalstations', '<i8'), ('magestimate', '<f8')])

But when the text file only has one line, the ndarray is like this (note the lack of square-brackets like the previous example):

array(('2016-05-08T03:13:02.841000Z', '2016-05-08T03:13:10.705000Z', '1900-01-01T00:00:00.000000Z', '2016-05-08T03:13:14.099997Z', '2016-05-08T03:13:14.938393Z', '2016-05-08T03:13:29.228391Z', '2016-05-08T03:13:31.868393Z', '2016-05-08T03:13:31.909995Z', '2016-05-08T03:13:36.920000Z', '2016-05-08T03:13:37.080000Z', '2016-05-08T03:13:37.635000Z', 9.0, 9, 3.41), 
  dtype=[('origintime', 'S27'), ('JAMA', 'S27'), ('CABP', 'S27'), ('MAG1', 'S27'), ('FLF1', 'S27'), ('PAC1', 'S27'), ('GGPT', 'S27'), ('PINO', 'S27'), ('SUCR', 'S27'), ('BNAS', 'S27'), ('SLOR', 'S27'), ('stackedcorr', '<f8'), ('totalstations', '<i8'), ('magestimate', '<f8')])

The difference is that the multi-line input is an array, while the one-line input is not.

This messes up indexing as I cannot loop over results['origintime'][i] because of the one-line input possibility.

How can I convert the ndarray of the one-line input (no square-brackets) to be a len=1 list, meaning it has the same format as the multi-line ndarrays?

Thanks


Solution

  • Numpy actually is loading in the file as an array, but it is a "0-dimensional" array. That is, results.ndim will return 0. You can convert it to a 1-dimensional array with 1 element by doing results.reshape((1,)).

    If you are reading in a file and you don't know whether it will have one or multiple lines beforehand, you can do:

    results = np.genfromtxt(filename)
    if results.ndim==0:
        results.reshape((1,))