Search code examples
pythonarraysnumpyio

read structured array from ascii file


I have an ASCII text file (which is in a given format that I cannot change) with the content

previous data...
#
# some comment
2 a -0.9989532219119496
1 b 1.8002219998623799
1 c 0.2681232137509927
# 
some other things...

and I would like to read that file into an array with a custom dtype (a "structured array"). It all works when the file is binary (remove the sep="\n" below), but it fails when the file is ASCII:

import numpy as np
import string

# Create some fake data
N = 3
dtype = np.dtype([("a", "i4"), ("b", "S8"), ("c", "f8")])
a = np.zeros(N, dtype)
a["a"] = np.random.randint(0, 3, N)
a["b"] = np.array([x for x in string.ascii_lowercase[:N]])
a["c"] = np.random.normal(size=(N,))

print(a)

a.tofile("test.dat", sep="\n")
b = np.fromfile("test.dat", dtype=dtype, sep="\n")

print(b)
ValueError: Unable to read character files of that array type

Any hints here?

(The file contains other data as well, so in real life I'm using a file handle instead of a filename string, but I suppose this doesn't matter much here.)


Solution

  • In [286]: txt = """previous data... 
         ...: # 
         ...: # some comment 
         ...: 2 a -0.9989532219119496 
         ...: 1 b 1.8002219998623799 
         ...: 1 c 0.2681232137509927 
         ...: #  
         ...: some other things...""".splitlines()  
    

    Using parameters as noted in my comment:

    In [289]: np.genfromtxt(txt, skip_header=1, max_rows=3, dtype=None, encoding=None)                                                                    
    Out[289]: 
    array([(2, 'a', -0.99895322), (1, 'b',  1.800222  ),
           (1, 'c',  0.26812321)],
          dtype=[('f0', '<i8'), ('f1', '<U1'), ('f2', '<f8')])
    In [290]: _.shape                                                               
    Out[290]: (3,)