I have a 2-column array mixed type array that I need to read in and reshape into a data cube. I've got most of it working, but for some reason both numpy.loadtxt and np.genfromtxt drop everything after the 8th character from the string part of the tuple. I have 25 blocks of 8 parameter-value pairs corresponding to stars of varying masses and metallicities. For instance, Teff \t\t 5.2739E+3
(there are 2 tabs between the string and the float) converts to a key-value pair just fine, but MASS/MSUN \t\t 0.800
gets converted to 'MASS/MSU':0.800
instead of 'MASS/MSUN':0.800
like I expected. Similarly,LOG(L/LSUN) \t\t 0.0522
becomes 'LOG(L/LS': 0.0522
instead of 'LOG(L/LSUN)': 0.0522
Why are the last characters in the strings falling off?
I've tried setting the delimiters to only tabs, only tabs and newlines (didn't seem to like that), commented out the lines between blocks, etc. Seems like no matter what I do, the character limit for each string is stuck at 8. There must be a string subtype I need to declare. I've made a workaround, it just bothers me.
This is my code (I'm using the Spyder GUI, BTW):
>>>f=np.genfromtxt("zamsdata.txt",dtype=(str,float))
>>>zcube = defaultdict(lambda: defaultdict(lambda: defaultdict(list)))
>>>infotups=[]
>>>for row in f:
>>> if 'MASS' in row[0]:
>>> mass=str(row[1])
>>> continue #rows are in repeating order of MASS, X, Y, Pc, Tc, R, L, Te, LOG(Te) & LOG(L/LSUN)
>>> if 'X' in row[0]:
>>> hydfrac=str(row[1])
>>> continue
>>> else:
>>> infotups=infotups+[[hydfrac,mass,str(row[0]),row[1]]]
>>>
>>>for l,m,a,o in infotups:
>>> zcube[l][m][a].append(o)
When the data type of a field is specified to be str
, it looks like the default size assigned to the field by genfromtxt
is eight characters. If you know that the maximum number of characters is, say, 12, you could use dtype=['S12', float]
. (Note that I've used a list, not a tuple.) You could also use dtype=None
, which tells genfromtxt
to figure out the data type of each field from what it finds in the file.