using numpy genfromtxt in python, i want to be able to get column headers as key for a given data. I tried the following, but not able to get the column names for the corresponding data.
column = np.genfromtxt(pathToFile,dtype=str,delimiter=',',usecols=(0))
columnData = np.genfromtxt(pathToFile,dtype=str,delimiter=',')
data = dict(zip(column,columnData.tolist()))
Below is the data file
header0,header1,header2
mydate,3.4,2.0
nextdate,4,6
afterthat,7,8
Currently, it shows data as
{
"mydate": [
"mydate",
"3.4",
"2.0"
],
"nextdate": [
"nextdate",
"4",
"6"
],
"afterthat": [
"afterthat",
"7",
"8"
]
}
I want to get to this format
{
"mydate": {
"header1":"3.4",
"header2":"2.0"
},
"nextdate": {
"header1":"4",
"header2":"6"
},
"afterthat": {
"header1":"7",
"header2": "8"
}
}
any suggestions?
With your sample file and genfromtxt
calls I get 2 arrays:
In [89]: column
Out[89]:
array(['header0', 'mydate', 'nextdate', 'afterthat'],
dtype='<U9')
In [90]: columnData
Out[90]:
array([['header0', 'header1', 'header2'],
['mydate', '3.4', '2.0'],
['nextdate', '4', '6'],
['afterthat', '7', '8']],
dtype='<U9')
Pull out the first row of columnData
In [91]: headers=columnData[0,:]
In [92]: headers
Out[92]:
array(['header0', 'header1', 'header2'],
dtype='<U9')
Now construct a dictionary of dictionaries (I don't need the separate column
array):
In [94]: {row[0]: {h:v for h,v in zip(headers, row)} for row in columnData[1:]}
Out[94]:
{'afterthat': {'header0': 'afterthat', 'header1': '7', 'header2': '8'},
'mydate': {'header0': 'mydate', 'header1': '3.4', 'header2': '2.0'},
'nextdate': {'header0': 'nextdate', 'header1': '4', 'header2': '6'}}
refine it a bit:
In [95]: {row[0]: {h:v for h,v in zip(headers[1:], row[1:])} for row in columnData[1:]}
Out[95]:
{'afterthat': {'header1': '7', 'header2': '8'},
'mydate': {'header1': '3.4', 'header2': '2.0'},
'nextdate': {'header1': '4', 'header2': '6'}}
I like dictionary comprehensions!
Your dictionary of lists version:
In [100]: {row[0]:row[1:] for row in columnData[1:].tolist()}
Out[100]: {'afterthat': ['7', '8'], 'mydate': ['3.4', '2.0'], 'nextdate': ['4', '6']}