I have created multiple data dictionaries with year_quarter
key. Then I used .describe()
on one of the columns I was interested to study, column A
. Now I want to create a DataFrame
with the statistics created with .describe()
.
This is what I did:
H_cltn = {} #original data dictionaries
stat_cltn = {}
QY =['2013_1', '2013_2', '2013_3', '2013_4']
for item in QY:
stat_cltn[item] = H_cltn[item]['A'].describe()
df = pd.DataFrame(['count','mean','std','min','25%','50%','75%','max'])
for item in QY:
df[item] = pd.Series(stat_cltn[item])
But this gives me NaN
values for the whole table.
You might be able to simplify along these lines:
QY =['2013_1', '2013_2', '2013_3', '2013_4']
df = pd.DataFrame()
for item in QY:
df = pd.concat([df, H_cltn[item]['A'].describe()], axis=1) ## possibly axis=0