I want to see changes in cell values when comparing multiple dataframes. These dataframes are formed from JSON data, which forms many-column tables, and I cannot easily change this data source. Let's say there are 10 dataframes with 10 rows and 10 columns (equally labelled). I'd like to compare the information, by turning each dataframe into 100 row and 1 column.
for 3x3 example:
import pandas as pd
data = [{'a':1,'b':2,'c':3},{'a':10,'b':20,'c':30},{'a':100,'b':200,'c':300}]
df = pd.DataFrame(data)
df.index = ['x','y','z']
gives this table
a b c
x 1 2 3
y 10 20 30
z 100 200 300
but I would like to have:
col
xa 1
xb 2
xc 3
ya 10
yb 20
yc 30
za 100
zb 200
zc 300
so that I may then add many columns and compare values changes.
Can somebody advise me on how to do this using pandas? It is okay if a third colum is required, i.e.:
1 2 3
x a 1
x b 2
x c 3
y a 10
y b 20
y c 30
z a 100
z b 200
z c 300
Use DataFrame.stack
with Series.to_frame
and then flatten MultiIndex
to index
with map
:
df_us = df.stack().to_frame('col')
df_us.index = df_us.index.map(lambda x: f'{x[0]}{x[1]}')
print (df_us)
col
xa 1
xb 2
xc 3
ya 10
yb 20
yc 30
za 100
zb 200
zc 300
For 3 columns:
df_us = df.stack().reset_index()
df_us.columns = [0,1,2]
print (df_us)
0 1 2
0 x a 1
1 x b 2
2 x c 3
3 y a 10
4 y b 20
5 y c 30
6 z a 100
7 z b 200
8 z c 300