I have the following dataset:
df =
id test score date
1 A 1 2000-01-01
1 B 3 2000-01-01
1 A 1 2000-01-30
1 B 5 2000-01-30
2 A 2 2000-01-01
2 B 4 2000-01-01
3 A 6 2000-01-01
4 A 2 2000-01-01
I want to group_by id
and date
in order to get columns containing the type of the test for the same id
and date
:
df_outcome =
id testA testB date
1 1 3 2000-01-01
1 1 5 2000-01-30
2 2 4 2000-01-01
note that individuals 3 and 4 do not have test A and B in the same date and therefore are excluded from the analysis.
(df.pivot(index=["id", "date"], columns="test", values="score")
.dropna()
.rename_axis(columns=None)
.add_prefix("test")
.reset_index()
.convert_dtypes())
to get
id date testA testB
0 1 2000-01-01 1 3
1 1 2000-01-30 1 5
2 2 2000-01-01 2 4