How can I transform with Pandas & NumPy this DataFrame:
to DataFrame like:
Name Year Nb
------+--------+--------+-------
0 | A | 2021 | 5.0
1 | A | 2020 | 4.0
2 | A | 2019 | 10.0
3 | A | 2018 | 4.0
4 | A | 2017 | 4.0
...
k | A-Jay | 2021 | 5.0
k+1 | A-Jay | 2020 | 6.0
...
l+i | A.J. | 2019 | 3.0
m | Aaban | 2021 | 4.0
m+1 | Aaban | 2020 | 4.0
...
?
Here's a way (probably the most elegant) using melt()
:
out = ( df
.melt(id_vars='Name', var_name='Year', value_name='Nb')
.dropna()
.sort_values(['Name','Year'], ascending=[True,False])
.reset_index(drop=True) )
Here's another way, this one using stack()
:
out = ( df
.set_index('Name')
.stack()
.reset_index()
.rename(columns={'level_1':'Year',0:'Nb'})
.sort_values(['Name','Year'],ascending=[True,False])
.reset_index(drop=True) )
Sample input:
Name 2021 2022 2023 2024
0 a NaN 4.0 None 0.0
1 b 2.0 NaN None NaN
2 c 3.0 6.0 None 0.0
Output:
Name Year Nb
0 a 2024 0.0
1 a 2022 4.0
2 b 2021 2.0
3 c 2024 0.0
4 c 2022 6.0
5 c 2021 3.0