Current Format:
UNIQUE ID | NAME | AGE | DEP | RANK |
---|---|---|---|---|
001 | John | 10 | 4th | 1 |
002 | Priya | 11 | 4th | 2 |
003 | Jack | 15 | 5th | 2 |
004 | Jill | 14 | 5th | 1 |
Expected Format:
UNIQUE ID | NAME | COLUMN_NO |
---|---|---|
001 | John | 1 |
001 | 10 | 2 |
001 | 4th | 3 |
001 | 1 | 4 |
002 | Priya | 1 |
002 | 11 | 2 |
002 | 4th | 3 |
002 | 2 | 4 |
My starting point:
>>> df
UNIQUE ID NAME AGE DEP RANK
0 1 John 10 4th 1
1 2 Priya 11 4th 2
2 3 Jack 15 5th 2
3 4 Jill 14 5th 1
The basic transformation you need is provided by df.stack
, which results in:
0 UNIQUE ID 1
NAME John
AGE 10
DEP 4th
RANK 1
1 UNIQUE ID 2
NAME Priya
[...]
However, you want column UNIQUE ID
to be treated separately. This can be accomplished by making it the index:
>>> df.set_index('UNIQUE ID').stack()
UNIQUE ID
1 NAME John
AGE 10
DEP 4th
RANK 1
2 NAME Priya
AGE 11
DEP 4th
RANK 2
The last missing bit are the column names: you want them renamed to numbers. This could be accomplished two different ways: a) by re-assigning df.columns
(after having moved column UNIQUE ID to the index first):
df = df.set_index('UNIQUE_ID')
df.columns = range(1, 5)
or b) by df.renam
ing the columns:
df = df.set_index('UNIQUE_ID')
df = df.rename(columns={'NAME': 1, 'AGE': 2, 'DEP': 3, 'RANK': 4})
And finally you can convert the resulting Series back to a DataFrame. The most elegant way to get COLUMN NO
at the right place is using df.rename_axis
before stacking. All together as one expression (possibly better to split it up):
>>> (df.set_index('UNIQUE ID')
.rename(columns={'NAME': 1, 'AGE': 2, 'DEP': 3, 'RANK': 4})
.rename_axis('COLUMN NO', axis=1)
.stack()
.to_frame('NAME')
.reset_index())
UNIQUE ID COLUMN NO NAME
0 1 1 John
1 1 2 10
2 1 3 4th
3 1 4 1
4 2 1 Priya
5 2 2 11
6 2 3 4th
7 2 4 2
8 3 1 Jack
9 3 2 15
10 3 3 5th
11 3 4 2
12 4 1 Jill
13 4 2 14
14 4 3 5th
15 4 4 1
Things left out: reading the data; preserving the correct type: UNIQUE ID
only looks numeric, but has leading zeros that probably want to be preserved; so parsing them as a string would be better.