I have data containing a list of topics (topics 1-5; and 0 meaning no topic is assigned) and their value. I want to create a new column for each topic and fill the column with the value. Here's what the table looks like...
reviewId topic value
01 2 -4
02 2 9
03 0 -7
04 5 -1
05 1 38
What should I do to create a table looking like this?
reviewId | topic | value | t1 | t2 | t3 | t4 | t5 |
---|---|---|---|---|---|---|---|
01 | 2 | -4 | 0 | -4 | 0 | 0 | 0 |
02 | 2 | 9 | 0 | 9 | 0 | 0 | 0 |
03 | 0 | -7 | 0 | 0 | 0 | 0 | 0 |
04 | 5 | -1 | 0 | 0 | 0 | 0 | -1 |
05 | 1 | 38 | 38 | 0 | 0 | 0 | 0 |
Here's reproducible data to work on.
raw_df = pd.DataFrame({
'reviewId': ['01', '02', '03', '04', '05'],
'topic': [2, 2, 0, 5, 1],
'value': ['-4', '9', '-7', '-1', '38']})
Here's a link to a similar problem with this one.
df.join(pd.get_dummies(df["topic"])
.reindex(columns=range(1, 5+1), fill_value=0)
.mul(df["value"], axis="rows")
.add_prefix("t"))
and join it with the original frame to get
reviewId topic value t1 t2 t3 t4 t5
0 01 2 -4 0 -4 0 0 0
1 02 2 9 0 9 0 0 0
2 03 0 -7 0 0 0 0 0
3 04 5 -1 0 0 0 0 -1
4 05 1 38 38 0 0 0 0