(my first question on StackOverFlow, so please be indulgent).
I am coding a ANN on a set of data containing among others the following columns:
[... , 'labels_column', 'Content %']
I would like to have the labels_column
to be Encoded (like with a OneHotEncoder, which I am using now) to numeric, but would like the values to be the ones from column 'Content %'
and not 1
For example:
labels_column | Content % |
---|---|
label_1 | 37 |
label_2 | 24 |
label_3 | 12 |
label_2 | 60 |
Turned after the Transform into:
label_1 | label_2 | label_3 |
---|---|---|
37 | 0 | 0 |
0 | 24 | 0 |
0 | 0 | 12 |
0 | 60 | 0 |
And not:
label_1 | label_2 | label_3 | Content % |
---|---|---|---|
1 | 0 | 0 | 37 |
0 | 1 | 0 | 24 |
0 | 0 | 1 | 12 |
0 | 1 | 0 | 60 |
Haven't managed yet doing it with masks, or other tricks...
Thanks a lot for your help!
You could do a math/broadcasting trick:
df = pd.DataFrame({'labels_column': ['label_1','label_2','label_3','label_2'],
'Content %': [37, 24, 12, 60]})
pd.get_dummies(df['labels_column']) * df[['Content %']].values