I have created a dataframe called df
as follows:
import pandas as pd
d = {'feature1': [1, 22,45,78,78], 'feature2': [33, 2,2,65,65], 'feature3': [100, 2,359,87,2],}
df = pd.DataFrame(data=d)
print(df)
The dataframe looks like this:
I want to create two new columns called Freq_1
and Freq_2
that count, for each record, how many times the number 1
and number 2
appear respectively. So, I'd like the resulting dataframe to look like this:
So, let's take a look at the column called Freq_1
:
1
appears only once across the whole first record;1
never appears.Let's take a look now at the column called Freq_2
:
2
doesn't appear;2
appears twice;How do I create the columns Freq_1 and Freq_2 in pandas?
Try this:
freq = {
i: df.eq(i).sum(axis=1) for i in range(10)
}
pd.concat([df, pd.DataFrame(freq).add_prefix("Freq_")], axis=1)
Result:
feature1 feature2 feature3 Freq_0 Freq_1 Freq_2 Freq_3 Freq_4 Freq_5 Freq_6 Freq_7 Freq_8 Freq_9
1 33 100 0 1 0 0 0 0 0 0 0 0
22 2 2 0 0 2 0 0 0 0 0 0 0
45 2 359 0 0 1 0 0 0 0 0 0 0
78 65 87 0 0 0 0 0 0 0 0 0 0
78 65 2 0 0 1 0 0 0 0 0 0 0