I have this dataframe:
Char1 Char2 Char3
0 2 2 3
1 2 3 3
2 2 3 3
3 2 2 2
I need to create three columns (_1
, _2
, _3
), each of which counts the occurrence of the values 1, 2 and 3 respectively in each row.
So the output would look like this:
Char1 Char2 Char3 _1 _2 _3
0 2 2 3 0 2 1
1 2 3 3 0 1 2
2 2 3 3 0 1 2
3 2 2 2 0 3 0
For example:
_1
is always equal to 0 because there are no 1's in any row_2
:
How do I get to code that in Python?
I have tried this code (for _1
only):
df['_1'] = df[df.Char1 == 1].sum() + df[df.Char2 == 1].sum() + df[df.Char3 == 1].sum()
but I get "NaN".
DF constructor:
df = pd.DataFrame({'Char1':[2,2,2,2], 'Char2':[2,3,3,2], 'Char3':[3,3,3,2]})
You can iterate over [1,2,3]
and use eq
on axis to identify the cells with a number and sum
across columns to get the total:
for num in [1,2,3]:
df[f"_{num}"] = df[['Char1','Char2','Char3']].eq(num, axis=1).sum(axis=1)
Output:
Char1 Char2 Char3 _1 _2 _3
0 2 2 3 0 2 1
1 2 3 3 0 1 2
2 2 3 3 0 1 2
3 2 2 2 0 3 0