I'm trying to create a new column that tallys up the number of times someone was paid for a job - regardless of if it was all of the money or just some of the money. So for each row, if it says "yes" or "partial" or "paid" in the job columns then I want a count of this in the new column.
My actual data has 15 different job columns that I want to "sum" across.
So before looks like:
Name | Job1 | Job2 |
---|---|---|
tom | Yes | No |
nick | Partial | Yes |
juli | No | No |
And I'd like afterwards to look like:
Name | Job1 | Job2 | Received_money |
---|---|---|---|
tom | Yes | No | 1 |
nick | Partial | Yes | 2 |
juli | No | No | 0 |
Current code
df['Received_money'] = df[['Job1', 'Job2']].apply(lambda row: len(row[row == 'Yes']), axis=1)
This is my current code and it partially does what I want. It adds up the number of times it says "Yes" in the columns listed. But:
"== 'partial'"
and "== 'paid'"
, and how to get it to give 1 point (so to speak) for each time these occur(Example data)
import pandas as pd
# initialize list of lists
data = [['tom', "Yes", "No"], ['nick', "Partial", "Yes"], ['juli', "No", "No"]]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns=['Name', 'Job1', 'Job2'])
Thank you!
I added 2 more Job columns just for testing. This should satisfy your requirement.
data = [['tom', "Yes", "No","Partial","Paid"], ['nick', "Partial", "Yes"], ['juli', "No", "No","Partial","Paid"]]
df = pd.DataFrame(data, columns=['Name', 'Job1', 'Job2','Job3','Job4'])
job_cols =['Job1','Job2','Job3','Job4']
paid_values = ['Yes','Paid','Partial']
df['Received_money'] = df[job_cols].apply(lambda row : len([r for r in row if r in paid_values]),axis=1)
print(df)