Using Python, I want to count the number of cells in a row that has data in it, in a pandas data frame and record the count in the leftmost cell of the row.
To count the number of cells missing data in each row, you probably want to do something like this:
df.apply(lambda x: x.isnull().sum(), axis='columns')
Replace df
with the label of your data frame.
You can create a new column and write the count to it using something like:
df['MISSING'] = df.apply(lambda x: x.isnull().sum(), axis='columns')
The column will be created at the end (rightmost) of your data frame.
You can move your columns around like this:
df = df[['Count', 'M', 'A', 'B', 'C']]
I'm wondering if your missing cells are actually empty strings as opposed to NaN
values. Can you confirm? I copied your screenshot into an Excel workbook. My full code is below:
df = pd.read_excel('count.xlsx', na_values=['', ' '])
df.head() # You should see NaN for empty cells
df['M']=df.apply(lambda x: x.isnull().sum(), axis='columns')
df.head() # Column M should report the values: first row: 0, second row: 1, third row: 2
df = df[['Count', 'M', 'A', 'B', 'C']]
df.head() # Column order should be Count, M, A, B, C
Notice the na_values
parameter in the pd.read_excel
method.