python pandas dataframe count tablecolumn

Pandas calculate the number of columns of a given name there are that have a value in a row

I have this dataset where I have some columns (not important to the calculations) and then many columns with same starting name. I want to calculate the sum of those columns per one row which contains else than NaN-value. The set looks something like this:

id	something	number1	number2	number3	number4
1	105	200	NaN	NaN	50
2	300	2	1	1	33
3	20	1	NaN	NaN	NaN

So I want to create new column that contains the length of the number columns that have a value. So the final dataset would look like this:

id	something	number1	number2	number3	number4	sum_columns
1	105	200	NaN	NaN	50	2
2	300	2	1	1	33	4
3	20	1	NaN	NaN	NaN	1

I know I can calculate the length of columns that start by specific name something like this:

df[df.columns[pd.Series(df.columns).str.startswith('number')]]

but I cant figure out, how can I add condition that there has to be other than NaN value and also how to apply it to every row. I think it could be done with lambda? but haven't succeeded yet.

Solution

# filter column on 'number' and count
df['sum_columns']=df.filter(like='number').count(axis=1)
df

    id  something   number1     number2     number3     number4     sum_columns
0    1      105         200         NaN         NaN       50.0          2
1    2      300           2         1.0         1.0       33.0          4
2    3       20           1         NaN         NaN        NaN          1

PS: Your first DF and second DF, the NaN count don't match. I used the second DF in the solution