i'm using multiindex in pandas.
Example1 :
'info' '2002' '2003' '2004'
'PID' 'Age' 't1' 't2' 't1' 't2' 't1' 't2'
101 30 Nan 2 2 1 Nan Nan
102 28 2 5 1 Nan 10 100
I want to know the number of things that are worth either.
[num] and [num] is True
[Nan] and [num] is True
[Nan] and [Nan] is False
Example2 :
'info'
'PID' 'Age' 'count'
101 30 2
102 28 3
so i think using a any(), but i did not solve it.
I think need if MultiIndex
also in index first check notna
of all values, then check at least one per first level of MultiIndex in columns by DataFrameGroupBy.any
and last count True
s by sum
:
print (df.index)
MultiIndex(levels=[[101, 102], [28, 30]],
labels=[[0, 1], [1, 0]])
df = df.notna().groupby(axis=1, level=0).any().sum(axis=1)
#oldier pandas versions
#df = df.notnull().groupby(axis=1, level=0).any().sum(axis=1)
If there is no MultiIndex
in index add drop
for remove info
level:
print (df.index)
RangeIndex(start=0, stop=2, step=1)
s = df.drop('info', level=0, axis=1).notnull().groupby(axis=1, level=0).any().sum(axis=1)
print (s)
0 2
1 3
dtype: int64
And if need create column by position use insert
:
df.insert(2, ('info','count'), s)
print (df)
info '2002' '2003' '2004'
'PID' 'Age' count 't1' 't2' 't1' 't2' 't1' 't2'
0 101 30 2 NaN 2 2 1.0 NaN NaN
1 102 28 3 2.0 5 1 NaN 10.0 100.0