Counting groups of like values within the same column

I have a dataframe containing columns with groups of 1's and -1's. Each column could contain any amount of 1's or -1's grouped together within each column, and they could also begin with an unspecified number of NaN's. I would like to count the total of values within each group excluding the NaN's.

An example dataframe is below:

NaN = float('nan')
df = pd.DataFrame({'col1':[1,1,1,1,-1,-1,-1,1,1],
                   'col2':[NaN, NaN, NaN, 1,1,-1,-1,-1,-1],
                   'col3':[NaN,NaN,NaN,NaN,NaN,1,1,1,-1]})

I would like the output to be another dataframe as shown below:

df =  pd.DataFrame({'col1':[4,4,4,4,3,3,3,2,2],
                    'col2':[NaN, NaN, NaN,2,2,4,4,4,4],
                    'col3': [NaN,NaN,NaN,NaN,NaN,3,3,3,1]})

Solution

Assuming a DataFrame/Series, you can group the successive values and use a groupby.transform('size'):

df = pd.DataFrame({'col': [1,1,1,1,1,-1,-1,-1,-1,1,1,1]})
df['out'] = (df.groupby(df['col'].ne(df['col'].shift()).cumsum())
               .transform('size')
            )

Output:

    col  out
0     1    5
1     1    5
2     1    5
3     1    5
4     1    5
5    -1    4
6    -1    4
7    -1    4
8    -1    4
9     1    3
10    1    3
11    1    3

If your input is a list, not need for pandas, go with itertools.groupby/itertools.chain:

from itertools import groupby, chain

lst = [1,1,1,1,1,-1,-1,-1,-1,1,1,1]
out = list(chain.from_iterable([l:=len(list(g))]*l for _, g in groupby(lst)))

Output:

[5, 5, 5, 5, 5, 4, 4, 4, 4, 3, 3, 3]

multiple columns

If you have a DataFrame with multiple columns, you can transform each column independently with apply and mask the NaNs:

NaN = float('nan')
df = pd.DataFrame({'col1':[1,1,1,1,-1,-1,-1,1,1],
                   'col2':[NaN, NaN, NaN, 1,1,-1,-1,-1,-1],
                   'col3':[NaN,NaN,NaN,NaN,NaN,1,1,1,-1]})

out = df.apply(lambda s: s.groupby(s.ne(s.shift()).cumsum())
                          .transform('size').mask(s.isna()))

Output:

   col1  col2  col3
0     4   NaN   NaN
1     4   NaN   NaN
2     4   NaN   NaN
3     4   2.0   NaN
4     3   2.0   NaN
5     3   4.0   3.0
6     3   4.0   3.0
7     2   4.0   3.0
8     2   4.0   1.0