What is the best way to filter groups by two lambda conditions and create a new column based on the conditions?

This is my DataFrame:

import pandas as pd

df = pd.DataFrame(
    {
        'a': ['x', 'x', 'x', 'x', 'y', 'y', 'y', 'y', 'z', 'z', 'z', 'p', 'p', 'p', 'p'],
        'b': [1, -1, 1, 1, -1, 1, 1, -1, -1, -1, -1, 1, 1, 1, 1]
    }
)

And this the expected output. I want to create column c:

    a  b    c
0   x  1    first
1   x -1    first
2   x  1    first
3   x  1    first
4   y -1    second
5   y  1    second
6   y  1    second
7   y -1    second
11  p  1    first
12  p  1    first
13  p  1    first
14  p  1    first

Groups are defined by column a. I want to filter df and choose groups that either their first b is 1 OR their second b is 1.

I did this by this code:

df1 = df.groupby('a').filter(lambda x: (x.b.iloc[0] == 1) | (x.b.iloc[1] == 1))

And for creating column c for df1, again groups should be defined by a and then if for each group first b is 1 then c is first and if the second b is 1 then c is second.

Note that for group p, both first and second b is 1, for these groups I want c to be first.

Maybe the way that I approach the issue is totally wrong.

Solution

A generic method that works with any number of positions for the first 1:

d = {0: 'first', 1: 'second'}

s = (df.groupby('a')['b']
       .transform(lambda g: g.reset_index()[g.values==1]
                  .first_valid_index())
       .replace(d)
     )

out = df.assign(c=s).dropna(subset=['c'])

Notes:

if you remove the replace step you will get an integer in c
if you use map in place of replace you can ignore the positions that are not defined as a dictionary key

Output:

    a  b       c
0   x  1   first
1   x -1   first
2   x  1   first
3   x  1   first
4   y -1  second
5   y  1  second
6   y  1  second
7   y -1  second
11  p  1   first
12  p  1   first
13  p  1   first
14  p  1   first

Example from comments:

df = pd.DataFrame({'a': ['x', 'x', 'x', 'x', 'y', 'y', 'y', 'y', 'z', 'z', 'z', 'p', 'p', 'p', 'p'],
                  'b': [1, -1, 1, 1, -1, 1, 1, -1, -1, -1, 1, 1, 1, 1, 1]})

d = {0: 'first', 1: 'second'}

s = (df.groupby('a')['b']
       .transform(lambda g: g.reset_index()[g.values==1]
                  .first_valid_index())
       .map(d)
     )

out = df.assign(c=s).dropna(subset=['c'])

    a  b       c
0   x  1   first
1   x -1   first
2   x  1   first
3   x  1   first
4   y -1  second
5   y  1  second
6   y  1  second
7   y -1  second
11  p  1   first
12  p  1   first
13  p  1   first
14  p  1   first

You can also only filter the rows with:

m1 = df.groupby('a').cumcount().le(1)
m2 = df['b'].eq(1)
out = df.loc[df['a'].isin(df.loc[m1&m2, 'a'])]