Search code examples
pandasdataframegroup-byiterationcounter

Python DataFrame : generate 2 counters in a groupby counter1 from 1 to 3 and counter2 counts everytime counter1 is reset to 1


In pandas dataframe, I'm trying to generate , 2 counters in a df.groupby on type_pgm :

=> counter1 looping from 1 to 3

=> counter2 counts everytime counter1 restart to 1

Here is a df_sample :

df_sample = pd.DataFrame([[1,"pgm_1","type_A"],
                          [1,"pgm_1","type_A"],
                          [1,"pgm_1","type_A"],
                          [2,"pgm_2","type_A"],
                          [2,"pgm_2","type_A"],
                          [2,"pgm_2","type_A"],
                          [2,"pgm_2","type_A"],
                          [2,"pgm_2","type_A"],
                          [3,"pgm_3","type_B"],
                          [3,"pgm_3","type_B"],
                          [3,"pgm_3","type_B"],
                          [3,"pgm_3","type_B"],
                          [4,"pgm_4","type_B"]
                          ])

df_sample.columns = ["pgm_id","pgm_name","pgm_type"]

print(df_sample)

    pgm_id pgm_name pgm_type
0        1    pgm_1   type_A
1        1    pgm_1   type_A
2        1    pgm_1   type_A
3        2    pgm_2   type_A
4        2    pgm_2   type_A
5        2    pgm_2   type_A
6        2    pgm_2   type_A
7        2    pgm_2   type_A
8        3    pgm_3   type_B
9        3    pgm_3   type_B
10       3    pgm_3   type_B
11       3    pgm_3   type_B
12       4    pgm_4   type_B

Here is the targeted df_target :

df_target = pd.DataFrame([[1,"pgm_1","type_A",1,1],
                          [1,"pgm_1","type_A",2,1],
                          [1,"pgm_1","type_A",3,1],
                          [2,"pgm_2","type_A",1,2],
                          [2,"pgm_2","type_A",2,2],
                          [2,"pgm_2","type_A",3,2],
                          [2,"pgm_2","type_A",1,3],
                          [2,"pgm_2","type_A",2,3],
                          [3,"pgm_3","type_B",1,1],
                          [3,"pgm_3","type_B",2,1],
                          [3,"pgm_3","type_B",3,1],
                          [3,"pgm_3","type_B",1,2],
                          [4,"pgm_4","type_B",2,2]
                          ])

df_target.columns = ["pgm_id","pgm_name","pgm_type","counter1","counter2"]

print(df_target)
    pgm_id pgm_name pgm_type  counter1  counter2
0        1    pgm_1   type_A         1         1
1        1    pgm_1   type_A         2         1
2        1    pgm_1   type_A         3         1
3        2    pgm_2   type_A         1         2
4        2    pgm_2   type_A         2         2
5        2    pgm_2   type_A         3         2
6        2    pgm_2   type_A         1         3
7        2    pgm_2   type_A         2         3
8        3    pgm_3   type_B         1         1
9        3    pgm_3   type_B         2         1
10       3    pgm_3   type_B         3         1
11       3    pgm_3   type_B         1         2
12       4    pgm_4   type_B         1         2

I tried many ways unsuccessfull...

Thanks a lot in advance for your help


Solution

  • I hope I've understood your question right:

    df_sample["counter1"] = (
        df_sample.groupby(["pgm_type", "pgm_name"])
        .apply(lambda x: pd.Series(i % 3 + 1 for i in range(len(x))))
        .values
    )
    
    df_sample["counter2"] = (
        df_sample.groupby("pgm_type")
        .apply(lambda x: pd.Series(i // 3 + 1 for i in range(len(x))))
        .values
    )
    
    print(df_sample)
    

    Prints:

        pgm_id pgm_name pgm_type  counter1  counter2
    0        1    pgm_1   type_A         1         1
    1        1    pgm_1   type_A         2         1
    2        1    pgm_1   type_A         3         1
    3        2    pgm_2   type_A         1         2
    4        2    pgm_2   type_A         2         2
    5        2    pgm_2   type_A         3         2
    6        2    pgm_2   type_A         1         3
    7        2    pgm_2   type_A         2         3
    8        3    pgm_3   type_B         1         1
    9        3    pgm_3   type_B         2         1
    10       3    pgm_3   type_B         3         1
    11       3    pgm_3   type_B         1         2
    12       4    pgm_4   type_B         1         2