Search code examples
pythonpython-polars

Count of specific value of column in group


I am trying to achieve the following:

aggregate_commitment_df = commitment_df.groupby("sprint_start_date", "squad").agg(
    stories_committed=pl.count(pl.col("issue_type") == "Story"),
    spikes_committed=pl.count(pl.col("issue_type") == "Spike"),
    bugs_committed=pl.count(pl.col("issue_type") == "Bug"),
    story_points_committed=pl.sum("story_points"),
)

where, for example, stories_committed will be the count of all rows with issue_type == "Story" for each sprint_start_date and squad.


Solution

  • You'll want to use filter in the aggregation context before you use count.

    import polars as pl
    from datetime import date
    
    commitment_df = pl.DataFrame({
        "sprint_start_date": [date(2023,8,1),date(2023,8,2),date(2023,8,3),] * 8,
        "squad": ["team1", "team2"] * 12,
        "issue_type": ["Story"] * 10 + ["Spike"] * 8 + ["Bug"] * 6, 
        "story_points": [*range(4)] * 6,
    })
    
    aggregate_commitment_df = commitment_df.groupby("sprint_start_date", "squad").agg(
        stories_committed=pl.col("issue_type").filter(pl.col("issue_type") == "Story").count(),
        spikes_committed=pl.col("issue_type").filter(pl.col("issue_type") == "Spike").count(),
        bugs_committed=pl.col("issue_type").filter(pl.col("issue_type") == "Bug").count(),
        story_points_committed=pl.sum("story_points"),
    )
    
    print(aggregate_commitment_df)
    

    Results:

    shape: (6, 6)
    ┌───────────────────┬───────┬───────────────────┬──────────────────┬────────────────┬────────────────────────┐
    │ sprint_start_date ┆ squad ┆ stories_committed ┆ spikes_committed ┆ bugs_committed ┆ story_points_committed │
    │ ---               ┆ ---   ┆ ---               ┆ ---              ┆ ---            ┆ ---                    │
    │ date              ┆ str   ┆ u32               ┆ u32              ┆ u32            ┆ i64                    │
    ╞═══════════════════╪═══════╪═══════════════════╪══════════════════╪════════════════╪════════════════════════╡
    │ 2023-08-03        ┆ team2 ┆ 1                 ┆ 2                ┆ 1              ┆ 8                      │
    │ 2023-08-02        ┆ team1 ┆ 1                 ┆ 2                ┆ 1              ┆ 4                      │
    │ 2023-08-01        ┆ team1 ┆ 2                 ┆ 1                ┆ 1              ┆ 4                      │
    │ 2023-08-03        ┆ team1 ┆ 2                 ┆ 1                ┆ 1              ┆ 4                      │
    │ 2023-08-02        ┆ team2 ┆ 2                 ┆ 1                ┆ 1              ┆ 8                      │
    │ 2023-08-01        ┆ team2 ┆ 2                 ┆ 1                ┆ 1              ┆ 8                      │
    └───────────────────┴───────┴───────────────────┴──────────────────┴────────────────┴────────────────────────┘