Search code examples
pythonpandasaltairgrouped-bar-chart

How to create a grouped bar Chart without manipulating the DataFrame


I noticed that one can create grouped bar charts in Altair, by evaluating the column of a DataFrame. My issue is that my data doesn't have each group as a value of an specific column.

So, is it possible to use the column name as a group name in Altair (without evaluating the column) or modifying the DataFrame?

Below is the DataFrame I have and the grouped bar chart I need:

enter image description here


Solution

  • This is an example of wide-form data (See Long-form vs. Wide-form Data). To transform it to Long-form data without modifying the dataframe, you can use the Fold Transform.

    Once you've done this, you can follow the Grouped Bar Chart Example to make your chart. It might look something like this:

    import pandas as pd
    import altair as alt
    
    df = pd.DataFrame({
        "Job Stat": ['INV', "WRK", "CMP", "JRB"],
        "Revenue": [100, 200, 300, 400],
        "Total Income": [150, 250, 350, 450]
    })
    
    (
      alt.Chart(df)
        .transform_fold(["Revenue", "Total Income"], as_=["key", "value"])
        .mark_bar()
        .encode(
            x="key:N",
            y="value:Q",
            color="key:N",
            column="Job Stat",
        )
    )
    

    enter image description here

    To make it closer to your example chart, you can tweak some of the label & legend settings:

    (
      alt.Chart(df)
        .transform_fold(["Revenue", "Total Income"], as_=["key", "value"])
        .mark_bar()
        .encode(
            alt.X('key:N', axis=None),
            alt.Y("value:Q"),
            alt.Color("key:N", legend=alt.Legend(title=None, orient='bottom')),
            alt.Column("Job Stat",
              sort=['INV', "WRK", "CMP", "JRB"],
              header=alt.Header(labelOrient="bottom", title=None)
            )
        )
    )
    

    enter image description here