Search code examples
pythonvisualizationaltair

Interactive proportional value of a pandas column displayed using altair pie chart in python?


I have a sample dataframe created using the snippet below

categories = [
    "Network Issue",
    "Hardware Failure",
    "Software Bug",
    "User Error",
    "Other",
]
locations = ["Location A", "Location B", "Location C", "Location D", "Location E"]
data = {
    "Root Cause": [random.choice(categories) for _ in range(20)],
    "LocationName": [random.choice(locations) for _ in range(20)],
}
df = pd.DataFrame(data)
df

I wanted to get the following -

  1. A pie (or donut) chart using Altair's alt.Chart(source).mak_arc() method whereby the theta is based on the proportional value of the Root Cause column i.e. - (sum('Root Cause') / count('Root Cause'))
  2. Allow users to select an arc of the pie (or donut) chart and have a table display those rows based on selection.

I have tried creating the pie chart using the following

import altair as alt
# Create the pie chart
pie_chart = (
    alt.Chart(df)
    .mark_arc()
    .encode(
        theta=alt.Theta(field="Root Cause", type="nominal", aggregate="count"),
        color=alt.Color(field="Root Cause", type="nominal"),
        tooltip=[
            "Root Cause",
            alt.Tooltip(field="Root Cause", type="nominal"),
            alt.Tooltip(field="count()", title="Percentage"),
        ],
    )
)

pie_chart

This gives me a pie chart but doesnt display the percentage value upon hovering or displayed as text. Also what would be a way to link the selection to a table adjacent to the pie chart?

I have tried looking at the docs here and here but havent got much luck, please advice.


Solution

  • Thanks for the recommendations @joelostblom. I did the following to achieve what I needed -

    1. Aggregate the original dataframe to reduce it to a smaller number of rows and then draw the pie chart based on this aggregated dataframe

    Pie chart using reduced/ aggregated dataframe

    selection = alt.selection_point(fields=['Root Cause'])

    pieChart = (
        alt.Chart(forPieChart)
        .mark_arc(innerRadius=80)
        .encode(
            theta=alt.Theta(field="Percentage", type="quantitative"),
            color=alt.Color(field="Root Cause", type="nominal", legend = alt.Legend(orient='left')
    
            ),
            tooltip=["Root Cause"],
        )
        .properties(title="Proportion of root cause", width=300, height=300)
    ).add_params(selection)
    
    1. Declare the base table object using the code specified in the link in the docs

    Base chart for data tables

    ranked_text = alt.Chart(pieChartData).mark_text(align='right', stroke=None, strokeWidth=0).encode(
        y=alt.Y('row_number:O').axis(None)
    ).transform_filter(
        selection
    ).transform_window(
        row_number='row_number()'
    ).transform_filter(
        alt.datum.row_number < 20
    ).properties(view=alt.ViewConfig(strokeWidth=0))
    

    Note - the alt.ViewConfig(strokeWidth=0) property helped to hide the column borders. Following this create 'columns' using the base table object like this -

    rootCauseField = drawTable(ranked_text, 'Root Cause:N', 'Root Cause')
    
    field1= drawTable(ranked_text, 'Field1:Q', 'Field1')
    
    field2 = drawTable(ranked_text,'Field2:Q', 'Field2')
    

    Combine all the 'columns' next to each other ...

    text = alt.hconcat(rootCauseField, field1,field2)
    

    Then concatenate the pie and table next to each other using

    pieChart | text
    

    Hope this is of use to others if they need something similar.