Search code examples
pythonaltair

Sort a stacked bar chart with transform fold Altair


I was wondering if there is any way to sort the bars within a stacked bar chart when the chart contains a transform fold. My data is "wide form":

enter image description here

and I want each row of the dataframe to be represented with its own bar in a chart. Hence, I need to first do a transform fold. This all works fine and the bars appear to be sorted alphabetically by default. HOWEVER, let's say I want the order of the bars within the stacked bar to be different from the way they are displaying by default.

I have tried:

  • changing the order of the columns within the transform_fold
  • adding a sort parameter within the X axis definition (both None so that it ignores the alphabetical default sort and the explicit ordered list as shown in the code below)
  • adding an order field
  • adding a sort parameter within the color definition

None of these approaches seem to make any difference from the default alphabetical sort! Is there a special way to do this when you have a transform fold?

Minimal code (with all attempts at ordering added) and output here:

small_df = pd.DataFrame({'Probability A':[0.2,0.8,0.4],'Probability B':[0.5,0.1,0.1],'Probability C':[0.3,0.1,0.5],'ID':['John','Sally','Frank']})

alt.Chart(small_df).mark_bar().transform_fold(
        ['Probability A','Probability C','Probability B']
    ).encode(
        x=alt.X('value:Q',scale=alt.Scale(domain=[0, 1]),sort=['Probability A','Probability C','Probability B']),
        y='ID:N',
        order=['Probability A','Probability C','Probability B'],
        color=alt.Color('key:N',sort=['Probability A','Probability C','Probability B'],scale=alt.Scale(domain=['Probability A','Probability B','Probability C'],range=['red', 'orange', 'green']))
)

enter image description here


Solution

  • I think you got confused with all the different settings. Let me try to break this down.

    1. The order of the color legend can be controlled using the scale argument.
    alt.Chart(small_df).mark_bar().transform_fold(
            ['Probability A','Probability B','Probability C']
        ).encode(
            x=alt.X('value:Q',scale=alt.Scale(domain=[0, 1])),
            y='ID:N',
            color=alt.Color('key:N',
                # arrange the column-colr pairs to you what you want
                scale=alt.Scale(
                    domain=['Probability C','Probability A','Probability B'],
                    range=['red', 'orange', 'green']
                )
            ),
            #order=alt.Order('value:Q',sort='descending'),
            #order=alt.Order('key:N',sort='descending'),
    )
    

    enter image description here

    1. To control the order of the columns inside each horizontal bar, we use order argument. By default, the columns are in ascending order in each bar. In the above code, if you uncomment the line
    order=alt.Order('key:N',sort='descending'),
    

    This will give you the column in descending order of key i.e., your column.

    enter image description here

    You can also use other ways of ordering it. For example, if you want it according to the value variable, uncomment the line

    order=alt.Order('value:Q',sort='descending'),
    

    enter image description here