Search code examples
pythonaltairvega-lite

Appending sample number to X-Labels in altair


I would like to automatically append the sample # (in parentheses) corresponding to the x-labels of an altair figure. I am open to doing this outside of altair, but I thought there may be a way to do it at the figure level using altair/vega-lite. I am pasting the code using an example from the altair/vega website (part of the vega_dataset), but with a hackneyed, manual method in which I rename the labels explicitly for one of the labels. In this case, I have added the sample number of 73 to Europe.

Link to data

import altair as alt
from vega_datasets import data

df = data.cars()
df['Origin'] = df['Origin'].replace({'Europe':'Europe (n=73)'})

alt.Chart(df).transform_density(
    'Miles_per_Gallon',
    as_=['Miles_per_Gallon', 'density'],
    extent=[5, 50],
    groupby=['Origin']
).mark_area(orient='horizontal').encode(
    y='Miles_per_Gallon:Q',
    color='Origin:N',
    x=alt.X(
        'density:Q',
        stack='center',
        impute=None,
        title=None,
        axis=alt.Axis(labels=False, values=[0],grid=False, ticks=True),
    ),
    column=alt.Column(
        'Origin:N',
        header=alt.Header(
            titleOrient='bottom',
            labelOrient='bottom',
            labelPadding=0,
        ),
    )
).properties(
    width=100
).configure_facet(
    spacing=0
).configure_view(
    stroke=None
)

enter image description here


Solution

  • You could use pandas to generate the replacement dictionary and assign it to a new dataframe column:

    import altair as alt
    from vega_datasets import data
    
    df = data.cars()
    group_sizes = df.groupby('Origin').size()
    replace_dict = group_sizes.index + ' (n=' + group_sizes.astype(str) + ')'
    df['Origin_with_count'] = df['Origin'].replace(replace_dict)
    
    alt.Chart(df).transform_density(
        'Miles_per_Gallon',
        as_=['Miles_per_Gallon', 'density'],
        extent=[5, 50],
        groupby=['Origin_with_count', 'Origin']
    ).mark_area(orient='horizontal').encode(
        y='Miles_per_Gallon:Q',
        color='Origin:N',
        x=alt.X(
            'density:Q',
            stack='center',
            impute=None,
            title=None,
            axis=alt.Axis(labels=False, values=[0],grid=False, ticks=True),
        ),
        column=alt.Column(
            'Origin_with_count:N',
            header=alt.Header(
                title=None,
                labelOrient='bottom',
                labelPadding=0,
            ),
        )
    ).properties(
        width=100
    ).configure_facet(
        spacing=0
    ).configure_view(
        stroke=None
    )
    

    You might be able to do something more elegant with labelExpr, not sure.

    enter image description here