Search code examples
altairvega-lite

Faceting concatenated images in altair-viz


This is a follow up inquiry to Altair: Sorting faceted “text” chart not reflecting expectation. I modified the dataframe with one additional column. My goal is to groupby the objects in the "Marker" column so that I end up with a multifaceted figure, each with the bars and text. The dataframe is below. I am pasting what a single image looks like before trying to facet. I am subsequently pasting the code and my best attempt. I am wondering if there is a way to do this in the altair code or if you would recommend doing a groupby outside the altair code with a for loop.

,Bug,Unknown,Level,LDA_Score,p_value,Marker
0,a,4.10808792666,Low,3.43193376894,0.0381678194757,GM
1,b,2.80231776318,High,2.86568860404,0.048078814719199996,GM
2,c,1.55012602444,High,3.0159901714,0.047006554908300004,GM
3,d,2.11298173821,High,2.94493334678,0.0120363750248,GM
4,e,2.08807237447,High,2.9096371889,0.0149437560986,GM
5,f,2.762619332479999,High,2.52323422148,0.040652301139,GM
6,g,4.390454714340001,Low,3.85075499081,0.029978515680400004,GM
7,h,3.32306083381,High,3.01988462626,0.0244409015043,GM
8,i,2.84614167157,High,2.97142565384,0.0438396924694,GM
9,j,4.51419624602,Low,3.84190054285,0.0460224914387,GM
10,k,4.027450677669999,High,3.52319882849,0.0113390729281,IFN
11,l,4.26967903787,Low,3.8458771734,0.00548234585386,IFN
12,m,1.7823168924,High,2.50020069082,0.0203578926278,IFN

This code and image is what it looks like without trying to groupby the final column:

y_sort = alt.EncodingSortField(field='LDA_Score', order='descending')

bars = alt.Chart(df).mark_bar().encode(
    alt.X('LDA_Score', title='LDA_Score (log10)', axis=alt.Axis(titleFontSize=14)),
    alt.Y("Bug:N", sort=y_sort, axis=alt.Axis(title=None, labelFontStyle='italic')),
    color=alt.Color('Level:N', legend=alt.Legend(title=None, labelFontSize=12, orient='right')),# scale=alt.Scale(domain=['>12weeks', '<12weeks'], range=['green', 'red'])),
    row=alt.Row('Level:N', header=alt.Header(title=None, labelFontSize=0), spacing=0),
).resolve_scale(
    y='independent'
)

text = alt.Chart(df).mark_text().encode(
    alt.Text('p_value:Q', format='.3e'),
    alt.Y("Bug:N", sort=y_sort, axis=None),
    row=alt.Row('Level:N', header=alt.Header(title=None, labelFontSize=0), spacing=0),
).resolve_scale(
    y='independent'
).properties(width=50, title="p_value"
)

FinalChart = alt.hconcat(bars, text, spacing=-10)\
    .configure_title(anchor='end', fontStyle='italic', fontSize=14)\
    .configure_axis(grid=True, gridOpacity=0.5).configure_view(opacity=0.5)
FinalChart.display()

enter image description here

I subsequently tried using the facet option in altair/vega. I'm pasting the code and then what I get:

y_sort = alt.EncodingSortField(field='LDA_Score', order='descending')

bars = alt.Chart(df).mark_bar().encode(
    alt.X('LDA_Score', title='LDA_Score (log10)', axis=alt.Axis(titleFontSize=14)),
    alt.Y("Bug:N", sort=y_sort, axis=alt.Axis(title=None, labelFontStyle='italic')),
    color=alt.Color('Level:N', legend=alt.Legend(title=None, labelFontSize=12, orient='right')),# scale=alt.Scale(domain=['>12weeks', '<12weeks'], range=['green', 'red'])),
    row=alt.Row('Level:N', header=alt.Header(title=None, labelFontSize=0), spacing=0),
).resolve_scale(
    y='independent'
).facet(column='Marker')

text = alt.Chart(df).mark_text().encode(
    alt.Text('p_value:Q', format='.3e'),
    alt.Y("Bug:N", sort=y_sort, axis=None),
    row=alt.Row('Level:N', header=alt.Header(title=None, labelFontSize=0), spacing=0),
).resolve_scale(
    y='independent'
).properties(width=50, title="p_value"
).facet(column='Marker')

FinalChart = alt.hconcat(bars, text, spacing=-10)\
    .configure_title(anchor='end', fontStyle='italic', fontSize=14)\
    .configure_axis(grid=True, gridOpacity=0.5).configure_view(opacity=0.5)
FinalChart.display()

enter image description here

I think it is likely intuitive, but what I'm wanting is something like this:

enter image description here


Solution

  • What you're after is to facet a concatenated chart; unfortunately this is not supported by Altair or Vega-Lite. But you can work around this by constructing the facet by hand: a facet operation is basically a filter plus a concat, so you can construct your desired chart like this:

    import altair as alt
    import pandas as pd
    from io import StringIO
    
    df = pd.read_csv(StringIO("""\
    ,Bug,Unknown,Level,LDA_Score,p_value,Marker
    0,a,4.10808792666,Low,3.43193376894,0.0381678194757,GM
    1,b,2.80231776318,High,2.86568860404,0.048078814719199996,GM
    2,c,1.55012602444,High,3.0159901714,0.047006554908300004,GM
    3,d,2.11298173821,High,2.94493334678,0.0120363750248,GM
    4,e,2.08807237447,High,2.9096371889,0.0149437560986,GM
    5,f,2.762619332479999,High,2.52323422148,0.040652301139,GM
    6,g,4.390454714340001,Low,3.85075499081,0.029978515680400004,GM
    7,h,3.32306083381,High,3.01988462626,0.0244409015043,GM
    8,i,2.84614167157,High,2.97142565384,0.0438396924694,GM
    9,j,4.51419624602,Low,3.84190054285,0.0460224914387,GM
    10,k,4.027450677669999,High,3.52319882849,0.0113390729281,IFN
    11,l,4.26967903787,Low,3.8458771734,0.00548234585386,IFN
    12,m,1.7823168924,High,2.50020069082,0.0203578926278,IFN
    """))
    
    y_sort = alt.EncodingSortField(field='LDA_Score', order='descending')
    
    bars = alt.Chart(df).mark_bar().encode(
        alt.X('LDA_Score', title='LDA_Score (log10)', axis=alt.Axis(titleFontSize=14)),
        alt.Y("Bug:N", sort=y_sort, axis=alt.Axis(title=None, labelFontStyle='italic')),
        color=alt.Color('Level:N', legend=alt.Legend(title=None, labelFontSize=12, orient='right')),# scale=alt.Scale(domain=['>12weeks', '<12weeks'], range=['green', 'red'])),
        row=alt.Row('Level:N', header=alt.Header(title=None, labelFontSize=0), spacing=0),
        column=alt.Column('Marker:N', title=None),
    ).resolve_scale(
        y='independent'
    )
    
    text = alt.Chart(df).mark_text().encode(
        alt.Text('p_value:Q', format='.3e'),
        alt.Y("Bug:N", sort=y_sort, axis=None),
        row=alt.Row('Level:N', header=alt.Header(title=None, labelFontSize=0), spacing=0),
    ).resolve_scale(
        y='independent'
    ).properties(width=50, title="p_value"
    )
    
    FinalChart = alt.hconcat(
        bars.transform_filter('datum.Marker == "GM"'),
        text.transform_filter('datum.Marker == "GM"'),
        bars.transform_filter('datum.Marker == "IFN"'),
        text.transform_filter('datum.Marker == "IFN"')
    ).configure_title(anchor='end', fontStyle='italic', fontSize=14)\
     .configure_axis(grid=True, gridOpacity=0.5).configure_view(opacity=0.5)
    FinalChart.display()
    

    enter image description here