Search code examples
pythondata-visualizationaltairvega-litevega

Stacked text in a stacked area chart using Altair


I wanted to know if it is possible to have text marks in the corresponding areas of a stacked area chart.

I used median aggregate to get single X and Y axis values otherwise it shows text all through the edge of the chart. However, this aggregate is not foolproof, as if the chart is a little convoluted, then the X axis position may not be the best possible region for the text to be displayed into.

This is as far as I have got -

X=[1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9]
V=[1,1,1,2,4,8,6,4,2,1,2,3,4,5,6,7,6,5,1,1,1,1,4,8,4,2,1,1,1,3,4,5,6,6,5,4]
key=['a', 'b', 'c', 'd']
K = [y for x in key for y in (x)*9]

demo = pd.DataFrame({'X': X, 'V': V, 'K': K})
a = alt.Chart(demo).mark_area().encode(
    x='X:O',
    y='V:Q',
    color='K:N'
)
t = alt.Chart(demo).mark_text().encode(
    x='median(X):O',
    y='median(V):Q',
    text=alt.Text('K:N',)
)
a+t

enter image description here

Issue

  • The text is not in its proper region.
  • The order of the text is also wrong.

It's not that I don't understand why I have these issues, I do actually(the Y position is not aggregating as "stacked" on top of each other), but I do not know how to solve it or if it is even doable as of now.


Solution

  • I would just build a separate dataframe for the text and use that as source. It is much easier and customizable than doing all sorts of transformations in Altair if such a thing is even possible in this case.

    import pandas as pd
    import altair as alt
    
    X=[1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9]
    V=[1,1,1,2,4,8,6,4,2,1,2,3,4,5,6,7,6,5,1,1,1,1,4,8,4,2,1,1,1,3,4,5,6,6,5,4]
    key=['a', 'b', 'c', 'd']
    K = [y for x in key for y in (x)*9]
    
    demo = pd.DataFrame({'X': X, 'V': V, 'K': K})
    
    # find X position where the sum of V's of K's is the maximum (this is at X=6)
    idxmax = demo.groupby(["X"]).sum().idxmax()[0]
    # find the cumulative sum of V's at position idxmax and
    # take away some offset (4) so the labels go down a bit
    # iloc[::-1] reverses the order because we want cumulative to start from the bottom (from 'd')
    ypos = demo.groupby(["X", "K"]).sum().loc[idxmax].iloc[::-1].cumsum()["V"] - 4
    # crate a new dataframe for the text, X column=idmax, Y column=cumulative ypos, K=key
    demotext = pd.DataFrame([[idxmax, y, k] for y,k in zip(ypos.tolist(), key[::-1])],
                            columns=["X", "Y", "K"])
    
    
    a = (alt.Chart(demo).mark_area()
            .encode(
                    x='X:O',
                    y='V:Q',
                    color='K:N')
        )
    t = (alt.Chart(demotext).mark_text()
            .encode(
                    x='X:O',
                    y='Y:Q',
                    text='K:N'
    ))
    
    a+t
    

    Output

    Altair Area Chart With Text