Search code examples
pythonaltair

How do I group colors by the values in a column?


I'm making a multi-feature scatterplot and I want to group the colorings based on the values of a column. Here's the code copied from that example link:

import altair as alt
from vega_datasets import data

source = data.iris()

alt.Chart(source).mark_circle().encode(
    alt.X('sepalLength').scale(zero=False),
    alt.Y('sepalWidth').scale(zero=False, padding=1),
    color='species',
    size='petalWidth'
)

My color column has several groupings of values that I'd like to show with the coloring. So, imagine instead of 3 values in the species column I had 6: [setosa-bright, setosa-dull, versicolor-bright, versicolor-dull, virginica-bright, virginica-dull]. I'd like to have both setosa colors be from the same family, like blue. Then both versicolors in greens, and both virginicas in reds, for example. The color families don't really matter, but grouping the colors by the column values does.

I've looked through https://altair-viz.github.io/user_guide/customization.html#color-domain-and-range and I haven't gotten that to do what I need yet. Am I barking up the wrong tree?


Solution

  • If you only have two of each color group, you could use the paired Vega color schemes. If you have more than than two values, you would have to create your own color scheme. There is a second grouping option in Altair via the detail encoding, but it doesn't automatically assign the same color to all values in the same group.

    import altair as alt
    from vega_datasets import data
    import random
    
    source = data.iris()
    source['group'] = source['species'].apply(lambda x: f"{x}-{random.choice(['light', 'dark'])}")
    
    alt.Chart(source).mark_circle().encode(
        alt.X('sepalLength').scale(zero=False),
        alt.Y('sepalWidth').scale(zero=False, padding=1),
        alt.Color('group').scale(scheme='tableau20'),
    )
    

    enter image description here