Search code examples
pythonplotlyplotly-python

Size legend for plotly express scatterplot in Python


Here is a Plotly Express scatterplot with marker color, size and symbol representing different fields in the data frame. There is a legend for symbol and a colorbar for color, but there is nothing to indicate what marker size represents.

Is it possible to display a "size" legend? In the legend I'm hoping to show some example marker sizes and their respective values.

A similar question was asked for R and I'm hoping for a similar results in Python. I've tried adding markers using fig.add_trace(), and this would work, except I don't know how to make the sizes equal.

import pandas as pd
import plotly.express as px
import random

# create data frame
df = pd.DataFrame({
  'X':list(range(1,11,1)),
  'Y':list(range(1,11,1)),
  'Symbol':['Yes']*5+['No']*5,
  'Color':list(range(1,11,1)),
  'Size':random.sample(range(10,150), 10)
})

# create scatterplot
fig = px.scatter(df, y='Y', x='X',color='Color',symbol='Symbol',size='Size')

# move legend
fig.update_layout(legend=dict(y=1, x=0.1))

fig.show()

Scatterplot Image:

Scatterplot Image

Thank you


Solution

  • You can not achieve this goal, if you use a metric scale/data like in your range. Plotly will try to always interpret it like metric, even if it seems/is discrete in the output. So your data has to be a factor like in R, as you are showing groups. One possible solution could be to use a list comp. and convert everything to a str. I did it in two steps so you can follow:

    import pandas as pd
    import plotly.express as px
    import random
    
    
    check = sorted(random.sample(range(10,150), 10))
    check = [str(num) for num in check]
    
    # create data frame
    df = pd.DataFrame({
      'X':list(range(1,11,1)),
      'Y':list(range(1,11,1)),
      'Symbol':['Yes']*5+['No']*5,
      'Color':check,
      'Size':list(range(1,11,1))
    })
    
    # create scatterplot
    fig = px.scatter(df, y='Y', x='X',color='Color',symbol='Symbol',size='Size')
    
    # move legend
    fig.update_layout(legend=dict(y=1, x=0.1))
    
    fig.show()
    

    That gives: enter image description here

    Keep in mind, that you also get the symbol label, as you now have TWO groups! Maybe you want to sort the values in the list before converting to string! Like in this picture (added it to the code above)

    desc

    UPDATE

    Hey There,

    yes, but as far as I know, only in matplotlib, and it is a little bit hacky, as you simulate scatter plots. I can only show you a modified example from matplotlib, but maybe it helps you so you can fiddle it out by yourself:

    from numpy.random import randn
    
    z = randn(10)
    
    red_dot, = plt.plot(z, "ro",  markersize=5)
    red_dot_other, = plt.plot(z*2, "ro", markersize=20)
    
    plt.legend([red_dot, red_dot_other], ["Yes", "No"], markerscale=0.5)
    

    That gives:

    markersize

    As you can see you are working with two different plots, to be exact one plot for each size legend. In the legend these plots are merged together. Legendsize is further steered through markerscale and it is linked to markersize of each plot. And because we have two plots with TWO different markersizes, we can create a plot with different markersizes in the legend. markerscale is normally a value between 0 and 1 but you can also do 150% thus 1.5.

    You can achieve this through fiddling around with the legend handler in matplotlib see here: https://matplotlib.org/stable/tutorials/intermediate/legend_guide.html