I have the following pandas dataframe with population of two countries during the years:
>>>year pop1 pop2
0 1 1.000000e+08 1.000000e+08
1 2 9.620000e+07 9.970000e+07
2 3 9.254440e+07 9.940090e+07
3 4 8.902771e+07 9.910270e+07
4 5 8.564466e+07 9.880539e+07
I want to create plot line so the y values will the pop columns:
fig = px.line(data, x="year", y="pop1", title='Population')
fig.add_scatter(x=data['year'], y=data['pop2'], mode='lines')
fig.show()
My problem here is that the legend shows only one line, and seems like I can't control it (e.g to change it fro mtrace to pop1 and pop 2). I have seen that there is option to use the "color" but seems that is impossible when plotting columns.
My end goal here is to be able to control the legend - to have the column names (pop1 and pop2) as the legend items.
To keep the solution close to your original setup, you can do this:
fig = px.line(data, x="year", y="pop1", title='Population')
fig.data[0].name="pop1"
fig.update_traces(showlegend=True)
fig.add_scatter(x=data['year'], y=data['pop2'], mode='lines', name = "pop2")
The suggestion in the comment form @TeejayBruno will solve your problem. But the approach described there differs fundamentally from the steps you've described. And I suspect that there is a reason why you're first building a figure using
fig = px.line(data, x="year", y="pop1", title='Population')
And then adding new traces using:
fig.add_scatter(x=data['year'], y=data['pop2'], mode='lines')
So I thought I'd shed some light on why the legend is "missing" after the first step, and then how to make sure that "pop1" is included in the legend when you're adding more traces in step 2.
px.line(data, x="year", y="pop1", title='Population')
There's a perfectly good explanation for that. Take a look at the following plot. When px.line
only picks up one trace, it decides that a legend is superflous and that the information could be more naturally displayed as the label of the y-axis. And I pretty much agree on the decition the plotly devs have made there:
But this does not as much sense when users decide to build on that figure by adding traces through fig.add_scatter()
. And this is the exact probelm you've stumbled upon.
When you use fig = px.line(data, x="year", y=["pop1", "pop2"], title='Population')
with multiple y categories, px.line
understands that displaying all that information as label names for the y-axis doesn't make much sense anymore, and produces a legend like in the green circle in the figure below. And the same time, the y-axis label is renamed to "value"
in the red circle:
And what additionally happens under the hood, is that the data properties of the fig
object are named "pop1"
and "pop2"
:
<bound method BaseFigure.show of Figure({
'data': [{'hovertemplate': 'variable=pop1<br>year=%{x}<br>value=%{y}<extra></extra>',
'legendgroup': 'pop1',
'line': {'color': '#636efa', 'dash': 'solid'},
'mode': 'lines',
'name': 'pop1',
'orientation': 'v',
'showlegend': True,
'type': 'scatter',
'x': array([1, 2, 3, 4, 5], dtype=int64),
'xaxis': 'x',
'y': array([1.000000e+08, 9.620000e+07, 9.254440e+07, 8.902771e+07, 8.564466e+07]),
'yaxis': 'y'},
{'hovertemplate': 'variable=pop2<br>year=%{x}<br>value=%{y}<extra></extra>',
'legendgroup': 'pop2',
'line': {'color': '#EF553B', 'dash': 'solid'},
'mode': 'lines',
'name': 'pop2',
'orientation': 'v',
And therein lies the solution to how you can adjust the legend properties to your needs:
1. Make sure that 'name': 'pop1'
for the first trace using fig.data[0].name="pop1"
.
2. Set the figure to displays trace names in the legend with fig.update_traces(showlegend=True)
(figure 2.1).
3. Include names for all consecutive traces using fig.add_scatter(x=data['year'], y=data['pop2'], mode='lines', name = "pop2")
(figure 2.2).
4. Rename the y-axis label to whatever you'd like using, for example, fig.update_yaxes(title=dict(text='People'))
.
import plotly.graph_objs as go
import plotly.express as px
import pandas as pd
data = pd.DataFrame({'year': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5},
'pop1': {0: 100000000.0,
1: 96200000.0,
2: 92544400.0,
3: 89027710.0,
4: 85644660.0},
'pop2': {0: 100000000.0,
1: 99700000.0,
2: 99400900.0,
3: 99102700.0,
4: 98805390.0}})
fig = px.line(data, x="year", y="pop1", title='Population')
#fig = px.line(data, x="year", y=["pop1", "pop2"], title='Population')
fig.data[0].name="pop1"
fig.update_traces(showlegend=True)
fig.add_scatter(x=data['year'], y=data['pop2'], mode='lines', name = "pop2")#
fig.update_yaxes(title=dict(text='People'))
fig.show()