I have a df in this form:
{'content': {175: nan,
176: nan,
177: 'Address not Found',
178: 'Delivery delayed-transport issues',
179: nan,
180: 'Parcel returned',
181: 'Parcel lost in mail',
182: 'Parcel received',
183: 'Return requested',
184: 'Repeat order placed},
'sales': {175: 7.0,
176: 7.0,
177: 9.0,
178: 13.0,
179: 11.0,
180: 9.0,
181: 19.0,
182: 14.0,
183: 9.0,
184: 9.0},
'order_date': {175: Timestamp('2019-08-28 16:30:00'),
176: Timestamp('2019-08-30 11:55:53'),
177: Timestamp('2019-09-06 14:51:14'),
178: Timestamp('2019-09-06 15:03:22'),
179: Timestamp('2019-09-06 15:46:11'),
180: Timestamp('2019-09-06 16:08:03'),
181: Timestamp('2019-09-06 17:13:01'),
182: Timestamp('2019-09-16 21:38:29'),
183: Timestamp('2019-09-25 12:35:29'),
184: Timestamp('2019-09-25 22:22:51')}}
This is in reference to this question: here
I want to plot a line chart with color
andsymbol
as the content column. However, when I do this:
fig = px.line(df, x='order_date', y='sales',color='content',symbol='content', color_discrete_sequence=px.colors.qualitative.Pastel,
markers=True, line_shape='hvh')
I am getting separate lines for the content, and some of them are just dots in the graph: not being connected, I am not sure why this happens. I tried replacing the Nan
values with None
, and the error still remains.
Any help with this would be greatly appreciated.
When you use px.line
in plotly express, specifying a color
or symbol
will cause plotly to split up your data by each unique color (or symbol), and plot these as separate traces. Lines are only drawn between traces with the same color.
Imagine your data looks like {'time': [1,2,3,4,5], 'values':[10,20,30,40,50], 'color': [a,b,c,a,b]}
. When you define fig = px.line(df, x='time', y='values', color='color'
, plotly will draw a line from point a to point a, point b to point b, and render point c as a single point. What you want instead is a single line connecting all of your points in order by time
.
One possible workaround would be to create fig1
where we use px.line to draw only the line with no markers. Then create fig2
where we use px.scatter to draw only the markers (with the colors and symbols). Then combine the data from both figures together into fig3
.
Here is an example below:
import pandas as pd
import plotly.graph_objects as go
import plotly.express as px
Timestamp = pd.Timestamp
nan = float("nan")
data = {'content': {175: nan,
176: nan,
177: 'Address not Found',
178: 'Delivery delayed-transport issues',
179: nan,
180: 'Parcel returned',
181: 'Parcel lost in mail',
182: 'Parcel received',
183: 'Return requested',
184: 'Repeat order placed'},
'sales': {175: 7.0,
176: 7.0,
177: 9.0,
178: 13.0,
179: 11.0,
180: 9.0,
181: 19.0,
182: 14.0,
183: 9.0,
184: 9.0},
'order_date': {175: Timestamp('2019-08-28 16:30:00'),
176: Timestamp('2019-08-30 11:55:53'),
177: Timestamp('2019-09-06 14:51:14'),
178: Timestamp('2019-09-06 15:03:22'),
179: Timestamp('2019-09-06 15:46:11'),
180: Timestamp('2019-09-06 16:08:03'),
181: Timestamp('2019-09-06 17:13:01'),
182: Timestamp('2019-09-16 21:38:29'),
183: Timestamp('2019-09-25 12:35:29'),
184: Timestamp('2019-09-25 22:22:51')}}
df = pd.DataFrame(data=data)
df.sort_values(by='order_date')
fig1 = px.line(df, x='order_date', y='sales').update_traces(line_color='lightgrey')
fig2 = px.scatter(df, x='order_date', y='sales',color='content',symbol='content', color_discrete_sequence=px.colors.qualitative.Pastel)
fig3 = go.Figure(data=fig1.data + fig2.data)
fig3.show()