With Plotly i plotted a histogram returning the evolution per month of the sum of sales grouped by category.
Still with Plotly, i would like to add a line above tracing the evolution of the number of sales. I would like to get a marker for each month showing the number of sales.
Here is my code used for my histogram:
import plotly.express as px
import plotly.graph_objects as go
fig = px.histogram(
dataset,
x="Years and month",
y="Price",
color="Category",
text_auto=".2f",
height=600,
width=980)
fig.update_layout(
bargap=0.2,
title_x=0.5)
fig.update_xaxes(
dtick="M1",
tickformat="%b\n%Y")
fig.show()
I tried adding this line of code but only got a straight line along my x axis at the bottom of my bars:
fig.add_trace(go.Scatter(x=dataset["Years and month"], y=dataset["Price"],
mode='lines',
name="Sales"))
# I don't know what argument to put to have the count of dataset["Price"]
dataset's info:
# dataset.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 679111 entries, 0 to 679331
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Price 679111 non-null float64
1 Category 679111 non-null int64
2 Years and month 679111 non-null object
dtypes: float64(1), int64(1), object(1)
memory usage: 20.7+ MB
None
Here a sample of my dataset:
Price Category Years and month
16.07 1 2021-12
9.28 0 2021-07
3.99 0 2021-03
27.46 1 2021-11
15.81 1 2022-03
17.99 0 2022-09
16.99 1 2022-01
9.41 0 2021-12
9.99 0 2022-05
8.99 0 2021-04
Small problem on top of that: my dataset has 679532 entries, which impacts my jupyter notebook when I am too greedy in requests (ex: go.scatter(mode="lines+markers") which crashes my notebook).
Here is a photo of my histogram with the desired result (black's line drawed with Paint):
I finally found the solution myself.
Edit: i rename the columns "Years and month" by "Year and month"
To add a trace with plotly.express you must use:
fig.add_traces(list(px.*the fig you want (ex: line; histogram; scatter; etc...)*(*all the arguments to trace your fig*).select_traces()))
To obtain the desired aggregation, you had to do a groupby()
followed by the column to aggregate.
In order to obtain the number of products sold, you must use hover_data=[]
and indicate the data to be aggregated, example here:
hover_data=[dataset.groupby(
"Year and month")["Price"].count()]
To get a line with markers, add .update_traces(mode='lines+markers')
just before the .select_traces()
Here is the full code for the solution:
import plotly.express as px
fig = px.histogram(dataset,
x="Year and month",
y="Price",
color="Category",
text_auto=".2f",
height=600,
width=980)
fig.update_layout(bargap=0.2)
fig.update_xaxes(dtick="M1", tickformat="%b\n%Y")
fig.add_traces(
list(
px.line(dataset.groupby("Year and month")["Price"].sum(),
hover_data=[
dataset.groupby("Year and month")["Price"].count()
]).update_traces(mode='lines+markers').select_traces()))
fig.show()