python plotly plotly-python plotly-express

Plotly: How to prevent varying thickness of bars in a Gantt diagram?

I'm trying to make a plotly Gantt diagram using plotly.express as in the example, but plotly is somehow varying the bar thickness based on the given names (see picture, red > green > blue):

The code looks like this:

import pandas as pd
import plotly.express as px

df = pd.read_csv('stackoverflow.csv')
fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task", color="Resource", text="Task", width=1600, height=800)
fig.update_yaxes(autorange="reversed")
fig.update_xaxes(
    dtick="1000",
    tickformat="%M:%S",
    ticklabelmode="instant")
fig.update_layout(xaxis_range=[df.Start.min(), df.Finish.max()])
fig.show()

With stackoverflow.csv being:

Task,Start,Finish,Workstation,Resource
1,1970-01-01 01:00:00.000,1970-01-01 01:00:05.400,1,ABL
2,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,2,ABS
3,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,3,ABU
4,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,4,ACC
5,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,4,ACC
6,1970-01-01 01:00:03.300,1970-01-01 01:00:05.300,4,ACC
7,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,5,ABS
8,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,6,ACT
9,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,7,ACC
10,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,7,ACC
11,1970-01-01 01:00:03.300,1970-01-01 01:00:05.300,7,ACC
12,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,8,ABS
13,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,9,ABU
14,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,10,ACC
15,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,11,ABS
16,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,12,ABU
17,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,13,ACC
18,1970-01-01 01:00:01.300,1970-01-01 01:00:03.300,13,ACC
19,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,14,ABS
20,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,15,ABP
21,1970-01-01 01:00:00.000,1970-01-01 01:00:01.500,16,ABZ
22,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,17,ACC
23,1970-01-01 01:00:01.300,1970-01-01 01:00:03.300,17,ACC
24,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,18,ABS
25,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,19,AAW
26,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,20,ACC
27,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,20,ACC
28,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,21,ABS
29,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,22,ABU
30,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,23,ACC

I want all the bars to be of the same thickness, surprisingly this works when I change the names of the Resource to some random 3-char value:

I think it has to do with the resources starting with AB* or AC*. Unfortunately, the names of the resources depend on real-world names, so I cannot arbitrarily change them. The varying bar thickness happens as well when the name of the resource is something like FooBar-Axx-FOO with xx = [CC, BS, CT...]. Does anyone know why this is happening or how to prevent it?

P.S.: The

fig.update_xaxes(
    dtick="1000",
    tickformat="%M:%S")

was necessary to show seconds in the Gantt instead of days... is there a better way to achieve this?

Update: conda environment yaml file I'm using to create the problem:

name: stack
channels:
  - conda-forge
  - pytorch
  - plotly
dependencies:
    - python>=3.5,<3.8
    - pandas
    - pip
    - pip:
      - plotly

slightly modified code still yielding the same problem as shown above:

import pandas as pd
import plotly.express as px
from io import StringIO


csv = """Task,Start,Finish,Workstation,Resource
1,1970-01-01 01:00:00.000,1970-01-01 01:00:05.400,1,ABL
2,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,2,ABS
3,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,3,ABU
4,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,4,ACC
5,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,4,ACC
6,1970-01-01 01:00:03.300,1970-01-01 01:00:05.300,4,ACC
7,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,5,ABS
8,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,6,ACT
9,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,7,ACC
10,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,7,ACC
11,1970-01-01 01:00:03.300,1970-01-01 01:00:05.300,7,ACC
12,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,8,ABS
13,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,9,ABU
14,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,10,ACC
15,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,11,ABS
16,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,12,ABU
17,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,13,ACC
18,1970-01-01 01:00:01.300,1970-01-01 01:00:03.300,13,ACC
19,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,14,ABS
20,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,15,ABP
21,1970-01-01 01:00:00.000,1970-01-01 01:00:01.500,16,ABZ
22,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,17,ACC
23,1970-01-01 01:00:01.300,1970-01-01 01:00:03.300,17,ACC
24,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,18,ABS
25,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,19,AAW
26,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,20,ACC
27,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,20,ACC
28,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,21,ABS
29,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,22,ABU
30,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,23,ACC"""

df = pd.read_csv(StringIO(csv))
# df = pd.read_csv("stackoverflow.csv")
fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task", color="Resource", text="Task", width=1600, height=800)
fig.update_yaxes(autorange="reversed")
fig.update_xaxes(
    dtick="1000",
    tickformat="%M:%S",
    ticklabelmode="instant")
fig.update_layout(xaxis_range=[df.Start.min(), df.Finish.max()])
fig.show()

The output of conda list:

pip                       20.2.4                     py_0    conda-forge
plotly                    4.12.0                   pypi_0    pypi
python                    3.7.8           h6f2ec95_1_cpython    conda-forge

Solution

(this is an answer in progress and prone to changes)

A possible solution:

But only a possible solution since I'm still not able to reproduce your code snippet and corresponding plot 100%. But we'll take a closer look at that in the details. And you'll need the latest Plotly version and also Kaleido. But those are pretty straight forward installations and a huge step forward for plotly. In my humble opinion at least...

Code 0:

df = pd.read_csv(StringIO(csv))
fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task", color="Resource", text="Task", width=1600, height=800)
f2 = fig.full_figure_for_development(warn=False)
f2.layout.barmode = 'group'
f2.show()

Plot 0:

The details:

We're going to have to do this step by step. Firstly, when I run your provided code, I'm getting this:

Plot 1:

Code 1:

import pandas as pd
import plotly.express as px
from io import StringIO


csv = """Task,Start,Finish,Workstation,Resource
1,1970-01-01 01:00:00.000,1970-01-01 01:00:05.400,1,ABL
2,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,2,ABS
3,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,3,ABU
4,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,4,ACC
5,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,4,ACC
6,1970-01-01 01:00:03.300,1970-01-01 01:00:05.300,4,ACC
7,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,5,ABS
8,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,6,ACT
9,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,7,ACC
10,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,7,ACC
11,1970-01-01 01:00:03.300,1970-01-01 01:00:05.300,7,ACC
12,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,8,ABS
13,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,9,ABU
14,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,10,ACC
15,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,11,ABS
16,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,12,ABU
17,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,13,ACC
18,1970-01-01 01:00:01.300,1970-01-01 01:00:03.300,13,ACC
19,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,14,ABS
20,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,15,ABP
21,1970-01-01 01:00:00.000,1970-01-01 01:00:01.500,16,ABZ
22,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,17,ACC
23,1970-01-01 01:00:01.300,1970-01-01 01:00:03.300,17,ACC
24,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,18,ABS
25,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,19,AAW
26,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,20,ACC
27,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,20,ACC
28,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,21,ABS
29,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,22,ABU
30,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,23,ACC"""

df = pd.read_csv(StringIO(csv))
# df = pd.read_csv("stackoverflow.csv")
fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task", color="Resource", text="Task", width=1600, height=800)
fig.update_yaxes(autorange="reversed")
fig.update_xaxes(
    dtick="1000",
    tickformat="%M:%S",
    ticklabelmode="instant")
fig.update_layout(xaxis_range=[df.Start.min(), df.Finish.max()])
fig.show()

In order to come close to your provided screenshot, I have to comment out a few lines as shown below. But it's still not the same figure as yours.

Plot 2:

Code 2 (same dataset):

df = pd.read_csv(StringIO(csv))
# df = pd.read_csv("stackoverflow.csv")
fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task", color="Resource", text="Task", width=1600, height=800)
# fig.update_yaxes(autorange="reversed")
# fig.update_xaxes(
#     dtick="1000",
#     tickformat="%M:%S",
#     ticklabelmode="instant")
# fig.update_layout(xaxis_range=[df.Start.min(), df.Finish.max()])
fig.show()

I find that a bit weird. And the possible solution is even weirder. If you take a look at the post Plotly: How to inspect and make changes to a plotly figure? you'll see how you can reveal most of the figure attributes using f2 = fig.full_figure_for_development. Any changes you do to f2 should also be possible to do to fig. But not in this case. In order to get a result that resembles your desired output, I had to do the following:

Code 3:

df = pd.read_csv(StringIO(csv))
fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task", color="Resource", text="Task", width=1600, height=800)
f2 = fig.full_figure_for_development(warn=False)
f2.layout.barmode = 'group'
f2.show()

Plot 3:

Maybe we're getting somewhere here? But now you might think "why not fig.layout.barmode = 'group'?". Well, here's the result of that:

Plot 4:

Code 4:

df = pd.read_csv(StringIO(csv))
fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task", color="Resource", text="Task", width=1600, height=800)
f2 = fig.full_figure_for_development(warn=False)
# f2.layout.barmode = 'group'
# f2.show()
fig.layout.barmode = 'group'
fig.show()

I find this whole thing to be more than a bit strange. But try please try it out and let me know how it all works out for you!