Search code examples
pythonplotlyplotly-pythonplotly-express

Plotly: How to prevent varying thickness of bars in a Gantt diagram?


I'm trying to make a plotly Gantt diagram using plotly.express as in the example, but plotly is somehow varying the bar thickness based on the given names (see picture, red > green > blue):

enter image description here

The code looks like this:

import pandas as pd
import plotly.express as px

df = pd.read_csv('stackoverflow.csv')
fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task", color="Resource", text="Task", width=1600, height=800)
fig.update_yaxes(autorange="reversed")
fig.update_xaxes(
    dtick="1000",
    tickformat="%M:%S",
    ticklabelmode="instant")
fig.update_layout(xaxis_range=[df.Start.min(), df.Finish.max()])
fig.show()

With stackoverflow.csv being:

Task,Start,Finish,Workstation,Resource
1,1970-01-01 01:00:00.000,1970-01-01 01:00:05.400,1,ABL
2,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,2,ABS
3,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,3,ABU
4,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,4,ACC
5,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,4,ACC
6,1970-01-01 01:00:03.300,1970-01-01 01:00:05.300,4,ACC
7,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,5,ABS
8,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,6,ACT
9,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,7,ACC
10,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,7,ACC
11,1970-01-01 01:00:03.300,1970-01-01 01:00:05.300,7,ACC
12,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,8,ABS
13,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,9,ABU
14,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,10,ACC
15,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,11,ABS
16,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,12,ABU
17,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,13,ACC
18,1970-01-01 01:00:01.300,1970-01-01 01:00:03.300,13,ACC
19,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,14,ABS
20,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,15,ABP
21,1970-01-01 01:00:00.000,1970-01-01 01:00:01.500,16,ABZ
22,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,17,ACC
23,1970-01-01 01:00:01.300,1970-01-01 01:00:03.300,17,ACC
24,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,18,ABS
25,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,19,AAW
26,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,20,ACC
27,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,20,ACC
28,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,21,ABS
29,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,22,ABU
30,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,23,ACC

I want all the bars to be of the same thickness, surprisingly this works when I change the names of the Resource to some random 3-char value: enter image description here

I think it has to do with the resources starting with AB* or AC*. Unfortunately, the names of the resources depend on real-world names, so I cannot arbitrarily change them. The varying bar thickness happens as well when the name of the resource is something like FooBar-Axx-FOO with xx = [CC, BS, CT...]. Does anyone know why this is happening or how to prevent it?

P.S.: The

fig.update_xaxes(
    dtick="1000",
    tickformat="%M:%S")

was necessary to show seconds in the Gantt instead of days... is there a better way to achieve this?


Update: conda environment yaml file I'm using to create the problem:

name: stack
channels:
  - conda-forge
  - pytorch
  - plotly
dependencies:
    - python>=3.5,<3.8
    - pandas
    - pip
    - pip:
      - plotly

slightly modified code still yielding the same problem as shown above:

import pandas as pd
import plotly.express as px
from io import StringIO


csv = """Task,Start,Finish,Workstation,Resource
1,1970-01-01 01:00:00.000,1970-01-01 01:00:05.400,1,ABL
2,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,2,ABS
3,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,3,ABU
4,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,4,ACC
5,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,4,ACC
6,1970-01-01 01:00:03.300,1970-01-01 01:00:05.300,4,ACC
7,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,5,ABS
8,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,6,ACT
9,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,7,ACC
10,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,7,ACC
11,1970-01-01 01:00:03.300,1970-01-01 01:00:05.300,7,ACC
12,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,8,ABS
13,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,9,ABU
14,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,10,ACC
15,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,11,ABS
16,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,12,ABU
17,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,13,ACC
18,1970-01-01 01:00:01.300,1970-01-01 01:00:03.300,13,ACC
19,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,14,ABS
20,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,15,ABP
21,1970-01-01 01:00:00.000,1970-01-01 01:00:01.500,16,ABZ
22,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,17,ACC
23,1970-01-01 01:00:01.300,1970-01-01 01:00:03.300,17,ACC
24,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,18,ABS
25,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,19,AAW
26,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,20,ACC
27,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,20,ACC
28,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,21,ABS
29,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,22,ABU
30,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,23,ACC"""

df = pd.read_csv(StringIO(csv))
# df = pd.read_csv("stackoverflow.csv")
fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task", color="Resource", text="Task", width=1600, height=800)
fig.update_yaxes(autorange="reversed")
fig.update_xaxes(
    dtick="1000",
    tickformat="%M:%S",
    ticklabelmode="instant")
fig.update_layout(xaxis_range=[df.Start.min(), df.Finish.max()])
fig.show()

The output of conda list:

pip                       20.2.4                     py_0    conda-forge
plotly                    4.12.0                   pypi_0    pypi
python                    3.7.8           h6f2ec95_1_cpython    conda-forge

Solution

  • (this is an answer in progress and prone to changes)


    A possible solution:

    But only a possible solution since I'm still not able to reproduce your code snippet and corresponding plot 100%. But we'll take a closer look at that in the details. And you'll need the latest Plotly version and also Kaleido. But those are pretty straight forward installations and a huge step forward for plotly. In my humble opinion at least...

    Code 0:

    df = pd.read_csv(StringIO(csv))
    fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task", color="Resource", text="Task", width=1600, height=800)
    f2 = fig.full_figure_for_development(warn=False)
    f2.layout.barmode = 'group'
    f2.show()
    

    Plot 0:

    enter image description here


    The details:


    We're going to have to do this step by step. Firstly, when I run your provided code, I'm getting this:

    Plot 1:

    enter image description here

    Code 1:

    import pandas as pd
    import plotly.express as px
    from io import StringIO
    
    
    csv = """Task,Start,Finish,Workstation,Resource
    1,1970-01-01 01:00:00.000,1970-01-01 01:00:05.400,1,ABL
    2,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,2,ABS
    3,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,3,ABU
    4,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,4,ACC
    5,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,4,ACC
    6,1970-01-01 01:00:03.300,1970-01-01 01:00:05.300,4,ACC
    7,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,5,ABS
    8,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,6,ACT
    9,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,7,ACC
    10,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,7,ACC
    11,1970-01-01 01:00:03.300,1970-01-01 01:00:05.300,7,ACC
    12,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,8,ABS
    13,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,9,ABU
    14,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,10,ACC
    15,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,11,ABS
    16,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,12,ABU
    17,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,13,ACC
    18,1970-01-01 01:00:01.300,1970-01-01 01:00:03.300,13,ACC
    19,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,14,ABS
    20,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,15,ABP
    21,1970-01-01 01:00:00.000,1970-01-01 01:00:01.500,16,ABZ
    22,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,17,ACC
    23,1970-01-01 01:00:01.300,1970-01-01 01:00:03.300,17,ACC
    24,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,18,ABS
    25,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,19,AAW
    26,1970-01-01 01:00:00.000,1970-01-01 01:00:02.000,20,ACC
    27,1970-01-01 01:00:02.000,1970-01-01 01:00:03.300,20,ACC
    28,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,21,ABS
    29,1970-01-01 01:00:00.000,1970-01-01 01:00:01.000,22,ABU
    30,1970-01-01 01:00:00.000,1970-01-01 01:00:01.300,23,ACC"""
    
    df = pd.read_csv(StringIO(csv))
    # df = pd.read_csv("stackoverflow.csv")
    fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task", color="Resource", text="Task", width=1600, height=800)
    fig.update_yaxes(autorange="reversed")
    fig.update_xaxes(
        dtick="1000",
        tickformat="%M:%S",
        ticklabelmode="instant")
    fig.update_layout(xaxis_range=[df.Start.min(), df.Finish.max()])
    fig.show()
    

    In order to come close to your provided screenshot, I have to comment out a few lines as shown below. But it's still not the same figure as yours.

    Plot 2:

    enter image description here

    Code 2 (same dataset):

    df = pd.read_csv(StringIO(csv))
    # df = pd.read_csv("stackoverflow.csv")
    fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task", color="Resource", text="Task", width=1600, height=800)
    # fig.update_yaxes(autorange="reversed")
    # fig.update_xaxes(
    #     dtick="1000",
    #     tickformat="%M:%S",
    #     ticklabelmode="instant")
    # fig.update_layout(xaxis_range=[df.Start.min(), df.Finish.max()])
    fig.show()
    

    I find that a bit weird. And the possible solution is even weirder. If you take a look at the post Plotly: How to inspect and make changes to a plotly figure? you'll see how you can reveal most of the figure attributes using f2 = fig.full_figure_for_development. Any changes you do to f2 should also be possible to do to fig. But not in this case. In order to get a result that resembles your desired output, I had to do the following:

    Code 3:

    df = pd.read_csv(StringIO(csv))
    fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task", color="Resource", text="Task", width=1600, height=800)
    f2 = fig.full_figure_for_development(warn=False)
    f2.layout.barmode = 'group'
    f2.show()
    

    Plot 3:

    enter image description here

    Maybe we're getting somewhere here? But now you might think "why not fig.layout.barmode = 'group'?". Well, here's the result of that:

    Plot 4:

    enter image description here

    Code 4:

    df = pd.read_csv(StringIO(csv))
    fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task", color="Resource", text="Task", width=1600, height=800)
    f2 = fig.full_figure_for_development(warn=False)
    # f2.layout.barmode = 'group'
    # f2.show()
    fig.layout.barmode = 'group'
    fig.show()
    

    I find this whole thing to be more than a bit strange. But try please try it out and let me know how it all works out for you!