Search code examples
pythonpandasplotlyplotly-python

Plot stacked barchar in Plotly with fixed order based on second column


I have a dataframe that looks like this:

index, start, end, bar_len,name,color, gr
1,2300000.0,5300000.0,3000000.0,p36.32,#949494, g1
2, 5300000.0,7100000.0,1800000.0,p36.31,#FFFFFF,  g1
3, 7100000.0,9100000.0,2000000.0,p36.23,#949494, g1
4, 9100000.0,12500000.0,3400000.0,p36.22,#FFFFFF, g1

I want to create an horizontal stacked barchar with the following output:

| - indx[1] [len=bar_len] | - indx[2] [len=bar_len] | - indx[3] [len=bar_len] | - indx[4] [len=bar_len]

I tried doing this the following way:

import plotly.express as px
import pandas as pd

input_path = r"example.csv"
df = pd.read_csv(input_path)
df.set_index('start')
fig = px.bar(
    df, x='bar_len', y='gr', color="DS_COLOR", orientation='h',
)

fig.update_layout(barmode='stack', xaxis={'categoryorder':'category ascending'})

The problem is that the values plotted on the barchar are not sorted by start column, which is what I am trying to do. Therefore, my question is: is there any way to plot a stacked bachar that plots the length of each of the elements based on one of the columns (bar_len) and sorts these plotted elements based on another column (start)?

UPDATE: I have seen that the problem raises when including the color label. This label resorts the barchart based on the color instead of preserving the original order based on index column. Is there any way to avoid this?


Solution

  • You can build it using plotly graph_objects. Code below to do the needful. Note: In the dataframe, I changed the color to HEX CODE which is #FF0000 for RED and #0000FF for BLUE. I have used only bar_len, color and gr columns. Adopted from this answer. df looks like this

        start   end bar_len name    color   gr
    0   2300000.0   5300000.0   3000000.0   p36.32  #FF0000 g1
    1   5300000.0   7100000.0   1800000.0   p36.31  #0000FF g1
    2   7100000.0   9100000.0   2000000.0   p36.23  #FF0000 g1
    3   9100000.0   12500000.0  3400000.0   p36.22  #0000FF g1
    

    The code is here:

    import pandas as pd
    import plotly.graph_objects as go
    
    input_path = r"example.csv"
    df = pd.read_csv(input_path)
    
    data = []
    for i in range(len(df)):
        data.append(go.Bar(x=[df['bar_len'][i]], y=[df['gr'][i]], marker=dict(color=df['color'][i]), orientation = 'h'))
    layout = dict(barmode='stack', yaxis={'title': 'gr'}, xaxis={'title': 'Length'})
    fig = go.Figure(data=data, layout=layout)
    fig.update_layout(showlegend=False, autosize=False, width=800, height=300)
    fig.show()
    

    OUTPUT GRAPH

    Graph

    Note: If the x-axis can be expressed as a timeline and you are able to get the x values as datetime, would suggest you also check out plotly.express.timeline charts which gives gantt chart form of graphs. Sample here - Check the first chart...