Search code examples
pandasplotlybar-charthistogramplotly-python

Why are the px.bar colors getting brighter/lighter when size of the data getting bigger?


I have a problem with plotly bar when using with 2 categorical columns , the graph is working but not clear as you can see . I changed colors but still the same .

Data sample :

Job                y

Housemaid          yes 
Admin.             No
Services           yes
Services           no

A very simple code but couldn't figure out the problem .

fig = px.bar(bank_data ,x="job", color="y")
fig.show()

here is what I get

plotly bar result

edit: it depends on the size of the data , for example when I use 1000 row ,that's what I get .

enter image description here

but when I use 2000 row , the color become lighter enter image description here

that's why it isn't clear at all when using all of the data.

used data : https://www.kaggle.com/datasets/volodymyrgavrysh/bank-marketing-campaigns-dataset

Edit: solved by adding another column "count", then groupby "job" :

bank_data["count"]=1
bank_data=bank_data.groupby(["job","y"],as_index=False).sum()
fig=px.bar(bank_data,x="job",y="count",color="y",barmode='group')
fig.show()

The result is that: enter image description here


Solution

  • Your problem is to count number of yes and no for each job, therefore you can use px.histogram with barmode='group'for this task rather than creating new column in your dataframe as follows:

    import plotly.express as px
    import pandas as pd 
    
    df = pd.read_csv('bank-additional-full.csv',sep=';')
    
    fig = px.histogram(df ,x="job", color="y",barmode='group')
    fig.show()
    

    enter image description here

    Your problem is mentioned here and it is proposed to use hisogram rather than the bar.