i am trying to plot a bar chart based on groupby function but once i try it crash and display the below error:
this error below appear when the user select 3 items from the multiselect
widget.
ValueError: All arguments should have the same length. The length of argument
color
is 3, whereas the length of previously-processed arguments ['gender', 'count'] is 95
some_columns_df = df.loc[:,['gender','country','city','hoby','company','status']]
some_collumns = some_columns_df.columns.tolist()
select_box_var= st.selectbox("Choose X Column",some_collumns)
multiselect_var= st.multiselect("Select Columns To GroupBy",some_collumns)
test_g3 = df.groupby([select_box_var] + multiselect_var).size().reset_index(name='count')
fig = px.histogram(test_g3,x=select_box_var, y='count',color=multiselect_var ,barmode = 'group',text_auto = True)
I know the error is in the color
parameter in the px.histogram
The reason is color only accepts one category.
color=['column_a','column_b']
Would cause
ValueError: All arguments should have the same length. The length of argument
color
is 2, whereas the length of previously-processed arguments ['total_bill'] is 244
2
is the length of list ['column_a','column_b']
, while 244
is the dataframe
's rows.
According to the document:
color (str or int or Series or array-like) – Either a name of a column in data_frame, or a pandas Series or array_like object. Values from this column or array_like are used to assign color to marks.
Therefore, either we use a column_name
, or we use a series
.
Here's my approach:
import plotly.express as px
df = px.data.tips() # a data set from plotly
df.head()
Output
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
Column:
sex
with unique values Female
and Male
time
with unique values Dinner
and Lunch
I choose these two columns, it's easier to figure out that there is only
4 combination.
We create a series
that concat columns sex
and time
categories = df[['sex','time']].agg(', '.join, axis=1)
print(categories)
Output
0 Female, Dinner
1 Male, Dinner
2 Male, Dinner
3 Male, Dinner
4 Female, Dinner
...
239 Male, Dinner
240 Female, Dinner
241 Male, Dinner
242 Male, Dinner
243 Female, Dinner
Length: 244, dtype: object
Utilize this categories
as color reference
fig = px.histogram(df, x="total_bill", color =categories)
fig.show()
If ','.join
didn't work, having issue,
categories = df[['sex','time']].agg(', '.join, axis=1)
then we try another way
categories = df['sex'] + df['time']
Sup[1]