I have two dataframes, one is populated with barcodes and their product's prices across several supermarkets during the COVID pandemic, and the other one contains the actual product names and their barcodes along with a few category_id's etc, which are known (so I know which category_id to target in order to get a specific set of barcodes.)
df_markets:
BARCODE AB BFRESH LIDL SUPERM
Date
2020-01-03 5201263086618 6.36 7.97 0 0 8.31
2020-01-03 5201263086625 7.58 9.53 0 0 9.91
2020-01-03 7322540574852 18.11 18.34 0 0 8.86
2020-01-03 7322540647136 18.8 18.95 0 0 18.9
2020-01-03 7322540587555 18.22 18.98 0 0 9.21
the df_bar:
product_id category_id subcategory_name BARCODE name
0 1.0 37.0 Adult diapers 5.201263e+12 Sani Pants N2 Medium 14
1 2.0 37.0 Adult diapers 5.201263e+12 Sani Pants N3 Large 14
2 3.0 37.0 Adult diapers 7.322541e+12 Tena Pants Plus Large
3 4.0 37.0 Adult diapers 7.322541e+12 Tena Slip Super Large N4
4 5.0 37.0 Adult diapers 7.322541e+12 Tena Pants Plus Extra Large
So here's the code I've been using so far:
import plotly.express as px
df_bar['BARCODE'] = df_bar['BARCODE'].apply(lambda x: str(x)[:-2])
#to be able to correct some formating issues
df_markets_rm = df_markets.loc[df_markets['BARCODE'].isin(df_bar.loc[df_bar['category_id']==61.0]['BARCODE'])]
#here I am accessing the barcodes with category_id of 61 and locating them
#inside the df_markets
fig = px.bar(df_markets_rm, x=df_markets_rm.index, y='AB',
hover_data=['BARCODE', 'AB'], color='BARCODE',
labels={'AB':'AB price'}, height=800,width = 1000, facet_row = 'BARCODE')
fig.show()
# and here I am plotting a bar plot to be able to see the ups and downs on the prices of the products
but the plot turns out like this:
The problem is quite obvious I think and that is that the prices, or the y axis is messed up, it is not sorted whatsoever. This is the head of the df_markets_rm in case it helps:
BARCODE AB BFRESH LIDL SUPERM
Date
2020-01-03 5201050900028 3.23 2.94 0 0
2020-01-03 5201080124067 2.78 2.8 0 0
2020-01-03 5201080124111 2.76 2.51 0 0
2020-02-03 5201050900028 3.23 2.94 0 0
2020-02-03 5201080124067 2.78 2.8 0 0 2.8
Do you have any idea how to fix this plotting problem?
The problem was that the data types were that of object, so converted them using:
df_markets_rm['AB']= pd.to_numeric(df_markets_rm['AB'])