What's the easiest way to plot SDMX data with PANDAS or Plotly?
I have the following code:
import pandasdmx as sdmx
import plotly.express as px
df = sdmx.Request('OECD').data(
resource_id='MEI_FIN',
key='IR3TIB.GBR+USA.M',
params={'startTime': '1900-06', 'dimensionAtObservation': 'TimeDimension'},
).write().reset_index()
df
i end up getting the following error when trying to plot
fig = px.line(df, x="TIME_PERIOD", y='', title='Life expectancy in Country: Denmark')
fig.show()
as the following:
ValueError: Value of 'y' is not the name of a column in 'data_frame'. Expected one of `[('TIME_PERIOD', '', ''), ('IR3TIB', 'GBR', 'M'), ('IR3TIB', 'USA', 'M')] but received:`
I am pretty new with python so i would appreciate every comment that could help me with this.
I think that your main problem is due to the fact that your df is with multiindex. I'm not sure if this is what you what to achieve but you can try the following code:
import pandasdmx as sdmx
import plotly.express as px
df = sdmx.Request('OECD').data(
resource_id='MEI_FIN',
key='IR3TIB.GBR+USA.M',
params={'startTime': '1900-06', 'dimensionAtObservation': 'TimeDimension'},
).write().reset_index()
# with this we get rid of multi-index
# you could use a loop if you prefer I used
# list of comprehension
df.columns = ["_".join([c for c in col if c!=''])
for col in df.columns]
fig = px.line(df,
x="TIME_PERIOD",
y=['IR3TIB_GBR_M', 'IR3TIB_USA_M'],
title='Life expectancy in GBR and USA')\
.update_layout(title_x=0.5)
fig.show()