Search code examples
python-3.xpandaspython-pptx

How to handle missing x-axis values when plotting multiple line charts using python-pptx


I am trying to plots multiple line charts on same graph using python-pptx module. Here is my data:

month   desc value
201911   a  1164
201912   a   971
202001   a  1125
202005   b  1549
202005   a  1038
202006   b  1244
202006   a  1475
202007   a   960
    

The month column can range from 201910 to 202008.

I draw line charts for a and b desc colum values with month on X axis.

Here is the code:

chart_data = ChartData()
cat=list(data["month"].unique())
chart_data.categories = cat
data=data.sort_values(['month'], ascending=[True])
for i in list(data["desc"].unique()):
            
    
    chart_data.add_series(i,tuple(data[data['desc']==i]['value']))
    
    x, y, cx, cy = Inches(1), Inches(1), Inches(9), Inches(6)
    chart = slide.shapes.add_chart(XL_CHART_TYPE.LINE, x, y, cx, cy, chart_data).chart

As we can see for value of a in desc column and the values from b are different.

The plot gives wrong alignment of x axis and y-axis values.

enter image description here

The plot for b starts rom 201911... However it should start from 202005. How to correct this?


Solution

  • You need to use None as the value for any points you don't want to be plotted. I looks like your data retrieval simply omits missing data points. You need to get it so the categories sequence and the values sequence (for each series) are the same length and aligned by position.

    Like:

    categories = ("Jan", "Feb", "Mar")
    series_values = (1, 2, 3)
    

    If the value for Jan and Feb are missing, sending:

    categories = ("Jan", "Feb", "Mar")
    series_values = (3)
    

    will give you a chart like the one in your example. What it needs to be is:

    categories = ("Jan", "Feb", "Mar")
    series_values = (None, None, 3)