I am using OpenPyXL for processing, there are only 2 lines of code:
workbook = load_workbook(filename='input.xlsx', keep_vba=True, rich_text=True)
# I will do operator here, but there is an error, so I comment this line
workbook.save('output.xlsx')
Then I get the output with less size:
When I try to open output.xlsx, this notification appears:
The chart was removed by Microsoft Excel. Sometimes I cannot open it in Microsoft Excel, but LibreOffice can open it with no error.
Left: input.xlsx, right: output.xlsx. The chart was removed and Microsoft Excel reports an error:
The alert:
Often my input file has ~ 4 sheets, a chart, and 1 sheet has data. My task is to write data to one sheet, I do not care about the others. I want to get one of these solutions:
Using another library/language is the last option.
The problem is that your charts are combination charts. The chart plot area contains a bar chart and a line chart. Each of these charts has one series where the first series, which is the bar chart series, has idx
0 and order
1 while the second series, which is the line chart series, has idx
1 and order
0. The order is the series order in Select Data Source of the chart. There the line chart series is before the bar chart series.
Seems openpyxl
reader does not handle combination charts properly. It reads the first bar chart in plot area and re-indexes it's series. Other possible series of additional charts in plot area stay unchanged for idx
and order
. And because the reader only finds one chart series in first bar chart in plot area, this series get idx
0 and order
0 then, since it is the only one series known. The following not considered line chart series stays idx
1 and order
0. But then both series in plot area have the same order
0 and that leads to the error while opening in Excel.
Workaround:
Change the series order in Select Data Source of the charts. For left chart "Cost" before "CVs". For right chart "IMP" before "CPC". Then bar chart series has has idx
0 and order
0 and line chart series has has idx
1 and order
1. That works then.
If you don't want to change the source file, the following code should also work:
from openpyxl import load_workbook
wb = load_workbook("./input.xlsx", rich_text=True)
for sheet in wb.worksheets:
for chart in sheet._charts:
idx = 0
order = 0
for subChart in chart._charts:
print(subChart)
for ser in subChart.ser:
print(ser.idx)
print(ser.order)
ser.idx = idx
idx += 1
ser.order = order
order += 1
print(ser.idx)
print(ser.order)
wb.save("./output.xlsx")
It re-indexes and re-orders all series in all sub-charts for each main-chart in worksheet.