I wanted to read an excel sheet that has one merged column and I need to treat that column to determine the rows that are grouped with. So that I can consider that group as a set. Likewise each merged rows are considered as set1, set2, set3 and so on. I will then process each set w its rows separately.
UPDATE: As Requested by @fireandfuel, I am including the approche that I have arrived.
CTMergeCells mergeCells = workSheet.getMergeCells();
SheetData sheetData = workSheet.getSheetData();
List<ArrayList<Row>> rowGroups = new ArrayList<ArrayList<Row>>();
List<CTMergeCell> cTMergeCells = mergeCells.getMergeCell();
for(CTMergeCell mcells : cTMergeCells){
String range = mcells.getRef();
Integer rowStart = Integer.parseInt(range.substring(1, 2));
Integer rowEnd = Integer.parseInt(range.substring(4, 5));
ArrayList<Row> rowss = (ArrayList<Row>) sheetData.getRow().stream().filter(row -> {
return row.getR() >= rowStart.longValue() && row.getR() <= rowEnd.longValue();
}).collect(Collectors.toList());
rowGroups.add((ArrayList<Row>) rowss);
}
It's very easy to access the definition of merged cells in SpreadsheetML
using docx4j.
Office Open XML
files (docx, xlsx and pptx) are simply ZIP archives which are containing XML files. You can open it with the most archive programs and have a look inside its file structure and files.
Here is some source code for to access the definition of merged cells using docx4j:
// load yourFile.xlsx file
SpreadsheetMLPackage spreadsheetMLPackage = SpreadsheetMLPackage.load(new File("yourFile.xlsx"));
// get worksheet from /xl/worksheets/yourSheet.xml file from yourFile.xlsx
WorksheetPart worksheetPart = (WorksheetPart) spreadsheetMLPackage.getParts().get(new PartName("/xl/worksheets/yourSheet.xml"));
Worksheet worksheet = worksheetPart.getJaxbElement();
// get the merged cells
CTMergeCells ctMergedCells = worksheet.getMergeCells();
if(ctMergedCells != null){
List<CTMergeCell> mergedCellList = ctMergedCells.getMergeCell();
// do something with the merged cells
}
The class org.xlsx4j.sml.CTMergeCell
has a function getRef
which gives you the reference of cells which are merged as String
, encoded as begin:end
, like A1:B2
(merged cells from A1
to B2
: A1
, A2
, B1
and B2
).