Search code examples
javaasp.netasposeaspose.words

how to read data from multiple HTML files and populated to single docx/pdf using Aspose?


Need to read data from each html file and add the data as one section in the docx document and the same should be applies for multiple html files adding each html data as each section in the single document


Solution

  • You can use Document.appendDocument method. In this case each appended document will be added to the document as a separate Section node (if there is only one section in the source document). For example:

    // List of input Html documents.
    String[] files = new String[]{"C:\\temp\\in1.html", "C:\\temp\\in2.html", "C:\\temp\\in3.html"};
    
    Document doc = new Document();
    DocumentBuilder builder = new DocumentBuilder(doc);
    builder.write("This is the main document where HTML documents will be appended");
    
    // Append HTML documents.
    for(String path : files)
    {
        Document subDoc = new Document(path);
        doc.appendDocument(subDoc, ImportFormatMode.USE_DESTINATION_STYLES);
    }
    
    doc.save("C:\\Temp\\out.docx");
    

    If you are using DocumentBuilder.insertHtml method to insert HTML, you should use DocumentBuilder.insertBreak to insert section break between the inserted HTML parts:

    // List of input Html documents.
    String[] files = new String[]{"C:\\temp\\in1.html", "C:\\temp\\in2.html", "C:\\temp\\in3.html"};
    
    Document doc = new Document();
    DocumentBuilder builder = new DocumentBuilder(doc);
    builder.write("This is the main document where HTML documents will be appended");
    
    // Append HTML documents.
    for(String path : files)
    {
        // Insert section break
        builder.insertBreak(BreakType.SECTION_BREAK_NEW_PAGE);
        // Insert HTML
        builder.insertHtml(Files.readString(Path.of(path)));
    }
    
    doc.save("C:\\Temp\\out.docx");