Search code examples
node.jsexceljs

How to get/parse HTML string for a cell value in excelJS?


I am trying to read an excel file where cells have formatting like Bold, Italic, Underline using exceljs library. I want to store the values in the cell as html but when I try to use r.html - it says html does not exist. Please check the attachment image. I would like to get output like <strong>Hello</strong> . How can I achieve this.

const workbook = new ExcelJS.Workbook();
    const worksheet = await workbook.xlsx.readFile(
      "../src/controllers/Formatted.xlsx"
    );



const rows = worksheet.getWorksheet("Sheet1").getRows(1, 3);
    rows?.forEach((row) =>
      console.log(row?.eachCell((r) => console.log(r.value, r.type)))
    );

Solution

  • After a lot of documentation mangling, I found ExcelJS provides the Font property for each cell, which returns an object which maintains booleans for bold, italics and other styles.

    For each cell in .xlsx file, we need to parse the HTML from the Font property and generate an HTML string for that cell content.

    Below is the snippet for converting formatted excel cell contents to HTML string.

    It supports styles like - Bold, Italic, Underline, Strikethrough.

    const convertFontToHTML = (
      text: string,
      font?: Partial<ExcelJS.Font>
    ): string[] => {
      return [
        font?.bold ? "<strong>" : "",
        font?.italic ? "<em>" : "",
        font?.underline ? "<u>" : "",
        font?.strike ? "<s>" : "",
        text,
        font?.strike ? "</s>" : "",
        font?.underline ? "</u>" : "",
        font?.italic ? "</em>" : "",
        font?.bold ? "</strong>" : "",
      ];
    };
    
    const combineTags = (str: string): string => {
      // simple tag combining where possible
      return str
        .replace(/<\/strong><strong>/g, "")
        .replace(/<\/em><em>/g, "")
        .replace(/<\/u><u>/g, "")
        .replace(/<\/s><s>/g, "");
    };
    

    Feel free to comment if any better solution exists :)