Search code examples
javascriptnode.jsxlsx

XLSX returns strange characters from Buffer Object/Base 64


I'm trying to read an Excel file which has been converted into a base-64 string. The File was originally a react-dropzone file before conversion. Unfortunately, when parsing the data with XLSX the file is returning unusual characters. Below is my code:

const XLSX = require("xlsx")

const file = "base64String"

const buffer = Buffer.from(file, 'base64')
const workbook = XLSX.read(buffer, { type: 'buffer' })

const sheetNamesList = workbook.SheetNames
// parse excel data to json
const excelData = XLSX.utils.sheet_to_json(workbook.Sheets[sheetNamesList[0]])

When logging the excelData, the output returns something like:

'u«ZjeÆ­ÿ¾wh¥éñWè®f­³ê\u001f~\'\u001ev.éí²ÞiÛ!yëfÈ^zÖÚ±î¸PK\u0003\u0004\u0014\u0000\u0006\u0000\b\u0000
\u0000\u0000!\u0000bîËNÃ0\u0010E÷HüCä-Jܲ@\b5íÇ\u0012*Q>ÀÄƪc[û\u0010B¡\u0015j7±\u0012ÏÜ{2ñÍh²nm¶Æ»R\fÈÀU^\
u001b7/ÅÇì%¿\u0017\u0019rZYï

Solution

  • Your file isn't just a base64 string. In your repl.it, you have additional pieces of data prior to the base64 content, so that is being interpreted as junk base64 content.

    If you strip away the data:application/vnd.openxmlformats-officedocument.spreadsheetml.sheet;base64, part, then the buffer should be able to properly parse your content.