Background: I'm uploading a csv file and a mapping file to my server using FormData and parsing using Papa Parse.
For some reason, Papa Parse's outputted object (which renders correctly using console.log) cannot be indexed by normal strings. I've even tried using JSON.parse(JSON.stringify(...))
on both my string and the object to see if I could normalize it somehow.
import Papa from 'papaparse'
import formidable from 'formidable'
import fs from 'fs'
...
const { files, fields } = await parseRequestForm(req)
let parsedMapping: Record<string, string = JSON.parse(fields.mapping as string)
const f = files.file as formidable.File
const output = await new Promise<{ loadedCount: number; totalCount: number }>(
(resolve) => {
const filecontent = fs.createReadStream(f.path)
filecontent.setEncoding('utf8')
let loadedCount = 0
let totalCount = 0
Papa.parse<Record<string, any>>(filecontent, {
header: true,
skipEmptyLines: true,
dynamicTyping: true,
chunkSize: 25,
encoding: 'utf8',
chunk: async (out) => {
const data = out.data.map((r) => applyMapping(r, parsedMapping))
totalCount += data.length
try {
await prisma.softLead.createMany({ data }).then((x) => {
loadedCount += x.count
})
} catch (e) { }
},
complete: () => resolve({ loadedCount, totalCount }),
})
}
)
type ParsedForm = {
error: Error | string
fields: formidable.Fields
files: formidable.Files
}
function parseRequestForm(req: NextApiRequest): Promise<ParsedForm> {
const form = formidable({ encoding: 'utf8' })
return new Promise((resolve, reject) => {
form.parse(req, (err, fields, files) => {
if (err) reject({ err })
resolve({ error: err, fields, files })
})
})
}
function applyMapping(
data: Record<string, any>,
mapping: Record<keyof SoftLead, string>
): Partial<SoftLead> {
return Object.fromEntries(
Object.entries(mapping).map(([leadField, csvField]) => {
// Struggling to access field here
console.log('Field', `"${csvField}"`)
console.log('Data', data)
const parsed = JSON.parse(JSON.stringify(data))
console.log(Buffer.from(Object.keys(parsed)[0]))
console.log(Buffer.from(Buffer.from(csvField).toString('utf8')))
console.log(parsed[csvField]) // undefined
return [leadField, data[csvField]]
})
)
}
The Buffer lines are also indicating that the strings are not the same, even though they print the same to the console.
Papaparse's Index
Buffer.from(Object.keys(parsed)[0])
=> <Buffer ef bb bf 45 6d 61 69 6c 73>
Map object key
Buffer.from(Buffer.from(csvField).toString('utf8'))
=> <Buffer 45 6d 61 69 6c 73>
A normal string
Buffer.from(Buffer.from('Emails').toString('utf-8'))
=> <Buffer 45 6d 61 69 6c 73>
utf16le
but I think its failing altogether to parse because FormData
apparently exclusively does utf8
I was able to solve this problem by stripping the BOM as described here. Simply,
const parsed = Object.fromEntries(
Object.entries(data).map(([k, v]) => [stripBom(k), v])
)
export default function stripBom(str: string) {
if (str.charCodeAt(0) === 0xfeff) {
return str.slice(1)
}
return str
}