Search code examples
javascriptnode.jsnode-csv-parse

First property inaccsesible of objects parsed by csv-parse


I am parsing a csv file with following contents using csv-parse -

userID,sysID
20,50
30,71

However, on the objects returned it isn't possible to access the property created from the first column userID.

Here is my code --


async function main(){
    let systemIDs = await getSystemIds('./systems.csv');     
    console.log(`Scanning data for ${systemIDs.length} systems..`);

    console.log(systemIDs[0]);
    console.log(systemIDs[0].userID); // This prints undefined 
    console.log(systemIDs[0].sysID);  // This prints the correct value
}

async function getSystemIds(path){
    let ids= [];
    await new Promise ((resolve,reject)=>{        
        const csvParser = csvParse({columns:true, skip_empty_lines: true});
        FS.createReadStream(path)
        .pipe(csvParser)
        .on('readable', ()=>{
            let record ;            
            while(record = csvParser.read()) {
                ids.push(record);
            }            
        })
        .on('finish',()=>{
            resolve();
        });
    });
    return ids;
}

Output -

Scanning data for 2 systems..
{ 'userID': '20', sysID: '50' } 
undefined  // <== The Problem
50

I notice the first column key userID has single quotes around it in the console output whereas sysID doesn't. But don't know what is causing them.


Solution

  • Figured it out myself in the end...

    I needed the bom option. The documentation states it should be set to true for UTF-8 files. But it is set to false by default.

    CSV files generated by MS Excel have a hidden BOM character in them. This gets picked up as part of the header (and key name) by the parser. Setting the bom option to true makes csv-parse compatible with utf-8 files with or without bom character in them.

    Set the bom option when initializing csv-parse -

    const csvParser = csvParse({
      columns: true, 
      skip_empty_lines: true,
      bom: true
    });
    

    It isn't very clear to me why this bom character is added to the csv files by Excel. Quote from wikipedia BOM -

    The Unicode Standard neither requires nor recommends the use of the BOM for UTF-8, but warns that it may be encountered at the start of a file trans-coded from another encoding