Search code examples
node.jscsvamazon-s3csv-parser

NodeJs Fetch and parse CSV file using csv-parser


I am trying to fetch the contents of a csv file from AWS S3 using axios and parse it using csv-parser it, then store the parsed data to my local database. Since the bucket/file is set to public, I'm pretty sure I don't need to include the bucketName, key, access/security keys when performing the request. Right now I was able to fetch the contents of the csv file but it is not parsed. All identifier first, then all values for each record separated with new line, somehting like this:

taskID,status,lastUpdate
1,ongoing,2023-01-01
2,completed,2023-01-02

I tried using csv-parser but I'm not sure what I'm missing and I can't seem to display the error. Here's what I've done so far:

const fs = require("fs")
const csvParser = require("csv-parser")
const axios = require("axios")

async function parseCSVFile(filePath) {
  let parsedData = []

  axios.get(filePath).then(function(response) {
      let csvData = response.data
      console.log(csvData);

      csvParser(csvData, { headers: true })
      .on('data', function(data) {
        console.log('asdasd');
        parsedData.push(data)
      })
      .on('end', function() {
        console.log('CSV data parsed', parsedData);
      })
      .on('error', function() {
        console.log("Error parsing CSV data");
      })
  })
  
}

I tried using fs initially and of course it worked but only because the file I am trying to parse is found locally. My csvData has a value but it is in plain text, but it's not being pushed to parsedData


Solution

  • csv-parser works with streams, which are piped to it, so it doesn't take a string as an argument, like in your code, only options object, which is why it doesn't work.

    So, turn axios response to a stream:

    axios.get(filePath, { responseType: 'stream'})
    

    and then pipe response stream to csv-parser:

    response.data.pipe(csvParser({ headers: true }))
    

    Try this:

     axios.get(filePath, { responseType: 'stream'}).then(function(response) {
          let csvData = response.data
          console.log(csvData); // this is a stream now..
    
          csvData.pipe(csvParser({ headers: true }))
          .on('data', function(data) {
            console.log('asdasd', data);
            parsedData.push(data)
          })
          .on('end', function() {
            console.log('CSV data parsed', parsedData);
          })
          .on('error', function() {
            console.log("Error parsing CSV data");
          })
      })