Search code examples
node.jsexcelmongodbxlsx

NodeJS MongoDB Mongoose export nested subdocuments and arrays to XLSX columns


I have query results from MongoDB as an array of documents with nested subdocuments and arrays of subdocuments.

[
  {
    RecordID: 9000,
    RecordType: 'Item',
    Location: {
      _id: 5d0699326e310a6fde926a08,
      LocationName: 'Example Location A'
    }
    Items: [
      {
        Title: 'Example Title A',
        Format: {
          _id: 5d0699326e310a6fde926a01,
          FormatName: 'Example Format A'
        }
      },
      {
        Title: 'Example Title B',
        Format: {
          _id: 5d0699326e310a6fde926a01,
          FormatName: 'Example Format B'
        }
      }
    ],
  },
  {
    RecordID: 9001,
    RecordType: 'Item',
    Location: {
      _id: 5d0699326e310a6fde926a08,
      LocationName: 'Example Location C'
    },
    Items: [
      {
        Title: 'Example Title C',
        Format: {
          _id: 5d0699326e310a6fde926a01,
          FormatName: 'Example Format C'
        }
      }
    ],
  }
]

Problem

I need to export the results to XLSX in column order. The XLSX library is working to export the top-level properties (such as RecordID and RecordType) only. I also need to export the nested objects and arrays of objects. Given a list of property names e.g. RecordID, RecordType, Location.LocationName, Items.Title, Items.Format.FormatName the properties must be exported to XLSX columns in the specified order.

Desired result

Here is the desired 'flattened' structure (or something similar) that I think should be able to convert to XLSX columns.

[
  {
    'RecordID': 9000,
    'RecordType': 'Item',
    'Location.LocationName': 'Example Location A',
    'Items.Title': 'Example Title A, Example Title B',
    'Items.Format.FormatName': 'Example Format A, Example Format B',
  },
  {
    'RecordID': 9001,
    'RecordType': 'Item',
    'Location.LocationName': 'Example Location C',
    'Items.Title': 'Example Title C',
    'Items.Format.FormatName': 'Example Format C',
  }
]

I am using the XLSX library to convert the query results to XLSX which works for top-level properties only.

  const worksheet: XLSX.WorkSheet = XLSX.utils.json_to_sheet(results.data);
  const workbook: XLSX.WorkBook = { Sheets: { 'data': worksheet }, SheetNames: ['data'] };
  const excelBuffer: any = XLSX.write(workbook, { bookType: 'xlsx', type: 'array' });

  const data: Blob = new Blob([excelBuffer], { type: EXCEL_TYPE });
  FileSaver.saveAs(data, new Date().getTime());

POSSIBLE OPTIONS

I am guessing I need to 'flatten' the structure either using aggregation in the query or by performing post-processing when the query is returned.

Option 1: Build the logic in the MongoDB query to flatten the results.

$replaceRoot might work since it is able to "promote an existing embedded document to the top level". Although I am not sure if this will solve the problem exactly, I do not want to modify the documents in place, I just need to flatten the results for exporting.

Here is the MongoDB query I am using to produce the results:

records.find({ '$and': [ { RecordID: { '$gt': 9000 } } ]},
  { skip: 0, limit: 10, projection: { RecordID: 1, RecordType: 1, 'Items.Title': 1, 'Items.Location': 1 }});

Option 2: Iterate and flatten the results on the Node server

This is likely not the most performant option, but might be the easiest if I can't find a way to do so within the MongoDB query.

UPDATE:

I may be able to use MongoDB aggregate $project to 'flatten' the results. For example, this aggregate query effectively 'flattens' the results by 'renaming' the properties. I just need to figure out how to implement the query conditions within the aggregate operation.

db.records.aggregate({
  $project: {
    RecordID: 1,
    RecordType: 1,
    Title: '$Items.Title',
    Format: '$Items.Format'
  }
})

UPDATE 2:

I have abandoned the $project solution because I would need to change the entire API to support aggregation. Also, I would need to find a solution for populate because aggregate does not support it, rather, it uses $lookup which is possible but time consuming because I would need to write the queries dynamically. I am going back to look into how to flatten the object by creating a function to iterate the array of objects recursively.


Solution

  • I wrote a function to iterate all object in the results array and create new flattened objects recursively. The flattenObject function shown here is similar to the previous answer and I took additional inspiration from this related answer.

    The '_id' properties are specifically excluded from being added to the flattened object, since ObjectIds are still being returned as bson types even though I have the lean() option set.

    I still need to figure out how to sort the objects such that they are in the order given e.g. RecordID, RecordType, Items.Title. I believe that might be easiest to achieve by creating a separate function to iterate the flattened results, although not necessarily the most performant. Let me know if anyone has any suggestions on how to achieve the object sorting by a given order or has any improvements to the solution.

    const apiCtrl = {};
    
    /**
     * Async array iterator
     */
    apiCtrl.asyncForEach = async (array, callback) => {
      for (let index = 0; index < array.length; index++) {
        await callback(array[index], index, array)
      }
    }
    
    // Check if a value is an object
    const isObject = (val) => {
      return typeof val == 'object' && val instanceof Object && !(val instanceof Array);
    }
    
    // Check if a value is a date object
    const isDateObject = (val) => {
      return Object.prototype.toString.call(val) === '[object Date]';
    }
    
    /**
     * Iterate object properties recursively and flatten all values to top level properties
     * @param {object} obj Object to flatten
     * @param {string} prefix A string to hold the property name
     * @param {string} res A temp object to store the current iteration
     * Return a new object with all properties on the top level only
     *
     */
    const flattenObject = (obj, prefix = '', res = {}) =>
    
      Object.entries(obj).reduce((acc, [key, val]) => {
        const k = `${prefix}${key}`
    
        // Skip _ids since they are returned as bson values
        if (k.indexOf('_id') === -1) {
          // Check if value is an object
          if (isObject(val) && !isDateObject(val)) {
            flattenObject(val, `${k}.`, acc)
          // Check if value is an array
          } else if (Array.isArray(val)) {
            // Iterate each array value and call function recursively
            val.map(element => {
              flattenObject(element, `${k}.`, acc);
            });
          // If value is not an object or an array
          } else if (val !== null & val !== 'undefined') {
            // Check if property has a value already
            if (res[k]) {
              // Check for duplicate values
              if (typeof res[k] === 'string' && res[k].indexOf(val) === -1) {
                // Append value with a separator character at the beginning
                res[k] += '; ' + val;
              }
            } else {
              // Set value
              res[k] = val;
            }
          }
        }
    
        return acc;
    
      }, res);
    
    /**
     * Convert DB query results to an array of flattened objects
     * Required to build a format that is exportable to csv, xlsx, etc.
     * @param {array} results Results of DB query
     * Return a new array of objects with all properties on the top level only
     */
    apiCtrl.buildExportColumns = async (results) => {
    
      const data = results.data;
      let exportColumns = [];
    
      if (data && data.length > 0) {
        try {
          // Iterate all records in results data array
          await apiCtrl.asyncForEach(data, async (record) => {
    
            // Convert the multi-level object to a flattened object
            const flattenedObject = flattenObject(record);
            // Push flattened object to array
            exportColumns.push(flattenedObject);
    
          });
        } catch (e) {
          console.error(e);
        }
      }
    
      return exportColumns;
    
    }