Search code examples
javascriptdatasetlodash

Javascript: Full Outer Join on a Array of Objects, using one key


I have two arrays of objects which I want to "Full Outer Join", like in SQL:

Dataset A:

[ { id: 1, name: "apple", color: "red" }, {id: 2, name: "banana", color: "yellow"} ]

Dataset B:

[ { id: 1, name: "apple", color: "blue" }, {id: 3, name: "mango", color: "green"} ]

Intended result:

[ { id: 1, dataset_a: { id: 1, name: "apple", color: "red" }
         , dataset_b: { id: 1, name: "apple", color: "blue" }
  }
, { id: 2, dataset_a: { id: 2, name: "banana", color: "yellow"}
         , dataset_b: null
  }
, { id: 3, dataset_a: null
         , dataset_b: { id: 3, name: "mango", color: "green"}
  }
]
  • The id's are unique.
  • Lodash may be used.
  • I have no restriction on ES version.

Instead of null, an empty object would be OK too. The id's don't necessarily need to be repeated, as shown below. So, this would be just as good:

[ { id: 1, dataset_a: { name: "apple", color: "red" }
         , dataset_b: { name: "apple", color: "blue" }
  }
, { id: 2, dataset_a: { name: "banana", color: "yellow"}
         , dataset_b: {}
  }
, { id: 3, dataset_a: {}
         , dataset_b: { name: "mango", color: "green"}
  }
]

Nina Scholz solution, transformed into a a function:

fullOuterJoin(dataset_a_name, dataset_b_name, dataset_a, dataset_b, key) {
    const getNullProperties = keys => Object.fromEntries(keys.map(k => [k, null]));
    var data = { [dataset_a_name]:dataset_a, [dataset_b_name]:dataset_b },
        result = Object
            .entries(data)
            .reduce((r, [table, rows]) => {
                //forEach dynamic destructuring
                rows.forEach(({ [key]:id, ...row }) => {
                    if (!r[id]) r.items.push(r[id] = { [key]:id, ...getNullProperties(r.tables) });
                    r[id][table] = row;
                });
                r.tables.push(table);
                r.items.forEach(item => r.tables.forEach(t => item[t] = item[t] || null));
                return r;
            }, { tables: [], items: [] })
            .items;
        
    return result;
},

Solution

  • You could take a dynamic approach and store the wanted data sets in an object and iterate the entries form the object. Then group by id and get all items back.

    This approach uses an object as hash table with id as key and an array as storage for the result set. If an id is not known, a new object with id and previously used keys with null value are used. Then the actual data set is added to the object.

    Finally for missing tables null values are assigned as well.

    const
        getNullProperties = keys => Object.fromEntries(keys.map(k => [k, null]));
    
    var dataset_a = [{ id: 1, name: "apple", color: "red" }, { id: 2, name: "banana", color: "yellow" }],
        dataset_b = [{ id: 1, name: "apple", color: "blue" }, { id: 3, name: "mango", color: "green" }],
        data = { dataset_a, dataset_b },
        result = Object
            .entries(data)
            .reduce((r, [table, rows]) => {
                rows.forEach(({ id, ...row }) => {
                    if (!r[id]) r.items.push(r[id] = { id, ...getNullProperties(r.tables) });
                    r[id][table] = row;
                });
                r.tables.push(table);
                r.items.forEach(item => r.tables.forEach(t => item[t] = item[t] || null));
                return r;
            }, { tables: [], items: [] })
            .items;
    
    console.log(result);
    .as-console-wrapper { max-height: 100% !important; top: 0; }