Search code examples
javascriptsetecmascript-next

Javascript "Set" for Objects and Arrays


Does Javascript have a built-in type for making a set out of data-objects and arrays?

let set = new Set();
set.add({"language": "ecmascript"});
set.add({"language": "ecmascript"});
set.add({"language": "ecmascript"});
set.add({"language": "ecmascript"});
set.add([1,2,3]);
set.add([1,2,3]);
set.add([1,2,3]);
set.add([1,2,3]);
console.log(set);

The Set I'm using above is only useful for primitives.


Solution

  • The Set I'm using above is only useful for primitives.

    That's incorrect, it works just fine for objects. The problem is that distinct objects with the same properties and property values are not equal, so doing set.add({"language": "ecmascript"}); twice adds two non-equal objects to the set (both with the same property name and value).

    If you add the same object more than once, it won't be added a second time:

    const set = new Set();
    const obj = {"language": "ecmascript"};
    set.add(obj);
    set.add(obj);
    console.log(set.size); // 1

    Does Javascript have a built-in type for...

    If you want objects with the same properties and values to be treated as equal, then no. You'd need to be able to specify a comparison operation, and there's no built-in Set in JavaScript that lets you define the comparison operation to use.

    Obviously, you can create one. As a starting point, I'd probably use a Map keyed by the names of the properties on the object, sorted and turned into a string via JSON.stringify. (Although that won't work if you want to have Symbol keys as part of the definition of equality.) For instance, if you're only considering own properties:

    const key = JSON.stringify(Object.getOwnPropertyNames(object).sort());
    

    The value for an entry could be either just an array of the objects with those keys that you do a linear search on, or a second Map keyed by some kind of hash of the property values, depending on how many objects you need to handle...


    In comments, I asked:

    Do you only need to handle objects with JSON-serializable values?

    and you answered:

    I have a bunch of objects that are already serialized, but there are duplicates that I'd like to eliminate and then re-serialize.

    Yeah, you can use a Set for that if you don't mind re-serializing, or a Map if you want to skip the re-serializing part:

    const unique = new Map();
    for (const source of serializedObjects) {
        const sourceObject = JSON.parse(source); // Or parse from whatever serialization it is
        // Build object adding properties in alpha order for stability
        const keyObj = {};
        for (const key of Object.keys(sourceObject).sort()) {
            keyObj[key] = sourceObject[key];
        }
        // Save it using JSON.stringify, which uses ES2015 property order
        map.set(JSON.stringify(keyObj), source);
    }
    const uniqueSourceStrings = [...map.values()];
    

    Or for the de-serialized objects themselves:

    const unique = new Map();
    for (const source of serializedObjects) {
        const sourceObject = JSON.parse(source); // Or parse from whatever serialization it is
        // Build object adding properties in alpha order for stability
        const keyObj = {};
        for (const key of Object.keys(sourceObject).sort()) {
            keyObj[key] = sourceObject[key];
        }
        // Save it using JSON.stringify, which uses ES2015 property order
        map.set(JSON.stringify(keyObj), sourceObject); // <=================== changed
    }
    const uniqueSourceObject = [...map.values()];
    //    ^^================================================================== changed