Search code examples
javascriptarraysperformancefilteringlodash

How to filter an array of objects by multiple identical properties


How do I filter this array as described in the question

Description

Notice that entry1 and entry4 share the same value for property: 'subject' and property: 'field'.

Question

Im looking for a performative and clean way to filter this array and get the entries that share both values for those propertyies.

UPDATE:

Returned value

I'm not trying to transform the data but analyze it. so the returned value from the analysis should look like this:

[['entry1', 'entry4'],...]

and with this analysis list I could easily transform my triples = [...] into a list of triples where I remove one of entries(doesnt matter which, could be 'entry1' or 'entry4'), and update the other one

[
  { subject: "entry1", property: "subject", value: "sport" },
  { subject: "entry1", property: "field", value: "category" },
  { subject: "entry1", property: "content", value: "football" },
  { subject: "entry1", property: "content", value: "basketball" },
]

P.S

  1. I'm not looking for a solution like:

    array.filter(({property, value})=> property === 'sport' && value === 'category')

I dont know 'sport' or 'category'. Those are dynamic values.

  1. My actual data is much bigger, and contains much more property types for each entry. Also its not ordered as nicely as I show here. I did simplify it, so please have in mind performance.

code snippet:

const triples = [
  { subject: "entry1", property: "subject", value: "sport" },
  { subject: "entry1", property: "field", value: "category" },
  { subject: "entry1", property: "content", value: "football" },
  
  { subject: "entry4", property: "subject", value: "sport" },
  { subject: "entry4", property: "field", value: "category" },
  { subject: "entry4", property: "content", value: "basketball" },
  
  { subject: "entry2", property: "subject", value: "music" },
  { subject: "entry2", property: "field", value: "category" },
  { subject: "entry2", property: "content", value: "notes" },
  
  { subject: "entry3", property: "subject", value: "painting" },
  { subject: "entry3", property: "field", value: "category" },
  { subject: "entry3", property: "content", value: "drawings" }
];

Solution

  • I must say the input data structure is not optimal, and the use of "subject" as both a real object property and as a value for property will make it all the more confusing. I will call the first notion (the real subject) "entries", since the sample values are "entry1", "entry2", ....

    Here is a way to extract ["entry1", "entry4"] for your sample data:

    1. Group the data by their entry into objects where "property" and "value" are translated into key/value pairs, so you would get something like this:

      {
          entry1: { subject: "sport", field: "category", content: "football" },
          entry4: { subject: "sport", field: "category", content: "basketball" },
          entry2: { subject: "music", field: "category", content: "notes" },
          entry3: { subject: "painting", field: "category", content: "drawings" }
      }
      

      This will be easier to work with. The below code will in fact create a Map instead of a plain object, but it is the same principle.

    2. Define a new group property for these objects, where the value is composed of subject and field, stringified as JSON. For example, the first object of the above result would be extended with:

      group: '["sport","category"]'
      
    3. Create a Map of entries, keyed by their group value. So that would give this result:

      {
          '["sport","category"]': ["entry1","entry4"],
          '["music","category"]': ["entry2"],
          '["painting","category"]': ["entry3"]
      }
      
    4. Now it is a simple step to only list the values (the subarrays) and only those that have more than one entry value.

    Here is the implementation:

    const triples = [{subject: "entry1", property: "subject", value: "sport"},{subject: "entry1", property: "field", value: "category"},{subject: "entry1", property: "content", value: "football"},{subject: "entry4", property: "subject", value: "sport"},{subject: "entry4", property: "field", value: "category"},{subject: "entry4", property: "content", value: "basketball"},{subject: "entry2", property: "subject", value: "music"},{subject: "entry2", property: "field", value: "category"},{subject: "entry2", property: "content", value: "notes"},{subject: "entry3", property: "subject", value: "painting"},{subject: "entry3", property: "field", value: "category"},{subject: "entry3", property: "content", value: "drawings"},];
    
    // 1. Group the data by subject into objects where "property" and "value" are translated into key/value pairs:
    const entries = new Map(triples.map(o => [o.subject, { entry: o.subject }]));
    triples.forEach(o => entries.get(o.subject)[o.property] = o.value);
    // 2. Define a group value for these objects (composed of subject and field)
    entries.forEach(o => o.group = JSON.stringify([o.subject, o.field]));
    // 3. Create Map of entries, keyed by their group value
    const groups = new Map(Array.from(entries.values(), o => [o.group, []]));
    entries.forEach(o => groups.get(o.group).push(o.entry));
    // 4. Keep only the subarrays that have more than one value
    const result = [...groups.values()].filter(group => group.length > 1);
    console.log(result);

    Be aware that the output is a nested array, because in theory there could be more combined entries, like [ ["entry1", "entry4"], ["entry123", "entry521", "entry951"] ]

    The above can be modified/extended to get the final filtered result. In the third step you would still collect the objects (not just the entry value), and the filtered result is then mapped back to the original format:

    const triples = [{subject: "entry1", property: "subject", value: "sport"},{subject: "entry1", property: "field", value: "category"},{subject: "entry1", property: "content", value: "football"},{subject: "entry4", property: "subject", value: "sport"},{subject: "entry4", property: "field", value: "category"},{subject: "entry4", property: "content", value: "basketball"},{subject: "entry2", property: "subject", value: "music"},{subject: "entry2", property: "field", value: "category"},{subject: "entry2", property: "content", value: "notes"},{subject: "entry3", property: "subject", value: "painting"},{subject: "entry3", property: "field", value: "category"},{subject: "entry3", property: "content", value: "drawings"},];
    
    // 1. Group the data by subject into objects where "property" and "value" are translated into key/value pairs:
    const entries = new Map(triples.map(o => [o.subject, { entry: o.subject }]));
    triples.forEach(o => entries.get(o.subject)[o.property] = o.value);
    // 2. Define a group value for these objects (composed of subject and field)
    entries.forEach(o => o.group = JSON.stringify([o.subject, o.field]));
    // 3. Create Map of objects(*), keyed by their group value
    const groups = new Map(Array.from(entries.values(), o => [o.group, []]));
    entries.forEach(o => groups.get(o.group).push(o));
    // 4. Keep only the subarrays that have more than one value
    const result = [...groups.values()].filter(group => group.length > 1)
    // 5. ...and convert it back to the original format:
        .flatMap(group => [
            { subject: group[0].entry, property: "subject", value: group[0].subject },
            { subject: group[0].entry, property: "field", value: group[0].field },
            ...group.map(o => ({ subject: group[0].entry, property: "content", value: o.content }))
        ]);
    
    console.log(result);