Search code examples
javascriptecmascript-6dynamic-values

Count the appereances of all values per category of a set of N objects


Given an array of objects with attributes I would like to count the total appearances per attribute type.

I have provided an example of 3 arrays which represent 3 different entities (in production it can vary up to 20.000 entities):

const arr = 
  [ [ { attribute_type: 'Background', value: 'Orange'  } 
    , { attribute_type: 'Fur',        value: 'Black'   } 
    , { attribute_type: 'Outfit',     value: 'Casual'  } 
    , { attribute_type: 'Earring',    value: 'None'    } 
    , { attribute_type: 'Eyes',       value: 'Fiery'   } 
    , { attribute_type: 'Mouth',      value: 'Smiling' } 
    , { attribute_type: 'Shoes',      value: 'Sandals' } 
    ] 
  , [ { attribute_type: 'Background', value: 'Orange'  } 
    , { attribute_type: 'Fur',        value: 'Brown'   } 
    , { attribute_type: 'Outfit',     value: 'Casual'  } 
    , { attribute_type: 'Earring',    value: 'None'    } 
    , { attribute_type: 'Eyes',       value: 'Gold'    } 
    , { attribute_type: 'Mouth',      value: 'Smiling' } 
    ] 
  , [ { attribute_type: 'Background', value: 'Diamond' } 
    , { attribute_type: 'Fur',        value: 'Gold'    } 
    , { attribute_type: 'Outfit',     value: 'Dress'   } 
    , { attribute_type: 'Earring',    value: 'None'    } 
    , { attribute_type: 'Eyes',       value: 'Gold'    } 
    , { attribute_type: 'Mouth',      value: 'Smiling' } 
    ] 
  ]

The attribute types can vary and is unknown beforehand. And not every attribute type is guaranteed to be present in an array.

I would like to end up with a list with the # of attribute appereances per category (and sorted ascending by appereance rate if possible):

const expected = 
  [ { Background: { Diamond: 1, Orange: 2          }} 
  , { Fur:        { Black:   1, Brown:  1, Gold: 1 }} 
  , { Outfit:     { Dress:   1, Casual: 2          }} 
  , { Earring:    { None:    3                     }} 
  , { Eyes:       { Fiery:   1, Gold:   2          }} 
  , { Mouth:      { Smiling: 3                     }} 
  , { Shoes:      { Sandals: 1                     }} 
  ] 

I've spent many hours on how to solve this issue and I've tried to take a look at Map data structures and merge but no success so far. The final result does not have to meet the provided format but I'm just trying to apply best practices.


Solution

  • the idea is to start from an already sorted list so that the output one is also sorted. I also sorted on the value attribute

    const arr = 
      [ [ { attribute_type: 'Background', value: 'Orange'  } 
        , { attribute_type: 'Fur',        value: 'Black'   } 
        , { attribute_type: 'Outfit',     value: 'Casual'  } 
        , { attribute_type: 'Earring',    value: 'None'    } 
        , { attribute_type: 'Eyes',       value: 'Fiery'   } 
        , { attribute_type: 'Mouth',      value: 'Smiling' } 
        , { attribute_type: 'Shoes',      value: 'Sandals' } 
        ] 
      , [ { attribute_type: 'Background', value: 'Orange'  } 
        , { attribute_type: 'Fur',        value: 'Brown'   } 
        , { attribute_type: 'Outfit',     value: 'Casual'  } 
        , { attribute_type: 'Earring',    value: 'None'    } 
        , { attribute_type: 'Eyes',       value: 'Gold'    } 
        , { attribute_type: 'Mouth',      value: 'Smiling' } 
        ] 
      , [ { attribute_type: 'Background', value: 'Diamond' } 
        , { attribute_type: 'Fur',        value: 'Gold'    } 
        , { attribute_type: 'Outfit',     value: 'Dress'   } 
        , { attribute_type: 'Earring',    value: 'None'    } 
        , { attribute_type: 'Eyes',       value: 'Gold'    } 
        , { attribute_type: 'Mouth',      value: 'Smiling' } 
        ] 
      ] 
    
    const result = Object.entries(arr
      .flat()           // set simple array with all sub array elements
      .sort((a,b)=>     // sort on attribute_type + value
        {
        let r = a.attribute_type.localeCompare(b.attribute_type)
        if (r === 0) r = a.value.localeCompare(b.value) 
        return r
        })
      .reduce((r,c)=>
        {
        r[c.attribute_type] = r[c.attribute_type] ?? {}
        r[c.attribute_type][c.value] = r[c.attribute_type][c.value] ?? 0
        r[c.attribute_type][c.value]++
        return r
        },{}))
      .map(([k,v])=>({[k]:v})) // change obj to array
    
    console.log( result )
    .as-console-wrapper {max-height: 100%!important;top:0 }

    As requested by the PO in a comment here is the same list, sorted in alpha order on the attribute-type then in ascending order on the number of entries of each value

    sorting the results for each attribute_type in ascending order is a bit trickier, I also put them in alphabetical order in case of a tie

    const arr = 
      [ [ { attribute_type: 'Background', value: 'Orange'  } 
        , { attribute_type: 'Fur',        value: 'Black'   } 
        , { attribute_type: 'Outfit',     value: 'Casual'  } 
        , { attribute_type: 'Earring',    value: 'None'    } 
        , { attribute_type: 'Eyes',       value: 'Fiery'   } 
        , { attribute_type: 'Mouth',      value: 'Smiling' } 
        , { attribute_type: 'Shoes',      value: 'Sandals' } 
        ] 
      , [ { attribute_type: 'Background', value: 'Orange'  } 
        , { attribute_type: 'Fur',        value: 'Brown'   } 
        , { attribute_type: 'Outfit',     value: 'Casual'  } 
        , { attribute_type: 'Earring',    value: 'None'    } 
        , { attribute_type: 'Eyes',       value: 'Gold'    } 
        , { attribute_type: 'Mouth',      value: 'Smiling' } 
        ] 
      , [ { attribute_type: 'Background', value: 'Diamond' } 
        , { attribute_type: 'Fur',        value: 'Gold'    } 
        , { attribute_type: 'Outfit',     value: 'Dress'   } 
        , { attribute_type: 'Earring',    value: 'None'    } 
        , { attribute_type: 'Eyes',       value: 'Gold'    } 
        , { attribute_type: 'Mouth',      value: 'Smiling' } 
        ] 
      ] 
    let result =
      arr
      .flat()
      .sort((a,b)=>a.attribute_type.localeCompare(b.attribute_type) )
      .reduce((r,{attribute_type, value},i,{[i+1]:nxt})=>
        {
        if (r.att != attribute_type)
          {
          r.att = attribute_type
          r.res.push( {[r.att]: []})
          r.idx++
          }
        let val = r.res[r.idx][r.att].find(x=>x[0]===value)
        if (!val) r.res[r.idx][r.att].push([value,1] )
        else      val[1]++
        if ( r.att != nxt?.attribute_type )
          {
          r.res[r.idx][r.att] =
            r.res[r.idx][r.att]
            .sort((a,b)=>
              {
              let z = a[1]-b[1]
              if (z===0) z = a[0].localeCompare(b[0])
              return z
              })
            .reduce((o,[ref,count])=>
              {
              o[ref] = count
              return o  
              },{})
          }
        return nxt ? r : r.res
        },{ att:'', idx:-1,res:[] }) 
        
    console.log( result )
    .as-console-wrapper {max-height: 100%!important;top:0 }

    some links:
    array.flat(), array.sort(), array.reduce(), str1.localeCompare(str2), Nullish coalescing operator (??)

    (yes you can find information in the mdn online documentation)