Search code examples
haskellgroupingmultimap

Grouping by function value into Multimap


Assuming I have a list of values like this:

["abc","abd","aab","123"]

I want to group those values into a MultiMap (conceptually, not limited to a specific data structure) in Haskell by using a function that maps any element to a key.

For this example, we shall use take 2 as a mapper.

The result I intend to get is (conceptually, as JSON):

{"ab":["abc","abd"], "aa":["aab"], "12":["123"]}

In this example I will use [(String, [String])] as a Multimap data structure.

My basic idea (conceptually):

let datalist = ["abc","abd","aab","123"]
let mapfn = take 2
let keys = nub $ map mapfn datalist
let valuesForKey key = filter ((==key).mapfn) datalist
let resultMultimap = zip keys $ map valuesForKey keys

My question:

  1. Is there any better way (in base or external packages) to do this? I want to avoid custom code.
  2. If 1) is not applicable, is there any guarantee that GHC will optimize this so one pass over the data list is sufficient to generate the full multimap (as opposed to one filter run per key)?

Conceptually, this question is similar to the SQL GROUP BY statement.


Solution

  • Using fromListWith from Data.Map:

    > let xs = ["abc","abd","aab","123"]
    > let f = take 2
    > Data.Map.fromListWith (++) [(f x, [x]) | x <- xs]
    fromList [("12",["123"]),("aa",["aab"]),("ab",["abd","abc"])]