Search code examples
javajava-8guava

How to use Guava's Multisets.toMultiSet() when collecting a stream?


I have list of strings, where each string consists of letters separated by the character ',' (comma). I want to go through the list of strings, split on comma, and calculate how many times each letter occurs, and store the result in a Multiset. Blank strings should be ignored, and the split parts should be trimmed. The multiset should be sorted on key.

The below code works, i.e., it produces the desired Multiset. However, I couldn't figure out how to use the proper collector method (Multisets.toMultiset()), so resorted to a two-step solution, using a temporary list variable, which I would like to eliminate.

I would appreciate if someone can show me how I should have constructed the call to Multisets.toMultiset() in the collect-step. I got stuck on defining the element function and the supplier function, I couldn't even make code that compiled...

@Test
public void testIt() {
    List<String> temp = Stream.of("b, c", "a", "  ", "a, c")
            .filter(StringUtils::isNotBlank)
            .map(val -> val.split(","))
            .flatMap(Arrays::stream)
            .map(String::trim)
            .collect(Collectors.toList());

    Multiset<String> multiset = ImmutableSortedMultiset.copyOf(temp);

    System.out.println("As list: " + temp);
    System.out.println("As multiset: " + multiset);
    // Output is:
    // As list: [b, c, a, a, c]
    // As multiset: [a x 2, b, c x 2]
}

I'm using Guava 28.1. Also used in the example above is the StringUtils class from commons-lang3, version 3.9

This is a simplified example from the real scenario, but one that still captures the essence of my problem


Solution

  • If you really want to ommit the second copy stage, there are several ways to achieve this:

    1. There is already an ImmatbleSortedMultiset Collector specified

      .collect(ImmutableSortedMultiset.toImmutableSortedMultiset(Comparator.naturalOrder()));
      
    2. Since you were asking how to do it with MultiSets::toMultiset

      .collect(Multisets.toMultiset(Function.identity(), i -> 1, TreeMultiset::create));
      
    3. Or you can perfectly add your own Collector implementation using the Builder

      .collect(Collector.of(
          ImmutableSortedMultiset::<String>naturalOrder,
          ImmutableSortedMultiset.Builder::add,
          (b1, b2) -> {b1.addAll(b2.build()); return b1;},
          ImmutableSortedMultiset.Builder::build)
      );