Search code examples
javaprocessingguavamultiset

Converting Multiset Count into a List Java


Is there any way to pull the count from a Multiset into a list?

String[] data = loadStrings("data/data.txt"); 

Multiset<String> myMultiset = ImmutableMultiset.copyOf(data);

for (String word : Multisets.copyHighestCountFirst(myMultiset).elementSet()) {
    System.out.println(word + ": " + myMultiset.count(word));
    // ...
}

As it stands I can output the most commonly occurring words into the console in Processing. I was wondering if it is at all possible to add the corresponding words and their count into an array or a list. I have tried like so:

for (String word : Multisets.copyHighestCountFirst(myMultiset).elementSet()) {
    float a[] = myMultiset.count(word);
}

but only received errors stating I cannot convert an int to a float[]

Is this even possible? Am I going about it all wrong? I've never used Multisets before so any help would be really useful

UPDATE: I have used this to get a copy of the highest count but am unable to convert it into a list.

Multiset<String> sortedList = Multisets.copyHighestCountFirst(myMultiset);

Solution

  • Please see Multiset.entrySet() docs:

    Returns a view of the contents of this multiset, grouped into Multiset.Entry instances, each providing an element of the multiset and the count of that element.

    So, i.e. to get the top 5 most occurring owrds, I'd loop over the entrySet():

    ImmutableMultiset<String> top = Multisets.copyHighestCountFirst(myMultiset);
    
    Iterator<Multiset.Entry<String>> it = top.entrySet().iterator();
    
    for (int i = 0; (i < 5) && it.hasNext(); i++) {
        Multiset.Entry<String> entry = it.next();
    
        String word = entry.getElement();
        int count = entry.getCount();
    
        // do something fancy with word and count...
    }
    

    I'm assuming you need to show the top 5 most occurring words and their frequencies. If you only need the words, just use asList() method:

    ImmutableMultiset<String> top = Multisets.copyHighestCountFirst(myMultiset);
    
    ImmutableList<String> list = top.asList();
    

    and iterate over list to get the first 5 elements.