Search code examples
javasortinggroup-byuniqueknime

Sort values within a list (coded as a string) with Java in KNIME


I have nearly no experience with java, but since I can't use the R snippet in KNIME Analytics Plattform for some reason (Java snippet works though), I would like to know how to do with Java what the following R code does:

library(dplyr)
Object <- dataset %>% group_by(Dimension1) %>% summarise(Set = toString(unique(sort(Dimension2))))

I got longformatted data like:

Nr. Value
1 Apple
1 Orange
1 Banana
1 Apple
2 Orange
2 Banana
2 Apple
3 Strawberry
3 Banana
4 Banana
4 Banana
4 Strawberry

With the KNIME "Group By"-node I can aggregate them by Nr. as a sorted list or as a set of unique values (unfortunately sorted randomly). However, I would like to have a sorted list (e.g. alphabetically) of unique values like:

Nr. Value
1 Apple Banana Orange
2 Apple Banana Orange
3 Banana Strawberry
4 Banana Strawberry

How can I do that with Java (or KNIME if possible)?

The output of the Group-by-node are strings like:

1 Orange, Apple, Banana
2 Apple, Banana, Orange
3 Banana,Strawberry
4 Strawberry, Banana


Solution

  • You can postprocess the result of groupby (Set) with the following snippet (column1 is a Set of the values within the group):

    String [] res = c_column1;
    java.util.Arrays.sort(res);
    out_column1 = res;
    

    snippet content