Search code examples
javasetset-intersectionset-theory

How to find the list of all intersections of multiple sets in Java?


I have a list of sets:

setlist = [s1,s2,s3...sn]

I want an all way comparison of sets, which is 2^n number of sets:

setIntersection = [s1 ∩ s2, s1 ∩ s2 ∩ s3, ....., s2 ∩ s3 ∩ s4, ...., sn-1 ∩ sn]

What would be the best way to do this in Java?

For example if I was using just 5 sets. I would want to be able to populate all overlaps of a 5 circle venn diagram.

I am trying to do this with a list of sets:

List<Set<People>> lListSets = new ArrayList<Set<People>>();
for (DaysObject day : listOfDaysInJanuary) {
        lListSets.add(day.peopleOneInternet());
}
findPowerSetsAndCompare(lListSets, listOfDaysInJanuary);

I would like find some result like:

January 1 (Bob, Sally, Tommy)
January 1, January 2 (Sally, Tommy)
...
so on for all possible combination of days.

Essentially what I am asking is how to apply a powerset algorithm combined with set union.


Solution

  • What would be the best way to do this in Java?

    The first part you are describing is a powerset (as I edited your question to include last week). You would then get the intersection for each set of sets in the powerset.

    Because you are doing a powerset of sets, rather than a simple powerset of something like integers, the implementation is going to be a bit more involved.

    Extra Credit

    I wrote a basic implementation of your requirements, as an example of how you might do it. All of the methods and types in this example are members of the Example class.

    Example class with just its main method which demonstrates the working code. I'm sure you'll forgive my using deprecated Date constructors for the demonstration.

    import java.text.*;
    import java.util.*;
    
    public class Example
    {
        public static void main(String[] args) {
            // create simple test harness
            Set<PeopleByDays> peepsByDay = new HashSet<PeopleByDays>();
            peepsByDay.add(new PeopleByDays(new Date(2015 - 1900, Calendar.JANUARY, 1),
                Person.BOB, Person.FRANK, Person.JIM, Person.JUDY, Person.SALLY));
            peepsByDay.add(new PeopleByDays(new Date(2015 - 1900, Calendar.JANUARY, 2),
                Person.BOB, Person.FRANK, Person.JIM, Person.JUDY, Person.TOMMY));
            peepsByDay.add(new PeopleByDays(new Date(2015 - 1900, Calendar.JANUARY, 3),
                Person.BOB, Person.FRANK, Person.JIM, Person.SALLY, Person.TOMMY));
            peepsByDay.add(new PeopleByDays(new Date(2015 - 1900, Calendar.JANUARY, 4),
                Person.BOB, Person.FRANK, Person.JUDY, Person.SALLY, Person.TOMMY));
            peepsByDay.add(new PeopleByDays(new Date(2015 - 1900, Calendar.JANUARY, 5),
                Person.BOB, Person.JIM, Person.JUDY, Person.SALLY, Person.TOMMY));
    
            // make powerSet, then intersect, then sort
            Set<Set<PeopleByDays>> powerPeeps = powerSet(peepsByDay);
            List<PeopleByDays> powerPeepsIntersected = intersect(powerPeeps);
            sort(powerPeepsIntersected);
    
            // print out results
            for (PeopleByDays peeps: powerPeepsIntersected) {
                String daysFormatted = format(peeps.getDays());
                System.out.print(daysFormatted);
                System.out.println(peeps);
            }
        }
    
        // all other Example members as defined in this answer
    }
    

    Person is a simple enum type for the people's names. Benefit of using an enum here is it takes care of appropriate equals() and hashCode() implementations for desired HashSet behavior.

        static enum Person {
            BOB, FRANK, JIM, JUDY, SALLY, TOMMY;
        }
    

    PeopleByDays extends HashSet<Person> to collect an additional set of Date objects to represent the days. Overrides retainAll() (intersect) to combine days; overrides equals() and hashSet() for proper behavior in outer set.

        static class PeopleByDays extends HashSet<Person> {
            private final Set<Date> days = new HashSet<Date>();
    
            public PeopleByDays() {
                super();
            }
            public PeopleByDays(Date day, Person... people) {
                super(Arrays.asList(people));
                this.days.add(day);
            }
            public PeopleByDays(PeopleByDays other) {
                super(other);
                this.days.addAll(other.days);
            }
    
            public List<Date> getDays() {
                return new ArrayList<Date>(this.days);
            }
    
            @Override
            public boolean retainAll(Collection<?> c) {
                if (c instanceof PeopleByDays) {
                    this.days.addAll(((PeopleByDays)c).days);
                }
                return super.retainAll(c);
            }
    
            @Override
            public boolean equals(Object o) {
                return super.equals(o) && this.days.equals(((PeopleByDays) o).days);
            }
            @Override
            public int hashCode() {
                return super.hashCode() + this.days.hashCode();
            }
        }
    

    powerSet() method, taken verbatim from this answer.

        public static <T> Set<Set<T>> powerSet(Set<T> originalSet) {
            Set<Set<T>> sets = new HashSet<Set<T>>();
            if (originalSet.isEmpty()) {
                sets.add(new HashSet<T>());
                return sets;
            }
            List<T> list = new ArrayList<T>(originalSet);
            T head = list.get(0);
            Set<T> rest = new HashSet<T>(list.subList(1, list.size()));
            for (Set<T> set: powerSet(rest)) {
                Set<T> newSet = new HashSet<T>();
                newSet.add(head);
                newSet.addAll(set);
                sets.add(newSet);
                sets.add(set);
            }
            return sets;
        }
    

    intersect() method to create intersection for each set of sets in the powerset.

        static List<PeopleByDays> intersect(Set<Set<PeopleByDays>> powerSet) {
            List<PeopleByDays> intersected = new ArrayList<PeopleByDays>();
            for (Set<PeopleByDays> powerElement: powerSet) {
                PeopleByDays intersection = null;
                if (powerElement.isEmpty()) {
                    intersection = new PeopleByDays();
                } else for (PeopleByDays peeps: powerElement) {
                    if (intersection == null) {
                        intersection = new PeopleByDays(peeps);
                    } else {
                        intersection.retainAll(peeps);
                    }
                }
                intersected.add(intersection);
            }
            return intersected;
        }
    

    sort() method to sort the resulting intersected sets by date.

        static void sort(List<PeopleByDays> peeps) {
            Collections.sort(peeps, new Comparator<PeopleByDays>() {
                @Override
                public int compare(PeopleByDays p1, PeopleByDays p2) {
                    List<Date> days1 = p1.getDays();
                    List<Date> days2 = p2.getDays();
                    Collections.sort(days1);
                    Collections.sort(days2);
                    for (int i = 0; i < days1.size() && i < days2.size(); i++) {
                        int compare = days1.get(i).compareTo(days2.get(i));
                        if (compare != 0) {
                            return compare;
                        }
                    }
                    return days1.size() - days2.size();
                }
            });
        }
    

    format() method to format the list of days for each intersection.

        static String format(List<Date> days) {
            if (days.isEmpty()) {
                return "";
            }
            StringBuilder sb = new StringBuilder();
            DateFormat format = new SimpleDateFormat("MMM d");
            Collections.sort(days);
            String separator = "";
            for (Date day: days) {
                sb.append(separator);
                sb.append(format.format(day));
                separator = ", ";
            }
            sb.append(" ");
            return sb.toString();
        }
    

    And finally, the output.

    []
    Jan 1 [BOB, JUDY, FRANK, JIM, SALLY]
    Jan 1, Jan 2 [BOB, JUDY, FRANK, JIM]
    Jan 1, Jan 2, Jan 3 [BOB, FRANK, JIM]
    Jan 1, Jan 2, Jan 3, Jan 4 [BOB, FRANK]
    Jan 1, Jan 2, Jan 3, Jan 4, Jan 5 [BOB]
    Jan 1, Jan 2, Jan 3, Jan 5 [BOB, JIM]
    Jan 1, Jan 2, Jan 4 [BOB, JUDY, FRANK]
    Jan 1, Jan 2, Jan 4, Jan 5 [BOB, JUDY]
    Jan 1, Jan 2, Jan 5 [BOB, JUDY, JIM]
    Jan 1, Jan 3 [BOB, FRANK, JIM, SALLY]
    Jan 1, Jan 3, Jan 4 [BOB, FRANK, SALLY]
    Jan 1, Jan 3, Jan 4, Jan 5 [BOB, SALLY]
    Jan 1, Jan 3, Jan 5 [BOB, JIM, SALLY]
    Jan 1, Jan 4 [BOB, JUDY, FRANK, SALLY]
    Jan 1, Jan 4, Jan 5 [BOB, JUDY, SALLY]
    Jan 1, Jan 5 [BOB, JUDY, JIM, SALLY]
    Jan 2 [BOB, JUDY, TOMMY, FRANK, JIM]
    Jan 2, Jan 3 [BOB, TOMMY, FRANK, JIM]
    Jan 2, Jan 3, Jan 4 [BOB, TOMMY, FRANK]
    Jan 2, Jan 3, Jan 4, Jan 5 [BOB, TOMMY]
    Jan 2, Jan 3, Jan 5 [BOB, TOMMY, JIM]
    Jan 2, Jan 4 [BOB, JUDY, TOMMY, FRANK]
    Jan 2, Jan 4, Jan 5 [BOB, JUDY, TOMMY]
    Jan 2, Jan 5 [BOB, JUDY, TOMMY, JIM]
    Jan 3 [BOB, TOMMY, FRANK, JIM, SALLY]
    Jan 3, Jan 4 [BOB, TOMMY, FRANK, SALLY]
    Jan 3, Jan 4, Jan 5 [BOB, TOMMY, SALLY]
    Jan 3, Jan 5 [BOB, TOMMY, JIM, SALLY]
    Jan 4 [BOB, JUDY, TOMMY, FRANK, SALLY]
    Jan 4, Jan 5 [BOB, JUDY, TOMMY, SALLY]
    Jan 5 [BOB, JUDY, TOMMY, JIM, SALLY]
    

    Hope that helps. I tinkered with it a lot longer than I intended ;) Still didn't sort the Person names in the output though.