Search code examples
javaiteratorlistproperty

Remove no coincidents from two ListProperty


My goal is: Achieve two "identical" lists, removing non coincident objects from both of them in the less possible time.

What I've achieved: Two identical lists, removing non coincidents, but takes too long.

My problem is:

I have two big lists (800k records each), those lists are filled with objects (HashCode and Equals correctly implemented on those objects) and I need to delete the non coincident records on both lists. It could be only 3-100 records (nothing compared to 800k registers).

The problem is mainly performance, cause its taking 10+ minutes to do the operation.

This is what I've tried:

  • retainAll: this works, but takes too long

  • Using HashSet.retainAll: Can't use sets in my lists. It takes seconds, works wonderful, but I need duplicates

  • Manually: one by one from list 1 looking in list 2, saving no coincidents in a third list, repeat operation backwards in a 4th list, then using removeAll with both lists.

  • Iterators: looked like a good idea to copy lists, remove coincidences from both copied lists, this way I have less items each loop, and I only need to find once, because the remainings are non coincidents. Finally use removeAll to remove non coincidents from original lists, but still takes +-10 minutes.

I need to find a quicker way to do this, but can't figure it out.

About the duplicates: Sounds weird, but for my program 2 objects are equal if they have the same "name" but could have different values in other attributes that I need.


Solution

  • Not understanding all the reasons why you have equality on the name, but not the values.. or even how you determine if list A has "foo", and list B has 2x "foo" if you want to keep all "foo"...

    Here is an idea.. Make a HashSet of "name" to array of objects of the same name... now you can use retainAll and then quickly reconstruct the original collection from the hashset values.