Search code examples
javalistfilterunique

2 huge list all with same id needs to filter and get the distinct items only


I have 2 huge list where more than millions of items. I am not allowed to share the production code. But I can simulate the actual code by following. Where I have few same id and other different id. I would like to have a list that will have only distinct id elements. I have solved it using classic Java where no stream API used. In terms of performance it is good enough when I have millions of items. How can I improve this code:-

    public class TestClass {
    private String id;
    private LocalDate creationTimestamp;

    public String getId() {
        return id;
    }

    public void setId(String id) {
        this.id = id;
    }

    public LocalDate getCreationTimestamp() {
        return creationTimestamp;
    }

    public void setCreationTimestamp(LocalDate creationTimestamp) {
        this.creationTimestamp = creationTimestamp;
    }
}

    TestClass testClass = new TestClass();
    testClass.setId("1");
    testClass.setCreationTimestamp(LocalDate.now());
    TestClass testClass2 = new TestClass();
    testClass2.setId("2");
    testClass.setCreationTimestamp(LocalDate.now());
    TestClass testClass3 = new TestClass();
    testClass3.setId("3");
    testClass.setCreationTimestamp(LocalDate.now());
    TestClass testClass4 = new TestClass();
    testClass4.setId("4");
    testClass.setCreationTimestamp(LocalDate.now());

    List<TestClass> testClassesList1 = new ArrayList<>();
    testClassesList1.add(testClass);
    testClassesList1.add(testClass2);
    testClassesList1.add(testClass3);
    testClassesList1.add(testClass4);

    TestClass testClass5 = new TestClass();
    testClass5.setId("1");
    testClass5.setCreationTimestamp(LocalDate.now());
    TestClass testClass6 = new TestClass();
    testClass6.setId("2");
    testClass6.setCreationTimestamp(LocalDate.now());
    TestClass testClass7 = new TestClass();
    testClass7.setId("5");
    testClass7.setCreationTimestamp(LocalDate.now());
    TestClass testClass8 = new TestClass();
    testClass8.setId("6");
    testClass8.setCreationTimestamp(LocalDate.now());

    List<TestClass> testClassesList2 = new ArrayList<>();
    testClassesList2.add(testClass5);
    testClassesList2.add(testClass6);
    testClassesList2.add(testClass7);
    testClassesList2.add(testClass8);


    List<TestClass> uniqueTestClasses = new ArrayList<>();

    if(testClassesList1.size() == testClassesList2.size()) {
        for (int i = 0; i < testClassesList1.size(); i++) {
            if(testClassesList1.get(i).getId().equalsIgnoreCase(
                    testClassesList2.get(i).getId())){
                uniqueTestClasses.add(testClassesList1.get(i));
            }else{
                uniqueTestClasses.add(testClassesList1.get(i));
                uniqueTestClasses.add(testClassesList2.get(i));
            }
        }
    }

This works fine. Also if the size is different, then what will be the solution? But it is not good in term of performance. How can I improve this to achieve the same goal with stream API.


Solution

  • As you asked for a code snippet of using HashSet method, below is one.

    public List<TestClass> checker(List<TestClass> ls1, List<TestClass> ls2){
        
        //Creating HashSet to store ids
        HashSet<Integer> set = new HashSet<>();
        
        //Finding the smaller list
        List<TestClass> smallerList = ls1.size() <= ls2.size() ? ls1 : ls2; 
        List<TestClass> biggerList = ls1.size() > ls2.size() ? ls1 : ls2;
        
        //Adding smaller list values to the HashSet
        for(TestClass tc : smallerList)
            set.add(tc.getId());
       
        //Looping through bigger list and searching and removing data
        Iterator<TestClass> iter = biggerList.iterator();
        while(iter.hasNext()){
            TestClass t = iter.next();
    
            //Checking if the object is in smallerList
            if(set.contains(t.getId()))
                iter.remove();
        }
        return biggerList;
    }
    

    I haven't tested this code out. It might have some syntactical errors.