I know this kind of question is asked in stackoverflow lots of time before. But my issue is little bit different and I could not find any similar scenario, so posting this question here
Problem: I need to remove duplicate objects from ArrayList. Structure of my arrayList is as below
dataList.add(new ObjectClass("a","b"));
dataList.add(new ObjectClass("c","n"));
dataList.add(new ObjectClass("b","a")); // should be counted as duplicate
dataList.add(new ObjectClass("z","x"));
I need to remove objects from above list such as, it treats combination of "a,b" and "b,a" as duplicates and remove any of those duplicate
My solution: step 1) Override equals method in DataClass class
class DataClass {
String source;
String destination;
DataClass(String src, String dest) {
this.source = src;
this.destination = dest;
}
// getter setter for source and destination variables
@Override
public boolean equals(Object obj) {
System.out.println("inside equals");
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
ObjectClass other = (ObjectClass) obj;
if(i.equals(other.getJ())
&& j.equals(other.getI())) {
return true;
} else return false;
}
step 2) method to remove duplicates
public List<DataClass> removeDuplicates(List<DataClass> dataList) {
List<DataClass> resultList = new ArrayList<DataClass>();
// Convert array list to Linked list
LinkedList<DataClass> linkedList = new LinkedList<DataClass>();
for(DataClass obj: dataList) {
linkedList.add(obj);
}
// Iterate through linked list and remove if values are duplicates
for(int i = 0; i<linkedList.size();i++) {
for(int j = i+1;j<linkedList.size();j++) {
if(linkedList.get(j).equals(linkedList.get(i))) {
linkedList.remove();
}
}
}
resultList.addAll(linkedList);
return resultList;
}
I am still looking for any better optimized solution, if there is any. Thanks in advance
Update with solution : my equals method was needed to correct some comparison logic. So here is my updated ObjectClass instead of DataClass including correct overridden equals method
public class ObjectClass {
String i;
String j;
public ObjectClass(String i, String j) {
this.i = i;
this.j = j;
}
// getters setters
// override hashcode
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
ObjectClass other = (ObjectClass) obj;
if((i.equals(other.getJ()) || i.equals(other.getI()))
&& (j.equals(other.getI()) || j.equals(other.getJ()))) {
return true;
} else return false;
}
}
After fixing equals method I tried below implementation in removeDuplicate method as Janos mentioned and it is working fine as expected
for(ObjectClass obj: dataList) {
if(!resultList.contains(obj))
resultList.add(obj);
}
There are several problems here:
class DataClass { String source; String destination; // ... @Override public boolean equals(Object obj) { // ... ObjectClass other = (ObjectClass) obj; if(i.equals(other.getJ()) && j.equals(other.getI())) { return true; } else return false; }
The equals
method casts the other object to ObjectClass
.
It should cast to the same class where this method is defined: DataClass
.
The equals
method compares i
and j
variables,
but they are not defined anywhere within the class.
There's source
and destination
.
The equals
method will return true when this.i
is the same as other.j
and this.j
is the same as other.i
, and return false otherwise. In other words, (a, b)
will be equal to (b, a)
. But it will not be equal to itself. That's very strange, and probably not what you intended.
The removeDuplicates
method is overcomplicated.
For example converting an array list to a linked list is unnecessary.
Here's a much simpler algorithm:
That's it.
List<DataClass> result = new ArrayList<>();
for (DataClass item : dataList) {
if (!result.contains(item)) {
result.add(item);
}
}
return result;
This assumes that the implementation of the equals
method is fixed.
Otherwise the result.contains
step will not work correctly.
Also note that result.contains
performs a linear search:
it checks every item until it finds a match.
You could improve performance by using a set.