Similar to this question, I'm trying to iterate only distinct values of sub-string of given strings, for example:
List<string> keys = new List<string>()
{
"foo_boo_1",
"foo_boo_2,
"foo_boo_3,
"boo_boo_1"
}
The output for the selected distinct values should be (select arbitrary the first sub-string's distinct value):
foo_boo_1 (the first one)
boo_boo_1
I've tried to implement this solution using the IEqualityComparer
with:
public class MyEqualityComparer : IEqualityComparer<string>
{
public bool Equals(string x, string y)
{
int xIndex = x.LastIndexOf("_");
int yIndex = y.LastIndexOf("_");
if (xIndex > 0 && yIndex > 0)
return x.Substring(0, xIndex) == y.Substring(0, yIndex);
else
return false;
}
public int GetHashCode(string obj)
{
return obj.GetHashCode();
}
}
foreach (var key in myList.Distinct(new MyEqualityComparer()))
{
Console.WriteLine(key)
}
But the resulted output is:
foo_boo_1
foo_boo_2
foo_boo_3
boo_boo_1
Using the IEqualityComparer
How do I remove the sub-string distinct values (foo_boo_2
and foo_boo_3
)?
*Please note that the "real" keys are a lot longer, something like "1_0_8-B153_GF_6_2", therefore I must use the LastIndexOf.
Your current implementation has some flaws:
Equals
and GetHashCode
must never throw exception (you have to check for null
)Equals
returns true
for x
and y
then GetHashCode(x) == GetHashCode(y)
. Counter example is "abc_1"
and "abc_2"
.The 2nd error can well cause Distinct
return incorrect results (Distinct
first compute hash).
Correct code can be something like this
public class MyEqualityComparer : IEqualityComparer<string> {
public bool Equals(string x, string y) {
if (ReferenceEquals(x, y))
return true;
else if ((null == x) || (null == y))
return false;
int xIndex = x.LastIndexOf('_');
int yIndex = y.LastIndexOf('_');
if (xIndex >= 0)
return (yIndex >= 0)
? x.Substring(0, xIndex) == y.Substring(0, yIndex)
: false;
else if (yIndex >= 0)
return false;
else
return x == y;
}
public int GetHashCode(string obj) {
if (null == obj)
return 0;
int index = obj.LastIndexOf('_');
return index < 0
? obj.GetHashCode()
: obj.Substring(0, index).GetHashCode();
}
}
Now you are ready to use it with Distinct
:
foreach (var key in myList.Distinct(new MyEqualityComparer())) {
Console.WriteLine(key)
}