Search code examples
algorithmfuzzyfuzzy-comparison

Fuzzy Matching on Date-Type values


I don't have a real question but I'm more like seeking for creative input for a problem.

I want to compare two (most likely unequal) Date values and calculate the ratio of their similarity. So for example if I'd compare 08.01.2013 and 10.01.2013 I would get a relative high value but between 08.01.2013 and 17.04.1998it would be really low.

But now I'm not sure how I should exactly calculate the similarity. First I was thinking about turning the Date values into Strings and then use the EditDistance on them (number of single char operations to transform one String into another). This seems like a good idea for some cases and I'll definitly implement it but I also need an appropriate calculation for something like 31.01.2013 and 02.02.2013


Solution

  • Why not use the difference in days between two dates as a starting point? It is "low" for similar dates and "high" for unequal dates, then use arithmetic to obtain a "similarity ratio" which matches your requirements.

    Consider a fixed reference date "early enough" in the past if you get stuck.