I am supporting a common library at work that performs many checks of a given string to see if it is a valid date. The Java API, commons-lang library, and JodaTime all have methods which can parse a string and turn it in to a date to let you know if it is actually a valid date or not, but I was hoping that there would be a way of doing the validation without actually creating a date object (or DateTime as is the case with the JodaTime library). For example here is a simple piece of example code:
public boolean isValidDate(String dateString) {
SimpleDateFormat df = new SimpleDateFormat("yyyyMMdd");
try {
df.parse(dateString);
return true;
} catch (ParseException e) {
return false;
}
}
This just seems wasteful to me, we are throwing away the resulting object. From my benchmarks about 5% of our time in this common library is spent validating dates. I'm hoping I'm just missing an obvious API. Any suggestions would be great!
UPDATE
Assume that we can always use the same date format at all times (likely yyyyMMdd). I did think about using a regex as well, but then it would need to be aware of the number of days in each month, leap years, etc...
Results
Parsed a date 10 million times
Using Java's SimpleDateFormat: ~32 seconds
Using commons-lang DateUtils.parseDate: ~32 seconds
Using JodaTime's DateTimeFormatter: ~3.5 seconds
Using the pure code/math solution by Slanec: ~0.8 seconds
Using precomputed results by Slanec and dfb (minus filling cache): ~0.2 seconds
There were some very creative answers, I appreciate it! I guess now I just need to decide how much flexibility I need what I want the code to look like. I'm going to say that dfb's answer is correct because it was purely the fastest which was my original questions. Thanks!
If you're really concerned about performance and your date format is really that simple, just pre-compute all the valid strings and hash them in memory. The format you have above only has ~ 8 million valid combinations up to 2050
EDIT by Slanec - reference implementation
This implementation depends on your specific dateformat. It could be adapted to any specific dateformat out there (just like my first answer, but a bit better).
It makes a set of all dates
from 1900 to 2050 (stored as Strings - there are 54787 of them) and then compares the given dates with those stored.
Once the dates
set is created, it's fast as hell. A quick microbenchmark showed an improvement by a factor of 10 over my first solution.
private static Set<String> dates = new HashSet<String>();
static {
for (int year = 1900; year < 2050; year++) {
for (int month = 1; month <= 12; month++) {
for (int day = 1; day <= daysInMonth(year, month); day++) {
StringBuilder date = new StringBuilder();
date.append(String.format("%04d", year));
date.append(String.format("%02d", month));
date.append(String.format("%02d", day));
dates.add(date.toString());
}
}
}
}
public static boolean isValidDate2(String dateString) {
return dates.contains(dateString);
}
P.S. It can be modified to use Set<Integer>
or even Trove's TIntHashSet
which reduces memory usage a lot (and therefore allows to use a much larger timespan), the performance then drops to a level just below my original solution.