Search code examples
javaregexstringdatesimpledateformat

How to check if a string contains a date in Java?


How do I check if a string contains a date of this form:

Sunday, January 15, 2012 at 7:36pm EST

The data I'm working with contains a ton of strings. But the type of string I'm looking for contains a 2 or 3 word name and a date. I'm checking for dates to identify these types of strings.

I've figured out the simpleDateFormat for this type of date.

String string1 = "Rahul Chowdhury Sunday, January 15, 2012 at 7:37pm EST";
String string2 = "Aritra Sinha Nirmal Friday, April 1, 2016 at 10:16pm EDT";    

SimpleDateFormat format = new SimpleDateFormat("EEEEE, MMM dd, yyyy 'at' hh:mmaa z");

But I have no idea how to proceed further.

I'm guessing regex might work but I don't know how to implement that when the length of the names of months/days vary. i.e. 'May' is much shorter than 'December'.

I'm wondering if there is a solution using regex or a simpler solution to this.

I know there are other threads asking similar questions, but they don't answer my question.


Solution

  • You could first check the presence of your date with a regex:

    \w+,\s+\w+\s+\d+\,\s+\d+\s+at\s+\d+:\d+(pm|am)\s+\w{3,4}
    

    This regex matches both

    Rahul Chowdhury Sunday, January 15, 2012 at 7:37pm EST
    Aritra Sinha Nirmal Friday, April 1, 2016 at 10:16pm EDT
    

    https://regex101.com/r/V0dAf8/2/

    When you found the match in your text then you could use SimpleDateFormat to check if it is well formed.

    String input = "Rahul Chowdhury Sunday, January 15, 2012 at 7:37pm EST";
    String regex = "(\\w+,\\s+\\w+\\s+\\d+\\,\\s+\\d+\\s+at\\s+\\d+:\\d+(pm|am)\\s+\\w{3,4})";
    Matcher matcher = Pattern.compile(regex).matcher(input);
    if (matcher.find()) {
      System.out.println(matcher.group(1));
    }
    

    This will print:

    Sunday, January 15, 2012 at 7:37pm EST