Search code examples
regexregex-negationregex-lookaroundsregex-groupregex-greedy

RegEx for matching various dates


I am trying to put together a regex statement to match on each of the below date formats.

* Mar 7, 2017
Mar. 7, 2017
* March 7, 2017
3-7-2017
03-07-2017
3-7-17
03-07-17
* 03/7/2017
* 03/07/17
* 3/7/17
Mar-07-2017
Mar-7-2017
March-07-2017

The below regex matches on the date formats that are indicated by an asterisk above. I have tried in vain to add to what I already have but have been unsuccessful.

([0-9]+)/([0-9]+)/([0-9]+)|([12]\d{3}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01]))|\w+\s\d{2},\s\d{4}|(?i)\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec](?:ember)?)\b (?:0?[1-9]|[1-2][0-9]|3[01]),? \d{4}

Any help is always appreciated!

* Bonus question *

On some occasions, there may be multiple date matches and I need it to find a match following a certain word. In the past I've used the below syntax by enclosing the regex statement between the parenthesis after the period.

(?<=Word).(StatementHere)

Solution

  • Try this then ...

    ([0-9]+)/([0-9]+)/([0-9]+)|((0?[1-9]|1[0-2])-(0?[1-9]|[12]\d|3[01])-(\d{4}|\d{2}))|\w+\s\d{2},\s\d{4}|(?i)\b(Jan(?:uary|\.)?|Feb(?:ruary|\.)?|Mar(?:ch|\.)?|Apr(?:il|\.)?|May|Jun(?:e|\.)?|Jul(?:y|\.)?|Aug(?:ust|\.)?|Sep(?:tember|\.)?|Oct(?:ober|\.)?|Nov(?:ember|\.)?|Dec(?:ember|\.)?)([ ](?:0?[1-9]|[1-2][0-9]|3[01]),?[ ]|-(?:0?[1-9]|[1-2][0-9]|3[01])-)(\d{4})
    

    https://regex101.com/r/k1vaVN/1

    Readable version

        ( [0-9]+ )                    # (1)
        /
        ( [0-9]+ )                    # (2)
        /
        ( [0-9]+ )                    # (3)
     |  
        (                             # (4 start)
             ( 0? [1-9] | 1 [0-2] )        # (5)
             -
             ( 0? [1-9] | [12] \d | 3 [01] )  # (6)
             -
             ( \d{4} | \d{2} )             # (7)
        )                             # (4 end)
     |  
        \w+ \s \d{2} , \s \d{4} 
     |  
        (?i)
        \b 
        (                             # (8 start)
             Jan
             (?: uary | \. )?
          |  Feb
             (?: ruary | \. )?
          |  Mar
             (?: ch | \. )?
          |  Apr
             (?: il | \. )?
          |  May
          |  Jun
             (?: e | \. )?
          |  Jul
             (?: y | \. )?
          |  Aug
             (?: ust | \. )?
          |  Sep
             (?: tember | \. )?
          |  Oct
             (?: ober | \. )?
          |  Nov
             (?: ember | \. )?
          |  Dec
             (?: ember | \. )?
        )                             # (8 end)
        (                             # (9 start)
             [ ] 
             (?: 0? [1-9] | [1-2] [0-9] | 3 [01] )
             ,? [ ] 
          |  -
             (?: 0? [1-9] | [1-2] [0-9] | 3 [01] )
             -
        )                             # (9 end)
        ( \d{4} )                     # (10)
    

    update
    Just wrap the dates in a (?: ) group, then add whatever qualifier before
    it that you need.

    word[ ]or[ ]phrase[ ]+\K(?:([0-9]+)/([0-9]+)/([0-9]+)|((0?[1-9]|1[0-2])-(0?[1-9]|[12]\d|3[01])-(\d{4}|\d{2}))|\w+\s\d{2},\s\d{4}|(?i)\b(Jan(?:uary|\.)?|Feb(?:ruary|\.)?|Mar(?:ch|\.)?|Apr(?:il|\.)?|May|Jun(?:e|\.)?|Jul(?:y|\.)?|Aug(?:ust|\.)?|Sep(?:tember|\.)?|Oct(?:ober|\.)?|Nov(?:ember|\.)?|Dec(?:ember|\.)?)([ ](?:0?[1-9]|[1-2][0-9]|3[01]),?[ ]|-(?:0?[1-9]|[1-2][0-9]|3[01])-)(\d{4}))