Search code examples
regexclojureclojure-contrib

Regex for dates in Clojure


The format of dates I am looking to capture fall into permutations of the pattern "word/DD/YYYY" where word corresponds to months, i.e.

(def months ["january" "January" "february" "February" "march" "March" "April" "april" "may" "May" "june" "June" "july" "July" "august" "August" "september" "September" "october" "October" "november" "November" "december" "December"])

So, possible permutations of the above pattern would be "DD/word/YYYY" "YYYY/word/DD" and "YYYY/DD/word"

I've tried something along the lines of

(def months-match (clojure.string/join "|" months))
(def months-str (str "(\\s*(" months-match ")")) 
(def moster (re-pattern  months-str))

(defn foomonths [s]
(map first (re-seq moster s)))

with plans to add the regex for days and years

|[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)\d\d

Permuting the regex has not been an issue. Rather it is the process of formulating months that are words into a regex structure with the days and years in digits.


Solution

  • I see that your questions is about regexes, so my apologies if this answer is off topic, though if I may suggest a slightly different approach, clj-time includes a time formatter that can handle most of these cases out of the box:

    project.clj:

     (defproject hello "0.1.0-SNAPSHOT"
      :description "FIXME: write description"
      :url "http://example.com/FIXME"
      :license {:name "Eclipse Public License"
                :url "http://www.eclipse.org/legal/epl-v10.html"}
      :dependencies [[org.clojure/clojure "1.5.1"]
                  [clj-time "0.6.0"]]
      :source-paths ["dev"]) 
    
    user> (def custom-formatter (formatter "dd/MMMMMMMMM/YYYY"))
    #'user/custom-formatter
    
    user> (parse custom-formatter "14/June/2014")
    #<DateTime 2014-06-14T00:00:00.000Z>
    
    user> (parse custom-formatter "14/september/2014")
    #<DateTime 2014-09-14T00:00:00.000Z> 
    

    So you could write one time format string for each permutation and then just try each until you get a match