Found some strange behaviour today that I'm hoping someone can shed some light on.
I'm using strptime to validate dates in an import file. In this case, I want to throw an error if a row in the file contains a date that doesn't fit the format %Y/%m/%d (2017/01/25).
I call strptime as follows:
Date.strptime('25/01/2017', '%Y/%m/%d')
I expect this to fail, as 25 does not fit the criteria for the year. However this succeeds, providing a date as:
0025, 01, 20
If I swap the month and day around (01/25/2018), it fails, as it does detect that the month is invalid.
So what gives? It seems bizarre that it not only creates this mental looking year (0025), but even crazier that it disregards the '17' from the end of the string without issue.
Thanks in advance! :)
You have to think what you actually said:
Date.strptime('25/01/2017', '%Y/%m/%d')
You are saying that you want the year 0025
, month 01
and day 20
(it strips the rest). In the end you get 0025-01-20
.
You can not rely just on Date.strptime
to do the validation for you.
The best is to actually parse it via regexp and do the validation.
For your format a possible regexp (an easy way):
'25/01/2017'.match(/\d{4}\/\d{2}\/\d{2}/)
In your case you will get nil
, because it does not match.
If you get a match you will get:
#<MatchData "2017/01/25">
.
The issue is that this does not check for the correct format of the date. You still need to check if strptime
can parse the result ( like the in the link provided by Tom Lord).
On the other hand you can check it also with regexp only, which can be rather complex: (the following regex checks yyyy/mm/dd
format):
^(?:(?:(?:(?:(?:[1-9]\d)(?:0[48]|[2468][048]|[13579][26])|(?:(?:[2468][048]|[13579][26])00))(\/)(?:0?2\1(?:29)))|(?:(?:[1-9]\d{3})(\/)(?:(?:(?:0?[13578]|1[02])\2(?:31))|(?:(?:0?[13-9]|1[0-2])\2(?:29|30))|(?:(?:0?[1-9])|(?:1[0-2]))\2(?:0?[1-9]|1\d|2[0-8])))))$
Then you know if the date is in correct format right away and you don't have to check parse it with strptime
.
Edit:
When dealing with time remember to always perform your own checks! Don't rely on the function. The problem with time is that you have many exceptions and even thou you have an ISO 8601 and maybe some others may applications do not follow it.
Based on comment I'm want to dig deeper into strptime
For now I want to paste the comment in the source code (in date_s_strptime function and data_core.c):
/*
* call-seq:
* Date.strptime([string='-4712-01-01'[, format='%F'[, start=Date::ITALY]]]) -> date
*
* Parses the given representation of date and time with the given
* template, and creates a date object. strptime does not support
* specification of flags and width unlike strftime.
*
* Date.strptime('2001-02-03', '%Y-%m-%d') #=> #<Date: 2001-02-03 ...>
* Date.strptime('03-02-2001', '%d-%m-%Y') #=> #<Date: 2001-02-03 ...>
* Date.strptime('2001-034', '%Y-%j') #=> #<Date: 2001-02-03 ...>
* Date.strptime('2001-W05-6', '%G-W%V-%u') #=> #<Date: 2001-02-03 ...>
* Date.strptime('2001 04 6', '%Y %U %w') #=> #<Date: 2001-02-03 ...>
* Date.strptime('2001 05 6', '%Y %W %u') #=> #<Date: 2001-02-03 ...>
* Date.strptime('sat3feb01', '%a%d%b%y') #=> #<Date: 2001-02-03 ...>
*
* See also strptime(3) and #strftime.
*/
You can see strings like sat/feb being used too, so there is no surprise the parser can deal with strings. TO BE CONTINUED - digging into the C code