I am trying to check if a string is in valid date format of ‘YYYYMMDD’ .
I am using the below technique. But for an invalid date string I am getting a valid date result.
What am I doing wrong?
SELECT'20019999',CASE WHEN unix_timestamp('20019999','YYYYMMDD') > 0 THEN 'Good'ELSE 'Bad'END;
First of all, you are using the wrong format
select from_unixtime(unix_timestamp()) as default_format
,from_unixtime(unix_timestamp(),'YYYY-MM-DD') as wrong_format
,from_unixtime(unix_timestamp(),'yyyy-MM-dd') as right_format
;
+----------------------+---------------+---------------+
| default_format | wrong_format | right_format |
+----------------------+---------------+---------------+
| 2017-10-07 04:13:26 | 2017-10-280 | 2017-10-07 |
+----------------------+---------------+---------------+
Second, there is no validation on date parts range.
If you increase the day part by 1 it forwards you to the following day.
with t as (select stack(7,'27','28','29','30','31','32','33') as dy)
select t.dy
,from_unixtime(unix_timestamp(concat('2017-02-',t.dy),'yyyy-MM-dd'),'yyyy-MM-dd') as dt
from t
;
+-----+-------------+
| dy | dt |
+-----+-------------+
| 27 | 2017-02-27 |
| 28 | 2017-02-28 |
| 29 | 2017-03-01 |
| 30 | 2017-03-02 |
| 31 | 2017-03-03 |
| 32 | 2017-03-04 |
| 33 | 2017-03-05 |
+-----+-------------+
If you increase the month part by 1 it forwards you to the following month.
with t as (select stack(5,'10','11','12','13','14') as mn)
select t.mn
,from_unixtime(unix_timestamp(concat('2017-',t.mn,'-01'),'yyyy-MM-dd'),'yyyy-MM-dd') as dt
from t
;
+-----+-------------+
| mn | dt |
+-----+-------------+
| 10 | 2017-10-01 |
| 11 | 2017-11-01 |
| 12 | 2017-12-01 |
| 13 | 2018-01-01 |
| 14 | 2018-02-01 |
+-----+-------------+
Even with CAST the validation is done only on the parts ranges and not on the date itself.
select cast('2010-02-32' as date);
+-------+
| _c0 |
+-------+
| NULL |
+-------+
select cast('2010-02-29' as date);
+-------------+
| _c0 |
+-------------+
| 2010-03-01 |
+-------------+
Here is a way to achive you goal:
with t as (select '20019999' as dt)
select dt
,from_unixtime(unix_timestamp(dt,'yyyyMMdd'),'yyyyMMdd') as double_converted_dt
,case
when from_unixtime(unix_timestamp(dt,'yyyyMMdd'),'yyyyMMdd') = dt
then 'Good'
else 'Bad'
end as dt_status
from t
;
+-----------+----------------------+------------+
| dt | double_converted_dt | dt_status |
+-----------+----------------------+------------+
| 20019999 | 20090607 | Bad |
+-----------+----------------------+------------+