Search code examples
dateprestotrinopresto-jdbc

presto SQL - Filter records if date column format is not matching yyyy-MM-dd


I need to get all the records from a table that has a date not matching with the format yyyy-MM-dd.

Here, column ID is unique bigint column. start_date is of varchar datatype.

Sample input:

enter image description here

Expected output:

enter image description here

Thanks


Solution

  • Use regexp_like:

    select id, start_date
      from mytable
     where NOT regexp_like(start_date, '\d{4}-\d{2}-\d{2}')
    

    This will work for '11-12-200' and 'None'.

    If you want to include NULL values as well, add additional condition:

     where (NOT regexp_like(start_date, '\d{4}-\d{2}-\d{2}'))
        OR start_date is null
    

    More strict date regexp is '^\d{4}\-(0[1-9]|1[012])\-(0[1-9]|[12][0-9]|3[01])$'

    This will restrict month to 01 .. 12 and day to 01 .. 31 and will not allow other characters before and after date( ^ and $ anchors are used).

    One more simple and powerful method

    is to use try_cast(col as date) - it will return NULL if not possible to cast:

    where try_cast(start_date as date) is not null
    

    This will also restrict wrong dates like Feb 30 (2000-02-30)