I have column data that stores date in mediumtext datatype. How do I convert mediumtext to date so I can extract data for certain time range.
Thank you
Sample data:
We can use STR_TO_DATE function with appropriate format model. The function doesn't have a format model for handling the timezone abbreviation as part of the datetime string, so we have to do a bit of string manipulation to workaround it
e.g.
SELECT STR_TO_DATE( CONCAT(SUBSTRING_INDEX(t.foo,' ',4),' ',SUBSTRING_INDEX(t.foo,' ',-1)),'%a %b %d %T %Y') AS dt
, SUBSTRING_INDEX(SUBSTRING_INDEX(t.foo,' ',5),' ',-1) AS tz
FROM ( SELECT 'Fri Sep 11 15:30:55 JST 2015' AS foo
UNION ALL SELECT 'Fri Sep 11 15:40:09 JST 2015'
UNION ALL SELECT 'Fri Sep 11 14:55:09 SGT 2015'
UNION ALL SELECT 'Fri Sep 11 14:57:38 SGT 2015'
UNION ALL SELECT 'Fri Sep 11 03:59:02 EDT 2015'
UNION ALL SELECT 'Fri Sep 11 01:10:17 PDT 2015'
) t
To handle the timezone, we can use CONVERT_TZ function, but the mysql.time_zone_name (and related time_zone tables) need to be populated, to handle the values for timezone.
SELECT STR_TO_DATE( CONCAT(SUBSTRING_INDEX(t.foo,' ',4),' ',SUBSTRING_INDEX(t.foo,' ',-1)),'%a %b %d %T %Y') AS dt
, SUBSTRING_INDEX(SUBSTRING_INDEX(t.foo,' ',5),' ',-1) AS tz
, CONVERT_TZ( STR_TO_DATE( CONCAT(SUBSTRING_INDEX(t.foo,' ',4),' ',SUBSTRING_INDEX(t.foo,' ',-1)),'%a %b %d %T %Y')
, SUBSTRING_INDEX(SUBSTRING_INDEX(t.foo,' ',5),' ',-1)
, '+00:00'
) AS dt_utc
FROM ( SELECT 'Fri Sep 11 01:10:17 PDT 2015' AS foo
UNION ALL SELECT 'Fri Sep 11 01:10:17 PST8PDT 2015'
) t
returns
dt tz dt_utc
------------------- ------- -------------------
2015-09-11 01:10:17 PDT (NULL)
2015-09-11 01:10:17 PST8PDT 2015-09-11 08:10:17
N.B. If mysql.time_zone_name
table doesn't contain entries for PDT
, EDT
, SGT
or other tz values that we want to convert, the CONVERT_TZ
function will return NULL.
FOLLOWUP
Using these types of expressions in the WHERE clause is going to require the expression to be evaluated for every row in the table. Given a range of date values we are looking for, if the information was stored in a DATETIME column, with an appropriate index defined, the optimizer could make use of a range scan operation.
Which leads us to the conclusion that for performance, MEDIUMTEXT is not the best datatype for storing DATETIME values.