I have a hive table that contains a column called timestamp. The timestamp is a bigint field generated from java System.currenttimemillis()
. I suppose it should be in UTC. Right now I am trying to select records from 1 hour ago. I know in MySQL you can do something like:
SELECT * FROM table WHERE datetimefield >= DATE_SUB(NOW(), INTERVAL 1 HOUR)
In hive, it seems like NOW()
is missing. I did some searching and find unix_timestamp()
. I should be able to get the current UTC time in milliseconds by doing a unix_timestamp()*1000
.
So if i want to get records from 1 hour ago I am thinking about doing something like:
SELECT * FROM hivetable WHERE datetimefield >= (unix_timestamp()*1000-3600000);
Can someone suggest if it's the right way to approach this problem? Also what if I want to select like 1 day ago? Seems inconvenient to convert that to milliseconds. Any help or suggested readings will be highly appreciated. Thanks in advance for your help.
Yes unix_timestamp()
gets you the seconds elapsed since Unix epoch. You can subtract 60*60*1000 milliseconds and compare your field to get the desired records.
For Hive 1.2.0 and higher you can use current_timestamp
select *
from hivetable
where
datetimefield >= ((unix_timestamp()*1000) - 3600000);
For 1 day,convert the milliseconds to date format and use date_sub
select *
from hivetable
where
from_unixtime(unix_timestamp(datetimefield,'MM-dd-yyyy HH:mm:ss')) >=
date_sub(from_unixtime(unix_timestamp()),1);