Search code examples
datehadooptimestamphbase

Get all rows from the last N days with HBase


I'm trying to write a component that fetches rows from HBase from the last 5 days (5 is arbitrary). The timestamp I want to use is the default timestamp HBase gives to rows (unless its problematic for some reason)

I know I can use scan and with timestamp range but I'm not quite sure how to get the current date in HBase (I'm currently testing it in the HBase shell but eventually I need to have a code doing that). I've tried something like this:

scan 'urls', {COLUMNS => 'urls', TIMERANGE => [SimpleDateFormat.new("yy/MM/dd HH:mm:ss").parse("2016/03/02 00:00:00", ParsePosition.new(0)).getTime(), new Date().getTime()]}

but the shell is saying I have a syntax error unexpected tCONSTANT. I did import both Date, SimpleDateFormat and ParsePosition successfully

I also looked at other examples but could not find exactly what I needed

I was also wondering if there is a more elegant way to accomplish this task?

Thanks in advance


Solution

  • In HBase shell you can use TIMERANGE filter. Example from scan --help command:

    hbase> scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804, 1303668904]}
    

    For java client, you can set timeRange on the scan object:

    Scan s = new Scan();
    s.setTimeRange(1303668804L, 1303668904L);