Search code examples
pythondatetimeanacondablaze

Filtering dates within Blaze Table


I'm using Blaze (0.6.3) with Anaconda 2.1.0 (on Python 2.7.8). I'm trying to use filters based on dates on Table's rows.

The mock TSV file is the following:

name    amount  date
foo 100 2001-05-11 08:54:48.063856
bar 1000    0001-01-01 00:00:00.0
baz 10000   1970-01-02 00:00:00.0

The python code is

from blaze import *
from datetime import datetime
data = Table(CSV('mock.tsv'))

data[data.name > 'bar']
data[data.amount > 1000]
data[data.date > datetime(1970,1,1)]

The first two filters are ok, but the third one throws a SyntaxError.

It all seems to boil down to the following:

lambda (name, amount, date): date > (1970-01-01 00:00:00)

which is syntactically invalid. Somehow, somewhere, datetime(1970,1,1) was translated to datetime(1970-01-01 00:00:00), then the datetime was forgotten. Blaze itself recognizes the date column with ?datetime type, which is what I want, but then it fails in the comparison.

Am I using it the wrong way?


Solution

  • This was an older bug that has since been fixed. Here it is working with development version. I believe that the latest stable release on Anaconda (0.6.5) should work fine as well

    In [1]: !cat tmp/myfile.csv
    name, amount, date
    foo, 100, 2001-05-11 08:54:48.063856
    bar, 1000, 0001-01-01 00:00:00.0
    baz, 10000, 1970-01-02 00:00:00.0
    
    In [2]: from blaze import *
    
    In [3]: data = Table('tmp/myfile.csv')
    
    In [4]: from datetime import datetime
    
    In [5]: data[data.date > datetime(1970,1,1)]
    Out[5]: 
      name  amount                       date
    0  foo     100 2001-05-11 08:54:48.063856
    1  baz   10000        1970-01-02 00:00:00
    

    The following should solve your problem

    conda update blaze
    

    Also, Blaze is happy to coerce your strings to the appropriate type, just in case you were too lazy to create the datetime yourself

    In [6]: data[data.date > '1970-01-01']
    Out[6]: 
      name  amount                       date
    0  foo     100 2001-05-11 08:54:48.063856
    1  baz   10000        1970-01-02 00:00:00