Search code examples
pythondatetimerethinkdbpython-rq

UnpickleError when pushing RethinkDB document to RQ queue


A document that I am retrieving from a RethinkDB database has a timestamp in it (represented as a Python datetime.datetime object with a special tzinfo value of type rethinkdb.ast.RqlTzinfo.

When pushing it to an RQ task queue I am getting an UnpickleError on the other end when the task attempts to unpickle the timestamp.

feeds = r.table('feeds') \
    .filter(lambda feed: feed['last_fetched'] \
        < (r.now() - config.FEED_REFRESH_INTERVAL)) \
    .pluck('id', 'feed_url', 'last_fetched').run(db)

q = Queue('feed_fetcher', connection=Redis())

for feed in feeds:
    print(feed['last_fetched']) # 2014-09-08 18:35:22.735000+00:00
    result = q.enqueue(fetch_feed, feed)

The output from the RQ worker:

18:48:34 *** Listening on feed_fetcher...
19:05:01 feed_fetcher: app.feedupdater.tasks.fetch_feed({u'last_fetched': datetime.datetime(2014, 9, 8, 18, 35, 22, 735000, tzinfo=<rethinkdb.ast.RqlTzinfo object at 0x7fdf508ed410>), u'id': u'ccd57063-a61a-4255-af67-94c244ff6bbb', u'feed_url': u'http://feeds.feedburner.com/hacker-news-feed-200?format=xml'}) (0002de8d-978c-4aeb-944e-2559458d114c)
Traceback (most recent call last):
  File "/usr/local/bin/rqworker", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/dist-packages/rq/scripts/rqworker.py", line 100, in main
    w.work(burst=args.burst)
  File "/usr/local/lib/python2.7/dist-packages/rq/worker.py", line 358, in work
    self.execute_job(job)
  File "/usr/local/lib/python2.7/dist-packages/rq/worker.py", line 422, in execute_job
    self.main_work_horse(job)
  File "/usr/local/lib/python2.7/dist-packages/rq/worker.py", line 457, in main_work_horse
    success = self.perform_job(job)
  File "/usr/local/lib/python2.7/dist-packages/rq/worker.py", line 473, in perform_job
    job.func_name,
  File "/usr/local/lib/python2.7/dist-packages/rq/job.py", line 225, in func_name
    self._unpickle_data()
  File "/usr/local/lib/python2.7/dist-packages/rq/job.py", line 193, in _unpickle_data
    self._func_name, self._instance, self._args, self._kwargs = unpickle(self.data)
  File "/usr/local/lib/python2.7/dist-packages/rq/job.py", line 50, in unpickle
    raise UnpickleError('Could not unpickle.', pickled_string, e)
rq.exceptions.UnpickleError: (u'Could not unpickle.', TypeError('__init__() takes exactly 2 arguments (1 given)', <class 'rethinkdb.ast.RqlTzinfo'>, ()))

I've managed to work my way around the issue by replacing the timestamp with an epoch time before pushing into the queue, however it's a stop-gap solution and makes things a bit more complicated given that I have to deal with timezones etc. when restoring the epoch time back into a datetime object before stuffing it back into the DB.

Where does the problem lie (is it RethinkDB's RqlTzinfo object or RQ's unpickling implementation) and is this a legitimate bug or simply my poor implementation?


Solution

  • A fix has been committed and is present in version 1.15.1 onward. This is no longer an issue. See: https://github.com/rethinkdb/rethinkdb/issues/3024.