Search code examples
pythonmongodbpymongobson

Pymongo BSON Binary save and retrieve?


I'm working in Python with MongoDB trying to save an array of floats tightly.

I can Create and store correctly *

but I CANNOT RETRIEVE THE DATA IN A USABLE FORMAT.

>>> import random, array, pymongo
>>> from bson.binary import Binary as BsonBinary
>>> con = pymongo.Connection('localhost', 27017)
>>> mm = con['testDatabase']
>>> vals = [random.random() *100 for x in range(1, 5)]
>>> vals
[2.9962593, 64.5582810776, 32.3781311717, 82.0606953423]
>>> varray = array.array('f', vals)
>>> varray
array('f', [2.9962593, 64.5582810776, 32.3781311717, 82.0606953423])
>>> vstring = varray.tostring()
>>> vstring
'\xb7\xc2?@\xd7\x1d\x81B5\x83\x01B\x13\x1f\xa4B'
>>> vbson = BsonBinary(vstring, 5)
>>> vbson
Binary('\xb7\xc2?@\xd7\x1d\x81B5\x83\x01B\x13\x1f\xa4B', 5)
>>> doc1 = { 'something': 1 , 'else' : vbson}
>>> doc1
{'something': 1, 'else': Binary('\xb7\xc2?@\xd7\x1d\x81B5\x83\x01B\x13\x1f\xa4B', 5)}
>>> mm.test1.insert(doc1)
ObjectID('530f7af1d809d80d3db1f635')
>>> gotdoc = mm.test1.find_one()
>>> gotdoc
{u'_id': ObjectId('530f7af1d809d80d3db1f635'), u'something': 3, u'else': Binary('\xb7\xc2?@\xd7\x1d\x81B5\x83\x01B\x13\x1f\xa4B', 5)}
>>> gotfield = gotdoc['else']
>>> gotfield
Binary('\xb7\xc2?@\xd7\x1d\x81B5\x83\x01B\x13\x1f\xa4B', 5)
>>> from bson import BSON
>>> BSON.decode(gotfield)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unbound method decode() must be called with BSON instance as first argument (got Binary instance instead)
>>> gotfield.decode()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb7 in position 0: ordinal not in range(128)
>>>

Once I get my Python string back, I can get my array of random floats back. But how?


Solution

  • Let's go through the errors:

    1. The first error appears simply because you need an actual BSON object. Note, that you have never encoded any data - creating bson.binary.Binary object does not mean invoking BSON.encode().

    2. And that is where PyMongo cheats you a bit. The bson.binary.Binary is a runtime-patched str or bytes instance (see source). That is why you get the second error: what you call is actually str.decode(), not BSON.decode(). So, gotfield contains the random float data you've stored initially, but the object itself has some different methods (e.g. repr()) bound to it.