Search code examples
aerospike

Aerospike: zlib/bz2 store and retrieve didnt worked


I am compressing a string using zlib, then storing in Aerospike bin. On retrieval and decompressing, I am getting "zlib.error: Error -5 while decompressing data: incomplete or truncated stream"

When I compared original compressed data and retrieved compressed data, some thing is missing at the end in retrieved data.

I am using Aerospike 3.7.3 & python client 2.0.1

Please help

Thanks

Update: Tried using bz2. Throws ValueError: couldn't find end of stream while retrieve and decompress. Looks like aerospike is stripping of the last byte or something else from the blob.

Update: Posting the code

import aerospike
import bz2

config = {
    'hosts': [
        ( '127.0.0.1', 3000 )
    ],
    'policies': {
        'timeout': 1000 # milliseconds
    }
}

client = aerospike.client(config)
client.connect()

content = "An Aerospike Query"
content_bz2 = bz2.compress(content)

key = ('benchmark', 'myset', 55)
#client.put(key, {'bin0':content_bz2})
(key, meta, bins) =  client.get(key)
print bz2.decompress(bins['bin0'])

Getting Following Error:

Traceback (most recent call last):
  File "asread.py", line 22, in <module>
    print bz2.decompress(bins['bin0'])
ValueError: couldn't find end of stream

Solution

  • The bz.compress method returns a string, and the client sees that type and tries to convert it to the server's as_str type. If it runs into a \0 in an unexpected position it will truncate the string, causing your error.

    Instead, make sure to cast binary data to a bytearray, which the client converts to the server's as_bytes type. On the read operation, bz.decompress will work with the bytearray data and give you back the original string.

    from __future__ import print_function
    import aerospike
    import bz2
    
    config = {'hosts': [( '33.33.33.91', 3000 )]}
    
    client = aerospike.client(config)
    client.connect()
    
    content = "An Aerospike Query"
    content_bz2 = bytearray(bz2.compress(content))
    
    key = ('test', 'bytesss', 1)
    client.put(key, {'bin0':content_bz2})
    (key, meta, bins) =  client.get(key)
    print(type(bins['bin0']))
    bin0 = bz2.decompress(bins['bin0'])
    print(type(bin0))
    print(bin0)
    

    Gives back

    <type 'bytearray'>
    <type 'str'>
    An Aerospike Query