I am writing cron script in python for a redis cluster and using redis-py-cluster for only reading data from a prod server. A separate Java application is writing to redis cluster with snappy compression and java string codec utf-8.
I am able to read data but not able to decode it.
from rediscluster import RedisCluster
import snappy
host, port ="127.0.0.1", "30001"
startup_nodes = [{"host": host, "port": port}]
print("Trying connecting to redis cluster host=" + host + ", port=" + str(port))
rc = RedisCluster(startup_nodes=startup_nodes, max_connections=32, decode_responses=True)
print("Connected", rc)
print("Reading all keys, value ...\n\n")
for key in rc.scan_iter("uidx:*"):
value = rc.get(key)
#uncompress = snappy.uncompress(value, decoding="utf-8")
print(key, value)
print('\n')
print("Done. exit()")
exit()
decode_responses=False
is working fine with the comment. however changing decode_responses=True
is throwing error. My guess is its not able to get the correct decoder.
Traceback (most recent call last):
File "splooks_cron.py", line 22, in <module>
print(key, rc.get(key))
File "/Library/Python/2.7/site-packages/redis/client.py", line 1207, in get
return self.execute_command('GET', name)
File "/Library/Python/2.7/site-packages/rediscluster/utils.py", line 101, in inner
return func(*args, **kwargs)
File "/Library/Python/2.7/site-packages/rediscluster/client.py", line 410, in execute_command
return self.parse_response(r, command, **kwargs)
File "/Library/Python/2.7/site-packages/redis/client.py", line 768, in parse_response
response = connection.read_response()
File "/Library/Python/2.7/site-packages/redis/connection.py", line 636, in read_response
raise e
: 'utf8' codec can't decode byte 0x82 in position 0: invalid start byte
PS: Uncommenting this line uncompress = snappy.uncompress(value, decoding="utf-8")
is breaking with error
Traceback (most recent call last):
File "splooks_cron.py", line 27, in <module>
uncompress = snappy.uncompress(value, decoding="utf-8")
File "/Library/Python/2.7/site-packages/snappy/snappy.py", line 91, in uncompress
return _uncompress(data).decode(decoding)
snappy.UncompressError: Error while decompressing: invalid input
After hours of debugging, I was finally able to solve this.
I am using xerial/snappy-java compressor in my Java code which is writing to redis cluster. Interesting thing is that during compression xerial SnappyOutputStream
adds some offset at the beginning of the compress data. In my case this looks something like this
"\x82SNAPPY\x00\x00\x00\x00\x01\x00\x00\x00\x01\x00\x00\x01\xb6\x8b\x06\\******actual data here*****
Due to this, the decompressor was not able to figure out. I modified code as below and remove offset form the value. it's working fine now.
for key in rc.scan_iter("uidx:*"):
value = rc.get(key)
#in my case offset was 20 and utf-8 is default ecoder/decoder for snappy
# https://github.com/andrix/python-snappy/blob/master/snappy/snappy.py
uncompress_value = snappy.decompress(value[20:])
print(key, uncompress_value)
print('\n')