Search code examples
pythoncassandracassandra-driver

Remove newline characters from Cassandra query results


I am using the python Cassandra driver to query a database. This is the code that I ran:

from cassandra.cluster import Cluster
from cassandra.query import dict_factory

ips = [...]
cluster = Cluster(ips)
session = cluster.connect()
session.row_factory = dict_factory
session.set_keyspace("mykeyspace")
response = session.execute("SELECT * FROM myDB WHERE mykey = 'xyz';")

In the output, I get weird 'n's in front of words, where newline characters used to be.

Example:

"Over the wintry nforest, winds howl in rage nwith no leaves to blow."

Is there a way to solve this issue?


Solution

  • The issue is in your data ingestion, or whatever is consuming response not in the drivers reading. It is the n in stored your database, not the \n char. The python driver does not escape or do anything from the raw bytes but convert to string: https://github.com/datastax/python-driver/blob/9869c2a044e5dd76309026437bcda5d01acf99c7/cassandra/cqltypes.py#L693

    If you insert a \n you get one back out:

    session.execute('CREATE TABLE mydb (mykey text PRIMARY KEY, mycolumn text);')
    session.execute("INSERT INTO mydb (mykey, mycolumn) VALUES ('xyz', 'a\nb');")
    response = session.execute("SELECT * FROM myDB WHERE mykey = 'xyz';")
    for line in response:
       print line['mycolumn']
    

    correctly outputs:

    a
    b
    

    Is there something taking response and escaping it afterwards?