Search code examples
pythonneo4jpy2neo

UnicodeDecodeError while creating nodes in neo4j using python (py2neo)


Hey I was creating a set of nodes in neo4j using the py2neo package in python. A similar piece of code worked for another set of nodes but isn't working in this case.

from py2neo import *
rf = open('dataset.txt','r')
sf = rf.read().split('\n')
rf.close()

L = []

for i in range(len(sf)):
    X = sf[i].split('\t')
    L.append(X)

for i in range(len(L)):
    L[i][0] = int(L[i][0])
    L[i][1] = int(L[i][1])
    L[i][4] = int(L[i][4])
    L[i][5] = float(L[i][5])
    L[i][6] = float(L[i][6])
    L[i][8] = float(L[i][8])
    L[i][9] = float(L[i][9])
    L[i][10] = float(L[i][10])
    L[i][19] = float(L[i][19])

def conGraph():
    authenticate("localhost:7474","neo4j","neo")
    graph = Graph("http://localhost:7474/db/data/")
    return graph

def createProducts():
    graph = conGraph()
    L1, L2, L3, L4, L5, L6 = [], [], [], [], [], []
    for i in range(len(L)):
        if L[i][17] not in L1:
            L1.append(L[i][17])
            L2.append(L[i][15])
            L3.append(L[i][16])
            L4.append(L[i][18])
            L5.append(float(L[i][9]))
        L6.append(float(L[i][19]))
for i in range(len(L1)):
    p = Node("Product", name = L1[i], category = L2[i], subcategory = L3[i], container = L4[i], unitprice = L5[i], basemargin = L6[i])
    graph.create(p)

createProducts()

Only the 1st node was created and then the following error occurred:

Traceback (most recent call last):
  File "C:\Documents and Settings\Administrator\Desktop\AstroMite\lambda\test.py", line 44, in <module>
createProducts()
  File "C:\Documents and Settings\Administrator\Desktop\AstroMite\lambda\test.py", line 41, in createProducts
p = Node("Product", name = L1[i], category = L2[i], subcategory = L3[i], container = L4[i], unitprice = L5[i], basemargin = L6[i])
  File "C:\Python27\lib\site-packages\py2neo\core.py", line 1458, in __init__
PropertyContainer.__init__(self, **properties)
  File "C:\Python27\lib\site-packages\py2neo\core.py", line 1223, in __init__
self.__properties = PropertySet(properties)
  File "C:\Python27\lib\site-packages\py2neo\core.py", line 1110, in __init__
self.update(iterable, **kwargs)
  File "C:\Python27\lib\site-packages\py2neo\core.py", line 1168, in update
self[key] = value
  File "C:\Python27\lib\site-packages\py2neo\core.py", line 1139, in __setitem__
dict.__setitem__(self, key, cast_property(value))
  File "C:\Python27\lib\site-packages\py2neo\types.py", line 55, in cast_property
value = ustr(value)
  File "C:\Python27\lib\site-packages\py2neo\util.py", line 210, in ustr
return s.decode(encoding)
  File "C:\Python27\lib\encodings\utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xae in position 8: invalid start byte

I searched for relevant questions but did not find any specific answer.


Solution

  • Your error is not related to neo4j or py2neo. You have an issue reading your dataset.txt file.

    Python assumes the file to be utf-8 encoded and tries to decode the file with this character encoding. It finds a character which is not utf-8 encoded and can thus not be decoded.

    Your file is most likely not utf-8 encoded. So you have to figure out the character encoding and then open the file with e.g.:

    f = open('dataset.txt', encoding = "ISO-8859-1")
    

    There are a lot of relevant questions for this:

    UnicodeDecodeError: 'utf-8' codec can't decode byte

    UnicodeDecodeError: 'utf-8' codec can't decode byte error

    UnicodeDecodeError: 'utf8' codec can't decode byte 0x9c