Search code examples
pythonneo4jneo4j.py

How to verify if a Neo4j Graph DB exist with Python?


I've this small snipped of code that load a file in an embedded Neo4j Database.

With this code I've two problems and I don't find documentation to solve it.

I'm following the samples of the documentation to create an index, but: a) How can I detect if the index exists yet? The documentation explais that if the index already exists, it's returned, but in my case it returns an error.

b) When I get the a node from the index I get an error

from neo4j import GraphDatabase, INCOMING, Evaluation

# Create a database
db = GraphDatabase("c:/temp/graph")

with db.transaction:
    # Create an index for "users" nodes
    # to look for them using some of the properties  

    # HERE I GET AN ERROR WHEN THE INDEX EXISTS PREVIOUSLY, BUT THE DOCUMENTATION EXPLAINS THE OPOSITE.
    users_idx = db.node.indexes.create('users')

    # Create the "users_reference" node to connect all "users" nodes to here
    users_reference = db.node()
    db.reference_node.USERS(users_reference, provider='lucene', type='fulltext')

    '''
    Content of the file
    1,Marc
    2,Didac
    3,Sergi
    4,David
    '''

    f = open("../files/users","r")
    for line in f:
        v = line.split(",")
        i = v[0]
        name = v[1]

        # All write operations happen in a transaction
        user = db.node(id=i, name=name)
        user.INSTANCE_OF(users_reference)
        users_idx['id'][i] = user

# I suppose that here, the transaction is closed

# I want get the node whose property "Id" has value "3" 
# to print the property "name" of the node with id = 3

# HERE I GET AN ERROR WHEN THE THERE'RE MULTIPLE NODES WITH THE SAME VALUE FOR THE PROPERTY "ID"

c = users_idx['id']['3'].single
print c['name']                

'''
If the file has a duplicated ID, the previouly code returns an error... 
1,Marc
1,Marc_2
1,Marc_3
2,Didac
3,Sergi
4,David
'''    

# Always shut down your database when your application exits
db.shutdown()

Solution

  • In your first example, the documentation is wrong. There is currently only one way to determine if an index exists, and it is to check for a ValueError when getting an index. Like this:

    try:
        idx = db.node.indexes.get('my_index')
    except ValueError,e:
        idx = db.node.indexes.create('my_index')
    

    That should be changed to some more specific exception, since this pattern breaks if something else triggers a ValueError.. Will add an issue for that.

    I've just pushed an update to the documentation, and I've added an "exists" method, to check if an index exists. It will be available on Pypi after the next neo4j milestone release.

    if db.node.indexes.exists('my_index'):
        db.node.indexes.get('my_index')
    else:
        db.node.indexes.create('my_index')
    

    In your second example, I think this is correct behavior. The 'single' property assures that there is a single result. If you expect a single result, but get multiple, that is an error. If you want the first result, you should be able to do something like:

    hits = iter(users_idx['id']['3'])
    c = hits.next()
    hits.close()