Search code examples
python-3.xneo4jpy2neo

py2neo v3 AttributeError: object has no attribute 'db_exists'


Trying to import data to a clean neo4j graph database using py2neo version 3. I've defined several node types as below, and everything seemed to be going well – except that I wasn't seeing the nodes show up in my neo4j browser.

Here's the relevant import code; I've verified that the records load properly into Python variables.

for row in data:    
    ds = DataSource()
    #   parse Source of Information column as a list, trimming whitespace
    ds.uri = list(map(str.strip, row['data_source'].split(',')))
    ds.description = row['data_source_description']
    graph.merge(ds)

But when I tried to do graph.exists(ds), I got back the following set of errors / tracebacks:

Traceback (most recent call last):
  File "mydir/venv/lib/python3.5/site-packages/py2neo/database/__init__.py", line 1139, in exists
    return subgraph.__db_exists__(self)
AttributeError: 'DataSource' object has no attribute '__db_exists__'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File     "mydir/venv/lib/python3.5/site-packages/py2neo/database/__init__.py", line 478, in exists
    return self.begin(autocommit=True).exists(subgraph)
  File "mydir/venv/lib/python3.5/site-packages/py2neo/database/__init__.py", line 1141, in exists
    raise TypeError("No method defined to determine the existence of object %r" % subgraph)
TypeError: No method defined to determine the existence of object <DataSource uri=['my_uri']>

Much to my surprise, I can't find another forum post discussing this problem. I'm guessing that there's a problem inheriting from GraphObject, but there doesn't seem to be an explicit definition of a __db_exists__ property for GraphObject, either. In fact, the only place I can find that property mentioned is in the definition of the exists function, when it generates this error.

Can anyone see what I'm doing wrong here?

The node class definitions are as follows:

class Content(GraphObject):             # group Person and Institution
    pass

class Person(Content):
    __primarykey__ = 'name'

    name = Property()
    in_scholar_names = Property()
#   
    mentored = RelatedTo('Person')
    mentored_by = RelatedFrom('Person', 'MENTORED')
    worked_alongside = Related('Person', 'WORKED_ALONGSIDE')
    studied_at = RelatedTo('Institution')
    worked_at = RelatedTo('Institution')
    tagged = RelatedTo('Tag')
    member_of = RelatedTo('Institution')

    last_update = RelatedTo('UpdateLog')

    def __lt__(self, other):
        return self.name.split()[-1] < other.name.split()[-1]

class Institution(Content):
    __primarykey__ = 'name'
#   
    name = Property()
    location = Property()
    type = Property()
    carnegie_class = Property()
#   
    students = RelatedFrom('Person', 'STUDIED_AT')
    employees = RelatedFrom('Person', 'WORKED_AT')
    members = RelatedFrom('Person', 'MEMBER_OF')

    last_update = RelatedTo('UpdateLog')

    def __lt__(self, other):
        return self.name < other.name


class User(GraphObject):
    __primarykey__ = 'username'

    username = Property()
    joined = Property()
    last_access = Property()
    active = Property()

    contributed = RelatedTo('UpdateLog')


class Provenance(GraphObject):          # group UpdateLog and DataSource
    pass    
# 
class UpdateLog(Provenance):
    __primarykey__ = 'id'

    id = Property()
    timestamp = Property()
    query = Property()

    previous = RelatedTo('UpdateLog', 'LAST_UPDATE')
    next = RelatedFrom('UpdateLog', 'LAST_UPDATE')
    based_on = RelatedTo('Provenance', 'BASED_ON')

    affected_nodes = RelatedFrom('Content', 'LAST_UPDATE')
    contributed_by = RelatedFrom('User', 'CONTRIBUTED')

class DataSource(Provenance):
    __primarykey__ = 'uri'

    id = Property()
    description = Property()
    uri = Property()

    source_for = RelatedFrom('UpdateLog', 'BASED_ON')


class Tag(GraphObject):
    __primarykey__ = 'name'

    name = Property()
    description = Property()

    see_also = Related('Tag')
    tagged = RelatedFrom('Content')

Solution

  • Okay, I think I figured it out. I had been learning py2neo in the context of Flask, where all those class definitions are important and useful for generating views (web pages) of the relationships on a given node.

    But for the data import script I'm currently writing, i.e. to actually create the nodes and relationships in the first place, I need to use the vanilla classes of 'Node' and 'Relationship', and just specify the types as parameters on the function. This updated version of the original code above produces no errors, and graph.exists(ds) returns true afterward:

    for row in data:    
        ds = Node("DataSource")
        #   parse Source of Information column as a list, trimming whitespace
        ds['uri'] = list(map(str.strip, row['data_source'].split(',')))
        ds['description'] = row['data_source_description']
        graph.merge(ds)
    

    Two other discoveries of note:

    1. My class inheritance was off the mark to begin with, because I should have been trying to inherit from Node, not GraphObject (even though GraphObject was the correct class to inherit back in the context of Flask)
    2. For the Node class, I have to use dict-style assignment of properties, with the square brackets and key names as quoted strings; the dot notation was off base here, and I'm surprised I didn't get more errors thrown, and sooner.