Trying to import data to a clean neo4j graph database using py2neo version 3. I've defined several node types as below, and everything seemed to be going well – except that I wasn't seeing the nodes show up in my neo4j browser.
Here's the relevant import code; I've verified that the records load properly into Python variables.
for row in data:
ds = DataSource()
# parse Source of Information column as a list, trimming whitespace
ds.uri = list(map(str.strip, row['data_source'].split(',')))
ds.description = row['data_source_description']
graph.merge(ds)
But when I tried to do graph.exists(ds)
, I got back the following set of errors / tracebacks:
Traceback (most recent call last):
File "mydir/venv/lib/python3.5/site-packages/py2neo/database/__init__.py", line 1139, in exists
return subgraph.__db_exists__(self)
AttributeError: 'DataSource' object has no attribute '__db_exists__'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "mydir/venv/lib/python3.5/site-packages/py2neo/database/__init__.py", line 478, in exists
return self.begin(autocommit=True).exists(subgraph)
File "mydir/venv/lib/python3.5/site-packages/py2neo/database/__init__.py", line 1141, in exists
raise TypeError("No method defined to determine the existence of object %r" % subgraph)
TypeError: No method defined to determine the existence of object <DataSource uri=['my_uri']>
Much to my surprise, I can't find another forum post discussing this problem. I'm guessing that there's a problem inheriting from GraphObject
, but there doesn't seem to be an explicit definition of a __db_exists__
property for GraphObject
, either. In fact, the only place I can find that property mentioned is in the definition of the exists
function, when it generates this error.
Can anyone see what I'm doing wrong here?
The node class definitions are as follows:
class Content(GraphObject): # group Person and Institution
pass
class Person(Content):
__primarykey__ = 'name'
name = Property()
in_scholar_names = Property()
#
mentored = RelatedTo('Person')
mentored_by = RelatedFrom('Person', 'MENTORED')
worked_alongside = Related('Person', 'WORKED_ALONGSIDE')
studied_at = RelatedTo('Institution')
worked_at = RelatedTo('Institution')
tagged = RelatedTo('Tag')
member_of = RelatedTo('Institution')
last_update = RelatedTo('UpdateLog')
def __lt__(self, other):
return self.name.split()[-1] < other.name.split()[-1]
class Institution(Content):
__primarykey__ = 'name'
#
name = Property()
location = Property()
type = Property()
carnegie_class = Property()
#
students = RelatedFrom('Person', 'STUDIED_AT')
employees = RelatedFrom('Person', 'WORKED_AT')
members = RelatedFrom('Person', 'MEMBER_OF')
last_update = RelatedTo('UpdateLog')
def __lt__(self, other):
return self.name < other.name
class User(GraphObject):
__primarykey__ = 'username'
username = Property()
joined = Property()
last_access = Property()
active = Property()
contributed = RelatedTo('UpdateLog')
class Provenance(GraphObject): # group UpdateLog and DataSource
pass
#
class UpdateLog(Provenance):
__primarykey__ = 'id'
id = Property()
timestamp = Property()
query = Property()
previous = RelatedTo('UpdateLog', 'LAST_UPDATE')
next = RelatedFrom('UpdateLog', 'LAST_UPDATE')
based_on = RelatedTo('Provenance', 'BASED_ON')
affected_nodes = RelatedFrom('Content', 'LAST_UPDATE')
contributed_by = RelatedFrom('User', 'CONTRIBUTED')
class DataSource(Provenance):
__primarykey__ = 'uri'
id = Property()
description = Property()
uri = Property()
source_for = RelatedFrom('UpdateLog', 'BASED_ON')
class Tag(GraphObject):
__primarykey__ = 'name'
name = Property()
description = Property()
see_also = Related('Tag')
tagged = RelatedFrom('Content')
Okay, I think I figured it out. I had been learning py2neo in the context of Flask, where all those class definitions are important and useful for generating views (web pages) of the relationships on a given node.
But for the data import script I'm currently writing, i.e. to actually create the nodes and relationships in the first place, I need to use the vanilla classes of 'Node' and 'Relationship', and just specify the types as parameters on the function. This updated version of the original code above produces no errors, and graph.exists(ds)
returns true
afterward:
for row in data:
ds = Node("DataSource")
# parse Source of Information column as a list, trimming whitespace
ds['uri'] = list(map(str.strip, row['data_source'].split(',')))
ds['description'] = row['data_source_description']
graph.merge(ds)
Two other discoveries of note:
Node
, not GraphObject
(even though GraphObject
was the correct class to inherit back in the context of Flask)Node
class, I have to use dict-style assignment of properties, with the square brackets and key names as quoted strings; the dot notation was off base here, and I'm surprised I didn't get more errors thrown, and sooner.