I'm trying to represent MongoDB data as a graph in Neo4J using the APOC connector, but I can't wrap my head around the correct syntax. My data in mongodb look like below.
{
"_id" : ObjectId("5e88985f788e2ab63ff926d7"),
"role": "member",
"name": "Emmett Brown",
"dob" : "1955-03-19",
"registration_date" : "1985-10-26",
"follows" : []
},
{
"_id" : ObjectId("5e88985f788e2ab63ff926d8"),
"role": "member",
"name": "Marty McFly",
"dob" : "1968-06-09",
"registration_date" : "2015-10-26",
"follows": [{
"id" : [ObjectId("5e88985f788e2ab63ff926d7")]
}]
},
{
"_id" : ObjectId("5e88985f788e2ab63ff926d9"),
"role": "member",
"name": "Biff Tannen",
"dob" : "1959-04-15",
"registration_date" : "2006-09-15",
"follows": [{
"id" : [ObjectId("5e88985f788e2ab63ff926d7"), ObjectId("5e88985f788e2ab63ff926d8")]
}]
}
What I'd like to do is to create a graph in Neo4J that would look like this :
CREATE (Emmett:Person)
CREATE (Marty:Person)
CREATE (Biff:Person)
CREATE
(Marty)-[:FOLLOWS]->(Emmett),
(Biff)-[:FOLLOWS]->(Emmett),
(Biff)-[:FOLLOWS]->(Marty)
So in other words, what I'd like to do is to use each ObjectId within the "follows" key as a destination node. However, since I'm using the ids, I have no idea on how to create my relationships... Here's what I came up with so far :
CALL apoc.mongodb.get('mongodb://localhost:27017', 'database_name', 'user_collection', {}) YIELD value AS person
MERGE (p:Person {name:person.name}) ON CREATE SET p.registration_date = person.registration_date
RETURN p
This allows me to return all my nodes and display them in Neo4J, but I have been trying to get the values of my nodes for the past 2 days, and I just can't do it... So I was thinking maybe any of you guys could be of any help with this ? Thank you in advance !
I don't have a Mongo instance to play with so simulated this with a JSON file - note that I've collapsed the ObjectId bits into just strings, which I think is how Neo4j handles them. You'd need to replace the first line with your call to apoc.mongodb.get
[{
"_id" : "5e88985f788e2ab63ff926d7",
"role": "member",
"name": "Emmett Brown",
"dob" : "1955-03-19",
"registration_date" : "1985-10-26",
"follows" : []
},
{
"_id" : "5e88985f788e2ab63ff926d8",
"role": "member",
"name": "Marty McFly",
"dob" : "1968-06-09",
"registration_date" : "2015-10-26",
"follows": [{
"id" : ["5e88985f788e2ab63ff926d7"]
}]
},
{
"_id" : "5e88985f788e2ab63ff926d9",
"role": "member",
"name": "Biff Tannen",
"dob" : "1959-04-15",
"registration_date" : "2006-09-15",
"follows": [{
"id" : ["5e88985f788e2ab63ff926d7", "5e88985f788e2ab63ff926d8"]
}]
}
]
The following creates People nodes, then runs a second pass that tries to connect them together:
CALL apoc.load.json("example.json") YIELD value as person
WITH collect(person) as people
FOREACH (personDetails in people |
MERGE (p: Person { id: personDetails._id })
ON CREATE SET p.registrationDate = personDetails.registrationDate,
p.name = personDetails.name
)
WITH people
UNWIND people as personDetails
MATCH (follower: Person { id: personDetails._id })
UNWIND personDetails.follows as followsRecords
MATCH (followed: Person) WHERE followed.id in followsRecords.id
MERGE (follower)-[:FOLLOWS]->(followed)
We probably want to also create a unique constraint on Person.id, which will speed things up with large datasets as well as prevent weird data issues in case we got our query wrong:
CREATE CONSTRAINT ON (p:Person) ASSERT p.id IS UNIQUE