Search code examples
graphneo4jgraph-databasespy2neo

Neo4j Match and Create takes too long in a 10000 node graph


I have a data model like this:

  • Person node
  • Email node
  • OWNS relationship
  • LISTS relationship
  • KNOWS relationship

each Person can OWN one Email and LISTS multiple Emails (like a contact list, 200 contacts is assumed per Person).

The query I am trying to perform is finding all the Persons that OWN an Email that a Contact LISTS and create a KNOWS relationship between them.

MATCH (n:Person {uid:'123'}) -[r1:LISTS]-> (m:Email) <-[r2:OWNS]- (l:Person) CREATE UNIQUE (n)-[:KNOWS]->[l]

The counts of my current database is as follows:

  • Number of Person nodes: 10948
  • Number of Email nodes: 1951481
  • Number of OWNS rels: 21882
  • Number of LISTS rels: 4376340 (Each Person has 200 unique LISTS rels)

Now my problem is that running the said query on this current database takes something between 4.3 to 4.8 seconds which is unacceptable for my need. I wanted to know if this is normal timing considering my data model or am I doing something wrong with the query (or even model).

Any help would be much appreciated. Also if this is normal for Neo4j please feel free to suggest other graph databases that can handle this kind of model better.

Thank you very much in advance

UPDATE:

My query is: profile match (n: {uid: '4692'}) -[:LISTS]-> (:Email) <-[:OWNS]- (l) create unique (n)-[r:KNOWS]->(l)

The PROFILE command on my query returns this:

Cypher version: CYPHER 2.2, planner: RULE. 3919222 total db hits in 2713 ms.

enter image description here


Solution

  • Yes, 4.5 seconds to match one person from index along with its <=100 listed email addresses and merging a relationship from user to the single owner of each email, is slow.

    The first thing is to make sure you have an index for uid property on nodes with :Person label. Check your indices with SCHEMA command and if missing create such an index with CREATE INDEX ON :Person(uid).

    Secondly, CREATE UNIQUE may or may not do the work fine, but you will want to use MERGE instead. CREATE UNIQUE is deprecated and though they are sometimes equivalent, the operation you want performed should be expressed with MERGE.

    Thirdly, to find out why the query is slow you can profile it:

    PROFILE
    MATCH (n:Person {uid:'123'})-[:LISTS]->(m:Email)<-[:OWNS]-(l:Person) 
    MERGE (n)-[:KNOWS]->[l]
    

    See 1, 2 for details. You may also want to profile your query while forcing the use of one or other of the cost and rule based query planners to compare their plans.

    CYPHER planner=cost
    PROFILE
    MATCH (n:Person {uid:'123'})-[:LISTS]->(m:Email)<-[:OWNS]-(l:Person) 
    MERGE (n)-[:KNOWS]->[l]
    

    With these you can hopefully find and correct the problem, or update your question with the information to help others help you find it.