Search code examples
neo4jgraph-databases

How to use contains within Neo4j cypher?


I have recently been handed over a Neo4j Database. Having read the documents, it does not seem like a big source. Current Neo4j has 11 nodes and about few hundred thousand edges. I am unsure if the size or attributes of the Neo4j are slowing down the processing.

Since the query is pretty big, I will post it once at the end of the question.

If I use a where clause for contains purposes, it gives me the result in 7-8 seconds.

MATCH (contact:Contacts)    
where lower(contact.Name) contains lower('Rick')          
WITH contact         
ORDER BY contact.Source asc         
SKIP 0 LIMIT 20

But the same query returns exact results in a couple of milliseconds if used in the following way but it only returns exact matches and not all that contain 'Rick'.

MATCH (contact:Contacts{Name:'Rick'})          
WITH contact         
ORDER BY contact.Source asc         
SKIP 0 LIMIT 20

Is there a way to use contains in the the latter way as it seems to be quicker.

Following is the entire query used:

MATCH (contact:Contacts{Name:'Rick'})          
WITH contact         
ORDER BY contact.Source asc         
SKIP 0 LIMIT 20          
OPTIONAL MATCH (contact)-[workingFor:WorkingFor]->(company:Company)         
with contact, workingFor, company 
OPTIONAL MATCH (contact)-[contactForEmployee:ContactForEmployee]->(employee:Employee)        
with contact,workingFor, company, contactForEmployee, employee 
OPTIONAL MATCH (contact)-[InfoFor:InfoFor]-(LeadInfo:LeadInfo)          with contact,workingFor, company, contactForEmployee, employee, InfoFor, LeadInfo 
optional MATCH (contact)-[connectedTo:ConnectionDetails]-(contactTo:Contacts)       
where date( connectedTo.LinckedInConnectedOn) <> date('1900-01-01')       
WITH contact,connectedTo,  contactTo,  workingFor, company, contactForEmployee, employee ,InfoFor, LeadInfo       
ORDER BY connectedTo.LinckedInConnectedOn DESC  
WITH contact, collect(connectedTo)[..5] AS liConnections, collect(contactTo)[..5] AS liContacts, workingFor, company,         contactForEmployee, employee, InfoFor, LeadInfo 
optional MATCH (contact)-[ocConnections:ConnectionDetails]-(ocContactTo:Contacts)       
where ocConnections.EmailConnectionStrengthStrong <> 0 or ocConnections.EmailConnectionStrengthMedium <> 0 or ocConnections.EmailConnectionStrengthLow <> 0       
WITH contact,ocConnections, ocContactTo, liConnections, liContacts, workingFor, company,contactForEmployee, employee, InfoFor, LeadInfo       
ORDER BY ocConnections.EmailConnectionStrengthStrong desc,      ocConnections.EmailConnectionStrengthMedium desc,
 ocConnections.EmailConnectionStrengthLow desc  
WITH contact, collect(ocConnections)[..5] AS ocConnections, collect(ocContactTo)[..5] AS ocContactTo,        
 liConnections, liContacts,  workingFor, company, contactForEmployee, employee,InfoFor, LeadInfo 
RETURN contact, workingFor, company, contactForEmployee, employee,InfoFor, LeadInfo,              
 collect(liConnections) AS liConnections, collect(liContacts) AS liConnectedTo,             
 collect(ocConnections) as  ocConnections,  collect(ocContactTo) as ocConnectedTo

Solution

  • CONTAINS works with existing indexes, except that you're using toLower() on the node property:

    where lower(contact.Name) contains lower('Rick')

    This prevents usage of the :Contacts(Name) index lookup, as the planner now has transform the Name property of all :Contacts nodes to lowercase to perform the checking.

    To allow index lookup for queries like this, assuming the Name property is case-sensitive, you may need to add an additional field just for holding the lowercase form of the name, and the you can run the query without needing to use the lower() function on the Name property.

    Alternately, if you can upgrade to Neo4j 3.5.x, we now have a fulltext schema indexes which are designed for these kinds of searches, and are case insensitive for lookups.