Search code examples
phpindexingneo4jneo4jphp

Neo4j php creating indexes


I'm looking into the everyman neo4j client (https://github.com/jadell/neo4jphp/wiki)

It looks very promising and comfortable to use. However I'm a bit confused about the indexes. I know in Neo4j you can add an index:

CREATE INDEX ON :Person(name)

As I recall correctly, this would automatically index all Person nodes by name.

In the everyman client library, the section on indexes shows that you can create and add indexes to nodes likes this:

$shipIndex = new Everyman\Neo4j\Index\NodeIndex($client, 'ships');

(PS: What does this above line do exactly?)

$heartOfGold = $client->makeNode()
    ->setProperty('propulsion', 'infinite improbability drive')
    ->save();

// Index the ship on one of its properties
$shipIndex->add($heartOfGold, 'propulsion', $heartOfGold->getProperty('propulsion'));

Now, my question. When should I manually add indexes in my PHP code like the example above, and when should I add the index to my Neo4j database and rely on automatic indexing? And in the latter case, can I also make use of index searching in code like this:

$match = $shipIndex->findOne('captain', 'Zaphod');

?


Solution

  • The above first methods add the node and his propulsion property to the lucene index. Not that this kind of index is marked as legacy since a couple of time now.

    Schema indices work as follow now :

    You create an index on a label/property combination, for example if you know you will have to find users by their login property, it is generally advisable to add an index for fast lookup :

    CREATE INDEX ON :User(login);
    

    This kind of index, since neo4j 3.0 can also be used with the CONTAINS clause, for example retrieve me all users where the login contains the neo letters :

    MATCH (n:User) WHERE n.login CONTAINS 'neo' RETURN n
    

    will use the above created index for fast retrieval. (nb: as of now CONTAINS is case sensitive)

    For a full difference explanation between the legacy and schema index, it is really well explained here : Neo4j auto-index, legacy index and label schema: differences for a relative-to-a-node full-text search

    Unfortunately the library you mentioned is not maintained anymore, as you can see from the commits history https://github.com/jadell/neo4jphp/commits/master

    Neo4j is evolving a lot, especially the 3.0 version has now a new binary protocol available which improves the performance and reduces the latency compared to http.

    I would advise you (disclaimer: I am the author of the following library) to use an up-to-date client like https://github.com/graphaware/neo4j-php-client . (Note that it is a pure driver, this doesn't offer ogm features for example, you'll have to write your own Cypher queries)