Search code examples
pythonneo4jcypherneo4j-python-driver

Get elements along with their ids from neo4j using python driver


I am using the python neo4j driver (5.5.0) to query for data from a neo4j aura database for use in my python application. However this data is insufficient for my purpose (building a graph visualization) as it only returns the node properties and not ids/labels. Same for edges, no id, no ids for connected nodes (it does give the connected node properties though). However I would really like to have the ids to ensure the graphs are consistent.

I have the following neo4j Cypher query

f" MATCH (d:Document)",
f" WHERE d.doc_id IN {doc_ids}",
f" UNWIND d as doc",
f"  CALL {{",
f"      WITH doc",
f"      MATCH (doc:Document)-[de:CONTAINS_ENTITY]->(e:Entity_node)-[ec:ENTITY_CONCEPT_ASSOCIATION]->(c: Concept)",
f"      WITH doc, ",
f"      {{ sub: doc, rel_id: id(de), rel_type: type(de), obj: e }} as contains,",
f"      {{ sub: e, rel_id: id(ec), rel_type: type(ec), obj: c }} as represents LIMIT {num_of_ents}"
f"      MATCH (a:Author)-[ad:AUTHORED]->(doc)",
f"      RETURN", 
f"          {{ contains: contains, represents: represents }} as entities,",
f"          {{ sub: a, rel_id: id(ad), rel_type: type(ad), obj: doc }} as authors",
f"  }}",
f" RETURN doc, collect(entities) as entities, collect(authors) as authors"

that I am passing to the neo4j drivers session.run() function

It is a bit complex and largely irrelevant so lets say this is the query

f" MATCH (d:Document)",
f" WHERE d.doc_id IN {doc_ids}",
f" RETURN d"

This would return a response like..

[
    {
        "d": {
            "title": "Coronavirus and paramyxovirus in bats from Northwest Italy",
            "doc_id": "a03517f26664be79239bcdf3dbb0966913206a86"
        }
    },
    ...
]

However the same query would return a different format of response in neo4j browser

[
  {
    "identity": 23016,
    "labels": [
      "Document"
    ],
    "properties": {
      "title": "Coronavirus and paramyxovirus in bats from Northwest Italy",
      "doc_id": "a03517f26664be79239bcdf3dbb0966913206a86"
    },
    "elementId": "23016"
  },
  ...
]

These responses contain the id of the node as well as the labels which are necessary for my application. Moreover the relations contain start and end ids too.

How do I get these values using the neo4j python driver. I have tried all the functions available on the Result & Record objects [data(), values(), items()] but none give the ids/labels. The graph() function gives nodes with ids but no relations at all (empty list).

I am aware of using the id() and labels() functions in the Cypher query itself but considering the size of my query that seems to increase the response time considerably.

The fact that the graph() function has ids tells me that the initial Result object has the ids somewhere within. How can I access it?


Solution

  • the properties "id" and "labels" are managed by the neo4j database. For access this properties you have to use a build in cypher function which are described in the doc's https://neo4j.com/docs/cypher-manual/current/functions/.

    According to your simplified example, getting the labels and the id would looks like this:

    f" MATCH (d:Document)",
    f" WHERE d.doc_id IN {doc_ids}",
    f" WITH *, labels(d) as doc_labels",
    f" WITH *, id(d) as doc_id",
    f" RETURN doc_id, doc_labels"
    

    or assuming that "your" doc_id is the searched node id and you just want to compare it in the WHERE Clause with an external parameter "doc_ids":

    f" MATCH (d:Document)",
    f" WHERE id(d) IN {doc_ids}",
    f" WITH *, labels(d) as doc_labels",
    f" WITH *, id(d) as doc_id",
    f" RETURN doc_id, doc_labels"
    

    The labels come back in form of an 'list of strings' (a data type of cypher) which contains all labels attached at the node. The id is a integer value.

    The result of the query should looks like this:

    doc_id        doc_labels        
    0             ["Document", "second_label"]
    1             ["Document"]
    

    Greetings, ottonormal