Search code examples
pythonneo4jrest-client

Indexing nodes in neo4j in python


I'm building a database with tag nodes and url nodes, and the url nodes are connected to tag nodes. In this case if the same url is inserted in to the database, it should be linking to the tag node, rather than creating duplicate url nodes. I think indexing would solve this problem. How is it possible to do indexing and traversal with the neo4jrestclient?. Link to a tutorial would be fine. I'm currently using versae neo4jrestclient.

Thanks


Solution

  • The neo4jrestclient supports both indexing and traversing the graph, but I think by using just indexing could be enoguh for your use case. However, I don't know if I understood properly your problem. Anyway, something like this could work:

    >>> from neo4jrestclient.client import GraphDatabase
    
    >>> gdb = GraphDatabase("http://localhost:7474/db/data/")
    
    >>> idx =  gdb.nodes.indexes.create("urltags")
    
    >>> url_node = gdb.nodes.create(url="http://foo.bar", type="URL")
    
    >>> tag_node = gdb.nodes.create(tag="foobar", type="TAG")
    

    We add the property count to the relationship to keep track the number of URLs "http://foo.bar" tagged with the tag foobar.

    >>> url_node.relationships.create(tag_node["tag"], tag_node, count=1)
    

    And after that, we index the url node according the value of the URL.

    >>> idx["url"][url_node["url"]] = url_node
    

    Then, when I need to create a new URL node tagged with a TAG node, we first query the index to check if that is yet indexed. Otherwise, we create the node and index it.

    >>> new_url = "http://foo.bar2"
    
    >>> nodes = idx["url"][new_url]
    
    >>> if len(nodes):
    ...     rel = nodes[0].relationships.all(types=[tag_node["tag"]])[0]
    ...     rel["count"] += 1
    ... else:
    ...     new_url_node = gdb.nodes.create(url=new_url, type="URL")
    ...     new_url_node.relationships.create(tag_node["tag"], tag_node, count=1)
    ...     idx["url"][new_url_node["url"]] = new_url_node