Search code examples
gremlinamazon-sagemakeramazon-neptunegremlinpythongraph-notebook

View Neptune Graph Schema using Jupyter notebook


Is there a way to view the schema of a graph in a Neptune cluster using Jupyter Notebook?

Like you would do a "select * from tablename limit 10" in an RDS using SQL, similarly is there a way to get a sense of the graph data through Jupyter Notebook?


Solution

  • It depends on how large your graph is as to how well this will perform but you can get a sense of the type of nodes and edges you have using something like the example below. From the tags you used I assume you are using Gremlin:

    g.V().groupCount().by(label)
    g.E().groupCount().by(label)
    

    If you have a very large graph try putting something like limit(100000) before the groupCount step.

    If you are using a programming language like Python (with gremlin python installed) then you will need to add a next() terminal step to the queries as in:

    g.V().groupCount().by(label).next()
    g.E().groupCount().by(label).next()
    

    Having found the labels and distribution of the labels you could use one of them to explore some properties. Let's imagine there is a label called "person".

    g.V().hasLabel('person').limit(10).valueMap().toList()
    

    Remember with Gremlin property graphs vertices with the same label may not necessarily have all the same properties so it's good to look at more than one vertex to get a sense for that as well.