Search code examples
cassandragremlintitanbulbs

Labels,vertices and edges TitanDB


I have the following information in a Titan Graph database.I am trying to make sense of the information by sending queries across gremlin shell.The Graph database that I am trying to investigate models a Network.There are two types of vertices

        - `Switch`
        - `Port`

I am trying to figure out the relationship between these two types of vertices.

g = TitanFactory.open("/tmp/cassandra.titan")

To see the list of vertices of each type

$ g.V('type', 'switch') 
==>v[228]
==>v[108]
==>v[124]
==>v[92]
==>v[156]
==>v[140]

$ g.V('type', 'port')

==>v[160]
==>v[120152]
==>v[164]
==>v[120156]
==>v[560104]
==>v[680020]
==>v[680040]
==>v[112]
==>v[120164]
==>v[560112]
==>v[680012]
==>v[680004]
==>v[144]
==>v[680032]
==>v[236]
==>v[100]
==>v[560128]
==>v[128]
==>v[680028]
==>v[232]
==>v[96]

To find the relation between the switch and port.

g.v(108).out         
==>v[560104]
==>v[680004]
==>v[112]

What is this "out"? As I understand there is a outward arrow pointing from Switch represented by vertex 108 to the Ports represented by vertices 560104 680004 and 112

What is this in and out? Is it something very specific to Graph Databases? Also what is a label in a graph databse? Are in and out labels?


Solution

  • The use of in and out is descriptive of the direction of the edge going from one vertex to another. In your case, you have this:

    switch --> port
    

    When you write:

    g.v(108).out
    

    you are telling Gremlin to find the vertex at 108, then walk along edges that point out or away from it. You might also think of out as starting from the tail of the arrow and walking to the head. Given your schema, those lead to "ports".

    Similarly, in simply means to have Gremlin walk along edges that point in to the vertex. You might also think of in as starting from the head of the arrow and walking to the tail. Given your schema, switches will have no in edges and hence will always return no results. However if you were to start from a "port" vertex and traverse in:

    g.v(560104).in
    

    you would at least get back vertex 108 as vertex "560104" has at least one edge with an arrow pointing to it (given what I know of your sample data).

    By now you've gathered that in and out are "directions" and not "labels". A label has a different purpose; it categorizes an edge. For example, you might have the following schema:

    switch --connectsTo--> port
    company --manufactures--> switch
    switch --locatedIn--> rack
    

    In other words you might have three edge labels representing different ways that a "switch" relates to other parts of your schema. In this way your queries can be more descriptive about what you want. Given your previous example and this revised schema you would have to write the following to get the same result you originally showed:

    g.v(108).out("connectsTo")         
    ==>v[560104]
    ==>v[680004]
    ==>v[112]
    

    Graph databases will typically take advantage of these labels to help improve performance of queries.