I've got some sample data on a family graph I want to query on.
I'd like to use the find method on the GraphFrames object in order to query the motif A->B where the edge is of type "Mother".
Since GraphFrames uses a subset of the cypher language of Neo4J I was wondering if the following would be the correct query?
graph.find("(A)-[edge:Mother]->(B)").show
Or what would be the best way to implement this in GraphFrames?
GraphFrame(vertex, graph.edges.filter("attr=='Mother'")).vertices.show
This doesn't work since I cannot filter on the direction, so I only want to get the mothers :)
Any idea?
Suppose this is your test data:
import org.graphframes.GraphFrame
val edgesDf = spark.sqlContext.createDataFrame(Seq(
("a", "b", "Mother"),
("b", "c", "Father"),
("d", "c", "Father"),
("e", "b", "Mother")
)).toDF("src", "dst", "relationship")
val graph = GraphFrame.fromEdges(edgesDf)
graph.edges.show()
+---+---+------------+
|src|dst|relationship|
+---+---+------------+
| a| b| Mother|
| b| c| Father|
| d| c| Father|
| e| b| Mother|
+---+---+------------+
You can use a motif query and apply a filter to it:
graph.find("()-[e]->()").filter("e.relationship = 'Mother'").show()
+------------+
| e|
+------------+
|[a,b,Mother]|
|[e,b,Mother]|
+------------+
Or, since your case is relatively simple, you can apply a filter to the edges of the graph:
graph.edges.filter("relationship = 'Mother'").show()
+---+---+------------+
|src|dst|relationship|
+---+---+------------+
| a| b| Mother|
| e| b| Mother|
+---+---+------------+
Here's some alternative syntax (each gets the same result as immediately above):
graph.edges.filter($"relationship" === "Mother").show()
graph.edges.filter('relationship === "Mother").show()
You mention filtering on direction, but the direction of each relationship is encoded in the graph itself (i.e. from source to destination).