I'm trying to export edges from grakn. I can do that with Python client like so:
edge_query = "match $c2c($c1, $c2) isa c2c; $c1 has id $id1; $c2 has id $id2;get $id1,$id2;"
with open(f"grakn.edgelist","w") as outfile:
with GraknClient(uri="localhost:48555") as client:
with client.session(keyspace=KEYSPACE) as session:
with session.transaction().read() as read_transaction:
answer_iterator = read_transaction.query(edge_query)
for answer in tqdm(answer_iterator):
id1 = answer.get("id1")
id2 = answer.get("id2")
outfile.write(f"{id1.value()} {id2.value()} \n")
Edit: For each Relation
, I want to export entities pairwise. The output can be a pair of Grakn IDs.
I can ignore the attributes of relation or entities.
Exporting to edges seems like a common task. Is there a better way(more elegant, faster, more efficient) to do it in Grakn?
This works as long as the relation type c2c
always has two roleplayers. However, this will produce two edges for every $c1, $c2
, which is probably not what you want.
Let's take a pair of Things, with ids V123
and V456
. If they satisfy $c2c($c1, $c2) isa c2c;
with $c1 = V123
and $c2 = V456
then they will also satisfy the same pattern as $c1 = V456
and $c2 = V123
. Grakn will return all combinations of $c1, $c2
that satisfy your query, so you'll get two answers back for this one c2c
relation.
Assuming this isn't what you want, if $c1
and $c2
play different roles in the relation c2c
(likely implying there is direction to the edge) then try changing the query, adding the roles, to:
edge_query = "match $c2c(role1: $c1, role2: $c2) isa c2c; $c1 has id $id1; $c2 has id $id2; get $id1,$id2;"
If they both play the same role (implying undirected edges), then we need to do something different in our logic. Either store edges as a set of sets of ids to remove duplicates without much effort, or perhaps consider using the Python ConceptAPI, something like this:
relation_query = "match $rc2c isa c2c;get;"
with open(f"grakn.edgelist","w") as outfile:
with GraknClient(uri="localhost:48555") as client:
with client.session(keyspace=KEYSPACE) as session:
with session.transaction().read() as read_transaction:
answer_iterator = read_transaction.query(relation_query)
for answer in answer_iterator:
relation_concept = answer.get("rc2c")
role_players_map = relation_concept.role_players_map()
role_player_ids = set()
for role, thing in role_players_map.items():
# Here you can do any logic regarding what things play which roles
for t in thing:
role_player_ids.add(t.id) # Note that you can retrieve a concept id from the concept object, you don't need to ask for it in the query
outfile.write(", ".join(role_player_ids) + "\n")
Of course, I have no idea what you're doing with the resulting edgelist, but for completeness, the more Grakn-esque way would be to treat the Relation as a first-class citizen since it represents a hyperedge in the Grakn knowledge model, in this case we would treat the Roles of the relation as edges. This means we aren't stuck when we have ternary or N-ary relations. We can do this by changing the query:
match $c2c($c) isa c2c; get;
Then in the result we get the id of the $c2c
and of the $c
.