Search code examples
vaticle-typedb

Method to export Incidence Matrix from Grakn?


We often use GraphBLAS for graph processing so we need to use the incidence matrix. I haven't been able to find a way to export this from Grakn to a csv or any file. Is this possible?


Solution

  • There isn't a built-in way to dump data to CSV in Grakn right now. However, we do highly encourage our community to contribute open source tooling for these kinds of tasks! Feel free to chat to use about it on our discord.

    As to how it can be done, conceptually it's pretty easy:

    Query to get stream all hyper-relations out:

    match $r isa relation;
    

    and then for each relation, we can pipeline another query (possibly in new transaction if you wish to keep memory usage lower):

    match $r iid <iid of $r from previous query>; $r ($x); get $x;
    

    which will get you everything in this particular hyper relation $r playing a role.

    If you also wish to extract attributes that are attached to the hyper relation, you can use the following

    match $r iid <iid of $r from first query>; $r has $a; get $a;
    

    In effect we can use these steps to build up each column in the A incidence matrix.

    There are a couple if important caveats I should bring up:

    • What you'll end up with, will exclude all type information about the hyper relations, the role players in the relations, and the actual role that is being played by the role player, and attribute types owned.

    ==> It would be interesting to hear/discuss how one could encode types information for use in GraphBLAS

    • In Graql, it's entirely possible to have relations participating in relations. in the worst case, this means all hyper-edges E will also be present in the set V. In practice only a few relations will play a role in other relations, so only a subset of E may be in V.