Search code examples
neo4jtriplesknowledge-graph

How to model conditional triples in a knowledge graph


I am trying to build a knowledge graph based on textual documents (unstructured data). Therefore my current approach is to extract triples from the data and send these over to a graph database, e.g. neo4j for further analyses. What I notice however is that in the construction of triples there are many, let's call them, 'conditional triples'. An example:

text = "Donald Trump was president-elect for the republican party since July 2016"

Provides the following 'interesting' triples:

(Donald Trump, was, president-elect)
(Donald Trump, was president-elect for, republican party)
(Donald Trump, was president-elect for republican party since, July 2016)

We thus need three 4 nodes:
1. Donald Trump
2. president-elect
2. republican party
2. July 2016

Those are the 4 nodes that might have interesting relations to other entities in the graph. However, my difficulty (or doubts), are with the relationships, these seem very specific and long.

I am not sure whether this actually is an issue, or whether it would be best practice to include such long relationships, such as was president-elect for republican party since.

I have considered looking into creating traversals like:

(Donald Trump)-[was]->(president-elect)-[for]->(republican party)-[since]->(July 2016)

This provides more 'simple' relationships, however this is either a unique traversal such that other president-elects are not related to this particular node, or if it is not a unique traversal, then other president-elects are related to this same node but then the for and since relationships can no longer be uniquely tracked to Donald Trump.

As a result I am now inclined to apply the longer relationships. My question therefore is: Is that a best-practice approach, or am I missing alternative solutions?


Solution

  • Here is a possible data model:

    (:Person {name:"Donald Trump"})-[:ACHIEVED {date:'2016-07-01'}]->(pos:Position)
    (pos)-[:HAS_TITLE]->(:Title {name:"President Elect"})
    (pos)-[:FOR_PARTY]->(:Party {name:"Republican"})
    

    The Person, Title, and Party nodes are unique.