Search code examples
nlpinformation-extraction

How to determine relationship between two entities when there is more than one relation while creating distant supervision training data?


I got the concepts of distant supervision. As for my understanding, the creating training data process is like;

  • Extract named entities from sentences
  • Find two entities named "e1" and "e2" from each sentence.
  • Search these two entities in knowledge base (freebase etc.) to find relationship between them

I got confused at this step. What if there is more than 1 relation between these two entities (e1 and e2) ? If so which relation should I select?


Solution

  • It depends on the model you're training.

    Are you learning a model for one kind of relationship and doing bootstrapping? Then only pay attention to that one relationship and drop the others from your DB.

    Are you trying to learn a bunch of relationships? Then use the presence or absence of each as a feature in your model. This is how Universals Schemas work.

    Here's an image of a feature matrix from the Universal Schema paper:

    Feature matrix from Universal Schemas