Search code examples
nlpstanford-nlptext-mining

compare TypedDependencies from Stanford NLP dependency parser tree


I am trying a semantic match between two sentences by comparing the dependencies.
I am getting two Stanford dependency trees from two different sentences. I want to compare and get a score for the semantic match between the sentences.

for(TypedDependency td1 : dependencyList1)
    {
        for(TypedDependency td2 : dependencyList2)
        {
            score = td1.compareTo(td2);
        }
    }

dependencyList1 and dependencyList2 are the list of all dependencies from sentences1 and sentence 2 respectively. I am using a compareTo function which gives out scores of -1,0,1.
I then average out the scores to come up with a final score.
I don't know how these scores are calculated.
Is there a better way to compare and identify similar dependencies.
Any help would be appreciated.


Solution

  • compareTo() gives you an ordering between dependencies, e.g., for sorting (see https://docs.oracle.com/javase/7/docs/api/java/lang/Comparable.html). To find similar dependencies, you first need to formalize exactly what you mean by "similar", and then make a custom scoring function.

    A natural metric, beyond simple equality, is collapsing things like *subj (nsubj, nsubjpass, csubj, csubjpass) and *obj (dobj, iobj). If you care about the endpoints of the arcs, checking for lemma match rather than word match is maybe a good start. Similarity in vector space (e.g., with word2vec or GloVE) is also quite effective.

    The list of dependencies, for reference, can be found at: http://universaldependencies.github.io/docs/u/dep/index.html