I try to build a tool to calculate the similarity between 2 words and I found that there is a formula come from Manchester Metropolitan University as following:
Until now, I am still confused how to get the h which is the depth of subsumer in the hierarchical semantic nets. As my understanding, h is the path length from the top word to the a certain word, as reference from the author, the top word is 'entity' for NOUN. But how about another kind of word such as ADJ, ADV, VERB...? And if we already have the top word, how can we list out the path from it to the word we need to calculate
The paper is at the following link: https://www.researchgate.net/profile/Keeley_Crockett/publication/232645326_Sentence_Similarity_Based_on_Semantic_Nets_and_Corpus_Statistics/links/0deec51b8db68f19fa000000.pdf
Really appreciate for any answer. Thanks
I would like to add more detail which I have just found. These details are enough for my searching but may not exactly with the question above, but I think I need to share to somebody need it in future.
'Entity' is not only root of Noun, but also the root of any word even it is VERB, ADJ, ADV....
Come back to above example, the h (length of subsummer of 'kiss' and 'kick') is 6, which is count from the top tree node root to the word 'act'