Search code examples
javanlpwordnet

wordnet word phrase in synset


How can we find the word phrases in a synset ? In particular, take this synset for the adj "booked":

booked, engaged, set-aside -- (reserved in advance)

I use the RitaWN Java package (WordNet version is 2.1), and cannot seem to find the phrases. In the example above, when I run

RiWordnet wordnet = new RiWordnet(null);
String[] syn = wordnet.getSynset(word, "a", true);
for(int i = 0; i < syn.length; i++)
            System.out.println(syn[i]);

It only outputs

booked engaged

While "set-aside" is not listed.

I have tested a lot and all phrases are not found. Another example:

commodity, trade good, good -- (articles of commerce)

then "trade good" is not returned from the getSynset() method. So how can we actually get phrases ?

(the ritawn package is obtained from http://rednoise.org/rita/wordnet/documentation/index.htm)


Solution

  • RiTaWN seems to ignore "compound-words" by default. You can disable this to get the full list of phrases (line 2 below).

    RiWordnet wordnet = new RiWordnet();
    wordnet.ignoreCompoundWords(false);
    String[] syn = wordnet.getSynset("booked", "a", true);
    System.out.println(Arrays.asList(syn));
    

    Result:

    [INFO] RiTa.WordNet.version [033]
    [booked, engaged, set-aside]