Search code examples
phpwordnetthesaurus

Calling wordnet from php (Wordnet class or API for PHP)


I am trying to write a program to find similarity between two documents, and since im using only english, I decided to use wordnet, but I cannot find a way to link the wordnet with php, I cannot find any wordnet api from php.

I saw in the forum some one said (Spudley) he called wordnet from php (using shell_exec() function), Thesaurus class or API for PHP [edited]

I would really like to know a method used or some example code, a tutorial perhaps to start using the wordnet with php.

many thanks


Solution

  • The PHP extension which is linked to from the WordNet site is very old and out of date -- it claims to work with PHP4, so I don't think it's been looked at in years.

    There aren't any other APIs available for WordNet->PHP, so I rolled my own solution.

    WordNet can be run from the command-line, so PHP's shell_exec() function can read the output.

    If you run WordNet from the command-line (cd to Wordnet's directory, then just wn) without any parameters, it will show you a list of possible functions that Wordnet supports.

    Still in the command-line, if you then try one/some of those functions, you'll see how Wordnet outputs its results. For example, if you want synonyms for the word 'star', you could try the -synsn function:

    wn star -synsn
    

    This will produce output that looks a bit like this:

    Synonyms/Hypernyms (Ordered by Estimated Frequency) of noun star

    8 senses of star

    Sense 1 star => celestial body, heavenly body

    Sense 2 ace, adept, champion, sensation, maven, mavin, virtuoso, genius, hotshot, star, superstar, whiz, whizz, wizard, wiz => expert

    Sense 3 star => celestial body, heavenly body

    Sense 4 star => plane figure, two-dimensional figure

    Sense 5 star, principal, lead => actor, histrion, player, thespian, role player

    Sense 6 headliner, star => performer, performing artist

    Sense 7 asterisk, star => character, grapheme, graphic symbol

    Sense 8 star topology, star => topology, network topology

    In PHP, you can read this same output using the shell_exec() function.

    $result = shell_exec('/path/to/wn '.$word.' -synsn');
    

    Now $result should contain the block of text quoted above.

    At this point, you have to do some proper coding. You'll need to take that block of text and parse it for the data you want.

    This is where it gets tricky. Because the data is presented in a format designed to be read by a human rather than by a program, it is tricky to parse accurately.

    It is important to note that different search options present their output slightly differently. And, some of the results that are returned can be somewhat esoteric. I ended up writing a weighting system to score the results, but it was fairly specific to my needs, so you'll need to experiment with it to come up with your own system.

    I hope that's enough help for you. :)