Search code examples
pythonnltkstanford-nlp

Query Analysis to determine the relationship between words using natural language processing


Sample Sentence

a) Who is the Ceo of IBM?

b) Where is IBM office located?

The series of operation is applied to above sentence using tokenization, pos-tagging and chunking to extract relationship.

Who is the Ceo of IBM -- (Extracted Tuple) --> [Who, Ceo of IBM]

Where is IBM office located -- (Extracted Tuple) --> [Where, IBM Office Located]

From Extracted dependencies how will I determine that what the Question is about? How the WP and WHP words in the sentence indicate that what kind of query to be made to extract data from the knowledge-based data set.

like in a) Who is pointing toward name, place or any another named entity.

and in b) Where is pointing toward name, place or any another named entity.

Any advice using natural language processing techniques or text mining is highly appreciated.


Solution

  • It depends on the variability of input sentences you are expecting. For the examples you give, you could use very simple pattern matching. Just set up a few patterns such as

    WHO IS ...? -> [who, ...]  
    WHERE IS ...? -> [where, ...]  
    WHERE CAN I FIND ...? -> [where, ...]
    

    And then use string matching to locate those patterns in your input data. You could even use regular expressions if necessary:

    s/who is \(.*\)/[who, \1]/
    

    (using sed-style search and replace here)

    This will of course only match those particular examples, but if most of your data looks like it, you might not need a full-blown NLP approach. You can always add more patterns like that, though at some point it might become unmanageable. However, this could get you quite far for your particular problem.

    You can of course do a full syntactic analysis, but it might be overkill & too brittle. The right approach depends fully on your use case.