Search code examples
nlpstanford-nlp

Identify prepositons and individual POS


I am trying to find correct parts of speech for each word in paragraph. I am using Stanford POS Tagger. However, I am stuck at a point.

I want to identify prepositions from the paragraph.

Penn Treebank Tagset says that:

IN  Preposition or subordinating conjunction

how, can I be sure if current word is be preposition or subordinating conjunction. How can I extract only prepositions from paragraph in this case?


Solution

  • I have had some breakthrough to understand if the word is actually preposition or subordinating conjunction.

    I have parsed following sentence :

    She left early because Mike arrived with his new girlfriend.

    (here because is subordinating conjunction )

    After POS tagging

    She_PRP left_VBD early_RB because_IN Mike_NNP arrived_VBD with_IN his_PRP$ new_JJ girlfriend_NN ._.

    here , to make sure because is a preposition or not I have parsed the sentence.

    Parse Tree for Sentence 1

    here because has direct parent after IN as SBAR(Subordinate Clause) as root.

    with also comes under IN but its direct parent will be PP so it is a preposition.

    Example 2 :

    Keep your hand on the wound until the nurse asks you to take it off. (here until is coordinating conjunction )

    POS tagging is :

    Keep_VB your_PRP$ hand_NN on_IN the_DT wound_NN until_IN the_DT nurse_NN asks_VBZ you_PRP to_TO take_VB it_PRP off_RP ._.

    So , until and on are marked as IN.

    However, picture gets clearer when we actually parse the sentence.

    So finally I conclude because is subordinating conjunction and with is preposition.

    Tried for many variations of sentences .. worked for almost all except some cases for before and after. Example 2