Search code examples
parsingnlpsummarysummarization

Summarization of simple Q&A


Is there a way to generate a one-sentence summarization of Q&A pairs?

For example, provided:

Q: What is the color of the car?
A: Red

I want to generate a summary as

The color of the car is red

Or, given

Q: Are you a man?
A: Yes

to

Yes, I am a man.

which accounts for both question and answer.

What would be some of the most reasonable ways to do this?


Solution

  • I had to once work on solving the opposite problem, i.e. generating questions out of sentences from Wikipedia articles.

    I used the Stanford Parser to generate parse trees out of all possible sentences in my training dataset.

    e.g.

    1. Go to http://nlp.stanford.edu:8080/parser/index.jsp
    2. Enter "The color of the car is red." and click "Parse".
    3. Then look at the Parse section of the response. The first layer of that sentence is NP VP (noun phrase followed by a verb phrase).
      The second layer is NP PP VBZ ADJP.

    I basically collected these patterns across 1000s of sentences, sorted them how common each patter was, and then used figured out how to best modify this parse tree to convert into each sentence in a different Wh-question (What, Who, When, Where, Why, etc)

    You could you easily do something very similar. Study the parse trees of all of your training data, and figure out what patterns you could extract to get your work done. In many cases, just replacing the Wh word from the question with the answer would give you a valid albeit somewhat awkwardly phrases sentence. e.g. "Red is the color of the car."

    In the case of questions like "Are you a man?" (i.e. primary verb is something like 'are', 'can', 'should', etc), swapping the first 2 words usually does the trick - "You are a man?"