Search code examples
pythonnlpallennlp

Does allennlp textual entailment model work when hypothesis and premise both involve multiple sentences?


On allennlp textual entailment demo website, the hypothesis and premise in examples always only consist of one sentence. Does allennlp textual entailment model work when hypothesis and premise both include multiple sentences? Is it theoretically practical? Or could I train the model on my own labeled dataset to make it work on paragraph texts?

For example:

  • Premise: "Whenever Jack is asked whether he prefers mom or dad, he doesn't know how to respond. To be honest, he has no idea why he has to make a choice. "
  • Hypothesis: "Whom do you love more, mom or dad? Some adults like to use this question to tease kids. For Jack, he doesn't like this question."

I read the paper decomposable attention model (Parikh et al, 2017). This paper doesn't discuss such a scenario. The idea behind the paper is text alignment. So intuitively, I think it should also be reasonable to work on paragraph texts. But I'm not very confident about it.

I sincerely appreciate it if anyone can help with it.


Solution

  • Currently, the datasets for textual entailment (eg. SNLI) contain single sentences as premise and hypothesis. However, the model should still "work" for paragraph texts (as long as the text is within the maximum token limit).

    That said, the models trained on these datasets, such as the ones on AllenNLP demo, are likely to have somewhat degraded performance on such inputs, as they have not seen longer examples. In theory, you definitely should be able to train/finetune a model on your own labeled dataset with such examples. One would expect that the performance of the new model would be somewhat improved for longer inputs.