Search code examples
pythontextsummaryseq2seq

Is there a way to control the output length of a sequence to sequence text summarization model?


Is there a way to control the number of words or characters that a seq2seq model for text summarization produces? Examples:

"My dog is the fastest dog in the world. He loves cuddling as well."

1 output: My dog is fast and loves cuddling.

2 output: My dog is the fastest dog and also loves cuddling.


Solution

  • This can be controlled by the number of decoder steps that produce the output, in this repo there are multiple approaches for text summarizations,and in eah of them there is a parameter like in the case of this model, there is a parameter named max_dec_steps (in cell 28) which controls max timesteps of decoder (max summary tokens), which simply reflects to the length of the output sentence, the repo writer also explains in detail multiple other models in this blog series.

    Hope this is helpful