I want to ask you some about torchtext.
I have a task about abstractive text summarization, and I build a seq2seq model with pytorch.
I just wonder about data_field constructed by build_vocab function in torchtext.
In machine translation, i accept that two data_fields(input, output) are needed.
But, in summarization, input data and output data are same language.
Here, should I make two data_field(full_sentence, abstract_sentence) in here?
Or is it okay to use only one data_field?
I'm afraid that my wrong choice make model's performance down.
Please, give me a hint.
You are right in the case of summarization and other tasks, it makes sense to build and use the same vocab for input and output