I have a dataset of tens of thousands of dialogues / conversations between a customer and customer support. These dialogues, which could be forum posts, or long-winded email conversations, have been hand-annotated to highlight the sentence containing the customers problem. For example:
Dear agent, I am writing to you because I have a very annoying problem with my washing machine. I bought it three weeks ago and was very happy with it. However, this morning the door does not lock properly. Please help
Dear customer.... etc
The highlighted sentence would be:
However, this morning the door does not lock properly.
I found a model on HuggingFace which has been pre-trained with customer dialogues, and have read the research paper, so I was considering fine-tuning this as a starting point, but I only have experience with text (multiclass/multilabel) classification when it comes to transformers.
If you want to get a specific sentence (without any modification) from the original input text, that is often referred to as 'span classification' where the output is the index of the first and last word of the specific sentence. The state-of-the-art now is the attention models like BERT .You can check the Bert models that are designed for the 'span classification' problem in huggingface as RobertaForQuestionAnswering https://huggingface.co/docs/transformers/model_doc/roberta#transformers.TFRobertaForQuestionAnswering that uses TensorFlow or PyTorch library.