Search code examples
nlpclassificationstanford-nlptext-classificationazure-language-understanding

Text Classification - what can you do vs. what are your capabilities?


Text Classification basically works on the input training sentences. Little or less number of variations of in the sentences do work. But when there is a scenario like

What can you do <<==>> What are your capabilities

This scenario does not work well with the regular classification or bot building platforms.

Are there any approaches for classification that would help me achieve this ?


Solution

  • What you are trying to solve is called Semantic Textual Similarity and is a known and well studied field.

    There are many different ways to solve this even if your data is tagged or not. For example, Google has published the Universal Sentence Encoder (code example) which is intended to tell if two sentences are similar like in your case.

    Another example would be any solution you can find in Quora Question Pairs Kaggle competition.

    There are also datasets for this problem, for example you can look for SemEval STS (STS for Semantic Textual Similarity), or the PAWS dataset