Search code examples
data-miningsentiment-analysistext-analysis

How can I analyze a nonstructured text?


I use TF-IDF to affect weight that can help me to construct my dictionary. but my model is not really good enough because I have unstructured text.

Any suggestions about TF-IDF similar algorithms?


Solution

  • When you say, your model is not good enough, does it mean that your generated dictionary is not good enough? Extracting key terms and constructing the dictionary using TF-IDF weight is actually feature selection step.

    To extract or select features for your model, you can follow other approaches like principle component analysis, latent semantic analysis etc. Lot of other feature selection techniques in machine learning can be useful too!

    But I truly believe for sentiment classification task, TF-IDF should be a very good approach to construct the dictionary. I rather suggest you to tune your model parameters when you are training it rather than blaming the feature selection approach.

    There are many deep learning techniques as well that are applicable for your target task.