Feature extraction NLP

I'm working on a reviews dataset. The problem is to fetch the important(number of times the same feature reviewed) positive and negative features of that specific product from the reviews.

Ex: some xyz car

positive: Great mileage, good looking, spacious etc

Negative: Poor power, bad performance, software problems etc

Thing is to extract the best and worst things about the product!

Until now I've used gensim's doc2vec to find the top positive and negative sentence. The results are not so good and because it gets similar sentences with structure, not similar feathers it holds.

Solution

Some write-ups of the "Word Mover's Distance" calculation, for identifying similar sentences/phrases, use reviews as their dataset and seem to extract common themes and representative phrases well.

See for example:

"Navigating themes in restaurant reviews with Word Mover’s Distance" http://tech.opentable.com/2015/08/11/navigating-themes-in-restaurant-reviews-with-word-movers-distance/

"Finding similar documents with Word2Vec and WMD" https://markroxor.github.io/gensim/static/notebooks/WMD_tutorial.html