Search code examples
machine-learningtwitternlptweetstream

Trying to find the name of a specific location from tweets


I am trying to find the name of a specific location from tweets and performing sentiment analysis on the hits I get from the search. The problem I am facing is that, I am looking for a location whose name is suppose "Sammy's Tap and Grill", searching which I get no hits. I need to search something like "Sammys" or "Sammy's" to get some hits. Alternatively, when I search for "Empire State Building", I cannot search for "Empire" alone, it gives weird tweets including Mayan and Chola empires. So here I have to search with "Empire State Building" or "Empire State". So is there an NLP trick where I can do something and search for the best possible term from the full name of the location that gets most relevent hits? I was just able to make a solution where I was checking if the hits I get were nouns, because some places have names like "Excellent" and "Fantastic" and I didnt want adjectives to pop up. So is there some NLP way to solve my problem about searching a locationname from a tweet?


Solution

  • your problem is very similar to named entity recognition problem. You can try using standart named entity exctractors or train your own NER model.

    There different libraries for NER, like

    1. Stanford NER,
    2. SpaCy NER Tool
    3. NLTK NER module

    In case if you want to train your own Named Entity Recognition model check this links:

    1. CRF git repository
    2. Named Entity Recognition with Tensorflow

    Good luck)