I’m currently using TextBlob to make a chatbot, and I’ve so far been extracting named entities using noun phrase extraction and finding the pos tag NNP. When entering a test user question such as ‘Will Smith’s latest single?’, I am correctly retrieving ‘Will Smith’. But I want to be able to search not only ‘will smith’ but ‘william smith’ ‘bill smith’ ‘willie smith’ ‘billy smith’ - basically other popularly known variations of the name in English language. I am using the Spotipy API as I am trying to retrieve Spotify artists. What I'm currently doing in PyCharm:
while True:
response = input()
searchQuery = TextBlob(response)
who = []
for item, tag in searchQuery.tags:
if tag == "NNP":
for nounPhrase in searchQuery.noun_phrases:
np = TextBlob(nounPhrase)
if item.lower() in np.words:
if nounPhrase not in who:
who.append(nounPhrase)
print(who)
if who:
for name in who:
if spotifyObject.search(name, 50, 0, 'artist', None):
searchResults = spotifyObject.search(name, 50, 0, 'artist', None)
artists = searchResults['artists']['items']
for a in artists:
print(a['name'])
Quick question:
Why would you want 'Bill Smith' to appear under the same search for Will Smith? I believe they are 2 different artists.
Option 1 If I understand your question correctly, I believe you may want to use regular expressions on the first name of the artist.
For example name LIKE %(any fist name)% + smith
As I assume the search is invalid in your case if the search returns "Will Sutton" for example.
Option 2
Do you want something similar to SpaCy's sense2vec feature. Which returns the word with percentage similarity. You could set a target that only returns results >70% for example. https://explosion.ai/demos/sense2vec
If this is not useful, then explain your question again; in a bit more detail (such as what makes a valid search case)
Thanks