Problem statement: The method receives a list of tuples. Each tuple consists of two items, an ID and a string. The instance variable search_criteria
is a dictionary. The key is a group name and the values are a list of keywords to look for in every tuple and return the ID if found.
Example input:
results - (id, text-field)
search_criteria - (group name, keywords to search for)
results = [(1, "This is an example"), (2, "Another example"), (3, "Random String)]
search_criteria = {"HR" : ["example", "harrassment", "fired"], "Maintenance" : ["is", "Random", "Cleaning"]}
Example output:
{
"HR" : {"example": [1,2]},
"Maintenance" : { "is" : [1], "Random" : [3]}
}
If a word is found, map the group to the keyword and the keyword to the list of ids found.
def build_keywords_found_dict(self, results):
group_dict = {}
for group in self.search_criteria:
for keyword in self.search_criteria[group]:
keyword_dict = {}
for data in results:
if keyword in data[1]:
group_dict[group] = keyword_dict[keyword].append(data[0])
return group_dict
Current output:
KeyError
You can create a reverse mapping dict that maps words to their criteria, so that you can iterate through the words in each phrase and map the words to their criteria in linear time:
mapping = {i: k for k, l in search_criteria.items() for i in l}
output = {}
for id, words in results:
for word in words.split():
if word in mapping:
output.setdefault(mapping[word], {}).setdefault(word, []).append(id)
output
becomes:
{'Maintenance': {'is': [1], 'Random': [3]}, 'HR': {'example': [1, 2]}}