I am struggling with lists, loops and dictionaries.
Basically, I have two 'levels' of dictionary. The first has as its key a word, and as the value the number of occurrences of that word within a given sentence. So like this:
wordcount={'sort': 3, 'count': 3, 'wrap': 3, 'coin': 11}
Next, I have a for loop that goes through a list of just these words, and for each one creates a dictionary with a number of additional attributes, namely that word's occurrence according to Google N-grams:
for word in wordlist:
url =f"https://books.google.com/ngrams/json?content=
{word}&year_start=1965&year_end=1975&corpus=26&smoothing=3"
sleep(1)
resp = requests.get(url)
if resp.ok:
results = json.loads(resp.content)[0]
results_clean = {key: val for key, val in results.items() if key == "ngram" or key =="timeseries"}
timeseries = {key: results_clean[key] for key in results_clean.keys() & {'timeseries'}}
timeseriesvalues= list(timeseries.values())
timeseriesmean=np.mean(timeseriesvalues)
ngramsonly = {key: results_clean[key] for key in results_clean.keys() & {'ngram'}}
ngramsvalues = list(ngramsonly.values())
results_nouns_final={"word": ngramsvalues, "occurrence_mean": timeseriesmean}
Basically, I want to append to results_noun_final
that word's occurrence value from before. However, when I try to do so by adding the word's value from wordcount
as a third item to this dictionary (as follows):
results_nouns_final={"word": ngramsvalues, "occurrence_mean": timeseriesmean, "count": wordcount.items()}
It is appending all words' counts, and giving me something like the following:
{'word': ['sort'], 'occurrence_mean': 5.319996372468642e-05, 'count': dict_items([('sort', 3), ('count', 3), ('wrap', 3), ('coin', 11)}
{'word': ['count'], 'occurrence_mean': 4.5438979543294875e-05, 'count': dict_items([('sort', 3), ('count', 3), ('wrap', 3), ('coin', 11)}
...etc.
Could anybody let me know where I am going wrong? My desired output would be something like the following:
{'word': ['sort'], 'occurrence_mean': 5.319996372468642e-05, 'count': 3}
{'word': ['count'], 'occurrence_mean': 4.5438979543294875e-05, 'count': 3}
When you use wordcount.items()
, you are getting all the items in the wordcount
dictionary. You only want to access the count of word
.
Try replacing the last line in your loop with:
results_nouns_final={"word": ngramsvalues, "occurrence_mean": timeseriesmean, "count": wordcount.get(word)}
wordcount.get(word)
gives you the count of word
from your dictionary. Using .get()
returns None
if the word
is not in wordcount
.
If you're sure every word exists in your dictionary, you could just use wordcount[word]
.