I have a dictionary with some specific query per session. The key-values are ID numbers of a specific session, and the item-values are the queries searched, like this:
1000 , [ Malaria, Cholera ]
1001 , [ Disease, Malaria, Fever]
1002 , [ Fever, Cholera, AIDS, Cancer, Sickness]
1003 , [ Sickness, Disease, Fever, Constipation]
I would like to found the cooccurrences of a specific query for all the sessions( example: Disease, 2occurrences: [(Fever, 2times),(Malaria, 1time),(Sickness, 1 times),(Constipation, 1time)]. I have tried with that code, trying with a library that I have read can help me, itertool:
for x in occurrences.values():
if len(x) > 2:
for y in x:
for pair in itertools.combinations(y, 2):
coccurr[pair]+=1
for k in cooccurr.keys():
print k, len(cooccurr[k])
the script runs without errors, but it doesn't print anything, neither an empty list. Which is my error? I use itertools correctly?
from collections import Counter
def findForQuery (queries, value):
related = Counter()
count = 0
for query in queries.values():
if value in query:
count += 1
related.update({item: 1 for item in query if item != value})
return count, related
queries = {
1000: [ 'Malaria', 'Cholera' ],
1001: [ 'Disease', 'Malaria', 'Fever'],
1002: [ 'Fever', 'Cholera', 'AIDS', 'Cancer', 'Sickness'],
1003: [ 'Sickness', 'Disease', 'Fever', 'Constipation']
}
Used like this:
>>> findForQuery(queries, 'Disease')
(2, Counter({'Fever': 2, 'Malaria': 1, 'Constipation': 1, 'Sickness': 1}))
>>> findForQuery(queries, 'Sickness')
(2, Counter({'Fever': 2, 'AIDS': 1, 'Constipation': 1, 'Cancer': 1, 'Disease': 1, 'Cholera': 1}))