Search code examples
pythonpython-itertools

pair's occurrences in a list


I have a dictionary with some specific query per session. The key-values are ID numbers of a specific session, and the item-values are the queries searched, like this:

1000 , [ Malaria, Cholera ]
1001 , [ Disease, Malaria, Fever]
1002 , [ Fever, Cholera, AIDS, Cancer, Sickness]
1003 , [ Sickness, Disease, Fever, Constipation]

I would like to found the cooccurrences of a specific query for all the sessions( example: Disease, 2occurrences: [(Fever, 2times),(Malaria, 1time),(Sickness, 1 times),(Constipation, 1time)]. I have tried with that code, trying with a library that I have read can help me, itertool:

for x in occurrences.values():
    if len(x) > 2:
        for y in x:
            for pair in itertools.combinations(y, 2):
                coccurr[pair]+=1


for k in cooccurr.keys():
    print k, len(cooccurr[k])      

the script runs without errors, but it doesn't print anything, neither an empty list. Which is my error? I use itertools correctly?


Solution

  • from collections import Counter
    def findForQuery (queries, value):
        related = Counter()
        count = 0
        for query in queries.values():
            if value in query:
                count += 1
                related.update({item: 1 for item in query if item != value})
        return count, related
    
    queries = {
        1000: [ 'Malaria', 'Cholera' ],
        1001: [ 'Disease', 'Malaria', 'Fever'],
        1002: [ 'Fever', 'Cholera', 'AIDS', 'Cancer', 'Sickness'],
        1003: [ 'Sickness', 'Disease', 'Fever', 'Constipation']
    }
    

    Used like this:

    >>> findForQuery(queries, 'Disease')
    (2, Counter({'Fever': 2, 'Malaria': 1, 'Constipation': 1, 'Sickness': 1}))
    >>> findForQuery(queries, 'Sickness')
    (2, Counter({'Fever': 2, 'AIDS': 1, 'Constipation': 1, 'Cancer': 1, 'Disease': 1, 'Cholera': 1}))