python comparison matching recommendation-engine

"skill" matching of users using python

The problem statement is : Identify the right person(X,Y,Z...) for a project(ABC, DEF,....) based on matching skills (S1, S2, S3,S4, S5, S6....).

Example : There is a person X who is skilled in S1, S2, S3. another person Y who is skilled in S4, S5, S6. another person Z who is skilled in S1, S3, S5, S6.

Then there is a project ABC, which uses one of these skill - lets say skill S1. So we should be able to identify person X for project ABC because it uses skill S1.

similarly, if another project DEF comes which needs skills S5 & S6, we should assign person Y & person Z because of skill match.

Is there a python equivalent to achieve this in the best possible manner ?

I tried this :

import re, math
from collections import Counter

WORD = re.compile(r'\w+')

def get_cosine(vec1, vec2):
     intersection = set(vec1.keys()) & set(vec2.keys())
     numerator = sum([vec1[x] * vec2[x] for x in intersection])

     sum1 = sum([vec1[x]**2 for x in vec1.keys()])
     sum2 = sum([vec2[x]**2 for x in vec2.keys()])
     denominator = math.sqrt(sum1) * math.sqrt(sum2)

     if not denominator:
        return 0.0
     else:
        return float(numerator) / denominator

def text_to_vector(text):
     words = WORD.findall(text)
     return Counter(words)

text1 = 'python, c, perl'
text2 = 'perl,c'

vector1 = text_to_vector(text1)
vector2 = text_to_vector(text2)

cosine = get_cosine(vector1, vector2)

print 'Cosine:', cosine

Solution

I don't know this is the good way or not . But you can go for this if it is not huge data :

skill = {'x': [1, 2, 3], 'y': [4, 5,6], 'z': [5,6,7,1]}
all_employees = list(skill.keys())
needed_employees = []
required_skill = [5, 6]
for i in (all_employees):
    c = 0
    for j in required_skill:
         if j in skill[i]:
            c += 1
    if c == len(required_skill):
         needed_employees.append(i)
print(needed_employees)