The problem statement is : Identify the right person(X,Y,Z...) for a project(ABC, DEF,....) based on matching skills (S1, S2, S3,S4, S5, S6....).
Example : There is a person X who is skilled in S1, S2, S3. another person Y who is skilled in S4, S5, S6. another person Z who is skilled in S1, S3, S5, S6.
Then there is a project ABC, which uses one of these skill - lets say skill S1. So we should be able to identify person X for project ABC because it uses skill S1.
similarly, if another project DEF comes which needs skills S5 & S6, we should assign person Y & person Z because of skill match.
Is there a python equivalent to achieve this in the best possible manner ?
I tried this :
import re, math
from collections import Counter
WORD = re.compile(r'\w+')
def get_cosine(vec1, vec2):
intersection = set(vec1.keys()) & set(vec2.keys())
numerator = sum([vec1[x] * vec2[x] for x in intersection])
sum1 = sum([vec1[x]**2 for x in vec1.keys()])
sum2 = sum([vec2[x]**2 for x in vec2.keys()])
denominator = math.sqrt(sum1) * math.sqrt(sum2)
if not denominator:
return 0.0
else:
return float(numerator) / denominator
def text_to_vector(text):
words = WORD.findall(text)
return Counter(words)
text1 = 'python, c, perl'
text2 = 'perl,c'
vector1 = text_to_vector(text1)
vector2 = text_to_vector(text2)
cosine = get_cosine(vector1, vector2)
print 'Cosine:', cosine
I don't know this is the good way or not . But you can go for this if it is not huge data :
skill = {'x': [1, 2, 3], 'y': [4, 5,6], 'z': [5,6,7,1]}
all_employees = list(skill.keys())
needed_employees = []
required_skill = [5, 6]
for i in (all_employees):
c = 0
for j in required_skill:
if j in skill[i]:
c += 1
if c == len(required_skill):
needed_employees.append(i)
print(needed_employees)