Search code examples
pythonjsoncategorization

Find word from dictionary in string


I am working a on problem in my project. I have a DB with column where I have stored types like "15 mins break" or "30 min free time" I want to make another column with category. My categories are stored in dictionary:

{ "short":["10","5","15","10min","5min","15min","shorter"],
"middle":["20","25","30","35","20min","25min","30min","35min"],
"long":["40","45","50","55","60","40min","45min","50min","55min","60min"]}

Any idea how can I assign category to types using python ? I mean just that part to find similarity with word in dictionary ? my code so far

...calling sql select
for i, index in rows():
    type = index[0]
    if (any of words from dictionary) is in type:
        category = (name of category, for example "short")
        update in sql
        ...

THx


Solution

  • You want to find out if any of the category markers are in the break description. Suppose s1 and s2 are you sample descriptions, and d is your dictionary:

    s1 = "15 mins break"
    s2 = "30 min free time"
    s3 = "something5something"
    

    Then the following expression evaluates to their category (the purpose of re.findall() is to tokenize the text; replace the regular expression with whatever is suitable for your project.)

    [cat for cat in d if any(marker in re.findall(r'[a-z0-9]+',s1) for marker in d[cat])]
    #['short']
    
    [cat for cat in d if any(marker in re.findall(r'[a-z0-9]+',s2) for marker in d[cat])]
    #['middle']
    
    [cat for cat in d if any(marker in re.findall(r'[a-z0-9]+',s3) for marker in d[cat])]
    #[]
    

    This assumes that all strings are in the lower case.