Search code examples
azure-language-understanding

Handling typos / misspellings on list entities


What is the best practice approach to handle typos / misspelling on LUIS List Entities?

I have intents on LUIS which use a list entity (specifically Company Department - HR, Finance, etc). It is common for users to misspell this when putting forward their utterance. LUIS expects an exact match, it doesn't do a "smart" match, and therefore doesn't pick up the misspelled entity. a) Using bing spell check is not necessarily a good solution. e.g. Certain departments are acronyms such as VRPA - and bing wont correct a typo there. b) When I used LUIS a year ago, I would pre-process the utterance and use a Levenshtein distance algorithm to fix typos on list entities before feeding them to LUIS.

I would imagine that by now LUIS has some better out of the box way of handling this very common use case.

I'd appreciate input on what the best practice approach is to handle this.


Solution

  • @acambitsis and I exchanged messages via his UserVoice ticket, but I'm going to post the answer here for others.

    A combination of Bing and Simple Entities might be what you're looking for, then (they're machine-learned).

    I was able to accomplish something close and attached images.

    In entities, I created a Simple entity with the role, VRPA. In intents, I created the Show Me intent and added sample utterances "Show me the VRPA" and "Show me the VPRA". I clicked on V**A and selected the Simple Entity:VRPA role. After training, I tried "show me the varp" and it correctly guessed "varp" was the "Simple:VRPA" entity.

    You may also find RegEx entities useful. For acronyms, you could do something like: /[vrpa]/i and then any combination of VRPA/VPRA/VARP/ARVP would match.

    I highly recommend reading through the Entity Types and Improve App Performance to see if anything jumps out to solve your particular issues.

    This may not do exactly what you're looking for. If not, I'd recommend implementing a fuzzy-matching algo of your choice.

    entities

    enter image description here

    intents

    enter image description here