Search code examples
pythonopenai-whisper

How to correct misspelled product names?


Moisturizer Finty Skin Hydrovisor
Joa Beauty Dark Circle Concealer
Sephora brow pencil

I have extracted these product names using audio transcription, but some products such as 'Fenty' and 'Joah' are misspelled.

An English text normalizer doesn't work on this and GPT-3 also doesn't seem to do a good job.

Is there a better approach to normalize the names to the actual product names?


Solution

  • One approach to normalize product names is to use a spell checker or spell correction algorithm that can handle out-of-vocabulary words. This can help correct the misspelled product names such as 'Fenty' and 'Joah' to their correct spellings 'Fenty' and 'Joa'.

    Another approach is to use named entity recognition (NER) algorithms to identify the product names in the text and then map them to a known list of product names. NER algorithms can help identify the product names even if they are misspelled or written in different formats.