Search code examples
pythonnlpspell-checking

Is there any way in python to auto-correct spelling mistake in multiple rows of an excel files of a single column?


I am working on the Sentiment Analysis for a college project. I have an excel file with a "column" named "comments" and it has "1000 rows". The sentences in these rows have spelling mistakes and for the analysis, I need to have them corrected. I don't know how to process this so that I get and column with correct sentences using python code.

All the methods I found were correcting spelling mistakes of a word not sentence and not on the column level with 100s of rows.


Solution

  • you can use Spellchecker for doing your stuff

    import pandas as pd
    from spellchecker import SpellChecker
    
    spell  = SpellChecker()
    
    df = pd.DataFrame(['hooww good mrning playing fotball studyiing hard'], columns = ['text'])
    
    def spell_check(x):
        correct_word = []
        mispelled_word = x.split()
        for word in mispelled_word:
            correct_word.append(spell.correction(word))
        return ' '.join(correct_word)
    
    
    df['spell_corrected_sentence'] = df['text'].apply(lambda x: spell_check(x))
    

    enter image description here