Search code examples
pandaspyenchant

Autocorrect a column in a pandas dataframe using pyenchant


I tried to apply the code from the accepted answer of this question to one of my dataframe columns where each row is a sentence, but it didn't work.

My code looks this:

from enchant.checker import SpellChecker
checker = SpellChecker("id_ID")

h = df['Jawaban'].astype(str).str.lower()
hayo = []


for text in h:
    checker.set_text(text)

    for s in checker:
        sug = s.suggest()[0]
        s.replace(sug)

    hayo.append(checker.get_text())

I got this following error:

IndexError: list index out of range

Any help is greatly appreciated.


Solution

  • I don't get the error using your code. The only thing I'm doing differently is to import the spell checker.

    from enchant.checker import SpellChecker
    checker = SpellChecker('en_US','en_UK') # not using id_ID
    
    # sample data
    ds = pd.DataFrame({ 'text': ['here is a spllng mstke','the wrld is grwng']})
    p = ds['text'].str.lower()
    
    hayo = []
    
    for text in p:
        checker.set_text(text)
    
        for s in checker:
            sug = s.suggest()[0]
            s.replace(sug)
    
        print(checker.get_text())
        hayo.append(checker.get_text())
    
    print(hayo)
    
    here is a spelling mistake
    the world is growing