I am working on vetting someone else's state spellchecker. The test data they ran seemed to work fine, but trying a different data set, it doesn't seem to be able to get past the first word "North" in a state name.
I need the code to be able to work with state names with two words.
This is the code:
import sys
!pip install pyspellchecker
from spellchecker import SpellChecker
#from google.colab import files
import pandas as pd
import io
#Implement spellcheck.
spell=SpellChecker()
for ind in newDF.index:
stateWordList = newDF['State'][ind].split()
if len(stateWordList) == 1:
#print(True)
if stateWordList[0] in spell:
pass
else:
correctState = input("'{}' is not a valid state, please enter a correct spelling:".format(stateWordList[0]))
newDF.at[ind, 'State'] = correctState
else:
misspelledState = False in (stateWord in spell for stateWord in stateWordList)
if misspelledState == True:
pass
else:
correctState = input("'{}' is not a valid state, please enter a correct spelling:".format(stateWordList[0]))
newDF.at[ind, 'State'] = correctState
Instead, it isn't seeing North WhateverState as valid, and returns:
'North' is not a valid state, please enter a correct spelling:
Does it need a condition specifically for two word names?
In your else
statement, you have a logic error
else:
misspelledState = False in (stateWord in spell for stateWord in stateWordList)
if misspelledState == True:
pass
else:
correctState = input("'{}' is not a valid state, please enter a correct spelling:".format(stateWordList[0]))
newDF.at[ind, 'State'] = correctState
Let's see misspelledState = False in (stateWord in spell for stateWord in stateWordList)
, if all the words in stateWordList
is well spelled, you are checking with misspelledState = False in (True, True, ...)
, the result will be False
.
Then go to the if-else
condition, it will go to else
condition where outputs the correction message:
if misspelledState == True:
pass
else:
correctState = input("'{}' is not a valid state, please enter a correct spelling:".format(stateWordList[0]))
newDF.at[ind, 'State'] = correctState
You can use
misspelledState = all([stateWord in spell for stateWord in stateWordList])