Search code examples
python-3.xpandasspelling

State name spellcheck code not working with two word state names


I am working on vetting someone else's state spellchecker. The test data they ran seemed to work fine, but trying a different data set, it doesn't seem to be able to get past the first word "North" in a state name.

I need the code to be able to work with state names with two words.

This is the code:

import sys
!pip install pyspellchecker
from spellchecker import SpellChecker
#from google.colab import files
import pandas as pd
import io

#Implement spellcheck.
spell=SpellChecker()
for ind in newDF.index:
  stateWordList = newDF['State'][ind].split()
  if len(stateWordList) == 1:
    #print(True)
    if stateWordList[0] in spell:
      pass
    else:
      correctState = input("'{}' is not a valid state, please enter a correct spelling:".format(stateWordList[0]))
      newDF.at[ind, 'State'] = correctState
  else:
    misspelledState = False in (stateWord in spell for stateWord in stateWordList)
    if misspelledState == True:
      pass
    else:
      correctState = input("'{}' is not a valid state, please enter a correct spelling:".format(stateWordList[0]))
      newDF.at[ind, 'State'] = correctState

Instead, it isn't seeing North WhateverState as valid, and returns:

'North' is not a valid state, please enter a correct spelling:

Does it need a condition specifically for two word names?


Solution

  • In your else statement, you have a logic error

      else:
        misspelledState = False in (stateWord in spell for stateWord in stateWordList)
        if misspelledState == True:
          pass
        else:
          correctState = input("'{}' is not a valid state, please enter a correct spelling:".format(stateWordList[0]))
          newDF.at[ind, 'State'] = correctState
    

    Let's see misspelledState = False in (stateWord in spell for stateWord in stateWordList), if all the words in stateWordList is well spelled, you are checking with misspelledState = False in (True, True, ...), the result will be False.

    Then go to the if-else condition, it will go to else condition where outputs the correction message:

        if misspelledState == True:
          pass
        else:
          correctState = input("'{}' is not a valid state, please enter a correct spelling:".format(stateWordList[0]))
          newDF.at[ind, 'State'] = correctState
    

    You can use

    misspelledState = all([stateWord in spell for stateWord in stateWordList])