When I run this code:
from nltk import NaiveBayesClassifier,classify
import USSSALoader
import random
class genderPredictor():
def getFeatures(self):
if self._loadNames() != None:
maleNames,femaleNames=self._loadNames()
else:
print "There is no training file."
return
featureset = list()
for nameTuple in maleNames:
features = self._nameFeatures(nameTuple[0])
featureset.append((features,'M'))
for nameTuple in femaleNames:
features = self._nameFeatures(nameTuple[0])
featureset.append((features,'F'))
return featureset
def trainAndTest(self,trainingPercent=0.80):
featureset = self.getFeatures()
random.shuffle(featureset)
name_count = len(featureset)
cut_point=int(name_count*trainingPercent)
train_set = featureset[:cut_point]
test_set = featureset[cut_point:]
self.train(train_set)
return self.test(test_set)
def classify(self,name):
feats=self._nameFeatures(name)
return self.classifier.classify(feats)
def train(self,train_set):
self.classifier = NaiveBayesClassifier.train(train_set)
return self.classifier
def test(self,test_set):
return classify.accuracy(self.classifier,test_set)
def getMostInformativeFeatures(self,n=5):
return self.classifier.most_informative_features(n)
def _loadNames(self):
return USSSALoader.getNameList()
def _nameFeatures(self,name):
name=name.upper()
return {
'last_letter': name[-1],
'last_two' : name[-2:],
'last_is_vowel' : (name[-1] in 'AEIOUY')
}
if __name__ == "__main__":
gp = genderPredictor()
accuracy=gp.trainAndTest()
And self._loadNames()
returns None
, I got this error (from random imported module):
shuffle C:\Python27\lib\random.py 285
TypeError: object of type 'NoneType' has no len()
This happend because despite I put a return
statment in getFeatures(self)
, the flow jumps into the next class method (which is trainAndTest(self,trainingPercent=0.80)
) which calls the random module (random.shuffle(featureset)
).
So, I'd like to know: how to stop the procedure flow not only in the getFeatures(self)
method, but in the entire class that contains it?
By the way, thanks Stephen Holiday for sharing the code.
This happend because despite I put a return statment in getFeatures(self), the flow jumps into the next class method (which is trainAndTest(self,trainingPercent=0.80)) which calls the random module (random.shuffle(featureset)).
An important thing to remember is that None
is a perfectly valid value. The return statement in your getFeatures()
is doing exactly what it is told and returning the valid value. Only an exceptional situation, or you explicitly, will stop that flow.
Instead of asking how you can "return from the class", what you might want to look into is checking the return values of functions you call and making sure its what you expect before you proceed. There are two places you could do this:
def trainAndTest(self,trainingPercent=0.80):
featureset = self.getFeatures()
...
def _loadNames(self):
return USSSALoader.getNameList()
In the first spot, you could check if featureset is None
, and react if it is None.
In the second spot, instead of blindly returning, you could check it first and react there.
Secondly. you have the option of raising exceptions. Exceptions are a situation where the code has encountered an error and can't continue. It is then the responsibility of the calling function to either handle it or let it ride up the chain. If nothing handles the exception, your application will crash. As you can see, you are getting an exception being raised from the random class because you are allowing a None to make its way into the shuffle
call.
names = USSSALoader.getNameList()
if names is None:
# raise an exception?
# do something else?
# ask the user to do something?
The question at that point is, what do you want your program to do at that moment when it happens to get a None instead of a valid list? Do you want an exception similar to the one being raised by random, but more helpful and specific to your application? Or maybe you just want to call some other method that gets a default list. Is not having the names list even a situation where your application do anything other than exit? That would be an unrecoverable situation.
names = USSSALoader.getNameList()
if names is None:
raise ValueError("USSSALoader didn't return any "
"valid names! Can't continue!")
Update
From your comment, I wanted to add the specific handling you wanted. Python has a handful of built in exception types to represent various circumstances. The one you would most likely want to raise is an IOError, indicating that the file could not be found. I assume "file" means whatever file USSSALoader.getNameList()
needs to use and can't find.
names = USSSALoader.getNameList()
if names is None:
raise IOError("No USSSALoader file found")
At this point, unless some function higher up the calling chain handles it, your program will terminate with a traceback error.