Search code examples
pythonpython-2.7nltktextblob

Error when using python textblob library tagger


I had the textblob library working fine for a while, but decided to install (using easy_install) an additional library (page here) claiming faster and more accurate tagging.

I couldn't get it working so I uninstalled it, but it seems to have messed with the tagging function in TextBlob. I've uninstalled and reinstalled both nltk and TextBlob numerous times with both pip and easy_install, and made sure they're up to date.

Here is an example of a simple script which generates the error:

from textblob import TextBlob

blob = TextBlob("This is a sentence")
print repr(blob.tags)

and the error printed:

    Traceback (most recent call last):
  File "tesst.py", line 5, in <module>
    print repr(blob.tags)
  File "C:\Users\Emmet\Anaconda\lib\site-packages\textblob\decorators.py", line 24, in __get__
    value = obj.__dict__[self.func.__name__] = self.func(obj)
  File "C:\Users\Emmet\Anaconda\lib\site-packages\textblob\blob.py", line 445, in pos_tags
    for word, t in self.pos_tagger.tag(self.raw)
  File "C:\Users\Emmet\Anaconda\lib\site-packages\textblob\decorators.py", line 35, in decorated
    return func(*args, **kwargs)
  File "C:\Users\Emmet\Anaconda\lib\site-packages\textblob\en\taggers.py", line 34, in tag
    tagged = nltk.tag.pos_tag(text)
  File "C:\Users\Emmet\Anaconda\lib\site-packages\nltk\tag\__init__.py", line 110, in pos_tag
    tagger = PerceptronTagger()
  File "C:\Users\Emmet\Anaconda\lib\site-packages\nltk\tag\perceptron.py", line 141, in __init__
    self.load(AP_MODEL_LOC)
  File "C:\Users\Emmet\Anaconda\lib\site-packages\nltk\tag\perceptron.py", line 209, in load
    self.model.weights, self.tagdict, self.classes = load(loc)
  File "C:\Users\Emmet\Anaconda\lib\site-packages\nltk\data.py", line 801, in load
    opened_resource = _open(resource_url)
  File "C:\Users\Emmet\Anaconda\lib\site-packages\nltk\data.py", line 924, in _open
    return urlopen(resource_url)
  File "C:\Users\Emmet\Anaconda\lib\urllib2.py", line 154, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Users\Emmet\Anaconda\lib\urllib2.py", line 431, in open
    response = self._open(req, data)
  File "C:\Users\Emmet\Anaconda\lib\urllib2.py", line 454, in _open
    'unknown_open', req)
  File "C:\Users\Emmet\Anaconda\lib\urllib2.py", line 409, in _call_chain
    result = func(*args)
  File "C:\Users\Emmet\Anaconda\lib\urllib2.py", line 1265, in unknown_open
    raise URLError('unknown url type: %s' % type)
urllib2.URLError: <urlopen error unknown url type: c>

You can see that the error actually mentions the perceptron tagger. Is there any way to more thoroughly remove any references there may be to the alternate tagger?

Also note that only the "tags" function has been affected.


Solution

  • I found out why I was having trouble with the ap tagger. My issue is solved here. More specifically, by the comment "Another option is to install nltk and then change "from textblob.packages import nltk" to "import nltk" [in the taggers.py] file."

    (Note that this doesn't correspond to the error message above: that error was coming up without aptagger installed. I was getting another error with it installed, and this is a solution for that.)