Search code examples
pythonmachine-learningdeep-learningspacyattributeerror

AttributeError: type object 'EnglishDefaults' has no attribute 'create_tokenizer'


import os
import torch
import numpy as np
import random
import spacy
from bpemb import BPEmb
nlp = spacy.load("en_core_web_sm")
tokenizer = nlp.Defaults.create_tokenizer(nlp)

This is my code and whenever I try to run this an error shows up saying

AttributeError: type object 'EnglishDefaults' has no attribute 'create_tokenizer'

Solution

  • Have you considered using the build-in Class Tokenizer that, according to the documentation, we can use to create new tokenizer?

    import spacy
    from spacy.tokenizer import Tokenizer
    nlp = spacy.load("en_core_web_sm")
    tokenizer = Tokenizer(nlp.vocab)
    print(tokenizer)
    

    result:

    $ python3 main.py
    <spacy.tokenizer.Tokenizer object at 0x13e7a52d0>