Search code examples
pythonnlphuggingface-transformershuggingface-tokenizershuggingface

issue when importing BloomTokenizer from transformers in python


I am trying to import BloomTokenizer from transformers

from transformers import BloomTokenizer

and I receive the following error

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name 'BloomTokenizer' from 'transformers' 
(/root/miniforge3/envs/pytorch/lib/python3.8/site-packages/transformers/__init__.py)

my version of transformers:

transformers                 4.20.1

what could I do to be able to import BloomTokenizer?


Solution

  • BLOOM has no slow tokenizer class. It only has a fast tokenizer. The official documentation is wrong at this point. Use the following instead:

     from transformers import BloomTokenizerFast
     tokenizer = BloomTokenizerFast.from_pretrained("...")