python bert-language-model huggingface-transformers huggingface-tokenizers huggingface-datasets

Hugging Face: NameError: name 'sentences' is not defined

I am following this tutorial here: https://huggingface.co/transformers/training.html - though, I am coming across an error, and I think the tutorial is missing an import, but i do not know which.

These are my current imports:

# Transformers installation
! pip install transformers
# To install from source instead of the last release, comment the command above and uncomment the following one.
# ! pip install git+https://github.com/huggingface/transformers.git

! pip install datasets transformers

from transformers import pipeline

Current code:

from datasets import load_dataset

raw_datasets = load_dataset("imdb")

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")

inputs = tokenizer(sentences, padding="max_length", truncation=True)

The error:

NameError                                 Traceback (most recent call last)

<ipython-input-9-5a234f114e2e> in <module>()
----> 1 inputs = tokenizer(sentences, padding="max_length", truncation=True)

NameError: name 'sentences' is not defined

Solution

The error states that you do not have a variable called sentences in the scope. I believe the tutorial presumes you already have a list of sentences and are tokenizing it.

Have a look at the documentation The first argument can be either a string or list of string or list of list of strings.

__call__(text: Union[str, List[str], List[List[str]]],...)