When I try to run the following code to load a dataset from Hugging Face hub to google Colab, I get an error!
! pip install transformers datasets
from datasets import load_dataset
cv_13 = load_dataset("mozilla-foundation/common_voice_13_0", "en", split="train")
<ipython-input-9-4d772f75be89> in <cell line: 3>()
1 from datasets import load_dataset
2
----> 3 cv_13 = load_dataset("mozilla-foundation/common_voice_13_0", "en", split="train")
2 frames
/usr/local/lib/python3.10/dist-packages/datasets/load.py in dataset_module_factory(path, revision, download_config, download_mode, dynamic_modules_path, data_dir, data_files, **download_kwargs)
1505 raise e1 from None
1506 if isinstance(e1, FileNotFoundError):
-> 1507 raise FileNotFoundError(
1508 f"Couldn't find a dataset script at {relative_to_absolute_path(combined_path)} or any data file in the same directory. "
1509 f"Couldn't find '{path}' on the Hugging Face Hub either: {type(e1).__name__}: {e1}"
FileNotFoundError: Couldn't find a dataset script at /content/mozilla-foundation/common_voice_13_0/common_voice_13_0.py or any data file in the same directory. Couldn't find 'mozilla-foundation/common_voice_13_0' on the Hugging Face Hub either: FileNotFoundError: Dataset 'mozilla-foundation/common_voice_13_0' doesn't exist on the Hub. If the repo is private or gated, make sure to log in with `huggingface-cli login`.
The dataset exists in Huggingface hub and loads successfully in my local Jupiter Lab. What should I do?
The Common Voice dataset at https://huggingface.co/datasets/mozilla-foundation/common_voice_13_0 is a gated dataset, so you need to log in to access it, e.g. using:
huggingface-cli login