I have created a custom huggingface dataset, containing images and ground truth data coming from json lines file. I want to save it to a local folder and be able to use it as is by loading it after to other notebooks. I did not find out how this can happen.
DatasetDict({
train: Dataset({
features: ['image', 'id', 'ground_truth'],
num_rows: 7
})
test: Dataset({
features: ['image', 'id', 'ground_truth'],
num_rows: 4
})
})
According to huggingface documentacion you can use save_to_disk
which "saves a dataset to a dataset directory, or in a filesystem".
Example:
ds.save_to_disk("path/to/dataset/dir")