Search code examples
pythonhuggingfacehuggingface-datasets

How to save custom dataset in local folder


I have created a custom huggingface dataset, containing images and ground truth data coming from json lines file. I want to save it to a local folder and be able to use it as is by loading it after to other notebooks. I did not find out how this can happen.

DatasetDict({
    train: Dataset({
        features: ['image', 'id', 'ground_truth'],
        num_rows: 7
    })
    test: Dataset({
        features: ['image', 'id', 'ground_truth'],
        num_rows: 4
    })
})

Solution

  • According to huggingface documentacion you can use save_to_disk which "saves a dataset to a dataset directory, or in a filesystem".

    Example:

    ds.save_to_disk("path/to/dataset/dir")