Search code examples
huggingface-datasets

How do I convert Pandas DataFrame to a Huggingface Dataset object?


I have the following df:

import pandas as pd
df = pd.DataFrame({"foo": ["bar", "baz"]})

How do I convert to a Huggingface Dataset?


Solution

  • datasets have an easy way to convert pandas dataframes to hugginface datasets:

    from datasets import Dataset
    dataset = Dataset.from_pandas(df)
    
    # if you want to go back to a pandas dataframe
    df = dataset.to_pandas()
    

    more info here: https://huggingface.co/docs/datasets/main/en/loading#inmemory-data