Search code examples

How to input data image dataset as dataframe?

I have a dataset, that contains only images. I need to input it as a dataframe.

I tried to do this with load_files function from sklearn:

data = load_files(path)

And tried to create a dataframe with output of the function above and pd.Dataframe function, but id didn't workout

The result should contain filepaths and labels as on the picture below enter image description here


  • You can use :

    import pandas as pd
    from pathlib import Path
    df = pd.DataFrame(
        [{"Filepath": str(img), "Labels":,
          "View": f'<img src="{img}" width="200" height="200">'} # View is optional
         for img in Path("images").rglob("*.png")] # or `*.*` if you have mixed formats
    ) # `.style` is optional

    Output :

    enter image description here

    Tree used :

    ┣━━ category1
    ┃   ┗━━ foo
    ┃       ┣━━ gis_exchange.png
    ┃       ┗━━ stackoverflow.png
    ┗━━ category2
        ┗━━ bar
            ┣━━ ask_ubuntu.png
            ┗━━ stack_apps.png