Search code examples
pythonimagepython-imaging-libraryjpeghuggingface-datasets

Why can't I use Image.open on PIL files?


I have written a simple code to download an audio dataset from ccmusic on hugging face. Problem is that somehow I can't open PIL images from said dataset with Image.Open() ... Can somebody explain why that is? And how to fix it?

If I run my code:

import datasets
import PIL
from PIL import Image
from datasets import load_dataset

dataset = load_dataset("ccmusic-database/music_genre", split="test")
output = dataset
im = Image.open(output[0])
im.show()

I get the following error:

Traceback (most recent call last): File "/Users/abc/Desktop/Project Python Audio/.venv/lib/python3.11/site-packages/PIL/Image.py", line 3247, in open fp.seek(0) ^^^^^^^ AttributeError: 'dict' object has no attribute 'seek' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/Users/abc/Desktop/Project Python Audio/loading_dataset.py", line 8, in im = Image.open(output[0]) ^^^^^^^^^^^^^^^^^^^^^ File "/Users/abc/Desktop/Project Python Audio/.venv/lib/python3.11/site-packages/PIL/Image.py", line 3249, in open fp = io.BytesIO(fp.read()) ^^^^^^^ AttributeError: 'dict' object has no attribute 'read'

However, if I print the file with:

import datasets
import PIL
from PIL import Image
from datasets import load_dataset

dataset = load_dataset("ccmusic-database/music_genre", split="test")
output = dataset
print(output[0])

I get:

{'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=476x349 at 0x101BE6E10>, 'fst_level_label': 1, 'sec_level_label': 6, 'thr_level_label': 6, 'duration': 416.0533106575964}

So it seems the PIL / Jpeg file array is located at the right position at output[0] ... but Image.open is unable to display it ... what is going on here and how can I watch the image?


Solution

  • If you look at the type of dataset[0] you will see it is a dict:

    print(type(dataset[0]]
    dict
    

    If you print it:

    print(dataset[0])
    {'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=476x349 at 0x101BE6E10>,
    'fst_level_label': 1,
    'sec_level_label': 6,
    'thr_level_label': 6,
    'duration': 416.0533106575964}
    

    you can see it is a dict because it is in curly braces. The "keys" are given in the left column and the corresponding "values" are on the right.

    That means if you access dataset[0]["image"] the object you are looking at is already a PIL Image so you can copy it and use it exactly the same as if you'd created it with Image.open().