Search code examples
pythonnumpyjupyter-notebookanacondascikit-image

KeyError: class 'numpy.object_' while downloading image dataset using imread


I am trying to download images from URLs using imread. After downloading about 700 images, I see KeyError: class 'numpy.object_' . I am really not familiar with numpy and Conda libraries. Any help would be appreciated

for i in range(len(classes)):
    if not os.path.exists(saved_dirs[i]):
        os.mkdir()
    saved_dir = saved_dirs[i]
    for url in urls[i]:
        # print(url)
        img = io.imread(url)
        saved_path = os.path.join(saved_dir, url[-20:])
        io.imsave(saved_path, img)

Trigger URL: https://requestor-proxy.figure-eight.com/figure_eight_datasets/open-images/test/f54ecb040198098c.jpg Printing trigger image:

[array([[[ 70, 141, 143],
        [ 71, 142, 146],
        [ 67, 141, 144],
        ...,
        [242, 253, 255],
        [242, 253, 255],
        [242, 253, 255]],

       [[ 61, 135, 136],
        [ 64, 135, 137],
        [ 63, 134, 138],
        ...,
        [242, 253, 255],
        [242, 253, 255],
        [242, 253, 255]],

       [[ 57, 133, 133],
        [ 63, 137, 138],
        [ 65, 136, 138],
        ...,
        [242, 253, 255],
        [242, 253, 255],
        [242, 253, 255]],

       ...,

       [[246, 244, 255],
        [244, 242, 253],
        [244, 242, 253],
        ...,
        [ 21,  43, 100],
        [ 20,  40, 101],
        [ 22,  42, 103]],

       [[244, 243, 251],
        [243, 241, 252],
        [242, 240, 251],
        ...,
        [ 26,  49, 103],
        [ 25,  45, 104],
        [ 25,  45, 106]],

       [[244, 243, 251],
        [243, 242, 250],
        [243, 242, 250],
        ...,
        [ 27,  48, 103],
        [ 26,  45, 104],
        [ 26,  44, 106]]], dtype=uint8)
 array(<PIL.MpoImagePlugin.MpoImageFile image mode=RGB size=3872x2592 at 0x17D8F7434E0>,
      dtype=object)]

The ouput:

KeyError                                  Traceback (most recent call last)
<ipython-input-22-b76e704a99f8> in <module>
      8         img = io.imread(url)
      9         saved_path = os.path.join(saved_dir, url[-20:])
---> 10         io.imsave(saved_path, img)

~\Anaconda3\lib\site-packages\skimage\io\_io.py in imsave(fname, arr, plugin, **plugin_args)
    137         if fname.lower().endswith(('.tiff', '.tif')):
    138             plugin = 'tifffile'
--> 139     if is_low_contrast(arr):
    140         warn('%s is a low contrast image' % fname)
    141     if arr.dtype == bool:

~\Anaconda3\lib\site-packages\skimage\exposure\exposure.py in is_low_contrast(image, fraction_threshold, lower_percentile, upper_percentile, method)
    501         image = rgb2gray(image)
    502 
--> 503     dlimits = dtype_limits(image, clip_negative=False)
    504     limits = np.percentile(image, [lower_percentile, upper_percentile])
    505     ratio = (limits[1] - limits[0]) / (dlimits[1] - dlimits[0])

~\Anaconda3\lib\site-packages\skimage\util\dtype.py in dtype_limits(image, clip_negative)
     55         warn('The default of `clip_negative` in `skimage.util.dtype_limits` '
     56              'will change to `False` in version 0.15.')
---> 57     imin, imax = dtype_range[image.dtype.type]
     58     if clip_negative:
     59         imin = 0

KeyError: <class 'numpy.object_'>

Solution

  • Edit

    I reproduced your results with:

    from skimage import io
    url = "https://requestor-proxy.figure-eight.com/figure_eight_datasets/open-images/test/f54ecb040198098c.jpg"
    img = io.imread(url)
    
    print(img.shape)
    

    Turns out, this actually returns two images or layers of the same image, where img[0] is the actual image you want, and img[1] is some faulty reading in Pillow (the module skimage is using to read images).

    Check this issue out.

    For now, some quick workaround should be fine.

    if img.shape[0] == 2:
      img = img[0]
    

    Original

    Could you show the URL that triggers this error? It could be your formatting on url[-20:] that adds some funky extension. Also, I'd recommend just printing img and img.shape and img.dtype to have a better idea of what's going on.