Search code examples
pythonopencvdct

Getting black images while applying discrete cosine transform (DCT) on image


Getting black images while applying discrete cosine transform (DCT) on image

Looks like this

def perform_dct(image_path):
    # Load the image in grayscale
    img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
    if img is None:
        raise ValueError(f"Image at {image_path} could not be loaded.")
    # Resize the image to 224x224 (size expected by ViT model)


    img = cv2.resize(img, (224, 224))
    # Perform DCT
    dct_img = cv2.dct(np.float32(img))

       # Normalize the DCT image to the range [0, 1]
    dct_img_min = np.min(dct_img)
    dct_img_max = np.max(dct_img)
    dct_img_normalized = (dct_img - dct_img_min) / (dct_img_max - dct_img_min)
    
    # Convert to 3-channel RGB image
    dct_img_normalized = (dct_img_normalized * 255).astype(np.uint8)
    dct_img_rgb = cv2.merge([dct_img_normalized, dct_img_normalized, dct_img_normalized])

    plt.imshow(dct_img_rgb)
    plt.title("DCT of Imagesss")
    plt.show()

    return dct_img_rgb

Trying to get discrete cosine transform (DCT) image but getting whole black image , I am not able to find error in my code . If you can Please help me


Solution

  • I think that the result is expectable. When you run dct you may get in the output very large coefficients. Consider checking the distributions of the quantiles in dct_img array:

    np.quantile(dct_img,[0,0.01,0.99,1])
    

    dct_img values range:

    min_value = -2537.42236328
    1% quantile = -71.6647377
    99% quantile = 69.59934616 
    max_value = 36856.75390625
    

    Given such high maximum after the normalisation (with min/max) 99% of values will be mapped to zeros. And this results in seeing only black pixels on the image.

    I'm not sure why you want to plot dct_image array. Basically plotting it's coefficients is not required step for further analysis. But if you really would like to visualize it, I would suggest first scaling it's values by applying log:

    dct_img = cv2.dct(np.float32(img))
    
    # scale all dct values to reduce difference between extereme `max` and `min` values
    scaled_dct = np.sign(dct_img) * np.log(1+np.abs(dct_img))
    
    # normalize the scaled data
    scaled_dct_normalized = normalize(scaled_dct)
    
    # plot the output
    plt.imshow(scaled_dct_normalized, cmap='gray', vmin=0, vmax=255)
    plt.title("scalled DCT of Imagesss")
    plt.show()
    

    enter image description here