Search code examples
pythonnumpypython-imaging-librarydicompydicom

Pydicom error while resizing dicom file - ValueError: The length of the pixel data in the dataset doesn't match the expected length


I am new to working with DICOM files.

I want to resize a non-square DICOM image, e.g., 512 x 768 pixels and rescale it to 512 x 512 without either dimension being squashed or stretched. So this entails resizing the largest dimension to 512 while preserving the aspect ratio and then padding with zeros along the shortest dimension.

I adopted the numpy and PIL resize method to do that after reading the dicom file using the pydicom python package. This code runs and stores the output file successfully.

But when I try to access the output dicom files pixel_array separately to view the CT image, it throws me the following error -

Traceback (most recent call last):
  File "/Users/abhimanyu/Maastricht/Internship/medical_imaging/read_dicom_file.py", line 32, in <module>
    arr = ds.pixel_array
          ^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/pydicom/dataset.py", line 1955, in pixel_array
    self.convert_pixel_data()
  File "/opt/homebrew/lib/python3.11/site-packages/pydicom/dataset.py", line 1512, in convert_pixel_data
    self._convert_pixel_data_without_handler()
  File "/opt/homebrew/lib/python3.11/site-packages/pydicom/dataset.py", line 1624, in _convert_pixel_data_without_handler
    raise last_exception  # type: ignore[misc]
    ^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/pydicom/dataset.py", line 1604, in _convert_pixel_data_without_handler
    self._do_pixel_data_conversion(handler)
  File "/opt/homebrew/lib/python3.11/site-packages/pydicom/dataset.py", line 1631, in _do_pixel_data_conversion
    arr = handler.get_pixeldata(self)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/pydicom/pixel_data_handlers/numpy_handler.py", line 221, in get_pixeldata
    raise ValueError(
ValueError: The length of the pixel data in the dataset (262144 bytes) doesn't match the expected length (524288 bytes). The dataset may be corrupted or there may be an issue with the pixel data handler.

My function to resize looks like this -

import pydicom
from pydicom.dataset import FileMetaDataset
from PIL import Image
import numpy as np

def resize_dicom(input_path, output_path, target_size):
    """
    Resize a DICOM image while preserving aspect ratio and padding with zeros.
    
    Args:
        input_path (str): Path to the input DICOM file.
        output_path (str): Path to the output DICOM file.
        target_size (tuple): Desired size as (width, height).
    """
    # Load the DICOM file
    dicom_data = pydicom.dcmread(input_path)
    
    # Extract pixel data and convert it to a NumPy array
    pixel_data = dicom_data.pixel_array
    
    # Calculate the desired aspect ratio
    aspect_ratio = dicom_data.Columns / dicom_data.Rows
    
    # Determine the scaling factor to maintain aspect ratio
    target_width, target_height = target_size
    if target_width / aspect_ratio <= target_height:
        new_width = int(target_width)
        new_height = int(target_width / aspect_ratio)
    else:
        new_width = int(target_height * aspect_ratio)
        new_height = int(target_height)

    # Resize the image using PIL
    img = Image.fromarray(pixel_data)
    img = img.resize((new_width, new_height))
    
    # Create a new NumPy array for the resized image
    resized_pixel_data = np.zeros([target_height, target_width], dtype=np.uint8)
    
    # Copy the resized image into the center of the new NumPy array
    x_offset = (target_width - new_width) // 2
    y_offset = (target_height - new_height) // 2
    resized_pixel_data[y_offset:y_offset + new_height, x_offset:x_offset + new_width] = np.array(img)

    # Update DICOM metadata for the resized image
    file_meta = FileMetaDataset()
    p_new = pydicom.Dataset()
    p_new.file_meta = file_meta
    exclude_tags = [(0x0028, 0x0010), (0x0028, 0x0011), (0x7FE0, 0x0010)]
    for elem in dicom_data:
        if elem.tag not in exclude_tags:
            p_new.add(elem)

    p_new.is_little_endian = True
    p_new.SOPInstanceUID = '{}.9999'.format(p_new.SOPInstanceUID)
    p_new.file_meta.TransferSyntaxUID = pydicom.uid.ImplicitVRLittleEndian
    p_new.is_implicit_VR = True
    p_new.Rows = target_width
    p_new.Columns = target_height
    p_new.PixelData = resized_pixel_data.tobytes()

    p_new.save_as(output_path)

I try to read the output dicom file like -

ds = dcmread(file, force=True)
print(ds.Rows, ds.Columns) # This works fine. It gives the correct result
arr = ds.pixel_array # <-- This is where I get the error
plt.imshow(arr, cmap="gray")
plt.show()

I am sure I am doing something wrong while saving the file because it has been corrupted. Although the Rows and Columns are being printed correctly.

In the error, it shows the length of the pixel data in the dataset (262144 bytes) which is = 512*512, but the expected length (524288 bytes) is = 512*512*2. I don't understand where this is coming from.

Can you help me with solving this error?


Solution

  • Since you're dealing with CT images, it is likely that each pixel is represented by more than 8 bits, probably 16 bits. In your resize function, try replacing

    resized_pixel_data = np.zeros([target_height, target_width], dtype=np.uint8)
    

    with

    resized_pixel_data = np.zeros([target_height, target_width], dtype=np.uint16)
    

    From reading the source code of Pydicom, this is how the expected length is calculated:

        rows = cast(int, ds.Rows)
        columns = cast(int, ds.Columns)
        samples_per_pixel = cast(int, ds.SamplesPerPixel)
        bits_allocated = cast(int, ds.BitsAllocated)
    
        length = rows * columns * samples_per_pixel
        length *= get_nr_frames(ds)
    
        if unit == "pixels":
            return length
    
        # Correct for the number of bytes per pixel
        if bits_allocated == 1:
            # Determine the nearest whole number of bytes needed to contain
            #   1-bit pixel data. e.g. 10 x 10 1-bit pixels is 100 bits, which
            #   are packed into 12.5 -> 13 bytes
            length = length // 8 + (length % 8 > 0)
        else:
            length *= bits_allocated // 8
    
        # DICOM Standard, Part 4, Annex C.7.6.3.1.2
        if ds.PhotometricInterpretation == "YBR_FULL_422":
            length = length // 3 * 2
    
        return length
    

    If my suggestion does not resolve your issue, try printing the values of

    rows = cast(int, ds.Rows)
    columns = cast(int, ds.Columns)
    samples_per_pixel = cast(int, ds.SamplesPerPixel)
    bits_allocated = cast(int, ds.BitsAllocated)
    

    and make sure they are as you expect.