I am trying to convert and list DICOM tags from .dcm
files into Excel (using pydicom), but certain tags are showing errors (Patient's Name, Patient ID etc) during conversion.
Some of the tags are showing 'None' in the Excel file although they contain/show data (SOP Class UID, SOP Instance UID etc) in DICOM format. How can I resolve this?
import xlsxwriter
import sys
import pydicom
import os.path
from pydicom.valuerep import PersonName
keywords = ("Patient's Name",
"Patient ID",
"Patient's Birth Date",
"Patient's Sex",
"SOP Class UID",
"SOP Instance UID",
"Group Length",
"Manufacturer",
"Referring Physician's Name",
"Study ID",
"Patient Orientation",
"Series Number",
"Pixel Data",
"Group Length",
"Rows",
"Columns",
)
# ...
dcm_files = [r"C:\Users\akhil\Downloads\Sample_Dataset\Sample_Dataset\PRASANNA_KUMARI\21_12_2013_11_13_46_AM\IMG-0001-00001.dcm"]
for dcm_file in dcm_files:
ds = pydicom.filereader.dcmread(dcm_file)
workbook = xlsxwriter.Workbook(os.path.basename(dcm_file) + '.xlsx')
worksheet = workbook.add_worksheet()
row = 0
col = 0
for keyword in keywords:
value = ds.get(keyword, "None")
if isinstance(value, list):
value = ", ".join([str(x) for x in value])
elif isinstance(value, PersonName):
value = str(value)
worksheet.write(row, col, keyword)
worksheet.write(row + 1, col, value)
col += 1
workbook.close()
Some tags from the DICOM file:
(0008, 0005) Specific Character Set CS: 'ISO_IR 100'
(0008, 0016) SOP Class UID UI: Secondary Capture Image Storage
(0008, 0018) SOP Instance UID UI: 1.2.300.0.7230010.3.1.4.3397350519.8248.1599586949.14
(0008, 0020) Study Date DA: '20200908'
(0008, 0021) Series Date DA: '20200908'
(0008, 0022) Acquisition Date DA: '20200908'
(0008, 0023) Content Date DA: '20200908'
(0008, 0030) Study Time TM: '155900'
(0008, 0031) Series Time TM: '155900'
(0008, 0032) Acquisition Time TM: '155900'
(0008, 0033) Content Time TM: '155900'
(0008, 0050) Accession Number SH: ''
(0008, 0060) Modality CS: 'OT'
(0008, 0064) Conversion Type CS: ''
(0008, 0070) Manufacturer LO: 'SANTESOFT'
(0008, 0090) Referring Physician's Name PN: ''
(0010, 0000) Group Length UL: 48
(0010, 0010) Patient's Name PN: 'NO^NAME'
(0010, 0020) Patient ID LO: '00000001'
(0010, 0030) Patient's Birth Date DA: ''
(0010, 0040) Patient's Sex CS: ''
(0018, 0000) Group Length UL: 14
(0018, 1063) Frame Time DS: "33.0"
You are not using the correct keywords here. First, the DICOM keywords do not have the 's
part, e.g. its called "Patient Name", not "Patient's Name" (this has been changed in the DICOM standard about 15 years ago or so).
Second, the keywords do not have spaces, so if you want to use the names with spaces for readabilty, you have to remove them for the lookup, for example:
keywords = ("Patient Name",
"Patient ID",
"Patient Birth Date",
"Patient Sex",
"SOP Class UID",
"SOP Instance UID",
"Group Length",
"Manufacturer",
"Referring Physician Name",
"Study ID",
"Patient Orientation",
"Series Number",
"Group Length",
"Rows",
"Columns",
)
...
for dcm_file in dcm_files:
ds = pydicom.filereader.dcmread(dcm_file)
...
for keyword in keywords:
dcm_keyword = keyword.replace(' ', '') # remove the spaces for the lookup
value = ds.get(dcm_keyword, "None")
Note that I have removed all the apostrophs in the tag names, and I have also removed Pixel Data
- converting binary data to a string would not work correctly, and you certainly don't want to display the pixel data in an Excel table.