Search code examples
pythondictionarydictionary-comprehension

Simplifying the code to a dictionary comprehension


In a directory images, images are named like - 1_foo.png, 2_foo.png, 14_foo.png, etc.

The images are OCR'd and the text extract is stored in a dict by the code below -

data_dict = {}

for i in os.listdir(images):
    if str(i[1]) != '_':
        k = str(i[:2])  # Get first two characters of image name and use as 'key'
    else:
        k = str(i[:1])  # Get first character of image name and use 'key'
    # Intiates a list for each key and allows storing multiple entries
    data_dict.setdefault(k, [])
    data_dict[k].append(pytesseract.image_to_string(i))

The code performs as expected.
The images can have varying numbers in their name ranging from 1 to 99.
Can this be reduced to a dictionary comprehension?


Solution

  • Yes. Here's one way, but I wouldn't recommend it:

    {k: d.setdefault(k, []).append(pytesseract.image_to_string(i)) or d[k]
     for d in [{}]
     for k, i in ((i.split('_')[0], i) for i in names)}
    

    That might be as clean as I can make it, and it's still bad. Better use a normal loop, especially a clean one like Dennis's.

    Slight variation (if I do the abuse once, I might as well do it twice):

    {k: d.setdefault(k, []).append(pytesseract_image_to_string(i)) or d[k]
     for d in [{}]
     for i in names
     for k in i.split('_')[:1]}
    

    Edit: kaya3 now posted a good one using a dict comprehension. I'd recommend that over mine as well. Mine are really just the dirty results of me being like "Someone said it can't be done? Challenge accepted!".