Search code examples
pythonstringnumpydictionarydictionary-comprehension

Use variable name as key in dictionary comprehension


I am trying to populate a dictionary with the variable name as the key and the median of a numpy array as the value. For example:

import numpy as np

# Make up some data
array1 = np.random.randint(255, size=(4, 4))
array2 = np.random.randint(255, size=(4, 4))
array3 = np.random.randint(255, size=(4, 4))
array4 = np.random.randint(255, size=(4, 4))

# Generate a dict with variable name key and median value 
d = {f"{x}":np.median(x) for x in [array1, array2, array3, array4]}

This yields:

{'[[ 27 234  43 110]\n [105 199 173  28]\n [ 97 235  32  62]\n [ 42 193 234  29]]': 101.0,
 '[[ 68 150  75 207]\n [156 188  47 116]\n [161  90  18 173]\n [213  12  65 216]]': 133.0,
 '[[188  26 229 204]\n [ 65 214 197 165]\n [  3  86 124 221]\n [ 87  27 189 125]]': 145.0,
 '[[193 158 107 148]\n [187  30  38 104]\n [ 44 137  31 227]\n [243 212  25 110]]': 123.5}

However, I would like to add the variable name to the dictionary, not the numpy array as a string. Here is my desired output:

{'array1': 101.0,
 'array2': 133.0,
 'array3': 145.0,
 'array4': 123.5}

How can I add the variable name as a string to a dictionary comprehension?


Solution

  • Having variables like array1, array2 ... is wrong from the start. Let's suppose that the names are significant names instead.

    Then we could use variable names to start with, and look them up in the various variable dictionaries. For instance like this:

    import numpy as np
    
    # Make up some data
    array1 = np.random.randint(255, size=(4, 4))
    array2 = np.random.randint(255, size=(4, 4))
    array3 = np.random.randint(255, size=(4, 4))
    array4 = np.random.randint(255, size=(4, 4))
    
    # Generate a dict with variable name key and median value
    d = {x:np.median(locals().get(x,globals().get(x))) for x in ["array1", "array2", "array3", "array4"]}
    
    
    >>> d
    {'array1': 156.5, 'array2': 76.0, 'array3': 100.0, 'array4': 85.0}
    

    locals().get(x,globals().get(x)) is one way of trying to fetch a variable named x first in locals then in globals. It will overlook nonlocals (but can be done, it's just an even more complex expression).

    If it's possible, I'd advise to store the variables in a dictionary from the start, then getting the result is trivial:

    datadict = {}
    datadict["array1"] = np.random.randint(255, size=(4, 4))
    datadict["array2"] = np.random.randint(255, size=(4, 4))
    datadict["array3"] = np.random.randint(255, size=(4, 4))
    datadict["array4"] = np.random.randint(255, size=(4, 4))
    
    # Generate a dict with variable name key and median value
    d = {k:np.median(v) for k,v in datadict.items()}