Search code examples
pythonpython-3.xlist-comprehensionnested-loops

python 3 nested comprehension


Is there a smart list/dictionary comprehension way of getting the intended output below give the following:

import numpy as np
freq_mat = np.random.randint(2,size=(4,5));
tokens = ['a', 'b', 'c', 'd', 'e'];
labels = ['X', 'S', 'Y', 'S'];

The intended output for freq_mat

array([[1, 0, 0, 1, 1],
       [0, 0, 0, 0, 1],
       [1, 0, 1, 1, 0],
       [0, 1, 0, 0, 0]])

should like the following:

[({'a': True, 'b': False, 'c': False, 'd': True, 'e': True}, 'X'),
 ({'a': False, 'b': False, 'c': False, 'd': False, 'e': True}, 'S'),
 ({'a': True, 'b': False, 'c': True, 'd': True, 'e': False}, 'Y'),
 ({'a': False, 'b': True, 'c': False, 'd': False, 'e': False}, 'S')]

Solution

  • As you note in your updated post, your original code doesn't work quite right: it adds the same value for every key in a given row - all True or all False. The simplest correction to your original code would be this:

    featureset = []
    for row, label in zip(freq_mat, labels):
        d = dict()
        for key, val in zip(tokens, row): # The critical bit
            d[key] = val>0            
        featureset.append((d,label))
    

    A more streamlined version, but one that's still quite a bit more readable, I think, than the single-comprehension approach:

    featureset = []
    for row, label in zip(freq_mat, labels):
        d = {key: val > 0 for key, val in zip(tokens, row)}
        featureset.append((d, label))
    

    Or for the one-liner:

    featureset = [({key:val>0 for key, val in zip(tokens, row)}, label)
        for row, label in zip(freq_mat, labels)]
    

    Personally I'd probably go with the second approach, a compromise of concision and readability. But that's up to you, of course!