Search code examples
pythonnumpytuplesunique

find unique values at a given position in a numpy array of tuples


I have a numpy array that looks like this:

[
('{893EE51E-0CD1-4C06-B672-365EECA26C33}', 'image/jpeg', 'Photo1.jpg', []),
('{893EE51E-0CD1-4C06-B672-365EECA26C33}', 'image/jpeg', 'Photo2.jpg', []),
('{893EE51E-0CD1-4C06-B672-365EECA26C63}', 'image/jpeg', 'Photo1.jpg', []),
('{893EE51E-0CD1-4C06-B672-365EECA26C73}', 'image/jpeg', 'Photo1.jpg', [])
]

How can i find the unique values at "position 0" of each tuple? Ideally i'd like to output an array (or list) that looks like this:

[
'{893EE51E-0CD1-4C06-B672-365EECA26C33}',
'{893EE51E-0CD1-4C06-B672-365EECA26C63}',
'{893EE51E-0CD1-4C06-B672-365EECA26C73}'
]

Solution

  • Recreating a structured array from your display:

    In [241]: _ = np.array([
         ...: ('{893EE51E-0CD1-4C06-B672-365EECA26C33}', 'image/jpeg', 'Photo1.jpg', []),
         ...: ('{893EE51E-0CD1-4C06-B672-365EECA26C33}', 'image/jpeg', 'Photo2.jpg', []),
         ...: ('{893EE51E-0CD1-4C06-B672-365EECA26C63}', 'image/jpeg', 'Photo1.jpg', []),
         ...: ('{893EE51E-0CD1-4C06-B672-365EECA26C73}', 'image/jpeg', 'Photo1.jpg', [])
         ...: ],dtype='U50,U20,U20,O')
    Out[241]: 
    array([('{893EE51E-0CD1-4C06-B672-365EECA26C33}', 'image/jpeg', 'Photo1.jpg', list([])),
           ('{893EE51E-0CD1-4C06-B672-365EECA26C33}', 'image/jpeg', 'Photo2.jpg', list([])),
           ('{893EE51E-0CD1-4C06-B672-365EECA26C63}', 'image/jpeg', 'Photo1.jpg', list([])),
           ('{893EE51E-0CD1-4C06-B672-365EECA26C73}', 'image/jpeg', 'Photo1.jpg', list([]))],
          dtype=[('f0', '<U50'), ('f1', '<U20'), ('f2', '<U20'), ('f3', 'O')])
    

    Selecting the first field:

    In [242]: _['f0']
    Out[242]: 
    array(['{893EE51E-0CD1-4C06-B672-365EECA26C33}',
           '{893EE51E-0CD1-4C06-B672-365EECA26C33}',
           '{893EE51E-0CD1-4C06-B672-365EECA26C63}',
           '{893EE51E-0CD1-4C06-B672-365EECA26C73}'], dtype='<U50')
    

    Applying unique to that:

    In [243]: np.unique(_)
    Out[243]: 
    array(['{893EE51E-0CD1-4C06-B672-365EECA26C33}',
           '{893EE51E-0CD1-4C06-B672-365EECA26C63}',
           '{893EE51E-0CD1-4C06-B672-365EECA26C73}'], dtype='<U50')