Search code examples
pythonarraysnumpyunique

Python find the unique values in a spesific column. 2d array


Good day.

If I have the following array:

[11, "apples", 22, 11], [12, "pear", 24, 11], [13, "bannana", 18, 11], [14, "pear", 17, 11]

How can I change the array to only show data from user pear? I want to collect all the values from column 1 of user pear. (12, 14)

Or alrternatively how can I find the values that are unique in colum 2, e.g. apples, pear and bannana. And then filter by pear to find the data only of pear. [12, "pear", 24, 11], [14, "pear", 17, 11]

What have I tried and vary forms of it:

uniqueRows = np.unique(array, axis=:,1)

This is what I can use to filter if I have the unique values.

new_arr = np.array([[11, "apples", 22, 11], [12, "pear", 24, 11], [13, "bannana", 18, 11], [14, "pear", 17, 11]])
new_val = np.array(["pear"])
result = np.in1d(new_arr[:, 1], new_val)
z = new_arr[result] 

Solution

  • Pandas Way

    import numpy as np
    import pandas as pd
    
    new_arr = np.array([[11, "apples", 22, 11], [12, "pear", 24, 11], [13, "banana", 18, 11], [14, "pear", 17, 11]])
    
    df = pd.DataFrame(new_arr,columns=['A','B','C','D'])
    
    result = df[df.B=='pear']
    print(result)
    '''
        A     B   C   D
    1  12  pear  24  11
    3  14  pear  17  11
    '''
    #or
    
    result_2 = df['B'].drop_duplicates()
    print(result_2)
    '''
    0    apples
    1      pear
    2    banana
    '''
    

    However instead of drop_duplicate you can use unique() but this way is faster.