Good day.
If I have the following array:
[11, "apples", 22, 11], [12, "pear", 24, 11], [13, "bannana", 18, 11], [14, "pear", 17, 11]
How can I change the array to only show data from user pear
? I want to collect all the values from column 1 of user pear
. (12, 14)
Or alrternatively how can I find the values that are unique in colum 2, e.g. apples, pear and bannana. And then filter by pear
to find the data only of pear
. [12, "pear", 24, 11], [14, "pear", 17, 11]
What have I tried and vary forms of it:
uniqueRows = np.unique(array, axis=:,1)
This is what I can use to filter if I have the unique values.
new_arr = np.array([[11, "apples", 22, 11], [12, "pear", 24, 11], [13, "bannana", 18, 11], [14, "pear", 17, 11]])
new_val = np.array(["pear"])
result = np.in1d(new_arr[:, 1], new_val)
z = new_arr[result]
Pandas Way
import numpy as np
import pandas as pd
new_arr = np.array([[11, "apples", 22, 11], [12, "pear", 24, 11], [13, "banana", 18, 11], [14, "pear", 17, 11]])
df = pd.DataFrame(new_arr,columns=['A','B','C','D'])
result = df[df.B=='pear']
print(result)
'''
A B C D
1 12 pear 24 11
3 14 pear 17 11
'''
#or
result_2 = df['B'].drop_duplicates()
print(result_2)
'''
0 apples
1 pear
2 banana
'''
However instead of drop_duplicate you can use unique() but this way is faster.