I have a data frame of 6 columns where each entry has a sequence of numbers.
pd.DataFrame(FixByteOrderUnique)
Out[518]:
0 1 2 3 4 5
0 58 68 58 59 -1 -1
1 59 69 59 58 -1 -1
2 93 94 93 33 -1 -1
3 58 59 58 68 -1 -1
4 92 94 92 33 -1 -1
5 59 58 59 69 -1 -1
6 57 48 57 79 -1 -1
7 15 26 15 101 -1 -1
I want per line to measure the number of unique elements ignoring in the count the numbers: -1,100,101 and 102. Valid numbers are from [0,99].
What I did was to make a lambda function that ignores in the counting the -1
def myfunc(row):
if -1 in row.values:
return row.nunique() - 1
else:
return row.nunique()
and then call my function like this
pd_sequences['unique'] = pd.DataFrame(FixByteOrderUnique).apply(myfunc, axis=1)
How I can include inside my lambda function to check if the number is from [0,99] to be eligible for the uniqueness counting?
You can change myfunc
to
def myfunc(row):
return row[(row < 100) & (row > -1)].nunique()
using boolean indexing of dataframe.