I am trying to find a more efficient way to return the index
of unique
values in a pandas
df
For the df below I want to return the index of the first time a unique value occurs.
import pandas as pd
import numpy as np
d = ({
'Day' : ['Mon','Mon','Tues','Mon','Tues','Wed'],
})
df = pd.DataFrame(data=d)
I can manually counti the index of unique value and return below:
first = df.iloc[0].Location
second = df.iloc[2].Location
third = df.iloc[5].Location
I was thinking of doing something like
first = (df['Day'] == 'Mon')
But I still have to change this to find the 2nd, 3rd unique value. Is there a more efficient method?
If want filter all unique index values use drop_duplicates
with keep=False
:
print (df['Day'].drop_duplicates(keep=False))
5 Wed
Name: Day, dtype: object
print (df['Day'].drop_duplicates(keep=False).index)
Int64Index([5], dtype='int64')
Or:
print (df.index[~df['Day'].duplicated(keep=False)])
Int64Index([5], dtype='int64')
If want filter first unique values use only drop_duplicates
:
print (df['Day'].drop_duplicates())
0 Mon
2 Tues
5 Wed
Name: Day, dtype: object
print (df['Day'].drop_duplicates().index)
Int64Index([0, 2, 5], dtype='int64')