I'm using pandas in python and using the duplicated(keep=False)
I've made a check between the values of a same dataframe column. So I'm getting as result:
5063 True
5064 True
5065 True
5066 False
5067 False
...
5310 False
5311 False
5312 True
5313 False
5314 False
Now I'd like to find the index of the true values. So I've thought that this could work
duplicateRowsDF = df.duplicated(keep=False)
for j in range (0, len(duplicateRowsDF)):
attempt=str(duplicateRowsDF.iloc[j])
if attempt=="True":
duplicated_index=duplicateRowsDF.index
but I'm getting
Index([5063, 5064, 5065, 5066, 5067, 5068, 5069, 5070, 5071, 5072,
...
5305, 5306, 5307, 5308, 5309, 5310, 5311, 5312, 5313, 5314],
dtype='int64', length=252)
while I'm aiming to have an "array" made with 5063 5064 5065 5312.
I've also tried the code df[df.index.duplicated(keep=False)]
but it seems not providing me what I'm expecting.
Simply perform boolean indexing on the index:
out = duplicateRowsDF.index[duplicateRowsDF]
Or, without intermediate:
out = df.index[df.duplicated(keep=False)]