Search code examples
pythonpandasduplicatesdrop-duplicates

drop_duplicates() stopped working in Python pandas


This code had previously worked in python 3 to remove the duplicate values but keep first occurrence across an entire dataframe. After coming back to my script this no longer removes duplicates in a pandas dataFrame.

df = df.apply(lambda x: x.drop_duplicates(), axis=1)

so if I have

a   b  c
0   1  2
3   4  0
0   8  9
10  0  11

I want to get as an output

a  b  c
0  1  2
3  4  
   8  9
10   11

I don't mind if the blanks return as 'nan'

I also tried the following

df.drop_duplicates(subset = None, keep='first')

and

df.drop_duplicates(subset = None, keep='first', inplace =True)

Any advice / alternatives would be welcome!


Solution

  • After your attached the data , I think you can using duplicated

    newdf=df[~df.stack().duplicated().unstack()]
    newdf
    Out[131]: 
          a    b     c
    0   0.0  1.0   2.0
    1   3.0  4.0   NaN
    2   NaN  8.0   9.0
    3  10.0  NaN  11.0