python pandas replace conditional-statements cell

Pandas - change cell value based on conditions from cell and from column

I have a Dataframe with a lot of "bad" cells. Let's say, they have all -99.99 as values, and I want to remove them (set them to NaN).

This works fine:

df[df == -99.99] = None

But actually I want to delete all these cells ONLY if another cell in the same row is market as 1 (e.g. in the column "Error").

I want to delete all -99.99 cells, but only if df["Error"] == 1.

The most straight-forward solution I thin is something like

df[(df == -99.99) & (df["Error"] == 1)] = None

but it gives me the error:

ValueError: cannot reindex from a duplicate axis

I tried every given solutions on the internet but I cant get it to work! :(

Since my Dataframe is big I don't want to iterate it (which of course, would work, but take a lot of time).

Any hint?

Solution

Try using broadcasting while passing numpy values:

# sample data, special value is -99
df = pd.DataFrame([[-99,-99,1], [2,-99,2], 
                   [1,1,1], [-99,0, 1]], 
                  columns=['a','b','Errors'])

# note the double square brackets
df[(df==-99) & (df[['Errors']]==1).values] = np.nan

Output:

     a     b  Errors
0  NaN   NaN       1
1  2.0 -99.0       2
2  1.0   1.0       1
3  NaN   0.0       1