Hard to find the right title...here is what I want:
I have a dataframe and a column col1 with values : val1, val2, val3
I want to select the rows with val2 or val3 values for this specific column and replace them with val4 value but not for all of them, just for a "slice" between idx x and y :
import pandas as pd
data = {'col1':["val1","val3","val3","val2","val1","val2","val3","val1"],'col2':["val3","val1","val2","val1","val2","val3","val2","val2"]}
df = pd.DataFrame(data)
df
col1 col2
0 val1 val3
1 val3 val1
2 val3 val2
3 val2 val1
4 val1 val2
5 val2 val3
6 val3 val2
7 val1 val2
Select rows from col1 with val2 or val3 values :
(df['col1']=="val2") | (df['col1']=="val3")
0 False
1 True
2 True
3 True
4 False
5 True
6 True
7 False
Now I want to replace the first 4 True rows for col1 (rows with index 1 2 3 5) with val4 in order to obtain :
col1 col2
0 val1 val3
1 val4 val1
2 val4 val2
3 val4 val1
4 val1 val2
5 val4 val3
6 val3 val2
7 val1 val2
I thought something like :
df[((df['col1']=="val2") | (df['col1']=="val3"))==True][0:4] = "val4"
but it doesn't work (not surprise...)
Thought I need to use something like .loc
Thanx for any clue
You can get the rows based on the condition
condition = (df['col1'] == "val2") | (df['col1'] == "val3")
And then get indices of rows that match the condition
indices = df[condition].index[:4]
Finally use loc to replace the selected rows with val4
df.loc[indices, 'col1'] = 'val4'
Output
col1 col2
0 val1 val3
1 val4 val1
2 val4 val2
3 val4 val1
4 val1 val2
5 val4 val3
6 val3 val2
7 val1 val2