So what I want is to randomly chose a given amount of elements on my dataframe, and to those elements, apply an operation (which will be a multiplication by a number which will also be randomly chosen between a range) to an 'eta' column. I'm stuck on achieving this.
atm I've randomly gotten a list of index within my dataframe but what I dont know is how to only apply the multiplication to the elements with those index and not the rest.
Some help would be trully appreciated!
numExtreme = 50 #Amount of registres to apply the random modification.
mod_indices = np.random.choice(df.index, numExtreme, replace=False) #set of "numExtreme" indices randomly chosen.
Use DataFrame.loc
for select random indices and only column eta
and multiple by random array between 1 and 50:
np.random.seed(123)
df = pd.DataFrame({'eta':np.random.randint(10, size=10),
'another':np.random.randint(10, size=10)})
print (df)
eta another
0 2 9
1 2 0
2 6 0
3 1 9
4 3 3
5 9 4
6 6 0
7 1 0
8 0 4
9 1 1
numExtreme = 5 #Amount of registres to apply the random modification.
mod_indices = np.random.choice(df.index, numExtreme, replace=False)
print (mod_indices)
[5 1 0 8 6]
arr = np.random.randint(1, 50, len(mod_indices))
print (arr)
[49 8 42 36 29]
df.loc[mod_indices, 'eta'] *= arr
print (df)
eta another
0 84 9
1 16 0
2 6 0
3 1 9
4 3 3
5 441 4
6 174 0
7 1 0
8 0 4
9 1 1