I have a dataframe:
x <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
y <- c(2, 2, 2, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)
df <- data.frame(x, y)
Now i want to change values in x, but only for 10 % of all values in x when y equals 2. For example
set.seed(999)
df[sample(which(df$y == 2), round(0.1 * length(which(df$y == 2)))), ]
x y
11 11 2
14 14 2
For exactly this cases I want to add + 1000. The result should look like:
x y
1 1 2
2 2 2
3 3 2
4 4 0
5 5 0
6 6 0
7 7 0
8 8 0
9 9 2
10 10 2
11 1011 2
12 12 2
13 13 2
14 1014 2
15 15 2
16 16 2
17 17 2
18 18 2
19 19 2
20 20 2
I am able to edit the sub-sample, but i dont know how to add the result to the dataframe "df" on a neat way. I am grateful for any help!
One simple way using base R could be
#Get indices when y = 2
inds <- df$y == 2
#set.seed(123)
#Get random indices whose value you need to change
inds_to_change <- sample(which(inds), round(0.1 * sum(inds)))
#Change the value
df$x[inds_to_change] <- df$x[inds_to_change] + 1000
df
# x y
#1 1 2
#2 2 2
#3 3 2
#4 4 0
#5 5 0
#6 6 0
#7 7 0
#8 8 0
#9 9 2
#10 1010 2
#11 11 2
#12 12 2
#13 13 2
#14 14 2
#15 15 2
#16 16 2
#17 1017 2
#18 18 2
#19 19 2
#20 20 2