I have a list with 1000 factors, each ranging from 1 to 1000 and each factor appears 15 times. I want to either assign 0 or 1 to every factor that has the same value. For instance, factor 1 that appears 15 times has to have always the value 0. Any idea on how to do this? Basically, I would like to have two columns, one with the factors, and one with the value (0 or 1) that each factor has.
You could do:
my_binary <- as.numeric(my_factor) %% 2
So, for example:
df <- data.frame(number = 1:20, factor = rep(letters[1:5], 4))
df$binary <- as.numeric(df$factor) %% 2
Gives you
df
#> number factor binary
#> 1 1 a 1
#> 2 2 b 0
#> 3 3 c 1
#> 4 4 d 0
#> 5 5 e 1
#> 6 6 a 1
#> 7 7 b 0
#> 8 8 c 1
#> 9 9 d 0
#> 10 10 e 1
#> 11 11 a 1
#> 12 12 b 0
#> 13 13 c 1
#> 14 14 d 0
#> 15 15 e 1
#> 16 16 a 1
#> 17 17 b 0
#> 18 18 c 1
#> 19 19 d 0
#> 20 20 e 1
And if you want arbitrary numbers at a specified probability you would do:
numbers <- c(0, 1)
probs <- c(0.75, 0.25)
df <- data.frame(number = 1:20, factor = rep(letters[1:5], 4))
df$binary <- sample(numbers, length(levels(df$factor)), prob = probs, T)[as.numeric(df$factor)]