I have the following dataframe:
structure(list(Store = c("vpm", "vpm",
"vpm"), Date = structure(c(18042, 18042, 18042), class = "Date"),
UniqueImageId = c("vp3_523", "vp3_668", "vp3_523"), EntryTime = structure(c(1558835514,
1558834942, 1558835523), class = c("POSIXct", "POSIXt")),
ExitTime = structure(c(1558838793, 1558838793, 1558839824
), class = c("POSIXct", "POSIXt")), Duration = c(3279, 3851,
4301), Age = c(35L, 35L, 35L), EntryPoint = c("Entry2Side",
"Entry2Side", "Entry2Side"), ExitPoint = c("Exit2Side", "Exit2Side",
"Exit2Side"), AgeNew = c("15_20", "25_32", "15_20"), GenderNew = c("Female",
"Male", "Female")), row.names = 4:6, class = c("data.table",
"data.frame"))
I am trying to populate a random number for the column AgeNew
and I am using sample
function with ifelse condition.
I tried the following
d$AgeNew <- ifelse(d$AgeNew == "0_2", sample(0:2, 1,replace = TRUE),
ifelse(d$AgeNew == "15_20", sample(15:20,1,replace = TRUE),
ifelse(d$AgeNew == "25_32", sample(25:36,1,replace = TRUE),
ifelse(d$AgeNew == "38_43", sample(36:43,1,replace = TRUE),
ifelse(d$AgeNew == "4_6", sample(4:6, 1,replace = TRUE),
ifelse(d$AgeNew == "48_53", sample(48:53,1,replace = TRUE),
ifelse(d$AgeNew == "60_Inf",sample(60:65,1,replace = TRUE),
sample(8:13, 1,replace = TRUE))))))))
But I am getting the same value getting repeated. For example, for the age group 0_2 I have only 2 populated. I tried using set.seed
set.seed(123)
and then running the ifelse still it repeats the same value.
This has been discussed somewhere (cannot find the source at the moment). The reason it behaves like this is because ifelse
runs only once for one condition hence, the value is recycled. Consider this example,
x <- c(1, 2, 1, 2, 1, 2)
ifelse(x == 1, sample(1:10, 1), sample(20:30, 1))
#[1] 1 26 1 26 1 26
ifelse(x == 1, sample(1:10, 1), sample(20:30, 1))
#[1] 10 28 10 28 10 28
ifelse(x == 1, sample(1:10, 1), sample(20:30, 1))
#[1] 9 24 9 24 9 24
As we can see it gives the same number which is recycled for both the scenarios. To avoid that we need to specify size
of sample
as length of the test
condition in ifelse
ifelse(x == 1, sample(1:10, length(x)), sample(20:30, length(x)))
#[1] 7 23 1 26 10 24
ifelse(x == 1, sample(1:10, length(x)), sample(20:30, length(x)))
#[1] 3 23 5 26 6 22
ifelse(x == 1, sample(1:10, length(x)), sample(20:30, length(x)))
#[1] 2 30 9 27 1 29