I have following problem. I have a number of plots that cover a biological gradient. From these plots, I would like to select 25, that cover the gradient best. To achieve this, I extracted min and max values and calculated the values that would cover the gradient best. I then chose the plots that were the closest match to the ideal value. This works fine. However, sometimes one plot is the closest match to two theoretical values and thus, I end up with duplicates in my list, which I would like to avoid. Obviously, I could increase the number of length.out, but from my perspective, this is not an optimal solution. I would like to end up with 25 selected and unique plots.
The following code exemplifies the problem: length.out is set to 25, but only 19 plots are selected.
data <- structure(list(Plot = c("3", "4", "5", "6", "8", "12", "14",
"15", "17", "18", "19", "20", "21", "22", "23", "25", "26", "28",
"29", "30", "32", "33", "34", "35", "36", "37", "38", "39", "40",
"41", "42", "43", "44", "45", "46", "47", "48", "49"), Value = c(2.19490722347427,
0.817884294633935, 0.834577676660982, 1.19923035999043, 0.293146158435238,
1.93237941781986, 1.74536845664897, 2.22904916731729, 0.789604037117133,
0.439716474953651, 0.834321473446987, 1.07386786707173, 0.977203815084214,
0.539717907433468, 0.950019385036826, 1.10794069639141, 1.41499437622422,
1.12933520841724, 1.99342508363262, 1.05715847816517, 2.27711128641038,
1.9766526350752, 2.16657914911448, 2.01955890337827, 1.1080527140292,
1.16614766657035, 1.04478527637105, 0.980792736677819, 0.818000882117776,
0.656157422806534, 1.07223822052094, 0.799912719334531, 0.4365715090508,
0.824331627537106, 1.19478221856558, 1.06047128780385, 1.54822823084764,
0.582397279167692)), class = "data.frame", row.names = c("3",
"4", "5", "6", "8", "12", "14", "15", "17", "18", "19", "20",
"21", "22", "23", "25", "26", "28", "29", "30", "32", "33", "34",
"35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45",
"46", "47", "48", "49"))
opt_seq<-seq(min(data$Value), max(data$Value), length.out = 25)
sel_plots <- sapply(opt_seq, function(i) which.min(abs(data$Value - i)))#25 plots
length(unique(sel_plots))
I highly appreciate every help!!
You can try:
sel_plots <- logical(nrow(data))
for(i in opt_seq) {
sel_plots[which(!sel_plots)[which.min(abs(data$Value[!sel_plots] - i))]] <- TRUE
}
sel_plots <- which(sel_plots)
length(unique(sel_plots))
#[1] 25