I am looking for an easy way to find whether the value in a column lies within the range of values in other columns.
My input looks like this:
ID "Q1 Comm - 01 Scope Thesis" "Q1 Comm - 02 Scope Project" "Q1 Comm - 03 Learn Intern" "Q1 Comm - 04 Biography" "Q1 Comm - Overall Plan"
10 NA NA 4 NA 4
31 2 NA NA NA 2
225 0 NA NA NA 1
243 NA 2 NA 1 0
310 NA 2 NA 1 NA
For each unique ID
, I am interested in identifying when the column Q1 Comm - Overall Plan
is:
1 - Below
the min()
of all the other columns, or
2 - Above
the max()
of all the other columns, or
3 - Within
the range of all the other columns
The complete list of columns (along with the overall
column) is below:
"Q1 Comm - 01 Scope Thesis"
"Q1 Comm - 02 Scope Project"
"Q1 Comm - 03 Learn Intern"
"Q1 Comm - 04 Biography"
"Q1 Comm - 05 Exhibit"
"Q1 Comm - 06 Social Act"
"Q1 Comm - 07 Post Project"
"Q1 Comm - 08 Learn Plant"
"Q1 Comm - 09 Study Narrate"
"Q1 Comm - 10 Learn Participate"
"Q1 Comm - 11 Write 1"
"Q1 Comm - 12 Read 2"
"Q1 Comm - Overall Plan"
My required output is something like this:
ID "Q1 Comm - 01 Scope Thesis" "Q1 Comm - 02 Scope Project" "Q1 Comm - 03 Learn Intern" "Q1 Comm - 04 Biography" "Q1 Comm - Overall Plan" "Q1_check"
10 NA NA 4 NA 4 "within"
31 2 NA NA NA 2 "within"
225 0 NA NA NA 1 "above"
243 NA 2 NA 1 0 "below"
310 NA 2 NA 1 NA NA
The dput() for my data frame df
is below.
dput(df)
structure(list(ID = c(10L, 31L, 225L, 243L), Q1.Comm...01.Scope.Thesis = c(NA,
2L, 0L, NA), Q1.Comm...02.Scope.Project = c(NA, NA, NA, 2L),
Q1.Comm...03.Learn.Intern = c(4L, NA, NA, NA), Q1.Comm...04.Biography = c(NA,
NA, NA, 1L), Q1.Comm...Overall.Plan = c(4L, 1L, 2L,
NA), X = c(NA, NA, NA, NA), X.1 = c(NA, NA, NA, NA), X.2 = c(NA,
NA, NA, NA)), class = "data.frame", row.names = c(NA, -4L
))
I had asked this question here Finding if a value is within the range of other columns but the example was too simplified and none of the solutions worked for me.
The question was getting too lengthy, therefore, for the sake of clarity, I am posting this as a new question.
I thank you for your time and help on this post.
library(purrr)
library(data.table)
needed_cols <- setdiff(names(df), c("ID", "Q1.Comm...Overall.Plan"))
setDT(df)[, c("min", "max") := transpose(pmap(.SD, range, na.rm = TRUE)), .SDcols = needed_cols]
df[, Q1_check := fcase(
is.na(`Q1.Comm...Overall.Plan`), NA_character_,
`Q1.Comm...Overall.Plan` < min, "below",
`Q1.Comm...Overall.Plan` > max, "above",
default = "within"
)
]
df[, c("max", "min") := NULL]