r range match multiple-columns r-colnames

Return list of columns containing data outside a predetermined range in r

In order to filter a data.frame for only the the columns of interest I need to find the columns in this data.frame containing data outside a specific range. Let the data.frame be

df<-data.frame(x1=c(1,5,9),x2=c(10,20,30),x3=c(20,100,1000))
ranges<-data.frame(y1=c(3,8),y2=c(10,20), y3=c(15,1250))

As an output I'd like a list returning the colnames: "x1","x2"

I tried the following, but the code works only if "ranges" contains all the numbers as specified below, and matches if the number is found. Thats unfortunately not what I need.

ranges<-c(15:300,10:20)
df.l<-colnames(df)[sapply(df,function(x) any(x %in% ranges))]

Any ideas? Thanks!

Solution

If 'ranges' is a data.frame or list, one option is

names(which(unlist(Map(function(x, y) any(!(x >= y[1] & x <= y[2])), df, ranges))))
#[1] "x1" "x2"

Or use the reverse logic

names(which(unlist(Map(function(x, y) any(x < y[1]| x > y[2]), df, ranges))))

Or in tidyverse,

library(purrr)
library(dplyr)
library(tibble)
map2(df, ranges, ~ between(.x, .y[1], .y[2]) %>% `!` %>% any) %>% 
    enframe %>% 
    unnest(cols = value) %>% 
    filter(value) %>% 
    pull(name)
#[1] "x1" "x2"

data

ranges <- data.frame(y1 = c(3, 8), y2 = c(10, 20), y3 = c(15, 1250))