I try to find a straight-forward way to vectorize/generalize the subsetting of a data.frame. Let's assume I have a data.frame:
df <- data.frame(A = 1:5, B = 10 * 1:5, C = 100 * 1:5)
Every column has its own condition and the goal is subset the df so that only those rows remain where the condition is met for at least one column. I now want to find a vectorized subset mechanism that generalizes
df <- subset(df, df[,1]<2 | df[,2]< 30 | df[,3]<100)
so I could formulate it somewhat like this
crit <- c(2,30,100)
df <- subset(df, df$header < crit[1:3])
and down the road I want to get to.
df <- subset(df, df$header < crit[1:n])
I know a multi-step loop workaround, but there must be another way. I am grateful for any help.
Given:
x <- c(1:5)
y <- c(10,20,30,40,50)
z <- c(100,200,300,400,500)
# df is a base function
mydf <- data.frame(A = x, B = y, C = z)
crit <- c(2,30,100)
Then this will let you see which values in the column are less than the crit value:
> sweep(mydf, 2, crit, "<")
A B C
[1,] TRUE TRUE FALSE
[2,] FALSE TRUE FALSE
[3,] FALSE FALSE FALSE
[4,] FALSE FALSE FALSE
[5,] FALSE FALSE FALSE
And this will give you the rows that meet any of the criteria:
> subset(mydf, rowSums(sweep(mydf, 2, crit, "<")) > 0)
A B C
1 1 10 100
2 2 20 200