Search code examples
rvectorsubset

Subsetting an unnamed vector in R


I'm getting a vector of numbers as output from one function, and am wanting to drop all the values higher than 2900, then pipe the remainder directly into a second function. (They'll be sorted, if that helps.) Is there a clever way to do this seemingly simple thing without having to stop and define an intermediate variable?


Solution

  • Here is a way without creating a temp vector.

    1. The functions f and g are simple test functions that output a sequence of integers from1 to their argument n. Function g assigns NA to half of the output vector.
    2. Function h sums its input vector.
    3. In the middle of the pipe, there's an anonymous function that subsets the output of f or g and pipes the resulting vector to function h.
    4. In the case of the pipe from g, extra code is needed to remove NA's, if that's what the user wants.
    f <- function(n) seq.int(n)
    g <- function(n){
      y <- seq.int(n)
      is.na(y) <- sample(n, n/2)
      y
    }
    h <- function(x, na.rm = FALSE) sum(x, na.rm = na.rm)
    
    set.seed(2022)
    f(3000) |> (\(x) x[x <= 2900])() |> h()
    #> [1] 4206450
    
    set.seed(2022)
    g(3000) |> (\(x) x[x <= 2900])() |> h()
    #> [1] NA
    
    set.seed(2022)
    g(3000) |> (\(x) x[x <= 2900])() |> h(na.rm = TRUE)
    #> [1] 2080026
    
    set.seed(2022)
    g(3000) |> (\(x) x[which(x <= 2900)])() |> h()
    #> [1] 2080026
    

    Created on 2022-03-12 by the reprex package (v2.0.1)


    Edit

    Following Mikael Jagan's comment, the input can be piped to the first function like below.

    input <- 3000
    input |> f() |> (\(x) x[x <= 2900])() |> h()
    #> [1] 4206450
    

    Created on 2022-03-12 by the reprex package (v2.0.1)