I'm trying to convert this statement using %>% into one that uses |> instead because I've noticed that the native pipe is much faster. The goal is to get rid of any empty dataframes that result by splitting data into combinations that don't actually exist in the data.
The issue is not with creating the split list of dataframes, it is with filtering out the empty data frames that are now in the list "split_df" in a pipe-able way. I don't think this needs example data as it is pretty easy to visualize, I just want Filter(anonymous function) to work with base R pipe.
System Info:
platform x86_64-apple-darwin17.0
arch x86_64
os darwin17.0
system x86_64, darwin17.0
version.string R version 4.2.2 (2022-10-31)
I know that function nesting is not allowed with |>, but I have tried to rewrite it several ways with no success. The magrittr way of writing this works, just curious about different options.
#code that works
#split the data based on two variables
split_df <- split(df,
f = list(df$variable1, df$variable2)) %>%
Filter(function(x) nrow(x) > 0, .) #Remove empty dataframes that result because of combinations that don't actually exist in the dataset.
#code that doesn't work that I have tried
split_df <- split(df,
f = list(df$variable1, df$variable2)) |>
Filter(\(x) {nrow(x) > 0}())
split_df <- split(df,
f = list(df$variable1, df$variable2)) |>
Filter() |>
(\(x) {nrow(x) > 0}) ()
Starting in R-4.2.0, it's possible to use _
as a placeholder after |>
. It can only be once (unlike magrittr
's support of multiple .
uses), but you should be able to do
split_df <- split(df, f = df[,c("variable1", "variable2")]) |>
Filter(\(x) nrow(x) > 0, x = _)
As an alternative (and not requiring _
, so this works in R between 4.1 and 4.2), you can name the function argument to Filter
:
split_df <- split(df, f = df[,c("variable1", "variable2")]) |>
Filter(f = \(x) nrow(x) > 0)
in which case the data is supplied to the first argument not explicitly named (which will be x=
, where you want the data to go).
(Both expressions tested using split(mtcars, mtcars[, c("cyl", "gear")])
, confirmed on R-4.1.2 (second) and R-4.2.0 (both).)