I'm currently trying to write a function that filters some rows of a disk.frame
object using regular expressions. I, unfortunately, run into some issues with the evaluation of my search string in the filter function. My idea was to pass a regular expression as a string into a function argument (e.g. storm_name
) and then pass that argument into my filtering call. I used the %like%
function included in {data.table}
for filtering rows.
My problem is that the storm_name
object gets evaluated inside the disk.frame. However, since the storm_name
is only included in the function environment, but not in the disk.frame object, I get the following error:
Error in .checkTypos(e, names_x) :
Object 'storm_name' not found amongst name, year, month, day, hour and 8 more
I already tried to evaluate the storm_name
object in the parent frame using eval(sotm_name, env = parent.env())
, but that also didn't help. Interestingly, this problem only occurs with {disk.frame}
objects but not with {data.table}
objects.
For now I found a solution using {dplyr}
instead. However, I would be grateful for any ideas on how this problem could be solved with {data.table}
.
# Load packages
library(data.table)
library(disk.frame)
# Create data table and diskframe object of storm data
storms_df <- as.disk.frame(storms)
storms_dt <- as.data.table(storms)
# Create search function
grep_storm_name <- function(dfr, storm_name){
dfr[name %like% storm_name]
}
# Check function with data.table object
grep_storm_name(storms_dt, "^A")
# Check function with diskframe object
grep_storm_name(storms_df, "^A")
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)
Matrix products: default
locale:
[1] LC_COLLATE=English_Sweden.1252 LC_CTYPE=English_Sweden.1252 LC_MONETARY=English_Sweden.1252
[4] LC_NUMERIC=C LC_TIME=English_Sweden.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] disk.frame_0.5.0 purrr_0.3.4 dplyr_1.0.7 data.table_1.14.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.7 benchmarkmeData_1.0.4 pryr_0.1.4 pillar_1.6.4
[5] compiler_4.1.0 iterators_1.0.13 tools_4.1.0 digest_0.6.27
[9] bit_4.0.4 jsonlite_1.7.2 lifecycle_1.0.1 tibble_3.1.6
[13] lattice_0.20-44 pkgconfig_2.0.3 rlang_0.4.12 Matrix_1.3-3
[17] foreach_1.5.1 rstudioapi_0.13 DBI_1.1.1 parallel_4.1.0
[21] bigassertr_0.1.4 bigreadr_0.2.4 httr_1.4.2 stringr_1.4.0
[25] globals_0.14.0 generics_0.1.1 fs_1.5.0 vctrs_0.3.8
[29] bit64_4.0.5 grid_4.1.0 tidyselect_1.1.1 glue_1.6.0
[33] listenv_0.8.0 R6_2.5.1 future.apply_1.7.0 parallelly_1.25.0
[37] fansi_1.0.0 magrittr_2.0.1 codetools_0.2-18 ellipsis_0.3.2
[41] fst_0.9.4 assertthat_0.2.1 future_1.21.0 benchmarkme_1.0.7
[45] utf8_1.2.2 stringi_1.7.6 doParallel_1.0.16 crayon_1.4.2
It now works since disk.frame v0.6