I'm trying to use a new R package called waldo (see at the tidyverse blog too) that is designed to compare data objects to find differences. The waldo::compare()
function returns an object that is, according to the documentation:
a character vector with class "waldo_compare"
The main purpose of this function is to be used within the console, leveraging coloring features to highlight outstanding values that are not equal between data objects. However, while just examining in console is useful, I do want to take those values and act on them (filter them out from the data, etc.). Therefore, I want to programmatically extract the outstanding values. I don't know how.
set.seed(2020)
vec_a <- sample(0:20, size = 10)
## [1] 3 15 13 0 16 11 10 12 6 18
4
) into an 11th vector element.vec_b <- vec_a
vec_b[11] <- 4
vec_b <- as.integer(vec_b)
## [1] 3 15 13 0 16 11 10 12 6 18 4
waldo::compare()
to test the differences between the two vectorswaldo::compare(vec_a, vec_b)
## `old[8:10]`: 12 6 18
## `new[8:11]`: 12 6 18 4
The beauty is that it's highlighted in the console:
I can try to assign waldo::compare()
to an object:
waldo_diff <- waldo::compare(vec_a, vec_b)
and then what? when I try to do waldo_diff[[1]]
I get:
[1] "`old[8:10]`: \033[90m12\033[39m \033[90m6\033[39m \033[90m18\033[39m \n`new[8:11]`: \033[90m12\033[39m \033[90m6\033[39m \033[90m18\033[39m \033[34m4\033[39m"
and for waldo_diff[[2]]
it's even worse:
Error in waldo_diff[3] : subscript out of bounds
Any idea how I could programmatically extract the outstanding values that appear in the "new" vector but not in the "old"?
At least for the simple case of comparing two vectors, you’ll be better off
using diffobj::ses_dat()
(which is from the package that waldo uses
under the hood) directly:
waldo::compare(1:3, 2:4)
#> `old`: 1 2 3
#> `new`: 2 3 4
diffobj::ses_dat(1:3, 2:4)
#> op val id.a id.b
#> 1 Delete 1 1 NA
#> 2 Match 2 2 NA
#> 3 Match 3 3 NA
#> 4 Insert 4 NA 3
For completeness, to extract additions you could do e.g.:
extract_additions <- function(x, y) {
ses <- diffobj::ses_dat(x, y)
y[ses$id.b[ses$op == "Insert"]]
}
old <- 1:3
new <- 2:4
extract_additions(old, new)
#> [1] 4