Below, I first find if variables X
and Y
have a value that is repeated less than 4 times. I find and list these values in low
.
I wonder, using BASE R, how can I transform low
which is a list of table
s to my desired output shown below?
Note: The data below is toy, a functional answer is appreciated.
data <- data.frame(id = c(rep("AA",4), rep("BB",2), rep("CC",2)), X = c(1,1,1,1,1,1,3,3),
Y = c(9,9,9,7,6,6,6,6),
Z = 1:8)
mods <- c("X","Y")
A <- setNames(lapply(seq_along(mods), function(i) table(data[[mods[i]]], dnn = NULL)), mods)
low <- setNames(lapply(seq_along(A), function(i) A[[i]][which(A[[i]] < 4)]), names(A))
Desired output:
data.frame(id = c("CC", "AA", "AA"), value = c(3, 7, 9), var.name = c("X", "Y", "Y"), occur = c(2, 1, 3))
# id value var.name occur # `value` comes from the `names(low[[i]])`# i = 1,2
# 1 CC 3 X 2 # `occur` comes from `as.numeric(low[[i]])`
# 2 AA 7 Y 1
# 3 AA 9 Y 3
We split the subset of columns of 'data' with 'id', loop through the list
with lapply
, do an inner join with merge
with the corresponding stack
ed 'low' list
of table
s, Filter
out the elements that are having number of rows 0 or length
0 to create 'lst1'. From 'lst1', create additional columns from the inner and outer names
with Map
and rbind
the elements
lst1 <- Filter(length, lapply(split(data[c('X', 'Y')], data$id),
function(dat) Filter(nrow, Map(merge, lapply(dat,
function(x) stack(table(x))), lapply(low, stack)))))
do.call(rbind, c(Map(cbind, id = names(lst1), lapply(lst1,
function(x) do.call(rbind, c(Map(cbind, x, var.name = names(x)),
make.row.names = FALSE)))), make.row.names = FALSE))
# id values ind var.name
#1 AA 1 7 Y
#2 AA 3 9 Y
#3 CC 2 3 X