Search code examples
rarraysmergecbind

iterating table() results into matrix/data frame


This must be simple but I'm banging my head against it for a while. Please help. I have a large data set from which I get all kinds of information via table(). I then want to store these counts, with the rownames that were counted. For a reproducible example consider

a <- c("a", "b", "c", "d", "a", "b")  # one count, occurring twice for a and 
                                      # b and once for c and d 
b <- c("a", "c")  # a completly different property from the dataset 
                  # occurring once for a and c
x <- table(a)
y <- table(b)  # so now x and y hold the information I seek

How can I merge/bind/whatever to get from x and y to this form:

   x. y.
a  2. 1
b  2. 0
c  1. 1
d. 1  0

HOWEVER, I need to use the solution to work iteratively, in a loop that takes x and y and gets the requested form above, and then gets more tables added, each hopefully adding a column. One of my many failed attempts, just to show my (probably flawed) logic, is:

member <- function (data = dfm, groupvar = 'group', analysis = kc15) {
  res<-matrix(NA,ncol=length(analysis$size)+1) #preparing an object for the results
  res[,1]<-table(docvars(data,groupvar)) #getting names and totals of groups
  for (i in 1:length(analysis$size)) { #getting a bunch of counts that I care about
    r<-table(docvars(data,groupvar)[analysis$cluster==i])
    res<-cbind(res,r) #here's the problem, trying to add each new count as a column.
  }
  res
}

So, to sum, the reproducible example above means to replicate the first column in res and an r, and I'm seeking (I think) a correct solution instead of the cbind, which would allow adding columns of different length but similar names, as in the example above. Please help its embarrassing how much time I'm wasting on this


Solution

  • merge the transposed and re-transpose.

    res <- t(merge(t(unclass(x)), t(unclass(y)), all=TRUE))
    res <- `colnames<-`(res[order(rownames(res)), 2:1], c("x", "y"))
    res[is.na(res)] <- 0
    res
    #   x y
    # a 2 1
    # b 2 0
    # c 1 1
    # d 1 0