My starting point is having several character vectors containing POS tags I extracted from texts. For example:
c("NNS", "VBP", "JJ", "CC", "DT")
c("NNS", "PRP", "JJ", "RB", "VB")
I use table()
or ftable()
to count the occurences of each tag.
CC DT JJ NNS VBP
1 1 1 1 1
The ultimate goal is to have a data.frame looking like this:
NNS VBP PRP JJ CC RB DT VB
1 1 1 0 1 1 0 1 0
2 1 0 1 1 0 1 0 1
Using plyr::rbind.fill
seems reasonable to me here, but it needs data.frame objects as inputs. However, when using as.data.frame.matrix(table(POS_vector))
an error occurs.
Error in seq_len(ncols) :
argument must be coercible to non-negative integer
Using as.data.frame.matrix(ftable(POS_vector))
actually produces a data.frame, but without the colnames.
V1 V2 V3 V4 V5 ...
1 1 1 1 1
Any help is highly appreciated.
In base R, you can try:
table(rev(stack(setNames(dat, seq_along(dat)))))
You can also use mtabulate
from "qdapTools":
library(qdapTools)
mtabulate(dat)
# CC DT JJ NNS PRP RB VB VBP
# 1 1 1 1 1 0 0 0 1
# 2 0 0 1 1 1 1 1 0
dat
is the same as defined in @Heroka's answer:
dat <- list(c("NNS", "VBP", "JJ", "CC", "DT"),
c("NNS", "PRP", "JJ", "RB", "VB"))