I am using quanteda to build two document feature matrices:
library(quanteda)
DFM1 <- dfm("this is a rock")
# features
# docs this is a rock
# text1 1 1 1 1
DFM2 <- dfm("this is music")
# features
# docs this is music
# text1 1 1 1
However, I want DFM2 to have a specific set of features, namely the ones from DFM1:
DFM2 <- dfm("this is music", *magicargument* = featnames(DFM1))
# features
# docs this is a rock
# text1 1 1 0 0
Is there a magicargument that I am missing? Or is there another efficient way to archieve it for large bags of words?
The magic argument is pattern
, where you supply a dfm whose features will be matched (including zeroes for features not in the target dfm):
dfm_select(DFM2, pattern = DFM1)
# Document-feature matrix of: 1 document, 4 features (50% sparse).
# 1 x 4 sparse Matrix of class "dfmSparse"
# features
# docs this is a rock
# text1 1 1 0 0