I have a document-term matrix:
document_term_matrix <- as.matrix(DocumentTermMatrix(corpus, control = list(stemming = FALSE, stopwords=FALSE, minWordLength=3, removeNumbers=TRUE, removePunctuation=TRUE )))
For this document-term matrix, I've calculated the local term- and global term weighing as follows:
lw_tf <- lw_tf(document_term_matrix)
gw_idf <- gw_idf(document_term_matrix)
lw_tf
is a matrix with the same dimensionality as the document-term-matrix (nxm) and gw_idf
is a vector of size n. However, when I run:
tf_idf <- lw_tf * gw_idf
The dimensionality of tf_idf
is again nxm.
Originally, I would not expect this multiplication to work, as the dimensionalities are not conformable. However, given this output I now expect the dimensionality of gw_idf
to be mxm. Is this indeed the case? And if so: what happened to the gw_idf
vector of size n
?
Matrix multiplication is done in R by using %*%
, not *
(the latter is just element-wise multiplication). Your reasoning is partially correct, you were just using the wrong symbols.
About the matrix multiplication, a matrix multiplication is only possible if the second dimension of the first matrix is the same as the first dimensions of the second matrix. The resulting dimensions is the dim1 of first matrix by the dim2 of the second matrix.
In your case, you're telling us you have a 1 x n
matrix multiplied by a n x m
matrix, which should result in a 1 x m
matrix. You can check such case in this example:
a <- matrix(runif(100, 0 , 1), nrow = 1, ncol = 100)
b <- matrix(runif(100 * 200, 0, 1), nrow = 100, ncol = 200)
c <- a %*% b
dim(c)
[1] 1 200
Now, about your specific case, I don't really have this package that makes term-documents (would be nice of you to provide an easily reproducible example!), but if you're multiplying a nxm
matrix element-wise (you're using *
, like I said in the beginning) by a nx1
array, the result does not make sense. Either your variable gw_idf
is not an array at all (maybe it's just a scalar) or you're simply making a wrong conclusion.