Dot product of sparse matrix

Im reading implementation of the Multinomial Naive Bayes and I do not understand how does this following calculation of dot product of the following matrixes work.

self.feature_count_ += safe_sparse_dot(Y.T, X)

Code can be found here

Where Y.T.shape = (3, 7000) and X.shape = (7000, 27860). How can it work when number of rows in the Y.T is not equal to number of columns in X? The size of the resulting matrix is (3, 27860) ?? How does it work? What am I missing?

Solution

Check out the "Mulitplying a matrix by another matrix" section here: https://www.mathsisfun.com/algebra/matrix-multiplying.html

If you go through the multiplication, you'll see that only the "inner" dimensions have to match (the 7000 in your case)