I have a sample X which is a sparse matrix (~5%) and now try to scale each column with a factor (basically tf-idf normalization).
Which I thought is a task easy to accomplish somehow now occurs to be not really supported. Here is what I used:
fac = log(size(X,1)./max(1,sum(X ~= 0)));
X = bsxfun(@times,X,fac); % this line gives an out of memory error
X is around 20,000x1,000,000 but only ~ 5% of the features are nonzero thus there shouldn't be any problem memorywise (the machine has 48 GB Ram and could easily handle a full matrix with the same number of elements allocated).
Actually I feel that there must be an easy way to do this, as it is a very common operation with sparse matrices holding data samples.
Thanks in advance
Yey for linear algebra! Column scaling is right multiplication of diagonal matrix:
X = X*diag(sparse(fac));