I currently have a couple of algorithms in Matlab that I am looking to code in Java. I will do so using one of the following (Colt, Apache Commons Math, jblas). However, since I am really looking to improve upon the speed of these algorithms, I am looking for suggestions, and hopefully existing implementations, for parallelizing these algorithms to increase performance.
From what I can tell, Hadoop is not a good option for distributing matrix operations. I have also looked at Mahout but it is not clear to me if this will be helpful in achieving this objective.
Many thanks for all your tips and suggestions.
Where are you getting the information that Hadoop "is not a good option for distributing matrix operations"? It is certainly a good option, but only as long as your data is huge - like 50 GB+ at least. If you can fit it in memory, Hadoop is not a good option, but if you're thinking you'll want to use it on multiple TB of data, then Hadoop is a good tool for the job. There's also a lot of other things to consider when optimizing matrix multiplication, like the structure of your data (is it sparse? does it occur in clusters? etc).
There's plenty of information on google about implementing Matrix Multiplication on MapReduce - Jeffrey Ullman's book might be a good place to start if you choose this route.