I'm a newbie with regards to mahout. I would like to build my own algorithms with mahout's tools. I'm quite puzzled with the of usage Mahout's SequentialAccessSparseVector
and RandomAccessSparseVector
. Could someone suggest when should one prefer over the other?
Thanks
The random-access version is backed by a hashtable, which will have the fastest sets and gets. But the iteration order is undefined. Sometimes iterating over vectors in order of dimension makes other operations efficient, like in computing a dot product, which only needs to look at the dimensions where both are defined. It will have slightly slower sets and gets and maybe use a little more memory. Both are sparse representations though.