Search code examples
scaladictionarydot-productvectormath

Calculating the Dot Product of two Sparse Vectors (and generating them) in Scala using the standard library


I am trying to calculate the dot product (scalar product) of two sparse vectors in Scala. The code I have written is doing everything that I want it to, except when multiplying the similar elements of the two vectors, it is not accounting for the 0 values.

I expect to get 72 as my answer as 3 and 18 are the only keys that are both non-zero and they evaluate to: (3 -> 21) + (18 -> 51) = 72

I used withDefaultValue(0) hoping it would "fill in" the unmentioned key/value pairs but I do not think this is the case, and I believe this is where my problem is coming from, in the very beginning. I think my question could also be "How to generate a Sparse Vector in Scala using the Standard Library".

If I enter the corresponding 0's and the two Maps (vectors) have the same number of key/value pairs, my code works properly.

```
  val Sparse1 = Map(0 -> 4, 3 -> 7, 6 -> 11, 18 -> 17).withDefaultValue(0)
  val Sparse2 = Map(1 -> 3, 3 -> 3, 11 -> 2,18 -> 3, 20 -> 6).withDefaultValue(0)
  //println(Sparse2.toSeq)//to see what it is....0's missing
  val SparseSum = (Sparse1.toSeq ++ Sparse2.toSeq).groupBy(_._1).mapValues(_.map(_._2).sum)
  //println(SparseSum)
  val productOfValues = ((Sparse1.toSeq ++ Sparse2.toSeq).groupBy(_._1).mapValues(_.map(_._2).reduce(_*_)))
  //println(productOfValues)
  var dotProduct = 0
  for ((h,i) <- productOfValues) {
    dotProduct += i
  }
  //println(dotProduct)
  //If I specify some zero values, lets see what happens:
  val Sparse3 = Map(0 -> 4, 1 -> 0, 3 -> 7, 6 -> 11, 11 -> 0, 18 -> 17, 20 -> 0).withDefaultValue(0)
  val Sparse4 = Map(0 -> 0, 1 -> 3, 3 -> 3, 6 -> 0, 11 -> 2,18 -> 3, 20 -> 6).withDefaultValue(0)
  val productOfValues2 = ((Sparse3.toSeq ++ Sparse4.toSeq).groupBy(_._1).mapValues(_.map(_._2).reduce(_*_)))
  var dotProduct2 = 0
  for ((l, m) <- productOfValues2) {
    dotProduct2 += m
  }
  println(productOfValues2)
  println(dotProduct2)//I get 72

```

I can create a Sparse Vector this way, and then update the values

  import scala.collection.mutable.Map
  val Sparse1 = Map[Int, Int]()
  for (k <- 0 to 20) {
    Sparse1 getOrElseUpdate (k, 0)
  }
  val Sparse2 = Map[Int, Int]()
  for (k <- 0 to 20) {
    Sparse2 getOrElseUpdate (k, 0)
  }

But I'm wondering if there is a "better" way. More along the lines of what I tried and failed to do using "withDefaultValue(0)"


Solution

  • Since you are using sparse vectors, you can ignore all keys that are not on both vectors.
    Thus, I would compute the intersection between both keys sets and then perform a simple map-reduce to compute the dot product.

    type SparseVector[T] = Map[Int, T]
    
    /** Generic function for any type T that can be multiplied & summed. */
    def sparseDotProduct[T: Numeric](v1: SparseVector[T], v2: SparseVector[T]): T = {
      import Numeric.Implicits._
    
      val commonIndexes = v1.keySet & v2.keySet
    
      commonIndexes
        .map(i => v1(i) * v2(i))
        .foldLeft(implicitly[Numeric[T]].zero)(_ + _)
    }
    

    Then, you can use it like this:

    // The withDefault(0) is optional now.
    val sparse1 = Map(0 -> 4, 3 -> 7, 6 -> 11, 18 -> 17).withDefaultValue(0)
    val sparse2 = Map(1 -> 3, 3 -> 3, 11 -> 2, 18 -> 3, 20 -> 6).withDefaultValue(0)
    
    sparseDotProduct(sparse1, sparse2)
    // res: Int = 72
    

    Edit - the same method, but without context bounds & implicit syntax.

    type SparseVector[T] = Map[Int, T]
    
    /** Generic function for any type T that can be multiplied & summed. */
    def sparseDotProduct[T](v1: SparseVector[T], v2: SparseVector[T])(implicit N: Numeric[T]): T = {      
      val commonIndexes = v1.keySet & v2.keySet
    
      commonIndexes
        .map(i => N.times(v1(i), v2(i)))
        .foldLeft(N.zero)((acc, element) => N.plus(acc, element))
    }
    

    Bonus - General approach for non-spare vectors.

    One can modify the above method to work for any kind of vector, not just spare. In this case, we would need the union of the keys, and take into account cases where one key does not exist on the other.

    type MyVector[T] = Map[Int, T]
    
    /** Generic function for any type T that can be multiplied & summed. */
    def dotProduct[T: Numeric](v1: MyVector[T], v2: MyVector[T]): T = {
      import Numeric.Implicits._
      val zero = implicitly[Numeric[T]].zero
    
      val allIndexes = v1.keySet | v2.keySet
    
      allIndexes.map { i =>
         v1.getOrElse(
           key = i,
           default = zero
         ) * v2.getOrElse(
           key = i,
           default = zero
         )
       }.foldLeft(zero)(_ + _)
    }