Search code examples
scalaapache-sparkspark-graphx

"error: type mismatch" in Spark with same found and required datatypes


I am using spark-shell for running my code. In my code, I have defined a function and I call that function with its parameters.

The problem is that I get the below error when I call the function.

error: type mismatch;

found   : org.apache.spark.graphx.Graph[VertexProperty(in class $iwC)(in class $iwC)(in class $iwC)(in class $iwC),String]

required: org.apache.spark.graphx.Graph[VertexProperty(in class $iwC)(in class $iwC)(in class $iwC)(in class $iwC),String]

What is the reason behind this error? Has it got anything to do with Graph datatype in Spark?

Code : This is the part of my code which involves the definition and call of the function "countpermissions".

class VertexProperty(val id:Long) extends Serializable
case class User(val userId:Long, val userCode:String, val Name:String, val Surname:String) extends VertexProperty(userId)
case class Entitlement(val entitlementId:Long, val name:String) extends VertexProperty(entitlementId)

def countpermissions(es:String, sg:Graph[VertexProperty,String]):Long = {
    return 0
}

val triplets = graph.triplets
val temp = triplets.map(t => t.attr)
val distinct_edge_string = temp.distinct    
var bcast_graph = sc.broadcast(graph)        
val edge_string_subgraph = distinct_edge_string.map(es => es -> bcast_graph.value.subgraph(epred = t => t.attr == es))
val temp1 = edge_string_subgraph.map(t => t._1 -> countpermissions(t._1, t._2))

The code runs without errors until the last line, where it gets the above mentioned error.


Solution

  • Here is the trick. Lets open the REPL and define a class:

    scala> case class Foo(i: Int)
    defined class Foo
    

    and a simple function which operates on this class:

    scala> def fooToInt(foo: Foo) = foo.i
    fooToInt: (foo: Foo)Int
    

    redefine the class:

    scala> case class Foo(i: Int)
    defined class Foo
    

    and create an instance:

    scala> val foo = Foo(1)
    foo: Foo = Foo(1)
    

    All whats left is to call fooToInt:

    scala> fooToInt(foo)
    <console>:34: error: type mismatch;
     found   : Foo(in class $iwC)(in class $iwC)(in class $iwC)(in class $iwC)
     required: Foo(in class $iwC)(in class $iwC)(in class $iwC)(in class $iwC)
              fooToInt(foo)
    

    Does it look familiar? Yet another trick to get a better idea what is going on:

    scala> case class Foo(i: Int)
    defined class Foo
    
    scala> val foo = Foo(1)
    foo: Foo = Foo(1)
    
    scala> case class Foo(i: Int)
    defined class Foo
    
    scala> def fooToInt(foo: Foo) = foo.i
    <console>:31: error: reference to Foo is ambiguous;
    it is imported twice in the same scope by
    import INSTANCE.Foo
    and import INSTANCE.Foo
             def fooToInt(foo: Foo) = foo.i
    

    So long story short this is an expected, although slightly confusing, behavior which arises from ambiguous definitions existing in the same scope.

    Unless you want to periodically :reset REPL state you should keep track of entities you create and if types definitions change make sure that no ambiguous definitions persist (overwrite things if needed) before you proceed.