Search code examples
apache-sparkhbase

HBase batch get with spark scala


I am trying to fetch data from HBase based on a list of row keys, in the API document there is a method called get(List gets), I am trying to use that, however the compiler is complaining something like this, does anyone had this experiance

overloaded method value get with alternatives: (x$1: java.util.List[org.apache.hadoop.hbase.client.Get])Array[org.apache.hadoop.hbase.client.Result] <and> (x$1: org.apache.hadoop.hbase.client.Get)org.apache.hadoop.hbase.client.Result cannot be applied to (List[org.apache.hadoop.hbase.client.Get])

The code I tried.

val keys: List[String] = df.select("id").rdd.map(r => r.getString(0)).collect.toList
   val gets:List[Get]=keys.map(x=> new Get(Bytes.toBytes(x)))
   val results = hTable.get(gets)

Solution

  • I ended up using JavaConvert to make it java.util.List, then it worked

    val gets:List[Get]=keys.map(x=> new Get(Bytes.toBytes(x)))
       import scala.collection.JavaConverters._
       val getJ=gets.asJava
       val results = hTable.get(getJ).toList