Search code examples
scalaapache-sparkapache-spark-sql

How pass values dynamically from table to spark scala dataframe


I want to pass values(objectName,Blocklist) dynamically from table reference.objectdata contains the below data.I am running this code in databricks

enter image description here

import io.delta.tables._
import org.apache.spark.{SparkContext, SparkConf}
import org.apache.spark.sql.hive,HiveContext
import org.apache.spark.sql.expressions.Window
import org.apache.spark.sql.functions.row_number

val df =spark.read.table("reference.objectdata")

val objectname= df.("objectName")
var i=1

while(i<=df.count())
{
  val firstdf =DeltaTable.forName(s"$objectName")
  val timestampvalue  = firstdf.history(Blocklist).select("timestamp","operationMetrics")
  val w1 = timestampvalue.orderBy("timestamp")
  val w =w1.head
  
  i =i+1
  
  }
  
  
  Println(w)

Solution

  • In my understanding, you want something along these lines:

    case class ObjectData(objectName: String, blocklist: Int)
    
    // Retrieve the list of object names and blocklist
    // You could also keep it as a Dataframe if you prefer 
    val df: Dataset[ObjectData] = spark.read.table("reference.objectdata").as[ObjectData]
    
    val df2: Dataset[X] = df.map { case ObjectData(objectName, blocklist) =>
      // For each object, execute a query that return a value of type X
      val something: X = DeltaTable.forName(objectName)
        ...
        .select(...)
        .orderBy(...)
        .head
      something 
    }
    

    This is only a draft that you'll need to adapt for your exact needs and some types may be incorrect as I didn't try to compile it (I'm actually sure it doesn't compile!).