I tried setting "spark.debug.maxToStringFields"
as described in the message WARN Utils: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.debug.maxToStringFields' in SparkEnv.conf.
. Please find the code below
val sparkConf= new SparkConf()
//here
sparkConf.set("spark.debug.maxToStringFields", "100000")
sparkConf.set("spark.sql.debug.maxToStringFields", "100000")
val spark = SparkSession.builder.config(sparkConf).getOrCreate()
//here
spark.conf.set("spark.debug.maxToStringFields", 100000)
spark.conf.set("spark.sql.debug.maxToStringFields", 100000)
val data = spark.read
.option("header", "true")
.option("delimiter", "|")
.format("csv")
.csv(path_to_csv_file)
.toDF()
.repartition(col("country"))
data.rdd.toDebugString
I only get the partial output of toDebugString with the above warning message. As you can see I have tried both options. Why is it not printing full RDD Lineage?
Can you check here :
https://www.programcreek.com/scala/org.apache.spark.SparkEnv
I think you have to set the value like
val sparkenv = SparkEnv.get sparkenv.conf.set("spark.oap.cache.strategy", "not_support_cache")