Search code examples
apache-sparkpysparkhiveapache-spark-sql

Pyspark error: "Undefined function: 'from_timestamp'


I am trying with spark.sql to fetch some data in pyspark from a hive view but every time it throws me the below error:

pyspark.sql.utils.AnalysisException: u"Undefined function: 'from_timestamp'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.;

My settings on SparkSession.builder are these:

spark = SparkSession.builder.appName("home_office") \
    .config("hive.exec.dynamic.partition", "true") \
    .config("hive.exec.dynamic.partition.mode", "nonstrict") \
    .config("hive.exec.compress.output=false", "false") \
    .config("spark.unsafe.sorter.spill.read.ahead.enabled", "false") \
    .config("spark.debug.maxToStringFields", 1000)\
    .enableHiveSupport() \
    .getOrCreate()

Solution

  • There is no such function from_timestamp in Spark SQL. If you're referring to the function in Impala, I believe the equivalent in Spark SQL is date_format.

    Example usage:

    select date_format(current_timestamp(), 'dd/MM/yyyy hh:mm:ss a');
    
    07/01/2021 08:37:11 AM