Search code examples
pysparkazure-databricks

Round of 2 decimal is not happening in pyspark


I am doing below calculation in databricks and rounding off to 2 decimal points.

result = (
    round(
        coalesce(
            when(col('col') != 0, col('col')),
            when(col('col') != 0, col('col')),
            when(col('col') != 0, col('col')),
            when(col('col') != 0, col('col'))
        ) * col('col4') +
        when((col('col') > 0) & (col('col') > 0), col('col') * col('col')).otherwise(col('col')),
        2
    )
    .alias('col')
)

and my code is working fine,but for one record it is not rounding off correctly

example 216.495 it should be round off 216.50 , in output it showing 216.49


Solution

  • Change the type of the column to DoubleType or convert to DecimalType scaling to 3.

    It gives expected results.

    from pyspark.sql.functions import col, coalesce, when,round
    from pyspark.sql.types import StructType, StructField, DoubleType,FloatType,DecimalType
    
    data = [
        (216.495,)
    ]
    
    schema = StructType([
        StructField("col", DoubleType(), True)
    ])
    
    df = spark.createDataFrame(data, schema=schema)
    df.select(round(col("col"),2).alias("col")).display()
    
    

    Or

    data = [
        (216.495,)
    ]
    
    DecimalType()
    schema = StructType([
        StructField("col", FloatType(), True)
    ])
    
    df = spark.createDataFrame(data, schema=schema)
    df.select(round(col("col").cast(DecimalType(scale=3)),2)).display()
    

    Output:

    col
    216.5