Is there any SQL function that calculates the positive value rate in a column of Spark / Hive table?
P.S. I'm using PySpark 2.4
There isn't a built-in SQL function to directly calculate the positive value rate in a column of a Spark or Hive table. However, you can achieve this using a combination of SQL functions.
result = spark.sql("""
SELECT
COUNT(CASE WHEN column_name > 0 THEN 1 END) / COUNT(*) as positive_rate
FROM table
""")