My requirement is to remove trailing zeros from decimal value, I have tried regex and strip()
to remove trailing zeros it worked but we use regex and strip for string datatype, I want Col_2
to be decimal without changing the precision and scale. Can someone please suggest alternative.
Sample Table :
Col_1 : string
Col_2 : decimal(18,2)
Col_1 | Col_2 |
---|---|
ABC | 2.00 |
DEF | 2.50 |
XMN | 0.00 |
RST | 1.28 |
Actual Result using regex:
Col_1 : string
Col_2 : string
Col_1 | Col_2 |
---|---|
ABC | 2 |
DEF | 2.5 |
XMN | |
RST | 1.28 |
Expected Result :
Col_1 | Col_2 |
---|---|
ABC | 2 |
DEF | 2.5 |
XMN | 0 |
RST | 1.28 |
decimal(18,2)
type will always store those 2 digits after the comma. Displaying the trailing zeros on the right side of the comma is just a matter of formatting.
So, if you want Col_2
to be in decimal and preserve the precision then store it as decimal(18,2)
and format it as you want when displaying the data.
You can use for that format_number
function:
import pyspark.sql.functions as F
df.withColumn("Col_2", F.expr("format_number(Col_2, '0.##')")).show()
#+-----+-----+
#|Col_1|Col_2|
#+-----+-----+
#| ABC| 2|
#| DEF| 2.5|
#| XMN| 0|
#| RST| 1.28|
#+-----+-----+