I have a dataframe where some strings contains "" in front and end of the string.
Eg:
+-------------------------------+
|data |
+-------------------------------+
|"john belushi" |
|"john mnunjnj" |
|"nmnj tyhng" |
|"John b-e_lushi" |
|"john belushi's book" |
Expected output:
+-------------------------------+
|data |
+-------------------------------+
|john belushi |
|john mnunjnj |
|nmnj tyhng |
|John b-e_lushi |
|john belushi's book |
I am trying to remove only " double quotes from the string. Can some one tell me how can I remove this in Scala ?
Python provide ltrim and rtrim. Is there any thing equivalent to that in Scala ?
Use expr, substring and length functions and get the substring from 2
and length() - 2
val df_d = List("\"john belushi\"", "\"John b-e_lushi\"", "\"john belushi's book\"")
.toDF("data")
Input:
+---------------------+
|data |
+---------------------+
|"john belushi" |
|"John b-e_lushi" |
|"john belushi's book"|
+---------------------+
Using expr, substring and length functions:
import org.apache.spark.sql.functions.expr
df_d.withColumn("data", expr("substring(data, 2, length(data) - 2)"))
.show(false)
Output:
+-------------------+
|data |
+-------------------+
|john belushi |
|John b-e_lushi |
|john belushi's book|
+-------------------+