I know , I am asking very basic question here , But is there any way to replace first occurrence of character within pyspark dataframe.
I have below value within dataframe.
Gourav#Joshi#Karnataka#US#English
I only want to replace first occurrence of # within dataframe.
Expected Output:
Gourav Joshi#Karnataka#US#English
Just use regexp_replace and capture the sub-string before the 1st #
as $1
:
spark.sql("""
select col, regexp_replace(col,'^([^#]*)#','$1 ') col_new
from values ('Gourav#Joshi#Karnataka#US#English') as (col)
""").show(1,0)
+---------------------------------+---------------------------------+
|col |col_new |
+---------------------------------+---------------------------------+
|Gourav#Joshi#Karnataka#US#English|Gourav Joshi#Karnataka#US#English|
+---------------------------------+---------------------------------+