I'm trying to cleanup my data in a Hive table. I need to replace some characters in a column but I'm unable to figure out how to remove multiple characters at once in using regexp_replace()
in Hive SQL.
The below is straightforward and works as expected:
select regexp_replace('abc-de-ghi', '-','');
and outputs:
abcdefghi
But I don't know how to clean up a string with different characters in it:
select regexp_replace('abc-de/ghi@jkl:mn#op', <i-dont-know-what-goes-here>,'');
Can someone please help me with this?
Use '[-/@:#]'
template with character set (in the brackets) you want to remove:
select regexp_replace('abc-de/ghi@jkl:mn#op','[-/@:#]','');
Result:
OK
abcdeghijklmnop
Time taken: 4.656 seconds, Fetched: 1 row(s)