Search code examples
rtrimgsub

Removing white spaces after specific symbol ";"


I have a question regarding removing white spaces within a character text inside a column data frame. This is my data frame column:

head(data$HO)
[1] "Lidar; Wind field; Temperature; Aerosol; Fabry-Perot etalon"                             
[2] "Compressive ghost imaging; Guided filter; Single-pixel imaging"    

This question differs from this one link as I want to remove only the spaces after the symbol ";" , so the output should look like this:

head(data$HO)
[1] "Lidar;Wind field;Temperature;Aerosol;Fabry-Perot etalon"                             
[2] "Compressive ghost imaging;Guided filter;Single-pixel imaging"    

I have tried

data$HO <- gsub("\\;s", ";",data$HO)

but it doesn't work.

Any suggestion?


Solution

  • You may use ;\s+ pattern and replace with ;:

    > x <- c("Lidar; Wind field; Temperature; Aerosol; Fabry-Perot etalon", "Compressive ghost imaging; Guided filter; Single-pixel imaging")
    > gsub(";\\s+", ";", x)
    [1] "Lidar;Wind field;Temperature;Aerosol;Fabry-Perot etalon"     
    [2] "Compressive ghost imaging;Guided filter;Single-pixel imaging"
    

    Pattern details:

    • ; - a semi-colon
    • \s+ - one or more whitespace chars.

    See the regex demo.

    Some more variations of the solution:

    gsub("(*UCP);\\K\\s+", "", x, perl=TRUE)
    gsub(";[[:space:]]+", ";", x)