Search code examples
rregexstr-replacegsubremove-if

R str_replace / remove access zeros in string after the decimal seperator, (zero following dot[1-9]) gesub, regex, regular expression


I have vector of strings containing numbers. Within these strings, I would like to remove all access zeros behind the decimal separator. So I tried mutate_all(funs(str_replace(., ".00", ""))).

This works for the first number in my vector: v <- c("bla 500.00", "bla 1.20", "bla 1.10", "bla 2.34").

For the rest I would not like to hard code mutate_all(funs(str_replace(., ".10", ".1")))%>%mutate_all(funs(str_replace(., ".20", ".2")))%>% ..., but use some kind of smart regex, which automatically does the job. So, removing every zero which is behind ".non-zero-integer" (dot non-zero-integer), while keeping the ".non-zero-integer" the same.


Solution

  • You could try to find:

    \b(\d+)(?:\.0+|(\.\d+?))0*\b
    

    And replace with \1\2. See an online demo


    • \b - Word-boundary;
    • (\d+) - Capture trailing digits upto;
    • (?: - Open a non-capture group;
      • \.0+ - 1+ zero's;
      • | - Or;
      • (\.\d+?) - A nested 2nd capture group to match a dot followed by a digit and 0+ (lazy) digits;
      • ) - Close non-capture group;
    • 0* - 0+ (greedy) digits;
    • \b - Trailing word-boundary.

    library(stringr)
    v <- c("bla 500.00", "bla 1.20", "bla 1.10", "bla 2.34", "bla 2.340003", "bla 1.032", "bla 1.10 bla 2.00")
    v <- str_replace_all(v, "\\b(\\d+)(?:\\.0+|(\\.\\d+?))0*\\b", "\\1\\2")
    v
    

    Prints: "bla 500", "bla 1.2", "bla 1.1", "bla 2.34", "bla 2.340003", "bla 1.032", "bla 1.1 bla 2"