I am cleaning up a dataset with a character variable like this:
df <- c("2015 000808", "2013 000041", "2015 000005", "2015 301585", "2015 311585", "2014 380096", "2013 100041")
So I can achieve this result, where the 000s in front of the second number are removed and each number is pasted together:
"2015808"
"201341"
"20155"
"2015301585"
"2015311585"
"2014380096"
"2013100041"
I am stuck trying to find the best way to remove the 0s that occur before the number in the second part of the string. I have looked at gsub
and substring
but I am bit confused how to remove a pattern of zeros based on their position as well as on conditions? Something along the lines of "remove one or more zeros only if they precede number 1-9 and are in position 7-11".
While akrun's approach is the one that should be used. Here is stringr
composition:
word(df, 1)
we take the left part of the stringword(df, -1)
we take the right part (here we use
2a. str_remove_all
with regex ^0+
to remove leading zeros.str_c
to combine both parts:library(stringr)
str_c(word(df,1), str_remove_all(word(df, -1), '^0+'))
[1] "2015808" "201341" "20155" "2015301585" "2015311585" "2014380096" "2013100041"