Search code examples
rstringnchar

Create new values based on length of a string in R?


I have a problem where I need to, ideally, create new values and new rows based on the length of a string.

This is my source data:

NumericCode1=c("12345","1234")
NumericCode2=c("0123.45","123.4")
AlphaCode=c("","")
df=data.frame(NumericCode1,NumericCode2,AlphaCode)

What I'd like to do is process this data using this logic:

If either of the values in NumericCode1 or NumericCode2 are greater than 5 (counting numbers only), then I'd like to populate AlphaCode with AA:BB:CC values for each. So the df would end up looking like this:

NumericCode1=c("12345","1234")
NumericCode2=c("0123.45","123.4")
AlphaCode=c("AA:BB:CC","")
df=data.frame(NumericCode1,NumericCode2,AlphaCode)

Then I could use this code to create a separate record for each and would get my desired output.

df %>% 
  separate_rows(AlphaCode, sep=":")

  NumericCode1 NumericCode2 AlphaCode
1        12345      0123.45        AA
2        12345      0123.45        BB
3        12345      0123.45        CC
4         1234        123.4          

My problem is I'm stuck at the first step. I can count the characters in the strings using nchar or str_lenght, but I cannot figure out how to "count if > 5 then do this".

Any help much appreciated.Thanks!


Solution

  • Using stringr::str_count and \\d we can count numbers only

    library(dplyr)
    library(stringr)
    df %>% mutate(Cond=if_else(str_count(NumericCode1,'\\d')>5|str_count(NumericCode2,'\\d')>5 ,
                               'AA:BB:CC',''))
    
       NumericCode1 NumericCode2   Cond
    1        12345      0123.45    AA:BB:CC
    2         1234        123.4