I have a problem where I need to, ideally, create new values and new rows based on the length of a string.
This is my source data:
NumericCode1=c("12345","1234")
NumericCode2=c("0123.45","123.4")
AlphaCode=c("","")
df=data.frame(NumericCode1,NumericCode2,AlphaCode)
What I'd like to do is process this data using this logic:
If either of the values in NumericCode1 or NumericCode2 are greater than 5 (counting numbers only), then I'd like to populate AlphaCode with AA:BB:CC values for each. So the df would end up looking like this:
NumericCode1=c("12345","1234")
NumericCode2=c("0123.45","123.4")
AlphaCode=c("AA:BB:CC","")
df=data.frame(NumericCode1,NumericCode2,AlphaCode)
Then I could use this code to create a separate record for each and would get my desired output.
df %>%
separate_rows(AlphaCode, sep=":")
NumericCode1 NumericCode2 AlphaCode
1 12345 0123.45 AA
2 12345 0123.45 BB
3 12345 0123.45 CC
4 1234 123.4
My problem is I'm stuck at the first step. I can count the characters in the strings using nchar
or str_lenght
, but I cannot figure out how to "count if > 5 then do this".
Any help much appreciated.Thanks!
Using stringr::str_count
and \\d
we can count numbers only
library(dplyr)
library(stringr)
df %>% mutate(Cond=if_else(str_count(NumericCode1,'\\d')>5|str_count(NumericCode2,'\\d')>5 ,
'AA:BB:CC',''))
NumericCode1 NumericCode2 Cond
1 12345 0123.45 AA:BB:CC
2 1234 123.4