Search code examples
rsplitstrsplit

R split a column in this format


I need to split out this column into 2 columns

  • 5/5/2020Tom Tesla

desired outcome is

  • Col1 Col2
  • 5/5/2020 Tom Tesla

I have tried strAny but need help as Col 1 is not a fixed with as the date field length varies due to 1 or 2 characters for the day of the month. Any suggestions how to do this?


Solution

  • We can use separate with a regex lookaround to split between a digit and a lower case letter

    library(tidyr)
    separate(df1, 'col1', into = c('date', 'other'), sep="(?<=[0-9])(?=[A-Za-z])")
    #     date             other
    #1  1/1/2000            yogurt
    #2  1/1/2000      toilet paper
    #3  2/1/2000              soda
    #4 11/1/2000            bagels
    #5 12/1/2000            fruits
    #6 13/1/2000 laundry detergent
    

    Or using base R with strsplit

    do.call(rbind, strsplit(as.character(df1$col1),
          "(?<=[0-9])(?=[A-Za-z])", perl = TRUE))
    

    data

    df1 <- structure(list(col1 = c("1/1/2000yogurt", "1/1/2000toilet paper", 
    "2/1/2000soda", "11/1/2000bagels", "12/1/2000fruits", "13/1/2000laundry detergent"
    )), class = "data.frame", row.names = c(NA, -6L))