Search code examples
rspread

How to elegantly split array of strings in list elements by substring?


How can i elegantly split an Array of strings in subgroups, based on their first character?

Sample data:

c("1_autoa", "1_autob", "1_autoc","2_bier", "3_hundx", "3_hundy")

Desired Output:

[[1]]
[1] "1_autoa" "1_autob" "1_autoc"

[[2]]
[1] "2_bier"

[[3]]
[1] "3_hundx" "3_hundy"

list(
  c("1_autoa", "1_autob", "1_autoc"), c("2_bier"), c("3_hundx", "3_hundy"))

What i tried: (Working example, but seems unnessary Long)

library(dplyr)
library(purrr)
library(magrittr)
data <- data.frame(
  id = 1:6, 
  name = c("1_autoa", "1_autob", "1_autoc", "2_bier", "3_hundx", "3_hundy")
)
data$start <- substr(x = data$name, start = 1, stop = 1)
spread(data, start, name) %>% 
  apply(MARGIN = 2, list) %>% 
  lapply(FUN = function(x) x[[1]][!is.na(x[[1]])])

Solution

  • Simply

    split(x, gsub('\\D+', '', x))
    

    where,

    x <- c("1_autoa", "1_autob", "1_autoc","2_bier", "3_hundx", "3_hundy")