Search code examples
rstringliststrsplit

How can I give sequential names to items in a list?


I have a character string that I have split into a list of smaller strings using strsplit. For example:

> full.seq <- "FZpcgK3VdAQzEFZpcAVdV8QM8ZpsEFZpacgGKi3VdVSQzEFZpcgGKAVdVRpEFKGIZpg13"
> full.seq
[1] "FZpcgK3VdAQzEFZpcAVdV8QM8ZpsEFZpacgGKi3VdVSQzEFZpcgGKAVdVRpEFKGIZpg13"
> sequences <- strsplit(full.seq, "cg")
> sequences
[[1]]
[1] "FZp"                          "K3VdAQzEFZpcAVdV8QM8ZpsEFZpa" "GKi3VdVSQzEFZp"              
[4] "GKAVdVRpEFKGIZpg13"  

I would like to give each of these new strings a unique, sequential name that I can still use to identify that they were from the same original string (for a later analysis I will do on these strings). For example, "ID.seq1", "ID.seq2", "ID.seq3" etc. I have tried doing this manually but receive this error:

> names(sequences) <- c("ID.seq1", "ID.seq2", "ID.seq3", "ID.seq4")
Error in names(sequences) <- c("ID.seq1", "ID.seq2", "ID.seq3", "ID.seq4") : 
  'names' attribute [4] must be the same length as the vector [1]

I would also like an automated way of doing this though, as I will need to label up to 30 new strings from a number of original strings. Any suggestions?


Solution

  • First of all, if you want a character vector, you will have to subset the list, because strsplit returns a list. After doing that, you can easily assign names to that vector of terms.

    full.seq <- "FZpcgK3VdAQzEFZpcAVdV8QM8ZpsEFZpacgGKi3VdVSQzEFZpcgGKAVdVRpEFKGIZpg13"
    sequences <- strsplit(full.seq, "cg")[[1]]
    names(sequences) <- paste0("ID.seq", c(1:4))
    sequences
    
             ID.seq1                        ID.seq2 
               "FZp" "K3VdAQzEFZpcAVdV8QM8ZpsEFZpa" 
             ID.seq3                        ID.seq4 
    "GKi3VdVSQzEFZp"           "GKAVdVRpEFKGIZpg13"