Search code examples
rstringconcatenationpaste

Concatenating Rows or columns of a dataframe


Working with R:

I took multiple strings of letters:

Orig1 - ABCDE

Orig2 - FGHIJ

Orig3 - KLMNO

I split those strings of letters using strsplit:

Orig1 - A B C D E

Orig2 - F G H I J

Orig3 - K L M N O

And I put each letter in its own row and column in a dataframe. Each string on its own row with each subsequent letter in its own column:

RowName   V1 V2 V3 V4 V5

Orig1     A  B  C  D  E

Orig2     F  G  H  I  J

Orig3     K  L  M  N  O

I manipulated these strings of letters to come up with a multiple altered strings based on various analyses of these strings of letters:

RowName   V1 V2 V3 V4 V5

Altered1  A  G  H  N  E

Altered2  F  B  C  I  O

Altered3  K  L  M  D  J

I can't figure out how to collapse the altered strings back out of the dataframe. I need this to be able to be converted into an exportable .fasta file with the rownames as the subsequent sequence names.

Paste didn't work within the dataframe, so I tried using a bit of code from another thread on a similar subject:

ldf = lapply(as.list(1:dim(df)[1]), function(x) df[x[1],])

this put each into its own list, which I could then use paste on, but I found the output baffling to try to export.

Any help would be appreciated.


Solution

  • I am not sure if you want this

    > do.call(paste,c(df[-1],sep = ""))
    [1] "AGHNE" "FBCIO" "KLMDJ"
    

    Data

    df <- structure(list(RowName = c("Altered1", "Altered2", "Altered3"
    ), V1 = c("A", "F", "K"), V2 = c("G", "B", "L"), V3 = c("H", 
    "C", "M"), V4 = c("N", "I", "D"), V5 = c("E", "O", "J")), class = "data.frame", row.names = c(NA, 
    -3L))
    
    > df
       RowName V1 V2 V3 V4 V5
    1 Altered1  A  G  H  N  E
    2 Altered2  F  B  C  I  O
    3 Altered3  K  L  M  D  J