Search code examples
rcastingdata-manipulationmeltdata-transform

Data transformation from columns to rows in r


I have a dataframe in this way

1954 <- c(a,b,c,d)#names of a person
X2 <- c(5,6,1,2)#their score
1955 <- c(e,f,g,h)
X3 <- c(2,4,6,9)
1956 <- c(j,k,l,m)
X4 <- c(1,3,6,8)

Girls <- data.frame(1954,X2,1955,X3,1956,X4)

Girls dataframe looks something like this

1954 X2 1955 X3 1956 X4 . . . . . . . n a 5 e 2 j 1 . . . . . . . n b 6 f 4 k 3 . . . . . . . . n c 1 g 6 l 6 . . . . . . . . .n d 2 h 9 m 8 . . . . . . . . . n

I would like the data frame to look like this

`Name score year(#new col)
 a     5   1954
 b     6   1954 
 c     1   1954
 d     2   1954
 e     2   1955
 f     4   1955
 g     6   1955
 h     9   1955
 j     1   1956
 k     3   1956
 l     6   1956
 m     8   1956
 .     .     .
 .     .     .
 n     n     n`

This is for a school project and I am struggling to transform data.Could someone help me out with this?


Solution

  • With no additional packages, you could do:

    setNames(
        cbind(
          stack(Girls[, grep("\\d{4}", names(Girls))]),
          stack(Girls[, grep("^X", names(Girls))])[, 1, drop = F]
          ),
      c("Name", "Year", "Score")
      )
    

    Output:

       Name Year Score
    1     a 1954     5
    2     b 1954     6
    3     c 1954     1
    4     d 1954     2
    5     e 1955     2
    6     f 1955     4
    7     g 1955     6
    8     h 1955     9
    9     j 1956     1
    10    k 1956     3
    11    l 1956     6
    12    m 1956     8
    

    Note that this required some changes to the code which you used to create an example, as you cannot put directly numbers as column names (they need to be within ``, and also letters need to be quoted).

    Correct code would be:

    `1954` <- c("a","b","c","d")
    X2 <- c(5,6,1,2)
    `1955` <- c("e","f","g","h")
    X3 <- c(2,4,6,9)
    `1956` <- c("j","k","l","m")
    X4 <- c(1,3,6,8)
    
    Girls <- data.frame(`1954`,X2,`1955`,X3,`1956`,X4, 
                        stringsAsFactors = FALSE, check.names = FALSE)