Search code examples
rstackreshapereshape2melt

Stacking data file in R


I am looking to stack a datafile that I have in R and am struggling to find a solution to my problem. I have 49 variables, named i1.1, i1.2, 1.3, i1.4, i1.5, i1.6, i1.7 ... i7.6, i7.7. And I would like to stack all of the i1s together with a new variable specifying the last part of the variable name that it was taken from e.g if the value was taken from i1.3 the new variable would contain the value 3. The other thing I have in my datset is that I wish to also keep an id variable in the dataset so that I am able to merge some additional data in at a later stage. DOes anyone have a code or an idea, how I could create something to do this? I am really stuck. Any suggestions welcome. Thank you in advance.

    > #Call datafile
> testfile <-read_xlsx("C:/Users/rawlingsD/Documents/R Pre Analysis/Test stacking file.xlsx")
> head(testfile)
# A tibble: 3 x 50
  Respondent_ID  i1.1  i1.2  i1.3  i1.4  i1.5  i1.6  i1.7  i2.1  i2.2  i2.3  i2.4  i2.5  i2.6  i2.7  i3.1
          <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1           12.    3.    1.    7.    1.    7.    1.    5.    1.    7.    1.    1.    1.    9.    1.    8.
2           13.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.
3           14.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.
# ... with 34 more variables: i3.2 <dbl>, i3.3 <dbl>, i3.4 <dbl>, i3.5 <dbl>, i3.6 <dbl>, i3.7 <dbl>,
#   i4.1 <dbl>, i4.2 <dbl>, i4.3 <dbl>, i4.4 <dbl>, i4.5 <dbl>, i4.6 <dbl>, i4.7 <dbl>, i5.1 <dbl>,
#   i5.2 <dbl>, i5.3 <dbl>, i5.4 <dbl>, i5.5 <dbl>, i5.6 <dbl>, i5.7 <dbl>, i6.1 <dbl>, i6.2 <dbl>,
#   i6.3 <dbl>, i6.4 <dbl>, i6.5 <dbl>, i6.6 <dbl>, i6.7 <dbl>, i7.1 <dbl>, i7.2 <dbl>, i7.3 <dbl>,
#   i7.4 <dbl>, i7.5 <dbl>, i7.6 <dbl>, i7.7 <dbl>

ANd this is the dataset I would like at the end with i2, i3 etc next to i1:

    > head(testfile_complete)
# A tibble: 6 x 3
  Respondent_ID image    i1
          <dbl> <dbl> <dbl>
1           12.    1.    3.
2           13.    1.    2.
3           14.    1.    3.
4           12.    2.    1.
5           13.    2.    2.
6           14.    2.    3.

Solution

  • If you need the columns, then you do: I will only do for i1 and i2

    reshape(dat1,varying = t(matrix(names(dat1[-1]),7)),idvar = "Respondent_ID",dir="long")
         Respondent_ID time i1.1 i2.1
    12.1            12    1    3    1
    13.1            13    1    2    2
    14.1            14    1    3    3
    12.2            12    2    1    7
    13.2            13    2    2    2
    14.2            14    2    3    3
    12.3            12    3    7    1
    13.3            13    3    2    2
    14.3            14    3    3    3
    12.4            12    4    1    1
    13.4            13    4    2    2
    14.4            14    4    3    3
    12.5            12    5    7    1
    13.5            13    5    2    2
    14.5            14    5    3    3
    12.6            12    6    1    9
    13.6            13    6    2    2
    14.6            14    6    3    3
    12.7            12    7    5    1
    13.7            13    7    2    2
    14.7            14    7    3    3