Search code examples
rdataframereshapedata-analysis

reshaping a large dataframe in R


I have a dataframe of 500 rows and 4004 columns that I would like to reshape to a dataframe of 500500 rows and 4 columns. That is from this dataframe: V1 V2 V3 V4 ... V4001 V4002 V4003 V4004 1 2 3 4 ... 4001 4002 4003 4004

1 2 3 4 ... 4001 4002 4003 4004

1 2 3 4 ... 4001 4002 4003 4004

... ... ... ... ... ... ... ... ... ... ... ... ...

1 2 3 4 ... 4001 4002 4003 4004

I would like :

V1 V2 V3 V4

1 2 3 4

1 2 3 4

1 2 3 4

1 2 3 4

... ... ... ... ... ... ... ... ...

4001 4002 4003 4004

4001 4002 4003 4004

4001 4002 4003 4004

... ... ... ... ...

4001 4002 4003 4004

I tried already to use y=matrix(as.matrix(dataGaus[[1]]),500500,4) (where dataGaus is my dataframe) but it doesn't give the expected result. I tried also to use reshape but I can't manage to use it to reproduce the result (and I have been through lot of posts on StackOverflow and on the net). In python, we can do this with a simple command numpy.array(dataGaus).reshape(-1,4). For some reasons, I am doing my analysis in R, and I would like to know if there is a function which does the same thing as the reshape(-1,4) of numpy in Python?

Thanks in advance, best


Solution

  • So if someone see this post, and wonders what is the answer, here is the answer that I got from R mailing list (thanks to David L Carson) :

    rows<-500
    cols<-4004
    dat2 <- array(as.matrix(dataGaus[[1]]), dim=c(rows, 4, cols/4))
    dat3 <- as.data.frame(matrix(aperm(dat2, c(1, 3, 2)), rows*cols/4, 4))
    

    where dataGaus[[1]] is the dataframe that I read from my datas usinf read.csv. The trick here is the use of aperm to create a permutation vector c(1,3,2). I am still not sure about how does it work, but for my purpose this works perfectly.