Search code examples
rfunctionlapplynames

R- lapply preserve rownames


I have a function that creates a melted (reshape2 package) table from another one and returns it with a transformation of the original name as its name. I need to apply it to a list of dataframes. Since the variables that are in "id.var" parameter of melt function are the rownames of the original table, I'm having issues preserving them.

My original function is:

creaMelts<-function(tbl){
    library(reshape2)
    texto<-deparse(substitute(tbl))
    tbl2<-melt(tbl, id.vars=rownames(tbl))
    texto2<-substr(texto,4,nchar(texto))
    colnames(tbl2)<-c('userId','movieId', texto2)
    tblName<<-paste0('df', texto2)
    assign(tblName, tbl2)
    return(assign(tblName, tbl2, envir = .GlobalEnv ) )
}

A couple of data frames:

tbl1<-data.frame(a=seq(1,2,1), 'X1'=seq(2,3,1), 'X2'=seq(3,4,1))
rownames(tbl1)<-c('X1', 'X2')
tbl2<-data.frame(a=seq(1,2,1), 'X1'=seq(3,4,1), 'X2'=seq(4,5,1))
rownames(tbl2)<-c('X1', 'X2')

So if I run:

creaMelts(tbl1)

And then I ask for df1, I get the desired result:

  userId movieId 1 NA
      2       3 a  1
      3       4 a  2

Since I need to do this fo a list:

lista=list(tbl1=tbl1,tbl2=tbl2)

I wrote:

lapply(seq_along(lista),function(tbl,n,rn,i){
    library(reshape2)
    texto<-n[[i]]
    print(texto)
    print(rn[[i]])
    tbl2<-melt(tbl, id.vars=rn[[i]])
    texto2<-substr(texto,4,nchar(texto))
    colnames(tbl2)<-c('userId','movieId', texto2)
    tblName<-paste0('df', texto2)
    assign(tblName, tbl2)
    return(assign(tblName, tbl2, envir = .GlobalEnv ) )
}, tbl=lista,n=names(lista), rn=rownames(lista))

Following the idea that adding the parameter to lapply will preserve the rownames like it does with the names, but it doesn't. In this case it does print each data frame name, but for the rownames I get NULL.

If I ask now for df1, I get

   userId movieId    1
       a       1 tbl1
       a       2 tbl1
      X1       2 tbl1
      X1       3 tbl1
      X2       3 tbl1
      X2       4 tbl1
       a       1 tbl2
       a       2 tbl2
      X1       3 tbl2
      X1       4 tbl2
      X2       4 tbl2
      X2       5 tbl2

Which is not the desired result.

How can I also use the rownames of each element of the list? Also I would like to avoid the printing of the table in the screen.


Solution

  • May I suggest an alternative approach? Use of assign and <<-, except in very rare cases, usually means you're using non-idiomatic R and

    From the R Inferno, Circle 6:

    If you think you need <<- , think again. If on reflection you still think you need <<- , think again.

    Set up data:

    tbl1 <- data.frame(a=seq(1,2,1), 'X1'=seq(2,3,1), 'X2'=seq(3,4,1))
    rownames(tbl1)<-c('X1', 'X2')
    tbl2 <- data.frame(a=seq(1,2,1), 'X1'=seq(3,4,1), 'X2'=seq(4,5,1))
    rownames(tbl2)<-c('X1', 'X2')
    

    Put your tables in a list (the first step to doing something sensible with a collection of objects in R):

    tblList <- list(tbl1=tbl1,tbl2=tbl2)
    

    Now write your function:

    meltfun <- function(x,lab=1) {
       z <- reshape2::melt(x,id.vars=rownames(x))
       colnames(z) <- c('userId','movieId', lab, "value")
       return(z)
    }
    

    Since we want to use labels associated with the names, we have to do a little more work:

    labs <- sapply(names(tblList),
               function(x) substr(x,4,nchar(x)))
    
    res <- mapply(meltfun,tblList,labs,SIMPLIFY=FALSE)
    setNames(res,paste0("df",labs)