Search code examples
rlistexport-to-csvnested-listsexport-to-text

In R, using results of rle (Run Length Encoding) including named row and column headers


I have a large matrix containing companies as row names, months as column names and data for each of the elements. Test data below:

testmatrix<-matrix(c(1,0,0,0,10,5,5,5,5,5,2,2,0,0,0,0,0,1,1,1),nrow=4,ncol=5,byrow=TRUE)
colnames(testmatrix)<-c("Jan","Feb","Mar","Apr","May")
rownames(testmatrix)<-c("Company1","Company2","Company3","Company4")
progression<-apply(testmatrix,1,rle)
progression

The progression object is the output of the rle function applied over each of the rows of the matrix. The result is a list with 2 elements that are both of class 'rle'. I would like to:

  1. Understand how to output (in R) a 4x3 (row by column) matrix of Company1 as follows:

enter image description here

Hence I'm struggling to understand how to deal with the output provided by progression

  1. Export progression to excel for further analysis (preferably in the format in (1) above (including column and row headers (in the list output they're referred to as: attr(*,"names")).

Your assistance is much appreciated!


Solution

  • This is not particularly elegant but this does the job:

    format_rle <- function(rle, rn){
      l <- list(rle$lengths,
        names(rle$lengths),
        rle$values,
        names(rle$values))
      m <- as.matrix(do.call(rbind, l))
      colnames(m) <- NULL
      rownames(m) <- rep(rn, nrow(m))
      m
    }
    

    Try format_rle(progression[[1]], "foo") to get the idea:

    [,1]  [,2]  [,3] 
    foo "1"   "3"   "1"  
    foo "Feb" "May" ""   
    foo "1"   "0"   "10" 
    foo "Jan" "Apr" "May"
    

    Then we apply this function to all elements in progression and save the result to individual csv files named according to the names in progression. You should have a bunch of .csv files in your working directory (getwd() to print it).

    for (i in seq_along(progression))
      write.csv(format_rle(progression[[i]], names(progression)[i]),
                file=paste0(names(progression[i]), ".csv"))
    

    Is this what you want?