Search code examples
rzerosapplyrep

Adding Zero to a column in first x rows in R


I am creating a classification model for forecasting purposes. I have several ext files which I converted into one large list containing several lists (called comb). I then broke the large list into a separate dataframe with each list as its own column (called BI). Because each list may contain different number of elements, the simpler argument matrix(unlist(l), ncol=ncol) does not work. When reviewing alternatives, I made modification to compile the following:

max_length <- max(sapply(comb,length))

BI<-sapply(comb, function(x){
c(x, rep(0, max_length - length(x)))
})

This creates a dataframe assigning each list a column and assigning each missing element within that column a value of ZERO. Those zeros show at the end of that column but I would like them to be at the beginning of the column. Here is an example of current output:

cola colb colc
2    2    2   
1    1    0
4    0    0

I need your help in converting my original code to produce the following format:

acola colb colc
2    0    0   
1    2    0
4    1    2

Solution

  • It might be sufficient to interchange the order in the concatenation c:

    max_length <- max(sapply(comb, length))
    
    BI <- sapply(comb, function(x){
        c(rep(0, max_length - length(x)), x)
    })
    

    EDIT: Based on additional information in the comments below, here's an approach that modifies the code in another way. The idea is that as long as your first approach gives you a proper data frame, we can circumvent the problem by using the order-function.

    max_length <- max(sapply(comb,length))
    
    BI <- sapply(comb, function(x){
        .zeros <- rep(0, max_length - length(x))
        .rearange <- order(c(1:length(x), .zeros))
        c(x, .zeros)[.rearange]
    })
    

    I have tested that this code works upon a minor test example I created, but I'm not certain that this example resembles your comb...

    If this revised approach doesn't work, then it's still possible to first create the data frame with your original code, and then reorder one column at the time.