Search code examples
rdataframecbind

cbind warnings : row names were found from a short variable and have been discarded


I have below line of code for cbind, but I am getting a warning message everytime. Though the code still functions as it should be, is there any way to resolve the warning?

dateset = subset(all_data[,c("VAR1","VAR2","VAR3","VAR4","VAR5","RATE1","RATE2","RATE3")])
dateset = cbind(dateset[c(1,2,3,4,5)],stack(dateset[,-c(1,2,3,4,5)]))

Warnings :

Warning message:
   In data.frame(..., check.names = FALSE) :
        row names were found from a short variable and have been discarded

Thanks in advance!


Solution

  • I'm guessing your data.frame has row.names:

    A <- data.frame(a = c("A", "B", "C"), 
                    b = c(1, 2, 3), 
                    c = c(4, 5, 6), 
                    row.names=c("A", "B", "C"))
    
    cbind(A[1], stack(A[-1]))
    #   a values ind
    # 1 A      1   b
    # 2 B      2   b
    # 3 C      3   b
    # 4 A      4   c
    # 5 B      5   c
    # 6 C      6   c
    # Warning message:
    # In data.frame(..., check.names = FALSE) :
    #   row names were found from a short variable and have been discarded
    

    What's happening here is that since you can't by default have duplicated row.names in a data.frame and since you don't tell R at any point to duplicate the row.names when recycling the first column to the same number of rows of the stacked column, R just discards the row.names.

    Compare with a similar data.frame, but one without row.names:

    B <- data.frame(a = c("A", "B", "C"), 
                    b = c(1, 2, 3), 
                    c = c(4, 5, 6))
    
    cbind(B[1], stack(B[-1]))
    #   a values ind
    # 1 A      1   b
    # 2 B      2   b
    # 3 C      3   b
    # 4 A      4   c
    # 5 B      5   c
    # 6 C      6   c
    

    Alternatively, you can set row.names = NULL in your cbind statement:

    cbind(A[1], stack(A[-1]), row.names = NULL)
    #   a values ind
    # 1 A      1   b
    # 2 B      2   b
    # 3 C      3   b
    # 4 A      4   c
    # 5 B      5   c
    # 6 C      6   c
    

    If your original row.names are important, you can also add them back in with:

    cbind(rn = rownames(A), A[1], stack(A[-1]), row.names = NULL)
    #   rn a values ind
    # 1  A A      1   b
    # 2  B B      2   b
    # 3  C C      3   b
    # 4  A A      4   c
    # 5  B B      5   c
    # 6  C C      6   c