Search code examples
rdataframedifferencetibble

how to subtract a number in a specific position in dataframe from every value in a column in R


I have a dataframe with 216 rows and 12 columns and I am trying to add a new column where each value is equal to the difference between each element in the 12th column and the value of the element in the 216th row, 12th column (df[216,12]). When I tried doing this with a reduced version of the dataframe (i.e. with just 5 columns instead of 216) it worked without problems but now that I'm trying to do the exact thing on the full dataset, it gives me an error saying "Error in Ops.data.frame(df_final[, 12], df_final[216, 12]) :‘-’ only defined for equally-sized data frames". Not sure why I'm getting that error or how to fix it..

For illustrative purposes, a simplified version of my dataset is as follows (the code works for this simplified dataset but not for my full dataset with 216 rows instead of just 5):

miRNA<-c("hsa-miR-10a-4373153", "hsa-miR-10b-4395329", "MammU6-4395470_1", "MammU6-4395470_2", "hsa-miR-15a-4373123")
C1<-c(28.005966, 30.806433, 17.341375, 17.40666, 30.039436)
T2<-c(30.973469, 29.236025, 30.41161, 20.914383, 20.904331)
C3<-c(26.322796, 25.542833, 22.460772, 19.972183, 30.409641)
T4<-c(26.441898, 25.837685, 23.158352, 20.379173, 33.81327)
C5<-c(39.750206, 19.901133, 28.180124, 22.668673, 25.748884)
T6<-c(23.004385, 28.472675, 23.81621, 26.433413, 28.851719)
T7<-c(22.239546, 28.741674, 23.754929, 26.015385, 28.16368)
T8<-c(29.590443, 30.041988, 21.323061, 24.272501, 18.099016)
C9<-c(15.856442, 22.64224, 29.629637, 25.374926, 22.356894)
C10<-c(38.137985, 24.753338, 26.986668, 24.578161, 19.223558)
data<-data.frame(miRNA, C1, T2, C3, T4, C5, T6, T7, T8, C9, C10)
View(data)
data$C12<-data[,11]-data[5,11]

Solution

  • The issue is that it is a tbl_df. Unlike data.frame, data[,11] won't collapse to a vector. It is still a tbl_df with a single column. There are many options

    unlist(data[,11])- unlist(data[5,11])
    

    Using a reproducible example

    df1 <- tibble(col1 = 1:5, col2 = 6:10)
    df1[, 2] - df1[1, 2]
    

    Error in Ops.data.frame(df1[, 2], df1[1, 2]) : ‘-’ only defined for equally-sized data frames

    unlist(df1[,2]) - unlist(df1[1,2])
    

    Or with drop = TRUE which is by default FALSE in tibblewhereas indata.frame, it isTRUE`

    df1[[2]] - df1[1,2, drop = TRUE]
    

    Note here that we use [[ to extract the column as a vector

    Or another option is to make use of dplyr functions

    library(dplyr)
    df1 %>%
        mutate_at(2, ~ . - .[2])