Search code examples
rdataframemultiple-columnsmeansubtraction

how to do the mean of two dataframes columns to be subtrated "mean(df1$a-df2$b)" in r


My two dataframes looks like this:

> dput(head(df1,25))
structure(list(Date = structure(c(16644, 16645, 16646, 16647, 
16648, 16649, 16650, 16651, 16652, 16653, 16654, 16655, 16656, 
16657, 16658, 16659, 16660, 16661, 16662, 16663, 16664, 16665, 
16666, 16667, 16668), class = "Date"), AU = c(0.241392906920806, 
0.257591745069017, 0.263305712230276, NaN, 0.252892547032525, 
0.251771180928526, 0.249211746794207, 0.257289083109259, 0.205017582640463, 
0.20072274573488, 0.210154167590338, 0.207384553271337, 0.193725450540089, 
0.199282601988984, 0.216267134143314, 0.217052471451736, NaN, 
0.220703029531909, 0.2164619798534, 0.223442036108148, 0.22061326758891, 
NaN, 0.277777461504811, NaN, 0.200839628485262)), row.names = c(NA, 
-25L), class = c("tbl_df", "tbl", "data.frame"))

> dput(head(df2,25))
structure(list(UF1 = c(0.2559, 0.2565, 0.257, 0.2577, 0.2583, 
0.259, 0.2596, 0.2603, 0.2611, 0.2618, 0.2625, 0.2633, 0.2641, 
0.2649, 0.2657, 0.2665, 0.2674, 0.2682, 0.2691, 0.27, 0.2709, 
0.2718, 0.2727, 0.2736, 0.2745), UF2 = c(0.2597, 0.2602, 0.2608, 
0.2614, 0.2621, 0.2627, 0.2634, 0.2641, 0.2648, 0.2655, 0.2663, 
0.267, 0.2678, 0.2686, 0.2694, 0.2702, 0.2711, 0.2719, 0.2728, 
0.2737, 0.2745, 0.2754, 0.2763, 0.2773, 0.2782), UF3 = c(0.2912, 
0.2915, 0.2918, 0.2922, 0.2926, 0.293, 0.2934, 0.2938, 0.2943, 
0.2947, 0.2952, 0.2957, 0.2962, 0.2968, 0.2973, 0.2979, 0.2985, 
0.2991, 0.2997, 0.3003, 0.3009, 0.3016, 0.3022, 0.3029, 0.3035
), Date = structure(c(16644, 16645, 16646, 16647, 16648, 16649, 
16650, 16651, 16652, 16653, 16654, 16655, 16656, 16657, 16658, 
16659, 16660, 16661, 16662, 16663, 16664, 16665, 16666, 16667, 
16668), class = "Date")), row.names = c(NA, 25L), class = "data.frame")
>

I want to do the mean of two different dataframes columns subtracting (mean(df1$AU-df2$UF)). The closest to the solution I got is the following:

data.frame(mean = colMeans(df1$AU, na.rm = TRUE) - colMeans(df2$UF))

but I got this error:

Error in colMeans(df1$mAU, na.rm = TRUE) : 
  'x' must be an array of at least two dimensions

I succeed to run the same code only for dataframes with one column each, but since I have 3 or more columns per dataframe I want calculate against df1$AU I need to be more efficient.

Any help will be much appreciated. Thank you.


Solution

  • Assuming what you meant is that you want the subtraction of the means of the (numeric) columns in df1 with the mean of the (numeric) columns in df2, this can be done like this:

    mean(df1$AU, na.rm = T) - colMeans(df2[,1:3], na.rm = T)
    

    this outputs:

           UF1        UF2        UF3 
    -0.0367389 -0.0404509 -0.0688949
    

    per column of the df2

    I hope this is helpful.