Search code examples
rdataframefunctionvectortibble

How can I calculate the average of n rows in a tibble and assign a new column to the tibble with the resulting value on those rows in r?


I have a tibble with exchange quotations that give me 5 buying price values and 5 selling price values per day (or 5 rows with 2 columns per day)

I'm trying to make an iterative solution where I cycle through the values, calculating sum and average for every 5 rows and save the result in a new column, repeated for the specific rows, so, for example:

buyPrice <- c(1:10)
sellPrice <- c(11:20)

exchange <- tibble(buyPrice, sellPrice)


# A tibble: 10 × 2
   buyPrice sellPrice
      <int>     <int>
 1        1        11
 2        2        12
 3        3        13
 4        4        14
 5        5        15
 6        6        16
 7        7        17
 8        8        18
 9        9        19
10       10        20

In this case, I would expect, after the work that I would have an averageBuy and an averageSell vector that I could appent to the tibble as new columns with the average values for every five rows:

# A tibble: 10 × 4
   buyPrice sellPrice averageBuy averageSell
      <int>     <int>      <dbl>       <dbl>
 1        1        11          3          13
 2        2        12          3          13
 3        3        13          3          13
 4        4        14          3          13
 5        5        15          3          13
 6        6        16          8          18
 7        7        17          8          18
 8        8        18          8          18
 9        9        19          8          18
10       10        20          8          18

Comming from other language, I was trying something in the veins of:

getAverageByDay <- function(dataSet, columnNumber) {
  
  temporarySum <- 0
  average <- c()
  dataSetRows <- nrow(dataSet)
  averageList <- rep(0,dataSetRows)
  
  for (i in 1:dataSetRows) {
    if (i %% 5 == 0) {
      temporarySum <- temporarySum + dataSet[[i,columnNumber]]
      
      average <- temporarySum / 5 
     
      averageList[(i-4):i] <- rep(average,5)
      
      temporarySum <- 0
      average <- 0
      
    } else {
      temporarySum <- temporarySum + dataSet[[i,columnNumber]] 
    }
  }
  return(averageList)
}

averageBuy <- getAverageByDay(exchange, 1)
averageSell <- getAverageByDay(exchange, 2)

Where I would append the resulting vectors as the new columns in the tibble. Everything else works, but this assignment that never happens:

averageList[(i-4):i] <- rep(average,5)

Which in turn returns NULL from the function.

If there's something more r specific than a for loop, I'd very much appreciate to know. Also, not very expensive if possible. Thanks


Solution

  • This is how I would solve the problem.

    library(dplyr)
    
    buyPrice <- c(1:10)
    sellPrice <- c(11:20)
    
    exchange <- tibble(buyPrice, sellPrice)
    
    exchange |>
      mutate(group = cumsum(row_number() %% 5 == 1)) |>
      mutate(
        averageBuy = mean(buyPrice),
        averageSell = mean(sellPrice),
        .by = group
      )
    #> # A tibble: 10 × 5
    #>    buyPrice sellPrice group averageBuy averageSell
    #>       <int>     <int> <int>      <dbl>       <dbl>
    #>  1        1        11     1          3          13
    #>  2        2        12     1          3          13
    #>  3        3        13     1          3          13
    #>  4        4        14     1          3          13
    #>  5        5        15     1          3          13
    #>  6        6        16     2          8          18
    #>  7        7        17     2          8          18
    #>  8        8        18     2          8          18
    #>  9        9        19     2          8          18
    #> 10       10        20     2          8          18
    

    Created on 2024-05-12 with reprex v2.0.2