Search code examples
rfor-loopif-statementmedian

R programming, For loop, if else statement not returning correct calculation of median number


I have a big dataframe summary2, containing different keywords, and dates from 2016-2020 for each keyword. So I created a loop for each company, so whenever it gets to 2016 in the dataframe it should start on the if loop. The strange thing is that the median is not correctly returned when getting to company 2 on line 269. Since I am using a median that requires the -4 weeks and +3 weeks of the hits. The first dates must use an if statement, to calculate only the available dates.

I am using the code below, but the median is not working correctly in the else statement. However the summary2$test & test2 is returning the correct number, so why is not the median(summary2$hits[i-4:i+3]), returning the correct number? If I use the summary2$test & summary2$test numbers manually for a median it returns the correct numbers.

The code

for (i in 1:nrow(summary2)) {
  
  if (summary2$date[i] < as.Date('2016-01-31')) {
    summary2$median[i] = median(summary2$hits[i:i+3])
  }
  else {
    summary2$median[i] = median(summary2$hits[i-4:i+3])
    summary2$test[i] = i-4
    summary2$test2[i] = i+3
  }
  
}

The dataframe:

line keyword hits date company median test test2
1 apple 32 2016-01-03 apple 30.0 NA NA
2 apple 30 2016-01-10 apple 28.0 NA NA
3 apple 29 2016-01-17 apple 29.0 NA NA
4 apple 30 2016-01-24 apple 31.0 NA NA
5 apple 28 2016-01-31 apple 29.5 1 8
6 apple 29 2016-02-07 apple 29.0 2 9
523 icloud 72 2016-01-03 apple 65 NA NA
524 icloud 69 2016-01-10 apple 66 NA NA
525 icloud 66 2016-01-17 apple 62 1 8
526 icloud 65 2016-01-24 apple 66 NA NA
527 icloud 66 2016-01-31 apple 28 523 530
528 icloud 62 2016-02-07 apple 28 524 531
529 icloud 66 2016-02-14 apple 28 525 532
530 icloud 66 2016-02-21 apple 28 526 533

Looks like there is some bug with line 525 as well.


Solution

  • I think you should use () when you use :, e.g.,

    for (i in 1:nrow(summary2)) {
      
      if (summary2$date[i] < as.Date('2016-01-31')) {
        summary2$median[i] = median(summary2$hits[i:(i+3)])
      }
      else {
        summary2$median[i] = median(summary2$hits[(i-4):(i+3)])
        summary2$test[i] = i-4
        summary2$test2[i] = i+3
      }
      
    }