I have a big dataframe summary2, containing different keywords, and dates from 2016-2020 for each keyword. So I created a loop for each company, so whenever it gets to 2016 in the dataframe it should start on the if loop. The strange thing is that the median is not correctly returned when getting to company 2 on line 269. Since I am using a median that requires the -4 weeks and +3 weeks of the hits. The first dates must use an if statement, to calculate only the available dates.
I am using the code below, but the median is not working correctly in the else statement. However the summary2$test & test2 is returning the correct number, so why is not the median(summary2$hits[i-4:i+3]), returning the correct number? If I use the summary2$test & summary2$test numbers manually for a median it returns the correct numbers.
The code
for (i in 1:nrow(summary2)) {
if (summary2$date[i] < as.Date('2016-01-31')) {
summary2$median[i] = median(summary2$hits[i:i+3])
}
else {
summary2$median[i] = median(summary2$hits[i-4:i+3])
summary2$test[i] = i-4
summary2$test2[i] = i+3
}
}
The dataframe:
line | keyword | hits | date | company | median | test | test2 |
---|---|---|---|---|---|---|---|
1 | apple | 32 | 2016-01-03 | apple | 30.0 | NA | NA |
2 | apple | 30 | 2016-01-10 | apple | 28.0 | NA | NA |
3 | apple | 29 | 2016-01-17 | apple | 29.0 | NA | NA |
4 | apple | 30 | 2016-01-24 | apple | 31.0 | NA | NA |
5 | apple | 28 | 2016-01-31 | apple | 29.5 | 1 | 8 |
6 | apple | 29 | 2016-02-07 | apple | 29.0 | 2 | 9 |
523 | icloud | 72 | 2016-01-03 | apple | 65 | NA | NA |
524 | icloud | 69 | 2016-01-10 | apple | 66 | NA | NA |
525 | icloud | 66 | 2016-01-17 | apple | 62 | 1 | 8 |
526 | icloud | 65 | 2016-01-24 | apple | 66 | NA | NA |
527 | icloud | 66 | 2016-01-31 | apple | 28 | 523 | 530 |
528 | icloud | 62 | 2016-02-07 | apple | 28 | 524 | 531 |
529 | icloud | 66 | 2016-02-14 | apple | 28 | 525 | 532 |
530 | icloud | 66 | 2016-02-21 | apple | 28 | 526 | 533 |
Looks like there is some bug with line 525 as well.
I think you should use ()
when you use :
, e.g.,
for (i in 1:nrow(summary2)) {
if (summary2$date[i] < as.Date('2016-01-31')) {
summary2$median[i] = median(summary2$hits[i:(i+3)])
}
else {
summary2$median[i] = median(summary2$hits[(i-4):(i+3)])
summary2$test[i] = i-4
summary2$test2[i] = i+3
}
}