I have an assignment in which I need to detect anomalies in a dataset. I'm using the 'anomalize' package in R and was wondering how to interpret the following output values of the 'anomalize' function:
Remainder_L1 Remainder_L2
I've checked the documentation but I'm unable to find the calculation method for these values. Can someone explain this calculation?
The anomolize documentation gives a great example of how to apply anomolize()
to a time series
This generates the Remainder_L1
and Remainder_L2
values for CRAN tidyverse downloads (that data comes with the anomolize package, so no need to import data, just run the code below to see how it generates the columns
# install.packages("anomalize")
library(tidyverse)
library(tibbletime)
library(anomalize)
tidyverse_cran_downloads %>%
time_decompose(count, merge = TRUE) %>%
anomalize(remainder)
# package date count observed season trend remainder remainder_l1 remainder_l2 anomaly
# <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
# 1 broom 2017-01-01 1053 1053. -1007. 1708. 352. -1725. 1704. No
# 2 broom 2017-01-02 1481 1481 340. 1731. -589. -1725. 1704. No
# 3 broom 2017-01-03 1851 1851 563. 1753. -465. -1725. 1704. No
# 4 broom 2017-01-04 1947 1947 526. 1775. -354. -1725. 1704. No
# 5 broom 2017-01-05 1927 1927 430. 1798. -301. -1725. 1704. No
What do these values mean? From the anomolize source code we see:
"remainder_l1" (lower limit for anomalies), "remainder_l2" (upper limit for anomalies)
In the example above, it's saying in the first row, anomolize()
would treat the value (1053) as an anomoly if it was less than -1725, or greater than 1725.