Search code examples
rtime-seriesbayesian

Applying Bayesian Changepoint Detection algorithm using bcp to a grouped data frame in R


I have a grouped dataframe which I would like to apply the bcp function to calculate for each point the posterior probability of there being a change at each point.

My data looks as follows:

# INSTALL PACMAN
if (!require("pacman", character.only = TRUE)) {
  install.packages("pacman")
}

pacman::p_load(bcp,tidyverse)

df <- data.frame(
  date = c(seq(Sys.Date(), by = -1, length.out = 1000), seq(Sys.Date(), by = -1, length.out = 1000)),
  value = c(rnorm(200, mean = 20, sd = 1), rnorm(800, mean = 17, sd = 2), rnorm(400, mean = 200, sd = 3), rnorm(600, mean = 150, sd = 4)),
  product = c(rep("A", 1000), rep("B", 1000))
)

By filtering my df to a single variable and assigning it to a new variable and applying bcp() I see it returns a list of 12

x <- df %>% 
  filter(product == "A")

 y <- bcp(x$value)

I've tried using group_map which returns only two columns which is not ideal, I've no idea why only two columns are returned:

df %>% 
  group_by(product) %>% 
  group_map(~ bcp(.x$value))

I've also tried group_modify but I can't get the syntax correct to parse out the correct fields:

df %>% 
  group_by(product) %>% 
  group_modify(~ {
    bcp::bcp(.x$value) %>% 
      tibble::enframe(name = "name", value = "value")
    })

As well as:

df %>% 
  group_by(product) %>% 
  group_modify(~ bcp::bcp(.x$value) %>% 
      pluck("posterior.prob")) 

Any guidance on how I can append the 'posterior.prob' from the bcp function to my original df on a per group basis would be greatly appreciated.


Solution

  • I'm not familiar with the bcp package but does this give you what you want?

    posterior_prob <- map(df %>% 
      group_by(product) %>% 
      group_map(~ bcp(.x$value)), pluck("posterior.prob")) %>% 
      unlist()
    df$posterior_prob_var <- posterior_prob
    head(df)
    #         date    value product posterior_prob_var
    # 1 2023-05-10 21.90542       A              0.002
    # 2 2023-05-09 19.61293       A              0.000
    # 3 2023-05-08 20.46336       A              0.002
    # 4 2023-05-07 21.22534       A              0.000
    # 5 2023-05-06 19.37578       A              0.000
    # 6 2023-05-05 18.94408       A              0.002