How to change probability table in object of class bn.fit (bn.fit.dnet) from bnlearn library?

I try to modify object of class bn.fit (bn.fit.dnet) from R's bnlearn library. I need

to set equal probabilities for every row in bn.fit$node$prob table. For this I use next code:

library(bnlearn)
library(purrr)

data(insurance)

bn <- tabu(insurance, score = "bic")
bn_fit <- bn.fit(bn, insurance, method = 'bayes')

bn_fit[1:length(bn_fit)] <- modify(bn_fit[1:length(bn_fit)], function(node) {
  node$prob <- modify(node$prob, ~(1 / NROW(node$prob)))
  node
})

I suppose this approach bit ugly and almost sure there exist more elegant way to do this. I can not remove 1:length(bn_fit). Also I don't know why I can not use NROW(.x) instead of NROW(node$prob) in my code.

To set arbitrary distribution on EVERY column in bn.fit$node$prob table. I don't understand how to avoid for loops in this case.

Related question is here

Solution

Regarding (1), modify takes a list or an atomic vector. bn_fit is of class bn.fit, bn.fit.dnet, however, under the hood it is a list too, since calling typeof() yields list. My guess is that there is no S3 generic method for subsetting these classes so R figures out that it is essentially a list and accordingly strips the class arguments. So subsetting bn_fit turns it into class list, and therefore you can use modify on it. Subsetting can even be done with empty brackets [], it will just return the object, but this time as class list. An alternative that I use below is to "manually" set the class attribute to NULL via attr(bnfit, "class") <- NULL.

Regarding (2), I wrote a tidyverse based function which can be used to alter the prob table of each node into a bayesm::rdirichlet distribution (see code below). The user still needs to provide part of the alpha argument (the length argument is given by the length of each prob table). Under the hood the function is relying on purrr::modify. It takes care of the classes by stripping them first and adding them back once the modification is done. My approach is to turn the prob tables into data.frames then modify the Freq column and adjust it for existing other variables (groups) and then translate that data.frame back into a table using xtabs and formulation notation via reformulate.

I'm not so deep into bayesian networks, so I do not know to what extent this function can be generalized, or whether it only works on the dataset you provided. Further, please test if the modified object is accepted by functions expecting class bn.fit, bn.fit.dnet.

I tried to comment each step of my code, but please ask if something is not clear.

(3) Regarding your question why NROW(.x) does not work in your code and you have to use NROW(node$prob) instead: This has to do with the way modify loops over the prob tables. A good way to check over what elements modify is looping over is to use purrr::pluck.

library(bnlearn)
library(tidyverse)

data(insurance)

bn <- tabu(insurance, score = "bic")
bn_fit <- bn.fit(bn, insurance, method = 'bayes')

change_bn_prob_table <- function(bnfit, alpha) {
  
  # save class attribute of bnfit object
  old_class <- attr(bnfit, "class")
  
  # strip class so that `modify` can be used
  attr(bnfit, "class") <- NULL
  
  # loop over `prop` tables of each node
  new <- purrr::modify(bnfit, function(x) {
    
    # save attributes of x 
    old_x_attr <- attributes(x)
    
    # save attributes of x[["prob"]]
    old_xprob_attr <- attributes(x[["prob"]])
    
    # turn `table` into data.frame
    inp <- as.data.frame(x[["prob"]]) 
    # save names apart from `Freq`
    cnames <- inp %>% select(-Freq) %>% colnames
    
    out <- inp %>% 
      # overwrite column `Freq` with probabilities from bayesm::rdirichlet
      # alpha needs to be supplied (the length of alpha is given by `nrow`)
      mutate(Freq := bayesm::rdirichlet(c(rep(alpha, nrow(inp))))) %>% 
      # devide probilities by sum of Freq in all remaining groups
      group_by(!!! syms(cnames[-1])) %>% 
      mutate(Freq := Freq/sum(Freq)) %>% 
      # turn data.frame back into prob table using formula notation via reformulate
      xtabs(reformulate(paste(colnames(.)), "Freq"), .)
    
    # strip `call` attribute from newly generated prob table
    attr(out, "call") <- NULL  
    # add `class` `table` as attribute
    attr(out, "class") <- "table"
    
    # restore old attribues and write x out to x$prob
    attributes(out) <- old_xprob_attr
    x[["prob"]] <- out
    
    # restore old attribues and return x
    attributes(x) <- old_x_attr
    x
    
  })
  
  # add saved class attributes 
  attr(new, "class") <- old_class
  new
  
}
# here `2` is the first part of `alpha` of `bayesm::rdirichlet`
bn_fit2 <- change_bn_prob_table(bn_fit, 2)

# test that `logLik` can be used on new modified bnfit object 
logLik(bn_fit2, insurance)
#> [1] -717691.8

^{Created on 2020-06-21 by the reprex package (v0.3.0)}