Search code examples
binaryclassificationprobabilitynaivebayes

Naive Bayes: Heterogeneous CPDs for observation variables


I am using a naives bayes model for binary classification using a combination of discrete and continous variables. My question is, can I use a different conditional probability distribution (CPD) functions for continuous and discrete observation variables ? For example, I use gaussian CPD for continous and some deterministic CPD for the discrete variables ?

Thank you


Solution

  • Yes, it is normal to mix continuous and discrete variables within the same model. Consider the following example.

    Suppose I have two random variables:

    • T - the temperature today
    • D - the day of the week

    Note T is continuous and D is discrete. Suppose I want to predict whether John will go to the beach, represented by the binary variable B. Then I could set up my inference as follows, assuming T and D are conditionally independent given B.

               p(T|B) • p(D|B) • p(B)
    p(B|T,D) = ━━━━━━━━━━━━ ∝ p(T|B) • p(D|B) • p(B)
                    p(T) • p(D)
    

    p(T|B) could be a Gaussian distribution, p(D|B) could be a discrete distribution, and p(B) could be a discrete prior on how often John goes to the beach.