As part of a machine learning class assignment, I am implementing a NaiveBayes classifier without using any external library.
My training data set X has 8 features and one binary label for 800 rows; I have calculated 1:8 vectors for mean and sd for each feature by class, along with the priors for the two classes.
In order to assess accuracy of the classifier on the training dataset, I want to generate a matrix Y with the same dimensions (i=800, j=8) in which each element y_ij is given as
y_ij = dnorm(x_ij, mean = mean_j, sd_j, log = T)
I have tried sweep, apply, and lapply without success. I am stuck and unfortunately this is an issue with familiarity with R rather than understanding the algo. Help is greatly appreciated.
There's probably a better data setup for this, but if you already have X
and two vectors of means and sds, xmean
and xsd
, you can use sapply
. Here's a reproducible example:
X <- matrix(rnorm(30), 10, 3)
xmean <- apply(X, 2, mean)
xsd <- apply(X, 2, sd)
sapply(1:ncol(X), function(j) { dnorm(X[,j], xmean[j], xsd[j], log = TRUE) })
🐙