Search code examples
rplotlylinear-discriminant

Making 3D plot of data with LDA boundary


I am playing around with linear discriminant analysis and trying to plot the data in plotly in R and then plot the discriminating plane. When I try this I can get the data and the plane, but the plane does not match with the the 3D plotted data. I get the data "floating" above the plane rather than the plane cutting through the data. Using the coefficients the lda function gives us with parameters SB.lda, OPS.lda, and tSho.lda I try to solve for z under the assumption that the equation is SBx + OPSy + tShoz = 0 because the lda function does not give us a constant.

When I reference this post (https://stats.stackexchange.com/questions/166942/why-are-discriminant-analysis-results-in-r-lda-and-spss-different-constant-t) and use the method to generate the constant I get a constant of -13.82201 which when plotted does not match with the data.

What am I missing?

library(plotly)
library(MASS)
train <- data.frame(read.table("https://raw.githubusercontent.com/gunsnfloyd/algorithmdata/main/train.tsv",header=TRUE))
head(train)
model <- lda(WS~ SB + OPS  + tSho ,data = train)

SB.lda <- 0.008056699
OPS.lda <- 15.174308879
tSho.lda <- 0.205465030

groupmean<-(model$prior%*%model$means)
constant<-(groupmean%*%model$scaling)
-constant


fig <- plot_ly(train, x = ~SB, y = ~OPS, z = ~tSho, color = ~WS, colors = c('#BF382A', '#0C4B8E'))
fig <- fig %>% add_markers()
fig <- fig %>% layout(scene = list(xaxis = list(title = 'SB'),
                                   yaxis = list(title = 'OPS'),
                                   zaxis = list(title = 'tSho')))

x.lda <- seq(0, 400, length.out = 100) #Range based on SB values
y.lda <- seq(0.5, 0.9, length.out = 100) #Range based on OPS values
z.lda <- outer(x.lda, y.lda, function(x, y) ( (-SB.lda * x - OPS.lda * y) / tSho.lda + 13.82201))

fig <- fig %>% add_surface(x = x.lda, y = y.lda, z = z.lda,colors = "#0C4B8E")      
fig

Solution

  • I get a different result from solving the equation:

    # general equation: A * x + B * y + C * z + c = 0
    
    # solve for z:
      SB * x + OPS * y + tSho * z - 13.82201 = 0
      SB * x + OPS * y - 13.82201            = - tSho * z
     -SB * x - OPS * y + 13.82201            = tSho * z
    (-SB * x - OPS * y + 13.82201) / tSho    = z
    
    # calculate values of plane:
    x.lda <- seq(0, 400, length.out = 100) #Range based on SB values
    y.lda <- seq(0.5, 0.9, length.out = 100) #Range based on OPS values
    z.lda <- outer(x.lda, y.lda, function(x, y) (-SB.lda * x - OPS.lda * y + 13.82201) / tSho.lda )
    

    The resulting plot:

    plot