I have conducted a linear SVM on a large dataset, however in order to reduce the number of dimensions I performed a PCA, than conducted the SVM on a subset of the component scores (the first 650 components which explained 99.5% of the variance). Now I want to plot the decision boundary in the original variable space using the beta weights and bias from the SVM created in PCA space. But I can't figure out how to project the bias term from the SVM into the original variable space. I've written a demo using the fisher iris data to illustrate:
clear; clc; close all
% load data
load fisheriris
inds = ~strcmp(species,'setosa');
X = meas(inds,3:4);
Y = species(inds);
mu = mean(X)
% perform the PCA
[eigenvectors, scores] = pca(X);
% train the svm
SVMModel = fitcsvm(scores,Y);
% plot the result
figure(1)
gscatter(scores(:,1),scores(:,2),Y,'rgb','osd')
title('PCA space')
% now plot the decision boundary
betas = SVMModel.Beta;
m = -betas(1)/betas(2); % my gradient
b = -SVMModel.Bias; % my y-intercept
f = @(x) m.*x + b; % my linear equation
hold on
fplot(f,'k')
hold off
axis equal
xlim([-1.5 2.5])
ylim([-2 2])
% inverse transform the PCA
Xhat = scores * eigenvectors';
Xhat = bsxfun(@plus, Xhat, mu);
% plot the result
figure(2)
hold on
gscatter(Xhat(:,1),Xhat(:,2),Y,'rgb','osd')
% and the decision boundary
betaHat = betas' * eigenvectors';
mHat = -betaHat(1)/betaHat(2);
bHat = b * eigenvectors';
bHat = bHat + mu; % I know I have to add mu somewhere...
bHat = bHat/betaHat(2);
bHat = sum(sum(bHat)); % sum to reduce the matrix to a single value
% the correct value of bHat should be 6.3962
f = @(x) mHat.*x + bHat;
fplot(f,'k')
hold off
axis equal
title('Recovered feature space')
xlim([3 7])
ylim([0 4])
Any guidance on how I'm calculating bHat incorrectly would be much appreciated.
Just in case anyone else comes across this problem, the solution is the bias term can be used to find the y-intercept, b = -SVMModel.Bias/betas(2)
. And the y-intercept is just another point in space [0 b]
which can be recovered/unrotated by inverse transforming it through the PCA. This new point can then be used to solve the linear equation y = mx + b (i.e., b = y - mx). So the code should be:
% and the decision boundary
betaHat = betas' * eigenvectors';
mHat = -betaHat(1)/betaHat(2);
yint = b/betas(2); % y-intercept in PCA space
yintHat = [0 b] * eigenvectors'; % recover in original space
yintHat = yintHat + mu;
bHat = yintHat(2) - mHat*yintHat(1); % solve the linear equation
% the correct value of bHat is now 6.3962