my first time asking a question.
I'm teaching myself neural networks and am currently trying to program a perceptron algorithm. I want to train it for the OR function, but it isn't working. I have absolutely no idea regarding what I am doing wrong, and there are no solutions on the internet that don't use a toolbox.
input = [0 0; 0 1; 1 0; 1 1]%input vector
num_in = 4;% number of iterations
desired_out = [0;1;1;1] %desired output
bias = -1; %bias
w=zeros(2,1); %weight vector, initially zero
iterations = 100; % number of iterations to go through
for i = 1:iterations
out = zeros(4,1);
for j = 1:num_in %go per row of x
y = bias+input(j,1)*w(1,1)+input(j,2)*w(2,1) %sum
if(out(j,1)~=desired_out(j,1)) % modify weights and bias if mismatch exists
bias = bias+desired_out(j,1);
w(1,1) =w(1,1)+input(j,1)*desired_out(j,1);
w(2,1) = w(2,1)+input(j,2)*desired_out(j,1);
end
end
end
out %print the output
I don't know which perceptron algorithm you are following but I think the one on Wikipedia is what you are trying to implement.
w
will be 3x1
and you have to append a column of ones at the end to your input features. This will allow you implement wx+b
using matrix multiplication i.e. in vectorized fashion.out
. You should have added the following line:
out(j,1) = y > 0;
if(out(j,1)~=desired_out(j,1))
? It is not mentioned on Wikipedia. Anyway, if you want to update only on mistakes, then you have to update differently on mistakes done on positive and negative samples. See this.input(j,1)*desired_out(j,1)
is wrong. According to Wikipedia, it should be (desired_out(j,1)-out(j,1))
.The corrected code is as follows:
input = [0 0 1; 0 1 1; 1 0 1; 1 1 1] % input vector
num_in = 4; % number of samples
desired_out = [0;1;1;1] % desired output
w=zeros(3,1); % weight vector, initially zero
iterations = 100; % number of iterations to go through
for i = 1:iterations
out = zeros(4,1);
for j = 1:num_in % go per row of x
y = input(j,1)*w(1,1)+input(j,2)*w(2,1)+w(3,1); % sum
out(j,1) = y>0;
w(1,1) =w(1,1)+input(j,1)*(desired_out(j,1)-out(j,1));
w(2,1) = w(2,1)+input(j,2)*(desired_out(j,1)-out(j,1));
w(3,1) = w(3,1)+input(j,3)*(desired_out(j,1)-out(j,1));
end
end
out %print the output
This could be vectorized further by using matrix multiplications instead of for
loops, but I will leave that up to you.