Here is the code
function [theta] = LR(D)
% D is the data having feature variables and class labels
% Now decompose D into X and C
%Note that dimensions of X = , C =
C = D(:,1);
C = C';
size(C)
X = D(:,2:size(D,2));
size(X)
alpha = .00001;
theta_old = zeros(1,34);
theta_new = .001.*ones(1,34);
count = 1;
for count = 1:100000
theta_old = theta_new;
theta_new = theta_new + alpha*(C-sigmoid(X*theta_new')')*X;
llr = sum(LLR((X*theta_new').*(C')))
end
thetaopt = theta_new
end
function a = LLR( z )
a= 1.*log(1.0 + exp(-z));
end
function a = sigmoid(z)
a = 1.0 ./ (1.0 + exp(-z));
end
The problem I have is that the log likelihood ratio first decreases, and then starts increasing. Is this a problem with the Gradient Descent algorithm or with the code.
It looks like there could be a problem with your objective function.
If the labels (C
) are in {0,1}
, then you should be using the loss C.*LLR(X*theta')+(1-C).*(LLR(X*theta')+X*theta')
If your labels are in {-1,1}
, then the loss should be LLR(C.*X*theta')
.
You seem to be using only the first part of the first type of loss function.