Search code examples
matlablinear-regressionprediction

Multivariate Linear Regression prediction in Matlab


I am trying to predict the energy output (y), based on two predictors (X). I have a total sample of 7034 samples (Xtot and ytot), corresponding to nearly 73 days of records.

I selected a week period within the data.

Then, I used the fitlm to create the MLR model.

Proceeded to the prediction.

Is this right? Is this the way that it should be used to obtain a 48 steps ahead prediction?

Thank you!

Xtot = dadosPVPREV(2:3,:);%predictors
ytot = dadosPVPREV(1,:);%variable to be predicted
Xtot = Xtot';
ytot = ytot';
X = Xtot(1:720,:);%period into consideration - predictors
y = ytot(1:720,:);%period into considaration - variable to be predicted
lmModel = fitlm(X, y, 'linear', 'RobustOpts', 'on'); %MLR fit
Xnew = Xtot(720:769,:); %new predictors of the y 
ypred = predict(lmModel, Xnew); %predicted values of y
yreal = ytot(720:769); %real values of the variable to be predicted
RMSE = sqrt(mean((yreal-ypred).^2)); %calculation of the error between the predicted and real values
figure; plot(ypred);hold; plot(yreal)

Solution

  • I see that over the past few days you have been struggling to train a prediction model. The following is an example of training such a model using linear regression. In this example, the values of the previous few steps are used to predict 5 steps ahead. The Mackey-Glass function is used as a data set to train the model.

    close all; clc; clear variables;
    load mgdata.dat; % importing Mackey-Glass dataset
    T = mgdata(:, 1); % time steps
    X1 = mgdata(:, 2); % 1st predictor
    X2 = flipud(mgdata(:, 2)); % 2nd predictor
    Y = ((sin(X1).^2).*(cos(X2).^2)).^.5; % response
    
    to_x = [-21 -13 -8 -5 -3 -2 -1 0]; % time offsets in the past, used for predictors
    to_y = +3; % time offset in the future, used for reponse
    
    T_trn = ((max(-to_x)+1):700)'; % time slice used to train model
    i_x_trn = bsxfun(@plus, T_trn, to_x); % indices of steps used to construct train data
    X_trn = [X1(i_x_trn) X2(i_x_trn)]; % train data set
    Y_trn = Y(T_trn+to_y); % train responses
    
    T_tst = (701:(max(T)-to_y))'; % time slice used to test model
    i_x_tst = bsxfun(@plus, T_tst, to_x); % indices of steps used to construct train data
    X_tst = [X1(i_x_tst) X2(i_x_tst)]; % test data set
    Y_tst = Y(T_tst+to_y); % test responses
    
    mdl = fitlm(X_trn, Y_trn) % training model
    
    Y2_trn = feval(mdl, X_trn); % evaluating train responses
    Y2_tst = feval(mdl, X_tst); % evaluating test responses 
    e_trn = mse(Y_trn, Y2_trn) % train error
    e_tst = mse(Y_tst, Y2_tst) % test error
    

    Also, using data transformation technique to generate new features in some models can reduce the prediction error:

    featGen = @(x) [x x.^2 sin(x) exp(x) log(x)]; % feature generator
    mdl = fitlm(featGen(X_trn), Y_trn)
    
    Y2_trn = feval(mdl, featGen(X_trn)); % evaluating train responses
    Y2_tst = feval(mdl, featGen(X_tst)); % evaluating test responses 
    

    enter image description here

    enter image description here