My input values are 1, 2, 3, 4, ... and my output values are 1*1, 2*2, 3*3, 4*4, ... My code looks like this:
$reg = new LeastSquares();
$samples = array();
$targets = array();
for ($i = 1; $i < 100; $i++)
{
$samples[] = [$i];
$targets[] = $i*$i;
}
$reg->train($samples, $targets);
echo $reg->predict([5])."\n";
echo $reg->predict([10])."\n";
I expect it to output roughly 25 and 100. But I get:
-1183.3333333333
-683.33333333333
I also tried to use SVR instead of LeastSquares but the values are strange too:
2498.23
2498.23
I am new to ML. What am I doing wrong?
As others have pointed out in the comments LeastSquares
is for fitting a linear model to your data (training examples).
Your data set (target = samples^2) is inherently non-linear. If you try to picture what happens when you fit the best possible (in a least square of residuals sense) line to a quadratic curve you get a negative y-intercept (a sketch of this below):
You've trained your linear model on data up to x=99, y=9801, which will mean you have a very large y-intercept. So down at x=5 or x=10 you end up with a large negative value as you've found.
If you use support vector regression with a degree-2 polynomial it will do a good job of capturing the pattern of your data:
<?php
require_once __DIR__ . '/vendor/autoload.php';
use Phpml\Regression\SVR;
use Phpml\SupportVectorMachine\Kernel;
$samples = array();
$targets = array();
for ($i = 1; $i <= 100; $i++)
{
$samples[] = [$i];
$targets[] = $i*$i;
}
$reg = new SVR(Kernel::POLYNOMIAL, $degree = 2);
$reg->train($samples, $targets);
echo $reg->predict([5])."\n";
echo $reg->predict([10])."\n";
?>
Returns:
25.0995
100.098
From your response in the comments its clear that you're looking to apply a neural network so that you don't have to worry about what degree of model to fit to your data. A neural network with a single hidden layer can fit any continuous function arbitrarily well with enough hidden nodes, and enough training data.
Unfortunately php-ml doesn't seem to have a MLP (multilayer perceptron - another term for a neural network) for regression available out-of-the-box. I'm sure you could build one from appropriate layers but if your goal is to get up and running with training regression models quickly it might not be the best approach.