I need to do multiple linear regression efficiently. I am trying to use the Math.NET Numerics package but it seems slow - perhaps it is the way I have coded it? For this example I have only simple (1 x value) regression.
I have this snippet:
public class barData
{
public double[] Xs;
public double Mid;
public double Value;
}
public List<barData> B;
var xdata = B.Select(x=>x.Xs[0]).ToArray();
var ydata = B.Select(x => x.Mid).ToArray();
var X = DenseMatrix.CreateFromColumns(new[] { new DenseVector(xdata.Length, 1), new DenseVector(xdata) });
var y = new DenseVector(ydata);
var p = X.QR().Solve(y);
var b = p[0];
var a = p[1];
B[0].Value = (a * (B[0].Xs[0])) + b;
This runs about 20x SLOWER than this pure C#:
double xAvg = 0;
double yAvg = 0;
int n = -1;
for (int x = Length - 1; x >= 0; x--)
{
n++;
xAvg += B[x].Xs[0];
yAvg += B[x].Mid;
}
xAvg = xAvg / B.Count;
yAvg = yAvg / B.Count;
double v1 = 0;
double v2 = 0;
n = -1;
for (int x = Length - 1; x >= 0; x--)
{
n++;
v1 += (B[x].Xs[0] - xAvg) * (B[x].Mid - yAvg);
v2 += (B[x].Xs[0] - xAvg) * (B[x].Xs[0] - xAvg);
}
double a = v1 / v2;
double b = yAvg - a * xAvg;
B[0].Value = (a * B[Length - 1].Xs[0]) + b;
ALSO if Math.NET is the issue, then if anyone knows simple way to alter my pure code for multiple Xs I would be grateful of some help
Using a QR decomposition is a very generic approach that can deliver least squares regression solutions to any function with linear parameters, no matter how complicated it is. It is therefore not surprising that it cannot compete with a very specific straight implementation (on computation time), especially not in the simple case of y:x->a+b*x
. Unfortunately Math.NET Numerics does not provide direct regression routines yet you could use instead.
However, there are still a couple things you can try for better speed:
QRMethod.Thin
to the QR
methodControl.ConfigureSingleThread()
) or tweak its parametersIf the data set is very large there are also more efficient ways to build the matrix, but that's likely not very relevant beside of the QR (-> perf analysis!).