I have about 1000 measurements using a device. Let's call these measurement y
. For each of these measurements, I know what the actual measurement should be, let's call these z
. How I can calibrate, adjust, or scale y
for a better estimation? I was thinking of solving either of the following systems of equations (linear/nonlinear) for alpha, beta, and gamma:
or
Could someone give me some advice and let me know if I am doing this correctly?
First you need to know that a measurement device is doing two kinds of errors: accidental and systematic.
The accidental errors are due to a number of perturbation factors with a complex interaction and will result in non repeatability (measuring twice the same value results in different measurements). To reduce the accidental errors, you can repeat the measurement and average.
The systematic errors are permanent and stable. They are due to the relation z = y
being wrong or approximate, and will repeat identically for the same measurement. The true relation can be of the form y = z + c
with c != 0
(offset error), y = c.z
with c != 1
(gain error), y = c1.z + c2
(both), or nonlinear, like y = c1.z² + c2.z + c3
, y = (c1.z + c2) / (c3.z + c4)
, y = ln(exp(z)+1)
... or any other.
In some cases, you have reasons to know the functional form of the relation (for instance a metallic ruler gets a wrong "gain" when the temperature changes); in other cases you don't, and you can use an empirical model such as a polynomial (quite often, the relation is smooth and remains close to y = z
).
Usually, observing a plot of the (z, y)
points will hint you the importance of accidental errors and the possible shape of the functional relation.
A simple approach is to try a least-squares fitting of a polynomial model (say second or third degree). Then when you have found the coefficients, you can look at the relative magnitudes of the polynomial terms (powers) over the working range. This will tell you if all terms are relevant. I advise you to discard the terms that do not significantly decrease the fitting error and keep a simple model.
Consider the case of the plot below, chosen randomly from the web.
At first sight the relation looks linear, with no offset error (as the relation includes the point (0, 0)
), and a few irregularities, that we can attribute to accidental errors. For this device, the straight model y = c.z
should be appropriate, and adding nonlinear terms would be useless or misleading.