machine-learning regression linear-regression non-linear-regression

Identify weakest feature in classification

A basic machine learning exercise is to perform a regression on some data. For instance, estimate the length of a fish as a function of weight and age.

This is often done by having a large training data set (weight, age, length) and then apply some regression analysis. It is then possible to estimate the length of a new fish from it's weight and age.

However, assume I instead wanted to solve this problem: "I have a fish with a known weight W, age A and length L. Assuming I want the length to be M instead of L, how should I adjust W and A".

This seems like a common problem, but I don't know what it's called. Can someone please help me in the right direction. How do you approach the problem if it's linear, and more what if it's non linear?

Solution

You are looking for a functional dependency

f: IR -> IR^2,  f(Weight) = (Age, Length)^T

You can do this basically with the same methods as you are using right now. It's only that the targets are two-dimensional, and therefore you need to adjust your loss function.

Simple Euclidian distance in the 2-dimensional space won't do here anymore, as you've got different magnitudes and different units of the predictor variables. So you've got to be creative here -- for example, you can normalize both predictors to [0,1] input the normalized values into the Euclidian or L1 distance loss function.

Once you got a suitable loss function, proceed as usual: choose a machine learning method, fit the data, make your predictions.

With respect to choosing a method: this can range from simple and uncorrelated -- for example two unrelated linear regressions, or more general stacking together two one-dimensional-output methods -- to correlated and more sophisticated: for example an artificial neural network with two output nodes where the ANN parameters are tied.

Finally, here is an example for the case of linear regression. There you make the ansatz

(Age, Length)^T =  (a1 + b1* Weight, a2 + b2*Weight)^T

and find the parameters a1, b1, a2, b2 by minimizing your loss function L, which in the simplest case is just

L(a1,b1,a2,b2) = || Age - a1 + b1 * Weight ||^2 + || Length - a2 + b2 * Weight ||^2

This choice amounts to two separated one-dimensional linear regressions. Fine.

Often, however, you also want a consistency between the target parameters -- intuitively: you prefer two small deviations in (Age, Length) to one large and one zero deviation. This is where correlated methods and loss functions enter.