Search code examples
gnuplotmaximatexmacs

Plotting linear regression and residual in Maxima via gnuplot


Motivation

I'm using TeXmacs to write math assignments and Maxima as CAS program. I recently found out how to use plot2d in gnuplot via Maxima.

Questions

  1. I now need to plot a linear regression line using some coordinates / table data.
  2. And it would be nice to be able to get the linear regression function f(x) = mx+b or f(x) = a+bx from the same data. Nice to have - not required.
  3. I also need to draw a residual plot from the same data. Perhaps getting the r^2 value.

Researched

I have read https://stackoverflow.com/a/9077851/205696 which answers how to draw a regression line in gnuplot but I do not know how to port that to the plot2d function in Maxima.

I also read this Q/A that explains how to calculate the equation for linear regression with Maxima but I am frankly not good enough in math to understand what is actually going on.

I found out that I can plot linear regressions with GeoGebra, but I do not see a way to draw a residual plot or to get f(x) = mx+b on the data I have provided (I can calculate f(x) = ax+b from observation but it would be nice to get exact numbers).

Explanatory images

Data table and linear regression line in TI Nspire Data table and linear regression line in TI Nspire.

Data table, linear regression line and residual plot in TI Nspire Data table, linear regression line and residual plot in TI Nspire.

Statistical calculations -> Linear regression in TI Nspire Statistical calculations -> Linear regression in TI Nspire.


Solution

  • Here is a sketch of a solution. There is a function linear_regression in the stats package which does most of the work.

    Let x and y be two lists, which contain your data. linear_regression expects a matrix of two or more columns. To paste the lists together into a matrix, you can say: xy: addcol(matrix(), x, y); (There are other ways to do it.)

    To get an object which contains the coefficients and residuals and other items, you can say: results: linear_regression(xy);

    To get the residuals: myresiduals: take_inference('residuals, results);

    To get the coefficients: coeffs: take_inference('b_estimation, results);

    Now with the coefficients, you can construct a function to represent the regression line: myline: subst(['b = coeffs[1], 'm = coeffs[2]], lambda([x], m*x + b));

    (Be sure to check the coefficients -- I might have gotten them reversed. Maybe it's coeffs[2] and then coeffs[1]? You should check.)

    At this point you have x, y, residuals, and myline. Plot them all together:

    plot2d ([[discrete, x, y], [discrete, x, residuals], myline], ['x, lmin(x) - 1, lmax(x) + 1]);
    

    To get more info, see: ? linear_regression