Given a set of points (x0,y0),…,(xn−1,yn−1), linear regression finds the line y=mx+b that comes closest to passing through all of the points; i.e., that makes √(y0 − (m x0 + b))2 + … + (yn−1 − (m xn−1 + b))2 as small as possible. Given a set of points (a two-column matrix) or two lists of numbers (the x- and y-coordinates), the linear_regression command will find the values of m and b which determine the line. For example, if you enter
or
you will get
which means that the line y = 4x − 2 is the best fit line.
The best fit line can be drawn with the linear_regression_plot command; if you enter
you will get
This will draw the line (in this case y=4x−2) and give you the equation at the top, as well as the R2 value, which is
R2 = |
|
(The R2 value will be between 0 and 1 and is one measure of how good the line fits the data; a value close to 1 indicates a good fit, a value close to 0 indicates a bad fit.)