Skip to main content
Chemistry LibreTexts

Linear Regression in Matlab

Linear regression is a form of regression analysis in which the relationship between one or more independent variables and another variable, called dependent variable, is modeled by a least squares function, called linear regression equation. This function is a linear combination of one or more model parameters, called regression coefficients.

Introduction

A linear regression equation with one independent variable represents a straight line. The results are subject to statistical analysis. Example of linear regression with one independent variable:

 

Linear Regression in MATLAB

 

Fitting a least-squares linear regression is easily accomplished in MATLAB using the backslash operator: '\'. In linear algebra, matrices may by multiplied like this:

output = input * coefficients

The backslash in MATLAB allows the programmer to effectively "divide" the output by the input to get the linear coefficients. This process will be illustrated by the following examples:

Example 1: Simple Linear Regression

 

First, some data with a roughly linear relationship is needed:

  • X = [1 2 4 5 7 9 11 13 14 16]'; Y = [101 105 109 112 117 116 122 123 129 130]';


"Divide" using MATLAB's backslash operator to regress without an intercept:

  • B = X \ Y


B =

10.8900

Append a column of ones before dividing to include an intercept:

  • B = [ones(length(X),1) X] \ Y


B =

101.3021
1.8412

In this case, the first number is the intercept and the second is the coefficient (slope).

 

 

Linear Curve Fitting with MATLAB’s Built-in Functions

 

 

Function

 

Description

 

POLYFIT

 

Fit polynomial to data. POLYFIT(X,Y,N) finds the coefficients of a polynomial P(X) of degree N that fits the data, P(X(I))~=Y(I) , in a least-squares sense.

 

  \

 

Backslash or matrix left division. If A is a square matrix, A\B is roughly the same as inv(A)*B , except it is computed in a different way.

 

POLYVAL

 

Evaluates a polynomial at given points

 

CORRCOEF

 

Compute the correlation of two vectors. This can be used with the POLYFIT and POLYVAL functions to compute the R-Square correlation coefficient between actual data and the output of a fitted curve.

 

Here is an example that uses the CORRCOEF function to compute the R-Square value:

 

load census

 

  • [p, s] = polyfit(cdate, pop, 2);

     


  • Output = polyval(p,cdate);

     

  • Correlation = corrcoef(pop, Output);

     

pop is perfectly correlated with itself, as expected, and Output is also correlated with itself. The off-diagonal element is the correlation between pop and Output. This value is very close to 1, so the correlation between the actual data and the result of the fit is very good. Thus, the fit is a good fit.

 

Via Matlab’s GUI interface

 

First enter some data, and plot it.

  • x=[0 2.6 5.1 7.4 9.6 7.8 4.8 2.6 0.1];
  • y=[-0.3 23.5 51.9 70.1 98.2 82.9 46.0 28.6 4.8]/1000; %Convert to amps from mA.
  • plot(x,y,'o');                                      %Plot data with circles title('Verifying Ohms Law');

    xlabel('Volts');>


  • ylabel('Current (Amps)');

Go to Tools->Basic Fitting and set dialog box as below

 

 

Hit the right arrow at the bottom of the box.

 

 

The figure now looks as shown.

 

 

Since the slope is about 0.01, the resistance is about 100 ohms.

 

It is also possible to use Matlab do the curve-fit and then calculate the correlation coefficient.  In the following code "p" contains coefficients of linear, or first order, fit (slope=m=p(1)=0.0100, intercept=b=p(2)=0.006).

 

>> p=polyfit(x,y,1)   

 

p = 0.0100 0.0006 

 

>> R=corrcoef(x,y);

 

>> R(1,2)

 

ans =

 

    0.9961

 

A correlation coefficient with a magnitude near 1 (as in this case) represents a good fit.  As the fit gets worse, the correlation coefficient approaches zero.