Main Content

fitnlm

Fit nonlinear regression model

Description

mdl = fitnlm(tbl,modelfun,beta0) fits the model specified by modelfun to variables in the table or dataset array tbl, and returns the nonlinear model mdl.

fitnlm estimates model coefficients using an iterative procedure starting from the initial values in beta0.

example

mdl = fitnlm(X,y,modelfun,beta0) fits a nonlinear regression model using the column vector y as a response variable and the columns of the matrix X as predictor variables.

example

mdl = fitnlm(___,modelfun,beta0,Name,Value) fits a nonlinear regression model with additional options specified by one or more Name,Value pair arguments.

example

Examples

collapse all

Create a nonlinear model for auto mileage based on the carbig data.

Load the data and create a nonlinear model.

load carbig
tbl = table(Horsepower,Weight,MPG);
modelfun = @(b,x)b(1) + b(2)*x(:,1).^b(3) + ...
    b(4)*x(:,2).^b(5);
beta0 = [-50 500 -1 500 -1];
mdl = fitnlm(tbl,modelfun,beta0)
mdl = 
Nonlinear regression model:
    MPG ~ b1 + b2*Horsepower^b3 + b4*Weight^b5

Estimated Coefficients:
          Estimate      SE        tStat       pValue 
          ________    _______    ________    ________

    b1     -49.383     119.97    -0.41164     0.68083
    b2      376.43     567.05     0.66384     0.50719
    b3    -0.78193    0.47168     -1.6578    0.098177
    b4      422.37     776.02     0.54428     0.58656
    b5    -0.24127    0.48325    -0.49926     0.61788


Number of observations: 392, Error degrees of freedom: 387
Root Mean Squared Error: 3.96
R-Squared: 0.745,  Adjusted R-Squared 0.743
F-statistic vs. constant model: 283, p-value = 1.79e-113

Create a nonlinear model for auto mileage based on the carbig data.

Load the data and create a nonlinear model.

load carbig
X = [Horsepower,Weight];
y = MPG;
modelfun = @(b,x)b(1) + b(2)*x(:,1).^b(3) + ...
    b(4)*x(:,2).^b(5);
beta0 = [-50 500 -1 500 -1];
mdl = fitnlm(X,y,modelfun,beta0)
mdl = 
Nonlinear regression model:
    y ~ b1 + b2*x1^b3 + b4*x2^b5

Estimated Coefficients:
          Estimate      SE        tStat       pValue 
          ________    _______    ________    ________

    b1     -49.383     119.97    -0.41164     0.68083
    b2      376.43     567.05     0.66384     0.50719
    b3    -0.78193    0.47168     -1.6578    0.098177
    b4      422.37     776.02     0.54428     0.58656
    b5    -0.24127    0.48325    -0.49926     0.61788


Number of observations: 392, Error degrees of freedom: 387
Root Mean Squared Error: 3.96
R-Squared: 0.745,  Adjusted R-Squared 0.743
F-statistic vs. constant model: 283, p-value = 1.79e-113

Create a nonlinear model for auto mileage based on the carbig data. Strive for more accuracy by lowering the TolFun option, and observe the iterations by setting the Display option.

Load the data and create a nonlinear model.

load carbig
X = [Horsepower,Weight];
y = MPG;
modelfun = @(b,x)b(1) + b(2)*x(:,1).^b(3) + ...
    b(4)*x(:,2).^b(5);
beta0 = [-50 500 -1 500 -1];

Create options to lower TolFun and to report iterative display, and create a model using the options.

opts = statset('Display','iter','TolFun',1e-10);
mdl = fitnlm(X,y,modelfun,beta0,'Options',opts);
 
                                     Norm of         Norm of
   Iteration             SSE        Gradient           Step 
  -----------------------------------------------------------
           0     1.82248e+06
           1          678600          788810         1691.07
           2          616716     6.12739e+06         45.4738
           3          249831      3.9532e+06         293.557
           4           17675          361544         369.284
           5         11746.6         69670.5         169.079
           6         7242.22          343738         394.822
           7         6250.32          159719         452.941
           8         6172.87         91622.9         268.674
           9            6077         6957.44         100.208
          10         6076.34         6370.39         88.1905
          11         6075.75         5199.08         77.9694
          12          6075.3         4646.61          69.764
          13         6074.91         4235.96         62.9114
          14         6074.55         3885.28         57.0647
          15         6074.23          3571.1         52.0036
          16         6073.93         3286.48         47.5795
          17         6073.66         3028.34         43.6844
          18          6073.4         2794.31         40.2352
          19         6073.17         2582.15         37.1663
          20         6072.95         2389.68         34.4243
          21         6072.74         2214.84         31.9651
          22         6072.55         2055.78         29.7516
          23         6072.37         1910.83          27.753
          24         6072.21         1778.51         25.9428
          25         6072.05          1657.5         24.2986
          26          6071.9         1546.65         22.8011
          27         6071.76         1444.93         21.4338
          28         6071.63         1351.44         20.1822
          29         6071.51         1265.39         19.0339
          30         6071.39         1186.06          17.978
          31         6071.28         1112.83         17.0052
          32         6071.17         1045.13          16.107
          33         6071.07         982.465         15.2762
          34         6070.98         924.389         14.5063
          35         6070.89         870.498         13.7916
          36          6070.8         820.434          13.127
          37         6070.72         773.872         12.5081
          38         6070.64         730.521         11.9307
          39         6070.57         690.117         11.3914
          40          6070.5         652.422          10.887
          41         6070.43         617.219         10.4144
          42         6070.37         584.315         9.97115
          43         6070.31          553.53         9.55489
          44         6070.25         524.703          9.1635
          45         6070.19         497.686         8.79506
          46         6070.14         472.345         8.44785
          47         6070.08         448.557         8.12028
          48         6070.03          426.21         7.81092
          49         6069.99         405.201         7.51845
          50         6069.94         385.435          7.2417
          51          6069.9         366.825         6.97956
          52         6069.85         349.293         6.73104
          53         6069.81         332.764         6.49523
          54         6069.77         317.171         6.27127
          55         6069.74         302.453          6.0584
          56          6069.7          288.55         5.85591
          57         6069.66         275.411         5.66315
          58         6069.63         262.986         5.47949
          59          6069.6          251.23          5.3044
          60         6069.57           240.1         5.13734
          61         6069.54         229.558         4.97784
          62         6069.51         219.567         4.82545
          63         6069.48         210.094         4.67977
          64         6069.45         201.108          4.5404
          65         6069.43         192.578           4.407
          66          6069.4         184.479         4.27923
          67         6069.38         176.785         4.15677
          68         6069.35         169.472         4.03935
          69         6069.33         162.518          3.9267
          70         6069.31         155.903         3.81855
          71         6069.29         149.608         3.71468
          72         6069.26         143.615         3.61486
          73         6069.24         137.907          3.5189
          74         6069.22         132.468         3.42658
          75         6069.21         127.283         3.33774
          76         6069.19         122.339         3.25221
          77         6069.17         117.623         3.16981
          78         6069.15         113.123         3.09041
          79         6069.14         108.827         3.01386
          80         6069.12         104.725         2.94002
          81          6069.1         100.806         2.86877
          82         6069.09         97.0611             2.8
          83         6069.07         93.4814         2.73358
          84         6069.06         90.0583         2.66942
          85         6069.05         86.7842         2.60741
          86         6069.03         83.6513         2.54745
          87         6069.02         80.6528         2.48947
          88         6069.01         77.7821         2.43338
          89         6068.99         75.0328         2.37908
          90         6068.98          72.399         2.32652
          91         6068.97         69.8752         2.27561
          92         6068.96         67.4561         2.22629
          93         6068.95         65.1367         2.17849
          94         6068.94         62.9122         2.13216
          95         6068.93         60.7784         2.08723
          96         6068.92         58.7308         2.04364
          97         6068.91         56.7655         2.00135
          98          6068.9         54.8787          1.9603
          99         6068.89         4349.28         18.1917
         100         6068.77         2416.27         14.4439
         101         6068.71         1721.26         12.1305
         102         6068.66         1228.78          10.289
         103         6068.63         884.002         8.82019
         104          6068.6         639.615         7.62745
         105         6068.58          464.84         6.64627
         106         6068.56         338.878         5.82964
         107         6068.55         247.508         5.14297
         108         6068.54         180.878         4.56032
         109         6068.53         132.084         4.06194
         110         6068.52         96.2342         3.63255
         111         6068.51         69.8362         3.26019
         112         6068.51         50.3734         2.93541
         113          6068.5         36.0205         2.65062
         114          6068.5         25.4451         2.39969
         115         6068.49         17.6693         2.17764
         116         6068.49         1027.39         14.0164
         117         6068.48         544.039          5.3137
         118         6068.48         94.0569         2.86662
         119         6068.48         113.637         3.73503
         120         6068.48         0.51834         1.37051
         121         6068.48         4.59439        0.912827
         122         6068.48         1.56359        0.629276
         123         6068.48         1.13825        0.432567
         124         6068.48        0.296021        0.297532
Iterations terminated: relative change in SSE less than OPTIONS.TolFun

Specify a nonlinear regression model for estimation using a function handle or model syntax.

Load sample data.

S = load('reaction');
X = S.reactants;
y = S.rate;
beta0 = S.beta;

Use a function handle to specify the Hougen-Watson model for the rate data.

mdl = fitnlm(X,y,@hougen,beta0)
mdl = 
Nonlinear regression model:
    y ~ hougen(b,X)

Estimated Coefficients:
          Estimate       SE       tStat     pValue 
          ________    ________    ______    _______

    b1      1.2526     0.86701    1.4447    0.18654
    b2    0.062776    0.043561    1.4411    0.18753
    b3    0.040048    0.030885    1.2967    0.23089
    b4     0.11242    0.075157    1.4957    0.17309
    b5      1.1914     0.83671    1.4239     0.1923


Number of observations: 13, Error degrees of freedom: 8
Root Mean Squared Error: 0.193
R-Squared: 0.999,  Adjusted R-Squared 0.998
F-statistic vs. zero model: 3.91e+03, p-value = 2.54e-13

Alternatively, you can use an expression to specify the Hougen-Watson model for the rate data.

myfun = 'y~(b1*x2-x3/b5)/(1+b2*x1+b3*x2+b4*x3)';
mdl2 = fitnlm(X,y,myfun,beta0)
mdl2 = 
Nonlinear regression model:
    y ~ (b1*x2 - x3/b5)/(1 + b2*x1 + b3*x2 + b4*x3)

Estimated Coefficients:
          Estimate       SE       tStat     pValue 
          ________    ________    ______    _______

    b1      1.2526     0.86701    1.4447    0.18654
    b2    0.062776    0.043561    1.4411    0.18753
    b3    0.040048    0.030885    1.2967    0.23089
    b4     0.11242    0.075157    1.4957    0.17309
    b5      1.1914     0.83671    1.4239     0.1923


Number of observations: 13, Error degrees of freedom: 8
Root Mean Squared Error: 0.193
R-Squared: 0.999,  Adjusted R-Squared 0.998
F-statistic vs. zero model: 3.91e+03, p-value = 2.54e-13

Generate sample data from the nonlinear regression model

y=b1+b2exp(-b3x)+ε,

where b1, b2, and b3 are coefficients, and the error term is normally distributed with mean 0 and standard deviation 0.5.

modelfun = @(b,x)(b(1)+b(2)*exp(-b(3)*x));

rng('default') % for reproducibility
b = [1;3;2];
x = exprnd(2,100,1);
y = modelfun(b,x) + normrnd(0,0.5,100,1);

Set robust fitting options.

opts = statset('nlinfit');
opts.RobustWgtFun = 'bisquare';

Fit the nonlinear model using the robust fitting options. Here, use an expression to specify the model.

b0 = [2;2;2];
modelstr = 'y ~ b1 + b2*exp(-b3*x)';

mdl = fitnlm(x,y,modelstr,b0,'Options',opts)
mdl = 
Nonlinear regression model (robust fit):
    y ~ b1 + b2*exp( - b3*x)

Estimated Coefficients:
          Estimate      SE       tStat       pValue  
          ________    _______    ______    __________

    b1     1.0218     0.07202    14.188    2.1344e-25
    b2     3.6619     0.25429    14.401     7.974e-26
    b3     2.9732     0.38496    7.7232    1.0346e-11


Number of observations: 100, Error degrees of freedom: 97
Root Mean Squared Error: 0.501
R-Squared: 0.807,  Adjusted R-Squared 0.803
F-statistic vs. constant model: 203, p-value = 2.34e-35

Load sample data.

S = load('reaction');
X = S.reactants;
y = S.rate;
beta0 = S.beta;

Specify a function handle for observation weights. The function accepts the model fitted values as input, and returns a vector of weights.

 a = 1; b = 1;
 weights = @(yhat) 1./((a + b*abs(yhat)).^2);

Fit the Hougen-Watson model to the rate data using the specified observation weights function.

mdl = fitnlm(X,y,@hougen,beta0,'Weights',weights)
mdl = 
Nonlinear regression model:
    y ~ hougen(b,X)

Estimated Coefficients:
          Estimate       SE       tStat     pValue 
          ________    ________    ______    _______

    b1     0.83085     0.58224     1.427    0.19142
    b2     0.04095    0.029663    1.3805    0.20477
    b3    0.025063    0.019673     1.274    0.23842
    b4    0.080053    0.057812    1.3847    0.20353
    b5      1.8261       1.281    1.4256    0.19183


Number of observations: 13, Error degrees of freedom: 8
Root Mean Squared Error: 0.037
R-Squared: 0.998,  Adjusted R-Squared 0.998
F-statistic vs. zero model: 1.14e+03, p-value = 3.49e-11

Load sample data.

S = load('reaction');
X = S.reactants;
y = S.rate;
beta0 = S.beta;

Fit the Hougen-Watson model to the rate data using the combined error variance model.

mdl = fitnlm(X,y,@hougen,beta0,'ErrorModel','combined')
mdl = 
Nonlinear regression model:
    y ~ hougen(b,X)

Estimated Coefficients:
          Estimate       SE       tStat     pValue 
          ________    ________    ______    _______

    b1      1.2526     0.86702    1.4447    0.18654
    b2    0.062776    0.043561    1.4411    0.18753
    b3    0.040048    0.030885    1.2967    0.23089
    b4     0.11242    0.075158    1.4957    0.17309
    b5      1.1914     0.83671    1.4239     0.1923


Number of observations: 13, Error degrees of freedom: 8
Root Mean Squared Error: 1.27
R-Squared: 0.999,  Adjusted R-Squared 0.998
F-statistic vs. zero model: 3.91e+03, p-value = 2.54e-13

Input Arguments

collapse all

Input data including predictor and response variables, specified as a table or dataset array. The predictor variables and response variable must be numeric.

  • If you specify modelfun using a formula, the model specification in the formula specifies the predictor and response variables.

  • If you specify modelfun using a function handle, the last variable is the response variable and the others are the predictor variables, by default. You can set a different column as the response variable by using the ResponseVar name-value pair argument. To select a subset of the columns as predictors, use the PredictorVars name-value pair argument.

The variable names in a table do not have to be valid MATLAB® identifiers, but the names must not contain leading or trailing blanks. If the names are not valid, you cannot specify modelfun using a formula.

You can verify the variable names in tbl by using the isvarname function. If the variable names are not valid, then you can convert them by using the matlab.lang.makeValidName function.

Data Types: table

Predictor variables, specified as an n-by-p matrix, where n is the number of observations and p is the number of predictor variables. Each column of X represents one variable, and each row represents one observation.

Data Types: single | double

Response variable, specified as an n-by-1 vector, where n is the number of observations. Each entry in y is the response for the corresponding row of X.

Data Types: single | double

Functional form of the model, specified as either of the following.

  • Function handle @modelfun or @(b,x)modelfun, where

    • b is a coefficient vector with the same number of elements as beta0.

    • x is a matrix with the same number of columns as X or the number of predictor variable columns of tbl.

    modelfun(b,x) returns a column vector that contains the same number of rows as x. Each row of the vector is the result of evaluating modelfun on the corresponding row of x. In other words, modelfun is a vectorized function, one that operates on all data rows and returns all evaluations in one function call. modelfun should return real numbers to obtain meaningful coefficients.

  • Character vector or string scalar formula in the form 'y ~ f(b1,b2,...,bj,x1,x2,...,xk)', where f represents a scalar function of the scalar coefficient variables b1,...,bj and the scalar data variables x1,...,xk. The variable names in the formula must be valid MATLAB identifiers.

Data Types: function_handle | char | string

Initial coefficient values for the nonlinear model, specified as a numeric vector. NonLinearModel starts its search for optimal coefficients from beta0.

Data Types: single | double

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'ErrorModel','combined','Exclude',2,'Options',opt specifies the error model as the combined model, excludes the second observation from the fit, and uses the options defined in the structure opt to control the iterative fitting procedure.

Names of the model coefficients, specified as a string array or cell array of character vectors.

Data Types: string | cell

Form of the error variance model, specified as one of the following. Each model defines the error using a standard mean-zero and unit-variance variable e in combination with independent components: the function value f, and one or two parameters a and b

'constant' (default)y=f+ae
'proportional'y=f+bfe
'combined'y=f+(a+b|f|)e

The only allowed error model when using Weights is 'constant'.

Note

options.RobustWgtFun must have value [] when using an error model other than 'constant'.

Example: 'ErrorModel','proportional'

Initial estimates of the error model parameters for the chosen ErrorModel, specified as a numeric array.

Error ModelParametersDefault Values
'constant' a1
'proportional'b1
'combined'a, b[1,1]

You can only use the 'constant' error model when using Weights.

Note

options.RobustWgtFun must have value [] when using an error model other than 'constant'.

For example, if 'ErrorModel' has the value 'combined', you can specify the starting value 1 for a and the starting value 2 for b as follows.

Example: 'ErrorParameters',[1,2]

Data Types: single | double

Observations to exclude from the fit, specified as the comma-separated pair consisting of 'Exclude' and a logical or numeric index vector indicating which observations to exclude from the fit.

For example, you can exclude observations 2 and 3 out of 6 using either of the following examples.

Example: 'Exclude',[2,3]

Example: 'Exclude',logical([0 1 1 0 0 0])

Data Types: single | double | logical

Options for controlling the iterative fitting procedure, specified as a structure created by statset. The relevant fields are the nonempty fields in the structure returned by the call statset('fitnlm').

OptionMeaningDefault
DerivStepRelative difference used in finite difference derivative calculations. A positive scalar, or a vector of positive scalars the same size as the vector of parameters estimated by the Statistics and Machine Learning Toolbox™ function using the options structure.eps^(1/3)
Display

Amount of information displayed by the fitting algorithm.

  • 'off' — Displays no information.

  • 'final' — Displays the final output.

  • 'iter' — Displays iterative output to the Command Window.

'off'
FunValCheckCharacter vector or string scalar indicating to check for invalid values, such as NaN or Inf, from the model function.'on'
MaxIterMaximum number of iterations allowed. Positive integer.200
RobustWgtFunWeight function for robust fitting. Can also be a function handle that accepts a normalized residual as input and returns the robust weights as output. If you use a function handle, give a Tune constant. See Robust Options[]
TuneTuning constant used in robust fitting to normalize the residuals before applying the weight function. A positive scalar. Required if the weight function is specified as a function handle.See Robust Options for the default, which depends on RobustWgtFun.
TolFunTermination tolerance for the objective function value. Positive scalar.1e-8
TolXTermination tolerance for the parameters. Positive scalar.1e-8

Data Types: struct

Predictor variables to use in the fit, specified as the comma-separated pair consisting of 'PredictorVars' and either a string array or cell array of character vectors of the variable names in the table or dataset array tbl, or a logical or numeric index vector indicating which columns are predictor variables.

The string values or character vectors should be among the names in tbl, or the names you specify using the 'VarNames' name-value pair argument.

The default is all variables in X, or all variables in tbl except for ResponseVar.

For example, you can specify the second and third variables as the predictor variables using either of the following examples.

Example: 'PredictorVars',[2,3]

Example: 'PredictorVars',logical([0 1 1 0 0 0])

Data Types: single | double | logical | string | cell

Response variable to use in the fit, specified as the comma-separated pair consisting of 'ResponseVar' and either a variable name in the table or dataset array tbl, or a logical or numeric index vector indicating which column is the response variable.

If you specify a model, it specifies the response variable. Otherwise, when fitting a table or dataset array, 'ResponseVar' indicates which variable fitnlm should use as the response.

For example, you can specify the fourth variable, say yield, as the response out of six variables, in one of the following ways.

Example: 'ResponseVar','yield'

Example: 'ResponseVar',[4]

Example: 'ResponseVar',logical([0 0 0 1 0 0])

Data Types: single | double | logical | char | string

Names of variables, specified as the comma-separated pair consisting of 'VarNames' and a string array or cell array of character vectors including the names for the columns of X first, and the name for the response variable y last.

'VarNames' is not applicable to variables in a table or dataset array, because those variables already have names.

Example: 'VarNames',{'Horsepower','Acceleration','Model_Year','MPG'}

Data Types: string | cell

Observation weights, specified as a vector of nonnegative scalar values or function handle.

  • If you specify a vector, then it must have n elements, where n is the number of rows in tbl or y.

  • If you specify a function handle, then the function must accept a vector of predicted response values as input, and return a vector of real positive weights as output.

Given weights, W, NonLinearModel estimates the error variance at observation i by MSE*(1/W(i)), where MSE is the mean squared error.

Data Types: single | double | function_handle

Output Arguments

collapse all

Nonlinear model representing a least-squares fit of the response to the data, returned as a NonLinearModel object.

If the Options structure contains a nonempty RobustWgtFun field, the model is not a least-squares fit, but uses the RobustWgtFun robust fitting function.

For properties and methods of the nonlinear model object, mdl, see the NonLinearModel class page.

More About

collapse all

Robust Options

Weight FunctionEquationDefault Tuning Constant
"andrews"w = (abs(r)<pi) .* sin(r) ./ r1.339
"bisquare" (default)w = (abs(r)<1) .* (1 - r.^2).^24.685
"cauchy"w = 1 ./ (1 + r.^2)2.385
"fair"w = 1 ./ (1 + abs(r))1.400
"huber"w = 1 ./ max(1, abs(r))1.345
"logistic"w = tanh(r) ./ r1.205
"talwar"w = 1 * (abs(r)<1)2.795
"welsch"w = exp(-(r.^2))2.985

Algorithms

  • fitnlm uses the same fitting algorithm as nlinfit.

  • fitnlm considers NaN values in tbl, X, and y to be missing values. When fitting a model, fitnlm does not use observations with missing values or observations at which modelfun returns NaN values. The ObservationInfo property of a fitted model contains information regarding whether or not fitnlm uses each observation in the fit.

References

[1] Seber, G. A. F., and C. J. Wild. Nonlinear Regression. Hoboken, NJ: Wiley-Interscience, 2003.

[2] DuMouchel, W. H., and F. L. O'Brien. “Integrating a Robust Option into a Multiple Regression Computing Environment.” Computer Science and Statistics: Proceedings of the 21st Symposium on the Interface. Alexandria, VA: American Statistical Association, 1989.

[3] Holland, P. W., and R. E. Welsch. “Robust Regression Using Iteratively Reweighted Least-Squares.” Communications in Statistics: Theory and Methods, A6, 1977, pp. 813–827.

Version History

Introduced in R2013b