# goodnessOfFit

Goodness of fit between test and reference data for analysis and validation of identified models

## Syntax

``fit = goodnessOfFit(x,xref,cost_func)``

## Description

`goodnessOfFit` returns fit values that represent the error norm between test and reference data sets. If you want to compare and visualize simulated model output with measurement data, see also `compare`.

example

````fit = goodnessOfFit(x,xref,cost_func)` returns the goodness of fit between the test data `x` and the reference data `xref` using the cost function `cost_func`. `fit` is a quantitative representation of the closeness of `x` to `xref`. To perform multiple test-to-reference fit comparisons, you can specify `x` and `xref` as cell arrays of equal size that contain multiple test and reference data sets. With cell array inputs, `fit` returns an array of fit values.```

## Examples

collapse all

Find the goodness of fit between measured output data and the simulated output of an estimated model.

Obtain the measured output.

```load iddata1 z1 yref = z1.y;```

`z1` is an `iddata` object containing measured input-output data. `z1.y` is the measured output.

Estimate a second-order transfer function model and simulate the model output `y_est`.

```sys = tfest(z1,2); y_est = sim(sys,z1(:,[],:)); ```

Calculate the goodness of fit, or error norm, between the measured and estimated outputs. Specify the normalized root mean squared error (NRMSE) as the cost function.

```cost_func = 'NRMSE'; y = y_est.y; fit = goodnessOfFit(y,yref,cost_func) ```
```fit = 0.2943 ```

Alternatively, you can use `compare` to calculate the fit. `compare` uses the NRMSE cost function, and expresses the fit percentage using the one's complement of the error norm. The fit relationship between `compare` and `goodnessOfFit` is therefore ${\mathrm{fit}}_{\mathrm{compare}}=\left(1-{\mathrm{fit}}_{\mathrm{gof}}\right)*100$. A `compare` result of 100% is equivalent to a `goodnessOfFit` result of 0.

Specify an initial condition of zero to match the initial condition that `goodnessOfFit` assumes.

```opt = compareOptions('InitialCondition','z'); compare(z1,sys,opt);```

The fit results are equivalent.

Find the goodness of fit between measured and estimated outputs for two models.

Obtain the input-output measurements `z2` from `iddata2`. Copy the measured output into reference output `yref`.

```load iddata2 z2 yref = z2.y;```

Estimate second-order and fourth-order transfer function models using `z2`.

```sys2 = tfest(z2,2); sys4 = tfest(z2,4);```

Simulate both systems to get estimated outputs.

```y_sim2 = sim(sys2,z2(:,[],:)); y2 = y_sim2.y; y_sim4 = sim(sys4,z2(:,[],:)); y4 = y_sim4.y;```

Create cell arrays from the reference and estimated outputs. The reference data set is the same for both model comparisons, so create identical reference cells.

```yrefc = {yref yref}; yc = {y2 y4};```

Compute `fit` values for the three cost functions.

`fit_nrmse = goodnessOfFit(yc,yrefc,'NRMSE')`
```fit_nrmse = 1×2 0.1429 0.1342 ```
`fit_nmse = goodnessOfFit(yc,yrefc,'NMSE')`
```fit_nmse = 1×2 0.0204 0.0180 ```
`fit_mse = goodnessOfFit(yc,yrefc,'MSE')`
```fit_mse = 1×2 1.0811 0.9541 ```

A fit value of 0 indicates a perfect fit between reference and estimated outputs. The fit value rises as fit goodness decreases. For all three cost functions, the fourth-order model produces a better fit than the second-order model.

## Input Arguments

collapse all

Data to test, specified as a matrix or cell array.

• For a single test data set, specify an Ns-by-N matrix, where Ns is the number of samples and N is the number of channels. You must specify `cost_fun` as `'NRMSE'` or `'NMSE'` to use multiple-channel data.

• For multiple test data sets, specify a cell array of length Nd, where Nd is the number of test-to-reference pairs and each cell contains one data matrix.

`x` must not contain any `NaN` or `Inf` values.

Reference data with which to compare `x`, specified as a matrix or cell array.

• For a single reference data set, specify an Ns-by-N matrix, where Ns is the number of samples and N is the number of channels. `xref` must be the same size as `x`. You must specify `cost_fun` as `'NRMSE'` or `'NMSE'` to use multiple-channel data.

• For multiple reference data sets, specify a cell array of length Nd, where Nd is the number of test-to-reference pairs and each cell contains one reference data matrix. As with the individual data matrices, the cell array sizes for `x` and `xref` must match. Each ith element of `fit` corresponds to the pairs of the ith cells of `x` and `xref`.

`xref` must not contain any `NaN` or `Inf` values.

Cost function to determine goodness of fit, specified as one of the following values. In the equations, the value fit applies to a single pairing of test and reference data sets.

ValueDescriptionEquationNotes
`'MSE'`Mean squared error

`$fit=\frac{{‖x-xref‖}^{2}}{Ns}$`

where Ns is the number of samples and ‖ indicates the 2-norm of a vector.

fit is a scalar.
`'NRMSE'`Normalized root mean squared error

`$fit\left(i\right)=\frac{‖xref\left(:,i\right)-x\left(:,i\right)‖}{‖xref\left(:,i\right)-mean\left(xref\left(:,i\right)\right)‖}$`

where ‖ indicates the 2-norm of a vector. `fit` is a row vector of length N and i = 1,...,N, where N is the number of channels.

fit is a row vector. `'NRMSE'` is the cost function used by `compare`.

`'NMSE'`Normalized mean squared error

`$fit\left(i\right)=\frac{{‖xref\left(:,i\right)-x\left(:,i\right)‖}^{2}}{{‖xref\left(:,i\right)-mean\left(xref\left(:,i\right)\right)‖}^{2}}$`

fit is a row vector.

## Output Arguments

collapse all

Goodness of fit between test and reference data pairs, returned as a scalar, a row vector, or a cell array.

• For a single test and reference data set pair, `fit` is returned as a scalar or row vector.

• If `cost_fun` is `'MSE'`, then `fit` is a scalar.

• If `cost_fun` is `'NRMSE'` or `'NMSE'`, then `fit` is a column vector of length N, where N is the number of channels.

• For multiple test and data set and reference pairs, where `x` and `xref` are cell arrays of length ND, `fit` is returned as a vector or a matrix.

• If `cost_fun` is `'MSE'`, then `fit` is a row vector of length ND.

• If `cost_fun` is `'NRMSE'` or `'NMSE'`, then `fit` is a matrix of size N-by- Nd, where N is the number of channels (data columns) and Nd represents the number of test pairs. Each element of `fit` contains the goodness of fit values for the corresponding test data and reference pair.

Each element of `fit` contains the goodness of fit values for the corresponding test data and reference pair.

Possible values for individual fit elements depend on the selection of `cost_func`.

• If `cost_func` is `'MSE'`, each `fit` value is a positive scalar that grows with the error between test and reference data. A `fit` value of `0` indicates a perfect match between test and reference data.

• If `cost_func` is `'NRMSE'` or `'NMSE'`, `fit` values vary between -`Inf` and 1.

• `0` — Perfect fit to reference data (zero error)

• -`Inf` — Bad fit

• `1``x` is no better than a straight line at matching `xref`

## Version History

Introduced in R2012a

expand all