## Data with Missing Values

Many data sets have one or more missing values. It is convenient to code missing values as `NaN` (Not a Number) to preserve the structure of data sets across multiple variables and observations.

Normal MATLAB® arithmetic operations yield `NaN` values when operands are `NaN`. Removing the `NaN` values would destroy the matrix structure. Removing the rows containing the `NaN` values would discard data. Statistics and Machine Learning Toolbox™ functions in the following table remove `NaN` values only for the purposes of computation.

FunctionDescription
`nancov`

Covariance matrix, ignoring `NaN` values

`nanmax`

Maximum, ignoring `NaN` values

`nanmean`

Mean, ignoring `NaN` values

`nanmedian`

Median, ignoring `NaN` values

`nanmin`

Minimum, ignoring `NaN` values

`nanstd`

Standard deviation, ignoring `NaN` values

`nansum`

Sum, ignoring `NaN` values

`nanvar`

Variance, ignoring `NaN` values

Other Statistics and Machine Learning Toolbox functions also ignore `NaN` values. These include `iqr`, `kurtosis`, `mad`, `prctile`, `range`, `skewness`, and `trimmean`.

### Working with Data with Missing Values

Create a 3-by-3 matrix of sample data. Remove two data values by replacing them with `NaN`.

```X = magic(3); X([1 5]) = [NaN NaN]```
```X = 3×3 NaN 1 6 3 NaN 7 4 9 2 ```

Compute the sum of for each column of the sample data matrix using the `sum` function.

`s1 = sum(X)`
```s1 = 1×3 NaN NaN 15 ```

If a column contains a `NaN` value, then the `sum` function will return `NaN` as the sum of the data in that column.

For comparison, compute the sum for each column of the sample data matrix using the `nansum` function.

`s2 = nansum(X)`
```s2 = 1×3 7 10 15 ```

If a column contains a `NaN` value, then the `nansum` function ignores the `NaN` value and returns the sum of the remaining values in the column.