hashSimilarityModel

Hashed-feature similarity model for estimating remaining useful life

Description

Use hashSimilarityModel to estimate the remaining useful life (RUL) of a component using a hashed-feature similarity model. This model is useful when you have run-to-failure degradation path histories for an ensemble of similar components, such as multiple machines manufactured to the same specifications, and the data set is large. The hashed-feature similarity model transforms the historical degradation path data for each ensemble member into a series of hashed-features, such as the mean, power, minimum, or maximum values for the data. You can then compute the hashed features of a test component and compare them to the hashed features of the ensemble data members.

To configure a hashSimilarityModel object, use fit, which computes and stores the hashed feature values of the ensemble data members. Once you configure the parameters of your similarity model, you can then predict the remaining useful life of similar components using predictRUL. For similarity models, the RUL of the test component is estimated as the median statistic of the lifetime span of the most similar components minus the current lifetime value of the test component. For a basic example illustrating RUL prediction, see Update RUL Prediction as Data Arrives.

For general information on predicting remaining useful life, see Models for Predicting Remaining Useful Life.

Creation

Syntax

mdl = hashSimilarityModel

mdl = hashSimilarityModel(initModel)

mdl = hashSimilarityModel(___,Name,Value)

Description

mdl = hashSimilarityModel creates a hashed-feature similarity model for estimating RUL and initializes the model with default settings.

example

mdl = hashSimilarityModel(initModel) creates a hashed-feature similarity model and initializes the model parameters using an existing hashSimilarityModel object initModel.

mdl = hashSimilarityModel(___,Name,Value) specifies user-settable model properties using name-value pairs. For example, hashSimilarityModel('LifeTimeUnit',"days") creates a hashed-feature similarity model that uses days as a lifetime unit. You can specify multiple name-value pairs. Enclose each property name in quotes.

example

Input Arguments

expand all

`initModel` — Hashed-feature similarity model
`hashSimilarityModel` object

Hashed-feature similarity model, specified as a hashSimilarityModel object.

Properties

expand all

`HashTable` — Hashed feature values
N-by-M array

This property is read-only.

Hashed feature values generated by the fit function, specified as N-by-M array, where M is the number of ensemble members and N is the number of hashed features. HashTable(i,j) contains the hashed feature value of jth feature computed for the ith data member.

To specify the method for computing the hashed features, use the Method property of the model.

`RegimeSplit` — Breakpoints for splitting historical data into multiple regimes
row vector of doubles (default) | `[]` | row vector of `duration` objects | row vector of `datetime` objects

Breakpoints for splitting historical data into multiple regimes, specified as a row vector of double values, duration objects, or datetime objects. The row vector of breakpoints must:

Be in increasing order
Have units and a format that is compatible with the training data used with the fit function

To use a single regime, specify RegimeSplit as [].

A separate hash table is generated for each regime. The RUL prediction is based on the similarity to the hashed features in the regime to which the test data belongs. If you change the value of RegimeSplit, then you must retrain your model using fit.

You can specify RegimeSplit:

Using a name-value pair when you create the model
Using dot notation after model creation

`LifeSpan` — Ensemble member life spans
double vector (default) | vector of `duration` objects

This property is read-only.

Ensemble member life spans, specified as a double vector or duration object vector and computed from the ensemble member degradation profiles by the fit function.

`NumNearestNeighbors` — Number of nearest neighbors for RUL estimation
`Inf` (default) | finite positive integer

Number of nearest neighbors for RUL estimation, specified as Inf or a finite positive integer. If NumNearestNeighbors is Inf, then predictRUL uses all the ensemble members during estimation.

You can specify NumNearestNeighbors:

Using a name-value pair when you create the model
Using dot notation after model creation

`Method` — Hashed feature computation method
`"minmaxstd"` (default) | function handle

Hashed feature computation method, specified as one of the following:

"minmaxstd" — Extract the minimum, maximum, and standard deviation of the data. This option omits observations that contain NaN. When you use this method, HashTable is M-by-3, where M is the number of ensemble members.
Function handle — Use a custom function that takes degradation data as a column vector, table, or timetable, and returns a row vector of features. For example:
```
mdl.Method = @(x) [mean(x),std(x),kurtosis(x),median(x)]
```

You can specify Method:

Using a name-value pair when you create the model
Using dot notation after model creation

`Distance` — Distance computation method
`"euclidian"` (default) | `"absolute"` | function handle

Distance computation method, specified as one of the following:

"euclidian" — Use the 2-norm of the difference between hash vectors.
"absolute" — Use the 1-norm of the difference between hash vectors.
Function handle — Use a custom function of the form:
```
D = distanceFunction(xTest,xEnsemble)
```
Here,
- xTest is a row vector of length N that contains test component hashed features, where N is the number of hashed features.
- xEnsemble is an M-by-N array of ensemble component hashed features, where M is the number of ensemble components in the fitted model. xEnsemble(i,:) contains the hashed features for the ith ensemble member.
- D is a column vector of length M, where D(i) is the distance between the test feature vector and the feature vector of the ith ensemble member.
For an example of using a custom distance function, see Specify Custom Distance Function for Hash Similarity Model.

You can specify Distance:

Using a name-value pair when you create the model
Using dot notation after model creation

`IncludeTies` — Flag to include ties
`true` (default) | `false`

Flag to include ties, specified as true or false. When IncludeTies is true, the model includes all neighbors whose distance to the test component data is less than the Kth smallest distance, where K is equal to NumNearestNeigbors.

You can specify IncludeTies:

Using a name-value pair when you create the model
Using dot notation after model creation

`Standardize` — Flag for standardizing feature data
`false` (default) | `true`

Flag for standardizing feature data before generating hashed features, specified as true or false. When Standardize is true, the feature data is standardized such that feature X becomes (X-mean(X))/std(X).

You can specify Standardize:

Using a name-value pair when you create the model
Using dot notation after model creation

`LifeTimeVariable` — Lifetime variable
`""` (default) | string

Lifetime variable, specified as a string that contains a valid MATLAB^® variable name or "".

When you train the model using the fit function, if your training data is a:

table, then LifeTimeVariable must match one of the variable names in the table
timetable, then LifeTimeVariable one of the variable names in the table or the dimension name of the time variable, data.Properties.DimensionNames{1}

You can specify LifeTimeVariable:

Using a name-value pair when you create the model
As an argument when you call the fit function
Using dot notation after model creation

`LifeTimeUnit` — Lifetime variable units
`""` (default) | string

Lifetime variable units, specified as a string.

The units of the lifetime variable do not need to be time-based. The life of the test component can be measured in terms of a usage variable, such as distance traveled (miles) or fuel consumed (gallons).

`DataVariables` — Degradation variable names
`""` (default) | string | string array

Degradation variable names, specified as a string or string array. The strings in DataVariables must be valid MATLAB variable name.

You can specify DataVariables:

Using a name-value pair when you create the model
As an argument when you call the fit function
Using dot notation after model creation

`UseParallel` — Flag for using parallel computing
`false` (default) | `true`

Flag for using parallel computing for hash table generation by the fit function, specified as either true or false.

You can specify UseParallel:

Using a name-value pair when you create the model
Using dot notation after model creation

`UserData` — Additional model information
`[]` (default) | any data type or format

Additional model information for bookkeeping purposes, specified as any data type or format. The model does not use this information.

You can specify UserData:

Using a name-value pair when you create the model
Using dot notation after model creation

Object Functions

`predictRUL`	Estimate remaining useful life for a test component
`fit`	Estimate parameters of remaining useful life model using historical data
`compare`	Compare test data to historical data ensemble for similarity models

Examples

collapse all

Train Hash Similarity Model

Open Live Script

Load training data.

load('hashTrainVectors.mat')

The training data is a cell array of column vectors. Each column vector is a degradation feature profile for a component.

Create a hash similarity model with default settings. By default, the hashed features used by the model are the signal maximum, minimum, and standard deviation values.

mdl = hashSimilarityModel;

Train the similarity model using the training data.

fit(mdl,hashTrainVectors)

Train Hash Similarity Model Using Tabular Data

Open Live Script

Load training data.

load('hashTrainTables.mat')

The training data is a cell array of tables. Each table is a degradation feature profile for a component. Each profile consists of life time measurements in the "Time" variable and corresponding degradation feature measurements in the "Condition" variable.

Create a hash similarity model that uses the following values as hashed features:

mdl = hashSimilarityModel('Method',@(x) [mean(x),std(x),kurtosis(x),median(x)]);

Train the similarity model using the training data. Specify the names of the life time and data variables.

fit(mdl,hashTrainTables,"Time","Condition")

Predict RUL Using Hash Similarity Model

Open Live Script

Load training data.

load('hashTrainTables.mat')

Create a hash similarity model that uses hours as a life time unit and the following values as hashed features:

Mean
Standard deviation
Kurtosis
Median

mdl = hashSimilarityModel('Method',@(x) [mean(x),std(x),kurtosis(x),median(x)],...
                          'LifeTimeUnit',"hours");

Train the similarity model using the training data. Specify the names of the life time and data variables.

fit(mdl,hashTrainTables,"Time","Condition")

Load testing data. The test data contains the degradation feature measurements for a test component up to the current life time.

load('hashTestData.mat')

Predict the RUL of the test component using the trained similarity model.

estRUL = predictRUL(mdl,hashTestData)

estRUL = duration
   175.69 hr

The estimated RUL for the component is around 176 hours.

Specify Custom Distance Function for Hash Similarity Model

Open Live Script

Load the training and test data.

load('hashTrainTables.mat')
load('hashTestData.mat')

Create a coordinate-weighted distance function distanceFunction that contains the following code.

type distanceFunction.m

function out = distanceFunction(xTest,xEnsemble)

% Use a function handle to compute a distance that weights each
% coordinate contribution differently.
W = [.1 .2 .3];            % coordinate weights
out = (sqrt((xTest - xEnsemble).^2 * W'));
       
end

Create a hash similarity model that uses hours as the life time unit and the function handle for distanceFunction as the distance measurement.

mdl = hashSimilarityModel('LifeTimeUnit',"hours", 'Distance', @distanceFunction);

Train the similarity model using the training data. Specify the names of the life time and data variables.

fit(mdl,hashTrainTables,"Time","Condition")

Predict the RUL of the test component using the trained similarity model.

estRUL = predictRUL(mdl,hashTestData)

estRUL = duration
   138.87 hr

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

The predictRUL command supports code generation with MATLAB Coder™ for this RUL model type. Before generating code that uses this model, you must save the model using saveRULModelForCoder. For an example, see Generate Code for Predicting Remaining Useful Life.
You cannot change any properties of this RUL model at run time.
For the Method property of hashSimilarityModel, you must use the default value, "minmaxstd". Other values of Method are not supported for code generation.
If you specify the Distance property of hashSimilarityModel as a function handle, it cannot be an anonymous function.

Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.

To evaluate these models in parallel, set the UseParallel property to true.

Version History

Introduced in R2018a

hashSimilarityModel

Description

Creation

Syntax

Description

Input Arguments

`initModel` — Hashed-feature similarity model
`hashSimilarityModel` object

Properties

`HashTable` — Hashed feature values
N-by-M array

`RegimeSplit` — Breakpoints for splitting historical data into multiple regimes
row vector of doubles (default) | `[]` | row vector of `duration` objects | row vector of `datetime` objects

`LifeSpan` — Ensemble member life spans
double vector (default) | vector of `duration` objects

`NumNearestNeighbors` — Number of nearest neighbors for RUL estimation
`Inf` (default) | finite positive integer

`Method` — Hashed feature computation method
`"minmaxstd"` (default) | function handle

`Distance` — Distance computation method
`"euclidian"` (default) | `"absolute"` | function handle

`IncludeTies` — Flag to include ties
`true` (default) | `false`

`Standardize` — Flag for standardizing feature data
`false` (default) | `true`

`LifeTimeVariable` — Lifetime variable
`""` (default) | string

`LifeTimeUnit` — Lifetime variable units
`""` (default) | string

`DataVariables` — Degradation variable names
`""` (default) | string | string array

`UseParallel` — Flag for using parallel computing
`false` (default) | `true`

`UserData` — Additional model information
`[]` (default) | any data type or format

Object Functions

Examples

Train Hash Similarity Model

Train Hash Similarity Model Using Tabular Data

Predict RUL Using Hash Similarity Model

Specify Custom Distance Function for Hash Similarity Model

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.

Version History

See Also

Functions

Topics

hashSimilarityModel

Description

Creation

Syntax

Description

Input Arguments

initModel — Hashed-feature similarity model hashSimilarityModel object

Properties

HashTable — Hashed feature values N-by-M array

RegimeSplit — Breakpoints for splitting historical data into multiple regimes row vector of doubles (default) | [] | row vector of duration objects | row vector of datetime objects

LifeSpan — Ensemble member life spans double vector (default) | vector of duration objects

NumNearestNeighbors — Number of nearest neighbors for RUL estimation Inf (default) | finite positive integer

Method — Hashed feature computation method "minmaxstd" (default) | function handle

Distance — Distance computation method "euclidian" (default) | "absolute" | function handle

IncludeTies — Flag to include ties true (default) | false

Standardize — Flag for standardizing feature data false (default) | true

LifeTimeVariable — Lifetime variable "" (default) | string

LifeTimeUnit — Lifetime variable units "" (default) | string

DataVariables — Degradation variable names "" (default) | string | string array

UseParallel — Flag for using parallel computing false (default) | true

UserData — Additional model information [] (default) | any data type or format

Object Functions

Examples

Train Hash Similarity Model

Train Hash Similarity Model Using Tabular Data

Predict RUL Using Hash Similarity Model

Specify Custom Distance Function for Hash Similarity Model

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

Automatic Parallel Support Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.

Version History

See Also

Functions

Topics

`initModel` — Hashed-feature similarity model
`hashSimilarityModel` object

`HashTable` — Hashed feature values
N-by-M array

`RegimeSplit` — Breakpoints for splitting historical data into multiple regimes
row vector of doubles (default) | `[]` | row vector of `duration` objects | row vector of `datetime` objects

`LifeSpan` — Ensemble member life spans
double vector (default) | vector of `duration` objects

`NumNearestNeighbors` — Number of nearest neighbors for RUL estimation
`Inf` (default) | finite positive integer

`Method` — Hashed feature computation method
`"minmaxstd"` (default) | function handle

`Distance` — Distance computation method
`"euclidian"` (default) | `"absolute"` | function handle

`IncludeTies` — Flag to include ties
`true` (default) | `false`

`Standardize` — Flag for standardizing feature data
`false` (default) | `true`

`LifeTimeVariable` — Lifetime variable
`""` (default) | string

`LifeTimeUnit` — Lifetime variable units
`""` (default) | string

`DataVariables` — Degradation variable names
`""` (default) | string | string array

`UseParallel` — Flag for using parallel computing
`false` (default) | `true`

`UserData` — Additional model information
`[]` (default) | any data type or format

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.