Main Content

hashSimilarityModel

Hashed-feature similarity model for estimating remaining useful life

Description

Use hashSimilarityModel to estimate the remaining useful life (RUL) of a component using a hashed-feature similarity model. This model is useful when you have run-to-failure degradation path histories for an ensemble of similar components, such as multiple machines manufactured to the same specifications, and the data set is large. The hashed-feature similarity model transforms the historical degradation path data for each ensemble member into a series of hashed-features, such as the mean, power, minimum, or maximum values for the data. You can then compute the hashed features of a test component and compare them to the hashed features of the ensemble data members.

To configure a hashSimilarityModel object, use fit, which computes and stores the hashed feature values of the ensemble data members. Once you configure the parameters of your similarity model, you can then predict the remaining useful life of similar components using predictRUL. For similarity models, the RUL of the test component is estimated as the median statistic of the lifetime span of the most similar components minus the current lifetime value of the test component. For a basic example illustrating RUL prediction, see Update RUL Prediction as Data Arrives.

For general information on predicting remaining useful life, see Models for Predicting Remaining Useful Life.

Creation

Description

example

mdl = hashSimilarityModel creates a hashed-feature similarity model for estimating RUL and initializes the model with default settings.

mdl = hashSimilarityModel(initModel) creates a hashed-feature similarity model and initializes the model parameters using an existing hashSimilarityModel object initModel.

example

mdl = hashSimilarityModel(___,Name,Value) specifies user-settable model properties using name-value pairs. For example, hashSimilarityModel('LifeTimeUnit',"days") creates a hashed-feature similarity model with that uses days as a lifetime unit. You can specify multiple name-value pairs. Enclose each property name in quotes.

Input Arguments

expand all

Hashed-feature similarity model, specified as a hashSimilarityModel object.

Properties

expand all

This property is read-only.

Hashed feature values generated by the fit function, specified as N-by-M array, where M is the number of ensemble members and N is the number of hashed features. HashTable(i,j) contains the hashed feature value of jth feature computed for the ith data member.

To specify the method for computing the hashed features, use the Method property of the model.

Breakpoints for splitting historical data into multiple regimes, specified as a row vector of double values, duration objects, or datetime objects. The row vector of breakpoints must:

  • Be in increasing order

  • Have units and a format that is compatible with the training data used with the fit function

To use a single regime, specify RegimeSplit as [].

A separate hash table is generated for each regime. The RUL prediction is based on the similarity to the hashed features in the regime to which the test data belongs. If you change the value of RegimeSplit, then you must retrain your model using fit.

You can specify RegimeSplit:

  • Using a name-value pair when you create the model

  • Using dot notation after model creation

This property is read-only.

Ensemble member life spans, specified as a double vector or duration object vector and computed from the ensemble member degradation profiles by the fit function.

Number of nearest neighbors for RUL estimation, specified as Inf or a finite positive integer. If NumNearestNeighbors is Inf, then predictRUL uses all the ensemble members during estimation.

You can specify NumNearestNeighbors:

  • Using a name-value pair when you create the model

  • Using dot notation after model creation

Hashed feature computation method, specified as one of the following:

  • "minmaxstd" — Extract the minimum, maximum, and standard deviation of the data. This option omits observations that contain NaN. When you use this method, HashTable is M-by-3, where M is the number of ensemble members.

  • Function handle — Use a custom function that takes degradation data as a column vector, table, or timetable, and returns a row vector of features. For example:

    mdl.Method = @(x) [mean(x),std(x),kurtosis(x),median(x)]

You can specify Method:

  • Using a name-value pair when you create the model

  • Using dot notation after model creation

Distance computation method, specified as one of the following:

  • "euclidian" — Use the 2-norm of the difference between hash vectors.

  • "absolute" — Use the 1-norm of the difference between hash vectors.

  • Function handle — Use a custom function of the form:

    D = distanceFunction(xTest,xEnsemble)

    Here,

    • xTest is a row vector of length N that contains test component hashed features, where N is the number of hashed features.

    • xEnsemble is an M-by-N array of ensemble component hashed features, where M is the number of ensemble components in the fitted model. xEnsemble(i,:) contains the hashed features for the ith ensemble member.

    • D is a column vector of length M, where D(i) is the distance between the test feature vector and the feature vector of the ith ensemble member.

    For an example of using a custom distance function, see Specify Custom Distance Function for Hash Similarity Model.

You can specify Distance:

  • Using a name-value pair when you create the model

  • Using dot notation after model creation

Flag to include ties, specified as true or false. When IncludeTies is true, the model includes all neighbors whose distance to the test component data is less than the Kth smallest distance, where K is equal to NumNearestNeigbors.

You can specify IncludeTies:

  • Using a name-value pair when you create the model

  • Using dot notation after model creation

Flag for standardizing feature data before generating hashed features, specified as true or false. When Standardize is true, the feature data is standardized such that feature X becomes (X-mean(X))/std(X).

You can specify Standardize:

  • Using a name-value pair when you create the model

  • Using dot notation after model creation

Lifetime variable, specified as a string that contains a valid MATLAB® variable name or "".

When you train the model using the fit function, if your training data is a:

  • table, then LifeTimeVariable must match one of the variable names in the table

  • timetable, then LifeTimeVariable one of the variable names in the table or the dimension name of the time variable, data.Properties.DimensionNames{1}

You can specify LifeTimeVariable:

  • Using a name-value pair when you create the model

  • As an argument when you call the fit function

  • Using dot notation after model creation

Lifetime variable units, specified as a string.

The units of the lifetime variable do not need to be time-based. The life of the test component can be measured in terms of a usage variable, such as distance traveled (miles) or fuel consumed (gallons).

Degradation variable names, specified as a string or string array. The strings in DataVariables must be valid MATLAB variable name.

You can specify DataVariables:

  • Using a name-value pair when you create the model

  • As an argument when you call the fit function

  • Using dot notation after model creation

Flag for using parallel computing for hash table generation by the fit function, specified as either true or false.

You can specify UseParallel:

  • Using a name-value pair when you create the model

  • Using dot notation after model creation

Additional model information for bookkeeping purposes, specified as any data type or format. The model does not use this information.

You can specify UserData:

  • Using a name-value pair when you create the model

  • Using dot notation after model creation

Object Functions

predictRULEstimate remaining useful life for a test component
fitEstimate parameters of remaining useful life model using historical data
compareCompare test data to historical data ensemble for similarity models

Examples

collapse all

Load training data.

load('hashTrainVectors.mat')

The training data is a cell array of column vectors. Each column vector is a degradation feature profile for a component.

Create a hash similarity model with default settings. By default, the hashed features used by the model are the signal maximum, minimum, and standard deviation values.

mdl = hashSimilarityModel;

Train the similarity model using the training data.

fit(mdl,hashTrainVectors)

Load training data.

load('hashTrainTables.mat')

The training data is a cell array of tables. Each table is a degradation feature profile for a component. Each profile consists of life time measurements in the "Time" variable and corresponding degradation feature measurements in the "Condition" variable.

Create a hash similarity model that uses the following values as hashed features:

mdl = hashSimilarityModel('Method',@(x) [mean(x),std(x),kurtosis(x),median(x)]);

Train the similarity model using the training data. Specify the names of the life time and data variables.

fit(mdl,hashTrainTables,"Time","Condition")

Load training data.

load('hashTrainTables.mat')

The training data is a cell array of tables. Each table is a degradation feature profile for a component. Each profile consists of life time measurements in the "Time" variable and corresponding degradation feature measurements in the "Condition" variable.

Create a hash similarity model that uses hours as a life time unit and the following values as hashed features:

  • Mean

  • Standard deviation

  • Kurtosis

  • Median

mdl = hashSimilarityModel('Method',@(x) [mean(x),std(x),kurtosis(x),median(x)],...
                          'LifeTimeUnit',"hours");

Train the similarity model using the training data. Specify the names of the life time and data variables.

fit(mdl,hashTrainTables,"Time","Condition")

Load testing data. The test data contains the degradation feature measurements for a test component up to the current life time.

load('hashTestData.mat')

Predict the RUL of the test component using the trained similarity model.

estRUL = predictRUL(mdl,hashTestData)
estRUL = duration
   175.69 hr

The estimated RUL for the component is around 176 hours.

Load the training and test data.

load('hashTrainTables.mat')
load('hashTestData.mat')

Create a coordinate-weighted distance function distanceFunction that contains the following code.

type distanceFunction.m
function out = distanceFunction(xTest,xEnsemble)

% Use a function handle to compute a distance that weights each
% coordinate contribution differently.
W = [.1 .2 .3];            % coordinate weights
out = (sqrt((xTest - xEnsemble).^2 * W'));
       
end

Create a hash similarity model that uses hours as the life time unit and the function handle for distanceFunction as the distance measurement.

mdl = hashSimilarityModel('LifeTimeUnit',"hours", 'Distance', @distanceFunction);

Train the similarity model using the training data. Specify the names of the life time and data variables.

fit(mdl,hashTrainTables,"Time","Condition")

Predict the RUL of the test component using the trained similarity model.

estRUL = predictRUL(mdl,hashTestData)
estRUL = duration
   138.87 hr

Extended Capabilities

Version History

Introduced in R2018a