outlierRemoverComponent
Description
outlierRemoverComponent is a pipeline component that removes
outliers. The pipeline component uses the functionality of the rmoutliers
function during the learn phase to identify and remove outlier values for a set of
observations. During the run phase, the component uses the values learned during the learn
phase to remove outlier values in a new data set.
Creation
Description
creates a pipeline component for removing outlier values.component = outlierRemoverComponent
sets writable Properties using one or more
name-value arguments. For example, you can specify the outlier detection method by using
the component = outlierRemoverComponent(Name=Value)Method name-value argument.
Properties
Structural Parameters
The software sets structural parameters when you create the component. You cannot modify structural parameters after the component is created.
This property is read-only after the component is created.
Number of data flow tags to include in the component, specified as a positive
integer scalar. NumDataFlow determines the number of nonzero
elements in InputTags
and OutputTags.
For example, if NumDataFlow=3, then InputTags=[1 2
3] and OutputTags=[1 2 3 0]. The 0
output tag corresponds to the logical output argument that indicates which
observations have outlier values.
Example: c =
outlierRemoverComponent(NumDataFlow=1)
Data Types: single | double
This property is read-only after the component is created.
Index of the data argument passed to learn that is used to
detect outliers, specified as a positive integer scalar. For example, if
ReferenceInput=3, then the software finds outliers in the third
data argument.
Example: c =
outlierRemoverComponent(ReferenceInput=2)
Data Types: single | double
Learn Parameters
The software sets learn parameters when you create the component. You can modify learn
parameters using dot notation any time before you use the learn object
function. Any unset learn parameters use the corresponding default values.
Outlier detection method, specified as one of the following values.
| Value | Description |
|---|---|
"gesd" | For each variable, find outliers by using the generalized extreme
Studentized deviate test for outliers. Use ThresholdFactor to specify the alpha value for the test. |
"grubbs" | For each variable, find outliers by using Grubbs’ test, which removes one
outlier per iteration based on hypothesis testing. Use
ThresholdFactor to specify the alpha value for the
test. |
"mean" | For each variable, outliers are values more than a certain number of
standard deviations from the mean. Use ThresholdFactor to
specify the number of standard deviations. |
"median" | For each variable, outliers are values more than a certain number of
scaled median absolute deviations (MAD) from the median. Use
ThresholdFactor to specify the number of scaled
MAD. |
"percentiles" | For each variable, outliers are values below the lower threshold or above
the upper threshold, as specified by Threshold. |
"quartiles" | For each variable, outliers are values more than a certain number of
interquartile ranges below the lower quartile (25 percent) or above the upper
quartile (75 percent). Use ThresholdFactor to specify the
number of interquartile ranges. |
For more information, see method.
Example: c =
outlierRemoverComponent(Method="mean")
Example: c.Method = "quartiles"
Data Types: char | string
Outlier detection threshold factor, specified as a nonnegative scalar.
When
Methodis"median", the outlier detection threshold factor is the number of scaled MAD, which is 3 by default.When
Methodis"mean", the outlier detection threshold factor is the number of standard deviations from the mean, which is 3 by default.When
Methodis"grubbs"or"gesd", the outlier detection threshold factor is a scalar in the interval (0,1), which represents the alpha value of the hypothesis test. Values close to 0 result in a smaller number of outliers, and values close to 1 result in a larger number of outliers. The default value is 0.05.When
Methodis"quartiles", the outlier detection threshold factor is the number of interquartile ranges, which is 1.5 by default.
You cannot specify ThresholdFactor when the outlier detection
method is "percentiles".
Example: c =
outlierRemoverComponent(ThresholdFactor=2.5)
Example: c.ThresholdFactor = 0.01
Data Types: single | double
Lower and upper percentile thresholds, specified as a nonnegative vector with two elements in the interval [0,100]. The first element indicates the lower percentile threshold, and the second element indicates the upper percentile threshold. The first element must be less than the second element.
You must specify Threshold when the outlier detection method
(Method) is
"percentiles". You cannot specify Threshold
for any other outlier detection method.
Example: c = outlierRemoverComponent(Threshold=[10
90])
Example: c.Threshold=[5 95]
Data Types: single | double
Maximum number of outliers to remove, specified as a positive integer scalar.
If you do not specify the MaxNumOutliers value, the software
uses the integer nearest to 10 percent of n, where
n is the number of observations in the data arguments of
learn.
You can specify MaxNumOutliers only when the outlier
detection method (Method) is
"gesd".
Example: c =
outlierRemoverComponent(MaxNumOutliers=20)
Example: c.MaxNumOutliers = 5
Data Types: single | double
Run Parameters
The software sets run parameters when you create the component. You can modify the run parameters at any time. Any unset run parameters use the corresponding default values.
Flag for removing outliers during the run phase, specified as 0
(false) or 1 (true). If you
set RunRemoval to true, then the software
removes observations with outlier values when you use the run
function. If RunRemoval is set to false, the
software does not remove any observations from the data arguments passed to
run.
Example: c =
outlierRemoverComponent(RunRemoval=true)
Example: c.RunRemoval = false
Data Types: logical
Component Properties
The software sets component properties when you create the component. You can modify the
component properties (excluding HasLearnables and
HasLearned) using dot notation at any time. You cannot modify the
HasLearnables and HasLearned properties
directly.
Component identifier, specified as a character vector or string scalar.
Example: c =
outlierRemoverComponent(Name="OutlierRemoval")
Example: c.Name = "Removal"
Data Types: char | string
Names of the input ports, specified as a character vector, string array, or cell array of character vectors.
Example: c =
outlierRemoverComponent(Inputs=["X","Y"])
Example: c.Inputs = ["X1","Y1"]
Data Types: char | string | cell
Names of the output ports, specified as a character vector, string array, or cell array of character vectors.
Example: c =
outlierRemoverComponent(Outputs=["newX","newY","indices"])
Example: c.Outputs = ["X1","X2","Idx"]
Data Types: char | string | cell
Tags that enable the automatic connection of the component inputs with other
components or pipelines, specified as a nonnegative integer vector. If you specify
InputTags, the number of tags must match the number of inputs
in Inputs.
Example: c = outlierRemoverComponent(InputTags=[1
0])
Example: c.InputTags = [1 2]
Data Types: single | double
Tags that enable the automatic connection of the component outputs with other
components or pipelines, specified as a nonnegative integer vector. If you specify
OutputTags, the number of tags must match the number of outputs
in Outputs.
Example: c = outlierRemoverComponent(OutputTags=[1 0
0])
Example: c.OutputTags = [1 2 0]
Data Types: single | double
This property is read-only.
Indicator for the learnables, returned as 1
(true). A value of 1 indicates that the
component contains Learnables.
Data Types: logical
This property is read-only.
Indicator showing the learning status of the component, returned as
0 (false) or 1
(true). A value of 1 indicates that the
learn object function has been applied to the component, and
the Learnables are nonempty.
Data Types: logical
Learnables
The software sets learnables when you use the learn object
function. You cannot modify learnables directly.
This property is read-only.
Lower threshold for identifying outliers, returned as a table. Each value
corresponds to a variable in VariablesWithOutliers.
This property is read-only.
Upper threshold for identifying outliers, returned as a table. Each value
corresponds to a variable in VariablesWithOutliers.
This property is read-only.
Center value for identifying outliers, returned as a table. Each value corresponds
to a variable in VariableWithOutliers.
This property is read-only.
Names of the variables used by the component to derive the
LowerThreshold, UpperThreshold, and
Center values. By default, the variables correspond to columns
in the first data argument of learn. You can use ReferenceInput to specify which data argument to use.
Object Functions
learn | Initialize and evaluate pipeline or component |
run | Execute pipeline or component for inference after learning |
reset | Reset pipeline or component |
series | Connect components in series to create pipeline |
parallel | Connect components or pipelines in parallel to create pipeline |
view | View diagram of pipeline inputs, outputs, components, and connections |
Examples
Create a pipeline component that removes outlier values in observations.
component = outlierRemoverComponent
component =
outlierRemoverComponent with properties:
Name: "OutlierRemover"
Inputs: ["DataIn1" "DataIn2"]
InputTags: [1 2]
Outputs: [1×3 string]
OutputTags: [1 2 0]
Learnables (HasLearned = false)
LowerThreshold: []
UpperThreshold: []
Center: []
VariablesWithOutliers: []
Structural Parameters (locked)
NumDataFlow: 2
ReferenceInput: 1
Run Parameters (unlocked)
RunRemoval: 0
Show all parameterscomponent is a outlierRemoverComponent object
that contains four learnables: LowerThreshold,
UpperThreshold, Center, and
VariablesWithOutliers. The properties remain empty until you pass
data to the component during the learn phase.
Load the carbig data set. Create a table containing the predictor
variables Acceleration, Displacement, and
Horsepower, and create another table containing the response
variable MPG.
load carbig
cars = table(Acceleration,Displacement,Horsepower);
y = table(MPG);Use the learn object function to remove observations with outlier
values in cars. The software removes the observations from both
cars and y.
[component,newcars,newy] = learn(component,cars,y); component
component =
outlierRemoverComponent with properties:
Name: "OutlierRemover"
Inputs: ["DataIn1" "DataIn2"]
InputTags: [1 2]
Outputs: ["DataOut1" "DataOut2" "IsOutlier"]
OutputTags: [1 2 0]
Learnables (HasLearned = true)
LowerThreshold: [1×3 table]
UpperThreshold: [1×3 table]
Center: [1×3 table]
VariablesWithOutliers: ["Acceleration" "Displacement" "Horsepower"]
Structural Parameters (locked)
NumDataFlow: 2
ReferenceInput: 1
Run Parameters (unlocked)
RunRemoval: 0
Show all parametersThe LowerThreshold, UpperThreshold,
Center, and VariablesWithOutliers properties are
nonempty, and the HasLearned property is set to
true.
Notice that newcars and newy have fewer
observations than cars and y.
newNumObservations = size([newcars newy],1) originalNumObservations = size([cars y],1)
newNumObservations = 385 originalNumObservations = 406
Version History
Introduced in R2026a
See Also
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Seleziona un sito web
Seleziona un sito web per visualizzare contenuto tradotto dove disponibile e vedere eventi e offerte locali. In base alla tua area geografica, ti consigliamo di selezionare: .
Puoi anche selezionare un sito web dal seguente elenco:
Come ottenere le migliori prestazioni del sito
Per ottenere le migliori prestazioni del sito, seleziona il sito cinese (in cinese o in inglese). I siti MathWorks per gli altri paesi non sono ottimizzati per essere visitati dalla tua area geografica.
Americhe
- América Latina (Español)
- Canada (English)
- United States (English)
Europa
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)