optimalDOE
Description
An optimalDOE
object contains a D-optimal design for an
experiment. The design points in a D-optimal design minimize the covariance of the model
coefficient estimates. Use a D-optimal design when you have a limited number of experimental
runs, or factor constraints that are not suitable for full factorial or mixture
designs.
Creation
Syntax
Description
specifies the number and levels for the factors in the design.dopt
= optimalDOE(levels1,levels2,...,levelsN
,nruns
)
specifies options using one or more name-value arguments in addition to any of the input
argument combinations in the previous syntaxes. For example, you can specify fixed factors
and the experiment model.dopt
= optimalDOE(___,Name=Value
)
Input Arguments
Number of factors in the design, specified as a positive integer.
If you do not also specify the NumLevelsPerFactor
name-value
argument when you pass n
to optimalDOE
,
each factor has two levels. The default range for each factor is
[-1,1]
.
Data Types: single
| double
Number of design points, specified as a positive integer.
Example: 100
Data Types: single
| double
Factor bounds, specified as a 2-by-n
matrix, where n is the number of factors in the design. Each column
of bounds
corresponds to a factor. The first row of
bounds
contains the lower bounds for the factors, and
the second row contains the upper bounds.
Example: [0 0.1 10; 5 0.7 50]
Data Types: single
| double
Factor levels, specified as a numeric, logical, or categorical vector, or a cell
array. levels1,...,levelsN
must contain levels for each factor in
the design.
Example: ["cohorta","cohortb"],[0,0.25,0.5,0.75],["drug1","drug2","drug3"]
Data Types: single
| double
| logical
| char
| string
| cell
| categorical
Candidate set for the design points, specified as a numeric matrix or a table.
Data Types: single
| double
| table
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Example: optimalDOE(4,100,FixedFactors=[ones(50,1);zeros(50,1)],ModelSpecification="scheffe-quad")
specifies a fixed factor and the quadratic Scheffe model for a design with four factors
and 100 design points.
Flag to avoid generating duplicate design points, specified as a numeric or
logical 1
(true
) or 0
(false
). If AvoidDuplicates
is
true
and optimalDOE
can calculate
nonduplicate points, the rows of dopt.Design
are unique. If
AvoidDuplicates
is false
, the function
does not attempt to avoid duplicate design points, and the rows are not
unique.
Example: AvoidDuplicates=true
Data Types: logical
Categorical factors list, specified as one of the values in this table.
Value | Description |
---|---|
Vector of positive integers |
Each entry in the vector is an index value indicating that the corresponding factor is categorical. The index values are between 1 and n, where n is the number of factors in the design. |
Logical vector |
A |
String vector or cell array of character vectors | Each element in the array is the name of a factor. The names must
match the entries in FactorNames . |
"all" | All factors are categorical. |
By default, optimalDOE
treats all nonnumeric factors as
categorical.
Example: CategoricalFactors="all"
Data Types: single
| double
| logical
| char
| string
| cell
Algorithm for generating the D-optimal design, specified as
"coordinate"
or "row"
.
"coordinate"
— Use the coordinate-exchange algorithm to generate a D-optimal design. This value is the default forExchangeMethod
when you do not specify a candidate set for the design. You cannot specify the exchange method as"coordinate"
when you specifycandset
."row"
— Use the row-exchange algorithm to generate a D-optimal design. This value is the default forExchangeMethod
when you specify a candidate set for the design. You cannot specify the exchange method as"row"
when you specifyFixedFactors
.
For more information about the coordinate-exchange and row-exchange algorithms,
see the Algorithms section of cordexch
and rowexch
.
Example: ExchangeMethod="row"
Data Types: char
| string
Validation function, specified as a function handle. The function must accept a table of design points and return a logical vector indicating which rows of the table contain valid design points.
The table input must have n variables, where n is the number of factors in the design. The names of the variables must be the names of the factors. You can specify the factor names by using
FactorsNames
.The logical vector output must contain the same number of elements as the number of rows in the table input.
When calculating the design, optimalDOE
excludes points
corresponding to true
values in the vector output for the
validation function.
Data Types: function_handle
Factor names, specified as a string vector or a cell array of
character vectors. The number of unique values in
FactorNames
must equal the
number of factors in the design. The default value for
FactorNames
is
["Factor1","Factor2",..."FactorN"]
.
If you pass levels for a factor using variable names in the input
argument levels1,levels2,...,levelsN
and do
not specify FactorNames
, optimalDOE
assigns the workspace variable name to the corresponding
factor.
Example: FactorNames=["compound","quantity"]
Data Types: char
| string
| cell
Fixed factor values, specified as a numeric matrix or a table.
Fixed factors are held constant while the function varies other factors, which can be useful when you create a blocked design. A blocked design orders design points by the values of a factor.
optimalDOE
uses all factors, including fixed factors, to
calculate design points. The last columns of the design contain the values specified
in FixedFactors
. FixedFactors
must have
nruns
rows.
Example: FixedFactors=[zeros(100,1);ones(100,1)]
Data Types: single
| double
| table
Initial design to use for the coordinate-exchange or row-exchange algorithm,
specified as an nruns
-by-n
numeric matrix
or a table. You can specify which algorithm to use by setting the
ExchangeMethod
name-value argument. If any of the factors are
nonnumeric, InitialDesign
must be a table.
Data Types: single
| double
| table
Maximum number of iterations for the algorithm that generates the design points,
specified as a positive integer. To specify the algorithm, use the
ExchangeMethod
name-value argument.
Example: IterationLimit=20
Data Types: single
| double
Experiment model, specified as one of the following values.
A character vector or string scalar with the model name.
Value Model Description "linear"
The model contains an intercept and linear term for each factor. "constant"
The model contains only a constant (intercept) term. "interactions"
The model contains an intercept, a linear term for each factor, and all products of pairs of distinct factors (no squared terms). "purequadratic"
The model contains an intercept term, and linear and squared terms for each factor. "quadratic"
The model contains an intercept term, linear and squared terms for each factor, and all products of pairs of distinct factors. "scheffe-linear"
The model contains a linear term for each factor and does not include an intercept term.
"scheffe-quad"
The model is given by the formula:
"scheffe-special-cubic"
The model is given by the formula:
"poly
ijk
"The model is a polynomial with all terms up to degree i
in the first factor, degreej
in the second factor, and so on. Specify the maximum degree for each factor by using numerals 0 though 9. The model contains interaction terms, but the degree of each interaction term does not exceed the maximum value of the specified degrees. For example,"poly13"
has an intercept and x1, x2, x22, x23, x1*x2, and x1*x22 terms, where x1 and x2 are the first and second factors, respectively.In the above table, each xi corresponds to the ith factor in the D-optimal design, and bi, bij, bijk, and dij are coefficients for the model terms.
A character vector or string scalar formula in Wilkinson Notation. The factor names in the formula must be factor names specified by the
FactorNames
name-value argument.A t-by-n terms matrix, where t is the number of terms and n is the number of factors in the design. A terms matrix is convenient when the number of factors is large and you want to generate the terms programmatically. For more information about terms matrices, see Terms Matrix
ModelSpecification
does not include the response variable.
optimalDOE
generates a design that minimizes the covariance
between the estimated coefficients for
ModelSpecification
.
Example: ModelSpecification="quadratic"
Data Types: single
| double
| char
| string
Number of levels for each factor, specified as a vector of positive integers.
NumLevelsPerFactor
must have an element for each factor in
the design.
Note
If you specify AvoidDuplicates
=true
,
the software adds additional levels for any noncategorical factors as needed to
avoid duplicate rows in the design.
Example: NumLevelsPerFactor=[3,4,10]
Data Types: single
| double
Maximum number of start points for generating the design, specified as a
positive integer. If
, then
NumTries
> 1optimalDOE
generates NumTries
designs
from different starting points. The function returns the design with the least
amount of covariance between the coefficient estimates for the experiment
model.
Example: NumTries=5
Data Types: single
| double
Options for controlling the iterative algorithm to minimize the fitting
criteria, specified as a structure array returned by statset
. Supported fields of the structure array specify options for
controlling the iterative algorithm.
This table summarizes the supported fields, which require Parallel Computing Toolbox™.
Field | Description |
---|---|
Streams | A
In this case, use a cell array the same size
as the parallel pool. If a parallel pool is not open, then
|
UseParallel |
|
UseSubstreams | Set to true to compute in a reproducible fashion.
The default is false . To compute reproducibly, set
Streams to a type allowing substreams:
"mlfg6331_64" or
"mrg32k3a" . |
To ensure more predictable results, use parpool
(Parallel Computing Toolbox) and explicitly create a parallel pool before calling
optimalDOE
with
Options=statset(UseParallel=1)
.
Example: Options=statset(UseParallel=1)
Data Types: struct
Properties
This property is read-only.
Candidate set for the design points, represented as a table. This property is set by
the candset
input argument when you create the
optimalDOE
object. If ExchangeMethod
is
"row"
and you do not specify a candidate set,
optimalDOE
automatically generates a candidate set.
Data Types: table
This property is read-only after object creation.
Categorical factors, represented as a vector of indices indicating which factors are
categorical. This property is set by the CategoricalFactors
name-value argument when you create the optimalDOE
object.
Data Types: double
This property is read-only.
Generated design points, represented as a table. Each column of
Design
corresponds to a factor in the design, and each row
corresponds to a point.
Data Types: table
This property is read-only.
Algorithm for generating the design, represented as "coordinate"
or "row"
. This property is set by the
ExchangeMethod
name-value argument when you create the
optimalDOE
object.
Data Types: string
This property is read-only.
Validation function, represented as a function handle. This property is set by the
ExcludeFcn
name-value argument when you create the
optimalDOE
object.
Data Types: function_handle
This property is read-only.
Fixed factor values, represented as a vector of indices indicating which factors are
fixed. This property is set by the FixedFactors
name-value argument
when you create the optimalDOE
object.
Data Types: single
| double
This property is read-only after object creation.
Factor levels, represented as a cell array with one element per factor. The software
uses the value of bounds
or
levels1,levels2,...,levelsN
to set Levels
.
Otherwise the software sets the elements of Levels
to have
n equally-spaced levels in the range [-1 1]
,
where n is determined as follows:
If you do not specify
ModelSpecification
orNumLevelsPerFactor
, then n equals 2.If you specify
NumLevelsPerFactor
, then n equalsNumLevelsPerFactor
.If you specify
ModelSpecification
and do not specifyNumLevelsPerFactor
, then n equals1
+ the maximum order of theModelSpecification
model.
Data Types: cell
This property is read-only.
Experiment model, specified as a formula in Wilkinson Notation.
ModelSpecification
indicates the model you want to fit with the
specified design. ModelSpecification
does not include the response
variable.
This property is set by the ModelSpecification
name-value argument when you create the optimalDOE
object.
Data Types: string
This property is read-only.
Optimal value for the determinant D =
|XTX|,
where X is the design matrix, represented as a numeric scalar. For
more information, see the Algorithms section of cordexch
.
Data Types: single
| double
Object Functions
Examples
Generate a D-optimal design with 10 points and four factors.
dopt = optimalDOE(4,10)
dopt = optimalDOE with properties: Design: [10×4 table] ModelSpecification: "1 + Factor1 + Factor2 + Factor3 + Factor4" OptimalityValue: 8.6016e+04 Levels: {[-1 1] [-1 1] [-1 1] [-1 1]} CategoricalFactors: [] FixedFactors: [] ExchangeMethod: "coordinate"
dopt
is an optimalDOE
object that contains information about the generated D-optimal design. The output includes the size of the table containing the design points, model for the design, factor levels, and method used to generate the design points. By default, the levels for each factor are -1
and 1
. The output also displays the optimal value for the determinant where X is the design matrix.
Display the design table.
dopt.Design
ans=10×4 table
Factor1 Factor2 Factor3 Factor4
_______ _______ _______ _______
1 -1 -1 -1
-1 1 -1 1
-1 1 1 -1
1 -1 1 1
-1 -1 -1 -1
-1 -1 1 -1
-1 -1 -1 1
1 1 -1 -1
1 1 1 1
1 1 1 -1
The design table displays the values for the 10 points in the optimal design.
Generate a D-optimal design and specify the factor bounds for the design points.
bounds = [10 20 30; 20 30 40]; dopt = optimalDOE(bounds,5)
dopt = optimalDOE with properties: Design: [5×3 table] ModelSpecification: "1 + Factor1 + Factor2 + Factor3" OptimalityValue: 8.0000e+06 Levels: {[10 20] [20 30] [30 40]} CategoricalFactors: [] FixedFactors: [] ExchangeMethod: "coordinate"
dopt
is an optimalDOE
object that contains information about the generated D-optimal design. By default, the levels for the factors are the same as the specified bounds.
Generate some response data for the design points.
pts = dopt.Design; h = height(pts); response = 2*pts.Factor1+3*pts.Factor2+pts.Factor3+0.01*randn(h,1);
Fit a linear model using the design points in dopt
as the predictor data and response
as the response data.
mdl = fitlm(dopt,response)
mdl = Linear regression model: y ~ 1 + Factor1 + Factor2 + Factor3 Estimated Coefficients: Estimate SE tStat pValue __________ __________ ________ __________ (Intercept) -0.0085086 0.046507 -0.18295 0.8848 Factor1 1.9998 0.00095215 2100.3 0.00030311 Factor2 3.0006 0.00095215 3151.4 0.00020201 Factor3 1.0001 0.00095215 1050.3 0.00060612 Number of observations: 5, Error degrees of freedom: 1 Root Mean Squared Error: 0.0102 R-squared: 1, Adjusted R-Squared: 1 F-statistic vs. constant model: 5.53e+06, p-value = 0.000312
mdl
is a LinearModel
object that contains the results of fitting a linear model to the data. The model display includes the model formula, estimated coefficients, and model summary statistics.
Generate data for patient weight using the randi
function. Create variables containing levels for patient age and smoking status.
weight = randi([120 200], 50, 1); age = [20 30 40 50]; smoker = ["Y", "N"];
Generate a D-optimal design with 20 points, using the unique values in age
, weight
, and smoker
as the factor levels.
dopt = optimalDOE(age,weight,smoker,20)
dopt = optimalDOE with properties: Design: [20×3 table] ModelSpecification: "1 + age + weight + smoker" OptimalityValue: 1.2996e+10 Levels: {[20 30 40 50] [122 123 127 130 131 132 133 135 142 145 150 151 154 155 156 159 164 171 172 173 174 176 177 180 181 182 184 185 186 188 193 194 195 196 197 198] ["N" "Y"]} CategoricalFactors: 3 FixedFactors: [] ExchangeMethod: "coordinate"
dopt
is an optimalDOE
object that contains information about the generated D-optimal design.
Display the design table.
dopt.Design
ans=20×3 table
age weight smoker
___ ______ ______
20 122 "N"
20 198 "N"
50 198 "Y"
20 122 "Y"
20 198 "Y"
50 122 "N"
50 122 "Y"
20 122 "N"
50 198 "N"
20 198 "N"
50 122 "N"
20 122 "Y"
50 122 "N"
50 198 "Y"
50 198 "N"
50 122 "Y"
⋮
The design table displays the values for the 20 points in the D-optimal design.
Generate a candidate set for the D-optimal design by using the combinations
function. Use the categorical
and randi
functions to create the factor values.
compound = categorical(["compound1","compound2","compound3"]); age = 17 + randi(83,1,10); candset = combinations(compound,age)
candset=30×2 table
compound age
_________ ___
compound1 85
compound1 93
compound1 28
compound1 93
compound1 70
compound1 26
compound1 41
compound1 63
compound1 97
compound1 98
compound2 85
compound2 93
compound2 28
compound2 93
compound2 70
compound2 26
⋮
candset
is a table that contains every possible combination of the values in compound
and age
.
Generate a D-optimal design with 15 points using candset
as the candidate set.
dopt = optimalDOE(candset,15)
dopt = optimalDOE with properties: Design: [15×2 table] ModelSpecification: "1 + compound + age" OptimalityValue: 2.3328e+06 Levels: {[compound1 compound2 compound3] [26 28 41 63 70 85 93 97 98]} CategoricalFactors: 1 FixedFactors: [] ExchangeMethod: "row" CandidateSet: [30×2 table]
dopt
is an optimalDOE
object that contains information about the generated D-optimal design. The levels for the design factors are the same as the unique values in the columns of candset
.
Determine if the set of design points is a subset of the candidate set by using the ismember
and all
functions.
idx = ismember(dopt.Design,candset,"rows");
all(idx)
ans = logical
1
The output shows that dopt.Design
is a subset of candset
.
Generate levels for three factors by using the categorical
, randi
, ones
, and zeros
functions.
compound = categorical(["compound1","compound2","compound3"]); age = 17 +randi(83,1,10); smoker = [ones(10,1);zeros(5,1)];
Generate a D-optimal design with 15 points using compound
and age
as factors and smoker
as a fixed factor. Specify the model for the design, and avoid calculating duplicate points.
dopt = optimalDOE(compound,age,15,FixedFactors=smoker,AvoidDuplicates=true,ModelSpecification="scheffe-linear")
dopt = optimalDOE with properties: Design: [15×3 table] ModelSpecification: "compound + age + Factor3" OptimalityValue: 7.5620e+06 Levels: {[compound1 compound2 compound3] [26 28 41 63 70 85 93 97 98] [0 1]} CategoricalFactors: 1 FixedFactors: 3 ExchangeMethod: "coordinate"
dopt
is an optimalDOE
object that contains information about the generated D-optimal design. The design table contains 15 points.
Determine if the design points are unique by using the unique
function.
upts = unique(dopt.Design)
upts=15×3 table
compound age Factor3
_________ ___ _______
compound1 26 1
compound1 28 1
compound1 41 1
compound1 97 0
compound1 98 0
compound2 26 1
compound2 28 1
compound2 41 1
compound2 97 0
compound2 98 0
compound2 98 1
compound3 93 1
compound3 97 1
compound3 98 0
compound3 98 1
upts
is a 15x3
table containing the unique points in dopt.Design
. This result indicates that the design points for dopt
are unique.
More About
A terms matrix T
is a
t-by-n matrix specifying the terms in a model,
where t is the number of terms, and n is the number of
factors in the design. The value of T(i,j)
is the exponent of variable
j
in term i
.
For example, suppose that a design includes three factors x1
,
x2
, and x3
. Each row of T
represents one term:
[0 0 0]
— Constant term or intercept[0 1 0]
—x2
; equivalently,x1^0 * x2^1 * x3^0
[1 0 1]
—x1*x3
[2 0 0]
—x1^2
[0 1 2]
—x2*(x3^2)
Wilkinson notation describes the terms in a model. The notation relates to the terms included in the model, not to the multipliers (coefficients) of those terms.
Wilkinson notation uses these symbols:
+
means include the next variable.–
means do not include the next variable.:
defines an interaction, which is a product of the terms.*
defines an interaction and all lower order terms.^
raises the predictor to a power, exactly as in*
repeated, so^
includes lower order terms as well.()
groups the terms.
This table shows typical examples of Wilkinson notation.
Wilkinson Notation | Terms in Standard Notation |
---|---|
1 | Constant (intercept) term |
x1^k , where k is a positive
integer | x1 ,
x12 , ...,
x1k |
x1 + x2 | x1 , x2 |
x1*x2 | x1 , x2 ,
x1*x2 |
x1:x2 | x1*x2 only |
–x2 | Do not include x2 |
x1*x2 + x3 | x1 , x2 , x3 ,
x1*x2 |
x1 + x2 + x3 + x1:x2 | x1 , x2 , x3 ,
x1*x2 |
x1*x2*x3 – x1:x2:x3 | x1 , x2 , x3 ,
x1*x2 , x1*x3 ,
x2*x3 |
x1*(x2 + x3) | x1 , x2 , x3 ,
x1*x2 , x1*x3 |
For more details, see Wilkinson Notation.
Version History
Introduced in R2024b
See Also
fullFactorialDOE
| mixtureDOE
| taguchiDOE
| addruns
| fitlm
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: United States.
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)