stats

Analysis of variance (ANOVA) table

Since R2022b

collapse all in page

Syntax

s = stats(aov)

s = stats(aov,type)

s = stats(aov,Component=sstype)

[s,ems] = stats(___)

Description

s = stats(aov) returns a component ANOVA table for the anova object aov. The component ANOVA table contains statistics for the model terms, error, and total. For more information, see s.

s = stats(aov,type) specifies whether to return a component or summary ANOVA table. The summary ANOVA table includes summary statistics for the linear and nonlinear model terms, regression, error, and total. For more information, see s.

example

s = stats(aov,Component=sstype) specifies the sum of squares type used to create the component table.

[s,ems] = stats(___) also returns a table of information about the expected mean squares ems for each term and the error. If you specify the sstype in the call to stats, then the software creates the ems table with the specified sum of squares type.

example

Examples

collapse all

Display Summary Table for Two-Way ANOVA

Open Live Script

Load popcorn yield data.

load popcorn.mat

The columns of the 6-by-3 matrix popcorn contain popcorn yield observations in cups for the brands Gourmet, National, and Generic. The first three rows of popcorn correspond to popcorn that was popped using an air popper and the last three rows correspond to popcorn popped in oil.

Create string arrays of factor values for the brand and type of popper using the repmat function.

brand = [repmat("Gourmet",6,1); repmat("National",6,1); repmat("Generic",6,1)];
popperType = repmat(["Air";"Air";"Air";"Oil";"Oil";"Oil"], [3, 1]);
factors = {brand,popperType};

Perform a two-way ANOVA to test the null hypothesis that the mean popcorn yield is not affected by the brand of popcorn and popper type.

aov = anova(factors,popcorn(:),FactorNames=["Brand","PopperType"],ModelSpecification="interactions")

aov = 
2-way anova, constrained (Type III) sums of squares.

Y ~ 1 + Brand*PopperType

                        SumOfSquares    DF    MeanSquares     F        pValue  
                        ____________    __    ___________    ____    __________

    Brand                    15.75       2        7.875      56.7     7.679e-07
    PopperType                 4.5       1          4.5      32.4    0.00010037
    Brand:PopperType      0.083333       2     0.041667       0.3       0.74622
    Error                   1.6667      12      0.13889                        
    Total                       22      17                                     


  Properties, Methods

By default, anova displays a component ANOVA table.

Generate a summary ANOVA table.

s = stats(aov,"summary")

s=5×5 table
                  SumOfSquares    DF    MeanSquares      F        pValue  
                  ____________    __    ___________    _____    __________

    Linear             20.25       3         6.75       48.6    5.4835e-07
    NonLinear       0.083333       2     0.041667        0.3       0.74622
    Regression        20.333       5       4.0667      29.28    2.5065e-06
    Error             1.6667      12      0.13889                         
    Total                 22      17       1.2941

The row Linear corresponds to the terms Brand and PopperType in the ANOVA model. The small p-value in the Linear row indicates that Brand and PopperType have a statistically significant combined effect on the popcorn yield. The row NonLinear corresponds to the term Brand:PopperType. The large p-value in the NonLinear row indicates that the interaction term does not have a statistically significant effect on the popcorn yield. The small p-value in the row Regression indicates that the ANOVA model is a better predictor of the response data than the mean of the data.

Display Expected Mean Squares Table for Two-Way ANOVA

Open Live Script

Load the sample car data.

load carsmall

Data for the country of origin, model year, and mileage is stored in the variables Origin, Model_Year, and MPG, respectively.

Perform a two-way ANOVA to test the null hypothesis that mean mileage is not affected by the country of origin or model year.

aov = anova({Origin, Model_Year},MPG,RandomFactors=[1 2],FactorNames=["Origin" "Year"])

aov = 
2-way anova, constrained (Type III) sums of squares.

Y ~ 1 + Origin + Year

              SumOfSquares    DF    MeanSquares      F         pValue  
              ____________    __    ___________    ______    __________

    Origin       1078.1        5      215.62       10.675    5.3303e-08
    Year         2638.4        2      1319.2       65.312    5.5975e-18
    Error          1737       86      20.198                           
    Total        6005.3       93                                       


  Properties, Methods

Display an expected mean squares table for the ANOVA.

[~,ems] = stats(aov)

ems=3×5 table
                Type         ExpectedMeanSquares        MeanSquaresDenominator    DFDenominator    FDenominator
              ________    __________________________    ______________________    _____________    ____________

    Origin    "random"    "9.159*V(Origin)+V(Error)"            20.198                  86          MS(Error)  
    Year      "random"    "29.5014*V(Year)+V(Error)"            20.198                  86          MS(Error)  
    Error     "random"    "V(Error)"

The formulas for the expected mean squares of the random factors Origin and Year contain terms for their respective variance components. You can use the expected mean squares formulas to compare how much of the expected mean squares is due to the variance in the error and how much is due to the variance components of the random terms.

Input Arguments

collapse all

`aov` — Analysis of variance results
`anova` object

Analysis of variance results, specified as an anova object. The properties of aov contain the factors and response data used by stats to compute the statistics in the ANOVA table.

`type` — Type of ANOVA table
`"component"` (default) | `"summary"`

Type of ANOVA table, specified as "component" or "summary".

Example: "summary"

Data Types: char | string

`sstype` — Type of sum of squares
`"three"` (default) | `"two"` | `"one"` | `"hierarchical"`

Type of the sum of squares used to perform the ANOVA, specified as "three", "two", "one", or "hierarchical". The stats function ignores sstype unless the ANOVA type is "component". For a model containing main effects but no interactions, the value of sstype influences the computations on the unbalanced data only.

The sum of squares of a term ( $S S_{T e r m}$ ) is defined as the reduction in the sum of squares error (SSE) obtained by adding the term to a model that excludes it. The formula for the sum of squares of a term Term has the form

$S S_{T e r m} = \underset{S S E_{f_{e x c l}}}{\underset{︸}{\sum_{i = 1}^{n} {(y_{i} - f_{e x c l} (g_{1}, ..., g_{N}))}^{2}}} - \underset{S S E_{f_{i n c l}}}{\underset{︸}{\sum_{i = 1}^{n} {(y_{i} - f_{i n c l} (g_{1}, ..., g_{N}))}^{2}}}$

where n is the number of observations, $y_{i}$ are the response data, $g_{1}, ..., g_{N}$ are the factors used to perform the ANOVA, $f_{e x c l}$ is a model that excludes Term, and $f_{i n c l}$ is a model that includes Term. Both $f_{e x c l}$ and $f_{i n c l}$ are specified by SumOfSquaresType. The variables $S S E_{f_{e x c l}}$ and $S S E_{f_{i n c l}}$ are the sum of squares errors for $f_{e x c l}$ and $f_{i n c l}$ , respectively. You can specify $f_{e x c l}$ and $f_{i n c l}$ using one of the options for SumOfSquaresType described in the following table.

Option	Type of Sum of Squares
`"three"` (default)	$f_{i n c l}$ is the full ANOVA model specified in the property `Formula`. $f_{e x c l}$ is a model composed of all terms in $f_{i n c l}$ except Term. The model $f_{e x c l}$ has the same sigma-restricted coding as $f_{i n c l}$ . This type of sum of squares is known as Type III.
`"two"`	$f_{e x c l}$ is a model composed of all terms in the ANOVA model specified in the property `Formula` that do not contain Term. If Term is a continuous term, then powers of Term are treated as separate terms that do not contain Term. $f_{i n c l}$ is a model composed of Term and all the terms in $f_{e x c l}$ . This type of sum of squares is known as Type II.
`"one"`	$f_{e x c l}$ is a model composed of all the terms that precede Term in the ANOVA model specified in the property `Formula`. $f_{i n c l}$ is a model composed of Term and all the terms in $f_{e x c l}$ . This type of sum of squares is known as Type I.
`"hierarchical"`	$f_{e x c l}$ and $f_{i n c l}$ are defined as in Type II, except powers of Term are treated as terms that contain Term.

Example: Component="hierarchical"

Data Types: char | string

Output Arguments

collapse all

`s` — ANOVA statistics
table

ANOVA statistics, returned as a table.

The contents of s depend on the ANOVA type specified in type.

If type is "component", then s contains ANOVA statistics for each variable in the model except the constant (intercept) term. The table includes these columns for each variable:

Column	Description
`SumOfSquares`	Sum of squares explained by the term and calculated depending on `sstype`.
`DF`	Degrees of freedom `DF` of a numeric variable is 1. `DF` of a categorical variable is the number of dummy variables created for the category (number of categories – 1). `DF` of an error term is the difference between the `DF` of the total and the sum of the `DF` for the model terms. `DF` of the total is `aov.NumObservations`–1.
`MeanSquares`	Mean squares, defined by `MeanSquares` = `SumOfSquares`/`DF`. `MeanSquares` for the error term is the mean squared error (MSE).
`F`	F-statistic value to test the null hypothesis that the corresponding coefficient is zero; computed by `F` = `MeanSquares`/`MSE`. When the null hypothesis is true, the F-statistic follows the F-distribution.
`pValue`	p-value of the F-statistic value

If type is "summary", then s contains summary statistics of grouped terms for each row. The summary statistics are calculated using Type I sum of squares. The table includes the same columns as "component" and these rows:

Row	Description
`Total`	Total statistics `SumOfSquares` — Total sum of squares, which is the sum of the squared deviations of the response around its mean `DF` — Sum of degrees of freedom of `Regression` and `Error`
`Regression`	Statistics for the model as a whole `SumOfSquares` — Model sum of squares, which is the sum of the squared deviations of the fitted value around the response mean. `F` and `pValue` — These values provide a test of whether the model as a whole fits significantly better than a degenerate model consisting of only a constant term.
`Linear`	Statistics for linear terms `SumOfSquares` — Sum of squares for linear terms, which is the difference between the model sum of squares and the sum of squares for nonlinear terms. `F` and `pValue` — These values provide a test of whether the model with only linear terms fits better than a degenerate model consisting of only a constant term. `stats` uses the mean squared error that is based on the full model to compute this F-value, so the F-value obtained by dropping the nonlinear terms and repeating the test is not the same as the value in this row.
`NonLinear`	Statistics for nonlinear terms `SumOfSquares` — Sum of squares for nonlinear (higher-order or interaction) terms, which is the increase in the residual sum of squares obtained by keeping only the linear terms and dropping all nonlinear terms. `F` and `pValue` — These values provide a test of whether the full model fits significantly better than a smaller model consisting of only the linear terms.
`Error`	Statistics for error `SumOfSquares` — Residual sum of squares, which is the sum of the squared residual values `MeanSquares` — Mean squared error, used to compute the F-statistic values for `Regression`, `Linear`, and `NonLinear` If the data contains replications (multiple observations sharing the same factor values), `s` also contains rows for `LackOfFit` and `PureError`. `LackOfFit` and `PureError` break down `Error` further.
`LackOfFit`	Lack-of-fit statistics `SumOfSquares` — Sum of squares due to lack of fit, which is the difference between the residual sum of squares and the replication sum of squares. `F` and `pValue` — The F-statistic value is the ratio of lack-of-fit `MeanSquares` to pure error `MeanSquares`. The ratio provides a test of bias by measuring whether the variation of the residuals is larger than the variation of the replications. A low p-value implies that adding additional terms to the model can improve the fit.
`PureError`	Statistics for pure error `SumOfSquares` — Replication sum of squares, obtained by finding the sets of points with identical predictor values, computing the sum of squared deviations around the mean within each set, and pooling the computed values `MeanSquares` — Model-free pure error variance estimate of the response

`ems` — Estimated mean squares information
table

Estimated mean squares information, returned as a table. The argument ems contains a row for each term, and a row for the error. The table returned by ems has the following variables.

Type — An indicator of whether the term is fixed or random.
ExpectedMeanSquares — A formula of the expected mean squares.
MeanSquaresDenominator — The value of the denominator in the calculation of the F-statistic.
DFDenominator — The value of the degrees of freedom in the calculation of the F-statistic denominator.
FDenominator — A formula for the denominator in the calculation of the F-statistic. The denominator changes depending on whether aov.Formula has random interaction terms.

You can use the ems table to determine if the variance of a random term has a large effect on the estimated mean squares.

Data Types: table

References

[1] Dunn, O. J., and V. A. Clark. Applied Statistics: Analysis of Variance and Regression. New York: Wiley, 1974.

[2] Goodnight, J. H., and F. M. Speed. Computing Expected Mean Squares. Cary, NC: SAS Institute, 1978.

[3] Seber, G. A. F., and A. J. Lee. Linear Regression Analysis. 2nd ed. Hoboken, NJ: Wiley-Interscience, 2003.

Version History

Introduced in R2022b

stats

Syntax

Description

Examples

Display Summary Table for Two-Way ANOVA

Display Expected Mean Squares Table for Two-Way ANOVA

Input Arguments

aov — Analysis of variance results anova object

type — Type of ANOVA table "component" (default) | "summary"

sstype — Type of sum of squares "three" (default) | "two" | "one" | "hierarchical"

Output Arguments

s — ANOVA statistics table

ems — Estimated mean squares information table

References

Version History

See Also

`aov` — Analysis of variance results
`anova` object

`type` — Type of ANOVA table
`"component"` (default) | `"summary"`

`sstype` — Type of sum of squares
`"three"` (default) | `"two"` | `"one"` | `"hierarchical"`

`s` — ANOVA statistics
table

`ems` — Estimated mean squares information
table