# multcompare

Multiple comparison of means for analysis of variance (ANOVA)

Since R2022b

## Syntax

``m = multcompare(aov)``
``m = multcompare(aov,factors)``
``m = multcompare(___,Name=Value)``

## Description

````m = multcompare(aov)` returns a table of results `m` from a multiple comparison of means for a one-way `anova` object.```

example

````m = multcompare(aov,factors)` performs the multiple comparison of means over the combinations of values for the factors listed in `factors`. This syntax is valid for a one-, two-, or N-way ANOVA.```

example

````m = multcompare(___,Name=Value)` specifies additional options using one or more name-value arguments. For example, you can specify the confidence level and the type of critical value used to determine if the means are significantly different.```

## Examples

collapse all

`load popcorn.mat`

The columns of the 6-by-3 matrix `popcorn` contain popcorn yield observations in cups for the brands Gourmet, National, and Generic.

Convert `popcorn` to a vector.

`popcorn = popcorn(:);`

Create a string array of values for the factor `Brand` using the function `repmat`.

`brand = [repmat("Gourmet",6,1); repmat("National",6,1); repmat("Generic",6,1)];`

Perform a one-way ANOVA to test the null hypothesis that the mean yields are the same across the three brands.

`aov = anova(brand,popcorn,FactorNames="Brand")`
```aov = 1-way anova, constrained (Type III) sums of squares. Y ~ 1 + Brand SumOfSquares DF MeanSquares F pValue ____________ __ ___________ ____ __________ Brand 15.75 2 7.875 18.9 7.9603e-05 Error 6.25 15 0.41667 Total 22 17 Properties, Methods ```

The small p-value indicates that the null hypothesis can be rejected at the 99% confidence level. Therefore, the difference in mean popcorn yield is statistically significant for at least one brand. Perform Dunnett's Test to determine if the mean yields of `Gourmet` and `National` differ significantly from the mean yield of `Generic`.

`m = multcompare(aov,CriticalValueType="dunnett",ControlGroup=3)`
```m=2×6 table Group1 Group2 MeanDifference MeanDifferenceLower MeanDifferenceUpper pValue __________ _________ ______________ ___________________ ___________________ _________ "Gourmet" "Generic" 2.25 1.341 3.159 4.402e-05 "National" "Generic" 0.75 -0.15904 1.659 0.11012 ```

Each row of `m` contains a p-value for the null hypothesis that the means of the groups in columns `Group1` and `Group2` are not significantly different. The p-value in the first row is small enough to reject the null hypothesis that the mean popcorn yield of `Gourmet` is not significantly different from that of `Generic`.The p-value in the second row is too large to reject the null hypothesis that the mean popcorn yield of `National` is not significantly different from that of `Generic`. The value for `MeanDifference` is positive in the first row; therefore, the mean popcorn yield of `Gourmet` is significantly higher than that of `Generic`.

`load patients.mat`

Create a table containing variables with factor values for the smoking status and physical location of patients, and the response data for systolic blood pressure.

`tbl = table(Smoker,Location,Systolic)`
```tbl=100×3 table Smoker Location Systolic ______ _____________________________ ________ true {'County General Hospital' } 124 false {'VA Hospital' } 109 false {'St. Mary's Medical Center'} 125 false {'VA Hospital' } 117 false {'County General Hospital' } 122 false {'St. Mary's Medical Center'} 121 true {'VA Hospital' } 130 false {'VA Hospital' } 115 false {'St. Mary's Medical Center'} 115 false {'County General Hospital' } 118 false {'County General Hospital' } 114 false {'St. Mary's Medical Center'} 115 false {'VA Hospital' } 127 true {'VA Hospital' } 130 false {'St. Mary's Medical Center'} 114 true {'VA Hospital' } 130 ⋮ ```

Perform a two-way ANOVA to test the null hypothesis that systolic blood pressure is not significantly different between smokers and non-smokers or locations.

`aov = anova(tbl,"Systolic")`
```aov = 2-way anova, constrained (Type III) sums of squares. Systolic ~ 1 + Smoker + Location SumOfSquares DF MeanSquares F pValue ____________ __ ___________ ______ __________ Smoker 2154.4 1 2154.4 94.462 5.9678e-16 Location 46.064 2 23.032 1.0099 0.36811 Error 2189.5 96 22.807 Total 4461.2 99 Properties, Methods ```

The p-values indicate that enough evidence exists to conclude that smoking status has a significant effect on blood pressure. However, not enough evidence exists to conclude that physical location has a significant effect.

Investigate the mean differences between the response data from each group.

`m = multcompare(aov,["Smoker","Location"])`
```m=15×6 table Group1 Group2 MeanDifference MeanDifferenceLower MeanDifferenceUpper pValue _______________________________________ _______________________________________ ______________ ___________________ ___________________ __________ Smoker Location Smoker Location ______ _____________________________ ______ _____________________________ false {'County General Hospital' } true {'County General Hospital' } -9.935 -12.908 -6.9623 7.6385e-15 false {'County General Hospital' } false {'VA Hospital' } 1.516 -1.6761 4.708 0.73817 false {'County General Hospital' } true {'VA Hospital' } -8.419 -12.899 -3.9394 5.3456e-06 false {'County General Hospital' } false {'St. Mary's Medical Center'} 0.3721 -3.2806 4.0248 0.99968 false {'County General Hospital' } true {'St. Mary's Medical Center'} -9.5629 -14.637 -4.4886 5.0113e-06 true {'County General Hospital' } false {'VA Hospital' } 11.451 7.2101 15.692 8.3835e-11 true {'County General Hospital' } true {'VA Hospital' } 1.516 -1.6761 4.708 0.73817 true {'County General Hospital' } false {'St. Mary's Medical Center'} 10.307 5.9931 14.621 6.5271e-09 true {'County General Hospital' } true {'St. Mary's Medical Center'} 0.3721 -3.2806 4.0248 0.99968 false {'VA Hospital' } true {'VA Hospital' } -9.935 -12.908 -6.9623 7.6385e-15 false {'VA Hospital' } false {'St. Mary's Medical Center'} -1.1439 -4.8086 2.5209 0.94367 false {'VA Hospital' } true {'St. Mary's Medical Center'} -11.079 -16.058 -6.0994 6.0817e-08 true {'VA Hospital' } false {'St. Mary's Medical Center'} 8.7911 4.3482 13.234 1.5297e-06 true {'VA Hospital' } true {'St. Mary's Medical Center'} -1.1439 -4.8086 2.5209 0.94367 false {'St. Mary's Medical Center'} true {'St. Mary's Medical Center'} -9.935 -12.908 -6.9623 7.6385e-15 ```

Each p-value corresponds to the null hypothesis that the means of groups in the same row are not significantly different. The table includes six p-values greater than 0.05, corresponding to the six pairs of groups with the same smoking status value. Therefore, systolic blood pressure is not significantly different between groups with the same smoking status value.

## Input Arguments

collapse all

Analysis of variance results, specified as an `anova` object. The properties of `aov` contain the factors and response data used by `multcompare` to compute the difference in means.

Factors used to group the response data, specified as a string vector or cell array of character vectors. The `multcompare` function groups the response data by the combinations of values for the factors in `factors`. The `factors` argument must be one or more of the names in `aov.FactorNames`.

Example: `["g1","g2"]`

Data Types: `string` | `cell`

### Name-Value Arguments

Specify optional pairs of arguments as `Name1=Value1,...,NameN=ValueN`, where `Name` is the argument name and `Value` is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: `Alpha=0.01,CriticalValueType="dunnett",Approximate=true` sets the significance level of the confidence intervals to 0.01 and uses an approximation of Dunnett's critical value to calculate the p-values.

Significance level for the estimates, specified as a scalar value in the range (0,1). The confidence level of the confidence intervals is $100\left(1-\alpha \right)%$. The default value for `Alpha` is `0.05`, which returns 95% confidence intervals for the estimates.

Example: `Alpha=0.01`

Data Types: `single` | `double`

Critical value type used by the `multcompare` function to calculate p-values, specified as one of the options in the following table. Each option specifies the statistical test that `multcompare` uses to calculate the critical value.

OptionStatistical Test
`"tukey-kramer"` (default)Tukey-Kramer test
`"hsd"`Honestly Significant Difference test — Same as `"tukey-kramer"`
`"dunn-sidak"`Dunn-Sidak correction
`"bonferroni"`Bonferroni correction
`"scheffe"`Scheffe test
`"dunnett"`Dunnett's test — Can be used only when `aov` is a one-way `anova` object or when a single factor is specified in `factors`. For Dunnett's test, the control group is selected in the generated plot and cannot be changed.
`"lsd"`Stands for Least Significant Difference and uses the critical value for a plain t-test. This option does not protect against the multiple comparisons problem unless it follows a preliminary overall test such as an F-test.

Example: `CriticalValueType="dunn-sidak"`

Data Types: `char` | `string`

Indicator to compute the Dunnett critical value approximately, specified as a numeric or logical `1` (`true`) or `0` (`false`). You can compute the Dunnett critical value approximately for speed. The default for `Approximate` is `true` for an N-way ANOVA with N greater than two, and `false` otherwise. This argument is valid only when `CriticalValueType` is `"dunnett"`.

Example: `Approximate=true`

Data Types: `logical`

Index of the control group factor value for Dunnett's test, specified as a positive integer. Factor values are indexed by the order in which they appear in `aov.ExpandedFactorNames`. This argument is valid only when `CriticalValueType` is `"dunnett"`.

Example: `ControlGroup=3`

Data Types: `single` | `double`

## Output Arguments

collapse all

Multiple comparison procedure results, returned as a table. The table `m` has the following variables:

• `Group1` — Values of the factors in the first comparison group

• `Group2` — Values of the factors in the second comparison group

• `MeanDifference` — Difference in mean response between the observations in `Group1` and the observations in `Group2`

• `MeanDifferenceLower` — 95% lower confidence bound on the mean difference

• `MeanDifferenceUpper` — 95% upper confidence bound on the mean difference

• `pValue`p-value indicating whether or not the mean of `Group1` is significantly different from the mean of `Group2`

If two or more factors are provided in `factors`, the columns `Group1` and `Group2` contain tables of values for the factors of the groups being compared.

## References

[1] Hochberg, Y., and A. C. Tamhane. Multiple Comparison Procedures. Hoboken, NJ: John Wiley & Sons, 1987.

[2] Milliken, G. A., and D. E. Johnson. Analysis of Messy Data, Volume I: Designed Experiments. Boca Raton, FL: Chapman & Hall/CRC Press, 1992.

[3] Searle, S. R., F. M. Speed, and G. A. Milliken. “Population marginal means in the linear model: an alternative to least-squares means.” American Statistician. 1980, pp. 216–221.

[4] Dunnett, Charles W. “A Multiple Comparison Procedure for Comparing Several Treatments with a Control.” Journal of the American Statistical Association, vol. 50, no. 272, Dec. 1955, pp. 1096–121.

[5] Krishnaiah, Paruchuri R., and J. V. Armitage. "Tables for multivariate t distribution." Sankhyā: The Indian Journal of Statistics, Series B (1966): 31-56.

## Version History

Introduced in R2022b