Documentation

anova1

One-way analysis of variance

Syntax

p = anova1(X)
p = anova1(X,group)
p = anova1(X,group,displayopt)
[p,table] = anova1(...)
[p,table,stats] = anova1(...)

Description

p = anova1(X) performs balanced one-way ANOVA for comparing the means of two or more columns of data in the matrix X, where each column represents an independent sample containing mutually independent observations. The function returns the p-value under the null hypothesis that all samples in X are drawn from populations with the same mean.

If p is near zero, it casts doubt on the null hypothesis and suggests that at least one sample mean is significantly different than the other sample means. Common significance levels are 0.05 or 0.01.

The anova1 function displays two figures, the standard ANOVA table and a box plot of the columns of X.

The standard ANOVA table divides the variability of the data into two parts:

  • Variability due to the differences among the column means (variability between groups)

  • Variability due to the differences between the data in each column and the column mean (variability within groups)

The standard ANOVA table has six columns:

  1. The source of the variability.

  2. The sum of squares (SS) due to each source.

  3. The degrees of freedom (df) associated with each source.

  4. The mean squares (MS) for each source, which is the ratio SS/df.

  5. The F-statistic, which is the ratio of the mean squares.

  6. The p-value, which is derived from the cdf of F.

The box plot of the columns of X suggests the size of the F-statistic and the p-value. Large differences in the center lines of the boxes correspond to large values of F and correspondingly small values of p.

anova1 treats NaN values as missing, and disregards them.

p = anova1(X,group) performs ANOVA by group. For more information on grouping variables, see Grouping Variables.

If X is a matrix, anova1 treats each column as a separate group, and evaluates whether the population means of the columns are equal. This form of anova1 is appropriate when each group has the same number of elements (balanced ANOVA). group can be a character array or a cell array of strings, with one row per column of X, containing group names. Enter an empty array ([]) or omit this argument if you do not want to specify group names.

If X is a vector, group must be a categorical variable, vector, string array, or cell array of strings with one name for each element of X. X values corresponding to the same value of group are placed in the same group. This form of anova1 is appropriate when groups have different numbers of elements (unbalanced ANOVA).

If group contains empty or NaN-valued cells or strings, the corresponding observations in X are disregarded.

p = anova1(X,group,displayopt) enables the ANOVA table and box plot displays when displayopt is 'on' (default) and suppresses the displays when displayopt is 'off'. Notches in the boxplot provide a test of group medians (see boxplot) different from the F test for means in the ANOVA table.

[p,table] = anova1(...) returns the ANOVA table (including column and row labels) in the cell array table. Copy a text version of the ANOVA table to the clipboard using the Copy Text item on the Edit menu.

[p,table,stats] = anova1(...) returns a structure stats used to perform a follow-up multiple comparison test. anova1 evaluates the hypothesis that the samples all have the same mean against the alternative that the means are not all the same. Sometimes it is preferable to perform a test to determine which pairs of means are significantly different, and which are not. Use the multcompare function to perform such tests by supplying the stats structure as input.

Assumptions

The ANOVA test makes the following assumptions about the data in X:

  • All sample populations are normally distributed.

  • All sample populations have equal variance.

  • All observations are mutually independent.

The ANOVA test is known to be robust with respect to modest violations of the first two assumptions.

Examples

expand all

One-Way ANOVA

Create X with columns that are constants plus random normal disturbances with mean zero and standard deviation one.

X = meshgrid(1:5);
rng default; % For reproducibility
X = X + normrnd(0,1,5,5)
X =

    1.5377    0.6923    1.6501    3.7950    5.6715
    2.8339    1.5664    6.0349    3.8759    3.7925
   -1.2588    2.3426    3.7254    5.4897    5.7172
    1.8622    5.5784    2.9369    5.4090    6.6302
    1.3188    4.7694    3.7147    5.4172    5.4889

Perform one-way ANOVA.

p = anova1(X)
p =

    0.0023

The small p -value indicates that differences between column means are significant. The probability of this outcome under the null hypothesis (that samples drawn from the same population would have means differing by the amounts seen in X ) is equal to the p -value.

Compare Beam Strength Using One-Way ANOVA

The following example is from a study of the strength of structural beams in Hogg (1987).

Input the data.

strength = [82 86 79 83 84 85 86 87 74 82 ...
            78 75 76 77 79 79 77 78 82 79];
alloy = {'st','st','st','st','st','st','st','st',...
         'al1','al1','al1','al1','al1','al1',...
         'al2','al2','al2','al2','al2','al2'};

The vector strength measures deflections of beams in thousandths of an inch under 3,000 pounds of force. The vector alloy identifies each beam as steel ('st'), alloy 1 ('al1'), or alloy 2 ('al2'). Although alloy is sorted in this example, grouping variables do not need to be sorted.

Test the null hypothesis that the steel beams are equal in strength to the beams made of the two more expensive alloys.

p = anova1(strength,alloy)
p =

   1.5264e-04

The p -value suggests rejection of the null hypothesis. The box plot shows that steel beams deflect more than beams made of the more expensive alloys.

References

[1] Hogg, R. V., and J. Ledolter. Engineering Statistics. New York: MacMillan, 1987.

Was this topic helpful?