stats
::swGOFT
The ShapiroWilk goodnessoffit test for normality
MuPAD® notebooks will be removed in a future release. Use MATLAB® live scripts instead.
MATLAB live scripts support most MuPAD functionality, though there are some differences. For more information, see Convert MuPAD Notebooks to MATLAB Live Scripts.
stats::swGOFT(x_{1}, x_{2}, …
) stats::swGOFT([x_{1}, x_{2}, …]
) stats::swGOFT(s
, <c
>)
stats::swGOFT
([x_{1}, x_{2},
…]) applies the ShapiroWilk goodnessoffit
test for the null hypothesis: “the data x_{1}, x_{2},
… are normally distributed (with unknown mean
and variance)”. The sample size must not be larger than 5000 and not
smaller than 3.
External statistical data stored in an ASCII file can be imported
into a MuPAD^{®} session via import::readdata
. In particular, see
Example 1 of the corresponding help page.
An error is raised by stats::swGOFT
if any
of the data cannot be converted to a real floatingpoint number or
if the sample size is too large or too small.
Let y_{1},
…, y_{n} be
the input data x_{1},
…, x_{n} arranged
in ascending order. stats::swGOFT
returns the list [PValue
= p, StatValue = w]
containing the following information:
w
is the attained value of the
ShapiroWilk statistic
.
Here, the a_{i} are
the ShapiroWilk coefficients, and S^2
is the statistical variance of the sample.
p
is the observed significance
level of the ShapiroWilk statistic W.
The observed significance level PValue = p
returned
by stats::swGOFT
has to be interpreted in the following
way: If p
is smaller than a given significance
level α<<1,
the null hypothesis may be rejected at level α.
If p
is larger than α,
the null hypothesis should not be rejected at level α.
The function is sensitive to the environment variable DIGITS
which
determines the numerical working precision.
We test a list of random data that purport to be a sample of normally distributed numbers:
f := stats::normalRandom(0, 1, Seed = 123): data := [f() $ i = 1..400]: stats::swGOFT(data)
The observed significance level is not small. Consequently, one should not reject the null hypothesis that the data are normally distributed.
Next, we dote the data with some uniformly continuous deviates:
impuredata := data . [frandom() $ i = 1..101]: stats::swGOFT(impuredata)
The doted data may be rejected as a sample of normal deviates at significance levels as small as .
delete f, data, impuredata:
We create a sample consisting of one string column and two nonstring columns:
s := stats::sample( [["1996", 1242, PI  1/2], ["1997", 1353, PI + 0.3], ["1998", 1142, PI + 0.5], ["1999", 1201, PI  1], ["2001", 1201, PI] ])
"1996" 1242 PI  1/2 "1997" 1353 PI + 0.3 "1998" 1142 PI + 0.5 "1999" 1201 PI  1 "2001" 1201 PI
We check whether the data of the third column are normally distributed:
stats::swGOFT(s, 3)
The observed significance level returned by the test is not small: the test does not indicate that the data are not normally distributed.
delete s:

The statistical data: real numerical values 

A sample of domain type 

An integer representing a column index of the sample 
List of two equations [PValue = p, StatValue = w]
with
floatingpoint values p
and w
.
See the `Details' section below for the interpretation of these values.
The implemented algorithm for the computation of the ShapiroWilk coefficients, the ShapiroWilk statistic and the observed significance level is based on: Patrick Royston, “Algorithm AS R94”, Applied Statistics, Vol.44, No.4 (1995).
Following Royston, the ShapiroWilk coefficients a_{i} are computed by an approximation of
where M denotes the expected values of standard normal order statistic for a sample, V is the corresponding covariance matrix, and M^{T} is the transpose of M.