statistics reported by ranksum are wrong
2 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
This is less a question and more of a bug report.
The ranksum U statistic reported by the ranksum function is much too large. Here's a simple example:
a1 = 1 : 100;
a2 = a1 + 0.01;
[ p, h, stats ] = ranksum( a2, a1 )
p =
0.9037
h =
0
stats =
zval: 0.1209
ranksum: 10100
The correct ranksum, working from the formal definition of Wilcoxon ranksum, is 5050. I have verified this with an online calculator for the U statistic.
After some experimentation, I believe the value being reported for U is actually U + ( n1 * n2 ) / 2, where n1 and n2 are the number of instances in the two samples.
The reported p and h values agree reasonably well with what I get from other calculators.
0 Commenti
Risposte (1)
the cyclist
il 16 Apr 2013
Jeff,
Here is an excerpt from the notes to the equivalent function in R:
"The literature is not unanimous about the definitions of the Wilcoxon rank sum and Mann-Whitney tests. The two most common definitions correspond to the sum of the ranks of the first sample with the minimum value subtracted or not: R subtracts and S-PLUS does not, giving a value which is larger by m(m+1)/2 for a first sample of size m. (It seems Wilcoxon's original paper used the unadjusted sum of the ranks but subsequent tables subtracted the minimum.)"
It seems you are seeing this lack of convention.
0 Commenti
Vedere anche
Categorie
Scopri di più su Startup and Shutdown in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!