pingouin.mwu¶

pingouin.
mwu
(x, y, alternative='twosided', **kwargs)¶ MannWhitney U Test (= Wilcoxon ranksum test). It is the nonparametric version of the independent Ttest.
 Parameters
 x, yarray_like
First and second set of observations.
x
andy
must be independent. alternativestring
Defines the alternative hypothesis, or tail of the test. Must be one of “twosided” (default), “greater” or “less”. See
scipy.stats.mannwhitneyu()
for more details. **kwargsdict
Additional keywords arguments that are passed to
scipy.stats.mannwhitneyu()
.
 Returns
 stats
pandas.DataFrame
'Uval'
: Uvalue'alternative'
: tail of the test'pval'
: pvalue'RBC'
: rankbiserial correlation'CLES'
: common language effect size
 stats
See also
Notes
The Mann–Whitney U test [1] (also called Wilcoxon ranksum test) is a nonparametric test of the null hypothesis that it is equally likely that a randomly selected value from one sample will be less than or greater than a randomly selected value from a second sample. The test assumes that the two samples are independent. This test corrects for ties and by default uses a continuity correction (see
scipy.stats.mannwhitneyu()
for details).The rank biserial correlation [2] is the difference between the proportion of favorable evidence minus the proportion of unfavorable evidence.
The common language effect size is the proportion of pairs where
x
is higher thany
. It was first introduced by McGraw and Wong (1992) [3]. Pingouin uses a bruteforce version of the formula given by Vargha and Delaney 2000 [4]:\[\text{CL} = P(X > Y) + .5 \times P(X = Y)\]The advantage is of this method are twofold. First, the bruteforce approach pairs each observation of
x
to itsy
counterpart, and therefore does not require normally distributed data. Second, the formula takes ties into account and therefore works with ordinal data.When tail is
'less'
, the CLES is then set to \(1  \text{CL}\), which gives the proportion of pairs wherex
is lower thany
.References
 1
Mann, H. B., & Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics, 5060.
 2
Kerby, D. S. (2014). The simple difference formula: An approach to teaching nonparametric correlation. Comprehensive Psychology, 3, 11IT.
 3
McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. Psychological bulletin, 111(2), 361.
 4
Vargha, A., & Delaney, H. D. (2000). A Critique and Improvement of the “CL” Common Language Effect Size Statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics: A Quarterly Publication Sponsored by the American Educational Research Association and the American Statistical Association, 25(2), 101–132. https://doi.org/10.2307/1165329
Examples
>>> import numpy as np >>> import pingouin as pg >>> np.random.seed(123) >>> x = np.random.uniform(low=0, high=1, size=20) >>> y = np.random.uniform(low=0.2, high=1.2, size=20) >>> pg.mwu(x, y, alternative='twosided') Uval alternative pval RBC CLES MWU 97.0 twosided 0.00556 0.515 0.2425
Compare with SciPy
>>> import scipy >>> scipy.stats.mannwhitneyu(x, y, use_continuity=True, alternative='twosided') MannwhitneyuResult(statistic=97.0, pvalue=0.0055604599321374135)
Onesided test
>>> pg.mwu(x, y, alternative='greater') Uval alternative pval RBC CLES MWU 97.0 greater 0.997442 0.515 0.2425
>>> pg.mwu(x, y, alternative='less') Uval alternative pval RBC CLES MWU 97.0 less 0.00278 0.515 0.7575
Passing keyword arguments to
scipy.stats.mannwhitneyu()
:>>> pg.mwu(x, y, alternative='twosided', method='exact') Uval alternative pval RBC CLES MWU 97.0 twosided 0.004681 0.515 0.2425