pingouin.mwu

pingouin.mwu(x, y, tail='two-sided')[source]

Mann-Whitney U Test (= Wilcoxon rank-sum test). It is the non-parametric version of the independent T-test.

Parameters
x, yarray_like

First and second set of observations. x and y must be independent.

tailstring

Specify whether to return ‘one-sided’ or ‘two-sided’ p-value.

Returns
statspandas DataFrame

Test summary

'U-val' : U-value
'p-val' : p-value
'RBC'   : rank-biserial correlation (effect size)
'CLES'  : common language effect size


Notes

mwu tests the hypothesis that data in x and y are samples from continuous distributions with equal medians. The test assumes that x and y are independent. This test corrects for ties and by default uses a continuity correction (see scipy.stats.mannwhitneyu() for details).

The rank biserial correlation is the difference between the proportion of favorable evidence minus the proportion of unfavorable evidence (see Kerby 2014).

The common language effect size is the probability (from 0 to 1) that a randomly selected observation from the first sample will be greater than a randomly selected observation from the second sample.

References

1

Mann, H. B., & Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics, 50-60.

2

Kerby, D. S. (2014). The simple difference formula: An approach to teaching nonparametric correlation. Comprehensive Psychology, 3, 11-IT.

3

McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. Psychological bulletin, 111(2), 361.

Examples

>>> import numpy as np
>>> import pingouin as pg
>>> np.random.seed(123)
>>> x = np.random.uniform(low=0, high=1, size=20)
>>> y = np.random.uniform(low=0.2, high=1.2, size=20)
>>> pg.mwu(x, y, tail='two-sided')
U-val    p-val    RBC   CLES
MWU   97.0  0.00556  0.515  0.758


Compare with SciPy

>>> import scipy
>>> scipy.stats.mannwhitneyu(x, y, use_continuity=True,
...                          alternative='two-sided')
MannwhitneyuResult(statistic=97.0, pvalue=0.0055604599321374135)