pingouin.mwu

pingouin.mwu(x, y, tail='two-sided')[source]

Mann-Whitney U Test (= Wilcoxon rank-sum test). It is the non-parametric version of the independent T-test.

Parameters
x, yarray_like

First and second set of observations. x and y must be independent.

tailstring

Specify whether to return ‘one-sided’ or ‘two-sided’ p-value. Can also be ‘greater’ or ‘less’ to specify the direction of the test. If tail='one-sided', the alternative of the test will be automatically detected by comparing the medians of x and y. For instance, if median(x) < median(y) and tail='one-sided', Pingouin will automatically set tail='less', and vice versa.

Returns
statspandas.DataFrame
  • 'U-val': U-value

  • 'p-val': p-value

  • 'RBC' : rank-biserial correlation

  • 'CLES' : common language effect size

Notes

The Mann–Whitney U test [1] (also called Wilcoxon rank-sum test) is a non-parametric test of the null hypothesis that it is equally likely that a randomly selected value from one sample will be less than or greater than a randomly selected value from a second sample. The test assumes that the two samples are independent. This test corrects for ties and by default uses a continuity correction (see scipy.stats.mannwhitneyu() for details).

The rank biserial correlation [2] is the difference between the proportion of favorable evidence minus the proportion of unfavorable evidence.

The common language effect size [3] is the probability (from 0 to 1) that a randomly selected observation from the first sample will be greater than a randomly selected observation from the second sample.

References

1

Mann, H. B., & Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics, 50-60.

2

Kerby, D. S. (2014). The simple difference formula: An approach to teaching nonparametric correlation. Comprehensive Psychology, 3, 11-IT.

3

McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. Psychological bulletin, 111(2), 361.

Examples

>>> import numpy as np
>>> import pingouin as pg
>>> np.random.seed(123)
>>> x = np.random.uniform(low=0, high=1, size=20)
>>> y = np.random.uniform(low=0.2, high=1.2, size=20)
>>> pg.mwu(x, y, tail='two-sided')
     U-val       tail    p-val    RBC    CLES
MWU   97.0  two-sided  0.00556  0.515  0.7575

Compare with SciPy

>>> import scipy
>>> scipy.stats.mannwhitneyu(x, y, use_continuity=True,
...                          alternative='two-sided')
MannwhitneyuResult(statistic=97.0, pvalue=0.0055604599321374135)

One-sided tail: one can either manually specify the alternative hypothesis

>>> pg.mwu(x, y, tail='greater')
     U-val     tail     p-val    RBC    CLES
MWU   97.0  greater  0.997442  0.515  0.7575
>>> pg.mwu(x, y, tail='less')
     U-val  tail    p-val    RBC    CLES
MWU   97.0  less  0.00278  0.515  0.7575

Or simply leave it to Pingouin, using the ‘one-sided’ argument, in which case Pingouin will compare the medians of x and y and select the most appropriate tail based on that:

>>> # Since np.median(x) < np.median(y), this is equivalent to tail='less'
>>> pg.mwu(x, y, tail='one-sided')
     U-val  tail    p-val    RBC    CLES
MWU   97.0  less  0.00278  0.515  0.7575