pingouin.wilcoxon(x, y, tail='two-sided')[source]

Wilcoxon signed-rank test. It is the non-parametric version of the paired T-test.

x, yarray_like

First and second set of observations. x and y must be related (e.g repeated measures) and, therefore, have the same number of samples. Note that a listwise deletion of missing values is automatically applied.


Specify whether to return ‘one-sided’ or ‘two-sided’ p-value. Can also be ‘greater’ or ‘less’ to specify the direction of the test. If tail='one-sided', the alternative of the test will be automatically detected by looking at the sign of the median of the differences between x and y. For instance, if np.median(x - y) > 0 and tail='one-sided', Pingouin will automatically set tail='greater' and vice versa.

statspandas DataFrame

Test summary

'W-val' : W-value
'p-val' : p-value
'RBC'   : matched pairs rank-biserial correlation (effect size)
'CLES'  : common language effect size


The Wilcoxon signed-rank test tests the null hypothesis that two related paired samples come from the same distribution. In particular, it tests whether the distribution of the differences x - y is symmetric about zero. A continuity correction is applied by default (see scipy.stats.wilcoxon() for details).

The rank biserial correlation is the difference between the proportion of favorable evidence minus the proportion of unfavorable evidence (see Kerby 2014).

The common language effect size is the probability (from 0 to 1) that a randomly selected observation from the first sample will be greater than a randomly selected observation from the second sample.


Versions of Pingouin below 0.2.6 gave wrong two-sided p-values for the Wilcoxon test. P-values were accidentally squared, and therefore smaller. This issue has been resolved in Pingouin>=0.2.6. Make sure to always use the latest release.



Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics bulletin, 1(6), 80-83.


Kerby, D. S. (2014). The simple difference formula: An approach to teaching nonparametric correlation. Comprehensive Psychology, 3, 11-IT.


McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. Psychological bulletin, 111(2), 361.


Wilcoxon test on two related samples.

>>> import numpy as np
>>> import pingouin as pg
>>> x = [20, 22, 19, 20, 22, 18, 24, 20, 19, 24, 26, 13]
>>> y = [38, 37, 33, 29, 14, 12, 20, 22, 17, 25, 26, 16]
>>> pg.wilcoxon(x, y, tail='two-sided')
          W-val       tail     p-val    RBC   CLES
Wilcoxon   20.5  two-sided  0.285765 -0.379  0.583

Compare with SciPy

>>> import scipy
>>> scipy.stats.wilcoxon(x, y, correction=True)
WilcoxonResult(statistic=20.5, pvalue=0.2857652190231508)

One-sided tail: one can either manually specify the alternative hypothesis

>>> pg.wilcoxon(x, y, tail='greater')
          W-val     tail     p-val    RBC   CLES
Wilcoxon   20.5  greater  0.876244 -0.379  0.583
>>> pg.wilcoxon(x, y, tail='less')
          W-val  tail     p-val    RBC   CLES
Wilcoxon   20.5  less  0.142883 -0.379  0.583

Or simply leave it to Pingouin, using the ‘one-sided’ argument, in which case Pingouin will look at the sign of the median of the differences between x and y and ajust the tail based on that:

>>> np.median(np.array(x) - np.array(y))

The median is negative, so Pingouin will test for the alternative hypothesis that the median of the differences is negative (= less than 0).

>>> pg.wilcoxon(x, y, tail='one-sided')  # Equivalent to tail = 'less'
          W-val  tail     p-val    RBC   CLES
Wilcoxon   20.5  less  0.142883 -0.379  0.583