pingouin.wilcoxon#
- pingouin.wilcoxon(x, y=None, alternative='two-sided', **kwargs)[source]#
Wilcoxon signed-rank test. It is the non-parametric version of the paired T-test.
- Parameters:
- xarray_like
Either the first set of measurements (in which case y is the second set of measurements), or the differences between two sets of measurements (in which case y is not to be specified.) Must be one-dimensional.
- yarray_like
Either the second set of measurements (if x is the first set of measurements), or not specified (if x is the differences between two sets of measurements.) Must be one-dimensional.
- alternativestring
Defines the alternative hypothesis, or tail of the test. Must be one of “two-sided” (default), “greater” or “less”. See
scipy.stats.wilcoxon()for more details.- **kwargsdict
Additional keywords arguments that are passed to
scipy.stats.wilcoxon().
- Returns:
- stats
pandas.DataFrame 'W-val': W-value'alternative': tail of the test'p-val': p-value'RBC': matched pairs rank-biserial correlation (effect size)'CLES': common language effect size
- stats
See also
Notes
The Wilcoxon signed-rank test [1] tests the null hypothesis that two related paired samples come from the same distribution. In particular, it tests whether the distribution of the differences x - y is symmetric about zero.
Important
Pingouin automatically applies a continuity correction. Therefore, the p-values will be slightly different than
scipy.stats.wilcoxon()unlesscorrection=Trueis explicitly passed to the latter.In addition to the test statistic and p-values, Pingouin also computes two measures of effect size. The matched pairs rank biserial correlation [2] is the simple difference between the proportion of favorable and unfavorable evidence; in the case of the Wilcoxon signed-rank test, the evidence consists of rank sums (Kerby 2014):
\[r = f - u\]The common language effect size is the proportion of pairs where
xis higher thany. It was first introduced by McGraw and Wong (1992) [3]. Pingouin uses a brute-force version of the formula given by Vargha and Delaney 2000 [4]:\[\text{CL} = P(X > Y) + .5 \times P(X = Y)\]The advantage is of this method are twofold. First, the brute-force approach pairs each observation of
xto itsycounterpart, and therefore does not require normally distributed data. Second, the formula takes ties into account and therefore works with ordinal data.When tail is
'less', the CLES is then set to \(1 - \text{CL}\), which gives the proportion of pairs wherexis lower thany.References
[1]Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics bulletin, 1(6), 80-83.
[2]Kerby, D. S. (2014). The simple difference formula: An approach to teaching nonparametric correlation. Comprehensive Psychology, 3, 11-IT.
[3]McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. Psychological bulletin, 111(2), 361.
[4]Vargha, A., & Delaney, H. D. (2000). A Critique and Improvement of the “CL” Common Language Effect Size Statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics: A Quarterly Publication Sponsored by the American Educational Research Association and the American Statistical Association, 25(2), 101–132. https://doi.org/10.2307/1165329
Examples
Wilcoxon test on two related samples.
>>> import numpy as np >>> import pingouin as pg >>> x = np.array([20, 22, 19, 20, 22, 18, 24, 20, 19, 24, 26, 13]) >>> y = np.array([38, 37, 33, 29, 14, 12, 20, 22, 17, 25, 26, 16]) >>> pg.wilcoxon(x, y, alternative='two-sided') W-val alternative p-val RBC CLES Wilcoxon 20.5 two-sided 0.285765 -0.378788 0.395833
Same but using pre-computed differences. However, the CLES effect size cannot be computed as it requires the raw data.
>>> pg.wilcoxon(x - y) W-val alternative p-val RBC CLES Wilcoxon 20.5 two-sided 0.285765 -0.378788 NaN
Compare with SciPy
>>> import scipy >>> scipy.stats.wilcoxon(x, y) WilcoxonResult(statistic=20.5, pvalue=0.2661660677806492)
The p-value is not exactly similar to Pingouin. This is because Pingouin automatically applies a continuity correction. Disabling it gives the same p-value as scipy:
>>> pg.wilcoxon(x, y, alternative='two-sided', correction=False) W-val alternative p-val RBC CLES Wilcoxon 20.5 two-sided 0.266166 -0.378788 0.395833
One-sided test
>>> pg.wilcoxon(x, y, alternative='greater') W-val alternative p-val RBC CLES Wilcoxon 20.5 greater 0.876244 -0.378788 0.395833
>>> pg.wilcoxon(x, y, alternative='less') W-val alternative p-val RBC CLES Wilcoxon 20.5 less 0.142883 -0.378788 0.604167