pingouin.compute_effsize#

pingouin.compute_effsize(x, y, paired=False, eftype='cohen')[source]#

Calculate effect size between two set of observations.

Parameters:

xnp.array or list

First set of observations.

ynp.array or list

Second set of observations.

pairedboolean

If True, uses Cohen d-avg formula to correct for repeated measurements (see Notes).

eftypestring

Desired output effect size. Available methods are:

'none': no effect size
'cohen': Unbiased Cohen d
'hedges': Hedges g
'r': Pearson correlation coefficient
'pointbiserialr': Point-biserial correlation
'eta-square': Eta-square
'odds-ratio': Odds ratio
'AUC': Area Under the Curve
'CLES': Common Language Effect Size

Returns:

effloat: Effect size

See also

convert_effsize: Conversion between effect sizes.
compute_effsize_from_t: Convert a T-statistic to an effect size.

Notes

Missing values are automatically removed from the data. If x and y are paired, the entire row is removed.

If x and y are independent, the Cohen \(d\) is:

\[d = \frac{\overline{X} - \overline{Y}} {\sqrt{\frac{(n_{1} - 1)\sigma_{1}^{2} + (n_{2} - 1) \sigma_{2}^{2}}{n1 + n2 - 2}}}\]

If x and y are paired, the Cohen \(d_{avg}\) is computed:

\[d_{avg} = \frac{\overline{X} - \overline{Y}} {\sqrt{\frac{(\sigma_1^2 + \sigma_2^2)}{2}}}\]

The Cohen’s d is a biased estimate of the population effect size, especially for small samples (n < 20). It is often preferable to use the corrected Hedges \(g\) instead:

\[g = d \times (1 - \frac{3}{4(n_1 + n_2) - 9})\]

The common language effect size is the proportion of pairs where x is higher than y (calculated with a brute-force approach where each observation of x is paired to each observation of y, see pingouin.wilcoxon() for more details):

\[\text{CL} = P(X > Y) + .5 \times P(X = Y)\]

For other effect sizes, Pingouin will first calculate a Cohen \(d\) and then use the pingouin.convert_effsize() to convert to the desired effect size.

References

Lakens, D., 2013. Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Front. Psychol. 4, 863. https://doi.org/10.3389/fpsyg.2013.00863
Cumming, Geoff. Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis. Routledge, 2013.
https://osf.io/vbdah/

Examples

Cohen d from two independent samples.

>>> import numpy as np
>>> import pingouin as pg
>>> x = [1, 2, 3, 4]
>>> y = [3, 4, 5, 6, 7]
>>> pg.compute_effsize(x, y, paired=False, eftype='cohen')
-1.707825127659933

The sign of the Cohen d will be opposite if we reverse the order of x and y:

>>> pg.compute_effsize(y, x, paired=False, eftype='cohen')
1.707825127659933

Hedges g from two paired samples.

>>> x = [1, 2, 3, 4, 5, 6, 7]
>>> y = [1, 3, 5, 7, 9, 11, 13]
>>> pg.compute_effsize(x, y, paired=True, eftype='hedges')
-0.8222477210374874

Common Language Effect Size.

>>> pg.compute_effsize(x, y, eftype='cles')
0.2857142857142857

In other words, there are ~29% of pairs where x is higher than y, which means that there are ~71% of pairs where x is lower than y. This can be easily verified by changing the order of x and y:

>>> pg.compute_effsize(y, x, eftype='cles')
0.7142857142857143