pingouin.pairwise_gameshowell

pingouin.pairwise_gameshowell(data=None, dv=None, between=None, alpha=0.05, tail='two-sided', effsize='hedges')[source]

Pairwise Games-Howell post-hoc test.

Parameters
datapandas DataFrame

DataFrame

dvstring

Name of column containing the dependant variable.

between: string

Name of column containing the between factor.

alphafloat

Significance level

tailstring

Indicates whether to return the ‘two-sided’ or ‘one-sided’ p-values

effsizestring or None

Effect size type. Available methods are

'none' : no effect size
'cohen' : Unbiased Cohen d
'hedges' : Hedges g
'glass': Glass delta
'eta-square' : Eta-square
'odds-ratio' : Odds ratio
'AUC' : Area Under the Curve

Returns
statsDataFrame

Stats summary

'A' : Name of first measurement
'B' : Name of second measurement
'mean(A)' : Mean of first measurement
'mean(B)' : Mean of second measurement
'diff' : Mean difference (= mean(A) - mean(B))
'se' : Standard error
'tail' : indicate whether the p-values are one-sided or two-sided
'T' : T-values
'df' : adjusted degrees of freedom
'pval' : Games-Howell corrected p-values
'hedges' : effect size (or any effect size defined in effsize)


Notes

Games-Howell is very similar to the Tukey HSD post-hoc test but is much more robust to heterogeneity of variances. While the Tukey-HSD post-hoc is optimal after a classic one-way ANOVA, the Games-Howell is optimal after a Welch ANOVA. Games-Howell is not valid for repeated measures ANOVA.

Compared to the Tukey-HSD test, the Games-Howell test uses different pooled variances for each pair of variables instead of the same pooled variance.

The T-values are defined as:

$t = \frac{\overline{x}_i - \overline{x}_j} {\sqrt{(\frac{s_i^2}{n_i} + \frac{s_j^2}{n_j})}}$

and the corrected degrees of freedom are:

$v = \frac{(\frac{s_i^2}{n_i} + \frac{s_j^2}{n_j})^2} {\frac{(\frac{s_i^2}{n_i})^2}{n_i-1} + \frac{(\frac{s_j^2}{n_j})^2}{n_j-1}}$

where $$\overline{x}_i$$, $$s_i^2$$, and $$n_i$$ are the mean, variance and sample size of the first group and $$\overline{x}_j$$, $$s_j^2$$, and $$n_j$$ the mean, variance and sample size of the second group.

The p-values are then approximated using the Studentized range distribution $$Q(\sqrt2*|t_i|, r, v_i)$$.

Note that the p-values might be slightly different than those obtained using R or Matlab since the studentized range approximation is done using the Gleason (1999) algorithm, which is more efficient and accurate than the algorithms used in Matlab or R.

References

1

Games, Paul A., and John F. Howell. “Pairwise multiple comparison procedures with unequal n’s and/or variances: a Monte Carlo study.” Journal of Educational Statistics 1.2 (1976): 113-125.

2

Gleason, John R. “An accurate, non-iterative approximation for studentized range quantiles.” Computational statistics & data analysis 31.2 (1999): 147-158.

Examples

Pairwise Games-Howell post-hocs on the pain threshold dataset.

>>> from pingouin import pairwise_gameshowell, read_dataset