pingouin.pairwise_gameshowell(data=None, dv=None, between=None, alpha=0.05, tail='two-sided', effsize='hedges')[source]

Pairwise Games-Howell post-hoc test.

datapandas DataFrame



Name of column containing the dependant variable.

between: string

Name of column containing the between factor.


Significance level


Indicates whether to return the ‘two-sided’ or ‘one-sided’ p-values

effsizestring or None

Effect size type. Available methods are

'none' : no effect size
'cohen' : Unbiased Cohen d
'hedges' : Hedges g
'glass': Glass delta
'eta-square' : Eta-square
'odds-ratio' : Odds ratio
'AUC' : Area Under the Curve

Stats summary

'A' : Name of first measurement
'B' : Name of second measurement
'mean(A)' : Mean of first measurement
'mean(B)' : Mean of second measurement
'diff' : Mean difference (= mean(A) - mean(B))
'se' : Standard error
'tail' : indicate whether the p-values are one-sided or two-sided
'T' : T-values
'df' : adjusted degrees of freedom
'pval' : Games-Howell corrected p-values
'hedges' : effect size (or any effect size defined in ``effsize``)


Games-Howell is very similar to the Tukey HSD post-hoc test but is much more robust to heterogeneity of variances. While the Tukey-HSD post-hoc is optimal after a classic one-way ANOVA, the Games-Howell is optimal after a Welch ANOVA. Games-Howell is not valid for repeated measures ANOVA.

Compared to the Tukey-HSD test, the Games-Howell test uses different pooled variances for each pair of variables instead of the same pooled variance.

The T-values are defined as:

\[t = \frac{\overline{x}_i - \overline{x}_j} {\sqrt{(\frac{s_i^2}{n_i} + \frac{s_j^2}{n_j})}}\]

and the corrected degrees of freedom are:

\[v = \frac{(\frac{s_i^2}{n_i} + \frac{s_j^2}{n_j})^2} {\frac{(\frac{s_i^2}{n_i})^2}{n_i-1} + \frac{(\frac{s_j^2}{n_j})^2}{n_j-1}}\]

where \(\overline{x}_i\), \(s_i^2\), and \(n_i\) are the mean, variance and sample size of the first group and \(\overline{x}_j\), \(s_j^2\), and \(n_j\) the mean, variance and sample size of the second group.

The p-values are then approximated using the Studentized range distribution \(Q(\sqrt2*|t_i|, r, v_i)\).

Note that the p-values might be slightly different than those obtained using R or Matlab since the studentized range approximation is done using the Gleason (1999) algorithm, which is more efficient and accurate than the algorithms used in Matlab or R.



Games, Paul A., and John F. Howell. “Pairwise multiple comparison procedures with unequal n’s and/or variances: a Monte Carlo study.” Journal of Educational Statistics 1.2 (1976): 113-125.


Gleason, John R. “An accurate, non-iterative approximation for studentized range quantiles.” Computational statistics & data analysis 31.2 (1999): 147-158.


Pairwise Games-Howell post-hocs on the pain threshold dataset.

>>> from pingouin import pairwise_gameshowell, read_dataset
>>> df = read_dataset('anova')
>>> pairwise_gameshowell(data=df, dv='Pain threshold',
...                      between='Hair color')