pingouin.pairwise_gameshowell(data=None, dv=None, between=None, effsize='hedges')

Pairwise Games-Howell post-hoc test.




Name of column containing the dependent variable.

between: string

Name of column containing the between factor.

effsizestring or None

Effect size type. Available methods are:

  • 'none': no effect size

  • 'cohen': Unbiased Cohen d

  • 'hedges': Hedges g

  • 'r': Pearson correlation coefficient

  • 'eta-square': Eta-square

  • 'odds-ratio': Odds ratio

  • 'AUC': Area Under the Curve

  • 'CLES': Common Language Effect Size


Stats summary:

  • 'A': Name of first measurement

  • 'B': Name of second measurement

  • 'mean(A)': Mean of first measurement

  • 'mean(B)': Mean of second measurement

  • 'diff': Mean difference (= mean(A) - mean(B))

  • 'se': Standard error

  • 'T': T-values

  • 'df': adjusted degrees of freedom

  • 'pval': Games-Howell corrected p-values

  • 'hedges': Hedges effect size (or any effect size defined in effsize)


Games-Howell [1] is very similar to the Tukey HSD post-hoc test but is much more robust to heterogeneity of variances. While the Tukey-HSD post-hoc is optimal after a classic one-way ANOVA, the Games-Howell is optimal after a Welch ANOVA. Please note that Games-Howell is not valid for repeated measures ANOVA. Only one-way ANOVA design are supported.

Compared to the Tukey-HSD test, the Games-Howell test uses different pooled variances for each pair of variables instead of the same pooled variance.

The T-values are defined as:

\[t = \frac{\overline{x}_i - \overline{x}_j} {\sqrt{(\frac{s_i^2}{n_i} + \frac{s_j^2}{n_j})}}\]

and the corrected degrees of freedom are:

\[v = \frac{(\frac{s_i^2}{n_i} + \frac{s_j^2}{n_j})^2} {\frac{(\frac{s_i^2}{n_i})^2}{n_i-1} + \frac{(\frac{s_j^2}{n_j})^2}{n_j-1}}\]

where \(\overline{x}_i\), \(s_i^2\), and \(n_i\) are the mean, variance and sample size of the first group and \(\overline{x}_j\), \(s_j^2\), and \(n_j\) the mean, variance and sample size of the second group.

The p-values are then approximated using the Studentized range distribution \(Q(\sqrt2|t_i|, r, v_i)\).


Versions of Pingouin below 0.3.10 used a wrong algorithm for the studentized range approximation [2], which resulted in (slightly) incorrect p-values. Please make sure you’re using the LATEST VERSION of Pingouin, and always DOUBLE CHECK your results with another statistical software.



Games, Paul A., and John F. Howell. “Pairwise multiple comparison procedures with unequal n’s and/or variances: a Monte Carlo study.” Journal of Educational Statistics 1.2 (1976): 113-125.


Gleason, John R. “An accurate, non-iterative approximation for studentized range quantiles.” Computational statistics & data analysis 31.2 (1999): 147-158.


Pairwise Games-Howell post-hocs on the Penguins dataset.

>>> import pingouin as pg
>>> df = pg.read_dataset('penguins')
>>> pg.pairwise_gameshowell(data=df, dv='body_mass_g',
...                         between='species').round(3)
           A          B   mean(A)   mean(B)      diff      se       T       df   pval  hedges
0     Adelie  Chinstrap  3700.662  3733.088   -32.426  59.706  -0.543  152.455  0.834  -0.079
1     Adelie     Gentoo  3700.662  5076.016 -1375.354  58.811 -23.386  249.643  0.001  -2.833
2  Chinstrap     Gentoo  3733.088  5076.016 -1342.928  65.103 -20.628  170.404  0.001  -3.105