pingouin.pairwise_gameshowell

pingouin.
pairwise_gameshowell
(data=None, dv=None, between=None, effsize='hedges')[source] Pairwise GamesHowell posthoc test.
 Parameters
 data
pandas.DataFrame
DataFrame
 dvstring
Name of column containing the dependent variable.
 between: string
Name of column containing the between factor.
 effsizestring or None
Effect size type. Available methods are:
'none'
: no effect size'cohen'
: Unbiased Cohen d'hedges'
: Hedges g'glass'
: Glass delta'r'
: Pearson correlation coefficient'etasquare'
: Etasquare'oddsratio'
: Odds ratio'AUC'
: Area Under the Curve'CLES'
: Common Language Effect Size
 data
 Returns
 stats
pandas.DataFrame
Stats summary:
'A'
: Name of first measurement'B'
: Name of second measurement'mean(A)'
: Mean of first measurement'mean(B)'
: Mean of second measurement'diff'
: Mean difference (= mean(A)  mean(B))'se'
: Standard error'T'
: Tvalues'df'
: adjusted degrees of freedom'pval'
: GamesHowell corrected pvalues'hedges'
: Hedges effect size (or any effect size defined ineffsize
)
 stats
See also
Notes
GamesHowell [1] is very similar to the Tukey HSD posthoc test but is much more robust to heterogeneity of variances. While the TukeyHSD posthoc is optimal after a classic oneway ANOVA, the GamesHowell is optimal after a Welch ANOVA. Please note that GamesHowell is not valid for repeated measures ANOVA. Only oneway ANOVA design are supported.
Compared to the TukeyHSD test, the GamesHowell test uses different pooled variances for each pair of variables instead of the same pooled variance.
The Tvalues are defined as:
\[t = \frac{\overline{x}_i  \overline{x}_j} {\sqrt{(\frac{s_i^2}{n_i} + \frac{s_j^2}{n_j})}}\]and the corrected degrees of freedom are:
\[v = \frac{(\frac{s_i^2}{n_i} + \frac{s_j^2}{n_j})^2} {\frac{(\frac{s_i^2}{n_i})^2}{n_i1} + \frac{(\frac{s_j^2}{n_j})^2}{n_j1}}\]where \(\overline{x}_i\), \(s_i^2\), and \(n_i\) are the mean, variance and sample size of the first group and \(\overline{x}_j\), \(s_j^2\), and \(n_j\) the mean, variance and sample size of the second group.
The pvalues are then approximated using the Studentized range distribution \(Q(\sqrt2t_i, r, v_i)\).
Caution
The pvalues might be slightly different than those obtained with R or Matlab because Pingouin uses the Gleason (1999) algorithm [2] for the studentized range approximation, which is more efficient and accurate.
References
 1
Games, Paul A., and John F. Howell. “Pairwise multiple comparison procedures with unequal n’s and/or variances: a Monte Carlo study.” Journal of Educational Statistics 1.2 (1976): 113125.
 2
Gleason, John R. “An accurate, noniterative approximation for studentized range quantiles.” Computational statistics & data analysis 31.2 (1999): 147158.
Examples
Pairwise GamesHowell posthocs on the Penguins dataset.
>>> import pingouin as pg >>> df = pg.read_dataset('penguins') >>> pg.pairwise_gameshowell(data=df, dv='body_mass_g', ... between='species').round(3) A B mean(A) mean(B) diff se T df pval hedges 0 Adelie Chinstrap 3700.662 3733.088 32.426 59.706 0.543 152.455 0.841 0.079 1 Adelie Gentoo 3700.662 5076.016 1375.354 58.811 23.386 249.643 0.001 2.833 2 Chinstrap Gentoo 3733.088 5076.016 1342.928 65.103 20.628 170.404 0.001 3.105