pingouin.pairwise_tukey(data=None, dv=None, between=None, effsize='hedges')

Pairwise Tukey-HSD post-hoc test.


DataFrame. Note that this function can also directly be used as a Pandas method, in which case this argument is no longer needed.


Name of column containing the dependent variable.

between: string

Name of column containing the between factor.

effsizestring or None

Effect size type. Available methods are:

  • 'none': no effect size

  • 'cohen': Unbiased Cohen d

  • 'hedges': Hedges g

  • 'r': Pearson correlation coefficient

  • 'eta-square': Eta-square

  • 'odds-ratio': Odds ratio

  • 'AUC': Area Under the Curve

  • 'CLES': Common Language Effect Size

  • 'A': Name of first measurement

  • 'B': Name of second measurement

  • 'mean(A)': Mean of first measurement

  • 'mean(B)': Mean of second measurement

  • 'diff': Mean difference (= mean(A) - mean(B))

  • 'se': Standard error

  • 'T': T-values

  • 'p-tukey': Tukey-HSD corrected p-values

  • 'hedges': Hedges effect size (or any effect size defined in effsize)


Tukey HSD post-hoc [1] is best for balanced one-way ANOVA.

It has been proven to be conservative for one-way ANOVA with unequal sample sizes. However, it is not robust if the groups have unequal variances, in which case the Games-Howell test is more adequate. Tukey HSD is not valid for repeated measures ANOVA. Only one-way ANOVA design are supported.

The T-values are defined as:

\[t = \frac{\overline{x}_i - \overline{x}_j} {\sqrt{2 \cdot \text{MS}_w / n}}\]

where \(\overline{x}_i\) and \(\overline{x}_j\) are the means of the first and second group, respectively, \(\text{MS}_w\) the mean squares of the error (computed using ANOVA) and \(n\) the sample size.

If the sample sizes are unequal, the Tukey-Kramer procedure is automatically used:

\[t = \frac{\overline{x}_i - \overline{x}_j}{\sqrt{\frac{MS_w}{n_i} + \frac{\text{MS}_w}{n_j}}}\]

where \(n_i\) and \(n_j\) are the sample sizes of the first and second group, respectively.

The p-values are then approximated using the Studentized range distribution \(Q(\sqrt2|t_i|, r, N - r)\) where \(r\) is the total number of groups and \(N\) is the total sample size.


Versions of Pingouin below 0.3.10 used a wrong algorithm for the studentized range approximation [2], which resulted in (slightly) incorrect p-values. Please make sure you’re using the LATEST VERSION of Pingouin, and always DOUBLE CHECK your results with another statistical software.



Tukey, John W. “Comparing individual means in the analysis of variance.” Biometrics (1949): 99-114.


Gleason, John R. “An accurate, non-iterative approximation for studentized range quantiles.” Computational statistics & data analysis 31.2 (1999): 147-158.


Pairwise Tukey post-hocs on the Penguins dataset.

>>> import pingouin as pg
>>> df = pg.read_dataset('penguins')
>>> df.pairwise_tukey(dv='body_mass_g', between='species').round(3)
           A          B   mean(A)   mean(B)      diff      se       T  p-tukey  hedges
0     Adelie  Chinstrap  3700.662  3733.088   -32.426  67.512  -0.480    0.869  -0.070
1     Adelie     Gentoo  3700.662  5076.016 -1375.354  56.148 -24.495    0.001  -2.967
2  Chinstrap     Gentoo  3733.088  5076.016 -1342.928  69.857 -19.224    0.001  -2.894