# pingouin.ttest

pingouin.ttest(x, y, paired=False, tail='two-sided', correction='auto', r=0.707, confidence=0.95)[source]

T-test.

Parameters
xarray_like

First set of observations.

yarray_like or float

Second set of observations. If y is a single value, a one-sample T-test is computed against that value (= “mu” in the t.test R function).

pairedboolean

Specify whether the two observations are related (i.e. repeated measures) or independent.

tailstring

Specify whether the alternative hypothesis is ‘two-sided’ or ‘one-sided’. Can also be ‘greater’ or ‘less’ to specify the direction of the test. ‘greater’ tests the alternative that x has a larger mean than y. If tail is ‘one-sided’, Pingouin will automatically infer the one-sided alternative hypothesis based on the test statistic.

correctionstring or boolean

For unpaired two sample T-tests, specify whether or not to correct for unequal variances using Welch separate variances T-test. If ‘auto’, it will automatically uses Welch T-test when the sample sizes are unequal, as recommended by Zimmerman 2004.

rfloat

Cauchy scale factor for computing the Bayes Factor. Smaller values of r (e.g. 0.5), may be appropriate when small effect sizes are expected a priori; larger values of r are appropriate when large effect sizes are expected (Rouder et al 2009). The default is 0.707 (= $$\sqrt{2} / 2$$).

confidencefloat

Confidence level for the confidence intervals (0.95 = 95%)

New in version 0.3.9.

Returns
statspandas.DataFrame
• 'T': T-value

• 'p-val': p-value

• 'dof': degrees of freedom

• 'cohen-d': Cohen d effect size

• 'CI95%': confidence intervals of the difference in means

• 'power': achieved power of the test ( = 1 - type II error)

• 'BF10': Bayes Factor of the alternative hypothesis

Notes

Missing values are automatically removed from the data. If x and y are paired, the entire row is removed (= listwise deletion).

The T-value for unpaired samples is defined as:

$t = \frac{\overline{x} - \overline{y}} {\sqrt{\frac{s^{2}_{x}}{n_{x}} + \frac{s^{2}_{y}}{n_{y}}}}$

where $$\overline{x}$$ and $$\overline{y}$$ are the sample means, $$n_{x}$$ and $$n_{y}$$ are the sample sizes, and $$s^{2}_{x}$$ and $$s^{2}_{y}$$ are the sample variances. The degrees of freedom $$v$$ are $$n_x + n_y - 2$$ when the sample sizes are equal. When the sample sizes are unequal or when correction=True, the Welch–Satterthwaite equation is used to approximate the adjusted degrees of freedom:

$v = \frac{(\frac{s^{2}_{x}}{n_{x}} + \frac{s^{2}_{y}}{n_{y}})^{2}} {\frac{(\frac{s^{2}_{x}}{n_{x}})^{2}}{(n_{x}-1)} + \frac{(\frac{s^{2}_{y}}{n_{y}})^{2}}{(n_{y}-1)}}$

The p-value is then calculated using a T distribution with $$v$$ degrees of freedom.

The T-value for paired samples is defined by:

$t = \frac{\overline{x}_d}{s_{\overline{x}}}$

where

$s_{\overline{x}} = \frac{s_d}{\sqrt n}$

where $$\overline{x}_d$$ is the sample mean of the differences between the two paired samples, $$n$$ is the number of observations (sample size), $$s_d$$ is the sample standard deviation of the differences and $$s_{\overline{x}}$$ is the estimated standard error of the mean of the differences. The p-value is then calculated using a T-distribution with $$n-1$$ degrees of freedom.

The scaled Jeffrey-Zellner-Siow (JZS) Bayes Factor is approximated using the pingouin.bayesfactor_ttest() function.

Results have been tested against JASP and the t.test R function.

References

• https://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm

• Delacre, M., Lakens, D., & Leys, C. (2017). Why psychologists should by default use Welch’s t-test instead of Student’s t-test. International Review of Social Psychology, 30(1).

• Zimmerman, D. W. (2004). A note on preliminary tests of equality of variances. British Journal of Mathematical and Statistical Psychology, 57(1), 173-181.

• Rouder, J.N., Speckman, P.L., Sun, D., Morey, R.D., Iverson, G., 2009. Bayesian t tests for accepting and rejecting the null hypothesis. Psychon. Bull. Rev. 16, 225–237. https://doi.org/10.3758/PBR.16.2.225

Examples

1. One-sample T-test.

>>> from pingouin import ttest
>>> x = [5.5, 2.4, 6.8, 9.6, 4.2]
>>> ttest(x, 4).round(2)
T  dof       tail  p-val         CI95%  cohen-d   BF10  power
T-test  1.4    4  two-sided   0.23  [2.32, 9.08]     0.62  0.766   0.19

1. Paired T-test.

Note that in the example below, since tail='one-sided', Pingouin will automatically infer the alternative hypothesis based on the T-value. In the example below, the T-value is negative so the tail is set to ‘less’,

>>> pre = [5.5, 2.4, 6.8, 9.6, 4.2]
>>> post = [6.4, 3.4, 6.4, 11., 4.8]
>>> ttest(pre, post, paired=True, tail='one-sided').round(2)
T  dof  tail  p-val          CI95%  cohen-d   BF10  power
T-test -2.31    4  less   0.04  [-inf, -0.05]     0.25  3.122   0.12


which is indeed equivalent to directly testing that x has a smaller mean than y (tail = 'less')

>>> ttest(pre, post, paired=True, tail='less').round(2)
T  dof  tail  p-val          CI95%  cohen-d   BF10  power
T-test -2.31    4  less   0.04  [-inf, -0.05]     0.25  3.122   0.12


Now testing the opposite alternative hypothesis (tail = 'greater')

>>> ttest(pre, post, paired=True, tail='greater').round(2)
T  dof     tail  p-val         CI95%  cohen-d  BF10  power
T-test -2.31    4  greater   0.96  [-1.35, inf]     0.25  0.32   0.02

1. Paired T-test with missing values.

>>> import numpy as np
>>> pre = [5.5, 2.4, np.nan, 9.6, 4.2]
>>> post = [6.4, 3.4, 6.4, 11., 4.8]
>>> stats = ttest(pre, post, paired=True)

1. Independent two-sample T-test (equal sample size).

>>> np.random.seed(123)
>>> x = np.random.normal(loc=7, size=20)
>>> y = np.random.normal(loc=4, size=20)
>>> stats = ttest(x, y, correction='auto')

1. Independent two-sample T-test (unequal sample size).

>>> np.random.seed(123)
>>> x = np.random.normal(loc=7, size=20)
>>> y = np.random.normal(loc=6.5, size=15)
>>> stats = ttest(x, y, correction='auto')