pingouin.ttest

pingouin.
ttest
(x, y, paired=False, tail='twosided', correction='auto', r=0.707, confidence=0.95)[source] Ttest.
 Parameters
 xarray_like
First set of observations.
 yarray_like or float
Second set of observations. If
y
is a single value, a onesample Ttest is computed against that value (= “mu” in the t.test R function). pairedboolean
Specify whether the two observations are related (i.e. repeated measures) or independent.
 tailstring
Specify whether the alternative hypothesis is ‘twosided’ or ‘onesided’. Can also be ‘greater’ or ‘less’ to specify the direction of the test. ‘greater’ tests the alternative that
x
has a larger mean thany
. If tail is ‘onesided’, Pingouin will automatically infer the onesided alternative hypothesis based on the test statistic. correctionstring or boolean
For unpaired two sample Ttests, specify whether or not to correct for unequal variances using Welch separate variances Ttest. If ‘auto’, it will automatically uses Welch Ttest when the sample sizes are unequal, as recommended by Zimmerman 2004.
 rfloat
Cauchy scale factor for computing the Bayes Factor. Smaller values of r (e.g. 0.5), may be appropriate when small effect sizes are expected a priori; larger values of r are appropriate when large effect sizes are expected (Rouder et al 2009). The default is 0.707 (= \(\sqrt{2} / 2\)).
 confidencefloat
Confidence level for the confidence intervals (0.95 = 95%)
New in version 0.3.9.
 Returns
 stats
pandas.DataFrame
'T'
: Tvalue'pval'
: pvalue'dof'
: degrees of freedom'cohend'
: Cohen d effect size'CI95%'
: confidence intervals of the difference in means'power'
: achieved power of the test ( = 1  type II error)'BF10'
: Bayes Factor of the alternative hypothesis
 stats
See also
Notes
Missing values are automatically removed from the data. If
x
andy
are paired, the entire row is removed (= listwise deletion).The Tvalue for unpaired samples is defined as:
\[t = \frac{\overline{x}  \overline{y}} {\sqrt{\frac{s^{2}_{x}}{n_{x}} + \frac{s^{2}_{y}}{n_{y}}}}\]where \(\overline{x}\) and \(\overline{y}\) are the sample means, \(n_{x}\) and \(n_{y}\) are the sample sizes, and \(s^{2}_{x}\) and \(s^{2}_{y}\) are the sample variances. The degrees of freedom \(v\) are \(n_x + n_y  2\) when the sample sizes are equal. When the sample sizes are unequal or when
correction=True
, the Welch–Satterthwaite equation is used to approximate the adjusted degrees of freedom:\[v = \frac{(\frac{s^{2}_{x}}{n_{x}} + \frac{s^{2}_{y}}{n_{y}})^{2}} {\frac{(\frac{s^{2}_{x}}{n_{x}})^{2}}{(n_{x}1)} + \frac{(\frac{s^{2}_{y}}{n_{y}})^{2}}{(n_{y}1)}}\]The pvalue is then calculated using a T distribution with \(v\) degrees of freedom.
The Tvalue for paired samples is defined by:
\[t = \frac{\overline{x}_d}{s_{\overline{x}}}\]where
\[s_{\overline{x}} = \frac{s_d}{\sqrt n}\]where \(\overline{x}_d\) is the sample mean of the differences between the two paired samples, \(n\) is the number of observations (sample size), \(s_d\) is the sample standard deviation of the differences and \(s_{\overline{x}}\) is the estimated standard error of the mean of the differences. The pvalue is then calculated using a Tdistribution with \(n1\) degrees of freedom.
The scaled JeffreyZellnerSiow (JZS) Bayes Factor is approximated using the
pingouin.bayesfactor_ttest()
function.Results have been tested against JASP and the t.test R function.
References
https://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm
Delacre, M., Lakens, D., & Leys, C. (2017). Why psychologists should by default use Welch’s ttest instead of Student’s ttest. International Review of Social Psychology, 30(1).
Zimmerman, D. W. (2004). A note on preliminary tests of equality of variances. British Journal of Mathematical and Statistical Psychology, 57(1), 173181.
Rouder, J.N., Speckman, P.L., Sun, D., Morey, R.D., Iverson, G., 2009. Bayesian t tests for accepting and rejecting the null hypothesis. Psychon. Bull. Rev. 16, 225–237. https://doi.org/10.3758/PBR.16.2.225
Examples
Onesample Ttest.
>>> from pingouin import ttest >>> x = [5.5, 2.4, 6.8, 9.6, 4.2] >>> ttest(x, 4).round(2) T dof tail pval CI95% cohend BF10 power Ttest 1.4 4 twosided 0.23 [2.32, 9.08] 0.62 0.766 0.19
Paired Ttest.
Note that in the example below, since
tail='onesided'
, Pingouin will automatically infer the alternative hypothesis based on the Tvalue. In the example below, the Tvalue is negative so the tail is set to ‘less’,>>> pre = [5.5, 2.4, 6.8, 9.6, 4.2] >>> post = [6.4, 3.4, 6.4, 11., 4.8] >>> ttest(pre, post, paired=True, tail='onesided').round(2) T dof tail pval CI95% cohend BF10 power Ttest 2.31 4 less 0.04 [inf, 0.05] 0.25 3.122 0.12
which is indeed equivalent to directly testing that
x
has a smaller mean thany
(tail = 'less'
)>>> ttest(pre, post, paired=True, tail='less').round(2) T dof tail pval CI95% cohend BF10 power Ttest 2.31 4 less 0.04 [inf, 0.05] 0.25 3.122 0.12
Now testing the opposite alternative hypothesis (
tail = 'greater'
)>>> ttest(pre, post, paired=True, tail='greater').round(2) T dof tail pval CI95% cohend BF10 power Ttest 2.31 4 greater 0.96 [1.35, inf] 0.25 0.32 0.02
Paired Ttest with missing values.
>>> import numpy as np >>> pre = [5.5, 2.4, np.nan, 9.6, 4.2] >>> post = [6.4, 3.4, 6.4, 11., 4.8] >>> stats = ttest(pre, post, paired=True)
Independent twosample Ttest (equal sample size).
>>> np.random.seed(123) >>> x = np.random.normal(loc=7, size=20) >>> y = np.random.normal(loc=4, size=20) >>> stats = ttest(x, y, correction='auto')
Independent twosample Ttest (unequal sample size).
>>> np.random.seed(123) >>> x = np.random.normal(loc=7, size=20) >>> y = np.random.normal(loc=6.5, size=15) >>> stats = ttest(x, y, correction='auto')