pingouin.ttest

pingouin.ttest(x, y, paired=False, tail='two-sided', correction='auto', r=0.707)[source]

T-test.

Parameters:
x : array_like

First set of observations.

y : array_like or float

Second set of observations. If y is a single value, a one-sample T-test is computed.

paired : boolean

Specify whether the two observations are related (i.e. repeated measures) or independent.

tail : string

Specify whether to return two-sided or one-sided p-value.

correction : string or boolean

For unpaired two sample T-tests, specify whether or not to correct for unequal variances using Welch separate variances T-test. If ‘auto’, it will automatically uses Welch T-test when the sample sizes are unequal, as recommended by Zimmerman 2004.

r : float

Cauchy scale factor for computing the Bayes Factor. Smaller values of r (e.g. 0.5), may be appropriate when small effect sizes are expected a priori; larger values of r are appropriate when large effect sizes are expected (Rouder et al 2009). The default is 0.707 (= sqrt(2) / 2).

Returns:
stats : pandas DataFrame

T-test summary

'T' : T-value
'p-val' : p-value
'dof' : degrees of freedom
'cohen-d' : Cohen d effect size
'power' : achieved power of the test ( = 1 - type II error)
'BF10' : Bayes Factor of the alternative hypothesis

See also

mwu
non-parametric independent T-test
wilcoxon
non-parametric paired T-test
anova
One-way and two-way ANOVA
rm_anova
One-way and two-way repeated measures ANOVA
compute_effsize
Effect sizes

Notes

Missing values are automatically removed from the data. If x and y are paired, the entire row is removed.

The two-sample T-test for unpaired data is defined as:

\[t = \dfrac{\overline{x} - \overline{y}} {\sqrt{\dfrac{s^{2}_{x}}{n_{x}} + \dfrac{s^{2}_{y}}{n_{y}}}}\]

where \(\overline{x}\) and \(\overline{y}\) are the sample means, \(n_{x}\) and \(n_{y}\) are the sample sizes, and \(s^{2}_{x}\) and \(s^{2}_{y}\) are the sample variances. The degrees of freedom \(v\) are \(n_x + n_y - 2\) when the sample sizes are equal. When the sample sizes are unequal or when correction=True, the Welch–Satterthwaite equation is used to approximate the adjusted degrees of freedom:

\[v = \dfrac{(\dfrac{s^{2}_{x}}{n_{x}} + \dfrac{s^{2}_{y}}{n_{y}})^{2}} {\dfrac{(\dfrac{s^{2}_{x}}{n_{x}})^{2}}{(n_{x}-1)} + \dfrac{(\dfrac{s^{2}_{y}}{n_{y}})^{2}}{(n_{y}-1)}}\]

The p-value is then calculated using a T distribution with \(v\) degrees of freedom.

The T-value for paired samples is defined by:

\[t = \dfrac{\overline{x}_{diff}}{s_{\overline{x}}}\]

where

\[s_{\overline{x}} = \dfrac{s_{diff}}{\sqrt n}\]

where \(\overline{x}_{diff}\) is the sample mean of the differences between the two paired samples, \(n\) is the number of observations (sample size), \(s_{diff}\) is the sample standard deviation of the differences and \(s_{\overline{x}}\) is the estimated standard error of the mean of the differences.

The p-value is then calculated using a T-distribution with \(n-1\) degrees of freedom.

The scaled Jeffrey-Zellner-Siow (JZS) Bayes Factor is approximated using the pingouin.bayesfactor_ttest() function.

References

[1]https://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm
[2]Delacre, M., Lakens, D., & Leys, C. (2017). Why psychologists should by default use Welch’s t-test instead of Student’s t-test. International Review of Social Psychology, 30(1).
[3]Zimmerman, D. W. (2004). A note on preliminary tests of equality of variances. British Journal of Mathematical and Statistical Psychology, 57(1), 173-181.
[4]Rouder, J.N., Speckman, P.L., Sun, D., Morey, R.D., Iverson, G., 2009. Bayesian t tests for accepting and rejecting the null hypothesis. Psychon. Bull. Rev. 16, 225–237. https://doi.org/10.3758/PBR.16.2.225

Examples

  1. One-sample T-test.

    >>> from pingouin import ttest
    >>> x = [5.5, 2.4, 6.8, 9.6, 4.2]
    >>> ttest(x, 4)
        T      p-val  dof  cohen-d  power   BF10
        1.397  0.2348    4    0.625  0.919  0.766
    
  2. Paired two-sample T-test (one-tailed).

    >>> from pingouin import ttest
    >>> pre = [5.5, 2.4, 6.8, 9.6, 4.2]
    >>> post = [6.4, 3.4, 6.4, 11., 4.8]
    >>> ttest(pre, post, paired=True, tail='one-sided')
        T       p-val  dof  cohen-d  power   BF10
        -2.308   0.04    4     0.25  0.121  3.122
    
  3. Paired two-sample T-test with missing values.

    >>> from pingouin import ttest
    >>> from numpy import nan
    >>> pre = [5.5, 2.4, nan, 9.6, 4.2]
    >>> post = [6.4, 3.4, 6.4, 11., 4.8]
    >>> ttest(pre, post, paired=True)
        T        p-val  dof  cohen-d  power    BF10
        -5.902  0.0097    3   0.306   0.065   7.169
    
  4. Independent two-sample T-test (equal sample size).

    >>> from pingouin import ttest
    >>> import numpy as np
    >>> np.random.seed(123)
    >>> x = np.random.normal(loc=7, size=20)
    >>> y = np.random.normal(loc=4, size=20)
    >>> ttest(x, y, correction='auto')
        T         p-val  dof  cohen-d  power   BF10
        9.106  4.30e-11   38     2.88    1.0  1.4e8
    
  5. Independent two-sample T-test (unequal sample size).

    >>> from pingouin import ttest
    >>> import numpy as np
    >>> np.random.seed(123)
    >>> x = np.random.normal(loc=7, size=20)
    >>> y = np.random.normal(loc=6.5, size=15)
    >>> ttest(x, y, correction='auto')
        T         p-val  dof   dof-corr  cohen-d  power   BF10
        2.327     0.027   33      30.75    0.792  0.614  2.454