pingouin.homoscedasticity

pingouin.homoscedasticity(*args, alpha=0.05)[source]

Test equality of variance.

Parameters
sample1, sample2,…array_like

Array of sample data. May be different lengths.

Returns
equal_varboolean

True if data have equal variance.

pfloat

P-value.

See also

normality

Test the univariate normality of one or more array(s).

sphericity

Mauchly’s test for sphericity.

Notes

This function first tests if the data are normally distributed using the Shapiro-Wilk test. If yes, then the homogeneity of variances is measured using the Bartlett test. If the data are not normally distributed, the Levene (1960) test, which is less sensitive to departure from normality, is used.

The Bartlett \(T\) statistic is defined as:

\[T = \frac{(N-k) \ln{s^{2}_{p}} - \sum_{i=1}^{k}(N_{i} - 1) \ln{s^{2}_{i}}}{1 + (1/(3(k-1)))((\sum_{i=1}^{k}{1/(N_{i} - 1))} - 1/(N-k))}\]

where \(s_i^2\) is the variance of the \(i^{th}\) group, \(N\) is the total sample size, \(N_i\) is the sample size of the \(i^{th}\) group, \(k\) is the number of groups, and \(s_p^2\) is the pooled variance.

The pooled variance is a weighted average of the group variances and is defined as:

\[s^{2}_{p} = \sum_{i=1}^{k}(N_{i} - 1)s^{2}_{i}/(N-k)\]

The p-value is then computed using a chi-square distribution:

\[T \sim \chi^2(k-1)\]

The Levene \(W\) statistic is defined as:

\[W = \frac{(N-k)} {(k-1)} \frac{\sum_{i=1}^{k}N_{i}(\overline{Z}_{i.}-\overline{Z})^{2} } {\sum_{i=1}^{k}\sum_{j=1}^{N_i}(Z_{ij}-\overline{Z}_{i.})^{2} }\]

where \(Z_{ij} = |Y_{ij} - median({Y}_{i.})|\), \(\overline{Z}_{i.}\) are the group means of \(Z_{ij}\) and \(\overline{Z}\) is the grand mean of \(Z_{ij}\).

The p-value is then computed using a F-distribution:

\[W \sim F(k-1, N-k)\]

References

1

Bartlett, M. S. (1937). Properties of sufficiency and statistical tests. Proc. R. Soc. Lond. A, 160(901), 268-282.

2

Brown, M. B., & Forsythe, A. B. (1974). Robust tests for the equality of variances. Journal of the American Statistical Association, 69(346), 364-367.

3

NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/

Examples

Test the homoscedasticity of two arrays.

>>> import numpy as np
>>> from pingouin import homoscedasticity
>>> np.random.seed(123)
>>> # Scale = standard deviation of the distribution.
>>> x = np.random.normal(loc=0, scale=1., size=100)
>>> y = np.random.normal(loc=0, scale=0.8,size=100)
>>> equal_var, p = homoscedasticity(x, y, alpha=.05)
>>> print(round(np.var(x), 3), round(np.var(y), 3), equal_var, p)
1.273 0.602 False 0.0