pingouin.normality

pingouin.
normality
(*args, alpha=0.05)[source] ShapiroWilk univariate normality test.
 Parameters
 sample1, sample2,…array_like
Array of sample data. May be of different lengths.
 Returns
 normalboolean
True if x comes from a normal distribution.
 pfloat
Pvalue.
See also
homoscedasticity
Test equality of variance.
sphericity
Mauchly’s test for sphericity.
Notes
The ShapiroWilk test calculates a \(W\) statistic that tests whether a random sample \(x_1, x_2, ..., x_n\) comes from a normal distribution.
The \(W\) statistic is calculated as follows:
\[W = \frac{(\sum_{i=1}^n a_i x_{i})^2} {\sum_{i=1}^n (x_i  \overline{x})^2}\]where the \(x_i\) are the ordered sample values (in ascending order) and the \(a_i\) are constants generated from the means, variances and covariances of the order statistics of a sample of size \(n\) from a standard normal distribution. Specifically:
\[(a_1, ..., a_n) = \frac{m^TV^{1}}{(m^TV^{1}V^{1}m)^{1/2}}\]with \(m = (m_1, ..., m_n)^T\) and \((m_1, ..., m_n)\) are the expected values of the order statistics of independent and identically distributed random variables sampled from the standard normal distribution, and \(V\) is the covariance matrix of those order statistics.
The nullhypothesis of this test is that the population is normally distributed. Thus, if the pvalue is less than the chosen alpha level (typically set at 0.05), then the null hypothesis is rejected and there is evidence that the data tested are not normally distributed.
The result of the ShapiroWilk test should be interpreted with caution in the case of large sample sizes. Indeed, quoting from Wikipedia:
“Like most statistical significance tests, if the sample size is sufficiently large this test may detect even trivial departures from the null hypothesis (i.e., although there may be some statistically significant effect, it may be too small to be of any practical significance); thus, additional investigation of the effect size is typically advisable, e.g., a Q–Q plot in this case.”
References
 1
Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52(3/4), 591611.
 2
https://www.itl.nist.gov/div898/handbook/prc/section2/prc213.htm
 3
Examples
Test the normality of one array.
>>> import numpy as np >>> from pingouin import normality >>> np.random.seed(123) >>> x = np.random.normal(size=100) >>> normal, p = normality(x, alpha=.05) >>> print(normal, p) True 0.275
Test the normality of two arrays.
>>> import numpy as np >>> from pingouin import normality >>> np.random.seed(123) >>> x = np.random.normal(size=100) >>> y = np.random.rand(100) >>> normal, p = normality(x, y, alpha=.05) >>> print(normal, p) [ True False] [0.275 0.001]