pingouin.multicomp

pingouin.
multicomp
(pvals, alpha=0.05, method='holm')[source] Pvalues correction for multiple comparisons.
 Parameters
 pvalsarray_like
Uncorrected pvalues.
 alphafloat
Significance level.
 methodstring
Method used for testing and adjustment of pvalues. Can be either the full name or initial letters. Available methods are:
'bonf'
: onestep Bonferroni correction'sidak'
: onestep Sidak correction'holm'
: stepdown method using Bonferroni adjustments'fdr_bh'
: Benjamini/Hochberg FDR correction'fdr_by'
: Benjamini/Yekutieli FDR correction'none'
: passthrough option (no correction applied)
 Returns
 rejectarray, boolean
True for hypothesis that can be rejected for given alpha.
 pvals_correctedarray
Pvalues corrected for multiple testing.
Notes
This function is similar to the p.adjust R function.
The correction methods include the Bonferroni correction (
'bonf'
) in which the pvalues are multiplied by the number of comparisons. Less conservative methods are also included such as Sidak (1967) ('sidak'
), Holm (1979) ('holm'
), Benjamini & Hochberg (1995) ('fdr_bh'
), and Benjamini & Yekutieli (2001) ('fdr_by'
), respectively.The first three methods are designed to give strong control of the familywise error rate. Note that the Holm’s method is usually preferred. The
'fdr_bh'
and'fdr_by'
methods control the false discovery rate, i.e. the expected proportion of false discoveries amongst the rejected hypotheses. The false discovery rate is a less stringent condition than the familywise error rate, so these methods are more powerful than the others.The Bonferroni [1] adjusted pvalues are defined as:
\[\widetilde {p}_{{(i)}}= n \cdot p_{{(i)}}\]where \(n\) is the number of finite pvalues (i.e. excluding NaN).
The Sidak [2] adjusted pvalues are defined as:
\[\widetilde {p}_{{(i)}}= 1  (1  p_{{(i)}})^{n}\]The Holm [3] adjusted pvalues are the running maximum of the sorted pvalues divided by the corresponding increasing alpha level:
\[\widetilde {p}_{{(i)}}=\max _{{j\leq i}}\left\{(nj+1)p_{{(j)}} \right\}_{{1}}\]The Benjamini–Hochberg procedure (BH stepup procedure, [4]) controls the false discovery rate (FDR) at level \(\alpha\). It works as follows:
1. For a given \(\alpha\), find the largest \(k\) such that \(P_{(k)}\leq \frac {k}{n}\alpha.\)
2. Reject the null hypothesis for all \(H_{(i)}\) for \(i = 1, \ldots, k\).
The BH procedure is valid when the \(n\) tests are independent, and also in various scenarios of dependence, but is not universally valid.
The Benjamini–Yekutieli procedure (BY, [5]) controls the FDR under arbitrary dependence assumptions. This refinement modifies the threshold and finds the largest \(k\) such that:
\[P_{(k)} \leq \frac{k}{n \cdot c(n)} \alpha\]References
 1
Bonferroni, C. E. (1935). Il calcolo delle assicurazioni su gruppi di teste. Studi in onore del professore salvatore ortu carboni, 1360.
 2
Šidák, Z. K. (1967). “Rectangular Confidence Regions for the Means of Multivariate Normal Distributions”. Journal of the American Statistical Association. 62 (318): 626–633.
 3
Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, 6, 65–70.
 4
Benjamini, Y., and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B, 57, 289–300.
 5
Benjamini, Y., and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Annals of Statistics, 29, 1165–1188.
Examples
FDR correction of an array of pvalues
>>> import pingouin as pg >>> pvals = [.50, .003, .32, .054, .0003] >>> reject, pvals_corr = pg.multicomp(pvals, method='fdr_bh') >>> print(reject, pvals_corr) [False True False False True] [0.5 0.0075 0.4 0.09 0.0015]
Holm correction with missing values
>>> import numpy as np >>> pvals[2] = np.nan >>> reject, pvals_corr = pg.multicomp(pvals, method='holm') >>> print(reject, pvals_corr) [False True False False True] [0.5 0.009 nan 0.108 0.0012]