pingouin.partial_corr#
- pingouin.partial_corr(data=None, x=None, y=None, covar=None, x_covar=None, y_covar=None, alternative='two-sided', method='pearson')[source]#
 Partial and semi-partial correlation.
- Parameters:
 - data
pandas.DataFrame Pandas Dataframe. Note that this function can also directly be used as a
pandas.DataFramemethod, in which case this argument is no longer needed.- x, ystring
 x and y. Must be names of columns in
data.- covarstring or list
 Covariate(s). Must be a names of columns in
data. Use a list if there are two or more covariates.- x_covarstring or list
 Covariate(s) for the
xvariable. This is used to compute semi-partial correlation (i.e. the effect ofx_covaris removed fromxbut not fromy). Only one ofcovar,x_covarandy_covarcan be specified.- y_covarstring or list
 Covariate(s) for the
yvariable. This is used to compute semi-partial correlation (i.e. the effect ofy_covaris removed fromybut not fromx). Only one ofcovar,x_covarandy_covarcan be specified.- alternativestring
 Defines the alternative hypothesis, or tail of the partial correlation. Must be one of “two-sided” (default), “greater” or “less”. Both “greater” and “less” return a one-sided p-value. “greater” tests against the alternative hypothesis that the partial correlation is positive (greater than zero), “less” tests against the hypothesis that the partial correlation is negative.
- methodstring
 Correlation type:
'pearson': Pearson \(r\) product-moment correlation'spearman': Spearman \(\rho\) rank-order correlation
- data
 - Returns:
 - stats
pandas.DataFrame 'n': Sample size (after removal of missing values)'r': Partial correlation coefficient'CI95': 95% parametric confidence intervals around \(r\)'p-val': p-value
- stats
 
See also
Notes
Partial correlation [1] measures the degree of association between
xandy, after removing the effect of one or more controlling variables (covar, or \(Z\)). Practically, this is achieved by calculating the correlation coefficient between the residuals of two linear regressions:\[x \sim Z, y \sim Z\]Like the correlation coefficient, the partial correlation coefficient takes on a value in the range from –1 to 1, where 1 indicates a perfect positive association.
The semipartial correlation is similar to the partial correlation, with the exception that the set of controlling variables is only removed for either
xory, but not both.Pingouin uses the method described in [2] to calculate the (semi)partial correlation coefficients and associated p-values. This method is based on the inverse covariance matrix and is significantly faster than the traditional regression-based method. Results have been tested against the ppcor R package.
Important
Rows with missing values are automatically removed from data.
References
Examples
Partial correlation with one covariate
>>> import pingouin as pg >>> df = pg.read_dataset('partial_corr') >>> pg.partial_corr(data=df, x='x', y='y', covar='cv1').round(3) n r CI95% p-val pearson 30 0.568 [0.25, 0.77] 0.001
Spearman partial correlation with several covariates
>>> # Partial correlation of x and y controlling for cv1, cv2 and cv3 >>> pg.partial_corr(data=df, x='x', y='y', covar=['cv1', 'cv2', 'cv3'], ... method='spearman').round(3) n r CI95% p-val spearman 30 0.521 [0.18, 0.75] 0.005
Same but one-sided test
>>> pg.partial_corr(data=df, x='x', y='y', covar=['cv1', 'cv2', 'cv3'], ... alternative="greater", method='spearman').round(3) n r CI95% p-val spearman 30 0.521 [0.24, 1.0] 0.003
>>> pg.partial_corr(data=df, x='x', y='y', covar=['cv1', 'cv2', 'cv3'], ... alternative="less", method='spearman').round(3) n r CI95% p-val spearman 30 0.521 [-1.0, 0.72] 0.997
As a pandas method
>>> df.partial_corr(x='x', y='y', covar=['cv1'], method='spearman').round(3) n r CI95% p-val spearman 30 0.578 [0.27, 0.78] 0.001
Partial correlation matrix (returns only the correlation coefficients)
>>> df.pcorr().round(3) x y cv1 cv2 cv3 x 1.000 0.493 -0.095 0.130 -0.385 y 0.493 1.000 -0.007 0.104 -0.002 cv1 -0.095 -0.007 1.000 -0.241 -0.470 cv2 0.130 0.104 -0.241 1.000 -0.118 cv3 -0.385 -0.002 -0.470 -0.118 1.000
Semi-partial correlation on x
>>> pg.partial_corr(data=df, x='x', y='y', x_covar=['cv1', 'cv2', 'cv3']).round(3) n r CI95% p-val pearson 30 0.463 [0.1, 0.72] 0.015