pingouin.cronbach_alpha

pingouin.
cronbach_alpha
(data=None, items=None, scores=None, subject=None, remove_na=False, ci=0.95)[source] Cronbach’s alpha reliability measure.
 Parameters
 datapandas dataframe
Wide or longformat dataframe.
 itemsstr
Column in
data
with the items names (longformat only). scoresstr
Column in
data
with the scores (longformat only). subjectstr
Column in
data
with the subject identifier (longformat only). remove_nabool
If True, remove the entire rows that contain missing values (= listwise deletion). If False, only pairwise missing values are removed when computing the covariance matrix. For more details, please refer to the
pandas.DataFrame.cov()
method. cifloat
Confidence interval (.95 = 95%)
 Returns
 alphafloat
Cronbach’s alpha
Notes
This function works with both wide and long format dataframe. If you pass a longformat dataframe, you must also pass the
items
,scores
andsubj
columns (in which case the data will be converted into wide format using thepandas.DataFrame.pivot()
method).Internal consistency is usually measured with Cronbach’s alpha, a statistic calculated from the pairwise correlations between items. Internal consistency ranges between negative infinity and one. Coefficient alpha will be negative whenever there is greater withinsubject variability than betweensubject variability.
Cronbach’s \(\alpha\) is defined as
\[\alpha ={k \over k1}\left(1{\sum_{{i=1}}^{k}\sigma_{{y_{i}}}^{2} \over\sigma_{x}^{2}}\right)\]where \(k\) refers to the number of items, \(\sigma_{x}^{2}\) is the variance of the observed total scores, and \(\sigma_{{y_{i}}}^{2}\) the variance of component \(i\) for the current sample of subjects.
Another formula for Cronbach’s \(\alpha\) is
\[\alpha = \frac{k \times \bar c}{\bar v + (k  1) \times \bar c}\]where \(\bar c\) refers to the average of all covariances between items and \(\bar v\) to the average variance of each item.
95% confidence intervals are calculated using Feldt’s method:
\[ \begin{align}\begin{aligned}c_L = 1  (1  \alpha) \cdot F_{(0.025, n1, (n1)(k1))}\\c_U = 1  (1  \alpha) \cdot F_{(0.975, n1, (n1)(k1))}\end{aligned}\end{align} \]where \(n\) is the number of subjects and \(k\) the number of items.
Results have been tested against the R package psych.
References
 1
 2
 3
 4
Feldt, Leonard S., Woodruff, David J., & Salih, Fathi A. (1987). Statistical inference for coefficient alpha. Applied Psychological Measurement, 11(1):93103.
Examples
Binary wideformat dataframe (with missing values)
>>> import pingouin as pg >>> data = pg.read_dataset('cronbach_wide_missing') >>> # In R: psych:alpha(data, use="pairwise") >>> pg.cronbach_alpha(data=data) (0.732661, array([0.435, 0.909]))
After listwise deletion of missing values (remove the entire rows)
>>> # In R: psych:alpha(data, use="complete.obs") >>> pg.cronbach_alpha(data=data, remove_na=True) (0.801695, array([0.581, 0.933]))
After imputing the missing values with the median of each column
>>> pg.cronbach_alpha(data=data.fillna(data.median())) (0.738019, array([0.447, 0.911]))
Likerttype longformat dataframe
>>> data = pg.read_dataset('cronbach_alpha') >>> pg.cronbach_alpha(data=data, items='Items', scores='Scores', ... subject='Subj') (0.591719, array([0.195, 0.84 ]))