pingouin.rm_anova

pingouin.rm_anova(data=None, dv=None, within=None, subject=None, correction='auto', detailed=False, export_filename=None)[source]

One-way and two-way repeated measures ANOVA.

Parameters
datapandas DataFrame

DataFrame. Note that this function can also directly be used as a pandas.DataFrame method, in which case this argument is no longer needed. Both wide and long-format dataframe are supported for one-way repeated measures ANOVA. However, data must be in long format for two-way repeated measures.

dvstring

Name of column containing the dependant variable (only required if data is in long format).

withinstring

Name of column containing the within factor (only required if data is in long format). If within is a single string, then compute a one-way repeated measures ANOVA, if within is a list with two strings, compute a two-way repeated measures ANOVA.

subjectstring

Name of column containing the subject identifier (only required if data is in long format).

correctionstring or boolean

If True, also return the Greenhouse-Geisser corrected p-value. If ‘auto’ (default), compute Mauchly’s test of sphericity to determine whether the p-values needs to be corrected (see pingouin.sphericity()).

detailedboolean

If True, return a full ANOVA table.

export_filenamestring

Filename (without extension) for the output file. If None, do not export the table. By default, the file will be created in the current python console directory. To change that, specify the filename with full path.

Returns
aovDataFrame

ANOVA summary

'Source' : Name of the within-group factor
'ddof1' : Degrees of freedom (numerator)
'ddof2' : Degrees of freedom (denominator)
'F' : F-value
'p-unc' : Uncorrected p-value
'np2' : Partial eta-square effect size
'eps' : Greenhouse-Geisser epsilon factor (= index of sphericity)
'p-GG-corr' : Greenhouse-Geisser corrected p-value
'W-spher' : Sphericity test statistic
'p-spher' : p-value of the sphericity test
'sphericity' : sphericity of the data (boolean)

See also

anova

One-way and two-way ANOVA

mixed_anova

Two way mixed ANOVA

friedman

Non-parametric one-way repeated measures ANOVA

Notes

Data can be in wide or long format for one-way repeated measures ANOVA but must be in long format for two-way repeated measures ANOVA.

In one-way repeated-measures ANOVA, the total variance (sums of squares) is divided into three components

\[SS_{total} = SS_{treatment} + (SS_{subjects} + SS_{error})\]

with

\[SS_{total} = \sum_i^r \sum_j^n (Y_{ij} - \overline{Y})^2\]
\[SS_{treatment} = \sum_i^r n_i(\overline{Y_i} - \overline{Y})^2\]
\[SS_{subjects} = r\sum (\overline{Y}_s - \overline{Y})^2\]
\[SS_{error} = SS_{total} - SS_{treatment} - SS_{subjects}\]

where \(i=1,...,r; j=1,...,n_i\), \(r\) is the number of conditions, \(n_i\) the number of observations for each condition, \(\overline{Y}\) the grand mean of the data, \(\overline{Y_i}\) the mean of the \(i^{th}\) condition and \(\overline{Y}_{subj}\) the mean of the \(s^{th}\) subject.

The F-statistics is then defined as:

\[F^* = \frac{MS_{treatment}}{MS_{error}} = \frac{\frac{SS_{treatment}} {r-1}}{\frac{SS_{error}}{(n - 1)(r - 1)}}\]

and the p-value can be calculated using a F-distribution with \(v_{treatment} = r - 1\) and \(v_{error} = (n - 1)(r - 1)\) degrees of freedom.

The effect size reported in Pingouin is the partial eta-square, which is equivalent to eta-square for one-way repeated measures ANOVA.

\[\eta_p^2 = \frac{SS_{treatment}}{SS_{treatment} + SS_{error}}\]

Results have been tested against R and JASP. Note however that if the dataset contains one or more other within subject factors, an automatic collapsing to the mean is applied on the dependant variable (same behavior as the ezANOVA R package). As such, results can differ from those of JASP.

Similarly, in two-way repeated measures ANOVA, Pingouin uses the lower bound epsilon factor for the interaction, which is more conservative than the Greenhouse-Geisser epsilon factor used in R or JASP. Therefore, the corrected p-values of the interaction term will be slightly higher in Pingouin. If you can, always double-check your results.

Missing values are automatically removed (listwise deletion) using the pingouin.remove_rm_na() function. This could drastically decrease the power of the ANOVA if many missing values are present. In that case, it might be better to use linear mixed effects models.

References

1

Bakeman, R. (2005). Recommended effect size statistics for repeated measures designs. Behavior research methods, 37(3), 379-384.

2

Richardson, J. T. (2011). Eta squared and partial eta squared as measures of effect size in educational research. Educational Research Review, 6(2), 135-147.

3

https://en.wikipedia.org/wiki/Repeated_measures_design

Examples

One-way repeated measures ANOVA using a wide-format dataset

>>> import pingouin as pg
>>> data = pg.read_dataset('rm_anova_wide')
>>> pg.rm_anova(data)
   Source  ddof1  ddof2      F     p-unc    np2    eps
0  Within      3     24  5.201  0.006557  0.394  0.694

One-way repeated-measures ANOVA using a long-format dataset

>>> df = pg.read_dataset('rm_anova')
>>> aov = pg.rm_anova(dv='DesireToKill', within='Disgustingness',
...                   subject='Subject', data=df, detailed=True)
>>> print(aov)
           Source       SS  DF      MS       F        p-unc    np2 eps
0  Disgustingness   27.485   1  27.485  12.044  0.000793016  0.116   1
1           Error  209.952  92   2.282       -            -      -   -

Two-way repeated-measures ANOVA

>>> aov = pg.rm_anova(dv='DesireToKill',
...                   within=['Disgustingness', 'Frighteningness'],
...                   subject='Subject', data=df)

As a pandas.DataFrame method

>>> df.rm_anova(dv='DesireToKill', within='Disgustingness',
...             subject='Subject',  detailed=True)
           Source       SS  DF      MS       F        p-unc    np2 eps
0  Disgustingness   27.485   1  27.485  12.044  0.000793016  0.116   1
1           Error  209.952  92   2.282       -            -      -   -