Short primer on power
Power analysis using powerlmm
A power function is a method to evaluate a statistical test.
Gives us a test's sensitivity as a function of a hypothetical true effect.
For a single value, the power function tells us how often a statistical test would correctly reject H0—if the hypotetical effect was actually true.
Often done for sample size planning.


powerlmmpowerlmmR package designed to perform power analyses for 2- and 3-level models.
powerlmmR package designed to perform power analyses for 2- and 3-level models.
Offers both fast and accurate analytical solutions, as well as convenient simulation methods.


Random intercept-only model
Variation in baseline levels
Parallel slopes

Variation in baseline levels
Variation in change over time
Possible correlation between intercepts and slopes

Within-subjects error
Variation around the subject-specific latent slope
powerlmm's syntaxpowerlmm's syntaxThe study_parameters() function.
study_parameters(): Setuplibrary(powerlmm)p <- study_parameters(n1 = 11, n2 = 50, icc_pre_subject = 0.5, var_ratio = 0.02, cohend = 0.5)study_parameters(): Setuplibrary(powerlmm)p <- study_parameters(n1 = 11, n2 = 50, icc_pre_subject = 0.5, var_ratio = 0.02, cohend = 0.5)
Standardized inputs
Not all parameters of a LMM affect power.
Often it's the relative magnitude that matters.
I use standardized inputs in this talk, however study_parameters() also accepts the raw parameter values.
study_parameters(): n1library(powerlmm)p <- study_parameters(n1 = 11, n2 = 50, icc_pre_subject = 0.5, var_ratio = 0.02, cohend = 0.5)
n1 is the number of equally spaced time points.
Argument T_end indicates duration, defaults to n1 - 1.
Currently, only linear change accross time is supported.
study_parameters(): n2library(powerlmm)p <- study_parameters(n1 = 11, n2 = 50, icc_pre_subject = 0.5, var_ratio = 0.02, cohend = 0.5)
n2 is the number of subjects per treatment.
Can be unbalanced n2 = per_treatment(5, 50)
study_parameters(): Pretest ICC? 🤔️library(powerlmm)p <- study_parameters(n1 = 11, n2 = 50, icc_pre_subject = 0.5, var_ratio = 0.02, cohend = 0.5)
icc_pre_subject is the amount of the total baseline variance the is between-subjects.
icc_pre_subject=var(subject_intercepts)var(subject_intercepts) + var(within-subject_error)study_parameters(): Pretest ICC? 🤔️library(powerlmm)p <- study_parameters(n1 = 11, n2 = 50, icc_pre_subject = 0.5, var_ratio = 0.02, cohend = 0.5)
icc_pre_subject is the amount of the total baseline variance the is between-subjects.
icc_pre_subject=var(subject_intercepts)var(subject_intercepts) + var(within-subject_error)icc_pre_subject would be the 2-level ICC if there was no random slopes.study_parameters(): Pretest ICC? 🤔️library(powerlmm)p <- study_parameters(n1 = 11, n2 = 50, icc_pre_subject = 0.5, var_ratio = 0.02, cohend = 0.5)
icc_pre_subject is the amount of the total baseline variance the is between-subjects.
icc_pre_subject=var(subject_intercepts)var(subject_intercepts) + var(within-subject_error)icc_pre_subject would be the 2-level ICC if there was no random slopes.
Argument cor_subject indicates the correlation between intercepts and slopes.
study_parameters(): Variance ratio? 🤔️library(powerlmm)p <- study_parameters(n1 = 11, n2 = 50, icc_pre_subject = 0.5, var_ratio = 0.02, cohend = 0.5)
var_ratio is the ratio of total random slope variance over the level-1 residual variance.
var_ratio=var(subject_slopes)var(within-subject_error)study_parameters(): the effect sizelibrary(powerlmm)p <- study_parameters(n1 = 11, n2 = 50, icc_pre_subject = 0.5, var_ratio = 0.02, cohend = 0.5)

Standardized effect sizes
cohend indicates the standardized differences between the two groups
at posttest.
Different denominators possible:
Baseline SD (default).
Posttest SD.
Random slope SD.
library(powerlmm)p <- study_parameters(n1 = 11, n2 = 50, icc_pre_subject = 0.5, var_ratio = 0.02, dropout = dropout_weibull(proportion = 0.4, rate = 1), cohend = 0.5)library(powerlmm)p <- study_parameters(n1 = 11, n2 = 50, icc_pre_subject = 0.5, var_ratio = 0.02, dropout = dropout_weibull(proportion = 0.4, rate = 1), cohend = 0.5)
dropout_weibull(...)
proportion is the total dropout at the last time point.
rate > 1 indicate that more dropout occur later during treatment, and rate < 1 that it occurs early during treatment.
Also support dropout = dropout_manual(...).
Can differ per treatment group, if combined with per_treatment(...)

Just call get_power(...).
get_power(...)get_power(p)
Power Analysis for Longitudinal Linear Mixed-Effects Models with missing data and unbalanced designs n1 = 11 n2 = 50 (treatment) 50 (control) 100 (total) dropout = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 (time) 0, 5, 10, 14, 18, 23, 26, 30, 34, 37, 40 (%, control) 0, 5, 10, 14, 18, 23, 26, 30, 34, 37, 40 (%, treatment)icc_pre_subjects = 0.5 var_ratio = 0.02 Cohen's d = 0.5 df = 98 alpha = 0.05 power = 42 %get_power(...)get_power(p)
Power Analysis for Longitudinal Linear Mixed-Effects Models with missing data and unbalanced designs n1 = 11 n2 = 50 (treatment) 50 (control) 100 (total) dropout = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 (time) 0, 5, 10, 14, 18, 23, 26, 30, 34, 37, 40 (%, control) 0, 5, 10, 14, 18, 23, 26, 30, 34, 37, 40 (%, treatment)icc_pre_subjects = 0.5 var_ratio = 0.02 Cohen's d = 0.5 df = 98 alpha = 0.05 power = 42 %Let's also see what power is without dropout.
get_power(update(p, dropout = 0))$power
[1] 0.5369149


Subjects now vary around their cluster's average.
Clusters' intercepts vary.

Only clusters' slopes vary.
Reasonable if allocation is random.

Clusters can differ both in intercepts and slopes.
Cluster-level intercepts and slopes can be correlated.

study_parameters()study_parameters(): n2library(powerlmm)p <- study_parameters(n1 = 11, n2 = 5, n3 = 4, icc_pre_subject = 0.5, icc_pre_cluster = 0.1, icc_slope = 0.05, var_ratio = 0.02, cohend = 0.5)
n2 is now the number of subjects per cluster. study_parameters(): n3library(powerlmm)p <- study_parameters(n1 = 11, n2 = 5, n3 = 4, icc_pre_subject = 0.5, icc_pre_cluster = 0.1, icc_slope = 0.05, var_ratio = 0.02, cohend = 0.5)
n3 is the number of clusters per treatment.study_parameters(): pretest ICClibrary(powerlmm)p <- study_parameters(n1 = 11, n2 = 5, n3 = 4, icc_pre_subject = 0.5, icc_pre_cluster = 0.1, icc_slope = 0.05, var_ratio = 0.02, cohend = 0.5)
icc_pre_subject now also includes the level-3 intercept variance.var(subject_intercepts) + var(cluster_intercepts)var(subject_intercepts) + var(cluster_intercepts) + var(within-subject_error)
study_parameters(): pretest ICClibrary(powerlmm)p <- study_parameters(n1 = 11, n2 = 5, n3 = 4, icc_pre_subject = 0.5, icc_pre_cluster = 0.1, icc_slope = 0.05, var_ratio = 0.02, cohend = 0.5)
icc_pre_cluster indicates the amount of baseline variance at the cluster level.var(cluster_intercepts)var(subject_intercepts) + var(cluster_intercepts) + var(within-subject_error)
cor_cluster indicates cluster-level correlation between intercept and slopes.study_parameters(): icc_slopelibrary(powerlmm)p <- study_parameters(n1 = 11, n2 = 5, n3 = 4, icc_pre_subject = 0.5, icc_pre_cluster = 0.1, icc_slope = 0.05, var_ratio = 0.02, cohend = 0.5)
icc_slope=var(cluster_slopes)var(subject_slopes) + var(cluster_slopes)
study_parameters(): The restlibrary(powerlmm)p <- study_parameters(n1 = 11, n2 = 5, n3 = 4, icc_pre_subject = 0.5, icc_pre_cluster = 0.1, icc_slope = 0.05, var_ratio = 0.02, dropout = ..., cohend = 0.5)
The remaining arguments are unchanged from the 2-level model.
Except that var_ratio now also includes the 3-level random slope in the numerator.
Introducing unequal_clusters(...)
unequal_clusters(...)
unequal_clusters(...)
p <- study_parameters(n1 = 11, n2 = unequal_clusters(2, 9), icc_pre_subject = 0.5, icc_slope = 0.05, var_ratio = 0.03, cohend = 0.5)Assume cluster sizes are drawn from a distribution.

n2 <- unequal_clusters(func = runif(n = 5, min = 2, max = 30))p <- study_parameters(n1 = 11, n2 = n2, icc_pre_subject = 0.5, icc_slope = 0.05, var_ratio = 0.03, cohend = 0.5)
N.B., cluster sizes are assumed to be the same in the treatment and control group.
Use n2 = per_treatment(n2, n2), to sample each group independently.
Let's do two realizations.
set.seed(5445)get_power(p)$power
[1] 0.3504035get_power(p)$power
[1] 0.2414357Let's do two realizations.
set.seed(5445)get_power(p)$power
[1] 0.3504035get_power(p)$power
[1] 0.2414357We can average over multiple realizations, based on randomly generated cluster sizes, to get the expected power.
get_power(p, R = 50, progress = FALSE)$power
[1] 0.3291297The argument cores can be used to run this in parallel on multiple CPU cores.
Partially nested studies
study_parameters(): Partially nestinglibrary(powerlmm)p <- study_parameters(n1 = 11, n2 = 5, n3 <- 4, icc_pre_subject = 0.5, icc_pre_cluster = 0.1, icc_slope = 0.05, var_ratio = 0.02, partially_nested = TRUE, cohend = 0.5)
Just add partially_nested = TRUE, to remove the third level from the control group.
study_parameters(): Partially nestinglibrary(powerlmm)p <- study_parameters(n1 = 11, n2 = 5, n3 <- 4, icc_pre_subject = 0.5, icc_pre_cluster = 0.1, icc_slope = 0.05, var_ratio = 0.02, partially_nested = TRUE, cohend = 0.5)
Just add partially_nested = TRUE, to remove the third level from the control group.
However, this leads us to another problem...
For some designs, like partially nested studies, exact dfs are hard to specify, and they must be approximated.
For situations with few clusters, we calculate power using the non-central t-distribution.
Ignore the problem (Wald test).
Use between-clusters dfs.
Approximated dfs from data (Satterthwaite).

Substantial difference if few clusters are included.
p <- sp(n1 = 11, n2 = 20, n3 = 3, icc_pre_subject = 0.5, cor_subject = -0.5, icc_slope = 0.1, var_ratio = 0.03, partially_nested = TRUE, cohend = 1)
Substantial difference if few clusters are included.
p <- sp(n1 = 11, n2 = 20, n3 = 3, icc_pre_subject = 0.5, cor_subject = -0.5, icc_slope = 0.1, var_ratio = 0.03, partially_nested = TRUE, cohend = 1)
get_power(p, df = 9999)$power
[1] 0.8537807get_power(p, df = "between")$power
[1] 0.3897371get_power(p, df = "satterth")$power
[1] 0.6733066One option is to look at the model implied SDs, variance-covariance matrices, etc.
p <- sp(n1 = 11, n2 = 5, icc_pre_subject = 0.5, icc_slope = 0.05, var_ratio = 0.3, cohend = 0.5)
p <- sp(n1 = 11, n2 = 5, icc_pre_subject = 0.5, icc_slope = 0.05, var_ratio = 0.3, cohend = 0.5)
get_sds(p) %>% plot

SD at posttest is 4x the pretest SD. I do not expect that much heterogeneity.
get_VPC(p) %>% plot

get_correlation_matrix(p) %>% plot

Not really true.
But it is true, that few clusters can be a problem.
Not really true.
But it is true, that few clusters can be a problem.
Not really true.
But it is true, that few clusters can be a problem.
It is easy to test this using powerlmm's simulate() method.
simulate(): Test the model's performanceLet's see if 4 clusters per treatment is enough.
p <- study_parameters(n1 = 11, n2 = 30, n3 = 4, icc_pre_subject = 0.5, cor_subject = -0.5, var_ratio = 0.02, icc_slope = 0.05, cohend = 0)res <- simulate(p, nsim = 5000, satterthwaite = TRUE, cores = 15)summary(): view the resultssummary(res)
Model: correct Random effects parameter M_est theta est_rel_bias prop_zero is_NA subject_intercept 100.0 100.0 -0.00040 0.00 0 subject_slope 1.9 1.9 0.00129 0.00 0 cluster_slope 0.1 0.1 0.00831 0.11 0 error 100.0 100.0 0.00036 0.00 0 cor_subject -0.5 -0.5 -0.00653 0.00 0 Fixed effects parameter M_est theta M_se SD_est Power Power_bw Power_satt (Intercept) -0.01586 0 1.05 1.04 0.047 NA NaN treatment 0.00878 0 1.48 1.48 0.049 NA NaN time -0.00116 0 0.22 0.22 0.071 NA NaN time:treatment 0.00047 0 0.30 0.30 0.072 0.026 0.048---Number of simulations: 5000 | alpha: 0.05Time points (n1): 11Subjects per cluster (n2 x n3): 30 x 4 (treatment) 30 x 4 (control)Total number of subjects: 240summary(): view the resultssummary(res)

icc_slope) 😱 😱 😱 n2 × n3) 🤑icc_slope) 😱 😱 😱 n2 × n3) 🤑Several of these factors interact 😭
powerlmmpowerlmmpowerlmmMore effect sizes ✅
Beyond linear change; polynomials, piecewise, or categorical time.
powerlmmMore effect sizes ✅
Beyond linear change; polynomials, piecewise, or categorical time.
Non-ignorable missing data processes.
powerlmmMore effect sizes ✅
Beyond linear change; polynomials, piecewise, or categorical time.
Non-ignorable missing data processes.
Power for 3-way interactions (tx moderators).
powerlmmMore effect sizes ✅
Beyond linear change; polynomials, piecewise, or categorical time.
Non-ignorable missing data processes.
Power for 3-way interactions (tx moderators).
Clusters as a crossed factor ("subject-level randomization").
powerlmmMore effect sizes ✅
Beyond linear change; polynomials, piecewise, or categorical time.
Non-ignorable missing data processes.
Power for 3-way interactions (tx moderators).
Clusters as a crossed factor ("subject-level randomization").
Power for alternative model that account for clustering
powerlmmMore effect sizes ✅
Beyond linear change; polynomials, piecewise, or categorical time.
Non-ignorable missing data processes.
Power for 3-way interactions (tx moderators).
Clusters as a crossed factor ("subject-level randomization").
Power for alternative model that account for clustering
Better simulation support for custom models.
More tutorials and examples are included in powerlmm's documentation.
Found a bug? Report it here: https://github.com/rpsychologist/powerlmm.
Technical paper is under review.
Slides created using the R package xaringan, which is powered by remarkjs.
FiraCode: Copyright (c) Nikita Prokopov http://tonsky.meLato: Copyright (c) Łukasz DziedzicOpen Sans: Copyright (c) Steve Matteson Noto Serif: Copyright (c) GoogleShort primer on power
Power analysis using powerlmm
Keyboard shortcuts
| ↑, ←, Pg Up, k | Go to previous slide |
| ↓, →, Pg Dn, Space, j | Go to next slide |
| Home | Go to first slide |
| End | Go to last slide |
| Number + Return | Go to specific slide |
| b / m / f | Toggle blackout / mirrored / fullscreen mode |
| c | Clone slideshow |
| p | Toggle presenter mode |
| t | Restart the presentation timer |
| ?, h | Toggle this help |
| Esc | Back to slideshow |