+ - 0:00:00
Notes for current slide
Notes for next slide

Power Analysis for Longitudinal 2- and 3-Level Models

Challenges and Some Solutions Using the R Package powerlmm

Kristoffer Magnusson @krstoffr
http://rpsychologist.com

April 11, 2018 Stockholm University

1 / 61

Overview

  • Short primer on power

  • Power analysis using powerlmm

    • 2-level models
    • 3-level models
2 / 61

A Power Primer

3 / 61

Power function

  • A power function is a method to evaluate a statistical test.

  • Gives us a test's sensitivity as a function of a hypothetical true effect.

  • For a single value, the power function tells us how often a statistical test would correctly reject H0—if the hypotetical effect was actually true.

  • Often done for sample size planning.

    • Or to choose between different statistical tests.

4 / 61

About powerlmm

5 / 61

About powerlmm

R package designed to perform power analyses for 2- and 3-level models.

5 / 61

About powerlmm

R package designed to perform power analyses for 2- and 3-level models.

Offers both fast and accurate analytical solutions, as well as convenient simulation methods.

5 / 61

Multilevel models

2-level models

6 / 61

2-level model

7 / 61

Random intercept-only

  • Random intercept-only model

  • Variation in baseline levels

  • Parallel slopes

8 / 61

Random intercept & slope

  • Variation in baseline levels

  • Variation in change over time

  • Possible correlation between intercepts and slopes

9 / 61

And let's not forget level 1

  • Within-subjects error

  • Variation around the subject-specific latent slope

10 / 61

How to calculate power for 2-level models?

11 / 61

Introducing powerlmm's syntax

12 / 61

Introducing powerlmm's syntax

The study_parameters() function.

12 / 61

study_parameters(): Setup

library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 50,
icc_pre_subject = 0.5,
var_ratio = 0.02,
cohend = 0.5)
13 / 61

study_parameters(): Setup

library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 50,
icc_pre_subject = 0.5,
var_ratio = 0.02,
cohend = 0.5)

Standardized inputs

  • Not all parameters of a LMM affect power.

  • Often it's the relative magnitude that matters.

  • I use standardized inputs in this talk, however study_parameters() also accepts the raw parameter values.

13 / 61

study_parameters(): n1

library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 50,
icc_pre_subject = 0.5,
var_ratio = 0.02,
cohend = 0.5)
  • n1 is the number of equally spaced time points.

  • Argument T_end indicates duration, defaults to n1 - 1.

  • Currently, only linear change accross time is supported.

14 / 61

study_parameters(): n2

library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 50,
icc_pre_subject = 0.5,
var_ratio = 0.02,
cohend = 0.5)
  • n2 is the number of subjects per treatment.

  • Can be unbalanced n2 = per_treatment(5, 50)

15 / 61

study_parameters(): Pretest ICC? 🤔️

library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 50,
icc_pre_subject = 0.5,
var_ratio = 0.02,
cohend = 0.5)
  • icc_pre_subject is the amount of the total baseline variance the is between-subjects. $$ \texttt{icc_pre_subject} = \frac{\text{var(subject_intercepts)}}{\text{var(subject_intercepts) + var(within-subject_error)}} $$
16 / 61

study_parameters(): Pretest ICC? 🤔️

library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 50,
icc_pre_subject = 0.5,
var_ratio = 0.02,
cohend = 0.5)
  • icc_pre_subject is the amount of the total baseline variance the is between-subjects. $$ \texttt{icc_pre_subject} = \frac{\text{var(subject_intercepts)}}{\text{var(subject_intercepts) + var(within-subject_error)}} $$
  • icc_pre_subject would be the 2-level ICC if there was no random slopes.
16 / 61

study_parameters(): Pretest ICC? 🤔️

library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 50,
icc_pre_subject = 0.5,
var_ratio = 0.02,
cohend = 0.5)
  • icc_pre_subject is the amount of the total baseline variance the is between-subjects. $$ \texttt{icc_pre_subject} = \frac{\text{var(subject_intercepts)}}{\text{var(subject_intercepts) + var(within-subject_error)}} $$
  • icc_pre_subject would be the 2-level ICC if there was no random slopes.

  • Argument cor_subject indicates the correlation between intercepts and slopes.

16 / 61

study_parameters(): Variance ratio? 🤔️

library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 50,
icc_pre_subject = 0.5,
var_ratio = 0.02,
cohend = 0.5)
  • var_ratio is the ratio of total random slope variance over the level-1 residual variance. $$ \texttt{var_ratio} = \frac{\text{var(subject_slopes)}}{\text{var(within-subject_error)}} $$
17 / 61

study_parameters(): the effect size

library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 50,
icc_pre_subject = 0.5,
var_ratio = 0.02,
cohend = 0.5)

Standardized effect sizes

  • cohend indicates the standardized differences between the two groups at posttest.

  • Different denominators possible:

    • Baseline SD (default).

    • Posttest SD.

    • Random slope SD.

18 / 61

19 / 61

How to deal with missing data?

19 / 61
library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 50,
icc_pre_subject = 0.5,
var_ratio = 0.02,
dropout = dropout_weibull(proportion = 0.4,
rate = 1),
cohend = 0.5)
20 / 61
library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 50,
icc_pre_subject = 0.5,
var_ratio = 0.02,
dropout = dropout_weibull(proportion = 0.4,
rate = 1),
cohend = 0.5)

dropout_weibull(...)

  • proportion is the total dropout at the last time point.

  • rate > 1 indicate that more dropout occur later during treatment, and rate < 1 that it occurs early during treatment.

  • Also support dropout = dropout_manual(...).

  • Can differ per treatment group, if combined with per_treatment(...)

20 / 61

"OK, but show me the power!"

21 / 61

"OK, but show me the power!"

Just call get_power(...).

21 / 61

get_power(...)

get_power(p)
Power Analysis for Longitudinal Linear Mixed-Effects Models
with missing data and unbalanced designs
n1 = 11
n2 = 50 (treatment)
50 (control)
100 (total)
dropout = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 (time)
0, 5, 10, 14, 18, 23, 26, 30, 34, 37, 40 (%, control)
0, 5, 10, 14, 18, 23, 26, 30, 34, 37, 40 (%, treatment)
icc_pre_subjects = 0.5
var_ratio = 0.02
Cohen's d = 0.5
df = 98
alpha = 0.05
power = 42 %
22 / 61

get_power(...)

get_power(p)
Power Analysis for Longitudinal Linear Mixed-Effects Models
with missing data and unbalanced designs
n1 = 11
n2 = 50 (treatment)
50 (control)
100 (total)
dropout = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 (time)
0, 5, 10, 14, 18, 23, 26, 30, 34, 37, 40 (%, control)
0, 5, 10, 14, 18, 23, 26, 30, 34, 37, 40 (%, treatment)
icc_pre_subjects = 0.5
var_ratio = 0.02
Cohen's d = 0.5
df = 98
alpha = 0.05
power = 42 %

Let's also see what power is without dropout.

get_power(update(p, dropout = 0))$power
[1] 0.5369149
22 / 61

3-level models

23 / 61

3-level model

24 / 61

3-level model (partially nested)

25 / 61

Random intercept-only

  • Subjects now vary around their cluster's average.

  • Clusters' intercepts vary.

26 / 61

Random slope-only

  • Only clusters' slopes vary.

  • Reasonable if allocation is random.

27 / 61

Random intercept & slope

  • Clusters can differ both in intercepts and slopes.

  • Cluster-level intercepts and slopes can be correlated.

28 / 61

Why you should care

29 / 61

Power analysis for 3-level models

30 / 61

Power analysis for 3-level models

Extending study_parameters()

30 / 61

study_parameters(): n2

library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 5,
n3 = 4,
icc_pre_subject = 0.5,
icc_pre_cluster = 0.1,
icc_slope = 0.05,
var_ratio = 0.02,
cohend = 0.5)
  • n2 is now the number of subjects per cluster.
31 / 61

study_parameters(): n3

library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 5,
n3 = 4,
icc_pre_subject = 0.5,
icc_pre_cluster = 0.1,
icc_slope = 0.05,
var_ratio = 0.02,
cohend = 0.5)
  • n3 is the number of clusters per treatment.
32 / 61

study_parameters(): pretest ICC

library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 5,
n3 = 4,
icc_pre_subject = 0.5,
icc_pre_cluster = 0.1,
icc_slope = 0.05,
var_ratio = 0.02,
cohend = 0.5)
  • icc_pre_subject now also includes the level-3 intercept variance.

$$ {\small \frac{\text{var(subject_intercepts) + var(cluster_intercepts)}}{\text{var(subject_intercepts) + var(cluster_intercepts) + var(within-subject_error)}} } $$

33 / 61

study_parameters(): pretest ICC

library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 5,
n3 = 4,
icc_pre_subject = 0.5,
icc_pre_cluster = 0.1,
icc_slope = 0.05,
var_ratio = 0.02,
cohend = 0.5)
  • icc_pre_cluster indicates the amount of baseline variance at the cluster level.

$$ {\small \frac{\text{var(cluster_intercepts)}}{\text{var(subject_intercepts) + var(cluster_intercepts) + var(within-subject_error)}} } $$

  • cor_cluster indicates cluster-level correlation between intercept and slopes.
34 / 61

study_parameters(): icc_slope

library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 5,
n3 = 4,
icc_pre_subject = 0.5,
icc_pre_cluster = 0.1,
icc_slope = 0.05,
var_ratio = 0.02,
cohend = 0.5)
  • The proportion of the total slope variance located at the cluster level.

$$ \texttt{icc_slope} = \frac{\text{var(cluster_slopes)}}{\text{var(subject_slopes) + var(cluster_slopes)}} $$

  • Can be viewed as the correlation between the latent slopes of any two subjects belonging to the same cluster.
35 / 61

study_parameters(): The rest

library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 5,
n3 = 4,
icc_pre_subject = 0.5,
icc_pre_cluster = 0.1,
icc_slope = 0.05,
var_ratio = 0.02,
dropout = ...,
cohend = 0.5)
  • The remaining arguments are unchanged from the 2-level model.

  • Except that var_ratio now also includes the 3-level random slope in the numerator.

36 / 61

37 / 61

"What if cluster sizes are not balanced?"

37 / 61

"What if cluster sizes are not balanced?"

Introducing unequal_clusters(...)

37 / 61

unequal_clusters(...)

38 / 61

unequal_clusters(...)

p <- study_parameters(n1 = 11,
n2 = unequal_clusters(2, 9),
icc_pre_subject = 0.5,
icc_slope = 0.05,
var_ratio = 0.03,
cohend = 0.5)
38 / 61

"I don't know the exact cluster sizes!"

39 / 61

"I don't know the exact cluster sizes!"

Assume cluster sizes are drawn from a distribution.

39 / 61

Sample cluster sizes

40 / 61

Sample cluster sizes

n2 <- unequal_clusters(func = runif(n = 5, min = 2, max = 30))
p <- study_parameters(n1 = 11,
n2 = n2,
icc_pre_subject = 0.5,
icc_slope = 0.05,
var_ratio = 0.03,
cohend = 0.5)
  • N.B., cluster sizes are assumed to be the same in the treatment and control group.

  • Use n2 = per_treatment(n2, n2), to sample each group independently.

41 / 61

Sample cluster sizes

Let's do two realizations.

set.seed(5445)
get_power(p)$power
[1] 0.3504035
get_power(p)$power
[1] 0.2414357
42 / 61

Sample cluster sizes

Let's do two realizations.

set.seed(5445)
get_power(p)$power
[1] 0.3504035
get_power(p)$power
[1] 0.2414357

🤔️

42 / 61

Get the expected power

We can average over multiple realizations, based on randomly generated cluster sizes, to get the expected power.

get_power(p,
R = 50,
progress = FALSE)$power
[1] 0.3291297

The argument cores can be used to run this in parallel on multiple CPU cores.

43 / 61

44 / 61

"The control group is unclustered"

Partially nested studies

44 / 61

study_parameters(): Partially nesting

library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 5,
n3 <- 4,
icc_pre_subject = 0.5,
icc_pre_cluster = 0.1,
icc_slope = 0.05,
var_ratio = 0.02,
partially_nested = TRUE,
cohend = 0.5)

Just add partially_nested = TRUE, to remove the third level from the control group.

45 / 61

study_parameters(): Partially nesting

library(powerlmm)
p <- study_parameters(n1 = 11,
n2 = 5,
n3 <- 4,
icc_pre_subject = 0.5,
icc_pre_cluster = 0.1,
icc_slope = 0.05,
var_ratio = 0.02,
partially_nested = TRUE,
cohend = 0.5)

Just add partially_nested = TRUE, to remove the third level from the control group.

However, this leads us to another problem...

45 / 61

⚠️

Degrees of freedom

46 / 61

⚠️

Degrees of freedom

For some designs, like partially nested studies, exact dfs are hard to specify, and they must be approximated.

46 / 61

Choice of dfs matters

For situations with few clusters, we calculate power using the non-central t-distribution.

Different df options

  • Ignore the problem (Wald test).

  • Use between-clusters dfs.

  • Approximated dfs from data (Satterthwaite).

47 / 61

Comparing dfs

Substantial difference if few clusters are included.

p <- sp(n1 = 11,
n2 = 20,
n3 = 3,
icc_pre_subject = 0.5,
cor_subject = -0.5,
icc_slope = 0.1,
var_ratio = 0.03,
partially_nested = TRUE,
cohend = 1)
48 / 61

Comparing dfs

Substantial difference if few clusters are included.

p <- sp(n1 = 11,
n2 = 20,
n3 = 3,
icc_pre_subject = 0.5,
cor_subject = -0.5,
icc_slope = 0.1,
var_ratio = 0.03,
partially_nested = TRUE,
cohend = 1)
get_power(p, df = 9999)$power
[1] 0.8537807
get_power(p, df = "between")$power
[1] 0.3897371
get_power(p, df = "satterth")$power
[1] 0.6733066
48 / 61

🤔️

49 / 61

🤔️

"I have no idea what parameter values are reasonable"

49 / 61

🤔️

"I have no idea what parameter values are reasonable"

One option is to look at the model implied SDs, variance-covariance matrices, etc.

49 / 61

Unreasonable model?

p <- sp(n1 = 11,
n2 = 5,
icc_pre_subject = 0.5,
icc_slope = 0.05,
var_ratio = 0.3,
cohend = 0.5)
50 / 61

Unreasonable model?

p <- sp(n1 = 11,
n2 = 5,
icc_pre_subject = 0.5,
icc_slope = 0.05,
var_ratio = 0.3,
cohend = 0.5)
get_sds(p) %>% plot

SD at posttest is 4x the pretest SD. I do not expect that much heterogeneity.

50 / 61

Unreasonable model?

get_VPC(p) %>% plot

51 / 61

Unreasonable model?

get_correlation_matrix(p) %>% plot

52 / 61

53 / 61

"Wait, I've heard you need at least 30 clusters?"

53 / 61

"Wait, I've heard you need at least 30 clusters?"

  • Not really true.
53 / 61

"Wait, I've heard you need at least 30 clusters?"

  • Not really true.

  • But it is true, that few clusters can be a problem.

53 / 61

"Wait, I've heard you need at least 30 clusters?"

  • Not really true.

  • But it is true, that few clusters can be a problem.

    • at least for estimating variance components.
53 / 61

"Wait, I've heard you need at least 30 clusters?"

  • Not really true.

  • But it is true, that few clusters can be a problem.

    • at least for estimating variance components.
  • It is easy to test this using powerlmm's simulate() method.

53 / 61

simulate(): Test the model's performance

Let's see if 4 clusters per treatment is enough.

p <- study_parameters(n1 = 11,
n2 = 30,
n3 = 4,
icc_pre_subject = 0.5,
cor_subject = -0.5,
var_ratio = 0.02,
icc_slope = 0.05,
cohend = 0)
res <- simulate(p,
nsim = 5000,
satterthwaite = TRUE,
cores = 15)
54 / 61

summary(): view the results

summary(res)
Model: correct
Random effects
parameter M_est theta est_rel_bias prop_zero is_NA
subject_intercept 100.0 100.0 -0.00040 0.00 0
subject_slope 1.9 1.9 0.00129 0.00 0
cluster_slope 0.1 0.1 0.00831 0.11 0
error 100.0 100.0 0.00036 0.00 0
cor_subject -0.5 -0.5 -0.00653 0.00 0
Fixed effects
parameter M_est theta M_se SD_est Power Power_bw Power_satt
(Intercept) -0.01586 0 1.05 1.04 0.047 NA NaN
treatment 0.00878 0 1.48 1.48 0.049 NA NaN
time -0.00116 0 0.22 0.22 0.071 NA NaN
time:treatment 0.00047 0 0.30 0.30 0.072 0.026 0.048
---
Number of simulations: 5000 | alpha: 0.05
Time points (n1): 11
Subjects per cluster (n2 x n3): 30 x 4 (treatment)
30 x 4 (control)
Total number of subjects: 240
55 / 61

summary(): view the results

summary(res)

56 / 61

Summary

Factors that influence power

57 / 61

Summary: Factors that influence power

2-level

  • Variance ratio (random slopes) 😱 😱
  • Total number of subjects
  • Number of time points
  • Effect size
    • Pretest ICCs (if standardized ES)
  • Missing data
  • Balance
58 / 61

Summary: Factors that influence power

2-level

  • Variance ratio (random slopes) 😱 😱
  • Total number of subjects
  • Number of time points
  • Effect size
    • Pretest ICCs (if standardized ES)
  • Missing data
  • Balance

3-level (in addition to 2-level factors)

  • 3-level random slope (icc_slope) 😱 😱 😱
    • Number of clusters
    • Cluster imbalance
  • If random intercept-only
    • Total number of subjects (n2 \(\times\) n3) 🤑
58 / 61

Summary: Factors that influence power

2-level

  • Variance ratio (random slopes) 😱 😱
  • Total number of subjects
  • Number of time points
  • Effect size
    • Pretest ICCs (if standardized ES)
  • Missing data
  • Balance

3-level (in addition to 2-level factors)

  • 3-level random slope (icc_slope) 😱 😱 😱
    • Number of clusters
    • Cluster imbalance
  • If random intercept-only
    • Total number of subjects (n2 \(\times\) n3) 🤑

Several of these factors interact 😭

58 / 61

Roadmap for powerlmm

59 / 61

Roadmap for powerlmm

  • More effect sizes ✅
59 / 61

Roadmap for powerlmm

  • More effect sizes ✅

  • Beyond linear change; polynomials, piecewise, or categorical time.

59 / 61

Roadmap for powerlmm

  • More effect sizes ✅

  • Beyond linear change; polynomials, piecewise, or categorical time.

  • Non-ignorable missing data processes.

59 / 61

Roadmap for powerlmm

  • More effect sizes ✅

  • Beyond linear change; polynomials, piecewise, or categorical time.

  • Non-ignorable missing data processes.

  • Power for 3-way interactions (tx moderators).

59 / 61

Roadmap for powerlmm

  • More effect sizes ✅

  • Beyond linear change; polynomials, piecewise, or categorical time.

  • Non-ignorable missing data processes.

  • Power for 3-way interactions (tx moderators).

  • Clusters as a crossed factor ("subject-level randomization").

59 / 61

Roadmap for powerlmm

  • More effect sizes ✅

  • Beyond linear change; polynomials, piecewise, or categorical time.

  • Non-ignorable missing data processes.

  • Power for 3-way interactions (tx moderators).

  • Clusters as a crossed factor ("subject-level randomization").

  • Power for alternative model that account for clustering

59 / 61

Roadmap for powerlmm

  • More effect sizes ✅

  • Beyond linear change; polynomials, piecewise, or categorical time.

  • Non-ignorable missing data processes.

  • Power for 3-way interactions (tx moderators).

  • Clusters as a crossed factor ("subject-level randomization").

  • Power for alternative model that account for clustering

  • Better simulation support for custom models.

59 / 61

Thanks!

More tutorials and examples are included in powerlmm's documentation.

Found a bug? Report it here: https://github.com/rpsychologist/powerlmm.

Technical paper is under review.

60 / 61

❤️

Slides created using the R package xaringan, which is powered by remarkjs.

Fonts

61 / 61

Overview

  • Short primer on power

  • Power analysis using powerlmm

    • 2-level models
    • 3-level models
2 / 61
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow