Equivalence, non-inferiority and superiority testing

an interactive visualization

Created by Kristoffer Magnusson

It is not uncommon to see researchers conclude that two treatments are equally effective, based on an insignificant test of the null hypothesis. Or that reducing the length of a treatment yields treatment effects that are no worse than the standard (longer) treatment, based on p > 0.05. Clearly, both conclusions are wrong. Much has been written about this, and in medicine the appropriate types of tests for these kinds of hypotheses are equivalence and non-inferiority tests. When testing for equivalence, we test whether a treatment effect is inside a prespecified equivalence margin [-Δ, Δ]. Similarly, when testing if a treatment is at least not worse than another treatment, we test if the effect is above a prespecified non-inferiority margin -Δ. My aim with this visualization is to show the decision rules associated with these different types of hypotheses. This visualization also shows how power relates to the different tests and different values of Δ, d and n.

Below I use a 95 % confidence interval to demonstrate the different hypotheses. You can move the CI around using the sliders or by clicking and dragging. Results of the test of treatment differences will automatically be highlighted.

Settings

Observed effect (d = 1)
Sample size (n = 10)
Margin (Δ = 0.3)

Effect of new treatment is superior

95 % CI

Power

Superiority

Non-inferiority

Equivalence

H0: d = 0
Ha: d > 0
H0: d ≤ -Δ
Ha: d > -Δ
H0: d ≤ -Δ or d ≥ Δ
Ha: -Δ < d < Δ

Technical notes

Power is calculated using the following power functions. Note that α is 0.025 for all tests, since a 95 % CI is used. Also note that normal approximations are used. So power will be slightly off for really small sample sizes.

Power of equivalence test

$$ 1 - \beta = \Phi \left( \frac{|d - \Delta|}{\sqrt{2/n}} - Z_{1-\alpha}\right) + \Phi \left( \frac{|d + \Delta|}{\sqrt{2/n}} - Z_{1-\alpha}\right) - 1$$

Power of non-inferiority test

$$ 1 - \beta =\Phi \left( \frac{d + \Delta}{\sqrt{2/n}} - Z_{1-\alpha}\right)$$

Power of superiority test

$$ 1 - \beta =\Phi \left( \frac{d}{\sqrt{2/n}} - Z_{1-\alpha}\right)$$

where \(\Phi\) is the cumulative distribution function of the standard normal distribution. \(d\) is Cohen's d, \(\Delta\) is non-inferiority or equivalence margin, \(n\) is the sample size per group, and is the \(Z_{1-\alpha}\) is the 100(\(1-\alpha\))th percentile of a standard normal distribution.

Formulas are adapted from Julious, Steven A. "Sample sizes for clinical trials with normal data." Statistics in medicine 23.12 (2004): 1921-1986.

Type I error

Non-inferiority is shown if the lower side of a two-sided (1–2α)×100% CI is above -Δ. In this case that means a 95 % CI, so the significance level is 0.025. Using the two one-sided test (TOST) procedure, equivalence is tested using a (1–2α)×100% CI. In this case this significance level is also 0.025. In the visualization superiority testing is also performed as a one tailed test, also with a significance level of 0.025. So if we wanted to use a 0.05 significance level we would use 90 % CIs.

Suggestions, errors, and bugs

Have any suggestion? Or found any bugs? Send them to me, my contact info can be found here.