The Cohen's d effect size is immensely popular in psychology. However, its interpretation is not straightforward and researchers often use general guidelines, such as small (0.2), medium (0.5) and large (0.8) when interpreting an effect. Moreover, in many cases it is questionable whether the standardized mean difference is more interpretable then the unstandardized mean difference.
In order to aid the interpretation of Cohen’s d, this visualization offers these different representations of Cohen's d: visual overlap, Cohen’s U3, the probability of superiority, percentage of overlap, and the number needed to treat. It also lets you change the standard deviation and displays the unstandardized difference.
Loading visualization
Cohen's U3
% Overlap
Probability of Superiority
Number Needed to Treat
A Common Language Explanation
With a Cohen's d of 0.8, 78.8% of the "treatment" group will be above the mean of the "control" group (Cohen's U3), 68.9% of the two groups will overlap, and there is a 71.4% chance that a person picked at random from the treatment group will have a higher score than a person picked at random from the control group (probability of superiority). Moreover, in order to have one more favorable outcome in the treatment group compared to the control group, we need to treat 3.5 people on average. This means that if there are 100 people in each group, and we assume that 20 people have favorable outcomes in the control group, then 20 + 28.3 people in the treatment group will have favorable outcomes.1
1The values are averages, and it is assumed that 20% (CER) of the control group have "favorable outcomes," i.e., their outcomes are below some cut-off. Change this by pressing the settings symbol to the right of the slider. Go to the formula section for more information.
Written by Kristoffer Magnusson a researcher in clinical psychology. You should follow him on Twitter and come hang out on the open science discord Git Gud Science.
FAQ
How do I use this visualization?
Change Cohen’s d
Use the slider to change Cohen’s d, or open the settings drawer and change the parameters. The inputs can also be controlled using the keyboard arrows.
Settings
You can change the following settings by clicking on the settings icon to the right of the slider.
- Parameters
- Mean 1
- Mean 2
- SD
- Control group event rate (CER)
- Labels
- X axis
- Distribution 1
- Distribution 2
- Slider settings
- Slider Max
- Slider Step: Controls the step size of the slider
Save settings
The settings can be saved in your browser’s localStorage
and will thus persist across visits.
Pan and rescale
You can pan the x axis by clicking and dragging the visualization. Double-click the visualization to center and rescale it.
Offline use
This site is cached using a service worker and will work even when you are offline.
What are the formulas?
Cohen’s d
Cohen’s d is simply the standardized mean difference,
,
where is the population parameter of Cohen’s d. Where it is assumed that , i.e., homogeneous population variances. And is the mean of the respective population.
Cohen’s U3
Cohen (1977) defined U3 as a measure of non-overlap, where “we take the percentage of the A population which the upper half of the cases of the Β population exceeds”. Cohen’s d can be converted to Cohen’s U3 using the following formula
where is the cumulative distribution function of the standard normal distribution, and the population Cohen’s d.
Overlap
Generally called the overlapping coefficient (OVL). Cohen’s d can be converted to OVL using the following formula (Reiser and Faraggi, 1999)
where is the cumulative distribution function of the standard normal distribution, and the population Cohen’s d.
Probability of superiority
This is effect size with many names: common language effect size (CL), Area under the receiver operating characteristics (AUC) or just A for its non-parametric version (Ruscio & Mullen, 2012). It is meant to be more intuitive for persons without any training in statistics. The effect size gives the probability that a person picked at random from the treatment group will have a higher score than a person picked at random from the control group. Cohen’s d can be converted CL using the following formula (Ruscio, 2008)
where is the cumulative distribution function of the standard normal distribution, and the population Cohen’s d.
Number Needed to Treat
NNT is the number of patients we would need to treat with the intervention to achieve one more favorable outcome compared to the control group. Furukawa and Leucht (2011) gives the following formula for converting Cohen’s d into NNT
where is the cumulative distribution function of the standard normal distribution and its inverse, CER is the control group’s event rate and the population Cohen’s d. N.B. CER is set to 20 % in the visualization above. You can change this be pressing the settings symbol to the right of the slider. The definition of an “event” or a “response” is arbitrary and could be defined as the proportion of patients who are in remission, e.g. bellow some cut-off on a standardized questionnaire. It is possible to convert Cohen’s d into a version of NNT that is invariant to the event rate of the control group. The interested reader should look at Furukawa and Leucht (2011) where a convincing argument is given to why this complicates the interpretation of NNT.
R code to calculate NNT from Cohen’s d
Since many have asked about R code for the formula above, here it is
References
- Baguley, T. (2009). Standardized or simple effect size: what should be reported? British journal of psychology, 100(Pt 3), 603–17.
- Cohen, J. (1977). Statistical power analysis for the behavioral sciencies. Routledge.
- Furukawa, T. A., & Leucht, S. (2011). How to obtain NNT from Cohen’s d: comparison of two methods. PloS one, 6(4).
- Reiser, B., & Faraggi, D. (1999). Confidence intervals for the overlapping coefficient: the normal equal variance case. Journal of the Royal Statistical Society, 48(3), 413-418.
- Ruscio, J. (2008). A probability-based measure of effect size: robustness to base rates and other factors. Psychological methods, 13(1), 19–30.
- Ruscio, J., & Mullen, T. (2012). Confidence Intervals for the Probability of Superiority Effect Size Measure and the Area Under a Receiver Operating Characteristic Curve. Multivariate Behavioral Research, 47(2), 201–223.
How do I cite this page?
Cite this page according to your favorite style guide. The references below are automatically generated and contain the correct information.
APA 7
Magnusson, K. (2020). Interpreting Cohen's d effect size: An interactive visualization (Version 2.4.2) [Web App]. R Psychologist. https://rpsychologist.com/cohend/
BibTex
I fund a bug/error/typo or want to make an suggestion!
Please report errors or suggestions by opening an issue on GitHub, if you want to ask a question use GitHub discussions
I'm gonna ask a large number of students to visit this site. Will it crash your server?
No, it will be fine. The app runs in your browser so the server only needs to serve the files.
The overlap statistic differs from Cohen's calculations
This is intentional, you can read more about my reasons in this blog post: Where Cohen went wrong – the proportion of overlap between two normal distributions
Can I include this visualization in my book/article/etc?
Yes, go ahead! I did not invent plotting two overlapping Gaussian distributions. This visualization is dedicated to the public domain, which means “you can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission” (see Creative common’s CC0-license). Although, attribution is not required it is always appreciated!
Contribute/Donate
There are many ways to contribute to free and open software. If you like my work and want to support it you can:
Pull requests are also welcome, or you can contribute by suggesting new features, add useful references, or help fix typos. Just open a issues on GitHub.
Sponsors
You can sponsor my open source work using GitHub Sponsors and have your name shown here.
Backers ✨
More Visualizations
Statistical Power and Significance Testing
An interactive version of the traditional Type I and II error illustration.
Equivalence and Non-Inferiority Testing
Explore how superiority, non-inferiority, and equivalence testing relates to a confidence interval