In this post I will use the theoretical and empirical sampling distribution of Cohen’s *d* to show the expected overestimation due to selective publishing. I will look at the overestimation for various sample sizes when the population effect is 0, 0.2, 0.5 and 0.8. The conclusion is that you should be weary of effect sizes from small samples, and that the issue is rather with type M (magnitude) errors than type I errors. At least is clinical psychology the pervasive problem is overestimation of effects and not falsely rejecting null hypothesis. Read more

Posts in Category **‘R’**

## Calculating the Overlap of Two Normal Distributions Using Monte Carlo Integration

I read this post over at the blog Cartesian Faith about Probability and Monte Carlo methods. The post describe how to numerically intregate using Monte Carlo methods. I thought the (…) Read more

## Creating a typical textbook illustration of statistical power using either ggplot or base graphics

A common way of illustrating the idea behind statistical power in null hypothesis significance testing, is by plotting the sampling distributions of the null hypothesis and the alternative hypothesis. Typically, these illustrations highlight the regions that correspond to making a type II error, type I error and correctly rejecting the null hypothesis (i.e. the test’s power). In this post I will show how to create such “power plots” using both ggplot and R’s base graphics.

Read more

## Working with shapefiles, projections and world maps in ggplot

In this post I show some different examples of how to work with map projections and how to plot the maps using ggplot. Many maps that are using the default projection are shown in the longlat-format, which is far from optimal. Here I show how to use either the Robinson or Winkel Tripel projection. Read more

## Analytical and simulation-based power analyses for mixed-design ANOVAs

In this post I show some R-examples on how to perform power analyses for mixed-design ANOVAs. The first example is analytical—and adapted from formulas used in G*Power (Faul et al., 2007), and the second example is a Monte Carlo simulation. Read more

## How to tell when error bars correspond to a significant p-value

Can you tell when error bars based on 95 % CIs or standard errors correspond to a significant p-value? Don’t fret if you think it’s hard, a study from 2005 showed that researchers in psychogoly, behavior neuroscience and medicine had a hard time judging when error bars from two independent groups signified a significant difference. Read more

## The Higgs boson: 5-sigma and the concept of p-values

Why are physicists talking about 5-sigma, and what’s it got to do with statistics? In this short post I’ll explain what 5-sigma is and why it’s not a measure of how certain scientist are that they’ve found the Higgs boson Read more

## Effect of sample size on the accuracy of Cohen’s d estimates (95 % CI)

When talking about confidence intervals, Jacob Cohen famously said: “I suspect that the main reason they are not reported is that they are so embarrassingly large!” (Cohen, 1994). In this post I’ll take a look at the relationship between the 95 % CI for Cohen’s *d* and it’s corresponding sample size. Read more

## PubMed publications in 2011 by 202 world countries: who’s the winner?

Which country had the most PubMed citations in 2011? To find out I used R statistical software to analyze the affiliation of 986 427 articles. Read more

## How to download complete XML records from PubMed and extract data

Yesterday I wrote an article that looked at the top 20 Cognitive Behavior Therapy journals with the most publications; today I will explain how I did it with R. Read more