How do I calculate statistical power?

May 31, 2010

The power of any test of statistical significance will be affected by four main parameters:

  1. the effect size
  2. the sample size (N)
  3. the alpha significance criterion (α)
  4. statistical power, or the chosen or implied beta (β)

All four parameters are mathematically related. If you know any three of them you can figure out the fourth.

Why is this good to know?

If you knew prior to conducting a study that you had, at best, only a 30% chance of getting a statistically significant result, would you proceed with the study? Or would you like to know in advance the minimum sample size required to have a decent chance of detecting the effect you are studying? These are the sorts of questions that power analysis can answer.

Let’s take the first example where we want to know the prospective power of our study and, by association, the implied probability of making a Type II error. In this type of analysis we would make statistical power the outcome contingent on the other three parameters. This basically means that the probability of getting a statistically significant result will be high when the effect size is large, the N is large, and the chosen level of alpha is relatively high (or relaxed).

For example, if I had a sample of N = 100 and I expected to find an effect size equivalent to r = .30, a quick calculation would reveal that I have an 57% chance of obtaining a statistically significant result using a two-tailed test with alpha set at the conventional level of .05. If I had a sample twice as large, the probability that my results will turn out to be statistically significant would be 86%.

Or let’s say we want to know the minimum sample size required to give us a reasonable chance (.80) of detecting an effect of certain size given a conventional level of alpha (.05). We can look up a power table or plug the numbers into a power calculator to find out.

For example, if I desired an 80% probability of detecting an effect that I expect will be equivalent to r = .30 using a two-tailed test with conventional levels of alpha, a quick calculation reveals that I will need an N of at least 84. If I decide a one-tailed test is sufficient, reducing my need for power, my minimum sample size falls to 67.

For more, see my ebook Statistical Power Trip

STP_3D_no_shadow_300


How do I know if my study has enough statistical power?

May 31, 2010

Let’s say you have designed a study and now you want to know the probability that your study will detect an effect, assuming there is a genuine effect there to be detected. This probability can be calculated by doing a statistical power calculation with power set as the dependent variable. The only tricky part will be in estimating the size of the effect in advance. If your estimate is too high, you will think you have more power than you do.

For example, if you have a sample of N = 50 and you expect the effect size will be equivalent to r = .25, then you will have a 42% probability of getting a statistically significant result given conventional levels of alpha (α2 = .05). In other words, your results are not likely to pan out. (You might want to think about ways of boosting the power of your study before proceeding.)

Let’s say you want to determine the minimum effect size that your study will be able to detect given certain levels of alpha and power. Again, you just run a basic power calculation, perhaps using a power calculator, with the effect size set as the dependent variable.

For example, if you set alpha and power at conventional levels of .05 and .80 respectively, and you have a sample of N = 50, then the minimum detectable effect size will be equivalent to r = .38.

For more, see The Essential Guide to Effect Sizes, chapter 3.


What’s wrong with post hoc power analyses?

May 31, 2010

When a test returns a result that is statistically nonsignificant, the question arises, “does this result mean there is no effect or did my study lack statistical power to detect?” It’s a fair question, but one which power analysis cannot answer.

Recall that statistical power is the probability that a test will correctly reject a false null hypothesis. Statistical power only has relevance when the null is false. The problem is that a nonsignificant result does not tell us whether the null is true or false. To calculate power after the fact is to make an assumption (that the null is false) that is not supported by the data.

Source: The Essential Guide to Effect Sizes


Can you recommend a good power calculator?

May 31, 2010

Power calculations are rarely done by hand. Instead, researchers normally refer to tables of critical values in much the same way that tables of critical values for t, F, and other statistics were once used to assess statistical significance.

A far easier way to run a power analysis is to use a power calculator or a computer program such as G*Power (Faul et al. 2007). At the time of writing the latest version of this freeware program was G*Power 3 which runs on both Windows XP/Vista/7/8 and Mac OS X10.7 – 10.10 operating systems. This user-friendly program can be used to run all types of power analysis for a variety of distributions. Using the interface you select the outcome of interest (e.g., minimum sample size), indicate the test type, input the parameters (e.g., the desired power and alpha levels), then click “calculate” to get an answer.

For a step-by-step guide to G*Power 3, complete with screenshots, check out the e-book Statistical Power Trip:

STP_3D_no_shadow_300

Daniel Soper of Arizona State University has several easy-to-use calculators for all sorts of statistical calculations including power analyses relevant for multiple regression.

Russ Lenth of the University of Iowa has a number of intuitive Java applets for running power analyses here.

The calculation of statistical power for multiple regression equations featuring categorical moderator variables requires some special considerations, as explained by Aguinis et al. (2005). An online calculator for this sort of analysis can be found at Herman Aguinis’s site at Indiana University here.


Can you recommend a plain English introduction to power analysis?

May 30, 2010

The analysis of statistical power is an essential skill for anyone who relies on tests of statistical significance. Yet the majority of text books say nothing about it.

In this plain-English primer, you will learn how to avoid the many problems that arise from misunderstanding issues of statistical power.

Using simple FAQs and a class-tested approach characterized by easy-to-follow examples, Statistical Power Trip will provide you with the tools you need to design studies that work.

STP_3D_no_shadow_300

Statistical Power Trip is a 55 page e-book. If you’re looking for something a little more substantial but still written in jargon-free language, I recommend The Essential Guide to Effect Sizes.