What is an effect size?

May 31, 2010

An effect is the result of something. It is an outcome, a result, a reaction, a change in Y brought about by a change in X.

An effect size refers to the magnitude of the result as it occurs, or would be found, in nature, or in a population. Although effects can be observed in the artificial setting of a laboratory or a sample, effect sizes exist in the real world.

When researchers estimate effect sizes by observing representative samples, they generate an effect size estimate. This estimate is usually expressed in the form of an effect size index.

For a good introduction to effect sizes – how to report them, how to interpret them – check out the e-book Effect Size Matters



Can you give me some examples of an effect size?

May 31, 2010

Examples of effect sizes are all around us. Consider the following claims which you might find advertised in your newspaper:

–         “enjoy immediate pain relief through acupuncture”

–         “change service providers now and save 30%”

–         “look 10 years younger with Botox”

Notice how each claim promises an effect (“look younger with Botox”) of measureable size (“10 years younger”). No understanding of statistical significance is necessary to gauge the merits of each claim. Each effect is being promoted as if it were intrinsically meaningful. (Whether it is or not is up to the newspaper reader to decide.)

Many of our daily decisions are based on some analysis of effect size. We sign up for courses that we believe will enhance our career prospects. We cut back on carbohydrates to lose weight. We stop at red lights to reduce the risk of accidents. We buy stock we believe will appreciate in value. We take an umbrella if we perceive a high chance of rain.

The interpretation of effect sizes is how we make sense of the world.

In this sense researchers are no different from anybody else. Where researchers do differ is in the care taken to generate accurate effect size estimates. But while we may spend a lot of our time looking for ways to reduce sampling and measurement error, among other things, ultimately our goal is a better understanding of real world effects.

And this is why it is essential that we interpret not only the statistical significance of our results but their real world or substantive significance as well.

For more on how to do this, check out the e-book Effect Size Matters.


Why does my research methods textbook have no entry for “effect size”?

May 31, 2010

Because it’s a bad textbook!

Most textbooks are about 20-30 years behind the state of the methodological art. Prior to writing The Essential Guide to Effect Sizes I scanned more than 30 texts published between 2000 and 2009. I found that 9 out of 10 had nothing to say about effect sizes. If effect sizes were mentioned, it was only in passing.

A typical text book will show you how to assess the statistical significance of a test, but not how to establish substantive significance of their results. It will talk about p values but have little to say about effect sizes.

This will change. In the future textbooks will increasingly show students how to: estimate the magnitude of observed effects, gauge the power of the statistical tests used to detect effects, and interpret effect size estimates in meaningful ways.

Can you give me three reasons for reporting effect sizes?

May 31, 2010
  1. Your estimate of the effect size constitutes your study’s evidence. A p value might tell you the direction of an effect, but only the estimate will tell you how big it is.
  2. Reporting the effect size facilitates the interpretation of the substantive significance of a result. Without an estimate of the effect size, no meaningful interpretation can take place.
  3. Effect sizes can be used to quantitatively compare the results of studies done in different settings.

More here.

Why are journal editors increasingly asking authors to report effect sizes?

May 31, 2010

Because the whole point of doing research is that we may learn something about real world effects.

Editors are increasingly asking for authors to provide their effect size estimates because of the growing realization that tests of statistical significance don’t tell us what we really want to know. As Cohen (1990: 1310) famously said:

“The primary product of a research inquiry is one or more measures of effect size, not p values.”

In the bad old days, researchers looked at their p values to see whether their hypotheses were supported. Get a low p value and, voila!, you had a result. But p values are confounded indexes that actually tell us very little about the phenomena we study. At best, they tell us the direction of an effect, but they don’t tell us how big it is. And if we can’t say whether the effect is large or trivial in size, how can we interpret our result?

The estimation of effect sizes is essential to the interpretation of a study’s results. In the fifth edition of its Publication Manual, the American Psychological Association or APA identified the “failure to report effect sizes” as one of seven common defects editors observed in submitted manuscripts. To help readers understand the importance of a study’s findings, authors were advised that “it is almost always necessary to include some index of effect” (APA 2001: 25).

Many editors have made similar calls thus it is increasingly common for submission guidelines to either encourage or mandate the reporting of effect sizes.

You say some editors have encouraged the reporting of effect sizes. Which ones?

May 31, 2010

–         Campion (1993) in Personnel Psychology

–         Combs (2010) in the Academy of Management Journal

–         Iacobucci (2005) in the Journal of Consumer Research

–         Kendall (1997) and La Greca (2005) both in the Journal of Consulting and Clinical Psychology

–         Lustig and Strauser (2004) in the Journal of Rehabilitation

–         Murphy (1997) and Zedeck (2003) both in the Journal of Applied Psychology

–         Shaver (2006) in the Journal of International Business Studies

As usual in matters of statistical reform, psychology journals lead the way. In a recent poll of psychology editors Cumming et al. (2007) found that a majority now advocate effect size reporting.

On his website Bruce Thompson lists 24 educational and psychology journals that now require effect size reporting.

A full list of references can be found here.

Why can’t I just judge my result by looking at the p value?

May 31, 2010

Because a low p value could reflect any number of things apart from the size of the underlying effect.

Consider two hypothetical studies examining the relationship between exam marking and academic happiness. Both studies used identical measures and procedures and generated the following results:

Study 1: N = 62, r = -.25, p > .05

Study 2: N = 63, r = -.25, p < .05

In the first study the results were found to be statistically nonsignificant (p > .05) leading the authors to conclude that exam marking has no effect on academic happiness. However, the results of the second study were found to be statistically significant (p < .05) leading the authors of Study 2 to conclude that marking adversely affects happiness.

But here’s the thing: in both studies the authors made identical estimates of the effect size (r = -.25). Both studies essentially came up with the exact same results. The conclusion that we should take away from either study is that marking has a negative effect on happiness equivalent to r = -.25.

So how is it that the authors of Study 1 reached a different conclusion?

They screwed up basically. The authors of Study 1 ignored their effect size estimate and examined only the p value associated with their test statistic. They incorrectly interpreted a statistically nonsignificant result as indicating no effect. A nonsignificant result is more accurately interpreted as an inconclusive result. There might be no effect or there might be an effect which went undetected because the study lacked statistical power.

In this example the only real difference between the two studies was that the second study had one more observation and consequently just enough statistical power to push the result across the threshold of statistical significance. In other words, sample size, rather than the effect size, explained the different conclusions drawn.

You should never judge the substantive significance of a result by looking at a p value. P values are confounded indexes and are no substitute for estimates of the effect size.

In this hypothetical example, both sets of authors would have arrived at the same conclusion if both had ignored their p values and focused on their correlation coefficients.

Source: The Essential Guide to Effect Sizes