January 4, 2017

Confidence intervals

A confidence interval (CI) is usually interpreted as the range of values that encompass the actual population or 'true' value, with a given probability. The width of the interval indicates the precision of the estimate (effect size). The wider the interval, the less precise the estimated effected size. A study with a small sample size will have greater random error, leading to a wider interval. 

Confidence interval equation
CI = effect size +/- (appropriate multiplier x standard error of the difference)


Appropriate multiplier = (alpha level / tails) x critical value

Alpha level
The alpha level is the probability, the researcher is willing to accept, that the findings are a result of sampling error. It is determined by the level of confidence the researcher decides to use, and is typically set to either 90%, 95% or 99%. The greater the confidence level, the wider the interval. Sample size can also affect the width of the interval, with smaller sample sizes leading to wider intervals. A confidence of 95% will provide an alpha level of 0.05.

The number of tails depends on the question the researcher is asking. Two tails are used if the researcher would like to know if the results of an intervention differ from a control or alternate group. It is common to use 2 tails in intervention based research.

Critical value

The z-distribution is used when the variance of the population is known. In some cases, an author may choose to use the z-distribution if the sample size is greater than 30.

The t-distribution should be used when the true variance is not known and has been estimated from the sample. With larger sample sizes, a t-distribution value will become similar to that of a z-distribution.

Link: t-distribution table


Standard deviation (SD) versus standard error of the mean (SEM)
The SD is always greater than the SEM, therefore a CI expressed using the SD would be wider than a CI expressed using the SEM. The inappropriate use of SEM to describe sample data variability may be presented by authors in an attempt to imply that a significant difference exists between groups, when in fact no difference exists. Authors who present data as the mean +/- SEM instead of the mean +/- SD may be trying to actively impair the reader's ability to accurately identify the variability in the study data.

Pooled standard deviation
The pooled SD is the weighted average of each group's standard deviation. It should be used when comparing the mean difference between two different (independent) groups.

1. Gaddis G, Gaddis M. Introduction to biostatistics: part 2, descriptive statistics. Ann Emerg Med 1990;19:309-315.