Wrong Words Beget Causal Confusion

In clinical medical and epidemiologic journals, most articles that report about associations will conclude with a discussion section in which the authors hold forth about

(1) how they have found that exposure to X “increases the risk” of Y, and

(2) how their finding makes sense because of some plausible (even if unproven) mechanism.

In an opinion piece in Significance,1 Dalmeet Singh Chawla cites to a study that suggests the “because” language frequently confuses readers into believing that a causal claim is being made. The study abstract explains:

Most researchers do not deliberately claim causal results in an observational study. But do we lead our readers to draw a causal conclusion unintentionally by explaining why significant correlations and relationships may exist? Here we perform a randomized study in a data analysis massive online open course to test the hypothesis that explaining an analysis will lead readers to interpret an inferential analysis as causal. We show that adding an explanation to the description of an inferential analysis leads to a 15.2% increase in readers interpreting the analysis as causal (95% CI 12.8% – 17.5%). We then replicate this finding in a second large scale massive online open course. Nearly every scientific study, regardless of the study design, includes explanation for observed effects. Our results suggest that these explanations may be misleading to the audience of these data analyses.”

Leslie Myint, Jeffrey T. Leek, and Leah R. Jager, “Explanation implies causation?” (Nov. 2017) (on line manuscript).

Invoking the principle of charity, these authors suggest that most researchers are not deliberately claiming causal results. Indeed, the language of biomedical science itself is biased in favor of causal interpretation. The term “statistical significance” suggests causality to naive readers, as does stats talk about “effect size,” and “fixed effect models,” for data sets that come no where near establishing causality.

Common epidemiologic publication practice tolerates if not encourages authors to state that their study shows (finds, demonstrates, etc.) that exposure to X “increases the risk” of Y in the studies’ samples. This language is deliberately causal, even if the study cannot support a causal conclusion alone or even with other studies. After all, a risk is the antecedent of a cause, and in the stochastic model of causation involved in much of biomedical research, causation will manifest in a change of a base rate to a higher or lower post-exposure rate. Given that mechanism is often unknown and not required, then showing an increased risk is the whole point. Eliminating chance, bias, confounding, and study design often is lost in the irrational exuberance of declaring the “increased risk.”

Tighter editorial control might have researchers qualify their findings by explaining that they found a higher rate in association with exposure, under the circumstances of the study, followed by an explanation that much more is needed to establish causation. But where is the fun and profit in that?

Journalists, lawyers, and advocacy scientists often use the word “link,” to avoid having to endorse associations that they know, or should know, have not been shown to be causal.2 Using “link” as a noun or a verb clearly implies a causal chain metaphor, which probably is often deliberately implied. Perhaps publishers would defend the use of “link” by noting that it is so much shorter than “association,” and thus saves typesetting costs.

More attention is needed to word choice, even and especially when statisticians and scientists are using their technical terms and jargon.3 If, for the sake of argument, we accept the sincerity of scientists who work as expert witnesses in litigation in which causal claims are overstated, we can see that poor word choices confuse scientists as well as lay people. Or you can just read the materials and methods and the results of published study papers; skip the introduction and discussion sections, as well as the newspaper headlines.


1 Dalmeet Singh Chawla, “Mind your language,” Significance 6 (Feb. 2018).

2 See, e.g., Perri Klass, M.D., “https://www.nytimes.com/2017/12/04/well/family/does-an-adhd-link-mean-tylenol-is-unsafe-in-pregnancy.html,” N.Y. Times (Dec. 4, 2017); Nicholas Bakalar, “Body Chemistry: Lower Testosterone Linked to Higher Death Risk,” N.Y. Times (Aug. 15, 2006).

3 Fang Xuelan & Graeme Kennedy, “Expressing Causation in Written English,” 23 RELC J. 62 (1992); Bengt Altenberg, “Causal Linking in Spoken and Written English,” 38 Studia Linguistica 20 (1984).