In 1965, Sir Austin Bradford Hill was appropriately elected to the President of the Royal Society of Medicine. Along with Sir Richard Doll and others, Hill had pioneered the use of epidemiologic and statistical methods, which in the 1950s, he had applied to the issue of tobacco smoking and lung cancer. By the late 1950s, Hill and Doll had embraced and urged a causal association between smoking and lung cancer, but they had formidable opponents in Joseph Berkson and Sir Ronald A. Fisher.
By 1964, Hill and Doll’s causal thesis had largely prevailed. On January 11, 1964, the Office of the Surgeon General, in the United States, issued a Committee report that reviewed the available evidence and concluded that the relationship between smoking and lung cancer was indeed causal. See Surgeon General’s Advisory Committee on Smoking and Health, Smoking and Health (Office of the Surgeon General, United States Public Health Service 1964). See also “Profiles in Science – 1964 Report,” National Library of Medicine Website.
Almost a year to the day after the Surgeon General’s report was issued, Hill gave the President’s Address at the Royal Society of Medicine, in London. For Hill, by then Professor Emeritus of Medical Statistics in the University of London, the occasion was triumphant. Not only had he prevailed over the intellectual doubts of Berkson and Fisher, and the animadversions of the tobacco industry, but he had shown that causal relationships can be identified and established with statistical methods in population studies, even in the absence of demonstrated mechanisms or experimental randomization. Fittingly, his after-dinner speech outlined the methodology that had proved successful, and that speech was published in the Proceedings of the Royal Society of Medicine. Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295 (1965) [cited as Hill].
As a tribute to Hill, the publication was noteworthy, but Hill, if still alive, would be surprised, perhaps shocked, certainly annoyed, that the text of his Presidential Address was still being cited as the canonical statement of causal assessments more than half a century later. Epidemiologic science has progressed in many important ways since 1964, and science has refined and improved substantially upon Hill’s articulation of the epidemiologic method for assessing the causality of observed associations. Nonetheless, lawyers, on both sides of the bar, continue to publish analyses of Hill’s 1965 publication. A few years ago, Dr. Frank Woodside and Allison Davis published one such article.1 This month, a lawyer from the lawsuit industry published a plaintiffs’ vision of Hill’s methodology, in Trial, the trade journal of his industry. R. Jason Richards, “Reflecting on Hill’s Original Causation Factors,” 52 Trial 44 (Nov. 2016) [cited as Richards]. Frankly, everyone would be better off if they simply read the original speech and understood it for what it was – a 50 year-old informal statement of a complex problem.
As do many defense lawyers, Richards treats the Hill factors as a canonical guide to causation. He states that “[t]he scientific community has generally accepted these viewpoints, and scientists regularly use them to assess causality between exposure and an outcome,” and then cites legal decisions only. Richards at 45.2
Some of Richards’ exegesis is a benign, helpful reminder to consider Hill’s recommendations in their original context, and that no one factor is typically dispositive. Richards at 49. What Richards fails to say is that a concordance of factors will often be needed to establish causation, as they were in showing that smoking caused lung cancer.
Richards tells us, plaintively and accurately, that “[m]any of Hill’s key insights about how to make decisions based on epidemiological evidence have been largely ignored or distorted over the years.” Richards at 46. Richards has in mind the indeterminacy of any given factor, which he casts as “no single interpretation is infallible.” Of course, if no single interpretation is infallible, then all are fallible, which Richards no doubts sees as immunizing any expert witness’s opinion against exclusionary gatekeeping.
Ultimately, Richards becomes yet another author who abridges and bastardizes Hill’s key insights. He further urges his readers to consider and invoke “seldom-cited but significant passages” from Hill (1965). One passage that Richards fails to consider and invoke himself is Hill’s important predicate for considering the nine factors in the first place. The starting point for Hill, as he clearly expressed in his President’s address, was that
“[o]ur observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance. What aspects of that association should we especially consider before deciding that the most likely interpretation of it is causation?”
Hill at 295. The nine factors follow, with elaboration, but importantly, the nine factors answer the question Hill posed about the aspects of the association, which we have already seen to be “perfectly clear-cut and beyond what we care to attribute to the play of chance.” To be sure, and fair, Hill did not use the words valid and statistically significant, but his meaning was perfectly clear and is completely ignored by Richards and occasionally by trial courts led into error by lawyers who advocate for causal associations on weak, inconsistent evidence from invalid associations.
There is some mischief perpetuated by treating Hill’s language as legislating a decision procedure for demonstrating causality. First, Hill spoke informally, without the scholarly apparatus of footnotes or extensive research. So when Hill wrote that we should not dismiss a putative causal claim merely because the association is “slight,” he was writing in the context of known causal associations with relative risks in the 100s (for chimney sweeps and scrotal cancer) or twenty to thirty fold for smoking and lung cancer. A slight association might well be one that is a merely doubling or tripling of base rate for the outcome. And Hill was not urging that such slight associations were much in the way of evidence for causality, only that we should not dismiss the causal claim solely because of the small size of the association.
Second, some of what Hill said was wrong when he said it in 1965. For instance, he wrote that none of the factors is necessary. The temporality factor, which specifies that the putative cause come before the putative effect, however, is indeed necessary. Unless of course we have “spooky action at a distance” that permits simultaneous causality across the universe in biomedicine. Hill was probably not thinking about the philosophical problems invoked by quantum physics when he incorrectly branded temporality as not necessary.
Third, some of what Hill said is distorted in commentary, such as Richards’, by ignorance, or by design. Perhaps the distortions occur because Hill was speaking colloquially to fellow scientists. Much of what he said, and is written in his 1965 article, is nuanced and contextual. For instance, Hill wrote that:
“[n]o formal tests of significance can answer those questions. Such tests can, and should, remind us of the effects that the play of chance can create, and they will instruct us in the likely magnitude of those effects. Beyond that they contribute nothing to the ‘proof’ of our hypothesis.”
Hill at 299 (emphasis added). This passage, referenced by Richards, is a favorite of lawsuit industry lawyers but an astute reader will note that it opens with a reference to “those questions.” To understand what “those questions” are, one would sensibly look at the paragraph above. In that previous paragraph, Hill states that the “proof” of causality requires judgment, and that the nine factors cannot be taken as providing a quod erat demonstrandum; rather the factors structure the inquiry, which is essential, into how to explain the facts observed and to rule out explanations that do not turn on causality. Understandably, statistical analyses do not replace the judgment involved in synthesizing the evidence with respect to the available factors. Of course, formal tests of significance do answer specific questions, such as whether the association in one beyond that which we care to ascribe to chance, a consideration that must be made before delving into the nine factors. And even after crossing the threshold of statistically significant, valid association, an analyst would consider statistical tests in connection with consistency and exposure gradient. The former consideration today is often addressed by meta-analyses, tests of homogeneity, and p-curves, none of which was in common use in 1965, at the time of Hill’s Presidential address. Some simple tests for exposure gradient were available, but often the judgment was left to a visual assessment of the dose-response curve. Today, formal statistical analyses would indeed be invoked to determine whether the dose-curve was likely inconsistent with random variability.
Fourth, what Hill said in 1965 must be viewed in the light of over 50 years of scientific developments in field of epidemiology and statistics. Ultimately, what Richards presents as précis is a gross distortion of Hill in 1965, and and an even greater distortion of acceptable scientific method in 2016.
Just as it is helpful to study historical propaganda, Richards’ article should be studied to understand how the lawsuit industry will try to induce error in judicial judgments. Richards argues that all the nine factors are “inherently subjective,” and so it is understandable and inevitable that expert witnesses will disagree. Richards at 48. The fact is that some of the Hill factors are not the least subjective. Strength, consistency, and exposure gradient, for instance, are all objective, quantifiable variables that can be used to evaluate a body of available epidemiologic studies (after the the available studies have shown associations perfectly clear cut and beyond the play of chance).
Richards deploys a sophistical argument that because none of the factors is necessary, then causation can be inferred in the absence of any of the factors, or perhaps in the presence of only the most unimportant factors, such as analogy. Readers will be hard pressed to come up with an example of a generally accepted causal relationship evidenced solely by an analogy or by a subjective assessment of plausibility, but Richards argues for the validity of such an inference in the abstract, and in the absence of any real-world examples.
“To some lawyers, all facts are created equal.” Felix Frankfurter3
To some lawyers, all epidemiologic studies are created equal, but they too are mistaken. Richard attempts to engage in extreme deconstuctionist analysis, verging on Daubert by Derrida. Richards avers that all studies are flawed, and one can always find always question a study’s validity, and “so it would be a stretch to establish any cause-and-effect relationship if statistically significant data were the only acceptable basis for asserting causation.” Richards at 48. Richards confuses validity with statistical significance, but worse, his argument ignores important qualitative and quantitative differences between and among studies in terms of their design, implementation, analysis, and interpretation. His argument is akin to saying that all human beings have flaws so we should do away with honors and awards, as well as prisons and penalties.
Finally, no article such as Richards’ would be complete without misstating the holding and dictum in Matrixx Initiatives:
“The U.S. Supreme Court has finally recognized this as well, agreeing that statistical significance is not the touchstone of reliability under a Daubert analysis.”
Richards at 48 (citing Matrixx Initiatives, Inc. v. Siracusano, 131 S. Ct. 1309, 1319 (2011)). This is a remarkably misleading citation given that the Court, in Matrixx, noted that it was not considering whether “expert testimony was properly admitted,” and that it was not trying “to define here what constitutes reliable evidence of causation.” Matrixx, 131 S. Ct. at 1319.
In reaching, nay stretching, to address the statistical issue, the Supreme Court cited three cases, two of which involved differential etiology and contested specific causation, for which statistical analysis was absent and irrelevant. Id. (citing Best v. Lowe’s Home Centers, Inc., 563 F.3d 171, 178 (C.A.6 2009); Westberry v. Gislaved Gummi AB, 178 F.3d 257, 263-264 (C.A.4 1999)). The third case was relevant to statistical inference, but involved a case in which the plaintiff’s expert witness had cited at least one study with a nominally statistically significant result, which was vitiated by internal and external validity concerns. Wells v. Ortho Pharmaceutical Corp., 788 F.2d 741, 744-745 (11th 1986), cert. denied, 479 U.S.950 (1986).4
Richard asserts that “Hill recognized that scientists could efficiently use data and draw reasonable inferences about causation in the absence of statistically significant findings.” Richards at 48. Tellingly, Richards provides no citation for his assertion. There is none, but nothing is the hallmark of epistemic nihilism.
1 Frank C. Woodside, III & Allison G. Davis, “The Bradford Hill Criteria: The Forgotten Predicate,” 35 Thomas Jefferson L. Rev. 103 (2013).
2 citing In re Trasylol Prods. Liab. Litig., 2010 WL 1489734, at *8-9 (S.D. Fla. Mar. 8, 2010); In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., 26 F. Supp. 3d 449, 454 (E.D. Pa. 2014).
3 quoted in Comes v. Microsoft Corp., 709 N.W.2d 114, 116 (Iowa 2006) without source information.
4 See “Wells v. Ortho Pharmaceutical Corp. Reconsidered – Part 1” (Nov. 12, 2012); “Wells v. Ortho Pharmaceutical Corp. Reconsidered – Part 2” (Nov. 13, 2012): “Wells v. Ortho Pharmaceutical Corp. Reconsidered – Part 3” (Nov. 18, 2012); “Wells v. Ortho Pharmaceutical Corp. Reconsidered – Part 4” (Nov. 19, 2012); “Wells v. Ortho Pharmaceutical Corp. Reconsidered – Part 5” (Nov. 21, 2012); “Wells v. Ortho Pharmaceutical Corp. Reconsidered – Part 6” (Nov. 21, 2012).