I am indebted to the article by Dr. Frank Woodside and Allison Davis on the so-called Bradford Hill criteria, for reminding me about the distorted view that some plaintiffs’ counsel advance in litigation about Bradford Hill’s view of statistical testing. Frank C. Woodside, III & Allison G. Davis, “The Bradford Hill Criteria: The Forgotten Predicate,” 35 Thomas Jefferson L. Rev. 103 (2013). Dr. David Schwartz has also written an insightful blog post on Bradford Hill. See David Schwartz, “5 Reasons to Apply the Bradford Hill Criteria in Your Next Case” (Sept. 20, 2013).
Here is where Bradford Hill postulates the position of a research question before his famous nine factors come into the analysis:
“Disregarding then any such problem in semantics we have this situation. Our observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance. What aspects of that association should we especially consider before deciding that the most likely interpretation of it is causation?”
Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 295 (1965).
The starting point, before the Bradford Hill nine factors come into play, requires a “clear-cut” association, which is “beyond what we would care to attribute to the play of chance.” What is “clear-cut” association? The most reasonable interpretation of Bradford Hill is that the starting point is an association that is not the result of chance, bias, or confounding.
I parted company with Woodside and Davis over whether Bradford Hill was somehow dismissive of the role of assessing chance in explaining an association. In acknowledging any validity in the plaintiffs’ interpretation of Bradford Hill’s 1965 paper, Woodside and Davis, do an injustice, in my view, to Bradford Hill’s careful articulation of his position.
The starting position, quoted above, seems very clear, but Woodside and Davis note that later on in his speech, Bradford Hill suggested that tests of significance do not contribute to proof of the hypothesis. Bradford Hill’s actual words are, however, fairly precise:
“No formal tests of significance can answer those questions. Such tests can, and should, remind us of the effects that the play of chance can create, and they will instruct us in the likely magnitude of those effects. Beyond that they contribute nothing to the ‘proof’ of our hypothesis.”
Bradford Hill at 299.
Plaintiffs’ counsel sometimes argue that this passage means that significance testing contributes “nothing” to proving the hypothesis, but this ignores two key points. First, the argument ignores where in the text the passage occurs: after Bradford Hill’s discussion of the nine factors. Bradford Hill’s statement can be understood only as a reflection back on the nine factors. The phrase “those questions” refers back to the nine factors, and this is the limitation that Bradford Hill is placing upon “formal tests of significance.” The starting point, before the nine factors are examined, is, after all, a “clear-cut” association, “beyond what we would care to attribute to the play of chance.”
Second, plaintiffs’ counsel’s argument ignores the clear meaning of the “[b]eyond that” phrase. Beyond what? Well, the limited role is nothing other than quantifying the play of chance in the observed results. This role is hugely important, and of course, is incorporated into the starting point before the nine factors are examined. In modern analyses, the role of random variability would actually be explored in the analysis of the exposure-outcome gradient, and perhaps in some of the other nine factors as well. Bradford Hill implied that a statistically significant association was a preliminary step, after which the really hard work began.
It would be unfair to Bradford Hill to read into his statement much about “strict” testing versus a more flexible inferential approach in selecting or interpreting a Type I error rate. By the time he presented his Presidential Address to the Royal Society of Medicine in 1965, much fur had flown in the disputes between Neyman and Fisher. Resolving Bradford Hill’s view on the dispute is not a pressing issue because on either account, the quantification of the p-value is an extremely important step in evaluating scientific data.
In his textbook on medical statistics, Bradford Hill expands on the role of statistical analysis in medicine:
“Are simple methods of the interpretation of figures only a synonym for common sense or do they involve an art or knowledge which can be imparted? Familiarity with medical statistics leads inevitably to the conclusion that common sense is not enough. Mistakes which when pointed out look extremely foolish are quite frequently made by intelligent persons, and the same mistakes, or types of mistakes, crop up again and again. There is often lacking what has been called a ‘statistical tact, which is rather more than simple good sense’. That tact the majority of persons must acquire (with a minority it is undoubtedly innate) by a study of the basic principles of statistical method.”
Austin Bradford Hill, Principles of Medical Statistics at 2 (4th ed. 1948) (emphasis in original).
Even in this early work though, Bradford Hill acknowledges the limits of statistical methods:
“It is a serious mistake to rely upon the statistical method to eliminate disturbing factors at the completion of the work. No statistical method can compensate for a badly planned experiment.”
Id. at 4 (emphasis in original). That statistical method cannot save a poorly planned experiment (or observational study) does not, however, imply that statistical methods are not needed to interpret a properly planned experiment or study.
In the summary section of the first chapter, Bradford Hill removes any doubt about his view of the importance, and the necessity, of statistical methods:
“The statistical method is required in the interpretation of figures which are at the mercy of numerous influences, and its object is to determine whether individual influences can be isolated and their effects measured.”
Id. at 10 (emphasis added).