Law reviews are not peer reviewed, not that peer review is a strong guarantor of credibility, accuracy, and truth. Most law reviews have no regular provision for letters to the editor; nor is there a PubPeer that permits readers to point out errors for the benefit of the legal community. Nonetheless, law review articles are cited by lawyers and judges, often at face value, for claims and statements made by article authors. Law review articles are thus a potent source of misleading, erroneous, and mischievous ideas and claims.
Erica Beecher-Monas is a law professor at Wayne State University Law School, or Wayne Law, which considers itself “the premier public-interest law school in the Midwest.” Beware of anyone or any institution that describes itself as working for the public interest. That claim alone should put us on our guard against whose interests are being included and excluded as legitimate “public” interest.
Back in 2006, Professor Beecher-Monas published a book on evaluating scientific evidence in court, which had a few goods points in a sea of error and nonsense. See Erica Beecher-Monas, Evaluating Scientific Evidence: An Interdisciplinary Framework for Intellectual Due Process (2006)[1]. More recently, Beecher-Monas has published a law review article, which from its abstract suggests that she might have something to say about this difficult area of the law:
“Scientists and jurists may appear to speak the same language, but they often mean very different things. The use of statistics is basic to scientific endeavors. But judges frequently misunderstand the terminology and reasoning of the statistics used in scientific testimony. The way scientists understand causal inference in their writings and practice, for example, differs radically from the testimony jurists require to prove causation in court. The result is a disconnect between science as it is practiced and understood by scientists, and its legal use in the courtroom. Nowhere is this more evident than in the language of statistical reasoning.
Unacknowledged difficulties in reasoning from group data to the individual case (in civil cases) and the absence of group data in making assertions about the individual (in criminal cases) beset the courts. Although nominally speaking the same language, scientists and jurists often appear to be in dire need of translators. Since expert testimony has become a mainstay of both civil and criminal litigation, this failure to communicate creates a conundrum in which jurists insist on testimony that experts are not capable of giving, and scientists attempt to conform their testimony to what the courts demand, often well beyond the limits of their expertise.”
Beecher-Monas, “Lost in Translation: Statistical Inference in Court,” 46 Arizona St. L.J. 1057, 1057 (2014) [cited as BM].
A close read of the article shows, however, that Beecher-Monas continues to promulgate misunderstanding, error, and misdirection on statistical and scientific evidence.
Individual or Specific Causation
The key thesis of this law review is that expert witnesses have no scientific or epistemic warrant upon which to opine about individual or specific causation.
“But what statistics cannot do—nor can the fields employing statistics, like epidemiology and toxicology, and DNA identification, to name a few—is to ascribe individual causation.”
BM at 1057-58.
Beecher-Monas tells us that expert witnesses are quite willing to opine on specific causation, but that they have no scientific or statistical warrant for doing so:
“Statistics is the law of large numbers. It can tell us much about populations. It can tell us, for example, that so-and-so is a member of a group that has a particular chance of developing cancer. It can tell us that exposure to a chemical or drug increases the risk to that group by a certain percentage. What statistics cannot do is tell which exposed person with cancer developed it because of exposure. This creates a conundrum for the courts, because nearly always the legal question is about the individual rather than the group to which the individual belongs.”
BM at 1057. Clinical medicine and science come in for particular chastisement by Beecher-Monas, who acknowledges the medical profession’s legitimate role in diagnosing and treating disease. Physicians use a process of differential diagnosis to arrive at the most likely diagnosis of disease, but the etiology of the disease is not part of their normal practice. Beecher-Monas leaps beyond the generalization that physicians infrequently ascertain specific causation to the sweeping claim that ascertaining the cause of a patient’s disease is beyond the clinician’s competence and scientific justification. Beecher-Monas thus tells us, in apodictic terms, that science has nothing to say about individual or specific causation. BM at 1064, 1075.
In a variety of contexts, but especially in the toxic tort arena, expert witness testimony is not reliable with respect to the inference of specific causation, which, Beecher-Monas writes, usually without qualification, is “unsupported by science.” BM at 1061. The solution for Beecher-Monas is clear. Admitting baseless expert witness testimony is “pernicious” because the whole purpose of having expert witnesses is to help the fact finder, jury or judge, who lack the background understanding and knowledge to assess the data, interpret all the evidence, and evaluate the epistemic warrant for the claims in the case. BM at 1061-62. Beecher-Monas would thus allow the expert witnesses to testify about what they legitimately know, and let the jury draw the inference about which expert witnesses in the field cannot and should not opine. BM at 1101. In other words, Beecher-Monas is perfectly fine with juries and judges guessing their way to a verdict on an issue that science cannot answer. If her book danced around this recommendation, now her law review article has come out into the open, declaring an open season to permit juries and judges to be unfettered in their specific causation judgments. What is touching is that Beecher-Monas is sufficiently committed to gatekeeping of expert witness opinion testimony that she proposes a solution to take a complex area away from expert witnesses altogether rather than confront the reality that there is often simply no good way to connect general and specific causation in a given person.
Causal Pies
Beecher-Monas relies heavily upon Professor Rothman’s notion of causal pies or sets to describe the factors that may combine to bring about a particular outcome. In doing so, she commits a non-sequitur:
“Indeed, epidemiologists speak in terms of causal pies rather than a single cause. It is simply not possible to infer logically whether a specific factor caused a particular illness.”[2]
BM at 1063. But the question on her adopted model of causation is not whether any specific factor was the cause, but whether it was one of the multiple slices in the pie. Her citation to Rothman’s statement that “it is not possible to infer logically whether a specific factor was the cause of an observed event,” is not the problem that faces factfinders in court cases.
With respect to differential etiology, Beecher-Monas claims that “‘ruling in’ all potential causes cannot be done.” BM at 1075. But why not? While it is true that disease diagnosis is often made upon signs and symptoms, BM at 1076, sometimes physicians are involved in trying to identify causes in individuals. Psychiatrists of course are frequently involved in trying to identify sources of anxiety and depression in their patients. It is not all about putting a DSM-V diagnosis on the chart, and prescribing medication. And there are times, when physicians can say quite confidently that a disease has a particular genetic cause, as in a man with BrCa1, or BrCa2, and breast cancer, or certain forms of neurodegenerative diseases, or an infant with a clearly genetically determined birth defect.
Beecher-Monas confuses “the” cause with “a” cause, and wonders away from both law and science into her own twilight zone. Here is an example of how Beecher-Monas’ confusion plays out. She asserts that:
“For any individual case of lung cancer, however, smoking is no more important than any of the other component causes, some of which may be unknown.”
BM at 1078. This ignores the magnitude of the risk factor and its likely contribution to a given case. Putting aside synergistic co-exposures, for most lung cancers, smoking is the “but for” cause of individual smokers’ lung cancers. Beecher-Monas sets up a strawman argument by telling us that is logically impossible to infer “whether a specific factor in a causal pie was the cause of an observed event.” BM at 1079. But we are usually interested in whether a specific factor was “a substantial contributing factor,” without which the disease would not have occurred. This is hardly illogical or impracticable for a given case of mesothelioma in a patient who worked for years in a crocidolite asbestos factor, or for a case of lung cancer in a patient who smoked heavily for many years right up to the time of his lung cancer diagnosis. I doubt that many people would hesitate, on either logical or scientific grounds, to attribute a child’s phocomelia birth defects to his mother’s ingestion of thalidomide during an appropriate gestational window in her pregnancy.
Unhelpfully, Beecher-Monas insists upon playing this word game by telling us that:
“Looking backward from an individual case of lung cancer, in a person exposed to both asbestos and smoking, to try to determine the cause, we cannot separate which factor was primarily responsible.”
BM at 1080. And yet that issue, of “primary responsibility” is not in any jury instruction for causation in any state of the Union, to my knowledge.
From her extreme skepticism, Beecher-Monas swings to the other extreme that asserts that anything that could have been in the causal set or pie was in the causal set:
“Nothing in relative risk analysis, in statistical analysis, nor anything in medical training, permits an inference of specific causation in the individual case. No expert can tell whether a particular exposed individual’s cancer was caused by unknown factors (was idiopathic), linked to a particular gene, or caused by the individual’s chemical exposure. If all three are present, and general causation has been established for the chemical exposure, one can only infer that they all caused the disease.115 Courts demanding that experts make a contrary inference, that one of the factors was the primary cause, are asking to be misled. Experts who have tried to point that out, however, have had a difficult time getting their testimony admitted.”
BM at 1080. There is no support for Beecher-Monas’ extreme statement. She cites, in footnote 115, to Kenneth Rothman’s introductory book on epidemiology, but what he says at the cited page is quite different. Rothman explains that “every component cause that played a role was necessary to the occurrence of that case.” In other words, for every component cause that actually participated in bringing about this case, its presence was necessary to the occurrence of the case. What Rothman clearly does not say is that for a given individual’s case, the fact that a factor can cause a person’s disease means that it must have caused it. In Beecher-Monas’ hypothetical of three factors – idiopathic, particular gene, and chemical exposure, all three, or any two, or only one of the three may have made a given individual’s causal set. Beecher-Monas has carelessly or intentionally misrepresented Rothman’s actual discussion.
Physicians and epidemiologists do apply group risk figures to individuals, through the lens of predictive regression equations. The Gail Model for 5 Year Risk of Breast Cancer, for instance, is a predictive equation that comes up with a prediction for an individual patient by refining the subgroup within which the patient fits. Similarly, there are prediction models for heart attack, such as the Risk Assessment Tool for Estimating Your 10-year Risk of Having a Heart Attack. Beecher-Monas might complain that these regression equations still turn on subgroup average risk, but the point is that they can be made increasingly precise as knowledge accumulates. And the regression equations can generate confidence intervals and prediction intervals for the individual’s constellation of risk factors.
Significance Probability and Statistical Significance
The discussion of significance probability and significance testing in Beecher-Monas’ book was frequently in error,[3] and this new law review article is not much improved. Beecher-Monas tells us that “judges frequently misunderstand the terminology and reasoning of the statistics used in scientific testimony,” BM at 1057, which is true enough, but this article does little to ameliorate the situation. Beecher-Monas offers the following definition of the p-value:
“The P- value is the probability, assuming the null hypothesis (of no effect) is true (and the study is free of bias) of observing as strong an association as was observed.”
BM at 1064-65. This definition misses that the p-value is a cumulative tail probability, and can be one-sided or two-sided. More seriously in error, however, is the suggestion that the null hypothesis is one of no effect, when it is merely a pre-specified expected value that is the subject of the test. Of course, the null hypothesis is often one of no disparity between the observed and the expected, but the definition should not mislead on this crucial point.
For some reason, Beecher-Monas persists in describing the conventional level of statistical significance as 95%, which substitutes the coefficient of confidence for the complement of the frequently pre-specified p-value for significance. Annoying but decipherable. See, e.g., BM at 1062, 1064, 1065. She misleadingly states that:
“The investigator will thus choose the significance level based on the size of the study, the size of the effect, and the trade-off between Type I (incorrect rejection of the null hypothesis) and Type II (incorrect failure to reject the null hypothesis) errors.”
BM at 1066. While this statement is sometimes, rarely true, it mostly is not. A quick review of the last several years of the New England Journal of Medicine will document the error. Invariably, researchers use the conventional level of alpha, at 5%, unless there is multiple testing, such as in a genetic association study.
Beecher-Monas admonishes us that “[u]sing statistical significance as a screening device is thus mistaken on many levels,” citing cases that do not provide support for this proposition.[4] BM at 1066. The Food and Drug Administration’s scientists, who review clinical trials for efficacy and safety will be no doubt be astonished to hear this admonition.
Beecher-Monas argues that courts should not factor statistical significance or confidence intervals into their gatekeeping of expert witnesses, but that they should “admit studies,” and leave it to the lawyers and expert witnesses to explain the strengths and weaknesses of the studies relied upon. BM at 1071. Of course, studies themselves are rarely admitted because they represent many levels of hearsay by unknown declarants. Given Beecher-Monas’ acknowledgment of how poorly judges and lawyers understand statistical significance, this argument is cynical indeed.
Remarkably, Beecher-Monas declares, without citation, that the
“the purpose of epidemiologists’ use of statistical concepts like relative risk, confidence intervals, and statistical significance are intended to describe studies, not to weed out the invalid from the valid.”
BM at 1095. She thus excludes by ipse dixit any inferential purposes these statistical tools have. She goes further and gives us a concrete example:
“If the methodology is otherwise sound, small studies that fail to meet a P-level of 5 [sic], say, or have a relative risk of 1.3 for example, or a confidence level that includes 1 at 95% confidence, but relative risk greater than 1 at 90% confidence ought to be admissible. And understanding that statistics in context means that data from many sources need to be considered in the causation assessment means courts should not dismiss non-epidemiological evidence out of hand.”
BM at 1095. Well, again, studies are not admissible; the issue is whether they may be reasonably relied upon, and whether reliance upon them may support an opinion claiming causality. And a “P-level” of 5 is, well, let us hope a serious typographical error. Beecher-Monas’ advice is especially misleading when there is there is only one study, or only one study in a constellation of exonerative studies. See, e.g., In re Accutane, No. 271(MCL), 2015 WL 753674, 2015 BL 59277 (N.J. Super. Law Div. Atlantic Cty. Feb. 20, 2015) (excluding Professor David Madigan for cherry picking studies to rely upon).
Confidence Intervals
Beecher-Monas’ book provided a good deal of erroneous information on confidence intervals.[5] The current article improves on the definitions, but still manages to go astray:
“The rationale courts often give for the categorical exclusion of studies with confidence intervals including the relative risk of one is that such studies lack statistical significance.62 Well, yes and no. The problem here is the courts’ use of a dichotomous meaning for statistical significance (significant or not).63 This is not a correct understanding of statistical significance.”
BM at 1069. Well yes and no; this interpretation of a confidence interval, say with a coefficient of confidence of 95%, is a reasonable interpretation of whether the point estimate is statistically significant at an alpa of 5%. If Beecher-Monas does not like strict significant testing, that is fine, but she cannot mandate its abandonment by scientists or the courts. Certainly the cited interpretation is one proper interpretation among several.
Power
There were several misleading references to statistical power in Beecher-Monas’ book, but the new law review tops them by giving a new, bogus definition:
“Power, the probability that the study in which the hypothesis is being tested will reject the alterative [sic] hypothesis when it is false, increases with the size of the study.”
BM at 1065. For this definition, Beecher-Monas cites to the Reference Manual on Scientific Evidence, but butchers the correct definition give by the late David Freedman and David Kaye.[6] All of which is very disturbing.
Relative Risks and Other Risk Measures
Beecher-Monas begins badly by misdefining the concept of relative risk:
“as the percentage of risk in the exposed population attributable to the agent under investigation.”
BM at 1068. Perhaps this percentage can be derived from the relative risk, if we know it to be the true measure with some certainty, through a calculation of attributable risk, but confusing and conflating attributable and relative risk in a law review article that is taking the entire medical profession to task, and most of the judiciary to boot, should be written more carefully.
Then Beecher-Monas tells us that the “[r]elative risk is a statistical test that (like statistical significance) depends on the size of the population being tested.” BM at 1068. Well, actually not; the calculation of the RR is unaffected by the sample size. The variance of course will vary with the sample size, but Beecher-Monas seems intent on ignoring random variability.
Perhaps most egregious is Beecher-Monas’ assertion that:
“Any increase above a relative risk of one indicates that there is some effect.”
BM at 1067. So much for ruling out chance, bias, and confounding! Or looking at an entire body of epidemiologic research for strength, consistency, coherence, exposure-response, etc. Beecher-Monas has thus moved beyond a liberal, to a libertine, position. In case the reader has any doubts of the idiosyncrasy of her views, she repeats herself:
“As long as there is a relative risk greater than 1.0, there is some association, and experts should be permitted to base their causal explanations on such studies.”
BM at 1067-68. This is evidentiary nihilism in full glory. Beecher-Monas has endorsed relying upon studies irrespective of their study design or validity, their individual confidence intervals, their aggregate summary point estimates and confidence intervals, or the absence of important Bradford Hill considerations, such as consistency, strength, and dose-response. So an expert witness may opine about general causation from reliance upon a single study with a relative risk of 1.05, say with a 95% confidence interval of 0.8 – 1.4?[7] For this startling proposition, Beecher-Monas cites the work of Sander Greenland, a wild and wooly plaintiffs’ expert witness in various toxic tort litigations, including vaccine autism and silicone autoimmune cases.
RR > 2
Beecher-Monas’ discussion of inferring specific causation from relative risks greater than two devolves into a muddle by her failure to distinguish general from specific causation. BM at 1067. There are different relevancies for general and specific causation, depending upon context, such as clinical trials or epidemiologic studies for general causation, number of studies available, and the like. Ultimately, she adds little to the discussion and debate about this issue, or any other.
[1] See previous comments on the book at “Beecher-Monas and the Attempt to Eviscerate Daubert from Within”; “Friendly Fire Takes Aim at Daubert – Beecher-Monas And The Undue Attack on Expert Witness Gatekeeping; and “Confidence in Intervals and Diffidence in the Courts.”
[2] Kenneth J. Rothman, Epidemiology: An Introduction 250 (2d ed. 2012).
[3] Erica Beecher-Monas, Evaluating Scientific Evidence: An Interdisciplinary Framework for Intellectual Due Process 42 n. 30, 61 (2007) (“Another way of explaining this is that it describes the probability that the procedure produced the observed effect by chance.”) (“Statistical significance is a statement about the frequency with which a particular finding is likely to arise by chance.”).
[4] See BM at 1066 & n. 44, citing “See, e.g., In re Breast Implant Litig., 11 F. Supp. 2d 1217, 1226–27 (D. Colo. 1998); Haggerty v. Upjohn Co., 950 F. Supp. 1160, 1164 (S.D. Fla. 1996), aff’d, 158 F.3d 588 (11th Cir. 1998) (“[S]cientifically valid cause and effect determinations depend on controlled clinical trials and epidemiological studies.”).”
[5] See, e.g., Erica Beecher-Monas, Evaluating Scientific Evidence 58, 67 (N.Y. 2007) (“No matter how persuasive epidemiological or toxicological studies may be, they could not show individual causation, although they might enable a (probabilistic) judgment about the association of a particular chemical exposure to human disease in general.”) (“While significance testing characterizes the probability that the relative risk would be the same as found in the study as if the results were due to chance, a relative risk of 2 is the threshold for a greater than 50 percent chance that the effect was caused by the agent in question.”)(incorrectly describing significance probability as a point probability as opposed to tail probabilities).
[6] David H. Kaye & David A. Freedman, Reference Guide on Statistics, in Federal Jud. Ctr., Reference Manual on Scientific Evidence 211, 253–54 (3d ed. 2011) (discussing the statistical concept of power).
[7] BM at 1070 (pointing to a passage in the FJC’s Reference Manual on Scientific Evidence that provides an example of one 95% confidence interval that includes 1.0, but which shrinks when calculated as a 90% interval to 1.1 to 2.2, which values “demonstrate some effect with confidence interval set at 90%). This is nonsense in the context of observational studies.