TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

Only Judges Can Change History

June 24th, 2022

Samuel Butler is credited with the quip that “God cannot alter the past, though historians can.”  Historians can only try to alter the past; judges can do it with authority. Butler also had an opinion on authority: “Authority intoxicates, and makes mere sots of magistrates. The fumes of it invade the brain, and make men giddy, proud, and vain.”

Yesterday, the Supreme Court handed down its latest decision on the Second Amendment, in New York State Rifle & Pistol Ass’n v. Bruen.[1] Justice Thomas wrote the opinion for the Court, with concurrences by Justices Alito, Roberts, and Kavanaugh. Justice Breyer, joined by Justices Kagan and Sotomayor, dissented. A predictable 6 to 3 split, along what some regard as partisan lines. There are aspects of the decision, however, that will keep legal scholars and public intellectuals busy for a long time. In an affront to “textualists,” the Court’s decision perpetuates the earlier decision in Heller[2] to write the word “militia” out of the Constitution.

The Court’s and the dissent’s opinions have fulsome discussions of the history of laws regarding open and concealed carry of firearm. The Bruen case came up from the United States Court of Appeals on a decision to deny an injunction against the New York state judge who denied the petitioners’ a firearm carry permit. Remarkably, the decision was on the pleadings, with no testimony taken from historian expert witnesses. Remarkably, the Supreme Court took the case without requiring the petitioners to exhaust their state court appellate remedies.

As for the legal history, I defer to the legal historians. Justice Thomas, drawing from George Orwell, teaches that not all history is created equal. “Constitutional rights are enshrined with the scope they were understood to have when the people adopted them.”[3] Perhaps we should take Thomas’s teching to heart. The Second Amendment was adopted in 1791, when there were no bullets in the form of metal casings, with shaped lead projectiles. The Second Amendment right may well ensconce the entitlement to use black powder pistols that took a minute to reload and were accurate up to five meters, but there were no handguns that could fire “bullets,” made of metal casings, repeatedly. Relevant handgun technology did not evolve until the period around 1830-40, and even then, it was a novelty.

Maybe we do need a better history? In the meanwhile, New York should promptly require permits for bullets, as we know them, since they were not in existence in 1791, when the Second Amendment became law.

 

[1] New York State Rifle & Pistol Ass’n v. Bruen, No.20-843, Slip op., U.S. Supreme Court (June 23, 2022).

[2] District of Columbia v. Heller, 554 U. S. 570 (2008).

[3] Slip op. at 25, quoting from Heller, 554 U. S., at 634–635.

Statistical Significance Test Anxiety

June 20th, 2022

Although lawyers are known as a querulous lot, statisticians may not be far behind. The famous statistician John Wilder Tukey famously remarked that the collective noun for the statistical profession should be a “quarrel” of statisticians.[1]

Recently, philosopher Deborah Mayo, who has written insightfully about the “statistics wars,”[2] published an important article that addressed an attempt by some officers of the American Statistical Association (ASA) to pass off their personal views of statistical significance testing as views of the ASA.[3] This attempt took not only the form of an editorial over the name of the Executive Director, without a disclaimer, but also an email campaign to push journal editors to abandon statistical significance testing. Professor Mayo’s recent article explores the interesting concept of intellectual conflicts of interest arising from journal editors and association leaders who use their positions to advance their personal views. As discussed in some of my own posts, the conflict of interest led another ASA officer to appoint a Task Force on statistical significance testing, which has now, finally, been published in multiple fora.

Last week, on January 11, 2022, Professor Mayo convened a Zoom forum, “Statistical Significance Test Anxiety,” moderated by David Hand, at which she and Yoav Benjamini, an author of the ASA President’s Task Force, presented. About 70 statisticians and scientists from around the world attended.

Professor Mayo has hosted several editorial commentaries on her editorial in Conservation Biology, including guest blog posts from:

Brian Dennis
Philip Stark
Kent Staley
Yudi Pawitan
Christian Hennig
Ionides and Ritov
Brian Haig
Daniël Lakens

and my humble post, which is set out in full, below. There are additional posts on “statistical test anxiety” coming; check Professor Mayo’s blog for additional commentaries.

     *     *     *     *     *     *     *     *     *     *     *     *     *     *     *

Of Significance, Error, Confidence, and Confusion – In the Law and In Statistical Practice

The metaphor of law as an “empty vessel” is frequently invoked to describe the law generally, as well as pejoratively to describe lawyers. The metaphor rings true at least in describing how the factual content of legal judgments comes from outside the law. In many varieties of litigation, not only the facts and data, but the scientific and statistical inferences must be added to the “empty vessel” to obtain a correct and meaningful outcome.

Once upon a time, the expertise component of legal judgments came from so-called expert witnesses, who were free to opine about the claims of causality solely by showing that they had more expertise than the lay jurors. In Pennsylvania, for instance, the standard for qualify witnesses to give “expert opinions” was to show that they had “a reasonable pretense to expertise on the subject.”

In the 19th and the first half of the 20th century, causal claims, whether of personal injuries, discrimination, or whatever, virtually always turned on a conception of causation as necessary and sufficient to bring about the alleged harm. In discrimination claims, plaintiffs pointed to the “inexorable zero,” in cases in which no Black citizen was ever seated on a grand jury, in a particular county, since the demise of Reconstruction. In health claims, the mode of reasoning usually followed something like Koch’s postulates.

The second half of the 20th century was marked by the rise of stochastic models in our understanding of the world. The consequence is that statistical inference made its way into the empty vessel. The rapid introduction of statistical thinking into the law did not always go well. In a seminal discrimination case, Casteneda v. Partida, 430 U.S. 432 (1977), in an opinion by Associate Justice Blackmun, the court calculated a binomial probability for observing the sample result (rather than a result at least as extreme as such a result), and mislabeled the measurement “standard deviations” rather than standard errors:

“As a general rule for such large samples, if the difference between the expected value and the observed number is greater than two or three standard deviations, then the hypothesis that the jury drawing was random would be suspect to a social scientist.  The II-year data here reflect a difference between the expected and observed number of Mexican-Americans of approximately 29 standard deviations. A detailed calculation reveals that the likelihood that such a substantial departure from the expected value would occur by chance is less than I in 10140.”

Id. at 430 U.S. 482, 496 n.17 (1977). Justice Blackmun was graduated from Harvard College, summa cum laude, with a major in mathematics.

Despite the extreme statistical disparity in the 11-year run of grand juries, Justice Blackmun’s opinion provoked a robust rejoinder, not only on the statistical analysis, but on the Court’s failure to account for obvious omitted confounding variables in its simplistic analysis. And then there were the inconvenient facts that Mr. Partida was a rapist, indicted by a grand jury (50% with “Hispanic” names), which was appointed by jury commissioners (3/5 Hispanic). Partida was convicted by a petit jury (7/12 Hispanic), in front a trial judge who was Hispanic, and he was denied a writ of habeas court by Judge Garza, who went on to be a member of the Court of Appeals. In any event, Justice Blackmun’s dictum about “two or three” standard deviations soon shaped the outcome of many thousands of discrimination cases, and was translated into a necessary p-value of 5%.

Beginning in the early 1960s, statistical inference became an important feature of tort cases that involved claims based upon epidemiologic evidence. In such health-effects litigation, the judicial handling of concepts such as p-values and confidence intervals often went off the rails.  In 1989, the United States Court of Appeals for the Fifth Circuit resolved an appeal involving expert witnesses who relied upon epidemiologic studies by concluding that it did not have to resolve questions of bias and confounding because the studies relied upon had presented their results with confidence intervals.[4] Judges and expert witnesses persistently interpreted single confidence intervals from one study as having a 95 percent probability of containing the actual parameter.[5] Similarly, many courts and counsel committed the transposition fallacy in interpreting p-values as posterior probabilities for the null hypothesis.[6]

Against this backdrop of mistaken and misrepresented interpretation of p-values, the American Statistical Association’s p-value statement was a helpful and understandable restatement of basic principles.[7] Within a few weeks, however, citations to the p-value Statement started to show up in the briefs and examinations of expert witnesses, to support contentions that p-values (or any procedure to evaluate random error) were unimportant, and should be disregarded.[8]

In 2019, Ronald Wasserstein, the ASA executive director, along with two other authors wrote an editorial, which explicitly called for the abandonment of using “statistical significance.”[9] Although the piece was labeled “editorial,” the journal provided no disclaimer that Wasserstein was not speaking ex cathedra.

The absence of a disclaimer provoked a great deal of confusion. Indeed, Brian Turran, the editor of Significancepublished jointly by the ASA and the Royal Statistical Society, wrote an editorial interpreting the Wasserstein editorial as an official ASA “recommendation.” Turran ultimately retracted his interpretation, but only in response to a pointed letter to the editor.[10] Turran adverted to a misleading press release from the ASA as the source of his confusion. Inquiring minds might wonder why the ASA allowed such a press release to go out.

In addition to press releases, some people in the ASA started to send emails to journal editors, to nudge them to abandon statistical significance testing on the basis of what seemed like an ASA recommendation. For the most part, this campaign was unsuccessful in the major biomedical journals.[11]

While this controversy was unfolding, then President Karen Kafadar of the ASA stepped into the breach to state definitively that the Executive Director was not speaking for the ASA.[12]  In November 2019, the ASA board of directors approved a motion to create a “Task Force on Statistical Significance and Replicability.”[8] Its charge was “to develop thoughtful principles and practices that the ASA can endorse and share with scientists and journal editors. The task force will be appointed by the ASA President with advice and participation from the ASA Board.”

Professor Mayo’s editorial has done the world of statistics, as well as the legal world of judges, lawyers, and legal scholars, a service in calling attention to the peculiar intellectual conflicts of interest that played a role in the editorial excesses of some of  the ASA’s leadership. From a lawyer’s perspective, it is clear that courts have been misled, and distracted by, some of the ASA officials who seem to have worked to undermine a consensus position paper on p-values.[13]

Curiously, the task force’s report did not find a home in any of the ASA’s several scholarly publications. Instead “The ASA President’s Task Force Statement on Statistical Significance and Replicability[14] appeared in the The Annals of Applied  Statistics, where it is accompanied by an editorial by ASA former President Karen Kafadar.[15]  In November 2021, the ASA’s official “magazine,” Chance, also published the Task Force’s Statement.[16]

Judges and litigants who must navigate claims of statistical inference need guidance on the standard of care scientists and statisticians should use in evaluating such claims. Although the Taskforce did not elaborate, it advanced five basic propositions, which had been obscured by many of the recent glosses on the ASA 2016 p-value statement, and the 2019 editorial discussed above:

  1. “Capturing the uncertainty associated with statistical summaries is critical.”
  2. “Dealing with replicability and uncertainty lies at the heart of statistical science. Study results are replicable if they can be verified in further studies with new data.”
  3. “The theoretical basis of statistical science offers several general strategies for dealing with uncertainty.”
  4. “Thresholds are helpful when actions are required.”
  5. “P-values and significance tests, when properly applied and interpreted, increase the rigor of the conclusions drawn from data.”

Although the Task Force’s Statement will not end the debate or the “wars,” it will go a long way to correct the contentions made in court about the insignificance of significance testing, while giving courts a truer sense of the professional standard of care with respect to statistical inference in evaluating claims of health effects.


[1] David R. Brillinger, “. . . how wonderful the field of statistics is. . . ,” Chap. 4, 41, 44, in Xihong Lin, et al., eds., Past, Present, and Future of Statistical Science (2014).

[2] Deborah MayoStatistical Inference as Severe Testing: How to Get Beyond the Statistics Wars (2018).

[3] Deborah Mayo, “The Statistics Wars and Intellectual Conflicts of Interest,” Conservation Biology (2021) (in press).

[4] Brock v. Merrill Dow Pharmaceuticals, Inc., 874 F.2d 307, 311-12 (5th Cir. 1989).

[5] Richard W. Clapp & David Ozonoff, “Environment and Health: Vital Intersection or Contested Territory?” 30 Am. J. L. & Med. 189, 210 (2004) (“Thus, a RR [relative risk] of 1.8 with a confidence interval of 1.3 to 2.9 could very likely represent a true RR of greater than 2.0, and as high as 2.9 in 95 out of 100 repeated trials.”) (Both authors testify for claimants cases involving alleged environmental and occupational harms.); Schachtman, “Confidence in Intervals and Diffidence in the Courts” (Mar. 4, 2012) (collecting numerous examples of judicial offenders).

[6] See, e.g., In re Ephedra Prods. Liab. Litig., 393 F.Supp. 2d 181, 191, 193 (S.D.N.Y. 2005) (Rakoff, J.) (credulously accepting counsel’s argument that the use of a critical value of less than 5% of significance probability increased the “more likely than not” burden of proof upon a civil litigant). The decision has been criticized in the scholarly literature, but it is still widely cited without acknowledging its error. See Michael O. Finkelstein, Basic Concepts of Probability and Statistics in the Law 65 (2009).

[7] Ronald L. Wasserstein & Nicole A. Lazar, “The ASA’s Statement on p-Values: Context, Process, and Purpose,” 70 The Am. Statistician 129 (2016); see “The American Statistical Association’s Statement on and of Significance” (March 17, 2016). The commentary beyond the “bold faced” principles was at times less helpful in suggesting that there was something inherently inadequate in using p-values. With the benefit of hindsight, this commentary appears to represent editorizing by the authors, and not the sense of the expert committee that agreed to the six principles.

[8] Schachtman, “The American Statistical Association Statement on Significance Testing Goes to Court, Part I” (Nov. 13, 2018), “Part II” (Mar. 7, 2019).

[9] Ronald L. Wasserstein, Allen L. Schirm, and Nicole A. Lazar, “Editorial: Moving to a World Beyond ‘p < 0.05’,” 73 Am. Statistician S1, S2 (2019); see Schachtman,“Has the American Statistical Association Gone Post-Modern?” (Mar. 24, 2019).

[10] Brian Tarran, “THE S WORD … and what to do about it,” Significance (Aug. 2019); Donald Macnaughton, “Who Said What,” Significance 47 (Oct. 2019).

[11] See, e.g., David Harrington, Ralph B. D’Agostino, Sr., Constantine Gatsonis, Joseph W. Hogan, David J. Hunter, Sharon-Lise T. Normand, Jeffrey M. Drazen, and Mary Beth Hamel, “New Guidelines for Statistical Reporting in the Journal,” 381 New Engl. J. Med. 285 (2019); Jonathan A. Cook, Dean A. Fergusson, Ian Ford, Mithat Gonen, Jonathan Kimmelman, Edward L. Korn, and Colin B. Begg, “There is still a place for significance testing in clinical trials,” 16 Clin. Trials 223 (2019).

[12] Karen Kafadar, “The Year in Review … And More to Come,” AmStat News 3 (Dec. 2019); see also Kafadar, “Statistics & Unintended Consequences,” AmStat News 3,4 (June 2019).

[13] Deborah Mayo, “The statistics wars and intellectual conflicts of interest,” 36 Conservation Biology (2022) (in-press, online Dec. 2021).

[14] Yoav Benjamini, Richard D. DeVeaux, Bradly Efron, Scott Evans, Mark Glickman, Barry Braubard, Xuming He, Xiao Li Meng, Nancy Reid, Stephen M. Stigler, Stephen B. Vardeman, Christopher K. Wikle, Tommy Wright, Linda J. Young, and Karen Kafadar, “The ASA President’s Task Force Statement on Statistical Significance and Replicability,” 15 Annals of Applied Statistics (2021) (in press).

[15] Karen Kafadar, “Editorial: Statistical Significance, P-Values, and Replicability,” 15 Annals of Applied Statistics (2021).

[16] Yoav Benjamini, Richard D. De Veaux, Bradley Efron, Scott Evans, Mark Glickman, Barry I. Graubard, Xuming He, Xiao-Li Meng, Nancy M. Reid, Stephen M. Stigler, Stephen B. Vardeman, Christopher K. Wikle, Tommy Wright, Linda J. Young & Karen Kafadar, “ASA President’s Task Force Statement on Statistical Significance and Replicability,” 34 Chance 10 (2021).

Differential Etiologies – Part Two – Ruling Out

June 19th, 2022

Perhaps the most important point of this law review article, “Differential Etiology: Inferring Specific Causation in the Law from Group Data in Science,”  is that general causation is necessary but insufficient, standing alone, to show specific causation. To be sure, the authors proclaimed that strong evidence of general causation somehow reduces the burden to show specific causation, but this pronouncement turned out to be an ipse dixit, without supporting analysis or citation. On general causation itself, what the authors characterized as the “ruling in” part of differential etiology, the authors offered some important considerations for courts to consider. Not the least of the important advice on general causation was urging caution in interpreting results when “strength of a relationship is modest.”[1] Given that they were talking to judges and lawyers, the advice might have taken on greater saliency if the authors explicitly noted that modest strength of a putative relationship means small relative risks, such as those smaller than two or three.

Acute Onset Conditions

The authors’ stated goal of bringing clarity to the determination of differential-etiology is a laudable one. In seeking clarity, they brush away some “easy” cases, such as the causal determination of acute onset conditions. Even so, the authors do not give any concrete examples. A broken bone discovered immediately after a car crash would hardly give a court much pause, but something such as the onset of acute liver failure shortly after ingesting a new medication turns out to be much more complicated than many would anticipate. Viral infections and autoimmune disease must be eliminated, and so such events are clearly in the realm of differential etiology, despite the close temporal proximity.

So-Called Signature Diseases

The authors also try to brush aside the “easy” case of signature diseases as not requiring differential etiology. The complexity of such cases ultimately embarrasses everyone. The authors no doubt thought that they were on safe ground in proffering the example of mesothelioma as a signature cancer caused by only asbestos (without wading into the deeper complexity of what is asbestos and which minerals in what mineralogical habit actually cause the disease).[2] Unfortunately, mesothelioma has never been a truly signal disease. The authors nonetheless consider it as one, with the caveat that mesotheliomas not caused by asbestos are “very rare.” And what was the authority for this statement? The Pennsylvania Supreme Court! Now the Pennsylvania Supreme Court is no doubt, at times, an authority on Pennsylvania law, if only because the Court is the last word on this contorted body of law. The Justices of that Court, however, would probably be the first to disclaim any credibility on the causes of any disease.[3]

The authors further distort the notion of signature diseases by stating that “[v]aginal adenocarcinoma in young women appears to be a signature disease associated with maternal use of DES.”[4] This cannot be right because over 10% of vaginal cancers are adenocarcimas. The principle of charity requires us to assume that the authors meant to indicate clear cell vaginal adenocarcinoma, but even so, charity will not correct the mistake. DES daughters do indeed have an increased risk of developing developing clear cell adenocarcinoma, but this type of cancer was well described before DES was ever invented and prescribed to women.[5]

Perhaps the safest ground for signature diseases is in microbiology, where we have infectious disease defined by the microbial agent that is uniquely associated with the disease. Probably close to the infectious diseases are the nutritional deficiency diseases defined by the absence of an essential nutrient or vitamin. To be sure, there are non-infectious diseases such as the pneumoconioses, each defined by the nature of the inhaled particle. Contrary to the authors’ contention, these diseases no not necessarily remove differential etiology from the analysis. Silicosis has a distinctive radiographic appearance, and yet that radiographic appearance is the same in many cases of coccidioidomycosis (Valley Fever). Asbestosis has a different radiographic appearance of the lungs and pleura, but the radiographic patterns might well be confused with the sequelae of rheumatoid arthritis or other interstitial lung diseases. At low levels of profusion of radiographic opacities, diseases such asbestosis and silicosis have diagnostic criteria that are far from perfect sensitivity and specificity. In one of the very first asbestos cases I defended, the claimant was diagnosed, by no less than the late Dr. Irving Selikoff,[6] with asbestosis, 3/3 on the ILO scale of linear, irregular radiographic lung opacities. An autopsy, however, found that there was no asbestosis at all, or even an elevated tissue fiber burden; the claimant had died of bilateral lymphangenitic carcinomatosis.

Definitive Mechanistic Pathway to Individual Causation

The paper presents a limited discussion of genetic causation. In the instance of mutations of highly penetrant alleles, identifying the genetic mutation will provide the general and the specific cause in a case. The authors also acknowledge that there may be cases involving hypothetical biomarkers that reveals a well-documented causal pathway from exposure to disease.

Differential Etiologies

So what happens when the plaintiff is claiming that he has developed a disease of ordinary life, one that has multiple known causes? Disease onset is not acute, but rather after a lengthy latency period. The plaintiff wants to inculpate the supposedly tortious exposure (the tortogen), and avoid the conclusion that any or all of the known alternative causes participated in his case. If there are cases of the disease without known causes (idiopathogens), the claimant will need to exclude idiopathogens in favor of fingering the tortogen as responsible for his bad outcome.

The authors helpfully distinguish differential diagnosis from differential etiology. The confusion of the two concepts has led to courts’ generally over-endorsing the black box of clinical judgment in health effects litigation. At the very least, this article can perhaps help the judiciary to move on from this naïve confusion.[7]

The authors advance the vague notion that somehow “clinical information” can supplement a relative that is not greater than two to augment the specific causation inference. This was, to be sure, the assertion of the New Jersey Supreme Court, based upon the improvident concession of the defense lawyer who argued the case.[8] There was nothing in the record of the New Jersey case, however, that would support the relevance of clinical information to the causal analysis of the plaintiff’s colorectal cancer.

The authors also point to a talc ovarian cancer case as exemplifying the use of clinical data to supplement a relative risk below two.[9] The cited case, however, involved expert witnesses who claimed a relative risk greater than two for the tortogen, and who failed to show how clinical information (such as the presence of talc in ovarian tissue) made the claimant any more likely to have had a cancer caused by talc.

Adverting to “clinical information” to supplement the relative risk all-too-often is hand waving that offers no analytical support for the specific causal inference. The clinical factors often are covariates in the multivariate model that generated the relevant relative risk. As such, the relative risk represents an assessment of the strength of the relevant association, independent of the clinical factors that are captured in the co-variates, in the multivariate model.  In the New Jersey case, Landrigan, plaintiff had no asbestosis that would suggest he even had a serious exposure to asbestos. In a companion case, Caternicchio, the plaintiff claimed that he had asbestosis, and somehow this made the causal inference for his colorectal cancer stronger.[10]  The epidemiologic studies he relied upon, however, stratified their analyses by length of exposure, and by radiographic category of asbestosis, neither of which suggested any relationship between radiographic findings and colorectal cancer outcome.

Perhaps because the authors are academics, they had to ask questions no one has every raised in a serious way in litigation, such as whether in addition to the clinical information, claimants could assert that toxicological data could be used to supplement a low (not greater than two) relative risk. The authors state the obvious; namely, toxicologic evidence is best suited to the assessment of general causation. They do not stop there, as they might have. Throwing their stated task of explicating the scientific foundations for specific causation inferences to the wind, the authors tell us that “[t]here is no formula for when such toxicologic evidence can tip the scales on the question of specific causation.”[11] And they wind up telling us vacuously that if the relevant epidemiology showed a small effect size, such as a two percent increased risk (RR = 1.02), then it would be unclear “how any animal data could cause one to substantially alter the best estimate of a human effect to reach a more-likely-than-not threshold.”[12] At this point in their paper, the authors seem to be discussing specific causation, but they offer nothing in the way of scientific evidence or examples of how toxicologic data could supplement a low relative risk (less than or equal to two) to permit a specific causation inference.

Idiopathy

When the analysis of the putative risk is done in a multivariate model that fairly covers the other relevant risks, relative risks less than 100 or so, suggest that there is a substantial baseline or background risk for the outcome of concern. When the relative risks identified in such analyses are less than 5 or so, the studies will suggest a reasonable proportion of so-called background cases with idiopathic (unknown) causes. Differential etiologies will have to rule out those mysterious idiopathogens.

If the putative specific cause is the only substance established to cause the outcome of concern, and the RR is greater than 1.0 and less than or equal to 2.0, by definition, there is a large base rate of the disease. No amount of hokey pokey will rule out the background causes. The authors deal with this scenario under the heading of differential etiology in the face of idiopathic causes, and characterize it as a “problem.”

Long story short, the authors conclude that “perhaps it is reasonable for courts to disregard idiopathic causes in those cases where idiopathic causes comprise a relatively small percent of all injuries.”[13] Such cases, however, by definition will diseases for which most causes are known, and the attributable fractions collectively for the known risks will be very high (say greater than 80 or 90%). Conversely, when the attributable fraction for all known risks is lower than 80%, the unexplained portion of the disease cases will represent idiopathic cases and causes that cannot be rule out with any confidence.

Differential etiology cannot work in the situation with a substantial baseline risk because there will be a disjunct (idiopathogen(s)) in the first statement of the syllogism, which cannot be ruled out. Thus, even if every other putative cause can be eliminated, the claimant will be left with the either the tortogen or the baseline risk as the cause of his injury, and the claimant will never arrive at a conclusion that is free of a disjunction that precludes judgment in his favor. In this scenario, the claimant must lose as a matter of law.

In their discussion of this issue, the authors note that this indeterminancy resulted in the exclusion of plaintiff’s expert witnesses in the notorious case of Milward v. Acuity Specialty Products Group, Inc.[14] In Milward, plaintiff had developed a rare variety of acute myeloid leukemia (AML), which had a large attributable fraction for idiopathic causation. This factual setting simply means that no known cause exists with a large relative risk, or even a small relative risk of 1.3 or so. Remarkably, these authors state that Milward “had prevailed on the general-causation issue” but in fact, no trial was ever held. The defense prevailed at the trial court by way of Rule 702 exclusion of plaintiff’s causation expert witnesses, but the First Circuit reversed and remanded for trial. The only prevailing that took place was the questionable avoidance of exclusion and summary judgment.[15]

On remand, the defense moved again to exclude plaintiffs’ expert witnesses on specific causation. Given that about 75% of AML cases are idiopathic, the court held that the plaintiffs’ expert witnesses attempt to proffer a differential etiology was fatally flawed.[16]

The authors cite the Milward specific causation decision, which in turn channeled the Restatement (Third) by couching the argument in terms of probability. If the claimant is left with a disjunction, [tortogen OR idiopathogen], then they suggest a probability value be assigned to the idiopathogen to support the inference that the probability that the tortogen was responsible for the claimant’s outcome [(1 – P(idiopathogen) x 100%]. Or in Judge Woodlock’s words:

“When a disease has a discrete set of causes, eliminating some number of them significantly raises the probability that the remaining option or options were the cause-in-fact of the disease. Restatement (Third) of Torts: Phys. & Emot. Harm § 28, cmt. c (2010) (‘The underlying premise [of differential etiology] is that each of the [ ] known causes is independently responsible for some proportion of the disease in a given population. Eliminating one or more of these as a possible cause for a specific plaintiff’s disease increases the probability that the agent in question was responsible for that plaintiff’s disease.’). The same cannot be said when eliminating a few possible causes leaves not only fewer possible causes but also a high probability that a cause cannot be identified. (‘When the causes of a disease are largely unknown . . . differential etiology is of little assistance.’).”[17]

The Milward approach is thus a vague, indirect invocation of relative risks and attributable fractions, without specifying the probabilities involved in quantitative terms.  Like obscenity, judges are supposed to discern when the residual probability of idiopathy is too great to permit an inference of specific causation. Somehow, I have the sense we should be able to do better than this.

Multiple Risks

To their credit, the authors tackle the difficult cases that arise when multiple risks are present. Those multiple risks may be competing risks, including the tortogen, in which case not all participate in bringing about the outcome. Indeed, if there is a baseline risk, the result may still have come about from an idiopathogen. The discussion in Differential Etiologies take some twists and turns, and I will not discuss all of it here.

Strong tortogen versus one weak competing risk

The authors describe the scenario of strong tortogen versus a single competing risk as one of the “easy cases,” at least when the alternative cause appears to be de minimus:

“If the choice of whether one’s lung cancer was the result of a lifetime of heavy smoking or by a brief encounter with a substance for which there is a significant but weak correlation with lung cancer, in most situations it should be an easy task to rule out the other substance as the specific cause of the individual’s injury.”[18]

Unfortunately, the article’s discussion leaves everything rather vague, without quantifying the risks involved. We can, without too much effort, provide some numbers, although we cannot be sure that the authors would accept the resulting quantification. If the claimant’s lifetime of heavy smoking carried a relative risk of 30, and the claimant worked for a few years in a truck depot where he was exposed to diesel fumes that carried a relative risk of 1.2, it would seem that it should be “an easy task” to rule out diesel fumes and rule in smoking. Note however that ease of the inference is lubricated by the size of the relative risks involved, one much larger than two, and the other much smaller than two, and the absence of any suggestion of interaction or synergy between them. If the tortogen in this scenario is tobacco, the plaintiff wins readily. If the tortogen is diesel fumes, the plaintiff loses. Query, if this scenario arises in a case against the tobacco company, whether the alternative causation defense of exposure to diesel fumes fails as a matter of law?

Synergy between strong tortogen and strong competing risk

The authors cannot resist the temptation to cite the Mt. Sinai catechism[19] of multiplicative risk from smoking and asbestos exposure[20]:

“A well-known example of a synergistic effect is the combined effect of asbestos exposure and smoking on the likelihood of developing lung cancer. For long-term smokers, the relative risk of developing lung cancer compared to those who have never smoked is sometimes estimated to be in the range of 10.0. For individuals substantially exposed to asbestos, the relative risk of developing lung cancer compared to non-exposed individuals is in the range of 5.0. However, if one is unfortunate enough to have been exposed to asbestos and to have been a long-term smoker, the relative risk compared to those unexposed individuals who have not smoked exceeds the sum of the relative risks. One possibility is that the relationship is multiplicative, in the range of 50.0—i.e., a 49-fold risk increment.”[21]

The synergistic interaction is often raised in an attempt to defeat causal apportionment or avoid responsibility for the larger risk, as when smokers attempt to recover for lung cancer from asbestos exposure. Some courts have, however, permitted causal apportionment. In their analysis, the authors of Differential Etiologies simply wink and tell us that “[t]he calculation of synergistic effects is fairly complex.”[22]

Tortogen versus Multiple Risks

The scenario in which the tortogen has been “ruled in,” and is present in the claimant’s history, along with multiple other risks is more difficult than one might have imagined. The authors tell us that an individual claimant will fail to show that the tortogen is more likely than not a cause of her injury when one or more of the competing risks is stronger than the risk from the tortogen (assuming no synergy).[23] The authors’ analysis leaves unclear why the claimant does not similarly fail when the strength of the tortogen is equal to that of a competing risk. Similarly, the claimant would appear to have fallen short of the burden of proving the tortogen’s causal role when there are multiple competing risk factors that individually present smaller risks than the tortogen, but for which multiple subsets represent combined competing risks greater than the risk of the tortogen. 

Concluding Thoughts

If the authors had framed the differential enterprise by the logic of iterative disjunctive syllogism, they would have recognized that the premise of the argument must contain the disjunction of all general causes that might have been a cause of the claimant’s disease or injury. Furthermore, unless the idiopathogen(s) is eliminated, which rarely is the case, we are left with a disjunction in the conclusion that prevents judgment for the plaintiff. The extensive analysis provided in Differential Etiologies ultimately must equate risk with cause, and it must do so on a probabilistic basis, even when the probabilities are left vague, and unquantified. Indeed, the authors come close to confronting the reality that we often do not know the cause of many individual’s diseases. We do know something about the person’s antecedent risks, and we can quantify and compare those risks. Noncommittally, the authors note that courts have been receptive to the practical solution of judging whether the tortogen’s relative risk was greater than two as a measure of sufficiency for specific causation, and that they “agree that theoretically this intuition has appeal.”[24]

Although I have criticized many aspect of the article, it is an important contribution to the legal study of specific causation. Its taxonomy will not likely be the final word on the subject, but it is a major step toward making sense of an area of the law long dominated by clinical black boxes and ipse dixits.


[1] Differential Etiologies at 885. The authors noted that their advice was “especially true in those case-control studies where the cases and controls are not drawn from the same defined population at risk for the outcome under investigation.”

[2] Differential Etiologies at 895.

[3] Differential Etiologies at 895 & n. 154, citing Betz v. Pneumo Abex, LLC, 44 A.3d 27, 51 (Pa. 2012).

[4] Differential Etiologies at 895 at n. 156.

[5] American Cancer Soc’y website, last visited June 19, 2022.

[6] I did not know at the time that Selikoff had failed the B-reader examination.

[7] See, e.g., Bowers v. Norfolk Southern Corp., 537 F. Supp. 2d 1343, 1359–60 (M.D. Ga. 2007) (“The differential diagnosis method has an inherent reliability; the differential etiology method does not. This conclusion does not suggest that the differential etiology approach has no merit. It simply means that courts, when dealing with matters of reliability, should consider opinions based on the differential etiology method with more caution. It also means that courts should not conflate the two definitions.”)

[8] Differential Etiologies at 899 & n.176, citing Landrigan v. Celotex Corp., 127 N.J. 404, 605 A.2d 1079, 1087 (1992).

[9] Differential Etiologies at 899 & n.179, citing Johnson & Johnson Talcum Powder Cases, 249 Cal. Rptr. 3d 642, 671–72 (Cal. Ct. App. 2019).

[10] Caterinicchio v. Pittsburgh Corning Corp., 127 N.J. 428, 605 A.2d 1092 (1992).

[11] Differential Etiologies at 899.

[12] Differential Etiologies at 900.

[13] Differential Etiologies at 915.

[14] 639 F.3d 11 (1st Cir. 2011).

[15] Does it require pointing out that the reversal took place with a highly questionable, unethical amicus brief submitted by a not-for-profit that was founded by the two plaintiffs’ expert witnesses excluded by the trial court? Given that the First Circuit reversed and remanded, and then later affirmed the exclusion of plaintiffs’ expert witnesses on specific causation, and the entry of judgment, the first appellate decision became unnecessary to the final judgment and no longer a clear precedent.

[16] Differential Etiologies at 912, discussing Milward v. Acuity Specialty Prods. Group, Inc., 969 F. Supp. 2d 101, 109 (D. Mass. 2013), aff’d sub. nom., Milward v. Rust-Oleum Corp., 820 F.3d 469, 471, 477 (1st Cir. 2016).

[17] Id., quoting from Milward.

[18] Differential Etiologies at 901.

[19]  “The Mt. Sinai Catechism” (June 11, 2013).

[20] The mantra of 5-10-50 comes from early publications by Irving John Selikoff, and represents a misrepresentation of “never smoked regularly” as “never smoked,” and the use of a non-contemporaneous control group for the non-asbestos exposed, non-smoker base rate. When the external control group was updated to show a relative risk of 20, rather than 10 for smoking only, Selikoff failed to update his analysis. Selikoff’s protégés have recently updated the insulator cohort, repeating many of the original errors, but even so, finding only that “the joint effect of smoking and asbestos alone was additive.” See Steve Markowitz, Stephen Levin, Albert Miller, and Alfredo Morabia, “Asbestos, Asbestosis, Smoking and Lung Cancer: New Findings from the North American Insulator Cohort,” Am. J. Respir. & Critical Care Med. (2013).

[21] Differential Etiologies at 902. The authors do not cite the Selikoff publications, which repeated his dataset and his dubious interpretation endlessly, but rather cite David Faigman, et al., Modern Scientific Evidence: The Law and Science of Expert Testimony § 26.25. (West 2019–2020 ed.). To their credit, the authors describe multiplicative interaction as a possibility, but surely they known that plaintiffs’ expert witnesses recite the Mt. Sinai catechism in courtrooms all around the country, while intoning “reasonable degree of medical certainty.” The authors cite some contrary studies. Differential Etiologies at 902 n.188, citing several reviews including Darren Wraith & Kerrie Mengersen, “Assessing the Combined Effect of Asbestos Exposure & Smoking on Lung Cancer: A Bayesian Approach, 26 Stats. Med. 1150, 1150 (2007) (evidence supports more than an additive model and less than a multiplicative relation).”

[22] Differential Etiologies at 902 at n.189.

[23] Differential Etiologies at 905. The authors note that courts have admitted differential etiology testimony when the tortogen’s risk is greater than the risk from other known risks. Id. citing Cooper v. Takeda Pharms., 191 Cal. Rptr. 3d 67, 79 (Ct. App. 2015).

[24] Differential Etiologies at 896 & n.163.

Differential Etiologies – Part One – Ruling In

June 17th, 2022

You put your right foot in

You put your right foot out

You put your right foot in

And you shake it all about

You do the Hokey Pokey and you turn yourself around

That’s what it’s all about!

 

Ever since the United States Supreme Court decided Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993), legal scholars, judges, and lawyers have struggled with the structure and validity of expert opinion on specific causation. Professor David Faigman and others have attempted to articulate the scientific basis (if any) for opinion testimony in health-effects litigation that a give person’s disease has been caused by an exposure or condition.

In 2015, as part of a tribute to the late Judge Jack Weinstein, Professor Faigman offered the remarkable suggestion that in advancing differential etiologies, expert witnesses were inventing wholesale an approach that had no foundation or acceptance in their scientific disciplines:

 “Differential etiology is ostensibly a scientific methodology, but one not developed by, or even recognized by, physicians or scientists. As described, it is entirely logical, but has no scientific methods or principles underlying it. It is a legal invention and, as such, has analytical heft, but it is entirely bereft of empirical grounding. Courts and commentators have so far merely described the logic of differential etiology; they have yet to define what that methodology is.”[1]

Faigman is correct that courts often have left unarticulated exactly what the methodology is, but he does not quite make sense when he writes that the method of differential etiology is “entirely logical,” but has no “scientific methods or principles underlying it.” After all, Faigman starts off his essay with a quotation from Thomas Huxley that “science is nothing but trained and organized common sense.”[2] As I have written elsewhere, the form of reasoning involved in differential diagnosis is nothing other than iterative disjunctive syllogism.[3] Either-or reasoning occurs throughout the physical and biological sciences; it is not clear why Faigman declares it un- or extra-scientific.

The strength of Faigman’s claim about the made-up nature of differential etiology appears to be undermined and contradicted by an example that he provides from clinical allergy and immunology:

“Allergists, for example, attempt to identify the etiology of allergic reactions in order to treat them (or to advise the patient to avoid what caused them), though it might still be possible to treat the allergic reactions without knowing their etiology.”

Faigman at 437. Of course, not only allergists try to determine the cause of an individual patient’s disease. Psychiatrists, in the psychoanalytic tradition, certain do so as well. Physicians who use predictive regression models use group data, in multivariate analyses, to predict outcomes, risk, and mortality in individual patients. Faigman’s claim is similarly undermined by the existence of a few diseases (other than infectious diseases) that are defined by the causative exposure. Silicosis and manganism have played a large role in often bogus litigation, but they represent instances in which a differential diagnosis and puzzle may also be an etiological diagnosis and a puzzle. Of course, to the extent that a disease is defined in terms of causative exposures, there may be serious and even intractable problems caused by the lack of specificity and accuracy in the diagnostic criteria for the supposedly pathognomonic disease.

As I noted at the time of Faigman’s 2015 essay, his suggestion that the concept of “differential etiology” was not used in the sciences themselves, was demonstrably flawed and historically inaccurate.[4]

A year earlier, in a more sustained analysis of specific causation, Professor Faigman went astray in a different direction, this time by stating that:

“it is not customary in the ordinary practice of sociology, epidemiology, anthropology, and related fields (for example, cognitive and social psychology) for professionals to make individual diagnostic judgments derived from group-based data.”[5]

Faigman’s invocation of “ordinary practice” of epidemiology was seriously wide of the mark. Medical practitioners and scientists frequently use epidemiologic data, based upon “group-based data” to make individual diagnostic judgments. The inferences from group data to individual range abound in the diagnostic process itself, where the specificity and sensitivity of disease signs and symptoms are measured by group data. Physicians must rely upon group data to make prognoses for individual patients, and they rely upon group data to predict future disease risks for individual patients. Future disease risks, as in the Framingham risk score for hard coronary heart disease, or the Gale model for breast cancer risk, are, of course, based upon “group-based data.” Medical decisions to intervene, surgically, pharmacologically, or by some other method, all involve applying group data to the individual patient.

Faigman’s 2014 law review article was certainly correct, however, in noting that specific causation inferences and conclusions were often left “profoundly underdefined,” with glib identifications of risk with cause.[6] There was thus plenty of room for further elucidation of specific causation decisions, and I welcome Faigman’s most recent effort to nail conceptual jello to the wall, in a law review article that was published last year.[7]

This new article, “Differential Etiology: Inferring Specific Causation in the Law from Group Data in Science,” is the collaborative product of Professor Faigman and three other academics. Joseph Sanders will be immediately recognizable to the legal community as someone who long pondered causation issues, both general and specific, and who has contributed greatly to the law review literature on causation of health outcomes. In addition to the law professors, Peter B. Imrey, a professor of medicine at the Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, and Philip Dawid, an emeritus professor of statistics in Cambridge University, have joined the effort to make sense of specific causation in the law. The addition of medical and statistical expertise has added greatly to Faigman’s previous efforts, and it has corrected some of his earlier errors and added much nuance to the discussion. The resulting law review article is well worth reading for practitioners. In this post, however, I have not detailed every important insight, but rather I have tried to point out some of the continuing and new errors in the analysis.

The Sanders-Faigman-Imbrey-Dawid analysis begins with a lament that:

“there is no body of science to which experts can turn when addressing this issue. Ultimately, much of the evidence that can be brought to bear on this causal question is the same group-level data employed to prove general causation. Consequently, the expert testimony often feels jerry-rigged, an improvisation designed to get through a tough patch.”[8]

As an assessment of the judicial decisions on specific causation, there can be no dissent or appeal from the judgment of these authors. The authors use of the term “jerry-rigged” is curious. I had first I thought they were straining to avoid using the common phrase “jury rigged” or to avoid inventing a neologism such as “judge rigged.” The American Heritage and Merriam Webster dictionaries, however, describe the phrase “jerry-rigged” as a conflation of “jury-rigged,” a nautical term for a temporary but adequate repair, with “jerry-rigged,” a war-time pejorative term for makeshift devices put together by Germans. So jerry-rigged it is, and the authors are off and running to try to describe, clarify, and justify the process of drawing specific causation inferences by differential etiology. They might have called what passes for judicial decision making in this area as the “hokey pokey.”

The authors begin their analysis of specific causation with a brief acknowledgement that our legal system could abandon any effort to set standards or require rigorous thinking on the matter by simply leaving the matter to the jury.[9] After all, this laissez-faire approach had been the rule of law for centuries. Nevertheless, despite occasional retrograde, recidivist judicial opinions,[10] the authors realize that the law has evolved to a point that some judicial control over specific causation opinions is required. And if judges are going to engage in gatekeeping of specific-causation opinions, they need to explain and justify their decisions in a coherent and cogent fashion.

Having thus dispatched legal nihilism, the authors turn their attention to what they boldly describe as “the first full-scale effort to bring scientific sensibilities – and rigorous statistical thinking – to the legally imperative concept of specific causation.”[11] The claim is remarkable claim given that tort law has been dealing with the issue for decades, but probably correct given how frequently judges have swept the issue under a judicial rug of inpenetrable verbiage and shaggy thinking. The authors also walk back some of Faigman’s earlier claims that there is no science in the assessment of specific causation, although they acknowledge the obvious, that policy issues sometimes play a role in deciding both general and specific causation decisions. The authors also offer the insight, for which they claim novelty, that some of the Bradford Hill guidelines, although stated as part of assessing general causation, have some relevancy to decisions concerning specific causation.[12] Their insight is indeed important, although hardly novel.

Drawing upon some of the clearer judicial decisions, the authors identify three necessary steps to reach a conclusion of specific causation:

“(a) making a proper diagnosis;

(b) supporting (“ruling in”) the plausibility of the alleged cause of the injury on the basis of general evidence and logic; and

(c) particularization, i.e., excluding (‘ruling out’) competing causes in the specific instance under consideration.”[13]

Although this article is ostensibly about specific causation, the authors do not reach a serious discussion of the matter until roughly the 42nd page of a 72 page article. Having described a three-step approach, the authors feel compelled to discuss step one (describing or defining the “diagnosis,” or the outcome of interest), and step two, the “ruling in” process that requires an assessment of general causation.

Although ascertaining general causation is not the focus of this article, the authors give an extensive discourse on it. Indeed, the authors have some useful things to say about steps one and two, and I commend the article to readers for some of its learning. As much as the lawsuit industry might wish to do away with the general causation step, it is not going anywhere soon.[14] The authors also manage to say some things that range from wrong to not even wrong. One example of professoriate wish casting is the following assertion:

“Other things being equal, when the evidence for general causation is strong, and especially when the strength of the exposure–disease relationship as demonstrated in a body of research is substantial, the plaintiff faces a lower threshold in establishing the substance as the cause in a particular case than when the relationship is weaker.”[15]

This assertion appears, sans citation or analysis. The generalization fails in the face of counterexamples. The causal role for estrogen in many breast cancers is extremely strong. The International Agency for Cancer Research classifies estrogen as a Category I, known human carcinogen for breast cancer, even though estrogen is made naturally in the human female, and male, body. In the Women’s Health Initiative clinical trial, researchers reported a hazard ratio of 1.2,[16] but plaintiffs struggled to prevail on specific causation in litigation involving claims of breast cancer caused by post-menopausal hormone therapy. Perhaps the authors meant, by strength of exposure relationship, a high relative risk as well, but that point is taken up when the authors address the “ruling in” step of the three-step approach. In any event, the strength of the case for general causation is quite independent of the specific causation inference, especially in the face of small effect sizes.

On general causation itself, the authors begin their discussion with “threats to validity,” a topic that they characterize as mostly implicit in the Bradford Hill guidelines. But their suggestion that validity is merely implicit in the guidelines is belied by their citation to Dr. Woodside’s helpful article on the “forgotten predicate” to the nine Bradford Hill guidelines.[17] Bradford Hill explicitly noted that the starting point for considering an association to be causal occurred when “[o]ur observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance.”[18] Sir Austin told us in no uncertain terms that there is no need to consider the nine guidelines until random and systematic error have been rejected.[19]

In this article’s discussion of general causation, Professor’s Dawid’s influence can be seen in the unusual care to describe and define the p-value.[20] But the discussion devolves into more wish casting, when the authors state that p-values are not the only way to assess random error in research results.

They double down by stating that “[m]any prominent statisticians and other scientists have questioned it, and the need for change is increasingly accepted.”[21] The source for their statement, the American Statistical Association (ASA) 2016 p-value Statement, did not questioned the utility of the p-value for assessing random error, and this law review provides no other support for other unidentified methods to assess random error. For the most part, the ASA Statement identified misuses and misstatements of p-values, with the caveat that “[s]cientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.” This is hardly questioning the importance or utility of p-values in assessing random error.

When one of the cited authors, Ronald Wasserstein, published an editorial in 2019, proclaiming that it was time to move past the p-value, the then president of the ASA, Professor Karen Kafadar, commissioned a task force on the matter. That task force, consisting of many of the world’s leading statisticians, issued a short, but pointed rejection of Wasserstein’s advocacy, and by implication, the position asserted in this law review.[22] Several of the leading biomedical journals that were lobbied by Wasserstein to abandon statistical significance testing reassessed their statistical guidelines and reaffirmed the use of p-values and tests.[23]

Similarly, this law review’s statements that alternatives to frequentist tests (p-values) such as Bayesian inference are “ascendant” have no supporting citations, and generally are an inaccurate assessment of what most biomedical journals are currently publishing.

Despite the care with which this law review article has defined p-values, the authors run off the road when defining a confidence interval:

A 95% confidence interval … is a one-sided or two-sided interval from a data sample with 95% probability of bounding a fixed, unknown parameter, for which no nondegenerate probability distribution is conceived, under specified assumptions about the data distribution.”[24]

The emphasis added is to point out that the authors assigned a single confidence interval with the property of bounding the true parameter with 95% probability. That property, however, belongs to the infinite set of confidence intervals based upon repeated sampling of the same size from the same population, and constant variance. There is no probability statement to be made for the true parameter, as either in or not in a given confidence interval.

In an issue that is relevant to general and specific causation, the authors offer some ipse dixit on the issue of “thresholds”:

“with respect to some substance/injury relationships, it is thought that there is no safe threshold. Cancer is the injury for which it is most frequently thought that there is no safe threshold, but even here the mechanism of injury may lead to a different conclusion.”[25]

Here as elsewhere, the authors are repeating dogma, not science, and they ignore the substantial body of scientific evidence that undermines the so-called linear no threshold dose-response curve. The only citation offered is a judicial citation to a case that rejected the no threshold position![26]

So much for “ruling in.” In the next post, I will turn my attention to this law review’s handling of the “ruling out” step of differential etiology.


[1] David L. Faigman & Claire Lesikar, “Organized Common Sense: Some Lessons from Judge Jack Weinstein’s Uncommonly Sensible Approach to Expert Evidence,” 64 DePaul L. Rev. 421, 444 (2015).

[2] Thomas H. Huxley, “On the Education Value of the Natural History Sciences” (1854), in Lay Sermons, Addresses and Reviews 77 (1915).

[3] See, e.g., “Differential Etiology and Other Courtroom Magic” (June 23, 2014) (collecting cases); “Differential Diagnosis in Milward v. Acuity Specialty Products Group” (Sept. 26, 2013).

[4] See David Faigman’s Critique of G2i Inferences at Weinstein Symposium (Sept. 11, 2015); Kløve & D. Doehring, “MMPI in epileptic groups with differential etiology,” 18 J. Clin. Psychol. 149 (1962); Kløve & C. Matthews, “Psychometric and adaptive abilities in epilepsy with differential etiology,” 7 Epilepsia 330 (1966); Teuber & K. Usadel, “Immunosuppression in juvenile diabetes mellitus? Critical viewpoint on the treatment with cyclosporin A with consideration of the differential etiology,” 103 Fortschr. Med. 707 (1985); G.May & W. May, “Detection of serum IgA antibodies to varicella zoster virus (VZV)–differential etiology of peripheral facial paralysis. A case report,” 74 Laryngorhinootologie 553 (1995); Alan Roberts, “Psychiatric Comorbidity in White and African-American Illicity Substance Abusers” Evidence for Differential Etiology,” 20 Clinical Psych. Rev. 667 (2000); Mark E. Mullinsa, Michael H. Leva, Dawid Schellingerhout, Gilberto Gonzalez, and Pamela W. Schaefera, “Intracranial Hemorrhage Complicating Acute Stroke: How Common Is Hemorrhagic Stroke on Initial Head CT Scan and How Often Is Initial Clinical Diagnosis of Acute Stroke Eventually Confirmed?” 26 Am. J. Neuroradiology 2207 (2005);Qiang Fua, et al., “Differential Etiology of Posttraumatic Stress Disorder with Conduct Disorder and Major Depression in Male Veterans,” 62 Biological Psychiatry 1088 (2007); Jesse L. Hawke, et al., “Etiology of reading difficulties as a function of gender and severity,” 20 Reading and Writing 13 (2007); Mastrangelo, “A rare occupation causing mesothelioma: mechanisms and differential etiology,” 105 Med. Lav. 337 (2014).

[5] David L. Faigman, John Monahan & Christopher Slobogin, “Group to Individual (G2i) Inference in Scientific Expert Testimony,” 81 Univ. Chi. L. Rev. 417, 465 (2014).

[6] Id. at 448.

[7] Joseph Sanders, David L. Faigman, Peter B. Imrey, and Philip Dawid, “Differential Etiology: Inferring Specific Causation in the Law from Group Data in Science,” 63 Ariz. L. Rev. 851 (2021) [Differential Etiology]. I am indebted to Kirk Hartley for calling this new publication to my attention.

[8] Id. at 851, 855.

[9] Id. at 855 & n. 8 (citing A. Philip Dawid, David L. Faigman & Stephen E. Fienberg, “Fitting Science into Legal Contexts: Assessing Effects of Causes or Causes of Effects?,” 43 Sociological Methods & Research 359, 363–64 (2014). See also Barbara Pfeffer Billauer, “The Causal Conundrum: Examining the Medical-Legal Disconnect in Toxic Tort Cases from a Cultural Perspective or How the Law Swallowed the Epidemiologist and Grew Long Legs and a Tail,” 51 Creighton L. Rev. 319 (2018) (arguing for a standard-less approach that allows clinicians to offer their ipse dixit opinions on specific causation).

[10] Differential Etiology at 915 & n.231, 919 & n.244 (citing In re Round-Up Prods. Liab. Litig., 358 F. Supp. 3d 956, 960 (N.D. Cal. 2019).

[11] Differential Etiology at 856 (emphasis added).

[12] Differential Etiology at 857.

[13] Differential Etiology at 857 & n.14 (citing Best v. Lowe’s Home Ctrs., Inc., 563 F.3d 171, 180 (6th Cir. 2009)).

[14] See Margaret Berger, “Eliminating General Causation: Notes Toward a New Theory of Justice and Toxic Torts,” 97 Colum L. Rev. 2117 (1997).

[15] Differential Etiology at 864.

[16] Jacques E. Rossouw, et al.,Risks and benefits of estrogen plus progestin in healthy postmenopausal women: Principal results from the Women’s Health Initiative randomized controlled trial,” 288 J. Am. Med. Ass’n 321 (2002).

[17] Differential Etiology at 884 & n.104, citing Frank Woodside & Allison Davis, “The Bradford Hill Criteria: The Forgotten Predicate,” 35 Thomas Jefferson L. Rev. 103 (2013).

[18] Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 295 (1965).  

[19] Differential Etiology at 865.

[20] Differential Etiology at 869.

[21] Differential Etiology at 872, citing Ronald L. Wasserstein and Nicole A. Lazar, “The ASA Statement on p-Values: Context, Process, and Purpose,” 72 Am. Statistician 129 (2016).

[22] Yoav Benjamini, Richard D. De Veaux, Bradley Efron, Scott Evans, Mark Glickman, Barry I. Graubard, Xuming He, Xiao-Li Meng, Nancy M. Reid, Stephen M. Stigler, Stephen B. Vardeman, Christopher K. Wikle, Tommy Wright, Linda J. Young, and Karen Kafadar, “ASA President’s Task Force Statement on Statistical Significance and Replicability,” 15 Ann. Applied Statistics 1084 (2021), 34 Chance 10 (2021).

[23] See “Statistical Significance at the New England Journal of Medicine” (July 19, 2019); See also Deborah G. Mayo, “The NEJM Issues New Guidelines on Statistical Reporting: Is the ASA P-Value Project Backfiring?” Error Statistics Philosophy  (July 19, 2019).

[24] Differential Etiology at 898 n.173 (emphasis added).

[25] Differential Etiology at 890.

[26] Differential Etiology at n.134, citing Chlorine Chemistry Council v. Envt’l Protection Agency, 206 F.3d 1286 (D.C. Cir. 2000), which rejected the agency’s assumption that the carcinogenic effects of chloroform in drinking water lacked a threshold.

Improper Reliance upon Regulatory Risk Assessments in Civil Litigation

March 19th, 2022

Risk assessments would seemingly be about assessing risks, but they are not. The Reference Manual on Scientific Evidence defines “risk” as “[a] probability that an event will occur (e.g., that an individual will become ill or die within a stated period of time or by a certain age).”[1] The risk in risk assessment, however, may be zero, or uncertain, or even a probability of benefit. Agencies that must assess risks and set “action levels,” or “permissible exposure limits,” or “acceptable intakes,” often work under great uncertainty, with inspired guesswork, using unproven assumptions.

The lawsuit industry has thus often embraced the false equivalence between agency pronouncements on harmful medicinal, environmental, or occupational exposures and civil litigation adjudication of tortious harms. In the United States, federal agencies such as the Occupational Safety and Health Administration (OSHA), or the Environmental Protection Agency (EPA), and their state analogues, regularly set exposure standards that could not and should not hold up in a common-law tort case. 

Remarkably, there are state and federal court judges who continue to misunderstand and misinterpret regulatory risk assessments, notwithstanding efforts to educate the judiciary. The second edition of the Reference Manual on Scientific Evidence contained a chapter by the late Professor Margaret Berger, who took pains to point out the difference between agency assessments and the adjudication of causal claims in court:

[p]roof of risk and proof of causation entail somewhat different questions because risk assessment frequently calls for a cost-benefit analysis. The agency assessing risk may decide to bar a substance or product if the potential benefits are outweighed by the possibility of risks that are largely unquantifiable because of presently unknown contingencies. Consequently, risk assessors may pay heed to any evidence that points to a need for caution, rather than assess the likelihood that a causal relationship in a specific case is more likely than not.[2]

In March 2003, Professor Berger organized a symposium,[3] the first Science for Judges program (and the last), where the toxicologist Dr. David L. Eaton presented on the differences in the use of toxicology in regulatory pronouncements as opposed to causal assessments in civil actions. As Dr. Eaton noted:

“regulatory levels are of substantial value to public health agencies charged with ensuring the protection of the public health, but are of limited value in judging whether a particular exposure was a substantial contributing factor to a particular individual’s disease or illness.”[4]

The United States Environmental Protection Agency (EPA) acknowledges that estimating “risk” from low level exposures based upon laboratory animal data is fraught because of inter-specie differences in longevity, body habitus and size, genetics, metabolism, excretion patterns, genetic homogeneity of laboratory animals, dosing levels and regimens. The EPA’s assumptions in conducting and promulgating regulatory risk assessments are intended to predict the upper bound of theoretical risk, while fully acknowledging that there may be no actual risk in humans:

“It should be emphasized that the linearized multistage [risk assessment] procedure leads to a plausible upper limit to the risk that is consistent with some proposed mechanisms of carcinogenesis. Such an estimate, however, does not necessarily give a realistic prediction of the risk. The true value of the risk is unknown, and may be as low as zero.”[5]

The approach of the U.S. Food and Drug Administration (FDA) with respect to mutagenic impurities in medications provides an illustrative example of how theoretical and hypothetical risk assessment can be.[6] The FDA’s risk assessment approach is set out in a “Guidance” document, which like all such FDA guidances, describes itself as containing non-binding recommendations, which do not preempt alternative approaches.[7] The agency’s goal is devise a control strategy for any mutagenic impurity to keep it at or below an “acceptable cancer risk level,” even if the risk or the risk level is completely hypothetical.

The FDA guidance advances the concept of a “Threshold of Toxicological Concern (TTC),” to set an “acceptable intake,” for chemical impurities that pose negligible risks of toxicity or carcinogenicity.[8] The agency describes its risk assessment methodology as “very conservative,” given the frequently unproven assumptions made to reach a quantification of an “acceptable intake”:

“The methods upon which the TTC is based are generally considered to be very conservative since they involve a simple linear extrapolation from the dose giving a 50% tumor incidence (TD50) to a 1 in 10-6 incidence, using TD50 data for the most sensitive species and most sensitive site of tumor induction. For application of a TTC in the assessment of acceptable limits of mutagenic impurities in drug substances and drug products, a value of 1.5 micrograms (µg)/day corresponding to a theoretical 10-5 excess lifetime risk of cancer can be justified.”

For more potent mutagenic carcinogens, such as aflatoxin-like-, N-nitroso-, and alkyl-azoxy compounds, the acceptable intake or permissible daily exposure (PDE) is set lower, based upon available animal toxicologic data.

The important divide between regulatory practice and the litigation of causal claims in civil actions arises from the theoretical nature of the risk assessment enterprise. The FDA acknowledges, for instance, that the acceptable intake is set to mark “a small theoretical increase in risk,” and a “highly hypothetical concept that should not be regarded as a realistic indication of the actual risk,” and thus not an actual risk.[9] The corresponding hypothetical or theoretical risk to the acceptable intake level is clearly small when compared with the human’s lifetime probability of developing cancer (which the FDA states is greater than 1/3, but probably now approaches 40%).

Although the TTC concept allows a calculation of an estimated “safe exposure,” the FDA points out that:

“exceeding the TTC is not necessarily associated with an increased cancer risk given the conservative assumptions employed in the derivation of the TTC value. The most likely increase in cancer incidence is actually much less than 1 in 100,000. *** Based on all the above considerations, any exposure to an impurity that is later identified as a mutagen is not necessarily associated with an increased cancer risk for patients already exposed to the impurity. A risk assessment would determine whether any further actions would be taken.”

In other words the FDA’s risk assessment exists to guide agency action, not to determine a person’s risk or medical status.[10]

As small and theoretical as the risks are, they are frequently based upon demonstrably incorrect assumptions, such as:

  1. humans are as sensitive as the most sensitive species;
  2. all organs are as sensitive as the most sensitive organ of the most sensitive species;
  3. the dose-response in the most sensitive species is a simple linear relationship;
  4. the linear relationship runs from zero exposure and zero risk to the exposure that yields the so-called TD50, the exposure that yields tumors in 50% of the experimental animal model;
  5. the TD-50 is calculated based upon the point estimate in the animal model study, regardless of any confidence interval around the point estimate;
  6. the inclusion, in many instances, of non-malignant tumors as part of the assessment of the TD50 exposure;
  7. there is some increased risk for any exposure, no matter how small; that is, there is no threshold below which there is no increased risk; and
  8. the medication with the mutagenic impurity was used daily for 70 years, by a person who weights 50 kg.

Although the FDA acknowledges that there may be some instances in which a “less than lifetime level” (LTL) may be appropriate, it places the burden on manufacturers to show the appropriateness of higher LTLs. The FDA’s M7 Guidance observes that

“[s]tandard risk assessments of known carcinogens assume that cancer risk increases as a function of cumulative dose. Thus, cancer risk of a continuous low dose over a lifetime would be equivalent to the cancer risk associated with an identical cumulative exposure averaged over a shorter duration.”[11]

Similarly, the agency acknowledges that there may be a “practical threshold,” as result of bodily defense mechanisms, such as DNA repair, which counter any ill effects from lower level exposures.[12]

“The existence of mechanisms leading to a dose response that is non-linear or has a practical threshold is increasingly recognized, not only for compounds that interact with non-DNA targets but also for DNA-reactive compounds, whose effects may be modulated by, for example, rapid detoxification before coming into contact with DNA, or by effective repair of induced damage. The regulatory approach to such compounds can be based on the identification of a No-Observed Effect Level (NOEL) and use of uncertainty factors (see ICH Q3C(R5), Ref. 7) to calculate a permissible daily exposure (PDE) when data are available.”

Expert witnesses often attempt to bootstrap their causation opinions by reference to determinations of regulatory agencies that are couched in similar language, but which use different quality and quantity of evidence than is required in the scientific community or in civil courts.

Supreme Court

Industrial Union Dep’t v. American Petroleum Inst., 448 U.S. 607, 656 (1980) (“OSHA is not required to support its finding that a significant risk exists with anything approaching scientific certainty” and “is free to use conservative assumptions in interpreting the data with respect to carcinogens, risking error on the side of overprotection, rather than underprotection.”).

Matrixx Initiatives, Inc. v. Siracusano, 563 U.S. 27, 131 S.Ct. 1309, 1320 (2011) (regulatory agency often makes regulatory decisions based upon evidence that gives rise only to a suspicion of causation) 

First Circuit

Sutera v. Perrier Group of America, Inc., 986 F. Supp. 655, 664-65, 667 (D. Mass. 1997) (a regulatory agency’s “threshold of proof is reasonably lower than that in tort law”; “substances are regulated because of what they might do at given levels, not because of what they will do. . . . The fact of regulation does not imply scientific certainty. It may suggest a decision to err on the side of safety as a matter of regulatory policy rather than the existence of scientific fact or knowledge. . . . The mere fact that substances to which [plaintiff] was exposed may be listed as carcinogenic does not provide reliable evidence that they are capable of causing brain cancer, generally or specifically, in [plaintiff’s] case.”); id. at 660 (warning against the danger that a jury will “blindly accept an expert’s opinion that conforms with their underlying fears of toxic substances without carefully understanding or examining the basis for that opinion.”). Sutera is an important precedent, which involved a claim that exposure to an IARC category I carcinogen, benzene, caused plaintiffs’ leukemia. The plaintiff’s expert witness, Robert Jacobson, espousing a “linear, no threshold” theory, and relying upon an EPA regulation, which he claimed supported his opinion that even trace amounts of benzene can cause leukemia.

In re Neurontin Mktg., Sales Practices, and Prod. Liab. Litig., 612 F. Supp. 2d 116, 136 (D. Mass. 2009) (‘‘It is widely recognized that, when evaluating pharmaceutical drugs, the FDA often uses a different standard than a court does to evaluate evidence of causation in a products liability action. Entrusted with the responsibility of protecting the public from dangerous drugs, the FDA regularly relies on a risk-utility analysis, balancing the possible harm against the beneficial uses of a drug. Understandably, the agency may choose to ‘err on the side of caution,’ … and take regulatory action such as revising a product label or removing a drug from the marketplace ‘upon a lesser showing of harm to the public than the preponderance-of-the-evidence or more-like-than-not standard used to assess tort liability’.’’) (internal citations omitted) 

Whiting v. Boston Edison Co., 891 F. Supp. 12, 23-24 (D. Mass. 1995) (criticizing the linear no-threshold hypothesis, common to regulatory risk assessments, because it lacks any known or potential error rate, and it cannot be falsified as would any scientific theory)

Second Circuit

Wills v. Amerada Hess Corp., No. 98 CIV. 7126(RPP), 2002 WL 140542 (S.D.N.Y. Jan. 31, 2002), aff’d, 379 F.3d 32 (2d Cir. 2004) (Sotomayor, J.). In this Jones Act case, the plaintiff claimed that her husband’s exposure to benzene and polycyclic aromatic hydrocarbons on board ship caused his squamous cell lung cancer. Plaintiff’s expert witness relied heavily upon the IARC categorization of benzene as a “known” carcinogen, and an “oncogene” theory of causation that claimed there was no safe level of exposure because a single molecule could induce cancer. According to the plaintiff’s expert witness, the oncogene theory dispensed with the need to quantify exposure. Then Judge Sotomayor, citing Sutera, rejected plaintiff’s no-threshold theory, and the argument that exposure that exceeded OHSA permissible exposure level supported the causal claim.

Mancuso v. Consolidated Edison Co., 967 F. Supp. 1437, 1448 (S.D.N.Y. 1997) (“recommended or prescribed precautionary standards cannot provide legal causation”; “[f]ailure to meet regulatory standards is simply not sufficient” to establish liability)

In re Agent Orange Product Liab. Litig., 597 F. Supp. 740, 781 (E.D.N.Y. 1984) (Weinstein, J.) (“The distinction between avoidance of risk through regulation and compensation for injuries after the fact is a fundamental one.”), aff’d in relevant part, 818 F.2d 145 (2d Cir.1987), cert. denied sub nom. Pinkney v. Dow Chemical Co., 484 U.S. 1004 (1988). Judge Weinstein explained that regulatory action would not by itself support imposing liability for an individual plaintiff.  Id. at 782. “A government administrative agency may regulate or prohibit the use of toxic substances through rulemaking, despite a very low probability of any causal relationship.  A court, in contrast, must observe the tort law requirement that a plaintiff establish a probability of more than 50% that the defendant’s action injured him.” Id. at 785.

In re Ephedra Prods. Liab. Litig., 393 F. Supp. 2d 181, 189 (S.D.N.Y. 2005) (improvidently relying in part upon FDA ban despite “the absence of definitive scientific studies establishing causation”)

Third Circuit

Gates v. Rohm & Haas Co., 655 F.3d 255, 268 (3d Cir. 2011) (affirming the denial of class certification for medical monitoring) (‘‘plaintiffs could not carry their burden of proof for a class of specific persons simply by citing regulatory standards for the population as a whole’’).

In re Schering-Plough Corp. Intron/Temodar Consumer Class Action, 2009 WL 2043604, at *13 (D.N.J. July 10, 2009)(“[T]here is a clear and decisive difference between allegations that actually contest the safety or effectiveness of the Subject Drugs and claims that merely recite violations of the FDCA, for which there is no private right of action.”)

Rowe v. E.I. DuPont de Nemours & Co., Civ. No. 06-1810 (RMB), 2008 U.S. Dist. LEXIS 103528, *46-47 (D.N.J. Dec. 23, 2008) (rejecting reliance upon regulatory findings and risk assessments in which “the basic goal underlying risk assessments . . . is to determine a level that will protect the most sensitive members of the population.”)  (quoting David E. Eaton, “Scientific Judgment and Toxic Torts – A Primer in Toxicology for Judges and Lawyers,” 12 J.L. & Pol’y 5, 34 (2003) (“a number of protective, often ‘worst case’ assumptions . . . the resulting regulatory levels . . . generally overestimate potential toxicity levels for nearly all individuals.”)

Soldo v. Sandoz Pharms. Corp., 244 F. Supp. 2d 434, 543 (W.D. Pa. 2003) (finding FDA regulatory proceedings and adverse event reports not adequate or helpful in determining causation; the FDA “ordinarily does not attempt to prove that the drug in fact causes a particular adverse effect.”)Wade-Greaux v. Whitehall Laboratories, Inc., 874 F. Supp. 1441, 1464 (D.V.I.) (“assumption[s that] may be useful in a regulatory risk-benefit context … ha[ve] no applicability to issues of causation-in-fact”), aff’d, 46 F.3d 1120 (3d  Cir. 1994)

O’Neal v. Dep’t of the Army, 852 F. Supp. 327, 333 (M.D. Pa. 1994) (administrative risk figures are “appropriate for regulatory purposes in which the goal is to be particularly cautious [but] overstate the actual risk and, so, are inappropriate for use in determining” civil liability)

Fourth Circuit

Dunn v. Sandoz Pharmaceuticals Corp., 275 F. Supp. 2d 672, 684 (M.D.N.C. 2003) (FDA “risk benefit analysis” “does not demonstrate” causation in any particular plaintiff)

Yates v. Ford Motor Co., 113 F. Supp. 3d 841, 857 (E.D.N.C. 2015) (“statements from regulatory and official agencies … are not bound by standards for causation found in toxic tort law”)

Meade v. Parsley, No. 2:09-cv-00388, 2010 U.S. Dist. LEXIS 125217, * 25 (S.D.W. Va. Nov. 24, 2010) (‘‘Inasmuch as the cost-benefit balancing employed by the FDA differs from the threshold standard for establishing causation in tort actions, this court likewise concludes that the FDA-mandated [black box] warnings cannot establish general causation in this case.’’)

Rhodes v. E.I. du Pont de Nemours & Co., 253 F.R.D. 365, 377 (S.D. W.Va. 2008) (rejecting the relevance of regulatory assessments, which are precautionary and provide no information about actual risk).

Fifth Circuit

Moore v. Ashland Chemical Co., 126 F.3d 679, 708 (5th Cir. 1997) (holding that expert witness could rely upon a material safety data sheet (MSDS) because mandated by the Hazard Communication Act, 29 C.F.R. § 1910.1200), vacated 151 F.3d 269 (5th Cir. 1998) (affirming trial court’s exclusion of expert witness who had relied upon MSDS).

Johnson v. Arkema Inc., 685 F.3d 452, 464 (5th Cir. 2012) (per curiam) (affirming exclusion of expert witness who upon regulatory pronouncements; noting the precautionary nature of such statements, and the absence of specificity for the result claimed at the exposures experienced by plaintiff)

Allen v. Pennsylvania Eng’g Corp., 102 F.3d 194, 198-99 (5th Cir. 1996) (“Scientific knowledge of the harmful level of exposure to a chemical, plus knowledge that the plaintiff was exposed to such quantities, are minimal facts necessary to sustain the plaintiffs’ burden in a toxic tort case”; regulatory agencies, charged with protecting public health, employ a lower standard of proof in promulgating regulations than that used in tort cases). The Allen court explained that it was “also unpersuaded that the “weight of the evidence” methodology these experts use is scientifically acceptable for demonstrating a medical link. . . .  Regulatory and advisory bodies. . .utilize a “weight of the evidence” method to assess the carcinogenicity of various substances in human beings and suggest or make prophylactic rules governing human exposure.  This methodology results from the preventive perspective that the agencies adopt in order to reduce public exposure to harmful substances.  The agencies’ threshold of proof is reasonably lower than that appropriate in tort law, which traditionally makes more particularized inquiries into cause and effect and requires a plaintiff to prove that it is more likely than not that another individual has caused him or her harm.” Id.

Burst v. Shell Oil Co., C. A. No. 14–109, 2015 WL 3755953, *8 (E.D. La. June 16, 2015) (explaining Fifth Circuit’s rejection of regulatory “weight of the evidence” approaches to evaluating causation)

Sprankle v. Bower Ammonia & Chem. Co., 824 F.2d 409, 416 (5th Cir. 1987) (affirmed Rule 403 exclusion evidence of OSHA violations in claim of respiratory impairment in a non-employee who experienced respiratory impairment after exposure to anhydrous ammonia; court found that the jury likely be confused by regulatory pronouncements)

Cano v. Everest Minerals Corp., 362 F. Supp. 2d 814, 825 (W.D. Tex. 2005) (noting that a product that “has been classified as a carcinogen by agencies responsible for public health regulations is not probative of” common-law specific causation) (finding that the linear no-threshold opinion of the plaintiffs’ expert witness, Malin Dollinger, lacked a satisfactory scientific basis)

Burleson v. Glass, 268 F. Supp. 2d 699, 717 (W.D. Tex. 2003) (“the mere fact that [the product] has been classified by certain regulatory organizations as a carcinogen is not probative on the issue of whether [plaintiff’s] exposure. . .caused his. . .cancers”), aff’d, 393 F.3d 577 (5th Cir. 2004)

Newton v. Roche Labs., Inc., 243 F. Supp. 2d 672, 677, 683 (W.D. Tex. 2002) (FDA’s precautionary decisions on labeling are not a determination of causation of specified adverse events) (“Although evidence of an association may … be important in the scientific and regulatory contexts…, tort law requires a higher standard of causation.”)

Molden v. Georgia Gulf Corp., 465 F. Supp. 2d 606, 611 (M.D. La. 2006) (“regulatory and advisory bodies make prophylactic rules governing human exposure based on proof that is reasonably lower than that appropriate in tort law”)

Sixth Circuit

Nelson v. Tennessee Gas Pipeline Co., 243 F.3d 244, 252-53 (6th Cir. 2001) (exposure above regulatory levels is insufficient to establish causation)

Stites v Sundstrand Heat Transfer, Inc., 660 F. Supp. 1516, 1525 (W.D. Mich. 1987) (rejecting use of regulatory standards to support claim of increased risk, noting the differences in goals and policies between regulation and litigation)

Mann v. CSX Transportation, Inc., case no. 1:07-Cv-3512, 2009 U.S. Dist. Lexis 106433 (N.D. Ohio Nov. 10, 2009) (rejecting expert testimony that relied upon EPA action levels, and V.A. compensation for dioxin exposure, as basis for medical monitoring opinions)

Baker v. Chevron USA, Inc., 680 F. Supp. 2d 865, 880 (S.D. Ohio 2010) (“[R]egulatory agencies are charged with protecting public health and thus reasonably employ a lower threshold of proof in promulgating their regulations than is used in tort cases.”) (“[t]he mere fact that Plaintiffs were exposed to [the product] in excess of mandated limits is insufficient to establish causation”; rejecting Dr. Dahlgren’s opinion and its reliance upon a “one-hit” or “no threshold” theory of causation in which exposure to one molecule of a cancer-causing agent has some finite possibility of causing a genetic mutation leading to cancer, a theory that may be accepted for purposes of setting regulatory standards, but not as reliable scientific knowledge)

Adams v. Cooper Indus., 2007 WL 2219212 at *7 (E.D. KY 2007).

Seventh Circuit

Wood v. Textron, Inc., No. 3:10 CV 87, 2014 U.S. Dist. LEXIS 34938 (N.D. Ind. Mar. 17, 2014); 2014 U.S. Dist. LEXIS 141593, at *11 (N.D. Ind. Oct. 3, 2014), aff’d, 807 F.3d 827 (7th Cir. 2015). Dahlgren based his opinions upon the children’s water supply containing vinyl chloride in excess of regulatory levels set by state and federal agencies, including the EPA. Similarly, Ryer-Powder relied upon exposure levels’ exceeding regulatory permissible limits for her causation opinions. The district court, with the approval now of the Seventh Circuit would have none of this nonsense. Exceeding governmental regulatory exposure limits does not prove causation. The con-compliance does not help the fact finder without knowing “the specific dangers” that led the agency to set the permissible level, and thus the regulations are not relevant at all without this information. Even with respect to specific causation, the regulatory infraction may be weak or null evidence for causation. (citing Cunningham v. Masterwear Corp., 569 F.3d 673, 674–75 (7th Cir. 2009)

Eighth Circuit

Glastetter v. Novartis Pharms. Corp., 107 F. Supp. 2d 1015, 1036 (E.D. Mo. 2000) (“[T]he [FDA’s] statement fails to affirmatively state that a connection exists between [the drug] and the type of injury in this case.  Instead, it states that the evidence received by the FDA calls into question [drug’s] safety, that [the drug] may be an additional risk factor. . .and that the FDA had new evidence suggesting that therapeutic use of [the drug] may lead to serious adverse experiences.  Such language does not establish that the FDA had concluded that [the drug] can cause [the injury]; instead, it indicates that in light of the limited social utility of [the drug for the use at issue] and the reports of possible adverse effects, the drug should no longer be used for that purpose.”) (emphasis in original), aff’d, 252 F.3d 986, 991 (8th Cir. 2001) (FDA’s precautionary decisions on labeling are not a determination of causation of specified adverse events; “methodology employed by a government agency results from the preventive perspective that the agencies adopt”)( “The FDA will remove drugs from the marketplace upon a lesser showing of harm to the public than the preponderance-of-the-evidence or the more-like-than-not standard used to assess tort liability . . . . [Its] decision that [the drug] can cause [the injury] is unreliable proof of medical causation.”), aff’d, 252 F.3d 986 (8th Cir. 2001)

Wright v. Willamette Indus., Inc., 91 F.3d 1105, 1107 (8th Cir. 1996) (rejecting claim that plaintiffs were not required to show individual exposure levels to formaldehyde from wood particles). The Wright court elaborated upon the difference between adjudication and regulation of harm:

“Whatever may be the considerations that ought to guide a legislature in its determination of what the general good requires, courts and juries, in deciding cases, traditionally make more particularized inquiries into matters of cause and effect.  Actions in tort for damages focus on the question of whether to transfer money from one individual to another, and under common-law principles (like the ones that Arkansas law recognizes) that transfer can take place only if one individual proves, among other things, that it is more likely than not that another individual has caused him or her harm.  It is therefore not enough for a plaintiff to show that a certain chemical agent sometimes causes the kind of harm that he or she is complaining of.  At a minimum, we think that there must be evidence from which the factfinder can conclude that the plaintiff was exposed to levels of that agent that are known to cause the kind of harm that the plaintiff claims to have suffered. See Abuan v. General Elec. Co., 3 F.3d at 333.  We do not require a mathematically precise table equating levels of exposure with levels of harm, but there must be evidence from which a reasonable person could conclude that a defendant’s emission has probablycaused a particular plaintiff the kind of harm of which he or she complains before there can be a recovery.”

Gehl v. Soo Line RR, 967 F.2d 1204, 1208 (8th Cir. 1992).

Nelson v. Am. Home Prods. Corp., 92 F. Supp. 2d 954, 958 (W.D. Mo. 2000) (FDA’s precautionary decisions on labeling are not a determination of causation of specified adverse events)

National Bank of Commerce v. Associated Milk Producers, Inc., 22 F. Supp. 2d 942, 961 (E.D.Ark. 1998), aff’d, 191 F.3d 858 (8th Cir. 1999) 

Junk v. Terminix Internat’l Co., 594 F. Supp. 2d 1062, 1071 (S.D. Iowa 2008) (“government agency regulatory standards are irrelevant to [plaintiff’s] burden of proof in a toxic tort cause of action because of the agency’s preventative perspective”)

Ninth Circuit

Henrickson v. ConocoPhillips Co., 605 F. Supp. 2d 1142, 1156 (E.D. Wash. 2009) (excluding expert witness causation opinions in case involving claims that benzene exposure caused leukemia) 

Lopez v. Wyeth-Ayerst Labs., Inc., 1998 WL 81296, at *2 (9th Cir. Feb. 25, 1998) (FDA’s precautionary decisions on labeling are not a determination of causation of specified adverse events)

In re Epogen & Aranesp Off-Label Marketing & Sales Practices Litig., 2009 WL 1703285, at *5 (C.D. Cal. June 17, 2009) (“have not been proven” allegations are an improper “FDA approval” standard; the FDA’s determination to require warning changes without establishing causation is established does not permit a court or jury, bound by common-law standards, to impose such a duty to warn when common-law causation requirements are not met).

In re Hanford Nuclear Reservation Litig., 1998 U.S. Dist. Lexis 15028 (E.D. Wash. 1998) (radiation and chromium VI), rev’d on other grounds, 292 F.3d 1124 (9th Cir. 2002).

Tenth Circuit

Hollander v. Shandoz Pharm. Corp., 95 F. Supp. 2d 1230, 1239 (W.D. Okla. 2000) (distinguishing FDA’s threshold of proof as lower than appropriate in tort law), aff’d in relevant part, 289 F.3d 1193, 1215 (10th Cir. 2002)

Mitchell v. Gencorp Inc., 165 F.3d 778, 783 n.3 (10th Cir. 1999) (benzene and CML) (quoting Allen, 102 F.3d at 198) (state administrative finding that product was a carcinogen was based upon lower administrative standard than tort standard) (“The methodology employed by a government agency “results from the preventive perspective that the agencies adopt in order to reduce public exposure to harmful substances.  The agencies’ threshold of proof is reasonably lower than that appropriate in tort law, which traditionally makes more particularized inquiries into cause and effect and requires a plaintiff to prove it is more likely than not that another individual has caused him or her harm.”)

In re Breast Implant Litig., 11 F. Supp. 2d 1217, 1229 (D.Colo. 1998)

Johnston v. United States, 597 F. Supp. 374, 393-394 (D. Kan.1984) (noting that the linear no-threshold hypothesis is based upon a prudent assumption designed to overestimate risk; speculative hypotheses are not appropriate in determining whether one person has harmed another)

Eleventh Circuit

Rider v. Sandoz Pharmaceuticals Corp., 295 F.3d 1194, 1201 (11th Cir. 2002) (FDA may take regulatory action, such as revising warning labels or withdrawing drug from the market ‘‘upon a lesser showing of harm to the public than the preponderance-of-the-evidence or more-likely-than-not standard used to assess tort liability’’) (“A regulatory agency such as the FDA may choose to err on the side of caution. Courts, however, are required by the Daubert trilogy to engage in objective review of the evidence to determine whether it has sufficient scientific basis to be considered reliable.”)

McClain v. Metabolife Internat’l, Inc., 401 F.3d 1233, 1248-1250 (11th Cir. 2005) (ephedra) (allowing that regulators “may pay heed to any evidence that points to a need for caution,” and apply “a much lower standard than that which is demanded by a court of law”) (“[U]se of FDA data and recommendations raises a more subtle methodological issue in a toxic tort case. The issue involves identifying and contrasting the type of risk assessment that a government agency follows for establishing public health guidelines versus an expert analysis of toxicity and causation in a toxic tort case.”)

In re Seroquel Products Liab. Litig., 601 F. Supp. 2d 1313, 1315 (M.D. Fla. 2009) (noting that administrative agencies “impose[] different requirements and employ[] different labeling and evidentiary standards” because a “regulatory system reflects a more prophylactic approach” than the common law)

Siharath v. Sandoz Pharmaceuticals Corp., 131 F. Supp. 2d 1347, 1370 (N.D. Ga. 2001) (“The standard by which the FDA deems a drug harmful is much lower than is required in a court of law.  The FDA’s lesser standard is necessitated by its prophylactic role in reducing the public’s exposure to potentially harmful substances.”), aff’d, 295 F.3d 1194 330 (11th Cir. 2002)

In re Accutane Products Liability, 511 F.Supp.2d 1288, 1291-92 (M.D. Fla. 2007)(acknowledging that regulatory risk assessments are not necessarily realistic in human populations because they are often based upon animal studies, and that the important differences between experimental animals and humans are substantial in various health outcomes).

Kilpatrick v. Breg, Inc., 2009 WL 2058384 at * 6-7 (S.D. Fla. 2009) (excluding plaintiff’s expert witness), aff’d, 613 F.3d 1329 (11th Cir. 2010)

District of Columbia Circuit

Ethyl Corp. v. E.P.A., 541 F.2d 1, 28 & n. 58 (D.C. Cir. 1976) (detailing the precautionary nature of agency regulations that may be based upon suspicions)

STATE COURTS

Arizona

Lofgren v. Motorola, 1998 WL 299925 (Ariz. Super. Ct. 1998) (finding plaintiffs’ expert witnesses’ testimony that TCE caused cancer to be not generally accepted; “it is appropriate public policy for health organizations such as IARC and the EPA to make judgments concerning the health and safety of the population based on evidence which would be less than satisfactory to support a specific plaintiff’s tort claim for damages in a court of law”)

Colorado

Salazar v. American Sterilizer Co., 5 P.3d 357 (Colo. Ct. App. 2000) (allowing testimony about harmful ethylene oxide exposure based upon OSHA regulations)

Georgia

Butler v. Union Carbide Corp., 712 S.E.2d 537, 552 & n.37 (Ga. App. 2011) (distinguishing risk assessment from causation assessment; citing the New York Court of Appeals decision in Parker for correctly rejecting reliance on regulatory pronouncements for causation determinations)

Illinois

La Salle Nat’l Bank v. Malik, 705 N.E.2d 938 (Ill. App. 3d) (reversing trial court’s exclusion of OSHA PEL for ethylene oxide), writ pet’n den’d, 714 N.E.2d 527 (Ill. 2d 1999)

New York

Parker v. Mobil Oil Corp., 7 N.Y.3d 434, 450, 857 N.E.2d 1114, 1122, 824 N.Y.S.2d 584 (N.Y. 2006) (noting that regulatory agency standards usually represent precautionary principle efforts deliberately to err on side of prevention; “standards promulgated by regulatory agencies as protective measures are inadequate to demonstrate legal causation.” 

In re Bextra & Celebrex, 2008 N.Y. Misc. LEXIS 720, *20, 239 N.Y.L.J. 27 (2008) (characterizing FDA Advisory Panel recommendations as regulatory standard and protective measure).

Juni v. A.O. Smith Water Products Co., 48 Misc. 3d 460, 11 N.Y.S.3d 416, 432, 433 (N.Y. Cty. 2015) (“the reports and findings of governmental agencies [declaring there to be no safe dose of asbestos] are irrelevant as they constitute insufficient proof of causation”), aff’d, 32 N.Y.3d 1116, 116 N.E.3d 75, 91 N.Y.S.3d 784 (2018)

Ohio

Valentine v. PPG Industries, Inc., 821 N.E.2d 580, 597-98 (Ohio App. 2004), aff’d, 850 N.E.2d 683 (Ohio 2006). 

Pennsylvania

Betz v. Pneumo Abex LLC, 44 A. 3d 27 (Pa. 2012).

Texas

Borg-Warner Corp., 232 S.W.3d 765, 770 (Tex. 2007)

Exxon Corp. v. Makofski, 116 S.W.3d 176, 187-88 (Tex. App. 2003) (describing “standards used by OSHA [and] the EPA” as inadequate for causal determinations)


[1] Michael D. Green, D. Michal Freedman, and Leon Gordis, “Reference Guide on Epidemiology,” in Reference Manual on Scientific Evidence 549, 627 (3d ed. 2011).

[2] Margaret A. Berger, “The Supreme Court’s Trilogy on the Admissibility of Expert Testimony,” in Reference Manual On Scientific Evidence at 33 (Fed. Jud. Center 2d. ed. 2000).

[3] Margaret A. Berger, “Introduction to the Symposium,” 12 J. L. & Pol’y 1 (2003). Professor Berger described the symposium as a “felicitous outgrowth of a grant from the Common Benefit Trust established in the Silicone Breast Implant Products Liability Litigation to hold a series of conferences at Brooklyn Law School.” Id. at 1. Ironically, that “Trust” was nothing more than the walking-around money of plaintiffs’ lawyers from the Silicone-Gel Breast Implant MDL 926. Although Professor Berger was often hostile the causation requirement in tort law, her symposium included some well-qualified scientists who amplified her point from the Reference Manual about the divide between regulatory risk assessment and scientific causal assessments.

[4] David L. Eaton, Scientific Judgment and Toxic Torts- A Primer in Toxicology for Judges and Lawyers, 12 J.L. & Pol’y 5, 36 (2003). See also Joseph V. Rodricks and Susan H. Rieth, “Toxicological risk assessment in the courtroom: are available methodologies suitable for evaluating toxic tort and product liability claims?” 27 Regul. Toxicol. & Pharmacol. 21, 27 (1998) (“The public health-oriented resolution of scientific uncertainty [used by regulators] is not especially helpful to the problem faced by a court.”)

[5] EPA “Guidelines for Carcinogen Risk Assessment” at 13 (1986).

[6] The approach is set out in FDA, M7 (R1) Assessment and Control of DNA Reactive (Mutagenic) Impurities in Pharmaceuticals to Limit Potential Carcinogenic Risk: Guidance for Industry (2018) [FDA M7]. This FDA guidance is essentially an adoption of the M7 document of the Expert Working Group (Multidisciplinary) of the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH).

[7] FDA M7 at 3.

[8] FDA M7 at 5.

[9] FDA M7 at 5 (emphasis added).

[10] See Labeling of Diphenhydramine Containing Drug Products for Over-the-Counter Human Use, 67 Fed. Reg. 72,555, at 72,556 (Dec. 6, 2002) (“FDA’s decision to act in an instance such as this one need not meet the standard of proof required to prevail in a private tort action. . .. To mandate a warning or take similar regulatory action, FDA need not show, nor do we allege, actual causation.”) (citing Glastetter).

[11] FDA M7 at “Acceptable Intakes in Relation to Less-Than-Lifetime (LTL) Exposure (7.3).”

[12] FDA M7 at 12 (“Mutagenic Impurities With Evidence for a Practical Threshold (7.2.2)”).

The American Tort Law Museum

March 14th, 2022

Last year, Professor Christopher J. Robinette wrote a blog post about the American Tort Law Museum. I had not heard of it, but I was curious. I have stopped by the Museum’s website on a few occasions to learn more.

The Museum’s website describes itself as “the nationally acclaimed American Museum of Tort Law,” which seems hyperbolic. I suppose as long as it is the only museum of tort law, it might as well call itself “the” museum of tort law.

Other than Professor Robinette, I have not read anything about this museum, but perhaps I was somehow left in the dark. The museum’s physical location is in Winsted, Connecticut, about 40 km. northwest of downtown Hartford, in the middle of nowhere.  Hardly a place for a nationally acclaimed museum, although Although Congressman John B. Larson is apparently very happy to have this museum in the boondocks of Connecticut.[1]

The website states that the museum seeks to “educate, inform and inspire Americans about two things: Trial by jury; and the benefits of tort law.” Well, “trial by jury” is like God and apple pie, but I am an atheist and I prefer blueberry pie. Trial by jury is great when the Crown is trying to take your property or your life, but I am a skeptic when it comes to juries’ deciding technical and scientific issues. And the “benefits of tort law”? Well, there are some, but does the museum inform about the many detriments and harms of tort law?

Browsing the website quickly answers the questions. There are case studies of what at least plaintiffs’ tort lawyers might consider benefits ($$$) of tort law, with call out to notable cases that resulted in large awards, and perhaps a few that may have led to safer products. The “nationally acclaimed” museum has nothing, at least in its online presence, about the detriments, irrationality, or failures of tort law. You will not find anything about crime and fraud among the ranks of plaintiffs’ lawyers; nor will you find anything about successful defenses that shut down entire litigations. Nothing here about Dickie Scruggs in prison garb, or about John Edwards’ love child. Hmm, you may be getting a sense that this is a lopsided, partisan effort. Indeed, the museum is a temple to the Lawsuit Industry, and with the exception of one anomalous defense lawyer, its “founders” are the muckety mucks of the plaintiffs’ bar.

Among the founders are Peter Angelos, F. Scott Baldwin, Frederick Baron, Thomas V. Girardi, Robert L. Habush, James F. Humphreys, Tommy Jacks, Joseph D. Jamail Jr., and various rent-seeking organizations, such as Center for Study of Responsive Law, Public Citizen, Public Safety Institute, and Safety Systems Foundation.

You can see who else is associated with this propaganda effort. For education about civics and the right to a jury trial, I prefer the House of Terror, in Budapest.


[1] John B. Larson, “Recognizing the American Museum of Tort Law’s Second Anniversary,” Cong. Rec. E1475 (Nov. 1, 2017).

Hindsight Bias – In Science & in the Law

February 27th, 2022

In the early 1970s, Amos Tversky and Daniel Kahneman raised the awareness of hindsight bias as a pervasive phenomenon in all human judgment.[1] Although these insights seemed eponymously obvious in hindsight, experimental psychologists directly tested the existence and extent of hindsight bias in a now classic paper by Baruch Fischhoff.[2] The lack of awareness of how hindsight bias affects our historical judgments seriously limits our ability to judge the past.

Kahneman’s participation in the planning phase of a new, fourth edition of the Reference Manual on Scientific Evidence, is a hopeful sign that his insights and the research of many psychologists will gain a fuller recognition in the law. Hindsight bias afflicts judges, lawyers, jurors, expert witnesses, scientists, physicians, and children of all ages.[3]

Hindsight Bias in the Law

Sixth Amendment Challenges

Challenges to the effectiveness of legal counsel is a mainstay for habeas petitions, filed by convicted felons. In hindsight, their lawyers’ conduct seems woefully inadequate. In judging such claims of ineffectiveness, the United States Supreme Court acknowledged the role and influence of hindsight bias in judging trial counsel’s strategic decisions:

“A fair assessment of attorney performance requires that every effort be made to eliminate the distorting effects of hindsight, to reconstruct the circumstances of counsel’s challenged conduct, and to evaluate the conduct from counsel’s perspective at the time. Because of the difficulties inherent in making the evaluation, a court must indulge a strong presumption that counsel’s conduct falls within the wide range of reasonable professional assistance; that is, the defendant must overcome the presumption that, under the circumstances, the challenged action might be considered sound trial strategy.”[4]

This decision raises the interesting question why there is not a strong presumption of reasonableness in other legal contexts, such as the “reasonableness” of physician judgments, or of adequate warnings.

Medical Malpractice

There is little doubt that retrospective judgments of the reasonableness of medical decisions is infected, distorted, and corrupted by hindsight bias.[5] In the words of one paper on the subject:

“There is evidence that hindsight bias, which may cause the expert to simplify, trivialise and criticise retrospectively the decisions of the treating doctor, is inevitable when the expert knows there has been an adverse outcome.”[6]

Requiring the finder of fact to assess the reasonableness of complex medical judgments in hindsight, with knowledge of the real-world outcomes of the prior judgments, pose a major threat to fairness in the trial process, in both bench and jury trials. Curiously, lawyers receive a “strong presumption” of reasonableness, but physicians and manufacturers do not.

Patent Litigation

Hindsight bias plays a large role in challenging patent validity. The works of genius seem obvious with hindsight. In the context of judging patent criteria such non-obviousness, the Supreme Court has emphasized that:

“A factfinder should be aware, of course, of the distortion caused by hindsight bias and must be cautious of arguments reliant upon ex post reasoning.”[7]

Certainly, factfinders in every kind of litigation, not just intellectual property cases, should be made aware of the distortion caused by hindsight bias.

Remedies

In all likelihood, hindsight bias can probably never be fully corrected. At a minimum, factfinders should be educated about the phenomenon. In criminal cases, defendants have called psychologists about the inherent difficulties in eyewitness or cross-race identification.[8] In New Jersey, trial courts must give a precautionary instruction in criminal cases that involve eyewitness identification.[9] In some but not all discrimination cases, courts have permitted expert witness opinion testimony about “implicit bias.”[10] In “long-tail” litigation, in which jurors must consider the reasonableness of warning decisions, or claims of failure to test, decades before the trial, defendants may well want to consider calling a psychologist to testify about the reality of hindsight bias, and how it leads to incorrect judgments about past events.

Another, independent remedy would be for the trial court to give a jury instruction on hindsight bias.  After all, the Supreme Court has clearly stated that “[a] factfinder should be aware, of course, of the distortion caused by hindsight bias and must be cautious of arguments reliant upon ex post reasoning.” The trial judge should set the stage for a proper consideration of past events, by alerting jurors to the reality and seductiveness of hindsight bias. What follows is a first attempt at such an instruction. I would love to hear from anyone who has submitted a proposed instruction on the issue.

Members of the jury, this case will require your determination of what were the facts of what scientists knew or should have known at a time in the past. At the same time that you try to make this determination, you will have been made aware of what is now known. Psychological research clearly shows that all human beings, regardless of their age, education, or life circumstances have what is known as hindsight bias. Having this bias means that we all tend to assume that people at times past should have known what we now in fact know. Calling it a bias is a way to say that this assumption is wrong. To decide this case fairly, you must try to determine what people, including experts in the field, actually knew and did before there were more recent discoveries, and without reference to what is now known and accepted.


[1] Amos Tversky & Daniel Kahneman, “Judgment under uncertainty: heuristics and Biases,” 185 Science 1124 (1974). See alsoPeople Get Ready – There’s a Reference Manual a Comin’ ”(June 6, 2021).

[2] Baruch Fischhoff, “Hindsight ≠ foresight: the effect of outcome knowledge on judgment under uncertainty,” 1 Experimental Psychology: Human Perception & Performance 288, 288 (1975), reprinted in 12 Quality & Safety Health Care 304 (2003); Baruch Fischhoff & Ruth Beyth, “I knew it would happen: Remembered probabilities of once – future things?” 13 Organizational Behavior & Human Performance 1 (1975); see Baruch Fischhoff, “An Early History of Hindsight Research,” 25 Social Cognition 10 (2007).

[3] See Daniel M. Bernstein, Edgar Erdfelder, Andrew N. Meltzoff, William Peria & Geoffrey R. Loftus, “Hindsight Bias from 3 to 95 Years of Age,” 37 J. Experimental Psychol., Learning, Memory & Cognition, 378 (2011).

[4] Strickland v. Washington, 466 U.S. 668, 689, 104 S.Ct. 2052, 2052 (1984); see also Feldman v. Thaler, 695 F.3d 372, 378 (5th Cir. 2012).

[5] Edward Banham-Hall & Sian Stevens, “Hindsight bias critically impacts on clinicians’ assessment of care quality in retrospective case note review,” 19 Clinical Medicine 16 (2019); Thom Petty, Lucy Stephenson, Pierre Campbell & Terence Stephenson, “Outcome Bias in Clinical Negligence Medicolegal Cases,” 26 J.Law & Med. 825 (2019); Leonard Berlin, “Malpractice Issues and Radiology – Hindsight Bias” 175 Am. J. Radiol. 597 (2000); Leonard Berlin, “Outcome Bias,” 183 Am. J. Radiol. 557 (2004); Thomas B. Hugh & Sidney W. A. Dekker, “Hindsight bias and outcome bias in the social construction of medical negligence: a review,” 16 J. Law. Med. 846 (2009).

[6] Thomas B. Hugh & G. Douglas Tracy, “Hindsight Bias in Medicolegal Expert Reports,” 176 Med. J. Australia 277 (2002).

[7] KSR International Co. v. Teleflex Inc., 550 U.S. 398, 127 S.Ct. 1727, 1742 (2007) (emphasis added; internal citations omitted).

[8] See Commonwealth v. Walker, 92 A.3d 766 (Pa. 2014) (Todd, J.) (rejecting per se inadmissibility of eyewitness expert witness opinion testimony).

[9] State v. Henderson, 208 N.J. 208, 27 A.3d 872 (2011).

[10] Samaha v. Wash. State Dep’t of Transp., No. cv-10-175-RMP, 2012 WL 11091843, at *4 (E.D. Wash. Jan. 3, 2012) (holding that an expert witness’s proferred opinions about the “concepts of implicit bias and stereotypes is relevant to the issue of whether an employer intentionally discriminated against an employee.”).

Of Significance, Error, Confidence & Confusion – In Law & Statistics

February 27th, 2022

A version of this post appeared previously on Professor Deborah Mayo’s blog, Error Statistics Philosophy. The post was invited as a comment on Professor Mayo’s article in Conservation Biology, which is cited and discussed below. Other commentators had important, insightful comments that can be found at Error Statistics Philosophy.[1] These commentators and many others participated in a virtual special sessionof Professor Mayo’s “Phil Stat Forum,” on January 11, 2022. This session, “Statistical Significance Test Anxiety,” was moderated by David Hand, and included presentations by Deborah Mayo and Yoav Benjamini. The presenters slides, as well as a video of the session are now online.

*      *     *     *     *     *     *     *

The metaphor of law as an “empty vessel” is frequently invoked to describe the law generally, as well as pejoratively to describe lawyers. The metaphor rings true at least in describing how the factual content of legal judgments comes from outside the law. In many varieties of litigation, not only the facts and data, but the scientific and statistical inferences must be added to the “empty vessel” to obtain a correct and meaningful outcome.

Once upon a time, the expertise component of legal judgments came from so-called expert witnesses, who were free to opine about the claims of causality solely by showing that they had more expertise than the lay jurors. In Pennsylvania, for instance, the standard for qualify witnesses to give “expert opinions” was to show that they had “a reasonable pretense to expertise on the subject.”

In the 19th and the first half of the 20th century, causal claims, whether of personal injuries, discrimination, or whatever, virtually always turned on a conception of causation as necessary and sufficient to bring about the alleged harm. In discrimination claims, plaintiffs pointed to the “inexorable zero,” in cases in which no Black citizen was ever seated on a grand jury, in a particular county, since the demise of Reconstruction. In health claims, the mode of reasoning usually followed something like Koch’s postulates.

The second half of the 20th century was marked by the rise of stochastic models in our understanding of the world. The consequence is that statistical inference made its way into the empty vessel. The rapid introduction of statistical thinking into the law did not always go well. In a seminal 1977 discrimination case, Casteneda v. Partida,[2] in an opinion by Associate Justice Blackmun, the court calculated a binomial probability for observing the sample result (rather than a result at least as extreme as such a result), and mislabeled the measurement “standard deviations” rather than standard errors:

“As a general rule for such large samples, if the difference between the expected value and the observed number is greater than two or three standard deviations, then the hypothesis that the jury drawing was random would be suspect to a social scientist.  The II-year data here reflect a difference between the expected and observed number of Mexican-Americans of approximately 29 standard deviations. A detailed calculation reveals that the likelihood that such a substantial departure from the expected value would occur by chance is less than I in 10140.”[3]

Justice Blackmun was graduated from Harvard College, summa cum laude, with a major in mathematics.

Despite the extreme statistical disparity in the 11-year run of grand juries, Justice Blackmun’s opinion provoked a robust rejoinder, not only on the statistical analysis, but on the Court’s failure to account for obvious omitted confounding variables in its simplistic analysis. And then there were the inconvenient facts that Mr. Partida was a rapist, indicted by a grand jury (50% with “Hispanic” names), which was appointed by jury commissioners (3/5 Hispanic). Partida was convicted by a petit jury (7/12 Hispanic), in front a trial judge who was Hispanic, and he was denied a writ of habeas court by Judge Garza, who went on to be a member of the Court of Appeals. In any event, Justice Blackmun’s dictum about “two or three” standard deviations soon shaped the outcome of many thousands of discrimination cases, and was translated into a necessary p-value of 5%.

Beginning in the early 1960s, statistical inference became an important feature of tort cases that involved claims based upon epidemiologic evidence. In such health-effects litigation, the judicial handling of concepts such as p-values and confidence intervals often went off the rails.  In 1989, the United States Court of Appeals for the Fifth Circuit resolved an appeal involving expert witnesses who relied upon epidemiologic studies by concluding that it did not have to resolve questions of bias and confounding because the studies relied upon had presented their results with confidence intervals.[4] Judges and expert witnesses persistently interpreted single confidence intervals from one study as having a 95 percent probability of containing the actual parameter.[5] Similarly, many courts and counsel committed the transposition fallacy in interpreting p-values as posterior probabilities for the null hypothesis.[6]

Against this backdrop of mistaken and misrepresented interpretation of p-values, the American Statistical Association’s p-value statement was a helpful and understandable restatement of basic principles.[7] Within a few weeks, however, citations to the p-value Statement started to show up in the briefs and examinations of expert witnesses, to support contentions that p-values (or any procedure to evaluate random error) were unimportant, and should be disregarded.[8]

In 2019, Ronald Wasserstein, the ASA executive director, along with two other authors wrote an editorial, which explicitly called for the abandonment of using “statistical significance.”[9] Although the piece was labeled “editorial,” the journal provided no disclaimer that Wasserstein was not speaking ex cathedra.

The absence of a disclaimer provoked much confusion. Indeed, Brian Turran, the editor of Significancepublished jointly by the ASA and the Royal Statistical Society, wrote an editorial interpreting the Wasserstein editorial as an official ASA “recommendation.” Turran ultimately retracted his interpretation, but only in response to a pointed letter to the editor.[10] Turran adverted to a misleading press release from the ASA as the source of his confusion. Inquiring minds might wonder why the ASA allowed such misleading press releases to go out.

In addition to press releases, some people in the ASA started to send emails to journal editors, to nudge them to abandon statistical significance testing on the basis of what seemed like an ASA recommendation. For the most part, this campaign was unsuccessful in the major biomedical journals.[11]

While this controversy was unfolding, then President Karen Kafadar of the ASA stepped into the breach to state definitively that the Executive Director was not speaking for the ASA.[12] In November 2019, the ASA board of directors approved a motion to create a “Task Force on Statistical Significance and Replicability.” Its charge was “to develop thoughtful principles and practices that the ASA can endorse and share with scientists and journal editors. The task force will be appointed by the ASA President with advice and participation from the ASA Board.”

Professor Mayo’s editorial has done the world of statistics, as well as the legal world of judges, lawyers, and legal scholars, a service in calling attention to the peculiar intellectual conflicts of interest that played a role in the editorial excesses of some of  the ASA’s leadership. From a lawyer’s perspective, it is clear that courts have been misled, and distracted by, some of the ASA officials who seem to have worked to undermine a consensus position paper on p-values.[13]

Curiously, the task force’s report did not find a home in any of the ASA’s several scholarly publications. Instead “The ASA President’s Task Force Statement on Statistical Significance and Replicability[14] appeared in the The Annals of Applied  Statistics, where it is accompanied by an editorial by ASA former President Karen Kafadar.[15] In November 2021, the ASA’s official “magazine,” Chance, also published the Task Force’s Statement.[16]

Judges and litigants who must navigate claims of statistical inference need guidance on the standard of care scientists and statisticians should use in evaluating such claims. Although the Taskforce did not elaborate, it advanced five basic propositions, which had been obscured by many of the recent glosses on the ASA 2016 p-value statement, and the 2019 editorial discussed above:

  1. “Capturing the uncertainty associated with statistical summaries is critical.”
  2. “Dealing with replicability and uncertainty lies at the heart of statistical science. Study results are replicable if they can be verified in further studies with new data.”
  3. “The theoretical basis of statistical science offers several general strategies for dealing with uncertainty.”
  4. “Thresholds are helpful when actions are required.”
  5. “P-values and significance tests, when properly applied and interpreted, increase the rigor of the conclusions drawn from data.”

Although the Task Force’s Statement will not end the debate or the “wars,” it will go a long way to correct the contentions made in court about the insignificance of significance testing, while giving courts a truer sense of the professional standard of care with respect to statistical inference in evaluating claims of health effects.


[1] Commentators included John Park, MD; Brian Dennis, Ph.D.; Philip B. Stark, Ph.D.; Kent Staley, Ph.D.; Yudi Pawitan, Ph.D.; Brian, Hennig, Ph.D.; Brian Haig, Ph.D.; and Daniël Lakens, Ph.D.

[2] Casteneda v. Partida, 430 U.S. 432 (1977).

[3] Id. at 430 U.S. 482, 496 n.17 (1977).

[4] Brock v. Merrill Dow Pharmaceuticals, Inc., 874 F.2d 307, 311-12 (5th Cir. 1989).

[5] Richard W. Clapp & David Ozonoff, “Environment and Health: Vital Intersection or Contested Territory?” 30 Am. J. L. & Med. 189, 210 (2004) (“Thus, a RR [relative risk] of 1.8 with a confidence interval of 1.3 to 2.9 could very likely represent a true RR of greater than 2.0, and as high as 2.9 in 95 out of 100 repeated trials.”) (Both authors testify for claimants cases involving alleged environmental and occupational harms.); Schachtman, “Confidence in Intervals and Diffidence in the Courts” (Mar. 4, 2012) (collecting numerous examples of judicial offenders).

[6] See, e.g., In re Ephedra Prods. Liab. Litig., 393 F.Supp. 2d 181, 191, 193 (S.D.N.Y. 2005) (Rakoff, J.) (credulously accepting counsel’s argument that the use of a critical value of less than 5% of significance probability increased the “more likely than not” burden of proof upon a civil litigant). The decision has been criticized in the scholarly literature, but it is still widely cited without acknowledging its error. See Michael O. Finkelstein, Basic Concepts of Probability and Statistics in the Law 65 (2009).

[7] Ronald L. Wasserstein & Nicole A. Lazar, “The ASA’s Statement on p-Values: Context, Process, and Purpose,” 70 The Am. Statistician 129 (2016); see “The American Statistical Association’s Statement on and of Significance” (March 17, 2016). The commentary beyond the “bold faced” principles was at times less helpful in suggesting that there was something inherently inadequate in using p-values. With the benefit of hindsight, this commentary appears to represent editorizing by the authors, and not the sense of the expert committee that agreed to the six principles.

[8] Schachtman, “The American Statistical Association Statement on Significance Testing Goes to Court, Part I” (Nov. 13, 2018), “Part II” (Mar. 7, 2019).

[9] Ronald L. Wasserstein, Allen L. Schirm, and Nicole A. Lazar, “Editorial: Moving to a World Beyond ‘p < 0.05’,” 73 Am. Statistician S1, S2 (2019); see Schachtman,“Has the American Statistical Association Gone Post-Modern?” (Mar. 24, 2019).

[10] Brian Tarran, “THE S WORD … and what to do about it,” Significance (Aug. 2019); Donald Macnaughton, “Who Said What,” Significance 47 (Oct. 2019).

[11] See, e.g., David Harrington, Ralph B. D’Agostino, Sr., Constantine Gatsonis, Joseph W. Hogan, David J. Hunter, Sharon-Lise T. Normand, Jeffrey M. Drazen, and Mary Beth Hamel, “New Guidelines for Statistical Reporting in the Journal,” 381 New Engl. J. Med. 285 (2019); Jonathan A. Cook, Dean A. Fergusson, Ian Ford, Mithat Gonen, Jonathan Kimmelman, Edward L. Korn, and Colin B. Begg, “There is still a place for significance testing in clinical trials,” 16 Clin. Trials 223 (2019).

[12] Karen Kafadar, “The Year in Review … And More to Come,” AmStat News 3 (Dec. 2019); see also Kafadar, “Statistics & Unintended Consequences,” AmStat News 3,4 (June 2019).

[13] Deborah Mayo, “The statistics wars and intellectual conflicts of interest,” 36 Conservation Biology (2022) (in-press, online Dec. 2021).

[14] Yoav Benjamini, Richard D. DeVeaux, Bradly Efron, Scott Evans, Mark Glickman, Barry Braubard, Xuming He, Xiao Li Meng, Nancy Reid, Stephen M. Stigler, Stephen B. Vardeman, Christopher K. Wikle, Tommy Wright, Linda J. Young, and Karen Kafadar, “The ASA President’s Task Force Statement on Statistical Significance and Replicability,” 15 Annals of Applied Statistics (2021) (in press).

[15] Karen Kafadar, “Editorial: Statistical Significance, P-Values, and Replicability,” 15 Annals of Applied Statistics (2021).

[16] Yoav Benjamini, Richard D. De Veaux, Bradley Efron, Scott Evans, Mark Glickman, Barry I. Graubard, Xuming He, Xiao-Li Meng, Nancy M. Reid, Stephen M. Stigler, Stephen B. Vardeman, Christopher K. Wikle, Tommy Wright, Linda J. Young & Karen Kafadar, “ASA President’s Task Force Statement on Statistical Significance and Replicability,” 34 Chance 10 (2021).

Confounded by Confounding in Unexpected Places

December 12th, 2021

In assessing an association for causality, the starting point is “an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance.”[1] In other words, before we even embark on consideration of Bradford Hill’s nine considerations, we should have ruled out chance, bias, and confounding as an explanation for the claimed association.[2]

Although confounding is sometimes considered as a type of systematic bias, its importance warrants its own category. Historically, courts have been rather careless in addressing confounding. The Supreme Court, in a case decided before Daubert and the statutory modifications to Rule 702, ignored the role of confounding in a multiple regression model used to support racial discrimination claims. In language that would be reprised many times to avoid and evade the epistemic demands of Rule 702, the Court held, in Bazemore, that the omission of variables in multiple regression models raises an issue that affects “the  analysis’ probativeness, not its admissibility.”[3]

When courts have not ignored confounding,[4] they have sidestepped its consideration by imparting magical abilities to confidence intervals to take care of problem posed by lurking variables.[5]

The advent of the Reference Manual on Scientific Manual allowed a ray of hope to shine on health effects litigation. Several important cases have been decided by judges who have taken note of the importance of assessing studies for confounding.[6] As a new, fourth edition of the Manual is being prepared, its editors and authors should not lose sight of the work that remains to be done.

The Third Edition of the Federal Judicial Center’s and the National Academies of Science, Engineering & Medicine’s Reference Manual on Scientific Evidence (RMSE3d 2011) addressed confounding in several chapters, not always consistently. The chapter on statistics defined “confounder” in terms of correlation between both the independent and dependent variables:

“[a] confounder is correlated with the independent variable and the dependent variable. An association between the dependent and independent variables in an observational study may not be causal, but may instead be due to confounding”[7]

The chapter on epidemiology, on the other hand, defined a confounder as a risk factor for both the exposure and disease outcome of interest:

“A factor that is both a risk factor for the disease and a factor associated with the exposure of interest. Confounding refers to a situation in which an association between an exposure and outcome is all or partly the result of a factor that affects the outcome but is unaffected by the exposure.”[8]

Unfortunately, the epidemiology chapter never defined “risk factor.” The term certainly seems much less neutral than a “correlated” variable, which lacks any suggestion of causality. Perhaps there is some implied help from the authors of the epidemiology chapter when they described a case of confounding by “known causal risk factors,” which suggests that some risk factors may not be causal.[9] To muck up the analysis, however, the epidemiology chapter went on to define “risk” as “[a] probability that an event will occur (e.g., that an individual will become ill or die within a stated period of time or by a certain age).”[10]

Both the statistics and the epidemiology chapters provide helpful examples of confounding and speak to the need for excluding confounding as the basis for an observed association. The statistics chapter, for instance, described confounding as a threat to “internal validity,”[11] and the need to inquire whether the adjustments in multivariate studies were “sensible and sufficient.”[12]

The epidemiology chapter in one passage instructed that when “an association is uncovered, further analysis should be conducted to assess whether the association is real or a result of sampling error, confounding, or bias.[13] Elsewhere in the same chapter, the precatory becomes mandatory.[14]

Legally Unexplored Source of Substantial Confounding

As the Reference Manual implies, attempting to control for confounding is not adequate.  The controlling must be carefully and sufficiently done. Under the heading of sufficiency and due care, there are epidemiologic studies that purport to control for confounding, but fail rather dramatically. The use of administrative databases, whether based upon national healthcare or insurance claims, has become a common place in chronic disease epidemiology. Their large size obviates many concerns about power to detect rare disease outcomes. Unfortunately, there is often a significant threat to the validity of such studies, which are based upon data sets that characterize patients as diabetic, hypertensive, obese, or smokers vel non. By dichotomizing what are continuous variables, the categorization extracts a significant price in multivariate models used in epidemiology.

Of course, physicians frequently create guidelines for normal versus abnormal, and these divisions or categories show up in medical records, in databases, and ultimately in epidemiologic studies. The actual measurements are not always available, and the use of a categorical variable may appear to simplify the statistical analysis of the dataset. Unfortunately, the results can be quite misleading. Consider the measurements of blood pressure in a study that is evaluating whether an exposure variable (such as medication use or environmental contaminant) is associated with an outcome such as cardiovascular or renal disease. Hypertension, if present, would clearly be a confounder, but the use of a categorical variable for hypertension would greatly undermine the validity of the study. If many of the study participants with hypertension had their condition well controlled by medication, then the categorical variable will dilute the adjustment for the role of hypertension in driving the association between the exposure and outcome variables of interest. Even if none of the hypertensive patients had good control, the reduction of all hypertension to a category, rather than a continuous measurement, is a path of the loss of information and the creation of bias.

Almost 40 years ago, Jacob Cohen showed that dichotomization of continuous variables results in a loss of power.[15] Twenty years later, Peter Austin showed in a Monte Carlo simulation that categorizing a continuous variable in a logistic regression results in inflating the rate of finding false positive associations.[16] The type I (false-positive) error rates increases with sample size, with increasing correlation between the confounding variable and outcome of interest, and the number of categories used for the continuous variables. Of course, the national databases often have huge sample sizes, which only serves to increase the bias from the use of categorical variables for confounding variables.

The late Douglas Altman, who did so much to steer the medical literature toward greater validity, warned that dichotomizing continuous variables was known to cause loss of information, statistical power, and reliability in medical research.[17]

In the field of pharmaco-epidemiology, the bias created by dichotomization of a continous variable is harmful from both the perspective of statistical estimation and hypothesis testing.[18] While readers are misled into believing that the study adjusts for important co-variates, the study will have lost information and power, with the result of presenting false-positive results that have the false-allure of a fully adjusted model. Indeed, this bias from inadequate control of confounding infects several pending pharmaceutical multi-district litigations.


Supreme Court

General Electric Co. v. Joiner, 522 U.S. 136, 145-46 (1997) (holding that an expert witness’s reliance on a study was misplaced when the subjects of the study “had been exposed to numerous potential carcinogens”)

First Circuit

Bricklayers & Trowel Trades Internat’l Pension Fund v. Credit Suisse Securities (USA) LLC, 752 F.3d 82, 89 (1st Cir. 2014) (affirming exclusion of expert witness who failed to account for confounding in event studies), aff’g 853 F. Supp. 2d 181, 188 (D. Mass. 2012)

Second Circuit

Wills v. Amerada Hess Corp., 379 F.3d 32, 50 (2d Cir. 2004) (holding expert witness’s specific causation opinion that plaintiff’s squamous cell carcinoma had been caused by polycyclic aromatic hydrocarbons was unreliable, when plaintiff had smoked and drunk alcohol)

Deutsch v. Novartis Pharms. Corp., 768 F.Supp. 2d 420, 432 (E.D.N.Y. 2011) (“When assessing the reliability of a epidemiologic study, a court must consider whether the study adequately accounted for “confounding factors.”)

Schwab v. Philip Morris USA, Inc., 449 F. Supp. 2d 992, 1199–1200 (E.D.N.Y. 2006), rev’d on other grounds, 522 F.3d 215 (2d Cir. 2008) (describing confounding in studies of low-tar cigarettes, where authors failed to account for confounding and assessing healthier life styles in users)

Third Circuit

In re Zoloft Prods. Liab. Litig., 858 F.3d 787, 793 (3d Cir. 2017) (affirming exclusion of causation expert witness)

Magistrini v. One Hour Martinizing Dry Cleaning, 180 F. Supp. 2d 584, 591 (D.N.J. 2002), aff’d, 68 Fed. Appx. 356 (3d Cir. 2003)(bias, confounding, and chance must be ruled out before an association  may be accepted as showing a causal association)

Soldo v. Sandoz Pharms. Corp., 244 F. Supp. 2d 434 (W.D.Pa. 2003) (excluding expert witnesses in Parlodel case; noting that causality assessments and case reports fail to account for confounding)

Wade-Greaux v. Whitehall Labs., Inc., 874 F. Supp. 1441 (D.V.I. 1994) (unanswered questions about confounding required summary judgment  against plaintiff in Primatene Mist birth defects case)

Fifth Circuit

Knight v. Kirby Inland Marine, Inc., 482 F.3d 347, 353 (5th Cir. 2007) (affirming exclusion of expert witnesses) (“Of all the organic solvents the study controlled for, it could not determine which led to an increased risk of cancer …. The study does not provide a reliable basis for the opinion that the types of chemicals appellants were exposed to could cause their particular injuries in the general population.”)

Burst v. Shell Oil Co., C. A. No. 14–109, 2015 WL 3755953, *7 (E.D. La. June 16, 2015) (excluding expert witness causation opinion that failed to account for other confounding exposures that could have accounted for the putative association), aff’d, 650 F. App’x 170 (5th Cir. 2016)

LeBlanc v. Chevron USA, Inc., 513 F. Supp. 2d 641, 648-50 (E.D. La. 2007) (excluding expert witness testimony that purported to show causality between plaintiff’s benzene ezposure and myelofibrosis), vacated, 275 Fed. App’x 319 (5th Cir. 2008) (remanding case for consideration of new government report on health effects of benzene)

Castellow v. Chevron USA, 97 F. Supp. 2d 780 (S.D. Tex. 2000) (discussing confounding in passing; excluding expert witness causation opinion in gasoline exposure AML case)

Kelley v. American Heyer-Schulte Corp., 957 F. Supp. 873 (W.D. Tex. 1997) (confounding in breast implant studies)

Sixth Circuit

Pluck v. BP Oil Pipeline Co., 640 F.3d 671 (6th Cir. 2011) (affirming exclusion of specific causation opinion that failed to rule out confounding factors)

Nelson v. Tennessee Gas Pipeline Co., 243 F.3d 244, 252-54 (6th Cir. 2001) (rewrite: expert’s failure to account for confounding factors in cohort study of alleged PCB exposures rendered his opinion unreliable)

Turpin v. Merrell Dow Pharms., Inc., 959 F. 2d 1349, 1355 -57 (6th Cir. 1992) (discussing failure of some studies to evaluate confounding)

Adams v. Cooper Indus. Inc., 2007 WL 2219212, 2007 U.S. Dist. LEXIS 55131 (E.D. Ky. 2007) (differential diagnosis includes ruling out confounding causes of plaintiffs’ disease).

Seventh Circuit

People Who Care v. Rockford Bd. of Educ., 111 F.3d 528, 537–38 (7th Cir. 1997) (noting importance of considering role of confounding variables in educational achievement);

Caraker v. Sandoz Pharms. Corp., 188 F. Supp. 2d 1026, 1032, 1036 (S.D. Ill 2001) (noting that “the number of dechallenge/rechallenge reports is too scant to reliably screen out other causes or confounders”)

Eighth Circuit

Penney v. Praxair, Inc., 116 F.3d 330, 333-334 (8th Cir. 1997) (affirming exclusion of expert witness who failed to account of the confounding effects of age, medications, and medical history in interpreting PET scans)

Marmo v. Tyson Fresh Meats, Inc., 457 F.3d 748, 758 (8th Cir. 2006) (affirming exclusion of specific causation expert witness opinion)

Ninth Circuit

Coleman v. Quaker Oats Co., 232 F.3d 1271, 1283 (9th Cir. 2000) (p-value of “3 in 100 billion” was not probative of age discrimination when “Quaker never contend[ed] that the disparity occurred by chance, just that it did not occur for discriminatory reasons. When other pertinent variables were factored in, the statistical disparity diminished and finally disappeared.”)

In re Viagra & Cialis Prods. Liab. Litig., 424 F.Supp. 3d 781 (N.D. Cal. 2020) (excluding causation opinion on grounds including failure to account properly for confounding)

Avila v. Willits Envt’l Remediation Trust, 2009 WL 1813125, 2009 U.S. Dist. LEXIS 67981 (N.D. Cal. 2009) (excluding expert witness opinion that failed to rule out confounding factors of other sources of exposure or other causes of disease), aff’d in relevant part, 633 F.3d 828 (9th Cir. 2011)

In re Phenylpropanolamine Prods. Liab. Litig., 289 F.Supp.2d 1230 (W.D.Wash. 2003) (ignoring study validity in a litigation arising almost exclusively from a single observational study that had multiple internal and external validity problems; relegating assessment of confounding to cross-examination)

In re Bextra and Celebrex Marketing Sales Practice, 524 F. Supp. 2d 1166, 1172 – 73 (N.D. Calif. 2007) (discussing invalidity caused by confounding in epidemiologic studies)

In re Silicone Gel Breast Implants Products Liab. Lit., 318 F.Supp. 2d 879, 893 (C.D.Cal. 2004) (observing that controlling for potential confounding variables is required, among other findings, before accepting epidemiologic studies as demonstrating causation).

Henricksen v. ConocoPhillips Co., 605 F. Supp. 2d 1142 (E.D. Wash. 2009) (noting that confounding must be ruled out)

Valentine v. Pioneer Chlor Alkali Co., Inc., 921 F. Supp. 666 (D. Nev. 1996) (excluding plaintiffs’ expert witnesses, including Dr. Kilburn, for reliance upon study that failed to control for confounding)

Tenth Circuit

Hollander v. Sandoz Pharms. Corp., 289 F.3d 1193, 1213 (10th Cir. 2002) (noting importance of accounting for confounding variables in causation of stroke)

In re Breast Implant Litig., 11 F. Supp. 2d 1217, 1233 (D. Colo. 1998) (alternative explanations, such confounding, should be ruled out before accepting causal claims).

Eleventh Circuit

In re Abilify (Aripiprazole) Prods. Liab. Litig., 299 F.Supp. 3d 1291 (N.D.Fla. 2018) (discussing confounding in studies but credulously accepting challenged explanations from David Madigan) (citing Bazemore, a pre-Daubert, decision that did not address a Rule 702 challenge to opinion testimony)

District of Columbia Circuit

American Farm Bureau Fed’n v. EPA, 559 F.3d 512 (D.C. Cir. 2009) (noting that data relied upon in setting particulate matter standards addressing visibility should avoid the confounding effects of humidity)

STATES

Delaware

In re Asbestos Litig., 911 A.2d 1176 (New Castle Cty., Del. Super. 2006) (discussing confounding; denying motion to exclude plaintiffs’ expert witnesses’ chrysotile causation opinions)

Minnesota

Goeb v. Tharaldson, 615 N.W.2d 800, 808, 815 (Minn. 2000) (affirming exclusion of Drs. Janette Sherman and Kaye Kilburn, in Dursban case, in part because of expert witnesses’ failures to consider confounding adequately).

New Jersey

In re Accutane Litig., 234 N.J. 340, 191 A.3d 560 (2018) (affirming exclusion of plaintiffs’ expert witnesses’ causation opinions; deprecating reliance upon studies not controlled for confounding)

In re Proportionality Review Project (II), 757 A.2d 168 (N.J. 2000) (noting the importance of assessing the role of confounders in capital sentences)

Grassis v. Johns-Manville Corp., 591 A.2d 671, 675 (N.J. Super. Ct. App. Div. 1991) (discussing the possibility that confounders may lead to an erroneous inference of a causal relationship)

Pennsylvania

Porter v. SmithKline Beecham Corp., No. 3516 EDA 2015, 2017 WL 1902905 (Pa. Super. May 8, 2017) (affirming exclusion of expert witness causation opinions in Zoloft birth defects case; discussing the importance of excluding confounding)

Tennessee

McDaniel v. CSX Transportation, Inc., 955 S.W.2d 257 (Tenn. 1997) (affirming trial court’s refusal to exclude expert witness opinion that failed to account for confounding)


[1] Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 295 (1965) (emphasis added).

[2] See, e.g., David A. Grimes & Kenneth F. Schulz, “Bias and Causal Associations in Observational Research,” 359 The Lancet 248 (2002).

[3] Bazemore v. Friday, 478 U.S. 385, 400 (1986) (reversing Court of Appeal’s decision that would have disallowed a multiple regression analysis that omitted important variables). Buried in a footnote, the Court did note, however, that “[t]here may, of course, be some regressions so incomplete as to be inadmissible as irrelevant; but such was clearly not the case here.” Id. at 400 n.10. What the Court missed, of course, is that the regression may be so incomplete as to be unreliable or invalid. The invalidity of the regression in Bazemore does not appear to have been raised as an evidentiary issue under Rule 702. None of the briefs in the Supreme Court or the judicial opinions cited or discussed Rule 702.

[4]Confounding in the Courts” (Nov. 2, 2018).

[5] See, e.g., Brock v. Merrill Dow Pharmaceuticals, Inc., 874 F.2d 307, 311-12 (5th Cir. 1989) (“Fortunately, we do not have to resolve any of the above questions [as to bias and confounding], since the studies presented to us incorporate the possibility of these factors by the use of a confidence interval.”). This howler has been widely acknowledged in the scholarly literature. See David Kaye, David Bernstein, and Jennifer Mnookin, The New Wigmore – A Treatise on Evidence: Expert Evidence § 12.6.4, at 546 (2d ed. 2011); Michael O. Finkelstein, Basic Concepts of Probability and Statistics in the Law 86-87 (2009) (criticizing the blatantly incorrect interpretation of confidence intervals by the Brock court).

[6]On Praising Judicial Decisions – In re Viagra” (Feb. 8, 2021); See “Ruling Out Bias and Confounding Is Necessary to Evaluate Expert Witness Causation Opinions” (Oct. 28, 2018); “Rule 702 Requires Courts to Sort Out Confounding” (Oct. 31, 2018).

[7] David H. Kaye and David A. Freedman, “Reference Guide on Statistics,” in RMSE3d 211, 285 (3ed 2011). 

[8] Michael D. Green, D. Michal Freedman, and Leon Gordis, “Reference Guide on Epidemiology,” in RMSE3d 549, 621.

[9] Id. at 592.

[10] Id. at 627.

[11] Id. at 221.

[12] Id. at 222.

[13] Id. at 567-68 (emphasis added).

[14] Id. at 572 (describing chance, bias, and confounding, and noting that “[b]efore any inferences about causation are drawn from a study, the possibility of these phenomena must be examined”); id. at 511 n.22 (observing that “[c]onfounding factors must be carefully addressed”).

[15] Jacob Cohen, “The cost of dichotomization,” 7 Applied Psychol. Measurement 249 (1983).

[16] Peter C. Austin & Lawrence J. Brunner, “Inflation of the type I error rate when a continuous confounding variable is categorized in logistic regression analyses,” 23 Statist. Med. 1159 (2004).

[17] See, e.g., Douglas G. Altman & Patrick Royston, “The cost of dichotomising continuous variables,” 332 Brit. Med. J. 1080 (2006); Patrick Royston, Douglas G. Altman, and Willi Sauerbrei, “Dichotomizing continuous predictors in multiple regression: a bad idea,” 25 Stat. Med. 127 (2006). See also Robert C. MacCallum, Shaobo Zhang, Kristopher J. Preacher, and Derek D. Rucker, “On the Practice of Dichotomization of Quantitative Variables,” 7 Psychological Methods 19 (2002); David L. Streiner, “Breaking Up is Hard to Do: The Heartbreak of Dichotomizing Continuous Data,” 47 Can. J. Psychiatry 262 (2002); Henian Chen, Patricia Cohen, and Sophie Chen, “Biased odds ratios from dichotomization of age,” 26 Statist. Med. 3487 (2007); Carl van Walraven & Robert G. Hart, “Leave ‘em Alone – Why Continuous Variables Should Be Analyzed as Such,” 30 Neuroepidemiology 138 (2008); O. Naggara, J. Raymond, F. Guilbert, D. Roy, A. Weill, and Douglas G. Altman, “Analysis by Categorizing or Dichotomizing Continuous Variables Is Inadvisable,” 32 Am. J. Neuroradiol. 437 (Mar 2011); Neal V. Dawson & Robert Weiss, “Dichotomizing Continuous Variables in Statistical Analysis: A Practice to Avoid,” Med. Decision Making 225 (2012); Phillippa M Cumberland, Gabriela Czanner, Catey Bunce, Caroline J Doré, Nick Freemantle, and Marta García-Fiñana, “Ophthalmic statistics note: the perils of dichotomising continuous variables,” 98 Brit. J. Ophthalmol. 841 (2014).

[18] Valerii Fedorov, Frank Mannino1, and Rongmei Zhang, “Consequences of dichotomization,” 8 Pharmaceut. Statist. 50 (2009).

When the American Medical Association Woke Up

November 17th, 2021

“You are more than entitled not to know what the word ‘performative’ means. It is a new word and an ugly word, and perhaps it does not mean anything very much. But at any rate there is one thing in its favor, it is not a profound word.”

J.L. Austin, “Performative Utterances,” in Philosophical Papers 233 (2nd ed. 1970).

John Langshaw Austin, J.L. to his friends, was a English philosopher who focused on language and how it actually worked in the real world. Austin developed the concept of performative utterances, which have since come to be known as “speech acts.” Little did J.L. know that performative utterances would come to dominate politics and social media.

The key aspect of spoken words that function as speech acts is that they do not simply communicate information, which might have some truth value, and some epistemic basis. Speech acts consist of actual conduct, such as promising, commanding, apologizing, etc.[1] The law has long implicitly recognized the distinction between factual assertions or statements and speech acts. The Federal Rules of Evidence, for instance, limits the rule against hearsay to “statements,” meaning written assertions or nonverbal conduct (such as nodding in agreement) that is intended as an assertion.[2]

When persons in wedding ceremonies say “I do,” at the appropriate moments, they are married, by virtue of their speech acts. Similarly for contracts and other promising under circumstances that give rise to enforceable contracts. A witness’s recounting another’s vows or promises is not hearsay because the witness is offering a recollection only for the fact that the utterance was made, and not to prove the truth of a matter asserted.[3]

The notion of a speech act underlies much political behavior these days. When people palaver about Q, or some QAnon conspiracy, the principle of charity requires us to understand them as not speaking words that can be true or false, but simply signaling their loyalty to a lost cause, usually associated with the loser of the 2020 presidential election. By exchanging ridiculous and humiliating utterances, fellow cultists are signaling loyalty, not making a statement about the world. Their “speech acts” are similar to rituals of exchanging blood with pledges of fraternity.

Of course, there are morons who show up at concerts expecting John F. Kennedy, Jr., to appear, or who show up at pizza places in Washington, D.C., armed with semiautomatic rifles, because their credulity outstripped the linguistic nuances of performative utterances about the Clintons. In days past, members of a cult would get a secret tatoo or wear a special piece of jewelry. Now, the way to show loyalty is to say stupid things in public, and not to laugh when your fellow cultists say similar things.

Astute observers of political systems, on both the left (George Orwell) and the right (Eric Voegelin) have long recognized that ideologies destroy language, including speech acts and performative utterances. The destructive capacities of ideologies are especially disturbing when they invade science and medicine. Alas, the ideology of the Woke has arrived in the halls of the American Medical Association (AMA).

Last month, AMA issued its guide to politically correct language, designed to advance health “equity”: “Advancing Health Equity: A Guide to Language, Narrative and Concepts (Nov. 2, 2021).” The 54 page guide is, at times, worthy of a MAD magazine parody, but the document quickly transcends parody to take us into an Orwellian nightmare of thought-control in the name of neo-Marxist “social justice” goals.[4]

In its guide to language best practices, the AMA urges us to promote health equity by adding progressive political language to what were once simple statements of fact. The AMA document begins with what seems affected, insincere humility:

“We share this document with humility. We recognize that language evolves, and we are mindful that context always matters. This guide is not and cannot be a check list of correct answers. Instead, we hope that this guide will stimulate critical thinking about language, narrative and concepts—helping readers to identify harmful phrasing in their own work and providing alternatives that move us toward racial justice and health equity.”

This pretense at humility quickly evaporates as the document’s tone become increasingly censorious and strident. The AMA seems less concerned with truth, evidence-based conclusions, or dialogue, than with conformity to social justice norms of the Woke mob.

In Table 1, the AMA introduces some “Key Principles and Associated Terms.” “Avoid use of adjectives such as vulnerable, marginalized and high-risk,” at least as to persons. Why? The AMA tells us that the use of such terms to describe individuals is “stigmatizing.” The terms are vague and imply (to the AMA) that the condition is inherent to the group rather than the actual root cause, which seems to be mostly, in the AMA’s view, the depredations of white cis-gendered men. To cure the social injustice, the AMA urges us to speak in terms of groups and communities (never individuals) that “have been historically marginalized or made vulnerable, or underserved, or under-resourced [sic], or experience disadvantage [sic].” The squishy passive voice pervades the AMA Guide, but the true subject – the oppressor – is easy to discern.

Putting aside the recurrent, barbarous use of the passive voice, we now must have medical articles that are sociological treatises. The AMA appears to be especially sensitive, perhaps hypersensitive, to what it considers “unintentional blaming.” For example, rather than discuss “[w]orkers who do not use PPE [personal protective equipment” or “people who do not seek healthcare,” the AMA instructs authors, without any apparent embarrassment or shame, to “try” substituting “workers under-resourced with” PPE, or “people with limited access to” healthcare.

Aside from assuaging the AMA’s social justice warriors, the substitutions are not remotely synonymous. There have been, there are, and there will likely always be workers and others who do not use protective equipment. There have been, there are, and there will likely always be persons who do not seek healthcare. For example, anti-vaxxing yutzballs can be found in all social strata and walks of life. Access to equipment or healthcare is a completely independent issue and concern. The AMA’s effort to hide these facts with the twisted passive-voice contortions assaults our language and our common sense.

Table 2 of the AMA Guide provides a list of commonly used words and phrases and the “equity-focused alternatives.”

“Disadvantaged” in Woke Speak becomes “historically and intentionally excluded.” The aspirational goal of “equality” is recast as “equity.” After all, mere equality, or treating everyone alike:

“ignores the historical legacy of disinvestment and deprivation through policy of historically marginalized and minoritized [sic] communities as well as contemporary forms of discrimination that limit opportunities. Through systematic oppression and deprivation from ethnocide, genocide, forced removal from land and slavery, Indigenous and Black people have been relegated to the lowest socioeconomic ranks of this country. The ongoing xenophobic treatment of undocumented brown people and immigrants (including Indigenous people disposed of their land in other countries) is another example. Intergenerational wealth has mainly benefited and exists for white families.”

In other words, treating people equally is racist. Non-racist is also racist. “Fairness” must also be banished; the equity-focused AMA requires “Social Justice.” Mere fairness pays “no attention” to power relations, and enforced distribution outcomes.

Illegal immigrants are, per AMA guidelines, transformed into “undocumented Immigrant,” because “illegal” is “a dehumanizing, derogatory term,” and because ‘[n]o human being is illegal.” The latter is a lovely sentiment, but human beings can be in countries unlawfully, just as they can be in the Capitol Building illegally.

“Non-compliance” is transmuted into “non-adherence,” because the former term “places blame for treatment failure solely on patients.” The latter term is suggested to exculpate patients, even though patients can be solely responsible for failing to follow prescribed treatment. The AMA wants, however, to remind us that non-adherence may result from “frustration and legitimate mistrust of health care, structural barriers that limit availability and accessibility of medications (including cost, insurance barriers and pharmacy deserts), time and resource constraints (including work hours, family responsibilities), and lack of effective communication about severity of disease or symptoms.” All true, but why not add sloth, stupidity, and superstition? We are still in a pandemic that has been fueled by non-compliance that largely warrants blame on the non-compliant.

The AMA wanders into fraught territory when it tells us impassively that identifying a “social problem” is now a sign of insensitivity. The AMA Woke Guide advises that social problems are really “social injustices.” Referring to a phenomenon as a social problem risks blaming people for their own “marginalization.” The term “marginalization” is part of the Social Justice jargon, and it occurs throughout the AMA Woke Guide. A handy glossary at the end of the document is provided for those of us who have not grown up in Woke culture:

“Marginalization: Process experienced by those under- or unemployed or in poverty, unable to participate economically or socially in society, including the labor market, who thereby suffer material as well as social deprivation.”[5]

The Woke apparently know that calling something a mere “social problem” makes it “seem less serious than social injustice,” and there is some chance that labeling a social phenomenon as a social problem risks “potentially blaming people for their own marginalization.” And yet not every social problem is a social injustice. Underage drinking and unprotected sex are social problems, as is widespread obesity and prevalent diabetes. Alcoholism is a social problem that is prevalent in all social strata; hardly a social injustice.

At page 23 of the Woke Guide, the AMA’s political hostility to individual agency and autonomy breaks through in a screed against meritocracy:

“Among these ideas is the concept of meritocracy, a social system in which advancement in society is based on an individual’s capabilities and merits rather than on the basis of family, wealth or social background. Individualism is problematic in obscuring the dynamics of group domination, especially socioeconomic privilege and racism. In health care, this narrative appears as an over-emphasis on changing individuals and individual behavior instead of the institutional and structural causes of disease.”

Good grief, now physicians cannot simply treat a person for a disease, they must treat entire tribes!

Table 5

Some of the most egregious language of the Woke Guide can be seen in its Table 5, entitled “Contrasting Conventional (Well-intentioned) Phrasing with Equity-focused Language that Acknowledges Root Causes of Inequities.” Table 5 makes clear that the AMA is working from a sociological program that is supported by implicit claims of knowledge for the “root causes” of inequities, a claim that should give everyone serious pause. After all, even if often disappointed, the readers of AMA journals expect rigorous scientific studies, carefully written and edited, which contribute to evidence-based medicine. There is nothing, however, in the AMA Guide, other than its ipse dixit, to support its claimed social justice etiologies.

Table 5 of the AMA Guide provides some of its most far-reaching efforts to impose a political vision through semantic legerdemain. Despite the lack of support for its claimed root causes, the AMA would force writers to assign Social Justice approved narratives and causation. A seemingly apolitical, neutral statement, such as:

“Low-income people have the highest level of coronary artery disease in the United States.”

now must be recast into sanctimonious cant that would warm the cockles of a cold Stalinist’s heart:

“People underpaid and forced into poverty as a result of banking policies, real estate developers gentrifying neighborhoods, and corporations weakening the power of labor movements, among others, have the highest level of coronary artery disease in the United States.”

Banks, corporations, and real estate developers have agency; people do not. With such verbiage, it will be hard to enforce page limits on manuscripts submitted to AMA journals. More important, however, is that the “root cause” analysis is not true in many cases. In countries where property is banned and labor owns the means of production, low-income people have higher rates of disease. The socio-economic variable is important, and consistent, across the globe, even in democratic socialist countries such as Sweden, or in Marxist paradises such as the People’s Republic of China and the former Soviet Union. The bewildered may wonder whether the AMA has ever heard of a control group. Maybe, just maybe, the increased incidence of coronary artery disease among the poor has more to do with Cheez Doodles than the ravages of capitalism.

CRITICAL REACTIONS

The AMA’s guide to linguistic etiquette is a transparent effort to advance a political agenda under the guise of language mandates. The AMA is not merely prescribing thoughtful substitutions for common phrases; the AMA guide is nothing less than an attempt to impose a “progressive” ideology with fulsome apologies. The AMA not only embraces, unquestioningly, the ideology of “white fragility, Ibram Kendi, and Robin DiAngelo; the AMA at times appears on the verge of medicalizing the behaviors of those who question or reject its Woke ideology. Is a psychiatric gulag the next step?

Dr. Michelle Cretella, the executive director of the American College of Pediatricians, expressed her concern that the AMA’s “social justice” plans are “rooted not in science and the medical ethics of the Hippocratic Oath, but in a host of Marxist ideologies that devalue the lives of our most vulnerable patients and seek to undermine the nuclear family which is the single most critical institution to child well-being.”[6]

Journalist Jesse Singal thinks that the AMA has gone berserk.[7] And Matt Bai, at the Washington Post, saw the AMA’s co-opting of language and narratives as having an Orwellian tone, resembling Mao’s “Little Red Book.”[8] The Post writer raised the interesting question why the AMA was even in the business of admonishing physicians and scientists about acceptable language. After all, the editors of Fowler’s Modern English Usage have managed for decades to eschew offering guidance on performing surgery. The Post opinion piece expresses a realistic concern that proposing “weird language” will worsen the current fraying of the social fabric, and pave the way for a Trump Restoration. Perhaps the AMA should stick to medicine rather than “mandating versions of history and their own lists of acceptable terminology.”

AMA Woke Speak has its antecedents,[9] and it will likely have its followers. For lawyers who work with expert witnesses, the AMA guide risks subjecting their medical witnesses to embarrassment, harassment, and impeachment for failing to comply with the new ideological orthodoxy. Just say no.


[1] See generally John L. Austin, How to Do Things with Words: The William James Lectures delivered at Harvard University in 1955 (1962).

[2] See Fed. R. Evid. Rule 801(a) & Notes of Advisory Comm. Definitions That Apply to This Article; Exclusions from Hearsay (defining statement).


[3] See, e.g., Emich Motors Corp. v. General Motors Corp., 181 F.2d 70 (7th Cir. 1950), rev’d on other grounds 340 U.S. 558 (1951).

[4] Harriet Hall, “The AMA’s Guide to Politically Correct Language: Advancing Health Equity,” Science Based Medicine (Nov. 2, 2021).

[5] Citing, Foster Osei Baah, Anne M Teitelman & Barbara Riegel, “Marginalization: Conceptualizing patient vulnerabilities in the framework of social determinants of health-An integrative review,” 26 Nurs Inq. e12268 (2019).

[6] Jeff Johnston, “Woke Medicine: ‘The AMA’s Strategic Plan to Embed Racial Justice and Advance Health Equity’,” The Daily Citizen (May 21, 2021) .

[7] Jesse Singal, “The AMA jumps the Woke Shark, introduces Medspeak,” Why Evolution is True (Nov. 1, 2021).

[8] Matt Bai, “Paging Dr. Orwell. The American Medical Association takes on the politics of language,” Wash. Post (Nov. 3, 2021).

[9] Office of Minority Health, U.S. Department of Health and Human Services, “National Standards for Culturally and Linguistically Appropriate Services in Health and Health Care: A Blueprint for Advancing and Sustaining CLAS

Policy and Practice” (2013); Association of State and Territorial Health Officials, “Health equity terms” (2018).

The opinions, statements, and asseverations expressed on Tortini are my own, or those of invited guests, and these writings do not necessarily represent the views of clients, friends, or family, even when supported by good and sufficient reason.