TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

An Opinion to SAVOR

November 11th, 2022

The saxagliptin medications are valuable treatments for type 2 diabetes mellitus (T2DM). The SAVOR (Saxagliptin Assessment of Vascular Outcomes Recorded in Patients with Diabetes Mellitus) study was a randomized controlled trial, undertaken by manufacturers at the request of the FDA.[1] As a large (over sixteen thousand patients randomized) double-blinded cardiovascular outcomes trial, SAVOR collected data on many different end points in patients with T2DM, at high risk of cardiovascular disease, over a median of 2.1 years. The primary end point was a composite end point of cardiac death, non-fatal myocardial infarction, and non-fatal stroke. Secondary end points included each constituent of the composite, as well as hospitalizations for heart failure, coronary revascularization, or unstable angina, as well as other safety outcomes.

The SAVOR trial found no association between saxagliptin use and the primary end point, or any of the constituents of the primary end point.  The trial did, however, find a modest association between saxagliptin and one of the several secondary end points, hospitalization for heart failure (hazard ratio, 1.27; 95% C.I., 1.07 to 1.51; p = 0.007). The SAVOR authors urged caution in interpreting their unexpected finding for heart failure hospitalizations, given the multiple end points considered.[2] Notwithstanding the multiplicity, in 2016, the FDA, which does not require a showing of causation for adding warnings to a drug’s labeling, added warnings about the “risk” of hospitalization for heart failure from the use of saxagliptin medications.

And the litigation came.

The litigation evidentiary display grew to include, in addition to SAVOR, observational studies, meta-analyses, and randomized controlled trials of other DPP-4 inhibitor medications that are in the same class as saxagliptin. The SAVOR finding for heart failure was not supported by any of the other relevant human study evidence. The lawsuit industry, however, armed with an FDA warning, pressed its cases. A multi-district litigation (MDL 2809) was established. Rule 702 motions were filed by both plaintiffs’ and defendants’ counsel.

When the dust settled in this saxagliptin litigation, the court found that the defendants’ expert witnesses satisfied the relevance and reliability requirements of Rule 702, whereas the proferred opinions of plaintiff’s expert witness, Dr. Parag Goyal, a cardiologist at Cornell-Weill Hospital in New York, did not satisfy Rule 702.[3] The court’s task was certainly made easier by the lack of any other expert witness or published opinion that saxagliptin actually causes heart failure serious enough to result in hospitalization. 

The saxagliptin litigation presented an interesting array of facts for a Rule 702 show down. First, there was an RCT that reported a nominally statistically significant association between medication use and a harm, hospitalization for heart failure. The SAVOR finding, however, was in a secondary end point, and its statistical significance was unimpressive when considered in the light of the multiple testing that took place in the context of a cardiovascular outcomes trial.

Second, the heart failure increase was not seen in the original registration trials. Third, there was an effort to find corroboration in observational studies and meta-analyses, without success. Fourth, there was no apparent mechanism for the putative effect. Fifth, there was no support from trials or observational studies of other medications in the class of DPP-4 inhibitors.

Dr. Goyal testified that the heart failure finding in SAVOR “should be interpreted as cause and effect unless there is compelling evidence to prove otherwise.” On this record, the MDL court excluded Dr. Goyal’s causation opinions. Dr. Goyal purported to conduct a Bradford Hill analysis, but the MDL court appeared troubled by his glib dismissal of the threat to validity in SAVOR from multiple testing, and his ignoring the consistency prong of the Hill factors. SAVOR was the only heart failure finding in humans, with the remaining observational studies, meta-analyses, and other trials of DPP-4 inhibitors failing to provide supporting evidence.

The challenged defense expert witnesses defended the validity of their opinions, and ultimately the MDL court had little concern in permitting them through the judicial gate. The plaintiffs’ challenges to Suneil Koliwad, a physician with a doctorate in molecular physiology, Eric Adler, a cardiologist, and Todd Lee, a pharmaco-epidemiologist, were all denied. The plaintiffs challenged, among other things, whether Dr. Adler was qualified to apply a Bonferroni correction to the SAVOR results, and whether Dr. Lee was obligated to obtain and statistically analyze the data from the trials and studies ab initio. The MDL court quickly dispatched these frivolous challenges.

The saxagliptin MDL decision is an important reminder that litigants should remain vigilant about inaccurate assertions of “statistical significance,” even in premier, peer-reviewed journals. Not all journals are as careful as the New England Journal of Medicine in requiring qualification of claims of statistical significance in the face of multiple testing.

One legal hiccup in the court’s decision was its improvident citation to Daubert, for the proposition that the gatekeeping inquiry must focus “solely on principles and methodology, not on the conclusions they generate.”[4] That piece of obiter dictum did not survive past the Supreme Court’s 1997 decision in Joiner,[5] and it was clearly superseded by statute in 2000. Surely it is time to stop citing Daubert for this dictum.


[1] Benjamin M. Scirica, Deepak L. Bhatt, Eugene Braunwald, Gabriel Steg, Jaime Davidson, et al., for the SAVOR-TIMI 53 Steering Committee and Investigators, “Saxagliptin and Cardiovascular Outcomes in Patients with Type 2 Diabetes Mellitus,” 369 New Engl. J. Med. 1317 (2013).

[2] Id. at 1324.

[3] In re Onglyza & Kombiglyze XR Prods. Liab. Litig., MDL 2809, 2022 WL 43244 (E.D. Ken. Jan. 5, 2022).

[4] Daubert v. Merrell Dow Pharms., Inc., 509 U.S. 579, 595 (1993).

[5] General Electric Co. v. Joiner, 522 U.S. 136 (1997).

Cheng’s Proposed Consensus Rule for Expert Witnesses

September 15th, 2022

Edward K. Cheng is the Hess Professor of Law in absentia from Vanderbilt Law School, while serving this fall as a visiting professor at Harvard. Professor Cheng is one of the authors of the multi-volume treatise, Modern Scientific Evidence, and the author of many articles on scientific and statistical evidence. Cheng’s most recent article, “The Consensus Rule: A New Approach to Scientific Evidence,”[1] while thought provoking, follows in the long-standing tradition of law school professors to advocate evidence law reforms, based upon theoretical considerations devoid of practical or real-world support.

Cheng’s argument for a radical restructuring of Rule 702 is based upon his judgment that jurors and judges are epistemically incompetent to evaluate expert witness opinion testimony. The current legal approach has trial judges acting as gatekeepers of expert witness testimony, and jurors acting as judges of factual scientific claims. Cheng would abolish these roles as beyond their ken.[2] Lay persons can, however, determine which party’s position is supported by the relevant expert community, which he presumes (without evidence) possesses the needed epistemic competence. Accordingly, Cheng would rewrite the legal system’s approach to important legal disputes, such as disputes over causal claims, from:

Whether a given substance causes a given disease

to

Whether the expert community believes that a given substance causes a given disease.

Cheng channels the philosophical understanding of the ancients who realized that one must have expertise to judge whether someone else has used that expertise correctly. And he channels the contemporary understanding that knowledge is a social endeavor, not the unique perspective of an individual in isolation. From these twin premisses, Cheng derives a radical and cynical proposal to reform the law of expert witness testimony. In his vision, experts would come to court not to give their own opinions, and certainly not to try to explain how they arrive at their opinions from the available evidence. For him, the current procedure is too much like playing chess with a monkey. The expert function would consist of telling the jury what the expert witness’s community believes.[3] Jurors would not decide the “actual substantive questions,” but simply decide what they believe the relevant expert witness community accepts as a consensus. This radical restructuring is what Cheng calls the “consensus rule.”

In this proposed “consensus rule,” there is no room for gatekeeping. Parties continue to call expert witnesses, but only as conduits for the “consensus” opinions of their fields. Indeed, Cheng’s proposal would radically limit expert witness to service as pollsters; their testimony would present only their views of what the consensus is in their fields. This polling information is the only evidence that the jury hear from expert witnesses, because this is the only evidence that Cheng believes the jury is epistemically competent to assess.[4]

Under Cheng’s Consensus Rule, when there is no consensus in the realm, the expert witness regime defaults to “anything goes,” without gatekeeping.[5] Judges would continue to exercise some control over who is qualified to testify, but only as far as the proposed experts must be in a position to know what the consensus is in their fields.

Cheng does not explain why, under his proposed “consensus rule,” subject matter experts are needed at all.  The parties might call librarians, or sociologists of science, to talk about the relevant evidence of consensus. If a party cannot afford a librarian expert witness, then perhaps lawyers could present directly the results of their PubMed, and other internet searches.

Cheng may be right that his “deferential approach” would eliminate having the inexpert passing judgment on the expert. The “consensus rule” would reduce science to polling, conducted informally, often without documentation or recording, by partisan expert witnesses. This proposal hardly better reflects, as he argues, the “true” nature of science. In Cheng’s vision, science in the courtroom is just a communal opinion, without evidence and without inference. To be sure, this alternative universe is tidier and less disputatious, but it is hardly science or knowledge. We are left with opinions about opinions, without data, without internal or external validity, and without good and sufficient facts and data.

Cheng claims that his proposed Consensus Rule is epistemically superior to Rule 702 gatekeeping. For the intellectual curious and able, his proposal is a counsel of despair. Deference to the herd, he tells us “is not merely optimal—it is the only practical strategy.”[6] In perhaps the most extreme overstatement of his thesis, Cheng tells us that

“deference is arguably not due to any individual at all! Individual experts can be incompetent, biased, error prone, or fickle—their personal judgments are not and have never been the source of reliability. Rather, proper deference is to the community of experts, all of the people who have spent their careers and considerable talents accumulating knowledge in their field.”[7]

Cheng’s hypothesized community of experts, however is worthy of deference only by virtue of the soundness of its judgments. If a community has not severely tested its opinions, then its existence as a community is irrelevant. Cheng’s deference is the sort of phenomenon that helped create Lysenkoism and other intellectual fads that were beyond challenge with actual data.

There is, I fear, some partial truth to Cheng’s judgment of juries and judges as epistemically incompetent, or challenged, to judge science, but his judgment seems greatly overstated. Finding aberrant jury verdicts would be easy, but Cheng provides no meaningful examples of gatekeeping gone wrong. Professor Cheng may have over-generalized in stating that judges are epistemically incompetent to make substantive expert determinations. He surely cannot be suggesting that judges never have sufficient scientific acumen to determine the relevance and reliability of expert witness opinion. If judges can, in some cases, make a reasonable go at gatekeeping, why then is Cheng advocating a general rule that strips all judges of all gatekeeping responsibility with respect to expert witnesses?

Clearly judges lack the technical resources, time, and background training to delve deeply into the methodological issues with which they may be confronted. This situation could be ameliorated by budgeting science advisors and independent expert witnesses, and by creating specialty courts staffed with judges that have scientific training. Cheng acknowledges this response, but he suggests that conflicts with “norms about generalist judges.”[8] This retreat to norms is curious in the face of Cheng’s radical proposals, and the prevalence of using specialist judges for adjudicating commercial and patent disputes.

Although Cheng is correct that assessing validity and reliability of scientific inferences and conclusions often cannot be reduced to a cookbook or checklist approach, not all expertise is as opaque as Cheng suggests. In his view, lawyers are deluded into thinking that they can understand the relevant science, with law professors being even worse offenders.[9] Cross-examining a technical expert witness can be difficult and challenging, but lawyers on both sides of the aisle occasionally demolish the most skilled and knowledgeable expert witnesses, on substantive grounds. And these demolitions happen to expert witnesses who typically, self-servingly claim that they have robust consensuses agreeing with their opinions.

While scolding us that we must get “comfortable with relying on the expertise and authority of others,” Cheng reassures us that deferring to authority is “not laziness or an abdication of our intellectual responsibility.”[10] According to Cheng, the only reason to defer to the opinion of expert is that they are telling us what their community would say.[11] Good reasons, sound evidence, and valid inference need not worry us in Cheng’s world.

Finding Consensus

Cheng tells us that his Consensus Rule would look something like:

Rule 702A. If the relevant scientific community believes a fact involving specialized knowledge, then that fact is established accordingly.”

Imagine the endless litigation over what the “relevant” community is. For a health effect claim about a drug and heart attacks, is it the community of cardiologists or epidemiologists? Do we accept the pronouncements of the American Heart Association or those of the American College of Cardiology. If there is a clear consensus based upon a clinical trial, which appears to be based upon suspect data, is discovery of underlying data beyond the reach of litigants because the correctness of the allegedly dispositive study is simply not in issue? Would courts have to take judicial notice of the clear consensus and shut down any attempt to get to the truth of the matter?

Cheng acknowledges that cases will involve issues that are controversial or undeveloped, without expert community consensus. Many litigations start after publication of a single study or meta-analysis, which is hardly the basis for any consensus. Cheng appears content, in this expansive area, to revert to anything goes because if the expert community has not coalesced around a unified view, or if the community is divided, then the courts cannot do better than flipping a coin! Cheng’s proposal thus has a loophole the size of the Sun.

Cheng tells us, unhelpfully, that “[d]etermining consensus is difficult in some cases, and less so in others.”[12] Determining consensus may not be straightforward, but no matter. Consensus Rule questions are not epistemically challenging and thus “far more manageable,” because they requires no special expertise. (Again, why even call a subject matter expert witness, as opposed to a science journalist or librarian?) Cheng further advises that consensus is “a bit like the reasonable person standard in negligence,” but this simply conflates normative judgments with the scientific judgments.[13]

Cheng’s Consensus Rule would allow the use of a systematic review or a meta-analysis, not for evidence of the correctness of its conclusions, but only as evidence of a consensus.[14] The thought experiment of how this suggestion plays out in the real world may cause some agita. The litigation over Avandia began within days of the publication of a meta-analysis in the New England Journal of Medicine.[15] So some evidence of consensus; right? But then the letters to the editor within a few weeks of publication showed that the meta-analysis was fatally flawed. Inadmissible! Under the Consensus Rule the correctness or the methodological appropriateness of the meta-analysis is irrelevant. A few months later, another meta-analysis is published, which fails to find the risk that the original meta-analysis claimed. Is the trial now about which meta-analysis represents the community’s consensus, or are we thrown into the game of anything goes, where expert witnesses just say things, without judicial supervision?  A few years go by, and now there is a large clinical trial that supersedes all the meta-analyses of small trials.[16] Is a single large clinical trial now admissible as evidence of a new consensus, or are only systematic reviews and meta-analyses relevant evidence?

Cheng’s Consensus Rule will be useless in most determinations of specific causation.  It will be a very rare case indeed when a scientific organization issues a consensus statement about plaintiff John Doe. Very few tort cases involve putative causal agents that are thought to cause every instance of some disease in every person exposed to the agent. Even when a scientific community has addressed general causation, it will have rarely resolved all the uncertainty about the causal efficacy of all levels of exposure or the appropriate window of latency. So Cheng’s proposal guarantees to remove specific causation from the control of Rule 702 gatekeeping.

The potential for misrepresenting consensus is even greater than the misrepresentations of actual study results. At least the data are the data, but what will jurors do when they are regaled by testimony about the informal consensus reached in the hotel lobby of the latest scientific conference. Regulatory pronouncements that are based upon precautionary principles will be misrepresented as scientific consensus.  Findings by the International Agency for Research on Cancer that a substance is a IIA “probable human carcinogen” will be hawked as a consensus, even though the classification specifically disclaims any quantitative meaning for “probable,” and it directly equates to “insufficient” evidence of carcinogencity in humans.

In some cases, as Cheng notes, organizations such as the National Research Council, or the National Academy of Science, Engineering and Medicine (NASEM), will have weighed in on a controversy that has found its way into court.[17] Any help from such organizations will likely be illusory. Consider the 2006 publication of a comprehensive review of the available studies on non-pulmonary cancers and asbestos exposure by NASEM. The writing group presented its assessment of colorectal cancer as not causally associated with occupational asbestos exposure.[18] By 2007, the following year, expert witnesses for plaintiffs argued that the NASEM publication was no longer a consensus because one or two (truly inconsequential studies) had been published after the report and thus not considered. Under Cheng’s proposal, this dodge would appear to be enough to oust the consensus rule, and default to the “anything goes” rule. The scientific record can change rapidly, and many true consensus statements quickly find their way into the dustbin of scientific history.

Cheng greatly underestimates the difficulty in ascertaining “consensus.” Sometimes, to be sure, professional societies issue consensus statements, but they are often tentative and inconclusive. In many areas of science, there will be overlapping realms of expertise, with different disciplines issuing inconsistent “consensus” statements. Even within a single expert community, there may be two schools of thoughts about a particular issue.

There are instances, perhaps more than a few, when a consensus is epistemically flawed. If, as is the case in many health effect claims, plaintiffs rely upon the so-called linear no-threshold dose-response (LNT) theory of carcinogenesis, plaintiffs will point to regulatory pronouncements that embrace LNT as “the consensus.” When scientists are being honest, they generally recognize LNT as part of a precautionary principle approach, which may make sense as the foundation of “risk assessment.” The widespread assumption of LNT in regulatory agencies, and among scientists who work in such agencies, is understandable, but LNT remains an assumption. Nonetheless, we already see LNT hawked as a consensus, which under Cheng’s Consenus Rule would become the key dispositive issue, while quashing the mountain of evidence that there are, in fact, defense mechanisms to carcinogenesis that result in practical thresholds.

Beyond, regulatory pronouncements, some areas of scientific endeavor have themselves become politicized and extremist. Tobacco smoking surely causes lung cancer, but the studies of environmental tobacco smoking and lung cancer have been oversold. In areas of non-scientific disputes, such as history of alleged corporate malfeasance, juries will be treated to “the consensus” of Marxist labor historians, without having to consider the actual underlying historical documents. Cheng tells us that his Consensus Rule is a “realistic way of treating nonscientific expertise,”[19] which would seem to cover historian expert witness. Yet here, lawyers and lay fact finders are fully capable of exploring the glib historical conclusions of historian witnesses with cross-examination on the underlying documentary facts of the proffered opinions.

The Alleged Warrant for the Consensus Rule

If Professor Cheng is correct that the current judicial system, with decisions by juries and judges, is epistemically incompetent, does his Consensus Rule necessarily follow?  Not really. If we are going to engage in radical reforms, then the institutionalization of blue-ribbon juries would make much greater sense. As for Cheng’s claim that knowledge is “social,” the law of evidence already permits the use of true consensus statements as learned treatises, both to impeach expert witnesses who disagree, and (in federal court) to urge the truth of the learned treatise.

The gatekeeping process of Rule 702, which Professor Cheng would throw overboard, has important advantages in that judges ideally will articulate reasons for finding expert witness opinion testimony admissible or not. These reasons can be evaluated, discussed, and debated, with judges, lawyers, and the public involved. This gatekeeping process is rational and socially open.

Some Other Missteps in Cheng’s Argument

Experts on Both Sides are Too Extreme

Cheng’s proposal is based, in part, upon his assessment that the adversarial system causes the parties to choose expert witnesses “at the extremes.” Here again, Cheng provides no empirical evidence for his assessment. There is a mechanical assumption often made by people who do not bother to learn the details of a scientific dispute that the truth must somehow lie in the “middle.” For instance, in MDL 926, the silicone gel breast implant litigation, presiding Judge Sam Pointer complained about the parties’ expert witnesses being too extreme. Judge Pointer  believed that MDL judges should not entertain Rule 702 challenges, which were in his view properly heard by the transferor courts. As a result, Judge Robert Jones, and then Judge Jack Weinstein, conducted thorough Rule 702 hearings and found that the plaintiffs’ expert witnesses’ opinions were unreliable and insufficiently supported by the available evidence.[20] Judge Weinstein started the process of selecting court-appointed expert witnesses for the remaining New York cases, which goaded Judge Pointer into taking the process back to the MDL court level. After appointing four, highly qualified expert witnesses, Judge Pointer continued to believe that the parties’ expert witnesses were “extremists,” and that the courts’ own experts would come down somewhere between them.  When the court-appointed experts filed their reports, Judge Pointer was shocked that all four of his experts sided with the defense in rejecting the tendentious claims of plaintiffs’ expert witnesses.

Statistical Significance

Along the way, in advocating his radical proposal, Professor Cheng made some other curious announcements. For instance, he tells us that “[w]hile historically used as a rule of thumb, statisticians have now concluded that using the 0.05 [p-value] threshold is more distortive than helpful.”[21] Cheng’s purpose here is unclear, but the source he cited does not remotely support his statement, and certainly not his gross overgeneralization about “statisticians.” If this is the way he envisions experts will report “consensus,” then his program seems broken at its inception. The American Statistical Association’s (ASA) p-value “consensus” statement articulated six principles, the third of which noted that

“[s]cientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.”

This is a few light years away from statisticians’ concluding that statistical significance thresholds are more distortive than helpful. The ASA p-value statement further explains that

“[t]he widespread use of ‘statistical significance’ (generally interpreted as ‘p < 0.05’) as a license for making a claim of a scientific finding (or implied truth) leads to considerable distortion of the scientific process.”[22]

In the science of health effects, statistical significance remains extremely important, but it has never been a license for making causal claims. As Sir Austin Bradford Hill noted in his famous after-dinner speech, ruling out chance (and bias) as an explanation for an association was merely a predicate for evaluating the association for causality.[23]

Over-endorsing Animal Studies

Under Professor Cheng’s Consensus Rule, the appropriate consensus might well be one generated solely by animal studies. Cheng tells that “perhaps” scientists do not consider toxicology when the pertinent epidemiology is “clear.” When the epidemiology, however, is unclear, scientists consider toxicology.[24] Well, of course, but the key question is whether a consensus about causation in humans will be based upon non-human animal studies. Cheng seems to answer this question in the affirmative by criticizing courts that have required epidemiologic studies “even though the entire field of toxicology uses tissue and animal studies to make inferences, often in combination with and especially in the absence of epidemiology.”[25] The vitality of the field of toxicology is hardly undermined by its not generally providing sufficient grounds for judgments of human causation.

Relative Risk Greater Than Two

In the midst of his argument for the Consensus Rule, Cheng points critically to what he calls “questionable proxies” for scientific certainty. One such proxy is the judicial requirement of risk ratios in excess of two. His short discussion appears to be focused upon the inference of specific causation in a given case, but it leads to a non-sequitur:

“Some courts have required a relative risk of 2.0 in toxic tort cases, requiring a doubling of the population risk before considering causation.73 But the preponderance standard does not require that the substance more likely than not caused any case of the disease in the population, it requires that the substance more likely than not caused the plaintiff’s case.”[26]

Of course, it is exactly because we are interested in the probability of causation of the plaintiff’s case, that we advert to the risk ratio to give us some sense whether “more likely than not” the exposure caused plaintiff’s case. Unless plaintiff can show he is somehow unique, he is “any case.” In many instances, plaintiff cannot show how he is different from the participants of the study that gave rise to the risk ratio less than two.


[1] Edward K. Cheng, “The Consensus Rule: A New Approach to Scientific Evidence,” 75 Vanderbilt L. Rev. 407 (2022) [Consensus Rule].

[2] Consensus Rule at 410 (“The judge and the jury, lacking in expertise, are not competent to handle the questions that the Daubert framework assigns to them.”)

[3] Consensus Rule at 467 (“Under the Consensus Rule, experts no longer offer their personal opinions on causation or teach the jury how to assess the underlying studies. Instead, their testimony focuses on what the expert community as a whole believes about causation.”)

[4] Consensus Rule at 467.

[5] Consensus Rule at 437.

[6] Consensus Rule at 434.

[7] Consensus Rule at 434.

[8] Consensus Rule at 422.

[9] Consensus Rule at 429.

[10] Consensus Rule at 432-33.

[11] Consensus Rule at 434.

[12] Consensus Rule at 456.

[13] Consensus Rule at 457.

[14] Consensus Rule at 459.

[15] Steven E. Nissen, M.D., and Kathy Wolski, M.P.H., “Effect of Rosiglitazone on the Risk of Myocardial Infarction and Death from Cardiovascular Causes,” 356 New Engl. J. Med. 2457 (2007).

[16] P.D. Home, et al., “Rosiglitazone Evaluated for Cardiovascular Outcomes in Oral Agent Combination Therapy for Type 2 Diabetes (RECORD), 373 Lancet 2125 (2009).

[17] Consensus Rule at 458.

[18] Jonathan M. Samet, et al., Asbestos: Selected Health Effects (2006).

[19] Consensus Rule at 445.

[20] Hall v. Baxter Healthcare Corp., 947 F. Supp.1387 (D. Or. 1996) (excluding plaintiffs’ expert witnesses’ causation opinions); In re Breast Implant Cases, 942 F. Supp. 958 (E. & S.D.N.Y. 1996) (granting partial summary judgment on claims of systemic disease causation).

[21] Consenus Rule at 424 (citing Ronald L. Wasserstein & Nicole A. Lazar, “The ASA Statement on p-Values: Context, Process, and Purpose,” 70 Am. Statistician 129, 131 (2016)).

[22] Id.

[23] Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 295 (1965). See Schachtman, “Ruling Out Bias & Confounding is Necessary to Evaluate Expert Witness Causation Opinions” (Oct. 29, 2018); “Woodside & Davis on the Bradford Hill Considerations” (Aug. 23, 2013); Frank C. Woodside, III & Allison G. Davis, “The Bradford Hill Criteria: The Forgotten Predicate,” 35 Thomas Jefferson L. Rev. 103 (2013).

[24] Consensus Rule at 444.

[25] Consensus Rule at 424 & n. 74 (citing to one of multiple court advisory expert witnesses in Hall v. Baxter Healthcare Corp., 947 F. Supp.1387, 1449 (D. Or. 1996), who suggested that toxicology would be appropriate to consider when the epidemiology was not clear). Citing to one outlier advisor is a rather strange move for Cheng considering that the “consensus” was readily discernible to the trial judge in Hall, and to Judge Jack Weinstein, a few months later, in In re Breast Implant Cases, 942 F. Supp. 958 (E. & S.D.N.Y. 1996).

[26] Consensus Rule at 424 & n. 73 (citing Lucinda M. Finley, “Guarding the Gate to the Courthouse: How Trial Judges Are Using Their Evidentiary Screening Role to Remake Tort Causation Rules,” 49 Depaul L. Rev. 335, 348–49 (2000). See Schachtman, “Rhetorical Strategy in Characterizing Scientific Burdens of Proof” (Nov. 15, 2014).

Amicus Curious – Gelbach’s Foray into Lipitor Litigation

August 25th, 2022

Professor Schauer’s discussion of statistical significance, covered in my last post,[1] is curious for its disclaimer that “there is no claim here that measures of statistical significance map easily onto measures of the burden of proof.” Having made the disclaimer, Schauer proceeds to falls into the transposition fallacy, which contradicts his disclaimer, and, generally speaking, is not a good thing for a law professor eager to advance the understanding of “The Proof,” to do.

Perhaps more curious than Schauer’s error is his citation support for his disclaimer.[2] The cited paper by Jonah B. Gelbach is one of several of Gelbach’s papers that advances the claim that the p-value does indeed map onto posterior probability and the burden of proof. Gelbach’s claim has also been the center piece in his role as an advocate in support of plaintiffs in the Lipitor (atorvastatin) multi-district litigation (MDL) over claims that ingestion of atorvastatin causes diabetes mellitus.

Gelbach’s intervention as plaintiffs’ amicus is peculiar on many fronts. At the time of the Lipitor litigation, Sonal Singh was an epidemiologist and Assistant Professor of Medicine, at the Johns Hopkins University. The MDL trial court initially held that Singh’s proffered testimony was inadmissible because of his failure to consider daily dose.[3] In a second attempt, Singh offered an opinion for 10 mg daily dose of atorvastatin, based largely upon the results of a clinical trial known as ASCOT-LLA.[4]

The ASCOT-LLA trial randomized 19,342 participants with hypertension and at least three other cardiovascular risk factors to two different anti-hypertensive medications. A subgroup with total cholesterol levels less than or equal to 6.5 mmol./l. were randomized to either daily 10 mg. atorvastatin or placebo.  The investigators planned to follow up for five years, but they stopped after 3.3 years because of clear benefit on the primary composite end point of non-fatal myocardial infarction and fatal coronary heart disease. At the time of stopping, there were 100 events of the primary pre-specified outcome in the atorvastatin group, compared with 154 events in the placebo group (hazard ratio 0.64 [95% CI 0.50 – 0.83], p = 0.0005).

The atorvastatin component of ASCOT-LLA had, in addition to its primary pre-specified outcome, seven secondary end points, and seven tertiary end points.  The emergence of diabetes mellitus in this trial population, which clearly was at high risk of developing diabetes, was one of the tertiary end points. Primary, secondary, and tertiary end points were reported in ASCOT-LLA without adjustment for the obvious multiple comparisons. In the treatment group, 3.0% developed diabetes over the course of the trial, whereas 2.6% developed diabetes in the placebo group. The unadjusted hazard ratio was 1.15 (0.91 – 1.44), p = 0.2493.[5] Given the 15 trial end points, an adjusted p-value for this particular hazard ratio, for diabetes, might well exceed 0.5, and even approach 1.0.

On this record, Dr. Singh honestly acknowledged that statistical significance was important, and that the diabetes finding in ASCOT-LLA might have been the result of low statistical power or of no association at all. Based upon the trial data alone, he testified that “one can neither confirm nor deny that atorvastatin 10 mg is associated with significantly increased risk of type 2 diabetes.”[6] The trial court excluded Dr. Singh’s 10mg/day causal opinion, but admitted his 80mg/day opinion. On appeal, the Fourth Circuit affirmed the MDL district court’s rulings.[7]

Jonah Gelbach is a professor of law at the University of California at Berkeley. He attended Yale Law School, and received his doctorate in economics from MIT.

Professor Gelbach entered the Lipitor fray to present a single issue: whether statistical significance at conventionally demanding levels such as 5 percent is an appropriate basis for excluding expert testimony based on statistical evidence from a single study that did not achieve statistical significance.

Professor Gelbach is no stranger to antic proposals.[8] As amicus curious in the Lipitor litigation, Gelbach asserts that plaintiffs’ expert witness, Dr. Singh, was wrong in his testimony about not being able to confirm the ASCOT-LLA association because he, Gelbach, could confirm the association.[9] Ultimately, the Fourth Circuit did not discuss Gelbach’s contentions, which is not surprising considering that the asserted arguments and alleged factual considerations were not only dehors the record, but in contradiction of the record.

Gelbach’s curious claim is that any time a risk ratio, for an exposure and an outcome of interest, is greater than 1.0, with a p-value < 0.5,[10] the evidence should be not only admissible, but sufficient to support a conclusion of causation. Gelbach states his claim in the context of discussing a single randomized controlled trial (ASCOT-LLA), but his broad pronouncements are carelessly framed such that others may take them to apply to a single observational study, with its greater threats to internal validity.

Contra Kumho Tire

To get to his conclusion, Gelbach attempts to remove the constraints of traditional standards of significance probability. Kumho Tire teaches that expert witnesses must “employ[] in the courtroom the same level of intellectual rigor that characterizes the practice of an expert in the relevant field.”[11] For Gelbach, this “eminently reasonable admonition” does not impose any constraints on statistical inference in the courtroom. Statistical significance at traditional levels (p < 0.05) is for elitist scholarly work, not for the “practical” rent-seeking work of the tort bar. According to Gelbach, the inflation of the significance level ten-fold to p < 0.5 is merely a matter of “weight” and not admissibility of any challenged opinion testimony.

Likelihood Ratios and Posterior Probabilities

Gelbach maintains that any evidence that has a likelihood ratio (LR > 1) greater than one is relevant, and should be admissible under Federal Rule of Evidence 401.[12] This argument ignores the other operative Federal Rules of Evidence, namely 702 and 703, which impose additional criteria of admissibility for expert witness opinion testimony.

With respect to variance and random error, Gelbach tells us that any evidence that generates a LR > 1, should be admitted when “the statistical evidence is statistically significant below the 50 percent level, which will be true when the p-value is less than 0.5.”[13]

At times, Gelbach seems to be discussing the admissibility of the ASCOT-LLA study itself, and not the proffered opinion testimony of Dr. Singh. The study itself would not be admissible, although it is clearly the sort of hearsay an expert witness in the field may consider. If Dr. Singh were to have reframed and recalculated the statistical comparisons, then the Rule 703 requirement of “reasonable reliance” by scientists in the field of interest may not have been satisfied.

Gelbach also generates a posterior probability (0.77), which is based upon his calculations from data in the ASCOT-LLA trial, and not the posterior probability of Dr. Singh’s opinion. The posterior probability, as calculated, is problematic on many fronts.

Gelbach does not present his calculations – for the sake of brevity he says – but he tells us that the ASCOT-LLA data yield a likelihood ratio of roughly 1.9, and a p-value of 0.126.[14] What the clinical trialists reported was a hazard ratio of 1.15, which is a weak association on most researchers’ scales, with a two-sided p-value of 0.25, which is five times higher than the usual 5 percent. Gelbach does not explain how or why his calculated p-value for the likelihood ratio is roughly half the unadjusted, two-sided p-value for the tertiary outcome from ASCOT-LLA.

As noted, the reported diabetes hazard ratio of 1.15 was a tertiary outcome for the ASCOT trial, one of 15 calculated by the trialists, with p-values unadjusted for multiple comparisons.  The failure to adjust is perhaps excusable in that some (but certainly not all) of the outcome variables are overlapping or correlated. A sophisticated reader would not be misled; only when someone like Gelbach attempts to manufacture an inflated posterior probability without accounting for the gross underestimate in variance is there an insult to statistical science. Gelbach’s recalculated p-value for his LR, if adjusted for the multiplicity of comparisons in this trial, would likely exceed 0.5, rendering all his arguments nugatory.

Using the statistics as presented by the published ASCOT-LLA trial to generate a posterior probability also ignores the potential biases (systematic errors) in data collection, the unadjusted hazard ratios, the potential for departures from random sampling, errors in administering the participant recruiting and inclusion process, and other errors in measurements, data collection, data cleaning, and reporting.

Gelbach correctly notes that there is nothing methodologically inappropriate in advocating likelihood ratios, but he is less than forthcoming in explaining that such ratios translate into a posterior probability only if he posits a prior probability of 0.5.[15] His pretense to having simply stated “mathematical facts” unravels when we consider his extreme, unrealistic, and unscientific assumptions.

The Problematic Prior

Gelbach’s glibly assumes that the starting point, the prior probability, for his analysis of Dr. Singh’s opinion is 50%. This is an old and common mistake,[16] long since debunked.[17] Gelbach’s assumption is part of an old controversy, which surfaced in early cases concerning disputed paternity. The assumption, however, is wrong legally and philosophically.

The law simply does not hand out 0.5 prior probability to both parties at the beginning of a trial. As Professor Jaffee noted almost 35 years ago:

“In the world of Anglo-American jurisprudence, every defendant, civil and criminal, is presumed not liable. So, every claim (civil or criminal) starts at ground zero (with no legal probability) and depends entirely upon proofs actually adduced.”[18]

Gelbach assumes that assigning “equal prior probability” to two adverse parties is fair, because the fact-finder would not start hearing evidence with any notion of which party’s contentions are correct. The 0.5/0.5 starting point, however, is neither fair nor is it the law.[19] The even odds prior is also not good science.

The defense is entitled to a presumption that it is not liable, and the plaintiff must start at zero.  Bayesians understand that this is the death knell of their beautiful model.  If the prior probability is zero, then Bayes’ Theorem tells us mathematically that no evidence, no matter how large a likelihood ratio, can move the prior probability of zero towards one. Bayes’ theorem may be a correct statement about inverse probabilities, but still be an inadequate or inaccurate model for how factfinders do, or should, reason in determining the ultimate facts of a case.

We can see how unrealistic and unfair Gelbach’s implied prior probability is if we visualize the proof process as a football field.  To win, plaintiffs do not need to score a touchdown; they need only cross the mid-field 50-yard line. Rather than making plaintiffs start at the zero-yard line, however, Gelbach would put them right on the 50-yard line. Since one toe over the mid-field line is victory, the plaintiff is spotted 99.99+% of its burden of having to present evidence to build up 50% probability. Instead, plaintiffs are allowed to scoot from the zero yard line right up claiming success, where even the slightest breeze might give them winning cases. Somehow, in the model, plaintiffs no longer have to present evidence to traverse the first half of the field.

The even odds starting point is completely unrealistic in terms of the events upon which the parties are wagering. The ASCOT-LLA study might have shown a protective association between atorvastatin and diabetes, or it might have shown no association at all, or it might have show a larger hazard ratio than measured in this particular sample. Recall that the confidence interval for hazard ratios for diabetes ran from 0.91 to 1.44. In other words, parameters from 0.91 (protective association) to 1.0 (no association), to 1.44 (harmful association) were all reasonably compatible with the observed statistic, based upon this one study’s data. The potential outcomes are not binary, which makes the even odds starting point inappropriate.[20]


[1]Schauer’s Long Footnote on Statistical Significance” (Aug. 21, 2022).

[2] Frederick Schauer, The Proof: Uses of Evidence in Law, Politics, and Everything Else 54-55 (2022) (citing Michelle M. Burtis, Jonah B. Gelbach, and Bruce H. Kobayashi, “Error Costs, Legal Standards of Proof, and Statistical Significance,” 25 Supreme Court Economic Rev. 1 (2017).

[3] In re Lipitor Mktg., Sales Practices & Prods. Liab. Litig., MDL No. 2:14–mn–02502–RMG, 2015 WL 6941132, at *1  (D.S.C. Oct. 22, 2015).

[4] Peter S. Sever, et al., “Prevention of coronary and stroke events with atorvastatin in hypertensive patients who have average or lower-than-average cholesterol concentrations, in the Anglo-Scandinavian Cardiac Outcomes Trial Lipid Lowering Arm (ASCOT-LLA): a multicentre randomised controlled trial,” 361 Lancet 1149 (2003). [cited here as ASCOT-LLA]

[5] ASCOT-LLA at 1153 & Table 3.

[6][6] In re Lipitor Mktg., Sales Practices & Prods. Liab. Litig., 174 F.Supp. 3d 911, 921 (D.S.C. 2016) (quoting Dr. Singh’s testimony).

[7] In re Lipitor Mktg., Sales Practices & Prods. Liab. Litig., 892 F.3d 624, 638-39 (2018) (affirming MDL trial court’s exclusion in part of Dr. Singh).

[8] SeeExpert Witness Mining – Antic Proposals for Reform” (Nov. 4, 2014).

[9] Brief for Amicus Curiae Jonah B. Gelbach in Support of Plaintiffs-Appellants, In re Lipitor Mktg., Sales Practices & Prods. Liab. Litig., 2017 WL 1628475 (April 28, 2017). [Cited as Gelbach]

[10] Gelbach at *2.

[11] Kumho Tire Co. v. Carmichael, 526 U.S. 137, 152 (1999).

[12] Gelbach at *5.

[13] Gelbach at *2, *6.

[14] Gelbach at *15.

[15] Gelbach at *19-20.

[16] See Richard A. Posner, “An Economic Approach to the Law of Evidence,” 51 Stanford L. Rev. 1477, 1514 (1999) (asserting that the “unbiased fact-finder” should start hearing a case with even odds; “[I]deally we want the trier of fact to work from prior odds of 1 to 1 that the plaintiff or prosecutor has a meritorious case. A substantial departure from this position, in either direction, marks the trier of fact as biased.”).

[17] See, e.g., Richard D. Friedman, “A Presumption of Innocence, Not of Even Odds,” 52 Stan. L. Rev. 874 (2000). [Friedman]

[18] Leonard R. Jaffee, “Prior Probability – A Black Hole in the Mathematician’s View of the Sufficiency and Weight of Evidence,” 9 Cardozo L. Rev. 967, 986 (1988).

[19] Id. at p.994 & n.35.

[20] Friedman at 877.

The Rise of Agnothology as Conspiracy Theory

July 19th, 2022

A few egregious articles in the biomedical literature have begun to endorse explicitly asymmetrical standards for inferring causation in the context of environmental or occupational exposures. Very little if anything is needed for inferring causation, and nothing counts against causation.  If authors refuse to infer causation, then they are agents of “industry,” epidemiologic malfeasors, and doubt mongers.

For an example of this genre, take the recent article, entitled “Toolkit for detecting misused epidemiological methods.”[1] [Toolkit] Please.

The asymmetry begins with Trump-like projection of the authors’ own foibles. The principal hammer in the authors’ toolkit for detecting misused epidemiologic methods is personal, financial bias. And yet, somehow, in an article that calls out other scientists for having received money from “industry,” the authors overlooked the business of disclosing their receipt of monies from one of the biggest industries around – the lawsuit industry.

Under the heading “competing interests,” the authors state that “they have no competing interests.”[2]  Lead author, Colin L. Soskolne, was, however, an active, partisan expert witness for plaintiffs’ counsel in diacetyl litigation.[3] In an asbestos case before the Pennsylvania Supreme Court, Rost v. Ford Motor Co., Soskolne signed on to an amicus brief, supporting the plaintiff, using his science credentials, without disclosing his expert witness work for plaintiffs, or his long-standing anti-asbestos advocacy.[4]

Author Shira Kramer signed on to Toolkit, without disclosing any conflicts, but with an even more impressive résumé of pro-plaintiff litigation experience.[5] Kramer is the owner of Epidemiology International, in Cockeysville, Maryland, where she services the lawsuit industry. She too was an “amicus” in Rost, without disclosing her extensive plaintiff-side litigation consulting and testifying.

Carl Cranor, another author of Toolkit, takes first place for hypocrisy on conflicts of interest. As a founder of Council for Education and Research on Toxics (CERT), he has sterling credentials for monetizing the bounty hunt against “carcinogens,” most recently against coffee.[6] He has testified in denture cream and benzene litigation, for plaintiffs. When he was excluded under Rule 702 from the Milward case, CERT filed an amicus brief on his behalf, without disclosing that Cranor was a founder of that organization.[7], [8]

The title seems reasonably fair-minded but the virulent bias of the authors is soon revealed. The Toolkit is presented as a Table in the middle of the article, but the actual “tools” are for the most part not seriously discussed, other than advice to “follow the money” to identify financial conflicts of interest.

The authors acknowledge that epidemiology provides critical knowledge of risk factors and causation of disease, but they quickly transition to an effort to silence any industry commentator on any specific epidemiologic issue. As we will see, the lawsuit industry is given a complete pass. Not surprisingly, several of the authors (Kramer, Cranor, Soskolne) have worked closely in tandem with the lawsuit industry, and have derived financial rewards for their efforts.

Repeatedly, the authors tell us that epidemiologic methods and language are misused by “powerful interests,” which have financial stakes in the outcome of research. Agents of these interests foment uncertainty and doubt about causal relationships through “disinformation,” “malfeasance,” and “doubt mongering.” There is no correlative concern about false claiming or claim mongering..

Who are these agents who plot to sabotage “social justice” and “truth”? Clearly, they are scientists with whom the Toolkit authors disagree. The Toolkit gang cites several papers as exemplifying “malfeasance,”[9] but they never explain what was wrong with them, or how the malfeasors went astray.  The Toolkit tactics seem worthy of Twitter smear and run.

The Toolkit

The authors’ chart of “tools” used by industry might have been an interesting taxonomy of error, but mostly they are ad hominem attack on scientists with whom they disagree. Channeling Putin on Ukraine, those scientists who would impose discipline and rigor on epidemiologic science are derided as not “real epidemiologists,” and, to boot, they are guilty of ethical lapses in failing to advance “social justice.”

Mostly the authors give us a toolkit for silencing those who would get in the way of the situational science deployed at the beck and call of the lawsuit industry.[10] Indeed, the Toolkit authors are not shy about identifying their litigation goals; they tell us that the toolkit can be deployed in depositions and in cross-examinations to pursue “social justice.” These authors also outline a social agenda that greatly resembles the goals of cancel culture: expose the perpetrators who stand in the way of the authors’preferred policy choices, diminish their adversaries’ their influence on journals, and galvanize peer reviewers to reject their adversaries’ scientific publications. The Toolkit authors tell us that “[t] he scientific community should engage by recognizing and professionally calling out common practices used to distort and misapply epidemiological and other health-related sciences.”[11] What this advice translates into are covert and open ad hominem campaigns as peer reviewers to block publications, to deny adversaries tenure and promotions, and to use social and other media outlets to attack adversaries’ motives, good faith, and competence.

None of this is really new. Twenty-five years ago, the late F. Douglas K. Liddell railed at the Mt. Sinai mob, and the phenomenon was hardly new then.[12] The Toolkit’s call to arms is, however, quite open, and raises the question whether its authors and adherents can be fair journal editors and peer reviewers of journal submissions.

Much of the Toolkit is the implementation of a strategy developed by lawsuit industry expert witnesses to demonize their adversaries by accusing them of manufacturing doubt or ignorance or uncertainty. This strategy has gained a label used to deride those who disagree with litigation overclaiming: agnotology or the creation of ignorance. According to Professor Robert Proctor, a regular testifying historian for tobacco plaintiffs, a linguist, Iain Boal, coined the term agnotology, in 1992, to describe the study of the production of ignorance.[13]

The Rise of “Agnotology” in Ngram

Agnotology has become a cottage sub-industry of the lawsuit industry, although lawsuits (or claim mongering if you like), of course, remain their main product. Naomi Oreskes[14] and David Michaels[15] gave the agnotology field greater visibility with their publications, using the less erudite but catchier phrase “manufacturing doubt.” Although the study of ignorance and uncertainty has a legitimate role in epistemology[16] and sociology,[17] much of the current literature is dominated by those who use agnotology as propaganda in support of their own litigation and regulatory agendas.[18] One lone author, however, appears to have taken agnotology study seriously enough to see that it is largely a conspiracy theory that reduces complex historical or scientific theory, evidence, opinion, and conclusions to a clash between truth and a demonic ideology.[19]

Is there any substance to the Toolkit?

The Toolkit is not entirely empty of substantive issues. The authors note that “statistical methods are a critical component of the epidemiologist’s toolkit,”[20] and they cite some articles about common statistical mistakes missed by peer reviewers. Curiously, the Toolkit omits any meaningful discussion of statistical mistakes that increase the risk of false positive results, such as multiple comparisons or dichotomizing continuous confounder variables. As for the Toolkit’s number one identified “inappropriate” technique used by its authors’ adversaries, we have:

“A1. Relying on statistical hypothesis testing; Using ‘statistical significance’ at the 0.05 level of probability as a strict decision criterion to determine the interpretation of statistical results and drawing conclusions.”

Peer into the hearings of any federal court so-called Daubert motion, and you will see the lawsuit industry, and its hired expert witnesses, rail at statistical significance, unless of course, there is some subgroup that has nominal significance, in which case, they are all in for endorsing the finding as “conclusive.” 

Welcome to asymmetric, situational science.


[1] Colin L. Soskolne, Shira Kramer, Juan Pablo Ramos-Bonilla, Daniele Mandrioli, Jennifer Sass, Michael Gochfeld, Carl F. Cranor, Shailesh Advani & Lisa A. Bero, “Toolkit for detecting misused epidemiological methods,” 20(90) Envt’l Health (2021) [Toolkit].

[2] Toolkit at 12.

[3] Watson v. Dillon Co., 797 F.Supp. 2d 1138 (D. Colo. 2011).

[4] Rost v. Ford Motor Co., 151 A.3d 1032 (Pa. 2016). See “The Amicus Curious Brief” (Jan. 4, 2018).

[5] See, e.g., Sean v. BMW of North Am., LLC, 26 N.Y.3d 801, 48 N.E.3d 937, 28 N.Y.S.3d 656 (2016) (affirming exclusion of Kramer); The Little Hocking Water Ass’n v. E.I. Du Pont De Nemours & Co., 90 F.Supp.3d 746 (S.D. Ohio 2015) (excluding Kramer); Luther v. John W. Stone Oil Distributor, LLC, No. 14-30891 (5th Cir. April 30, 2015) (mentioning Kramer as litigation consultant); Clair v. Monsanto Co., 412 S.W.3d 295 (Mo. Ct. App. 2013 (mentioning Kramer as plaintiffs’ expert witness); In re Chantix (Varenicline) Prods. Liab. Litig., No. 2:09-CV-2039-IPJ, MDL No. 2092, 2012 WL 3871562 (N.D.Ala. 2012) (excluding Kramer’s opinions in part); Frischhertz v. SmithKline Beecham Corp., 2012 U.S. Dist. LEXIS 181507, Civ. No. 10-2125 (E.D. La. Dec. 21, 2012) (excluding Kramer); Donaldson v. Central Illinois Public Service Co., 199 Ill. 2d 63, 767 N.E.2d 314 (2002) (affirming admissibility of Kramer’s opinions in absence of Rule 702 standards).

[6]  “The Council for Education & Research on Toxics” (July 9, 2013) (CERT amicus brief filed without any disclosure of conflict of interest). Among the fellow travelers who wittingly or unwittingly supported CERT’s scheme to pervert the course of justice were lawsuit industry stalwarts, Arthur L. Frank, Peter F. Infante, Philip J. Landrigan, Barry S. Levy, Ronald L. Melnick, David Ozonoff, and David Rosner. See also NAS, “Carl Cranor’s Conflicted Jeremiad Against Daubert” (Sept. 23, 2018); Carl Cranor, “Milward v. Acuity Specialty Products: How the First Circuit Opened Courthouse Doors for Wronged Parties to Present Wider Range of Scientific Evidence” (July 25, 2011).

[7] Milward v. Acuity Specialty Products Group, Inc., 664 F. Supp. 2d 137, 148 (D. Mass. 2009), rev’d, 639 F.3d 11 (1st Cir. 2011), cert. den. sub nom. U.S. Steel Corp. v. Milward, 565 U.S. 1111 (2012), on remand, Milward v. Acuity Specialty Products Group, Inc., 969 F.Supp. 2d 101 (D. Mass. 2013) (excluding specific causation opinions as invalid; granting summary judgment), aff’d, 820 F.3d 469 (1st Cir. 2016).

[8] To put this effort into a sociology of science perspective, the Toolkit article is published in a journal, Environmental Health, an Editor in Chief of which is David Ozonoff, a long-time pro-plaintiff partisan in the asbestos litigation. The journal has an “ombudsman,”Anthony Robbins, who was one of the movers-and-shakers in forming SKAPP, The Project on Scientific Knowledge and Public Policy, a group that plotted to undermine the application of federal evidence law of expert witness opinion testimony. SKAPP itself now defunct, but its spirit of subverting law lives on with efforts such as the Toolkit. “More Antic Proposals for Expert Witness Testimony – Including My Own Antic Proposals” (Dec. 30, 2014). Robbins is also affiliated with an effort, led by historian and plaintiffs’ expert witness David Rosner, to perpetuate misleading historical narratives of environmental and occupational health. “ToxicHistorians Sponsor ToxicDocs” (Feb. 1, 2018); “Creators of ToxicDocs Show Off Their Biases” (June 7, 2019); Anthony Robbins & Phyllis Freeman, “ToxicDocs (www.ToxicDocs.org) goes live: A giant step toward leveling the playing field for efforts to combat toxic exposures,” 39 J. Public Health Pol’y 1 (2018).

[9] The exemplars cited were Paolo Boffetta, MD, MPH; Hans Olov Adami, Philip Cole, Dimitrios Trichopoulos, Jack Mandel, “Epidemiologic studies of styrene and cancer: a review of the literature,” 51 J. Occup. & Envt’l Med. 1275 (2009); Carlo LaVecchia & Paolo Boffetta, “Role of stopping exposure and recent exposure to asbestos in the risk of mesothelioma,” 21 Eur. J. Cancer Prev. 227 (2012); John Acquavella, David Garabrant, Gary Marsh G, Thomas Sorahan and Douglas L. Weed, “Glyphosate epidemiology expert panel review: a weight of evidence systematic review of the relationship between glyphosate exposure and non-Hodgkin’s lymphoma or multiple myeloma,” 46 Crit. Rev. Toxicol. S28 (2016); Catalina Ciocan, Nicolò Franco, Enrico Pira, Ihab Mansour, Alessandro Godono, and Paolo Boffetta, “Methodological issues in descriptive environmental epidemiology. The example of study Sentieri,” 112 La Medicina del Lavoro 15 (2021).

[10] The Toolkit authors acknowledge that their identification of “tools” was drawn from previous publications of the same ilk, in the same journal. Rebecca F. Goldberg & Laura N. Vandenberg, “The science of spin: targeted strategies to manufacture doubt with detrimental effects on environmental and public health,” 20:33 Envt’l Health (2021).

[11] Toolkit at 11.

[12] F.D.K. Liddell, “Magic, Menace, Myth and Malice,” 41 Ann. Occup. Hyg. 3, 3 (1997). SeeThe Lobby – Cut on the Bias” (July 6, 2020).

[13] Robert N. Proctor & Londa Schiebinger, Agnotology: The Making and Unmaking of Ignorance (2008).

[14] Naomi Oreskes & Erik M. Conway, Merchants of Doubt: How a Handful of Scientists Obscured the Truth on Issues from Tobacco Smoke to Global Warming (2010); Naomi Oreskes & Erik M. Conway, “Defeating the merchants of doubt,” 465 Nature 686 (2010).

[15] David Michaels, The Triumph of Doubt: Dark Money and the Science of Deception (2020); David Michaels, Doubt is Their Product: How Industry’s Assault on Science Threatens Your Health (2008); David Michaels, “Science for Sale,” Boston Rev. 2020; David Michaels, “Corporate Campaigns Manufacture Scientific Doubt,” 174 Science News 32 (2008); David Michaels, “Manufactured Uncertainty: Protecting Public Health in the Age of Contested Science and Product Defense,” 1076 Ann. N.Y. Acad. Sci. 149 (2006); David Michaels, “Scientific Evidence and Public Policy,” 95 Am. J. Public Health s1 (2005); David Michaels & Celeste Monforton, “Manufacturing Uncertainty: Contested Science and the Protection of the Public’s Health and Environment,” 95 Am. J. Pub. Health S39 (2005); David Michaels & Celeste Monforton, “Scientific Evidence in the Regulatory System: Manufacturing Uncertainty and the Demise of the Formal Regulatory Ssytem,” 13 J. L. & Policy 17 (2005); David Michaels, “Doubt is Their Product,” Sci. Am. 96 (June 2005); David Michaels, “The Art of ‘Manufacturing Uncertainty’,” L.A. Times (June 24, 2005).

[16] See, e.g., Sibilla Cantarini, Werner Abraham, and Elisabeth Leiss, eds., Certainty-uncertainty – and the Attitudinal Space in Between (2014); Roger M. Cooke, Experts in Uncertainty: Opinion and Subjective Probability in Science (1991).

[17] See, e.g., Ralph Hertwig & Christoph Engel, eds., Deliberate Ignorance: Choosing Not to Know (2021); Linsey McGoey, The Unknowers: How Strategic Ignorance Rules the World (2019); Michael Smithson, “Toward a Social Theory of Ignorance,” 15 J. Theory Social Behavior 151 (1985).

[18] See Janet Kourany & Martin Carrier, eds., Science and the Production of Ignorance: When the Quest for Knowledge Is Thwarted (2020); John Launer, “The production of ignorance,” 96 Postgraduate Med. J. 179 (2020); David S. Egilman, “The Production of Corporate Research to Manufacture Doubt About the Health Hazards of Products: An Overview of the Exponent BakeliteVR Simulation Study,” 28 New Solutions 179 (2018); Larry Dossey, “Agnotology: on the varieties of ignorance, criminal negligence, and crimes against humanity,” 10 Explore 331 (2014); Gerald Markowitz & David Rosner, Deceit and Denial: The Deadly Politics of Industrial Revolution (2002).

[19] See Enea Bianchi, “Agnotology: a Conspiracy Theory of Ignorance?” Ágalma: Rivista di studi culturali e di estetica 41 (2021).

[20] Toolkit at 4.

Madigan’s Shenanigans & Wells Quelled in Incretin-Mimetic Cases

July 15th, 2022

The incretin-mimetic litigation involved claims that the use of Byetta, Januvia, Janumet, and Victoza medications causes pancreatic cancer. All four medications treat diabetes mellitus through incretin hormones, which stimulate or support insulin production, which in turn lowers blood sugar. On Planet Earth, the only scientists who contend that these medications cause pancreatic cancer are those hired by the lawsuit industry.

The cases against the manufacturers of the incretin-mimetic medications were consolidated for pre-trial proceedings in federal court, pursuant to the multi-district litigation (MDL) statute, 28 US Code § 1407. After years of MDL proceedings, the trial court dismissed the cases as barred by the doctrine of federal preemption, and for good measure, excluded plaintiffs’ medical causation expert witnesses from testifying.[1] If there were any doubt about the false claiming in this MDL, the district court’s dismissals were affirmed by the Ninth Circuit.[2]

The district court’s application of Federal Rule of Evidence 702 to the plaintiffs’ expert witnesses’ opinion is an important essay in patho-epistemology. The challenged expert witnesses provided many examples of invalid study design and interpretation. Of particular interest, two of the plaintiffs’ high-volume statistician testifiers, David Madigan and Martin Wells, proffered their own meta-analyses of clinical trial safety data. Although the current edition of the Reference Manual on Scientific Evidence[3] provides virtually no guidance to judges for assessing the validity of meta-analyses, judges and counsel do now have other readily available sources, such as the FDA’s Guidance on meta-analysis of safety outcomes of clinical trials.[4] Luckily for the Incretin-Mimetics pancreatic cancer MDL judge, the misuse of meta-analysis methodology by plaintiffs’ statistician expert witnesses, David Madigan and Martin Wells was intuitively obvious.

Madigan and Wells had a large set of clinical trials at their disposal, with adverse safety outcomes assiduously collected. As is the case with many clinical trial safety outcomes, the trialists will often have a procedure for blinded or unblinded adjudication of safety events, such as pancreatic cancer diagnosis.

At deposition, Madigan testified that he counted only adjudicated cases of pancreatic cancer in his meta-analyses, which seems reasonable enough. As discovery revealed, however, Madigan employed the restrictive inclusion criteria of adjudicated pancreatic cancer only to the placebo group, not to the experimental group. His use of restrictive inclusion criteria for only the placebo group had the effect of excluding several non-adjudicated events, with the obvious spurious inflation of risk ratios. The MDL court thus found with relative ease that Madigan’s “unequal application of criteria among the two groups inevitably skews the data and critically undermines the reliability of his analysis.” The unexplained, unjustified change in methodology revealed Madigan’s unreliable “cherry-picking” and lack of scientific rigor as producing a result-driven meta-analyses.[5]

The MDL court similarly found that Wells’ reports “were marred by a selective review of data and inconsistent application of inclusion criteria.”[6] Like Madigan, Wells cherry picked studies. For instance, he excluded one study, EXSCEL, on grounds that it reported “a high pancreatic cancer event rate in the comparison group as compared to background rate in the general population….”[7] Wells’ explanation blatantly failed, however, given that the entire patient population of the clinical trial had diabetes, a known risk factor for pancreatic cancer.[8]

As Professor Ioannidis and others have noted, we are awash in misleading meta-analyses:

“Currently, there is massive production of unnecessary, misleading, and conflicted systematic reviews and meta-analyses. Instead of promoting evidence-based medicine and health care, these instruments often serve mostly as easily produced publishable units or marketing tools.  Suboptimal systematic reviews and meta-analyses can be harmful given the major prestige and influence these types of studies have acquired.  The publication of systematic reviews and meta-analyses should be realigned to remove biases and vested interests and to integrate them better with the primary production of evidence.”[9]

Whether created for litigation, like the Madigan-Wells meta-analyses, or published in the “peer-reviewed” literature, courts will have to up their game in assessing the validity of such studies. Published meta-analyses have grown exponentially from the 1990s to the present. To date, 248,886 meta-analyses have been published, according the National Library of Medicine’s Pub-Med database. Last year saw over 35,000 meta-analyses published. So far, this year, 20,416 meta-analyses have been published, and we appear to be on track to have a bumper crop.

The data analytics from Pub-Med provide a helpful visual representation of the growth of meta-analyses in biomedical science.

 

Count of Publications with Keyword Meta-analysis in Pub-Med Database

In 1979, the year I started law school, one meta-analysis was published. Lawyers could still legitimately argue that meta-analyses involved novel methodology that had not been generally accepted. The novelty of meta-analysis wore off sometime between 1988, when Judge Robert Kelly excluded William Nicholson’s meta-analysis of health outcomes among PCB-exposed workers, on grounds that such analyses were “novel,” and 1990, when the Third Circuit reversed Judge Kelly, with instructions to assess study validity.[10] Fortunately, or not, depending upon your point of view, plaintiffs dropped Nicholson’s meta-analysis in subsequent proceedings. A close look at Nicholson’s non-peer reviewed calculations shows that he failed to standardize for age or sex, and that he merely added observed and expected cases, across studies, without weighting by individual study variance. The trial court never had the opportunity to assess the validity vel non of Nicholson’s ersatz meta-analysis.[11] Today, trial courts must pick up on the challenge of assessing study validity of meta-analyses relied upon by expert witnesses, regulatory agencies, and systematic reviews.


[1] In re Incretin-Based Therapies Prods. Liab. Litig., 524 F.Supp.3d 1007 (S.D. Cal. 2021).

[2] In re Incretin-Based Therapies Prods. Liab. Litig., No. 21-55342, 2022 WL 898595 (9th Cir. Mar. 28, 2022) (per curiam)

[3]  “The Treatment of Meta-Analysis in the Third Edition of the Reference Manual on Scientific Evidence” (Nov. 15, 2011).

[4] Food and Drug Administration, Center for Drug Evaluation and Research, “Meta-Analyses of Randomized Controlled Clinical Trials to Evaluate the Safety of Human Drugs or Biological Products – (Draft) Guidance for Industry” (Nov. 2018); Jonathan J. Deeks, Julian P.T. Higgins, Douglas G. Altman, “Analysing data and undertaking meta-analyses,” Chapter 10, in Julian P.T. Higgins, James Thomas, Jacqueline Chandler, Miranda Cumpston, Tianjing Li, Matthew J. Page, and Vivian Welch, eds., Cochrane Handbook for Systematic Reviews of Interventions (version 6.3 updated February 2022); Donna F. Stroup, Jesse A. Berlin, Sally C. Morton, Ingram Olkin, G. David Williamson, Drummond Rennie, David Moher, Betsy J. Becker, Theresa Ann Sipe, Stephen B. Thacker, “Meta-Analysis of Observational Studies: A Proposal for Reporting,” 283 J. Am. Med. Ass’n 2008 (2000); David Moher, Alessandro Liberati, Jennifer Tetzlaff, and Douglas G Altman, “Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement,” 6 PLoS Med e1000097 (2009).

[5] In re Incretin-Based Therapies Prods. Liab. Litig., 524 F.Supp.3d 1007, 1037 (S.D. Cal. 2021). See In re Lipitor (Atorvastatin Calcium) Mktg., Sales Practices & Prods. Liab. Litig. (No. II) MDL2502, 892 F.3d 624, 634 (4th Cir. 2018) (“Result-driven analysis, or cherry-picking, undermines principles of the scientific method and is a quintessential example of applying methodologies (valid or otherwise) in an unreliable fashion.”).

[6] In re Incretin-Based Therapies Prods. Liab. Litig., 524 F.Supp.3d 1007, 1043 (S.D. Cal. 2021).

[7] Id. at 1038.

[8] See, e.g., Albert B. Lowenfels & Patrick Maisonneuve, “Risk factors for pancreatic cancer,” 95 J. Cellular Biochem. 649 (2005).

[9] John P. Ioannidis, “The mass production of redundant, misleading, and conflicted systematic reviews and meta-analyses,” 94 Milbank Quarterly 485 (2016).

[10] In re Paoli R.R. Yard PCB Litig., 706 F. Supp. 358, 373 (E.D. Pa. 1988), rev’d and remanded, 916 F.2d 829, 856-57 (3d Cir. 1990), cert. denied, 499 U.S. 961 (1991). See also Hines v. Consol. Rail Corp., 926 F.2d 262, 273 (3d Cir. 1991).

[11]The Shmeta-Analysis in Paoli” (July 11, 2019). See  James A. Hanley, Gilles Thériault, Ralf Reintjes and Annette de Boer, “Simpson’s Paradox in Meta-Analysis,” 11 Epidemiology 613 (2000); H. James Norton & George Divine, “Simpson’s paradox and how to avoid it,” Significance 40 (Aug. 2015); George Udny Yule, “Notes on the theory of association of attributes in statistics,” 2 Biometrika 121 (1903).

The Faux Bayesian Approach in Litigation

July 13th, 2022

In an interesting series of cases, an expert witness claimed to have arrived at the specific causation of plaintiff’s stomach cancer by using “Bayesian probabilities which consider the interdependence of individual probabilities.” Courtesy of counsel in the cases, I have been able to obtain a copy of the report of the expert witness, Dr. Robert P. Gale. The cases in which Dr. Gale served were all FELA cancer cases against the Union Pacific Railroad, brought for cancers diagnosed in the plaintiffs. Given his research and writings in hematopoietic cancers and molecular biology, Dr. Gale would seem to have been a credible expert witness for the plaintiffs in their cases.[1]

The three cases involving Dr. Gale were all decisions on Rule 702 motions to exclude his causation opinions. In all three cases, the court found Dr. Gale to be qualified to opine on causation, which finding is decided by a very low standard in federal court. In two of the cases, the same judge, federal Magistrate Judge Cheryl R. Zwart, excluded Dr. Gale’s opinions.[2] In at least one of the two cases, the decision seemed rather straightforward, given that Dr. Gale claimed to have ruled out alternative causes of Mr. Hernandez’s stomach cancer.  Somehow, despite his qualifications, however, Dr. Gale missed that Mr. Hernandez had had helicobacter pylori infections before he was diagnosed with stomach cancer.

In the third case, the district judge denied the Rule 702 motion against Dr. Gale, in a cursory, non-searching review.[3]

The common thread in all three cases is that the courts dutifully noted that Dr. Gale had described his approach to specific causation as involving “Bayesian probabilities which consider the interdependence of individual probabilities.” The judicial decisions never described how Dr. Gale’s invocation of Bayesian probabilities contributed to his specific causation opinion, and a careful review of Dr. Gale’s report reveals no such analysis. To be explicit, there was no discussion of prior or posterior probabilities or odds, no discussion of likelihood ratios, or Bayes factors. There was absolutely nothing in Dr. Gale’s report that would warrant his claim that he had done a Bayesian analysis of specific causation or of the “interdependence of individual probabilities” of putative specific causes.

We might forgive the credulity of the judicial officers in these cases, but why would Dr. Gale state that he had done a Bayesian analysis? The only reason that suggests itself is that Dr. Gale was bloviating in order to give his specific causation opinions an aura of scientific and mathematical respectability. Falsus in duo, falsus in omnibus.[4]


[1] See, e.g., Robert Peter Gale, et al., Fetal Liver Transplantation (1987); Robert Peter Gale & Thomas Hauser, Chernobyl: The Final Warning (1988); Kenneth A. Foon, Robert Peter Gale, et al., Immunologic Approaches to the Classification and Management of Lymphomas and Leukemias (1988); Eric Lax & Robert Peter Gale, Radiation: What It Is, What You Need to Know (2013).

[2] Byrd v. Union Pacific RR, 453 F.Supp.3d 1260 (D. Neb. 2020) (Zwart, M.J.); Hernandez v. Union Pacific RR, No. 8: 18CV62 (D. Neb. Aug. 14, 2020).

[3] Langrell v. Union Pacific RR, No. 8:18CV57, 2020 WL 3037271 (D. Neb. June 5, 2020) (Bataillon, S.J.).

[4] Dr. Gale’s testimony has not fared well elsewhere. See, e.g., In re Incretin-Based Therapies Prods. Liab. Litig., 524 F.Supp.3d 1007 (S.D. Cal. 2021) (excluding Gale); Wilcox v. Homestake Mining Co., 619 F. 3d 1165 (10th Cir. 2010); June v. Union Carbide Corp., 577 F. 3d 1234 (10th Cir. 2009) (affirming exclusion of Dr. Gale and entry of summary judgment); Finestone v. Florida Power & Light Co., 272 F. App’x 761 (11th Cir. 2008); In re Rezulin Prods. Liab. Litig., 309 F.Supp.2d 531 (S.D.N.Y. 2004) (excluding Dr. Gale from offering ethical opinions).

Small Relative Risks and Causation (General & Specific)

June 28th, 2022

The Bradford Hill Predicate: Ruling Out Random and Systematic Error

In two recent posts, I spent some time discussing a recent law review, which had some important things to say about specific causation.[1] One of several points from which I dissented was the article’s argument that Sir Austin Bradford Hill had not made explicit that ruling out random and systematic error was required before assessing his nine “viewpoints” on whether an association was causal. I take some comfort in the correctness of my interpretation of Sir Austin’s famous article by reading the analysis of no less than Sir Richard Doll’s own analysis of his friend and colleague’s views:

“In summary, we have to show, first, that the association cannot reasonably be explained by chance (bearing in mind that extreme chances do turn up from time to time or no one would buy a ticket in a national lottery), by methodological bias (which can have many sources), or by confounding (which needs to be explored but should not be postulated without some idea of what it might be). Second, we have to see whether the available evidence gives positive support to the concept of causality: that is to say, how it matches up to Hill’s (1965) guidelines (Table 1).”[2]

On the issue of whether small relative risks can establish general causation, the Differential Etiology  paper urged caution in interpreting results when “strength of a relationship is modest.” The strength of an association is, of course, one of the nine Bradford Hill viewpoints, which come into play after we have a “clear-cut” association beyond what we would care to attribute to chance. Additionally, strength of association is primarily a quantitive assessment, and the advice given about caution in the face of “modest” associations is not terribly helpful.  The scientific literature does better.

Sir Richard’s 2002 paper is in a sense a scientific autobiography about some successes in discerning causal associations from observational studies. Unlike expert witnesses for the lawsuit industry, Sir Richard’s essay is notably for its intellectual humility.  In addition to its clear and explicit articulation of the need to rule out random and systematic error before proceeding to a consideration of Sir Austin’s nine guidelines, Sir Richard Doll’s 2002 essay is instructive for judges and lawyers, for other reasons. For example, he raises and explains the problem encountered for causal inference by small relative risks:

“Small relative risks of the order of 2:1 or even less are what are likely to be observed, like the risk now recorded for childhood leukemia and exposure to magnetic fields of 0.4 µT or more (Ahlbom et al. 2000) that are seldom encountered in the United Kingdom. And here the problems of eliminating bias and confounding are immense.”[3]

Sir Richard opines that relative risks under two can be shown to be causal associations, but often with massive data, randomization, and a good deal of support from experimental work.

Another Sir Richard, Sir Richard Peto, along with Sir Richard Doll, raised this concern in their classic essay on the causes of cancer, where they noted that relative risks between one and two create extremely difficult problems of interpretation because the role of the association cannot be confidently disentangled from the contribution of biases.[4] Small relative risks are thus seen as raising a concern about bias and confounding.[5]

In the legal world, courts have recognized that the larger the relative risk, or the strength of association, the more likely a general causation inference can be drawn, even when they blithely ignored the role of actual or residual confounding.[6]

The chapters on statistics and on epidemiology in the current (third) edition of the Reference Manual on Scientific Evidence directly tie the magnitude of the association to the elimination of confounding as an alternative explanation for causality of an association. A larger “effect size,” such as for smoking and lung cancer (greater than ten-fold, and often higher than 30-fold), eliminates the need to worry about confounding:

“Many confounders have been proposed to explain the association between smoking and lung cancer, but careful epidemiological studies have ruled them out, one after the other.”[7]

*  *  *  *  *  *

“A relative risk of 10, as seen with smoking and lung cancer, is so high that it is extremely difficult to imagine any bias or confounding factor that might account for it. The higher the relative risk, the stronger the association and the lower the chance that the effect is spurious. Although lower relative risks can reflect causality, the epidemiologist will scrutinize such associations more closely because there is a greater chance that they are the result of uncontrolled confounding or biases.”[8]

The Reference Manual omits the converse: the lower relative risk, the weaker the association and the greater the chance that the apparent effect is spurious. The authors’ intent, however, is clear enough. In the Appendix, below, I have collected some pronouncements from the scientific literature that urge caution in drawing causal inferences in the face of weak associations, but with more quantitative guidance.

 Small RRs and Specific Causation

Sir Richard Doll was among the first generation of epidemiologists in the academic world. He eschewed the use of epidemiology for discerning the cause of an individual’s disease:

“That asbestos is a cause of lung cancer in this practical sense is incontrovertible, but we can never say that asbestos was responsible for the production of the disease in a particular patient, as there are many other etiologically significant agents to which the individual may have been exposed, and we can speak only of the extent to which the risk of the disease was increased by the extent of his or her exposure.”[9]

On the individual attribution issue, Sir Richard’s views do not hold up as well as his analysis of general causation. Epidemiologic study results are used to predict future disease in individuals, to guide screening and prophylaxis decisions, to determine pharmacologic and surgical interventions in individuals, and to provide prognoses to individuals. Just as confounding falls by the wayside in the analysis of general causation with relative risks greater than 20, so too do the concerns about equating increased risk with specific causation.

The urn model of probability, however, gives us some insight into attributability. If we expected 100 cases of a disease in a sample of a certain size, but we observed 200 cases, then we would have 100 expected and 100 excess cases. Attribution would be no better than a flip of a coin.  If, however, in a situation where the relative risk was 20, we might have 100 expected cases and 2,000 excess cases. The odds of a given case’s being an excess case are rather strong, and even the agnostics and dissenters from probabilistic reasoning in individual cases become weak kneed about denying recovery when the claimant is similar to the cases seen in the study sample.

******************Appendix*************************

Norman E. Breslow & N. E. Day, “Statistical Methods in Cancer Research,” in The Analysis of Case-Control Studies 36 (IARC Pub. No. 32, 1980) (“[r]elative risks of less than 2.0 may readily reflect some unperceived bias or confounding factor”)

Richard Doll & Richard Peto, The Causes of Cancer 1219 (1981) (“when relative risk lies between 1 and 2 … problems of interpretation may become acute, and it may be extremely difficult to disentangle the various contributions of biased information, confounding of two or more factors, and cause and effect.”)

Iain K. Crombie, “The limitations of case-control studies in the detection of environmental carcinogens,” 35 35 J. Epidem. & Community Health 281, 281 (1981) (“The case-control study is unable to detect very small relative risks (< 1.5) even where exposure is widespread and large numbers of cases of cancer are occurring in the population.”)

Ernst L. Wynder & Geoffrey C. Kabat, “Environmental Tobacco Smoke and Lung Cancer: A Critical Assessment,” in H. Kasuga, ed., Indoor Air Quality 5, 6 (1990) (“An association is generally considered weak if the odds ratio is under 3.0 and particularly when it is under 2.0, as is the case in the relationship of ETS and lung cancer. If the observed relative risk is small, it is important to determine whether the effect could be due to biased selection of subjects, confounding, biased reporting, or anomalies of particular subgroups.”).

Ernst L. Wynder, “Epidemiological issues in weak associations,” 19 Internat’l  J. Epidemiol. S5 (1990)

David Sackett, R. Haynes, Gordon Guyatt, and Peter Tugwell, Clinical  Epidemiology: A Basic Science for Clinical Medicine (2d ed. 1991)

Muin J. Khoury, Levy M. James, W. Dana Flanders, and David J. Erickson, “Interpretation of recurring weak associations obtained from epidemiologic studies of suspected human teratogens,” 46 Teratology 69 (1992);

Lynn Rosenberg, “Induced Abortion and Breast Cancer: More Scientific Data Are Needed,” 86 J. Nat’l Cancer Instit. 1569, 1569 (1994) (“A typical difference in risk (50%) is small in epidemiologic terms and severely challenges our ability to distinguish if it reflects cause and effect or if it simply reflects bias.”) (commenting upon Janet R. Daling, K. E. Malone, L. F. Voigt, E. White, and Noel S. Weiss, “Risk of breast cancer among young women: relationship to induced abortion,” 86 J. Nat’l Cancer Inst. 1584 (1994);

Linda Anderson, “Abortion and possible risk for breast cancer: analysis and inconsistencies,” (Wash. D.C., Nat’l Cancer Institute, Oct. 26,1994) (“In epidemiologic research, relative risks of less than 2 are considered small and are usually difficult to interpret. Such increases may be due to chance, statistical bias, or effects of confounding factors that are sometimes not evident.”); 

Washington Post (Oct. 27, 1994) (quoting Dr. Eugenia Calle, Director of Analytic Epidemiology for the American Cancer Society: “Epidemiological studies, in general are probably not able, realistically, to identify with any confidence any relative risks lower than 1.3 (that is a 30% increase in risk) in that context, the 1.5 [reported relative risk of developing breast cancer after abortion] is a modest elevation compared to some other risk factors that we know cause disease.”)

Gary Taubes, “Epidemiology Faces Its Limits,” 269 Science 164, 168 (July 14, 1995) (quoting Marcia Angell, former editor of the New England Journal of Medicine, as stating that “[a]s a general rule of thumb, we are looking for a relative risk of 3 or more [before accepting a paper for publication], particularly if it is biologically implausible or if it’s a brand new finding.”) (quoting John C. Bailar: “If you see a 10-fold relative risk and it’s replicated and it’s a good study with biological backup, like we have with cigarettes and lung cancer, you can draw a strong inference. * * * If it’s a 1.5 relative risk, and it’s only one study and even a very good one, you scratch your chin and say maybe.”)

Samuel Shapiro, “Bias in the evaluation of low-magnitude associations: an empirical perspective,” 151 Am. J. Epidemiol. 939 (2000)

David A. Freedman & Philip B. Stark, “The Swine Flu Vaccine and Guillain-Barré Syndrome: A Case Study in Relative Risk and Specific Causation,” 64 Law & Contemp. Probs. 49, 61 (2001) (“If the relative risk is near 2.0, problems of bias and confounding in the underlying epidemiologic studies may be serious, perhaps intractable.”).

S. Straus, W. Richardson, P. Glasziou, and R. Haynes, Evidence-Based Medicine. How to Teach and Practice EBM (3d ed. 2005)

David F. Goldsmith & Susan G. Rose, “Establishing Causation with Epidemiology,” in Tee L. Guidotti & Susan G. Rose, eds., Science on the Witness Stand: Evaluating Scientific Evidence in Law, Adjudication, and Policy 57, 60 (2001) (“There is no clear consensus in the epidemiology community regarding what constitutes a ‘strong’ relative risk, although, at a minimum, it is likely to be one where the RR is greater than two; i.e., one in which the risk among the exposed is at least twice as great as among the unexposed.”)

Samuel Shapiro, “Looking to the 21st century: have we learned from our mistakes, or are we doomed to compound them?” 13 Pharmacoepidemiol. & Drug Safety  257 (2004)

Mark Parascandola, Douglas L Weed & Abhijit Dasgupta, “Two Surgeon General’s reports on smoking and cancer: a historical investigation of the practice of causal inference,” 3 Emerging Themes in Epidemiol. 1 (2006)

Heinemann, “Epidemiology of Selected Diseases in Women,” chap. 4, in M.A. Lewis, M. Dietel, P.C. Scriba, W.K. Raff, eds., Biology and Epidemiology of Hormone Replacement Therapy 47, 48 (2006) (discussing the “small relative risks in relation to bias/confounding and causal relation.”)

Roger D. Peng, Francesca Dominici, and Scott L. Zeger, “Reproducible Epidemiologic Research,” 163 Am. J. Epidem. 783, 784 (2006) (“The targets of current investigations tend to have smaller relative risks that are more easily confounded.”)

R. Bonita, R. Beaglehole & T. Kjellström, Basic Epidemiology 93 (W.H.O. 2d ed. 2006) (“A strong association between possible cause and effect, as measured by the size of the risk ratio (relative risk), is more likely to be causal than is a weak association, which could be influenced by confounding or bias. Relative risks greater than 2 can be considered strong.”)

David A. Grimes & Kenneth F. Schulz, “False alarms and pseudo-epidemics: the limitations of observational epidemiology,” 120 Obstet. & Gynecol. 920 (2012) (“Most reported associations in observational clinical research are false, and the minority of associations that are true are often exaggerated. This credibility problem has many causes, including the failure of authors, reviewers, and editors to recognize the inherent limitations of these studies. This issue is especially problematic for weak associations, variably defined as relative risks (RRs) or odds ratios (ORs) less than 4.”)

Kenneth F. Schulz & David A. Grimes, Essential Concepts in Clinical Research:
Randomised Controlled Trials and Observational Epidemiology at 75 (2d ed. 2019) (“Even after attempts to minimise selection and information biases and after control for known potential confounding factors, bias often remains. These biases can easily account for small associations. As a result, weak associations (which dominate in published studies) must be viewed with circumspection and humility.43 Weak associations, defined as relative risks between 0.5 and 2.0, in a cohort study can readily be accounted for by residual bias (Fig. 7.2). Because case-control studies are more susceptible to bias than are cohort studies, the bar must be set higher. ln case-control studies, weak associations can be viewed as odds ratios between 0.33 and 3.0 (Fig. 7.3). Results that full within these zones may be due to bias. Results that full outside these bounds in either direction may deserve attention.”)

Brian L. Strom, “Basic Principles of Clinical Epidemiology Relevant to Pharmacoepidemiologic Studies,” chap. 3, in Brian L. Strom, Stephen E. Kimmel & Sean Hennessy, eds., Pharmacoepidemiology 48 (6th ed. 2020) (“Conventionally, epidemiologists consider an association with a relative risk of less than 2.0 a weak association.”)


[1] Joseph Sanders, David L. Faigman, Peter B. Imrey, and Philip Dawid, “Differential Etiology: Inferring Specific Causation in the Law from Group Data in Science,” 63 Ariz. L. Rev. 851 (2021) [Differential Etiology].

[2] Richard Doll, “Proof of Causality: deduction from epidemiological observation,” 45 Persp. Biology & Med. 499, 501 (2002) (emphasis added).

[3] Id. at 512.

[4] Richard Doll & Richard Peto, The Causes of Cancer 1219 (1981) (“when relative risk lies between 1 and 2 … problems of interpretation may become acute, and it may be extremely difficult to disentangle the various contributions of biased information, confounding of two or more factors, and cause and effect.”).

[5]Confounding in the Courts” (Nov. 2, 2018); “General Causation and Epidemiologic Measures of Risk Size” (Nov. 24, 2012). 

[6] See King v. Burlington Northern Santa Fe Railway Co., 762 N.W.2d 24, 40 (Neb. 2009) (“the higher the relative risk, the greater the likelihood that the relationship is causal”); Landrigan v. Celotex Corp., 127 N.J. 404, 605 A.2d 1079, 1086 (1992) (“The relative risk of lung cancer in cigarette smokers as compared to nonsmokers is on the order of 10:1, whereas the relative risk of pancreatic cancer is about 2:1. The difference suggests that cigarette smoking is more likely to be a causal factor for lung cancer than for pancreatic cancer.”).

[7] RMSE3d at 219.

[8] RMSE3d at 602. 

[9] Richard Doll, “Proof of Causality: deduction from epidemiological observation,” 45 Persp. Biology & Med. 499, 500 (2002).

Differential Etiologies – Part Two – Ruling Out

June 19th, 2022

Perhaps the most important point of this law review article, “Differential Etiology: Inferring Specific Causation in the Law from Group Data in Science,”  is that general causation is necessary but insufficient, standing alone, to show specific causation. To be sure, the authors proclaimed that strong evidence of general causation somehow reduces the burden to show specific causation, but this pronouncement turned out to be an ipse dixit, without supporting analysis or citation. On general causation itself, what the authors characterized as the “ruling in” part of differential etiology, the authors offered some important considerations for courts to consider. Not the least of the important advice on general causation was urging caution in interpreting results when “strength of a relationship is modest.”[1] Given that they were talking to judges and lawyers, the advice might have taken on greater saliency if the authors explicitly noted that modest strength of a putative relationship means small relative risks, such as those smaller than two or three.

Acute Onset Conditions

The authors’ stated goal of bringing clarity to the determination of differential-etiology is a laudable one. In seeking clarity, they brush away some “easy” cases, such as the causal determination of acute onset conditions. Even so, the authors do not give any concrete examples. A broken bone discovered immediately after a car crash would hardly give a court much pause, but something such as the onset of acute liver failure shortly after ingesting a new medication turns out to be much more complicated than many would anticipate. Viral infections and autoimmune disease must be eliminated, and so such events are clearly in the realm of differential etiology, despite the close temporal proximity.

So-Called Signature Diseases

The authors also try to brush aside the “easy” case of signature diseases as not requiring differential etiology. The complexity of such cases ultimately embarrasses everyone. The authors no doubt thought that they were on safe ground in proffering the example of mesothelioma as a signature cancer caused by only asbestos (without wading into the deeper complexity of what is asbestos and which minerals in what mineralogical habit actually cause the disease).[2] Unfortunately, mesothelioma has never been a truly signal disease. The authors nonetheless consider it as one, with the caveat that mesotheliomas not caused by asbestos are “very rare.” And what was the authority for this statement? The Pennsylvania Supreme Court! Now the Pennsylvania Supreme Court is no doubt, at times, an authority on Pennsylvania law, if only because the Court is the last word on this contorted body of law. The Justices of that Court, however, would probably be the first to disclaim any credibility on the causes of any disease.[3]

The authors further distort the notion of signature diseases by stating that “[v]aginal adenocarcinoma in young women appears to be a signature disease associated with maternal use of DES.”[4] This cannot be right because over 10% of vaginal cancers are adenocarcimas. The principle of charity requires us to assume that the authors meant to indicate clear cell vaginal adenocarcinoma, but even so, charity will not correct the mistake. DES daughters do indeed have an increased risk of developing developing clear cell adenocarcinoma, but this type of cancer was well described before DES was ever invented and prescribed to women.[5]

Perhaps the safest ground for signature diseases is in microbiology, where we have infectious disease defined by the microbial agent that is uniquely associated with the disease. Probably close to the infectious diseases are the nutritional deficiency diseases defined by the absence of an essential nutrient or vitamin. To be sure, there are non-infectious diseases such as the pneumoconioses, each defined by the nature of the inhaled particle. Contrary to the authors’ contention, these diseases no not necessarily remove differential etiology from the analysis. Silicosis has a distinctive radiographic appearance, and yet that radiographic appearance is the same in many cases of coccidioidomycosis (Valley Fever). Asbestosis has a different radiographic appearance of the lungs and pleura, but the radiographic patterns might well be confused with the sequelae of rheumatoid arthritis or other interstitial lung diseases. At low levels of profusion of radiographic opacities, diseases such asbestosis and silicosis have diagnostic criteria that are far from perfect sensitivity and specificity. In one of the very first asbestos cases I defended, the claimant was diagnosed, by no less than the late Dr. Irving Selikoff,[6] with asbestosis, 3/3 on the ILO scale of linear, irregular radiographic lung opacities. An autopsy, however, found that there was no asbestosis at all, or even an elevated tissue fiber burden; the claimant had died of bilateral lymphangenitic carcinomatosis.

Definitive Mechanistic Pathway to Individual Causation

The paper presents a limited discussion of genetic causation. In the instance of mutations of highly penetrant alleles, identifying the genetic mutation will provide the general and the specific cause in a case. The authors also acknowledge that there may be cases involving hypothetical biomarkers that reveals a well-documented causal pathway from exposure to disease.

Differential Etiologies

So what happens when the plaintiff is claiming that he has developed a disease of ordinary life, one that has multiple known causes? Disease onset is not acute, but rather after a lengthy latency period. The plaintiff wants to inculpate the supposedly tortious exposure (the tortogen), and avoid the conclusion that any or all of the known alternative causes participated in his case. If there are cases of the disease without known causes (idiopathogens), the claimant will need to exclude idiopathogens in favor of fingering the tortogen as responsible for his bad outcome.

The authors helpfully distinguish differential diagnosis from differential etiology. The confusion of the two concepts has led to courts’ generally over-endorsing the black box of clinical judgment in health effects litigation. At the very least, this article can perhaps help the judiciary to move on from this naïve confusion.[7]

The authors advance the vague notion that somehow “clinical information” can supplement a relative that is not greater than two to augment the specific causation inference. This was, to be sure, the assertion of the New Jersey Supreme Court, based upon the improvident concession of the defense lawyer who argued the case.[8] There was nothing in the record of the New Jersey case, however, that would support the relevance of clinical information to the causal analysis of the plaintiff’s colorectal cancer.

The authors also point to a talc ovarian cancer case as exemplifying the use of clinical data to supplement a relative risk below two.[9] The cited case, however, involved expert witnesses who claimed a relative risk greater than two for the tortogen, and who failed to show how clinical information (such as the presence of talc in ovarian tissue) made the claimant any more likely to have had a cancer caused by talc.

Adverting to “clinical information” to supplement the relative risk all-too-often is hand waving that offers no analytical support for the specific causal inference. The clinical factors often are covariates in the multivariate model that generated the relevant relative risk. As such, the relative risk represents an assessment of the strength of the relevant association, independent of the clinical factors that are captured in the co-variates, in the multivariate model.  In the New Jersey case, Landrigan, plaintiff had no asbestosis that would suggest he even had a serious exposure to asbestos. In a companion case, Caternicchio, the plaintiff claimed that he had asbestosis, and somehow this made the causal inference for his colorectal cancer stronger.[10]  The epidemiologic studies he relied upon, however, stratified their analyses by length of exposure, and by radiographic category of asbestosis, neither of which suggested any relationship between radiographic findings and colorectal cancer outcome.

Perhaps because the authors are academics, they had to ask questions no one has every raised in a serious way in litigation, such as whether in addition to the clinical information, claimants could assert that toxicological data could be used to supplement a low (not greater than two) relative risk. The authors state the obvious; namely, toxicologic evidence is best suited to the assessment of general causation. They do not stop there, as they might have. Throwing their stated task of explicating the scientific foundations for specific causation inferences to the wind, the authors tell us that “[t]here is no formula for when such toxicologic evidence can tip the scales on the question of specific causation.”[11] And they wind up telling us vacuously that if the relevant epidemiology showed a small effect size, such as a two percent increased risk (RR = 1.02), then it would be unclear “how any animal data could cause one to substantially alter the best estimate of a human effect to reach a more-likely-than-not threshold.”[12] At this point in their paper, the authors seem to be discussing specific causation, but they offer nothing in the way of scientific evidence or examples of how toxicologic data could supplement a low relative risk (less than or equal to two) to permit a specific causation inference.

Idiopathy

When the analysis of the putative risk is done in a multivariate model that fairly covers the other relevant risks, relative risks less than 100 or so, suggest that there is a substantial baseline or background risk for the outcome of concern. When the relative risks identified in such analyses are less than 5 or so, the studies will suggest a reasonable proportion of so-called background cases with idiopathic (unknown) causes. Differential etiologies will have to rule out those mysterious idiopathogens.

If the putative specific cause is the only substance established to cause the outcome of concern, and the RR is greater than 1.0 and less than or equal to 2.0, by definition, there is a large base rate of the disease. No amount of hokey pokey will rule out the background causes. The authors deal with this scenario under the heading of differential etiology in the face of idiopathic causes, and characterize it as a “problem.”

Long story short, the authors conclude that “perhaps it is reasonable for courts to disregard idiopathic causes in those cases where idiopathic causes comprise a relatively small percent of all injuries.”[13] Such cases, however, by definition will diseases for which most causes are known, and the attributable fractions collectively for the known risks will be very high (say greater than 80 or 90%). Conversely, when the attributable fraction for all known risks is lower than 80%, the unexplained portion of the disease cases will represent idiopathic cases and causes that cannot be rule out with any confidence.

Differential etiology cannot work in the situation with a substantial baseline risk because there will be a disjunct (idiopathogen(s)) in the first statement of the syllogism, which cannot be ruled out. Thus, even if every other putative cause can be eliminated, the claimant will be left with the either the tortogen or the baseline risk as the cause of his injury, and the claimant will never arrive at a conclusion that is free of a disjunction that precludes judgment in his favor. In this scenario, the claimant must lose as a matter of law.

In their discussion of this issue, the authors note that this indeterminancy resulted in the exclusion of plaintiff’s expert witnesses in the notorious case of Milward v. Acuity Specialty Products Group, Inc.[14] In Milward, plaintiff had developed a rare variety of acute myeloid leukemia (AML), which had a large attributable fraction for idiopathic causation. This factual setting simply means that no known cause exists with a large relative risk, or even a small relative risk of 1.3 or so. Remarkably, these authors state that Milward “had prevailed on the general-causation issue” but in fact, no trial was ever held. The defense prevailed at the trial court by way of Rule 702 exclusion of plaintiff’s causation expert witnesses, but the First Circuit reversed and remanded for trial. The only prevailing that took place was the questionable avoidance of exclusion and summary judgment.[15]

On remand, the defense moved again to exclude plaintiffs’ expert witnesses on specific causation. Given that about 75% of AML cases are idiopathic, the court held that the plaintiffs’ expert witnesses attempt to proffer a differential etiology was fatally flawed.[16]

The authors cite the Milward specific causation decision, which in turn channeled the Restatement (Third) by couching the argument in terms of probability. If the claimant is left with a disjunction, [tortogen OR idiopathogen], then they suggest a probability value be assigned to the idiopathogen to support the inference that the probability that the tortogen was responsible for the claimant’s outcome [(1 – P(idiopathogen) x 100%]. Or in Judge Woodlock’s words:

“When a disease has a discrete set of causes, eliminating some number of them significantly raises the probability that the remaining option or options were the cause-in-fact of the disease. Restatement (Third) of Torts: Phys. & Emot. Harm § 28, cmt. c (2010) (‘The underlying premise [of differential etiology] is that each of the [ ] known causes is independently responsible for some proportion of the disease in a given population. Eliminating one or more of these as a possible cause for a specific plaintiff’s disease increases the probability that the agent in question was responsible for that plaintiff’s disease.’). The same cannot be said when eliminating a few possible causes leaves not only fewer possible causes but also a high probability that a cause cannot be identified. (‘When the causes of a disease are largely unknown . . . differential etiology is of little assistance.’).”[17]

The Milward approach is thus a vague, indirect invocation of relative risks and attributable fractions, without specifying the probabilities involved in quantitative terms.  Like obscenity, judges are supposed to discern when the residual probability of idiopathy is too great to permit an inference of specific causation. Somehow, I have the sense we should be able to do better than this.

Multiple Risks

To their credit, the authors tackle the difficult cases that arise when multiple risks are present. Those multiple risks may be competing risks, including the tortogen, in which case not all participate in bringing about the outcome. Indeed, if there is a baseline risk, the result may still have come about from an idiopathogen. The discussion in Differential Etiologies take some twists and turns, and I will not discuss all of it here.

Strong tortogen versus one weak competing risk

The authors describe the scenario of strong tortogen versus a single competing risk as one of the “easy cases,” at least when the alternative cause appears to be de minimus:

“If the choice of whether one’s lung cancer was the result of a lifetime of heavy smoking or by a brief encounter with a substance for which there is a significant but weak correlation with lung cancer, in most situations it should be an easy task to rule out the other substance as the specific cause of the individual’s injury.”[18]

Unfortunately, the article’s discussion leaves everything rather vague, without quantifying the risks involved. We can, without too much effort, provide some numbers, although we cannot be sure that the authors would accept the resulting quantification. If the claimant’s lifetime of heavy smoking carried a relative risk of 30, and the claimant worked for a few years in a truck depot where he was exposed to diesel fumes that carried a relative risk of 1.2, it would seem that it should be “an easy task” to rule out diesel fumes and rule in smoking. Note however that ease of the inference is lubricated by the size of the relative risks involved, one much larger than two, and the other much smaller than two, and the absence of any suggestion of interaction or synergy between them. If the tortogen in this scenario is tobacco, the plaintiff wins readily. If the tortogen is diesel fumes, the plaintiff loses. Query, if this scenario arises in a case against the tobacco company, whether the alternative causation defense of exposure to diesel fumes fails as a matter of law?

Synergy between strong tortogen and strong competing risk

The authors cannot resist the temptation to cite the Mt. Sinai catechism[19] of multiplicative risk from smoking and asbestos exposure[20]:

“A well-known example of a synergistic effect is the combined effect of asbestos exposure and smoking on the likelihood of developing lung cancer. For long-term smokers, the relative risk of developing lung cancer compared to those who have never smoked is sometimes estimated to be in the range of 10.0. For individuals substantially exposed to asbestos, the relative risk of developing lung cancer compared to non-exposed individuals is in the range of 5.0. However, if one is unfortunate enough to have been exposed to asbestos and to have been a long-term smoker, the relative risk compared to those unexposed individuals who have not smoked exceeds the sum of the relative risks. One possibility is that the relationship is multiplicative, in the range of 50.0—i.e., a 49-fold risk increment.”[21]

The synergistic interaction is often raised in an attempt to defeat causal apportionment or avoid responsibility for the larger risk, as when smokers attempt to recover for lung cancer from asbestos exposure. Some courts have, however, permitted causal apportionment. In their analysis, the authors of Differential Etiologies simply wink and tell us that “[t]he calculation of synergistic effects is fairly complex.”[22]

Tortogen versus Multiple Risks

The scenario in which the tortogen has been “ruled in,” and is present in the claimant’s history, along with multiple other risks is more difficult than one might have imagined. The authors tell us that an individual claimant will fail to show that the tortogen is more likely than not a cause of her injury when one or more of the competing risks is stronger than the risk from the tortogen (assuming no synergy).[23] The authors’ analysis leaves unclear why the claimant does not similarly fail when the strength of the tortogen is equal to that of a competing risk. Similarly, the claimant would appear to have fallen short of the burden of proving the tortogen’s causal role when there are multiple competing risk factors that individually present smaller risks than the tortogen, but for which multiple subsets represent combined competing risks greater than the risk of the tortogen. 

Concluding Thoughts

If the authors had framed the differential enterprise by the logic of iterative disjunctive syllogism, they would have recognized that the premise of the argument must contain the disjunction of all general causes that might have been a cause of the claimant’s disease or injury. Furthermore, unless the idiopathogen(s) is eliminated, which rarely is the case, we are left with a disjunction in the conclusion that prevents judgment for the plaintiff. The extensive analysis provided in Differential Etiologies ultimately must equate risk with cause, and it must do so on a probabilistic basis, even when the probabilities are left vague, and unquantified. Indeed, the authors come close to confronting the reality that we often do not know the cause of many individual’s diseases. We do know something about the person’s antecedent risks, and we can quantify and compare those risks. Noncommittally, the authors note that courts have been receptive to the practical solution of judging whether the tortogen’s relative risk was greater than two as a measure of sufficiency for specific causation, and that they “agree that theoretically this intuition has appeal.”[24]

Although I have criticized many aspect of the article, it is an important contribution to the legal study of specific causation. Its taxonomy will not likely be the final word on the subject, but it is a major step toward making sense of an area of the law long dominated by clinical black boxes and ipse dixits.


[1] Differential Etiologies at 885. The authors noted that their advice was “especially true in those case-control studies where the cases and controls are not drawn from the same defined population at risk for the outcome under investigation.”

[2] Differential Etiologies at 895.

[3] Differential Etiologies at 895 & n. 154, citing Betz v. Pneumo Abex, LLC, 44 A.3d 27, 51 (Pa. 2012).

[4] Differential Etiologies at 895 at n. 156.

[5] American Cancer Soc’y website, last visited June 19, 2022.

[6] I did not know at the time that Selikoff had failed the B-reader examination.

[7] See, e.g., Bowers v. Norfolk Southern Corp., 537 F. Supp. 2d 1343, 1359–60 (M.D. Ga. 2007) (“The differential diagnosis method has an inherent reliability; the differential etiology method does not. This conclusion does not suggest that the differential etiology approach has no merit. It simply means that courts, when dealing with matters of reliability, should consider opinions based on the differential etiology method with more caution. It also means that courts should not conflate the two definitions.”)

[8] Differential Etiologies at 899 & n.176, citing Landrigan v. Celotex Corp., 127 N.J. 404, 605 A.2d 1079, 1087 (1992).

[9] Differential Etiologies at 899 & n.179, citing Johnson & Johnson Talcum Powder Cases, 249 Cal. Rptr. 3d 642, 671–72 (Cal. Ct. App. 2019).

[10] Caterinicchio v. Pittsburgh Corning Corp., 127 N.J. 428, 605 A.2d 1092 (1992).

[11] Differential Etiologies at 899.

[12] Differential Etiologies at 900.

[13] Differential Etiologies at 915.

[14] 639 F.3d 11 (1st Cir. 2011).

[15] Does it require pointing out that the reversal took place with a highly questionable, unethical amicus brief submitted by a not-for-profit that was founded by the two plaintiffs’ expert witnesses excluded by the trial court? Given that the First Circuit reversed and remanded, and then later affirmed the exclusion of plaintiffs’ expert witnesses on specific causation, and the entry of judgment, the first appellate decision became unnecessary to the final judgment and no longer a clear precedent.

[16] Differential Etiologies at 912, discussing Milward v. Acuity Specialty Prods. Group, Inc., 969 F. Supp. 2d 101, 109 (D. Mass. 2013), aff’d sub. nom., Milward v. Rust-Oleum Corp., 820 F.3d 469, 471, 477 (1st Cir. 2016).

[17] Id., quoting from Milward.

[18] Differential Etiologies at 901.

[19]  “The Mt. Sinai Catechism” (June 11, 2013).

[20] The mantra of 5-10-50 comes from early publications by Irving John Selikoff, and represents a misrepresentation of “never smoked regularly” as “never smoked,” and the use of a non-contemporaneous control group for the non-asbestos exposed, non-smoker base rate. When the external control group was updated to show a relative risk of 20, rather than 10 for smoking only, Selikoff failed to update his analysis. Selikoff’s protégés have recently updated the insulator cohort, repeating many of the original errors, but even so, finding only that “the joint effect of smoking and asbestos alone was additive.” See Steve Markowitz, Stephen Levin, Albert Miller, and Alfredo Morabia, “Asbestos, Asbestosis, Smoking and Lung Cancer: New Findings from the North American Insulator Cohort,” Am. J. Respir. & Critical Care Med. (2013).

[21] Differential Etiologies at 902. The authors do not cite the Selikoff publications, which repeated his dataset and his dubious interpretation endlessly, but rather cite David Faigman, et al., Modern Scientific Evidence: The Law and Science of Expert Testimony § 26.25. (West 2019–2020 ed.). To their credit, the authors describe multiplicative interaction as a possibility, but surely they known that plaintiffs’ expert witnesses recite the Mt. Sinai catechism in courtrooms all around the country, while intoning “reasonable degree of medical certainty.” The authors cite some contrary studies. Differential Etiologies at 902 n.188, citing several reviews including Darren Wraith & Kerrie Mengersen, “Assessing the Combined Effect of Asbestos Exposure & Smoking on Lung Cancer: A Bayesian Approach, 26 Stats. Med. 1150, 1150 (2007) (evidence supports more than an additive model and less than a multiplicative relation).”

[22] Differential Etiologies at 902 at n.189.

[23] Differential Etiologies at 905. The authors note that courts have admitted differential etiology testimony when the tortogen’s risk is greater than the risk from other known risks. Id. citing Cooper v. Takeda Pharms., 191 Cal. Rptr. 3d 67, 79 (Ct. App. 2015).

[24] Differential Etiologies at 896 & n.163.

Differential Etiologies – Part One – Ruling In

June 17th, 2022

You put your right foot in

You put your right foot out

You put your right foot in

And you shake it all about

You do the Hokey Pokey and you turn yourself around

That’s what it’s all about!

 

Ever since the United States Supreme Court decided Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993), legal scholars, judges, and lawyers have struggled with the structure and validity of expert opinion on specific causation. Professor David Faigman and others have attempted to articulate the scientific basis (if any) for opinion testimony in health-effects litigation that a give person’s disease has been caused by an exposure or condition.

In 2015, as part of a tribute to the late Judge Jack Weinstein, Professor Faigman offered the remarkable suggestion that in advancing differential etiologies, expert witnesses were inventing wholesale an approach that had no foundation or acceptance in their scientific disciplines:

 “Differential etiology is ostensibly a scientific methodology, but one not developed by, or even recognized by, physicians or scientists. As described, it is entirely logical, but has no scientific methods or principles underlying it. It is a legal invention and, as such, has analytical heft, but it is entirely bereft of empirical grounding. Courts and commentators have so far merely described the logic of differential etiology; they have yet to define what that methodology is.”[1]

Faigman is correct that courts often have left unarticulated exactly what the methodology is, but he does not quite make sense when he writes that the method of differential etiology is “entirely logical,” but has no “scientific methods or principles underlying it.” After all, Faigman starts off his essay with a quotation from Thomas Huxley that “science is nothing but trained and organized common sense.”[2] As I have written elsewhere, the form of reasoning involved in differential diagnosis is nothing other than iterative disjunctive syllogism.[3] Either-or reasoning occurs throughout the physical and biological sciences; it is not clear why Faigman declares it un- or extra-scientific.

The strength of Faigman’s claim about the made-up nature of differential etiology appears to be undermined and contradicted by an example that he provides from clinical allergy and immunology:

“Allergists, for example, attempt to identify the etiology of allergic reactions in order to treat them (or to advise the patient to avoid what caused them), though it might still be possible to treat the allergic reactions without knowing their etiology.”

Faigman at 437. Of course, not only allergists try to determine the cause of an individual patient’s disease. Psychiatrists, in the psychoanalytic tradition, certain do so as well. Physicians who use predictive regression models use group data, in multivariate analyses, to predict outcomes, risk, and mortality in individual patients. Faigman’s claim is similarly undermined by the existence of a few diseases (other than infectious diseases) that are defined by the causative exposure. Silicosis and manganism have played a large role in often bogus litigation, but they represent instances in which a differential diagnosis and puzzle may also be an etiological diagnosis and a puzzle. Of course, to the extent that a disease is defined in terms of causative exposures, there may be serious and even intractable problems caused by the lack of specificity and accuracy in the diagnostic criteria for the supposedly pathognomonic disease.

As I noted at the time of Faigman’s 2015 essay, his suggestion that the concept of “differential etiology” was not used in the sciences themselves, was demonstrably flawed and historically inaccurate.[4]

A year earlier, in a more sustained analysis of specific causation, Professor Faigman went astray in a different direction, this time by stating that:

“it is not customary in the ordinary practice of sociology, epidemiology, anthropology, and related fields (for example, cognitive and social psychology) for professionals to make individual diagnostic judgments derived from group-based data.”[5]

Faigman’s invocation of “ordinary practice” of epidemiology was seriously wide of the mark. Medical practitioners and scientists frequently use epidemiologic data, based upon “group-based data” to make individual diagnostic judgments. The inferences from group data to individual range abound in the diagnostic process itself, where the specificity and sensitivity of disease signs and symptoms are measured by group data. Physicians must rely upon group data to make prognoses for individual patients, and they rely upon group data to predict future disease risks for individual patients. Future disease risks, as in the Framingham risk score for hard coronary heart disease, or the Gale model for breast cancer risk, are, of course, based upon “group-based data.” Medical decisions to intervene, surgically, pharmacologically, or by some other method, all involve applying group data to the individual patient.

Faigman’s 2014 law review article was certainly correct, however, in noting that specific causation inferences and conclusions were often left “profoundly underdefined,” with glib identifications of risk with cause.[6] There was thus plenty of room for further elucidation of specific causation decisions, and I welcome Faigman’s most recent effort to nail conceptual jello to the wall, in a law review article that was published last year.[7]

This new article, “Differential Etiology: Inferring Specific Causation in the Law from Group Data in Science,” is the collaborative product of Professor Faigman and three other academics. Joseph Sanders will be immediately recognizable to the legal community as someone who long pondered causation issues, both general and specific, and who has contributed greatly to the law review literature on causation of health outcomes. In addition to the law professors, Peter B. Imrey, a professor of medicine at the Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, and Philip Dawid, an emeritus professor of statistics in Cambridge University, have joined the effort to make sense of specific causation in the law. The addition of medical and statistical expertise has added greatly to Faigman’s previous efforts, and it has corrected some of his earlier errors and added much nuance to the discussion. The resulting law review article is well worth reading for practitioners. In this post, however, I have not detailed every important insight, but rather I have tried to point out some of the continuing and new errors in the analysis.

The Sanders-Faigman-Imbrey-Dawid analysis begins with a lament that:

“there is no body of science to which experts can turn when addressing this issue. Ultimately, much of the evidence that can be brought to bear on this causal question is the same group-level data employed to prove general causation. Consequently, the expert testimony often feels jerry-rigged, an improvisation designed to get through a tough patch.”[8]

As an assessment of the judicial decisions on specific causation, there can be no dissent or appeal from the judgment of these authors. The authors use of the term “jerry-rigged” is curious. I had first I thought they were straining to avoid using the common phrase “jury rigged” or to avoid inventing a neologism such as “judge rigged.” The American Heritage and Merriam Webster dictionaries, however, describe the phrase “jerry-rigged” as a conflation of “jury-rigged,” a nautical term for a temporary but adequate repair, with “jerry-rigged,” a war-time pejorative term for makeshift devices put together by Germans. So jerry-rigged it is, and the authors are off and running to try to describe, clarify, and justify the process of drawing specific causation inferences by differential etiology. They might have called what passes for judicial decision making in this area as the “hokey pokey.”

The authors begin their analysis of specific causation with a brief acknowledgement that our legal system could abandon any effort to set standards or require rigorous thinking on the matter by simply leaving the matter to the jury.[9] After all, this laissez-faire approach had been the rule of law for centuries. Nevertheless, despite occasional retrograde, recidivist judicial opinions,[10] the authors realize that the law has evolved to a point that some judicial control over specific causation opinions is required. And if judges are going to engage in gatekeeping of specific-causation opinions, they need to explain and justify their decisions in a coherent and cogent fashion.

Having thus dispatched legal nihilism, the authors turn their attention to what they boldly describe as “the first full-scale effort to bring scientific sensibilities – and rigorous statistical thinking – to the legally imperative concept of specific causation.”[11] The claim is remarkable claim given that tort law has been dealing with the issue for decades, but probably correct given how frequently judges have swept the issue under a judicial rug of inpenetrable verbiage and shaggy thinking. The authors also walk back some of Faigman’s earlier claims that there is no science in the assessment of specific causation, although they acknowledge the obvious, that policy issues sometimes play a role in deciding both general and specific causation decisions. The authors also offer the insight, for which they claim novelty, that some of the Bradford Hill guidelines, although stated as part of assessing general causation, have some relevancy to decisions concerning specific causation.[12] Their insight is indeed important, although hardly novel.

Drawing upon some of the clearer judicial decisions, the authors identify three necessary steps to reach a conclusion of specific causation:

“(a) making a proper diagnosis;

(b) supporting (“ruling in”) the plausibility of the alleged cause of the injury on the basis of general evidence and logic; and

(c) particularization, i.e., excluding (‘ruling out’) competing causes in the specific instance under consideration.”[13]

Although this article is ostensibly about specific causation, the authors do not reach a serious discussion of the matter until roughly the 42nd page of a 72 page article. Having described a three-step approach, the authors feel compelled to discuss step one (describing or defining the “diagnosis,” or the outcome of interest), and step two, the “ruling in” process that requires an assessment of general causation.

Although ascertaining general causation is not the focus of this article, the authors give an extensive discourse on it. Indeed, the authors have some useful things to say about steps one and two, and I commend the article to readers for some of its learning. As much as the lawsuit industry might wish to do away with the general causation step, it is not going anywhere soon.[14] The authors also manage to say some things that range from wrong to not even wrong. One example of professoriate wish casting is the following assertion:

“Other things being equal, when the evidence for general causation is strong, and especially when the strength of the exposure–disease relationship as demonstrated in a body of research is substantial, the plaintiff faces a lower threshold in establishing the substance as the cause in a particular case than when the relationship is weaker.”[15]

This assertion appears, sans citation or analysis. The generalization fails in the face of counterexamples. The causal role for estrogen in many breast cancers is extremely strong. The International Agency for Cancer Research classifies estrogen as a Category I, known human carcinogen for breast cancer, even though estrogen is made naturally in the human female, and male, body. In the Women’s Health Initiative clinical trial, researchers reported a hazard ratio of 1.2,[16] but plaintiffs struggled to prevail on specific causation in litigation involving claims of breast cancer caused by post-menopausal hormone therapy. Perhaps the authors meant, by strength of exposure relationship, a high relative risk as well, but that point is taken up when the authors address the “ruling in” step of the three-step approach. In any event, the strength of the case for general causation is quite independent of the specific causation inference, especially in the face of small effect sizes.

On general causation itself, the authors begin their discussion with “threats to validity,” a topic that they characterize as mostly implicit in the Bradford Hill guidelines. But their suggestion that validity is merely implicit in the guidelines is belied by their citation to Dr. Woodside’s helpful article on the “forgotten predicate” to the nine Bradford Hill guidelines.[17] Bradford Hill explicitly noted that the starting point for considering an association to be causal occurred when “[o]ur observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance.”[18] Sir Austin told us in no uncertain terms that there is no need to consider the nine guidelines until random and systematic error have been rejected.[19]

In this article’s discussion of general causation, Professor’s Dawid’s influence can be seen in the unusual care to describe and define the p-value.[20] But the discussion devolves into more wish casting, when the authors state that p-values are not the only way to assess random error in research results.

They double down by stating that “[m]any prominent statisticians and other scientists have questioned it, and the need for change is increasingly accepted.”[21] The source for their statement, the American Statistical Association (ASA) 2016 p-value Statement, did not questioned the utility of the p-value for assessing random error, and this law review provides no other support for other unidentified methods to assess random error. For the most part, the ASA Statement identified misuses and misstatements of p-values, with the caveat that “[s]cientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.” This is hardly questioning the importance or utility of p-values in assessing random error.

When one of the cited authors, Ronald Wasserstein, published an editorial in 2019, proclaiming that it was time to move past the p-value, the then president of the ASA, Professor Karen Kafadar, commissioned a task force on the matter. That task force, consisting of many of the world’s leading statisticians, issued a short, but pointed rejection of Wasserstein’s advocacy, and by implication, the position asserted in this law review.[22] Several of the leading biomedical journals that were lobbied by Wasserstein to abandon statistical significance testing reassessed their statistical guidelines and reaffirmed the use of p-values and tests.[23]

Similarly, this law review’s statements that alternatives to frequentist tests (p-values) such as Bayesian inference are “ascendant” have no supporting citations, and generally are an inaccurate assessment of what most biomedical journals are currently publishing.

Despite the care with which this law review article has defined p-values, the authors run off the road when defining a confidence interval:

A 95% confidence interval … is a one-sided or two-sided interval from a data sample with 95% probability of bounding a fixed, unknown parameter, for which no nondegenerate probability distribution is conceived, under specified assumptions about the data distribution.”[24]

The emphasis added is to point out that the authors assigned a single confidence interval with the property of bounding the true parameter with 95% probability. That property, however, belongs to the infinite set of confidence intervals based upon repeated sampling of the same size from the same population, and constant variance. There is no probability statement to be made for the true parameter, as either in or not in a given confidence interval.

In an issue that is relevant to general and specific causation, the authors offer some ipse dixit on the issue of “thresholds”:

“with respect to some substance/injury relationships, it is thought that there is no safe threshold. Cancer is the injury for which it is most frequently thought that there is no safe threshold, but even here the mechanism of injury may lead to a different conclusion.”[25]

Here as elsewhere, the authors are repeating dogma, not science, and they ignore the substantial body of scientific evidence that undermines the so-called linear no threshold dose-response curve. The only citation offered is a judicial citation to a case that rejected the no threshold position![26]

So much for “ruling in.” In the next post, I will turn my attention to this law review’s handling of the “ruling out” step of differential etiology.


[1] David L. Faigman & Claire Lesikar, “Organized Common Sense: Some Lessons from Judge Jack Weinstein’s Uncommonly Sensible Approach to Expert Evidence,” 64 DePaul L. Rev. 421, 444 (2015).

[2] Thomas H. Huxley, “On the Education Value of the Natural History Sciences” (1854), in Lay Sermons, Addresses and Reviews 77 (1915).

[3] See, e.g., “Differential Etiology and Other Courtroom Magic” (June 23, 2014) (collecting cases); “Differential Diagnosis in Milward v. Acuity Specialty Products Group” (Sept. 26, 2013).

[4] See David Faigman’s Critique of G2i Inferences at Weinstein Symposium (Sept. 11, 2015); Kløve & D. Doehring, “MMPI in epileptic groups with differential etiology,” 18 J. Clin. Psychol. 149 (1962); Kløve & C. Matthews, “Psychometric and adaptive abilities in epilepsy with differential etiology,” 7 Epilepsia 330 (1966); Teuber & K. Usadel, “Immunosuppression in juvenile diabetes mellitus? Critical viewpoint on the treatment with cyclosporin A with consideration of the differential etiology,” 103 Fortschr. Med. 707 (1985); G.May & W. May, “Detection of serum IgA antibodies to varicella zoster virus (VZV)–differential etiology of peripheral facial paralysis. A case report,” 74 Laryngorhinootologie 553 (1995); Alan Roberts, “Psychiatric Comorbidity in White and African-American Illicity Substance Abusers” Evidence for Differential Etiology,” 20 Clinical Psych. Rev. 667 (2000); Mark E. Mullinsa, Michael H. Leva, Dawid Schellingerhout, Gilberto Gonzalez, and Pamela W. Schaefera, “Intracranial Hemorrhage Complicating Acute Stroke: How Common Is Hemorrhagic Stroke on Initial Head CT Scan and How Often Is Initial Clinical Diagnosis of Acute Stroke Eventually Confirmed?” 26 Am. J. Neuroradiology 2207 (2005);Qiang Fua, et al., “Differential Etiology of Posttraumatic Stress Disorder with Conduct Disorder and Major Depression in Male Veterans,” 62 Biological Psychiatry 1088 (2007); Jesse L. Hawke, et al., “Etiology of reading difficulties as a function of gender and severity,” 20 Reading and Writing 13 (2007); Mastrangelo, “A rare occupation causing mesothelioma: mechanisms and differential etiology,” 105 Med. Lav. 337 (2014).

[5] David L. Faigman, John Monahan & Christopher Slobogin, “Group to Individual (G2i) Inference in Scientific Expert Testimony,” 81 Univ. Chi. L. Rev. 417, 465 (2014).

[6] Id. at 448.

[7] Joseph Sanders, David L. Faigman, Peter B. Imrey, and Philip Dawid, “Differential Etiology: Inferring Specific Causation in the Law from Group Data in Science,” 63 Ariz. L. Rev. 851 (2021) [Differential Etiology]. I am indebted to Kirk Hartley for calling this new publication to my attention.

[8] Id. at 851, 855.

[9] Id. at 855 & n. 8 (citing A. Philip Dawid, David L. Faigman & Stephen E. Fienberg, “Fitting Science into Legal Contexts: Assessing Effects of Causes or Causes of Effects?,” 43 Sociological Methods & Research 359, 363–64 (2014). See also Barbara Pfeffer Billauer, “The Causal Conundrum: Examining the Medical-Legal Disconnect in Toxic Tort Cases from a Cultural Perspective or How the Law Swallowed the Epidemiologist and Grew Long Legs and a Tail,” 51 Creighton L. Rev. 319 (2018) (arguing for a standard-less approach that allows clinicians to offer their ipse dixit opinions on specific causation).

[10] Differential Etiology at 915 & n.231, 919 & n.244 (citing In re Round-Up Prods. Liab. Litig., 358 F. Supp. 3d 956, 960 (N.D. Cal. 2019).

[11] Differential Etiology at 856 (emphasis added).

[12] Differential Etiology at 857.

[13] Differential Etiology at 857 & n.14 (citing Best v. Lowe’s Home Ctrs., Inc., 563 F.3d 171, 180 (6th Cir. 2009)).

[14] See Margaret Berger, “Eliminating General Causation: Notes Toward a New Theory of Justice and Toxic Torts,” 97 Colum L. Rev. 2117 (1997).

[15] Differential Etiology at 864.

[16] Jacques E. Rossouw, et al.,Risks and benefits of estrogen plus progestin in healthy postmenopausal women: Principal results from the Women’s Health Initiative randomized controlled trial,” 288 J. Am. Med. Ass’n 321 (2002).

[17] Differential Etiology at 884 & n.104, citing Frank Woodside & Allison Davis, “The Bradford Hill Criteria: The Forgotten Predicate,” 35 Thomas Jefferson L. Rev. 103 (2013).

[18] Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 295 (1965).  

[19] Differential Etiology at 865.

[20] Differential Etiology at 869.

[21] Differential Etiology at 872, citing Ronald L. Wasserstein and Nicole A. Lazar, “The ASA Statement on p-Values: Context, Process, and Purpose,” 72 Am. Statistician 129 (2016).

[22] Yoav Benjamini, Richard D. De Veaux, Bradley Efron, Scott Evans, Mark Glickman, Barry I. Graubard, Xuming He, Xiao-Li Meng, Nancy M. Reid, Stephen M. Stigler, Stephen B. Vardeman, Christopher K. Wikle, Tommy Wright, Linda J. Young, and Karen Kafadar, “ASA President’s Task Force Statement on Statistical Significance and Replicability,” 15 Ann. Applied Statistics 1084 (2021), 34 Chance 10 (2021).

[23] See “Statistical Significance at the New England Journal of Medicine” (July 19, 2019); See also Deborah G. Mayo, “The NEJM Issues New Guidelines on Statistical Reporting: Is the ASA P-Value Project Backfiring?” Error Statistics Philosophy  (July 19, 2019).

[24] Differential Etiology at 898 n.173 (emphasis added).

[25] Differential Etiology at 890.

[26] Differential Etiology at n.134, citing Chlorine Chemistry Council v. Envt’l Protection Agency, 206 F.3d 1286 (D.C. Cir. 2000), which rejected the agency’s assumption that the carcinogenic effects of chloroform in drinking water lacked a threshold.

Improper Reliance upon Regulatory Risk Assessments in Civil Litigation

March 19th, 2022

Risk assessments would seemingly be about assessing risks, but they are not. The Reference Manual on Scientific Evidence defines “risk” as “[a] probability that an event will occur (e.g., that an individual will become ill or die within a stated period of time or by a certain age).”[1] The risk in risk assessment, however, may be zero, or uncertain, or even a probability of benefit. Agencies that must assess risks and set “action levels,” or “permissible exposure limits,” or “acceptable intakes,” often work under great uncertainty, with inspired guesswork, using unproven assumptions.

The lawsuit industry has thus often embraced the false equivalence between agency pronouncements on harmful medicinal, environmental, or occupational exposures and civil litigation adjudication of tortious harms. In the United States, federal agencies such as the Occupational Safety and Health Administration (OSHA), or the Environmental Protection Agency (EPA), and their state analogues, regularly set exposure standards that could not and should not hold up in a common-law tort case. 

Remarkably, there are state and federal court judges who continue to misunderstand and misinterpret regulatory risk assessments, notwithstanding efforts to educate the judiciary. The second edition of the Reference Manual on Scientific Evidence contained a chapter by the late Professor Margaret Berger, who took pains to point out the difference between agency assessments and the adjudication of causal claims in court:

[p]roof of risk and proof of causation entail somewhat different questions because risk assessment frequently calls for a cost-benefit analysis. The agency assessing risk may decide to bar a substance or product if the potential benefits are outweighed by the possibility of risks that are largely unquantifiable because of presently unknown contingencies. Consequently, risk assessors may pay heed to any evidence that points to a need for caution, rather than assess the likelihood that a causal relationship in a specific case is more likely than not.[2]

In March 2003, Professor Berger organized a symposium,[3] the first Science for Judges program (and the last), where the toxicologist Dr. David L. Eaton presented on the differences in the use of toxicology in regulatory pronouncements as opposed to causal assessments in civil actions. As Dr. Eaton noted:

“regulatory levels are of substantial value to public health agencies charged with ensuring the protection of the public health, but are of limited value in judging whether a particular exposure was a substantial contributing factor to a particular individual’s disease or illness.”[4]

The United States Environmental Protection Agency (EPA) acknowledges that estimating “risk” from low level exposures based upon laboratory animal data is fraught because of inter-specie differences in longevity, body habitus and size, genetics, metabolism, excretion patterns, genetic homogeneity of laboratory animals, dosing levels and regimens. The EPA’s assumptions in conducting and promulgating regulatory risk assessments are intended to predict the upper bound of theoretical risk, while fully acknowledging that there may be no actual risk in humans:

“It should be emphasized that the linearized multistage [risk assessment] procedure leads to a plausible upper limit to the risk that is consistent with some proposed mechanisms of carcinogenesis. Such an estimate, however, does not necessarily give a realistic prediction of the risk. The true value of the risk is unknown, and may be as low as zero.”[5]

The approach of the U.S. Food and Drug Administration (FDA) with respect to mutagenic impurities in medications provides an illustrative example of how theoretical and hypothetical risk assessment can be.[6] The FDA’s risk assessment approach is set out in a “Guidance” document, which like all such FDA guidances, describes itself as containing non-binding recommendations, which do not preempt alternative approaches.[7] The agency’s goal is devise a control strategy for any mutagenic impurity to keep it at or below an “acceptable cancer risk level,” even if the risk or the risk level is completely hypothetical.

The FDA guidance advances the concept of a “Threshold of Toxicological Concern (TTC),” to set an “acceptable intake,” for chemical impurities that pose negligible risks of toxicity or carcinogenicity.[8] The agency describes its risk assessment methodology as “very conservative,” given the frequently unproven assumptions made to reach a quantification of an “acceptable intake”:

“The methods upon which the TTC is based are generally considered to be very conservative since they involve a simple linear extrapolation from the dose giving a 50% tumor incidence (TD50) to a 1 in 10-6 incidence, using TD50 data for the most sensitive species and most sensitive site of tumor induction. For application of a TTC in the assessment of acceptable limits of mutagenic impurities in drug substances and drug products, a value of 1.5 micrograms (µg)/day corresponding to a theoretical 10-5 excess lifetime risk of cancer can be justified.”

For more potent mutagenic carcinogens, such as aflatoxin-like-, N-nitroso-, and alkyl-azoxy compounds, the acceptable intake or permissible daily exposure (PDE) is set lower, based upon available animal toxicologic data.

The important divide between regulatory practice and the litigation of causal claims in civil actions arises from the theoretical nature of the risk assessment enterprise. The FDA acknowledges, for instance, that the acceptable intake is set to mark “a small theoretical increase in risk,” and a “highly hypothetical concept that should not be regarded as a realistic indication of the actual risk,” and thus not an actual risk.[9] The corresponding hypothetical or theoretical risk to the acceptable intake level is clearly small when compared with the human’s lifetime probability of developing cancer (which the FDA states is greater than 1/3, but probably now approaches 40%).

Although the TTC concept allows a calculation of an estimated “safe exposure,” the FDA points out that:

“exceeding the TTC is not necessarily associated with an increased cancer risk given the conservative assumptions employed in the derivation of the TTC value. The most likely increase in cancer incidence is actually much less than 1 in 100,000. *** Based on all the above considerations, any exposure to an impurity that is later identified as a mutagen is not necessarily associated with an increased cancer risk for patients already exposed to the impurity. A risk assessment would determine whether any further actions would be taken.”

In other words the FDA’s risk assessment exists to guide agency action, not to determine a person’s risk or medical status.[10]

As small and theoretical as the risks are, they are frequently based upon demonstrably incorrect assumptions, such as:

  1. humans are as sensitive as the most sensitive species;
  2. all organs are as sensitive as the most sensitive organ of the most sensitive species;
  3. the dose-response in the most sensitive species is a simple linear relationship;
  4. the linear relationship runs from zero exposure and zero risk to the exposure that yields the so-called TD50, the exposure that yields tumors in 50% of the experimental animal model;
  5. the TD-50 is calculated based upon the point estimate in the animal model study, regardless of any confidence interval around the point estimate;
  6. the inclusion, in many instances, of non-malignant tumors as part of the assessment of the TD50 exposure;
  7. there is some increased risk for any exposure, no matter how small; that is, there is no threshold below which there is no increased risk; and
  8. the medication with the mutagenic impurity was used daily for 70 years, by a person who weights 50 kg.

Although the FDA acknowledges that there may be some instances in which a “less than lifetime level” (LTL) may be appropriate, it places the burden on manufacturers to show the appropriateness of higher LTLs. The FDA’s M7 Guidance observes that

“[s]tandard risk assessments of known carcinogens assume that cancer risk increases as a function of cumulative dose. Thus, cancer risk of a continuous low dose over a lifetime would be equivalent to the cancer risk associated with an identical cumulative exposure averaged over a shorter duration.”[11]

Similarly, the agency acknowledges that there may be a “practical threshold,” as result of bodily defense mechanisms, such as DNA repair, which counter any ill effects from lower level exposures.[12]

“The existence of mechanisms leading to a dose response that is non-linear or has a practical threshold is increasingly recognized, not only for compounds that interact with non-DNA targets but also for DNA-reactive compounds, whose effects may be modulated by, for example, rapid detoxification before coming into contact with DNA, or by effective repair of induced damage. The regulatory approach to such compounds can be based on the identification of a No-Observed Effect Level (NOEL) and use of uncertainty factors (see ICH Q3C(R5), Ref. 7) to calculate a permissible daily exposure (PDE) when data are available.”

Expert witnesses often attempt to bootstrap their causation opinions by reference to determinations of regulatory agencies that are couched in similar language, but which use different quality and quantity of evidence than is required in the scientific community or in civil courts.

Supreme Court

Industrial Union Dep’t v. American Petroleum Inst., 448 U.S. 607, 656 (1980) (“OSHA is not required to support its finding that a significant risk exists with anything approaching scientific certainty” and “is free to use conservative assumptions in interpreting the data with respect to carcinogens, risking error on the side of overprotection, rather than underprotection.”).

Matrixx Initiatives, Inc. v. Siracusano, 563 U.S. 27, 131 S.Ct. 1309, 1320 (2011) (regulatory agency often makes regulatory decisions based upon evidence that gives rise only to a suspicion of causation) 

First Circuit

Sutera v. Perrier Group of America, Inc., 986 F. Supp. 655, 664-65, 667 (D. Mass. 1997) (a regulatory agency’s “threshold of proof is reasonably lower than that in tort law”; “substances are regulated because of what they might do at given levels, not because of what they will do. . . . The fact of regulation does not imply scientific certainty. It may suggest a decision to err on the side of safety as a matter of regulatory policy rather than the existence of scientific fact or knowledge. . . . The mere fact that substances to which [plaintiff] was exposed may be listed as carcinogenic does not provide reliable evidence that they are capable of causing brain cancer, generally or specifically, in [plaintiff’s] case.”); id. at 660 (warning against the danger that a jury will “blindly accept an expert’s opinion that conforms with their underlying fears of toxic substances without carefully understanding or examining the basis for that opinion.”). Sutera is an important precedent, which involved a claim that exposure to an IARC category I carcinogen, benzene, caused plaintiffs’ leukemia. The plaintiff’s expert witness, Robert Jacobson, espousing a “linear, no threshold” theory, and relying upon an EPA regulation, which he claimed supported his opinion that even trace amounts of benzene can cause leukemia.

In re Neurontin Mktg., Sales Practices, and Prod. Liab. Litig., 612 F. Supp. 2d 116, 136 (D. Mass. 2009) (‘‘It is widely recognized that, when evaluating pharmaceutical drugs, the FDA often uses a different standard than a court does to evaluate evidence of causation in a products liability action. Entrusted with the responsibility of protecting the public from dangerous drugs, the FDA regularly relies on a risk-utility analysis, balancing the possible harm against the beneficial uses of a drug. Understandably, the agency may choose to ‘err on the side of caution,’ … and take regulatory action such as revising a product label or removing a drug from the marketplace ‘upon a lesser showing of harm to the public than the preponderance-of-the-evidence or more-like-than-not standard used to assess tort liability’.’’) (internal citations omitted) 

Whiting v. Boston Edison Co., 891 F. Supp. 12, 23-24 (D. Mass. 1995) (criticizing the linear no-threshold hypothesis, common to regulatory risk assessments, because it lacks any known or potential error rate, and it cannot be falsified as would any scientific theory)

Second Circuit

Wills v. Amerada Hess Corp., No. 98 CIV. 7126(RPP), 2002 WL 140542 (S.D.N.Y. Jan. 31, 2002), aff’d, 379 F.3d 32 (2d Cir. 2004) (Sotomayor, J.). In this Jones Act case, the plaintiff claimed that her husband’s exposure to benzene and polycyclic aromatic hydrocarbons on board ship caused his squamous cell lung cancer. Plaintiff’s expert witness relied heavily upon the IARC categorization of benzene as a “known” carcinogen, and an “oncogene” theory of causation that claimed there was no safe level of exposure because a single molecule could induce cancer. According to the plaintiff’s expert witness, the oncogene theory dispensed with the need to quantify exposure. Then Judge Sotomayor, citing Sutera, rejected plaintiff’s no-threshold theory, and the argument that exposure that exceeded OHSA permissible exposure level supported the causal claim.

Mancuso v. Consolidated Edison Co., 967 F. Supp. 1437, 1448 (S.D.N.Y. 1997) (“recommended or prescribed precautionary standards cannot provide legal causation”; “[f]ailure to meet regulatory standards is simply not sufficient” to establish liability)

In re Agent Orange Product Liab. Litig., 597 F. Supp. 740, 781 (E.D.N.Y. 1984) (Weinstein, J.) (“The distinction between avoidance of risk through regulation and compensation for injuries after the fact is a fundamental one.”), aff’d in relevant part, 818 F.2d 145 (2d Cir.1987), cert. denied sub nom. Pinkney v. Dow Chemical Co., 484 U.S. 1004 (1988). Judge Weinstein explained that regulatory action would not by itself support imposing liability for an individual plaintiff.  Id. at 782. “A government administrative agency may regulate or prohibit the use of toxic substances through rulemaking, despite a very low probability of any causal relationship.  A court, in contrast, must observe the tort law requirement that a plaintiff establish a probability of more than 50% that the defendant’s action injured him.” Id. at 785.

In re Ephedra Prods. Liab. Litig., 393 F. Supp. 2d 181, 189 (S.D.N.Y. 2005) (improvidently relying in part upon FDA ban despite “the absence of definitive scientific studies establishing causation”)

Third Circuit

Gates v. Rohm & Haas Co., 655 F.3d 255, 268 (3d Cir. 2011) (affirming the denial of class certification for medical monitoring) (‘‘plaintiffs could not carry their burden of proof for a class of specific persons simply by citing regulatory standards for the population as a whole’’).

In re Schering-Plough Corp. Intron/Temodar Consumer Class Action, 2009 WL 2043604, at *13 (D.N.J. July 10, 2009)(“[T]here is a clear and decisive difference between allegations that actually contest the safety or effectiveness of the Subject Drugs and claims that merely recite violations of the FDCA, for which there is no private right of action.”)

Rowe v. E.I. DuPont de Nemours & Co., Civ. No. 06-1810 (RMB), 2008 U.S. Dist. LEXIS 103528, *46-47 (D.N.J. Dec. 23, 2008) (rejecting reliance upon regulatory findings and risk assessments in which “the basic goal underlying risk assessments . . . is to determine a level that will protect the most sensitive members of the population.”)  (quoting David E. Eaton, “Scientific Judgment and Toxic Torts – A Primer in Toxicology for Judges and Lawyers,” 12 J.L. & Pol’y 5, 34 (2003) (“a number of protective, often ‘worst case’ assumptions . . . the resulting regulatory levels . . . generally overestimate potential toxicity levels for nearly all individuals.”)

Soldo v. Sandoz Pharms. Corp., 244 F. Supp. 2d 434, 543 (W.D. Pa. 2003) (finding FDA regulatory proceedings and adverse event reports not adequate or helpful in determining causation; the FDA “ordinarily does not attempt to prove that the drug in fact causes a particular adverse effect.”)Wade-Greaux v. Whitehall Laboratories, Inc., 874 F. Supp. 1441, 1464 (D.V.I.) (“assumption[s that] may be useful in a regulatory risk-benefit context … ha[ve] no applicability to issues of causation-in-fact”), aff’d, 46 F.3d 1120 (3d  Cir. 1994)

O’Neal v. Dep’t of the Army, 852 F. Supp. 327, 333 (M.D. Pa. 1994) (administrative risk figures are “appropriate for regulatory purposes in which the goal is to be particularly cautious [but] overstate the actual risk and, so, are inappropriate for use in determining” civil liability)

Fourth Circuit

Dunn v. Sandoz Pharmaceuticals Corp., 275 F. Supp. 2d 672, 684 (M.D.N.C. 2003) (FDA “risk benefit analysis” “does not demonstrate” causation in any particular plaintiff)

Yates v. Ford Motor Co., 113 F. Supp. 3d 841, 857 (E.D.N.C. 2015) (“statements from regulatory and official agencies … are not bound by standards for causation found in toxic tort law”)

Meade v. Parsley, No. 2:09-cv-00388, 2010 U.S. Dist. LEXIS 125217, * 25 (S.D.W. Va. Nov. 24, 2010) (‘‘Inasmuch as the cost-benefit balancing employed by the FDA differs from the threshold standard for establishing causation in tort actions, this court likewise concludes that the FDA-mandated [black box] warnings cannot establish general causation in this case.’’)

Rhodes v. E.I. du Pont de Nemours & Co., 253 F.R.D. 365, 377 (S.D. W.Va. 2008) (rejecting the relevance of regulatory assessments, which are precautionary and provide no information about actual risk).

Fifth Circuit

Moore v. Ashland Chemical Co., 126 F.3d 679, 708 (5th Cir. 1997) (holding that expert witness could rely upon a material safety data sheet (MSDS) because mandated by the Hazard Communication Act, 29 C.F.R. § 1910.1200), vacated 151 F.3d 269 (5th Cir. 1998) (affirming trial court’s exclusion of expert witness who had relied upon MSDS).

Johnson v. Arkema Inc., 685 F.3d 452, 464 (5th Cir. 2012) (per curiam) (affirming exclusion of expert witness who upon regulatory pronouncements; noting the precautionary nature of such statements, and the absence of specificity for the result claimed at the exposures experienced by plaintiff)

Allen v. Pennsylvania Eng’g Corp., 102 F.3d 194, 198-99 (5th Cir. 1996) (“Scientific knowledge of the harmful level of exposure to a chemical, plus knowledge that the plaintiff was exposed to such quantities, are minimal facts necessary to sustain the plaintiffs’ burden in a toxic tort case”; regulatory agencies, charged with protecting public health, employ a lower standard of proof in promulgating regulations than that used in tort cases). The Allen court explained that it was “also unpersuaded that the “weight of the evidence” methodology these experts use is scientifically acceptable for demonstrating a medical link. . . .  Regulatory and advisory bodies. . .utilize a “weight of the evidence” method to assess the carcinogenicity of various substances in human beings and suggest or make prophylactic rules governing human exposure.  This methodology results from the preventive perspective that the agencies adopt in order to reduce public exposure to harmful substances.  The agencies’ threshold of proof is reasonably lower than that appropriate in tort law, which traditionally makes more particularized inquiries into cause and effect and requires a plaintiff to prove that it is more likely than not that another individual has caused him or her harm.” Id.

Burst v. Shell Oil Co., C. A. No. 14–109, 2015 WL 3755953, *8 (E.D. La. June 16, 2015) (explaining Fifth Circuit’s rejection of regulatory “weight of the evidence” approaches to evaluating causation)

Sprankle v. Bower Ammonia & Chem. Co., 824 F.2d 409, 416 (5th Cir. 1987) (affirmed Rule 403 exclusion evidence of OSHA violations in claim of respiratory impairment in a non-employee who experienced respiratory impairment after exposure to anhydrous ammonia; court found that the jury likely be confused by regulatory pronouncements)

Cano v. Everest Minerals Corp., 362 F. Supp. 2d 814, 825 (W.D. Tex. 2005) (noting that a product that “has been classified as a carcinogen by agencies responsible for public health regulations is not probative of” common-law specific causation) (finding that the linear no-threshold opinion of the plaintiffs’ expert witness, Malin Dollinger, lacked a satisfactory scientific basis)

Burleson v. Glass, 268 F. Supp. 2d 699, 717 (W.D. Tex. 2003) (“the mere fact that [the product] has been classified by certain regulatory organizations as a carcinogen is not probative on the issue of whether [plaintiff’s] exposure. . .caused his. . .cancers”), aff’d, 393 F.3d 577 (5th Cir. 2004)

Newton v. Roche Labs., Inc., 243 F. Supp. 2d 672, 677, 683 (W.D. Tex. 2002) (FDA’s precautionary decisions on labeling are not a determination of causation of specified adverse events) (“Although evidence of an association may … be important in the scientific and regulatory contexts…, tort law requires a higher standard of causation.”)

Molden v. Georgia Gulf Corp., 465 F. Supp. 2d 606, 611 (M.D. La. 2006) (“regulatory and advisory bodies make prophylactic rules governing human exposure based on proof that is reasonably lower than that appropriate in tort law”)

Sixth Circuit

Nelson v. Tennessee Gas Pipeline Co., 243 F.3d 244, 252-53 (6th Cir. 2001) (exposure above regulatory levels is insufficient to establish causation)

Stites v Sundstrand Heat Transfer, Inc., 660 F. Supp. 1516, 1525 (W.D. Mich. 1987) (rejecting use of regulatory standards to support claim of increased risk, noting the differences in goals and policies between regulation and litigation)

Mann v. CSX Transportation, Inc., case no. 1:07-Cv-3512, 2009 U.S. Dist. Lexis 106433 (N.D. Ohio Nov. 10, 2009) (rejecting expert testimony that relied upon EPA action levels, and V.A. compensation for dioxin exposure, as basis for medical monitoring opinions)

Baker v. Chevron USA, Inc., 680 F. Supp. 2d 865, 880 (S.D. Ohio 2010) (“[R]egulatory agencies are charged with protecting public health and thus reasonably employ a lower threshold of proof in promulgating their regulations than is used in tort cases.”) (“[t]he mere fact that Plaintiffs were exposed to [the product] in excess of mandated limits is insufficient to establish causation”; rejecting Dr. Dahlgren’s opinion and its reliance upon a “one-hit” or “no threshold” theory of causation in which exposure to one molecule of a cancer-causing agent has some finite possibility of causing a genetic mutation leading to cancer, a theory that may be accepted for purposes of setting regulatory standards, but not as reliable scientific knowledge)

Adams v. Cooper Indus., 2007 WL 2219212 at *7 (E.D. KY 2007).

Seventh Circuit

Wood v. Textron, Inc., No. 3:10 CV 87, 2014 U.S. Dist. LEXIS 34938 (N.D. Ind. Mar. 17, 2014); 2014 U.S. Dist. LEXIS 141593, at *11 (N.D. Ind. Oct. 3, 2014), aff’d, 807 F.3d 827 (7th Cir. 2015). Dahlgren based his opinions upon the children’s water supply containing vinyl chloride in excess of regulatory levels set by state and federal agencies, including the EPA. Similarly, Ryer-Powder relied upon exposure levels’ exceeding regulatory permissible limits for her causation opinions. The district court, with the approval now of the Seventh Circuit would have none of this nonsense. Exceeding governmental regulatory exposure limits does not prove causation. The con-compliance does not help the fact finder without knowing “the specific dangers” that led the agency to set the permissible level, and thus the regulations are not relevant at all without this information. Even with respect to specific causation, the regulatory infraction may be weak or null evidence for causation. (citing Cunningham v. Masterwear Corp., 569 F.3d 673, 674–75 (7th Cir. 2009)

Eighth Circuit

Glastetter v. Novartis Pharms. Corp., 107 F. Supp. 2d 1015, 1036 (E.D. Mo. 2000) (“[T]he [FDA’s] statement fails to affirmatively state that a connection exists between [the drug] and the type of injury in this case.  Instead, it states that the evidence received by the FDA calls into question [drug’s] safety, that [the drug] may be an additional risk factor. . .and that the FDA had new evidence suggesting that therapeutic use of [the drug] may lead to serious adverse experiences.  Such language does not establish that the FDA had concluded that [the drug] can cause [the injury]; instead, it indicates that in light of the limited social utility of [the drug for the use at issue] and the reports of possible adverse effects, the drug should no longer be used for that purpose.”) (emphasis in original), aff’d, 252 F.3d 986, 991 (8th Cir. 2001) (FDA’s precautionary decisions on labeling are not a determination of causation of specified adverse events; “methodology employed by a government agency results from the preventive perspective that the agencies adopt”)( “The FDA will remove drugs from the marketplace upon a lesser showing of harm to the public than the preponderance-of-the-evidence or the more-like-than-not standard used to assess tort liability . . . . [Its] decision that [the drug] can cause [the injury] is unreliable proof of medical causation.”), aff’d, 252 F.3d 986 (8th Cir. 2001)

Wright v. Willamette Indus., Inc., 91 F.3d 1105, 1107 (8th Cir. 1996) (rejecting claim that plaintiffs were not required to show individual exposure levels to formaldehyde from wood particles). The Wright court elaborated upon the difference between adjudication and regulation of harm:

“Whatever may be the considerations that ought to guide a legislature in its determination of what the general good requires, courts and juries, in deciding cases, traditionally make more particularized inquiries into matters of cause and effect.  Actions in tort for damages focus on the question of whether to transfer money from one individual to another, and under common-law principles (like the ones that Arkansas law recognizes) that transfer can take place only if one individual proves, among other things, that it is more likely than not that another individual has caused him or her harm.  It is therefore not enough for a plaintiff to show that a certain chemical agent sometimes causes the kind of harm that he or she is complaining of.  At a minimum, we think that there must be evidence from which the factfinder can conclude that the plaintiff was exposed to levels of that agent that are known to cause the kind of harm that the plaintiff claims to have suffered. See Abuan v. General Elec. Co., 3 F.3d at 333.  We do not require a mathematically precise table equating levels of exposure with levels of harm, but there must be evidence from which a reasonable person could conclude that a defendant’s emission has probablycaused a particular plaintiff the kind of harm of which he or she complains before there can be a recovery.”

Gehl v. Soo Line RR, 967 F.2d 1204, 1208 (8th Cir. 1992).

Nelson v. Am. Home Prods. Corp., 92 F. Supp. 2d 954, 958 (W.D. Mo. 2000) (FDA’s precautionary decisions on labeling are not a determination of causation of specified adverse events)

National Bank of Commerce v. Associated Milk Producers, Inc., 22 F. Supp. 2d 942, 961 (E.D.Ark. 1998), aff’d, 191 F.3d 858 (8th Cir. 1999) 

Junk v. Terminix Internat’l Co., 594 F. Supp. 2d 1062, 1071 (S.D. Iowa 2008) (“government agency regulatory standards are irrelevant to [plaintiff’s] burden of proof in a toxic tort cause of action because of the agency’s preventative perspective”)

Ninth Circuit

Henrickson v. ConocoPhillips Co., 605 F. Supp. 2d 1142, 1156 (E.D. Wash. 2009) (excluding expert witness causation opinions in case involving claims that benzene exposure caused leukemia) 

Lopez v. Wyeth-Ayerst Labs., Inc., 1998 WL 81296, at *2 (9th Cir. Feb. 25, 1998) (FDA’s precautionary decisions on labeling are not a determination of causation of specified adverse events)

In re Epogen & Aranesp Off-Label Marketing & Sales Practices Litig., 2009 WL 1703285, at *5 (C.D. Cal. June 17, 2009) (“have not been proven” allegations are an improper “FDA approval” standard; the FDA’s determination to require warning changes without establishing causation is established does not permit a court or jury, bound by common-law standards, to impose such a duty to warn when common-law causation requirements are not met).

In re Hanford Nuclear Reservation Litig., 1998 U.S. Dist. Lexis 15028 (E.D. Wash. 1998) (radiation and chromium VI), rev’d on other grounds, 292 F.3d 1124 (9th Cir. 2002).

Tenth Circuit

Hollander v. Shandoz Pharm. Corp., 95 F. Supp. 2d 1230, 1239 (W.D. Okla. 2000) (distinguishing FDA’s threshold of proof as lower than appropriate in tort law), aff’d in relevant part, 289 F.3d 1193, 1215 (10th Cir. 2002)

Mitchell v. Gencorp Inc., 165 F.3d 778, 783 n.3 (10th Cir. 1999) (benzene and CML) (quoting Allen, 102 F.3d at 198) (state administrative finding that product was a carcinogen was based upon lower administrative standard than tort standard) (“The methodology employed by a government agency “results from the preventive perspective that the agencies adopt in order to reduce public exposure to harmful substances.  The agencies’ threshold of proof is reasonably lower than that appropriate in tort law, which traditionally makes more particularized inquiries into cause and effect and requires a plaintiff to prove it is more likely than not that another individual has caused him or her harm.”)

In re Breast Implant Litig., 11 F. Supp. 2d 1217, 1229 (D.Colo. 1998)

Johnston v. United States, 597 F. Supp. 374, 393-394 (D. Kan.1984) (noting that the linear no-threshold hypothesis is based upon a prudent assumption designed to overestimate risk; speculative hypotheses are not appropriate in determining whether one person has harmed another)

Eleventh Circuit

Rider v. Sandoz Pharmaceuticals Corp., 295 F.3d 1194, 1201 (11th Cir. 2002) (FDA may take regulatory action, such as revising warning labels or withdrawing drug from the market ‘‘upon a lesser showing of harm to the public than the preponderance-of-the-evidence or more-likely-than-not standard used to assess tort liability’’) (“A regulatory agency such as the FDA may choose to err on the side of caution. Courts, however, are required by the Daubert trilogy to engage in objective review of the evidence to determine whether it has sufficient scientific basis to be considered reliable.”)

McClain v. Metabolife Internat’l, Inc., 401 F.3d 1233, 1248-1250 (11th Cir. 2005) (ephedra) (allowing that regulators “may pay heed to any evidence that points to a need for caution,” and apply “a much lower standard than that which is demanded by a court of law”) (“[U]se of FDA data and recommendations raises a more subtle methodological issue in a toxic tort case. The issue involves identifying and contrasting the type of risk assessment that a government agency follows for establishing public health guidelines versus an expert analysis of toxicity and causation in a toxic tort case.”)

In re Seroquel Products Liab. Litig., 601 F. Supp. 2d 1313, 1315 (M.D. Fla. 2009) (noting that administrative agencies “impose[] different requirements and employ[] different labeling and evidentiary standards” because a “regulatory system reflects a more prophylactic approach” than the common law)

Siharath v. Sandoz Pharmaceuticals Corp., 131 F. Supp. 2d 1347, 1370 (N.D. Ga. 2001) (“The standard by which the FDA deems a drug harmful is much lower than is required in a court of law.  The FDA’s lesser standard is necessitated by its prophylactic role in reducing the public’s exposure to potentially harmful substances.”), aff’d, 295 F.3d 1194 330 (11th Cir. 2002)

In re Accutane Products Liability, 511 F.Supp.2d 1288, 1291-92 (M.D. Fla. 2007)(acknowledging that regulatory risk assessments are not necessarily realistic in human populations because they are often based upon animal studies, and that the important differences between experimental animals and humans are substantial in various health outcomes).

Kilpatrick v. Breg, Inc., 2009 WL 2058384 at * 6-7 (S.D. Fla. 2009) (excluding plaintiff’s expert witness), aff’d, 613 F.3d 1329 (11th Cir. 2010)

District of Columbia Circuit

Ethyl Corp. v. E.P.A., 541 F.2d 1, 28 & n. 58 (D.C. Cir. 1976) (detailing the precautionary nature of agency regulations that may be based upon suspicions)

STATE COURTS

Arizona

Lofgren v. Motorola, 1998 WL 299925 (Ariz. Super. Ct. 1998) (finding plaintiffs’ expert witnesses’ testimony that TCE caused cancer to be not generally accepted; “it is appropriate public policy for health organizations such as IARC and the EPA to make judgments concerning the health and safety of the population based on evidence which would be less than satisfactory to support a specific plaintiff’s tort claim for damages in a court of law”)

Colorado

Salazar v. American Sterilizer Co., 5 P.3d 357 (Colo. Ct. App. 2000) (allowing testimony about harmful ethylene oxide exposure based upon OSHA regulations)

Georgia

Butler v. Union Carbide Corp., 712 S.E.2d 537, 552 & n.37 (Ga. App. 2011) (distinguishing risk assessment from causation assessment; citing the New York Court of Appeals decision in Parker for correctly rejecting reliance on regulatory pronouncements for causation determinations)

Illinois

La Salle Nat’l Bank v. Malik, 705 N.E.2d 938 (Ill. App. 3d) (reversing trial court’s exclusion of OSHA PEL for ethylene oxide), writ pet’n den’d, 714 N.E.2d 527 (Ill. 2d 1999)

New York

Parker v. Mobil Oil Corp., 7 N.Y.3d 434, 450, 857 N.E.2d 1114, 1122, 824 N.Y.S.2d 584 (N.Y. 2006) (noting that regulatory agency standards usually represent precautionary principle efforts deliberately to err on side of prevention; “standards promulgated by regulatory agencies as protective measures are inadequate to demonstrate legal causation.” 

In re Bextra & Celebrex, 2008 N.Y. Misc. LEXIS 720, *20, 239 N.Y.L.J. 27 (2008) (characterizing FDA Advisory Panel recommendations as regulatory standard and protective measure).

Juni v. A.O. Smith Water Products Co., 48 Misc. 3d 460, 11 N.Y.S.3d 416, 432, 433 (N.Y. Cty. 2015) (“the reports and findings of governmental agencies [declaring there to be no safe dose of asbestos] are irrelevant as they constitute insufficient proof of causation”), aff’d, 32 N.Y.3d 1116, 116 N.E.3d 75, 91 N.Y.S.3d 784 (2018)

Ohio

Valentine v. PPG Industries, Inc., 821 N.E.2d 580, 597-98 (Ohio App. 2004), aff’d, 850 N.E.2d 683 (Ohio 2006). 

Pennsylvania

Betz v. Pneumo Abex LLC, 44 A. 3d 27 (Pa. 2012).

Texas

Borg-Warner Corp., 232 S.W.3d 765, 770 (Tex. 2007)

Exxon Corp. v. Makofski, 116 S.W.3d 176, 187-88 (Tex. App. 2003) (describing “standards used by OSHA [and] the EPA” as inadequate for causal determinations)


[1] Michael D. Green, D. Michal Freedman, and Leon Gordis, “Reference Guide on Epidemiology,” in Reference Manual on Scientific Evidence 549, 627 (3d ed. 2011).

[2] Margaret A. Berger, “The Supreme Court’s Trilogy on the Admissibility of Expert Testimony,” in Reference Manual On Scientific Evidence at 33 (Fed. Jud. Center 2d. ed. 2000).

[3] Margaret A. Berger, “Introduction to the Symposium,” 12 J. L. & Pol’y 1 (2003). Professor Berger described the symposium as a “felicitous outgrowth of a grant from the Common Benefit Trust established in the Silicone Breast Implant Products Liability Litigation to hold a series of conferences at Brooklyn Law School.” Id. at 1. Ironically, that “Trust” was nothing more than the walking-around money of plaintiffs’ lawyers from the Silicone-Gel Breast Implant MDL 926. Although Professor Berger was often hostile the causation requirement in tort law, her symposium included some well-qualified scientists who amplified her point from the Reference Manual about the divide between regulatory risk assessment and scientific causal assessments.

[4] David L. Eaton, Scientific Judgment and Toxic Torts- A Primer in Toxicology for Judges and Lawyers, 12 J.L. & Pol’y 5, 36 (2003). See also Joseph V. Rodricks and Susan H. Rieth, “Toxicological risk assessment in the courtroom: are available methodologies suitable for evaluating toxic tort and product liability claims?” 27 Regul. Toxicol. & Pharmacol. 21, 27 (1998) (“The public health-oriented resolution of scientific uncertainty [used by regulators] is not especially helpful to the problem faced by a court.”)

[5] EPA “Guidelines for Carcinogen Risk Assessment” at 13 (1986).

[6] The approach is set out in FDA, M7 (R1) Assessment and Control of DNA Reactive (Mutagenic) Impurities in Pharmaceuticals to Limit Potential Carcinogenic Risk: Guidance for Industry (2018) [FDA M7]. This FDA guidance is essentially an adoption of the M7 document of the Expert Working Group (Multidisciplinary) of the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH).

[7] FDA M7 at 3.

[8] FDA M7 at 5.

[9] FDA M7 at 5 (emphasis added).

[10] See Labeling of Diphenhydramine Containing Drug Products for Over-the-Counter Human Use, 67 Fed. Reg. 72,555, at 72,556 (Dec. 6, 2002) (“FDA’s decision to act in an instance such as this one need not meet the standard of proof required to prevail in a private tort action. . .. To mandate a warning or take similar regulatory action, FDA need not show, nor do we allege, actual causation.”) (citing Glastetter).

[11] FDA M7 at “Acceptable Intakes in Relation to Less-Than-Lifetime (LTL) Exposure (7.3).”

[12] FDA M7 at 12 (“Mutagenic Impurities With Evidence for a Practical Threshold (7.2.2)”).