TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

The IARC-hy of Evidence – Incoherent & Inconsistent Classifications of Carcinogenicity

September 19th, 2023

Recently, two lawyers wrote an article in a legal trade magazine about excluding epidemiologic evidence in civil litigation.[1] The article was wildly wide of the mark, with several conceptual and practical errors.[2] For starters, the authors discussed Rule 702 as excluding epidemiologic studies and evidence, when the rule addresses the admissibility of expert witness opinion testimony. The most egregious recommendation of the authors, however, was their recommendation that counsel urge the classifications of chemicals with respect to carcinogenicity, by the International Agency for Research on Cancer (IARC), and by regulatory agencies, as probative for or against causation.

The project of evaluating the evidence for, or against, carcinogenicity of the myriad natural and synthetic agents to which humans are exposed is certainly important. Certainly, IARC has taken the project seriously. There have, however, been problems with IARC’s classifications of specific chemicals, pharmaceuticals, or exposure circumstances, but a basic problem with the classifications begins with the classes themselves. Classification requires defined classes. I don’t mean to be anti-semantic, but IARC’s definitions and its hierarchy of carcinogenicity are not entirely coherent.

The agency was established in 1965, and by the early 1970s, found itself in the business of preparing “monographs on the evaluation of carcinogenic risk of chemicals to man.” Originally, the IARC set out to classify the carcinogenicity of chemicals, but over the years, its scope increased to include complex mixtures, physical agents such as different forms of radiation, and biological organisms. To date, there have been 134 IARC monographs, addressing 1,045 “agents” (either substances or exposure circumstances).

From its beginnings, the IARC has conducted its classifications through working groups that meet to review and evaluate evidence, and classify the cancer hazards of “agents” under discussion. The breakdown of IARC’s classifications among four groups currently is:

Group 1 – Carcinogenic to humans (127 agents)

Group 2A – Probably carcinogenic to humans (95 agents)

Group 2B – Possibly carcinogenic to humans (323 agents)

Group 3 – Not classifiable as to its carcinogenicity to humans   (500 agents)

Previously, the IARC classification included a Group 4 for agents that are probably not carcinogenic for human beings. After decades of review, the IARC placed only a single agent in Group 4, caprolactam, apparently because the agency found everything else in the world to be presumptively a cause of cancer. The IARC could not find sufficiently strong evidence even for water, air, or basic foods to declare that they do not cause cancer in humans. Ultimately, the IARC abandoned Group 4, in favor of a presumption of universal carcinogencity.

The IARC describes its carcinogen classification procedures, requirements, and rationales in a document known as “The Preamble.” Any discussion of IARC classifications, whether in scientific publications or in legal briefs, without reference to this document should be suspect. The Preamble seeks to define many of the words in the classificatory scheme, some in ways that are not intuitive. This document has been amended over time, and the most recent iteration can be found online at the IARC website.[3]

IARC claims to build its classifications upon “consensus” evaluations, based in turn upon considerations of

(a) the strength of evidence of carcinogenicity in humans,

(b) the evidence of carcinogenicity in experimental (non-human) animals, and

(c) the mechanistic evidence of carcinogenicity.

IARC further claims that its evaluations turn on the use of “transparent criteria and descriptive terms.”[4] This last claim is, for some terms, is falsifiable.

The working groups are described as engaged in consensus evaluations, although past evaluations have been reached on simple majority vote of the working group. The working groups are charged with considering the three lines of evidence, described above, for any given agent, and reaching a synthesis in the form of the IARC classificatory scheme. The chart, from the Preamble, below roughly describes how working groups may “mix and match” lines of evidence, of varying degrees of robustness and validity (vel non) to reach a classification.

 

Agents placed in Category I are thus “carcinogenic to humans.” Interestingly, IARC does not refer to Category I carcinogens as “known” carcinogens, although many commentators are prone to do so. The implication of calling Category I agents “known carcinogens” is to distinguish Category IIA, IIB, and III as agents “not known to cause cancer.” The adjective that IARC uses, rather than “known,” is “sufficient” evidence in humans, but IARC also allows for reaching Category I with “limited,” or even “inadequate” human evidence if the other lines of evidence, in experimental animals or mechanistic evidence in humans, are sufficient.

In describing “sufficient” evidence, the IARC’s Preamble does not refer to epidemiologic evidence as potentially “conclusive” or “definitive”; rather its use of “sufficient” implies, perhaps non-transparently, that its labels of “limited” or “inadequate” evidence in humans refer to insufficient evidence. IARC gives an unscientific, inflated weight and understanding to “limited evidence of carcinogenicity,” by telling us that

“[a] causal interpretation of the positive association observed in the body of evidence on exposure to the agent and cancer is credible, but chance, bias, or confounding could not be ruled out with reasonable confidence.”[5]

Remarkably, for IARC, credible interpretations of causality can be based upon evidentiary displays that are confounded or biased.  In other words, non-credible associations may support IARC’s conclusions of causality. Causal interpretations of epidemiologic evidence are “credible” according to IARC, even though Sir Austin’s predicate of a valid association is absent.[6]

The IARC studiously avoids, however, noting that any classification is based upon “insufficient” evidence, even though that evidence may be less than sufficient, as in “limited,” or “inadequate.” A close look at Table 4 reveals that some Category I classifications, and all Category IIA, IIB, and III classifications are based upon insufficient evidence of carcinogenicity in humans.

Non-Probable Probabilities

The classification immediately below Category or Group I is Group 2A, for agents “probably carcinogenic to humans.” The IARC’s use of “probably” is problematic. Group I carcinogens require only “sufficient” evidence of human carcinogenicity, and there is no suggestion that any aspect of a Group I evaluation requires apodictic, conclusive, or even “definitive” evidence. Accordingly, the determination of Group I carcinogens will be based upon evidence that is essentially probabilistic. Group 2A is also defined as having only “limited evidence of carcinogenicity in humans”; in other words, insufficient evidence of carcinogenicity in humans, or epidemiologic studies with uncontrolled confounding and biases.

Importing IARC 2A classifications into legal or regulatory arenas will allow judgments or regulations based upon “limited evidence” in humans, which as we have seen, can be based upon inconsistent observational studies, and studies that fail to measure and adjust for known and potential confounding risk factors and systematic biases. The 2A classification thus requires little substantively or semantically, and many 2A classifications leave juries and judges to determine whether a chemical or medication caused a human being’s cancer, when the basic predicates for Sir Austin Bradford Hill’s factors for causal judgment have not been met.[7]

An IARC evaluation of Group 2A, or “probably carcinogenic to humans,” would seem to satisfy the legal system’s requirement that an exposure to the agent of interest more likely than not causes the harm in question. Appearances and word usage in different contexts, however, can be deceiving. Probability is a continuous quantitative scale from zero to one. In Bayesian analyses, zero and one are unavailable because if either were our starting point, no amount of evidence could ever change our judgment of the probability of causation. (Cromwell’s Rule). The IARC informs us that its use of “probably” is purely idiosyncratic; the probability that a Group 2A agent causes cancer has “no quantitative” meaning. All the IARC intends is that a Group 2A classification “signifies a greater strength of evidence than possibly carcinogenic.”[8] Group 2A classifications are thus consistent with having posterior probabilities less than 0.5 (or 50 percent). A working group could judge the probability of a substance or a process to be carcinogenic to humans to be greater than zero, but no more than say ten percent, and still vote for a 2A classification, in keeping with the IARC Preamble. This low probability threshold for a 2A classification converts the judgment of “probably carcinogenic” into little more than precautionary prescriptions, rendered when the most probable assessment is either ignorance or lack of causality. There is thus a practical certainty, close to 100%, that a 2A classification will confuse judges and juries, as well as the scientific community.

In addition to being based upon limited, that is insufficient, evidence of human carcinogenicity, Group 2A evaluations of “probable human carcinogenicity” connote “sufficient evidence” in experimental animals. An agent can be classified 2A even when the sufficient evidence of carcinogenicity occurs in only one of several non-human animal species, with the other animal species failing to show carcinogenicity. IARC 2A classifications can thus raise the thorny question in court whether a claimant is more like a rat or a mouse.

Courts should, because of the incoherent and diluted criteria for “probably carcinogenic,” exclude expert witness opinions based upon IARC 2A classifications as scientifically insufficient.[9] Given the distortion of ordinary language in its use of defined terms such as “sufficient,” “limited,” and “probable,” any evidentiary value to IARC 2A classifications, and expert witness opinion based thereon, is “substantially outweighed by a danger of … unfair prejudice, confusing the issues, [and] misleading the jury….”[10]

Everything is Possible

Category 2B denotes “possibly carcinogenic.” This year, the IARC announced that a working group had concluded that aspartame, an artificial sugar substitute, was “possibly carcinogenic.”[11] Such an evaluation, however, tells us nothing. If there are no studies at all of an agent, the agent could be said to be possibly carcinogenic. If there are inconsistent studies, even if the better designed studies are exculpatory, scientists could still say that the agent of interest was possibly carcinogenic. The 2B classification does not tell us anything because everything is possible until there is sufficient evidence to inculpate or exculpate it from causing cancer in humans.

It’s a Hazard, Not a Risk

IARC’s classification does not include an assessment of exposure levels. Consequently, there is no consideration of dose or exposure level at which an agent becomes carcinogenic. IARC’s evaluations are limited to whether the agent is or is not carcinogenic. The IARC explicitly concedes that exposure to a carcinogenic agent may carry little risk, but it cannot bring itself to say no risk, or even benefit at low exposures.

As noted, the IARC classification scheme refers to the strength of the evidence that an agent is carcinogenic, and not to the quantitative risk of cancer from exposure at a given level. The Preamble explains the distinction as fundamental:

“A cancer hazard is an agent that is capable of causing cancer, whereas a cancer risk is an estimate of the probability that cancer will occur given some level of exposure to a cancer hazard. The Monographs assess the strength of evidence that an agent is a cancer hazard. The distinction between hazard and risk is fundamental. The Monographs identify cancer hazards even when risks appear to be low in some exposure scenarios. This is because the exposure may be widespread at low levels, and because exposure levels in many populations are not known or documented.”[12]

This attempted explanation reveals important aspects of IARC’s project. First, there is an unproven assumption that there will be cancer hazards regardless of the exposure levels. The IARC contemplates that there may circumstances of low levels of risk from low levels of exposure, but it elides the important issue of thresholds. Second, IARC’s distinction between hazard and risk is obscured by its own classifications.  For instance, when IARC evaluated crystalline silica and classified it in Group I, it did so for only “occupational exposures.”[13] And yet, when IARC evaluated the hazard of coal exposure, it placed coal dust in Group 3, even though coal dust contains crystalline silica.[14] Similarly, in 2018, the IARC classified coffee as a Group 3,[15] even though every drop of coffee contains acrylamide, which is, according to IARC, a Group 2A “probable human carcinogen.”[16]


[1] Christian W. Castile & and Stephen J. McConnell, “Excluding Epidemiological Evidence Under FRE 702,” For The Defense 18 (June 2023) [Castile].

[2]Excluding Epidemiologic Evidence Under Federal Rule of Evidence 702” (Aug. 26, 2023).

[3] IARC Monographs on the Identification of Carcinogenic Hazards to Humans – Preamble (2019).

[4] Jonathan M. Samet , Weihsueh A. Chiu , Vincent Cogliano, Jennifer Jinot, David Kriebel, Ruth M. Lunn, Frederick A. Beland, Lisa Bero, Patience Browne, Lin Fritschi, Jun Kanno , Dirk W. Lachenmeier, Qing Lan, Gerard Lasfargues, Frank Le Curieux, Susan Peters, Pamela Shubat, Hideko Sone, Mary C. White , Jon Williamson, Marianna Yakubovskaya , Jack Siemiatycki, Paul A. White, Kathryn Z. Guyton, Mary K. Schubauer-Berigan, Amy L. Hall, Yann Grosse, Veronique Bouvard, Lamia Benbrahim-Tallaa, Fatiha El Ghissassi, Beatrice Lauby-Secretan, Bruce Armstrong, Rodolfo Saracci, Jiri Zavadil , Kurt Straif, and Christopher P. Wild, “The IARC Monographs: Updated Procedures for Modern and Transparent Evidence Synthesis in Cancer Hazard Identification,” 112 J. Nat’l Cancer Inst. djz169 (2020).

[5] Preamble at 31.

[6] See Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295 (1965) (noting that only when “[o]ur observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance,” do we move on to consider the nine articulated factors for determining whether an association is causal.

[7] Id.

[8] IARC Monographs on the Identification of Carcinogenic Hazards to Humans – Preamble 31 (2019) (“The terms probably carcinogenic and possibly carcinogenic have no quantitative significance and are used as descriptors of different strengths of evidence of carcinogenicity in humans.”).

[9] SeeIs the IARC lost in the weeds” (Nov. 30, 2019); “Good Night Styrene” (Apr. 18, 2019).

[10] Fed. R. Evid. 403.

[11] Elio Riboli, et al., “Carcinogenicity of aspartame, methyleugenol, and isoeugenol,” 24 The Lancet Oncology P848-850 (2023);

IARC, “Aspartame hazard and risk assessment results released” (2023).

[12] Preamble at 2.

[13] IARC Monograph 68, at 41 (1997) (“For these reasons, the Working Group therefore concluded that overall the epidemiological findings support increased lung cancer risks from inhaled crystalline silica (quartz and cristobalite) resulting from occupational exposure.”).

[14] IARC Monograph 68, at 337 (1997).

[15] IARC Monograph No. 116, Drinking Coffee, Mate, and Very Hot Beverages (2018).

[16] IARC Monograph no. 60, Some Industrial Chemicals (1994).

Excluding Epidemiologic Evidence under Federal Rule of Evidence 702

August 26th, 2023

We are 30-plus years into the “Daubert” era, in which federal district courts are charged with gatekeeping the relevance and reliability of scientific evidence. Not surprisingly, given the lawsuit industry’s propensity on occasion to use dodgy science, the burden of awakening the gatekeepers from their dogmatic slumber often falls upon defense counsel in civil litigation. It therefore behooves defense counsel to speak carefully and accurately about the grounds for Rule 702 exclusion of expert witness opinion testimony.

In the context of medical causation opinions based upon epidemiologic evidence, the first obvious point is that whichever party is arguing for exclusion should distinguish between excluding an expert witness’s opinion and prohibiting an expert witness from relying upon a particular study.  Rule 702 addresses the exclusion of opinions, whereas Rule 703 addresses barring an expert witness from relying upon hearsay facts or data unless they are reasonably relied upon by experts in the appropriate field. It would be helpful for lawyers and legal academics to refrain from talking about “excluding epidemiological evidence under FRE 702.”[1] Epidemiologic studies are rarely admissible themselves, but come into the courtroom as facts and data relied upon by expert witnesses. Rule 702 is addressed to the admissibility vel non of opinion testimony, some of which may rely upon epidemiologic evidence.

Another common lawyer mistake is the over-generalization that epidemiologic research provides “gold standard” of general causation evidence.[2] Although epidemiology is often required, it not “the medical science devoted to determining the cause of disease in human beings.”[3] To be sure, epidemiologic evidence will usually be required because there is no genetic or mechanistic evidence that will support the claimed causal inference, but counsel should be cautious in stating the requirement. Glib statements by courts that epidemiology is not always required are often simply an evasion of their responsibility to evaluate the validity of the proffered expert witness opinions. A more careful phrasing of the role of epidemiology will make such glib statements more readily open to rebuttal. In the absence of direct biochemical, physiological, or genetic mechanisms that can be identified as involved in bringing about the plaintiffs’ harm, epidemiologic evidence will be required, and it may well be the “gold standard” in such cases.[4]

When epidemiologic evidence is required, counsel will usually be justified in adverting to the “hierarchy of epidemiologic evidence.” Associations are shown in studies of various designs with vastly differing degrees of validity; and of course, associations are not necessarily causal. There are thus important nuances in educating the gatekeeper about this hierarchy. First, it will often be important to educate the gatekeeper about the distinction between descriptive and analytic studies, and the inability of descriptive studies such as case reports to support causal inferences.[5]

There is then the matter of confusion within the judiciary and among “scholars” about whether a hierarchy even exists. The chapter on epidemiology in the Reference Manual on Scientific Evidence appears to suggest the specious position that there is no hierarchy.[6] The chapter on medical testimony, however, takes a different approach in identifying a normative hierarchy of evidence to be considered in evaluating causal claims.[7] The medical testimony chapter specifies that meta-analyses of randomized controlled trials sit atop the hierarchy. Yet, there are divergent opinions about what should be at the top of the hierarchical evidence pyramid. Indeed, the rigorous, large randomized trial will often replace a meta-analysis of smaller trials as the more definitive evidence.[8] Back in 2007, a dubious meta-analysis of over 40 clinical trials led to a litigation frenzy over rosiglitazone.[9] A mega-trial of rosiglitazone showed that the 2007 meta-analysis was wrong.[10]

In any event, courts must purge their beliefs that once there is “some” evidence in support of a claim, their gatekeeping role is over. Randomized controlled trials really do trump observational studies, which virtually always have actual or potential confounding in their final analyses.[11] While disclaimers about the unavailability of randomized trials for putative toxic exposures are helpful, it is not quite accurate to say that it is “unethical to intentionally expose people to a potentially harmful dose of a suspected toxin.”[12] Such trials are done all the time when there is an expected therapeutic benefit that creates at least equipoise between the overall benefit and harm at the outset of the trial.[13]

At this late date, it seems shameful that courts must be reminded that evidence of associations does not suffice to show causation, but prudence dictates giving the reminder.[14] Defense counsel will generally exhibit a Pavlovian reflex to state that causality based upon epidemiology must be viewed through a lens of “Bradford Hill criteria.”[15] Rhetorically, this reflex seems wrong given that Sir Austin himself noted that his nine different considerations were “viewpoints,” not criteria. Taking a position that requires an immediate retreat seems misguided. Similarly, urging courts to invoke and apply the Bradford Hill considerations must be accompanied the caveat that courts must first apply Bradford Hill’s predicate[16] for the nine considerations:

“Disregarding then any such problem in semantics we have this situation. Our observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance. What aspects of that association should we especially consider before deciding that the most likely interpretation of it is causation?”[17]

Courts should be mindful that the language from the famous, often-cited paper was part of an after-dinner address, in which Sir Austin was speaking informally. Scientists will understand that he was setting out a predicate that calls for

(1) an association, which is

(2) “perfectly clear cut,” such that bias and confounding are excluded, and

(3) “beyond what we would care to attribute to the play of chance,” with random error kept to an acceptable level, before advancing to further consideration of the nine viewpoints commonly recited.

These predicate findings are the basis for advancing to investigate Bradford Hill’s nine viewpoints; the viewpoints do not replace or supersede the predicates.[18]

Within the nine viewpoints, not all are of equal importance. Consistency among studies, a particularly important consideration, implies that isolated findings in a single observational study will rarely suffice to support causal conclusions. Another important consideration, the strength of the association, has nothing to do with “statistical significance,” which is a predicate consideration, but reminds us that large risk ratios or risk differences provides some evidence that the association does not result from unmeasured confounding. Eliminating confounding, however, is one of the predicate requirements for applying the nine factors. As with any methodology, the Bradford Hill factors are not self-executing. The annals of litigation provide all-too-many examples of undue selectivity, “cherry picking,” and other deviations from the scientist’s standard of care.

Certainly lawyers must steel themselves against recommending the “carcinogen” hazard identifications advanced by the International Agency for Research on Cancer (IARC). There are several problematic aspects to the methods of IARC, not the least of which is IARC’s fanciful use of the word “probable.” According to the IARC Preamble, “probable” has no quantitative meaning.[19] In common legal parlance, “probable” typically conveys a conclusion that is more likely than not. Another problem arises from the IARC’s labeling of “probable human carcinogens” made in some cases without any real evidence of carcinogenesis in humans. Regulatory pronouncements are even more diluted and often involved little more than precautionary principle wishcasting.[20]


[1] Christian W. Castile & and Stephen J. McConnell, “Excluding Epidemiological Evidence Under FRE 702,” For The Defense 18 (June 2023) [Castile]. Although these authors provide an interesting overview of the subject, they fall into some common errors, such as failing to address Rule 703. The article is worth reading for its marshaling recent case law on the subject, but I detail of its errors here in the hopes that lawyers will speak more precisely about the concepts involved in challenging medical causation opinions.

[2] Id. at 18. In re Zantac (Ranitidine) Prods. Liab. Litig., No. 2924, 2022 U.S. Dist. LEXIS 220327, at *401 (S.D. Fla. Dec. 6, 2022); see also Horwin v. Am. Home Prods., No. CV 00-04523 WJR (Ex), 2003 U.S. Dist. LEXIS 28039, at *14-15 (C.D. Cal. May 9, 2003) (“epidemiological studies provide the primary generally accepted methodology for demonstrating a causal relation between a chemical compound and a set of symptoms or disease” *** “The lack of epidemiological studies supporting Plaintiffs’ claims creates a high bar to surmount with respect to the reliability requirement, but it is not automatically fatal to their case.”).

[3] See, e.g., Siharath v. Sandoz Pharm. Corp., 131 F. Supp. 2d 1347, 1356 (N.D. Ga. 2001) (“epidemiology is the medical science devoted to determining the cause of disease in human beings”).

[4] See, e.g., Lopez v. Wyeth-Ayerst Labs., No. C 94-4054 CW, 1996 U.S. Dist. LEXIS 22739, at *1 (N.D. Cal. Dec. 13, 1996) (“Epidemiological evidence is one of the most valuable pieces of scientific evidence of causation”); Horwin v. Am. Home Prods., No. CV 00-04523 WJR (Ex), 2003 U.S. Dist. LEXIS 28039, at *15 (C.D. Cal. May 9, 2003) (“The lack of epidemiological studies supporting Plaintiffs’ claims creates a high bar to surmount with respect to the reliability requirement, but it is not automatically fatal to their case”).

[5] David A. Grimes & Kenneth F. Schulz, “Descriptive Studies: What They Can and Cannot Do,” 359 Lancet 145 (2002) (“…epidemiologists and clinicians generally use descriptive reports to search for clues of cause of disease – i.e., generation of hypotheses. In this role, descriptive studies are often a springboard into more rigorous studies with comparison groups. Common pitfalls of descriptive reports include an absence of a clear, specific, and reproducible case definition, and interpretations that overstep the data. Studies without a comparison group do not allow conclusions about cause of disease.”).

[6] Michael D. Green, D. Michal Freedman, and Leon Gordis, “Reference Guide on Epidemiology,” Reference Manual on Scientific Evidence 549, 564n.48 (citing a paid advertisement by a group of scientists, and misleadingly referring to the publication as a National Cancer Institute symposium) (citing Michele Carbone et al., “Modern Criteria to Establish Human Cancer Etiology,” 64 Cancer Res. 5518, 5522 (2004) (National Cancer Institute symposium [sic] concluding that “[t]here should be no hierarchy [among different types of scientific methods to determine cancer causation]. Epidemiology, animal, tissue culture and molecular pathology should be seen as integrating evidences in the determination of human carcinogenicity.”).

[7] John B. Wong, Lawrence O. Gostin & Oscar A. Cabrera, “Reference Guide on Medical Testimony,” in Reference Manual on Scientific Evidence 687, 723 (3d ed. 2011).

[8] See, e.g., J.M. Elwood, Critical Appraisal of Epidemiological Studies and Clinical Trials 342 (3d ed. 2007).

[9] See Steven E. Nissen & Kathy Wolski, “Effect of Rosiglitazone on the Risk of Myocardial Infarction and Death from Cardiovascular Causes,” 356 New Engl. J. Med. 2457 (2007). See also “Learning to Embrace Flawed Evidence – The Avandia MDL’s Daubert Opinion” (Jan. 10, 2011).

[10] Philip D. Home, et al., “Rosiglitazone evaluated for cardiovascular outcomes in oral agent combination therapy for type 2 diabetes (RECORD): a multicentre, randomised, open-label trial,” 373 Lancet 2125 (2009).

[11] In re Zantac (Ranitidine) Prods. Liab. Litig., No. 2924, 2022 U.S. Dist. LEXIS 220327, at *402 (S.D. Fla. Dec. 6, 2022) (“Unlike experimental studies in which subjects are randomly assigned to exposed and placebo groups, observational studies are subject to bias due to the possibility of differences between study populations.”)

[12] Castile at 20.

[13] See, e.g., Benjamin Freedman, “Equipoise and the ethics of clinical research,” 317 New Engl. J. Med. 141 (1987).

[14] See, e.g., In Re Onglyza (Saxagliptin) & Kombiglyze Xr (Saxagliptin & Metformin) Prods. Liab. Litig., No. 5:18-md-2809-KKC, 2022 U.S. Dist. LEXIS 136955, at *127 (E.D. Ky. Aug. 2, 2022); Burleson v. Texas Dep’t of Criminal Justice, 393 F.3d 577, 585-86 (5th Cir. 2004) (affirming exclusion of expert causation testimony based solely upon studies showing a mere correlation between defendant’s product and plaintiff’s injury); Beyer v. Anchor Insulation Co., 238 F. Supp. 3d 270, 280-81 (D. Conn. 2017); Ambrosini v. Labarraque, 101 F.3d 129, 136 (D.C. Cir. 1996).

[15] Castile at 21. See In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., 26 F. Supp. 3d 449, 454-55 (E.D. Pa. 2014).

[16]Bradford Hill on Statistical Methods” (Sept. 24, 2013); see also Frank C. Woodside, III & Allison G. Davis, “The Bradford Hill Criteria: The Forgotten Predicate,” 35 Thomas Jefferson L. Rev. 103 (2013). 

[17] Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 295 (1965).

[18] Castile at 21. See, e.g., In re Onglyza (Saxagliptin) & Kombiglyze XR (Saxagliptin & Metformin) Prods. Liab. Litig., No. 5:18-md-2809-KKC, 2022 U.S. Dist. LEXIS 1821, at *43 (E.D. Ky. Jan. 5, 2022) (“The analysis is meant to apply when observations reveal an association between two variables. It addresses the aspects of that association that researchers should analyze before deciding that the most likely interpretation of [the association] is causation”); Hoefling v. U.S. Smokeless Tobacco Co., LLC, 576 F. Supp. 3d 262, 273 n.4 (E.D. Pa. 2021) (“Nor would it have been appropriate to apply them here: scientists are to do so only after an epidemiological association is demonstrated”).

[19] IARC Monographs on the Identification of Carcinogenic Hazards to Humans – Preamble 31 (2019) (“The terms probably carcinogenic and possibly carcinogenic have no quantitative significance and are used as descriptors of different strengths of evidence of carcinogenicity in humans.”).

[20]Improper Reliance upon Regulatory Risk Assessments in Civil Litigation” (Mar. 19, 2023).

Reference Manual – Desiderata for 4th Edition – Part VI – Rule 703

February 17th, 2023

One of the most remarkable, and objectionable, aspects of the third edition was its failure to engage with Federal Rule of Evidence of 703, and the need for courts to assess the validity of individual studies relied upon. The statistics chapter has a brief, but important discussion of Rule 703, as does the chapter on survey evidence. The epidemiology chapter mentions Rule 703 only in a footnote.[1]

Rule 703 appears to be the red-headed stepchild of the Federal Rules, and it is often ignored and omitted from so-called Daubert briefs.[2] Perhaps part of the problem is that Rule 703 (“Bases of an Expert”) is one of the mostly poorly drafted rules in the Federal Rules of Evidence:

“An expert may base an opinion on facts or data in the case that the expert has been made aware of or personally observed. If experts in the particular field would reasonably rely on those kinds of facts or data in forming an opinion on the subject, they need not be admissible for the opinion to be admitted. But if the facts or data would otherwise be inadmissible, the proponent of the opinion may disclose them to the jury only if their probative value in helping the jury evaluate the opinion substantially outweighs their prejudicial effect.”

Despite its tortuous wording, the rule is clear enough in authorizing expert witnesses to rely upon studies that are themselves inadmissible, and allowing such witnesses to disclose the studies that they have relied upon, when there has been the requisite showing of probative value that outweighs any prejudice.

The statistics chapter in the third edition, nonetheless, confusingly suggested that

“a particular study may use a method that is entirely appropriate but that is so poorly executed that it should be inadmissible under Federal Rules of Evidence 403 and 702. Or, the method may be inappropriate for the problem at hand and thus lack the ‘fit’ spoken of in Daubert. Or the study might rest on data of the type not reasonably relied on by statisticians or substantive experts and hence run afoul of Federal Rule of Evidence 703.”[3]

Particular studies, even when beautifully executed, are not admissible. And particular studies are not subject to evaluation under Rule 702, apart from the gatekeeping of expert witness opinion testimony that is based upon the particular studies. To be sure, the reference to Rule 703 is important and welcomed counter to the suggestion, elsewhere in the third edition, that courts should not look at individual studies. The independent review of individual studies is occasionally lost in the shuffle of litigation, and the statistics chapter is correct to note an evidentiary concern whether each individual study may or may not be reasonably relied upon by an expert witness. In any event, reasonably relied upon studies do not ipso facto become admissible.

The third edition’s chapter on Survey Research contains the most explicit direction on Rule 703, in terms of courts’ responsibilities.  In that chapter, the authors instruct that Rule 703:

“redirect[ed] attention to the ‘validity of the techniques employed’. The inquiry under Rule 703 focuses on whether facts or data are ‘of a type reasonably relied upon by experts in the particular field in forming opinions or inferences upon the subject’.”[4]

Although Rule 703 is clear enough on admissibility, the epidemiology chapter described epidemiologic studies broadly as admissible if sufficiently rigorous:

“An epidemiologic study that is sufficiently rigorous to justify a conclusion that it is scientifically valid should be admissible, as it tends to make an issue in dispute more or less likely.”[5]

The authors of the epidemiology chapter acknowledge, in a footnote, “that [h]earsay concerns may limit the independent admissibility of the study, but the study could be relied on by an expert in forming an opinion and may be admissible pursuant to Fed. R. Evid. 703 as part of the underlying facts or data relied on by the expert.”[6]

This footnote is curious, and incorrect. There is no question that hearsay “concerns” “may limit” admissibility of a study; hearsay is inadmissible unless there is a statutory exception.[7] Rule 703 is not one of the exceptions to the rule against hearsay in Article VIII of the Federal Rules of Evidence. An expert witness’s reliance upon a study does not make the study admissible. The authors cite two cases,[8] but neither case held that reasonable reliance by expert witnesses transmuted epidemiologic studies into admissible evidence. The text of Rule 703 itself, and the overwhelming weight of case law interpreting and applying the rule,[9]  makes clear that the rule does not render scientific studies admissible. The two cases cited by the epidemiology chapter, Kehm and Ellis, both involved “factual findings” in public investigative or evaluative reports, which were independently admissible under Federal Rule of Evidence 803(8)(C).[10] As such, the cases failed to support the chapter’s suggestion that Rule 703 is a rule of admissibility for epidemiologic studies. The third edition thus, in one sentence, confused Rule 703 with an exception to the rule against hearsay, which would prevent the statistically based epidemiologic studies from being received in evidence. The point was reasonably clear, however, that studies “may be offered” to explain an expert witness’s opinion. Under Rule 705, that offer may also be refused.

The Reference Manual was certainly not alone in advancing the notion that studies are themselves admissible. Other well-respected evidence scholars have misstated the law on this issue.[11] The fourth edition would do well to note that scientific studies, and especially epidemiologic studies, involve multiple levels of hearsay. A typical epidemiologic study may contain hearsay leaps from patient to clinician, to laboratory technicians, to specialists interpreting test results, back to the clinician for a diagnosis, to a nosologist for disease coding, to a national or hospital database, to a researcher querying the database, to a statistician analyzing the data, to a manuscript that details data, analyses, and results, to editors and peer reviewers, back to study authors, and on to publication. Those leaps do not mean that the final results are thus untrustworthy or not reasonably relied upon, but they do raise well-nigh insuperable barriers to admissibility. The inadmissibility of scientific studies is generally not problematic because Rule 703 permits testifying expert witnesses to formulate opinions based upon facts and data, which are not themselves admissible in evidence. The distinction between relied upon, and admissible, studies is codified in the Federal Rules of Evidence, and in virtually every state’s evidence law.

The fourth edition might well also note that under Rule 104(a), the Rules of Evidence themselves do not govern a trial court’s preliminary determination, under Rules 702 or 703, of the admissibility of an expert witness’s opinion, or the appropriateness of reliance upon a particular study. Although Rule 705 may allow disclosure of facts and data described in studies, it is not an invitation to permit testifying expert witnesses to become a conduit for off-hand comments and opinions in the introduction or discussion sections of relied upon articles.[12] The wholesale admission of such hearsay opinions undermines the court’s control over opinion evidence. Rule 703 authorizes reasonable reliance upon “facts and data,” not every opinion that creeps into the published literature.

Reference Manual’s Disregard of Study Validity in Favor of the “Whole Tsumish”

The third edition evidence considerable ambivalence in whether trial judges should engage in resolving disputes about the validity of individual studies relied upon by expert witnesses. Since 2000, Rule 702 clearly required such engagement, which made the Manual’s hesitancy, on the whole, unjustifiable.  The ambivalence with respect to study validity, however, was on full display in the late Professor Margaret Berger’s chapter, “The Admissibility of Expert Testimony.”[13] Berger’s chapter criticized “atomization,” or looking at individual studies in isolation, a process she described pejoratively as “slicing-and-dicing.”[14]

Drawing on the publications of Daubert-critic Susan Haack, Berger appeared to reject the notion that courts should examine the reliability of each study independently.[15] Berger described the “proper” scientific method, as evidenced by works of the International Agency for Research on Cancer (IARC), the Institute of Medicine, the National Institute of Health, the National Research Council, and the National Institute for Environmental Health Sciences, “is to consider all the relevant available scientific evidence, taken as a whole, to determine which conclusion or hypothesis regarding a causal claim is best supported by the body of evidence.”[16]

Berger’s description of the review process, however, was profoundly misleading in its incompleteness. Of course, scientists undertaking a systematic review identify all the relevant studies, but some of the “relevant” studies may well be insufficiently reliable (because of internal or external validity issues) to answer the research question at hand. All the cited agencies, and other research organizations and researchers, exclude studies that are fundamentally flawed, whether as a result of bias, confounding, erroneous data analyses, or related problems. Berger cited no support for her remarkable suggestion that scientists do not make “reliability” judgments about available studies when assessing the “totality of the evidence.”[17]

Professor Berger, who had a distinguished career as a law professor and evidence scholar, died in November 2010, before the third edition was published. She was no friend of Daubert,[18] but her antipathy remarkably outlived her. Berger’s critical discussion of “atomization” cited the notorious decision in Milward v. Acuity Specialty Products Group, Inc., 639 F.3d 11, 26 (1st Cir. 2011), which was decided four months after her passing.[19]

Professor Berger’s contention about the need to avoid assessments of individual studies in favor of the whole “tsumish” must also be rejected because Federal Rule of Evidence 703 requires that each study considered by an expert witness “qualify” for reasonable reliance by virtue of the study’s containing facts or data that are “of a type reasonably relied upon by experts in the particular field forming opinions or inferences upon the subject.” One of the deeply troubling aspects of the Milward decision is that it reversed the trial court’s sensible decision to exclude a toxicologist, Dr. Martyn Smith, who outran his headlights on issues having to do with a field in which he was clearly inexperienced – epidemiology.

Another curious omission in the third edition’s discussions of Milward is the dark ethical cloud of misconduct that hovers over the First Circuit’s reversal of the trial court’s exclusions of Martyn Smith and Carl Cranor. On appeal, the Council for Education and Research on Toxics (CERT) filed an amicus brief in support of reversing the exclusion of Smith and Cranor. The CERT amicus brief, however, never disclosed that CERT was founded by Smith and Cranor, and that CERT funded Smith’s research.[20]

Rule 702 requires courts to pay attention to, among other things, the sufficiency of the facts and data relied upon by expert witnesses. Rule 703’s requirement that individual studies must be reasonably relied upon is an important additional protreptic against the advice given by Professor Berger, in the third edition.


[1] The index notes the following page references for Rule 703: 214, 361, 363-364, and 610 n.184.

[2] See David E. Bernstein & Eric G. Lasker,“Defending Daubert: It’s Time to Amend Federal Rule of Evidence 702,” 57 William & Mary L. Rev. 1, 32 (2015) (“Rule 703 is frequently ignored in Daubert analyses”);  Schachtman, “Rule 703 – The Problem Child of Article VII,” 17 Proof 3 (Spring 2009); Schachtman “The Effective Presentation of Defense Expert Witnesses and Cross-examination of Plaintiffs’ Expert Witnesses”; at the ALI-ABA Course on Opinion and Expert Witness Testimony in State and Federal Courts (February 14-15, 2008). See also Julie E. Seaman, “Triangulating Testimonial Hearsay: The Constitutional Boundaries of Expert Opinion Testimony,” 96 Georgetown L.J. 827 (2008); “RULE OF EVIDENCE 703 — Problem Child of Article VII” (Sept. 19, 2011); “Giving Rule 703 the Cold Shoulder” (May 12, 2012); “New Reference Manual on Scientific Evidence Short Shrifts Rule 703,” (Oct. 16, 2011).

[3] RMSE3d at 214.

[4] RMSE3d at 364 (internal citations omitted).

[5] RMSE 3d at 610 (internal citations omitted).

[6] RSME3d at 601 n.184.

[7] Rule 802 (“Hearsay Rule”) “Hearsay is not admissible except as provided by these rules or by other rules prescribed by the Supreme Court pursuant to statutory authority or by Act of Congress.”

[8] Kehm v. Procter & Gamble Co., 580 F. Supp. 890, 902 (N.D. Iowa 1982) (“These [epidemiologic] studies were highly probative on the issue of causation—they all concluded that an association between tampon use and menstrually related TSS [toxic shock syndrome] cases exists.”), aff’d, 724 F.2d 613 (8th Cir. 1984); Ellis v. International Playtex, Inc., 745 F.2d 292, 303 (4th Cir. 1984). The chapter also cited another the en banc decision in Christophersen for the proposition that “[a]s a general rule, questions relating to the bases and sources of an expert’s opinion affect the weight to be assigned that opinion rather than its admissibility. . . . ” In the Christophersen case, the Fifth Circuit was clearly addressing the admissibility of the challenged expert witness’s opinions, not the admissibility of relied-upon studies. Christophersen v. Allied-Signal Corp., 939 F.2d 1106, 1111, 1113-14 (5th Cir. 1991) (en banc) (per curiam) (trial court may exclude opinion of expert witness whose opinion is based upon incomplete or inaccurate exposure data), cert. denied, 112 S. Ct. 1280 (1992).

[9] Interestingly, the authors of this chapter abandoned their suggestion, advanced in the second edition, that studies relied upon “might qualify for the learned treatise exception to the hearsay rule, Fed. R. Evid. 803(18), or possibly the catchall exceptions, Fed. R. Evid. 803(24) & 804(5).” which was part of their argument in the Second Edition. RMSE 2d at 335 (2000). See also RMSE 3d at 214 (discussing statistical studies as generally “admissible,” but acknowledging that admissibility may be no more than permission to explain the basis for an expert’s opinion, which is hardly admissibility at all).

[10] See Ellis, 745 F.2d at 299-303; Kehm, 724 F.2d at 617-18. These holdings predated the Supreme Court’s 1993 decision in Daubert, and the issue whether they are subject to Rule 702 has not been addressed.  Federal agency factual findings have been known to be invalid, on occasion.

[11] David L. Faigman, et al., Modern Scientific Evidence: The Law and Science of Expert Testimony v.1, § 23:1,at 206 (2009) (“Well conducted studies are uniformly admitted.”).

[12] Montori, et al., “Users’ guide to detecting misleading claims in clinical research reports,” 329 Br. Med. J. 1093, 1093 (2004) (advising readers on how to avoid being misled by published literature, and counseling readers to “Read only the Methods and Results sections; bypass the Discussion section.”)  (emphasis added).

[13] RSME 3d 11 (2011).

[14] Id. at 19.

[15] Id. at 20 & n. 51 (citing Susan Haack, “An Epistemologist in the Bramble-Bush: At the Supreme Court with Mr. Joiner,” 26 J. Health Pol. Pol’y & L. 217–37 (1999).

[16] Id. at 19-20 & n.52.

[17] See Berger, “The Admissibility of Expert Testimony,” RSME 3d 11 (2011).  Professor Berger never mentions Rule 703 at all!  Gone and forgotten.

[18] Professor Berger filed an amicus brief on behalf of plaintiffs, in Rider v. Sandoz Pharms. Corp., 295 F.3d 1194 (11th Cir. 2002).

[19] Id. at 20 n.51. (The editors note that the published chapter was Berger’s last revision, with “a few edits to respond to suggestions by reviewers.”) The addition of the controversial Milward decision cannot seriously be considered an “edit.”

[20]From Here to CERT-ainty” (June 28, 2018); “ THE COUNCIL FOR EDUCATION AND RESEARCH ON TOXICS” (July 9, 2013).

Reference Manual – Desiderata for 4th Edition – Part V – Specific Tortogens

February 14th, 2023

Examples are certainly helpful to explain and to show judges how real scientists reach causal conclusions. The Reference Manual should certainly give such examples of how scientists determine whether a claim has been adequately tested, and whether the claim has eliminated the myriad kinds of error that threaten such claims and require us to withhold our assent. The third edition of the Manual, however, advances some dodgy examples, without any data or citations. I have already pointed out that the third edition’s reference to clear cell adenocarcinoma of the vagina in young women as a “signal” disease caused only by DES is incorrect.[1] There are, alas, other troubling examples in the third edition, which are due for pruning.

Claimed Interaction Between Asbestos and Tobacco Risks for Lung Cancer

The third edition’s chapter on epidemiology discusses the complexities raised by potential interaction between multiple exposures. The discussion is appropriately suggesting that a relative risk cannot be used to determine the probability of individual causation “if the agent interacts with another cause in a way that results in an increase in disease beyond merely the sum of the increased incidence due to each agent separately.” The suggestion is warranted, although the chapter then is mum on whether there are other approaches that can be invoked to derive probabilities of causation when multiple exposures interact in a known way. Then the authors provided an example:

“For example, the relative risk of lung cancer due to smoking is around 10, while the relative risk for asbestos exposure is approximately 5. The relative risk for someone exposed to both is not the arithmetic sum of the two relative risks, that is, 15, but closer to the product (50- to 60-fold), reflecting an interaction between the two.200 Neither of the individual agent’s relative risks can be employed to estimate the probability of causation in someone exposed to both asbestos and cigarette smoke.”[2]

Putting aside for the moment the general issue of interaction, the chapter’s use of the Mt. Sinai catechism, of 5-10-50, for asbestos and tobacco smoking and lung cancer, is a poor choice. The evidence for multiplicative interaction was advanced by the late Irving Selikoff, and frankly the evidence was never very good. The supposed “non-smokers” were really “never smoked regularly,” and the smoking histories were taken by postcard surveys. The cohort of asbestos insulators was well aware of the study hypothesis, in that many of its members had compensations claims, and they had an interest in downplaying their smoking.  Indeed, the asbestos workers’ union helped fund Selikoff’s work, and Selikoff had served as a testifying expert witness for claimants.

Given that “never smoked regularly” is not the same as never having smoked, and given that the ten-fold risk from smoking-alone was already an underestimate of lung cancer risk from smoking alone, the multiplicative model never was on a firm basis.  The smoking-alone risk ratio was doubled in the American Cancer Society’s Cancer Prevention Survey Numbers One and Two, but the Mt. Sinai physicians, who frequently testified in lawsuits for claimants steadfastly held to their outdated statistical control group.[3] It is thus disturbing that the third edition’s authors trotted out a summary of asbestos / smoking lung cancer risks based upon Selikoff’s dodgy studies of asbestos insulation workers. The 5-10-50 dogma was already incorrect when the first edition went to press.

Not only were Selikoff’s study probably incorrect when originally published, updates to the insulation worker cohort published after his death, specifically undermine the multiplicative claim. In a 2013 publication by Selikoff’s successors, asbestos and smoking failed to show multiplicative interaction.  Indeed, occupational asbestos exposure that had not manifested in clinically apparent asbestosis did not show any interaction with smoking.  Only in a subgroup of insulators with clinically detectable asbestosis did the asbestosis and smoking show “supra-additive” (but not multiplicative) interaction.[4]

Manganese and Parkinson’s Disease

Table 1, of the toxicology chapter in the third edition, presented a “Sample of Selected Toxicological End Points and Examples of Agents of Concern in Humans.” The authors cautioned that the table was “not an exhaustive or inclusive list of organs, end points, or agents. Absence from this list does not indicate a relative lack of evidence for a causal relation as to any agent of concern.”[5] Among the examples presented in this Table 1 was neurotoxicity in the form of “Parkinson’s disease and manganese”[6]

The presence of this example of this example in Table 1 is curious on a number of fronts. First, one of the members of the Development Committee for the third edition was Judge Kathleen O’Malley, who presided over a multi-district litigation involving claims for parkinsonism and Parkinson’s disease against manufacturers of welding rods. It seemed unlikely that Judge O’Malley would have overlooked this section. See, e.g., In re Welding Fume Prods. Liab. Litig., 245 F.R.D. 279 (N.D. Ohio 2007) (exposure to manganese fumes allegedly increased the risk of later developing brain damage). More important, however, the authors’ inclusion of Parkinson’s disease as an outcome from manganese exposure is remarkable because that putative relationship has been extensively studied and rejected by leading researchers in the field of movement disorders.[7] In 2010, neuro-epidemiologists published a comprehensive meta-analysis that confirmed the absence of a relationship between manganese exposure and Parkinson’s disease.[8] The inclusion in Table 1 of a highly controversial relationship, manganese-Parkinson’s disease, suggests either undisclosed partisanship or ignorance of the relevant scientific evidence.

Mesothelioma

The toxicology chapter of the third edition also weighed in on mesothelioma as a supposed signature disease of asbestos exposure. The chapter’s authors described mesothelioma as “almost always caused by asbestos,”[9] which was no doubt true when mesothelioma was first identified as caused by fibrous amphibole minerals.[10] The last two decades, however, has seen a shift in the incidence of mesothelioma among industrially exposed workers, which reveals more cases without asbestos exposure and with other potential causes. Leading scientists in the field have acknowledged non-asbestos causes,[11] and recently researchers have identified genetic mutations that completely account for the causation of individual cases of mesothelioma.[12] It is time for the fourth edition to acknowledge other causes of mesothelioma, and to offer judges and lawyers guidance on genetic causes of sporadic diseases.


[1] SeeReference Manual – Desiderata for the Fourth Edition – Signature Disease” (Jan. 30, 2023).

[2] RMSE3d at 615 & n. 200. The chapter fails to cite support for the 5-10-50 dogma, but it is readily recognizable as the Mt. Sinai Catechism that was endlessly repeated by Irving Selikoff and his protégés.

[3] Michael J. Thun, Cathy A. Day-Lally, Eugenia E. Calle, W. Dana Flanders, and Clark W Heath, “Excess mortality among cigarette smokers: Changes in a 20-year interval,” 85 Am. J. Public Health 1223 (1995).

[4] Steve Markowitz, Stephen Levin, Albert Miller, and Alfredo Morabia, “Asbestos, Asbestosis, Smoking and Lung Cancer: New Findings from the North American Insulator Cohort,” 188 Am. J. Respir. & Critical Care Med. 90 (2013); seeThe Mt. Sinai Catechism” (June 7, 2013).

[5] RMSE3d at 653-54.

[6] Reference Manual at 653.

[7] See e.g., Karin Wirdefeldt, Hans-Olaf Adami, Philip Cole, Dimitrios Trichopoulos, and Jack Mandel, “Epidemiology and etiology of Parkinson’s disease: a review of the evidence. 26 European J. Epidemiol. S1, S20-21 (2011); Tomas R. Guilarte, “Manganese and Parkinson’s Disease: A Critical Review and New Findings,” 118 Environ Health Perspect. 1071, 1078 (2010) (“The available evidence from human and nonhuman primate studies using behavioral, neuroimaging, neurochemical, and neuropathological end points provides strong support to the hypothesis that, although excess levels of [manganese] accumulation in the brain results in an atypical form of parkinsonism, this clinical outcome is not associated with the degeneration of nigrostriatal dopaminergic neurons as is the case in PD [Parkinson’s disease].”)

[8] James Mortimer, Amy Borenstein, and Lorene Nelson, “Associations of welding and manganese exposure with Parkinson disease: Review and meta-analysis,” 79 Neurology 1174 (2012).

[9] Bernard D. Goldstein & Mary Sue Henifin, “Reference Guide on Toxicology,” RMSE3d 633, 635 (2011).

[10] See J. Christopher Wagner, C.A. Sleggs, and Paul Marchand, “Diffuse pleural mesothelioma and asbestos exposure in the North Western Cape Province,” 17 Br. J. Indus. Med. 260 (1960); J. Christopher Wagner, “The discovery of the association between blue asbestos and mesotheliomas and the aftermath,” 48 Br. J. Indus. Med. 399 (1991); see also Harriet Hardy, M.D., Challenging Man-Made Disease:  The Memoirs of Harriet L. Hardy, M.D. 95 (1983); “Harriet Hardy’s Views on Asbestos Issues” (Mar. 13, 2013).

[11] Richard L. Attanoos, Andrew Churg, Allen R. Gibbs, and Victor L. Roggli, “Malignant Mesothelioma and Its Non-Asbestos Causes,” 142 Arch. Pathol. & Lab. Med. 753 (2018).

[12] Angela Bononia, Qian Wangb, Alicia A. Zolondick, Fang Baib, Mika Steele-Tanjia, Joelle S. Suareza , Sandra Pastorinoa, Abigail Sipesa, Valentina Signoratoa, Angelica Ferroa, Flavia Novellia , Jin-Hee Kima, Michael Minaaia,d, Yasutaka Takinishia, Laura Pellegrinia, Andrea Napolitanoa, Ronghui Xua , Christine Farrara , Chandra Goparajua, Cristian Bassig, Massimo Negrinig, Ian Paganoa , Greg Sakamotoa, Giovanni Gaudinoa, Harvey I. Pass, José N. Onuchic , Haining Yang, and Michele Carbone, “BAP1 is a novel regulator of HIF-1α,” 120 Proc. Nat’l Acad. Sci. e2217840120 (2023).

Reference Manual – Desiderata for 4th Edition – Part III – Differential Etiology

February 1st, 2023

Admittedly, I am playing the role of the curmudgeon here by pointing out errors or confusions in the third edition of the Reference Manual.  To be sure, there are many helpful and insightful discussions throughout the Manual, but they do not need to be revised.  Presumably, the National Academies and the Federal Judicial Center are undertaking the project of producing a fourth edition because they understand that revisions, updates, and corrections are needed. Otherwise, why bother?

To be sure, there are aspects of the third edition’s epidemiology chapter that get some important points right. 

(1) The chapter at least acknowledges that small relative risks (1 < RR <3) may be insufficient to support causal inferences.[1]

(2) The chapter correctly notes that the method known as “differential etiology” addresses only specific causation, and that the method presupposes that general causation has been established.[2]

(3) The third edition correctly observes that clinicians generally are not concerned with etiology as much as with diagnosis of disease.[3] The authors of the epidemiology chapter correctly observe that “[f]or many health conditions, the cause of the disease or illness has no relevance to its treatment, and physicians, therefore, do not employ this term or pursue that question.”[4] This observation alone should help trial courts question whether many clinicians have even the pretense of expertise to offer expert causation opinions.[5]

(4) With respect to so-called differential etiology, the third edition correctly states that this mode of reasoning is a logically valid argument if premises are true; that is, general causation must be established for each “differential etiology.” The epidemiology chapter observes that “like any scientific methodology, [differential etiology] can be performed in an unreliable manner.”[6]

(5) The third edition reports that the differential etiology argument as applied in litigation is often invalid because not all the differentials other than the litigation claim have been ruled out.[7]

(6) The third edition properly notes that for diseases for which the causes are largely unknown, such as most birth defects, a differential etiology is of little benefit.[8] Unfortunately, the third edition offered no meaningful guidance for how courts should consider differential etiologies offered when idiopathic cases make up something less “than largely,” (0% < Idiopathic < 10%, 20%, 30%, 40, 50%, etc.).The chapter acknowledges that:

“Although differential etiologies are a sound methodology in principle, this approach is only valid if … a substantial proportion of competing causes are known. Thus, for diseases for which the causes are largely unknown, such as most birth defects, a differential etiology is of little benefit.”[9]

Accordingly, many cases reject proffered expert witness testimony on differential etiology, when the witnesses failed to rule out idiopathic causes in the case at issue. What is a substantial proportion?  Unfortunately, the third edition did not attempt to quantify or define “substantial.” The inability to rule out unknown etiologies remains the fatal flaw in much expert witness opinion testimony on specific causation.

Errant Opinions on Differential Etiology

The third edition’s treatment of differential etiology does leave room for improvement. One glaring error is the epidemiology chapter’s assertion that “differential etiology is a legal invention not used by physicians.”[10] Indeed, the third edition provides a definition for “differential etiology” that reinforces the error:

differential etiology. Term used by the court or witnesses to establish or refute external causation for a plaintiff’s condition. For physicians, etiology refers to cause.”[11]

The third edition’s assertion about legal provenance and exclusivity can be quickly dispelled by a search on “differential etiology” in the National Library of Medicine’s PubMed database, which shows up dozens of results, going back to the early 1960s. Some citations are supplied in the notes.[12] A Google Ngram for “differential etiology” in American English shows prevalent usage well before any of the third edition’s cited cases:

The third edition’s erroneous assertion about the provenance of “differential etiology” has been echoed by other law professors. David Faigman, for instance, has claimed that in advancing differential etiologies, expert witnesses were inventing wholesale an approach that had no foundation or acceptance in their scientific disciplines:

“Differential etiology is ostensibly a scientific methodology, but one not developed by, or even recognized by, physicians or scientists. As described, it is entirely logical, but has no scientific methods or principles underlying it. It is a legal invention and, as such, has analytical heft, but it is entirely bereft of empirical grounding. Courts and commentators have so far merely described the logic of differential etiology; they have yet to define what that methodology is.”[13]

Faigman’s claim that courts and commentators have not defined the methodology underlying differential etiology is wrong. Just as hypothesis testing is predicated upon a probabilistic version of modus tollens, differential etiology is based upon “iterative disjunctive syllogism,” or modus tollendo ponens. Basic propositional logic recognizes that such syllogisms are valid arguments,[14] in which one of its premises is a disjunction (P v Q), and the other premise is the negation of one of the disjuncts:

P v Q

~P­­­_____

∴ Q

If we expand the disjunctive premise to more than one disjunction, we can repeat the inference (iteratively), eliminating one disjunct at a time, until we arrive at a conclusion that is a simple, affirmative proposition, without any disjunctions in it.

P v Q v R

~P­­­_____

∴ Q v R

     ~Q­­­_____

∴ R

Hence, the term “iterative disjunctive syllogism.” Sherlock Holmes’ fans, of course, will recognize that iterative disjunctive syllogism is nothing other than the process of elimination, as explained by the hero of Sir Arthur Conan Doyle’s short stories.[15]

The fourth edition should correct the error of the third edition, and it should dispel the strange notion that differential etiology is not used by scientists or clinicians themselves.

Supreme Nonsense on Differential Etiology

In 2011, the Supreme Court addressed differential etiology in a case, Matrixx Initiatives, in stunningly irrelevant and errant dicta. The third edition did not discuss this troublesome case, in which the defense improvidently moved to dismiss a class action complaint for securities violations allegedly arising from the failure to disclose multiple adverse event reports of anosmia from the use of the defendant’s product, Zicam. The basic reason for the motion on the pleadings was that the plaintiffs’ failed to allege a statistically significant and causally related increased risk of anosmia.  The Supreme Court made short work of the defense argument because material events, such as an FDA recall, did not require the existence of a causal relationship between Zicam use and anosmia. The defense complaints about statistical significance, causation, and their absence, were thus completely beside the point of the case.  Nonetheless, it became the Court’s turn for improvidence in addressing statistical and causation issues not properly before it. With respect to causation, the Court offered this by way of obiter dictum:

“We note that courts frequently permit expert testimony on causation based on evidence other than statistical significance. Seee.g.Best v. Lowe’s Home Centers, Inc., 563 F. 3d 171, 178 (6th Cir 2009); Westberry v. Gislaved Gummi AB, 178 F. 3d 257, 263–264 (4th Cir. 1999) (citing cases); Wells v. Ortho Pharmaceutical Corp., 788 F. 2d 741, 744–745 (11th Cir. 1986). We need not consider whether the expert testimony was properly admitted in those cases, and we do not attempt to define here what constitutes reliable evidence of causation.”[16]

This part of the Court’s opinion was stunningly wrong about the Court of Appeals’ decisions on statistical significance[17] and on causation. The Best and the Westberry decisions were both cases that turned on specific, not general, causation.  Statistical significance this was not part of the reasoning or rationale of the cited cases on specific caustion. Both cases assumed that general causation was established, and inquired into whether expert witnesses could reasonably and validly attribute the health outcome in the case to the exposures that were established causes of such outcomes.  The Court’s selection of these cases, quite irrelevant to its discussion, appears to have come from the Solicitor General’s amicus brief in Matrixx, but mindlessly adopted by the Court.

Although cited for an irrelevant proposition, the Supreme Court’s selection of the Best’s case was puzzling because the Sixth Circuit’s discussion of the issue is particularly muddled. Here is the relevant language from Best:

“[A] doctor’s differential diagnosis is reliable and admissible where the doctor

(1) objectively ascertains, to the extent possible, the nature of the patient’s injury…,

(2) ‘rules in’ one or more causes of the injury using a valid methodology,

and

(3) engages in ‘standard diagnostic techniques by which doctors normally rule out alternative causes” to reach a conclusion as to which cause is most likely’.”[18]

Of course, as the authors of the third edition’s epidemiology chapter correctly note, physicians rarely use this iterative process to arrive at causes of diseases in an individual; they use it to identify the disease or disease process that is responsible for the patient’s signs and symptoms.[19] The Best court’s description does not make sense in that it characterizes the process as ruling in “one or more” causes, and then ruling out alternative causes.  If an expert had ruled in only one cause, then there would be no need or opportunity to rule out an alternative cause.  If the one ruled-in cause was ruled out for other reasons, then the expert witness would be left with a case of idiopathic disease.

In any event, differential etiology was irrelevant to the general causation issue raised by the defense in Matrixx Initiatives. After the Supreme Court correctly recognized that causation was largely irrelevant to the securities fraud claim, it had no reason to opine on general causation.  Certainly, the Supreme Court had no reason to cite two cases on differential etiology in a case that did not even require allegations of general causation. The fourth edition of the Reference Manual should put Matrixx Initatives in its proper (and very limited) place.


[1] RMSE3d at 612 & n.193 (noting that “one commentator contends that, because epidemiology is sufficiently imprecise to accurately measure small increases in risk, in general, studies that find a relative risk less than 2.0 should not be sufficient for causation. The concern is not with specific causation but with general causation and the likelihood that an association less than 2.0 is noise rather than reflecting a true causal relationship. See Michael D. Green, “The Future of Proportional Liability,” in Exploring Tort Law (Stuart Madden ed., 2005); see also Samuel M. Lesko & Allen A. Mitchell, “The Use of Randomized Controlled Trials for Pharmacoepidemiology Studies,” in Pharmacoepidemiology 599, 601 (Brian Strom ed., 4th ed. 2005) (“it is advisable to use extreme caution in making causal inferences from small relative risks derived from observational studies”); Gary Taubes, “Epidemiology Faces Its Limits,” 269 Science 164 (1995) (explaining views of several epidemiologists about a threshold relative risk of 3.0 to seriously consider a causal relationship); N.E. Breslow & N.E. Day, “Statistical Methods in Cancer Research,” in The Analysis of Case-Control Studies 36 (IARC Pub. No. 32, 1980) (“[r]elative risks of less than 2.0 may readily reflect some unperceived bias or confounding factor”); David A. Freedman & Philip B. Stark, “The Swine Flu Vaccine and Guillain-Barré Syndrome: A Case Study in Relative Risk and Specific Causation,” 64 Law & Contemp. Probs. 49, 61 (2001) (“If the relative risk is near 2.0, problems of bias and confounding in the underlying epidemiologic studies may be serious, perhaps intractable.”). For many other supporting comments and observations, see “Small Relative Risks and Causation” (June 28, 2022).

[2] RMSE3d. at 618 (“Although differential etiologies are a sound methodology in principle, this approach is only valid if general causation exists … .”). In the case of a novel putative cause, the case may give rise to a hypothesis that the putative cause can cause the outcome, in general, and did so in the specific case.  That hypothesis must, of course, then be tested and supported by appropriate analytical methods before it can be accepted for general causation and as a putative specific cause in a particular individual.

[3] RMSE3d at 617.

[4] RMSE3d at 617 & n. 211 (citing Zandi v. Wyeth, Inc., No. 27-CV-06-6744, 2007 WL 3224242 (D. Minn. Oct. 15, 2007) (observing that physicians do assess the cause of patients’ breast cancers)).

[5] See, e.g., Tamraz v. BOC Group Inc., No. 1:04-CV-18948, 2008 WL 2796726 (N.D.Ohio July 18, 2008)(denying Rule 702 challenge to treating physician’s causation opinion), rev’d sub nomTamraz v. Lincoln Elec. Co., 620 F.3d 665 (6th Cir. 2010)(carefully reviewing record of trial testimony of plaintiffs’ treating physician; reversing judgment for plaintiff based in substantial part upon treating physician’s speculative causal assessment created by plaintiffs’ counsel), cert. denied, ___ U.S. ___ , 131 S. Ct. 2454 (2011).

[6] RMSE3d at 617-18 & n. 215.

[7] See, e.g, Milward v. Acuity Specialty Products Group, Inc., Civil Action No. 07–11944–DPW, 2013 WL 4812425 (D. Mass. Sept. 6, 2013) (excluding plaintiffs’ expert witnesses on specific causation), aff’d sub nom., Milward v. Rust-Oleum Corp., 820 F.3d 469 (1st Cir. 2016). Interestingly, the earlier appellate journey taken by the Milward litigants resulted in a reversal of a Rule 702 exclusion of plaintiff’s general causation expert witnesses. That reversal meant that there was no longer a final judgment.  The exclusion of specific causation witnesses was affirmed by the First Circuit, and the general causation opinion was no longer necessary to the final judgment. See Differential Diagnosis in Milward v. Acuity Specialty Products Group” (Sept. 26, 2013); “Differential Etiology and Other Courtroom Magic” (June 23, 2014).

[8] RMSE3d at 617-18 & n. 214.

[9] See RMSE at 618 (internal citations omitted).

[10] RMSE3d at 691 (emphasis added).

[11] RMSE3d at 743.

[12] See, e.g., Kløve & D. Doehring, “MMPI in epileptic groups with differential etiology,” 18 J. Clin. Psychol. 149 (1962); Kløve & C. Matthews, “Psychometric and adaptive abilities in epilepsy with differential etiology,” 7 Epilepsia 330 (1966); Teuber & K. Usadel, “Immunosuppression in juvenile diabetes mellitus? Critical viewpoint on the treatment with cyclosporin A with consideration of the differential etiology,” 103  Fortschr. Med. 707 (1985); G.May & W. May, “Detection of serum IgA antibodies to varicella zoster virus (VZV)–differential etiology of peripheral facial paralysis. A case report,” 74 Laryngorhinootologie 553 (1995); Alan Roberts, “Psychiatric Comorbidity in White and African-American Illicity Substance Abusers” Evidence for Differential Etiology,” 20 Clinical Psych. Rev. 667 (2000); Mark E. Mullinsa, Michael H. Leva, Dawid Schellingerhout, Gilberto Gonzalez, and Pamela W. Schaefera, “Intracranial Hemorrhage Complicating Acute Stroke: How Common Is Hemorrhagic Stroke on Initial Head CT Scan and How Often Is Initial Clinical Diagnosis of Acute Stroke Eventually Confirmed?” 26 Am. J. Neuroradiology 2207 (2005); Qiang Fua, et al., “Differential Etiology of Posttraumatic Stress Disorder with Conduct Disorder and Major Depression in Male Veterans,” 62 Biological Psychiatry 1088 (2007); Jesse L. Hawke, et al., “Etiology of reading difficulties as a function of gender and severity,” 20 Reading and Writing 13 (2007); Mastrangelo, “A rare occupation causing mesothelioma: mechanisms and differential etiology,” 105 Med. Lav. 337 (2014).

[13] David L. Faigman & Claire Lesikar, “Organized Common Sense: Some Lessons from Judge Jack Weinstein’s Uncommonly Sensible Approach to Expert Evidence,” 64 DePaul L. Rev. 421, 439, 444 (2015). See alsoDavid Faigman’s Critique of G2i Inferences at Weinstein Symposium” (Sept. 25, 2015).

[14] See Irving Copi & Carl Cohen Introduction to Logic at 362 (2005).

[15] See, e.g., Doyle, The Blanched Soldier (“…when you have eliminated all which is impossible, then whatever remains, however improbable, must be the truth.”); Doyle, The Beryl Coronet (“It is an old maxim of mine that when you have excluded the impossible, whatever remains, however improbable, must be the truth.”); Doyle, The Hound of the Baskervilles (1902) (“We balance probabilities and choose the most likely. It is the scientific use of the imagination.”); Doyle, The Sign of the Four, ch 6 (1890)(“‘You will not apply my precept’, he said, shaking his head. ‘How often have I said to you that when you have eliminated the impossible, whatever remains, however improbable, must be the truth? We know that he did not come through the door, the window, or the chimney. We also know that he could not have been concealed in the room, as there is no concealment possible. When, then, did he come?”)

[16] Matrixx Initiatives, Inc. v. Siracusano, 131 S. Ct. 1309, 1319 (2011). 

[17] The citation to Wells was clearly wrong in that the plaintiffs in that case had, in fact, relied upon studies that were nominally statistically significant, and so the Wells court could not have held that statistical significance was unnecessary.

[18] Best v. Lowe’s Home Centers, Inc., 563 F.3d 171, 179, 183-84 (6th Cir. 2009).

[19] See generally Harold C. Sox, Michael C. Higgins, and Douglas K. Owens, Medical Decision Making (2d ed. 2014). 

Reference Manual – Desiderata for 4th Edition – Part I – Signature Diseases

January 30th, 2023

The fourth edition of the Reference Manual on Scientific Evidence is by all accounts under way. Each of the first three editions represented an improvement over previous editions, but the last edition continued to have substantive problems. The bar, the judiciary, and the scientific community hopefully await an improved fourth edition. Although I have posted previously about issues in the third edition, I am updating and adding to what I have written.[1]  There were only a few reviews and acknowledgments of the third edition.[2] The editorial staff provided little to no opportunity for comments in advance of the third edition, and to date, there has been no call for public comment about the pending fourth edition. I hope there will be more opportunity for the legal and scientific community to comment in the production of the fourth edition.

There are several issues raised by the third edition’s treatment of specific causation, which I hope will be improved in the fourth edition. One such issue is the epidemiology chapter’s brief discussion of so-called signature diseases. The chapter takes the curious position that epidemiology has nothing to say about individual or specific causation, a position I will discuss in later posts. The chapter, however, carves out a limited exception to its (questionable) edict that epidemiology does not concern itself with specific causation.  The chapter tells us, uncontroversially, that some diseases do not occur without exposure to a specific chemical or substance. In my view, the authors of this chapter then go astray in telling us that “[a]bestosis is a signature disease for asbestos, and vaginal adenocarcinoma (in young adult women) is a signature disease for in utero DES exposure.”

Now, by definition, only asbestos can cause asbestosis, but asbestosis presents clinically in a way that is indistinguishable in many cases from idiopathic pulmonary fibrosis and other interstitial fibrotic diseases of the lungs. Over the years, the diagnostic criteria for asbestosis have changed, but these criteria have always had a specificity and sensitivity less than 100%. Saying that a case of asbestosis must have been caused by asbestos begs the clinical question whether the case really is asbestosis.

The chapter’s characterization of vaginal adenocarcinoma as a signature disease of in utero DES exposure is also not correct.  Although this cancer in young women is extremely rare, there is a baseline risk that allows the calculation of relative risks for young women exposed in utero. In older women, the relative risks are lower because the baseline risks are higher, and because the effect of DES is diminished for older onset cases.[3] The disease was known before the use of DES in pregnant women began after World War II.[4]

For support of their discussion of “signal diseases,” the authors of the epidemiology chapter chose, remarkably, to cite an article that was over 25 years old (now over 35 years old) at the time the third edition was published.[5] The referenced passage asks us to:

“Consider tort claims for what have come to be called signature disease. These are diseases characteristically caused by only a few substances – such as the vaginal adenocarcinoma usually associated with exposure to DES in utero – and mesothelioma, a cancer of the pleura caused almost exclusively by exposure to asbestos fibers in the air.”[6]

Well, “usually associated” does equal signature disease.[7] The relative risks for smoking and some kinds of lung cancer are higher than for DES in utero and clear cell vaginal adenocarcinoma, but no one calls lung cancer a signature disease of smoking. (Admittedly, smoking is the major cause and perhaps the most preventable cause of lung cancer in Western countries.)

The third edition’s reference to a source that describes mesothelioma as “caused almost exclusively by exposure to asbestos fibers” is also out of date.[8] Recognizing that casual comments and citations can influence credulous judges, the authors of the fourth edition should strive for greater accuracy in their discussions of such scientific issues. It may be time to find new examples of signature disease.


[1]Reference Manual on Scientific Evidence v4.0” (Feb. 28, 2021); “Reference Manual on Scientific Evidence – 3rd Edition is Past Its Expiry” (Oct. 17, 2021). 

[2] See, e.g., Adam Dutkiewicz, “Book Review: Reference Manual on Scientific Evidence, Third Edition,” 28 Thomas M. Cooley L. Rev. 343 (2011); John A. Budny, “Book Review: Reference Manual on Scientific Evidence, Third Edition,” 31 Internat’l J. Toxicol. 95 (2012); James F. Rogers, Jim Shelson, and Jessalyn H. Zeigler, “Changes in the Reference Manual on Scientific Evidence (Third Edition),” Internat’l Ass’n Def. Csl. Drug, Device & Biotech. Comm. Newsltr. (June 2012). See Schachtman “New Reference Manual’s Uneven Treatment of Conflicts of Interest” (Oct. 12, 2011).

[3] Janneke Verloop, Flora E. van Leeuwen, Theo J. M. Helmerhorst, Hester H. van Boven, and Matti A. Rookus, “Cancer risk in DES daughters,” 21 Cancer Causes & Control 999 (2010).

[4] See “Risk Factors for Vaginal Cancer,” American Cancer Soc’y website (last visited Jan. 29, 2023).

[5] Kenneth S. Abraham & Richard A. Merrill, Scientific Uncertainty in the Courts, 2 Issues Sci. & Tech. 93, 101 (Winter 1986).

[6] Id.

[7] See, e.g., Kadir Güzin, Semra Kayataş Eserm, Ayşe Yiğit, and Ebru Zemheri, “Primary clear cell carcinoma of the vagina that is not related to in utero diethylstilbestrol use,” 3 Gynecol. Surg. 281 (2006).

[8] Michele Carbone, Harvey I. Pass, Guntulu Ak, H. Richard Alexander Jr., Paul Baas, Francine Baumann, Andrew M. Blakely, Raphael Bueno, Aleksandra Bzura, Giuseppe Cardillo, Jane E. Churpek, Irma Dianzani, Assunta De Rienzo, Mitsuru Emi, Salih Emri, Emanuela Felley-Bosco, Dean A. Fennell, Raja M. Flores, Federica Grosso, Nicholas K. Hayward, Mary Hesdorffer, Chuong D. Hoang, Peter A. Johansson, Hedy L. Kindler, Muaiad Kittaneh, Thomas Krausz, Aaron Mansfield, Muzaffer Metintas, Michael Minaai, Luciano Mutti, Maartje Nielsen, Kenneth O’Byrne, Isabelle Opitz, Sandra Pastorino, Francesca Pentimalli, Marc de Perrot, Antonia Pritchard, Robert Taylor Ripley, Bruce Robinson, and Valerie Rusch, “Medical and Surgical Care of Patients With Mesothelioma and Their Relatives Carrying Germline BAP1 Mutations,” 17 J. Thoracic Oncol. 873 (2022). See also Mitchell Cheung, Yuwaraj Kadariya, Eleonora Sementino, Michael J. Hall, Ilaria Cozzi, Valeria Ascoli, Jill A. Ohar, and Joseph R. Testa, “Novel LRRK2 mutations and other rare, non-BAP1-related candidate tumor predisposition gene variants in high-risk cancer families with mesothelioma and other tumors,” 30 Human Molecular Genetics 1750 (2021); Thomas Wiesner, Isabella Fried, Peter Ulz, Elvira Stacher, Helmut Popper, Rajmohan Murali, Heinz Kutzner, Sigurd Lax, Freya Smolle-Jüttner, Jochen B. Geigl, and Michael R. Speicher, “Toward an Improved Definition of the Tumor Spectrum Associated With BAP1 Germline Mutations,” 30 J. Clin. Oncol. e337 (2012); Alexandra M. Haugh, BA1; Ching-Ni Njauw, MS2,3; Jeffrey A. Bubley, et al., “Genotypic and Phenotypic Features of BAP1 Cancer Syndrome: A Report of 8 New Families and Review of Cases in the Literature,” 153 J.Am. Med. Ass’n Dermatol. 999 (2017).

An Opinion to SAVOR

November 11th, 2022

The saxagliptin medications are valuable treatments for type 2 diabetes mellitus (T2DM). The SAVOR (Saxagliptin Assessment of Vascular Outcomes Recorded in Patients with Diabetes Mellitus) study was a randomized controlled trial, undertaken by manufacturers at the request of the FDA.[1] As a large (over sixteen thousand patients randomized) double-blinded cardiovascular outcomes trial, SAVOR collected data on many different end points in patients with T2DM, at high risk of cardiovascular disease, over a median of 2.1 years. The primary end point was a composite end point of cardiac death, non-fatal myocardial infarction, and non-fatal stroke. Secondary end points included each constituent of the composite, as well as hospitalizations for heart failure, coronary revascularization, or unstable angina, as well as other safety outcomes.

The SAVOR trial found no association between saxagliptin use and the primary end point, or any of the constituents of the primary end point.  The trial did, however, find a modest association between saxagliptin and one of the several secondary end points, hospitalization for heart failure (hazard ratio, 1.27; 95% C.I., 1.07 to 1.51; p = 0.007). The SAVOR authors urged caution in interpreting their unexpected finding for heart failure hospitalizations, given the multiple end points considered.[2] Notwithstanding the multiplicity, in 2016, the FDA, which does not require a showing of causation for adding warnings to a drug’s labeling, added warnings about the “risk” of hospitalization for heart failure from the use of saxagliptin medications.

And the litigation came.

The litigation evidentiary display grew to include, in addition to SAVOR, observational studies, meta-analyses, and randomized controlled trials of other DPP-4 inhibitor medications that are in the same class as saxagliptin. The SAVOR finding for heart failure was not supported by any of the other relevant human study evidence. The lawsuit industry, however, armed with an FDA warning, pressed its cases. A multi-district litigation (MDL 2809) was established. Rule 702 motions were filed by both plaintiffs’ and defendants’ counsel.

When the dust settled in this saxagliptin litigation, the court found that the defendants’ expert witnesses satisfied the relevance and reliability requirements of Rule 702, whereas the proferred opinions of plaintiff’s expert witness, Dr. Parag Goyal, a cardiologist at Cornell-Weill Hospital in New York, did not satisfy Rule 702.[3] The court’s task was certainly made easier by the lack of any other expert witness or published opinion that saxagliptin actually causes heart failure serious enough to result in hospitalization. 

The saxagliptin litigation presented an interesting array of facts for a Rule 702 show down. First, there was an RCT that reported a nominally statistically significant association between medication use and a harm, hospitalization for heart failure. The SAVOR finding, however, was in a secondary end point, and its statistical significance was unimpressive when considered in the light of the multiple testing that took place in the context of a cardiovascular outcomes trial.

Second, the heart failure increase was not seen in the original registration trials. Third, there was an effort to find corroboration in observational studies and meta-analyses, without success. Fourth, there was no apparent mechanism for the putative effect. Fifth, there was no support from trials or observational studies of other medications in the class of DPP-4 inhibitors.

Dr. Goyal testified that the heart failure finding in SAVOR “should be interpreted as cause and effect unless there is compelling evidence to prove otherwise.” On this record, the MDL court excluded Dr. Goyal’s causation opinions. Dr. Goyal purported to conduct a Bradford Hill analysis, but the MDL court appeared troubled by his glib dismissal of the threat to validity in SAVOR from multiple testing, and his ignoring the consistency prong of the Hill factors. SAVOR was the only heart failure finding in humans, with the remaining observational studies, meta-analyses, and other trials of DPP-4 inhibitors failing to provide supporting evidence.

The challenged defense expert witnesses defended the validity of their opinions, and ultimately the MDL court had little concern in permitting them through the judicial gate. The plaintiffs’ challenges to Suneil Koliwad, a physician with a doctorate in molecular physiology, Eric Adler, a cardiologist, and Todd Lee, a pharmaco-epidemiologist, were all denied. The plaintiffs challenged, among other things, whether Dr. Adler was qualified to apply a Bonferroni correction to the SAVOR results, and whether Dr. Lee was obligated to obtain and statistically analyze the data from the trials and studies ab initio. The MDL court quickly dispatched these frivolous challenges.

The saxagliptin MDL decision is an important reminder that litigants should remain vigilant about inaccurate assertions of “statistical significance,” even in premier, peer-reviewed journals. Not all journals are as careful as the New England Journal of Medicine in requiring qualification of claims of statistical significance in the face of multiple testing.

One legal hiccup in the court’s decision was its improvident citation to Daubert, for the proposition that the gatekeeping inquiry must focus “solely on principles and methodology, not on the conclusions they generate.”[4] That piece of obiter dictum did not survive past the Supreme Court’s 1997 decision in Joiner,[5] and it was clearly superseded by statute in 2000. Surely it is time to stop citing Daubert for this dictum.


[1] Benjamin M. Scirica, Deepak L. Bhatt, Eugene Braunwald, Gabriel Steg, Jaime Davidson, et al., for the SAVOR-TIMI 53 Steering Committee and Investigators, “Saxagliptin and Cardiovascular Outcomes in Patients with Type 2 Diabetes Mellitus,” 369 New Engl. J. Med. 1317 (2013).

[2] Id. at 1324.

[3] In re Onglyza & Kombiglyze XR Prods. Liab. Litig., MDL 2809, 2022 WL 43244 (E.D. Ken. Jan. 5, 2022).

[4] Daubert v. Merrell Dow Pharms., Inc., 509 U.S. 579, 595 (1993).

[5] General Electric Co. v. Joiner, 522 U.S. 136 (1997).

Cheng’s Proposed Consensus Rule for Expert Witnesses

September 15th, 2022

Edward K. Cheng is the Hess Professor of Law in absentia from Vanderbilt Law School, while serving this fall as a visiting professor at Harvard. Professor Cheng is one of the authors of the multi-volume treatise, Modern Scientific Evidence, and the author of many articles on scientific and statistical evidence. Cheng’s most recent article, “The Consensus Rule: A New Approach to Scientific Evidence,”[1] while thought provoking, follows in the long-standing tradition of law school professors to advocate evidence law reforms, based upon theoretical considerations devoid of practical or real-world support.

Cheng’s argument for a radical restructuring of Rule 702 is based upon his judgment that jurors and judges are epistemically incompetent to evaluate expert witness opinion testimony. The current legal approach has trial judges acting as gatekeepers of expert witness testimony, and jurors acting as judges of factual scientific claims. Cheng would abolish these roles as beyond their ken.[2] Lay persons can, however, determine which party’s position is supported by the relevant expert community, which he presumes (without evidence) possesses the needed epistemic competence. Accordingly, Cheng would rewrite the legal system’s approach to important legal disputes, such as disputes over causal claims, from:

Whether a given substance causes a given disease

to

Whether the expert community believes that a given substance causes a given disease.

Cheng channels the philosophical understanding of the ancients who realized that one must have expertise to judge whether someone else has used that expertise correctly. And he channels the contemporary understanding that knowledge is a social endeavor, not the unique perspective of an individual in isolation. From these twin premisses, Cheng derives a radical and cynical proposal to reform the law of expert witness testimony. In his vision, experts would come to court not to give their own opinions, and certainly not to try to explain how they arrive at their opinions from the available evidence. For him, the current procedure is too much like playing chess with a monkey. The expert function would consist of telling the jury what the expert witness’s community believes.[3] Jurors would not decide the “actual substantive questions,” but simply decide what they believe the relevant expert witness community accepts as a consensus. This radical restructuring is what Cheng calls the “consensus rule.”

In this proposed “consensus rule,” there is no room for gatekeeping. Parties continue to call expert witnesses, but only as conduits for the “consensus” opinions of their fields. Indeed, Cheng’s proposal would radically limit expert witness to service as pollsters; their testimony would present only their views of what the consensus is in their fields. This polling information is the only evidence that the jury hear from expert witnesses, because this is the only evidence that Cheng believes the jury is epistemically competent to assess.[4]

Under Cheng’s Consensus Rule, when there is no consensus in the realm, the expert witness regime defaults to “anything goes,” without gatekeeping.[5] Judges would continue to exercise some control over who is qualified to testify, but only as far as the proposed experts must be in a position to know what the consensus is in their fields.

Cheng does not explain why, under his proposed “consensus rule,” subject matter experts are needed at all.  The parties might call librarians, or sociologists of science, to talk about the relevant evidence of consensus. If a party cannot afford a librarian expert witness, then perhaps lawyers could present directly the results of their PubMed, and other internet searches.

Cheng may be right that his “deferential approach” would eliminate having the inexpert passing judgment on the expert. The “consensus rule” would reduce science to polling, conducted informally, often without documentation or recording, by partisan expert witnesses. This proposal hardly better reflects, as he argues, the “true” nature of science. In Cheng’s vision, science in the courtroom is just a communal opinion, without evidence and without inference. To be sure, this alternative universe is tidier and less disputatious, but it is hardly science or knowledge. We are left with opinions about opinions, without data, without internal or external validity, and without good and sufficient facts and data.

Cheng claims that his proposed Consensus Rule is epistemically superior to Rule 702 gatekeeping. For the intellectual curious and able, his proposal is a counsel of despair. Deference to the herd, he tells us “is not merely optimal—it is the only practical strategy.”[6] In perhaps the most extreme overstatement of his thesis, Cheng tells us that

“deference is arguably not due to any individual at all! Individual experts can be incompetent, biased, error prone, or fickle—their personal judgments are not and have never been the source of reliability. Rather, proper deference is to the community of experts, all of the people who have spent their careers and considerable talents accumulating knowledge in their field.”[7]

Cheng’s hypothesized community of experts, however is worthy of deference only by virtue of the soundness of its judgments. If a community has not severely tested its opinions, then its existence as a community is irrelevant. Cheng’s deference is the sort of phenomenon that helped create Lysenkoism and other intellectual fads that were beyond challenge with actual data.

There is, I fear, some partial truth to Cheng’s judgment of juries and judges as epistemically incompetent, or challenged, to judge science, but his judgment seems greatly overstated. Finding aberrant jury verdicts would be easy, but Cheng provides no meaningful examples of gatekeeping gone wrong. Professor Cheng may have over-generalized in stating that judges are epistemically incompetent to make substantive expert determinations. He surely cannot be suggesting that judges never have sufficient scientific acumen to determine the relevance and reliability of expert witness opinion. If judges can, in some cases, make a reasonable go at gatekeeping, why then is Cheng advocating a general rule that strips all judges of all gatekeeping responsibility with respect to expert witnesses?

Clearly judges lack the technical resources, time, and background training to delve deeply into the methodological issues with which they may be confronted. This situation could be ameliorated by budgeting science advisors and independent expert witnesses, and by creating specialty courts staffed with judges that have scientific training. Cheng acknowledges this response, but he suggests that conflicts with “norms about generalist judges.”[8] This retreat to norms is curious in the face of Cheng’s radical proposals, and the prevalence of using specialist judges for adjudicating commercial and patent disputes.

Although Cheng is correct that assessing validity and reliability of scientific inferences and conclusions often cannot be reduced to a cookbook or checklist approach, not all expertise is as opaque as Cheng suggests. In his view, lawyers are deluded into thinking that they can understand the relevant science, with law professors being even worse offenders.[9] Cross-examining a technical expert witness can be difficult and challenging, but lawyers on both sides of the aisle occasionally demolish the most skilled and knowledgeable expert witnesses, on substantive grounds. And these demolitions happen to expert witnesses who typically, self-servingly claim that they have robust consensuses agreeing with their opinions.

While scolding us that we must get “comfortable with relying on the expertise and authority of others,” Cheng reassures us that deferring to authority is “not laziness or an abdication of our intellectual responsibility.”[10] According to Cheng, the only reason to defer to the opinion of expert is that they are telling us what their community would say.[11] Good reasons, sound evidence, and valid inference need not worry us in Cheng’s world.

Finding Consensus

Cheng tells us that his Consensus Rule would look something like:

Rule 702A. If the relevant scientific community believes a fact involving specialized knowledge, then that fact is established accordingly.”

Imagine the endless litigation over what the “relevant” community is. For a health effect claim about a drug and heart attacks, is it the community of cardiologists or epidemiologists? Do we accept the pronouncements of the American Heart Association or those of the American College of Cardiology. If there is a clear consensus based upon a clinical trial, which appears to be based upon suspect data, is discovery of underlying data beyond the reach of litigants because the correctness of the allegedly dispositive study is simply not in issue? Would courts have to take judicial notice of the clear consensus and shut down any attempt to get to the truth of the matter?

Cheng acknowledges that cases will involve issues that are controversial or undeveloped, without expert community consensus. Many litigations start after publication of a single study or meta-analysis, which is hardly the basis for any consensus. Cheng appears content, in this expansive area, to revert to anything goes because if the expert community has not coalesced around a unified view, or if the community is divided, then the courts cannot do better than flipping a coin! Cheng’s proposal thus has a loophole the size of the Sun.

Cheng tells us, unhelpfully, that “[d]etermining consensus is difficult in some cases, and less so in others.”[12] Determining consensus may not be straightforward, but no matter. Consensus Rule questions are not epistemically challenging and thus “far more manageable,” because they requires no special expertise. (Again, why even call a subject matter expert witness, as opposed to a science journalist or librarian?) Cheng further advises that consensus is “a bit like the reasonable person standard in negligence,” but this simply conflates normative judgments with the scientific judgments.[13]

Cheng’s Consensus Rule would allow the use of a systematic review or a meta-analysis, not for evidence of the correctness of its conclusions, but only as evidence of a consensus.[14] The thought experiment of how this suggestion plays out in the real world may cause some agita. The litigation over Avandia began within days of the publication of a meta-analysis in the New England Journal of Medicine.[15] So some evidence of consensus; right? But then the letters to the editor within a few weeks of publication showed that the meta-analysis was fatally flawed. Inadmissible! Under the Consensus Rule the correctness or the methodological appropriateness of the meta-analysis is irrelevant. A few months later, another meta-analysis is published, which fails to find the risk that the original meta-analysis claimed. Is the trial now about which meta-analysis represents the community’s consensus, or are we thrown into the game of anything goes, where expert witnesses just say things, without judicial supervision?  A few years go by, and now there is a large clinical trial that supersedes all the meta-analyses of small trials.[16] Is a single large clinical trial now admissible as evidence of a new consensus, or are only systematic reviews and meta-analyses relevant evidence?

Cheng’s Consensus Rule will be useless in most determinations of specific causation.  It will be a very rare case indeed when a scientific organization issues a consensus statement about plaintiff John Doe. Very few tort cases involve putative causal agents that are thought to cause every instance of some disease in every person exposed to the agent. Even when a scientific community has addressed general causation, it will have rarely resolved all the uncertainty about the causal efficacy of all levels of exposure or the appropriate window of latency. So Cheng’s proposal guarantees to remove specific causation from the control of Rule 702 gatekeeping.

The potential for misrepresenting consensus is even greater than the misrepresentations of actual study results. At least the data are the data, but what will jurors do when they are regaled by testimony about the informal consensus reached in the hotel lobby of the latest scientific conference. Regulatory pronouncements that are based upon precautionary principles will be misrepresented as scientific consensus.  Findings by the International Agency for Research on Cancer that a substance is a IIA “probable human carcinogen” will be hawked as a consensus, even though the classification specifically disclaims any quantitative meaning for “probable,” and it directly equates to “insufficient” evidence of carcinogencity in humans.

In some cases, as Cheng notes, organizations such as the National Research Council, or the National Academy of Science, Engineering and Medicine (NASEM), will have weighed in on a controversy that has found its way into court.[17] Any help from such organizations will likely be illusory. Consider the 2006 publication of a comprehensive review of the available studies on non-pulmonary cancers and asbestos exposure by NASEM. The writing group presented its assessment of colorectal cancer as not causally associated with occupational asbestos exposure.[18] By 2007, the following year, expert witnesses for plaintiffs argued that the NASEM publication was no longer a consensus because one or two (truly inconsequential studies) had been published after the report and thus not considered. Under Cheng’s proposal, this dodge would appear to be enough to oust the consensus rule, and default to the “anything goes” rule. The scientific record can change rapidly, and many true consensus statements quickly find their way into the dustbin of scientific history.

Cheng greatly underestimates the difficulty in ascertaining “consensus.” Sometimes, to be sure, professional societies issue consensus statements, but they are often tentative and inconclusive. In many areas of science, there will be overlapping realms of expertise, with different disciplines issuing inconsistent “consensus” statements. Even within a single expert community, there may be two schools of thoughts about a particular issue.

There are instances, perhaps more than a few, when a consensus is epistemically flawed. If, as is the case in many health effect claims, plaintiffs rely upon the so-called linear no-threshold dose-response (LNT) theory of carcinogenesis, plaintiffs will point to regulatory pronouncements that embrace LNT as “the consensus.” When scientists are being honest, they generally recognize LNT as part of a precautionary principle approach, which may make sense as the foundation of “risk assessment.” The widespread assumption of LNT in regulatory agencies, and among scientists who work in such agencies, is understandable, but LNT remains an assumption. Nonetheless, we already see LNT hawked as a consensus, which under Cheng’s Consenus Rule would become the key dispositive issue, while quashing the mountain of evidence that there are, in fact, defense mechanisms to carcinogenesis that result in practical thresholds.

Beyond, regulatory pronouncements, some areas of scientific endeavor have themselves become politicized and extremist. Tobacco smoking surely causes lung cancer, but the studies of environmental tobacco smoking and lung cancer have been oversold. In areas of non-scientific disputes, such as history of alleged corporate malfeasance, juries will be treated to “the consensus” of Marxist labor historians, without having to consider the actual underlying historical documents. Cheng tells us that his Consensus Rule is a “realistic way of treating nonscientific expertise,”[19] which would seem to cover historian expert witness. Yet here, lawyers and lay fact finders are fully capable of exploring the glib historical conclusions of historian witnesses with cross-examination on the underlying documentary facts of the proffered opinions.

The Alleged Warrant for the Consensus Rule

If Professor Cheng is correct that the current judicial system, with decisions by juries and judges, is epistemically incompetent, does his Consensus Rule necessarily follow?  Not really. If we are going to engage in radical reforms, then the institutionalization of blue-ribbon juries would make much greater sense. As for Cheng’s claim that knowledge is “social,” the law of evidence already permits the use of true consensus statements as learned treatises, both to impeach expert witnesses who disagree, and (in federal court) to urge the truth of the learned treatise.

The gatekeeping process of Rule 702, which Professor Cheng would throw overboard, has important advantages in that judges ideally will articulate reasons for finding expert witness opinion testimony admissible or not. These reasons can be evaluated, discussed, and debated, with judges, lawyers, and the public involved. This gatekeeping process is rational and socially open.

Some Other Missteps in Cheng’s Argument

Experts on Both Sides are Too Extreme

Cheng’s proposal is based, in part, upon his assessment that the adversarial system causes the parties to choose expert witnesses “at the extremes.” Here again, Cheng provides no empirical evidence for his assessment. There is a mechanical assumption often made by people who do not bother to learn the details of a scientific dispute that the truth must somehow lie in the “middle.” For instance, in MDL 926, the silicone gel breast implant litigation, presiding Judge Sam Pointer complained about the parties’ expert witnesses being too extreme. Judge Pointer  believed that MDL judges should not entertain Rule 702 challenges, which were in his view properly heard by the transferor courts. As a result, Judge Robert Jones, and then Judge Jack Weinstein, conducted thorough Rule 702 hearings and found that the plaintiffs’ expert witnesses’ opinions were unreliable and insufficiently supported by the available evidence.[20] Judge Weinstein started the process of selecting court-appointed expert witnesses for the remaining New York cases, which goaded Judge Pointer into taking the process back to the MDL court level. After appointing four, highly qualified expert witnesses, Judge Pointer continued to believe that the parties’ expert witnesses were “extremists,” and that the courts’ own experts would come down somewhere between them.  When the court-appointed experts filed their reports, Judge Pointer was shocked that all four of his experts sided with the defense in rejecting the tendentious claims of plaintiffs’ expert witnesses.

Statistical Significance

Along the way, in advocating his radical proposal, Professor Cheng made some other curious announcements. For instance, he tells us that “[w]hile historically used as a rule of thumb, statisticians have now concluded that using the 0.05 [p-value] threshold is more distortive than helpful.”[21] Cheng’s purpose here is unclear, but the source he cited does not remotely support his statement, and certainly not his gross overgeneralization about “statisticians.” If this is the way he envisions experts will report “consensus,” then his program seems broken at its inception. The American Statistical Association’s (ASA) p-value “consensus” statement articulated six principles, the third of which noted that

“[s]cientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.”

This is a few light years away from statisticians’ concluding that statistical significance thresholds are more distortive than helpful. The ASA p-value statement further explains that

“[t]he widespread use of ‘statistical significance’ (generally interpreted as ‘p < 0.05’) as a license for making a claim of a scientific finding (or implied truth) leads to considerable distortion of the scientific process.”[22]

In the science of health effects, statistical significance remains extremely important, but it has never been a license for making causal claims. As Sir Austin Bradford Hill noted in his famous after-dinner speech, ruling out chance (and bias) as an explanation for an association was merely a predicate for evaluating the association for causality.[23]

Over-endorsing Animal Studies

Under Professor Cheng’s Consensus Rule, the appropriate consensus might well be one generated solely by animal studies. Cheng tells that “perhaps” scientists do not consider toxicology when the pertinent epidemiology is “clear.” When the epidemiology, however, is unclear, scientists consider toxicology.[24] Well, of course, but the key question is whether a consensus about causation in humans will be based upon non-human animal studies. Cheng seems to answer this question in the affirmative by criticizing courts that have required epidemiologic studies “even though the entire field of toxicology uses tissue and animal studies to make inferences, often in combination with and especially in the absence of epidemiology.”[25] The vitality of the field of toxicology is hardly undermined by its not generally providing sufficient grounds for judgments of human causation.

Relative Risk Greater Than Two

In the midst of his argument for the Consensus Rule, Cheng points critically to what he calls “questionable proxies” for scientific certainty. One such proxy is the judicial requirement of risk ratios in excess of two. His short discussion appears to be focused upon the inference of specific causation in a given case, but it leads to a non-sequitur:

“Some courts have required a relative risk of 2.0 in toxic tort cases, requiring a doubling of the population risk before considering causation.73 But the preponderance standard does not require that the substance more likely than not caused any case of the disease in the population, it requires that the substance more likely than not caused the plaintiff’s case.”[26]

Of course, it is exactly because we are interested in the probability of causation of the plaintiff’s case, that we advert to the risk ratio to give us some sense whether “more likely than not” the exposure caused plaintiff’s case. Unless plaintiff can show he is somehow unique, he is “any case.” In many instances, plaintiff cannot show how he is different from the participants of the study that gave rise to the risk ratio less than two.


[1] Edward K. Cheng, “The Consensus Rule: A New Approach to Scientific Evidence,” 75 Vanderbilt L. Rev. 407 (2022) [Consensus Rule].

[2] Consensus Rule at 410 (“The judge and the jury, lacking in expertise, are not competent to handle the questions that the Daubert framework assigns to them.”)

[3] Consensus Rule at 467 (“Under the Consensus Rule, experts no longer offer their personal opinions on causation or teach the jury how to assess the underlying studies. Instead, their testimony focuses on what the expert community as a whole believes about causation.”)

[4] Consensus Rule at 467.

[5] Consensus Rule at 437.

[6] Consensus Rule at 434.

[7] Consensus Rule at 434.

[8] Consensus Rule at 422.

[9] Consensus Rule at 429.

[10] Consensus Rule at 432-33.

[11] Consensus Rule at 434.

[12] Consensus Rule at 456.

[13] Consensus Rule at 457.

[14] Consensus Rule at 459.

[15] Steven E. Nissen, M.D., and Kathy Wolski, M.P.H., “Effect of Rosiglitazone on the Risk of Myocardial Infarction and Death from Cardiovascular Causes,” 356 New Engl. J. Med. 2457 (2007).

[16] P.D. Home, et al., “Rosiglitazone Evaluated for Cardiovascular Outcomes in Oral Agent Combination Therapy for Type 2 Diabetes (RECORD), 373 Lancet 2125 (2009).

[17] Consensus Rule at 458.

[18] Jonathan M. Samet, et al., Asbestos: Selected Health Effects (2006).

[19] Consensus Rule at 445.

[20] Hall v. Baxter Healthcare Corp., 947 F. Supp.1387 (D. Or. 1996) (excluding plaintiffs’ expert witnesses’ causation opinions); In re Breast Implant Cases, 942 F. Supp. 958 (E. & S.D.N.Y. 1996) (granting partial summary judgment on claims of systemic disease causation).

[21] Consenus Rule at 424 (citing Ronald L. Wasserstein & Nicole A. Lazar, “The ASA Statement on p-Values: Context, Process, and Purpose,” 70 Am. Statistician 129, 131 (2016)).

[22] Id.

[23] Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 295 (1965). See Schachtman, “Ruling Out Bias & Confounding is Necessary to Evaluate Expert Witness Causation Opinions” (Oct. 29, 2018); “Woodside & Davis on the Bradford Hill Considerations” (Aug. 23, 2013); Frank C. Woodside, III & Allison G. Davis, “The Bradford Hill Criteria: The Forgotten Predicate,” 35 Thomas Jefferson L. Rev. 103 (2013).

[24] Consensus Rule at 444.

[25] Consensus Rule at 424 & n. 74 (citing to one of multiple court advisory expert witnesses in Hall v. Baxter Healthcare Corp., 947 F. Supp.1387, 1449 (D. Or. 1996), who suggested that toxicology would be appropriate to consider when the epidemiology was not clear). Citing to one outlier advisor is a rather strange move for Cheng considering that the “consensus” was readily discernible to the trial judge in Hall, and to Judge Jack Weinstein, a few months later, in In re Breast Implant Cases, 942 F. Supp. 958 (E. & S.D.N.Y. 1996).

[26] Consensus Rule at 424 & n. 73 (citing Lucinda M. Finley, “Guarding the Gate to the Courthouse: How Trial Judges Are Using Their Evidentiary Screening Role to Remake Tort Causation Rules,” 49 Depaul L. Rev. 335, 348–49 (2000). See Schachtman, “Rhetorical Strategy in Characterizing Scientific Burdens of Proof” (Nov. 15, 2014).

Amicus Curious – Gelbach’s Foray into Lipitor Litigation

August 25th, 2022

Professor Schauer’s discussion of statistical significance, covered in my last post,[1] is curious for its disclaimer that “there is no claim here that measures of statistical significance map easily onto measures of the burden of proof.” Having made the disclaimer, Schauer proceeds to falls into the transposition fallacy, which contradicts his disclaimer, and, generally speaking, is not a good thing for a law professor eager to advance the understanding of “The Proof,” to do.

Perhaps more curious than Schauer’s error is his citation support for his disclaimer.[2] The cited paper by Jonah B. Gelbach is one of several of Gelbach’s papers that advances the claim that the p-value does indeed map onto posterior probability and the burden of proof. Gelbach’s claim has also been the center piece in his role as an advocate in support of plaintiffs in the Lipitor (atorvastatin) multi-district litigation (MDL) over claims that ingestion of atorvastatin causes diabetes mellitus.

Gelbach’s intervention as plaintiffs’ amicus is peculiar on many fronts. At the time of the Lipitor litigation, Sonal Singh was an epidemiologist and Assistant Professor of Medicine, at the Johns Hopkins University. The MDL trial court initially held that Singh’s proffered testimony was inadmissible because of his failure to consider daily dose.[3] In a second attempt, Singh offered an opinion for 10 mg daily dose of atorvastatin, based largely upon the results of a clinical trial known as ASCOT-LLA.[4]

The ASCOT-LLA trial randomized 19,342 participants with hypertension and at least three other cardiovascular risk factors to two different anti-hypertensive medications. A subgroup with total cholesterol levels less than or equal to 6.5 mmol./l. were randomized to either daily 10 mg. atorvastatin or placebo.  The investigators planned to follow up for five years, but they stopped after 3.3 years because of clear benefit on the primary composite end point of non-fatal myocardial infarction and fatal coronary heart disease. At the time of stopping, there were 100 events of the primary pre-specified outcome in the atorvastatin group, compared with 154 events in the placebo group (hazard ratio 0.64 [95% CI 0.50 – 0.83], p = 0.0005).

The atorvastatin component of ASCOT-LLA had, in addition to its primary pre-specified outcome, seven secondary end points, and seven tertiary end points.  The emergence of diabetes mellitus in this trial population, which clearly was at high risk of developing diabetes, was one of the tertiary end points. Primary, secondary, and tertiary end points were reported in ASCOT-LLA without adjustment for the obvious multiple comparisons. In the treatment group, 3.0% developed diabetes over the course of the trial, whereas 2.6% developed diabetes in the placebo group. The unadjusted hazard ratio was 1.15 (0.91 – 1.44), p = 0.2493.[5] Given the 15 trial end points, an adjusted p-value for this particular hazard ratio, for diabetes, might well exceed 0.5, and even approach 1.0.

On this record, Dr. Singh honestly acknowledged that statistical significance was important, and that the diabetes finding in ASCOT-LLA might have been the result of low statistical power or of no association at all. Based upon the trial data alone, he testified that “one can neither confirm nor deny that atorvastatin 10 mg is associated with significantly increased risk of type 2 diabetes.”[6] The trial court excluded Dr. Singh’s 10mg/day causal opinion, but admitted his 80mg/day opinion. On appeal, the Fourth Circuit affirmed the MDL district court’s rulings.[7]

Jonah Gelbach is a professor of law at the University of California at Berkeley. He attended Yale Law School, and received his doctorate in economics from MIT.

Professor Gelbach entered the Lipitor fray to present a single issue: whether statistical significance at conventionally demanding levels such as 5 percent is an appropriate basis for excluding expert testimony based on statistical evidence from a single study that did not achieve statistical significance.

Professor Gelbach is no stranger to antic proposals.[8] As amicus curious in the Lipitor litigation, Gelbach asserts that plaintiffs’ expert witness, Dr. Singh, was wrong in his testimony about not being able to confirm the ASCOT-LLA association because he, Gelbach, could confirm the association.[9] Ultimately, the Fourth Circuit did not discuss Gelbach’s contentions, which is not surprising considering that the asserted arguments and alleged factual considerations were not only dehors the record, but in contradiction of the record.

Gelbach’s curious claim is that any time a risk ratio, for an exposure and an outcome of interest, is greater than 1.0, with a p-value < 0.5,[10] the evidence should be not only admissible, but sufficient to support a conclusion of causation. Gelbach states his claim in the context of discussing a single randomized controlled trial (ASCOT-LLA), but his broad pronouncements are carelessly framed such that others may take them to apply to a single observational study, with its greater threats to internal validity.

Contra Kumho Tire

To get to his conclusion, Gelbach attempts to remove the constraints of traditional standards of significance probability. Kumho Tire teaches that expert witnesses must “employ[] in the courtroom the same level of intellectual rigor that characterizes the practice of an expert in the relevant field.”[11] For Gelbach, this “eminently reasonable admonition” does not impose any constraints on statistical inference in the courtroom. Statistical significance at traditional levels (p < 0.05) is for elitist scholarly work, not for the “practical” rent-seeking work of the tort bar. According to Gelbach, the inflation of the significance level ten-fold to p < 0.5 is merely a matter of “weight” and not admissibility of any challenged opinion testimony.

Likelihood Ratios and Posterior Probabilities

Gelbach maintains that any evidence that has a likelihood ratio (LR > 1) greater than one is relevant, and should be admissible under Federal Rule of Evidence 401.[12] This argument ignores the other operative Federal Rules of Evidence, namely 702 and 703, which impose additional criteria of admissibility for expert witness opinion testimony.

With respect to variance and random error, Gelbach tells us that any evidence that generates a LR > 1, should be admitted when “the statistical evidence is statistically significant below the 50 percent level, which will be true when the p-value is less than 0.5.”[13]

At times, Gelbach seems to be discussing the admissibility of the ASCOT-LLA study itself, and not the proffered opinion testimony of Dr. Singh. The study itself would not be admissible, although it is clearly the sort of hearsay an expert witness in the field may consider. If Dr. Singh were to have reframed and recalculated the statistical comparisons, then the Rule 703 requirement of “reasonable reliance” by scientists in the field of interest may not have been satisfied.

Gelbach also generates a posterior probability (0.77), which is based upon his calculations from data in the ASCOT-LLA trial, and not the posterior probability of Dr. Singh’s opinion. The posterior probability, as calculated, is problematic on many fronts.

Gelbach does not present his calculations – for the sake of brevity he says – but he tells us that the ASCOT-LLA data yield a likelihood ratio of roughly 1.9, and a p-value of 0.126.[14] What the clinical trialists reported was a hazard ratio of 1.15, which is a weak association on most researchers’ scales, with a two-sided p-value of 0.25, which is five times higher than the usual 5 percent. Gelbach does not explain how or why his calculated p-value for the likelihood ratio is roughly half the unadjusted, two-sided p-value for the tertiary outcome from ASCOT-LLA.

As noted, the reported diabetes hazard ratio of 1.15 was a tertiary outcome for the ASCOT trial, one of 15 calculated by the trialists, with p-values unadjusted for multiple comparisons.  The failure to adjust is perhaps excusable in that some (but certainly not all) of the outcome variables are overlapping or correlated. A sophisticated reader would not be misled; only when someone like Gelbach attempts to manufacture an inflated posterior probability without accounting for the gross underestimate in variance is there an insult to statistical science. Gelbach’s recalculated p-value for his LR, if adjusted for the multiplicity of comparisons in this trial, would likely exceed 0.5, rendering all his arguments nugatory.

Using the statistics as presented by the published ASCOT-LLA trial to generate a posterior probability also ignores the potential biases (systematic errors) in data collection, the unadjusted hazard ratios, the potential for departures from random sampling, errors in administering the participant recruiting and inclusion process, and other errors in measurements, data collection, data cleaning, and reporting.

Gelbach correctly notes that there is nothing methodologically inappropriate in advocating likelihood ratios, but he is less than forthcoming in explaining that such ratios translate into a posterior probability only if he posits a prior probability of 0.5.[15] His pretense to having simply stated “mathematical facts” unravels when we consider his extreme, unrealistic, and unscientific assumptions.

The Problematic Prior

Gelbach’s glibly assumes that the starting point, the prior probability, for his analysis of Dr. Singh’s opinion is 50%. This is an old and common mistake,[16] long since debunked.[17] Gelbach’s assumption is part of an old controversy, which surfaced in early cases concerning disputed paternity. The assumption, however, is wrong legally and philosophically.

The law simply does not hand out 0.5 prior probability to both parties at the beginning of a trial. As Professor Jaffee noted almost 35 years ago:

“In the world of Anglo-American jurisprudence, every defendant, civil and criminal, is presumed not liable. So, every claim (civil or criminal) starts at ground zero (with no legal probability) and depends entirely upon proofs actually adduced.”[18]

Gelbach assumes that assigning “equal prior probability” to two adverse parties is fair, because the fact-finder would not start hearing evidence with any notion of which party’s contentions are correct. The 0.5/0.5 starting point, however, is neither fair nor is it the law.[19] The even odds prior is also not good science.

The defense is entitled to a presumption that it is not liable, and the plaintiff must start at zero.  Bayesians understand that this is the death knell of their beautiful model.  If the prior probability is zero, then Bayes’ Theorem tells us mathematically that no evidence, no matter how large a likelihood ratio, can move the prior probability of zero towards one. Bayes’ theorem may be a correct statement about inverse probabilities, but still be an inadequate or inaccurate model for how factfinders do, or should, reason in determining the ultimate facts of a case.

We can see how unrealistic and unfair Gelbach’s implied prior probability is if we visualize the proof process as a football field.  To win, plaintiffs do not need to score a touchdown; they need only cross the mid-field 50-yard line. Rather than making plaintiffs start at the zero-yard line, however, Gelbach would put them right on the 50-yard line. Since one toe over the mid-field line is victory, the plaintiff is spotted 99.99+% of its burden of having to present evidence to build up 50% probability. Instead, plaintiffs are allowed to scoot from the zero yard line right up claiming success, where even the slightest breeze might give them winning cases. Somehow, in the model, plaintiffs no longer have to present evidence to traverse the first half of the field.

The even odds starting point is completely unrealistic in terms of the events upon which the parties are wagering. The ASCOT-LLA study might have shown a protective association between atorvastatin and diabetes, or it might have shown no association at all, or it might have show a larger hazard ratio than measured in this particular sample. Recall that the confidence interval for hazard ratios for diabetes ran from 0.91 to 1.44. In other words, parameters from 0.91 (protective association) to 1.0 (no association), to 1.44 (harmful association) were all reasonably compatible with the observed statistic, based upon this one study’s data. The potential outcomes are not binary, which makes the even odds starting point inappropriate.[20]


[1]Schauer’s Long Footnote on Statistical Significance” (Aug. 21, 2022).

[2] Frederick Schauer, The Proof: Uses of Evidence in Law, Politics, and Everything Else 54-55 (2022) (citing Michelle M. Burtis, Jonah B. Gelbach, and Bruce H. Kobayashi, “Error Costs, Legal Standards of Proof, and Statistical Significance,” 25 Supreme Court Economic Rev. 1 (2017).

[3] In re Lipitor Mktg., Sales Practices & Prods. Liab. Litig., MDL No. 2:14–mn–02502–RMG, 2015 WL 6941132, at *1  (D.S.C. Oct. 22, 2015).

[4] Peter S. Sever, et al., “Prevention of coronary and stroke events with atorvastatin in hypertensive patients who have average or lower-than-average cholesterol concentrations, in the Anglo-Scandinavian Cardiac Outcomes Trial Lipid Lowering Arm (ASCOT-LLA): a multicentre randomised controlled trial,” 361 Lancet 1149 (2003). [cited here as ASCOT-LLA]

[5] ASCOT-LLA at 1153 & Table 3.

[6][6] In re Lipitor Mktg., Sales Practices & Prods. Liab. Litig., 174 F.Supp. 3d 911, 921 (D.S.C. 2016) (quoting Dr. Singh’s testimony).

[7] In re Lipitor Mktg., Sales Practices & Prods. Liab. Litig., 892 F.3d 624, 638-39 (2018) (affirming MDL trial court’s exclusion in part of Dr. Singh).

[8] SeeExpert Witness Mining – Antic Proposals for Reform” (Nov. 4, 2014).

[9] Brief for Amicus Curiae Jonah B. Gelbach in Support of Plaintiffs-Appellants, In re Lipitor Mktg., Sales Practices & Prods. Liab. Litig., 2017 WL 1628475 (April 28, 2017). [Cited as Gelbach]

[10] Gelbach at *2.

[11] Kumho Tire Co. v. Carmichael, 526 U.S. 137, 152 (1999).

[12] Gelbach at *5.

[13] Gelbach at *2, *6.

[14] Gelbach at *15.

[15] Gelbach at *19-20.

[16] See Richard A. Posner, “An Economic Approach to the Law of Evidence,” 51 Stanford L. Rev. 1477, 1514 (1999) (asserting that the “unbiased fact-finder” should start hearing a case with even odds; “[I]deally we want the trier of fact to work from prior odds of 1 to 1 that the plaintiff or prosecutor has a meritorious case. A substantial departure from this position, in either direction, marks the trier of fact as biased.”).

[17] See, e.g., Richard D. Friedman, “A Presumption of Innocence, Not of Even Odds,” 52 Stan. L. Rev. 874 (2000). [Friedman]

[18] Leonard R. Jaffee, “Prior Probability – A Black Hole in the Mathematician’s View of the Sufficiency and Weight of Evidence,” 9 Cardozo L. Rev. 967, 986 (1988).

[19] Id. at p.994 & n.35.

[20] Friedman at 877.

The Rise of Agnothology as Conspiracy Theory

July 19th, 2022

A few egregious articles in the biomedical literature have begun to endorse explicitly asymmetrical standards for inferring causation in the context of environmental or occupational exposures. Very little if anything is needed for inferring causation, and nothing counts against causation.  If authors refuse to infer causation, then they are agents of “industry,” epidemiologic malfeasors, and doubt mongers.

For an example of this genre, take the recent article, entitled “Toolkit for detecting misused epidemiological methods.”[1] [Toolkit] Please.

The asymmetry begins with Trump-like projection of the authors’ own foibles. The principal hammer in the authors’ toolkit for detecting misused epidemiologic methods is personal, financial bias. And yet, somehow, in an article that calls out other scientists for having received money from “industry,” the authors overlooked the business of disclosing their receipt of monies from one of the biggest industries around – the lawsuit industry.

Under the heading “competing interests,” the authors state that “they have no competing interests.”[2]  Lead author, Colin L. Soskolne, was, however, an active, partisan expert witness for plaintiffs’ counsel in diacetyl litigation.[3] In an asbestos case before the Pennsylvania Supreme Court, Rost v. Ford Motor Co., Soskolne signed on to an amicus brief, supporting the plaintiff, using his science credentials, without disclosing his expert witness work for plaintiffs, or his long-standing anti-asbestos advocacy.[4]

Author Shira Kramer signed on to Toolkit, without disclosing any conflicts, but with an even more impressive résumé of pro-plaintiff litigation experience.[5] Kramer is the owner of Epidemiology International, in Cockeysville, Maryland, where she services the lawsuit industry. She too was an “amicus” in Rost, without disclosing her extensive plaintiff-side litigation consulting and testifying.

Carl Cranor, another author of Toolkit, takes first place for hypocrisy on conflicts of interest. As a founder of Council for Education and Research on Toxics (CERT), he has sterling credentials for monetizing the bounty hunt against “carcinogens,” most recently against coffee.[6] He has testified in denture cream and benzene litigation, for plaintiffs. When he was excluded under Rule 702 from the Milward case, CERT filed an amicus brief on his behalf, without disclosing that Cranor was a founder of that organization.[7], [8]

The title seems reasonably fair-minded but the virulent bias of the authors is soon revealed. The Toolkit is presented as a Table in the middle of the article, but the actual “tools” are for the most part not seriously discussed, other than advice to “follow the money” to identify financial conflicts of interest.

The authors acknowledge that epidemiology provides critical knowledge of risk factors and causation of disease, but they quickly transition to an effort to silence any industry commentator on any specific epidemiologic issue. As we will see, the lawsuit industry is given a complete pass. Not surprisingly, several of the authors (Kramer, Cranor, Soskolne) have worked closely in tandem with the lawsuit industry, and have derived financial rewards for their efforts.

Repeatedly, the authors tell us that epidemiologic methods and language are misused by “powerful interests,” which have financial stakes in the outcome of research. Agents of these interests foment uncertainty and doubt about causal relationships through “disinformation,” “malfeasance,” and “doubt mongering.” There is no correlative concern about false claiming or claim mongering..

Who are these agents who plot to sabotage “social justice” and “truth”? Clearly, they are scientists with whom the Toolkit authors disagree. The Toolkit gang cites several papers as exemplifying “malfeasance,”[9] but they never explain what was wrong with them, or how the malfeasors went astray.  The Toolkit tactics seem worthy of Twitter smear and run.

The Toolkit

The authors’ chart of “tools” used by industry might have been an interesting taxonomy of error, but mostly they are ad hominem attack on scientists with whom they disagree. Channeling Putin on Ukraine, those scientists who would impose discipline and rigor on epidemiologic science are derided as not “real epidemiologists,” and, to boot, they are guilty of ethical lapses in failing to advance “social justice.”

Mostly the authors give us a toolkit for silencing those who would get in the way of the situational science deployed at the beck and call of the lawsuit industry.[10] Indeed, the Toolkit authors are not shy about identifying their litigation goals; they tell us that the toolkit can be deployed in depositions and in cross-examinations to pursue “social justice.” These authors also outline a social agenda that greatly resembles the goals of cancel culture: expose the perpetrators who stand in the way of the authors’preferred policy choices, diminish their adversaries’ their influence on journals, and galvanize peer reviewers to reject their adversaries’ scientific publications. The Toolkit authors tell us that “[t] he scientific community should engage by recognizing and professionally calling out common practices used to distort and misapply epidemiological and other health-related sciences.”[11] What this advice translates into are covert and open ad hominem campaigns as peer reviewers to block publications, to deny adversaries tenure and promotions, and to use social and other media outlets to attack adversaries’ motives, good faith, and competence.

None of this is really new. Twenty-five years ago, the late F. Douglas K. Liddell railed at the Mt. Sinai mob, and the phenomenon was hardly new then.[12] The Toolkit’s call to arms is, however, quite open, and raises the question whether its authors and adherents can be fair journal editors and peer reviewers of journal submissions.

Much of the Toolkit is the implementation of a strategy developed by lawsuit industry expert witnesses to demonize their adversaries by accusing them of manufacturing doubt or ignorance or uncertainty. This strategy has gained a label used to deride those who disagree with litigation overclaiming: agnotology or the creation of ignorance. According to Professor Robert Proctor, a regular testifying historian for tobacco plaintiffs, a linguist, Iain Boal, coined the term agnotology, in 1992, to describe the study of the production of ignorance.[13]

The Rise of “Agnotology” in Ngram

Agnotology has become a cottage sub-industry of the lawsuit industry, although lawsuits (or claim mongering if you like), of course, remain their main product. Naomi Oreskes[14] and David Michaels[15] gave the agnotology field greater visibility with their publications, using the less erudite but catchier phrase “manufacturing doubt.” Although the study of ignorance and uncertainty has a legitimate role in epistemology[16] and sociology,[17] much of the current literature is dominated by those who use agnotology as propaganda in support of their own litigation and regulatory agendas.[18] One lone author, however, appears to have taken agnotology study seriously enough to see that it is largely a conspiracy theory that reduces complex historical or scientific theory, evidence, opinion, and conclusions to a clash between truth and a demonic ideology.[19]

Is there any substance to the Toolkit?

The Toolkit is not entirely empty of substantive issues. The authors note that “statistical methods are a critical component of the epidemiologist’s toolkit,”[20] and they cite some articles about common statistical mistakes missed by peer reviewers. Curiously, the Toolkit omits any meaningful discussion of statistical mistakes that increase the risk of false positive results, such as multiple comparisons or dichotomizing continuous confounder variables. As for the Toolkit’s number one identified “inappropriate” technique used by its authors’ adversaries, we have:

“A1. Relying on statistical hypothesis testing; Using ‘statistical significance’ at the 0.05 level of probability as a strict decision criterion to determine the interpretation of statistical results and drawing conclusions.”

Peer into the hearings of any federal court so-called Daubert motion, and you will see the lawsuit industry, and its hired expert witnesses, rail at statistical significance, unless of course, there is some subgroup that has nominal significance, in which case, they are all in for endorsing the finding as “conclusive.” 

Welcome to asymmetric, situational science.


[1] Colin L. Soskolne, Shira Kramer, Juan Pablo Ramos-Bonilla, Daniele Mandrioli, Jennifer Sass, Michael Gochfeld, Carl F. Cranor, Shailesh Advani & Lisa A. Bero, “Toolkit for detecting misused epidemiological methods,” 20(90) Envt’l Health (2021) [Toolkit].

[2] Toolkit at 12.

[3] Watson v. Dillon Co., 797 F.Supp. 2d 1138 (D. Colo. 2011).

[4] Rost v. Ford Motor Co., 151 A.3d 1032 (Pa. 2016). See “The Amicus Curious Brief” (Jan. 4, 2018).

[5] See, e.g., Sean v. BMW of North Am., LLC, 26 N.Y.3d 801, 48 N.E.3d 937, 28 N.Y.S.3d 656 (2016) (affirming exclusion of Kramer); The Little Hocking Water Ass’n v. E.I. Du Pont De Nemours & Co., 90 F.Supp.3d 746 (S.D. Ohio 2015) (excluding Kramer); Luther v. John W. Stone Oil Distributor, LLC, No. 14-30891 (5th Cir. April 30, 2015) (mentioning Kramer as litigation consultant); Clair v. Monsanto Co., 412 S.W.3d 295 (Mo. Ct. App. 2013 (mentioning Kramer as plaintiffs’ expert witness); In re Chantix (Varenicline) Prods. Liab. Litig., No. 2:09-CV-2039-IPJ, MDL No. 2092, 2012 WL 3871562 (N.D.Ala. 2012) (excluding Kramer’s opinions in part); Frischhertz v. SmithKline Beecham Corp., 2012 U.S. Dist. LEXIS 181507, Civ. No. 10-2125 (E.D. La. Dec. 21, 2012) (excluding Kramer); Donaldson v. Central Illinois Public Service Co., 199 Ill. 2d 63, 767 N.E.2d 314 (2002) (affirming admissibility of Kramer’s opinions in absence of Rule 702 standards).

[6]  “The Council for Education & Research on Toxics” (July 9, 2013) (CERT amicus brief filed without any disclosure of conflict of interest). Among the fellow travelers who wittingly or unwittingly supported CERT’s scheme to pervert the course of justice were lawsuit industry stalwarts, Arthur L. Frank, Peter F. Infante, Philip J. Landrigan, Barry S. Levy, Ronald L. Melnick, David Ozonoff, and David Rosner. See also NAS, “Carl Cranor’s Conflicted Jeremiad Against Daubert” (Sept. 23, 2018); Carl Cranor, “Milward v. Acuity Specialty Products: How the First Circuit Opened Courthouse Doors for Wronged Parties to Present Wider Range of Scientific Evidence” (July 25, 2011).

[7] Milward v. Acuity Specialty Products Group, Inc., 664 F. Supp. 2d 137, 148 (D. Mass. 2009), rev’d, 639 F.3d 11 (1st Cir. 2011), cert. den. sub nom. U.S. Steel Corp. v. Milward, 565 U.S. 1111 (2012), on remand, Milward v. Acuity Specialty Products Group, Inc., 969 F.Supp. 2d 101 (D. Mass. 2013) (excluding specific causation opinions as invalid; granting summary judgment), aff’d, 820 F.3d 469 (1st Cir. 2016).

[8] To put this effort into a sociology of science perspective, the Toolkit article is published in a journal, Environmental Health, an Editor in Chief of which is David Ozonoff, a long-time pro-plaintiff partisan in the asbestos litigation. The journal has an “ombudsman,”Anthony Robbins, who was one of the movers-and-shakers in forming SKAPP, The Project on Scientific Knowledge and Public Policy, a group that plotted to undermine the application of federal evidence law of expert witness opinion testimony. SKAPP itself now defunct, but its spirit of subverting law lives on with efforts such as the Toolkit. “More Antic Proposals for Expert Witness Testimony – Including My Own Antic Proposals” (Dec. 30, 2014). Robbins is also affiliated with an effort, led by historian and plaintiffs’ expert witness David Rosner, to perpetuate misleading historical narratives of environmental and occupational health. “ToxicHistorians Sponsor ToxicDocs” (Feb. 1, 2018); “Creators of ToxicDocs Show Off Their Biases” (June 7, 2019); Anthony Robbins & Phyllis Freeman, “ToxicDocs (www.ToxicDocs.org) goes live: A giant step toward leveling the playing field for efforts to combat toxic exposures,” 39 J. Public Health Pol’y 1 (2018).

[9] The exemplars cited were Paolo Boffetta, MD, MPH; Hans Olov Adami, Philip Cole, Dimitrios Trichopoulos, Jack Mandel, “Epidemiologic studies of styrene and cancer: a review of the literature,” 51 J. Occup. & Envt’l Med. 1275 (2009); Carlo LaVecchia & Paolo Boffetta, “Role of stopping exposure and recent exposure to asbestos in the risk of mesothelioma,” 21 Eur. J. Cancer Prev. 227 (2012); John Acquavella, David Garabrant, Gary Marsh G, Thomas Sorahan and Douglas L. Weed, “Glyphosate epidemiology expert panel review: a weight of evidence systematic review of the relationship between glyphosate exposure and non-Hodgkin’s lymphoma or multiple myeloma,” 46 Crit. Rev. Toxicol. S28 (2016); Catalina Ciocan, Nicolò Franco, Enrico Pira, Ihab Mansour, Alessandro Godono, and Paolo Boffetta, “Methodological issues in descriptive environmental epidemiology. The example of study Sentieri,” 112 La Medicina del Lavoro 15 (2021).

[10] The Toolkit authors acknowledge that their identification of “tools” was drawn from previous publications of the same ilk, in the same journal. Rebecca F. Goldberg & Laura N. Vandenberg, “The science of spin: targeted strategies to manufacture doubt with detrimental effects on environmental and public health,” 20:33 Envt’l Health (2021).

[11] Toolkit at 11.

[12] F.D.K. Liddell, “Magic, Menace, Myth and Malice,” 41 Ann. Occup. Hyg. 3, 3 (1997). SeeThe Lobby – Cut on the Bias” (July 6, 2020).

[13] Robert N. Proctor & Londa Schiebinger, Agnotology: The Making and Unmaking of Ignorance (2008).

[14] Naomi Oreskes & Erik M. Conway, Merchants of Doubt: How a Handful of Scientists Obscured the Truth on Issues from Tobacco Smoke to Global Warming (2010); Naomi Oreskes & Erik M. Conway, “Defeating the merchants of doubt,” 465 Nature 686 (2010).

[15] David Michaels, The Triumph of Doubt: Dark Money and the Science of Deception (2020); David Michaels, Doubt is Their Product: How Industry’s Assault on Science Threatens Your Health (2008); David Michaels, “Science for Sale,” Boston Rev. 2020; David Michaels, “Corporate Campaigns Manufacture Scientific Doubt,” 174 Science News 32 (2008); David Michaels, “Manufactured Uncertainty: Protecting Public Health in the Age of Contested Science and Product Defense,” 1076 Ann. N.Y. Acad. Sci. 149 (2006); David Michaels, “Scientific Evidence and Public Policy,” 95 Am. J. Public Health s1 (2005); David Michaels & Celeste Monforton, “Manufacturing Uncertainty: Contested Science and the Protection of the Public’s Health and Environment,” 95 Am. J. Pub. Health S39 (2005); David Michaels & Celeste Monforton, “Scientific Evidence in the Regulatory System: Manufacturing Uncertainty and the Demise of the Formal Regulatory Ssytem,” 13 J. L. & Policy 17 (2005); David Michaels, “Doubt is Their Product,” Sci. Am. 96 (June 2005); David Michaels, “The Art of ‘Manufacturing Uncertainty’,” L.A. Times (June 24, 2005).

[16] See, e.g., Sibilla Cantarini, Werner Abraham, and Elisabeth Leiss, eds., Certainty-uncertainty – and the Attitudinal Space in Between (2014); Roger M. Cooke, Experts in Uncertainty: Opinion and Subjective Probability in Science (1991).

[17] See, e.g., Ralph Hertwig & Christoph Engel, eds., Deliberate Ignorance: Choosing Not to Know (2021); Linsey McGoey, The Unknowers: How Strategic Ignorance Rules the World (2019); Michael Smithson, “Toward a Social Theory of Ignorance,” 15 J. Theory Social Behavior 151 (1985).

[18] See Janet Kourany & Martin Carrier, eds., Science and the Production of Ignorance: When the Quest for Knowledge Is Thwarted (2020); John Launer, “The production of ignorance,” 96 Postgraduate Med. J. 179 (2020); David S. Egilman, “The Production of Corporate Research to Manufacture Doubt About the Health Hazards of Products: An Overview of the Exponent BakeliteVR Simulation Study,” 28 New Solutions 179 (2018); Larry Dossey, “Agnotology: on the varieties of ignorance, criminal negligence, and crimes against humanity,” 10 Explore 331 (2014); Gerald Markowitz & David Rosner, Deceit and Denial: The Deadly Politics of Industrial Revolution (2002).

[19] See Enea Bianchi, “Agnotology: a Conspiracy Theory of Ignorance?” Ágalma: Rivista di studi culturali e di estetica 41 (2021).

[20] Toolkit at 4.