TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

A Bayesian Toehold in the New Reference Guide to Epidemiology

April 4th, 2026

The most recent edition of the Reference Manual on epidemiology distinguishes more carefully between Bayesian and frequentist approaches to statistical analyses than did its previous iterations. In past editions, the authors conflated confidence and credible intervals, an error that is studiously avoided in the text of the chapter on epidemiology, in the fourth edition.[1]

The chapter acknowledges that “most published research does not” use Bayesian credible intervals of posterior probabilities. The authors then offer a largely unsupported conclusion about a “toehold”:

“Epidemiologic studies assessed by Bayesian statistical analyses have begun to gain a toehold in litigation, although court opinions are still dominated by discussion of traditional significance testing.”[2]

The authors do not define what a toehold is; nor do they specify whether it is a big toe or pinky toe. The new chapter cites three cases, which out of the universe of cases, seems like a tiny toe. The three cases cited by the Reference Manual as a toehold raise serious questions about the legitimacy of using Bayesian analyses, at least to date.

  1. Langrell.

In Langrell,[3] one of the three cases cited by the Manual, an expert witness claimed to have used a “Bayesian approach,” but in reality no Bayesian statistics were involved. The Manual describes the result in Langrell as admitting the testimony of a specific causation expert witness who had used a Bayesian approach for specific causation of a cancer “so rare that it was “unlikely or impossible for epidemiological studies to be performed.”[4]

Citing Langrell for the stated proposition was questionable scholarship at best. The case was one of several cancer claims against railroad employers, in which Robert Peter Gale served as an expert witness. Dr. Robert Peter Gale is a well-credentialed clinician whose career has focused on lymphopoietic cancers.[5] He has no apparent expertise in statistics or epidemiology.

In one reported decision, Byrd, Dr. Gale attempted to offer a “Bayesian” opinion that railroad yard exposures caused a worker’s lung cancer. The claimant had also been a two-pack per day smoker for many years.[6] The published opinion refers to Dr. Gale’s having used Bayesian methods, but there is nothing in the published opinion to suggest that such methods had been used.[7] Gale appeared to equate Bayesian analysis with a non-quantitative differential etiology. Given the claimant’s extensive smoking history, the trial court excluded Dr. Gale’s proffered opinion on the cause of the claimant’s lung cancer, as unreliable.

In another railroad case brought by Saul Hernandez, Gale also claimed to use Bayesian methods to assess the causation of the claimant’s stomach cancer. There is only one mention, however, of Bayes in Gale’s report:

“My opinion is based in Bayesian probabilities which consider the interdependence of individual probabilities. This process is sometimes referred to as differential diagnosis or differential causation determination or differential etiology. Differential diagnosis is a method of reasoning widely-accepted in medicine.”[8]

To be explicit, there was no discussion of prior or posterior probabilities or odds, no discussion of likelihood ratios, or Bayes factors. There was absolutely nothing in Dr. Gale’s report that would warrant his claim that he had done a Bayesian analysis of specific causation or of the “interdependence of individual probabilities” of putative specific causes. The court excluded Dr. Gale’s proffered opinion in Hernandez, with its scant reference to a Bayesian analysis.[9]

The third instance of Gale’s purported use of a Bayesian analysis occurred in the Langrell case, cited by the Manual. The authors of the new Manual do not specify what kind of rare cancer was involved in the Langrell case. For the record, Mr. Langrell developed squamous cell carcinoma of the tonsils, which is the most common type of oropharyngeal cancer, which has been studied for many decades. Alcohol, tobacco, and human papillomavirus (HPV), have long been associated with the occurrence of such cancers. Mr. Langrell had a history of exposure to all three risk factors. Contrary to Gale’s poor-mouthing about lack of data, there are many large cohort studies of railroad yard workers with diesel fume exposure.[10]

The full extent of the district court’s exposition about Gale’s “Bayesian” method was to state that:

“He testified he used a Bayesian approach, allowing him to ‘consider interdependence of individual probabilities’ and to render an opinion as to ‘whether the weight of the evidence indicates it is more likely than not to a reasonable degree of medical probability that exposure to the carcinogens discussed was a cause of tonsil cancer in Mr. Langrell’.”[11]

There is no evidence that Dr. Gale had the competence to conduct a Bayesian analysis, or that he actually did one. Dr. Gale’s participation in the Langrell, Byrd, and Hernandez cases seems like poor evidence of a toehold for Bayesian methods. Not even a pinky toe.

We might forgive the credulity of the judicial officers in these cases, but why would Dr. Gale state that he had done a Bayesian analysis? The only reason that suggests itself is that Dr. Gale was bloviating in order to give his specific causation opinions an aura of scientific and mathematical respectability.  Falsus in uno, falsus in omnibus.[12] In two of the three related cases, his opinion was rejected. The Manual cites only the case in which Gale’s opinion was admitted. The cited opinion offers no support for Gale’s having actually conducted a Bayesian analysis of any sort.

  1. In re Abilify.

The second cited example of toe holds was the use of a Bayesian analysis by a statistician, David Madigan, in the Abilify litigation. Madigan has published on Bayesian statistics, but his litigation activities have repeatedly raised issues whether Madigan’s Bayesian analyses are reliable.

The Abilify litigation involved claims that the anti-psychotic medication caused impulsive gambling, eating, shopping, and sex. Of course, psychotic behavior itself involves those impulsive behaviors and many others. The Manual cited a decision of the multi-district litigation court that noted that “[n]umerous federal courts have found Dr. Madigan’s methodology of detecting safety signals using a combination of frequentist and Bayesian algorithms to be reliable under Rule 702 and Daubert.”[13]

The “signals” to which the Manual citation refers are suggestions of possible causal associations; they are hypotheses generated from pharmacovigilance studies of adverse event reports, not tests of those hypotheses. Signals are not causes; they may not rise even to the level of associations. The particular analyses proffered by Madigan in Abilify, and in many other litigations, for plaintiffs, involves comparing the rate of reporting specific adverse events for the drug with the reporting rate for all drugs, or for comparator drugs. The outcome of these analyses is a reporting rate ratio, not an incidence ratio.

The following 2 x 2 table illustrates how adverse event data are using to create “signals” of disproportional reporting.

The FDA provides very clear guidance on the meaning and use of such signal-finding algorithms or disproportionality analyses (DPAs):

“In the context of spontaneous report systems, some authors use the term “signal of disproportionate reporting” (SDR) when discussing associations highlighted by DPA methods. In reality, most SDRs that emerge from spontaneous report databases represent non-causal effects because the reports are associated with treatment indications (i.e., confounding by indication), co-prescribing patterns, co-morbid illnesses, protopathic bias, channeling bias, or other reporting artifacts, or, the reported adverse events are already labeled or are medically trivial.”[14]

Disproportionality analyses are not part of analytical epidemiology, but Madigan has tried to pass them off as such in any number of litigations. More discerning courts have excluded his attempts. In the Accutane litigation in Atlantic County, New Jersey, Judge Johnson conducted an extensive pre-trial hearing on challenges to Madigan’s causation opinions, and found them wanting under the New Jersey analogue of Federal Rule of Evidence 702.[15] On appeal, the New Jersey Supreme Court reviewed and affirmed the exclusion of Madigan’s litigation opinions that isotretinoin causes Crohn’s disease.[16]

The pattern of adverse event report filing in connection with isotretinoin has been carefully studied; it illustrates the FDA’s point about artifacts. One such study of isotretinoin adverse event reporting showed that attorneys reported  87.8% cases, while physicians reported 6.0%, and consumers reported only 5.1% cases. For the entire FAERS database, only 3.6% reports for all drug reactions during the same time period were reported by attorneys (p value < .01).[17]

In other areas less affected by litigation-created reporting bias, the results of DPAs have been compared with analytical epidemiology. A DPA of statin use and bladder cancer suggested a reporting odds ratio of 1.48, 95% CI; 1.36-1.61. The authors, in a peer-reviewed publication, reported the result with clearly inappropriate causal language: “Multi-methodological approaches suggest that statins are associated with an increased risk for bladder cancer.”[18] An appropriate meta-analysis of analytical epidemiologic studies reported an actual odds ratio of 1.07, 95 % CI (0.95, 1.21), which finding was interpreted as suggesting “that there was no association between statin use and risk of bladder cancer.”[19]

Dr. Madigan’s use of Bayesian methods to analyze reporting ratios and his passing them off as evidence that can support causal inference is a paradigmatic instance of an inappropriate methodology. Dr. Madigan’s use of Bayesian methods to analyze reporting rates seems like poor evidence of a toehold.

  1. In re Testosterone.

The third case cited by the Manual for the toehold proposition arose in the multi-district litigation created for claims against manufacturers of testosterone. This MDL aggregated cases based upon a speculative Public Citizen petition that transdermal testosterone used by men who have low testosterone levels causes heart attacks and strokes. The plaintiffs adopted what appeared to be a strategy of deploying complex arguments and analyses to obfuscate and defeat Rule 702 gatekeeping. As part of this strategy, two of the plaintiffs’ expert witness conducted a Bayesian “hypothesis test,” by which they took an out-of-date meta-analysis,[20] removed some of the studies that they incorrectly decided were duplicative, and recalculated a credible interval instead of a confidence interval.

This Bayesian hypothesis test came up in several decisions of the MDL court. The Manual cited only to a decision dated August 23, 2018, which it characterized as denying a motion to exclude expert witness testimony that advanced a Bayesian critique of epidemiologic studies.[21]

Looking at the cited decision of August 23, 2018, we see a reference to a previous ruling in May 2017, when the court held that an expert witness’s failure and inability to “quantify the cardiovascular risk he finds in his Bayesian analysis … is an issue affecting the weight to be accorded to his analysis, not its admissibility.”[22] On its face, this opinion does not quite make sense given that a Bayesian analysis would necessarily involve a quantification of posterior probability. The referenced May 2017 opinion also demonstrates the court’s failure to understand basic frequentist concepts, when it recited incorrect definitions of p-value and confidence intervals:

“According to conventional statistical practice, such a result—that is, a finding of a positive association between smoking and development of the disease—would be considered statistically significant if there is a 95% probability, also expressed as a “p-value” of <0.05, that the observed association is not the product of chance. If, however, the p-value were greater than 0.05, the observed association would not be regarded as statistically significant, according to prevailing conventions, because there is a greater than 5% probability that the association observed was the result of chance.

* * *

Statistical significance can also be expressed equivalently in terms of a confidence interval. A confidence interval consists of a range of values. For a 95% confidence interval, one would expect future studies sampling the same population to produce values within the range 95% of the time.”[23]

There is, however, also a discussion in the May 2017 decision to the Bayesian hypothesis test, which had been developed by plaintiffs’ expert witnesses,

Burt Gerstman and Martin Wells.[24] The new Manual’s citation to the testosterone MDL case seems to be to this Bayesian analysis.

While the testosterone MDL case cited by the Manual refers only obliquely to a putative Bayesian analysis that had no quantification, the May 2017 decision, not cited by the Manual, actually involved a Bayesian analysis that supposedly yielded a posterior probability of 85% that there was some increased risk for a composite of heart attack and stroke outcomes from use of testosterone therapies.

In the May 2017 decision, the MDL court rejected AbbVie’s Rule 702 motion to exclude Gerstman’s opinion based upon the Bayesian hypothesis test. AbbVie’s approach to the challenge to the Gerstman-Wells’ Bayesian analysis seemed to avoid the complexity inherent in the analysis. The AbbVie motion included several grounds, not all discussed in the court’s decision of May 2017, for excluding the Bayesian analysis, including:

“1) the plaintiffs’ witnesses’ failure to publish their analysis;

2) the challenged witness’s having never published a significant Bayesian analysis previously;

3) the absence of Bayesian analyses in the relevant studies on testosterone;

4) the rarity of Bayesian analyses in product liability cases;

5) the witnesses’ failure to state what the actual risk was, as opposed to the probability that it exceeded 1.0; and

6) the defense expert witness’s calculation that the “Increased [cardiovascular] risk meets only a 70% level of evidence, which is far below the 95% level required.”[25]

Grounds one through four were extremely weak as stated, and ground five did not affect the relevancy of the analysis to general causation. Ground six was the shot in the foot, with the defense’s falling into the trap of conflating the coefficient of confidence (95%) with the posterior probability of a Bayesian analysis.

According to the district court’s opinion, AbbVie challenged Gerstman’s Bayesian analysis because Gerstman never used or published on Bayesian statistics, and thus he lacked expertise in Bayesian analysis. This part of the challenge was readily dismissed because the level of qualifications for an expert witness is very low. A somewhat more substantive objection complained that the Bayesian analysis was “inappropriately based on subjective assumptions.”

The MDL court refused to exclude Gerstman’s Bayesian analysis, relying in part upon the suggestion in the statistics chapter of the Reference Manual third edition that Bayesians constitute a “a well-established minority” in the field of statistics.[26]

On AbbVie’s claim that Bayesian methods are excessively “subjective,” the court declared that AbbVie had failed to explain how the subjective aspect of Bayesian analysis made the proffered Bayesian analysis “any less reliable than frequentist approaches to statistics, which also involve subjective judgments in interpretation of study results.”

Unfortunately, important issues raised by the plaintiffs’ Bayesian meta-analysis were not raised by counsel or addressed by the MDL court’s initial gatekeeping opinion of May 2017. The court briefly revisited the Bayesian analysis as proffered by Martin Wells, with the same lack of specificity, in August 2018.[27] The Bayesian analysis had been prepared jointly by Gerstman and Wells, and the August 2018 decision followed the earlier decision from 2017, without adding any analysis or explanation.

A third challenge to Wells’ Bayesian analysis was filed in 2019, by a different defendant in the testosterone MDL. This challenge was supported by an expert witness report that carefully identified the invalidity of the proffered Bayesian analysis.

Bayes’ Rule is a theorem that provides a posterior probability for a claim or proposition based upon a prior probability and the strength of the evidence at hand. Unlike frequentist statistics, which treat the population value (mean or risk ratio) as having a fixed, but unknown value, Bayesian analyses treat both prior and posterior probabilities as probability distributions. Every Bayesian analysis must start with a prior probability, and therein lies a serious methodological problem, not addressed by the MDL testosterone court in May 2017.

In the Bayesian hypothesis test advanced by the plaintiffs’ expert witnesses in the testosterone cases was based on a method described by John Carlin.[28] The analysis invokes a prior risk ratio of 1.0, which standing alone might seem like a perfectly fair and disinterested prior. The chosen variance around 1.0, which makes up the prior probability distribution, however, was extremely wide and flat, essentially encompassing no risk at the low end, and absolute risk, at the high end. A flat distribution implies that the priors of testosterone causing all heart attacks and strokes, preventing all such outcomes, and having no effect at all, were roughly equally likely as a starting point. Given that we start with a very good understanding that testosterone does not prevent all heart attacks and strokes; nor does it cause all such events, we know that these starting points are unrealistic. The starting assumptions of the plaintiffs’ meta-analysis were, therefore, completely unrealistic and counterfactual.

Carlin’s method used in the proffered Bayesian meta-analysis in the testosterone cases further assumed a “hierarchical normal model.” Carlin described his assumption as reasonable “as long as the studies are large and observed counts are not too small.”[29] In the dataset used by plaintiffs’ expert witnesses, however, virtually all the studies had very low event counts, often zero or one, in either the TRT or placebo arm, or both. Carlin acknowledged that it was difficult to assess the validity of the normal model, and emphasized that

“[a] study of the sensitivity of conclusions to the choice of prior would be important.”[30]

Subsequent simulation studies of Carlin’s approach have shown that so-called “vague” or “non-informative” priors, such as were used by plaintiffs’ expert witnesses, can exercise an “unintentionally large degree of influence on any inferences.”[31]

AbbVie’s earlier challenges to Gerstman and Wells failed to note that they had offered no tests of the validity of Carlin’s method in the context of meta-analyzing clinical trials for sparse safety outcomes. The challenge filed in the Martin case, in 2019, challenged the unsupported assumptions of the proffered Bayesian hypothesis test. This Rule 702 challenge pointed out not only the subjectivity of the assumed prior probability distribution, but its counter-factual nature, and the failure of the proffered Bayesian analysis to comply with the methodological requirements of Carlin’s method.

There were additional problems with the Bayesian hypothesis test as put forward by plaintiffs’ expert witnesses. First, advancing of a causal claim with an 85% posterior probability was bound to be confused with the plaintiffs’ burden of proof of greater than 50%, notwithstanding that the calculated posterior probability did not take into account uncertainty from bias and other non-random errors in the aggregated clinical trial data, which were out-of-date and which had questionable inclusionary and exclusionary criteria. Second, the posterior probability was based upon a composite end point that combined heart attack and stroke. As a later deposition of one of the Bayesian analysts, Martin Wells, showed, had the Carlin method been applied to just the heart attack summary point estimate, then the posterior probability that TRT causes heart attack would have been less than 50%, and thus greater than 50% that testosterone does not cause heart attack.[32]

Notwithstanding the plaintiffs’ failure to rebut the very specific methodological challenges to their witnesses’ Bayesian analysis, the MDL court denied the third Rule 702 motion to exclude, without meaningful analysis.[33] The case (Martin) was later tried to a jury that returned a verdict for the defense. Neither in Martin nor in any other testosterone case that was tried did plaintiffs actually present their Bayesian analysis to the trier of fact. The likely interpretation of this failure is that the Bayesian analysis was always meant to obfuscate the weaknesses of their causation case and to help deflect Rule 702 challenges.

The ultimate verdict on the plaintiffs’ case and the Bayesian hypothesis test with its ill-informed non-inormative priors was returned only after most of the MDL cases were tried or had settled. In 2023, a “mega-trial,” a large, well-conducted randomized controlled trial was concluded and published with findings of no increased risk of heart or stroke after long-term use of TRT in men who resembled the TRT plaintiffs.  The trial enrolled over 5,000 men, about whom the researchers reported that a primary composite cardiovascular end-point event occurred in 182 men (7.0%) on testosterone therapy, and in 190 men (7.3%) receiving placebo, with a hazard ratio below one (HR = 0.96, 95% C.I., 0.78 – 1.17). None of the components of the composite (heart attack, stroke) showed an increased risk.[34]

“Falshood flies, and Truth comes limping after it; so that when Men come to be undeceived, it is too late, the Jest is over, and the Tale has had its Effect: Like a Man who has thought of a good Repartee, when the Discourse is changed, or the Company parted: Or, like a Physician who hath found out an infallible Medicine after the Patient is dead.”[35]

CONCLUSION

The Reference Manual’s chapter on epidemiology claims that Bayesian analyses have gained a toehold in litigation. The authors cited three cases, all involving the evaluation of health effects. One of the cases (Langrell) cited a claim of specific causation, and the case cited showed no evidence of an actual Bayesian analysis. The cited case was one of three in which the same expert witness, Dr. Gale, claimed to use Bayesian analysis. The other two cases, not cited, rejected the admissibility of Dr. Gale’s proffered testimony.

The second case cited (In re Ability) actually involved a Bayesian analysis, but for a so-called disproportionality analysis, which is a technique for interpreting a signal of possible health effect. The misuse of the analysis by the Bayesian analyst (David Madigan) was overlooked by the court, and by the Reference Manual.

The third case cited by the Manual also involved an actual Bayesian analysis, In re Testosterone, in the form of a Bayesian hypothesis test. The proffered analysis actually did, in theory, speak to a material issue of general causation. The Manual’s credulous citation, and the MDL court’s gatekeeper, however, overlooked that the methodology was misspecified and misapplied in multiple ways.

If these three citations are a toehold, then we need a tow-truck for these wrecks!


[1] Steve C. Gold, Michael D. Green, Jonathan Chevrier, & Brenda Eskenazi, Reference Guide on Epidemiology, in National Academies of Sciences, Engineering, and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 939 (4th ed. 2025) [cited as GGCE]

[2] GGCE at 963 n.178.

[3] Langrell v. Union Pac. Ry. Co., No. 8:18CV57, 2020 WL 3037271, at *3 (D. Neb. June 5, 2020).

[4] Id.

[5] See, e.g., Robert Peter Gale, et al., Fetal Liver Transplantation (1987); Robert Peter Gale & Thomas Hauser, CHERNOBYL: THE FINAL WARNING (1988); Kenneth A. Foon, Robert Peter Gale, et al., IMMUNOLOGIC APPROACHES TO THE CLASSIFICATION AND MANAGEMENT OF LYMPHOMAS AND LEUKEMIAS (1988); Eric Lax & Robert Peter Gale, RADIATION: WHAT IT IS, WHAT YOU NEED TO KNOW (2013).

[6] Byrd v. Union Pacific RR, 453 F. Supp. 3d 1260 (D. Neb. 2020).

[7] Id. at 1270 (“Dr. Gale states that his opinion is based on Bayesian probabilities which consider the interdependence ofindividual probabilities. This process is sometimes referred to as differential diagnosis or differential etiology.”).

[8] Report of Robert Peter Gale in Saul Hernandez at 13 (July 23, 2019)[on file with author]. There was no evidence that Mr. Hernandez was tested for infection by helicobacter pylori.

[9] Hernandez v. Union Pacific RR, No. 8: 18CV62 (D. Neb. Aug. 14, 2020).

[10] See, e.g., Monireh Sadat Seyyedsalehi, Giulia Collatuzzo, Federica Teglia & Paolo Boffetta, Occupational exposure to diesel exhaust and head and neck cancer: a systematic review and meta-analysis of cohort studies, 33 EUR. J. CANCER PREV. 435 (2024).

[11] Langrell v. Union Pac. Ry. Co., No. 8:18CV57, 2020 WL 3037271, at *3-4 (D. Neb. June 5, 2020).

[12] Dr. Gale’s testimony has not fared well elsewhere. See, e.g., In re Incretin-Based Therapies Prods. Liab. Litig., 524 F.Supp.3d 1007 (S.D. Cal. 2021) (excluding Gale); Wilcox v. Homestake Mining Co., 619 F. 3d 1165 (10th Cir. 2010); June v. Union Carbide Corp., 577 F. 3d 1234 (10th Cir. 2009) (affirming exclusion of Dr. Gale and entry of summary judgment); Finestone v. Florida Power & Light Co., 272 F. App’x 761 (11th Cir. 2008); In re Rezulin Prods. Liab. Litig., 309 F.Supp.2d 531 (S.D.N.Y. 2004) (excluding Dr. Gale from offering ethical opinions); Cundy v. BNSF Ry, No. 40095-6-III.  Wash. Ct. App. (Mar. 5, 2026) (affirming dismissal of case; Gale was one of plaintiffs expert witnesses); Russo v. Metro-North RR., Index No. 159201/2019, 2025 NY Slip Op 34659(U), N.Y.S.Ct., N.Y. Cty. (Dec. 5, 2025); Saverino v. Metro-North RR, 2024 NY Slip Op 31326(U), Index No. 161353/2019, N.Y. S. Ct., N.Y. Cty. (Apr. 8, 2024).

[13] In re Abilify (Arpiprazole) Prods. Liab. Litig., No. 3:16MD2734, 2021 WL 4951944, at *5 (N.D. Fla. July 15, 2021).

[14] FDA Adverse Event Reporting System (FAERS) (Last updated Sept. 8, 2014), available at <http://www.fda.gov/Drugs/GuidanceComplianceRegulatoryInformation/Surveillance/AdverseDrugEffects/default.htm>.

[15] In re Accutane Litig., No. 271(MCL), 2015 WL 753674, at *15 (N.J. Super. Law Div., Feb. 20, 2015) (Hon. Nelson C. Johnson, also known as the author of Boardwalk Empire).

[16] In re Accutane, 234 N.J. 340 (2018) (affirming exclusion of David Madigan).

[17] Derrick J. Stobaugh, et al., Alleged isotretinoin-associated inflammatory bowel disease: Disproportionate reporting by attorneys to the Food and Drug Administration Adverse Event Reporting System, 69 J. AM. ACAD. DERMATOL. 393 (2013).

[18] Mai Fujimoto, et al., Association between Statin Use and Bladder Cancer: Data Mining of a Spontaneous Reporting Database and a Claim Database, 1 J. PHARMACOL. & PHARMACOVIGILANCE 1 (2015).

[19] Xiao-long Zhang, et al., Statin use and risk of bladder cancer: a meta-analysis, 24 CANCER CAUSES & CONTROL 769 (2013).

[20] S. Albert & J. Morley, Testosterone therapy, association with age, initiation and mode of therapy with cardiovascular events: a systematic review, 95 CLIN. ENDOCRINOL. 436 (2016).

[21] GGCE at 963 n.178 (citing In re Testosterone Replacement Therapy Prods. Liab. Litig., No. 14 C 1748, 2018 WL 4030585, at *8 (N.D. Ill. Aug. 23, 2018), and explaining that the court had denied a “motion to exclude testimony of expert ‘whose Bayesian critiques of epidemiological studies’ were similar to those of another expert whose testimony ‘the Court has previously found admissible’.”).

[22] In re Testosterone Replacement Therapy Prods. Liab. Litig., No. 14 C 1748, 2017 WL 1833173, at *4 (N.D. Ill. May 8, 2017).

[23] Id.

[24] This is the same Martin Wells found to be a methodological shapeshifter in the paraquat parkinsonism litigagion. In re Paraquat Prods. Prods. Liab. Litig., Case No. 3:21-md-3004-NJR, MDL No. 3004, 730 F.Supp.3d 793, 838 (2024) (S.D. Ill. 2024). See also Schachtman, Paraquat Shape-Shifting Expert Witness Quashed, TORTINI (Apr. 24, 2024).

 

[25] Defendants’ Motion to Exclude Plaintiffs’ Expert Testimony on the Issue of Causation, and for Summary Judgment, and Mem. of Law in Support, No. 1:14-CV-01748, MDL 2545, 2017 WL 1104501, at *69–70 (N.D. Ill. Feb. 20, 2017) (citing Reference Manual 259 (3rd ed. 2011), for the proposition that “‘subjective Bayesians are a well-established minority’ of scientists whose methods ‘have rarely been used in court.’”). See also Plaintiffs’ Mem. of Law in Opp. to Motion of AbbVie Defendants to Exclude Plaintiffs’ Expert Testimony on Causation, and for Summary Judgment, MDL No. 2545, Dkt. No. 1753 (N.D. Ill. Mar. 23, 2017).

[26] See David H. Kaye & David Freedman, Reference Guide on Statistics, in National Academies of Sciences, Engineering, and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 529 (3rd ed. 2011).

[27] In re Testosterone Replacement Therapy Prods. Liab. Litig., MDL No. 2545, MDL No. 2545, 2018 WL 4030585, at *8 (N.D. Ill. Aug. 23, 2018).

[28] John Carlin, Meta-analysis for 2 x 2 tables: a Bayesian approach, 11 STAT. MED. 141 (1992) [Carlin]

[29] Carlin at 157.

[30] Id.

[31] See P. Lambert et al., How vague is vague? A simulation study of the impact of the use of vague prior distributions in MCMC using WinBUGS, 24 STATS. MED. 2401, 2402 (2005). See also Andrew Gelman, Prior distributions for variance parameters in hierarchical models, 1 BAYESIAN ANALYSIS 515

(2006); E. Pullenayegum, An informed reference prior for between-study heterogeneity in meta-analyses of binary outcomes, 30 STATS. MED. 3082 (2010).

[32] Deposition of Martin Wells, in Martin v. Actavis, Inc., No. 15-cv-4292, 2018 WL 7350886 (N.D. Ill. Apr. 2, 2018).

[33] Martin v. Actavis, Inc., Case No. 15 C 4292, MDL No. 2545, 430 F. Supp.3d 516, 534 (2019).

[34] A. Lincoff et al., Cardiovascular Safety of Testosterone-Replacement Therapy, 389 NEW ENGL. J. MED. 107, 114 (2023).

[35] Jonathan Swift, The Examiner No. 14 (Nov. 9, 1710), in THE EXAMINER & OTHER PIECES WRITTEN IN 1710-11 at 8, 11-12 (Herbert Davis, ed. 1966).

How Science Works in the New Reference Manual on Scientific Evidence

March 12th, 2026

The Second and Third Editions of the Reference Manual on Scientific Evidence contained a chapter, “How Science Works,” by Professor David Goodstein. This chapter ambitiously set out to cover philosophy and sociology of science to help orient judges as strangers in a strange land. Goodstein’s chapter had been a useful introduction to scientific methodology, and it countered some of the antic ideas seen in some judicial opinions, as well as in some other chapters of the Manual. Goodstein brought a good deal of experience and expertise to the task. He was a distinguished professor of physics and Vice Provost at the California Institute of Technology, and he had written engagingly about scientific discovery and the pathology of science.[1] Sadly, Goodstein died in April 2024. His death may have had some role in the delayed publication of the Fourth Edition of the Manual,[2] and the improvident replacement of his chapter with a new chapter written by authors less articulate about how science works.

The substitute chapter on “How Science Works” was written by two authors considerably less accomplished than the late Professor Goodstein.[3] Michael Weisberg is a professor of philosophy at the University of Pennsylvania, where he is the deputy director of Perry World House, which “analyzes global policy challenges through the realms of climate, democracy, global justice and human rights, and security.” The connection with Perry House may explain the new chapter’s heavy reliance upon the development of the chlorofluorocarbon (CFC) connection to ozone layer depletion as an exemplar of scientific discovery and knowledge. The University of Pennsylvania webpage describes Weisberg as “educat[ing] the next generation of environmental leaders in the classroom, at the negotiating table, and in the field, ensuring that their voices have maximal impact on addressing the climate crisis.”[4] So we have a philosopher of advocacy science, as it were. Some readers might think those credentials are not optimal for preparing a nuts-and-bolts description of how science works. Reading sections of the new chapter will not diminish their concerns.

Joining with Weisberg on this new version of “How Science Works,” is Anastasia Thanukos, who works at the University of California Museum of Paleontology. Thanukos has her masters degree in integrative biology, and her doctorate in science education.[5] 

The new “method” chapter has some virtues. As did Goodstein’s chapter, the new authors put peer review into a realistic perspective that should keep judges from being snoockered into admitting weak or bogus evidence because it had been published in a peer reviewed journal.[6] The authors should have gone much farther in pointing out that the rise of predatory and pay-to-play journals, as well as journals controlled by advocacy groups, have undermined much of the publishing model of modern science.

Weisberg and Thanukos discuss “expertise” in a way that is interesting but irrelevant to legal cases.  They seem blithely unaware that the standard for qualifying an expert witness is extremely low. Who will disbuse them when they argue that “[i]t is worth evaluating the closeness of a scientist’s disciplinary expertise to a scientific topic on which expert testimony is delivered”?[7] In what emerges as a consistent pattern of giving anti-manufacturing industry examples, the authors point to Richard Scorer as an accomplished scientist, who had no specific expertise in CFC ozone depletion. Notwithstanding the lack of specific expertise, an industry-backed group promoted Scorer’s views that criticized the CFC-ozone depletion hypothesis.[8] Citing Naomi Oreskes, the new Manual chapter states that “[t]he problem of scientists with legitimate expertise in one field weighing in on a scientific question outside their area of expertise is a pernicious one that has affected public acceptance of science and policy on issues such as climate change and tobacco exposure.”[9] Later, when Weisberg and Thanukos discuss the Milward case, they miss the pernicious influence that flowed from allowing Martyn Smith, a toxicologist, to give methodologically muddled opinion testimony on epidemiology. Pernicious is where you find it, and the authors of the new chapter find virtually all untoward instances of poor scientific method and conduct to originate from manufacturing industry.

Weisberg and Thanukos introduce a discussion of the “replication crisis,” a phrase and concept absent from the third edition of the Reference Manual.[10] The authors express some skepticism that there is an actual crisis over replication,[11] but their focus on climate science may mean that they are simply blinded by groupthink in that discipline. Their discussion of retractions omits the steep rise in retraction rates in most scientific disciplines,[12] and the authors ignore the proliferation of poor quality journals. Positively, the authors introduce a discussion of study preregistration, a notion absent from the third edition of the Manual, and they explain that such preregistration may serve as a bulwark against data dredging post hoc analyses.[13] Negatively, the authors ignore how frequently preregistered protocols are not used, or are used and then violated.

Weisberg and Thanukos appropriately ignore “weight of the evidence” (WOE) and “inference to the best explanation” (IBE). Readers might (mistakenly) think that the new chapter implicitly rejects WOE, as put forth by Carl Cranor and credulously accepted by the First Circuit in Milward, when the chapter authors insist that 

“the judge’s task requires a deeper examination of the available evidence and methods by which it was arrived at, as well as an assessment of how the community of experts in this area has evaluated or would evaluate the evidence and reasoning in question.”[14]

Contrary to the Milward decision from 2011, the new authors are not shy about stating the obvious; there is good science, and there is bad science.  Not all “judgment” about causality is acceptable and fit for submission to juries.[15] Given the judicial resistance to Rule 702, the obvious here requires stating. Weisberg and Thanukos acknowledge that some scientific judgment is unreliable or invalid because it was based upon work that was not carried out in accordance with current standards for scientific investigation and inference.[16] It should not surprise anyone that most of their examples of bad science are the product of manufacturing industry; the authors are oblivious to bad science sponsored by the lawsuit industry or by non-governmental advocacy organizations (NGOs).

Weisberg and Thanukos frame scientific disagreements and debates as governed by both data and ethical norms. Science is not infinitely contestable. There are identifiable norms, including a norm that scientists should “seek relevant information,” and “scrutinize ideas and evidence.”[17] Contrary to Milward’s standard of judicial abstention and credulity in the face of dodgy causal claims, these authors state what should be obvious, that scientific scrutiny involves, among other things, “an evaluation of methods, considering potential biases and oversights.”[18]

The chapters’ authors, non-lawyers, get closer to the heart of the error in Milward’s abstention doctrine with their recognition of what should have been obvious to the authors of the law chapter (Richter & Capra):

“When research relevant to a trial has not yet been scrutinized by a community with the appropriate technical expertise, a judge may be placed in the position of providing or requesting this scrutiny.”[19]  

Rather than some vague, subjective, and content-free WOE standard, Weisberg and Thanukos urge scientists, and by implication judges as well, to engage in serious efforts to “identify and avoid bias” and abide by ethical guidelines.[20] In other (my) words, the new authors agree that there is a standard of care reflected in the norms of science, and consequently there can be deviations from that standard. For Weisberg and Thanukos, compliance with the normative structure of scientific investigations is at the heart of building up accurate and predictive conclusions from data.[21] As part of their communitarian and normative conception of the scientific process, the authors appear to accept the reality and necessity for judges to act as gatekeepers.[22]

And while this recognition of standards and the need to police against deviations from standards is commendable, Weisberg and Thanukos proceed to give an abridgment of scientific method and process that is distorted and erroneous. They steadfastly ignore the concept of hierarchy of evidence, and thus provide illegitimate cover for levelers of evidence. In discussing randomized controlled trials, for instance, they note that such trials are often taken as “the gold standard,” but then they counter, without citation, support, or argument, that such trials “are just one line of evidence among many.”[23] The authors elide discussion and reconciliation of when that “just one line of evidence” conflicts with observational studies.

Notwithstanding their helpful comments about the need to evaluate studies for bias and other errors, these authors enter into the Milward controversy with an observation that assessing many lines of evidence is required and can be difficult for courts, and has led to “controversy.” Citing to papers including one  by the late Margaret Berger at her notorious lawsuit industry SKAPP-funded Coronado Conference, Weisberg and Thanukos float the observation that:

“In science, the available evidence (some of which may come from other research programs not designed to test the hypothesis under consideration) is evaluated as a body, along with the strengths, weaknesses, and caveats relating to each type of data, an approach which, some scholars have argued, the judiciary has not always followed.98[24]

This claim that the available evidence is evaluated as “a body” is presented as a fact about how science works, without any citation or argument. Several comments are in order. First, the claim is at odds with the authors’ own statements that scientific norms require evaluating each study for biases and other disqualifying flaws. Second, the claim is at odds with the authors’ own reference to systematic reviews and meta-analyses,[25] which are governed by protocols with inclusionary and exclusionary criteria for individual studies, and which require consideration of individual study validity before it enters the “body” of evidence that is quantitatively or qualitatively evaluated. In the authors’ words, “authors delineate both the criteria that studies must meet for inclusion in the review and the methods that will be used to assess the studies.”[26] The Milward case involved an expert witness who had proffered the very opposite of a systematic review in the form of post hoc rejiggering of studies and their data to fit a pre-conceived litigation goal. In the context of addressing the replication crisis, Weisberg and Thanukos correctly observe “peer review alone cannot ensure that the conclusions of published studies are actually correct, highlighting the responsibility judges bear in evaluating the validity of the methodologies that contributed to a particular piece of research.”[27] Of course, the Milward case involved a hired expert witness whose unprincipled re-analysis of studies was never peer reviewed or published.

Third, the authors could easily have found additional support for the contrary proposition that individual studies must be evaluated before being considered as part of the entire evidentiary display. The IARC Preamble, which roughly describes how that agency arrives at its so-called hazard classifications of human carcinogenicity, specifies that individual studies within each of three streams of evidence are evaluated for validity and soundness before contributing to a sub-conclusion with respect to (1) epidemiology, (2) toxicology, and (3) mechanistic lines of evidence.[28] Each of those three lines of evidence is adjudged “sufficient,” “limited,” or “inadequate,” by specialists in the three respective areas, before an overall evaluation is reached. There is much that is objectionable in the IARC working group procedures, but this division of labor and the need to consider disparate lines of evidence and studies within each line separately before attempting a synthesis, is present in all systematic review methodology. The suggestion from Weisberg and Thanukos that “the available evidence” in science is “evaluated as a body” is not only unsupported, but it is demonstrably false and misleading.

This claim about holistic evaluation is a fairly transparent but failed attempt to support a claim made in the chapter on the admissibility of expert witness evidence by Liesa Richter and Daniel Capra, who present an exposition of the notorious Milward case, without criticism, in a way to suggest that the case represents appropriate judicial gatekeeping under Rule 702, and that the case is consistent with scientific norms.[29] The chapter on how science works, after  having stated a false claim about scientific methodology for synthesis and integrating disparate lines of evidence, attempts to provide a gloss on the similar and equally benighted claim of Richter and Capra, in footnote 98:

“98. Some scholars have raised concerns that the courts have on occasion unfairly dismissed numerous individual lines of evidence as being flawed or insufficiently conclusive and concluded that evidence is lacking, when in fact the body of evidence, taken as a whole, points to a clear conclusion. For more, see discussion of Milward v. Acuity Specialty Products Group, Inc.; see also Liesa L. Richter & Daniel J. Capra, The Admissibility of Expert Testimony, in this manual; Berger 2005, supra note 97; and Steve C. Gold, A Fitting Vision of Science for the Courtroom, 3 Wake Forest J.L. & Pol’y 1 (2013).”

Some “scholars” have indeed said such things in their more unscholarly moments; some scholars have criticized Milward, but they are not cited in this new methods chapter. The footnote is accurate, but highly misleading by omission. The First Circuit in Milward also said as much, also without support or justification, and Richter and Capra, in their chapter of the Manual, fourth edition, parrot the Milward case. Weisberg and Thanukos cite to two articles, by Margaret Berger and by Steven Gold, both law professors, not scientists, and both ideologically hostile to Rule 702 gatekeeping. The Berger article was from a lawsuit-industry SKAPP funded symposium known as the Coronado Conference, and the Gold paper comes out of a symposium sponsored by the lawsuit industry itself and the Center for Progressive Reform, an advocacy NGO to which one of Mr. Milward’s expert witnesses, Carl Cranor, belongs. So the authors of the new science methodology chapter failed to cite any scientific source, but cited to papers by lawyers in the capture of the lawsuit industry, and a single (infamous) decision that ignored Rules 702 and 703, as well as the extensive literature on systematic reviews.  Weisberg and Thanukos could have cited many sources that contradicted their claim, and the claim of the lawsuit industry sponsored lawyers, but they did not. This is what biased and subversive scholarship looks like.

Funding Bias – The New McCarthyism

The selective citation to articles sponsored by the lawsuit industry is ironic in the context of what Weisberg and Thanukos have to say elsewhere about the “funding effect.” Some of what the authors say about personal bias is almost reasonable. For instance, they suggest that funding source is a “valid consideration” in evaluating methodologies and conclusions of expert testimony, and presumably of published studies as well, but not a sufficient reason to exclude such testimony or reliance.[30] Interestingly, these authors ignored the funding and the ideological interests of the symposia they cited in support of the repudiated Milward abstention doctrine.

Over three decades ago, Kenneth Rothman, the founder of Epidemiology, the official journal of the International Society for Environmental Epidemiology (ISEE), wrote his protest against the obsession with funding in article that should have been cited in the new chapter, for balance. Rothman described the fixation on funding as the “new McCarthyism in science,” which manifested as intolerance toward industry-sponsored studies, and strict scrutiny of “conflict-of-interest” (COI) disclosures.[31] The new McCarthyites amplify the gamesmanship over COI disclosures by excusing or justifying non-disclosure of COIs from scientists who have positional conflicts, or who are aligned with advocacy groups or with the lawsuit industry.

This asymmetrical standard for adjudging conflicts is on full display in the Weisberg and Thanukos chapter, when they claim that “in pharmaceuticals, there is a strong tendency for industry-sponsored trials to favor the industry’s product.”[32] The chapter authors, and their cited source, ignore the context in which the pharmaceutical industry scientists publish clinical trial results.  A successful clinical trial that showed efficacy with minimal adverse events is the result of years of prior research, including phase I and II trials, and preclinical testing. If the research fails to show efficacy, or shows unreasonable harm, in any of this prior research, the phase III trial is never done and so never published. If the medication is never licensed, the phase III trial will generally not be published. The selection effects are obvious and overwhelming in determining that the published results of phase III trials will be work that favors the sponsor. The “failed” phase III trial may result in a securities class action against the pharmaceutical company. In the realm of observational studies, some work commissioned by manufacturing industry has its origins in the poorly conducted, flawed work of environmental zealots and NGOs. Manufacturing industry has an obvious interest in correcting the scientific record, and again, any carefully done study would rebut that of the zealots and favor the industry sponsor.

Elsewhere, the authors offer a more balanced assessment when they observe that “[a]ll research is potentially influenced by bias, and every funder of research has the potential to introduce a source of bias.”[33] Similarly, the fourth edition chapter notes that “[a]ll scientists have some sort of motivation for their work, and this does not preclude scientific knowledge building, so long as biased methodologies and interpretations are avoided.”[34] Their recognition that motivated reasoning is everywhere suggests that all research should receive scrutiny regardless of apparent or disclosed funding source.[35]

When it comes to providing examples of funding-effect distortions of science, Weisberg and Thanukos seem to blank on instances created by the lawsuit industry or by environmental NGOs. The reader should contrast how readily and stridently the authors point to bias in industry-sponsored research with how the authors tie themselves up with double negatives when making the same point about NGOs:

“That is not to suggest that government-or nongovernmental organization (NGO)-sponsored research is necessarily free from bias.”[36]

The cognitive dissonance is palpable. The only conclusion that could be drawn from such a locution is that Weisberg and Thanukos have not worked very hard to identify and disclose their own biases.

STATISTICS DONE POORLY

When it comes to explaining and discussing the role of statistical methods in the scientific process, Weisberg and Thanukos go off the rails. The new chapter is an unmitigated disaster, which should have been corrected in the peer review and oversight process. The first sign of trouble became apparent upon checking the definition of “p-value” in the chapter’s glossary:

p-value. A statistic that gives the calculated probability that the null hypothesis could be true even given the observed differences between conditions.”[37]

This definition is the transposition fallacy on steroids. Obviously, a p-value cannot be the probability that the null hypothesis “could be true” when the procedure for calculating a p-value must assume that the null hypothesis is true, along with a specified probability model. Equally important, the p-value does not describe a probability in connection with the null hypothesis because it describes the probability of observing data as different from the null, or more so, as seen in this particular sample.  The statistics chapter in the Manual by Hall and Kaye states the meaning correctly.  The coverage of statistical concepts by Weisberg and Thanukos should be studiously ignored.

The outrageously incorrect definition of p-value in the glossary is not an isolated error.  The authors are clearly statistically challenged. In the text of their chapter, they incorrectly describe the p-value, consistently with their aberrant glossary entry:

“the commonly used p-value approach, scientists compare a test hypothesis (e.g., that drug X is effective) to a null (e.g., that there is no difference in cure rates between those who took drug X and those who took a placebo). Scientists then calculate the probability that the null hypothesis could be true even with the observed difference between conditions (e.g., the cure rate of patients taking drug X compared to that of those taking a placebo).”[38]

Weisberg and Thanukos thus conflate frequentist and Bayesian statistics. They also obliterate the meaning of the confidence interval, an important concept for judges and lawyers to understand. Here is how the authors describe the confidence interval in their chapter:

Evaluating estimates: In science (and in contrast to their lay meanings), the terms uncertainty and error refer to the variability of a set of data that is intended to estimate a single number. Uncertainty and error are generally expressed as a range, within which we are confident that, if the study were repeated, the new result would fall. Scientists often use a 95% confidence interval for this purpose.”[39]

Describing the confidence interval in the same sentence as “uncertainty and error” is bound to induce uncertainty and error. The confidence interval provides a range of estimates based upon random error, and uncertainty only in the form of imprecision in the point estimate. There are of course myriad other kinds of uncertainty and error not captured by the confidence interval. The most important of the authors’ errors is that they assert incorrectly that the confidence interval provides a range within which new results from the study repeated would fall.  This is, again, a variant on the transposition fallacy that the authors commit in their definition of the p-value. The confidence interval provides a range of results that would not be rejected as alternative null hypotheses by the data in the obtained sample. Because of random error, future samples would give different results, with different confidence intervals, which would not be co-extensive with the first obtained confidence interval. To be sure, the statistics chapter states the matter correctly, and the epidemiology chapter finally gets it correct in its text (after having mangled the concept in the second and third editions), but the epidemiology chapter perpetuates its previous errors in defining confidence intervals in its glossary. This sort of issue, and it is a serious one, could have been eliminated had there been meaningful peer review and editorial oversight for consistency and accuracy of the Manual as a whole.

Weisberg and Thanukos address statistical power in a way that may also mislead readers. They tell us that “[p]ower refers to a test’s ability to reject a hypothesis that is indeed false.” W&T at 88. If only were it so. The authors omit that power is a probability that at a specified level of significance (say p < 0.05), and a specified alternative hypothesis, sample size, and probability model, the sample result will reject the null hypothesis in favor of the alternative hypothesis. Then the authors suggest confusingly that “[w]ell-designed studies have sufficient power to detect the differences of interest, but it may not be apparent when a test lacks power.”[40]

If the study at issue presents a confidence interval around a point estimate of interest, then it will be clear what alternative null hypotheses are statistically compatible with the sample result at the pre-specified level of alpha (significance). Any point outside the interval would be rejected by such a test of significance, and so the casual reader will have a rather good idea of what could and could not be rejected by the sample data. And of course, virtually every study will have low power to detect extremely small increased risks, say relative risk of 1.00001. And most studies will have high power to detect risk ratios of over 1,000.

This new chapter on “How Science Works” also propagates some well-known fallacies about statistical significance testing. Implicit in the authors’ committing the transposition fallacy, is a conceptual and mathematical confusion between the coefficient of confidence (1-α) and the posterior probability of an hypothesis.

The authors’ mistake comes in their insistence upon labeling precision in a test result as “certainty.” In the quote below, the authors’ confusion is clear and obvious:

“Note that the 95% and 5% cutoffs are somewhat arbitrary, and a higher degree of confidence might be required if more certainty were desired—for example if an impactful policy decision depended on the conclusion.”[41]

An impactful [sic] policy decision might well call for more certainty, or a higher posterior probability, but a higher coefficient of confidence will not necessarily map to hypothesis probability at all. The authors’ confusion and conflation of the probability of alpha and the Bayesian posterior probability arises elsewhere within the chapter:

“(1) A p-value lower than 0.05 does not prove that a null hypothesis is false. It is strong evidence, but there is a small chance that the difference observed could be the result of chance alone.

(2) Using a low p-value (e.g., 0.05) as a criterion for significance sets a high bar for rejecting the null hypothesis, minimizing the chance of getting a false positive… .”[42]

Again, a p-value less than five percent is hardly strong evidence in the context of large database studies, especially when there are multiple comparisons and the outcome is not the pre-specified outcome of the analysis. The authors’ confusion is on full display when they discuss the Zoloft birth defects litigation, where the Third Circuit affirmed the exclusion of plaintiffs’ expert witnesses’ causation opinions and the grant of summary judgment to the defendants. According to the authors’ narrative:

“plaintiffs’ expert’s testimony would have argued that multiple, nonsignificant associations between Zoloft use and birth defects indicated a causal relationship. The testimony was excluded because these results were consistent with a weak causal relationship (a small effect size), one that is ‘so weak that one cannot conclude that the risk is greater than that seen in the general population’.”[43]

Of course, in the Zoloft litigation, the excluded plaintiffs’ expert witnesses were caught red-handed – at cherry picking – and attempting to circumvent the lack of significance with a methodologically incorrect meta-analyses.[44]

If the risk of birth defects among children born to mothers who used Zoloft in pregnancy was no greater than seen in the general population, then there would be no risk, not risk “so weak” it cannot be seen. Locutions such as the “results were consistent with a weak causal relationship,” when the results were equally consistent with no causal relationship suggest that the writers cannot bring themselves to say that the causal hypothesis was simply not supported at all. Of course, no study may exclude an increased risk of 0.01 percent, or a relative risk of 1.01, but at some point, when multiple attempts fail to reveal an increased risk, we may conclude that the proponents of the causal claim have failed to make their case.

META-SHMETA-ANALYSIS

Weisberg and Thanukos address meta-analysis incompletely in the context of systematic reviews. The authors do not provide any insights into how meta-analyses are done, and more glaringly, they fail to mention that not all systematic reviews can or should result in quantitative syntheses of estimates of association. On the positive side, they state that meta-analyses are important in litigation, and that the application of rigorous methodologies should be required.[45] With clearly unintended irony, Weisberg and Thanukos offer, as support for their statement, the Paoli Railroad Yard case, “in which the exclusion of a contested meta-analysis was overturned.”[46]

Weisberg and Thanukos have stepped into the wet corner of a pigsty. The issue in the Paoli case arose from a meta-analysis of mortality rates associated with polychlorobiphenyl (PCB) exposures. The district court excluded the proponent of the meta-analysis, not because it was unreliable, but because it was novel. Holding it up in conjunction with a statement about application of rigorous or reliable methodologies was way off the relevant legal point.

The expert witness who proffered the meta-analysis in Paoli was William  Nicholson, who was a physicist with no professional training in epidemiology. For his opinion that PCBs were causally associated with human liver cancer, Nicholson relied upon a non-peer-reviewed, unpublished report he wrote for the Ontario Ministry of Labor.[47] Nicholson described his report as a “study of the data of all the PCB worker epidemiological studies that had been published,” from which he concluded that there was “substantial evidence for a causal association between excess risk of death from cancer of the liver, biliary tract, and gall bladder and exposure to PCBs.”[48]

The defense challenged Nicholson’s opinion, not on Rule 702, but on case law that pre-dated the Daubert decision.[49] The challenge included pointing out the unreliability of the Nicholson’s meta-analysis, but also asserted (incorrectly) the novelty of meta-analysis generally. The district court sustained the defense objection on the grounds of “novelty,” without reaching the reliability analysis.[50] The Third Circuit appropriately reversed and remanded for consideration of the reliability of Nicholson’s meta-analysis.[51]

The consideration of Nicholson’s “meta-analysis” never occurred on remand; plaintiffs’ counsel and their expert witnesses withdrew their reliance upon Nicholson’s analysis. Their about face was highly prudent. Nicholson’s report presented SMRs (standardized mortality ratios); for the all cancers statistic, he reported an SMR of 95. What Nicholson did, in this analysis, and in all other instances, was simply divide the observed number of deaths by the expected, and multiply by 100. This crude, simplistic calculation fails to present a standardized mortality ratio, which requires taking into account the age distribution of the exposed and the unexposed groups, and a weighting of the contribution of cases within each age stratum. Nicholson’s presentation of data was nothing short of a fraud.

Nicholson’s Report was replete with many other methodological sins. He used a composite of three organs (liver, gall bladder, bile duct) without any biological rationale. His analysis combined male and female results, and still his analysis of the composite outcome was based upon only seven cases. Of those seven cases, some of the cases were not confirmed as primary liver cancer, and at least one case was confirmed as not being a primary liver cancer.[52]

As noted, Nicholson failed to standardize the analysis for the age distribution of the observed and expected cases, and he failed to present meaningful analysis of random or systematic error. When he did present p-values, he presented one-tailed values, and he made no corrections for his many comparisons from the same set of data.

Finally, and most egregiously, Nicholson’s meta-analysis was meta-analysis in name only. What he had done was simply to add “observed” and “expected” events across studies to arrive at totals, and to recalculate a bogus risk ratio, which he fraudulently called a standardized mortality ratio. Adding events across studies, without weighting by the inverse of study variance, is not a valid meta-analysis; indeed, it is a well-known example of how to generate the error known as Simpson’s Paradox, which can change the direction or magnitude of any association.[53]

In citing to the Paoli case as a reversal of exclusion of a contested meta-analysis, Weisberg and Thanukos give a truncated analysis that misleads readers, judges, and lawyers. There never was a proper consideration of the reliability vel non of Nicholson’s meta-analysis in the Paoli litigation, and in the final analysis, the Paoli plaintiffs abandoned reliance upon Nicholson’s ill-conceived meta-analysis.

VIRTUE SIGNALING

Although there are no land acknowledgments for the property on which Federal Judicial Center building is located, Weisberg and Thanukos miss few opportunities to let us know that they are woke scholars. There is the gratuitous and triggering “pregnant people,”[54] which begs any number of biological questions. Then there is the authors’ statement that they are limiting their focus to the “Western conception of science,” which begs another question, why would we call any other epistemically valid approach, from any corner of the globe, as something other than “science.”[55]

Equally gratuitous are the authors’ endorsements of DEI and “diversity,” with overbroad generalizations that diversity per se advances science,[56] and a claim that “women, people of color, other historically oppressed groups, and non-Western people” are not taken seriously as scientists.[57] In over 40 years of litigating technical and scientific issues, I have never seen a judge or a lawyer disrespect an expert witness based upon sex, race, ethnicity, or national origin. Of course, I have seen expert witnesses treated roughly for propounding bad science, and that seems perfectly appropriate.


[1] See David Goodstein, ON FACT AND FRAUD: CAUTIONARY TALES FROM THE FRONT LINES OF SCIENCE (2010).

[2] Weisberg and Thanukos frequently refer to other chapters in the Manual, which suggests that their chapter was written late in the development of the Fourth Edition, and perhaps contributed to the delayed publication.

[3] Michael Weisberg & Anastasia Thanukos, How Science Works, in National Academies of Sciences, Engineering, and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 47 (4th ed. 2025) [cited as W&T].

[4] See Michael Weisberg, University of Pennsylvania Philosophy, at https://philosophy.sas.upenn.edu/people/michael-weisberg.

[5] Anna Thanukos, Staff, available at https://ucmp.berkeley.edu/people/anna-thanukos/#:~:text=Her%20background%3A%20Anna%20has%20a,Education%2C%20both%20from%20UC%20Berkeley

[6] W&T at 72-75.

[7] W&T at 81.

[8] W&T at 81.

[9] W&T at 81 & n.85 (emphasis added), citing Naomi Oreskes & Erik M. Conway, MERCHANTS OF DOUBT: HOW A HANDFUL OF SCIENTISTS OBSCURED THE TRUTH ON ISSUES FROM TOBACCO SMOKE TO GLOBAL WARMING (2010).

[10] W&T at 94-96.

[11] W&T at 95 n.120.

[12] Richard Van Noorden, More than 10,000 research papers were retracted in 2023 — a new record, 624 NATURE 479 (2023).

[13] W&T at 95.

[14] W&T at 55.

[15] W&T at 63, 68.

[16] W&T at 68.

[17] W&T at 65.

[18] W&T at 70.

[19] W&T at 71.

[20] W&T at 66.

[21] W&T at 75.

[22] W&T at 49.

[23] W&T at 83.

[24] W&T at 86 (citing Richter and Capra’s discussion of Milward in chapter one of the Manual, and Professor Gold’s article from the lawsuit industry celebratory conference on the Milward case).

[25] W&T at 99-100.

[26] W&T at 99.

[27] W&T 96 (emphasis added).

[28] IARC MONOGRAPHS ON THE IDENTIFICATION OF CARCINOGENIC HAZARDS TO HUMANS – PREAMBLE (2019), available at https://monographs.iarc.who.int/wp-content/uploads/2019/07/Preamble-2019.pdf

[29] Liesa L. Richter & Daniel J. Capra, The Admissibility of Expert Testimony, National Academies of Sciences, Engineering, and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 1, 32-33 (4th ed. 2025).

[30] W&T at 76.

[31] Kenneth J. Rothman, “Conflict of interest: the new McCarthyism in science,” 269 J. AM. MED. ASS’N 2782 (1993). See Schachtman, The Rhetoric and Challenge of Conflicts of Interest, TORTINI (July 30, 2013).

[32] W&T at 76 & n.67, citing Sergio Sismondo, Pharmaceutical Company Funding and Its Consequences: A Qualitative Systematic Review, 29 CONTEMP. CLINICAL TRIALS 109 (2008).

[33] W&T at 77.

[34] W&T at 59-60.

[35] W&T at 59-60.

[36] W&T at 76.

[37] W&T at 111.

[38] W&T at 87.

[39] W&T at 90.

[40] W&T at 88.

[41] W&T at 90 (emphasis added).

[42] W&T at 88.

[43] W&T at 90 (internal citations omitted).

[44] In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., 26 F. Supp. 3d 449 (E.D. Pa. 2014); No. 12-md-2342, 2015 WL 314149, at *3 (E.D. Pa. Jan. 23, 2015) (rejecting proffered expert witness opinion based upon “cherry-picking of studies and data within studies”), aff’d, 858 F.3d 787 (3rd Cir. 2017).

[45] W&T at 99.

[46] W&T at 99 & n.134, citing In re Paoli R.R. Yard PCB Litig., 916 F.2d 829 (3d Cir. 1990).

[47] William Nicholson, Report to the Workers’ Compensation Board on Occupational Exposure to PCBs and Various Cancers, for the Industrial Disease Standards Panel (ODP); IDSP Report No. 2 (Toronto Dec. 1987) [Report].

[48] Id. at 373.

[49] See United States v. Downing, 753 F.2d 1224 (3d Cir.1985).

[50] In re Paoli RR Yard Litig., 706 F. Supp. 358, 372-73 (E.D. Pa. 1988).

[51] In re Paoli RR Yard PCB Litig., 916 F.2d 829 (3d Cir. 1990), cert. denied sub nom. General Elec. Co. v. Knight, 499 U.S. 961 (1991).

[52] Report, Table 22.

[53] See James A. Hanley, et al., Simpson’s Paradox in Meta-Analysis, 11  EPIDEMIOLOGY 613 (2000); H. James Norton & George Divine, Simpson’s paradox and how to avoid it, SIGNIFICANCE 40 (Aug. 2015); George Udny Yule, Notes on the theory of association of attributes in statistics, 2 BIOMETRIKA 121 (1903).

[54] W&T at 84.

[55] W&T at 50.

[56] W&T at 71 n. 52-54.

[57] W&T at 102.

Reference Manual’s Chapter on Expert Witness Testimony Admissibility – Part 5

March 7th, 2026

By ignoring Milward’s expert witnesses’ omissions from, and abridgements of, WOE and IBE, the appellate court blinded itself to these witnesses’ distortions of scientific method. The need for judgment, which the Milward court was keen to honor, does not mean that there are not aberrant or deviant judgments, or deviations from the standard of scientific care that are disqualifying. The need for judgment must also allow for equipoise and uncertainty that stands in the way of an inculpatory or exonerative verdict. And then there is the business of questionable research practices that subvert causal judgment. The district court had followed and acknowledged the showing of questionable research practices that pervaded Martyn Smith’s for-litigation opinions. The cheerleaders for Milward seem eager to obscure these practices by their insistence that causation is, after all, only a judgment.

The Milward decision, in its embrace of some truly aberrant methodology and judgment, and some absence of methodology, made some of its own whoopers. Martyn Smith’s incompetent analyses of the epidemiologic evidence had been thoroughly debunked in the district court, but the circuit court glibly adopted Smith’s characterizations. The appellate court failed to understand and come to grips with Smith’s rejiggering of data, and his inconsistently redefining exposures and outcomes in epidemiologic studies to make up new, fanciful results that favored his WOE-ful opinion. The appellate court also failed to understand that scientific judgment is not some vague, amorphous, unstructured decision that turns on whatever looks to be “explanatory.” Even the International Agency for Research on Cancer, which issues hazard classifications that are distorted by non-scientific precautionary principle reasoning, insists that three streams of evidence (epidemiologic, toxicologic, mechanistic) be considered separately, in accordance with criteria, with attention to the validity of each study, and synthesized into a judgment of causality following a carefully structured analysis.[1]

The appellate court in Milward took the demonstration of Smith’s failure to calculate odds ratios correctly to be something that merely went to the weight, not the admissibility, on the theory that a jury, which does not have access to the Reference Manual or to the actual studies as published, could sort it all out. And yet, when the court improvidently set out a definition of what an odds ratio is, it bungled the definition beyond understanding:

“An odds ratio represents the difference in the incidence of a disease between a population that has been exposed to benzene and one that has not.”[2]

The court’s definition is not even wrong. The difference between incidence of a disease in an exposed group and a non-exposed group is the risk difference. It is not an odds ratio. Perhaps the court might have realized what most third graders know, that there is a difference between a ratio (division) and a difference (subtraction). And of course, the odds of exposure is not the same as the incidence of a disease. The relevant odds ratio represents the odds of exposure in cases with APML diagnoses divided by the odds of exposure in study subjects without APML. The odds ratio does involve measurements of incidence although in some cases the odds ratio will approximate a risk ratio, which does involve a ratio of incidences. This is not some hyper-technicality; it is a vivid display that Chief Judge Lynch, writing for a panel of three judges of the First Circuit, had no idea of what she was reviewing or writing.

Richter and Capra devote two pages to a discussion of the Milward case and its embrace of WOE and IBE. There is not, in this discussion, a single adjective of approval or of disapproval. The attention to this one intermediate appellate court opinion far exceeds any other case decided at a level below the Supreme Court, and an engaged reader must ask why the authors of the first chapter of the new Reference Manual wrote about this case at all, especially given the 2023 amendments to Rule 702, which would suggest that Milward was bad law when decided in 2011, and clearly and emphatically bad law in December 2025, when the new Manual was published.

The chapter provides one not-so-subtle clue of the authors’ intent. At the conclusion of their extended, uncritical, and incomplete exposition of Milward,[3] Richter and Capra refer the reader to a law review symposium,[4] “[f]or a detailed analysis of the Milward decision and the weight of the evidence approach to scientific reasoning.” Like Richter and Capra’s coverage of Milward, the cited symposium was hardly an objective analysis; rather, it was more like a drunken celebration at a family reunion.

There have been many law review articles that have discussed the Milward case, but Richter and Capra chose to cite to one particular symposium, which was sponsored by two corporations, the Center for Progressive Reform (CPR) and the Robert A. Habush Foundation. The Center for Progressive Reform (CPR) is a not-for-profit corporation. Its website describes the CPR as a “research and advocacy organization that works in the service of responsive government; climate justice, mitigation, and adaptation; and protecting against environmental harm.”[5] CPR describes one of its key activities as defending science from corporate interference. Presumably its own corporate activities and those of the lawsuit industry are acceptable, but those of corporate manufacturing industry are not. From reviewing CPR’s website, it is not clear that the CPR believes manufacturing corporations should even be allowed to defend against lawsuits. Milward’s retained expert witness Carl Cranor is a “member scholar” at CPR, which makes CPR’s sponsorship of the symposium rather incestuous.[6]

CPR is also apparently comfortable with one highly politicized “corporation,” namely the American Association for Justice (AAJ), which is the trade group for the American lawsuit industry.[7] The AAJ describes itself as a corporation, or a “collective,” that supports plaintiff trial lawyers as their “collective voice … on Capitol Hill and in courthouses across the nation … .” The Robert A. Habush Foundation is endowed by the AAJ, and serves its “educational” mission.  Through the Habush Foundation, the AAJ funds educational programs, “think tanks,” and writing projects designed to influence judges, law professors, lawyers, and the public, on issues of importance to the AAJ:  “the civil justice system and individual rights” for bigger, better, and more profitable litigation outcomes. The AAJ may be a “not-for-profit” corporation, but it represents the interests of one of the most powerful, wealthiest, interest groups in American society — the plaintiffs’ bar.

The Milward symposium agenda and papers from its participants were published at the website for the Wake Forest Journal of Law and Public Policy, but now are marked as “currently private. If you would like to request access, we’ll send your username to the site owner for approval.”

The symposium cited by Richter and Capra for “analysis,” was very much a family affair. The choice of venue, at the Wake Forest Law School, was connected to the web of interests involved. CPR board member, Sid Shapiro, is a law professor at Wake Forest. Shapiro presented at the symposium, along with the Wake Forest professor Michael Green. Cranor, Shapiro’s CPR colleague, and party expert witness for plaintiff, presented.[8] There was only one practicing lawyer who presented at the symposium, Steven Baughman Jensen, who was a past chair of the AAJ’s Section on Toxic, Environmental, and Pharmaceutical Torts. Jensen represented Milward, and hired Cranor as one of the plaintiff’s expert witnesses. Attorney Jensen’s contribution to the symposium has been published along with Cranor’s as well, in the proceedings of the Milward symposium were published volume 3, no. 1 of the Wake Forest Journal of Law and Public Policy,[9] which is now also marked private. Jensen also published an abbreviated paean to Milward in in the AAJ’s trade journal.[10] No defense counsel or defense expert witness participated at the symposium, referenced by Richter and Capra.

Consistent with the financial, advocacy, and political interests of the symposium sponsors, the articles are almost all partisan high-fives for the Milward decision. Writing for the Federal Judicial Center and the National Academies, the authors of a chapter on the law of expert witnesses, a legal issue, for the Reference Manual, should have been aware of the partisan nature of the CPR-AAJ sponsored symposium. They should have flagged the advocacy nature of the symposium, and identified the funding sources and the conflicts created. Furthermore, Richter and Capra should have cited papers that criticized the Milward case, from various perspectives, including its failure to adhere to the law of Rule 702.[11] Their failure to do so is a significant failure of this chapter.


[1] IARC MONOGRAPHS ON THE IDENTIFICATION OF CARCINOGENIC HAZARDS TO HUMANS – PREAMBLE (2019).

[2] Milward, 639 F.3d at 23.

[3] Richter & Capra at 33n.96 (“For a detailed analysis of the Milward decision and the weight of the evidence approach to scientific reasoning…”).

[4] Symposium: Toxic Tort Litigation: After Milward v. Acuity Products, 3 WAKE FOREST JOURNAL OF LAW & POLICY 1 (2013).

[5] The Center for Progressive Reform, at https://progressivereform.org/, last visited on Feb. 24, 2026

[6] Carl Cranor Biography, Center for Progressive Reform, Member Scholars, at https://progressivereform.org/member-scholars/

[7] The AAJ was previously known by the more revealing name, Association of Trial Lawyers of America (ATLA®). 

[8] Carl F. Cranor, Milward v. Acuity Specialty Products: Advances in General Causation Testimony in Toxic Tort Litigation, 3 WAKE FOREST JOURNAL OF LAW & POLICY 105 (2013).

[9] Steve Baughman Jensen, Sometimes Doubt Doesn’t Sell: A Plaintiffs’ Lawyer’s Perspective on Milward v. Acuity Products, 3 WAKE FOREST JOURNAL OF LAW & POLICY 177 (2013).

[10] Steve Baughman Jensen, Reframing the Daubert Issue in Toxic Tort Cases, 49 TRIAL 46 (Feb. 2013).

[11] See Eric Lasker, Manning the Daubert Gate: A Defense Primer in Response to Milward v. Acuity Specialty Products, 79 DEF. COUNS. J. 128, 128 (2012);

David E. Bernstein, The Misbegotten Judicial Resistance to the Daubert Revolution, 89 NOTRE DAME L. REV. 27, 29, 53-58 (2013); David E. Bernstein & Eric G. Lasker, Defending Daubert: It’s Time to Amend Federal Rule of Evidence 702, 57 WM. & MARY L. REV. 1, 33 (2015); Richard Collin Mangrum, Comment on the Proposed Revision of Federal Rule 702: “Clarifying” the Court’s Gatekeeping Responsibility over Expert Testimony, 56 CREIGHTON LAW REVIEW 97, 106 & n.45 (2022); Thomas D. Schroeder, Toward a More Apparent Approach to Considering the Admission of Expert Testimony, 95 NOTRE DAME L. REV. 2039, 2045 (2020); Lawrence A. Kogan, Weight of the Evidence: A Lower Expert Evidence Standard Metastasizes in Federal Court, Washington Legal Foundation Critical Legal Issues WORKING PAPER Series no. 215 (Mar. 2020); Note, Judicial Conference Amends Rule 702. — Federal Rule of Evidence 702, 138 HARV. L. REV. 899, 903 (2025); Nathan A. Schachtman, Desultory Thoughts on Milward v. Acuity Specialty Products, DOI: 10.13140/RG.2.1.5011.5285 (Oct. 2015), available at https://www.researchgate.net/publication/282816421_Desultory_Thoughts_on_Milward_v_Acuity_Specialty_Products .

Reference Manual’s Chapter on Expert Witness Testimony Admissibility – Part 4

March 5th, 2026

In the district court, Judge George O’Toole conducted a pre-trial hearing over four days, and heard testimony from Smith and Cranor, as well as from defense expert witnesses. Judge O’Toole’s published opinion carefully and accurately stated the facts, the applicable law, and presented a well-reasoned judgment as to why Smith’s opinion was not admissible under Rule 702. Without admissible opinions on general causation to support Milward’s case, Judge O’Toole granted summary judgment to the defendants.

Milward appealed the judgment. A panel of judges in the First Circuit heard argument, and reversed in an opinion that is riddled with serious errors.[1] In reviewing the district court’s application of Rule 702, the panel, in an opinion written by Chief Judge Lynch, credulously accepted most of Smith’s and Cranor’s arguments that an ill-defined WOE approach is acceptable method of guiding scientific judgment. Cranor equated WOE, as used by Smith, to the approach that Sir Austin Bradford Hill described, in 1965, for identifying causal associations from epidemiologic data.[2] Chief Judge Lynch’s opinion tracked accurately Cranor’s and Milward’s lawyers’ misrepresentations about Sir Austin’s paper:

“Dr. Smith’s opinion was based on a ‘‘weight of the evidence’’ methodology in which he followed the guidelines articulated by world-renowned epidemiologist Sir Arthur [sic] Bradford Hill in his seminal methodological article on inferences of causality. See Arthur [sic] Bradford Hill, The Environment and Disease: Association or Causation?, 58 Proc. Royal Soc’y Med. 295 (1965).

Hill’s article explains that one should not conclude that an observed association between a disease and a feature of the environment (e.g., a chemical) is causal without first considering a variety of ‘viewpoints’ on the issue.”[3]

The quoted language from the First Circuit opinion, which twice refers to “Arthur Bradford Hill,” rather than Austin Bradford Hill, may suggest that neither Chief Judge Lynch nor his judicial colleagues and their law clerks read the classic paper. An even stronger indicator that the appellate court did not actually read this paper is evidenced in the court’s equating WOE to Bradford Hill viewpoints, without consideration of the necessary predicate for those nine viewpoints. In his short paper, Sir Austin clearly spelled out that there was a foundation needed before parsing the nine viewpoints:

“Disregarding then any such problem in semantics we have this situation. Our observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance. What aspects of that association should we especially consider before deciding that the most likely interpretation of it is causation?”[4]

Whatever Sir Arthur had to say about the matter, Sir Austin defined the starting point of causal analysis as an association free of invalidating bias and random error. The Milward decision ignored this all important predicate for assessing the various considerations that might allow for a valid association to be considered a causal association.[5] The resulting abridgement was a failure of scientific due process that distorted the Bradford Hill paper.

The First Circuit amplified its error when it asserted that from the nine considerations “no one type of evidence must be present before causality may be inferred.”[6] Although Sir Austin said something similar, one of the considerations he noted was “temporality,” in which the putative cause must come before the effect.  Most scientists would consider this consideration to be essential, unless they were observing events that were moving faster than the speed of light. The other eight considerations are more dependent upon context of the exposures and outcomes of interest, but surely strength and consistency of the clear-cut association across multiple studies is an extremely important consideration.

The First Circuit proceeds from misreading Sir Austin’s paper to misunderstanding another paper invoked by Cranor and by Milward’s lawyers. Carelessly tracking Cranor, the appellate court suggested that there was no “hierarchy of evidence”:

“For example, when a group from the National Cancer Institute was asked to rank the different types of evidence, it concluded that ‘‘[t]here should be no such hierarchy.’’ Michele Carbon [sic] et al., Modern Criteria to Establish Human Cancer Etiology, 64 Cancer Res. 5518, 5522 (2004); see also Sheldon Krimsky, The Weight of Scientific Evidence in Policy and Law, 95 Am. J. Pub. Health S129, S130 (2005).”[7]

This quoted language from the Milward opinion shows how slavishly and credulously the court adopted and regurgitated plaintiff’s argument. Sheldon Krimky was actively involved with SKAPP, and his article was presented at the SKAPP-funded Coronado Conference, discussed earlier in this series. Krimsky actually acknowledged that although “the term [WOE] is applied quite liberally in the regulatory literature, the methodology behind it is rarely explicated.”

As for the article by Carbon [sic], this publication never rejected a hierarchy of evidence. The court’s language, quoted above, follows immediately after the court’s discussion of Sir Austin’s nine types of corroborating evidence that would support the causal interpretation of an association. As such, the court seems to imply, incorrectly, that there was no hierarchy of these considerations.[8]

The court’s language also suggests that the quoted language came from the National Cancer Institute (NCI), but its provenance is quite different. The cited article’s lead author, Michele Carbone (not Carbon), was reporting on a workshop hosted by the NCI at an NCI building; it was not an official NCI event or publication. The NCI did not sponsor or conduct the meeting, and Carbone’s paper was not an official statement of the NCI. Carbone’s paper was styled “Meeting Report,” and published as a paid advertisement in Cancer Research, not in the Journal of the National Cancer Institute as a scholarly article.

The discipline of epidemiology was not strongly represented at the meeting; most of the chairpersons and scientists in attendance were pathologists, cell biologists, virologists, and toxicologists. The authors of the meeting report reflect the interests and focus of the scientists in attendance. The lead author, Michele Carbone, a pathologist at the University of Hawaii, was an enthusiastic proponent of Simian Virus 40 as a cause of mesothelioma, a hypothesis that has not fared terribly well in the crucible of epidemiologic science.

The cited article did report some suggestions for modifying Bradford Hill’s criteria in the light of modern molecular biology, as well as a sense of the group that there was no “hierarchy” in which epidemiology was at the top of disciplines.  The group definitely did not address the established concept that some types of epidemiologic studies are analytically more powerful to support inferences of causality than others — the hierarchy of epidemiologic evidence. The group also did not address or reject a ranking of importance of Bradford Hill’s nine viewpoints. There was nothing remarkable about the tumor biologists’ statement that in some cases causality can be determined by careful identification of genetic inheritance or molecular biological pathways. There was no evidence of this sort in the Milward case, and the citation by Cranor and Milward’s lawyers was nothing more than hand waving.

Carbone’s meeting report summarizes informal discussion sessions at the 2003 meeting.  Those in attendance broke out into two groups, one chaired by Brook Mossman, a pathologist, and the other group chaired by Dr. Harald zur Hausen, a virologist. The meeting report included a narrative of how the two groups responded to twelve questions. Drawing from plaintiff’s (and Cranor’s) argument, the court’s citation to this meeting report is based upon one sentence in Carbone’s report, about one of twelve questions:

6. What is the hierarchy of state-of-the-art approaches needed for confirmation criteria, and which bioassays are critical for decisions: epidemiology, animal testing, cell culture, genomics, and so forth?

There should be no such hierarchy. Epidemiology, animal, tissue culture and molecular pathology should be seen as integrating evidences in the determination of human carcinogenicity.”[9]

Considering the fuller context of the meeting, there is nothing particularly surprising about this statement.  The full question and answer in the meeting report does not even remotely support the weight given to it by the court. There was quite a bit of disagreement among meeting participants over criteria for different kinds of carcinogens, as seen the report on another question:

“2. Should the criteria be the same for different agents (viruses, chemicals, physical agents, promoting agents versus initiating DNA-damaging agents)?

There were different opinions. Group 1 debated this issue and concluded that the current listing of criteria should remain the same because we lack sufficient evidence to develop a separate classification. Group 2 strongly supported the view that it is useful to separate the biological or infectious agents from chemical and physical carcinogens due to their frequently entirely different mode of action.”[10]

Carbone and the other authors of the meeting report noted the importance to epidemiology for general causation, while acknowledging its limitations for determining specific causation:

“Concerning the respective roles of epidemiology and molecular pathology, it was noted that epidemiology allows the determination of the overall effect of a given carcinogen in the human population (e.g., hepatitis B virus and hepatocellular carcinoma) but cannot prove causality in the individual tumor patient.”[11]

Clearly, the report was not disavowing the necessity for epidemiology to confirm carcinogenicity in humans. Specific causation of Mr. Milward’s APML was irrelevant to his first appeal to the First Circuit. Carbone’s report emphasized the need to integrate epidemiologic findings with molecular biology; it did not suggest that epidemiology was not necessary or urge that epidemiology be ignored or disregarded:

“A general consensus was often reached on several topics such as the need to integrate molecular pathology and epidemiology for a more accurate and rapid identification of human carcinogens.”[12]

                 * * * * *

“Ideally, before labeling an agent as a human carcinogen, it is important to have epidemiological, experimental animals, and mechanistic evidence (molecular pathology).”[13]

The court’s implication that there was “no hierarchy of evidence” is unsupported by the meeting report. The suggestion that WOE allows some loosey-goosey, ad hoc, unstructured assessment of diverse lines of evidence is rejected in the meeting report with a careful admonition about the lack of validity of some animal models and mechanistic research:

“Moreover, carcinogens and anticarcinogens can have different effects in different situations. As shown by the example of addition of β-carotene in the diet, β- carotene has chemopreventive effects in many experimental systems, yet it appears to have increased the incidence of lung cancer in heavy smokers. Animal experiments can be very useful in predicting the carcinogenicity of a given chemical. However, there are significant differences in susceptibility among species and within organs in the same species, and differences in the metabolic pathway of a given chemical among human and animals could lead to error.”[14]

Inference to the Best Explanation

The First Circuit asserted that “no serious argument can be made that the weight of the evidence approach is inherently unreliable.”[15] As discussed above, this assertion is demonstrably false. In his testimony at the Rule 702 pre-trial hearing, Cranor classified WOE as based upon “inference to the best explanation,” and the First Circuit obsequiously accepted this claim. In articulating and accepting Cranor’s reduction of scientific method to IBE, the appellate court seemed unaware that IBE as an epistemic theory has been roundly criticized. In a very general sense, IBE draws on Charles Pierce’s description of abduction as a mode of reasoning, although many writers have been eager to distinguish abduction from IBE. Bas van Fraassen criticized IBE as lacking merit as a mode of argument in a way germane to Cranor’s presentation of the notion, and the First Circuit’s uncritical acceptance:

“As long as the pattern of Inference to the Best Explanation—henceforth, IBE—is left vague, it seems to fit much rational activity. But when we scrutinize its credentials, we find it seriously wanting.”[16]

The IBE approach raises thorny problems of knowing how to discern the best explanation, or how to tell whether an explanation is simply the best of a bad lot. Other philosophers of science have questioned why explanatoriness should matter as opposed to predictive ability and resistance to falsification upon severe or robust testing.

In the hands of Smith and Cranor, these philosophical quandries become largely beside the point. For Smith and Cranor IBE becomes telling just so stories, which transform “but for” causation into “could be” causation. Drawing directly from Cranor, the Circuit Court explained that an inference to the best explanation involves six general steps for scientists:

“(1) identify an association between an exposure and a disease,

(2) consider a range of plausible explanations for the association,

(3) rank the rival explanations according to their plausibility,

(4) seek additional evidence to separate the more plausible from the less plausible explanations,

(5) consider all of the relevant available evidence, and

(6) integrate the evidence  using professional judgment to come to a conclusion about the best explanation.”[17]

Of course assessing causation requires judgment, but Cranor and Smith radically abridge the process of judging by eliminating:

  • the robust testing of, and attempts to falsify, hypotheses,
  • the weighting of study designs,
  • the pre-specification of kinds of studies to be included or excluded, the assignment of weights to different kinds and qualities of studies, and
  • the pre-specification of criteria of study validity, experimental design, consistency, and exposure-response.

The vague, contentless IBE and WOE, in the hands of Smith, operates just as van Fraassen anticipated. With Cranor’s “philosophizing,” IBE creates a permission structure to reach any desired conclusion. Indeed, Cranor’s approach makes no allowance for when careful scientists withhold judgment because the evidence is inadequate to the task. Furthermore, Cranor’s approach and the Milward decision would cheerily approve cherry picking of studies and data within studies, post hoc weighing of evidence, and even fabricating and rejiggering of evidence, all of which was on display in Smith’s for-litigation opinion.

The First Circuit uttered its mantra of approval of Smith’s scientific delicts in language that became the target of the revision of Rule 702 in 2023:

“the alleged flaws identified by the [district] court go to the weight of Dr. Smith’s opinion, not its admissibility. There is an important difference between what is unreliable support and what a trier of fact may conclude is insufficient support for an expert’s conclusion.”[18]

Earlier in its opinion, the appellate court quoted from the version of Rule 702 in effect when it heard the appeal:

“if (1) the testimony is based upon sufficient facts or data, (2) the testimony is the product of reliable principles and methods, and (3) the witness has applied the principles and methods reliably to the facts of the case.”[19]

Sufficiency, reliability, and validity were all preliminary questions to be decided by the court as part of its gatekeeping responsibility.  The appellate court simply ignored the law in its decision to green light Smith’s testimony.

                    (to be continued)


[1] Milward v. Acuity Specialty Products Group, Inc., 639 F.3d 11 (1st Cir. 2011), cert. denied sub nom., U.S. Steel Corp. v. Milward, 565 U.S. 1111 (2012).

[2] Austin Bradford Hill, The Environment and Disease: Association or Causation?, 58 PROC. ROYAL SOC’Y MED. 295 (1965).

[3] Milward, 639 F.3d at 17.

[4] Id. at 295.

[5] See Frank C. Woodside, III & Allison G. Davis, The Bradford Hill Criteria: The Forgotten Predicate, 35 THOMAS JEFFERSON L. REV. 103 (2013).

[6] Milward, 639 F.3d at 17.

[7] Id. (internal citations omitted).

[8] The Reference Manual chapter on medical testimony carefully discusses the hierarchy of evidence as it factors into the assessment of medical causation. John B. Wong, Lawrence O. Gostin & Oscar A. Cabrera, Reference Guide on Medical Testimony, in National Academies of Sciences, Engineering and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 687, 723 -24 (2011); John B. Wong, Lawrence O. Gostin, & Oscar A. Cabrera, Reference Guide on Medical Testimony, in National Academies of Sciences, Engineering and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 1105, 1150-52 (4th ed. 2025). Interestingly, the chapter on epidemiology in the third edition of the Reference Manual cited to the Carbone workshop with apparent approval, but the same chapter in the fourth edition has dropped the reference. Compare Michael D. Green, D. Michal Freedman & Leon Gordis, Reference Guide on Epidemiology, in National Academies of Sciences, Engineering and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 549, 564 n.48 (3rd ed. 2011) with Steve C. Gold, Michael D. Green, Jonathan Chevrier, & Brenda Eskenazi, Reference Guide on Epidemiology, in National Academies of Sciences, Engineering and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 897 (4th ed. 2025).

[9] Carbone at 5522.

[10] Carbone at 5521.

[11] Carbone at 5518 (emphasis added).

[12] Carbone at 5518.

[13] Carbone at 5519.

[14] Carbone at 5521.

[15] Milward, 639 F.3d at 18-19.

[16] Bas van Fraassen, LAWS AND SYMMETRY 131 (1989).

[17] Milward, 639 F.3d at 18.

[18] Milward, 639 F.3d at 22.

[19] Milward, 639 F.3d at 14.

Reference Manual’s Chapter on Expert Witness Testimony Admissibility – Part 3

March 2nd, 2026

Richter and Capra treat WOE in Justice Steven’s lone dissenting opinion in Joiner as if it were the law. Of course, it was not; nor was it a particularly insightful analysis into scientific method, Rule 702, or the law of expert witnesses. The Manual authors elevate WOE by their complete failure to offer any criticisms or by citing to the scientific and legal scholars who have criticized WOE.

Richter and Capra do cite to a couple of cases that are skeptical of expert witnesses who had offered WOE opinions, but they fail to cite to any cases that disparage WOE itself.[1] In aggravation of their misplaced focus on the Joiner dissent, Richter and Capra proceed to spend two full pages on the Milward case, which had posthumously appeared in Professor Berger’s version of the law chapter in the 2011, third edition of the Reference Manual. The attention given to Milward in the fourth edition is greater than to any other non-Supreme Court case, including Frye. Richter and Capra offer no commentary or analysis critical of the case, although many legal commentators have criticized the Milward opinion on WOE.[2]

Richter and Capra’s chapter fails to note that a dark cloud hangs over the Milward case due to the unethical non-disclosure of CERT’s amicus brief filed in support of reversing the exclusion of CERT’s founders, Carl Cranor and Martyn Smith,[3] or CERT’s funding Smith’s research, or CERT’s involvement in shaking down corporations in California for Prop 65 bounties.

In their extensive coverage of the 2011 Milward decision, Richter and Capra failed to report that after the First Circuit reversed and remanded, the trial court again excluded plaintiffs’ expert witnesses for failing to give a valid opinion on specific causation. On the second appeal, the First Circuit affirmed the exclusion of specific causation expert witness testimony and the entry of final judgment for defendants.[4] Given that the first appellate decision was no longer necessary to the final disposition of the case, it is questionable whether there is any holding with respect to general causation in the case.

The most salient aspect of Richter and Capra’s uncritical coverage of the Milward case is their complete failure to identify the legal errors made by the First Circuit in its decision on Rule 702 and general causation. As the Reporter to the Rules Advisory Committee, Professor Capra was intimately involved in many meetings and memoranda that addressed the failings of courts to engage properly in gatekeeping. These failings were the gravamen of the basis for the 2023 amendments to Rule 702. The Milward decision in 2011 managed to check almost every box for bad decision making: the appellate panel ignored the text of Rule 702, disregarded Supreme Court precedent in the Joiner case, relied upon over-ruled, obsolete, pre-Daubert decisions, ignored the policy considerations urged by the Supreme Court, bungled basic scientific concepts, and egregiously and credulously endorsed WOE as advocated as a scientific methodology. Professor David E. Bernstein has pointed to the 2011 Milward decision, as “the most notorious,” and “[t]he most prominent example of such judicial truculence” in resisting following the requirements of Rule 702, as it existed in 2011.[5]

Milward is an important case, much as the Berenstain Bears stories are important and helpful in teaching children what not to do. Unfortunately, Richter and Capra discuss Milward in a way that might lead readers to believe that the case represents a reasonable or proper treatment of the science involved in the case. To correct this biased coverage of Milward, readers will have to roll up their sleeves and actually look at what the court did and did not do, and what scientific methodology issues were involved.

Perhaps the best place to begin is the beginning. Brian Milward filed a lawsuit in which he claimed that he was exposed to benzene as a refrigerator technician.[6] He developed acute promyelocytic leukeumia (APML), and claimed that he had been exposed to benzene from having used products made or sold by roughly two dozen companies. APML is a rare disease, type M3 of acute myeloid leukemia (AML), defined by specific chromosomal abnormalities that are necessary but not sufficient to result in APML. APML has an incidence of fewer than five cases per million per year. APML occurs with equal frequency in both sexes; there are no known environmental or occupational causes of APML.[7] APML occurs in the general population without benzene exposure, and its occurrence in all populations is sparse. There are no biomarkers that suggest that some putative benzene-related mechanism is involved in some APML cases, which biomarker would identify the rarity of benzene involvement in causation.

Milward’s General Causation Expert Witness, Martyn T. Smith

Milward did not serve a report from an epidemiologist, or anyone with significant expertise in epidemiology. His only general causation expert witness was Martyn Smith, a toxicologist, who testified that the “weight of the evidence” supported his opinion that benzene exposure causes APML.[8] As noted above, Smith is a member of the advocacy group, the Collegium Ramazzini; and for over 30 years, he has been a frequent testifier for plaintiffs in chemical exposure cases.[9]

Despite the low but widespread prevalence of APML in the general population, with no sex specificity, and the absence of any identifying biomarker of supposed benzene-related etiology in individual cases, Smith maintained that epidemiology was not necessary to reach a causal opinion about benzene and APML. The principal thrust of Smith’s proffered testimony is that APML is a plausible outcome of benzene exposure, because benzene can cause other varieties of AML, by structurally altering chromosomes (clastogenic) by breaking them and causing re-arrangements.[10]

The trial court found that Smith’s extrapolations were problematic and lacking in supporting evidence. The clear differences among AML subtypes made the extrapolation to APML, a unique clinical entity, inappropriate. The characteristic translocation in APML is absent from other varieties of AML, and APML, unlike other AML varieties, is treatable with all-trans retinoic acid.[11]

Smith advanced speculation that benzene targeted cells in the pathway of  leukemic transformation to APML, but the state of science was clearly devoid of sufficient evidence to show that benzene was involved in the APML translocations. Although the parties agreed that mechanistic evidence showed that benzene can effectuate chromosome damage that are characteristic of some AML subtypes other than APML, the trial court found that:

“[n]o evidence has been published making a similar connection between benzene exposure and the t(15;17) translocation, characteristic of APL [APML].”[12]

The trial court assessed Smith’s extrapolation from benzene’s clastogenic effect in breaking and rearranging chromosomes to induce some types of AML to its causing the specific APML t(15;17) translocation, as a

“bull in the china shop generalization: since the bull smashes the teacups, it must also smash the crystal. Whether that is so, of course, would depend on the bull having equal access to both teacups and crystal. If the teacups were easily knocked over, but the crystal securely stored away, a reason would exist to question, if not to reject, the proposition that the crystal was in as much danger as the teacups.”[13]

The trial judge clearly saw that Smith’s plausibility proved too much, and would support attributing virtually any disease to benzene through a putative mechanism of breaking chromosomes.

Lacking the courage of his convictions, Smith, non-epidemiologist, proceeded to offer opinions about the epidemiology of benzene and APML, some of them quite fanciful. No published or unpublished study showed a statistically significant increase in APML among benzene-exposed workers. The most Smith could draw from the published epidemiologic studies on benzene was one Chinese study that found a small risk ratio, without even nominal statistical significance: a crude odds ratio of 1.42 for benzene exposure and APML. Despite Smith’s hand waving about lack of power,[14] this Chinese study suggested that chloramphenicol was a risk factor for APML (M3), and it was able to identify a nominally statistically significant association between benzene and another sub-type of AML (M2a), with an odds ratio of 1.54.[15]

Smith offered no meta-analysis to show that the available studies collectively established a summary estimate of increased risk for APML among benzene workers. Undaunted, Smith set about to re-jigger the numbers in published studies to make something out of nothing. Neither physician nor epidemiologist, Smith altered diagnoses and exposure status as reported in published papers so that his reclassified cases and controls would yield, where none existed. These re-analyses were done speculatively, inconsistently, and incompetently, and were driven by the motivation to make something out of nothing. His approach was unsupported, unprincipled, and lacking in any reasonable methodology. The proffered re-analyses were never published, never presented at a professional society meeting, and never could comply with the standards used by epidemiologists used in their non-litigation activities. As a toxicologist, Smith did not have any non-litigation epidemiologic activities of note.

Smith’s representation of the relevant epidemiologic methods and studies was misleading and contained numerous errors that cumulatively led to erroneous conclusions; his own re-jiggering was carried out to reach a preferred conclusion to support plaintiff’s litigation case.[16]

One of the epidemiologic studies relied upon by Smith was Golumb (1982).[17] This study did not explore associations with benzene; it was a study of insecticides, chemicals and solvents, and petroleum. Crude oil contains very little benzene, typically about 0.1 percent.[18] Smith, without any evidentiary support, assumed that petroleum exposure equated to benzene exposure.

There were eight cases of cases of leukemia with petroleum exposure; one of those cases was APML. The authors of Golumb (1982) reported that this particular case with APML was actually a crane operator.[19]

In analyzing published epidemiologic studies, Smith insisted that he could re-classify APML cases to non-APML in control subjects, in studies, when the karyotype was normal. Karyotype analysis identifies the defining translocations of specific chromosomes in APML, and is found in virtually all such cases. The obvious result of Smith’s ad hoc reclassifications were to increase risk ratios for APML among benzene-exposed subjects. His arbitrary reclassifications of data allowed him to create the result he desired. In reviewing other published studies, Smith insisted that normal karyotype did not require reclassifying cases out of the APML category, when this approach would yield a risk ratio above one. 

Taking data from the Golumb 1982 paper, Smith attempted to inflate his calculation of an odds ratio, which would support his causation opinion. He arbitrarily discarded two APML from the non-exposed cases, and he discarded eight non-APML cases from the exposed subjects. He did not report p-values or confidence intervals for his reanalyses. At the hearing, the defense epidemiologist showed that Smith’s rejiggered odds ratio (1.51) had a p-value of 0.72, and a 95 percent confidence interval of 0.15 – 14.91. Not only was the result not statistically significant, the confidence interval shows that there was a range of alternative hypotheses over an order of magnitude in range, with none of them being rejected based upon the sample data at an alpha of 0.05. Without the rejiggering of exposed and unexposed cases, the odds ratio would have been 0.71, p = 0.76. All results, both as reported in the published article and as rejiggered by Smith were highly compatible with no association whatsoever.

In discussing other studies, Smith repeated his re-labeling of leukemia cases as APML, in the absence of karyotyping, to support his claims that there were more APML cases observed than expect on general population rates.[20] Smith also cited studies improvidently in supposed support of his opinion (Rinsky 1981; updated in 1994), where there was no association at all. Even workers heavily exposed to benzene in these studies did not develop APML.[21]  Similarly, in support of his opinion, Smith cited another Chinese study, which actually declared that:

“Acute promyelocytic leukemia has been reported infrequently in benzene-exposed groups as well as in t-ANLL. Although ANLL-M3 occurred in at least 4 patients in this series, its general representation among the subtypes of ANLL was similar in its distribution in de novo ANLL in China.”[22]

Smith’s methodological improprieties were the subject of a four day pre-trial hearing before Judge O’Toole. In the course of the hearings, Smith attempted to defends his methods, but like Donny Kerabatsos, in the Big Lebowski, Smith was out of his depth. The trial court found that Dr. Smith’s arbitrary creating and choosing data to support his beliefs was unreliable and not in accordance with generally accepted scientific methodology in the fields of medicine or epidemiology. Smith was simply fabricating data to fit his made-for-litigation beliefs.

Carl Cranor’s Attempt to Bolster Smith

Milward also submitted a report from Carl Forest Cranor, Smith’s business partner in founding the Prop 65 bounty-hunting CERT, and a fellow member of the advocacy group Collegium Ramazzini. Cranor has no expertise in toxicology or epidemiology, and he has never published on the cause of APML. As a professor of philosophy, Cranor has written about scientific methodology, including WOE and “inference to the best explanation (IBE).” Cranor’s publications are riddled with basic misunderstandings of statistical concepts.[23] Essentially, Cranor testified at the Rule 702 hearing, as a cheerleader for Smith, and to advocate for open admissions of dodgy scientific conclusions as acceptable with a methodology he described as WOE or IBE. Cranor stretched to resurrect Justice Stevens’ use of WOE, and attempted to pass it off as a generally accepted scientific mode of reasoning.

The trial court carefully reviewed the proffered opinion testimony in a four day pre-trial hearing. The trial court found that Smith had shown that his hypothesis was plausible and possible, but not that it was “scientific knowledge,” as required by Rule 702. Lacking sufficient scientific methodological validity and support, Smith’s opinions failed to satisfy the requirements of Rule 702, and were thus inadmissible. As a result of excluding plaintiff’s sole general causation expert witness, the trial court granted summary judgment to the defendants.[24]

(to be continued)


[1] See, e.g., Allen v. Pennsylvania Eng’g Corp., 102 F.3d 194, 197-98 (5th Cir. 1996) (“We are also unpersuaded that the ‘weight of the evidence’ methodology these experts use is scientifically acceptable for demonstrating a medical link between Allen’s EtO [ethylene oxide] exposure and brain cancer.”); Magistrini v. One Hour Martinizing Dry Cleaning, 180 F. Supp. 2d 584, 601-02 (D.N.J. 2002) (excluding David Ozonoff, whose WOE analysis of whether perchloroethylene causes acute myelomonocytic leukemia was criticized by court-appointed technical advisor), aff’d, 68 F. App’x 356 (3d Cir. 2003).

[2] See Eric Lasker, Manning the Daubert Gate: A Defense Primer in Response to Milward v. Acuity Specialty Products, 79 DEF. COUNS. J. 128, 128 (2012); David E. Bernstein, The Misbegotten Judicial Resistance to the Daubert Revolution, 89 NOTRE DAME L. REV. 27, 29, 53-58 (2013); David E. Bernstein & Eric G. Lasker, Defending Daubert: It’s Time to Amend Federal Rule of Evidence 702, 57 WM. & MARY L. REV. 1, 33 (2015); Richard Collin Mangrum, Comment on the Proposed Revision of Federal Rule 702: “Clarifying” the Court’s Gatekeeping Responsibility over Expert Testimony, 56 CREIGHTON LAW REVIEW 97, 106 & n.45 (2022); Thomas D. Schroeder, Toward a More Apparent Approach to Considering the Admission of Expert Testimony, 95 NOTRE DAME L. REV. 2039, 2045 (2020); Lawrence A. Kogan, Weight of the Evidence: A Lower Expert Evidence Standard Metastasizes in Federal Court, Washington Legal Foundation Critical Legal Issues WORKING PAPER Series no. 215 (Mar. 2020); Note, Judicial Conference Amends Rule 702. — Federal Rule of Evidence 702, 138 HARV. L. REV. 899, 903 (2025); Nathan A. Schachtman, Desultory Thoughts on Milward v. Acuity Specialty Products, DOI: 10.13140/RG.2.1.5011.5285 (Oct. 2015), available at https://www.researchgate.net/publication/282816421_Desultory_Thoughts_on_Milward_v_Acuity_Specialty_Products .

[3] See David DeMatteo & Kellie Wiltsie, When Amicus Curiae Briefs are Inimicus Curiae Briefs: Amicus Curiae Briefs and the Bypassing of Admissibility Standards, 72 AM. UNIV. L. REV. 1871 (2022) (noting that amicus briefs often include “unvetted and potentially inaccurate, misleading, or mischaracterized expert information,” without the procedural safeguards in place for vetting expert witnesses at trial).

[4] Milward v. Acuity Specialty Prods. Group, Inc., 969 F. Supp. 2d 101, 109 (D. Mass. 2013), aff’d sub. nom., Milward v. Rust-Oleum Corp., 820 F.3d 469, 471, 477 (1st Cir. 2016).

[5] David E. Bernstein, The Misbegotten Judicial Resistance to the Daubert Revolution, 89 NOTRE DAME L. REV. 27, 53, 29 (2013).

[6] Milward v. Acuity Specialty Products Group, Inc., 664 F. Supp. 2d 137 (D. Mass. 2009) (O’Toole, J.), rev’d, 639 F.3d 11 (1st Cir. 2011), cert. denied, U.S. Steel Corp. v. Milward, 565 U.S. 1111 (2012).

[7] Andrew Y. Li, et al., Clustered incidence of adult acute promyelocytic leukemia in the vicinity of Baltimore, 61 LEUKEMIA & LYMPHOMA 2743 (2021); Hassan Ali, et al., Epidemiology and Survival Outcomes of Acute Promyelocytic Leukemia in Adults: A SEER Database Analysis, 144 BLOOD 5942 S1 (2024).

[8] Milward, 664 F. Supp. 2d at 142.

[9] See, e.g., PPG Industries, Inc. v. Wells, No. 21-0232 (Feb. 10, 2023 W.Va.S.Ct.); Hall v. ConocoPhillips, 248 F. Supp. 3d 1177 (W.D. Okla. 2017); In re Levaquin Prods. Liab. Litig., 739 F.3d 401 (8th Cir. 2014); Jacoby v. Rite Aid Corp., No. 1508 EDA 2012 (Dec. 9, 2013 Pa. Super.); Harris v. CSX Transp., Inc., 232 W.Va. 617, 753 S.E.2d 275 (2013); In re Baycol Prods. Litig., 495 F. Supp. 2d 977 (D. Minn. 2007); In re Rezulin Prods. Liab. Litig., MDL 1348, 441 F.Supp.2d 567 (S.D.N.Y. 2006) (advocating mythological “silent injury”); Perry v. Novartis, 564 F.Supp.2d 452 (E.D. Pa. 2008); Dodge v. Cotter Corp., 328 F.3d 1212 (10th Cir. 2003); Sutera v. The Perrier Group of America Inc., 986 F. Supp. 655 (D. Mass. 1997); Redland Soccer Club, Inc. v. Dep’t of Army, 835 F.Supp. 803 (M.D. Pa. 1993).

[10] Milward, 664 F.Supp. 2d at 143-44.

[11] Milward, 664 F.Supp. 2d at 144.

[12] Id. at 146

[13] Id.

[14] The claim that a study lacks power is meaningless without a specification of the alternative hypothesis, the risk ratio the researcher thinks is the population parameter, at a specified level of alpha (typically p < 0.05), and a specified probability model. While virtually all studies would have reasonable statistical power (say 80 percent probability) to reject an alternative hypothesis that the risk ratio exceeded 10,000, no study would have power to detect a risk ratio of 1.0001, at a high level of probability.

[15] Yi Zhongguo, et al. (National Investigative Group for the Survey of Leukemia & Aplastic Anemia), Countrywide Analysis of Risk Factors for Leukemia and Aplastic Anemia, 14 ACTA ACADEMIAE MEDICINAE SINICAE 185 (1992).

[16] Milward, 664 F. Supp. 2d at 148-49.

[17] Harvey M. Golomb, et al., Correlation of Occupation and Karyotype in Adults With Acute Nonlymphocytic Leukemia, 60 BLOOD 404 (1982).

[18] Bo Holmberg, Per Lundberg, Benzene: standards, occurrence, and exposure, 7 AM. J. INDUS. MED. 375 (1985).

[19] Golumb, supra at note 17, at 407.

[20] See, e.g., Song-Nian Yin, et al., A cohort study of cancer among benzene-exposed workers in China: overall results, 29 AM. J. INDUS. MED. 227 (1996).

[21] Robert A. Rinsky, et al., Leukemia in Benzene Workers, 2 AM. J. INDUS. MED. 217 (1981); Mary B. Paxton, et al., Leukemia Risk Associated with Benzene Exposure in the Pliofilm Cohort: I. Mortality Update and Exposure Distribution, 14 RISK ANALYSIS 147 (1994); Mary B. Paxton, et al., Leukemia Risk Associated with Benzene Exposure in the Pliofilm Cohort II. Risk Estimates, 14 RISK ANALYSIS 155 (1994).

[22] Lois B. Travis, et al., Hematopoietic Malignancies and Related Disorders Among Benzene-Exposed Workers in China, 14 LEUKEMIA & LYMPHOMA 91, 99 (1994).

[23] See, e.g., Carl F. Cranor, REGULATING TOXIC SUBSTANCES: A PHILOSOPHY OF SCIENCE AND THE LAW at 33-34(1993) (conflating random error with posterior probabilities: “One can think of α, β (the chances of type I and type II errors, respectively) and 1- β as measures of the “risk of error” or “standards of proof.”); id. at 44, 47, 55, 72-76.

[24] 664 F. Supp. 2d at 140, 149.

The Fourth Edition’s Chapter on Admissibility of Expert Witness Testimony – Part 2

February 24th, 2026

The Manual’s new law chapter on the admissibility (vel non) of expert witness testimony was written by two law professors who teach evidence, and who often write articles with each another.[1] Liesa Richter teaches at the University of Oklahoma College of Law. Daniel Capra teaches at Fordham School of Law, in Manhattan. For the last three decades, Capra has been the Reporter for the Judicial Conference Advisory Committee on the Federal Rules of Evidence. There probably is no evidence law scholar more involved with the Federal Rules, including with the key expert witness rules, Rule 702 and Rule 703, than Capra.

The new chapter’s strengths follow from Professor Capra’s involvement in the evolution of Rule 702. The chapter plainly acknowledges that the Supreme Court decisions in the 1990s follow from an epistemic standard, and the use of the terms “scientific” and “knowledge” in Rule 702. Counting heads, as suggested by the Frye case, was at times a weak and ambiguous proxy for knowledge.[2] The new chapter has the important advantage of not having authors entwined in the advocacy of dodgy groups such as SKAPP, and the Collegium Ramazzini. Gone from the new chapter are Berger’s gratuitous and unwarranted endorsements and mischaracterizations of carcinogenicity evaluations by the International Agency for Research on Cancer (IARC).

Like Berger’s previous versions of this chapter, the new chapter carefully explains the Supreme Court decisions on expert witness admissibility and the changes in Rule 702, over time, including the 2023 amendment to 702. One glaring omission from the new chapter is absence of any mention of the fourth Supreme Court case in the 1993-2000 quartert: Weisgram v. Marley.[3] This important opinion by Justice Ginsburg was a clear expression of the seriousness with which the Court took the gatekeeping enterprise:

“Since Daubert, moreover, parties relying on expert testimony have had notice of the exacting standards of reliability such evidence must meet… . It is implausible to suggest, post-Daubert, that parties will initially present less than their best expert evidence in the expectation of a second chance should their first trial fail.”[4]

Professor Berger discussed this case in her last chapter, but the new authors fail to mention it all.[5] 

On the plus side, Richter and Capra discuss, although way too briefly, the role that Federal Rule of Evidence 703 plays in governing expert witness testimony.[6]  Rule 703 does not address the admissibility of expert witnesses’ opinions, but it does give trial courts control over what hearsay facts and data (such as published studies), otherwise inadmissible, upon which expert witnesses can rely.[7] Richter and Capra do not, however come to grips what how Rule 703 will often require trial courts to engage with the specifics of the validity and flaws of specific studies in order to evaluate the reasonableness of expert witness reliance upon them. Berger, in the third edition of the Manual, completely failed to address Rule 703, and its important role in gatekeeping.

Richter and Capra helpfully advise judges to be cautious in relying upon pre-2023 amendment cases because that most recent amendment was designed to correct clearly erroneous applications of Rule 702 in both federal trial and appellate courts.[8] The new Manual authors also deserve credit for being willing to call out judges for ignoring the Rule 702 sufficiency prong and for invoking the evasive dodge of many courts in characterizing expert witness challenges as going to “weight not admissibility.”[9]

Richter and Capra improve upon past chapters by simply reporting that the 2023 amendment to Rule 702 addressed important concerns that courts were failing to keep expert witnesses “within the bounds of what can be concluded from a reliable application of the expert’s basis and methodology,” and that Rule Rule 702(d) was amended to emphasize their legal obligation to do so.[10] Berger could have discussed this phenomenon even back in 2010-11, but failed to do so.

The new authors report that the Rules Advisory Committee had been concerned that expert witnesses engage regularly in overstating or overclaiming the appropriate level of certainty for their opinions, especially in the context of forensic science.[11] Although the recognition of problematic overclaiming in forensic science is a welcomed development, Richter and Capra fail to recognize that overclaiming is at the heart of the Milward case involving benzene exposure and acute promyelocytic leukemia (APL). And they seem unaware that overclaiming is baked into the precautionary principle that drives IARC pronouncements, advocacy positions of groups such as the Collegium Ramazzini, and much of regulatory rule-making.

In several respects, Richter and Capra have improved upon the past three editions in presenting the law of expert witness testimony. The new chapter gives a brief exposition of the Joiner case,[12] where the Court concluded that that there was an “analytical gap” between the plaintiffs’ experts witnesses’ conclusion on causation and the animal and human studies upon which they relied. The authors’ summary of the case explains that the Supreme Court majority concluded that the trial court below was well within its discretion to find that the plaintiffs’ expert witnesses had a cavernous analytical gap between their relied upon evidence and their conclusion that polychlorobiphenyls (PCBs) caused Mr. Joiner’s lung cancer.

Richter and Capra’s goes sideways in addressing the dissent by Justice Stevens and by giving it uncritical, disproportionate attention. As a dissent, which has never gained any serious acceptance on the high court by any other member, Justice Stevens’ opinion in Joiner hardly deserved any mention at all. Richter and Capra note, however, early in the chapter that Justice Stevens’ criticized the majority in Joiner for having “examined each study relied upon by the plaintiff’s experts in a piecemeal fashion and concluded that the experts’ opinions on causation were unreliable because no one study supported causation.”[13] Stevens’ criticism was wide of the mark in that the Court specifically addressed the “mosaic” theory that was a reprise of the plaintiffs’ unsuccessful strategy in the Bendectin litigation.[14]

Justice Stevens’ dissent wantonly embraced Joiner’s expert witnesses’ use of a “weight of the evidence” (WOE) methodology. Stevens asserted that WOE is accepted in regulatory circles, which is true but irrelevant, and that it is accepted in scientific circles, which is a gross exaggeration and misrepresentation. Richter and Capra somehow manage to discuss Stevens’ WOE argument twice,[15] thereby giving undue, uncritical emphasis and appearing to endorse it over the majority opinion, which after all contained the holding of the Joiner case. The authors give credence to the WOE argument in Joiner by suggesting that the majority had not adequately addressed it, and by failing to provide or cite any critical commentary on WOE.

Careful readers will be left wondering why their time is being wasted with the emphasis on a dissent that was never the law, that mischaracterized the majority opinion, that endorsed a method, WOE, that has been widely criticized, and that never persuaded any other justice to join.

The scientific community has never been seriously impressed by the so-called WOE approach to determining causality.  The phrase is vague and ambiguous; its use, inconsistent.[16] Although the phrase, WOE, is thrown around a lot, especially in regulatory contexts, it has no clear, consistent meaning or mode of application.[17]

Many lawyers, like Justice Stevens, Richter, and Capra, may feel comfortable with WOE because the phrase is used often in the law, where the subjectivity, vagueness, lack of structure and hierarchy to the metaphor “weighing” evidence is seen as a virtue that avoids having to worry too much about the evidential soundness of verdicts.[18] The process of science, however, is not like that of a jury’s determination of a fact such as who had the right of way in a car collision case. Not all evidence is the same in science, and a scientific judgment is not acceptable when it hangs on weak evidence and invalid inferences.

The lawsuit industry and its expert witnesses have adopted WOE, much as they have the equally vague term, “link,” for WOE’s permissiveness of causal inferences. WOE frees them from the requirement of any meaningful methodology, which means that any conclusion is possible, including their preferred conclusion. Under WOE, any conclusion can survive gatekeeping as an opinion. WOE frees the putative expert witness from the need to consider the quality of research. WOE-ful enthusiasts such as Carl Cranor invoke WOE or seek to inflict WOE without mentioning the crucial “nuts and bolts” of scientific inference, such as concepts of

  • Internal and external validity
  • A hierarchy of evidence
  • Assessment of random error
  • Assessment of known and residual confounding
  • Known and potential threats to validity
  • Pre-specification of end points and statistical analyses
  • Pre-specification of weights to be assigned, and inclusionary and exclusionary criteria for studies
  • Appropriate synthesis across studies, such as systematic review and meta-analysis

These important concepts are lost in the miasma of WOE.

If Richter and Capra wished to take a deeper dive into the Joiner case, rather than elevate the rank speculation of the lone dissenter, Justice Stevens, they may have asked whether Joiner’s expert witnesses relied upon all, or the most carefully conducted, epidemiologic studies.

As the record was fashioned, the Supreme Court’s discussion of the plaintiffs’ expert witnesses’ methodological excesses and failures did not include a discussion of why the excluded witnesses had failed to rely upon all the available epidemiology. The challenged witnesses relied upon an unpublished Monsanto study, but apparently ignored an unpublished investigation by NIOSH government researchers, who found that there were “no excess deaths from cancers of the … the lung,” among PCB-exposed workers at a Westinghouse Electric manufacturing facility. Actually, the NIOSH report indicated a statistically non-significant decrease in lung cancer rate among PCB exposed workers, with fairly a narrow confidence interval; SMR = 0.7 (95% CI, 0.4 – 1.2).[19] By the time the Joiner case was litigated, this unpublished NIOSH report was published and unjustifiably ignored by Joiner’s expert witnesses.[20] Twenty years after Joiner was decided in the Supreme Court, NIOSH scientists published updated data from this cohort, which showed that the long-term lung cancer mortality for PCB-exposed workers remained reduced, with a standardized mortality ratio of 0.88 (95% C.I., 0.7–1.1) for the cohort, and even lower for the workers with the highest levels of exposure, 0.82 (95% C.I., 0.5–1.3).[21]

At the time the Joiner case was on its way up to the Supreme Court, two Swedish studies were available, but they were perhaps too small to add much to the mix of evidence.[22] Another North American study published in 1987, and not cited by Joiner’s expert witnesses, was also conducted in a cohort of North American PCB-exposed capacitor workers, and showed less than expected mortality from lung cancer.[23] Joiner thus represents not only an analytical gap case, but also a cherry picking case. The Supreme Court was eminently correct to affirm the shoddy evidence proffered in the Joiner case.

Thirty years after the Supreme Court decided Joiner, the claim that PCBs cause lung cancer in humans remains unsubstantiated. Subsequent studies bore out the point that Joiner’s expert witnesses were using an improper, unsafe methodology and invalid inferences to advance a specious claim.[24] In 2015, researchers published a large, updated cohort study, funded by General Electric, on the mortality experience of workers in a plant that manufactured capacitors with PCBs. The study design was much stronger than anything relied upon by Joiner’s expert witnesses, and its results are consistent with the NIOSH study available to, but ignored by, them. The results are not uniformly good for General Electric, but on the end point of lung cancer for men, the standardized mortality ratio was 81 (95% C.I., 68 – 96), nominally statistically significantly below the expected SMR of 100.[25]

There is also the legal aftermath of Joiner, in which the Supreme Court reversed and remanded the case to the 11th Circuit, which in turn remanded the case back to the district court to address claims that Mr. Joiner had also been exposed to furans and dioxins, and that these other chemicals had caused, or contributed to, his lung cancer, as well.[26] 

Thus the dioxins were left in the case even after the Supreme Court ruled on admissibility of expert witnesses’ opinions on PCBs and lung cancer. Anthony Roisman, a lawyer with the plaintiff-side National Legal Scholars Law Firm, P.C., argued that the Court had addressed an artificial question when asked about PCBs alone because the case was really about an alleged mixture of exposures, and he held out hope that the Joiners would do better on remand.[27]

Alas, the Joiner case evaporated in the district court. In February 1998, Judge Orinda Evans, who had been the original trial judge, and who had sustained defendants’ Rule 702 challenges and granted their motions for summary judgments, received and reopened the case upon remand from the 11th Circuit. Judge Evans set a deadline for a pre-trial order, and then extended the deadline at plaintiff’s request. After Joiner’s lawyers withdrew, and then their replacements withdrew, the parties ultimately stipulated to the dismissal of the case with prejudice, in February 1999. The case had run its course, and so had the claim that dioxins were responsible for plaintiff’s lung cancer.

In 2006, the National Research Council published a monograph on dioxin, which took the controversial approach of focusing on all cancer mortality rather than specific cancers that had been suggested as likely outcomes of interest.[28] The validity of this approach, and the committee’s conclusions, were challenged vigorously in subsequent publications.[29] In 2013, the Industrial Injuries Advisory Council (IIAC), an independent scientific advisory body in the United Kingdom, published a review of lung cancer and dioxin. The Council found the epidemiologic studies mixed, and declined to endorse the compensability of lung cancer for dioxin-exposed industrial workers.[30]

In 1996, when Justice Stevens dissented in Joiner, and over the course of three decades, Stevens’ assessment of science, scientific methodology, and law have been wrong. His viewpoints never gained acceptance from any other justice on the Supreme Court. Richter and Capra, in writing the first chapter of the new Reference Manual, lead judges and lawyers astray in improvidently elevating the dissent, as though it were law, and in failing to provide sufficient context, analysis, and criticism.

(To be continued.)


[1] Liesa L. Richter & Daniel J. Capra, The Admissibility of Expert Testimony, National Academies of Sciences, Engineering, and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 1 (4th ed. 2025).

[2] Id. at 6.

[3] 528 U.S. 440 (2000).

[4] 528 U.S. at 445 (internal citations omitted).

[5] Margaret A. Berger, The Admissibility of Expert Testimony, in National Academies of Sciences, Engineering and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 11, 18-19 (3rd 2011).

[6] Richter & Capra at 17.

[7] See Nathan A. Schachtman, Rule of Evidence 703—Problem Child of Article VII, PROOF 3 (Spring 2009).

[8] Id. at 13.

[9] Id. at 16.

[10] Id. at 22-23.

[11] Id. at 23, 39.

[12] General Electric Co. v. Joiner, 522 U.S. 136 (1997).

[13] Richter & Capra at 10 (citing General Electric Co. v. Joiner, 522 U.S. 136, 150-155 (1997) (Stevens, J.).

[14] Joiner, 522 U.S. at 147-48.

[15] Richter & Capra at 10, 31.

[16] See, e.g., V. H. Dale, G.R. Biddinger, M.C. Newman, J.T. Oris, G.W. Suter II, T. Thompson, et al., Enhancing the ecological risk assessment process, 4 INTEGRATED ENVT’L ASSESS. MANAGEMENT 306 (2008) (“An approach to interpreting lines of evidence and weight of evidence is critically needed for complex assessments, and it would be useful to develop case studies and/or standards of practice for interpreting lines of evidence.”); Igor Linkov, Drew Loney, Susan M. Cormier, F.Kyle Satterstrom & Todd Bridges, Weight-of-evidence evaluation in environmental assessment: review of qualitative and quantitative approaches, 407 SCI. TOTAL ENV’T 5199–205 (2009); Douglas L. Weed, Weight of Evidence: A Review of Concept and Methods, 25 RISK ANALYSIS 1545 (2005) (noting the vague, ambiguous, indefinite nature of the concept of WOE review); R.G. Stahl Jr., Issues addressed and unaddressed in EPA’s ecological risk guidelines, 17 RISK POLICY REPORT 35 (1998); (noting that U.S. EPA’s guidelines for ecological WOE approaches to risk assessment fail to provide meaningful guidance); Glenn W. Suter & Susan M. Cormier, Why and how to combine evidence in environmental assessments: Weighing evidence and building cases, 409 SCI. TOTAL ENV’T 1406, 1406 (2011) (noting arbitrariness and subjectivity of WOE “methodology”).

[17] See Charles Menzie, et al., “A weight-of-evidence approach for evaluating ecological risks; report of the Massachusetts Weight-of-Evidence Work Group,” 2 HUMAN ECOL. RISK ASSESS. 277, 279 (1996)  (“although the term ‘weight of evidence’ is used frequently in ecological risk assessment, there is no consensus on its definition or how it should be applied”); Sheldon Krimsky, “The weight of scientific evidence in policy and law,” 95 AM. J. PUB. HEALTH S129 (2005) (“However, the term [WOE] is applied quite liberally in the regulatory literature, the methodology behind it is rarely explicated.”).

[18] See, e.g., People v. Collier, 146 A.D.3d 1146, 1147-48, 2017 NY Slip Op 00342 (N.Y. App. Div. 3d Dep’t, Jan. 19, 2017) (rejecting appeal based upon defendant’s claim that conviction was against “weight of the evidence”); Venson v. Altamirano, 749 F.3d 641, 656 (7th Cir. 2014) (noting “new trial is appropriate if the jury’s verdict is against the manifest weight of the evidence”).

[19] Thomas Sinks, et al., Health Hazard Evaluation Report, HETA 89-116-209 (Jan. 1991).

[20] Thomas Sinks, et al., Mortality among workers exposed to polychlorinated biphenyls,” 136 AM. J. EPIDEMIOL. 389 (1992).

[21] Avima M. Ruder, et al., Mortality among Workers Exposed to Polychlorinated Biphenyls (PCBs) in an Electrical Capacitor Manufacturing Plant in Indiana: An Update, 114 ENVT’L HEALTH PERSP. 18, 21 (2006).

[22] P. Gustavsson, et al., Short-term mortality and cancer incidence in capacitor manufacturing workers exposed to polychlorinated biphenyls (PCBs), 10 AM. J. INDUS. MED. 341 (1986); P. Gustavsson & C. Hogstedt, “A cohort study of Swedish capacitor manufacturing workers exposed to polychlorinated biphenyls (PCBs),” 32 AM. J. INDUS. MED. 234 (1997) (cancer incidence for entire cohort, SIR = 86, 95%; CI 51-137).

[23] David P. Brown, “Mortality of workers exposed to polychlorinated biphenyls–an update,” 42 ARCH. ENVT’L HEALTH 333, 336 (1987

[24] See Mary M. Prince, et al., Mortality and exposure response among 14,458 electrical capacitor manufacturing workers exposed to polychlorinated biphenyls (PCBs), 114 ENVT’L HEALTH PERSP. 1508, 1511 (2006) (reporting a nominally statistically significant decreased mortality ratio of 0.78, 95% C.I. 0.65–0.93, for men exposed to PCBs); Avima M. Ruder, Mortality among 24,865 workers exposed to polychlorinated biphenyls (PCBs) in three electrical capacitor manufacturing plants: a ten-year update, 217 INT’L J. HYG. & ENVT’L HEALTH 176, 181 (2014) (reporting no increase in the lung cancer standardized mortality ratio for long-term workers, 0.99, 95% C.I., 0.91–1.07).

[25] Renate D. Kimbrough, et al., Mortality among capacitor workers exposed to polychlorinated biphenyls (PCBs), a long-term update, 88 INT’L ARCH. OCCUP. & ENVT’L HEALTH 85 (2015).

[26] Joiner v. General Electric Co., 134 F.3d 1457 (11th Cir. 1998) (per curiam).

[27] Anthony Z. Roisman, The Implications of G.E. v. Joiner for Admissibility of Expert Testimony, 65 VT. J. ENVT’L L. 1 (1998).

[28] See David L. Eaton (Chairperson), HEALTH RISKS FROM DIOXIN AND RELATED COMPOUNDS – EVALUATION OF THE EPA REASSESSMENT (2006).

[29] Paolo Boffetta, et al., TCDD and cancer: A critical review of epidemiologic studies,” 41 CRIT. REV. TOXICOL. 622 (2011) (“In conclusion, recent epidemiological evidence falls far short of conclusively demonstrating a causal link between TCDD exposure and cancer risk in humans.”).

[30] Industrial Injuries Advisory Council – Information Note on Lung cancer and Dioxin (December 2013). See also Mann v. CSX Transp., Inc., 2009 WL 3766056, 2009 U.S. Dist. LEXIS 106433 (N.D. Ohio 2009) (Polster, J.) (dioxin exposure case) (“Plaintiffs’ medical expert, Dr. James Kornberg, has opined that numerous organizations have classified dioxins as a known human carcinogen. However, it is not appropriate for one set of experts to bring the conclusions of another set of experts into the courtroom and then testify merely that they ‘agree’ with that conclusion.”), citing Thorndike v. DaimlerChrysler Corp., 266 F. Supp. 2d 172 (D. Me. 2003) (court excluded expert who was “parroting” other experts’ conclusions).

The Reference Manual’s Chapter on Expert Witness Testimony Admissibility – Part One

February 23rd, 2026

With the retraction of the climate science chapter, The Reference Manual on Scientific Evidence is now one chapter shorter, at least in the Federal Judicial Center’s version. At the time of this writing, for curious souls, the National Academies version is still sporting the climate advocacy chapter. Even without the climate chapter, the Manual is over 1,000 pages, and more than a casual weekend read. Many judges, finding this tome on their desks, will read individual subject matter chapters pro re nata. The first chapter in the Manual, however, is about the law, not science, and might be the starting place for the ordinary work-a-day judge. As in past editions of the Manual, the new edition has a chapter on the The Admissibility of Expert Testimony. In the first, second, and third editions, this chapter was written by Professor Margaret Berger. In the fourth edition, the chapter on the law was written by law professors Liesa Richter and Daniel Capra. To understand and evaluate the most recent iteration, the reader should have some sense of what has gone before.

Previous Chapters on Admissibility of Expert Witness Testimony

Professor Berger’s past chapters had been idiosyncratic productions.[1] Berger was an evidence law scholar, who wrote often about expert witness admissibility issues.[2] She was also known for her antic proposals, such as calling for abandoning the element of causation in products liability cases.[3] As an outspoken ideological opponent of expert witness gatekeeping, Berger was a strange choice to write the law chapter of the Manual.[4] Berger’s chapters in the first through the third editions made her opposition to gatekeeping obvious, and this hostility may have been responsible for some of the judicial resistance to applying the clear language of Rule 702, even after its 2000 revision.

Berger was not only a law professor; she was at the center of ideological and financially conflicted groups that worked to undermine the application of Rule 702 in health effects cases. One of the key players in this concerted action was David Michaels. Currently, Michaels teaches epidemiology at the George Washington University Milken Institute School of Public Health. He is a card-carrying member of the Collegium Ramazzini, an organization that has participated in efforts to corrupt state and federal judges by funding ex parte conferences with lawsuit industry expert witnesses.[5] Michaels is the author of two books, both highly anti-manufacturing industry, and biased in favor of the lawsuit industry.[6] Both books are provocatively titled anti-industry diatribes, which have little scholarly value, but are used regularly by plaintiffs counsel solely to smear corporate defendants and defense expert witnesses. Most clear-eyed trial judges have quashed these efforts on various grounds, including Rule 703, because the books are not the sort of material upon which scientists would reasonably rely.[7]

In 2002, David Michaels created an anti-Daubert advocacy organization, the Project on Scientific Knowledge and Public Policy (SKAPP), from money siphoned from the plaintiffs’ common-benefit fund in MDL 926 (silicone gel breast implant litigation).[8] Michaels lavished some of the misdirected money to prepare and publish an anti-Daubert pamphlet for SKAPP, in 2003.[9] In this anti-Daubert publications, and many others sponsored by SKAPP, Michaels and the SKAPP grantees typically acknowledged the source of SKAPP funding obliquely to hide that it was nothing more than plaintiffs counsels’ walking around money:

“I am also grateful for the support SKAPP has received from the Common Benefit Trust, a fund established pursuant to a court order in the Silicone Gel Breast Implant Liability litigation.”[10]

Many credulous lawyers, judges, and legal scholars were duped into believing that SKAPP, SKAPP publications, and SKAPP-sponsored publications were supported by the Federal Judicial Center.

Michaels directed a good amount of SKAPP’s anti-Daubert funding to support Professor Berger’s efforts in organizing a series of symposia on science and the law. Several of Berger’s SKAPP conferences were held in Coronado, California, and featured a predominance of scientists who work for the lawsuit industry and are affiliated with advocacy organizations, such as the Collegium Ramazzini. The papers from one of the Coronado Conferences were published in a special issue of the American Journal of Public Health, the official journal of the American Public Health Association,[11] which has issued position papers highly critical of Rule 702 gatekeeping.[12]

The spider web of connections between SKAPP, the Collegium Ramazzini, the American Public Health Association, the Tellus Institute, the lawsuit industry,  Professor Berger, and others hostile to Rule 702 is a testament to the concerted action to undermine the Supreme Court’s decisions in the area, and the codification of those decisions in Rule 702. That Professor Berger was within this web of connections, and was writing the chapter on the admissibility of expert witness opinion testimony, in the first three editions of the Reference Manual, explains but does not justify many of the opinions contained within those chapters.

Professor David Bernstein, who has written extensively on expert witness issues, restated the situation thus:

“In 2003, the toxic tort plaintiffs’ bar used money from a fund established as part of the silicone breast implant litigation settlement to sponsor four conference in Coronado, California, that resulted in a slew of policy papers excoriating the Daubert gatekeeping requirement.”[13]

The active measures of these groups and Professor Berger explain the straight line between Berger’s symposia and the First Circuit’s decision in Milward v. Acuity Specialty Products Group, Inc.[14] Carl Cranor was one of the speakers at the Coronado Conferences, and along with Martyn Smith, another member of the Collegium Ramazzini, founded a Proposition 65 bounty-hunting organization, Council for Education on Research on Toxics (CERT). Cranor has long advocated for a loosey-goosey “weight of the evidence” approach that had been rejected by the Supreme Court in Joiner.[15] Cranor, along with Smith, unsurprisingly turned up as expert witnesses for plaintiff in Milward, in which case they reprised their weight-of-the evidence approach opinions. When Milward appealed the exclusion of Cranor and Smith, CERT filed an amicus brief, without disclosing that Cranor and Smith were founders of the organization, and that CERT funded Smith’s research through donations to his university, from CERT’s shake-down operations under Prop 65. The First Circuit’s 2011 decision in Milward resulted from a fraud on the court.

Professor Berger died in November 2010, but when the third edition of the Manual was released in 2011, it contained Berger’s chapter on the law of expert witnesses, with a citation to the Milward case, decided after her death.[16] An editorial note from an unnamed editor to her posthumous chapter suggested that

“[w]hile revising this chapter Professor Berger became ill and, tragically, passed away. We have published her last revision, with a few edits to respond to suggestions by reviewers.”

Given that Berger was an ideological opponent of expert witness gatekeeping, there can be little doubt that she would have endorsed the favorable references to Milward made after her passing, but adding them can hardly be considered non-substantive edits. Curious readers might wonder who was the editor who took such liberties of adding the chapter citations to Milward. Curious readers do not have to wonder, however, what would have happened if the incestuous relationships among Berger, SKAPP, the plaintiffs’ bar, and others had been replicated by similar efforts of manufacturing industry to influence the interpretation and application of the law. In 2008, the Supreme Court decided an important case involving constitutional aspects of punitive damages. The Court went out of its way to decline to rely upon empirical research that showed the unpredictability of punitive damage awards because it was funded in part by Exxon:

“The Court is aware of a body of literature running parallel to anecdotal reports, examining the predictability of punitive awards by conducting numerous ‘mock juries’, where different ‘jurors’ are confronted with the same hypothetical case. See, e.g., C. Sunstein, R. Hastie, J. Payne, D. Schkade, & W. Viscusi, Punitive Damages: How Juries Decide (2002); Schkade, Sunstein, & Kahneman, Deliberating About Dollars: The Severity Shift, 100 Colum. L.Rev. 1139 (2000); Hastie, Schkade, & Payne, Juror Judgments in Civil Cases: Effects of Plaintiff’s Requests and Plaintiff’s Identity on Punitive Damage Awards, 23 Law & Hum. Behav. 445 (1999); Sunstein, Kahneman, & Schkade, Assessing Punitive Damages (with Notes on Cognition and Valuation in Law), 107 Yale L.J. 2071 (1998). Because this research was funded in part by Exxon, we decline to rely on it.”[17]

Unlike the situation with SKAPP, David Michaels, the plaintiffs’ bar, and Professor Berger, the studies sponsored in part by Exxon had disclosed their funding clearly. Those studies involved outstanding scientists whose integrity were unquestionable, and for its trouble, Exxon was rewarded with gratuitous shaming from Justice Souter. The anti-Daubert papers sponsored by the plaintiffs’ bar through SKAPP, and Professor Berger’s ideological conflicts of interest have received a free pass. This disparate treatment between conflicts of interest within manufacturing industry and those within the lawsuit industry and its advocacy group allies is a serious social, political, and legal problem. It was a problem on full display in the now-retracted climate science chapter in the Manual. In evaluating the new fourth edition’s chapter on the law of expert witness admissibility (and other chapters), we should be asking whether there are signs of undue political influence.


[1] See Schachtman, The Late Professor Berger’s Introduction to the Reference Manual on Scientific Evidence, TORTINI (Oct. 23, 2011).

[2] See generally Edward K. Cheng, Introduction: Festschrift in Honor of Margaret A. Berger, 75 BROOKLYN L. REV. 1057 (2010). 

[3] Margaret A. Berger, Eliminating General Causation: Notes towards a New Theory of Justice and Toxic Torts, 97 COLUM. L. REV. 2117 (1997).

[4] See, e.g., Margaret A. Berger & Aaron D. Twerski, “Uncertainty and Informed Choice:  Unmasking Daubert,” 104 MICH. L.  REV. 257 (2005). 

[5] In re School Asbestos Litig., 977 F.2d 764 (3d Cir. 1992). See Cathleen M. Devlin, Disqualification of Federal Judges – Third Circuit Orders District Judge James McGirr Kelly to Disqualify Himself So As To Preserve ‘The Appearance of Justice’ Under 28 U.S.C. § 455 – In re School Asbestos Litigation (1992), 38 VILL. L. REV. 1219 (1993); Bruce A. Green, May Judges Attend Privately Funded Educational Programs? Should Judicial Education Be Privatized?: Questions of Judicial Ethics and Policy, 29 FORDHAM URB. L. J. 941, 996-98 (2002).

[6] David Michael, DOUBT IS THEIR PRODUCT: HOW INDUSTRY’S WAR ON SCIENCE THREATENS YOUR HEALTH (2008); David Michaels, THE TRIUMPH OF DOUBT (2020).

[7] See In re DePuy Orthopaedics, Inc. Pinnacle Hip Implant Prods. Liab. Litig., 888 F.3d 753, 787 n.71 (5th Cir. 2018) (advising the district court to weigh carefully whether Doubt is Their Product has any legal relevance); King v. DePuy Orthopaedics, Inc., 2024 WL 6953089, at *2 (D. Ariz. July 9, 2024) (finding Michaels’ books to be legally irrelevant); Sarjeant v. Foster Wheeler LLC, 2024 WL 4658407, at *1 (N.D. Cal.Oct. 24, 2024) (ruling that Doubt Is Their Product is legally irrelevant hearsay, and not the type of material upon which an expert witness would rely to form scientific opinion). See also Evans v. Biomet, Inc., 2022 WL 3648250, at *4 (D. Alaska Feb. 1, 2022) (quashing plaintiff’s subpoena to defendant’s expert for material in connection with Doubt Is Their Product).

[8] See Ralph Klier v. Elf Atochem North America Inc., 2011 U.S. App. LEXIS 19650 (5th Cir. 2011) (holding that district court abused its discretion in distributing residual funds from class action over arsenic exposure to charities; directing that residual funds be distributed to class members with manifest personal injuries). A “common benefit” fund is commonplace in multi-district litigation of mass torts.  In such cases, federal courts may require the defendant to “hold back” a certain percentage of settlement proceeds, to pay into a fund, which is available to those plaintiffs’ counsel who did “common benefit work,” work for the benefit of all claimants.  Plaintiffs’ counsel who worked for the common benefit of all claimants may petition the MDL court for compensation or reimbursement for their work or expenses.  See, e.g., William Rubenstein, On What a ‘Common Benefit Fee’ Is, Is Not, and Should Be, CLASS ACTION ATT’Y FEE DIG. 87, 89 (Mar. 2009).  In the silicone gel breast implant litigation (MDL 926), plaintiffs’ counsel on the MDL Steering Committee undertook common benefit work in the form of developing expert witnesses for trial, and funding scientific studies.  By MDL Orders 13, and 13A, the Court set hold-back amounts of 5 or 6%, and later reduced the amount to 4%.  Id. at 94.

[9] Eula Bingham, Leslie Boden, Richard Clapp, Polly Hoppin, Sheldon Krimsky, David Michaels, David Ozonoff & Anthony Robbins, Daubert: The Most Influential Supreme Court Ruling You’ve Never Heard Of (June 2003). The authors described the publication as a publication of SKAPP, coordinated by the Tellus Institute, and funded by The Bauman Foundation, a private foundation that supports “progressive social change advocacy.” Boden, Hoppin, Michaels, and Ozonoff are fellows of the Collegium Ramazzini.

[10] David Michael, DOUBT IS THEIR PRODUCT: HOW INDUSTRY’S WAR ON SCIENCE THREATENS YOUR HEALTH 267 (2008). See Nathan Schachtman, “SKAPP A LOT,” TORTINI (April 30, 2010); “Manufacturing Certainty” TORTINI (Oct. 25, 2011); “David Michaels’ Public Relations Problem” TORTINI (Dec. 2, 2011); “Conflicted Public Interest Groups” TORTINI (Nov. 3, 2013). 

[11] 95 AM. J. PUB. HEALTH S1 (2005).

[12] See, e.g., Am. Pub. Health Assn, Threats to Public Health Science, Policy Statement 2004-11 (Nov. 9, 2004), available at https://www.apha.org/policy-and-advocacy/public-health-policy-briefs/policy-database/2014/07/02/08/52/threats-to-public-health-science

[13] David E. Bernstein & Eric G. Lasker, Defending Daubert: It’s Time to Amend Federal Rule of Evidence, 702, 57 WM. & MARY L. REV. 1, 39 (2015), available at https://scholarship.law.wm.edu/wmlr/vol57/iss1/2. See David Michaels & Neil Vidmar, Foreword, 72 LAW & CONTEMP. PROBS. i, ii (2009) (“SKAPP has convened four Coronado Conferences.”).

[14] Milward v. Acuity Specialty Products Group, Inc., 639 F.3d 11 (1st Cir. 2011), cert. denied sub nom., U.S. Steel Corp. v. Milward, 132 S. Ct. 1002 (2012).

[15] General Electric Co. v. Joiner, 522 U.S. 136, 136-37 (1997).

[16] Margaret A. Berger, The Admissibility of Expert Testimony, in National Academies of Sciences, Engineering and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 11, 20 n.51, 23-24 n.61 (3rd 2011).

[17] Exxon Shipping Co. v. Baker, 554 U.S. 471, 128 S. Ct. 2605, 2626 n.17 (2008).

The FJC Retracts Climate Science – Postscriptum

February 11th, 2026

The version of the Reference Manual on the NASEM website still has the climate science chapter.

The FJC website has a version without the climate science chapter. There is a note that the chapter was removed on February 6, 2026.

Perhaps the two organizations should talk?

The FJC retraction has been covered by many media outlets, but the failure of NASEM to act has not been reported, as far as I can see.[1]


[1] See, e.g., Nate Raymond, US judiciary scraps climate chapter from scientific evidence manual, REUTERS (Feb. 9, 2026)

The FJC Retracts Climate Science Chapter in New Reference Manual

February 10th, 2026

When the new, fourth, edition of the Reference Manual on Scientific Evidence was released late last year,[1] I remarked that there were some new chapters,[2] including one on climate change. I found the addition of a chapter on climate change curious largely because I was unfamiliar with the science or the need to address the area for federal judges, and because I thought there were other more pressing topics, such as genetic causation, from which judges could benefit, but which were not included.

I confess that I did not read the new chapter on climate change,[3] which is not a subject that comes up in my practice or in my writing. Writers at the National Review, however, did read the chapter on climate, and found it objectionable. Writing on January 17th of this year, Michael Fragoso observed that the chapter on climate science was an advocacy piece that would resolve climate change litigation in favor of plaintiffs.[4]

If Fragoso’s charge is correct, the implications are extremely serious. Judges have an ethical obligation not to go beyond the adversary process to educate themselves about the factual issues before them in pending litigation. In the past, judges who have done so have found themselves on the wrong end of a petition for a writ of mandamus, and have been disqualified and removed from cases.[5] The Federal Judicial Center (FJC), which is the research and educational division of the federal courts, has tried to create a safe space for teaching judges about technical subjects that arise in litigation in a way that is balanced and removed from partisan advocacy. The last edition, the third, and the current edition, the fourth, of the Manual have been the joint product of the both the FJC and the National Academies of Science, Engineering and Medicine (NASEM), in the hope of producing disinterested tutorials on key areas of science that are important to judges in their adjudication of civil and criminal cases, as well as their performance of judicial review of regulation and agency action.

Following up on the National Review article, on January 29, 2026, the Attorneys General of 24 states[6] wrote a letter to Judge Robin Rosenberg, the director of the Federal Judicial Center. The letter identified the advocacy perspective of the climate chapter and its authors, who wrote what the Attorneys General described as an amicus brief that placed a thumb on the scales of justice, with respect to issues currently pending at all levels of the federal courts. The Attorneys General requested the immediate withdrawal of the offending chapter.

Judge Rosenberg is a savvy judge of scientific horse flesh. She presided over the Zantac multi-district litigation (MDL No. 2924), in which she excluded plaintiffs’ expert witnesses in a detailed, analytically careful opinion of over 300 pages.[7] On February 6, a week after the request to withdraw the climate chapter was made, Judge Rosenberg, wrote to West Virginia Attorney General John McCuskey, to report that the chapter had been omitted.[8] Given the prompt response from Judge Rosenberg, the decision was likely not a difficult one.  A decision not to include this chapter, as written, in the first place, would have been an even easier one.

Retractions of publications of the NASEM, which includes what was formerly the Institute of Medicine, are rarer than hens teeth.  This one received coverage and some intense harrumphing.[9] The retraction of the climate science chapter comes on the heels of a high-profile retraction, in December 2025, of an article in the prestigious journal Nature,[10] which argued that the costs of climate change would reach $38 trillion a year by 2049.[11]

The climate science chapter appears to be the outcome of what the late Daniel Kahneman called poor decision hygiene.  The chapter in question had two authors, and both were from the same institution, published together, and shared the same advocacy perspectives on climate change. Hardly a team of rivals. The editors of the Manual certainly could have done better in selecting these authors and in editing the work product.

Jessica Wentz is a Non-Resident Senior Fellow at the Sabin Center for Climate Change Law, at the Columbia Law School. The Sabin Center website describes itself as “develop[ing] legal techniques to combat the climate crisis and advance climate justice, and train the next generation of leaders in the field.” The language of “combat” and “crisis” certainly suggests a hardened, adversarial stance. Wentz’s writings reveal her advocacy and adversarial positions.[12]

Radley Horton is a Professor at Columbia University’s Climate School. He describes his research as focusing on climate extremes, and related topics. Horton’s curriculum vitae, social media, social media, testimony,[13] and professional work certainly mark him as an advocate for “attribution science” in litigation to address climate crises. Horton and Wentz previously published a law review article that seems to be a brief for plaintiffs’ positions in climate litigation.[14] One of the key issues in climate litigation is whether litigation is an appropriate avenue for addressing climate issues, and Horton and Wentz have both clearly committed to endorsing litigation strategies, and the plaintiffs’ positions to boot.

This kerfuffle at FJC and NASEM has a larger meaning. There is a glib assumption afoot that the only conflicts of interest that matter are ones that are attributed to industrial stakeholders and their scientific supporters. This naïve view was attacked and debunked back in 1980, by Sir Richard Peto, writing in the pages of Nature. Sir Richard noted that whereas industry may downplay risks, “environmentalists usually exaggerate the likely hazards and are largely indifferent to the costs of control.” Positional conflicts can be, and often are, more powerful than the ones created by profit.[15]


[1] National Academies of Sciences, Engineering, and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE (4th ed. 2025) (cited as RMSE 4th ed.).

[2] Nathan Schachtman, A New Year, A New Reference Manual, in TORTINI (Jan. 5, 2026).

[3] Jessica Wentz & Radley Horton, Reference Guide on Climate Science, RMSE 4th ed.

[4] Michael A. Fragoso, Bias and the Federal Judicial Center’s ‘Climate Science’, NAT’L REV. (Jan. 17, 2026). Fragoso also took umbrage to the use of the silly phrase “pregnant people” elsewhere in the Manual. RMSE 4th ed. at 84.

[5] In re School Asbestos Litigation, 977 F.2d 764 (3d Cir. 1992). See Cathleen M. Devlin, Disqualification of Federal Judges – Third Circuit Orders District Judge James McGirr Kelly to Disqualify Himself So As To Preserve ‘The Appearance of Justice’ Under 28 U.S.C. § 455 – In re School Asbestos Litigation (1992), 38 VILL. L. REV. 1219 (1993);

[6] Alabama, Alaska, Arkansas, Florida, Georgia, Idaho, Indiana, Iowa, Missouri, Montana, Nebraska, New Hampshire, North Dakato, Ohio, Oklahoma, Pennsylvania, South Carolina, South Dakato, Tennessee, Texas, Utah, West Virginia, West Virginia, and Wyoming.

[7] In re Zantac (Ranidine) Prods. Liab. Litig., 644 F. Supp. 3d 1075 (S.D. Fla. 2022).

[8] Hon. Robin Rosenberg, Letter in Response to Attorneys General (Feb. 6, 2026).

[9] Editorial Board, A Failed Climate Coup in the Courts,  WALL ST. J. (Feb. 9, 2026); Charles Creitz, Judicial research center cuts climate section from judges’ manual, FOX NEWS (Feb. 9, 2026); Suzanne Monyak, Judiciary Cuts Climate Part of Science Manual after Backlash, BLOOMBERG LAW (Feb. 9, 2026).

[10] Maximilian Kotz, Anders Levermann & Leonie Wenz, The economic commitment of climate change, 628 NATURE 551 (2024) (retracted on Dec. 3, 2025).

[11] Authors retract Nature paper projecting high costs of climate change, RETRACTION WATCH (Dec. 3, 2025).

[12] Michael Burger, Jessica Wentz & Daniel Metzger, Climate science in rights-based advocacy contexts (June 28, 2020).

[13] Written Testimony of Radley Horton, Lamont Associate Research Professor, Columbia University, before the Committee on Science, Space, and Technology Subcommittee on Environment Sea Change: Impacts of Climate Change on Our Oceans and Coasts (Feb. 27, 2019).

[14] Michael Burger, Radley M. Horton & Jessica Wentz, The Law and Science of Climate Change Attribution, 45 COLUMBIA J. ENVT’L J. L. 57  (2020).

[15] Richard Peto, Distorting the epidemiology of cancer: the need for a more balanced overview, 284 NATURE 297, 297 (1980).

The 4th Reference Manual’s Treatment of Genetic Causes of Disease

January 23rd, 2026

After checking to see whether the new Reference Manual on Scientific Evidence[1] attended to some long overdue corrections, I turned my attention to the substance of the chapter on epidemiology. A cursory comparison between the third[2] and fourth[3] editions of the epidemiology chapter in the Reference Manual a lot of carry over from the third edition, some change in authorship, and at least one interesting change.

The two lawyer authors, Steve Gold and Michael Green, remain, but the authors with reasonable pretense to subject-matter expertise have changed. Gold and Green are both law professors with a long history of commenting on American tort and evidence law. Both are aligned with the lawsuit industry. Previous epidemiology authors, Daryl Michal Freedman and Leon Gordis are now gone from the chapter. Leon Gordis, who had been a chairman of the department of epidemiology, in the Bloomberg School of Public Health, Johns Hopkins University, died in September 2015, after the third edition was published. Daryl Michal Freedman, who been the other subject-matter expert on the third edition’s chapter on epidemiology, has been an epidemiologist with the Biostatistics Branch of the National Cancer Institute, for many years. It is not clear why he left the project.

Replacing Gordis and Freedman are Jonathan Chevrier and Brenda Eskenazi. Chevrier is an associate professor on the faculty of medicine, in the department of epidemiology, in McGill University. The focus of his work is on “common environmental contaminants,” and the role in the development and health of children. Brenda Eskenazi is professor emerita, in the University of California Berkeley School of Public Health, where she is the Director of the Center for Environmental Research and Children’s Health. Eskenazi is a member of a dodgy group known as the Collegium Ramazzini, which was responsible for staging an ex parte presentation of plaintiffs’ expert witnesses to judges presiding in asbestos litigation.[4] Eskenazi was not, however, a member of the Collegium at the time the group conspired with the late Irving Selikoff to pervert the course of justice in American asbestos litigation.

The second significant change is substantive; the fourth edition has added a new subsection to the epidemiology chapter. Comparing the texts of the third and fourth editions of this chapter reveals a new subheading in the new edition:[5]

Genetic and Molecular Epidemiologic Studies

Alas, there is not as much substance to the new subsection, which is less than four pages. Lawyers in the trenches might well have hoped for more substantive treatment of genetic epidemiology, and genetic causation. The chapter’s authors explain their abbreviated treatment with the comment:

“Although commentators have long forecast that the output of genetic and molecular epidemiology would revolutionize causal proof, as of this writing few judicial opinions have addressed these types of studies, and it is far from clear that a revolution is in the offing.”[6] 

The chapter authors are correct that some authors in the past proffered unrealistic predictions of how genetics would supplant correlational studies. Nonetheless, this area has not been as quiescent as the authors’ parsimonious treatment would suggest.

On the question of how prevalent are genetic causation issues, whether raised by plaintiffs or defendants, the chapter might have benefitted from the contributions of a practicing lawyer. Genetic issues come up with some frequency in the litigation of cases involving mesothelioma. The days of plaintiffs who had 30 years of amphibole asbestos exposure in the workplace are largely over. Today’s cases involve little to no exposure, and it stands to reason that the origins of the recently diagnosed cases are different from those diagnosed in the 1970s and 1980s.[7] Genetic cause of mesothelioma is a salient current issue that is passed over in this new Reference Manual.

The authors acknowledge a single birth defects case in which genetic causation was litigated,[8] which was already old news when the last edition of the Manual was published. There are now many more reported cases that cry out for discussion in this under-covered area of the Manual.[9] There are also many cases not reported that have turned on genetic issues. For instance, in some cancer and birth defect cases, the existence of a highly penetrant genetic mutation that could explain the occurrence of a disease completely raises a serious question whether the plaintiff who fails to test for the mutation can possibly have carried his burden of proof.[10] And then there are myriad cases in which the parties have engaged in motion practice, sometimes extended, over access to genetic testing materials.

Genetic issues have arisen in the litigation of high-profile general causation disputes. For instance, the failure to control for genetic effects in epidemiologic studies was a significant issue in the acetaminophen-autism litigation, with both sides presenting geneticists to explain whether the relevant studies were undermined by failure to control for genetic effects.[11]

In the Manual’s epidemiology chapter’s new section on genetics, the authors describe some basic terms and explain that genetic epidemiology may provide evidence for, or against, claims of health effects. The authors’ views come through most clearly in the following short passage:

“Alternatively, genetic epidemiology may reveal associations between genetic variations and a plaintiff’s disease, raising the issue of whether or not a genetic variation may be a competing cause of the disease. This requires assessment of whether the gene–disease association is causal in a general sense, whether it acts independently of the exposure, and whether it is a competing cause in the plaintiff’s specific instance. The extreme, though not typical, example would be a health outcome or disease entirely determined by genetics, 55 as is the case with sickle cell anemia.56[12]

The authors never explain or defend their claim that cases involving diseases caused entirely by genetics are “extreme” and “not typical.” At several points, the authors emphasize that gene-environment interactions are the more prevalent determinants of diseases.[13] If we were to catalog the currently known genetic determinants of diseases, the authors may be correct on a percentage basis, but the issue in any given case is whether the disease or harm claimed by the plaintiff is one of the “extreme” cases of complete genetic causation, or an instance of genetic susceptibility. The authors’ generalization, even if it were correct, would not be very helpful or informative for any specific case.

Perhaps even more important for lawyers, there is a substantive issue on which the new chapter manages to provide confusing guidance. The epidemiology chapter appears to create a false dichotomy between rare, highly penetrant genetic mutations that are uncommon causes of certain diseases, and the more prevalent genetic mutations and polymorphisms that leave persons more susceptible to the deleterious effects of exogenous exposures to toxic chemicals.[14] There is, however, another scenario omitted in the chapter’s discussion of genetic causation. Genetic mutations and polymorphisms may leave persons susceptible to normal, endogenous chemicals, stochastic cellular events, and biological processes that result in diseases such as cancers. In other words, the knee-jerk reflex to invoke exogenous, external toxic chemical exposures promotes a false dichotomy and obscures the obvious implication that susceptibility mutations and polymorphisms may lead to cancer without environmental exposures to harmful chemicals.[15]

The number of endogenous events leading to DNA alterations is enormous, and requires us to rethink the mantra that attributes chronic diseases to gene-environment interaction. At the very least, we need to stop thinking of “environment” as chemical exposures from without ourselves. The epidemiology chapter authors, like many writers, point to external chemical exposures as the culprits in gene-environment reactions, but they ignore the normal, endogenous events that lead to DNA damage, for which genetic susceptibility may be relevant. Mutations that result in increased susceptibility to cancer may affect DNA alterations from both endogenous and metabolic factors as well as from exposures to external chemicals.

Ignorance is never a good thing, and the chapter does the bar and bench a disservice in not adequately exploring genetic susceptibility in view of both exogenous and endogenous exposures that may be responsible for chronic diseases, such as cancers.


[1] National Academies of Sciences, Engineering, and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE (4th ed. 2025) (cited as RMSE 4th ed.)

[2] Michael D. Green, D. Michal Freedman & Leon Gordis, Reference Guide on Epidemiology, 549, in RMSE 3rd ed.

[3] Steve C. Gold, Michael D. Green, Jonathan Chevrier, & Brenda Eskenazi, Reference Guide on Epidemiology, in RMSE 4th ed.

[4] See In re School Asbestos Litigation, 977 F.2d 764 (3d Cir. 1992). See also Cathleen M. Devlin, Disqualification of Federal Judges – Third Circuit Orders District Judge James McGirr Kelly to Disqualify Himself So As To Preserve ‘The Appearance of Justice’ Under 28 U.S.C. § 455 – In re School Asbestos Litigation (1992), 38 VILL. L. REV. 1219 (1993); Bruce A. Green, May Judges Attend Privately Funded Educational Programs? Should Judicial Education Be Privatized?:  Questions of Judicial Ethics and Policy, 29 FORDHAM URB. L. J. 941, 996-98 (2002).

[5] Steve Gold, et al., Reference Guide on Epidemiology, at 914, in RMSE 4th ed.

[6] Id. at 916.

[7] ToxicoGenomica, The Litigator’s Guide to Using Genomics in a Toxic Tort Case (2018).

[8] Id. at 917 & n.55 (citing Bowen v. E.I. Du Pont de Nemours & Co., No. CIV.A. 97C-06-194 CH, 2005 WL 1952859 (Del. Super. Ct. June 23, 2005), aff’d, 906 A.2d 787 (Del. 2006) (discussing the importance of a test for a genetic mutation, which was the defense’s alternative causation theory to plaintiff’s claim that a toxic exposure caused the birth defect at issue). The authors fail to mention that the Bowen case was actually dismissed.

[9] See, e.g., Oliver v. Sec’y Health & Human Servs., 900 F.3d 1357 (Fed. Cir. 2018); Ortega v. United States, 2021 WL 4477896, 2021 U.S. Dist. LEXIS 188969 (N.D.Ill. Sept. 30, 2021); Vanslembrouck ex rel. Braverman v. Halperin, 2014 WL 5462596 (Mich. App. 2014).

[10] See, e.g., Halter v. Boehringer Ingelheim Pharms. Inc., no. 2023-L-001382, Cir. Ct. Cook Cty., Illinois, jury verdict (Aug. 27, 2025) (defense verdict in colorectal cancer case in which plaintiff failed to test for genetic mutation); see also Lauraann Wood, Boehringer Wins Another Zantac Cancer Trial In Illinois, LAW360, Chicago (Aug. 27, 2025).

[11] See, e.g., In re Acetaminophen – ASD-ADHD Prods. Liab. Litig., 707 F.Supp.3d 309, 320  (S.D.N.Y. 2023).

[12] Id. at 916-17 (emphasis added).

[13] Id. at 915.

[14] See, e.g., id. at 967n.190, citing McMillan v. Dep’t of Veterans Affairs, 294 F. Supp. 2d 305, 312 (E.D.N.Y. 2003) (“It is generally accepted that genetic susceptibility plays a key role in determining the adverse effects of environmental chemicals. . . . [I]f polymorphisms of the gene encoding the AhR [protein] exist in humans as they do in laboratory animals, some people would be at greater risk or at lesser risk for the toxic and carcinogenic effects of TCDD [dioxin].”).

[15] See Edward J. Calabrese, Changing the paradigm: The biggest polluter and threat to your health is your body, J. OCCUP. & ENVT’L HYG. (2025), published on-line, ahead of print.