TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

How Science Works in the New Reference Manual on Scientific Evidence

March 12th, 2026

The Second and Third Editions of the Reference Manual on Scientific Evidence contained a chapter, “How Science Works,” by Professor David Goodstein. This chapter ambitiously set out to cover philosophy and sociology of science to help orient judges as strangers in a strange land. Goodstein’s chapter had been a useful introduction to scientific methodology, and it countered some of the antic ideas seen in some judicial opinions, as well as in some other chapters of the Manual. Goodstein brought a good deal of experience and expertise to the task. He was a distinguished professor of physics and Vice Provost at the California Institute of Technology, and he had written engagingly about scientific discovery and the pathology of science.[1] Sadly, Goodstein died in April 2024. His death may have had some role in the delayed publication of the Fourth Edition of the Manual,[2] and the improvident replacement of his chapter with a new chapter written by authors less articulate about how science works.

The substitute chapter on “How Science Works” was written by two authors considerably less accomplished than the late Professor Goodstein.[3] Michael Weisberg is a professor of philosophy at the University of Pennsylvania, where he is the deputy director of Perry World House, which “analyzes global policy challenges through the realms of climate, democracy, global justice and human rights, and security.” The connection with Perry House may explain the new chapter’s heavy reliance upon the development of the chlorofluorocarbon (CFC) connection to ozone layer depletion as an exemplar of scientific discovery and knowledge. The University of Pennsylvania webpage describes Weisberg as “educat[ing] the next generation of environmental leaders in the classroom, at the negotiating table, and in the field, ensuring that their voices have maximal impact on addressing the climate crisis.”[4] So we have a philosopher of advocacy science, as it were. Some readers might think those credentials are not optimal for preparing a nuts-and-bolts description of how science works. Reading sections of the new chapter will not diminish their concerns.

Joining with Weisberg on this new version of “How Science Works,” is Anastasia Thanukos, who works at the University of California Museum of Paleontology. Thanukos has her masters degree in integrative biology, and her doctorate in science education.[5] 

The new “method” chapter has some virtues. As did Goodstein’s chapter, the new authors put peer review into a realistic perspective that should keep judges from being snoockered into admitting weak or bogus evidence because it had been published in a peer reviewed journal.[6] The authors should have gone much farther in pointing out that the rise of predatory and pay-to-play journals, as well as journals controlled by advocacy groups, have undermined much of the publishing model of modern science.

Weisberg and Thanukos discuss “expertise” in a way that is interesting but irrelevant to legal cases.  They seem blithely unaware that the standard for qualifying an expert witness is extremely low. Who will disbuse them when they argue that “[i]t is worth evaluating the closeness of a scientist’s disciplinary expertise to a scientific topic on which expert testimony is delivered”?[7] In what emerges as a consistent pattern of giving anti-manufacturing industry examples, the authors point to Richard Scorer as an accomplished scientist, who had no specific expertise in CFC ozone depletion. Notwithstanding the lack of specific expertise, an industry-backed group promoted Scorer’s views that criticized the CFC-ozone depletion hypothesis.[8] Citing Naomi Oreskes, the new Manual chapter states that “[t]he problem of scientists with legitimate expertise in one field weighing in on a scientific question outside their area of expertise is a pernicious one that has affected public acceptance of science and policy on issues such as climate change and tobacco exposure.”[9] Later, when Weisberg and Thanukos discuss the Milward case, they miss the pernicious influence that flowed from allowing Martyn Smith, a toxicologist, to give methodologically muddled opinion testimony on epidemiology. Pernicious is where you find it, and the authors of the new chapter find virtually all untoward instances of poor scientific method and conduct to originate from manufacturing industry.

Weisberg and Thanukos introduce a discussion of the “replication crisis,” a phrase and concept absent from the third edition of the Reference Manual.[10] The authors express some skepticism that there is an actual crisis over replication,[11] but their focus on climate science may mean that they are simply blinded by groupthink in that discipline. Their discussion of retractions omits the steep rise in retraction rates in most scientific disciplines,[12] and the authors ignore the proliferation of poor quality journals. Positively, the authors introduce a discussion of study preregistration, a notion absent from the third edition of the Manual, and they explain that such preregistration may serve as a bulwark against data dredging post hoc analyses.[13] Negatively, the authors ignore how frequently preregistered protocols are not used, or are used and then violated.

Weisberg and Thanukos appropriately ignore “weight of the evidence” (WOE) and “inference to the best explanation” (IBE). Readers might (mistakenly) think that the new chapter implicitly rejects WOE, as put forth by Carl Cranor and credulously accepted by the First Circuit in Milward, when the chapter authors insist that 

“the judge’s task requires a deeper examination of the available evidence and methods by which it was arrived at, as well as an assessment of how the community of experts in this area has evaluated or would evaluate the evidence and reasoning in question.”[14]

Contrary to the Milward decision from 2011, the new authors are not shy about stating the obvious; there is good science, and there is bad science.  Not all “judgment” about causality is acceptable and fit for submission to juries.[15] Given the judicial resistance to Rule 702, the obvious here requires stating. Weisberg and Thanukos acknowledge that some scientific judgment is unreliable or invalid because it was based upon work that was not carried out in accordance with current standards for scientific investigation and inference.[16] It should not surprise anyone that most of their examples of bad science are the product of manufacturing industry; the authors are oblivious to bad science sponsored by the lawsuit industry or by non-governmental advocacy organizations (NGOs).

Weisberg and Thanukos frame scientific disagreements and debates as governed by both data and ethical norms. Science is not infinitely contestable. There are identifiable norms, including a norm that scientists should “seek relevant information,” and “scrutinize ideas and evidence.”[17] Contrary to Milward’s standard of judicial abstention and credulity in the face of dodgy causal claims, these authors state what should be obvious, that scientific scrutiny involves, among other things, “an evaluation of methods, considering potential biases and oversights.”[18]

The chapters’ authors, non-lawyers, get closer to the heart of the error in Milward’s abstention doctrine with their recognition of what should have been obvious to the authors of the law chapter (Richter & Capra):

“When research relevant to a trial has not yet been scrutinized by a community with the appropriate technical expertise, a judge may be placed in the position of providing or requesting this scrutiny.”[19]  

Rather than some vague, subjective, and content-free WOE standard, Weisberg and Thanukos urge scientists, and by implication judges as well, to engage in serious efforts to “identify and avoid bias” and abide by ethical guidelines.[20] In other (my) words, the new authors agree that there is a standard of care reflected in the norms of science, and consequently there can be deviations from that standard. For Weisberg and Thanukos, compliance with the normative structure of scientific investigations is at the heart of building up accurate and predictive conclusions from data.[21] As part of their communitarian and normative conception of the scientific process, the authors appear to accept the reality and necessity for judges to act as gatekeepers.[22]

And while this recognition of standards and the need to police against deviations from standards is commendable, Weisberg and Thanukos proceed to give an abridgment of scientific method and process that is distorted and erroneous. They steadfastly ignore the concept of hierarchy of evidence, and thus provide illegitimate cover for levelers of evidence. In discussing randomized controlled trials, for instance, they note that such trials are often taken as “the gold standard,” but then they counter, without citation, support, or argument, that such trials “are just one line of evidence among many.”[23] The authors elide discussion and reconciliation of when that “just one line of evidence” conflicts with observational studies.

Notwithstanding their helpful comments about the need to evaluate studies for bias and other errors, these authors enter into the Milward controversy with an observation that assessing many lines of evidence is required and can be difficult for courts, and has led to “controversy.” Citing to papers including one  by the late Margaret Berger at her notorious lawsuit industry SKAPP-funded Coronado Conference, Weisberg and Thanukos float the observation that:

“In science, the available evidence (some of which may come from other research programs not designed to test the hypothesis under consideration) is evaluated as a body, along with the strengths, weaknesses, and caveats relating to each type of data, an approach which, some scholars have argued, the judiciary has not always followed.98[24]

This claim that the available evidence is evaluated as “a body” is presented as a fact about how science works, without any citation or argument. Several comments are in order. First, the claim is at odds with the authors’ own statements that scientific norms require evaluating each study for biases and other disqualifying flaws. Second, the claim is at odds with the authors’ own reference to systematic reviews and meta-analyses,[25] which are governed by protocols with inclusionary and exclusionary criteria for individual studies, and which require consideration of individual study validity before it enters the “body” of evidence that is quantitatively or qualitatively evaluated. In the authors’ words, “authors delineate both the criteria that studies must meet for inclusion in the review and the methods that will be used to assess the studies.”[26] The Milward case involved an expert witness who had proffered the very opposite of a systematic review in the form of post hoc rejiggering of studies and their data to fit a pre-conceived litigation goal. In the context of addressing the replication crisis, Weisberg and Thanukos correctly observe “peer review alone cannot ensure that the conclusions of published studies are actually correct, highlighting the responsibility judges bear in evaluating the validity of the methodologies that contributed to a particular piece of research.”[27] Of course, the Milward case involved a hired expert witness whose unprincipled re-analysis of studies was never peer reviewed or published.

Third, the authors could easily have found additional support for the contrary proposition that individual studies must be evaluated before being considered as part of the entire evidentiary display. The IARC Preamble, which roughly describes how that agency arrives at its so-called hazard classifications of human carcinogenicity, specifies that individual studies within each of three streams of evidence are evaluated for validity and soundness before contributing to a sub-conclusion with respect to (1) epidemiology, (2) toxicology, and (3) mechanistic lines of evidence.[28] Each of those three lines of evidence is adjudged “sufficient,” “limited,” or “inadequate,” by specialists in the three respective areas, before an overall evaluation is reached. There is much that is objectionable in the IARC working group procedures, but this division of labor and the need to consider disparate lines of evidence and studies within each line separately before attempting a synthesis, is present in all systematic review methodology. The suggestion from Weisberg and Thanukos that “the available evidence” in science is “evaluated as a body” is not only unsupported, but it is demonstrably false and misleading.

This claim about holistic evaluation is a fairly transparent but failed attempt to support a claim made in the chapter on the admissibility of expert witness evidence by Liesa Richter and Daniel Capra, who present an exposition of the notorious Milward case, without criticism, in a way to suggest that the case represents appropriate judicial gatekeeping under Rule 702, and that the case is consistent with scientific norms.[29] The chapter on how science works, after  having stated a false claim about scientific methodology for synthesis and integrating disparate lines of evidence, attempts to provide a gloss on the similar and equally benighted claim of Richter and Capra, in footnote 98:

“98. Some scholars have raised concerns that the courts have on occasion unfairly dismissed numerous individual lines of evidence as being flawed or insufficiently conclusive and concluded that evidence is lacking, when in fact the body of evidence, taken as a whole, points to a clear conclusion. For more, see discussion of Milward v. Acuity Specialty Products Group, Inc.; see also Liesa L. Richter & Daniel J. Capra, The Admissibility of Expert Testimony, in this manual; Berger 2005, supra note 97; and Steve C. Gold, A Fitting Vision of Science for the Courtroom, 3 Wake Forest J.L. & Pol’y 1 (2013).”

Some “scholars” have indeed said such things in their more unscholarly moments; some scholars have criticized Milward, but they are not cited in this new methods chapter. The footnote is accurate, but highly misleading by omission. The First Circuit in Milward also said as much, also without support or justification, and Richter and Capra, in their chapter of the Manual, fourth edition, parrot the Milward case. Weisberg and Thanukos cite to two articles, by Margaret Berger and by Steven Gold, both law professors, not scientists, and both ideologically hostile to Rule 702 gatekeeping. The Berger article was from a lawsuit-industry SKAPP funded symposium known as the Coronado Conference, and the Gold paper comes out of a symposium sponsored by the lawsuit industry itself and the Center for Progressive Reform, an advocacy NGO to which one of Mr. Milward’s expert witnesses, Carl Cranor, belongs. So the authors of the new science methodology chapter failed to cite any scientific source, but cited to papers by lawyers in the capture of the lawsuit industry, and a single (infamous) decision that ignored Rules 702 and 703, as well as the extensive literature on systematic reviews.  Weisberg and Thanukos could have cited many sources that contradicted their claim, and the claim of the lawsuit industry sponsored lawyers, but they did not. This is what biased and subversive scholarship looks like.

Funding Bias – The New McCarthyism

The selective citation to articles sponsored by the lawsuit industry is ironic in the context of what Weisberg and Thanukos have to say elsewhere about the “funding effect.” Some of what the authors say about personal bias is almost reasonable. For instance, they suggest that funding source is a “valid consideration” in evaluating methodologies and conclusions of expert testimony, and presumably of published studies as well, but not a sufficient reason to exclude such testimony or reliance.[30] Interestingly, these authors ignored the funding and the ideological interests of the symposia they cited in support of the repudiated Milward abstention doctrine.

Over three decades ago, Kenneth Rothman, the founder of Epidemiology, the official journal of the International Society for Environmental Epidemiology (ISEE), wrote his protest against the obsession with funding in article that should have been cited in the new chapter, for balance. Rothman described the fixation on funding as the “new McCarthyism in science,” which manifested as intolerance toward industry-sponsored studies, and strict scrutiny of “conflict-of-interest” (COI) disclosures.[31] The new McCarthyites amplify the gamesmanship over COI disclosures by excusing or justifying non-disclosure of COIs from scientists who have positional conflicts, or who are aligned with advocacy groups or with the lawsuit industry.

This asymmetrical standard for adjudging conflicts is on full display in the Weisberg and Thanukos chapter, when they claim that “in pharmaceuticals, there is a strong tendency for industry-sponsored trials to favor the industry’s product.”[32] The chapter authors, and their cited source, ignore the context in which the pharmaceutical industry scientists publish clinical trial results.  A successful clinical trial that showed efficacy with minimal adverse events is the result of years of prior research, including phase I and II trials, and preclinical testing. If the research fails to show efficacy, or shows unreasonable harm, in any of this prior research, the phase III trial is never done and so never published. If the medication is never licensed, the phase III trial will generally not be published. The selection effects are obvious and overwhelming in determining that the published results of phase III trials will be work that favors the sponsor. The “failed” phase III trial may result in a securities class action against the pharmaceutical company. In the realm of observational studies, some work commissioned by manufacturing industry has its origins in the poorly conducted, flawed work of environmental zealots and NGOs. Manufacturing industry has an obvious interest in correcting the scientific record, and again, any carefully done study would rebut that of the zealots and favor the industry sponsor.

Elsewhere, the authors offer a more balanced assessment when they observe that “[a]ll research is potentially influenced by bias, and every funder of research has the potential to introduce a source of bias.”[33] Similarly, the fourth edition chapter notes that “[a]ll scientists have some sort of motivation for their work, and this does not preclude scientific knowledge building, so long as biased methodologies and interpretations are avoided.”[34] Their recognition that motivated reasoning is everywhere suggests that all research should receive scrutiny regardless of apparent or disclosed funding source.[35]

When it comes to providing examples of funding-effect distortions of science, Weisberg and Thanukos seem to blank on instances created by the lawsuit industry or by environmental NGOs. The reader should contrast how readily and stridently the authors point to bias in industry-sponsored research with how the authors tie themselves up with double negatives when making the same point about NGOs:

“That is not to suggest that government-or nongovernmental organization (NGO)-sponsored research is necessarily free from bias.”[36]

The cognitive dissonance is palpable. The only conclusion that could be drawn from such a locution is that Weisberg and Thanukos have not worked very hard to identify and disclose their own biases.

STATISTICS DONE POORLY

When it comes to explaining and discussing the role of statistical methods in the scientific process, Weisberg and Thanukos go off the rails. The new chapter is an unmitigated disaster, which should have been corrected in the peer review and oversight process. The first sign of trouble became apparent upon checking the definition of “p-value” in the chapter’s glossary:

p-value. A statistic that gives the calculated probability that the null hypothesis could be true even given the observed differences between conditions.”[37]

This definition is the transposition fallacy on steroids. Obviously, a p-value cannot be the probability that the null hypothesis “could be true” when the procedure for calculating a p-value must assume that the null hypothesis is true, along with a specified probability model. Equally important, the p-value does not describe a probability in connection with the null hypothesis because it describes the probability of observing data as different from the null, or more so, as seen in this particular sample.  The statistics chapter in the Manual by Hall and Kaye states the meaning correctly.  The coverage of statistical concepts by Weisberg and Thanukos should be studiously ignored.

The outrageously incorrect definition of p-value in the glossary is not an isolated error.  The authors are clearly statistically challenged. In the text of their chapter, they incorrectly describe the p-value, consistently with their aberrant glossary entry:

“the commonly used p-value approach, scientists compare a test hypothesis (e.g., that drug X is effective) to a null (e.g., that there is no difference in cure rates between those who took drug X and those who took a placebo). Scientists then calculate the probability that the null hypothesis could be true even with the observed difference between conditions (e.g., the cure rate of patients taking drug X compared to that of those taking a placebo).”[38]

Weisberg and Thanukos thus conflate frequentist and Bayesian statistics. They also obliterate the meaning of the confidence interval, an important concept for judges and lawyers to understand. Here is how the authors describe the confidence interval in their chapter:

Evaluating estimates: In science (and in contrast to their lay meanings), the terms uncertainty and error refer to the variability of a set of data that is intended to estimate a single number. Uncertainty and error are generally expressed as a range, within which we are confident that, if the study were repeated, the new result would fall. Scientists often use a 95% confidence interval for this purpose.”[39]

Describing the confidence interval in the same sentence as “uncertainty and error” is bound to induce uncertainty and error. The confidence interval provides a range of estimates based upon random error, and uncertainty only in the form of imprecision in the point estimate. There are of course myriad other kinds of uncertainty and error not captured by the confidence interval. The most important of the authors’ errors is that they assert incorrectly that the confidence interval provides a range within which new results from the study repeated would fall.  This is, again, a variant on the transposition fallacy that the authors commit in their definition of the p-value. The confidence interval provides a range of results that would not be rejected as alternative null hypotheses by the data in the obtained sample. Because of random error, future samples would give different results, with different confidence intervals, which would not be co-extensive with the first obtained confidence interval. To be sure, the statistics chapter states the matter correctly, and the epidemiology chapter finally gets it correct in its text (after having mangled the concept in the second and third editions), but the epidemiology chapter perpetuates its previous errors in defining confidence intervals in its glossary. This sort of issue, and it is a serious one, could have been eliminated had there been meaningful peer review and editorial oversight for consistency and accuracy of the Manual as a whole.

Weisberg and Thanukos address statistical power in a way that may also mislead readers. They tell us that “[p]ower refers to a test’s ability to reject a hypothesis that is indeed false.” W&T at 88. If only were it so. The authors omit that power is a probability that at a specified level of significance (say p < 0.05), and a specified alternative hypothesis, sample size, and probability model, the sample result will reject the null hypothesis in favor of the alternative hypothesis. Then the authors suggest confusingly that “[w]ell-designed studies have sufficient power to detect the differences of interest, but it may not be apparent when a test lacks power.”[40]

If the study at issue presents a confidence interval around a point estimate of interest, then it will be clear what alternative null hypotheses are statistically compatible with the sample result at the pre-specified level of alpha (significance). Any point outside the interval would be rejected by such a test of significance, and so the casual reader will have a rather good idea of what could and could not be rejected by the sample data. And of course, virtually every study will have low power to detect extremely small increased risks, say relative risk of 1.00001. And most studies will have high power to detect risk ratios of over 1,000.

This new chapter on “How Science Works” also propagates some well-known fallacies about statistical significance testing. Implicit in the authors’ committing the transposition fallacy, is a conceptual and mathematical confusion between the coefficient of confidence (1-α) and the posterior probability of an hypothesis.

The authors’ mistake comes in their insistence upon labeling precision in a test result as “certainty.” In the quote below, the authors’ confusion is clear and obvious:

“Note that the 95% and 5% cutoffs are somewhat arbitrary, and a higher degree of confidence might be required if more certainty were desired—for example if an impactful policy decision depended on the conclusion.”[41]

An impactful [sic] policy decision might well call for more certainty, or a higher posterior probability, but a higher coefficient of confidence will not necessarily map to hypothesis probability at all. The authors’ confusion and conflation of the probability of alpha and the Bayesian posterior probability arises elsewhere within the chapter:

“(1) A p-value lower than 0.05 does not prove that a null hypothesis is false. It is strong evidence, but there is a small chance that the difference observed could be the result of chance alone.

(2) Using a low p-value (e.g., 0.05) as a criterion for significance sets a high bar for rejecting the null hypothesis, minimizing the chance of getting a false positive… .”[42]

Again, a p-value less than five percent is hardly strong evidence in the context of large database studies, especially when there are multiple comparisons and the outcome is not the pre-specified outcome of the analysis. The authors’ confusion is on full display when they discuss the Zoloft birth defects litigation, where the Third Circuit affirmed the exclusion of plaintiffs’ expert witnesses’ causation opinions and the grant of summary judgment to the defendants. According to the authors’ narrative:

“plaintiffs’ expert’s testimony would have argued that multiple, nonsignificant associations between Zoloft use and birth defects indicated a causal relationship. The testimony was excluded because these results were consistent with a weak causal relationship (a small effect size), one that is ‘so weak that one cannot conclude that the risk is greater than that seen in the general population’.”[43]

Of course, in the Zoloft litigation, the excluded plaintiffs’ expert witnesses were caught red-handed – at cherry picking – and attempting to circumvent the lack of significance with a methodologically incorrect meta-analyses.[44]

If the risk of birth defects among children born to mothers who used Zoloft in pregnancy was no greater than seen in the general population, then there would be no risk, not risk “so weak” it cannot be seen. Locutions such as the “results were consistent with a weak causal relationship,” when the results were equally consistent with no causal relationship suggest that the writers cannot bring themselves to say that the causal hypothesis was simply not supported at all. Of course, no study may exclude an increased risk of 0.01 percent, or a relative risk of 1.01, but at some point, when multiple attempts fail to reveal an increased risk, we may conclude that the proponents of the causal claim have failed to make their case.

META-SHMETA-ANALYSIS

Weisberg and Thanukos address meta-analysis incompletely in the context of systematic reviews. The authors do not provide any insights into how meta-analyses are done, and more glaringly, they fail to mention that not all systematic reviews can or should result in quantitative syntheses of estimates of association. On the positive side, they state that meta-analyses are important in litigation, and that the application of rigorous methodologies should be required.[45] With clearly unintended irony, Weisberg and Thanukos offer, as support for their statement, the Paoli Railroad Yard case, “in which the exclusion of a contested meta-analysis was overturned.”[46]

Weisberg and Thanukos have stepped into the wet corner of a pigsty. The issue in the Paoli case arose from a meta-analysis of mortality rates associated with polychlorobiphenyl (PCB) exposures. The district court excluded the proponent of the meta-analysis, not because it was unreliable, but because it was novel. Holding it up in conjunction with a statement about application of rigorous or reliable methodologies was way off the relevant legal point.

The expert witness who proffered the meta-analysis in Paoli was William  Nicholson, who was a physicist with no professional training in epidemiology. For his opinion that PCBs were causally associated with human liver cancer, Nicholson relied upon a non-peer-reviewed, unpublished report he wrote for the Ontario Ministry of Labor.[47] Nicholson described his report as a “study of the data of all the PCB worker epidemiological studies that had been published,” from which he concluded that there was “substantial evidence for a causal association between excess risk of death from cancer of the liver, biliary tract, and gall bladder and exposure to PCBs.”[48]

The defense challenged Nicholson’s opinion, not on Rule 702, but on case law that pre-dated the Daubert decision.[49] The challenge included pointing out the unreliability of the Nicholson’s meta-analysis, but also asserted (incorrectly) the novelty of meta-analysis generally. The district court sustained the defense objection on the grounds of “novelty,” without reaching the reliability analysis.[50] The Third Circuit appropriately reversed and remanded for consideration of the reliability of Nicholson’s meta-analysis.[51]

The consideration of Nicholson’s “meta-analysis” never occurred on remand; plaintiffs’ counsel and their expert witnesses withdrew their reliance upon Nicholson’s analysis. Their about face was highly prudent. Nicholson’s report presented SMRs (standardized mortality ratios); for the all cancers statistic, he reported an SMR of 95. What Nicholson did, in this analysis, and in all other instances, was simply divide the observed number of deaths by the expected, and multiply by 100. This crude, simplistic calculation fails to present a standardized mortality ratio, which requires taking into account the age distribution of the exposed and the unexposed groups, and a weighting of the contribution of cases within each age stratum. Nicholson’s presentation of data was nothing short of a fraud.

Nicholson’s Report was replete with many other methodological sins. He used a composite of three organs (liver, gall bladder, bile duct) without any biological rationale. His analysis combined male and female results, and still his analysis of the composite outcome was based upon only seven cases. Of those seven cases, some of the cases were not confirmed as primary liver cancer, and at least one case was confirmed as not being a primary liver cancer.[52]

As noted, Nicholson failed to standardize the analysis for the age distribution of the observed and expected cases, and he failed to present meaningful analysis of random or systematic error. When he did present p-values, he presented one-tailed values, and he made no corrections for his many comparisons from the same set of data.

Finally, and most egregiously, Nicholson’s meta-analysis was meta-analysis in name only. What he had done was simply to add “observed” and “expected” events across studies to arrive at totals, and to recalculate a bogus risk ratio, which he fraudulently called a standardized mortality ratio. Adding events across studies, without weighting by the inverse of study variance, is not a valid meta-analysis; indeed, it is a well-known example of how to generate the error known as Simpson’s Paradox, which can change the direction or magnitude of any association.[53]

In citing to the Paoli case as a reversal of exclusion of a contested meta-analysis, Weisberg and Thanukos give a truncated analysis that misleads readers, judges, and lawyers. There never was a proper consideration of the reliability vel non of Nicholson’s meta-analysis in the Paoli litigation, and in the final analysis, the Paoli plaintiffs abandoned reliance upon Nicholson’s ill-conceived meta-analysis.

VIRTUE SIGNALING

Although there are no land acknowledgments for the property on which Federal Judicial Center building is located, Weisberg and Thanukos miss few opportunities to let us know that they are woke scholars. There is the gratuitous and triggering “pregnant people,”[54] which begs any number of biological questions. Then there is the authors’ statement that they are limiting their focus to the “Western conception of science,” which begs another question, why would we call any other epistemically valid approach, from any corner of the globe, as something other than “science.”[55]

Equally gratuitous are the authors’ endorsements of DEI and “diversity,” with overbroad generalizations that diversity per se advances science,[56] and a claim that “women, people of color, other historically oppressed groups, and non-Western people” are not taken seriously as scientists.[57] In over 40 years of litigating technical and scientific issues, I have never seen a judge or a lawyer disrespect an expert witness based upon sex, race, ethnicity, or national origin. Of course, I have seen expert witnesses treated roughly for propounding bad science, and that seems perfectly appropriate.


[1] See David Goodstein, ON FACT AND FRAUD: CAUTIONARY TALES FROM THE FRONT LINES OF SCIENCE (2010).

[2] Weisberg and Thanukos frequently refer to other chapters in the Manual, which suggests that their chapter was written late in the development of the Fourth Edition, and perhaps contributed to the delayed publication.

[3] Michael Weisberg & Anastasia Thanukos, How Science Works, in National Academies of Sciences, Engineering, and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 47 (4th ed. 2025) [cited as W&T].

[4] See Michael Weisberg, University of Pennsylvania Philosophy, at https://philosophy.sas.upenn.edu/people/michael-weisberg.

[5] Anna Thanukos, Staff, available at https://ucmp.berkeley.edu/people/anna-thanukos/#:~:text=Her%20background%3A%20Anna%20has%20a,Education%2C%20both%20from%20UC%20Berkeley

[6] W&T at 72-75.

[7] W&T at 81.

[8] W&T at 81.

[9] W&T at 81 & n.85 (emphasis added), citing Naomi Oreskes & Erik M. Conway, MERCHANTS OF DOUBT: HOW A HANDFUL OF SCIENTISTS OBSCURED THE TRUTH ON ISSUES FROM TOBACCO SMOKE TO GLOBAL WARMING (2010).

[10] W&T at 94-96.

[11] W&T at 95 n.120.

[12] Richard Van Noorden, More than 10,000 research papers were retracted in 2023 — a new record, 624 NATURE 479 (2023).

[13] W&T at 95.

[14] W&T at 55.

[15] W&T at 63, 68.

[16] W&T at 68.

[17] W&T at 65.

[18] W&T at 70.

[19] W&T at 71.

[20] W&T at 66.

[21] W&T at 75.

[22] W&T at 49.

[23] W&T at 83.

[24] W&T at 86 (citing Richter and Capra’s discussion of Milward in chapter one of the Manual, and Professor Gold’s article from the lawsuit industry celebratory conference on the Milward case).

[25] W&T at 99-100.

[26] W&T at 99.

[27] W&T 96 (emphasis added).

[28] IARC MONOGRAPHS ON THE IDENTIFICATION OF CARCINOGENIC HAZARDS TO HUMANS – PREAMBLE (2019), available at https://monographs.iarc.who.int/wp-content/uploads/2019/07/Preamble-2019.pdf

[29] Liesa L. Richter & Daniel J. Capra, The Admissibility of Expert Testimony, National Academies of Sciences, Engineering, and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 1, 32-33 (4th ed. 2025).

[30] W&T at 76.

[31] Kenneth J. Rothman, “Conflict of interest: the new McCarthyism in science,” 269 J. AM. MED. ASS’N 2782 (1993). See Schachtman, The Rhetoric and Challenge of Conflicts of Interest, TORTINI (July 30, 2013).

[32] W&T at 76 & n.67, citing Sergio Sismondo, Pharmaceutical Company Funding and Its Consequences: A Qualitative Systematic Review, 29 CONTEMP. CLINICAL TRIALS 109 (2008).

[33] W&T at 77.

[34] W&T at 59-60.

[35] W&T at 59-60.

[36] W&T at 76.

[37] W&T at 111.

[38] W&T at 87.

[39] W&T at 90.

[40] W&T at 88.

[41] W&T at 90 (emphasis added).

[42] W&T at 88.

[43] W&T at 90 (internal citations omitted).

[44] In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., 26 F. Supp. 3d 449 (E.D. Pa. 2014); No. 12-md-2342, 2015 WL 314149, at *3 (E.D. Pa. Jan. 23, 2015) (rejecting proffered expert witness opinion based upon “cherry-picking of studies and data within studies”), aff’d, 858 F.3d 787 (3rd Cir. 2017).

[45] W&T at 99.

[46] W&T at 99 & n.134, citing In re Paoli R.R. Yard PCB Litig., 916 F.2d 829 (3d Cir. 1990).

[47] William Nicholson, Report to the Workers’ Compensation Board on Occupational Exposure to PCBs and Various Cancers, for the Industrial Disease Standards Panel (ODP); IDSP Report No. 2 (Toronto Dec. 1987) [Report].

[48] Id. at 373.

[49] See United States v. Downing, 753 F.2d 1224 (3d Cir.1985).

[50] In re Paoli RR Yard Litig., 706 F. Supp. 358, 372-73 (E.D. Pa. 1988).

[51] In re Paoli RR Yard PCB Litig., 916 F.2d 829 (3d Cir. 1990), cert. denied sub nom. General Elec. Co. v. Knight, 499 U.S. 961 (1991).

[52] Report, Table 22.

[53] See James A. Hanley, et al., Simpson’s Paradox in Meta-Analysis, 11  EPIDEMIOLOGY 613 (2000); H. James Norton & George Divine, Simpson’s paradox and how to avoid it, SIGNIFICANCE 40 (Aug. 2015); George Udny Yule, Notes on the theory of association of attributes in statistics, 2 BIOMETRIKA 121 (1903).

[54] W&T at 84.

[55] W&T at 50.

[56] W&T at 71 n. 52-54.

[57] W&T at 102.

Reference Manual’s Chapter on Expert Witness Testimony Admissibility – Part 5

March 7th, 2026

By ignoring Milward’s expert witnesses’ omissions from, and abridgements of, WOE and IBE, the appellate court blinded itself to these witnesses’ distortions of scientific method. The need for judgment, which the Milward court was keen to honor, does not mean that there are not aberrant or deviant judgments, or deviations from the standard of scientific care that are disqualifying. The need for judgment must also allow for equipoise and uncertainty that stands in the way of an inculpatory or exonerative verdict. And then there is the business of questionable research practices that subvert causal judgment. The district court had followed and acknowledged the showing of questionable research practices that pervaded Martyn Smith’s for-litigation opinions. The cheerleaders for Milward seem eager to obscure these practices by their insistence that causation is, after all, only a judgment.

The Milward decision, in its embrace of some truly aberrant methodology and judgment, and some absence of methodology, made some of its own whoopers. Martyn Smith’s incompetent analyses of the epidemiologic evidence had been thoroughly debunked in the district court, but the circuit court glibly adopted Smith’s characterizations. The appellate court failed to understand and come to grips with Smith’s rejiggering of data, and his inconsistently redefining exposures and outcomes in epidemiologic studies to make up new, fanciful results that favored his WOE-ful opinion. The appellate court also failed to understand that scientific judgment is not some vague, amorphous, unstructured decision that turns on whatever looks to be “explanatory.” Even the International Agency for Research on Cancer, which issues hazard classifications that are distorted by non-scientific precautionary principle reasoning, insists that three streams of evidence (epidemiologic, toxicologic, mechanistic) be considered separately, in accordance with criteria, with attention to the validity of each study, and synthesized into a judgment of causality following a carefully structured analysis.[1]

The appellate court in Milward took the demonstration of Smith’s failure to calculate odds ratios correctly to be something that merely went to the weight, not the admissibility, on the theory that a jury, which does not have access to the Reference Manual or to the actual studies as published, could sort it all out. And yet, when the court improvidently set out a definition of what an odds ratio is, it bungled the definition beyond understanding:

“An odds ratio represents the difference in the incidence of a disease between a population that has been exposed to benzene and one that has not.”[2]

The court’s definition is not even wrong. The difference between incidence of a disease in an exposed group and a non-exposed group is the risk difference. It is not an odds ratio. Perhaps the court might have realized what most third graders know, that there is a difference between a ratio (division) and a difference (subtraction). And of course, the odds of exposure is not the same as the incidence of a disease. The relevant odds ratio represents the odds of exposure in cases with APML diagnoses divided by the odds of exposure in study subjects without APML. The odds ratio does involve measurements of incidence although in some cases the odds ratio will approximate a risk ratio, which does involve a ratio of incidences. This is not some hyper-technicality; it is a vivid display that Chief Judge Lynch, writing for a panel of three judges of the First Circuit, had no idea of what she was reviewing or writing.

Richter and Capra devote two pages to a discussion of the Milward case and its embrace of WOE and IBE. There is not, in this discussion, a single adjective of approval or of disapproval. The attention to this one intermediate appellate court opinion far exceeds any other case decided at a level below the Supreme Court, and an engaged reader must ask why the authors of the first chapter of the new Reference Manual wrote about this case at all, especially given the 2023 amendments to Rule 702, which would suggest that Milward was bad law when decided in 2011, and clearly and emphatically bad law in December 2025, when the new Manual was published.

The chapter provides one not-so-subtle clue of the authors’ intent. At the conclusion of their extended, uncritical, and incomplete exposition of Milward,[3] Richter and Capra refer the reader to a law review symposium,[4] “[f]or a detailed analysis of the Milward decision and the weight of the evidence approach to scientific reasoning.” Like Richter and Capra’s coverage of Milward, the cited symposium was hardly an objective analysis; rather, it was more like a drunken celebration at a family reunion.

There have been many law review articles that have discussed the Milward case, but Richter and Capra chose to cite to one particular symposium, which was sponsored by two corporations, the Center for Progressive Reform (CPR) and the Robert A. Habush Foundation. The Center for Progressive Reform (CPR) is a not-for-profit corporation. Its website describes the CPR as a “research and advocacy organization that works in the service of responsive government; climate justice, mitigation, and adaptation; and protecting against environmental harm.”[5] CPR describes one of its key activities as defending science from corporate interference. Presumably its own corporate activities and those of the lawsuit industry are acceptable, but those of corporate manufacturing industry are not. From reviewing CPR’s website, it is not clear that the CPR believes manufacturing corporations should even be allowed to defend against lawsuits. Milward’s retained expert witness Carl Cranor is a “member scholar” at CPR, which makes CPR’s sponsorship of the symposium rather incestuous.[6]

CPR is also apparently comfortable with one highly politicized “corporation,” namely the American Association for Justice (AAJ), which is the trade group for the American lawsuit industry.[7] The AAJ describes itself as a corporation, or a “collective,” that supports plaintiff trial lawyers as their “collective voice … on Capitol Hill and in courthouses across the nation … .” The Robert A. Habush Foundation is endowed by the AAJ, and serves its “educational” mission.  Through the Habush Foundation, the AAJ funds educational programs, “think tanks,” and writing projects designed to influence judges, law professors, lawyers, and the public, on issues of importance to the AAJ:  “the civil justice system and individual rights” for bigger, better, and more profitable litigation outcomes. The AAJ may be a “not-for-profit” corporation, but it represents the interests of one of the most powerful, wealthiest, interest groups in American society — the plaintiffs’ bar.

The Milward symposium agenda and papers from its participants were published at the website for the Wake Forest Journal of Law and Public Policy, but now are marked as “currently private. If you would like to request access, we’ll send your username to the site owner for approval.”

The symposium cited by Richter and Capra for “analysis,” was very much a family affair. The choice of venue, at the Wake Forest Law School, was connected to the web of interests involved. CPR board member, Sid Shapiro, is a law professor at Wake Forest. Shapiro presented at the symposium, along with the Wake Forest professor Michael Green. Cranor, Shapiro’s CPR colleague, and party expert witness for plaintiff, presented.[8] There was only one practicing lawyer who presented at the symposium, Steven Baughman Jensen, who was a past chair of the AAJ’s Section on Toxic, Environmental, and Pharmaceutical Torts. Jensen represented Milward, and hired Cranor as one of the plaintiff’s expert witnesses. Attorney Jensen’s contribution to the symposium has been published along with Cranor’s as well, in the proceedings of the Milward symposium were published volume 3, no. 1 of the Wake Forest Journal of Law and Public Policy,[9] which is now also marked private. Jensen also published an abbreviated paean to Milward in in the AAJ’s trade journal.[10] No defense counsel or defense expert witness participated at the symposium, referenced by Richter and Capra.

Consistent with the financial, advocacy, and political interests of the symposium sponsors, the articles are almost all partisan high-fives for the Milward decision. Writing for the Federal Judicial Center and the National Academies, the authors of a chapter on the law of expert witnesses, a legal issue, for the Reference Manual, should have been aware of the partisan nature of the CPR-AAJ sponsored symposium. They should have flagged the advocacy nature of the symposium, and identified the funding sources and the conflicts created. Furthermore, Richter and Capra should have cited papers that criticized the Milward case, from various perspectives, including its failure to adhere to the law of Rule 702.[11] Their failure to do so is a significant failure of this chapter.


[1] IARC MONOGRAPHS ON THE IDENTIFICATION OF CARCINOGENIC HAZARDS TO HUMANS – PREAMBLE (2019).

[2] Milward, 639 F.3d at 23.

[3] Richter & Capra at 33n.96 (“For a detailed analysis of the Milward decision and the weight of the evidence approach to scientific reasoning…”).

[4] Symposium: Toxic Tort Litigation: After Milward v. Acuity Products, 3 WAKE FOREST JOURNAL OF LAW & POLICY 1 (2013).

[5] The Center for Progressive Reform, at https://progressivereform.org/, last visited on Feb. 24, 2026

[6] Carl Cranor Biography, Center for Progressive Reform, Member Scholars, at https://progressivereform.org/member-scholars/

[7] The AAJ was previously known by the more revealing name, Association of Trial Lawyers of America (ATLA®). 

[8] Carl F. Cranor, Milward v. Acuity Specialty Products: Advances in General Causation Testimony in Toxic Tort Litigation, 3 WAKE FOREST JOURNAL OF LAW & POLICY 105 (2013).

[9] Steve Baughman Jensen, Sometimes Doubt Doesn’t Sell: A Plaintiffs’ Lawyer’s Perspective on Milward v. Acuity Products, 3 WAKE FOREST JOURNAL OF LAW & POLICY 177 (2013).

[10] Steve Baughman Jensen, Reframing the Daubert Issue in Toxic Tort Cases, 49 TRIAL 46 (Feb. 2013).

[11] See Eric Lasker, Manning the Daubert Gate: A Defense Primer in Response to Milward v. Acuity Specialty Products, 79 DEF. COUNS. J. 128, 128 (2012);

David E. Bernstein, The Misbegotten Judicial Resistance to the Daubert Revolution, 89 NOTRE DAME L. REV. 27, 29, 53-58 (2013); David E. Bernstein & Eric G. Lasker, Defending Daubert: It’s Time to Amend Federal Rule of Evidence 702, 57 WM. & MARY L. REV. 1, 33 (2015); Richard Collin Mangrum, Comment on the Proposed Revision of Federal Rule 702: “Clarifying” the Court’s Gatekeeping Responsibility over Expert Testimony, 56 CREIGHTON LAW REVIEW 97, 106 & n.45 (2022); Thomas D. Schroeder, Toward a More Apparent Approach to Considering the Admission of Expert Testimony, 95 NOTRE DAME L. REV. 2039, 2045 (2020); Lawrence A. Kogan, Weight of the Evidence: A Lower Expert Evidence Standard Metastasizes in Federal Court, Washington Legal Foundation Critical Legal Issues WORKING PAPER Series no. 215 (Mar. 2020); Note, Judicial Conference Amends Rule 702. — Federal Rule of Evidence 702, 138 HARV. L. REV. 899, 903 (2025); Nathan A. Schachtman, Desultory Thoughts on Milward v. Acuity Specialty Products, DOI: 10.13140/RG.2.1.5011.5285 (Oct. 2015), available at https://www.researchgate.net/publication/282816421_Desultory_Thoughts_on_Milward_v_Acuity_Specialty_Products .

Reference Manual’s Chapter on Expert Witness Testimony Admissibility – Part 4

March 5th, 2026

In the district court, Judge George O’Toole conducted a pre-trial hearing over four days, and heard testimony from Smith and Cranor, as well as from defense expert witnesses. Judge O’Toole’s published opinion carefully and accurately stated the facts, the applicable law, and presented a well-reasoned judgment as to why Smith’s opinion was not admissible under Rule 702. Without admissible opinions on general causation to support Milward’s case, Judge O’Toole granted summary judgment to the defendants.

Milward appealed the judgment. A panel of judges in the First Circuit heard argument, and reversed in an opinion that is riddled with serious errors.[1] In reviewing the district court’s application of Rule 702, the panel, in an opinion written by Chief Judge Lynch, credulously accepted most of Smith’s and Cranor’s arguments that an ill-defined WOE approach is acceptable method of guiding scientific judgment. Cranor equated WOE, as used by Smith, to the approach that Sir Austin Bradford Hill described, in 1965, for identifying causal associations from epidemiologic data.[2] Chief Judge Lynch’s opinion tracked accurately Cranor’s and Milward’s lawyers’ misrepresentations about Sir Austin’s paper:

“Dr. Smith’s opinion was based on a ‘‘weight of the evidence’’ methodology in which he followed the guidelines articulated by world-renowned epidemiologist Sir Arthur [sic] Bradford Hill in his seminal methodological article on inferences of causality. See Arthur [sic] Bradford Hill, The Environment and Disease: Association or Causation?, 58 Proc. Royal Soc’y Med. 295 (1965).

Hill’s article explains that one should not conclude that an observed association between a disease and a feature of the environment (e.g., a chemical) is causal without first considering a variety of ‘viewpoints’ on the issue.”[3]

The quoted language from the First Circuit opinion, which twice refers to “Arthur Bradford Hill,” rather than Austin Bradford Hill, may suggest that neither Chief Judge Lynch nor his judicial colleagues and their law clerks read the classic paper. An even stronger indicator that the appellate court did not actually read this paper is evidenced in the court’s equating WOE to Bradford Hill viewpoints, without consideration of the necessary predicate for those nine viewpoints. In his short paper, Sir Austin clearly spelled out that there was a foundation needed before parsing the nine viewpoints:

“Disregarding then any such problem in semantics we have this situation. Our observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance. What aspects of that association should we especially consider before deciding that the most likely interpretation of it is causation?”[4]

Whatever Sir Arthur had to say about the matter, Sir Austin defined the starting point of causal analysis as an association free of invalidating bias and random error. The Milward decision ignored this all important predicate for assessing the various considerations that might allow for a valid association to be considered a causal association.[5] The resulting abridgement was a failure of scientific due process that distorted the Bradford Hill paper.

The First Circuit amplified its error when it asserted that from the nine considerations “no one type of evidence must be present before causality may be inferred.”[6] Although Sir Austin said something similar, one of the considerations he noted was “temporality,” in which the putative cause must come before the effect.  Most scientists would consider this consideration to be essential, unless they were observing events that were moving faster than the speed of light. The other eight considerations are more dependent upon context of the exposures and outcomes of interest, but surely strength and consistency of the clear-cut association across multiple studies is an extremely important consideration.

The First Circuit proceeds from misreading Sir Austin’s paper to misunderstanding another paper invoked by Cranor and by Milward’s lawyers. Carelessly tracking Cranor, the appellate court suggested that there was no “hierarchy of evidence”:

“For example, when a group from the National Cancer Institute was asked to rank the different types of evidence, it concluded that ‘‘[t]here should be no such hierarchy.’’ Michele Carbon [sic] et al., Modern Criteria to Establish Human Cancer Etiology, 64 Cancer Res. 5518, 5522 (2004); see also Sheldon Krimsky, The Weight of Scientific Evidence in Policy and Law, 95 Am. J. Pub. Health S129, S130 (2005).”[7]

This quoted language from the Milward opinion shows how slavishly and credulously the court adopted and regurgitated plaintiff’s argument. Sheldon Krimky was actively involved with SKAPP, and his article was presented at the SKAPP-funded Coronado Conference, discussed earlier in this series. Krimsky actually acknowledged that although “the term [WOE] is applied quite liberally in the regulatory literature, the methodology behind it is rarely explicated.”

As for the article by Carbon [sic], this publication never rejected a hierarchy of evidence. The court’s language, quoted above, follows immediately after the court’s discussion of Sir Austin’s nine types of corroborating evidence that would support the causal interpretation of an association. As such, the court seems to imply, incorrectly, that there was no hierarchy of these considerations.[8]

The court’s language also suggests that the quoted language came from the National Cancer Institute (NCI), but its provenance is quite different. The cited article’s lead author, Michele Carbone (not Carbon), was reporting on a workshop hosted by the NCI at an NCI building; it was not an official NCI event or publication. The NCI did not sponsor or conduct the meeting, and Carbone’s paper was not an official statement of the NCI. Carbone’s paper was styled “Meeting Report,” and published as a paid advertisement in Cancer Research, not in the Journal of the National Cancer Institute as a scholarly article.

The discipline of epidemiology was not strongly represented at the meeting; most of the chairpersons and scientists in attendance were pathologists, cell biologists, virologists, and toxicologists. The authors of the meeting report reflect the interests and focus of the scientists in attendance. The lead author, Michele Carbone, a pathologist at the University of Hawaii, was an enthusiastic proponent of Simian Virus 40 as a cause of mesothelioma, a hypothesis that has not fared terribly well in the crucible of epidemiologic science.

The cited article did report some suggestions for modifying Bradford Hill’s criteria in the light of modern molecular biology, as well as a sense of the group that there was no “hierarchy” in which epidemiology was at the top of disciplines.  The group definitely did not address the established concept that some types of epidemiologic studies are analytically more powerful to support inferences of causality than others — the hierarchy of epidemiologic evidence. The group also did not address or reject a ranking of importance of Bradford Hill’s nine viewpoints. There was nothing remarkable about the tumor biologists’ statement that in some cases causality can be determined by careful identification of genetic inheritance or molecular biological pathways. There was no evidence of this sort in the Milward case, and the citation by Cranor and Milward’s lawyers was nothing more than hand waving.

Carbone’s meeting report summarizes informal discussion sessions at the 2003 meeting.  Those in attendance broke out into two groups, one chaired by Brook Mossman, a pathologist, and the other group chaired by Dr. Harald zur Hausen, a virologist. The meeting report included a narrative of how the two groups responded to twelve questions. Drawing from plaintiff’s (and Cranor’s) argument, the court’s citation to this meeting report is based upon one sentence in Carbone’s report, about one of twelve questions:

6. What is the hierarchy of state-of-the-art approaches needed for confirmation criteria, and which bioassays are critical for decisions: epidemiology, animal testing, cell culture, genomics, and so forth?

There should be no such hierarchy. Epidemiology, animal, tissue culture and molecular pathology should be seen as integrating evidences in the determination of human carcinogenicity.”[9]

Considering the fuller context of the meeting, there is nothing particularly surprising about this statement.  The full question and answer in the meeting report does not even remotely support the weight given to it by the court. There was quite a bit of disagreement among meeting participants over criteria for different kinds of carcinogens, as seen the report on another question:

“2. Should the criteria be the same for different agents (viruses, chemicals, physical agents, promoting agents versus initiating DNA-damaging agents)?

There were different opinions. Group 1 debated this issue and concluded that the current listing of criteria should remain the same because we lack sufficient evidence to develop a separate classification. Group 2 strongly supported the view that it is useful to separate the biological or infectious agents from chemical and physical carcinogens due to their frequently entirely different mode of action.”[10]

Carbone and the other authors of the meeting report noted the importance to epidemiology for general causation, while acknowledging its limitations for determining specific causation:

“Concerning the respective roles of epidemiology and molecular pathology, it was noted that epidemiology allows the determination of the overall effect of a given carcinogen in the human population (e.g., hepatitis B virus and hepatocellular carcinoma) but cannot prove causality in the individual tumor patient.”[11]

Clearly, the report was not disavowing the necessity for epidemiology to confirm carcinogenicity in humans. Specific causation of Mr. Milward’s APML was irrelevant to his first appeal to the First Circuit. Carbone’s report emphasized the need to integrate epidemiologic findings with molecular biology; it did not suggest that epidemiology was not necessary or urge that epidemiology be ignored or disregarded:

“A general consensus was often reached on several topics such as the need to integrate molecular pathology and epidemiology for a more accurate and rapid identification of human carcinogens.”[12]

                 * * * * *

“Ideally, before labeling an agent as a human carcinogen, it is important to have epidemiological, experimental animals, and mechanistic evidence (molecular pathology).”[13]

The court’s implication that there was “no hierarchy of evidence” is unsupported by the meeting report. The suggestion that WOE allows some loosey-goosey, ad hoc, unstructured assessment of diverse lines of evidence is rejected in the meeting report with a careful admonition about the lack of validity of some animal models and mechanistic research:

“Moreover, carcinogens and anticarcinogens can have different effects in different situations. As shown by the example of addition of β-carotene in the diet, β- carotene has chemopreventive effects in many experimental systems, yet it appears to have increased the incidence of lung cancer in heavy smokers. Animal experiments can be very useful in predicting the carcinogenicity of a given chemical. However, there are significant differences in susceptibility among species and within organs in the same species, and differences in the metabolic pathway of a given chemical among human and animals could lead to error.”[14]

Inference to the Best Explanation

The First Circuit asserted that “no serious argument can be made that the weight of the evidence approach is inherently unreliable.”[15] As discussed above, this assertion is demonstrably false. In his testimony at the Rule 702 pre-trial hearing, Cranor classified WOE as based upon “inference to the best explanation,” and the First Circuit obsequiously accepted this claim. In articulating and accepting Cranor’s reduction of scientific method to IBE, the appellate court seemed unaware that IBE as an epistemic theory has been roundly criticized. In a very general sense, IBE draws on Charles Pierce’s description of abduction as a mode of reasoning, although many writers have been eager to distinguish abduction from IBE. Bas van Fraassen criticized IBE as lacking merit as a mode of argument in a way germane to Cranor’s presentation of the notion, and the First Circuit’s uncritical acceptance:

“As long as the pattern of Inference to the Best Explanation—henceforth, IBE—is left vague, it seems to fit much rational activity. But when we scrutinize its credentials, we find it seriously wanting.”[16]

The IBE approach raises thorny problems of knowing how to discern the best explanation, or how to tell whether an explanation is simply the best of a bad lot. Other philosophers of science have questioned why explanatoriness should matter as opposed to predictive ability and resistance to falsification upon severe or robust testing.

In the hands of Smith and Cranor, these philosophical quandries become largely beside the point. For Smith and Cranor IBE becomes telling just so stories, which transform “but for” causation into “could be” causation. Drawing directly from Cranor, the Circuit Court explained that an inference to the best explanation involves six general steps for scientists:

“(1) identify an association between an exposure and a disease,

(2) consider a range of plausible explanations for the association,

(3) rank the rival explanations according to their plausibility,

(4) seek additional evidence to separate the more plausible from the less plausible explanations,

(5) consider all of the relevant available evidence, and

(6) integrate the evidence  using professional judgment to come to a conclusion about the best explanation.”[17]

Of course assessing causation requires judgment, but Cranor and Smith radically abridge the process of judging by eliminating:

  • the robust testing of, and attempts to falsify, hypotheses,
  • the weighting of study designs,
  • the pre-specification of kinds of studies to be included or excluded, the assignment of weights to different kinds and qualities of studies, and
  • the pre-specification of criteria of study validity, experimental design, consistency, and exposure-response.

The vague, contentless IBE and WOE, in the hands of Smith, operates just as van Fraassen anticipated. With Cranor’s “philosophizing,” IBE creates a permission structure to reach any desired conclusion. Indeed, Cranor’s approach makes no allowance for when careful scientists withhold judgment because the evidence is inadequate to the task. Furthermore, Cranor’s approach and the Milward decision would cheerily approve cherry picking of studies and data within studies, post hoc weighing of evidence, and even fabricating and rejiggering of evidence, all of which was on display in Smith’s for-litigation opinion.

The First Circuit uttered its mantra of approval of Smith’s scientific delicts in language that became the target of the revision of Rule 702 in 2023:

“the alleged flaws identified by the [district] court go to the weight of Dr. Smith’s opinion, not its admissibility. There is an important difference between what is unreliable support and what a trier of fact may conclude is insufficient support for an expert’s conclusion.”[18]

Earlier in its opinion, the appellate court quoted from the version of Rule 702 in effect when it heard the appeal:

“if (1) the testimony is based upon sufficient facts or data, (2) the testimony is the product of reliable principles and methods, and (3) the witness has applied the principles and methods reliably to the facts of the case.”[19]

Sufficiency, reliability, and validity were all preliminary questions to be decided by the court as part of its gatekeeping responsibility.  The appellate court simply ignored the law in its decision to green light Smith’s testimony.

                    (to be continued)


[1] Milward v. Acuity Specialty Products Group, Inc., 639 F.3d 11 (1st Cir. 2011), cert. denied sub nom., U.S. Steel Corp. v. Milward, 565 U.S. 1111 (2012).

[2] Austin Bradford Hill, The Environment and Disease: Association or Causation?, 58 PROC. ROYAL SOC’Y MED. 295 (1965).

[3] Milward, 639 F.3d at 17.

[4] Id. at 295.

[5] See Frank C. Woodside, III & Allison G. Davis, The Bradford Hill Criteria: The Forgotten Predicate, 35 THOMAS JEFFERSON L. REV. 103 (2013).

[6] Milward, 639 F.3d at 17.

[7] Id. (internal citations omitted).

[8] The Reference Manual chapter on medical testimony carefully discusses the hierarchy of evidence as it factors into the assessment of medical causation. John B. Wong, Lawrence O. Gostin & Oscar A. Cabrera, Reference Guide on Medical Testimony, in National Academies of Sciences, Engineering and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 687, 723 -24 (2011); John B. Wong, Lawrence O. Gostin, & Oscar A. Cabrera, Reference Guide on Medical Testimony, in National Academies of Sciences, Engineering and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 1105, 1150-52 (4th ed. 2025). Interestingly, the chapter on epidemiology in the third edition of the Reference Manual cited to the Carbone workshop with apparent approval, but the same chapter in the fourth edition has dropped the reference. Compare Michael D. Green, D. Michal Freedman & Leon Gordis, Reference Guide on Epidemiology, in National Academies of Sciences, Engineering and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 549, 564 n.48 (3rd ed. 2011) with Steve C. Gold, Michael D. Green, Jonathan Chevrier, & Brenda Eskenazi, Reference Guide on Epidemiology, in National Academies of Sciences, Engineering and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 897 (4th ed. 2025).

[9] Carbone at 5522.

[10] Carbone at 5521.

[11] Carbone at 5518 (emphasis added).

[12] Carbone at 5518.

[13] Carbone at 5519.

[14] Carbone at 5521.

[15] Milward, 639 F.3d at 18-19.

[16] Bas van Fraassen, LAWS AND SYMMETRY 131 (1989).

[17] Milward, 639 F.3d at 18.

[18] Milward, 639 F.3d at 22.

[19] Milward, 639 F.3d at 14.

Reference Manual’s Chapter on Expert Witness Testimony Admissibility – Part 3

March 2nd, 2026

Richter and Capra treat WOE in Justice Steven’s lone dissenting opinion in Joiner as if it were the law. Of course, it was not; nor was it a particularly insightful analysis into scientific method, Rule 702, or the law of expert witnesses. The Manual authors elevate WOE by their complete failure to offer any criticisms or by citing to the scientific and legal scholars who have criticized WOE.

Richter and Capra do cite to a couple of cases that are skeptical of expert witnesses who had offered WOE opinions, but they fail to cite to any cases that disparage WOE itself.[1] In aggravation of their misplaced focus on the Joiner dissent, Richter and Capra proceed to spend two full pages on the Milward case, which had posthumously appeared in Professor Berger’s version of the law chapter in the 2011, third edition of the Reference Manual. The attention given to Milward in the fourth edition is greater than to any other non-Supreme Court case, including Frye. Richter and Capra offer no commentary or analysis critical of the case, although many legal commentators have criticized the Milward opinion on WOE.[2]

Richter and Capra’s chapter fails to note that a dark cloud hangs over the Milward case due to the unethical non-disclosure of CERT’s amicus brief filed in support of reversing the exclusion of CERT’s founders, Carl Cranor and Martyn Smith,[3] or CERT’s funding Smith’s research, or CERT’s involvement in shaking down corporations in California for Prop 65 bounties.

In their extensive coverage of the 2011 Milward decision, Richter and Capra failed to report that after the First Circuit reversed and remanded, the trial court again excluded plaintiffs’ expert witnesses for failing to give a valid opinion on specific causation. On the second appeal, the First Circuit affirmed the exclusion of specific causation expert witness testimony and the entry of final judgment for defendants.[4] Given that the first appellate decision was no longer necessary to the final disposition of the case, it is questionable whether there is any holding with respect to general causation in the case.

The most salient aspect of Richter and Capra’s uncritical coverage of the Milward case is their complete failure to identify the legal errors made by the First Circuit in its decision on Rule 702 and general causation. As the Reporter to the Rules Advisory Committee, Professor Capra was intimately involved in many meetings and memoranda that addressed the failings of courts to engage properly in gatekeeping. These failings were the gravamen of the basis for the 2023 amendments to Rule 702. The Milward decision in 2011 managed to check almost every box for bad decision making: the appellate panel ignored the text of Rule 702, disregarded Supreme Court precedent in the Joiner case, relied upon over-ruled, obsolete, pre-Daubert decisions, ignored the policy considerations urged by the Supreme Court, bungled basic scientific concepts, and egregiously and credulously endorsed WOE as advocated as a scientific methodology. Professor David E. Bernstein has pointed to the 2011 Milward decision, as “the most notorious,” and “[t]he most prominent example of such judicial truculence” in resisting following the requirements of Rule 702, as it existed in 2011.[5]

Milward is an important case, much as the Berenstain Bears stories are important and helpful in teaching children what not to do. Unfortunately, Richter and Capra discuss Milward in a way that might lead readers to believe that the case represents a reasonable or proper treatment of the science involved in the case. To correct this biased coverage of Milward, readers will have to roll up their sleeves and actually look at what the court did and did not do, and what scientific methodology issues were involved.

Perhaps the best place to begin is the beginning. Brian Milward filed a lawsuit in which he claimed that he was exposed to benzene as a refrigerator technician.[6] He developed acute promyelocytic leukeumia (APML), and claimed that he had been exposed to benzene from having used products made or sold by roughly two dozen companies. APML is a rare disease, type M3 of acute myeloid leukemia (AML), defined by specific chromosomal abnormalities that are necessary but not sufficient to result in APML. APML has an incidence of fewer than five cases per million per year. APML occurs with equal frequency in both sexes; there are no known environmental or occupational causes of APML.[7] APML occurs in the general population without benzene exposure, and its occurrence in all populations is sparse. There are no biomarkers that suggest that some putative benzene-related mechanism is involved in some APML cases, which biomarker would identify the rarity of benzene involvement in causation.

Milward’s General Causation Expert Witness, Martyn T. Smith

Milward did not serve a report from an epidemiologist, or anyone with significant expertise in epidemiology. His only general causation expert witness was Martyn Smith, a toxicologist, who testified that the “weight of the evidence” supported his opinion that benzene exposure causes APML.[8] As noted above, Smith is a member of the advocacy group, the Collegium Ramazzini; and for over 30 years, he has been a frequent testifier for plaintiffs in chemical exposure cases.[9]

Despite the low but widespread prevalence of APML in the general population, with no sex specificity, and the absence of any identifying biomarker of supposed benzene-related etiology in individual cases, Smith maintained that epidemiology was not necessary to reach a causal opinion about benzene and APML. The principal thrust of Smith’s proffered testimony is that APML is a plausible outcome of benzene exposure, because benzene can cause other varieties of AML, by structurally altering chromosomes (clastogenic) by breaking them and causing re-arrangements.[10]

The trial court found that Smith’s extrapolations were problematic and lacking in supporting evidence. The clear differences among AML subtypes made the extrapolation to APML, a unique clinical entity, inappropriate. The characteristic translocation in APML is absent from other varieties of AML, and APML, unlike other AML varieties, is treatable with all-trans retinoic acid.[11]

Smith advanced speculation that benzene targeted cells in the pathway of  leukemic transformation to APML, but the state of science was clearly devoid of sufficient evidence to show that benzene was involved in the APML translocations. Although the parties agreed that mechanistic evidence showed that benzene can effectuate chromosome damage that are characteristic of some AML subtypes other than APML, the trial court found that:

“[n]o evidence has been published making a similar connection between benzene exposure and the t(15;17) translocation, characteristic of APL [APML].”[12]

The trial court assessed Smith’s extrapolation from benzene’s clastogenic effect in breaking and rearranging chromosomes to induce some types of AML to its causing the specific APML t(15;17) translocation, as a

“bull in the china shop generalization: since the bull smashes the teacups, it must also smash the crystal. Whether that is so, of course, would depend on the bull having equal access to both teacups and crystal. If the teacups were easily knocked over, but the crystal securely stored away, a reason would exist to question, if not to reject, the proposition that the crystal was in as much danger as the teacups.”[13]

The trial judge clearly saw that Smith’s plausibility proved too much, and would support attributing virtually any disease to benzene through a putative mechanism of breaking chromosomes.

Lacking the courage of his convictions, Smith, non-epidemiologist, proceeded to offer opinions about the epidemiology of benzene and APML, some of them quite fanciful. No published or unpublished study showed a statistically significant increase in APML among benzene-exposed workers. The most Smith could draw from the published epidemiologic studies on benzene was one Chinese study that found a small risk ratio, without even nominal statistical significance: a crude odds ratio of 1.42 for benzene exposure and APML. Despite Smith’s hand waving about lack of power,[14] this Chinese study suggested that chloramphenicol was a risk factor for APML (M3), and it was able to identify a nominally statistically significant association between benzene and another sub-type of AML (M2a), with an odds ratio of 1.54.[15]

Smith offered no meta-analysis to show that the available studies collectively established a summary estimate of increased risk for APML among benzene workers. Undaunted, Smith set about to re-jigger the numbers in published studies to make something out of nothing. Neither physician nor epidemiologist, Smith altered diagnoses and exposure status as reported in published papers so that his reclassified cases and controls would yield, where none existed. These re-analyses were done speculatively, inconsistently, and incompetently, and were driven by the motivation to make something out of nothing. His approach was unsupported, unprincipled, and lacking in any reasonable methodology. The proffered re-analyses were never published, never presented at a professional society meeting, and never could comply with the standards used by epidemiologists used in their non-litigation activities. As a toxicologist, Smith did not have any non-litigation epidemiologic activities of note.

Smith’s representation of the relevant epidemiologic methods and studies was misleading and contained numerous errors that cumulatively led to erroneous conclusions; his own re-jiggering was carried out to reach a preferred conclusion to support plaintiff’s litigation case.[16]

One of the epidemiologic studies relied upon by Smith was Golumb (1982).[17] This study did not explore associations with benzene; it was a study of insecticides, chemicals and solvents, and petroleum. Crude oil contains very little benzene, typically about 0.1 percent.[18] Smith, without any evidentiary support, assumed that petroleum exposure equated to benzene exposure.

There were eight cases of cases of leukemia with petroleum exposure; one of those cases was APML. The authors of Golumb (1982) reported that this particular case with APML was actually a crane operator.[19]

In analyzing published epidemiologic studies, Smith insisted that he could re-classify APML cases to non-APML in control subjects, in studies, when the karyotype was normal. Karyotype analysis identifies the defining translocations of specific chromosomes in APML, and is found in virtually all such cases. The obvious result of Smith’s ad hoc reclassifications were to increase risk ratios for APML among benzene-exposed subjects. His arbitrary reclassifications of data allowed him to create the result he desired. In reviewing other published studies, Smith insisted that normal karyotype did not require reclassifying cases out of the APML category, when this approach would yield a risk ratio above one. 

Taking data from the Golumb 1982 paper, Smith attempted to inflate his calculation of an odds ratio, which would support his causation opinion. He arbitrarily discarded two APML from the non-exposed cases, and he discarded eight non-APML cases from the exposed subjects. He did not report p-values or confidence intervals for his reanalyses. At the hearing, the defense epidemiologist showed that Smith’s rejiggered odds ratio (1.51) had a p-value of 0.72, and a 95 percent confidence interval of 0.15 – 14.91. Not only was the result not statistically significant, the confidence interval shows that there was a range of alternative hypotheses over an order of magnitude in range, with none of them being rejected based upon the sample data at an alpha of 0.05. Without the rejiggering of exposed and unexposed cases, the odds ratio would have been 0.71, p = 0.76. All results, both as reported in the published article and as rejiggered by Smith were highly compatible with no association whatsoever.

In discussing other studies, Smith repeated his re-labeling of leukemia cases as APML, in the absence of karyotyping, to support his claims that there were more APML cases observed than expect on general population rates.[20] Smith also cited studies improvidently in supposed support of his opinion (Rinsky 1981; updated in 1994), where there was no association at all. Even workers heavily exposed to benzene in these studies did not develop APML.[21]  Similarly, in support of his opinion, Smith cited another Chinese study, which actually declared that:

“Acute promyelocytic leukemia has been reported infrequently in benzene-exposed groups as well as in t-ANLL. Although ANLL-M3 occurred in at least 4 patients in this series, its general representation among the subtypes of ANLL was similar in its distribution in de novo ANLL in China.”[22]

Smith’s methodological improprieties were the subject of a four day pre-trial hearing before Judge O’Toole. In the course of the hearings, Smith attempted to defends his methods, but like Donny Kerabatsos, in the Big Lebowski, Smith was out of his depth. The trial court found that Dr. Smith’s arbitrary creating and choosing data to support his beliefs was unreliable and not in accordance with generally accepted scientific methodology in the fields of medicine or epidemiology. Smith was simply fabricating data to fit his made-for-litigation beliefs.

Carl Cranor’s Attempt to Bolster Smith

Milward also submitted a report from Carl Forest Cranor, Smith’s business partner in founding the Prop 65 bounty-hunting CERT, and a fellow member of the advocacy group Collegium Ramazzini. Cranor has no expertise in toxicology or epidemiology, and he has never published on the cause of APML. As a professor of philosophy, Cranor has written about scientific methodology, including WOE and “inference to the best explanation (IBE).” Cranor’s publications are riddled with basic misunderstandings of statistical concepts.[23] Essentially, Cranor testified at the Rule 702 hearing, as a cheerleader for Smith, and to advocate for open admissions of dodgy scientific conclusions as acceptable with a methodology he described as WOE or IBE. Cranor stretched to resurrect Justice Stevens’ use of WOE, and attempted to pass it off as a generally accepted scientific mode of reasoning.

The trial court carefully reviewed the proffered opinion testimony in a four day pre-trial hearing. The trial court found that Smith had shown that his hypothesis was plausible and possible, but not that it was “scientific knowledge,” as required by Rule 702. Lacking sufficient scientific methodological validity and support, Smith’s opinions failed to satisfy the requirements of Rule 702, and were thus inadmissible. As a result of excluding plaintiff’s sole general causation expert witness, the trial court granted summary judgment to the defendants.[24]

(to be continued)


[1] See, e.g., Allen v. Pennsylvania Eng’g Corp., 102 F.3d 194, 197-98 (5th Cir. 1996) (“We are also unpersuaded that the ‘weight of the evidence’ methodology these experts use is scientifically acceptable for demonstrating a medical link between Allen’s EtO [ethylene oxide] exposure and brain cancer.”); Magistrini v. One Hour Martinizing Dry Cleaning, 180 F. Supp. 2d 584, 601-02 (D.N.J. 2002) (excluding David Ozonoff, whose WOE analysis of whether perchloroethylene causes acute myelomonocytic leukemia was criticized by court-appointed technical advisor), aff’d, 68 F. App’x 356 (3d Cir. 2003).

[2] See Eric Lasker, Manning the Daubert Gate: A Defense Primer in Response to Milward v. Acuity Specialty Products, 79 DEF. COUNS. J. 128, 128 (2012); David E. Bernstein, The Misbegotten Judicial Resistance to the Daubert Revolution, 89 NOTRE DAME L. REV. 27, 29, 53-58 (2013); David E. Bernstein & Eric G. Lasker, Defending Daubert: It’s Time to Amend Federal Rule of Evidence 702, 57 WM. & MARY L. REV. 1, 33 (2015); Richard Collin Mangrum, Comment on the Proposed Revision of Federal Rule 702: “Clarifying” the Court’s Gatekeeping Responsibility over Expert Testimony, 56 CREIGHTON LAW REVIEW 97, 106 & n.45 (2022); Thomas D. Schroeder, Toward a More Apparent Approach to Considering the Admission of Expert Testimony, 95 NOTRE DAME L. REV. 2039, 2045 (2020); Lawrence A. Kogan, Weight of the Evidence: A Lower Expert Evidence Standard Metastasizes in Federal Court, Washington Legal Foundation Critical Legal Issues WORKING PAPER Series no. 215 (Mar. 2020); Note, Judicial Conference Amends Rule 702. — Federal Rule of Evidence 702, 138 HARV. L. REV. 899, 903 (2025); Nathan A. Schachtman, Desultory Thoughts on Milward v. Acuity Specialty Products, DOI: 10.13140/RG.2.1.5011.5285 (Oct. 2015), available at https://www.researchgate.net/publication/282816421_Desultory_Thoughts_on_Milward_v_Acuity_Specialty_Products .

[3] See David DeMatteo & Kellie Wiltsie, When Amicus Curiae Briefs are Inimicus Curiae Briefs: Amicus Curiae Briefs and the Bypassing of Admissibility Standards, 72 AM. UNIV. L. REV. 1871 (2022) (noting that amicus briefs often include “unvetted and potentially inaccurate, misleading, or mischaracterized expert information,” without the procedural safeguards in place for vetting expert witnesses at trial).

[4] Milward v. Acuity Specialty Prods. Group, Inc., 969 F. Supp. 2d 101, 109 (D. Mass. 2013), aff’d sub. nom., Milward v. Rust-Oleum Corp., 820 F.3d 469, 471, 477 (1st Cir. 2016).

[5] David E. Bernstein, The Misbegotten Judicial Resistance to the Daubert Revolution, 89 NOTRE DAME L. REV. 27, 53, 29 (2013).

[6] Milward v. Acuity Specialty Products Group, Inc., 664 F. Supp. 2d 137 (D. Mass. 2009) (O’Toole, J.), rev’d, 639 F.3d 11 (1st Cir. 2011), cert. denied, U.S. Steel Corp. v. Milward, 565 U.S. 1111 (2012).

[7] Andrew Y. Li, et al., Clustered incidence of adult acute promyelocytic leukemia in the vicinity of Baltimore, 61 LEUKEMIA & LYMPHOMA 2743 (2021); Hassan Ali, et al., Epidemiology and Survival Outcomes of Acute Promyelocytic Leukemia in Adults: A SEER Database Analysis, 144 BLOOD 5942 S1 (2024).

[8] Milward, 664 F. Supp. 2d at 142.

[9] See, e.g., PPG Industries, Inc. v. Wells, No. 21-0232 (Feb. 10, 2023 W.Va.S.Ct.); Hall v. ConocoPhillips, 248 F. Supp. 3d 1177 (W.D. Okla. 2017); In re Levaquin Prods. Liab. Litig., 739 F.3d 401 (8th Cir. 2014); Jacoby v. Rite Aid Corp., No. 1508 EDA 2012 (Dec. 9, 2013 Pa. Super.); Harris v. CSX Transp., Inc., 232 W.Va. 617, 753 S.E.2d 275 (2013); In re Baycol Prods. Litig., 495 F. Supp. 2d 977 (D. Minn. 2007); In re Rezulin Prods. Liab. Litig., MDL 1348, 441 F.Supp.2d 567 (S.D.N.Y. 2006) (advocating mythological “silent injury”); Perry v. Novartis, 564 F.Supp.2d 452 (E.D. Pa. 2008); Dodge v. Cotter Corp., 328 F.3d 1212 (10th Cir. 2003); Sutera v. The Perrier Group of America Inc., 986 F. Supp. 655 (D. Mass. 1997); Redland Soccer Club, Inc. v. Dep’t of Army, 835 F.Supp. 803 (M.D. Pa. 1993).

[10] Milward, 664 F.Supp. 2d at 143-44.

[11] Milward, 664 F.Supp. 2d at 144.

[12] Id. at 146

[13] Id.

[14] The claim that a study lacks power is meaningless without a specification of the alternative hypothesis, the risk ratio the researcher thinks is the population parameter, at a specified level of alpha (typically p < 0.05), and a specified probability model. While virtually all studies would have reasonable statistical power (say 80 percent probability) to reject an alternative hypothesis that the risk ratio exceeded 10,000, no study would have power to detect a risk ratio of 1.0001, at a high level of probability.

[15] Yi Zhongguo, et al. (National Investigative Group for the Survey of Leukemia & Aplastic Anemia), Countrywide Analysis of Risk Factors for Leukemia and Aplastic Anemia, 14 ACTA ACADEMIAE MEDICINAE SINICAE 185 (1992).

[16] Milward, 664 F. Supp. 2d at 148-49.

[17] Harvey M. Golomb, et al., Correlation of Occupation and Karyotype in Adults With Acute Nonlymphocytic Leukemia, 60 BLOOD 404 (1982).

[18] Bo Holmberg, Per Lundberg, Benzene: standards, occurrence, and exposure, 7 AM. J. INDUS. MED. 375 (1985).

[19] Golumb, supra at note 17, at 407.

[20] See, e.g., Song-Nian Yin, et al., A cohort study of cancer among benzene-exposed workers in China: overall results, 29 AM. J. INDUS. MED. 227 (1996).

[21] Robert A. Rinsky, et al., Leukemia in Benzene Workers, 2 AM. J. INDUS. MED. 217 (1981); Mary B. Paxton, et al., Leukemia Risk Associated with Benzene Exposure in the Pliofilm Cohort: I. Mortality Update and Exposure Distribution, 14 RISK ANALYSIS 147 (1994); Mary B. Paxton, et al., Leukemia Risk Associated with Benzene Exposure in the Pliofilm Cohort II. Risk Estimates, 14 RISK ANALYSIS 155 (1994).

[22] Lois B. Travis, et al., Hematopoietic Malignancies and Related Disorders Among Benzene-Exposed Workers in China, 14 LEUKEMIA & LYMPHOMA 91, 99 (1994).

[23] See, e.g., Carl F. Cranor, REGULATING TOXIC SUBSTANCES: A PHILOSOPHY OF SCIENCE AND THE LAW at 33-34(1993) (conflating random error with posterior probabilities: “One can think of α, β (the chances of type I and type II errors, respectively) and 1- β as measures of the “risk of error” or “standards of proof.”); id. at 44, 47, 55, 72-76.

[24] 664 F. Supp. 2d at 140, 149.

The Fourth Edition’s Chapter on Admissibility of Expert Witness Testimony – Part 2

February 24th, 2026

The Manual’s new law chapter on the admissibility (vel non) of expert witness testimony was written by two law professors who teach evidence, and who often write articles with each another.[1] Liesa Richter teaches at the University of Oklahoma College of Law. Daniel Capra teaches at Fordham School of Law, in Manhattan. For the last three decades, Capra has been the Reporter for the Judicial Conference Advisory Committee on the Federal Rules of Evidence. There probably is no evidence law scholar more involved with the Federal Rules, including with the key expert witness rules, Rule 702 and Rule 703, than Capra.

The new chapter’s strengths follow from Professor Capra’s involvement in the evolution of Rule 702. The chapter plainly acknowledges that the Supreme Court decisions in the 1990s follow from an epistemic standard, and the use of the terms “scientific” and “knowledge” in Rule 702. Counting heads, as suggested by the Frye case, was at times a weak and ambiguous proxy for knowledge.[2] The new chapter has the important advantage of not having authors entwined in the advocacy of dodgy groups such as SKAPP, and the Collegium Ramazzini. Gone from the new chapter are Berger’s gratuitous and unwarranted endorsements and mischaracterizations of carcinogenicity evaluations by the International Agency for Research on Cancer (IARC).

Like Berger’s previous versions of this chapter, the new chapter carefully explains the Supreme Court decisions on expert witness admissibility and the changes in Rule 702, over time, including the 2023 amendment to 702. One glaring omission from the new chapter is absence of any mention of the fourth Supreme Court case in the 1993-2000 quartert: Weisgram v. Marley.[3] This important opinion by Justice Ginsburg was a clear expression of the seriousness with which the Court took the gatekeeping enterprise:

“Since Daubert, moreover, parties relying on expert testimony have had notice of the exacting standards of reliability such evidence must meet… . It is implausible to suggest, post-Daubert, that parties will initially present less than their best expert evidence in the expectation of a second chance should their first trial fail.”[4]

Professor Berger discussed this case in her last chapter, but the new authors fail to mention it all.[5] 

On the plus side, Richter and Capra discuss, although way too briefly, the role that Federal Rule of Evidence 703 plays in governing expert witness testimony.[6]  Rule 703 does not address the admissibility of expert witnesses’ opinions, but it does give trial courts control over what hearsay facts and data (such as published studies), otherwise inadmissible, upon which expert witnesses can rely.[7] Richter and Capra do not, however come to grips what how Rule 703 will often require trial courts to engage with the specifics of the validity and flaws of specific studies in order to evaluate the reasonableness of expert witness reliance upon them. Berger, in the third edition of the Manual, completely failed to address Rule 703, and its important role in gatekeeping.

Richter and Capra helpfully advise judges to be cautious in relying upon pre-2023 amendment cases because that most recent amendment was designed to correct clearly erroneous applications of Rule 702 in both federal trial and appellate courts.[8] The new Manual authors also deserve credit for being willing to call out judges for ignoring the Rule 702 sufficiency prong and for invoking the evasive dodge of many courts in characterizing expert witness challenges as going to “weight not admissibility.”[9]

Richter and Capra improve upon past chapters by simply reporting that the 2023 amendment to Rule 702 addressed important concerns that courts were failing to keep expert witnesses “within the bounds of what can be concluded from a reliable application of the expert’s basis and methodology,” and that Rule Rule 702(d) was amended to emphasize their legal obligation to do so.[10] Berger could have discussed this phenomenon even back in 2010-11, but failed to do so.

The new authors report that the Rules Advisory Committee had been concerned that expert witnesses engage regularly in overstating or overclaiming the appropriate level of certainty for their opinions, especially in the context of forensic science.[11] Although the recognition of problematic overclaiming in forensic science is a welcomed development, Richter and Capra fail to recognize that overclaiming is at the heart of the Milward case involving benzene exposure and acute promyelocytic leukemia (APL). And they seem unaware that overclaiming is baked into the precautionary principle that drives IARC pronouncements, advocacy positions of groups such as the Collegium Ramazzini, and much of regulatory rule-making.

In several respects, Richter and Capra have improved upon the past three editions in presenting the law of expert witness testimony. The new chapter gives a brief exposition of the Joiner case,[12] where the Court concluded that that there was an “analytical gap” between the plaintiffs’ experts witnesses’ conclusion on causation and the animal and human studies upon which they relied. The authors’ summary of the case explains that the Supreme Court majority concluded that the trial court below was well within its discretion to find that the plaintiffs’ expert witnesses had a cavernous analytical gap between their relied upon evidence and their conclusion that polychlorobiphenyls (PCBs) caused Mr. Joiner’s lung cancer.

Richter and Capra’s goes sideways in addressing the dissent by Justice Stevens and by giving it uncritical, disproportionate attention. As a dissent, which has never gained any serious acceptance on the high court by any other member, Justice Stevens’ opinion in Joiner hardly deserved any mention at all. Richter and Capra note, however, early in the chapter that Justice Stevens’ criticized the majority in Joiner for having “examined each study relied upon by the plaintiff’s experts in a piecemeal fashion and concluded that the experts’ opinions on causation were unreliable because no one study supported causation.”[13] Stevens’ criticism was wide of the mark in that the Court specifically addressed the “mosaic” theory that was a reprise of the plaintiffs’ unsuccessful strategy in the Bendectin litigation.[14]

Justice Stevens’ dissent wantonly embraced Joiner’s expert witnesses’ use of a “weight of the evidence” (WOE) methodology. Stevens asserted that WOE is accepted in regulatory circles, which is true but irrelevant, and that it is accepted in scientific circles, which is a gross exaggeration and misrepresentation. Richter and Capra somehow manage to discuss Stevens’ WOE argument twice,[15] thereby giving undue, uncritical emphasis and appearing to endorse it over the majority opinion, which after all contained the holding of the Joiner case. The authors give credence to the WOE argument in Joiner by suggesting that the majority had not adequately addressed it, and by failing to provide or cite any critical commentary on WOE.

Careful readers will be left wondering why their time is being wasted with the emphasis on a dissent that was never the law, that mischaracterized the majority opinion, that endorsed a method, WOE, that has been widely criticized, and that never persuaded any other justice to join.

The scientific community has never been seriously impressed by the so-called WOE approach to determining causality.  The phrase is vague and ambiguous; its use, inconsistent.[16] Although the phrase, WOE, is thrown around a lot, especially in regulatory contexts, it has no clear, consistent meaning or mode of application.[17]

Many lawyers, like Justice Stevens, Richter, and Capra, may feel comfortable with WOE because the phrase is used often in the law, where the subjectivity, vagueness, lack of structure and hierarchy to the metaphor “weighing” evidence is seen as a virtue that avoids having to worry too much about the evidential soundness of verdicts.[18] The process of science, however, is not like that of a jury’s determination of a fact such as who had the right of way in a car collision case. Not all evidence is the same in science, and a scientific judgment is not acceptable when it hangs on weak evidence and invalid inferences.

The lawsuit industry and its expert witnesses have adopted WOE, much as they have the equally vague term, “link,” for WOE’s permissiveness of causal inferences. WOE frees them from the requirement of any meaningful methodology, which means that any conclusion is possible, including their preferred conclusion. Under WOE, any conclusion can survive gatekeeping as an opinion. WOE frees the putative expert witness from the need to consider the quality of research. WOE-ful enthusiasts such as Carl Cranor invoke WOE or seek to inflict WOE without mentioning the crucial “nuts and bolts” of scientific inference, such as concepts of

  • Internal and external validity
  • A hierarchy of evidence
  • Assessment of random error
  • Assessment of known and residual confounding
  • Known and potential threats to validity
  • Pre-specification of end points and statistical analyses
  • Pre-specification of weights to be assigned, and inclusionary and exclusionary criteria for studies
  • Appropriate synthesis across studies, such as systematic review and meta-analysis

These important concepts are lost in the miasma of WOE.

If Richter and Capra wished to take a deeper dive into the Joiner case, rather than elevate the rank speculation of the lone dissenter, Justice Stevens, they may have asked whether Joiner’s expert witnesses relied upon all, or the most carefully conducted, epidemiologic studies.

As the record was fashioned, the Supreme Court’s discussion of the plaintiffs’ expert witnesses’ methodological excesses and failures did not include a discussion of why the excluded witnesses had failed to rely upon all the available epidemiology. The challenged witnesses relied upon an unpublished Monsanto study, but apparently ignored an unpublished investigation by NIOSH government researchers, who found that there were “no excess deaths from cancers of the … the lung,” among PCB-exposed workers at a Westinghouse Electric manufacturing facility. Actually, the NIOSH report indicated a statistically non-significant decrease in lung cancer rate among PCB exposed workers, with fairly a narrow confidence interval; SMR = 0.7 (95% CI, 0.4 – 1.2).[19] By the time the Joiner case was litigated, this unpublished NIOSH report was published and unjustifiably ignored by Joiner’s expert witnesses.[20] Twenty years after Joiner was decided in the Supreme Court, NIOSH scientists published updated data from this cohort, which showed that the long-term lung cancer mortality for PCB-exposed workers remained reduced, with a standardized mortality ratio of 0.88 (95% C.I., 0.7–1.1) for the cohort, and even lower for the workers with the highest levels of exposure, 0.82 (95% C.I., 0.5–1.3).[21]

At the time the Joiner case was on its way up to the Supreme Court, two Swedish studies were available, but they were perhaps too small to add much to the mix of evidence.[22] Another North American study published in 1987, and not cited by Joiner’s expert witnesses, was also conducted in a cohort of North American PCB-exposed capacitor workers, and showed less than expected mortality from lung cancer.[23] Joiner thus represents not only an analytical gap case, but also a cherry picking case. The Supreme Court was eminently correct to affirm the shoddy evidence proffered in the Joiner case.

Thirty years after the Supreme Court decided Joiner, the claim that PCBs cause lung cancer in humans remains unsubstantiated. Subsequent studies bore out the point that Joiner’s expert witnesses were using an improper, unsafe methodology and invalid inferences to advance a specious claim.[24] In 2015, researchers published a large, updated cohort study, funded by General Electric, on the mortality experience of workers in a plant that manufactured capacitors with PCBs. The study design was much stronger than anything relied upon by Joiner’s expert witnesses, and its results are consistent with the NIOSH study available to, but ignored by, them. The results are not uniformly good for General Electric, but on the end point of lung cancer for men, the standardized mortality ratio was 81 (95% C.I., 68 – 96), nominally statistically significantly below the expected SMR of 100.[25]

There is also the legal aftermath of Joiner, in which the Supreme Court reversed and remanded the case to the 11th Circuit, which in turn remanded the case back to the district court to address claims that Mr. Joiner had also been exposed to furans and dioxins, and that these other chemicals had caused, or contributed to, his lung cancer, as well.[26] 

Thus the dioxins were left in the case even after the Supreme Court ruled on admissibility of expert witnesses’ opinions on PCBs and lung cancer. Anthony Roisman, a lawyer with the plaintiff-side National Legal Scholars Law Firm, P.C., argued that the Court had addressed an artificial question when asked about PCBs alone because the case was really about an alleged mixture of exposures, and he held out hope that the Joiners would do better on remand.[27]

Alas, the Joiner case evaporated in the district court. In February 1998, Judge Orinda Evans, who had been the original trial judge, and who had sustained defendants’ Rule 702 challenges and granted their motions for summary judgments, received and reopened the case upon remand from the 11th Circuit. Judge Evans set a deadline for a pre-trial order, and then extended the deadline at plaintiff’s request. After Joiner’s lawyers withdrew, and then their replacements withdrew, the parties ultimately stipulated to the dismissal of the case with prejudice, in February 1999. The case had run its course, and so had the claim that dioxins were responsible for plaintiff’s lung cancer.

In 2006, the National Research Council published a monograph on dioxin, which took the controversial approach of focusing on all cancer mortality rather than specific cancers that had been suggested as likely outcomes of interest.[28] The validity of this approach, and the committee’s conclusions, were challenged vigorously in subsequent publications.[29] In 2013, the Industrial Injuries Advisory Council (IIAC), an independent scientific advisory body in the United Kingdom, published a review of lung cancer and dioxin. The Council found the epidemiologic studies mixed, and declined to endorse the compensability of lung cancer for dioxin-exposed industrial workers.[30]

In 1996, when Justice Stevens dissented in Joiner, and over the course of three decades, Stevens’ assessment of science, scientific methodology, and law have been wrong. His viewpoints never gained acceptance from any other justice on the Supreme Court. Richter and Capra, in writing the first chapter of the new Reference Manual, lead judges and lawyers astray in improvidently elevating the dissent, as though it were law, and in failing to provide sufficient context, analysis, and criticism.

(To be continued.)


[1] Liesa L. Richter & Daniel J. Capra, The Admissibility of Expert Testimony, National Academies of Sciences, Engineering, and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 1 (4th ed. 2025).

[2] Id. at 6.

[3] 528 U.S. 440 (2000).

[4] 528 U.S. at 445 (internal citations omitted).

[5] Margaret A. Berger, The Admissibility of Expert Testimony, in National Academies of Sciences, Engineering and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 11, 18-19 (3rd 2011).

[6] Richter & Capra at 17.

[7] See Nathan A. Schachtman, Rule of Evidence 703—Problem Child of Article VII, PROOF 3 (Spring 2009).

[8] Id. at 13.

[9] Id. at 16.

[10] Id. at 22-23.

[11] Id. at 23, 39.

[12] General Electric Co. v. Joiner, 522 U.S. 136 (1997).

[13] Richter & Capra at 10 (citing General Electric Co. v. Joiner, 522 U.S. 136, 150-155 (1997) (Stevens, J.).

[14] Joiner, 522 U.S. at 147-48.

[15] Richter & Capra at 10, 31.

[16] See, e.g., V. H. Dale, G.R. Biddinger, M.C. Newman, J.T. Oris, G.W. Suter II, T. Thompson, et al., Enhancing the ecological risk assessment process, 4 INTEGRATED ENVT’L ASSESS. MANAGEMENT 306 (2008) (“An approach to interpreting lines of evidence and weight of evidence is critically needed for complex assessments, and it would be useful to develop case studies and/or standards of practice for interpreting lines of evidence.”); Igor Linkov, Drew Loney, Susan M. Cormier, F.Kyle Satterstrom & Todd Bridges, Weight-of-evidence evaluation in environmental assessment: review of qualitative and quantitative approaches, 407 SCI. TOTAL ENV’T 5199–205 (2009); Douglas L. Weed, Weight of Evidence: A Review of Concept and Methods, 25 RISK ANALYSIS 1545 (2005) (noting the vague, ambiguous, indefinite nature of the concept of WOE review); R.G. Stahl Jr., Issues addressed and unaddressed in EPA’s ecological risk guidelines, 17 RISK POLICY REPORT 35 (1998); (noting that U.S. EPA’s guidelines for ecological WOE approaches to risk assessment fail to provide meaningful guidance); Glenn W. Suter & Susan M. Cormier, Why and how to combine evidence in environmental assessments: Weighing evidence and building cases, 409 SCI. TOTAL ENV’T 1406, 1406 (2011) (noting arbitrariness and subjectivity of WOE “methodology”).

[17] See Charles Menzie, et al., “A weight-of-evidence approach for evaluating ecological risks; report of the Massachusetts Weight-of-Evidence Work Group,” 2 HUMAN ECOL. RISK ASSESS. 277, 279 (1996)  (“although the term ‘weight of evidence’ is used frequently in ecological risk assessment, there is no consensus on its definition or how it should be applied”); Sheldon Krimsky, “The weight of scientific evidence in policy and law,” 95 AM. J. PUB. HEALTH S129 (2005) (“However, the term [WOE] is applied quite liberally in the regulatory literature, the methodology behind it is rarely explicated.”).

[18] See, e.g., People v. Collier, 146 A.D.3d 1146, 1147-48, 2017 NY Slip Op 00342 (N.Y. App. Div. 3d Dep’t, Jan. 19, 2017) (rejecting appeal based upon defendant’s claim that conviction was against “weight of the evidence”); Venson v. Altamirano, 749 F.3d 641, 656 (7th Cir. 2014) (noting “new trial is appropriate if the jury’s verdict is against the manifest weight of the evidence”).

[19] Thomas Sinks, et al., Health Hazard Evaluation Report, HETA 89-116-209 (Jan. 1991).

[20] Thomas Sinks, et al., Mortality among workers exposed to polychlorinated biphenyls,” 136 AM. J. EPIDEMIOL. 389 (1992).

[21] Avima M. Ruder, et al., Mortality among Workers Exposed to Polychlorinated Biphenyls (PCBs) in an Electrical Capacitor Manufacturing Plant in Indiana: An Update, 114 ENVT’L HEALTH PERSP. 18, 21 (2006).

[22] P. Gustavsson, et al., Short-term mortality and cancer incidence in capacitor manufacturing workers exposed to polychlorinated biphenyls (PCBs), 10 AM. J. INDUS. MED. 341 (1986); P. Gustavsson & C. Hogstedt, “A cohort study of Swedish capacitor manufacturing workers exposed to polychlorinated biphenyls (PCBs),” 32 AM. J. INDUS. MED. 234 (1997) (cancer incidence for entire cohort, SIR = 86, 95%; CI 51-137).

[23] David P. Brown, “Mortality of workers exposed to polychlorinated biphenyls–an update,” 42 ARCH. ENVT’L HEALTH 333, 336 (1987

[24] See Mary M. Prince, et al., Mortality and exposure response among 14,458 electrical capacitor manufacturing workers exposed to polychlorinated biphenyls (PCBs), 114 ENVT’L HEALTH PERSP. 1508, 1511 (2006) (reporting a nominally statistically significant decreased mortality ratio of 0.78, 95% C.I. 0.65–0.93, for men exposed to PCBs); Avima M. Ruder, Mortality among 24,865 workers exposed to polychlorinated biphenyls (PCBs) in three electrical capacitor manufacturing plants: a ten-year update, 217 INT’L J. HYG. & ENVT’L HEALTH 176, 181 (2014) (reporting no increase in the lung cancer standardized mortality ratio for long-term workers, 0.99, 95% C.I., 0.91–1.07).

[25] Renate D. Kimbrough, et al., Mortality among capacitor workers exposed to polychlorinated biphenyls (PCBs), a long-term update, 88 INT’L ARCH. OCCUP. & ENVT’L HEALTH 85 (2015).

[26] Joiner v. General Electric Co., 134 F.3d 1457 (11th Cir. 1998) (per curiam).

[27] Anthony Z. Roisman, The Implications of G.E. v. Joiner for Admissibility of Expert Testimony, 65 VT. J. ENVT’L L. 1 (1998).

[28] See David L. Eaton (Chairperson), HEALTH RISKS FROM DIOXIN AND RELATED COMPOUNDS – EVALUATION OF THE EPA REASSESSMENT (2006).

[29] Paolo Boffetta, et al., TCDD and cancer: A critical review of epidemiologic studies,” 41 CRIT. REV. TOXICOL. 622 (2011) (“In conclusion, recent epidemiological evidence falls far short of conclusively demonstrating a causal link between TCDD exposure and cancer risk in humans.”).

[30] Industrial Injuries Advisory Council – Information Note on Lung cancer and Dioxin (December 2013). See also Mann v. CSX Transp., Inc., 2009 WL 3766056, 2009 U.S. Dist. LEXIS 106433 (N.D. Ohio 2009) (Polster, J.) (dioxin exposure case) (“Plaintiffs’ medical expert, Dr. James Kornberg, has opined that numerous organizations have classified dioxins as a known human carcinogen. However, it is not appropriate for one set of experts to bring the conclusions of another set of experts into the courtroom and then testify merely that they ‘agree’ with that conclusion.”), citing Thorndike v. DaimlerChrysler Corp., 266 F. Supp. 2d 172 (D. Me. 2003) (court excluded expert who was “parroting” other experts’ conclusions).

The Reference Manual’s Chapter on Expert Witness Testimony Admissibility – Part One

February 23rd, 2026

With the retraction of the climate science chapter, The Reference Manual on Scientific Evidence is now one chapter shorter, at least in the Federal Judicial Center’s version. At the time of this writing, for curious souls, the National Academies version is still sporting the climate advocacy chapter. Even without the climate chapter, the Manual is over 1,000 pages, and more than a casual weekend read. Many judges, finding this tome on their desks, will read individual subject matter chapters pro re nata. The first chapter in the Manual, however, is about the law, not science, and might be the starting place for the ordinary work-a-day judge. As in past editions of the Manual, the new edition has a chapter on the The Admissibility of Expert Testimony. In the first, second, and third editions, this chapter was written by Professor Margaret Berger. In the fourth edition, the chapter on the law was written by law professors Liesa Richter and Daniel Capra. To understand and evaluate the most recent iteration, the reader should have some sense of what has gone before.

Previous Chapters on Admissibility of Expert Witness Testimony

Professor Berger’s past chapters had been idiosyncratic productions.[1] Berger was an evidence law scholar, who wrote often about expert witness admissibility issues.[2] She was also known for her antic proposals, such as calling for abandoning the element of causation in products liability cases.[3] As an outspoken ideological opponent of expert witness gatekeeping, Berger was a strange choice to write the law chapter of the Manual.[4] Berger’s chapters in the first through the third editions made her opposition to gatekeeping obvious, and this hostility may have been responsible for some of the judicial resistance to applying the clear language of Rule 702, even after its 2000 revision.

Berger was not only a law professor; she was at the center of ideological and financially conflicted groups that worked to undermine the application of Rule 702 in health effects cases. One of the key players in this concerted action was David Michaels. Currently, Michaels teaches epidemiology at the George Washington University Milken Institute School of Public Health. He is a card-carrying member of the Collegium Ramazzini, an organization that has participated in efforts to corrupt state and federal judges by funding ex parte conferences with lawsuit industry expert witnesses.[5] Michaels is the author of two books, both highly anti-manufacturing industry, and biased in favor of the lawsuit industry.[6] Both books are provocatively titled anti-industry diatribes, which have little scholarly value, but are used regularly by plaintiffs counsel solely to smear corporate defendants and defense expert witnesses. Most clear-eyed trial judges have quashed these efforts on various grounds, including Rule 703, because the books are not the sort of material upon which scientists would reasonably rely.[7]

In 2002, David Michaels created an anti-Daubert advocacy organization, the Project on Scientific Knowledge and Public Policy (SKAPP), from money siphoned from the plaintiffs’ common-benefit fund in MDL 926 (silicone gel breast implant litigation).[8] Michaels lavished some of the misdirected money to prepare and publish an anti-Daubert pamphlet for SKAPP, in 2003.[9] In this anti-Daubert publications, and many others sponsored by SKAPP, Michaels and the SKAPP grantees typically acknowledged the source of SKAPP funding obliquely to hide that it was nothing more than plaintiffs counsels’ walking around money:

“I am also grateful for the support SKAPP has received from the Common Benefit Trust, a fund established pursuant to a court order in the Silicone Gel Breast Implant Liability litigation.”[10]

Many credulous lawyers, judges, and legal scholars were duped into believing that SKAPP, SKAPP publications, and SKAPP-sponsored publications were supported by the Federal Judicial Center.

Michaels directed a good amount of SKAPP’s anti-Daubert funding to support Professor Berger’s efforts in organizing a series of symposia on science and the law. Several of Berger’s SKAPP conferences were held in Coronado, California, and featured a predominance of scientists who work for the lawsuit industry and are affiliated with advocacy organizations, such as the Collegium Ramazzini. The papers from one of the Coronado Conferences were published in a special issue of the American Journal of Public Health, the official journal of the American Public Health Association,[11] which has issued position papers highly critical of Rule 702 gatekeeping.[12]

The spider web of connections between SKAPP, the Collegium Ramazzini, the American Public Health Association, the Tellus Institute, the lawsuit industry,  Professor Berger, and others hostile to Rule 702 is a testament to the concerted action to undermine the Supreme Court’s decisions in the area, and the codification of those decisions in Rule 702. That Professor Berger was within this web of connections, and was writing the chapter on the admissibility of expert witness opinion testimony, in the first three editions of the Reference Manual, explains but does not justify many of the opinions contained within those chapters.

Professor David Bernstein, who has written extensively on expert witness issues, restated the situation thus:

“In 2003, the toxic tort plaintiffs’ bar used money from a fund established as part of the silicone breast implant litigation settlement to sponsor four conference in Coronado, California, that resulted in a slew of policy papers excoriating the Daubert gatekeeping requirement.”[13]

The active measures of these groups and Professor Berger explain the straight line between Berger’s symposia and the First Circuit’s decision in Milward v. Acuity Specialty Products Group, Inc.[14] Carl Cranor was one of the speakers at the Coronado Conferences, and along with Martyn Smith, another member of the Collegium Ramazzini, founded a Proposition 65 bounty-hunting organization, Council for Education on Research on Toxics (CERT). Cranor has long advocated for a loosey-goosey “weight of the evidence” approach that had been rejected by the Supreme Court in Joiner.[15] Cranor, along with Smith, unsurprisingly turned up as expert witnesses for plaintiff in Milward, in which case they reprised their weight-of-the evidence approach opinions. When Milward appealed the exclusion of Cranor and Smith, CERT filed an amicus brief, without disclosing that Cranor and Smith were founders of the organization, and that CERT funded Smith’s research through donations to his university, from CERT’s shake-down operations under Prop 65. The First Circuit’s 2011 decision in Milward resulted from a fraud on the court.

Professor Berger died in November 2010, but when the third edition of the Manual was released in 2011, it contained Berger’s chapter on the law of expert witnesses, with a citation to the Milward case, decided after her death.[16] An editorial note from an unnamed editor to her posthumous chapter suggested that

“[w]hile revising this chapter Professor Berger became ill and, tragically, passed away. We have published her last revision, with a few edits to respond to suggestions by reviewers.”

Given that Berger was an ideological opponent of expert witness gatekeeping, there can be little doubt that she would have endorsed the favorable references to Milward made after her passing, but adding them can hardly be considered non-substantive edits. Curious readers might wonder who was the editor who took such liberties of adding the chapter citations to Milward. Curious readers do not have to wonder, however, what would have happened if the incestuous relationships among Berger, SKAPP, the plaintiffs’ bar, and others had been replicated by similar efforts of manufacturing industry to influence the interpretation and application of the law. In 2008, the Supreme Court decided an important case involving constitutional aspects of punitive damages. The Court went out of its way to decline to rely upon empirical research that showed the unpredictability of punitive damage awards because it was funded in part by Exxon:

“The Court is aware of a body of literature running parallel to anecdotal reports, examining the predictability of punitive awards by conducting numerous ‘mock juries’, where different ‘jurors’ are confronted with the same hypothetical case. See, e.g., C. Sunstein, R. Hastie, J. Payne, D. Schkade, & W. Viscusi, Punitive Damages: How Juries Decide (2002); Schkade, Sunstein, & Kahneman, Deliberating About Dollars: The Severity Shift, 100 Colum. L.Rev. 1139 (2000); Hastie, Schkade, & Payne, Juror Judgments in Civil Cases: Effects of Plaintiff’s Requests and Plaintiff’s Identity on Punitive Damage Awards, 23 Law & Hum. Behav. 445 (1999); Sunstein, Kahneman, & Schkade, Assessing Punitive Damages (with Notes on Cognition and Valuation in Law), 107 Yale L.J. 2071 (1998). Because this research was funded in part by Exxon, we decline to rely on it.”[17]

Unlike the situation with SKAPP, David Michaels, the plaintiffs’ bar, and Professor Berger, the studies sponsored in part by Exxon had disclosed their funding clearly. Those studies involved outstanding scientists whose integrity were unquestionable, and for its trouble, Exxon was rewarded with gratuitous shaming from Justice Souter. The anti-Daubert papers sponsored by the plaintiffs’ bar through SKAPP, and Professor Berger’s ideological conflicts of interest have received a free pass. This disparate treatment between conflicts of interest within manufacturing industry and those within the lawsuit industry and its advocacy group allies is a serious social, political, and legal problem. It was a problem on full display in the now-retracted climate science chapter in the Manual. In evaluating the new fourth edition’s chapter on the law of expert witness admissibility (and other chapters), we should be asking whether there are signs of undue political influence.


[1] See Schachtman, The Late Professor Berger’s Introduction to the Reference Manual on Scientific Evidence, TORTINI (Oct. 23, 2011).

[2] See generally Edward K. Cheng, Introduction: Festschrift in Honor of Margaret A. Berger, 75 BROOKLYN L. REV. 1057 (2010). 

[3] Margaret A. Berger, Eliminating General Causation: Notes towards a New Theory of Justice and Toxic Torts, 97 COLUM. L. REV. 2117 (1997).

[4] See, e.g., Margaret A. Berger & Aaron D. Twerski, “Uncertainty and Informed Choice:  Unmasking Daubert,” 104 MICH. L.  REV. 257 (2005). 

[5] In re School Asbestos Litig., 977 F.2d 764 (3d Cir. 1992). See Cathleen M. Devlin, Disqualification of Federal Judges – Third Circuit Orders District Judge James McGirr Kelly to Disqualify Himself So As To Preserve ‘The Appearance of Justice’ Under 28 U.S.C. § 455 – In re School Asbestos Litigation (1992), 38 VILL. L. REV. 1219 (1993); Bruce A. Green, May Judges Attend Privately Funded Educational Programs? Should Judicial Education Be Privatized?: Questions of Judicial Ethics and Policy, 29 FORDHAM URB. L. J. 941, 996-98 (2002).

[6] David Michael, DOUBT IS THEIR PRODUCT: HOW INDUSTRY’S WAR ON SCIENCE THREATENS YOUR HEALTH (2008); David Michaels, THE TRIUMPH OF DOUBT (2020).

[7] See In re DePuy Orthopaedics, Inc. Pinnacle Hip Implant Prods. Liab. Litig., 888 F.3d 753, 787 n.71 (5th Cir. 2018) (advising the district court to weigh carefully whether Doubt is Their Product has any legal relevance); King v. DePuy Orthopaedics, Inc., 2024 WL 6953089, at *2 (D. Ariz. July 9, 2024) (finding Michaels’ books to be legally irrelevant); Sarjeant v. Foster Wheeler LLC, 2024 WL 4658407, at *1 (N.D. Cal.Oct. 24, 2024) (ruling that Doubt Is Their Product is legally irrelevant hearsay, and not the type of material upon which an expert witness would rely to form scientific opinion). See also Evans v. Biomet, Inc., 2022 WL 3648250, at *4 (D. Alaska Feb. 1, 2022) (quashing plaintiff’s subpoena to defendant’s expert for material in connection with Doubt Is Their Product).

[8] See Ralph Klier v. Elf Atochem North America Inc., 2011 U.S. App. LEXIS 19650 (5th Cir. 2011) (holding that district court abused its discretion in distributing residual funds from class action over arsenic exposure to charities; directing that residual funds be distributed to class members with manifest personal injuries). A “common benefit” fund is commonplace in multi-district litigation of mass torts.  In such cases, federal courts may require the defendant to “hold back” a certain percentage of settlement proceeds, to pay into a fund, which is available to those plaintiffs’ counsel who did “common benefit work,” work for the benefit of all claimants.  Plaintiffs’ counsel who worked for the common benefit of all claimants may petition the MDL court for compensation or reimbursement for their work or expenses.  See, e.g., William Rubenstein, On What a ‘Common Benefit Fee’ Is, Is Not, and Should Be, CLASS ACTION ATT’Y FEE DIG. 87, 89 (Mar. 2009).  In the silicone gel breast implant litigation (MDL 926), plaintiffs’ counsel on the MDL Steering Committee undertook common benefit work in the form of developing expert witnesses for trial, and funding scientific studies.  By MDL Orders 13, and 13A, the Court set hold-back amounts of 5 or 6%, and later reduced the amount to 4%.  Id. at 94.

[9] Eula Bingham, Leslie Boden, Richard Clapp, Polly Hoppin, Sheldon Krimsky, David Michaels, David Ozonoff & Anthony Robbins, Daubert: The Most Influential Supreme Court Ruling You’ve Never Heard Of (June 2003). The authors described the publication as a publication of SKAPP, coordinated by the Tellus Institute, and funded by The Bauman Foundation, a private foundation that supports “progressive social change advocacy.” Boden, Hoppin, Michaels, and Ozonoff are fellows of the Collegium Ramazzini.

[10] David Michael, DOUBT IS THEIR PRODUCT: HOW INDUSTRY’S WAR ON SCIENCE THREATENS YOUR HEALTH 267 (2008). See Nathan Schachtman, “SKAPP A LOT,” TORTINI (April 30, 2010); “Manufacturing Certainty” TORTINI (Oct. 25, 2011); “David Michaels’ Public Relations Problem” TORTINI (Dec. 2, 2011); “Conflicted Public Interest Groups” TORTINI (Nov. 3, 2013). 

[11] 95 AM. J. PUB. HEALTH S1 (2005).

[12] See, e.g., Am. Pub. Health Assn, Threats to Public Health Science, Policy Statement 2004-11 (Nov. 9, 2004), available at https://www.apha.org/policy-and-advocacy/public-health-policy-briefs/policy-database/2014/07/02/08/52/threats-to-public-health-science

[13] David E. Bernstein & Eric G. Lasker, Defending Daubert: It’s Time to Amend Federal Rule of Evidence, 702, 57 WM. & MARY L. REV. 1, 39 (2015), available at https://scholarship.law.wm.edu/wmlr/vol57/iss1/2. See David Michaels & Neil Vidmar, Foreword, 72 LAW & CONTEMP. PROBS. i, ii (2009) (“SKAPP has convened four Coronado Conferences.”).

[14] Milward v. Acuity Specialty Products Group, Inc., 639 F.3d 11 (1st Cir. 2011), cert. denied sub nom., U.S. Steel Corp. v. Milward, 132 S. Ct. 1002 (2012).

[15] General Electric Co. v. Joiner, 522 U.S. 136, 136-37 (1997).

[16] Margaret A. Berger, The Admissibility of Expert Testimony, in National Academies of Sciences, Engineering and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE 11, 20 n.51, 23-24 n.61 (3rd 2011).

[17] Exxon Shipping Co. v. Baker, 554 U.S. 471, 128 S. Ct. 2605, 2626 n.17 (2008).

The First Daubert Motion

February 20th, 2026

As every school child knows, or at least every law student in the United States knows, Daubert was a Bendectin case. The plaintiff claimed that his mother’s use of Bendectin, a prescription anti-nausea medication, during pregnancy caused him to be born with a major limb reduction defect.

Filed in 1984, the Daubert case was pending, in summer 1989, before Judge Earl Ben Gilliam, in the Southern District of California. A trial date was approaching, and a deadline for motions for summary judgment. The first Daubert motion was filed in August 1989, in Daubert v. Merrell Dow Pharmaceuticals, Inc.[1] It was a motion for summary judgment, not a motion specifically to exclude plaintiffs’ expert witness’s proffered testimony.

By the time of the first Daubert motion, the plaintiff was relying upon the anticipated testimony of John Davis Palmer, M.D. For the time, John Davis Palmer was not an unlikely expert witness. Although Palmer practiced internal medicine, he had a doctorate in pharmacology. Palmer, however, had no experience studying Bendectin, and no real expertise in epidemiology. He had never designed or published an epidemiologic study, and he had never done any kind of research on Bendectin. The standard for qualifying an expert witness, even in federal court, has always been very low, and thus not an effective way to police the quality of scientific evidence.

Palmer was a rather late substitute for expert witnesses previously listed by the plaintiff. Alan Kimball Done, a pediatrician, had been the main warhorse of the Bendectin plaintiffs, but he was withdrawn by plaintiff’s counsel after he was found to have committed perjury about his academic credentials in another Bendectin case.[2]

Plaintiff also needed to drop another expert witness, William Griffith McBride, who had been a star in plaintiff’s counsel’s stable. McBride helped show the teratogenicity of thalidomide in the early 1960s,[3] and his work in the Bendectin litigation gave these dodgy cases some patina of respectability. In 1988, however, McBride was accused of fraud, for which he would eventually lose his medical license.[4] McBride also chose, rather improvidently, to sue journalists, journals, and Merrell Dow executives, for reporting his rather extensive fees, only to lose that litigation.[5] When plaintiff’s counsel withdrew McBride, plaintiff was left with only Dr. Palmer to serve as plaintiff’s sole expert witness on both general and specific causation.

At the time that the first Daubert motion was filed, manufacturer Merrell Dow had voluntarily withdrawn Bendectin from the market, without any suggestion from the FDA that this action was necessary or in the public interest. The manufacturer had also enjoyed considerable success in court. The company had tried a case that consolidated the general causation claims of over 800 plaintiffs, to a defense jury verdict, in 1985, before Chief Judge Carl Rubin, of the Southern District of Ohio.[6] Despite some isolated trial losses, the company was vindicated in three federal circuits at the time its lawyers filed the “Daubert” motion.[7] The First, Fifth, and District of Columbia Circuits of the United States Court of Appeals, had all held that the plaintiffs’ case was legally insufficient to sustain a verdict against the defendant, or that the expert testimony involved was inadmissible.

In the Daubert case, Merrell Dow Pharmaceuticals was represented by the law firm Dickson, Carlson & Campillo. The important task of drafting the motion for summary judgment landed on the desk of a first year associate, Pamela Yates, who is now a partner at Arnold & Porter. Given that Merrell Dow had succeeded in other appellate courts, the task may have seemed straight forward, but the legal theories were actually all over the map.

The first Daubert motion was not styled as a motion to exclude expert witness opinion testimony, but rather as a motion for summary judgment, pursuant to Federal Rule of Civil Procedure 56, on the issue of causation. Merrell Dow’s supporting brief did not clearly invoke the distinction between general and specific causation, which distinction was not widely drawn until later in the 1990s. The supporting brief implicitly addressed both general and specific causation.

At the time that the first Daubert motion was made, there was no clear consensus of precedent that identified the source of support for a trial court’s ruling peremptorily on a weak evidentiary display on the causation issue. The evidence supporting the defense expert witnesses’ opinion that Bendectin had not been shown to cause birth defects generally and limb reduction defects specifically was strong. For all major congenital defects, there had been no change in overall incidence for the years in which Bendectin was marketed. Such an ecological argument usually has no validity, but in the case of Bendectin, for several years, roughly half of all pregnant women used the medication. When the medication was abruptly withdrawn,[8] not because of the science but because of the cost of the litigation, the rate of birth defects remained unaffected. The great majority of birth defects have no known cause, and there was no scientific consensus that Bendectin caused birth defects; indeed by 1989, the nearly universal consensus was that Bendectin did not cause birth defects.[9]

There were also many analytical epidemiologic studies, which both individually or in combination failed to support a conclusion of causation.

In the face of the defense’s affirmative evidence, the plaintiff relied upon a potpourri of evidence:

1) chemical structure activity analysis;

2) in vitro (test tube) studies;

3 ) in vivo studies (animal teratology) studies; and

4) reanalysis of epidemiology studies.

Plaintiff’s lead counsel Barry Nace[10] had concocted this potpourri approach, which he called “mosaic theory,” and which might more aptly be called the tsemish or the shmegegge theory.[11] Whatever Nace called it, he fed it to his expert witness to argue that:

“Like the pieces of a mosaic, the individual studies showed little or nothing when viewed separately from one another, but they combined to produce a whole that was greater than the sum of its parts: a foundation for Dr. Done’s opinion that Bendectin caused appellant’s birth defects.”[12]

Although philosopher Harry Frankfurt had not yet written his seminal treatise on the subject, most courts saw that this was bullshit, which tends to result “whenever a person’s obligations or opportunities to speak about some topic exceed his knowledge of the facts that are relevant to that topic.”[13]

In addition to favorable opinions from the First, Fifth, and District of Columbia Circuits, Merrell Dow had a favorable Zeitgeist working in its favor. The plaintiff-friendly influential judge, Judge Jack Weinstein, had rolled up his sleeves and taken a hard look at the plaintiffs’ scientific evidence in the Agent Orange litigation. Judge Weinstein found that evidence wanting in an important opinion in 1985.[14] Although the alleged causal agent in Agent Orange was not Bendectin, Judge Weinstein recognized that epidemiological studies were, in a similar medico-legal context, “the only useful studies having any bearing on causation.[15] Judge Weinstein relied heavily upon Federal Rule of Evidence Rule 703, which governed what inadmissible studies expert witnesses could rely upon, to whittle down the reliance list of plaintiffs’ expert witnesses before declaring their opinions too fragile to support a reasonable jury’s verdict in favor of plaintiffs.

More generally, discerning members of the legal system were reaching the end of their tolerance for the common law laissez-faire approach to expert witness evidence. In 1986, the Department of Justice issued a report that explicitly called for meaningful judicial gatekeeping of expert witnesses.[16] And in that same year, 1986, Judge Patrick Higginbotham wrote an influential opinion, in which he warned that expert witness opinion testimony was out of control, with expert witnesses becoming mouth pieces for the lawyers and advocates of policy beyond their proper role. Judge Higginbotham observed that trial judges (with support from appellate court judges) had a duty to address the problem by policing the soundness of opinions proffered in litigation, and to reject the system’s reliance upon expert witnesses simply because they “say[] it is so:”[17]

“we recognize the temptation to answer objections to receipt of expert testimony with the short hand remark that the jury will give it ‘the weight it deserves’. This nigh reflective explanation may be sound in some cases, but in others it can mask a failure by the trial judge to come to grips with an important trial decision. Trial judges must be sensitive to the qualifications of persons claiming to be experts … . Our message to our able trial colleagues: It is time to take hold of expert testimony in federal trials.”[18]

Although Merrell Dow had a substantial tailwind behind its motion for summary judgment, there was no one clear theory upon which it could rely. Some of the Bendectin appellate court opinions were based upon the insufficiency of the plaintiffs’ expert witness evidence, on the basis of the entire record after trial. The evidence in Daubert was virtually the same if not more restricted than what was of record in some of those appellate court cases. The ecological evidence was clear.

Some of the judgments relied upon by Merrell Dow were based upon the Frye test, and some were based upon Rule 703, which addresses what kinds of otherwise inadmissible evidence expert witnesses may rely upon in formulating their opinions. Finally, some courts, such as Fifth Circuit in In Re Air Crash Disaster at New Orleans, were beginning to see Rule 702 as the source of their authority to control wayward expert witness opinion testimony.

Merrell Dow advanced multiple lines of analyses to show that plaintiffs cannot establish causation based upon the then current scientific record. The first Daubert motion had no clear line of authority, and so, understandably, it cast a wide net on all available potential legal rules and doctrines to oppose the plantiff’s potpourri Bendectin causation theory. The motion harnessed precedents based upon sufficiency of the plaintiffs’ proffered expert witness, Federal Rules 702 and 703, as well as the 1923 Frye case.[19]

The cases that invoked Frye doctrine presented several interpretative problems. Frye was a criminal case that prohibited expert witnesses from testifying about their interpretations of the output of a mechanical device. The Frye case’s insistence upon general acceptance, when imported into a causation dispute in a tort case, was ambiguous as to what exactly had to be generally accepted: the specific causal claim, or the method used to reach the causal claim, or the method used as applied to the facts of the case. Furthermore, Frye’s requirement of general acceptance was not explicitly incorporated into either Rule 702 or 703, when promulgated in 1975.[20]

Merrell Dow had ample evidence that there was no general acceptance of the plaintiff’s causal claim, but its counsel also showed that by applying generally accepted methodology, scientists could not reach the plaintiff’s causal conclusion, and no scientist outside of the litigation had done so. In particular, there was general acceptance of the propositions that non-human in vivo and in vitro teratology experiments have little if any predictive ability for human outcomes. Because randomized controlled trials were never an option for testing human teratogenicity, observational epidemiology was required, and the available studies were largely exonerative. Only by post-publication data dredging and manipulation was plaintiffs’ expert witness Palmer (following what Shann Swan had done in previous cases) able to raise questions about possible associations. Plaintiff’s expert witness Palmer could not show that these manipulations were a generally accepted method for interpreting or re-analyzing published studies.

In its last point, the first Daubert motion also maintained that the standard for medical causation required that the relevant relative risk exceed two.[21] As noted, the brief did not distinguish general from specific causation, a distinction that had not entered the legal lexicon fully in 1989. The brief’s citation to swine-flu cases, however, clarifies the nature of Merrell Dow’s argument. In the swine-flu litigation, the United States government assumed liability for adverse effects of a vaccine for swine flu. The government recognized that within a certain time window after vaccination, patients had more than a doubled risk of Guillain-Barré syndrome (GBS), an autoimmune neurological condition. The government refused compensation for claimants outside that window. Merrell Dow relied heavily upon one swine flu case, Cook v. United States, which articulated and applied the principle:

“Wherever the relative risk to vaccinated persons is greater than two times the risk to unvaccinated persons, there is a greater than 50% chance that a given GBS case among vaccinees of that latency period is attributable to vaccination, thus sustaining plaintiff’s burden of proof on causation.”[22]

In other words, the government had conceded that the swine-flu vaccine could cause GBS in some temporal situations, but not others. The magnitude of the causal association had been quantified in relative risk terms by epidemiologic studies. Only for those claimants vaccinated in time windows with relative risks greater than two could courts conclude that GBS was, more likely than not, caused by vaccination.

Unlike the federal government in the swine-flu GBS litigation, Merrell Dow was not, however, conceding general causation for any exposure scenarios. The first Daubert motion can only be read to deny general causation, but to explain further that even if the court were to assume, arguendo, that Bendectin causes limb reduction deficits based upon Palmer’s schmegegge and Swan’s re-jiggered risk ratios, that there would still be no proper inference that Bendectin more likely than not caused Jason Daubert’s birth defects.

In response to these arguments, the plaintiff’s counsel argued their mosaic, potpourri, schmegegge theories. Although plaintiffs were down to Dr. Palmer, they filed transcripts and affidavits from a host of other expert witnesses, from previous Bendectin cases.

As for the legal rules of decision, Barry Nace, on behalf of plaintiffs, argued that Rule 703 had “absorbed” the Frye rule. Having been shown to be qualified under the minimal standard of Rule 702, these expert witnesses then satisfied Rule 703 by relying upon “scientific evidence” of the sort that experts in their field rely upon, even if other scientists would not rely upon such evidence in support of a conclusion. Otherwise those expert witnesses were unrestrained by the law, and they were free to assess their relied upon facts and data as sufficient to show that Bendectin probably causes birth defects and that Bendectin caused Jason Daubert’s birth defects. Nace argued that as long as expert witnesses, properly qualified, offered relevant opinions, based upon “things of science,” they could opine that the earth was flat, and it was for the jury to sort out whether to believe them.

Judge Earl Gilliam found Nace’s position untenable, and granted summary judgment later in 1989.[23] Interestingly Judge Gilliam’s opinion in the district court never cited Federal Rule of Evidence 702. Instead, the opinion pointed to Rule 703, as restricting evidence, even if “science,” unless the proponent showed that the underlying principle had gained general acceptance in the relevant field.[24] Opinions not based upon facts or data “of a type reasonably relied upon by experts in the particular field” would be confusing, misleading, and unhelpful, and thus inadmissible. The reference to helpfulness might perhaps be taken as an implicit invocation of Rule 702.

Judge Gilliam had the benefit of the Circuit decisions in Brock, Richardson, and Lynch, with their various holdings of insufficiency or inadmissibility of plaintiffs’ expert witness evidence. In particular, Judge Gilliam cited Brock for the proposition that trial courts must “critically evaluate the reasoning process by which the experts connect data to their conclusions in order for courts to consistently and rationally resolve the disputes before them.”[25] Following Judge Weinstein on Agent Orange, and the previous federal decisions on Bendectin, Judge Gilliam observed that causation in the Bendectin cases could be established, under the circumstances of plaintiffs’ evidentiary display, only through reliance upon epidemiologic evidence. Dr. Done’s schmeggege, concocted as it was by Barry Nace, would not get plaintiffs to a jury.

Judge Gilliam went further to point out that some of plaintiffs’ proffer did not even purport to claim causation. Shanna Swan’s prior testimony asserted that Bendectin was “associated” with limb reduction. Jay Glasser, a specialist in biostatistics, epidemiology and biometry had opined that “Bendectin is within a reasonable degree of epidemiological certainty associated with congenital disorders, including limb defects.” Dr. Johannes Thiersch, a specialist in pathology and pharmacology, proclaimed that “structure analysis” was “of great interest.”[26] In other words, there was a good deal of true, true, but immaterial opinion in what Mr. Nace had thrown over the transom, in opposition to the motion for summary judgment.

Nace appealed, and the Daubert case was argued to the Ninth Circuit in 1991. In a short opinion by Judge Kozinski, the appellate court affirmed the judgment below.[27] The affirmance did not mention Rule 702; rather it relied upon the decisions of other Circuits, in which the plaintiffs’ evidentiary display had been found insufficient to sustain a reasonable jury verdict.

Judge Kozinski’s opinion tilted towards Rule 703 and the Frye standard in citing to cases that stated, based upon Frye, that expert witnesses must use generally accepted techniques from the scientific community. As a legal determination, the determination of general acceptance vel non was a legal determination reviewable de novo. For its de novo decision on general acceptance, the Ninth Circuit relied upon the cases coming from the First, Fifth, and District of Columbia Circuits,[28] and of course, the record below.

By 1991, another Circuit, the Third, had weighed in on the same evidentiary display, when it reversed summary judgment for Merrell Dow, and remanded for reconsideration under the Third Circuit’s approach to Rule 702. Judge Kozinski declared that the Third Circuit’s approach was not followed in the Ninth Circuit, and proceeded to ignore the DeLuca case.[29]

Judge Kozinski treated the insufficiency and the invalidity of the Nace/Done schmeggege theory as legal precedent, and thus the court’s opinion gave very little attention by way of expository description or explanation of the problems with the four factors (in vitro, in vivo, structure analysis, and re-analysis of epidemiologic studies). As Judge Kozinski put the matter:

 “For the convincing reasons articulated by our sister circuits, we agree with the district court that the available animal and chemical studies, together with plaintiffs’ expert reanalysis of epidemiological studies, provide insufficient foundation to allow admission of expert testimony to the effect that Bendectin caused plaintiffs’ injuries.”[30]

And thus, summary judgment was proper in Daubert. Judge Kozinksi, like Judge Gilliam in the district court, never reached the specific causation argument that involved risk ratios less than two.

Some of the Circuit court cases relied upon by Judge Kozinski delved into the invalidity of these methods for determining the causes of human birth defects. The Lynch decision explored in some detail the Shanna Swan made-for-litigation rejiggering of a study based upon data from the Metropolitan Atlanta Congenital Defects Program, which included a challenge to whether it could be reasonably relied upon (Rule 703), as well as its pretense to support a scientific conclusion (Rule 702).[31] Later commentators would skirt the validity issue by asserting that re-analysis, in the abstract, is not impermissible or invalid, without addressing the specific issues discussed in the reported decisions. Other commentators have misrepresented Swan’s re-analysis as a meta-analysis, which it was not.

Some commentators have complained that the defense in Daubert made too much of the lack of statistical significance. Their complaint, in the abstract, might have some salience. In some contexts, an isolated and elevated risk ratio greater or less than one may well have important information, even if the p-value is a bit above 0.05. The lack of statistical significance at the conventional five percent, however, conveys important information about the finding’s imprecision, especially when there was a large dataset to evaluate. In 1994, a meta-analysis was published that found a summary estimate of all birth defects in the available epidemiologic studies to be an odds ratio of 0.95 (95% C.I., 0.88-1.04), and the summary estimate for limb reduction defects to be an odds ratio 1.12 (95% C.I., 0.83-1.48).[32]


[1] Defendant’s Memorandum of Points and Authorities in Support of Its Motion for Summary Judgment on the Issue of Causation, Daubert v. Merrell Dow Pharms., Inc., Case No. 84-2013-G(I) (S.D. Cal. Aug. 2, 1989). The motion was made in a companion case before Judge Gilliam as well, Schuller v. Merrell Dow Pharms., Inc., Case No. 84-2929-G(I). The first Daubert motion may not have been the first one drafted. The linked brief is the first one as filed.

[2] See Oxendine v. Merrell Dow Pharms., Inc., 563 A.2d 330 (D.C. Ct. App. 1989).

[3] William Griffith McBride, Thalidomide and Congenital Abnormalities, 278 LANCET 1358 (1961).

[4] William Griffith McBride, McBride criticizes inquiry, 336 NATURE 614 (1988); Norman Swan, Disciplinary tribunal for McBride, 299 BRIT. MED. J. 1360 (1989); G. F. Humphrey, Scientific fraud: the McBride case, 32 MED. SCI. LAW 199 (1992); Mark Lawson, McBride found guilty of fraud, 361 NATURE 673 (1993); Leigh Dayton, Thalidomide hero found guilty of scientific fraud, NEW SCI. (Feb.27, 1993); William McBride: alerted the world to the dangers of thalidomide in fetal development, 362 BRIT. MED. J. k3415 (2018).

[5] McBride v. Merrell Dow & Pharms., Inc., 800 F.2d 1208 (D.C. Ct. App. 1986). McBride ultimately failed against all his litigation targets.

[6] See In Re Richardson-Merrell. Inc. Bendectin Prods. Liab. Litig., 624 F.Supp. 1212 (S.D. Ohio 1985); aff’d sub nom. In re Bendectin Litig., 857 F.2d 290 (6th Cir. 1988); cert. denied, 488 US 1006 (1989).

[7] Brock v. Merrell Dow Pharmaceuticals Inc., 874 F.2d 307 (5th Cir. 1989); Richardson y. Richardson-Merrell, 857 F.2d 823 (D.C. Cir. 1988); Lynch v. Merrell-National Labs., 830 F.2d 1190 (1st Cir. 1987) (affirming grant of summary judgment).

[8] US Food & Drug Admin., Determination That Bendectin Was Not Withdrawn from Sale for Reasons of Safety or Effectiveness, 64 FED. REG. 43190–1 (1999).

[9] Brief at 3-4.

[10] Barry Nace was one of the lead plaintiffs’ counsel in the Bendectin litigation, and he represented the Daubert family. Nace was also formerly President of the lawsuit industry’s principal lobbying organization, the American Trial Lawyers Association (now the AAJ). See also In re Barry J. Nace, A Member of the Bar of the District of Columbia Court of Appeals (Bar Registration No. 130724), No. 13–BG–1439, Slip op. (Sept. 4, 2014), available at <https://www.dccourts.gov/sites/default/files/pdf-opinions/13-BG-1439.pdf>, last visited on Feb. 8, 2026.

[11] See Michael D. Green, Pessimism about Milward, 3 WAKE FOREST J. L & POL’Y 41, 63 (2013) (paraphrasing Nace as describing the mosaic theory as “[d]amn brilliant, and I was the one who thought of it and fed it to Alan [Done].”).

[12] Id. at 61 (2013) (citing Oxendine v. Merrell Dow Pharm., Inc., 506 A.2d 1100, 1110 (D.C. 1986).

[13] Harry Frankfurt, ON BULLSHIT 63 (2005).

[14] In re “Agent Orange” Prod. Liab. Litig., 611 F. Supp. 1223 (E.D.N.Y. 1985), aff’d, 818 F.2d 187 (2d Cir. 1987), cert. denied, 487 U.S. 1234 (1988).

[15] Id. at p. 1231.

[16] United States Dep’t of Justice, Tort Policy Working Group, Report of the Tort Policy Working Group on the causes, extent and policy implications of the current crisis in insurance availability and affordability at 35 (Report No. 027-000-01251-5) (Wash. DC 1986), available at https://archive.org/details/micro_IA41152903_0369.

[17] In Re Air Crash Disaster at New Orleans, 795 F.2d 1230, 1233-34 (5th Cir. 1986).

[18] Id. at 1233-34.

[19] The Brief, at 2, cited United States v. Kilgus, 571 F.2d 508, 510 (9th Cir. 1987) (citing Frye).

[20] An Act to Establish Rules of Evidence for Certain Courts and Proceedings. Pub. L. 93–595, 88 Stat. 1926 (1975).

[21] Brief at 17.

[22] 545 F.Supp. 306, 308 (N.D. Cal. 1982). See generally Richard E. Neustadt & Harvey V. Fineberg, THE SWINE FLU AFFAIR: DECISION-MAKING ON A SLIPPERY DISEASE (Nat’l Acad. Sci. 1978).

[23] Daubert v. Merrell Dow Pharms., Inc., 727 F.Supp. 570 (S.D. Cal. 1989).

[24] Id. at 571, citing United States v. Kilgus, 571 F.2d 508, 510 (9th Cir.1978).

[25] Id. at 572 (citing Brock, 874 F.2d at 310).

[26] Id. at 574. The use of “association” was at best ambiguous, because it begged the question whether it as an association that was “clear cut” (reasonably free from bias and confounding), and beyond that which we would care to attribute to chance.

[27] Daubert v. Merrell Dow Pharms., Inc., 951 F.2d 1128 (9th Cir. 1991).

[28] Brock v. Merrell Dow Pharms., Inc., 874 F.2d 307, modified, 884 F.2d 166 (5th Cir.1989), cert. denied, 494 U.S. 1046 (1990); Richardson v. Richardson–Merrell, Inc., 857 F.2d 823 (D.C.Cir.1988), cert. denied, 493 U.S. 882 (1989); Lynch v. Merrell–National Labs., 830 F.2d 1190 (1st Cir.1987).

[29] DeLuca v. Merrell Dow Pharmaceuticals, Inc., 131 F.R.D. 71 (D.N.J.) (granting summary judgment), rev’d and remanded, 911 F.2d 941 (3d Cir.1990). On remand, the district court entered summary judgment on the alternative reasoning of Rule 702, as interpreted by the Third Circuit. DeLuca v. Merrell Dow Pharms., Inc., 791 F.Supp. 1042, 1048 (D.N.J. 1992) (re-entering summary judgment after considering Rule 702), aff’d, 6 F.3d 778 (3d Cir.1993) (per curiam), cert. denied, 510 U.S. 1044 (1994).

[30] Daubert v. Merrell Dow Pharms., Inc., 951 F.2d 1128, 1131 (9th Cir. 1991).

[31] Lynch v. Merrell–National Labs., 830 F.2d 1190, 1194-95 (1st Cir.1987).

[32] Paul M. McKeigue, Steven H. Lamm, Shai Linn & Jeffrey S. Kutcher, Bendectin and Birth Defects: I. A Meta-Analysis of the Epidemiologic Studies, 50 TERATOLOGY 27 (1994). This meta-analysis made no correction for multiple comparisons in examining many different types of birth defects.

A New Year, A New Reference Manual

January 5th, 2026

The fourth edition of the Reference Manual on Scientific Evidence was quietly released in the waning hours of 2025, in the twilight of American democracy.[1] The Manual had been slated to be published in 2023, but that date slid to 2024, and then to 2025.  Perhaps the change in directorship of the Federal Judicial Center slowed things up. (Judge Robin Rosenberg of Zantac fame is now the Director)

The new volume is available for download at:

https://www.nationalacademies.org/publications/26919

Although I was a reviewer of one chapter of the Manual, I am just seeing this new edition for the first time today. The basic structure of the volume has not changed, although it has now grown to over 1,600 pages. Many of the key chapters on statistics, epidemiology, toxicology, and medical testimony are carried over from previous editions, with some new authors added and some previous authors no longer participating. In addition, there are some new chapters on exposure science, artificial intelligence, climate science, mental health, neuroscience, and eyewitness identification.

The individual chapters and authors in the new edition of the Manual are:

Liesa L. Richter & Daniel J. Capra, The Admissibility of Expert Testimony, at 1.

Michael Weisberg & Anastasia Thanukos, How Science Works, at 47

Valena E. Beety, Jane Campbell Moriarty, & Andrea L. Roth, Reference Guide on Forensic Feature Comparison Evidence, at 113

David H. Kaye, Reference Guide on Human DNA Identification Evidence, at 207

Thomas D. Albright & Brandon L. Garrett, Reference Guide on Eyewitness Identification, at 361

David H. Kaye & Hal S. Stern, Reference Guide on Statistics and Research Methods, at 463

Daniel L. Rubinfeld & David Card, Reference Guide on Multiple Regression and Advanced Statistical Models, at 577

Shari Seidman Diamond, Matthew Kugler, & James N. Druckman, Reference Guide on Survey Research, at 681

Mark A. Allen, Carlos Brain, & Filipe Lacerda, Reference Guide on Estimation of Economic Damages, at 749

Prologue to the Reference Guide on Exposure Science and Exposure Assessment, the Reference Guide on Epidemiology, and the Reference Guide on Toxicology, at 829i

Elizabeth Marder & Joseph V. Rodricks, Reference Guide on Exposure Science and Exposure Assessment, at 831

Steve C. Gold, Michael D. Green, Jonathan Chevrier, & Brenda Eskenazi, Reference Guide on Epidemiology, at 897

David L. Eaton, Bernard D. Goldstein, & Mary Sue Henifin, Reference Guide on Toxicology, at 1027

John B. Wong, Lawrence O. Gostin, & Oscar A. Cabrera, Reference Guide on Medical Testimony, at 1105

Henry T. Greely & Nita A. Farahany, Reference Guide on Neuroscience, at 1185

Kirk Heilbrun, David DeMatteo, & Paul S. Appelbaum, Reference Guide on Mental Health Evidence, at 1269

Chaouki T. Abdallah, Bert Black, & Edl Schamiloglu, Reference Guide on Engineering, at 1353

Brian N. Levine, Joanne Pasquarelli, & Clay Shields, Reference Guide on Computer Science, at 1409

James E. Baker & Laurie N. Hobart, Reference Guide on Artificial Intelligence, at 1481

Jessica Wentz & Radley Horton, Reference Guide on Climate Science, at 1561

Some quick comments on changes in authorship in some of the chapters. Bernard Goldstein, a member of the dodgy Collegium Ramazzini, remains an author of the toxicology chapter in the new edition. David Eaton, however, has been added. Professor Eaton was the president of the Society of Toxicology for many years, and perhaps he has brought some balance to the new edition’s work on toxicology.

An author of the statistics chapter, David Kaye, is also the sole author of the chapter on DNA evidence. Professor Kaye is a distinguished scholar of DNA evidence with serious statistical expertise. David Freedman had been a co-author of the statistics chapter in the third edition, but sadly Professor Freedman died before the third edition was published. Freedman is replaced by Hal Stern, an accomplished statistician from the University of California.

The chapter on epidemiology lost Leon Gordis, who died in 2015. The chapter in the fourth edition has the return of law professors Steve C. Gold and Michael D. Green, whose pro-plaintiff biases are well known, along with two new authors, epidemiology professors Jonathan Chevrier, & Brenda Eskenazi. Like Goldstein, Eskenazi is a fellow of the Collegium Ramazzini.

The Reference Manual, for better or worse, has had substantial influence on the litigation of scientific and technical issues in federal court, and in some state courts as well. I hope to write more substantively about the new edition in 2026.


[1] National Academies of Sciences, Engineering, and Medicine & Federal Judicial Center, Reference Manual on Scientific Evidence (4th ed. 2025).

Prada – Fashionable, But Unreliable Review on Acetaminophen and Autism

September 30th, 2025

Back in the first week of this month, I posted about a paper (Prada 2025),[1]  which featured a so-called navigation-guide systematic review of the scientific evidence on the issue whether pregnant women’s ingestion of acetaminophen causes their children to develop autism.[2] The focus of my post was on some dodgy aspects of the Prada review, such as its anemic disclosures of interest, and its squirrely claim to have been “NIH funded.”

Since posting, the Prada review has been very much in the news. Last week, President Trump held a news conference, where we learned that he cannot pronounce acetaminophen and that he has a strongly held opinion that acetaminophen causes autism.[3] Trump was surrounded by officials in his administration, including plaintiffs’ lawyer Robert Kennedy, Jr., and three physicians, Drs. Oz, Makary, and Bhattacharya, who looked on in apparent approval. Once upon a time, a risk communication such as this one about acetaminophen, would have come out from a non-political FDA employee, such as Janet Woodcock, who was head of Drug Safety, and for many years the Director of Center for Drug Evaluation and Research. Over her tenure, Dr. Woodcock weighed in on many pharmaceutical safety issues. Those of us who have been involved in litigation of those safety issues remember that Dr. Woodcock chose her language very carefully. She did not just give opinions; she marshalled facts.

Admittedly, Trump’s autism press conference was not as deranged as his 2020 press conference at which he suggested that injecting sodium hypochlorite (bleach) into patients would cure Covid-19 infections. Still, most of the world was left with the impression that Trump was replacing (DOGE-ing) scientific research and replacing it with irrational speculation. Trump’s press conference on acetaminophen and vaccines was widely met with skepticism and disbelief. Medical ethicist Dr. Arthur Caplan, who is not given to hyperbole, called the conference “the saddest display of a lack of evidence, rumors, recycling old myths, lousy advice, outright lies, and dangerous advice I have ever witnessed by anyone in authority.”[4]

When the administration physicians communicated with the public, they said something very different from Trump’s presentation. In her press release, Press Secretary Karoline Leavitt used the meaningless locution, “suggested link,” and cited the Prada review, which eschewed causal conclusions:[5]

“Andrea Baccarelli, M.D., Ph.D., Dean of the Faculty, Harvard T.H. Chan School of Public Health: “Colleagues and I recently conducted a rigorous review, funded by a grant from the National Institutes of Health (NIH), of the potential risks of acetaminophen use during pregnancy… We found evidence of an association between exposure to acetaminophen during pregnancy and increased incidence of neurodevelopmental disorders in children.

Harvard University: Using acetaminophen during pregnancy may increase children’s autism and ADHD risk.”

Of course, saying that something “may increase risk” is not even close to saying that something causes the outcome in question. And Baccarelli’s description of his paper, Prada review, as funded by the National Institutes of Health is misleading at best.[6]

Leavitt went on to declare that “[t]he Trump Administration does not believe popping more pills is always the answer for better health.” Unless of course, it is Propecia for Mr. Trump, testosterone for Mr. Kennedy, or ketamine for Mr. Musk.

FDA Commissioner Martin A. Makary issued a Notice, the same day, in which he declared:

“In recent years, evidence has accumulated suggesting that the use of acetaminophen by pregnant women may be associated with an increased risk of neurological conditions such as autism and ADHD in children.

* * *

To be clear, while an association between acetaminophen and autism has been described in many studies, a causal relationship has not been established and there are contrary studies in the scientific literature.”[7]

So the FDA is clearly not declaring that acetaminophen causes autism.

Dr. Mehmet Oz, former surgeon and television talking head, who stood mute by Trump’s side at the infamous press conference, found his voice later in the week, when he acknowledged that pregnant women of course should take acetaminophen when physicians direct them to do so.

In Europe, where pharmaceutical regulation is typically more precautionary than in the United States, both the European Medicines Agency and the U.K.’s Medicines and Healthcare Products Regulatory Agency announced that using acetaminophen during pregnancy was safe with no showing that it causes autism in offspring.[8] Steffen Thirstrup, the EMA’s Chief Medical Officer, announced a day after the Trump bungle, that:

“Paracetamol [acetaminophen] remains an important option to treat pain or fever in pregnant women. Our advice is based on a rigorous assessment of the available scientific data and we have found no evidence that taking paracetamol during pregnancy causes autism in children.”

Most medical organizations were appalled at the administration’s sloppy messaging. The day after the press conference, the American College of Medical Toxicology (ACMT) issued a statement in response, to affirm the safety of acetaminophen in pregnancy.[9] The ACMT noted that its position was in agreement with the American College of Obstetrics and Gynecologists, the Society for Maternal-Fetal Medicine, the American Academy of Pediatrics, and the Society for Developmental and Behavioral Pediatrics.

The acetaminophen kerfuffle seems always to come back to the Prada “navigation guide” systematic review and its authors, including the Harvard Dean, Andrea Baccarelli, who was the well-paid member of the plaintiffs’ expert witness team in acetaminophen litigation.[10] Why did Dr. Andrea Baccarelli in the Prada review use this curious, arcane, and infrequently used method of review? Why did Baccarelli and his co-authors publish this review in Environmental Health, which is dedicated to publishing “manuscripts on important aspects of environmental and occupational medicine,” which places maternal ingestion of a licensed pharmaceutical outside its stated competence? Why did Baccarelli offer a litigation opinion that acetaminophen causes autism, but retreat to “association” when writing for the scientific community? And why did Baccarelli and his co-authors not disclose that Baccarelli had submitted essentially the same navigation guide systematic review as his proffered expert witness testimony, and that a federal court had rejected his opinion as not “the product of reliable principles and methods,” and not “a reliable application of the principles and methods to the facts of the case”[11]? Perhaps the answers are obvious to most observers, but candid disclosures certainly would have provided important context, and saved some people the embarrassment of relying upon the Prada review.

In digging deeper into the history of the navigation guide method itself, the earliest citation I could find to such systematic reviews was in 2009, in a conference paper that discussed this approach as a proposal.[12] The authors that made up the Navigating the Scientific Evidence to Improve Prevention Workshop Organizing Committee were not particularly well known or distinguished in the field of research synthesis. Still, there must be other reasons that “navigation guide” reviews are not more prevalent if the Organizing Committee had been truly on to something important.

The Committee never identified a rationale for a new systematic review approach. When the Organizing Committee outlined its approach in 2009, there were well over three decades of experience with systematic reviews,[13] with well-regarded full-length textbook treatment by experts in the field.[14]

In addition to the lack of experience among its authors and the preemption of the subject by comprehensive treatments elsewhere, there were three additional curious take aways from a cursory reading of the Organizing Committee’s 2009 manuscript. First, Committee emphasized the alleged need for a review methodology for environmental exposures. This emphasis was never accompanied by a showing that well-described methodologies long in use were somehow inadequate or inappropriate for environmental exposures.

Second, the authors urged the need for precautionary assessments, which might make their method fine where syntheses for precautionary pronouncements were called for. In the United States, regulatory assessments vary depending up the governing statutes that create the regulatory mandate.  In personal injury litigation, the precautionary principle is nothing less than an end run around the burden of proof on the party claiming harm and suing in tort. The designated subject matter of environmental exposures for the proposed systematic review technique offers an insight into why these authors believed that they had to propose a new fangled systematic review methodology. Previously described methods interfered with authors’ ability to elevate “iffy” associations into conclusions of causality in the name of the precautionary principle.

The third curiosity in the 2009 manuscript is that the authors never described the need for a pre-specified protocol. Later articles on this proposed methodology similarly failed to describe the need for such a protocol,[15] although by 2014, authors from the original Organizing Committee reversed course to add a pre-specified protocol to the requirements for a navigation guide systematic review.[16]

A recent article defines a systematic review essentially in terms of a protocol:

“Systematic review (SR) is a rigorous, protocol-driven approach designed to minimise error and bias when summarising the body of research evidence relevant to a specific scientific question.”[17]

The purpose of a protocol may be obvious to anyone who has been paying attention to the replication crisis in biomedical literature, but the same article offers a helpful description of its rationale:

“The purposes of the protocol are to discourage ad-hoc changes to methodology during the review process which may introduce bias, to allow any justifiable methodological changes to be tracked, and also to allow peer-review of the work that it is proposed, to help ensure the utility and validity of its objectives and methods.”[18]

Systematic reviews vary widely in quality, methodological rigor, and validity, but one of the key determinants of their validity is whether they were preceded by pre-specified protocol. Although systematic reviews are often described the “gold standard” for evidence synthesis, their methodological rigor vary widely. Reviews that lack a pre-specified protocol are decidedly less rigorous than those reviews that employ a protocol.[19] The absence of a protocol is thus an important tell that a systematic review may be untrustworthy.

The Prada paper put together by Baccarelli’s team has no protocol. It may satisfy the Trump administration’s Fool’s Gold Standard for Science, but that is far short of the requirements of Federal Rule of Evidence 702. Given Baccarelli’s abridgement of scientific method, we should not be overly surprised by Judge Cote’s judgment of the failures of Baccarelli’s and the other plaintiffs’ expert witnesses’ proffered opinions in the acetaminophen litigation:

“their analyses have not served to enlighten but to obfuscate the weakness of the evidence on which they purport to rely and the contradictions in the research. As performed by the plaintiffs’ experts, their transdiagnostic analysis has obscured instead of informing the inquiry on causation.”[20]

Judge Cote carefully reviewed Baccarelli’s proffered testimony and found it replete with cursory analyses, cherry-picked data, and result-driven assessments of studies.[21] Her Honor’s findings would seem to apply with equal measure to the Prada review.


[1] Diddier Prada, Beate Ritz, Ann Z. Bauer and Andrea A. Baccarelli, “Evaluation of the evidence on acetaminophen use and neurodevelopmental disorders using the Navigation Guide methodology,” 24 Envt’l Health 56 (2025).

[2] See Schachtman, “Acetaminophen & Autism – Prada Review Misleadingly Claims to Be NIH Funded,” Tortini (Sept. 9, 2025).

[3] Jeff Mason, Ahmed Aboulenein, and Julie Steenhuysen, “Trump Links Autism to Tylenol and Vaccines, Claims Not Backed by Science,” Reuters (Sept. 22, 2025); Brianna Abbott & Andrea Petersen, “The Trump administration said acetaminophen could cause autism. Doctors maintain it is safe during pregnancy,” Wall St. J. (Sept. 22, 2025) (“Studies looking at a link [sic] between acetaminophen and autism are inconclusive.”); Will Weissert, “Dr. Trump? The president reprises his COVID era, this time sharing unproven medical advice on autism,” Wash. Post (Sept. 23, 2025).

[4] Ali Swenson & Lauran Neergaard, “Trump makes unfounded claims about Tylenol and repeats discredited link between vaccines and autism,” Assoc. Press (Sept. 23, 2025).

[5] Leavitt, “FACT: Evidence Suggests Link Between Acetaminophen, Autism,” The White House (Sept. 22, 2025).

[6] See Schachtman, “Acetaminophen & Autism – Prada Review Misleadingly Claims to Be NIH Funded,” Tortini (Sept. 9, 2025). The referenced grants had nothing to do with acetaminophen and autism, or even autism generally. The NIEHS granted Dr. Baccarelli money to study air pollution and brain aging. The exposure of interest was not acetaminophen, and the outcome of interest was not autism. By claiming that his research was “NIH funded,” Baccarelli was attempting to boost the prestige of the research even though his acetaminophen review was done for litigation, not for the federal government. Apparently the NIEHS acquiesces in this charade because it suggests to the uninitiated that its research grants result in more published papers, even though the topics of those papers are unrelated to the funded research proposal, and the unrelated topics never receiving committee peer review.

[7] Martin A. Makary, “Notice to Physicians on the Use of Acetaminophen During Pregnancy,” (Sept. 22, 2025).

[8] E.M.A., “Use of paracetamol during pregnancy unchanged in the EU,” (Sept. 23, 2025).

[9] ACMT Supports the Safe Use of Acetaminophen in Pregnancy (Sept. 23, 2025).

[10] Rebecca Robbins & Azeen Ghorayshi, “Harvard Dean Was Paid $150,000 as an Expert Witness in Tylenol Lawsuits,” N.Y. Times (Sept. 23, 2025).

[11] Fed. R. Evid. 702.

[12] Patrice Sutton, Heather Sarantis, Julia Quint, Mark Miller, Michele Ondeck, Rivka Gordon, and Tracey Woodruff, “Navigating the Scientific Evidence to Improve Prevention: A Proposal to Develop A Transparent and Systematic Methodology to Sort the Scientific Evidence Linking Environmental Exposures to Reproductive Health Outcomes,”  (July 29, 2009).

[13] See Quan Nha Hong & Pierre Pluye, “Systematic reviews: A brief historical overview,” 34 Education for Information 261, 261 (2018) (describing the evolution of systematic reviews as made up of a “foundation period 1970-1989,” an “institutionalization period 1990-2000, and a “diversification period” from 2001 forward.)

[14] Matthias Egger, Julian P. T. Higgins, and George Davey Smith, Systematic Reviews in Health Research: Meta-Analysis in Context (3rd ed. 2022). The first edition of this text was published in 1995.

[15] Tracey J. Woodruff, Patrice Sutton, and The Navigation Guide Work Group, “An Evidence-Based Medicine Methodology To Bridge The Gap Between Clinical And Environmental Health Sciences,” 30 Health Affairs 931 (2011); Julia R. Barrett, “The Navigation Guide Systematic Review for the Environmental Health Sciences,” 122 Envt’l Health Persp. A283 (2014).

[16] Tracey J. Woodruff & Patrice Sutton, “The Navigation Guide Systematic Review Methodology: A Rigorous and Transparent Method for Translating Environmental Health Science into Better Health Outcomes,” 122 Environ Health Perspect. 1007 (2014).

[17] Paul Whaley, Crispin Halsall, Marlene Ågerstrand, Elisa Aiassa, Diane Benford, Gary Bilotta, David Coggon, Chris Collins, Ciara Dempsey, Raquel Duarte-Davidson, Rex Fitzgerald, Malyka Galay-Burgos, David Gee, Sebastian Hoffmann, Juleen Lam, Toby Lasserson, Len Levy, Steven Lipworth, Sarah Mackenzie Ross, Olwenn Martin, Catherine Meads, Monika Meyer-Baron, James Miller, Camilla Pease, Andrew Rooney, Alison Sapiets, Gavin Stewart, and David Taylor, “Implementing systematic review techniques in chemical risk assessment: Challenges, opportunities and recommendations,” 92-93 Env’t Internat’l 556 (2016).

[18] Id. at 560.

[19] Julia Menon, Fréderique Struijs & Paul Whaley, “The methodological rigour of systematic reviews in environmental health,” 52 Critical Rev Toxicol. 167 (2022).

[20] In re Acetaminophen ASD-ADHD Prods. Liab. Litig., 707 F. Supp. 3d 309, 334, 2023 WL 8711617 (S.D.N.Y. 2023) (Cote, J.).

[21] Id. at 354-56.

Specific Causation – The Process of Elimination

September 24th, 2025

Specific causation causes some courts to become costive, and sometimes, courts overuse so-called differential etiology as a laxative. The phrase “differential etiology” is an analogy to differential diagnosis, the reasoning process by which clinicians assess the identity of a disease or disorder. Differential etiology, like laxatives, can be overused and misused.

Last month, the Ninth Circuit affirmed a district court’s summary judgment in a glyphosate case. Engilis v. Monsanto Co., No. 23-4201, D.C. No. 3:19-cv-07859-VC (9th Cir. August 12, 2025). The trial court found that plaintiff’s expert witness’s differential etiology was unreliable because the putative expert witness acknowledged that obesity could be a cause of plaintiff’s disease, but then failed reliably to rule out obesity as a differential etiology. Instead, the excluded expert witness glibly inferred that glyphosate was a cause of plaintiff’s cancer. The trial and appellate courts were faced with a great example of invalid, motivated reasoning, or the lack of reasoning.

The Ninth Circuit’s affirmance was significant because it clearly acknowledged that there was no presumption of admissibility, and that the district court was well within its discretion to find that the proffered expert witness opinion had failed to meet the requirements of Rule 702.[1]

The decision in Engilis was simple and straightforward; it was based upon specific or individual causation or its absence. In cases involving diseases with multiple potential causes, none of which is necessary for the outcome, an exposure or lifestyle factor may be capable of causing a particular disease, but that factor may not have played a causal role in everyone who experienced the exposure or lifestyle factor and who developed the disease. (Not everyone who smoked cigarettes develops lung cancer, and not all lung cancer patients smoked.) Courts and litigants are thus left with the puzzle of individual causation.

In a case such as Engilis, courts can basically assume, arguendo, that glyphosate can cause the claimed outcome (Non-Hodgkin’s Lymphoma or NHL), but then insist that there is competent and sufficient evidence to show that the claimant’s specific case of NHL was caused by the claimed exposure.

Some courts and commentators have suggested that a process of “differential etiology, by analogy to differential diagnosis, can get a claimant to the finish line. This attempted solution assumes arguendo that glyphosate can cause NHL, but then it still must resolve whether this specific case of NHL (or whatever claimed) was caused by the claimed exposure.

As suggested above, differential etiology is something like constipation, which is resolved by the process of elimination. Formally, the reasoning process is an “iterative disjunctive syllogism.” We start with an exhaustive listing of the possible established general causes of the claimed disease:

A or B or C (exhausting the possible general causes of the claimed disease).

Because the diseases may multifactorial, the set of disjuncts may be more complex:

A or B or C or A*B or B*C or A*C or A*B*C.

But if the claimant had never been exposed to A, we can deduce:

B or C or B*C.

And if the claimant had never been exposed to B, we can infer that:

C.

And if C is the tortogen under investigation, for which general causation was established, the claimant would have an unequivocal submissible case to the jury.

Of course many diseases have unknown causes, so-called idiopathic or sporadic cases.  In such instances, any proper differential etiology must include a disjunct D, for idiopathic cause. We can see that the iterative disjunctive syllogism in such cases leaves us with uneliminated D in some of the remaining disjuncts, and the claimant cannot reach an unequivocal conclusion in support of his claim.

There may perhaps be a solution to this problem that turns on the effect size, and the probability of attribution associated with each uneliminated disjunct, but that is a story for another day.


[1] See Paul Driessen, “Nation’s most liberal court rejects plaintiff expert’s claims that glyphosate caused couple’s cancer,” Eurasia Review (Sept. 23, 2025).