TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

The 4th Reference Manual’s Treatment of Genetic Causes of Disease

January 23rd, 2026

After checking to see whether the new Reference Manual on Scientific Evidence[1] attended to some long overdue corrections, I turned my attention to the substance of the chapter on epidemiology. A cursory comparison between the third[2] and fourth[3] editions of the epidemiology chapter in the Reference Manual a lot of carry over from the third edition, some change in authorship, and at least one interesting change.

The two lawyer authors, Steve Gold and Michael Green, remain, but the authors with reasonable pretense to subject-matter expertise have changed. Gold and Green are both law professors with a long history of commenting on American tort and evidence law. Both are aligned with the lawsuit industry. Previous epidemiology authors, Daryl Michal Freedman and Leon Gordis are now gone from the chapter. Leon Gordis, who had been a chairman of the department of epidemiology, in the Bloomberg School of Public Health, Johns Hopkins University, died in September 2015, after the third edition was published. Daryl Michal Freedman, who been the other subject-matter expert on the third edition’s chapter on epidemiology, has been an epidemiologist with the Biostatistics Branch of the National Cancer Institute, for many years. It is not clear why he left the project.

Replacing Gordis and Freedman are Jonathan Chevrier and Brenda Eskenazi. Chevrier is an associate professor on the faculty of medicine, in the department of epidemiology, in McGill University. The focus of his work is on “common environmental contaminants,” and the role in the development and health of children. Brenda Eskenazi is professor emerita, in the University of California Berkeley School of Public Health, where she is the Director of the Center for Environmental Research and Children’s Health. Eskenazi is a member of a dodgy group known as the Collegium Ramazzini, which was responsible for staging an ex parte presentation of plaintiffs’ expert witnesses to judges presiding in asbestos litigation.[4] Eskenazi was not, however, a member of the Collegium at the time the group conspired with the late Irving Selikoff to pervert the course of justice in American asbestos litigation.

The second significant change is substantive; the fourth edition has added a new subsection to the epidemiology chapter. Comparing the texts of the third and fourth editions of this chapter reveals a new subheading in the new edition:[5]

Genetic and Molecular Epidemiologic Studies

Alas, there is not as much substance to the new subsection, which is less than four pages. Lawyers in the trenches might well have hoped for more substantive treatment of genetic epidemiology, and genetic causation. The chapter’s authors explain their abbreviated treatment with the comment:

“Although commentators have long forecast that the output of genetic and molecular epidemiology would revolutionize causal proof, as of this writing few judicial opinions have addressed these types of studies, and it is far from clear that a revolution is in the offing.”[6] 

The chapter authors are correct that some authors in the past proffered unrealistic predictions of how genetics would supplant correlational studies. Nonetheless, this area has not been as quiescent as the authors’ parsimonious treatment would suggest.

On the question of how prevalent are genetic causation issues, whether raised by plaintiffs or defendants, the chapter might have benefitted from the contributions of a practicing lawyer. Genetic issues come up with some frequency in the litigation of cases involving mesothelioma. The days of plaintiffs who had 30 years of amphibole asbestos exposure in the workplace are largely over. Today’s cases involve little to no exposure, and it stands to reason that the origins of the recently diagnosed cases are different from those diagnosed in the 1970s and 1980s.[7] Genetic cause of mesothelioma is a salient current issue that is passed over in this new Reference Manual.

The authors acknowledge a single birth defects case in which genetic causation was litigated,[8] which was already old news when the last edition of the Manual was published. There are now many more reported cases that cry out for discussion in this under-covered area of the Manual.[9] There are also many cases not reported that have turned on genetic issues. For instance, in some cancer and birth defect cases, the existence of a highly penetrant genetic mutation that could explain the occurrence of a disease completely raises a serious question whether the plaintiff who fails to test for the mutation can possibly have carried his burden of proof.[10] And then there are myriad cases in which the parties have engaged in motion practice, sometimes extended, over access to genetic testing materials.

Genetic issues have arisen in the litigation of high-profile general causation disputes. For instance, the failure to control for genetic effects in epidemiologic studies was a significant issue in the acetaminophen-autism litigation, with both sides presenting geneticists to explain whether the relevant studies were undermined by failure to control for genetic effects.[11]

In the Manual’s epidemiology chapter’s new section on genetics, the authors describe some basic terms and explain that genetic epidemiology may provide evidence for, or against, claims of health effects. The authors’ views come through most clearly in the following short passage:

“Alternatively, genetic epidemiology may reveal associations between genetic variations and a plaintiff’s disease, raising the issue of whether or not a genetic variation may be a competing cause of the disease. This requires assessment of whether the gene–disease association is causal in a general sense, whether it acts independently of the exposure, and whether it is a competing cause in the plaintiff’s specific instance. The extreme, though not typical, example would be a health outcome or disease entirely determined by genetics, 55 as is the case with sickle cell anemia.56[12]

The authors never explain or defend their claim that cases involving diseases caused entirely by genetics are “extreme” and “not typical.” At several points, the authors emphasize that gene-environment interactions are the more prevalent determinants of diseases.[13] If we were to catalog the currently known genetic determinants of diseases, the authors may be correct on a percentage basis, but the issue in any given case is whether the disease or harm claimed by the plaintiff is one of the “extreme” cases of complete genetic causation, or an instance of genetic susceptibility. The authors’ generalization, even if it were correct, would not be very helpful or informative for any specific case.

Perhaps even more important for lawyers, there is a substantive issue on which the new chapter manages to provide confusing guidance. The epidemiology chapter appears to create a false dichotomy between rare, highly penetrant genetic mutations that are uncommon causes of certain diseases, and the more prevalent genetic mutations and polymorphisms that leave persons more susceptible to the deleterious effects of exogenous exposures to toxic chemicals.[14] There is, however, another scenario omitted in the chapter’s discussion of genetic causation. Genetic mutations and polymorphisms may leave persons susceptible to normal, endogenous chemicals, stochastic cellular events, and biological processes that result in diseases such as cancers. In other words, the knee-jerk reflex to invoke exogenous, external toxic chemical exposures promotes a false dichotomy and obscures the obvious implication that susceptibility mutations and polymorphisms may lead to cancer without environmental exposures to harmful chemicals.[15]

The number of endogenous events leading to DNA alterations is enormous, and requires us to rethink the mantra that attributes chronic diseases to gene-environment interaction. At the very least, we need to stop thinking of “environment” as chemical exposures from without ourselves. The epidemiology chapter authors, like many writers, point to external chemical exposures as the culprits in gene-environment reactions, but they ignore the normal, endogenous events that lead to DNA damage, for which genetic susceptibility may be relevant. Mutations that result in increased susceptibility to cancer may affect DNA alterations from both endogenous and metabolic factors as well as from exposures to external chemicals.

Ignorance is never a good thing, and the chapter does the bar and bench a disservice in not adequately exploring genetic susceptibility in view of both exogenous and endogenous exposures that may be responsible for chronic diseases, such as cancers.


[1] National Academies of Sciences, Engineering, and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE (4th ed. 2025) (cited as RMSE 4th ed.)

[2] Michael D. Green, D. Michal Freedman & Leon Gordis, Reference Guide on Epidemiology, 549, in RMSE 3rd ed.

[3] Steve C. Gold, Michael D. Green, Jonathan Chevrier, & Brenda Eskenazi, Reference Guide on Epidemiology, in RMSE 4th ed.

[4] See In re School Asbestos Litigation, 977 F.2d 764 (3d Cir. 1992). See also Cathleen M. Devlin, Disqualification of Federal Judges – Third Circuit Orders District Judge James McGirr Kelly to Disqualify Himself So As To Preserve ‘The Appearance of Justice’ Under 28 U.S.C. § 455 – In re School Asbestos Litigation (1992), 38 VILL. L. REV. 1219 (1993); Bruce A. Green, May Judges Attend Privately Funded Educational Programs? Should Judicial Education Be Privatized?:  Questions of Judicial Ethics and Policy, 29 FORDHAM URB. L. J. 941, 996-98 (2002).

[5] Steve Gold, et al., Reference Guide on Epidemiology, at 914, in RMSE 4th ed.

[6] Id. at 916.

[7] ToxicoGenomica, The Litigator’s Guide to Using Genomics in a Toxic Tort Case (2018).

[8] Id. at 917 & n.55 (citing Bowen v. E.I. Du Pont de Nemours & Co., No. CIV.A. 97C-06-194 CH, 2005 WL 1952859 (Del. Super. Ct. June 23, 2005), aff’d, 906 A.2d 787 (Del. 2006) (discussing the importance of a test for a genetic mutation, which was the defense’s alternative causation theory to plaintiff’s claim that a toxic exposure caused the birth defect at issue). The authors fail to mention that the Bowen case was actually dismissed.

[9] See, e.g., Oliver v. Sec’y Health & Human Servs., 900 F.3d 1357 (Fed. Cir. 2018); Ortega v. United States, 2021 WL 4477896, 2021 U.S. Dist. LEXIS 188969 (N.D.Ill. Sept. 30, 2021); Vanslembrouck ex rel. Braverman v. Halperin, 2014 WL 5462596 (Mich. App. 2014).

[10] See, e.g., Halter v. Boehringer Ingelheim Pharms. Inc., no. 2023-L-001382, Cir. Ct. Cook Cty., Illinois, jury verdict (Aug. 27, 2025) (defense verdict in colorectal cancer case in which plaintiff failed to test for genetic mutation); see also Lauraann Wood, Boehringer Wins Another Zantac Cancer Trial In Illinois, LAW360, Chicago (Aug. 27, 2025).

[11] See, e.g., In re Acetaminophen – ASD-ADHD Prods. Liab. Litig., 707 F.Supp.3d 309, 320  (S.D.N.Y. 2023).

[12] Id. at 916-17 (emphasis added).

[13] Id. at 915.

[14] See, e.g., id. at 967n.190, citing McMillan v. Dep’t of Veterans Affairs, 294 F. Supp. 2d 305, 312 (E.D.N.Y. 2003) (“It is generally accepted that genetic susceptibility plays a key role in determining the adverse effects of environmental chemicals. . . . [I]f polymorphisms of the gene encoding the AhR [protein] exist in humans as they do in laboratory animals, some people would be at greater risk or at lesser risk for the toxic and carcinogenic effects of TCDD [dioxin].”).

[15] See Edward J. Calabrese, Changing the paradigm: The biggest polluter and threat to your health is your body, J. OCCUP. & ENVT’L HYG. (2025), published on-line, ahead of print.

Signature Diseases in the New Reference Manual

January 16th, 2026

The third edition of the Reference Manual on Scientific Evidence[1] had some problems with its discussion of so-called signature diseases.[2] There was a distinct need for the  epidemiology chapter in particular to improve in its fourth edition[3] on the issue of so-called signature diseases, diseases caused by only a single cause. The third edition carved out a limited exception to its questionable generalization that epidemiology had nothing useful to say about specific causation by stating that some diseases do not occur without exposure to a specific chemical or substance.

The new, fourth edition carries forward its assertion that “[t]here are some diseases that do not occur without exposure to an agent; these are known as signature diseases.”[4] And in a footnote, the authors of the epidemiology chapter, fourth edition, attempt to explain:

“There are, however, some diseases that do not occur without exposure to a given toxic agent. This is the same as saying that the toxic agent is a necessary cause for the disease, and the disease is sometimes referred to as a signature disease (also, the agent is pathognomonic) because the existence of the disease necessarily implies the causal role of the agent. Two examples are asbestosis, which is a signature disease for asbestos, and vaginal adenocarcinoma (in young adult women), which is a signature disease for in utero DES exposure. See Kenneth S. Abraham & Richard A. Merrill, Scientific Uncertainty in the Courts, in Issues Sci. & Tech. 93, 101 (1986).”[5]

Much of this language in the footnote is repeated from the third edition, as is the citation to the article by Abraham and Merrill. That article was written by lawyers, not scientists, and is now 40 years old, inaccurate and out of date.

With respect to asbestosis, the epidemiology chapter is correct, at least in part. By definition, only asbestos can cause asbestosis, but asbestosis presents clinically in ways that are indistinguishable in many cases from idiopathic pulmonary fibrosis and other interstitial fibrotic diseases of the lungs. Over the years, the diagnostic criteria for asbestosis have changed, but these criteria have always had a specificity and sensitivity less than 100%. Saying that a case of asbestosis must have been caused by asbestos begs the clinical question whether the case really is asbestosis. The situation might be clearer for a pathology diagnosis of asbestosis, but even then there is often the problem of coincidental findings of asbestos bodies in the presence of interstitial fibrosis from another cause.

On the other hand, the chapter’s characterization of vaginal adenocarcinoma as a signature disease of in utero DES exposure is clearly not correct.  Although this cancer in young women is rather rare, there is a baseline risk that allows the calculation of relative risks for young women exposed in utero.[6] In older women, the relative risks are lower because the baseline risks are higher, and because the effect of DES is diminished for older onset cases.[7] The disease, however, was known before the use of DES in pregnant women, which began after World War II,[8] and thus not an apt or accurate example of signature disease.

The Reference Manual should really not weigh in on controversies that may arise in courtroom litigations, unless it has a very solid basis. Here the chapter on epidemiology cited to a decades old article, by lawyers, on a technical topic. The proposition about DES was readily falsified by a wee bit of research in PubMed.


[1] National Academies of Sciences, Engineering, and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE (3rd ed. 2011) (cited as RMSE 3rd ed.)

[2] See Schachtman, Reference Manual – Desiderata for 4th Edition – Part I – Signature Diseases, TORTINI (Jan. 30, 2023); see also Reference Manual on Scientific Evidence v4.0 (Feb. 28, 2021); Reference Manual on Scientific Evidence – 3rd Edition is Past Its Expiry (Oct. 17, 2021).

[3] National Academies of Sciences, Engineering, and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE (4th ed. 2025) (cited as RMSE 4th ed.).

[4] RMSE 4th ed. at 927-28 n.90.

[5] RMSE 4th at 990 n.274, citing Kenneth S. Abraham & Richard A. Merrill, Scientific Uncertainty in the Courts, 2 ISSUES SCI. & TECH. 93, 101 (Winter 1986). Thankfully, the new epidemiology chapter did not put its finger on the scale about the now discredited view that mesothelioma is a signature disease of asbestos exposure. See Michele Carbone, Harvey Pass, et al., “Medical and Surgical Care of Patients With Mesothelioma and Their Relatives Carrying Germline BAP1 Mutations,” 17 J. THORACIC ONCOL. 873 (2022). See also Mitchell Cheung, et al., Novel LRRK2 mutations and other rare, non-BAP1-related candidate tumor predisposition gene variants in high-risk cancer families with mesothelioma and other tumors, 30 HUMAN MOL. GENETICS 1750 (2021); Thomas Wiesner, et al., “Toward an Improved Definition of the Tumor Spectrum Associated With BAP1 Germline Mutations,” 30 J. CLIN. ONCOL. e337 (2012); Alexandra M. Haugh, et al., Genotypic and Phenotypic Features of BAP1 Cancer Syndrome: A Report of 8 New Families and Review of Cases in the Literature, 153 J.AM. MED. ASS’N DERMATOL. 999 (2017).

[6] See, e.g., Kadir Güzin, et al.,Primary clear cell carcinoma of the vagina that is not related to in utero diethylstilbestrol use,” 3 GYNECOL. SURG. 281 (2006).

[7] Janneke Verloop, et al., Cancer risk in DES daughters, 21 CANCER CAUSES & CONTROL 999 (2010).

[8] See Risk Factors for Vaginal CancerAmerican Cancer Soc’y website (last visited Jan. 16, 2026).

Reference Manual 4th Edition Corrects Some, Not All, Mistakes on Confidence Intervals

January 9th, 2026

So now that the new, fourth, edition of the Reference Manual on Scientific Evidence,[1] has been released, inquiring minds may want to know whether it has corrected errors in the previous, third, edition.[2] The authors of the new edition have had 14 years to ponder and reflect upon errors and to correct them.

Judges and lawyers look to the Manual for guidance and understanding of basic concepts, and the first three editions contained significant errors in addressing statistical concepts. There is probably no better place to jump in to see whether the new edition has corrected the prevalent mistakes in defining the statistical concept of a confidence interval, which was botched in several chapters in the third edition.[3] The concept of a confidence interval is important in many statistical applications, but it is especially important in the interpretation of epidemiologic studies.

Contrition is good for the soul. The new edition, in places, evinces an awareness that earlier editions had misled readers, and that the fourth edition needed to do better.  And in several key places, including in particular the chapter, the fourth edition has improved in its discussion of confidence intervals.

Professor David Kaye has two chapters in the new edition, one on DNA evidence, and another chapter, with Professor Hal Stern, on statistical evidence.[4] Kaye is a careful writer with substantial statistical expertise. His contributions to the third edition were anodyne treatments of statistical concepts, and his chapters in the new edition seem excellent as well upon first reading. In his chapter on DNA evidence, Kaye alludes to the misunderstandings and misrepresentations of the confidence interval,[5] and in his chapter on statistical evidence, Kaye, along with Stern, gives careful definitions and explications of confidence intervals.

Kaye and Stern call out several cases, frequently cited, for having given clearly incorrect definitions of confidence intervals. This sort of candor to the court is necessary if judges, and lawyers, are going to correct bad practices.[6] The statistics chapter in the fourth edition also does not shy away from calling out the authors of another chapter [epidemiology] in the Reference Manual’s third edition for having given erroneous definitions:

“Language from another reference guide in the previous edition of this Reference Manual that is often quoted may inadvertently convey the incorrect impression that a confidence coefficient such as 95% refers to the percentage of results in (hypothetically) repeated studies that would be expected to lie within the interval reported in the study before the court.”[7]

A very gentle criticism indeed; the epidemiology chapter was manifestly incorrect, and we can all agree that its error was negligent, not intentional. The epidemiology chapter from the third edition did not merely convey the incorrect impression; that chapter contained erroneous definitions of confidence intervals.

Kaye and Sterne correctly note that a given confidence interval “does not give the probability that the unknown parameter lies within the confidence interval.”[8] And they helpfully point out that there is no tendency for the point estimate near the center of a confidence interval to be closer to the true value than any other value within the interval.[9]

The authors of the new edition’s chapter on epidemiology obviously got the message from Professors Kaye and Sterne.[10] Fourth time is a charm. The epidemiology chapter in the third edition had been a mess on statistical issues.[11] Without any acknowledgment or confession of error committed in the first three editions, the authors of the epidemiology chapter in the fourth edition now note:

“Just as the p-value does not provide the probability that the risk estimate found in a study is correct, the confidence interval does not provide the range within which the true risk is likely to lie. In other words, it is a misconception to interpret a 95% confidence interval as representing an interval within which the true value has a 95% probability of being found.”[12]

Unfortunately, in the glossary at the end of the new edition’s epidemiology chapter, the erroneous definition of confidence interval was carried forward from the third edition, without change or correction:

confidence interval. A range of values that reflects random error. Thus, if a confidence level of 0.95 is selected for a study, 95% of similar studies would result in the true relative risk falling within the confidence interval.”[13]

What the authors no doubt meant to write was that:

“95% of similar studies would result in the true relative risk falling within the confidence intervals.”

By putting “interval” in the singular, the authors fell into the trap described by Professors Kaye and Hall, and into the error that the previous chapters on epidemiology committed.

The new edition of the Reference Manual appears to suffer, at least on this statistical issue, from the lack of high-level editing across chapters.  The interaction between authors of the statistics and the epidemiology chapters sorted out a serious error, but the error pops up in new chapters. Michael Weisberg and Anastasia Thanukos have an introductory chapter on How Science Works, which crudely and incorrectly describes confidence intervals:

“Uncertainty and error are generally expressed as a range, within which we are confident that, if the study were repeated, the new result would fall. Scientists often use a 95% confidence interval for this purpose.”[14]

Confidence intervals model only random error, and the “range” around one point estimate does not give us “confidence” that the next point estimate would fall into that range.

The chapter on regression analyses in third edition of the Reference Manual incorrectly defined confidence intervals.[15] Alas the fourth edition did not auto-correct:

“Loosely speaking, a confidence interval represents an interval of values in which the true value of a regression coefficient falls within some pre-specified probability (where the true value is the estimate that would be obtained from the same model with a very large sample).”[16]

Why the authors of a highly technical chapter chose to speak loosely, rather than accurately, is a mystery. All the authors of the regression chapter had to do was refer to the accurate, helpful definitions in the statistics chapter.

Why should we care about the Reference Manual’s misleading, incorrect definitions of confidence intervals (or p-values for that matter)? The erroneous definitions and misuses typically place a Bayesian interpretation upon the confidence interval by claiming that the coefficient of confidence (typically 95% when alpha is set at 0.05) states the probability that the parameter, the true population measure, falls within the interval around the point estimate. This misinterpretation might suffice for a Bayesian 95% credible interval, but almost invariably the calculation under discussion is the point estimate ± 1.96 standard errors. Good statistics, like good grammar, costs nothing.

Whether the conflation of confidence intervals with credible intervals results from ignorance or willful efforts to mislead, it is wrong.  And the conflation is part of a long-running rhetorical campaign to mislead about the meaning of the burden of proof and statistical significance in order to abandon statistical tests, and to green-light precautionary principle judgments as “scientific.”[17]

In past posts, I have cited and quoted any number of scientists and lawyers who have engaged in the effort, either intentional or negligent, to mislead readers about the nature of science, by idealizing and falsely elevating the burden of proof in science, and declaring it to be different from the legal and regulatory burden of proof.[18]

To pick one particularly notorious author, consider junk science writer Naomi Oreskes.[19] In her 2010 book, Oreskes declares:

“The 95 percent confidence standard means that there is only 1 chance in 20 that you believe something that isn’t true.

* * * * *

That is a very high bar. It reflects a scientific worldview in which skepticism is a virtue, credulity is not.”[20]

In fact, statistics, science, and law, the confidence interval has nothing to do with the burden of proof; rather it reflects the precision of a single point estimate. Truth is a virtue that may be lost on the likes of Naomi Oreskes, but it is essential to litigating scientific issues. Given that many lawyers in the past had cited the Reference Manual’s chapter on epidemiology for its incorrect definitions of the statistical confidence interval, we should rejoice that this one error has been corrected.


[1] National Academies of Sciences, Engineering, and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE (4th ed. 2025) (cited as RMSE 4th ed.)

[2] National Academies of Sciences, Engineering, and Medicine & Federal Judicial Center, REFERENCE MANUAL ON SCIENTIFIC EVIDENCE (3rd ed. 2011) (cited as RMSE 3rd ed.)

[3] See Nathan Schachtman, Reference Manual – Desiderata for 4th Edition – Part IV – Confidence Intervals, TORTINI (Feb. 10, 2023).

[4] In RMSE 3rd ed., Professor Kaye, along with David Freedman, wrote the chapter on statistical evidence; the two gave careful definitions and explications of confidence intervals.  Professor Freedman sadly died before the third edition was released, and he is replaced by Hal Stern in the chapter on statistics in the fourth edition.

[5] David H. Kaye, Reference Guide on Human DNA Identification Evidence in RMSE 4th ed. at 261, (noting that “the meaning of a confidence interval is subtle, and the estimate commonly is misconstrued”).

[6] See Kaye & Sterne, RMSE 4th ed. at 511 n.125 (citing Turpin v. Merrell Dow Pharm., Inc., 959 F.2d 1349, 1353 (6th Cir. 1992) (“If a confidence interval of ‘95 percent between 0.8 and 3.10 is cited, this means that random repetition of the study should produce, 95 percent of the time, a relative risk somewhere between 0.8 and 3.10.”); Garcia v. Tyson Foods, Inc., 890 F. Supp. 2d 1273, 1285 (D. Kan. 2012) (“Dr. Radwin testified that his study was conducted within a confidence interval of 95 — that is ‘if I did this study over and over again, 95 out of a hundred times I would  expect to get an average between that interval.’”); In re Silicone Gel Breast Implants Prods. Liab. Litig., 318 F. Supp. 2d 879, 897 (C.D. Cal. 2004) (“a margin of error between 0.5 and 8.0 at the 95% confidence level . . . means that 95 times out of 100 a study of that type would yield a relative risk value somewhere between 0.5 and 8.0”)).

[7] See Kaye & Sterne, RMSE 4th ed. at 511 n.125 (citing Rhyne v. U.S. Steel  Corp., 474 F. Supp. 3d 733, 744 (W.D.N.C. 2020) (“‘If a 95% confidence interval is specified, the range encompasses the results we would expect 95% of the time if samples for new studies were repeatedly drawn from the population.’ Reference Guide on Epidemiology, at 580.”).

[8] Kaye & Sterne, RMSE 4th ed. at 512 & n. 126 (citing additional errant judicial decisions, and Geoff Cumming & Robert Maillardet, Confidence Intervals and Replication: Where Will the Next Mean Fall?, 11 PSYCH. METHODS 217 (2006).)

[9] Id. at 512.

[10] Steve C. Gold, Michael D. Green, Jonathan Chevrier, & Brenda Eskenazi, Reference Guide on Epidemiology, in RMSE 4th ed. at 897

[11] Michael D. Green, D. Michal Freedman & Leon Gordis, Reference Guide on Epidemiology, 549, 573, 580, in RMSE 3rd ed.

[12] Steve C. Gold, Michael D. Green, Jonathan Chevrier, & Brenda Eskenazi, Reference Guide on Epidemiology, RMSE 4th ed. at 897, 939.

[13] Id. at 1011.

[14] Michael Weisberg & Anastasia Thanukos, How Science Works , in RMSE 4th ed. at 47, 90.

[15] Daniel Rubinfeld, Reference Guide on Multiple Regression, RMSE 3rd ed. at 303, 342, 352.

[16] Daniel Rubinfeld & David Card, Reference Guide on Multiple Regression and Advanced Statistical Models, in RMSE 4th ed. at 577, 613.

[17] Schachtman, Rhetorical Strategy in Characterizing Scientific Burdens of Proof, TORTINI (Nov. 11, 2014);

[18] See, e.g., Kevin C. Elliott & David B. Resnik, Science, Policy, and the Transparency of Values, 122 ENVT’L HEALTH PERSP. 647 (2014) (exemplifying the rhetorical strategy that idealizes and elevates a burden of proof in science, and then declaring it to be different from legal and regulatory burdens of proof).

[19] Schachtman, Playing Dumb on Statistical Significance, TORTINI (Jan. 4, 2015); The Rhetoric of Playing Dumb on Statistical Significance – Further Comments on Oreskes, TORTINI (Jan. 17, 2015).

[20] Naomi Oreskes & Erik M. Conway, MERCHANTS OF DOUBT: HOW A HANDFUL OF SCIENTISTS OBSCURED THE TRUTH ON ISSUES FROM TOBACCO SMOKE TO GLOBAL WARMING at 156-57 (2010).

A New Year, A New Reference Manual

January 5th, 2026

The fourth edition of the Reference Manual on Scientific Evidence was quietly released in the waning hours of 2025, in the twilight of American democracy.[1] The Manual had been slated to be published in 2023, but that date slid to 2024, and then to 2025.  Perhaps the change in directorship of the Federal Judicial Center slowed things up. (Judge Robin Rosenberg of Zantac fame is now the Director)

The new volume is available for download at:

https://www.nationalacademies.org/publications/26919

Although I was a reviewer of one chapter of the Manual, I am just seeing this new edition for the first time today. The basic structure of the volume has not changed, although it has now grown to over 1,600 pages. Many of the key chapters on statistics, epidemiology, toxicology, and medical testimony are carried over from previous editions, with some new authors added and some previous authors no longer participating. In addition, there are some new chapters on exposure science, artificial intelligence, climate science, mental health, neuroscience, and eyewitness identification.

The individual chapters and authors in the new edition of the Manual are:

Liesa L. Richter & Daniel J. Capra, The Admissibility of Expert Testimony, at 1.

Michael Weisberg & Anastasia Thanukos, How Science Works, at 47

Valena E. Beety, Jane Campbell Moriarty, & Andrea L. Roth, Reference Guide on Forensic Feature Comparison Evidence, at 113

David H. Kaye, Reference Guide on Human DNA Identification Evidence, at 207

Thomas D. Albright & Brandon L. Garrett, Reference Guide on Eyewitness Identification, at 361

David H. Kaye & Hal S. Stern, Reference Guide on Statistics and Research Methods, at 463

Daniel L. Rubinfeld & David Card, Reference Guide on Multiple Regression and Advanced Statistical Models, at 577

Shari Seidman Diamond, Matthew Kugler, & James N. Druckman, Reference Guide on Survey Research, at 681

Mark A. Allen, Carlos Brain, & Filipe Lacerda, Reference Guide on Estimation of Economic Damages, at 749

Prologue to the Reference Guide on Exposure Science and Exposure Assessment, the Reference Guide on Epidemiology, and the Reference Guide on Toxicology, at 829i

Elizabeth Marder & Joseph V. Rodricks, Reference Guide on Exposure Science and Exposure Assessment, at 831

Steve C. Gold, Michael D. Green, Jonathan Chevrier, & Brenda Eskenazi, Reference Guide on Epidemiology, at 897

David L. Eaton, Bernard D. Goldstein, & Mary Sue Henifin, Reference Guide on Toxicology, at 1027

John B. Wong, Lawrence O. Gostin, & Oscar A. Cabrera, Reference Guide on Medical Testimony, at 1105

Henry T. Greely & Nita A. Farahany, Reference Guide on Neuroscience, at 1185

Kirk Heilbrun, David DeMatteo, & Paul S. Appelbaum, Reference Guide on Mental Health Evidence, at 1269

Chaouki T. Abdallah, Bert Black, & Edl Schamiloglu, Reference Guide on Engineering, at 1353

Brian N. Levine, Joanne Pasquarelli, & Clay Shields, Reference Guide on Computer Science, at 1409

James E. Baker & Laurie N. Hobart, Reference Guide on Artificial Intelligence, at 1481

Jessica Wentz & Radley Horton, Reference Guide on Climate Science, at 1561

Some quick comments on changes in authorship in some of the chapters. Bernard Goldstein, a member of the dodgy Collegium Ramazzini, remains an author of the toxicology chapter in the new edition. David Eaton, however, has been added. Professor Eaton was the president of the Society of Toxicology for many years, and perhaps he has brought some balance to the new edition’s work on toxicology.

An author of the statistics chapter, David Kaye, is also the sole author of the chapter on DNA evidence. Professor Kaye is a distinguished scholar of DNA evidence with serious statistical expertise. David Freedman had been a co-author of the statistics chapter in the third edition, but sadly Professor Freedman died before the third edition was published. Freedman is replaced by Hal Stern, an accomplished statistician from the University of California.

The chapter on epidemiology lost Leon Gordis, who died in 2015. The chapter in the fourth edition has the return of law professors Steve C. Gold and Michael D. Green, whose pro-plaintiff biases are well known, along with two new authors, epidemiology professors Jonathan Chevrier, & Brenda Eskenazi. Like Goldstein, Eskenazi is a fellow of the Collegium Ramazzini.

The Reference Manual, for better or worse, has had substantial influence on the litigation of scientific and technical issues in federal court, and in some state courts as well. I hope to write more substantively about the new edition in 2026.


[1] National Academies of Sciences, Engineering, and Medicine & Federal Judicial Center, Reference Manual on Scientific Evidence (4th ed. 2025).

AAAS Conference on Scientific Evidence and the Courts

September 8th, 2025

Back in September 2023, the American Association for the Advancement of Science (AAAS), with its Center for Scientific Responsibility and Justice, sponsored a two day meeting on Scientific Evidence and the Courts. If there were notices for this conference, I missed them. The meeting presentations are now available online. Judging from camera views of the audience, the conference did not appear to be well attended. Most of the material was forgettable, but some of the presentations are worth watching.

Jennifer L. Mnookin opened the conference with a keynote presentation on “Where Law and Science Meet.” Chancellor Mnookin presented a broad overview and some interesting insights on the development of the evidence law of expert witness testimony.

Following Mnookin, Professors Ronald Allen and Andrew Jurs presented on the “Unintended Impacts [sic] of the Daubert Standard.” The conference took place only a few months before amendment to Rule 702 became effective, and the reference to a “Daubert” standard was untoward. Allen’s comments followed the path of his previous articles. Jurs presented some empirical legal research, which seemed flawed for its assumption that the Frye standard was universally applied in federal court before the advent of Daubert. Assessing whether these standards lead to different outcomes when both standards have been applied heterogeneously, and one standard, Frye, is often not applied at all, and Daubert is often flyblown by judges hostile to the gatekeeping enterprise, Jurs’ empirical research seemed both invalid and very much beside the point. Both presenters missed the key point of Daubert, in which case plaintiff’s counsel advocated for no standard at all, beyond basic subject-matter qualification, for giving expert opinions in court.

A Session on “An International Perspective,” Scott Carlson discussed the efforts of the American Bar Association (ABA), and its Center of Global Programs, on supporting judges in foreign countries. Prateek Sibal discussed the history and work of the UNESCO Global Judges Initiative. My sincere wish is that the ABA would support judges more in the United States.

Panelists Valerie P. Hans, Emily Murphy, and Dr. Michael J. Saks presented on various jury issues, in a session “In the Minds of the Jury.” The presentations on how foreign countries process expert witness testimony were lacking any mention of how juries rarely if ever sit in civil cases that involve complex technical and scientific issues.

Two editors of scientific journals, Adriana Bankston and Valda Vinson, along with law professor Michael Sakes, spoke about peer review and publication, in  a session “As a Matter of Fact: ‘General Acceptance’ in Emerging vs. Established Science.” Their discussion on the publication process shed very little light on how courts and juries should assess the validity of specific papers, particularly in view of the lax practices at many journals. Towards the end of this session, a question from the audience proved to be very revealing of the prejudices of the law professor on the panel. The questioner rose to complain that after beginning research on a topic that has litigation relevance her research is now frequently questioned. She asked the panel how she might deal with the annoyance of being questioned. Some on the panel basically urged her to buck up, but the law professor invoked the spirit of agnothologist, and lawsuit industry expert witness, David Michael, to suggest that “manufacturing doubt” was just a corporate tactic in the face of scientific evidence. The prejudice against corporate speech is remarkable when the lawsuit industry has a long history of playing the ad hominem game in advancing its pecuniary interests.

The session that followed addressed how trustworthy science might best be put before courts. The organizers described this session, Utilizing Scientific and Technical Expertise, as going to the heart of the issues targeted by the conference. Joe S. Cecil, Deanne M. Ottaviano, and Shari Seidman Diamond discussed how scientific expertise enters into the evidentiary record in American courtrooms. Their presentations were interesting, but curiously no one mentioned that the primary avenue for expert witness opinion is through oral testimony!

Joe Cecil discussed methods judges have to obtain scientific and technical evidence to advance justice. (By this I hope he meant the truth, and not just the outcome preferred by social justice warriors.) As noted, Joe Cecil did not focus on the ordinary methods of direct and cross-examination of party expert witnesses, but rather, he identified other methods of introducing expertise into the courtroom for the benefit of the judge or the jury. Only one suggestion really affects jury comprehension, namely the appointment of non-party expert witnesses by the court. The other methods really only provide expertise to the trial judge, who perhaps is challenged to make a ruling under Federal Rule of Evidence 702. The federal courts have the inherent supervisory power to appoint technical advisors to act as special law clerks on issues. Similarly, appointed special masters can address technical implementation issues, subject to the district judges’ control. The judges are always free to read outside the briefs and testimony, but there are ethical and notice issues for such conduct. The Reference Manual on Scientific Evidence (RMSE) sits on the shelves on every federal judge’s bookshelf, even if in pristine, unused condition. Judges can at least read the RMSE on specific issues without having to disclose their extra-curricular research to the parties.  Of course, parties are well advised to consider any materials in the RMSE, which support or oppose their contentions.

In discussing the RMSE, Cecil noted that the fourth edition was in the works. He also mentioned that all the old chapter topics would be carried forward to the fourth edition, and that new topics would include eyewitness identification, computer science, artificial intelligence, and climate science. Sadly, there will be no chapter on genetic determination of disease, but perhaps the clinical medicine chapter will take on the subject in greater detail than previous editions. This conference took place two years ago, and yet the RMSE, fourth edition, is still not published. The National Academies website previously listed the project as completed, but the site now describes the work as “in progress.”

Joe Cecil’s analysis of the various extraordinary expert techniques was pretty much spot on, especially his assessment that “experiments” with court-appointed experts were often failures or at best modest successes. The discussion of Judge Pointer’s Rule 706 independent expert witnesses in the silicon [sic] breast implant litigation, MDL926, seemed to lack context. Cecil acknowledged that the court’s expert witnesses contributed some value to admissibility decisions, but Judge Pointer notoriously did not believe that he, as the MDL judge, had any responsibility for Rule 702 determinations, and he made none except in cases that he tried in the Northern District of Alabama. (And these decisions were before the Science Panel was appointed.) So the Rule 706 witnesses really could not have aided in admissibility decisions.

The real value – in my view – of the Science Panel was that it demonstrated that Judge Pointer was quite wrong in believing that both sides’ expert witnesses were simply “too extreme,” or too partisan, and that the truth was somehow in the middle. Indeed, Judge Pointer said so on many occasions, and he was judicially gobsmacked when all four of his experts roundly rejected the plaintiffs’ distortions of the science of immunology, epidemiology, toxicology, and rheumatology. The courts’ expert witnesses sat for discovery depositions, and then gave testimony de bene esse. To my knowledge, their testimony was never admitted in any of the subsequent trials.

Judge Jed Rakoff gave an interesting presentation, “Strengthening Cooperation Between the Scientific Enterprise and the Justice System,” on the intersection between scientific and legal expertise and the need for their better integration. Judge Rakoff focused on the astonishing lack of compliance of trial judges with the gatekeeping requirements of Rule 702 in addressing the admissibility of forensic evidence. Several subsequent panels also addressed forensic topics, including “A Texas Case Study in Accountability for Forensic Sciences,” “Innovations in Investigative Technologies Improvements and Drawbacks,” and “Artificial Intelligence and the Courts,” “Wrongful Convictions and Changed Science: Statutes,” and “Standing Up for Justice: When the Law and Science Work Hand-in-Hand.”

One of the more curious sessions was on “Statistical Modeling and Causation Science,” presented by the American Statistical Association along with the AAAS. Maria Cuellar, from the University of Pennsylvania, discussed the role of statistical thinking in causal assessment, with slides that referred to a nonparametric estimator for the probability of causation. Cuellar, however, never defined what an estimator was; nor did she differentiate nonparametric from parametric estimators. She displayed other equations, again without explaining their origin and meaning, or identifying symbols or meanings. Similarly, Rochelle E. Tractenberg, discussed the use of statistics as evidence and as part of inferring causal inference in litigation, in a model of unclarity. At one point, Tractenberg appeared to suggest that general causation could be taken from regulatory pronouncements. Her discussion of glyphosate implied that general causation was established, which may have led me to disregard her presentation.

Finally, the conference sported a discussion, “Toxic Tort 2.0: Emerging Trends in Climate Change Related Litigation,” The two presenters were Dr. L. Delta Merner, the “Lead Scientist” for the Science Hub for Climate Litigation, Union of Concerned Scientists, and Dr. Paul A. Hanle, Visiting Scholar and  Founder of the Climate Judiciary Project, Environmental Law Institute. The Science Hub actively promotes climate change litigation, which made me wonder whether its scientists are involved in that new chapter in the upcoming fourth edition of the Reference Manual.

Systematic Reviews versus Expert Witness Reports

July 2nd, 2025

Back in November 2024, I posted that the fourth edition of the Reference Manual on Scientific Evidence was completed, and that its publication was imminent. I based my prediction upon the National Academies’ website that reported that the project had been completed. Alas, when no Manual was forth coming, I checked back, and the project was, and is as of today, marked as “in progress.” The NASEM website provides no explanation for the retrograde movement. Could the Manual have been DOGE’d? Did Robert F. Kennedy Jr. insist that a chapter on miasma theory be added?

Ever since the third edition of the Manual arrived, I have tried to identify its strengths and weaknesses, and to highlight topics and coverage that should be improved in the next edition. In 2023, knowing that people were working on submissions for the fourth edition, I posted a series of desiderata for the new edition.[1] I might well have extended the desiderata, but I thought that work was close to completion.

One gaping omission in the third edition of the Manual, which I did not address, is the dearth of coverage of the synthesis of data and evidence across studies. To be sure, the chapter on medical testimony does discuss the “hierarchy of medical evidence, and places the systematic review at the apex.[2] The chapter on epidemiology, however, fails to discuss systematic reviews in a meaningful way, and treats meta-analysis, which ideally pre-supposes a systematic review, with some hostility and neglect.[3]

Notwithstanding the glaring omission in the 2011 version of the Reference Manual, the legal academy had been otherwise well aware of the importance of properly conducted systematic reviews. Back in 2006, Professor Margaret Berger organized a symposia on law and science, at which John Ioannidis presented on the importance of systematic reviews.[4] Lisa Bero also presented on systematic reviews and meta-analyses, and identified a significant source of bias in such reviews that results when authors limit their citations to studies that support their pre-selected, preferred conclusion.[5] Bero’s contribution, however, missed the point that a well-conducted systematic review makes cherry picking much more difficult, as well as obvious to the reader.

The high prevalence of biased citation and consideration of, and reliance upon, studies is a major source of methodological error in courtroom proceedings. Even when the studies relied upon are reasonably well done, expert witnesses can manipulate the evidentiary display through biased selection and exclusion of what to present in support of their opinions. Sometimes astute judges recognize and bar expert witnesses who would pass off their opinions, as well considered, when they are propped up only by biased citation. Unfortunately, courts have not always been vigilant and willing to exclude expert witnesses who proffer biased, invalid opinions based upon cherry-picked evidence.[6] Given that cherry picking or “biased citation” is recognized in the professional community as rather serious methodological sins, judges may be astonished to learn that both phrases, “cherry picking” and “biased citation” do not appear in the third edition of the Reference Manual on Scientific Evidence. With the delay in publishing the fourth edition, there is still time to add citations to careful debunking of biased citation, such as the reverse-engineered systematic review and meta-analysis in last year’s decision in the paraquat parkinsonism litigation.[7]

When I began my courtroom career, systematic reviews of the evidence for a causal claim were virtually non-existent. Most reviews and textbook chapters were hipshots that identified a few studies that supported the author’s preferred opinion, with perhaps a few disparaging words about a study that contradicted the author’s preferred outcome. On a controversial issue, lawyers could generally find a textbook or review article on either side of an issue. Cross-examination on a so-called “learned treatise,” however, was limited. In state courts, the learned treatise was not admissible for its truth, but only to show that expert witnesses should not be believed when they disagreed with the statement. It was all too easy for an expert witness to declare, “yes, I disagree with that one sentence, on one page, out of 1,500 pages, in that one book.”

In federal courts, the applicable rule of evidence makes the learned treatise statement admissible for its truth:

“Rule 803. Exceptions to the Rule Against Hearsay

The following are not excluded by the rule against hearsay, regardless of whether the declarant is available as a witness:

(18) Statements in Learned Treatises, Periodicals, or Pamphlets . A statement contained in a treatise, periodical, or pamphlet if:

(A) the statement is called to the attention of an expert witness on cross-examination or relied on by the expert on direct examination; and

(B) the publication is established as a reliable authority by the expert’s admission or testimony, by another expert’s testimony, or by judicial notice.

If admitted, the statement may be read into evidence but not received as an exhibit.”

While this rule historically had some importance in showing the finder of fact that the opinion given in court was not shared with the relevant expert community, the rule was and is problematic. Exactly what counts as “learned” is undefined. Expert witnesses on either side can simply endorse a treatise, a periodical, or a pamphlet as learned to enable a lawyer to use it on direct or cross-examination, and make its contents admissible. The rule was drafted and enacted in 1975, when another rule, Rule 702, was generally interpreted to place no epistemic restraints upon expert witnesses. Allowing Rule 803(18) to be invoked without the epistemic constraints of Rules 702 and 703 raised few concerns in 1975, but in the aftermath of Daubert (1993), the tension within the Federal Rules of Evidence requires that the admissibility of a statement in a learned treatise cannot save an expert witness opinion that is not otherwise sufficiently grounded and valid.[8]

Systematic reviews are a different kettle of fish from the sort of textbook opinions of the 1970s and 1980s, which often lacked comprehensive assessments and consistent application of criteria for validity. The intersection of the evolution of Rule 702 and systematic reviews is remarkable. When Rule 702 was drafted, systematic reviews were non-existent. When the Supreme Court decided the Daubert case in 1993, systematic reviews were just emerging as a different and superior form of evidence synthesis.[9] The lesson for judges, regulators, and lawyers is that the standards for valid synthesis of studies and lines of evidence have changed and become more demanding.

In 2009, several professional groups produced an important guidance for reporting systematic reviews, “the Preferred Reporting Items for Systematic reviews and Meta-Analyses,” or PRISMA.[10] Although the PRISMA guidance ostensibly addresses reporting, if authors have not done something that should be reported, their failure to do it and report about it can be identified as a significant omission from their publication. One of the PRISMA specifications called for the writing of a protocol for any systematic review, and for making this protocol available to the scientific community and the public. The protocol will identify the exact clinical issue under review, the kinds of evidence that bear on the issue, and criteria for including or excluding studies that should be included in the review. The requirement of pre-registration has the ability to damp down data dredging in observational studies and experiments, and to help readers see when authors reverse engineered systematic reviews by declaring their criteria for inclusion and exclusion after reading candidate studies and their conclusions.

In 2011, the Centre for Reviews and Dissemination, at the University of York in England, developed an internet archive, PROSPERO, for prospectively registering systematic reviews. In addition to reducing duplication of systematic reviews, PROSPERO aimed to increase transparency, validity, and integrity of the systematic reviews. Around the same time, the Center for Open Science, also set up a web-based archive for systematic review protocols.[11]

Reviews purporting to be systematic are now commonplace. By 2018, ROSPERO had registered over 30,000 records, but of course, some scientists may have registered systematic reviews which they never completed.[12] Despite the publication of professional guidances, carefully performed systematic reviews can still be hard to find.[13]

In federal court, expert witnesses must proffer their opinions in a specified form. Back in the 1980s, federal court practice on expert witnesses was “loose” not only on admissibility issues, but also on the requirements for pre-trial disclosure of opinions. In some federal districts, such as those within Pennsylvania, federal judges took their cues not from the language of the Federal Rules of Civil Procedure, but from state court practice, which required only cursory disclosure of top-level opinions without identifying all facts and data relied upon by the proposed expert witness. In many state courts, and in some federal judicial districts, lawyers had a difficulty obtaining judicial authorization to conduct examinations before trial to discover all the bases and reasoning (if any) behind an expert witness’s opinion. Under the current version of the Federal Rules of Civil Procedure, trial by ambush has generally given way to full discovery. The current version of Rule 26 provides:

Rule 26. Duty to Disclose; General Provisions Governing Discovery

(a) Required Disclosures.

* * *

(2) Disclosure of Expert Testimony.

(A) In General. In addition to the disclosures required by Rule 26(a)(1) , a party must disclose to the other parties the identity of any witness it may use at trial to present evidence under Federal Rule of Evidence 702 703 , or 705 .

(B) Witnesses Who Must Provide a Written Report. Unless otherwise stipulated or ordered by the court, this disclosure must be accompanied by a written report—prepared and signed by the witness—if the witness is one retained or specially employed to provide expert testimony in the case or one whose duties as the party’s employee regularly involve giving expert testimony. The report must contain:

(i) a complete statement of all opinions the witness will express and the basis and reasons for them;

(ii) the facts or data considered by the witness in forming them;

(iii) any exhibits that will be used to summarize or support them;

(iv) the witness’s qualifications, including a list of all publications authored in the previous 10 years;

(v) a list of all other cases in which, during the previous 4 years, the witness testified as an expert at trial or by deposition; and

(vi) a statement of the compensation to be paid for the study and testimony in the case.

An expert’s report or disclosure under Rule 26 remains a far cry from a systematic review, but the Rule goes a long way towards eliminating trial by ambush and surprise in requiring a complete statement of all opinions, all the bases and reasons for the opinions, and all the facts or data considered in reaching the opinions. The requirements of Rule 26, combined with a mandatory oral deposition, go a long way to help reveal cherry picking and motivated reasoning in an expert witness’s opinions.


[1] Schachtman, “Reference Manual – Desiderata for 4th Edition – Part I – Signature Diseases,” Tortini (Jan. 30, 2023); “Reference Manual – Desiderata for 4th Edition – Part II – Epidemiology & Specific Causation,” Tortini (Jan. 31, 2023); “Reference Manual – Desiderata for 4th Edition – Part III – Differential Etiology,” Tortini (Feb. 1, 2023); “Reference Manual – Desiderata for 4th Edition – Part IV – Confidence Intervals,” Tortini (Feb. 10, 2023); “Reference Manual – Desiderata for 4th Edition – Part V – Specific Tortogens,” Tortini (Feb. 14, 2023); “Reference Manual – Desiderata for 4th Edition – Part VI – Rule 703,” Tortini (Feb. 17, 2023).

[2] See John B. Wong, Lawrence O. Gostin, and Oscar A. Cabrera, “Reference Guide on Medical Testimony,” in Reference Manual on Scientific Evidence 687, 723-24 (3d ed. 2011) (discussing hierarchy of medical evidence, with systematic reviews at the apex).

[3] Schachtman, “The Treatment of Meta-Analysis in the Third Edition of the Reference Manual on Scientific Evidence,” Tortini (Nov. 14, 2011).

[4] John P.A. Ioannidis & Joseph Lau, Systematic Review of Medical Evidence, 12 J.L. & Pol’y 509 (2004).

[5] Lisa Bero, “Evaluating Systematic Reviews and Meta-Analyses,” 14 J. L. & Policy 569, 576 (2006).

[6] See Schachtman, “Cherry Picking; Systematic Reviews; Weight of the Evidence,” Tortini (April 5, 2015); “The Fallacy of Cherry Picking As Seen in American Courtrooms,” Tortini (May 3, 2014);  “The Cherry-Picking Fallacy in Synthesizing Evidence,” Tortini (June 15, 2012).

[7] In re Paraquat Prods. Liab. Litig., 730 F. Supp. 3d 793 (S.D. Ill. 2024); see also Schachtman, “Paraquat Shape-Shifting Expert Witness Quashed,” Tortini (Apr. 24, 2024).

[8] See Schachtman, “Unlearning the Learned Treatise Exception,” Tortini (Aug. 21, 2010).

[9] Iain Chalmers, Larry V. Hedges, Harris Cooper, “A Brief History of Research Synthesis,” 25 Evaluation & the Health Professions 12 (2002); Mark Starr, Iain Chalmers, Mike Clarke, Andrew D. Oxman, “The origins, evolution, and future of The Cochrane Database of Systematic Reviews,” 25 Int J. Technol. Assess. Health Care s182 (2009); Mike Clarke, “History of evidence synthesis to assess treatment effects: personal reflections on something that is very much alive,” 109 J. Roy. Soc. Med. 154 (2016). See also Wen-Lin Lee, R. Barker Bausell & Brian M. Berman, “The growth of health-related meta-analyses published from 1980 to 2000,” 24 Eval. Health Prof. 327 (2001).

[10] Alessandro Liberati, Douglas G. Altman, Jennifer Tetzlaff, Cynthia Mulrow, Peter C. Gøtzsche, John P.A. Ioannidis, Mike Clarke, Devereaux, Jos Kleijnen, and David Moher, “The PRISMA Statement for Reporting Systematic Reviews and Meta-Analyses of Studies That Evaluate Health Care Interventions: Explanation and Elaboration,” 151 Ann Intern Med. W-65 (2009); “The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration,” 6 PLoS Med. e1000100 (2009).

[11] Alison Booth, Mike Clarke, Gordon Dooley, Davina Ghersi, David Moher, Mark Petticrew & Lesley Stewart, “The nuts and bolts of PROSPERO: an international prospective register of systematic reviews,” 1 Systematic Reviews 1 (2012); Alison Booth, Mike Clarke, Davina Ghersi, David Moher, Mark Petticrew, Lesley Stewart, “An international registry of systematic review protocols,” 377 Lancet 108 (2011).

[12] Matthew J. Page, Larissa Shamseer, and Andrea C. Tricco, “Registration of systematic reviews in PROSPERO: 30,000 records and counting,” 7 Systematic Reviews 32 (2018).

[13][13] John P. Ioannidis, “The Mass Production of Redundant, Misleading, and Conflicted Systematic Reviews and Meta-analyses,” 94 Milbank Q. 485 (2016).

Science for Judges – Reference Manual v4.0

November 6th, 2024

By the time the third edition of the Reference Manual on Scientific Evidence (RMSE) arrived in 2011, the work had evolved into a massive doorstop. The third edition generally got favorable, but unsearching, reviews. In some ways it was an impressive effort, but it left a lot to be desired in terms of comprehensiveness and consistency.[1] A decade passed, and the National Academies of Science, Engineering, and Medicine (NASEM), along with the Federal Judicial Center, opened work on a fourth edition, in early 2021.[2]

A look at the NASEM website shows that work on the fourth edition of the RMSE is now completed. There is, however, no announced publication date. The website’s description of the RMSE project suggests that the fourth edition will continue the practice of individual chapters with different authors. The topics to be covered are listed as:

Behavioral and Social Sciences, Biology and Life Sciences, Computers and Information Technology, Earth Sciences, Education, Engineering and Technology, Environment and Environmental Studies, Health and Medicine, Math, Chemistry, and Physics, Policy for Science and Technology, and Surveys and Statistics.

It seems unlikely that the chapters will actually track these topics. Previous editions had specific chapters on epidemiology, toxicology, regression, and clinical medicine, among others. The listing of topics strikes me as a higher level of generality than the actual chapter headings.

The following project description is provided:

“In collaboration with the Federal Judicial Center (FJC), a committee of the National Academies of Sciences, Engineering, and Medicine will develop the fourth edition of the Reference Manual on Scientific Evidence.  The Reference Manual is a primary reference source for federal judges on questions of science in litigation.  It does not instruct judges on how to rule regarding admissibility of particular types of evidence, but instead offers judges advice on how to manage expert testimony, discusses emerging problems with expert testimony, and provides information on the methodology of areas of science that often present difficult issues when introduced in the form of expert testimony.

The manual is a compilation of individually-authored chapters on various topics of science and technology relevant to the courts, The fourth edition will include updates of existing chapters as well as new chapters that reflect emerging areas.  The committee will select the topics to be included in the manual, commission expert authors to revise the current chapters or draft new ones, approve the chapters, and submit the manual for external review.”

This description, at least as to previous editions, seems misleading. The first, second, and third editions contained very specific advice on specific issues. Indeed, it is unfathomable how a reference manual could avoid prescriptive judgments as to how scientific judgments should and should not be reached.

The Co-Chairs of the fourth edition are Hon. Nancy D. Freudenthal and Dr. Fred H. Gage. Members of the committee responsible for the new edition are:

Dr. Russ B. Altman (biomedical data, pharmacogenomics)

Hon. David G. Campbell (D. Ariz.)

Dr. Alicia L. Carriquiry (statistics, forensics)

Dr. Lynn R. Goldman (occupational and environmental health)

Dr. Brian W. Kernighan (engineering)

Dr. Pramod P. Khargonekar (engineering)

Hon. Goodwin Liu (California Supreme Court)

Dr. Shobita Parthasarathy (science, technology, and public policy)

Hon. Patti B. Saris (D. Mass.)

Hon. Thomas Schroeder (M.D.N.C.)

Hon. David S. Tatel (6th Circuit)

The Staff Officer for the project is Dr. Anne-Marie C. Mazza.

There is much that is needed in a new edition.  We will soon know whether the wait was worth it.[3]


[1] See, e.g., Adam Dutkiewicz, “Book Review: Reference Manual on Scientific Evidence, Third Edition,” 28 Thomas M. Cooley L. Rev. 343 (2011); John A. Budny, “Book Review: Reference Manual on Scientific Evidence, Third Edition,” 31 Internat’l J. Toxicol. 95 (2012); James F. Rogers, Jim Shelson, and Jessalyn H. Zeigler, “Changes in the Reference Manual on Scientific Evidence (Third Edition),” Internat’l Ass’n Def. Csl. Drug, Device & Biotech. Comm. Newsltr. (June 2012). See Schachtman “New Reference Manual’s Uneven Treatment of Conflicts of Interest,” Tortini (Oct. 12, 2011).

[2] Schachtman,“Reference Manual on Scientific Evidence v4.0Tortini (Feb. 28, 2021); Schachtman, “People Get Ready – There’s A Reference Manual A’Comin’,” Tortini (July 16, 2021); “Reference Manual on Scientific Evidence – 3rd Edition is Past Its ExpiryTortini (Oct. 17, 2021).

[3] I have written elsewhere of some of the issues that cry out for attention. Schachtman, “Reference Manual – Desiderata for the 4th Edition – Part I – Signature Diseases,” Tortini (Jan. 30, 2023); “Reference Manual – Desiderata for the 4th Edition – Part II – Epidemiology and Specific Causation,” Tortini (Jan. 31, 2023); “Reference Manual – Desiderata for the 4th Edition – Part III – Differential Diagnosis,” Tortini (Feb. 1, 2023); “Reference Manual – Desiderata for the 4th Edition – Part IV – Confidence Intervals,” Tortini (Feb. 10, 2023); “Reference Manual – Desiderata for the 4th Edition – Part V – Specific Tortogens,” Tortini (Feb. 14, 2023); “Reference Manual – Desiderata for the 4th Edition – Part VI – Rule 703,” Tortini (Feb. 17, 2023).

Paraquat Shape-Shifting Expert Witness Quashed

April 24th, 2024

Another multi-district litigation (MDL) has hit a jarring speed bump. Claims for Parkinson’s disease (PD), allegedly caused by exposure to paraquat dichloride (paraquat), were consolidated, in June 2021, for pre-trial coordination in MDL No. 3004, in the Southern District of Illinois, before Chief Judge Nancy J. Rosenstengel. Like many health-effects litigation claims, the plaintiffs’ claims in these paraquat cases turn on epidemiologic evidence. To make their causation case in the first MDL trial cases, plaintiffs’ counsel nominated a statistician, Martin T. Wells, to present their causation case. Last week, Judge Rosenstengel found Wells’ opinion so infected by invalid methodologies and inferences as to be inadmissible under the most recent version of Rule 702.[1] Summary judgment in the trial cases followed.[2]

Back in the 1980s, paraquat gained some legal notoriety in one of the most retrograde Rule 702 decisions.[3] Both the herbicide and Rule 702 survived, however, and they both remain in wide use. For the last two decades, there has been a widespread challenges to the safety of paraquat, and in particular there have been claims that paraquat can cause PD or parkinsonism under some circumstances.  Despite this background, the plaintiffs’ counsel in MDL 3004 began with four problems.

First, paraquat is closely regulated for agricultural use in the United States. Under federal law, paraquat can be used to control the growth of weeds only “by or under the direct supervision of a certified applicator.”[4] The regulatory record created an uphill battle for plaintiffs.[5] Under the Federal Insecticide, Fungicide, and Rodenticide Act (“FIFRA”), the U.S. EPA has regulatory and enforcement authority over the use, sale, and labeling of paraquat.[6] As part of its regulatory responsibilities, in 2019, the EPA systematically reviewed available evidence to assess whether there was an association between paraquat and PD. The agency’s review concluded that “there is limited, but insufficient epidemiologic evidence at this time to conclude that there is a clear associative or causal relationship between occupational paraquat exposure and PD.”[7] In 2021, the EPA issued its Interim Registration Review Decision, and reapproved the registration of paraquat. In doing so, the EPA concluded that “the weight of evidence was insufficient to link paraquat exposure from pesticidal use of U.S. registered products to Parkinson’s disease in humans.”[8]

Second, beyond the EPA, there were no other published reviews, systematic or otherwise, which reached a conclusion that paraquat causes PD.[9]

Third, the plaintiffs claims faced another serious impediment. Their counsel placed their reliance upon Professor Martin Wells, a statistician on the faculty of Cornell University. Unfortunately for plaintiffs, Wells has been known to operate as a “cherry picker,” and his methodology has been previously reviewed in an unfavorable light. Another MDL court, which reviewed a review and meta-analysis propounded by Wells, found that his reports “were marred by a selective review of data and inconsistent application of inclusion criteria.”[10]

Fourth, the plaintiffs’ claims were before Chief Judge Nancy J. Rosenstengel, who was willing to do the hard work required under Rule 702, specially as it has been recently amended for clarification and emphasis of the gatekeeper’s responsibilities to evaluate validity issues in the proffered opinions of expert witnesses. As her 97 page decision evinces, Judge Rosenstengel conducted four days of hearings, which included viva voce testimony from Martin Wells, and she obviously read the underlying papers, reviews, as well as the briefs and the Reference Manual on Scientific Evidence, with great care. What followed did not go well for Wells or the plaintiffs’ claims.[11] Judge Rosenstengel has written an opinion that may be the first careful judicial consideration of the basic requirements of systematic review.

The court noted that systematic reviewers carefully define a research question and what kinds of empirical evidence will be reviewed, and then collect, summarize, and, if feasible, synthesize the available evidence into a conclusion.[12] The court emphasized that systematic reviewers should “develop a protocol for the review before commencement and adhere to the protocol regardless of the results of the review.”[13]

Wells proffered a meta-analysis, and a “weight of the evidence” (WOE) review from which he concluded that paraquat causes PD and nearly triples the risk of the disease among workers exposed to the herbicide.[14] In his reports, Wells identified a universe of at least 36 studies, but included seven in his meta-analysis. The defense had identified another two studies that were germane.[15]

Chief Judge Rosenstengel’s opinion is noteworthy for its fine attention to detail, detail that matters to the validity of the expert witness’s enterprise. Martin Wells set out to do a meta-analysis, which was all fine and good. With a universe of 36 studies, with sub-findings, alternative analyses, and changing definitions of relevant exposure, the devil lay in the details.

The MDL court was careful to point out that it was not gainsaying Wells’ decision to limit his meta-analysis to case-control studies, or to his grading of any particular study as being of low quality. Systematic reviews and meta-analyses are generally accepted techniques that are part of a scientific approach to causal inference, but each has standards, predicates, and requirements for valid use. Expert witnesses must not only use a reliable methodology, Rule 702(d) requires that they must reliably apply their chosen methodology to the facts at hand in reaching their conclusions.[16]

The MDL court concluded that Wells’ meta-analysis was not sufficiently reliable under Rule 702 because he failed faithfully and reliably to apply his own articulated methodology. The court followed Wells’ lead in identifying the source and content of his chosen methodology, and simply examined his proffered opinion for compliance with that methodology.[17] The basic principles of validity for conducting meta-analyses were not, in any event, really contested. These principles and requirements were clearly designed to ensure and enhance the reliability of meta-analyses by pre-empting results-driven, reverse-engineered summary estimates of association.

The court found that Wells failed clearly to pre-specify his eligibility criteria. He then proceeded to redefine exposure criteria and study inclusion or eligibility criteria, and study quality criteria, after looking at the evidence. He also inconsistently applied his stated criteria, all in an apparently desired effort to exclude less favorable study outcomes. These ad hoc steps were some of Wells’ deviations from the standards to which he played lip service.

The court did not exclude Wells because it disagreed with his substantive decisions to include or exclude any particular study, or his quality grading of any study. Rather, Dr. Wells’ meta-analysis does not pass muster under Rule 702 because its methodology was unclear, inconsistently applied, not replicable, and at times transparently reverse-engineered.[18]

The court’s evaluation of Wells was unflinchingly critical. Wells’ proffered opinions “required several methodological contortions and outright violations of the scientific standards he professed to apply.”[19] From his first involvement in this litigation, Wells had violated the basic rules of conducting systematic reviews and meta-analyses.[20] His definition of “occupational” exposure meandered to suit his desire to include one study (with low variance) that might otherwise have been excluded.[21] Rather than pre-specifying his review process, his study inclusion criteria, and his quality scores, Wells engaged in an unwritten “holistic” review process, which he conceded was not objectively replicable. Wells’ approach left him free to include studies he wanted in his meta-analysis, and then provide post hoc justifications.[22] His failure to identify his inclusion/exclusion criteria was a “methodological red flag” in Dr. Wells’ meta-analysis, which suggested his reverse engineering of the whole analysis, the “very antithesis of a systematic review.”[23]

In what the court described as “methodological shapeshifting,” Wells blatantly and inconsistently graded studies he wanted to include, and had already decided to include in his meta-analysis, to be of higher quality.[24] The paraquat MDL court found, unequivocally, that Wells had “failed to apply the same level of intellectual rigor to his work in the four trial selection cases that would be required of him and his peers in a non-litigation setting.”[25]

It was also not lost upon the MDL court that Wells had shifted from a fixed effect to a random effects meta-analysis, between his principal and rebuttal reports.[26] Basic to the meta-analytical enterprise is a predicate systematic review, properly done, with pre-specification of inclusion and exclusion criteria for what studies would go into any meta-analysis. The MDL court noted that both sides had cited Borenstein’s textbook on meta-analysis,[27] and that Wells had himself cited the Cochrane Handbook[28] for the basic proposition that that objective and scientifically valid study selection criteria should be clearly stated in advance to ensure the objectivity of the analysis.

There was of course legal authority for this basic proposition about prespecification. Given that the selection of studies that go into a systematic review and meta-analysis can be dispositive of its conclusion, undue subjectivity or ad hoc inclusion can easily arrange a desired outcome.[29] Furthermore, meta-analysis carries with it the opportunity to mislead a lay jury with a single (and inflated) risk ratio,[30] which is obtained by the operator’s manipulation of inclusion and exclusion criteria. This opportunity required the MDL court to examine the methodological rigor of the proffered meta-analysis carefully to evaluate whether it reflects a valid pooling of data or it was concocted to win a case.[31]

Martin Wells had previously acknowledged the dangers of manipulation and subjective selectivity inherent in systematic reviews and meta-analyses. The MDL court quoted from Wells’ testimony in Martin v. Actavis:

QUESTION: You would certainly agree that the inclusion-exclusion criteria should be based upon objective criteria and not simply because you were trying to get to a particular result?

WELLS: No, you shouldn’t load the – sort of cook the books.

QUESTION: You should have prespecified objective criteria in advance, correct?

WELLS: Yes.[32]

The MDL court also picked up on a subtle but important methodological point about which odds ratio to use in a meta-analysis when a study provides multiple analyses of the same association. In his first paraquat deposition, Wells cited the Cochrane Handbook, for the proposition that if a crude risk ratio and a risk ratio from a multivariate analysis are both presented in a given study, then the adjusted risk ratio (and its corresponding measure of standard error seen in its confidence interval) is generally preferable to reduce the play of confounding.[33] Wells violated this basic principle by ignoring the multivariate analysis in the study that dominated his meta-analysis (Liou) in favor of the unadjusted bivariate analysis. Given that Wells accepted this basic principle, the MDL court found that Wells likely selected the minimally adjusted odds ratio over the multiviariate adjusted odds ratio for inclusion in his meta-analysis in order to have the smaller variance (and thus greater weight) from the former. This maneuver was disqualifying under Rule 702.[34]

All in all, the paraquat MDL court’s Rule 702 ruling was a convincing demonstration that non-expert generalist judges, with assistance from subject-matter experts, treatises, and legal counsel, can evaluate and identify deviations from methodological standards of care.


[1] In re Paraquat Prods. Prods. Liab. Litig., Case No. 3:21-md-3004-NJR, MDL No. 3004, Slip op., ___ F.3d ___ (S.D. Ill. Apr. 17, 2024) [Slip op.]

[2] In re Paraquat Prods. Prods. Liab. Litig., Op. sur motion for judgment, Case No. 3:21-md-3004-NJR, MDL No. 3004 (S.D. Ill. Apr. 17, 2024). See also Brendan Pierson, “Judge rejects key expert in paraquat lawsuits, tosses first cases set for trial,” Reuters (Apr. 17, 2024); Hailey Konnath, “Trial-Ready Paraquat MDL Cases Tossed After Testimony Axed,” Law360 (Apr. 18, 2024).

[3] Ferebee v. Chevron Chem. Co., 552 F. Supp. 1297 (D.D.C. 1982), aff’d, 736 F.2d 1529 (D.C. Cir.), cert. denied, 469 U.S. 1062 (1984). SeeFerebee Revisited,” Tortini (Dec. 28, 1017).

[4] See 40 C.F.R. § 152.175.

[5] Slip op. at 31.

[6] 7 U.S.C. § 136w; 7 U.S.C. § 136a(a); 40 C.F.R. § 152.175. The agency must periodically review the registration of the herbicide. 7 U.S.C. § 136a(g)(1)(A). See Ruckelshaus v. Monsanto Co., 467 U.S. 986, 991-92 (1984).

[7] See Austin Wray & Aaron Niman, Memorandum, Paraquat Dichloride: Systematic review of the literature to evaluate the relationship between paraquat dichloride exposure and Parkinson’s disease at 35 (June 26, 2019).

[8] See also Jeffrey Brent and Tammi Schaeffer, “Systematic Review of Parkinsonian Syndromes in Short- and Long-Term Survivors of Paraquat Poisoning,” 53 J. Occup. & Envt’l Med. 1332 (2011) (“An analysis the world’s entire published experience found no connection between high-dose paraquat exposure in humans and the development of parkinsonism.”).

[9] Douglas L. Weed, “Does paraquat cause Parkinson’s disease? A review of reviews,” 86 Neurotoxicology 180, 180 (2021).

[10] In re Incretin-Based Therapies Prods. Liab. Litig., 524 F.Supp. 3d 1007, 1038, 1043 (S.D. Cal. 2021), aff’d, No. 21-55342, 2022 WL 898595 (9th Cir. Mar. 28, 2022) (per curiam). SeeMadigan’s Shenanigans and Wells Quelled in Incretin-Mimetic CasesTortini (July 15, 2022).

[11] The MDL court obviously worked hard to learn the basics principles of epidemiology. The court relied extensively upon the epidemiology chapter in the Reference Manual on Scientific Evidence. Much of that material is very helpful, but its exposition on statistical concepts is at times confused and erroneous. It is unfortunate that courts do not pay more attention to the more precise and accurate exposition in the chapter on statistics. Citing the epidemiology chapter, the MDL court gave an incorrect interpretation of the p-value: “A statistically significant result is one that is unlikely the product of chance. Slip op. at 17 n. 11. And then again, citing the Reference Manual, the court declared that “[a] p-value of .1 means that there is a 10% chance that values at least as large as the observed result could have been the product of random error. Id.” Id. Similarly, the MDL court gave an incorrect interpretation of the confidence interval. In a footnote, the court tells us that “[r]esearchers ordinarily assert a 95% confidence interval, meaning that ‘there is a 95% chance that the “true” odds ratio value falls within the confidence interval range’. In re Zoloft (Sertraline Hydrochloride) Prod. Liab. Litig., MDL No. 2342, 2015 WL 7776911, at *2 (E.D. Pa. Dec. 2, 2015).” Slip op. at 17n.12.  Citing another court for the definition of a statistical concept is a risky business.

[12] Slip op. at 20, citing Lisa A. Bero, “Evaluating Systematic Reviews and Meta-Analyses,” 14 J.L. & Pol’y 569, 570 (2006).

[13] Slip op. at 21, quoting Bero, at 575.

[14] Slip op. at 3.

[15] The nine studies at issue were as follows: (1) H.H. Liou, et al., “Environmental risk factors and Parkinson’s disease; A case-control study in Taiwan,” 48 Neurology 1583 (1997); (2) Caroline M. Tanner, et al.,Rotenone, Paraquat and Parkinson’s Disease,” 119 Envt’l Health Persps. 866 (2011) (a nested case-control study within the Agricultural Health Study (“AHS”)); (3) Clyde Hertzman, et al., “A Case-Control Study of Parkinson’s Disease in a Horticultural Region of British Columbia,” 9 Movement Disorders 69 (1994); (4) Anne-Maria Kuopio, et al., “Environmental Risk Factors in Parkinson’s Disease,” 14 Movement Disorders 928 (1999); (5) Katherine Rugbjerg, et al., “Pesticide exposure and risk of Parkinson’s disease – a population-based case-control study evaluating the potential for recall bias,” 37 Scandinavian J. of Work, Env’t & Health 427 (2011); (6) Jordan A. Firestone, et al., “Occupational Factors and Risk of Parkinson’s Disease: A Population-Based Case-Control Study,” 53 Am. J. of Indus. Med. 217 (2010); (7) Amanpreet S. Dhillon,“Pesticide / Environmental Exposures and Parkinson’s Disease in East Texas,” 13 J. of Agromedicine 37 (2008); (8) Marianne van der Mark, et al., “Occupational exposure to pesticides and endotoxin and Parkinson’s disease in the Netherlands,” 71 J. Occup. & Envt’l Med. 757 (2014); (9) Srishti Shrestha, et al., “Pesticide use and incident Parkinson’s disease in a cohort of farmers and their spouses,” Envt’l Research 191 (2020).

[16] Slip op. at 75.

[17] Slip op. at 73.

[18] Slip op. at 75, citing In re Mirena IUS Levonorgestrel-Related Prod. Liab. Litig. (No. II), 341 F. Supp. 3d 213, 241 (S.D.N.Y. 2018) (“Opinions that assume a conclusion and reverse-engineer a theory to fit that conclusion are . . . inadmissible.”) (internal citation omitted), aff’d, 982 F.3d 113 (2d Cir. 2020); In re Zoloft (Sertraline Hydrochloride) Prod. Liab. Litig., No. 12-md-2342, 2015 WL 7776911, at *16 (E.D. Pa. Dec. 2, 2015) (excluding expert’s opinion where he “failed to consistently apply the scientific methods he articulat[ed], . . . deviated from or downplayed certain well established principles of his field, and . . . inconsistently applied methods and standards to the data so as to support his a priori opinion.”), aff’d, 858 F.3d 787 (3d Cir. 2017).

[19] Slip op. at 35.

[20] Slip op. at 58.

[21] Slip op. at 55.

[22] Slip op. at 41, 64.

[23] Slip op. at 59-60, citing In re Lipitor (Atorvastatin Calcium) Mktg., Sales Pracs. & Prod. Liab. Litig., 892 F.3d 624, 634 (4th Cir. 2018) (“Result-driven analysis, or cherry-picking, undermines principles of the scientific method and is a quintessential example of applying methodologies (valid or otherwise) in an unreliable fashion.”).

[24] Slip op. at 67, 69-70, citing In re Zoloft (Sertraline Hydrochloride) Prod. Liab. Litig., 858 F.3d 787, 795-97 (3d Cir. 2017) (“[I]f an expert applies certain techniques to a subset of the body of evidence and other techniques to another subset without explanation, this raises an inference of unreliable application of methodology.”); In re Bextra and Celebrex Mktg. Sales Pracs. & Prod. Liab. Litig., 524 F. Supp. 2d 1166, 1179 (N.D. Cal. 2007) (excluding an expert witness’s causation opinion because of his result-oriented, inconsistent evaluation of data sources).

[25] Slip op. at 40.

[26] Slip op. at 61 n.44.

[27] Michael Borenstein, Larry V. Hedges, Julian P. T. Higgins, and Hannah R. Rothstein, Introduction to Meta-Analysis (2d ed. 2021).

[28] Jacqueline Chandler, James Thomas, Julian P. T. Higgins, Matthew J. Page, Miranda Cumpston, Tianjing Li, Vivian A. Welch, eds., Cochrane Handbook for Systematic Reviews of Interventions (2ed 2023).

[29] Slip op. at 56, citing In re Zimmer Nexgen Knee Implant Prod. Liab. Litig., No. 11 C 5468, 2015 WL 5050214, at *10 (N.D. Ill. Aug. 25, 2015).

[30] Slip op. at 22. The court noted that the Reference Manual on Scientific Evidence cautions that “[p]eople often tend to have an inordinate belief in the validity of the findings when a single number is attached to them, and many of the difficulties that may arise in conducting a meta-analysis, especially of observational studies such as epidemiological ones, may consequently be overlooked.” Id., quoting from Manual, at 608.

[31] Slip op. at 57, citing Deutsch v. Novartis Pharms. Corp., 768 F. Supp. 2d 420, 457-58 (E.D.N.Y. 2011) (“[T]here is a strong risk of prejudice if a Court permits testimony based on an unreliable meta-analysis because of the propensity for juries to latch on to the single number.”).

[32] Slip op. at 64, quoting from Notes of Testimony of Martin Wells, in In re Testosterone Replacement Therapy Prod. Liab. Litig., Nos. 1:14-cv-1748, 15-cv-4292, 15-cv-426, 2018 WL 7350886 (N.D. Ill. Apr. 2, 2018).

[33] Slip op. at 70.

[34] Slip op. at 71-72, citing People Who Care v. Rockford Bd. of Educ., 111 F.3d 528, 537-38 (7th Cir. 1997) (“[A] statistical study that fails to correct for salient explanatory variables . . . has no value as causal explanation and is therefore inadmissible in federal court.”); In re Roundup Prod. Liab. Litig., 390 F. Supp. 3d 1102, 1140 (N.D. Cal. 2018). Slip op. at 17 n. 12.

Excluding Epidemiologic Evidence under Federal Rule of Evidence 702

August 26th, 2023

We are 30-plus years into the “Daubert” era, in which federal district courts are charged with gatekeeping the relevance and reliability of scientific evidence. Not surprisingly, given the lawsuit industry’s propensity on occasion to use dodgy science, the burden of awakening the gatekeepers from their dogmatic slumber often falls upon defense counsel in civil litigation. It therefore behooves defense counsel to speak carefully and accurately about the grounds for Rule 702 exclusion of expert witness opinion testimony.

In the context of medical causation opinions based upon epidemiologic evidence, the first obvious point is that whichever party is arguing for exclusion should distinguish between excluding an expert witness’s opinion and prohibiting an expert witness from relying upon a particular study.  Rule 702 addresses the exclusion of opinions, whereas Rule 703 addresses barring an expert witness from relying upon hearsay facts or data unless they are reasonably relied upon by experts in the appropriate field. It would be helpful for lawyers and legal academics to refrain from talking about “excluding epidemiological evidence under FRE 702.”[1] Epidemiologic studies are rarely admissible themselves, but come into the courtroom as facts and data relied upon by expert witnesses. Rule 702 is addressed to the admissibility vel non of opinion testimony, some of which may rely upon epidemiologic evidence.

Another common lawyer mistake is the over-generalization that epidemiologic research provides “gold standard” of general causation evidence.[2] Although epidemiology is often required, it not “the medical science devoted to determining the cause of disease in human beings.”[3] To be sure, epidemiologic evidence will usually be required because there is no genetic or mechanistic evidence that will support the claimed causal inference, but counsel should be cautious in stating the requirement. Glib statements by courts that epidemiology is not always required are often simply an evasion of their responsibility to evaluate the validity of the proffered expert witness opinions. A more careful phrasing of the role of epidemiology will make such glib statements more readily open to rebuttal. In the absence of direct biochemical, physiological, or genetic mechanisms that can be identified as involved in bringing about the plaintiffs’ harm, epidemiologic evidence will be required, and it may well be the “gold standard” in such cases.[4]

When epidemiologic evidence is required, counsel will usually be justified in adverting to the “hierarchy of epidemiologic evidence.” Associations are shown in studies of various designs with vastly differing degrees of validity; and of course, associations are not necessarily causal. There are thus important nuances in educating the gatekeeper about this hierarchy. First, it will often be important to educate the gatekeeper about the distinction between descriptive and analytic studies, and the inability of descriptive studies such as case reports to support causal inferences.[5]

There is then the matter of confusion within the judiciary and among “scholars” about whether a hierarchy even exists. The chapter on epidemiology in the Reference Manual on Scientific Evidence appears to suggest the specious position that there is no hierarchy.[6] The chapter on medical testimony, however, takes a different approach in identifying a normative hierarchy of evidence to be considered in evaluating causal claims.[7] The medical testimony chapter specifies that meta-analyses of randomized controlled trials sit atop the hierarchy. Yet, there are divergent opinions about what should be at the top of the hierarchical evidence pyramid. Indeed, the rigorous, large randomized trial will often replace a meta-analysis of smaller trials as the more definitive evidence.[8] Back in 2007, a dubious meta-analysis of over 40 clinical trials led to a litigation frenzy over rosiglitazone.[9] A mega-trial of rosiglitazone showed that the 2007 meta-analysis was wrong.[10]

In any event, courts must purge their beliefs that once there is “some” evidence in support of a claim, their gatekeeping role is over. Randomized controlled trials really do trump observational studies, which virtually always have actual or potential confounding in their final analyses.[11] While disclaimers about the unavailability of randomized trials for putative toxic exposures are helpful, it is not quite accurate to say that it is “unethical to intentionally expose people to a potentially harmful dose of a suspected toxin.”[12] Such trials are done all the time when there is an expected therapeutic benefit that creates at least equipoise between the overall benefit and harm at the outset of the trial.[13]

At this late date, it seems shameful that courts must be reminded that evidence of associations does not suffice to show causation, but prudence dictates giving the reminder.[14] Defense counsel will generally exhibit a Pavlovian reflex to state that causality based upon epidemiology must be viewed through a lens of “Bradford Hill criteria.”[15] Rhetorically, this reflex seems wrong given that Sir Austin himself noted that his nine different considerations were “viewpoints,” not criteria. Taking a position that requires an immediate retreat seems misguided. Similarly, urging courts to invoke and apply the Bradford Hill considerations must be accompanied the caveat that courts must first apply Bradford Hill’s predicate[16] for the nine considerations:

“Disregarding then any such problem in semantics we have this situation. Our observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance. What aspects of that association should we especially consider before deciding that the most likely interpretation of it is causation?”[17]

Courts should be mindful that the language from the famous, often-cited paper was part of an after-dinner address, in which Sir Austin was speaking informally. Scientists will understand that he was setting out a predicate that calls for

(1) an association, which is

(2) “perfectly clear cut,” such that bias and confounding are excluded, and

(3) “beyond what we would care to attribute to the play of chance,” with random error kept to an acceptable level, before advancing to further consideration of the nine viewpoints commonly recited.

These predicate findings are the basis for advancing to investigate Bradford Hill’s nine viewpoints; the viewpoints do not replace or supersede the predicates.[18]

Within the nine viewpoints, not all are of equal importance. Consistency among studies, a particularly important consideration, implies that isolated findings in a single observational study will rarely suffice to support causal conclusions. Another important consideration, the strength of the association, has nothing to do with “statistical significance,” which is a predicate consideration, but reminds us that large risk ratios or risk differences provides some evidence that the association does not result from unmeasured confounding. Eliminating confounding, however, is one of the predicate requirements for applying the nine factors. As with any methodology, the Bradford Hill factors are not self-executing. The annals of litigation provide all-too-many examples of undue selectivity, “cherry picking,” and other deviations from the scientist’s standard of care.

Certainly lawyers must steel themselves against recommending the “carcinogen” hazard identifications advanced by the International Agency for Research on Cancer (IARC). There are several problematic aspects to the methods of IARC, not the least of which is IARC’s fanciful use of the word “probable.” According to the IARC Preamble, “probable” has no quantitative meaning.[19] In common legal parlance, “probable” typically conveys a conclusion that is more likely than not. Another problem arises from the IARC’s labeling of “probable human carcinogens” made in some cases without any real evidence of carcinogenesis in humans. Regulatory pronouncements are even more diluted and often involved little more than precautionary principle wishcasting.[20]


[1] Christian W. Castile & and Stephen J. McConnell, “Excluding Epidemiological Evidence Under FRE 702,” For The Defense 18 (June 2023) [Castile]. Although these authors provide an interesting overview of the subject, they fall into some common errors, such as failing to address Rule 703. The article is worth reading for its marshaling recent case law on the subject, but I detail of its errors here in the hopes that lawyers will speak more precisely about the concepts involved in challenging medical causation opinions.

[2] Id. at 18. In re Zantac (Ranitidine) Prods. Liab. Litig., No. 2924, 2022 U.S. Dist. LEXIS 220327, at *401 (S.D. Fla. Dec. 6, 2022); see also Horwin v. Am. Home Prods., No. CV 00-04523 WJR (Ex), 2003 U.S. Dist. LEXIS 28039, at *14-15 (C.D. Cal. May 9, 2003) (“epidemiological studies provide the primary generally accepted methodology for demonstrating a causal relation between a chemical compound and a set of symptoms or disease” *** “The lack of epidemiological studies supporting Plaintiffs’ claims creates a high bar to surmount with respect to the reliability requirement, but it is not automatically fatal to their case.”).

[3] See, e.g., Siharath v. Sandoz Pharm. Corp., 131 F. Supp. 2d 1347, 1356 (N.D. Ga. 2001) (“epidemiology is the medical science devoted to determining the cause of disease in human beings”).

[4] See, e.g., Lopez v. Wyeth-Ayerst Labs., No. C 94-4054 CW, 1996 U.S. Dist. LEXIS 22739, at *1 (N.D. Cal. Dec. 13, 1996) (“Epidemiological evidence is one of the most valuable pieces of scientific evidence of causation”); Horwin v. Am. Home Prods., No. CV 00-04523 WJR (Ex), 2003 U.S. Dist. LEXIS 28039, at *15 (C.D. Cal. May 9, 2003) (“The lack of epidemiological studies supporting Plaintiffs’ claims creates a high bar to surmount with respect to the reliability requirement, but it is not automatically fatal to their case”).

[5] David A. Grimes & Kenneth F. Schulz, “Descriptive Studies: What They Can and Cannot Do,” 359 Lancet 145 (2002) (“…epidemiologists and clinicians generally use descriptive reports to search for clues of cause of disease – i.e., generation of hypotheses. In this role, descriptive studies are often a springboard into more rigorous studies with comparison groups. Common pitfalls of descriptive reports include an absence of a clear, specific, and reproducible case definition, and interpretations that overstep the data. Studies without a comparison group do not allow conclusions about cause of disease.”).

[6] Michael D. Green, D. Michal Freedman, and Leon Gordis, “Reference Guide on Epidemiology,” Reference Manual on Scientific Evidence 549, 564n.48 (citing a paid advertisement by a group of scientists, and misleadingly referring to the publication as a National Cancer Institute symposium) (citing Michele Carbone et al., “Modern Criteria to Establish Human Cancer Etiology,” 64 Cancer Res. 5518, 5522 (2004) (National Cancer Institute symposium [sic] concluding that “[t]here should be no hierarchy [among different types of scientific methods to determine cancer causation]. Epidemiology, animal, tissue culture and molecular pathology should be seen as integrating evidences in the determination of human carcinogenicity.”).

[7] John B. Wong, Lawrence O. Gostin & Oscar A. Cabrera, “Reference Guide on Medical Testimony,” in Reference Manual on Scientific Evidence 687, 723 (3d ed. 2011).

[8] See, e.g., J.M. Elwood, Critical Appraisal of Epidemiological Studies and Clinical Trials 342 (3d ed. 2007).

[9] See Steven E. Nissen & Kathy Wolski, “Effect of Rosiglitazone on the Risk of Myocardial Infarction and Death from Cardiovascular Causes,” 356 New Engl. J. Med. 2457 (2007). See also “Learning to Embrace Flawed Evidence – The Avandia MDL’s Daubert Opinion” (Jan. 10, 2011).

[10] Philip D. Home, et al., “Rosiglitazone evaluated for cardiovascular outcomes in oral agent combination therapy for type 2 diabetes (RECORD): a multicentre, randomised, open-label trial,” 373 Lancet 2125 (2009).

[11] In re Zantac (Ranitidine) Prods. Liab. Litig., No. 2924, 2022 U.S. Dist. LEXIS 220327, at *402 (S.D. Fla. Dec. 6, 2022) (“Unlike experimental studies in which subjects are randomly assigned to exposed and placebo groups, observational studies are subject to bias due to the possibility of differences between study populations.”)

[12] Castile at 20.

[13] See, e.g., Benjamin Freedman, “Equipoise and the ethics of clinical research,” 317 New Engl. J. Med. 141 (1987).

[14] See, e.g., In Re Onglyza (Saxagliptin) & Kombiglyze Xr (Saxagliptin & Metformin) Prods. Liab. Litig., No. 5:18-md-2809-KKC, 2022 U.S. Dist. LEXIS 136955, at *127 (E.D. Ky. Aug. 2, 2022); Burleson v. Texas Dep’t of Criminal Justice, 393 F.3d 577, 585-86 (5th Cir. 2004) (affirming exclusion of expert causation testimony based solely upon studies showing a mere correlation between defendant’s product and plaintiff’s injury); Beyer v. Anchor Insulation Co., 238 F. Supp. 3d 270, 280-81 (D. Conn. 2017); Ambrosini v. Labarraque, 101 F.3d 129, 136 (D.C. Cir. 1996).

[15] Castile at 21. See In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., 26 F. Supp. 3d 449, 454-55 (E.D. Pa. 2014).

[16]Bradford Hill on Statistical Methods” (Sept. 24, 2013); see also Frank C. Woodside, III & Allison G. Davis, “The Bradford Hill Criteria: The Forgotten Predicate,” 35 Thomas Jefferson L. Rev. 103 (2013). 

[17] Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 295 (1965).

[18] Castile at 21. See, e.g., In re Onglyza (Saxagliptin) & Kombiglyze XR (Saxagliptin & Metformin) Prods. Liab. Litig., No. 5:18-md-2809-KKC, 2022 U.S. Dist. LEXIS 1821, at *43 (E.D. Ky. Jan. 5, 2022) (“The analysis is meant to apply when observations reveal an association between two variables. It addresses the aspects of that association that researchers should analyze before deciding that the most likely interpretation of [the association] is causation”); Hoefling v. U.S. Smokeless Tobacco Co., LLC, 576 F. Supp. 3d 262, 273 n.4 (E.D. Pa. 2021) (“Nor would it have been appropriate to apply them here: scientists are to do so only after an epidemiological association is demonstrated”).

[19] IARC Monographs on the Identification of Carcinogenic Hazards to Humans – Preamble 31 (2019) (“The terms probably carcinogenic and possibly carcinogenic have no quantitative significance and are used as descriptors of different strengths of evidence of carcinogenicity in humans.”).

[20]Improper Reliance upon Regulatory Risk Assessments in Civil Litigation” (Mar. 19, 2023).

Judicial Flotsam & Jetsam – Retractions

June 12th, 2023

In scientific publishing, when scientists make a mistake, they publish an erratum or a corrigendum. If the mistake vitiates the study, then the erring scientists retract their article. To be sure, sometimes the retraction comes after an obscene delay, with the authors kicking and screaming.[1] Sometimes the retraction comes at the request of the authors, better late than never.[2]

Retractions in the biomedical journals, whether voluntary or not, are on the rise.[3] The process and procedures for retraction of articles often lack transparency. Many articles are retracted without explanation or disclosure of specific problems about the data or the analysis.[4] Sadly, however, misconduct in the form of plagiarism and data falsification is a frequent reason for retractions.[5] The lack of transparency for retractions, and sloppy scholarship, combine to create Zombie papers, which are retracted but continue to be cited in subsequent publications.[6]

LEGAL RETRACTIONS

The law treats errors very differently. Being a judge usually means that you never have to say you are sorry. Judge Andrew Hurwitz has argued that that our legal system would be better served if judges could and did “freely acknowledged and transparently corrected the occasional ‘goof’.”[7] Alas, as Judge Hurwitz notes, very few published decisions acknowledge mistakes.[8]

In the world of scientific jurisprudence, the judicial reticence to acknowledge mistakes is particularly dangerous, and it leads directly to the proliferation of citations to cases that make egregious mistakes. In the niche area of judicial assessment of scientific and statistical evidence, the proliferation of erroneous statements is especially harmful because it interferes with thinking clearly about the issues before courts. Judges believe that they have argued persuasively for a result, not by correctly marshaling statistical and scientific concepts, but by relying upon precedents erroneously arrived at by other judges in earlier cases. Regardless of how many cases are cited (and there are many possible “precedents”), the true parameter does not have a 95% probability of lying within the interval given by a given 95% confidence interval.[9] Similarly, as much as judges would like p-values and confidence intervals to eliminate the need to worry about systematic error, their saying so cannot make it so.[10] Even a mighty federal judge cannot make the p-value probability, or its complement, substitute for the posterior probability of a causal claim.[11]

Some cases in the books are so egregiously decided that it is truly remarkable that they would be cited for any proposition. I call these scientific Dred Scott cases, which illustrate that sometimes science has no criteria of validity that the law is bound to respect. One such Dred Scott case was the result of a bench trial in a federal district court in Atlanta, in Wells v. Ortho Pharmaceutical Corporation.[12]

Wells was notorious for its poor assessment of all the determinants of scientific causation.[13] The decision was met with a storm of opprobrium from the legal and medical community.[14] No scientists or legal scholars offered a serious defense of Wells on the scientific merits. Even the notorious plaintiffs’ expert witness, Carl Cranor, could muster only a distanced agnosticism:

“In Wells v. Ortho Pharmaceutical Corp., which involved a claim that birth defects were caused by a spermicidal jelly, the U.S. Court of Appeals for the 11th Circuit followed the principles of Ferebee and affirmed a plaintiff’s verdict for about five million dollars. However, some members of the medical community chastised the legal system essentially for ignoring a well-established scientific consensus that spermicides are not teratogenic. We are not in a position to judge this particular issue, but the possibility of such results exists.”[15]

Cranor apparently could not bring himself to note that it was not just scientific consensus that was ignored; the Wells case ignored the basic scientific process of examining relevant studies for both internal and external validity.

Notwithstanding this scholarly consensus and condemnation, we have witnessed the repeated recrudescence of the Wells decision. In Matrixx Initiatives, Inc. v. Siracusano,[16] in 2011, the Supreme Court, speaking through Justice Sotomayor, wandered into a discussion, irrelevant to its holding, whether statistical significance was necessary for a determination of the causality of an association:

“We note that courts frequently permit expert testimony on causation based on evidence other than statistical significance. Seee.g.Best v. Lowe’s Home Centers, Inc., 563 F. 3d 171, 178 (6th Cir 2009); Westberry v. Gislaved Gummi AB, 178 F. 3d 257, 263–264 (4th Cir. 1999) (citing cases); Wells v. Ortho Pharmaceutical Corp., 788 F. 2d 741, 744–745 (11th Cir. 1986). We need not consider whether the expert testimony was properly admitted in those cases, and we do not attempt to define here what constitutes reliable evidence of causation.”[17]

The quoted language is remarkable for two reasons. First, the Best and Westberry cases did not involve statistics at all. They addressed specific causation inferences from what is generally known as differential etiology. Second, the citation to Wells was noteworthy because the case has nothing to do with adverse event reports or the lack of statistical significance.

Wells involved a claim of birth defects caused by the use of spermicidal jelly contraceptive, which had been the subject of several studies, one of which at least yielded a nominally statistically significant increase in detected birth defects over what was expected.

Wells could thus hardly be an example of a case in which there was a judgment of causation based upon a scientific study that lacked statistical significance in its findings. Of course, finding statistical significance is just the beginning of assessing the causality of an association. The most remarkable and disturbing aspect of the citation to Wells, however, was that the Court was unaware of, or ignored, the case’s notoriety, and the scholarly and scientific consensus that criticized the decision for its failure to evaluate the entire evidentiary display, as well as for its failure to rule out bias and confounding in the studies relied upon by the plaintiff.

Justice Sotomayor’s decision for a unanimous Court is not alone in its failure of scholarship and analysis in embracing the dubious precedent of Wells. Many other courts have done much the same, both in state[18] and in federal courts,[19] and both before and after the Supreme Court decided Daubert, and even after Rule 702 was amended in 2000.[20] Perhaps even more disturbing is that the current edition of the Reference Manual on Scientific Evidence glibly cites to the Wells case, for the dubious proposition that

“Generally, researchers are conservative when it comes to assessing causal relationships, often calling for stronger evidence and more research before a conclusion of causation is drawn.”[21]

We are coming up on the 40th anniversary of the Wells judgment. It is long past time to stop citing the case. Perhaps we have reached the stage of dealing with scientific evidence at which errant and aberrant cases should be retracted, and clearly marked as retracted in the official reporters, and in the electronic legal databases. Certainly the technology exists to link the scholarly criticism with a case citation, just as we link subsequent judicial treatment by overruling, limiting, and criticizing.


[1] Laura Eggertson, “Lancet retracts 12-year-old article linking autism to MMR vaccines,” 182 Canadian Med. Ass’n J. E199 (2010).

[2] Notice of retraction for Teng Zeng & William Mitch, “Oral intake of ranitidine increases urinary excretion of N-nitrosodimethylamine,” 37 Carcinogenesis 625 (2016), published online (May 4, 2021) (retraction requested by authors with an acknowledgement that they had used incorrect analytical methods for their study).

[3] Tianwei He, “Retraction of global scientific publications from 2001 to 2010,” 96 Scientometrics 555 (2013); Bhumika Bhatt, “A multi-perspective analysis of retractions in life sciences,” 126 Scientometrics 4039 (2021); Raoul R.Wadhwa, Chandruganesh Rasendran, Zoran B. Popovic, Steven E. Nissen, and Milind Y. Desai, “Temporal Trends, Characteristics, and Citations of Retracted Articles in Cardiovascular Medicine,” 4 JAMA Network Open e2118263 (2021); Mario Gaudino, N. Bryce Robinson, Katia Audisio, Mohamed Rahouma, Umberto Benedetto, Paul Kurlansky, Stephen E. Fremes, “Trends and Characteristics of Retracted Articles in the Biomedical Literature, 1971 to 2020,” 181 J. Am. Med. Ass’n Internal Med. 1118 (2021); Nicole Shu Ling Yeo-Teh & Bor Luen Tang, “Sustained Rise in Retractions in the Life Sciences Literature during the Pandemic Years 2020 and 2021,” 10 Publications 29 (2022).

[4] Elizabeth Wager & Peter Williams, “Why and how do journals retract articles? An analysis of Medline retractions 1988-2008,” 37 J. Med. Ethics 567 (2011).

[5] Ferric C. Fanga, R. Grant Steen, and Arturo Casadevall, “Misconduct accounts for the majority of retracted scientific publications,” 109 Proc. Nat’l Acad. Sci. 17028 (2012); L.M. Chambers, C.M. Michener, and T. Falcone, “Plagiarism and data falsification are the most common reasons for retracted publications in obstetrics and gynaecology,” 126 Br. J. Obstetrics & Gyn. 1134 (2019); M.S. Marsh, “Separating the good guys and gals from the bad,” 126 Br. J. Obstetrics & Gyn. 1140 (2019).

[6] Tzu-Kun Hsiao and Jodi Schneider, “Continued use of retracted papers: Temporal trends in citations and (lack of) awareness of retractions shown in citation contexts in biomedicine,” 2 Quantitative Science Studies 1144 (2021).

[7] Andrew D. Hurwitz, “When Judges Err: Is Confession Good for the Soul?” 56 Ariz. L. Rev. 343, 343 (2014).

[8] See id. at 343-44 (quoting Justice Story who dealt with the need to contradict a previously published opinion, and who wrote “[m]y own error, however, can furnish no ground for its being adopted by this Court.” U.S. v. Gooding, 25 U.S. 460, 478 (1827)).

[9] See, e.g., DeLuca v. Merrell Dow Pharms., Inc., 791 F. Supp. 1042, 1046 (D.N.J. 1992) (”A 95% confidence interval means that there is a 95% probability that the ‘true’ relative risk falls within the interval”) , aff’d, 6 F.3d 778 (3d Cir. 1993); In re Silicone Gel Breast Implants Prods. Liab. Litig, 318 F.Supp.2d 879, 897 (C.D. Cal. 2004); Eli Lilly & Co. v. Teva Pharms, USA, 2008 WL 2410420, *24 (S.D.Ind. 2008) (stating incorrectly that “95% percent of the time, the true mean value will be contained within the lower and upper limits of the confidence interval range”). See also Confidence in Intervals and Diffidence in the Courts” (Mar. 4, 2012).

[10] See, e.g., Brock v. Merrill Dow Pharmaceuticals, Inc., 874 F.2d 307, 311-12 (5th Cir. 1989) (“Fortunately, we do not have to resolve any of the above questions [as to bias and confounding], since the studies presented to us incorporate the possibility of these factors by the use of a confidence interval.”). This howler has been widely acknowledged in the scholarly literature. See David Kaye, David Bernstein, and Jennifer Mnookin, The New Wigmore – A Treatise on Evidence: Expert Evidence § 12.6.4, at 546 (2d ed. 2011); Michael O. Finkelstein, Basic Concepts of Probability and Statistics in the Law 86-87 (2009) (criticizing the blatantly incorrect interpretation of confidence intervals by the Brock court).

[11] In re Ephedra Prods. Liab. Litig., 393 F.Supp. 2d 181, 191 (S.D.N.Y. 2005) (Rakoff, J.) (“Generally accepted scientific convention treats a result as statistically significant if the P-value is not greater than .05. The expression ‘P=.05’ means that there is one chance in twenty that a result showing increased risk was caused by a sampling error — i.e., that the randomly selected sample accidentally turned out to be so unrepresentative that it falsely indicates an elevated risk.”); see also In re Phenylpropanolamine (PPA) Prods. Liab. Litig., 289 F.Supp. 2d 1230, 1236 n.1 (W.D. Wash. 2003) (“P-values measure the probability that the reported association was due to chance… .”). Although the erroneous Ephedra opinion continues to be cited, it has been debunked in the scholarly literature. See Michael O. Finkelstein, Basic Concepts of Probability and Statistics in the Law 65 (2009); Nathan A. Schachtman, “Statistical Evidence in Products Liability Litigation,” at 28-13, chap. 28, in Stephanie A. Scharf, George D. Sax, & Sarah R. Marmor, eds., Product Liability Litigation: Current Law, Strategies and Best Practices (2d ed. 2021).

[12] Wells v. Ortho Pharm. Corp., 615 F. Supp. 262 (N.D. Ga.1985), aff’d & modified in part, remanded, 788 F.2d 741 (11th Cir.), cert. denied, 479 U.S. 950 (1986).

[13] I have discussed the Wells case in a series of posts, “Wells v. Ortho Pharm. Corp., Reconsidered,” (2012), part one, two, three, four, five, and six.

[14] See, e.g., James L. Mills and Duane Alexander, “Teratogens and ‘Litogens’,” 15 New Engl. J. Med. 1234 (1986); Samuel R. Gross, “Expert Evidence,” 1991 Wis. L. Rev. 1113, 1121-24 (1991) (“Unfortunately, Judge Shoob’s decision is absolutely wrong. There is no scientifically credible evidence that Ortho-Gynol Contraceptive Jelly ever causes birth defects.”). See also Editorial, “Federal Judges v. Science,” N.Y. Times, December 27, 1986, at A22 (unsigned editorial) (“That Judge Shoob and the appellate judges ignored the best scientific evidence is an intellectual embarrassment.”);  David E. Bernstein, “Junk Science in the Courtroom,” Wall St. J. at A 15 (Mar. 24,1993) (pointing to Wells as a prominent example of how the federal judiciary had embarrassed American judicial system with its careless, non-evidence based approach to scientific evidence); Bert Black, Francisco J. Ayala & Carol Saffran-Brinks, “Science and the Law in the Wake of Daubert: A New Search for Scientific Knowledge,” 72 Texas L. Rev. 715, 733-34 (1994) (lawyers and leading scientist noting that the district judge “found that the scientific studies relied upon by the plaintiffs’ expert were inconclusive, but nonetheless held his testimony sufficient to support a plaintiffs’ verdict. *** [T]he court explicitly based its decision on the demeanor, tone, motives, biases, and interests that might have influenced each expert’s opinion. Scientific validity apparently did not matter at all.”) (internal citations omitted); Bert Black, “A Unified Theory of Scientific Evidence,” 56 Fordham L. Rev. 595, 672-74 (1988); Paul F. Strain & Bert Black, “Dare We Trust the Jury – No,” 18 Brief  7 (1988); Bert Black, “Evolving Legal Standards for the Admissibility of Scientific Evidence,” 239 Science 1508, 1511 (1988); Diana K. Sheiness, “Out of the Twilight Zone: The Implications of Daubert v. Merrill Dow Pharmaceuticals, Inc.,” 69 Wash. L. Rev. 481, 493 (1994); David E. Bernstein, “The Admissibility of Scientific Evidence after Daubert v. Merrell Dow Pharmacueticals, Inc.,” 15 Cardozo L. Rev. 2139, 2140 (1993) (embarrassing decision); Troyen A. Brennan, “Untangling Causation Issues in Law and Medicine: Hazardous Substance Litigation,” 107 Ann. Intern. Med. 741, 744-45 (1987) (describing the result in Wells as arising from the difficulties created by the Ferebee case; “[t]he Wells case can be characterized as the court embracing the hypothesis when the epidemiologic study fails to show any effect”); Troyen A. Brennan, “Causal Chains and Statistical Links: Some Thoughts on the Role of Scientific Uncertainty in Hazardous Substance Litigation,” 73 Cornell L. Rev. 469, 496-500 (1988); David B. Brushwood, “Drug induced birth defects: difficult decisions and shared responsibilities,” 91 W. Va. L. Rev. 51, 74 (1988); Kenneth R. Foster, David E. Bernstein, and Peter W. Huber, eds., Phantom Risk: Scientific Inference and the Law 28-29, 138-39 (1993) (criticizing Wells decision); Peter Huber, “Medical Experts and the Ghost of Galileo,” 54 Law & Contemp. Problems 119, 158 (1991); Edward W. Kirsch, “Daubert v. Merrell Dow Pharmaceuticals: Active Judicial Scrutiny of Scientific Evidence,” 50 Food & Drug L.J. 213 (1995) (“a case in which a court completely ignored the overwhelming consensus of the scientific community”); Hans Zeisel & David Kaye, Prove It With Figures: Empirical Methods in Law and Litigation § 6.5, at 93(1997) (noting the multiple comparisons in studies of birth defects among women who used spermicides, based upon the many reported categories of birth malformations, and the large potential for even more unreported categories); id. at § 6.5 n.3, at 271 (characterizing Wells as “notorious,” and noting that the case became a “lightning rod for the legal system’s ability to handle expert evidence.”); Edward K. Cheng , “Independent Judicial Research in the ‘Daubert’ Age,” 56 Duke L. J. 1263 (2007) (“notoriously concluded”); Edward K. Cheng, “Same Old, Same Old: Scientific Evidence Past and Present,” 104 Michigan L. Rev. 1387, 1391 (2006) (“judge was fooled”); Harold P. Green, “The Law-Science Interface in Public Policy Decisionmaking,” 51 Ohio St. L.J. 375, 390 (1990); Stephen L. Isaacs & Renee Holt, “Drug regulation, product liability, and the contraceptive crunch: Choices are dwindling,” 8 J. Legal Med. 533 (1987); Neil Vidmar & Shari S. Diamond, “Juries and Expert Evidence,” 66 Brook. L. Rev. 1121, 1169-1170 (2001); Adil E. Shamoo, “Scientific evidence and the judicial system,” 4 Accountability in Research 21, 27 (1995); Michael S. Davidson, “The limitations of scientific testimony in chronic disease litigation,” 10 J. Am. Coll. Toxicol. 431, 435 (1991); Charles R. Nesson & Yochai Benkler, “Constitutional Hearsay: Requiring Foundational Testing and Corroboration under the Confrontation Clause,” 81 Virginia L. Rev. 149, 155 (1995); Stephen D. Sugarman, “The Need to Reform Personal Injury Law Leaving Scientific Disputes to Scientists,” 248 Science 823, 824 (1990); Jay P. Kesan, “A Critical Examination of the Post-Daubert Scientific Evidence Landscape,” 52 Food & Drug L. J. 225, 225 (1997); Ora Fred Harris, Jr., “Communicating the Hazards of Toxic Substance Exposure,” 39 J. Legal Ed. 97, 99 (1989) (“some seemingly horrendous decisions”); Ora Fred Harris, Jr., “Complex Product Design Litigation: A Need for More Capable Fact-Finders,” 79 Kentucky L. J. 510 & n.194 (1991) (“uninformed judicial decision”); Barry L. Shapiro & Marc S. Klein, “Epidemiology in the Courtroom: Anatomy of an Intellectual Embarrassment,” in Stanley A. Edlavitch, ed., Pharmacoepidemiology 87 (1989); Marc S. Klein, “Expert Testimony in Pharmaceutical Product Liability Actions,” 45 Food, Drug, Cosmetic L. J. 393, 410 (1990); Michael S. Lehv, “Medical Product Liability,” Ch. 39, in Sandy M. Sanbar & Marvin H. Firestone, eds., Legal Medicine 397, 397 (7th ed. 2007); R. Ryan Stoll, “A Question of Competence – Judicial Role in Regulation of Pharmaceuticals,” 45 Food, Drug, Cosmetic L. J. 279, 287 (1990); Note, “A Question of Competence: The Judicial Role in the Regulation of Pharmaceuticals,” Harvard L. Rev. 773, 781 (1990); Peter H. Schuck, “Multi-Culturalism Redux: Science, Law, and Politics,” 11 Yale L. & Policy Rev. 1, 13 (1993); Howard A. Denemark, “Improving Litigation Against Drug Manufacturers for Failure to Warn Against Possible Side  Effects: Keeping Dubious Lawsuits from Driving Good Drugs off the Market,” 40 Case Western Reserve L.  Rev. 413, 438-50 (1989-90); Howard A. Denemark, “The Search for Scientific Knowledge in Federal Courts in the Post-Frye Era: Refuting the Assertion that Law Seeks Justice While Science Seeks Truth,” 8 High Technology L. J. 235 (1993)

[15] Carl Cranor & Kurt Nutting, “Scientific and Legal Standards of Statistical Evidence in Toxic Tort and Discrimination Suits,” 9 Law & Philosophy 115, 123 (1990) (internal citations omitted).

[16] 131 S.Ct. 1309 (2011) [Matrixx]

[17] Id. at 1319.

[18] Baroldy v. Ortho Pharmaceutical Corp., 157 Ariz. 574, 583, 760 P.2d 574 (Ct. App. 1988); Earl v. Cryovac, A Div. of WR Grace, 115 Idaho 1087, 772 P. 2d 725, 733 (Ct. App. 1989); Rubanick v. Witco Chemical Corp., 242 N.J. Super. 36, 54, 576 A. 2d 4 (App. Div. 1990), aff’d in part, 125 N.J. 421, 442, 593 A. 2d 733 (1991); Minnesota Min. & Mfg. Co. v. Atterbury, 978 S.W. 2d 183, 193 n.7 (Tex. App. 1998); E.I. Dupont de Nemours v. Castillo ex rel. Castillo, 748 So. 2d 1108, 1120 (Fla. Dist. Ct. App. 2000); Bell v. Lollar, 791 N.E.2d 849, 854 (Ind. App. 2003; King v. Burlington Northern & Santa Fe Ry, 277 Neb. 203, 762 N.W.2d 24, 35 & n.16 (2009).

[19] City of Greenville v. WR Grace & Co., 827 F. 2d 975, 984 (4th Cir. 1987); American Home Products Corp. v. Johnson & Johnson, 672 F. Supp. 135, 142 (S.D.N.Y. 1987); Longmore v. Merrell Dow Pharms., Inc., 737 F. Supp. 1117, 1119 (D. Idaho 1990); Conde v. Velsicol Chemical Corp., 804 F. Supp. 972, 1019 (S.D. Ohio 1992); Joiner v. General Elec. Co., 864 F. Supp. 1310, 1322 (N.D. Ga. 1994) (which case ultimately ended up in the Supreme Court); Bowers v. Northern Telecom, Inc., 905 F. Supp. 1004, 1010 (N.D. Fla. 1995); Pick v. American Medical Systems, 958 F. Supp. 1151, 1158 (E.D. La. 1997); Baker v. Danek Medical, 35 F. Supp. 2d 875, 880 (N.D. Fla. 1998).

[20] Rider v. Sandoz Pharms. Corp., 295 F. 3d 1194, 1199 (11th Cir. 2002); Kilpatrick v. Breg, Inc., 613 F. 3d 1329, 1337 (11th Cir. 2010); Siharath v. Sandoz Pharms. Corp., 131 F. Supp. 2d 1347, 1359 (N.D. Ga. 2001); In re Meridia Prods. Liab. Litig., Case No. 5:02-CV-8000 (N.D. Ohio 2004); Henricksen v. ConocoPhillips Co., 605 F. Supp. 2d 1142, 1177 (E.D. Wash. 2009); Doe v. Northwestern Mutual Life Ins. Co., (D. S.C. 2012); In re Chantix (Varenicline) Prods. Liab. Litig., 889 F. Supp. 2d 1272, 1286, 1288, 1290 (N.D. Ala. 2012); Farmer v. Air & Liquid Systems Corp. at n.11 (M.D. Ga. 2018); In re Abilify Prods. Liab. Litig., 299 F. Supp. 3d 1291, 1306 (N.D. Fla. 2018).

[21] Michael D. Green, D. Michal Freedman & Leon Gordis, “Reference Guide on Epidemiology,” 549, 599 n.143, in Federal Judicial Center, National Research Council, Reference Manual on Scientific Evidence (3d ed. 2011).