TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

Dipak Panigrahy – Expert Witness & Putative Plagiarist

March 27th, 2024

Citing an IARC monograph may be in itself questionable, given the IARC’s deviations from good systematic review practice. Taking the language of an IARC, monograph, and passing it off as your own, without citation or attribution, and leaving out the qualifications and limitations stated in the monograph, should be disqualifying for an expert witness.

And in one federal court, it is.

Last week, on March 18, Senior Judge Roy Bale Dalton, Jr., of Orlando, Florida, granted defendant Lockheed Martin’s Rule 702 motion to exclude the proffered testimony of Dr. Dipak Panigrahy.[1] Panigraphy had opined in his Rule 26 report that seven substances[2] present in the Orlando factory cause eight different types of cancer[3] in 22 of the plaintiffs. Lockheed’s motion asserted that Panigrahy copied IARC verbatim, except for its qualifications and limitations. Judge Dalton reportedly found Panigraphy’s conduct so “blatant that it represents deliberate lack of candor” and an “unreliable methodology.” Although Judge Dalton’s opinion is not yet posted on Westlaw or Google Scholar,[4] the report from Legal Newsline quoted the opinion extensively:

“Here, there is no question that Dr. Panigrahy extensively plagiarized his report… .”

“And his deposition made the plagiarism appear deliberate, as he repeatedly outright refused to acknowledge the long swaths of his report that quote other work verbatim without any quotation marks at all – instead stubbornly insisting that he cited over 1,100 references, as if that resolves the attribution issue (it does not).”

“Indeed, the plagiarism is so ubiquitous throughout the report that it is frankly overwhelming to try to make heads or tails of just what is Dr. Panigrahy’s own work – a task that neither he nor Plaintiffs’ counsel even attempts to tackle.”

There is a wide-range of questionable research practices and dubious inferences that lead to the exclusion of expert witnesses under Rule 702, but I would have thought that Panigraphy was the first witness to have been excluded for plagiarism. Judge Dalton did, however, cite cases involving plagiarism by expert witnesses.[5] Although plagiarism might be framed as a credibility issue, the extent of the plagiarism by Panigraphy represented such an egregious lack of candor that it may justify exclusion under Rule 702.

Judge Dalton’s gatekeeping analysis, however, did not stop with the finding of blatant plagiarism from the IARC monograph. Panigraphy’s report was further methodologically marred by reliance upon the IARC, and his confusion of the IARC hazard evaluation with the required determination of causation in the law of torts. Judge Dalton explained that

“the plagiarism here reflects even deeper methodological problems because the report lifts a great deal of its analysis from IARC in particular. As the Court discussed in the interim causation Order, research agencies like IARC are, understandably, focused on protecting public health and recommending protective standards, rather than evaluating causation from an expert standpoint in the litigation context. IARC determines qualitatively whether substances are carcinogenic to humans; its descriptors have “no quantitative significance” such as more likely than not. Troublingly, Dr. Panigrahy did not grasp this crucial distinction between IARC’s classifications and the general causation preponderance standard. Because so much of Dr. Panigrahy’s report is merely a wholesale adoption of IARC’s findings under the guise of his own expertise, and IARC’s findings in and of themselves are insufficient, he fails to reliably establish general causation.”[6]

Dr. Panigraphy was accepted into medical school at the age of 17. His accelerated education may have left him without a firm understanding of the ethical requirements of scholarship.

Earlier this month, Senior Judge Dalton excluded another expert witness’s opinion testimony, from Dr. Donald Mattison, on autism, multiple sclerosis, and Parkinson’s disease, but permitted opinions on the causation of various birth defects.[7] Judge Dalton’s decisions arise from a group of companion cases, brought by more than 60 claimants against Lockheed Martin for various health conditions alleged to have been caused by Lockheed’s supposed contamination of the air, soil, and groundwater, with chemicals from its weapons manufacturing plant.

The unreliability of Panigraphy’s report led to the entry of summary judgment against the 22 plaintiffs, whose cases turned on the Panigraphy report.

The putative plagiarist, Dr. Panigraphy, is an assistant professor of pathology, at Harvard Medical School, in the department of pathology, Beth Israel Deaconess Medical Center, in Boston.  “The Expert Institute” has a profile of Panigraphy, with a compilation of background information and litigation activities. His opinions were excluded in the federal multi-district litigation concerning Zantac/ranitidine.[8]  Very similar opinions were permitted over defense challenges, in a short, perfunctory order, even shorter on reasoning, in the valsartan multi-district litigation.[9]


[1] John O’Brien, “‘A mess’: Expert in Florida toxic tort plagiarizes cancer research of others, tries to submit it to court,” Legal News Line (Mar. 25, 2024).

[2] trichloroethylene, tetrachloroethylene, formaldehyde, arsenic, hexavalent chromium, trichloroethylene, and styrene.

[3] cancers of the kidney, breast, thyroid, pancreas, liver and bile duct, testicles, and anus, as well as Hodgkin’s lymphoma, non-Hodgkin’s lymphoma, and leukemia.

[4] Henderson v. Lockheed Martin Corp., case no. 6:21-cv-1363-RBD-DCI, document 399 (M.D. Fla. Mar. 18, 2024) (Dalton, S.J.).

[5] Henderson Order at 6, citing Moore v. BASF Corp., No. CIV.A. 11-1001, 2012 WL 6002831, at *7 (E.D. La. Nov. 30, 2012) (excluding expert testimony from Bhaskar Kura), aff’d, 547 F. App’x 513 (5th Cir. 2013); Spiral Direct, Inc. v. Basic Sports Apparel, Inc., No. 6:15-cv-641, 2017 WL 11457208, at *2 (M.D. Fla. Apr. 13, 2017); 293 F. Supp. 3d 1334, 1363 n. 20 (2017); Legier & Materne v. Great Plains Software, Inc., No. CIV.A. 03-0278, 2005 WL 2037346, at *4 (E.D. La. Aug. 3, 2005) (denying motion to exclude proffered testimony because expert witness plagiarized a paragraph in his report).

[6] Henderson Order at 8 -10 (internal citations omitted), citing McClain v. Metabolife Internat’l, Inc., 401 F.3d 1233, 1249 (11th Cir. 2005) (distinguishing agency assessment of risk from judicial assessment of causation); Williams v. Mosaic Fertilizer, LLC, 889 F.3d 1239, 1247 (11th Cir. 2018) (identifying “methodological perils” in relying extensively on regulatory agencies’ precautionary standards to determine causation); Allen v. Pennsylvania Eng’g Corp., 102 F.3d 194, 198 (5th Cir. 1996) (noting that IARC’s “threshold of proof is reasonably lower than that appropriate in tort law, which traditionally makes more particularized inquiries into cause and effect and requires a plaintiff to prove that it is more likely than not that another individual has caused him or her harm”); In re Roundup Prods. Liab. Litig., 390 F. Supp. 3d 1102, 1109 (N.D. Cal. 2018) (“IARC classification is insufficient to get the plaintiffs over the general causation hurdle.”), aff’d, 997 F.3d 941 (9th Cir. 2021).

[7] John O’Brien, “Autism plaintiffs rejected from Florida Lockheed Martin toxic tort,” Legal Newsline (Mar. 15, 2024).

[8][8] In re Zantac (ranitidine) Prods. Liab. Litig., MDL NO. 2924 20-MD-2924, 644 F. Supp. 3d 1075, 1100 (S.D. Fla. 2022).

[9] In re Valsartan, Losartan, and Irbesartan Prods. Liab. Litig., Case 1:19-md-02875-RBK-SAK, document 1958 (D.N.J. Mar. 4, 2022).

A Π-Day Celebration of Irrational Numbers and Other Things – Philadelphia Glyphosate Litigation

March 14th, 2024

Science can often be more complicated and nuanced than we might like. Back in 1897, the Indiana legislature attempted to establish that π was equal to 3.2.[1] Sure, that was simpler and easier to use in calculations, but also wrong. The irreducible fact is that π is an irrational number, and Indiana’s attempt to change that fact was, well, irrational. And to celebrate irrationality, consider the lawsuit’s industry’s jihad against glyphosate, including its efforts to elevate a dodgy IARC evaluation, while suppressing evidence of glyphosate’s scientific exonerations

                                                 

After Bayer lost three consecutive glyphosate cases in Philadelphia last year, observers were scratching their heads over why the company had lost when the scientific evidence strongly supports the defense. The Philadelphia Court of Common Pleas, not to be confused with Common Fleas, can be a rough place for corporate defendants. The local newspapers, to the extent people still read newspapers, are insufferably slanted in their coverage of health claims.

The plaintiffs’ verdicts garnered a good deal of local media coverage in Philadelphia.[2] Defense verdicts generally receive no ink from sensationalist newspapers such as the Philadelphia Inquirer. Regardless, media accounts, both lay and legal, are generally inadequate to tell us what happened, or what went wrong in the court room. The defense losses could be attributable to partial judges or juries, or the difficulty in communicating subtle issues of scientific validity. Plaintiffs’ expert witnesses may seem more sure of themselves than defense experts, or plaintiffs’ counsel may connect better with juries primed by fear-mongering media. Without being in the courtroom, or at least studying trial transcripts, outside observers are challenged to explain fully jury verdicts that go against the scientific evidence. The one thing jury verdicts are not, however, are valid assessments of the strength of scientific evidence, inferences, and conclusions.

Although Philadelphia juries can be rough, they like to see a fight. (Remember Rocky.) It is not a place for genteel manners or delicate and subtle distinctions. Last week, Bayer broke its Philadelphia losing streak, with a win in Kline v. Monsanto Co.[3] Mr. Kline claimed that he developed Non-Hodgkin’s lymphoma (NHL) from his long-term use of Round-Up. The two-week trial, before Judge Ann Butchart, last week went to the jury, which deliberated two hours before returning a unanimous defense verdict. The jury found that the defendants, Monsanto and Nouryon Chemicals LLC, were not negligent, and that the plaintiff’s use of Roundup was not a factual cause of his lymphoma.[4]

Law360 reported that the Kline verdict was the first to follow a ruling on Valentine’s Day, February 14, 2024, which excluded any courtroom reference to the hazard evaluation of Glyphosate by the International Agency for Research on Cancer (IARC). The Law360 article indicated that the IARC found that glyphosate can cause cancer; except of course IARC has never reached such a conclusion.

The IARC working group evaluated the evidence for glyphosate and classified the substance as a category IIA carcinogen, which it labels as “probably” causing human cancer. This label sounds close to what might be useful in a courtroom, except that the IARC declares that “probably,” as used in is IIA classification does not mean what people generally, and lawyers and judges specifically, mean by the word probably.  For IARC, “probable” has no quantitative meaning.  In other words, for IARC, probable, a quantitative concept, which everyone understands to be measured on a scale from 0 to 1, or from 0% to 100%, is not quantitative. An IARC IIA classification could thus represent a posterior probability of 1% in favor of carcinogenicity (and 99% probable not a carcinogen). In other words, on whether glyphosate causes cancer in humans, IARC says maybe in its own made-up epistemic modality.

To find the idiosyncratic definition of “probable,” a diligent reader must go outside the monograph of interest to the so-called Preamble, a separate document, last revised in 2019. The first time the jury will hear of the IARC pronouncement will be in the plaintiff’s case, and if the defense wishes to inform the jury on the special, idiosyncratic meaning of IARC “probable,” they must do it on cross-examination of hostile plaintiffs’ witnesses, or wait longer until they present their own witnesses. Disclosing the IARC IIA classification hurts because the “probable” language lines up with what the trial judges will instruct the juries at the end of the case, when the jurors are told that they need not believe that the plaintiff has eliminated all doubt; they need only find that the plaintiff has shown that each element of his case is “probable,” or more likely than not, in order to prevail. Once the jury has heard “probable,” the defense will have a hard time putting the toothpaste back in the tube. Of course, this is why the lawsuit industry loves IARC evaluations, with its fallacies of semantical distortion.[5]

Although identifying the causes of a jury verdict is more difficult than even determining carcinogenicity, Rosemary Pinto, one of plaintiff Kline’s lawyers, suggested that the exclusion of the IARC evaluation sank her case:

“We’re very disappointed in the jury verdict, which we plan to appeal, based upon adverse rulings in advance of the trial that really kept core components of the evidence out of the case. These included the fact that the EPA safety evaluation of Roundup has been vacated, who IARC (the International Agency for Research on Cancer) is and the relevance of their finding that Roundup is a probable human carcinogen [sic], and also the allowance into evidence of findings by foreign regulatory agencies disguised as foreign scientists. All of those things collectively, we believe, tilted the trial in Monsanto’s favor, and it was inconsistent with the rulings in previous Roundup trials here in Philadelphia and across the country.”[6]

Pinto was involved in the case, and so she may have some insight into why the jury ruled as it did. Still, issuing this pronouncement before interviewing the jurors seems little more than wishcasting. As philosopher Harry Frankfurt explained, “the production of bullshit is stimulated whenever a person’s obligations or opportunities to speak about some topic exceed his knowledge of the facts that are relevant to that topic.”[7] Pinto’s real aim was revealed in her statement that the IARC review was “crucial evidence that juries should be hearing.”[8]  

What is the genesis of Pinto’s complaint about the exclusion of IARC’s conclusions? The Valentine’s Day Order, issued by Judge Joshua H. Roberts, who heads up the Philadelphia County mass tort court, provided that:

AND NOW, this 14th day of February, 2024, upon consideration of Defendants’ Motion to Clarify the Court’s January 4, 2024 Order on Plaintiffs Motion in Limine No. 5 to Exclude Foreign Regulatory Registrations and/or Approvals of Glyphosate, GBHs, and/or Roundup, Plaintiffs’ Response, and after oral argument, it is ORDERED as follows:

  1. The Court’s Order of January 4, 2024, is AMENDED to read as follows: [ … ] it is ORDERED that the Motion is GRANTED without prejudice to a party’s introduction of foreign scientific evidence, provided that the evidence is introduced through an expert witness who has been qualified pursuant to Pa. R. E. 702.

  2. The Court specifically amends its Order of January 4, 2024, to exclude reference to IARC, and any other foreign agency and/or foreign regulatory agency.

  3. The Court reiterates that no party may introduce any testimony or evidence regarding a foreign agency and/or foreign regulatory agency which may result in a mini-trial regarding the protocols, rules, and/or decision making process of the foreign agency and/or foreign regulatory agency. [fn1]

  4. The trial judge shall retain full discretion to make appropriate evidentiary rulings on the issues covered by this Order based on the testimony and evidence elicited at trial, including but not limited to whether a party or witness has “opened the door.”[9]

Now what was not covered in the legal media accounts was the curious irony that the exclusion of the IARC evaluation resulted from plaintiffs’ motion, an own goal of sorts. In previous Philadelphia trials, plaintiffs’ counsel vociferously objected to the defense counsel’s and experts’ references to the determinations by foreign regulators, such as European Union Assessment Group on Glyphosate (2017, 2022), Health Canada (2017), European Food Safety Authority (2017, 2023), Australian Pesticides and Veterinary Medicines Authority (2017), German Federal Institute for Risk Assessment (2019), and others, that rejected the IARC evaluation and reported that glyphosate has not been shown to be carcinogenic.[10]

The gravamen of the plaintiffs’ objection was that such regulatory determinations were hearsay, and that they resulted from various procedures, using various criteria, which would require explanation, and would be subject to litigants’ challenges.[11] In other words, for each regulatory agency’s determination, there would be a “mini-trial,” or a “trial within a trial,” about the validity and accuracy of the foreign agency’s assessment.

In the earlier Philadelphia trials, the plaintiffs’ objections were largely sustained, which created a significant evidentiary bias in the courtrooms. Plaintiffs’ expert witnesses could freely discuss the IARC glyphosate evaluation, but the defense and its experts could not discuss the many determinations of the safety of glyphosate. Jurors were apparently left with the erroneous impression that the IARC evaluation was a consensus view of the entire world’s scientific community.

Now plaintiffs’ objection has a point, even though it seems to prove too much and must ultimately fail. In a trial, each side has expert witnesses who can offer an opinion about the key causal issue, whether glyphosate can cause NHL, and whether it caused this plaintiff’s NHL. Each expert witness will have written a report that identifies the facts and data relied upon, and that explains the inferences drawn and conclusions reached. The adversary can challenge the validity of the data, inferences, and conclusions because the opposing expert witness will be subject to cross-examination.

The facts and data relied upon will, however, be “hearsay,” which will come from published studies not written by the expert witnesses at trial. There will be many aspects of the relied upon studies that will be taken on faith without the testimony of the study participants, their healthcare providers, or the scientists who collected the data, chose how to analyze the data, conducted the statistical and scientific analyses, and wrote up the methods and study findings. Permitting reliance upon any study thus allows for a “mini-trial” or a “trial within a trial,” on each study cited and relied upon by the testifying expert witnesses. This complexity involved in expert witness opinion testimony is one of the foundational reasons for Rule 702’s gatekeeping regime in federal court, and most state courts, but which is usually conspicuously absent in Pennsylvania courtrooms.

Furthermore, the plaintiffs’ objections to foreign regulatory determinations would apply to any review paper, and more important, it would apply to the IARC glyphosate monograph itself. After all, if expert witnesses are supposed to have reviewed the underlying studies themselves, and be competent to do so, and to have arrived at an opinion in some reliable way from the facts and data available, then they would have no need to advert to the IARC’s review on the general causation issue.  If an expert witness were allowed to invoke the IARC conclusion, presumably to bolster his or her own causation opinion, then the jury would need to resolve questions about:

  • who was on the working group;
  • how were working group members selected, or excluded;
  • how the working group arrived at its conclusion;
  • what did the working group rely upon, or not rely upon, and why,
  • what was the group’s method for synthesizing facts and data to reach its conclusion;
  • was the working group faithful to its stated methodology;
  • did the working group commit any errors of statistical or scientific judgment along the way;
  • what potential biases did the working group members have;
  • what is the basis for the IARC’s classificatory scheme; and
  • how are IARC’s key terms such as “sufficient,” “limited,” “probable,” “possible,” etc., defined and used by working groups.

Indeed, a very substantial trial could be had on the bona fides and methods of the IARC, and the glyphosate IARC working group in particular.

The curious irony behind the Valentine’s Day order is that plaintiffs’ counsel were generally winning their objections to the defense’s references to foreign regulatory determinations. But as pigs get fatter, hogs get slaughtered. Last year, plaintiffs’ counsel moved to “exclude foreign regulatory registrations and or approvals of glyphosate.”[12] To be sure, plaintiffs’ counsel were not seeking merely the exclusion of glyphosate registrations, but the scientific evaluations of regulatory agencies and their staff scientists and consulting scientists. Plaintiffs wanted trials in which juries would hear only about IARC, as though it was a scientific consensus. The many scientific regulatory considerations and rejections of the IARC evaluation would be purged from the courtroom.

On January 4, 2024, plaintiffs’ counsel obtained what they sought, an order that memorialized the tilted playing field they had largely been enjoying in Philadelphia courtrooms. Judge Roberts’ order was short and somewhat ambiguous:

“upon consideration of plaintiff’s motion in limine no. 5 to exclude foreign regulatory registrations and/or approvals of glyphosate, GBHs, and/or Roundup, any response thereto, the supplements of the parties, and oral argument, it is ORDERED that the motion is GRANTED without prejudice to a party’s introduction of foreign scientific evidence including, but not limited to, evidence from the International Agency for Research on Cancer (IARC), provided that such introduction does not refer to foreign regulatory agencies.”

The courtroom “real world” outcome after Judge Roberts’ order was an obscene verdict in the McKivison case. Again, there may have been many contributing causes to the McKivison verdict, including Pennsylvania’s murky and retrograde law of expert witness opinion testimony.[13] Mr. McKivison was in remission from NHL and had sustained no economic damages, and yet, on January 26, 2024, a jury in his case returned a punitive compensatory damages award of $250 million, and an even more punitive punitive damage award of $2 billion.[14] It seems at least plausible that the imbalance between admitting the IARC evaluation while excluding foreign regulatory assessments helped create a false narrative that scientists and regulators everywhere had determined glyphosate to be unsafe.

On February 2, 2024, the defense moved for a clarification of Judge Roberts’ January 4, 2024 order that applied globally in the Philadelphia glyphosate litigation. The defendants complained that in their previous trial, after Judge Roberts’ Order of January 4, 2024, they were severely prejudiced by being prohibited from referring to the conclusions and assessments of foreign scientists who worked for regulatory agencies. The complaint seems well founded.  If a hearsay evaluation of glyphosate by an IARC working group is relevant and admissible, the conclusions of foreign scientists about glyphosate are relevant and admissible, whether or not they are employed by foreign regulatory agencies. Indeed, plaintiffs’ counsel routinely complained about Monsanto/Bayer’s “influence” over the United States Environmental Protection Agency, but the suggestion that the European Union’s regulators are in the pockets of Bayer is pretty farfetched. Indeed, the complaint about bias is peculiar coming from plaintifs’ counsel, who command an out-sized influence within the Collegium Ramazzini,[15] which in turn often dominates IARC working groups. Every agency and scientific group, including the IARC, has its “method,” its classificatory schemes, its definitions, and the like. By privileging the IARC conclusion, while excluding all the other many agencies and groups, and allowing plaintiffs’ counsel to argue that there is no real-world debate over glyphosate, Philadelphia courts play a malignant role in helping to generate the huge verdicts seen in glyphosate litigation.

The defense motion for clarification also stressed that the issue whether glyphosate causes NHL or other human cancer is not the probandum for which foreign agency and scientific group statements are relevant.  Pennsylvania has a most peculiar, idiosyncratic law of strict liability, under which such statements may not be relevant to liability questions. Plaintiffs’ counsel, in glyphosate and most tort litigations, however, routinely assert negligence as well as punitive damages claims. Allowing plaintiffs’ counsel to create a false and fraudulent narrative that Monsanto has flouted the consensus of the entire scientific and regulatory community in failing to label Roundup with cancer warnings is a travesty of the rule of law.

What seems clever by halves in the plaintiffs’ litigation approach was that its complaints about foreign regulatory assessments applied equally, if not more so, to the IARC glyphosate hazard evaluation. The glyphosate litigation is not likely as interminable as π, but it is irrational.

*      *     *      *      *     * 

Post Script.  Ten days after the verdict in Kline, and one day after the above post, the Philadelphia Inquirer released a story about the defense verdict. See Nick Vadala, “Monsanto wins first Roundup court case in recent string of Philadelphia lawsuits,” Phila. Inq. (Mar. 15, 2024).


[1] Bill 246, Indiana House of Representatives (1897); Petr Beckmann, A History of π at 174 (1971).

[2] See Robert Moran, “Philadelphia jury awards $175 million after deciding 83-year-old man got cancer from Roundup weed killer,” Phila. Inq. (Oct. 27, 2023); Nick Vadala, “Philadelphia jury awards $2.25 billion to man who claimed Roundup weed killer gave him cancer,” Phila. Inq. (Jan. 29, 2024).

[3] Phila. Ct. C.P. 2022-01641.

[4] George Woolston, “Monsanto Nabs 1st Win In Philly’s Roundup Trial Blitz,” Law360 (Mar. 5, 2024); Nicholas Malfitano, “After three initial losses, Roundup manufacturers get their first win in Philly courtroom,” Pennsylvania Record (Mar. 6, 2024).

[5][5] See David Hackett Fischer, “ Fallacies of Semantical Distortion,” chap. 10, in Historians’ Fallacies: Toward a Logic of Historical Thought (1970); see alsoIARC’s Fundamental Distinction Between Hazard and Risk – Lost in the Flood” (Feb. 1, 2024); “The IARC-hy of Evidence – Incoherent & Inconsistent Classification of Carcinogencity” (Sept. 19, 2023).

[6] Malfitano, note 2 (quoting Pinto); see also Law360, note 2 (quoting Pinto).

[7] Harry Frankfurt, On Bullshit at 63 (2005); seeThe Philosophy of Bad Expert Witness Opinion Testimony” (Oct. 2, 2010).

[8] See Malifanto, note 2 (quoting Pinto).

[9] In re Roundup Prods. Litig., Phila. Cty. Ct. C.P., May Term 2022-0550, Control No. 24020394 (Feb. 14, 2024) (Roberts, J.). In a footnote, the court explained that “an expert may testify that foreign scientists have concluded that Roundup and· glyphosate can be used safely and they do not cause cancer. In the example provided, there is no specific reference to an agency or regulatory body, and the jury is free to make a credibility determination based on the totality of the expert’s testimony. It is, however, impossible for this Court, in a pre-trial posture, to anticipate every iteration of a question asked or answer provided; it remains within the discretion of the trial judge to determine whether a question or answer is appropriate based on the context and the trial circumstances.”

[10] See National Ass’n of Wheat Growers v. Bonta, 85 F.4th 1263, 1270 (9th Cir. 2023) (“A significant number of . . . organizations disagree with IARC’s conclusion that glyphosate is a probable carcinogen”; … “[g]lobal studies from the European Union, Canada, Australia, New Zealand, Japan, and South Korea have all concluded that glyphosate is unlikely to be carcinogenic to humans.”).

[11] See, e.g., In re Seroquel, 601 F. Supp. 2d 1313, 1318 (M.D. Fla. 2009) (noting that references to foreign regulatory actions or decisions “without providing context concerning the regulatory schemes and decision-making processes involved would strip the jury of any framework within which to evaluate the meaning of that evidence”)

[12] McKivison v. Monsanto Co., Phila. Cty. Ct. C.P., No. 2022- 00337, Plaintiff’s Motion in Limine No. 5 to Exclude Foreign Regulatory Registration and/or Approvals of Glyphosate, GHBs and/or Roundup.

[13] See Sherman Joyce, “New Rule 702 Helps Judges Keep Bad Science Out Of Court,” Law360 (Feb. 13, 2024) (noting Pennsylvania’s outlier status on evidence law that enables dodgy opinion testimony).

[14] P.J. D’Annunzio, “Monsanto Fights $2.25B Verdict After Philly Roundup Trial,” Law360 (Feb. 8, 2024).

[15]Collegium Ramazzini & Its Fellows – The Lobby” (Nov. 19, 2023).

Purging Compurgation

March 12th, 2024

“You could file briefs on a napkin right now and get it granted.”

Alan Lange & Tom Dawson, Kings of Torts 87 (2d ed. 2010) (quoting convicted former lawyer, Zach Scruggs)

Back in the 1980s, I started to see expert witnesses stray into the business of psychoanalysis of corporate defendants. Perhaps it took place earlier; it seemed to be a tactic when I first started to try cases. Not only did expert witnesses wish to indict products as causes of plaintiffs’ harms, they wanted to indict the motives and intentions of the manufacturers. Such “motive” testimony should have been cleared from courtrooms by the basic rule of expert witness opinion testimony; namely, the warrant for expert witness testimony is that the subject matter is “beyond the ken” of the jury. Given that the tendentious witnesses had no special skills in divining motives, and that jurors were routinely called upon to infer motives, the offending testimony should have been readily quashed. Almost 100 years ago, Judge Learned Hand, confronted with similar argumentative opinion testimony, held, in his magisterial way, that “[a]rgument is argument whether in the box or at the bar, and its proper place is the last.”[1]

What I found when I started trying cases was that many states had hard rules on expert witnesses, but soft judges. In some litigations, plaintiffs’ counsel offered a witness, such as the late Marc Lappé, not only to assess motives, but also to make ethical pronouncements about defendants’ conduct. More typically, the ethical judgments came from historian witnesses or regulatory expert witnesses. Occasionally, expert witnesses on health effects issues offered psychoanalytic opinions as well. Plaintiffs’ counsel typically argued that Federal Rule of Evidence 704, which declared that “[a]n opinion is not objectionable just because it embraces an ultimate issue,” green lighted their witnesses’ amateur or professional psychoanalysis. Defendants typically argued that the common law requirement that opinions be “beyond the ken” of jurors was carried forward in Rule 702’s requirement of relevant expertise, knowledge, and helpfulness to the trier of fact.  State court analogues to these rules replicated the debate in state courts around the country.

The attempt to deprecate the intentions or motives of a party were not necessarily enhanced when the expert witness compurgator had some semblance of subject-matter expertise. In one case, a frequent statistician testifier for the lawsuit industry, Martin Wells, expressed the opinion that the study at issue in the litigation “was seriously flawed by bad epidemiological practice. The combination of bias and poor epidemiologic practice is so rampant that one can easily conclude the study was intentionally designed to achieve a desired result regardless of the actual findings in the data.”[2]

Wells may have been entitled to his opinion about the quality of the study at issue, and if he had good grounds and a reliable methodology, perhaps he should have been permitted to share that opinion with a jury. The court, however, held that opinions based upon “inferences about the intent or motive of parties … lie outside the bounds of expert testimony, but are instead classic jury questions.”[3] The acceptability of the Wells’ compurgation was not improved or made more admissible by coating it with a patina of expertise about interpreting studies. The trial court found that:

“Dr. Wells’ statements represent his subjective beliefs regarding an alleged bad motive or intent on the part of defendants or others who designed the study. The Court finds that his speculation about the reason for alleged methodological issues in the study are not the product of reliable methods, and will be excluded.”[4]

By 2011, or so, the case law interpreting common law and statutory rules about ethics and motive opinion generally tilted in favor of the defense.[5] Courts routinely excluded expert witness opinions about corporate knowledge, motivations, and intent, as irrelevant and inadmissible under Rule 702.

As though ethicist and historian testimony were not bad enough, imagine an economist offering testimony to deprecate lobbying efforts that are protected first amendment speech. In one multi-district litigation, thinking that they could get away with most anything, plaintiffs’ counsel offered just such an expert witness.

The expert witness at issue was an economist, Glen W. Harrison, of no particular distinction, who sought to serve as a compurgator in litigation. Harrison is an accomplished litigation witness, who was developed and trained by the Motley Rice firm and others in many tobacco cases.[6] What is clear is that he was deployed, in MDL 1535, to lobby the fact-finder inappropriately, without any real expertise in the material science, toxicology, or epidemiology issues in the litigation.

The essence of Glen Harrison’s opinion is that the “manufacturing industry” saw itself as having an “economic incentive” to engage in lobbying. This opinion was either tautologically or trivially true, but plaintiffs sought the opportunity to cast lawful (and constitutionally protected) lobbying as nefarious and tortious. A disinterested observer might have thought that the important issue was whether the lobbying was unlawful and thus inappropriate, but Harrison was not an expert on the law governing stakeholders’ submissions to agencies or to organizations that promulgate standards.

Harrison’s opinion on “inappropriateness” was based upon his inexpert factual review of documents, with occasional inferences or comments about whether the documents were incomplete, or inconsistent with other pieces of evidence. What was remarkable about this bold attempt to subvert the MDL trial process was that Harrison had absolutely no expertise or competence to discuss documents that involved issues of epidemiology, risk assessment, neurology, neuropsychology, toxicology, or exposure measurements. Harrison tried to squeeze out some bare relevance by commenting upon documents with his personal, lay observations that they seemed inaccurate, or that they were incomplete. Of course, a lawyer could equally well argue the point to the jury in summation. Clearly, the goal of proffering Harrison was to have a summation from a witness, with a pleasant Australian accent, in the middle of the plaintiffs’ case in chief. If you listened closely, you could hear a roar of disapproval from the Albany Rural Cemetery.[7]

For some time, the MDL 1535 judge winked at the plaintiffs’ and Harrison’s improper ploy to demonize lawful, appropriate industry conduct, and the MDL resolved before the parties obtained a ruling on Harrison’s proffered testimony. While the issue was before the MDL court, it appeared unmoved by considerations of the First Amendment or of the Noerr-Pennington doctrine,[8] or even the statutory invitation and right to comment upon proposed regulations.[9] Of course, manufacturing and lawsuit industries have a right to participate in notice and comment periods of rulemaking. The courtroom asymmetry threatened by Harrison’s proffered testimony was that plaintiffs’ counsel could comment upon defendants’ lobbying, but defense counsel had no equivalent opportunity to comment upon the lawsuit’s extensive rent seeking.[10]


[1] Nichols v. Universal Pictures Corp., 45 F.2d 119, 123 (2d Cir. 1930) (Hand, J.).

[2] In re Trasylol Prods. Liab. Litig., 08-md-01928, 2010 WL 1489793, at *2 (S.D. Fla. Feb. 24, 2010) (quoting from Rule 26 report of Martin T. Wells ¶ 4, Van Steenburgh Affidavit, Exhibit B, Docket No. 1677).

[3] Id. at *8 (internal quotation marks omitted).

[4] Id. at *2.

[5] See Beck, “Experts Offering Evidence of Corporate Intent, Ethics, And The Like,” Drug & Device Law (May 19, 2011) (collecting cases). See, e.g., Kidder, Peabody & Co., Inc. v. IAG Int’l Acceptance Grp., N.V., 14 F.Supp. 2d 391, 404 (S.D.N.Y.1998); Crown Cork, 2013 WL 978980, at *7 (excluding expert opinions of parties’ knowledge, state of mind, and intent); DePaepe v. General Motors Corp., 141 F.3d 714, 720 (7th Cir. 1998) (disallowing opinion of expert witness, who “lacked any scientific basis for an opinion about … motives,” about defendant’s failure to add safety measure in order to “save money”); In re Diet Drugs Prods. Liab. Litig., 2000 WL 876900, at *9 (E.D.Pa. June 20, 2000) (noting that “question of intent is a classic jury question and not one for experts”); Smith v. Wyeth-Ayerst Laboratories Co., 278 F.Supp.2d 684, 700 (W.D.N.C. 2003) (expert witnesses may not opine about corporate intent and motive) (barring Dr. Moye from giving such testimony); In re Rezulin Products Liab. Litig., 309 F. Supp. 2d 531, 543, 545 n.37 (S.D.N.Y.2004) (excluding opinions on intent and motive, as well as historical narrative gleaned form otherwise admissible documentary evidence); In re Baycol Prods. Liab. Litig., 495 F. Supp. 2d 977, 1001 (D.Minn. 2007) (holding expert witness to have exceeded proper proffer “to the extent that he speculates as to Bayer’s motive, intent, or state of mind”); 532 F. Supp. 2d 1029, 1069 (D. Minn. 2007) (“[A]n expert may not testify as to ethical issues or to his personal views”; “[t]he question of corporate intent is one for the jury, not for an expert”); Reece v. Astrazeneca Pharms., LP, 500 F. Supp. 2d 736, 744-46 (S.D. Ohio 2007) (advisability of tests; warnings needed for particular medical conditions; lack of methodology); In re Guidant Corp. Implantable Defibrillators Prods. Liab. Litig., 2007 WL 1964337, at *8 (D. Minn. June 29, 2007); Singh v. Edwards Lifesciences Corp., 2008 WL 5758387, ¶ELS 6 (Wash. Super. Snohomish Cty. Jan. 31, 2008); In re Fosamax Prods. Liab. Litig., 645 F. Supp. 2d 164, 192 (S.D.N.Y. 2009) (granting defendant’s motion to exclude testimony from Dr. Furberg about purported general ethical standards for conducting clinical trials); In re Xerox Corp. Sec. Litig., 746 F. Supp. 2d 402, 415 (D. Conn. 2010) (“Inferences about the intent or motive of parties or others lie outside the bounds of expert testimony.”) (internal citations omitted); In re Gadolinium-Based Contrast Agents Prods. Liab. Litig., 2010 WL 1796334, at *13 (N.D. Ohio May 4, 2010); In re Levaquin Prods. Liab. Litig., 2010 WL 11470977 (D. Minn. Nov. 10, 2010); Deutsch v. Novartis Pharms. Corp., 768 F. Supp.2d 420, 467 (E.D.N.Y. 2011); In re Heparin Prods. Liab. Litig., 2011 WL 1059660, at *8 (N.D. Ohio March 21, 2011); Lemons v. Novartis Pharms. Corp., 849 F. Supp.2d 608, 615 (W.D.N.C. 2012); Hill v. Novartis Pharms. Corp., 2012 WL 5451809, at *2 (E.D. Cal. Nov. 7, 2012); Georges v. Novartis Pharms. Corp., 2012 WL 9064768, at *13 (C.D. Cal. Nov. 2, 2012); Johnson v. Wyeth LLC, 2012 WL 1204081, at *3 (D. Ariz. Apr. 11, 2012); Pritchett v. I-Flow Corp., 2012 WL 1059948, at *6 (D. Colo. Mar. 28, 2012); Chandler v. Greenstone Ltd., 2012 WL 882756, at *1 (W.D. Wash. Mar. 14, 2012); Winter v. Novartis Pharms. Corp., 2012 WL 827305, at *5 (W.D. Mo. March 8, 2012); Earp v. Novartis Pharms. Corp., 2013 WL 4854488, at *4 (E.D.N.C. Sept. 11, 2013).

[6] See, e.g., Group Health Plan, Inc. v. Philip Morris, Inc., 188 F. Supp. 2d 1122 (D. Minn. 2002); Blue Cross & Blue Shield of N.J. v. Philip Morris, 178 F. Supp. 2d 198 (E.D.N.Y. 2001); Rent-A-Center West Inc.  v. Dept. of Revenue, 418 S.C. 320, 792 S.E.2d 260 (2016).

[7] Where Judge Learned Hand was buried.

[8] See, e.g., Video Int’l Prod., Inc. v. Warner-Amex Cable Comm., Inc., 858 F.2d 1075, 1084 (5th Cir. 1988) (applying Noerr-Pennington doctrine to bar use of evidence of lobbying in tort case); Hamilton v. AccuTek, 935 F. Supp. 1307, 1321 (E.D.N.Y. 1996) (granting summary judgment to gun makers on product liability and fraud claims based upon their efforts to influence federal policies by lawful lobbying); In re Municipal Stormwater Pond, No. 18-cv-3495 (JNE/KMM), 2019 U.S. Dist. LEXIS 227887, at *12 (D. Minn. Dec. 20, 2019) (dismissing claims of fraudulent misrepresentation claims against maker of coal-tar sealant on grounds that the Noerr-Pennington doctrine protected manufacturer’s lobbying before state and local governments); Eiser v. Brown & Williamson Tobacco Corp., Phila. Ct. Com. Pleas LEXIS 43, *20, 2005 WL 1323030 (2005) (invoking Noerr-Pennington doctrine to bar evidence of defendant manufacturer’s lobbying in products liability case). See generally James M. Sabovich, “Petition without Exception: Against the Fraud Exception to Noerr-Pennington Immunity from the Toxic Tort Perspective,” 17 Penn State Envt’l L. Rev. 101 (2008).

[9] See Admin. Procedures Act, 5 U.S.C. § 553; Attorney General’s Manual on the Administrative Procedure Act 31 (1947) (“[t]he objective should be to assure informed administrative action and adequate protection to private interests”).

[10] Lawsuit industry certainly exercises its rent-seeking through legitimate lobbying, and occasionally through illegimate means.  See U.S. v. Scruggs, 691 F.3d 660 (5th Cir. 2012).

A Citation for Jurs & DeVito’s Unlawful U-Turn

February 27th, 2024

Antic proposals abound in the legal analysis of expert witness opinion evidence. In the courtroom, the standards for admitting or excluding such evidence are found in judicial decisions or in statutes. When legislatures have specified standards for admitting expert witness opinions, courts have a duty to apply the standards to the facts before them. Law professors are, of course, untethered from either precedent or statute, and so we may expect chaos to ensue when they wade into disputes about the proper scope of expert witness gatekeeping.

Andrew Jurs teaches about science and the law at the Drake University Law School, and Scott DeVito is an associate professor of law at the Jacksonville University School of Law. Together, they have recently produced one of the most antic of antic proposals in a fatuous call for the wholesale revision of the law of expert witnesses.[1]

Jurs and DeVito rightly point out that since the Supreme Court, in Daubert,[2] waded into the dispute whether the historical Frye decision survived the enactment of the Federal Rules of Evidence, we have seen lower courts apply the legal standard inconsistently and sometimes incoherently. These authors, however, like many other academics, incorrectly label one or the other standard, Frye or Daubert, as being stricter than the other. Applying the labels of stricter and weaker standards, ignores that they are standards that measure completely different things. Frye advances a sociological standard, and a Frye test challenge can be answered by conducting a survey. Rule 702, as interpreted by Daubert, and as since revised and adopted by the Supreme Court and Congress, is an epistemic standard. Jurs and DeVito, like many other legal academic writers, apply a single adjective to standards that measure two different, incommensurate things. The authors’ repetition of the now 30-plus year-old mistake is a poor start for a law review article that sets out to reform the widespread inconsistency in the application of Rule 702, in federal and in state courts.

In seeking greater adherence to the actual rule, and consistency among decisions, Jurs and DeVito might have urged for judicial education, or blue-ribbon juries, or science courts, or greater use of court-appointed expert witnesses. Instead they have put their marker down on abandoning all meaningful gatekeeping. Jurs and DeVito are intent upon repairing the inconsistency and incoherency in the application of Daubert, by removing the standard altogether.

“To resolve the problem, we propose that the Courts replace the multiple Daubert factors with a single factor—testability—and that once the evidence meets this standard the judge should provide the jury with a proposed jury instruction to guide their analysis of the fact question addressed by the expert evidence.”[3]

In other words, because lower federal courts have routinely ignored the actual statutory language of Rule 702, and Supreme Court precedents, Jurs and DeVito would have courts invent a new standard, that virtually excludes nothing as long as someone can imagine a test for the asserted opinion. Remarkably, although they carry on about the “rule of law,” the authors fail to mention that judges have no authority to ignore the requirements of Rule 702. And perhaps even more stunning is that they have advanced their nihilistic proposal in the face of the remedial changes in Rule 702, designed to address judicial lawlessness in ignoring previously enacted versions of Rule 702. This antic proposal would bootstrap previous judicial “flyblowing” of a Congressional mandate into a prescription for abandoning any meaningful standard. They have articulated the Cole Porter standard: anything goes. Any opinion that “can be tested is science”; end of discussion.  The rest is for the jury to decide as a question of fact, subject to the fact finder’s credibility determinations. This would be a Scott v. Sandford rule[4] for scientific validity; science has no claims of validity that the law is bound to respect.

Jurs and DeVito attempt a cynical trick. They argue that they would fix the problem of “an unpredictable standard” by reverting to what they say is Daubert’s first principle of ensuring the reliability of expert witness testimony, and limiting the evidentiary display at trial to “good science.” Cloaking their nihilism, the authors say that they want to promote “good science,” but advocate the admissibility of any and every opinion, as long as it is theoretically “testable.” In order to achieve this befuddled goal, they simply redefine scientific knowledge as “essentially” equal to testable propositions.[5]

Jurs and DeVito marshal evidence of judicial ignorance of key aspects of scientific method, such as error rate. We can all agree that judges frequently misunderstand key scientific concepts, but their misunderstandings and misapplications do not mean that the concepts are unimportant or unnecessary. Many judges seem unable to deliver an opinion that correctly defines p-value or confidence interval, but their inabilities do not allow us to dispense with the need to assess random error in statistical tests. Our faint-hearted authors never explain why the prevalence of judicial error must be a counsel of despair that drives us to bowdlerize scientific evidence into something it is not. We may simply need better training for judges, or better assistance for them in addressing complex claims. Ultimately, we need better judges.

For those judges who have taken their responsibility seriously, and who have engaged with the complexities of evaluating validity concerns raised in Rule 702 and 703 challenges, the Jurs and DeVito proposal must seem quite patronizing. The “Daubert” factors are simply too complex for you, so we will just give you crayons, or a single, meaningless factor that you cannot screw up.[6]

The authors set out a breezy, selective review of statements by a few scientists and philosophers of science. Rather than supporting the extreme reductionism, Jurs and DeVito’s review reveals that science is much more than identifying a “testable” proposition. Indeed, the article’s discussion of philosophy and practice of science weighs strongly against the authors’ addled proposal.[7]

The authors, for example, note that Sir Isaac Newton emphasized the importance of empirical method.[8] Contrary to the article’s radical reductionism, the authors note that Sir Karl Popper and Albert Einstein stressed that the failure to obtain a predicted experimental result may render a theory “untenable,” which of course requires data and valid tests and inferences to assess. Quite a bit of motivated reasoning has led Jurs and DeVito to confuse a criterion of testability with the whole enterprise of science, and to ignore the various criteria of validity for collecting data, testing hypotheses, and interpreting results.

The authors suggest that their proposal will limit the judicial inquiry to the the legal question of reliability, but this suggestion is mere farce. Reliability means obtaining the same or sufficiently similar results upon repeated testing, but these authors abjure testing itself.  Furthermore, reliability as contemplated by the Supreme Court, in 1993, and by FRE 702 ever since, has meant validity of the actual test that an expert witness argues in support of his or her opinion or claims.

Whimsically, and without evidence, Jurs and DeVito claim that their radical abandonment of gatekeeping will encourage scientists, in “fields that are testable, but not yet tested, to perform real, objective, and detailed research.” Their proposal, however, works to remove any such incentive because untested but testable research becomes freely admissible. Why would the lawsuit industry fund studies, which might not support their litigation claims, when the industry’s witnesses need only imagine a possible test to advance their claims, without the potential embarrassment by facts? The history of modern tort law teaches us that cheap speculation would quickly push out actual scientific studies.

The authors’ proposal would simply open the floodgates of speculation, conjecture, and untested hypothesis, and leave the rest to the vagaries of trials, mostly in front of jurors untrained in evaluating scientific and statistical evidence. Admittedly, some incurious and incompetent gatekeepers and triers of fact will be relieved to know that they will not have to evaluate actual scientific evidence, because it had been eliminated by the Jurs and DeVito proposal to make mere testability the touchstone of admissibility

To be sure, in Aristotelian terms, testability is logical and practically prior to testing, but these relationships do not justify holding out testability as the “essence” of science, and the alpha and omega of science.[9] Of course, one must have an hypothesis to engage in hypothesis testing, but science lies in the clever interrogation of nature, guided by the hypothesis. The scientific process lies in answering the question, not simply in formulating the question.

As for the authors’ professed concern about “rule of law,” readers should note that the Jurs and DeVito article completely ignores the remedial amendment to Rule 702, which went into effect on December 1, 2023, to address the myriad inconsistencies, and failures to engage, in required gatekeeping of expert witness opinion testimony.[10]

The new Rule 702 is now law, with its remedial clarification that the proponent of expert witness opinion must show the court that the opinion is sufficiently supported by facts or data, Rule 702(b), and that the opinion “reflects a reliable application of the principles and methods to the facts of the case,” Rule 702(d). The Rule prohibits deferring the evaluation of sufficiency of support or reliability of application of method to the trier of fact; there is no statutory support for suggesting that these inquires always or usually go to “weight and not admissibility.”

The Jurs and DeVito proposal would indeed be a U-Turn in the law of expert witness opinion testimony. Rather than promote the rule of law, they have issued an open, transparent call for licentiousness in the adjudication of scientific and technical issues.


[1] Andrew Jurs & Scott DeVito, “A Return to Rationality: Restoring the Rule of Law After Daubert’s Disasterous U-Turn,” 164 New Mexico L. Rev. 164 (2024) [cited below as U-Turn]

[2] Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993).

[3] U-Turn at 164, Abstract.

[4] 60 U.S. 393 (1857).

[5] U-Turn at 167.

[6] U-Turn at 192.

[7] See, e.g., U-Turn at 193 n.179, citing David C. Gooding, “Experiment,” in W.H. Newton-Smith, ed., A Companion to the Philosophy of Science 117 (2000) (emphasizing the role of actual experimentation, not the possibility of experimentation, in the development of science).

[8] U-Turn at 194.

[9] See U-Turn at 196.

[10] See Supreme Court Order, at 3 (Apr. 24, 2023); Supreme Court Transmittal Package (Apr. 24, 2023).

The Proper Study of Mankind

December 24th, 2023

“Know then thyself, presume not God to scan;

The proper study of Mankind is Man.”[1]

 

Kristen Ranges recently earned her law degree from the University of Miami School of Law, and her doctorate in Environmental Science and Policy, from the University of Miami Rosenstiel School of Marine, Atmospheric, and Earth Science. Ranges’ dissertation was titled Animals Aiding Justice: The Deepwater Horizon Oil Spill and Ensuing Neurobehavioral Impacts as a Case Study for Using Animal Models in Toxic Tort Litigation – A Dissertation.[2] At first blush, Ranges would seem to be a credible interlocutor in the never-ending dispute over the role of whole animal toxicology (and in vitro toxicology) in determining human causation in tort litigation. Her dissertation title is, however, as Martin Short would say, a bit of a tell. Zebrafish become sad when exposed to oil spills, as do we all.

Ranges recently published a spin-off of her dissertation as a law review article with one of her professors. “Vermin of Proof: Arguments for the Admissibility of Animal Model Studies as Proof of Causation in Toxic Tort Litigation.”[3] Arguments for; no arguments against. We can thus understand this is an advocacy piece, which is fair enough. The paper was not designed or titled to mislead anyone into thinking it would be a consideration of arguments for and against extrapolation from (non-human) animal studies to human beings. Perhaps you will think it churlish of me to point out that animal studies will rarely be admissible as evidence. They come into consideration in legal cases only through expert witnesses’ reliance upon them. So the issue is not whether animal studies are admissible, but rather whether expert witness opinion testimony that relies solely or excessively on animal studies for purposes of inferring causation is admissible under the relevant evidentiary rules. Talking about the admissibility of animal model studies signals, if nothing else, a serious lack of familiarity with the relevant evidentiary rules.

Ranges’ law review is clearly, and without subtlety, an advocacy piece. She argues:

“However, judges, scholars, and other legal professionals are skeptical of the use of animal studies because of scientific and legal concerns, which range from interspecies disparities to prejudice of juries. These concerns are either unfounded or exaggerated. Animal model studies can be both reliable and relevant in toxic tort cases. Given the Federal Rules of Evidence, case law relevant to scientific evidence, and one of the goals of tort law-justice-judges should more readily admit these types of studies as evidence to help plaintiffs meet the burden of proof in toxic tort litigation.”[4]

For those of you who labor in this vineyard, I would suggest you read Ranges’ article and judge for yourself. What I see is a serious lack of scientific evidence for her claims, and a serious misunderstanding of the relevant law. One might, for starters, putting aside the Agency’s epistemic dilution, ask whether there are any I.A.R.C. category I (“known”) carcinogens based solely upon animal evidence. Or has the U.S. Food & Drug Administration ever approved a medication as reasonably safe and effective based upon only animal studies?

Every dog owner and lover has likely been told by a veterinarian, or the Humane Society, that we should resist their lupine entreaties and withhold chocolate, raisins, walnuts, avocados, and certain other human foods. Despite their obvious intelligence, capacity for affection, when it comes to toxicology, dogs are not people, although some people act like the less reputable varieties of dogs.

Back in 1985, in connection with Agent Orange litigation, the late Judge Jack Weinstein wrote what was correct then, and even more so today, that “laboratory animal studies are generally viewed with more suspicion than epidemiological studies, because they require making the assumption that chemicals behave similarly in different species.”[5] Judge Weinstein was no push-over for strident defense counsel or expert witnesses, but the legal consequences were nonetheless obvious to him, when he looked carefully at the animal studies plaintiffs’ expert witnesses were claiming to support their opinions. “[A]nimal studies are of so little probative value as to be inadmissible.  They cannot be a predicate for an opinion under Rule 703.”[6] One of the several disconnects between the plaintiffs’ expert witnesses’ animal studies and the human diseases claimed was the disparity of dose and duration between the relied upon studies and the service men claimants. Judge Weinstein observed that when the hand waving stopped, “[t]here is no evidence that plaintiffs were exposed to the far higher concentrations involved in both animal and industrial exposure studies.”[7]

Ranges and Oakley unfairly deprecate the Supreme Court’s treatment of animal evidence in the 1997 Joiner opinion.[8] Mr. Joiner had been an electrician by a small city in Georgia, where he experienced dermal exposure, over several years, to polychlorinated biphenyls (PCB’s), a chemical found in electrical transformer coolant. He alleged that he had developed small-cell lung cancer from his occasional occupational exposure. In the district court, a careful judge excluded the plaintiffs’ expert witnesses, who relied heavily upon animal studies and who cherry picked and distorted the available epidemiology.[9] The Court of Appeals reversed, in an unsigned, non-substantive opinion that interjected an asymmetric standard of review.[10]

After granting review, the Supreme Court engaged with the substantive validity issues passed over by the intermediate appellate court. In addressing the plaintiff’s expert witnesses’s reliance upon animal studies, the Court was struck by an extrapolation from a different species, different route of administration, different dose, different duration of exposure, and different disease.[11] Joiner was an adult human whose alleged exposure to PCBs was far less than the exposure in the baby mice that received injections of PCBs in a high concentration. The mice developed alveologenic adenomas, a rare tumor that is usually benign, not malignant.[12] The Joiner Court recognized that these multiple extrapolations were a bridge to nowhere, and reversed the Court of Appeals, and reinstated the judgment of the district court. What is particular salient about the Joiner decision, and about which you will find no discussion in the law review paper by Ranges and Oakley, is how well the Joiner opinion has held up over quarter of a century that passed. Today, in the waning moments of 2023, there is still no valid, scientifically sound support for the claim that the sort of exposure Mr. Joiner had can cause small-cell lung cancer.[13]

Perhaps the most egregious lapses in scholarship occur when Ranges, a newly minted scientist, and her co-author, a full professor of law, write:

“For example, Bendectin, an antinausea medication prescribed to pregnant women, caused a slew of birth defects (hence its nickname ‘The Second Thalidomide’).49[14]

I had to re-read this sentence many times to make sure I was not hallucinating. Ranges’ and Oakley’s statement is, of course, demonstrably false. A double whooper, at least, and a jarring deviation from the standard of scholarly care.

But their statement is footnoted you say. Here is what the cited article, footnote 40 in “Vermin of Proof,” says:

“RESULTS: The temporal trends in prevalence rates for specific birth defects examined from 1970 through 1992 did not show changes that reflected the cessation of Bendectin use over the 1980–84 period. Further, the NVP hospitalization rate doubled when Bendectin use ceased.

CONCLUSIONS: The population results of the ecological analyses complement the person-specific results of the epidemiological analyses in finding no evidence of a teratogenic effect from the use of Bendectin.”[15]

So the cited source actually says the exact opposite of what the authors assert. Apparently, students on law review at Georgetown University Law Center do not check citations for accuracy. Not only was the statement wrong in 1993, when the Supreme Court decided the famous Daubert case, it was wrong 20 years later, in 2013, when the United States Food and Drug Administration (FDA) approved  Diclegis, a combination of doxylamine succinate and pyridoxine hydrochloride, the essential ingredients in Bendectin, for sale in the United States, for pregnant women experiencing nausea and vomiting.[16] The return of Bendectin to the market, although under a different name, was nothing less than a triumph of science over the will of the lawsuit industry.[17] 

Channeling the likes of plaintiffs’ expert witness Carl Cranor (whom they cite liberally and credulously), Ranges and Oakley argue for a vague “weight of the evidence” (WOE) methodology, in which several inconclusive and lighter-than-air pieces of evidence somehow magically combine in cold fusion to warrant a conclusion of causation. Others have gone down this dubious path before, but these authors’ embrace of the plaintiffs’ expert witnesses’ opinion in Bendectin litigation reveals the insubstantiality and the invalidity of their method.[18] As Professor Ronald Allen put the matter:

“Given the weight of evidence in favor of Bendectin’s safety, it seems peculiar to argue for mosaic evidence [WOE] from a case in which it would have plainly been misleading.”[19]

It surely seems like a reduction ad absurdum of the proposed methodology.

One thing these authors get right is that most courts disparage and exclude expert witness opinion that relies exclusively or excessively upon animal toxicology.[20] They wrongly chastise these courts, however, for ignoring scientific opinion. In 2005, the Teratology Society issued a position paper on causation in teratology-related litigation,[21] in which the Society specifically addressed the authors’ claims:

“6. Human data are required for conclusions that there is a causal relationship between an exposure and an outcome in humans. Experimental animal data are commonly and appropriately used in establishing regulatory exposure limits and are useful in addressing biologic plausibility and mechanism questions, but are not by themselves sufficient to establish causation in a lawsuit. In vitro data may be helpful in exploring mechanisms of toxicity but are not by themselves evidence of causation.”[22]

Ranges and Oakley are flummoxed that courts exclude expert witnesses who have relied upon animal studies when regulatory agencies use such studies with abandon. The case law on the distinction between precautionary standards in regulation and causation standards in tort law is clear, and explains the difference in approach, but these authors are determined to ignore the obvious difference.[23] The Teratology Society emphasized what should be hornbook law; namely, regulatory standards for testing and warnings are not particularly germane to tort law standards for causation:

“2. The determination of causation in a lawsuit is not the same as a regulatory determination of a protective level of exposure. If a government agency has determined a regulatory exposure level for a chemical, the existence of that level is not evidence that the chemical produces toxicity in humans at that level or any other level. Regulatory levels use default assumptions that are improper in lawsuits. One such assumption is that humans will be as sensitive to the toxicity of a chemical as is the most sensitive experimental animal species. This assumption may be very useful in regulation but is not evidence that exposure to that chemical caused an adverse outcome in an individual plaintiff. Regulatory levels often incorporate uncertainty factors or margins of exposure. These factors may result in a regulatory level much lower than an exposure level shown to be harmful in any organism and are an additional reason for the lack of utility of regulatory levels in causation considerations.”[24]

The suggestion from Ranges and Oakley that the judicial treatment of reliance upon animal studies is based upon ossified, ancient precedent, prejudice, and uncritical acceptance of defense counsel’s unsupported argument is simply wrong. There are numerous discussions of the difficulty of extrapolating teratogenicity from animal data to humans,[25] and ample basis for criticism of the glib extension of rodent carcinogenicity to humans.[26]

Ranges and Oakley ignore the extensive scientific literature questioning extrapolation from high exposure rodent models to much lower exposures in humans.[27] The invalidity of extrapolation can result in both false positives and false negatives. Indeed the thalidomide case is a compelling example of the failure of animal testing. Thalidomide was tested on pregnant rats and rabbits without detecting teratogenicity; indeed most animal species do not metabolize thalidomide or exhibit teratogenicity as seen in humans. Animal models simply do not have a sufficient positive predictive value to justify a conclusion of causation in humans, even if we accept a precautionary principle recognition of such animal testing for regulatory purposes.[28]

As improvident as Ranges’ pronouncements may be, finding her message amplified by Professor Ed Cheng on his podcast series, Excited Utterances, was even more disturbing. In November 2023, Cheng interviewed Kristen Ranges in an episode of his podcast, Vermin of Proof, in which he gave Ranges a chance to reprise her complaints about the judiciary’s handling of animal evidence, without much in the way of specificity, and with some credulous cheerleading to aid and abet. In his epilogue, Cheng wondered why toxicologic evidence is disfavored when such evidence is routinely used by scientists and regulators. What Cheng misses is that regulators use toxicologic evidence for regulation, not for assessments of human causation, and that the two enterprises are quite different.  The regulatory exercise goes something like asking about the stall speed of a pig. It does not matter that pigs cannot fly; we skip that fact and press on to ask what the pig’s take off and stall speeds are.

Seventy years ago, no less than Sir Austin Bradford Hill, observed that:

“We may subject mice, or other laboratory animals, to such an atmosphere of tobacco smoke that they can — like the old man in the fairy story — neither sleep nor slumber; they can neither breed nor eat. And lung cancers may or may not develop to a significant degree. What then? We may have thus strengthened the evidence, we may even have narrowed the search, but we must, I believe, invariably return to man for the final proof or proofs.”[29]


[1] Alexander Pope, “An Essay on Man” (1733), in Robin Sowerby, ed., Alexander Pope: Selected Poetry and Prose at 153 (1988).

[2] Kristen Ranges, Animals Aiding Justice: The Deepwater Horizon Oil Spill and Ensuing Neurobehavioral Impacts as a Case Study for Using Animal Models in Toxic Tort Litigation – A Dissertation (2023).

[3] Kristen Ranges & Jessica Owley, “Vermin of Proof: Arguments for the Admissibility of Animal Model Studies as Proof of Causation in Toxic Tort Litigation,” 34 Georgetown Envt’l L. Rev. 303 (2022) [Vermin]

[4] Vermin at 303.

[5] In re Agent Orange Prod. Liab. Litig., 611 F. Supp. 1223, 1241 (E.D.N.Y. 1985), aff’d, 818 F.2d 187 (2d Cir. 1987), cert. denied, 487 U.S. 1234 (1988).

[6] Id.

[7] Id.

[8] General Elec. Co. v. Joiner, 522 U.S. 136, 144 (1997) [Joiner]

[9] Joiner v. General Electric Co., 864 F. Supp. 1310 (N.D. Ga. 1994).

[10] Joiner v. General Electric Co., 134 F.3d 1457 (11th Cir. 1998) (per curiam). 

[11] Joiner, 522 U.S. at 144-45.

[12] See Leonid Roshkovan, Jeffrey C. Thompson, Sharyn I. Katz, Charuhas Deshpande, Taylor Jenkins, Anna K. Nowak, Rosyln Francis, Carole Dennie, Dominique Fabre, Sunil Singhal, and Maya Galperin-Aizenberg, “Alveolar adenoma of the lung: multidisciplinary case discussion and review of the literature,” 12 J. Thoracic Dis. 6847 (2020).

[13] See How Have Important Rule 702 Holdings Held Up With Time?” (Mar. 20, 2015); “The Joiner Finale” (Mar. 23, 2015).

[14] Vermain at 312.

[15] Jeffrey S Kutcher, Arnold Engle, Jacqueline Firth & Steven H. Lamm, “Bendectin and Birth Defects II: Ecological Analyses, 67 Birth Defects Research Part A: Clinical and Molecular Teratology 88, 88 (2003).

[16] See FDA News Release, “FDA approves Diclegis for pregnant women experiencing nausea and vomiting,” (April 8, 2013).

[17] See Gideon Koren, “The Return to the USA of the Doxylamine-Pyridoxine Delayed Release Combination (Diclegis®) for Morning Sickness — A New Morning for American Women,” 20 J. Popul. Ther. Clin. Pharmacol. e161 (2013).

[18] Michael D. Green, “Pessimism About Milward,” 3 Wake Forest J. Law & Policy41, 62-63 (2013); Susan Haack, “Irreconcilable Differences? The Troubled Marriage of Science and Law,” 72 Law & Contemporary Problems 1, 17 (2009); Susan Haack, “Proving Causation: The Holism of Warrant and the Atomism of Daubertm” 4 J. Health & Biomedical Law 273, 274-78 (2008).

[19] Ronald J. Allen & Esfand Nafisi, “Daubert and its Discontents,” 76 Brooklyn L. Rev. 132, 148 (2010). 

[20] See In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., 26 F. Supp. 3d 466, 475 (E.D. Pa. 2014) (noting that “causation opinions based primarily upon in vitro and live animal studies are unreliable and do not meet the Daubert standard.”), aff’d, 858 F.3d 787 (3rd Cir. 2017); Chapman v. Procter & Gamble Distrib., LLC, 766 F.3d 1296, 1308 (11th Cir. 2014) (affirming exclusion of testimony based on “secondary methodologies,” including animal studies, which offer “insufficient proof of general causation.”); The Sugar Ass’n v. McNeil-PPC, Inc., 2008 WL 11338092, *3 (C.D. Calif. July 21, 2008) (finding that plaintiffs’ expert witnesses, including Dr. Abou-Donia, failed to provide the requisite analytical  support for the extrapolation of their Five Opinions from rats to humans”); In re Silicone Gel Breast Implants Prods. Liab. Litig., 318 F. Supp. 2d 879, 891 (C.D. Cal. 2004) (observing that failure to compare similarities and differences across animals and humans could lead to the exclusion of opinion evidence); Cagle v. The Cooper Companies, 318 F. Supp. 2d 879, 891 (C.D. Calif. 2004) (citing Joiner, for observation that animal studies are not generally admissible when contrary epidemiologic studies are available; and detailing significant disadvantages in relying upon animal studies, such as (1) differences in absorption, distribution, and metabolism; (2) the unrealistic, non-physiological exposures used in animal studies; and (3) the use of unverified assumptions about dose-response); Wills v. Amerada Hess Corp., No. 98 CIV. 7126(RPP), 2002 WL 140542, at *12 (S.D.N.Y. Jan. 31, 2002) (faulting expert’s reliance on animal studies because there was no evidence plaintiff had injected suspected carcinogen in same manner as studied animals, or at same dosage levels), aff’d, 379 F.3d 32 (2nd Cir. 2004) (Sotomayor, J.); Bourne v. E.I. du Pont de Nemours & Co., 189 F. Supp. 2d 482, 501 (S.D. W.Va. 2002) (benlate and birth defects), aff’d, 85 F. App’x 964 (4th Cir.), cert. denied, 543 U.S. 917 (2004); Magistrini v. One Hour Martinizing Dry Cleaning noted that “[a]nimal bioassays are of limited use in determining whether a particular chemical causes a particular disease, or type of cancer, in humans.”190 180 F. Supp. 2d 584, 593 (D.N.J. 2002); Soutiere v. BetzDearborn, Inc., No. 2:99-CV-299, 2002 WL  34381147, at *4 (D. Vt. July 24, 2002) (holding expert’s evidence inadmissible when “[a]t best there are animal studies that suggest a link between massive doses of [the substance in question] and the development of certain kinds of cancers, such that [the substance in question] is listed as a ‘suspected’ or ‘probable’ human carcinogen”); Glastetter v. Novartis Pharms. Corp., 252 F.3d 986, 991 (8th Cir. 2001); Hollander v. Sandoz Pharm. Corp., 95 F. Supp. 2d 1230, 1238 (W.D. Okla. 2000), aff’d, 289 F.3d 1193, 1209 (10th Cir. 2002) (rejecting the relevance of animal studies to causation arguments in the circumstances of the case); Allison v. McGhan Medical Corp., 184 F.3d 1300, 1313–14 (11th Cir.1999); Raynor v. Merrell Pharrns. Inc., 104 F.3d 1371, 1375-1377 (D.C. Cir.1997) (observing that animal studies are unreliable, especially when “sound epidemiological studies produce opposite results from non-epidemiological ones, the rate of error of the latter is likely to be quite high”); Lust v. Merrell Dow Pharms., Inc., 89 F.3d 594, 598 (9th Cir.1996); Barrett v. Atlantic Richfield Co., 95 F.3d 375 (5th Cir. 1996) (extrapolation from a rat study was speculation); Nat’l Bank of Comm. v. Dow Chem. Co., 965 F. Supp. 1490, 1527 (E.D. Ark. 1996) (“because of the difference in animal species, the methods and routes of administration of the suspect chemical agent, maternal metabolisms and other factors, animal studies, taken alone, are unreliable predictors of causation in humans”), aff’d, 133 F.3d 1132 (8th Cir. 1998); Hall v. Baxter Healthcare Corp., 947 F. Supp. 1387, 1410-11 (D. Or. 1996) (with the help of court-appointed technical advisors, observing that animal studies taken alone fail to predict human disease reliably); Daubert v. Merrell Dow Pharrns., Inc., 43 F.3d 1311, 1322 (9th Cir. 1995) (on remand from Supreme Court with directions to apply an epistemic standard derived from Rule 702 itself); Sorensen v. Shaklee Corp., 31 F.3d 638, 650 (8th Cir.1994) (affirming exclusion of expert witness opinions based upon animal mutagenicity data not germane to the claimed harm); Elkins v. Richardson-Merrell, Inc., 8 F.3d 1068, 1073 (6th Cir. 1993);Wade-Greaux v. Whitehall Labs., Inc., 874 F. Supp. 1441, 1482 (D.V.1. 1994), aff’d, 46 F.3d 1120 (3d Cir. 1994) (per curiam); Renaud v. Martin Marietta Corp., Inc., 972 F.2d 304, 307 (10th Cir.1992) (“The etiological evidence proffered by the plaintiff was not sufficiently reliable, being drawn from tests on non-human subjects without confirmatory epidemiological data.”) (“Dr. Jackson performed no calculations to determine whether the dose or route of administration of antidepressants to rats and monkeys in the papers that she cited in her report was equivalent to or substantially similar to human beings taking prescribed doses of Prozac.”); Bell v. Swift Adhesives, Inc., 804 F. Supp. 1577, 1579–81 (S.D. Ga. 1992) (excluding expert opinion of Dr. Janette Sherman, who opined that methylene chloride caused liver cancer, based largely upon on animal studies); Conde v. Velsicol Chem. Corp., 804 F. Supp. 972, 1025-26 (S.D. Ohio 1992) (noting that epidemiology is “the primary generally accepted methodology for demonstrating a causal relation between a chemical compound and a set of symptoms or a disease”), aff’d, 24 F.3d 809 (6th Cir. 1994); Turpin v. Merrell Dow Pharm., Inc., 959 F.2d 1349, 1360-61 (6th Cir. 1992) (“The analytical gap between the [animal study] evidence presented and the inferences to be drawn on the ultimate issue of human birth defects is too wide. Under such circumstances, a jury should not be asked to speculate on the issue of causation.”); Brock v. Merrell Dow Pharm., 874F.2d 307, 313 (5th Cir. 1989) (noting the “very limited usefulness of animal studies when confronted with questions of toxicity”); Richardson v. Richardson-Merrell, Inc., 857 F. 2d 823, 830 (D.C. Cir. 1988) (“Positive results from in vitro studies may provide a clue signaling the need for further research, but alone do not provide a satisfactory basis for opining about causation in the human context.”);  Lynch v. Merrell-Nat‘l Labs., 830 F.2d 1190, 1194 (1st Cir. 1987) (“Studies of this sort [animal studies], singly or in combination, do not have the capability of proving causation in human beings in the absence of any confirmatory epidemiological data.”). See also Merrell Dow Pharrns., Inc. v. Havner, 953 S.W.2d 706, 730 (Tex. 1997); DePyper v. Navarro, No. 83-303467-NM, 1995 WL 788828, at *34 (Mich. Cir. Ct. Nov. 27, 1995), aff’d, No. 191949, 1998 WL 1988927 (Mich. Ct. App. Nov. 6, 1998); Nelson v. American Sterilizer Co., 566 N.W.2d 671 (Mich. Ct. App. 1997)(high-dose animal studies not reliable). But see Ambrosini v. Labarraque,  101 F.3d 129, 137-140 (D.C. Cir.1996); Dyson v. Winfield, 113 F. Supp. 2d 44, 50-51 (D.D.C. 2000).

[21] Teratology Society Public Affairs Committee, “Position Paper Causation in Teratology-Related Litigation,” 73 Birth Defects Research (Part A) 421 (2005) [Teratology Position Paper]

[22] Id. at 423.

[23]  SeeImproper Reliance Upon Regulatory Risk Assessments in Civil Litigation” (Mar. 19, 2023) (collecting cases).

[24] Teratology Position Paper at 422-423.

[25] See, e.g., Gideon Koren, Anne Pastuszak & Shinya Ito, “Drugs in Pregnancy,” 338 New England J. Med. 1128, 1131 (1998); Louis Lasagna, “Predicting Human Drug Safety from Animal Studies: Current Issues,” 12 J. Toxicological Sci. 439, 442-43 (1987).

[26] Bruce N. Ames & Lois S. Gold, Too Many Rodent Carcinogens: Mitogenesis Increases Mutagenesis, 249 Science 970, 970 (1990) (noting that chronic irritation induced by many chemicals at high exposures is itself a cause of cancer in rodent models); Bruce N. Ames & Lois Swirsky Gold, “Environmental Pollution and Cancer: Some Misconceptions,” in Jay H. Lehr, ed., Rational Readings on Environmental Concerns 151, 153 (1992); Mary Eubanks, “The Danger of Extrapolation: Humans and Rodents Differ in Response to PCBs,” 112 Envt’l Health Persps. A113 (2004)

[27] Andrea Gawrylewski, “The Trouble with Animal Models: Why did human trials fail?” 21 The Scientist 44 (2007); Michael B. Bracken, “Why animal studies are often poor predictors of human reactions to exposure,” 101 J. Roy. Soc. Med. 120 (2008); Fiona Godlee, “How predictive and productive is animal research?” 3348 Brit. Med. J. g3719 (2014); John P. A. Ioannidis, “Extrapolating from Animals to Humans,” 4 Science Translational Med. 15 (2012); Pandora Pound & Michael Bracken, “Is animal research sufficiently evidence based to be a cornerstone of biomedical research?” 348 Brit. Med. J. g3387 (2014); Pandora Pound, Shah Ebrahim, Peter Sandercock, Michael B Bracken, and Ian Roberts, “Where is the evidence that animal research benefits humans?” 328 Brit. Med. J. 514 (2004) (writing on behalf of the Reviewing Animal Trials Systematically (RATS) Group).

[28] See Ray Greek, Niall Shanks, and Mark J. Rice, “The History and Implications of Testing Thalidomide on Animals,” 11 J. Philosophy, Sci. & Law 1, 19 (2011).

[29] Austin Bradford Hill, “Observation and Experiment,” 248 New Engl. J. Med. 995, 999 (1953).

The Role of Peer Review in Rule 702 and 703 Gatekeeping

November 19th, 2023

“There is no expedient to which man will not resort to avoid the real labor of thinking.”
              Sir Joshua Reynolds (1723-92)

Some courts appear to duck the real labor of thinking, and the duty to gatekeep expert witness opinions,  by deferring to expert witnesses who advert to their reliance upon peer-reviewed published studies. Does the law really support such deference, especially when problems with the relied-upon studies are revealed in discovery? A careful reading of the Supreme Court’s decision in Daubert, and of the Reference Manual on Scientific Evidence provides no support for admitting expert witness opinion testimony that relies upon peer-reviewed published studies, when the studies are invalid or are based upon questionable research practices.[1]

In Daubert v. Merrell Dow Pharmaceuticals, Inc.,[2] The Supreme Court suggested that peer review of studies relied upon by a challenged expert witness should be a factor in determining the admissibility of that expert witness’s opinion. In thinking about the role of peer-review publication in expert witness gatekeeping, it is helpful to remember the context of how and why the Supreme was talking about peer review in the first place. In the trial court, the Daubert plaintiff had proffered an expert witness opinion that featured reliance upon an unpublished reanalysis of published studies. On the defense motion, the trial court excluded the claimant’s witness,[3] and the Ninth Circuit affirmed.[4] The intermediate appellate court expressed its view that unpublished, non-peer-reviewed reanalyses were deviations from generally accepted scientific discourse, and that other appellate courts, considering the alleged risks of Bendectin, refused to admit opinions based upon unpublished, non-peer-reviewed reanalyses of epidemiologic studies.[5] The Circuit expressed its view that reanalyses are generally accepted by scientists when they have been verified and scrutinized by others in the field. Unpublished reanalyses done for solely for litigation would be an insufficient foundation for expert witness opinion.[6]

The Supreme Court, in Daubert, evaded the difficult issues involved in evaluating a statistical analysis that has not been published by deciding the case on the ground that the lower courts had applied the wrong standard.  The so-called Frye test, or what I call the “twilight zone” test comes from the heralded 1923 case excluding opinion testimony based upon a lie detector:

“Just when a scientific principle or discovery crosses the line between the experimental and demonstrable stages is difficult to define. Somewhere in this twilight zone the evidential force of the principle must be recognized, and while the courts will go a long way in admitting expert testimony deduced from a well recognized scientific principle or discovery, the thing from which the deduction is made must be sufficiently established to have gained general acceptance in the particular field in which it belongs.”[7]

The Supreme Court, in Daubert, held that with the promulgation of the Federal Rules of Evidence in 1975, the twilight zone test was no longer legally valid. The guidance for admitting expert witness opinion testimony lay in Federal Rule of Evidence 702, which outlined an epistemic test for “knowledge,” which would be helpful to the trier of fact. The Court then proceeded to articulate several  non-definitive factors for “good science,” which might guide trial courts in applying Rule 702, such as testability or falsifiability, a showing of known or potential error rate. Another consideration, general acceptance carried over from Frye as a consideration.[8] Courts have continued to build on this foundation to identify other relevant considerations in gatekeeping.[9]

One of the Daubert Court’s pertinent considerations was “whether the theory or technique has been subjected to peer review and publication.”[10] The Court, speaking through Justice Blackmun, provided a reasonably cogent, but probably now out-dated discussion of peer review:

 “Publication (which is but one element of peer review) is not a sine qua non of admissibility; it does not necessarily correlate with reliability, see S. Jasanoff, The Fifth Branch: Science Advisors as Policymakers 61-76 (1990), and in some instances well-grounded but innovative theories will not have been published, see Horrobin, “The Philosophical Basis of Peer Review and the Suppression of Innovation,” 263 JAMA 1438 (1990). Some propositions, moreover, are too particular, too new, or of too limited interest to be published. But submission to the scrutiny of the scientific community is a component of “good science,” in part because it increases the likelihood that substantive flaws in methodology will be detected. See J. Ziman, Reliable Knowledge: An Exploration of the Grounds for Belief in Science 130-133 (1978); Relman & Angell, “How Good Is Peer Review?,” 321 New Eng. J. Med. 827 (1989). The fact of publication (or lack thereof) in a peer reviewed journal thus will be a relevant, though not dispositive, consideration in assessing the scientific validity of a particular technique or methodology on which an opinion is premised.”[11]

To the extent that peer review was touted by Justice Blackmun, it was because the peer-review process advanced the ultimate consideration of the scientific validity of the opinion or claim under consideration. Validity was the thing; peer review was just a crude proxy.

If the Court were writing today, it might well have written that peer review is often a feature of bad science, advanced by scientists who know that peer-reviewed publication is the price of admission to the advocacy arena. And of course, the wild proliferation of journals, including the “pay-to-play” journals, facilitates the festschrift.

Reference Manual on Scientific Evidence

Certainly, judicial thinking evolved since 1993, and the decision in Daubert. Other considerations for gatekeeping have been added. Importantly, Daubert involved the interpretation of a statute, and in 2000, the statute was amended.

Since the Daubert decision, the Federal Judicial Center and the National Academies of Science have weighed in with what is intended to be guidance for judges and lawyers litigating scientific and technical issue. The Reference Manual on Scientific Evidence is currently in a third edition, but a fourth edition is expected in 2024.

How does the third edition[12] treat peer review?

An introduction by now retired Associate Justice Stephen Breyer blandly reports the Daubert considerations, without elaboration.[13]

The most revealing and important chapter in the Reference Manual is the one on scientific method and procedure, and sociology of science, “How Science Works,” by Professor David Goodstein.[14] This chapter’s treatment is not always consistent. In places, the discussion of peer review is trenchant. At other places, it can be misleading. Goodstein’s treatment, at first, appears to be a glib endorsement of peer review as a substitute for critical thinking about a relied-upon published study:

“In the competition among ideas, the institution of peer review plays a central role. Scientifc articles submitted for publication and proposals for funding often are sent to anonymous experts in the field, in other words, to peers of the author, for review. Peer review works superbly to separate valid science from nonsense, or, in Kuhnian terms, to ensure that the current paradigm has been respected.11 It works less well as a means of choosing between competing valid ideas, in part because the peer doing the reviewing is often a competitor for the same resources (space in prestigious journals, funds from government agencies or private foundations) being sought by the authors. It works very poorly in catching cheating or fraud, because all scientists are socialized to believe that even their toughest competitor is rigorously honest in the reporting of scientific results, which makes it easy for a purposefully dishonest scientist to fool a referee. Despite all of this, peer review is one of the venerated pillars of the scientific edifice.”[15]

A more nuanced and critical view emerges in footnote 11, from the above-quoted passage, when Goodstein discusses how peer review was framed by some amici curiae in the Daubert case:

“The Supreme Court received differing views regarding the proper role of peer review. Compare Brief for Amici Curiae Daryl E. Chubin et al. at 10, Daubert v. Merrell Dow Pharms., Inc., 509 U.S. 579 (1993) (No. 92-102) (“peer review referees and editors limit their assessment of submitted articles to such matters as style, plausibility, and defensibility; they do not duplicate experiments from scratch or plow through reams of computer-generated data in order to guarantee accuracy or veracity or certainty”), with Brief for Amici Curiae New England Journal of Medicine, Journal of the American Medical Association, and Annals of Internal Medicine in Support of Respondent, Daubert v. Merrell Dow Pharm., Inc., 509 U.S. 579 (1993) (No. 92-102) (proposing that publication in a peer-reviewed journal be the primary criterion for admitting scientifc evidence in the courtroom). See generally Daryl E. Chubin & Edward J. Hackett, Peerless Science: Peer Review and U.S. Science Policy (1990); Arnold S. Relman & Marcia Angell, How Good Is Peer Review? 321 New Eng. J. Med. 827–29 (1989). As a practicing scientist and frequent peer reviewer, I can testify that Chubin’s view is correct.”[16]

So, if, as Professor Goodstein attests, Chubin is correct that peer review does not “guarantee accuracy or veracity or certainty,” the basis for veneration is difficult to fathom.

Later in Goodstein’s chapter, in a section entitled “V. Some Myths and Facts about Science,” the gloves come off:[17]

Myth: The institution of peer review assures that all published papers are sound and dependable.

Fact: Peer review generally will catch something that is completely out of step with majority thinking at the time, but it is practically useless for catching outright fraud, and it is not very good at dealing with truly novel ideas. Peer review mostly assures that all papers follow the current paradigm (see comments on Kuhn, above). It certainly does not ensure that the work has been fully vetted in terms of the data analysis and the proper application of research methods.”[18]

Goodstein is not a post-modern nihilist. He acknowledges that “real” science can be distinguished from “not real science.” He can hardly be seen to have given a full-throated endorsement to peer review as satisfying the gatekeeper’s obligation to evaluate whether a study can be reasonably relied upon, or whether reliance upon such a particular peer-reviewed study can constitute sufficient evidence to render an expert witness’s opinion helpful, or the application of a reliable methodology.

Goodstein cites, with apparent approval, the amicus brief filed by the New England Journal of Medicine, and other journals, which advised the Supreme Court that “good science,” requires a “a rigorous trilogy of publication, replication and verification before it is relied upon.” [19]

“Peer review’s ‘role is to promote the publication of well-conceived articles so that the most important review, the consideration of the reported results by the scientific community, may occur after publication.’”[20]

Outside of Professor Goodstein’s chapter, the Reference Manual devotes very little ink or analysis to the role of peer review in assessing Rule 702 or 703 challenges to witness opinions or specific studies.  The engineering chapter acknowledges that “[t]he topic of peer review is often raised concerning scientific and technical literature,” and helpfully supports Goodstein’s observations by noting that peer review “does not ensure accuracy or validity.”[21]

The chapter on neuroscience is one of the few chapters in the Reference Manual, other than Professor Goodstein’s, to address the limitations of peer review. Peer review, if absent, is highly suspicious, but its presence is only the beginning of an evaluation process that continues after publication:

Daubert’s stress on the presence of peer review and publication corresponds nicely to scientists’ perceptions. If something is not published in a peer-reviewed journal, it scarcely counts. Scientists only begin to have confidence in findings after peers, both those involved in the editorial process and, more important, those who read the publication, have had a chance to dissect them and to search intensively for errors either in theory or in practice. It is crucial, however, to recognize that publication and peer review are not in themselves enough. The publications need to be compared carefully to the evidence that is proffered.[22]

The neuroscience chapter goes on to discuss peer review also in the narrow context of functional magnetic resonance imaging (fMRI). The authors note that fMRI, as a medical procedure, has been the subject of thousands of peer-reviewed, but those peer reviews do little to validate the use of fMRI as a high-tech lie detector.[23] The mental health chapter notes in a brief footnote that the science of memory is now well accepted and has been subjected to peer review, and that “[c]areful evaluators” use only tests that have had their “reliability and validity confirmed in peer-reviewed publications.”[24]

Echoing other chapters, the engineering chapter also mentions peer review briefly in connection with qualifying as an expert witness, and in validating the value of accrediting societies.[25]  Finally, the chapter points out that engineering issues in litigation are often sufficiently novel that they have not been explored in peer-reviewed literature.[26]

Most of the other chapters of the Reference Manual, third edition, discuss peer review only in the context of qualifications and membership in professional societies.[27] The chapter on exposure science discusses peer review only in the narrow context of a claim that EPA guidance documents on exposure assessment are peer reviewed and are considered “authoritative.”[28]

Other chapters discuss peer review briefly and again only in very narrow contexts. For instance, the epidemiology chapter discusses peer review in connection with two very narrow issues peripheral to Rule 702 gatekeeping. First, the chapter raises the question (without providing a clear answer) whether non-peer-reviewed studies should be included in meta-analyses.[29] Second, the chapter asserts that “[c]ourts regularly affirm the legitimacy of employing differential diagnostic methodology,” to determine specific causation, on the basis of several factors, including the questionable claim that the methodology “has been subjected to peer review.”[30] There appears to be no discussion in this key chapter about whether, and to what extent, peer review of published studies can or should be considered in the gatekeeping of epidemiologic testimony. There is certainly nothing in the epidemiology chapter, or for that matter elsewhere in the Reference Manual, to suggest that reliance upon a peer-reviewed published study pretermits analysis of that study to determine whether it is indeed internally valid or reasonably relied upon by expert witnesses in the field.


[1] See Jop de Vrieze, “Large survey finds questionable research practices are common: Dutch study finds 8% of scientists have committed fraud,” 373 Science 265 (2021); Yu Xie, Kai Wang, and Yan Kong, “Prevalence of Research Misconduct and Questionable Research Practices: A Systematic Review and Meta-Analysis,” 27 Science & Engineering Ethics 41 (2021).

[2] 509 U.S. 579 (1993).

[3]  Daubert v. Merrell Dow Pharmaceuticals, Inc., 727 F.Supp. 570 (S.D.Cal.1989).

[4] 951 F. 2d 1128 (9th Cir. 1991).

[5]  951 F. 2d, at 1130-31.

[6] Id. at 1131.

[7] Frye v. United States, 293 F. 1013, 1014 (D.C. Cir. 1923) (emphasis added).

[8]  Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579, 590 (1993).

[9] See, e.g., In re TMI Litig. II, 911 F. Supp. 775, 787 (M.D. Pa. 1995) (considering the relationship of the technique to methods that have been established to be reliable, the uses of the method in the actual scientific world, the logical or internal consistency and coherence of the claim, the consistency of the claim or hypothesis with accepted theories, and the precision of the claimed hypothesis or theory).

[10] Id. at  593.

[11] Id. at 593-94.

[12] National Research Council, Reference Manual on Scientific Evidence (3rd ed. 2011) [RMSE]

[13] Id., “Introduction” at 1, 13

[14] David Goodstein, “How Science Works,” RMSE 37.

[15] Id. at 44-45.

[16] Id. at 44-45 n. 11 (emphasis added).

[17] Id. at 48 (emphasis added).

[18] Id. at 49 n.16 (emphasis added)

[19] David Goodstein, “How Science Works,” RMSE 64 n.45 (citing Brief for the New England Journal of Medicine, et al., as Amici Curiae supporting Respondent, 1993 WL 13006387 at *2, in Daubert v. Merrell Dow Pharms., Inc., 509 U.S. 579 (1993).

[20] Id. (citing Brief for the New England Journal of Medicine, et al., 1993 WL 13006387 *3)

[21] Channing R. Robertson, John E. Moalli, David L. Black, “Reference Guide on Engineering,” RMSE 897, 938 (emphasis added).

[22] Henry T. Greely & Anthony D. Wagner, “Reference Guide on Neuroscience,” RMSE 747, 786.

[23] Id. at 776, 777.

[24] Paul S. Appelbaum, “Reference Guide on Mental Health Evidence,” RMSE 813, 866, 886.

[25] Channing R. Robertson, John E. Moalli, David L. Black, “Reference Guide on Engineering,” RMSE 897, 901, 931.

[26] Id. at 935.

[27] Daniel Rubinfeld, “Reference Guide on Multiple Regression,” 303, 328 RMSE  (“[w]ho should be qualified as an expert?”); Shari Seidman Diamond, “Reference Guide on Survey Research,” RMSE 359, 375; Bernard D. Goldstein & Mary Sue Henifin, “Reference Guide on Toxicology,” RMSE 633, 677, 678 (noting that membership in some toxicology societies turns in part on having published in peer-reviewed journals).

[28] Joseph V. Rodricks, “Reference Guide on Exposure Science,” RMSE 503, 508 (noting that EPA guidance documents on exposure assessment often are issued after peer review).

[29] Michael D. Green, D. Michal Freedman, and Leon Gordis, “Reference Guide on Epidemiology,” RMSE 549, 608.

[30] Id. at 617-18 n.212.

Consenus is Not Science

November 8th, 2023

Ted Simon, a toxicologist and a fellow board member at the Center for Truth in Science, has posted an intriguing piece in which he labels scientific consensus as a fool’s errand.[1]  Ted begins his piece by channeling the late Michael Crichton, who famously derided consensus in science, in his 2003 Caltech Michelin Lecture:

“Let’s be clear: the work of science has nothing whatever to do with consensus. Consensus is the business of politics. Science, on the contrary, requires only one investigator who happens to be right, which means that he or she has results that are verifiable by reference to the real world. In science, consensus is irrelevant. What is relevant is reproducible results. The greatest scientists in history are great precisely because they broke with the consensus.

* * * *

There is no such thing as consensus science. If it’s consensus, it isn’t science. If it’s science, it isn’t consensus. Period.”[2]

Crichton’s (and Simon’s) critique of consensus is worth remembering in the face of recent proposals by Professor Edward Cheng,[3] and others,[4] to make consensus the touchstone for the admissibility of scientific opinion testimony.

Consensus or general acceptance can be a proxy for conclusions drawn from valid inferences, within reliably applied methodologies, based upon sufficient evidence, quantitatively and qualitatively. When expert witnesses opine contrary to a consensus, they raise serious questions regarding how they came to their conclusions. Carl Sagan declaimed that “extraordinary claims require extraordinary evidence,” but his principle was hardly novel. Some authors quote the French polymath Pierre Simon Marquis de Laplace, who wrote in 1810: “[p]lus un fait est extraordinaire, plus il a besoin d’être appuyé de fortes preuves,”[5] but as the Quote Investigator documents,[6] the basic idea is much older, going back at least another century to church rector who expressed his skepticism of a contemporary’s claim of direct communication with the almighty: “Sure, these Matters being very extraordinary, will require a very extraordinary Proof.”[7]

Ted Simon’s essay is also worth consulting because he notes that many sources of apparent consensus are really faux consensus, nothing more than self-appointed intellectual authoritarians who systematically have excluded some points of view, while turning a blind eye to their own positional conflicts.

Lawyers, courts, and academics should be concerned that Cheng’s “consensus principle” will change the focus from evidence, methodology, and inference, to a surrogate or proxy for validity. And the sociological notion of consensus will then require litigation of whether some group really has announced a consensus. Consensus statements in some areas abound, but inquiring minds may want to know whether they are the result of rigorous, systematic reviews of the pertinent studies, and whether the available studies can support the claimed consensus.

Professor Cheng is hard at work on a book-length explication of his proposal, and some criticism will have to await the event.[8] Perhaps Cheng will overcome the objections placed against his proposal.[9] Some of the examples Professor Cheng has given, however, such as his errant his dramatic misreading of the American Statistical Association’s 2016 p-value consensus statement to represent, in Cheng’s words:

“[w]hile historically used as a rule of thumb, statisticians have now concluded that using the 0.05 [p-value] threshold is more distortive than helpful.”[10]

The 2016 Statement said no such thing, although a few statisticians attempted to distort the statement in the way that Cheng suggests. In 2021, a select committee of leading statisticians, appointed by the President of the ASA, issued a statement to make clear that the ASA had not embraced the Cheng misinterpretation.[11] This one example alone does not bode well for the viability of Cheng’s consensus principle.


[1] Ted Simon, “Scientific consensus is a fool’s errand made worse by IARC” (Oct. 2023).

[2] Michael Crichton, “Aliens Cause Global Warming,” Caltech Michelin Lecture (Jan. 17, 2003).

[3] Edward K. Cheng, “The Consensus Rule: A New Approach to Scientific Evidence,” 75 Vanderbilt L. Rev. 407 (2022) [Consensus Rule]

[4] See Norman J. Shachoy Symposium, The Consensus Rule: A New Approach to the Admissibility of Scientific Evidence (2022), 67 Villanova L. Rev. (2022); David S. Caudill, “The ‘Crisis of Expertise’ Reaches the Courtroom: An Introduction to the Symposium on, and a Response to, Edward Cheng’s Consensus Rule,” 67 Villanova L. Rev. 837 (2022); Harry Collins, “The Owls: Some Difficulties in Judging Scientific Consensus,” 67 Villanova L. Rev. 877 (2022); Robert Evans, “The Consensus Rule: Judges, Jurors, and Admissibility Hearings,” 67 Villanova L. Rev. 883 (2022); Martin Weinel, “The Adversity of Adversarialism: How the Consensus Rule Reproduces the Expert Paradox,” 67 Villanova L. Rev. 893 (2022); Wendy Wagner, “The Consensus Rule: Lessons from the Regulatory World,” 67 Villanova L. Rev. 907 (2022); Edward K. Cheng, Elodie O. Currier & Payton B. Hampton, “Embracing Deference,” 67 Villanova L. Rev. 855 (2022).

[5] Pierre-Simon Laplace, Théorie analytique des probabilités (1812) (The more extraordinary a fact, the more it needs to be supported by strong proofs.”). See Tressoldi, “Extraordinary Claims Require Extraordinary Evidence: The Case of Non-Local Perception, a Classical and Bayesian Review of Evidences,” 2 Frontiers Psych. 117 (2011); Charles Coulston Gillispie, Pierre-Simon Laplace, 1749-1827: a life in exact science (1997).

[6]Extraordinary Claims Require Extraordinary Evidence” (Dec. 5, 2021).

[7] Benjamin Bayly, An Essay on Inspiration 362, part 2 (2nd ed. 1708).

[8] The Consensus Principle, under contract with the University of Chicago Press.

[9] SeeCheng’s Proposed Consensus Rule for Expert Witnesses” (Sept. 15, 2022);
Further Thoughts on Cheng’s Consensus Rule” (Oct. 3, 2022); “Consensus Rule – Shadows of Validity” (Apr. 26, 2023).

[10] Consensus Rule at 424 (citing but not quoting Ronald L. Wasserstein & Nicole A. Lazar, “The ASA Statement on p-Values: Context, Process, and Purpose,” 70 Am. Statistician 129, 131 (2016)).

[11] Yoav Benjamini, Richard D. DeVeaux, Bradly Efron, Scott Evans, Mark Glickman, Barry Braubard, Xuming He, Xiao Li Meng, Nancy Reid, Stephen M. Stigler, Stephen B. Vardeman, Christopher K. Wikle, Tommy Wright, Linda J. Young, and Karen Kafadar, “The ASA President’s Task Force Statement on Statistical Significance and Replicability,” 15 Annals of Applied Statistics 1084 (2021); see also “A Proclamation from the Task Force on Statistical Significance” (June 21, 2021).

Science & the Law – from the Proceedings of the National Academies of Science

October 5th, 2023

The current issue of the Proceedings of the National Academies of Science (PNAS) features a medley of articles on science generally, and forensic science, in the law.[1] The general editor of the compilation appears to be editorial board member, Thomas D. Albright, the Conrad T. Prebys Professor of Vision Research at the Salk Institute for Biological Studies.

 I have not had time to plow through the set of offerings, but even a superficial inspection reveals that the articles will be of interest to lawyers and judges involved in the litigation of scientific issues. The authors seem to agree that descriptively and prescriptively, validity is more important than expertise in the legal  consideration of scientific evidence.

1. Thomas D. Albright, “A scientist’s take on scientific evidence in the courtroom,” 120 Proceedings of the National Academies of Science 120 (41) e2301839120 (2023).

Albright’s essay was edited by Henry Roediger, a psychologist at the Washington University in St. Louis.

Abstract

Scientific evidence is frequently offered to answer questions of fact in a court of law. DNA genotyping may link a suspect to a homicide. Receptor binding assays and behavioral toxicology may testify to the teratogenic effects of bug repellant. As for any use of science to inform fateful decisions, the immediate question raised is one of credibility: Is the evidence a product of valid methods? Are results accurate and reproducible? While the rigorous criteria of modern science seem a natural model for this evaluation, there are features unique to the courtroom that make the decision process scarcely recognizable by normal standards of scientific investigation. First, much science lies beyond the ken of those who must decide; outside “experts” must be called upon to advise. Second, questions of fact demand immediate resolution; decisions must be based on the science of the day. Third, in contrast to the generative adversarial process of scientific investigation, which yields successive approximations to the truth, the truth-seeking strategy of American courts is terminally adversarial, which risks fracturing knowledge along lines of discord. Wary of threats to credibility, courts have adopted formal rules for determining whether scientific testimony is trustworthy. Here, I consider the effectiveness of these rules and explore tension between the scientists’ ideal that momentous decisions should be based upon the highest standards of evidence and the practical reality that those standards are difficult to meet. Justice lies in carefully crafted compromise that benefits from robust bonds between science and law.

2. Thomas D.Albright, David Baltimore, Anne-MarieMazza, “Science, evidence, law, and justice,” 120 Proceedings of the National Academies of Science 120 (41) e2301839120 (2023).

Professor Baltimore is a nobel laureate and researcher in biology, now at the California Institute of Technology. Anne-Marie Mazza is the director of the Committee on Science, Technology, and Law, of the National Academies of Sciences, Engineering, and Medicine. Jennifer Mnookin is the chancellor of the University of Wisconsin, Madison; previously, she was the dean of the UCLA School of Law. Judge Tatel is a federal judge on the United States Court of Appeals for the District of Columbia Circuit.

Abstract

For nearly 25 y, the Committee on Science, Technology, and Law (CSTL), of the National Academies of Sciences, Engineering, and Medicine, has brought together distinguished members of the science and law communities to stimulate discussions that would lead to a better understanding of the role of science in legal decisions and government policies and to a better understanding of the legal and regulatory frameworks that govern the conduct of science. Under the leadership of recent CSTL co-chairs David Baltimore and David Tatel, and CSTL director Anne-Marie Mazza, the committee has overseen many interdisciplinary discussions and workshops, such as the international summits on human genome editing and the science of implicit bias, and has delivered advisory consensus reports focusing on topics of broad societal importance, such as dual use research in the life sciences, voting systems, and advances in neural science research using organoids and chimeras. One of the most influential CSTL activities concerns the use of forensic evidence by law enforcement and the courts, with emphasis on the scientific validity of forensic methods and the role of forensic testimony in bringing about justice. As coeditors of this Special Feature, CSTL alumni Tom Albright and Jennifer Mnookin have recruited articles at the intersection of science and law that reveal an emerging scientific revolution of forensic practice, which we hope will engage a broad community of scientists, legal scholars, and members of the public with interest in science-based legal policy and justice reform.

3. Nicholas Scurich, David L. Faigman, and Thomas D. Albright, “Scientific guidelines for evaluating the validity of forensic feature-comparison methods,” 120 Proceedings of the National Academies of Science (2023).

Nicholas Scurich is the chair of the department of Psychological Science, at the University of Southern California, David Faigman has written prolifically about science in the law. He is now the chancellor and dean, at the University of San Francisco College of Law.

Abstract

When it comes to questions of fact in a legal context—particularly questions about measurement, association, and causality—courts should employ ordinary standards of applied science. Applied sciences generally develop along a path that proceeds from a basic scientific discovery about some natural process to the formation of a theory of how the process works and what causes it to fail, to the development of an invention intended to assess, repair, or improve the process, to the specification of predictions of the instrument’s actions and, finally, empirical validation to determine that the instrument achieves the intended effect. These elements are salient and deeply embedded in the cultures of the applied sciences of medicine and engineering, both of which primarily grew from basic sciences. However, the inventions that underlie most forensic science disciplines have few roots in basic science, and they do not have sound theories to justify their predicted actions or results of empirical tests to prove that they work as advertised. Inspired by the “Bradford Hill Guidelines”—the dominant framework for causal inference in epidemiology—we set forth four guidelines that can be used to establish the validity of forensic comparison methods generally. This framework is not intended as a checklist establishing a threshold of minimum validity, as no magic formula determines when particular disciplines or hypotheses have passed a necessary threshold. We illustrate how these guidelines can be applied by considering the discipline of firearm and tool mark examination.

4. Peter Stout, “The secret life of crime labs,” 120 Proceedings of the National Academies of Science 120 (41) e2303592120 (2023).

Peter Stout is a scientist with the Houston Forensic Science Center, in Houston, Texas. The Center describes itself as “an independent local government corporation,” which provides forensic “services” to the Houston police

Abstract

Houston TX experienced a widely known failure of its police forensic laboratory. This gave rise to the Houston Forensic Science Center (HFSC) as a separate entity to provide forensic services to the City of Houston. HFSC is a very large forensic laboratory and has made significant progress at remediating the past failures and improving public trust in forensic testing. HFSC has a large and robust blind testing program, which has provided many insights into the challenges forensic laboratories face. HFSC’s journey from a notoriously failed lab to a model also gives perspective to the resource challenges faced by all labs in the country. Challenges for labs include the pervasive reality of poor-quality evidence. Also that forensic laboratories are necessarily part of a much wider system of interdependent functions in criminal justice making blind testing something in which all parts have a role. This interconnectedness also highlights the need for an array of oversight and regulatory frameworks to function properly. The major essential databases in forensics need to be a part of blind testing programs and work is needed to ensure that the results from these databases are indeed producing correct results and those results are being correctly used. Last, laboratory reports of “inconclusive” results are a significant challenge for laboratories and the system to better understand when these results are appropriate, necessary and most importantly correctly used by the rest of the system.

5. Brandon L. Garrett & Cynthia Rudin, “Interpretable algorithmic forensics,” 120 Proceedings of the National Academies of Science 120 (41) 120 (41) e2301842120 (2023).

Garrett teaches at the Duke University School of Law. Rudin teaches statistics at Duke University.

Abstract

One of the most troubling trends in criminal investigations is the growing use of “black box” technology, in which law enforcement rely on artificial intelligence (AI) models or algorithms that are either too complex for people to understand or they simply conceal how it functions. In criminal cases, black box systems have proliferated in forensic areas such as DNA mixture interpretation, facial recognition, and recidivism risk assessments. The champions and critics of AI argue, mistakenly, that we face a catch 22: While black box AI is not understandable by people, they assume that it produces more accurate forensic evidence. In this Article, we question this assertion, which has so powerfully affected judges, policymakers, and academics. We describe a mature body of computer science research showing how “glass box” AI—designed to be interpretable—can be more accurate than black box alternatives. Indeed, black box AI performs predictably worse in settings like the criminal system. Debunking the black box performance myth has implications for forensic evidence, constitutional criminal procedure rights, and legislative policy. Absent some compelling—or even credible—government interest in keeping AI as a black box, and given the constitutional rights and public safety interests at stake, we argue that a substantial burden rests on the government to justify black box AI in criminal cases. We conclude by calling for judicial rulings and legislation to safeguard a right to interpretable forensic AI.

6. Jed S. Rakoff & Goodwin Liu, “Forensic science: A judicial perspective,” 120 Proceedings of the National Academies of Science e2301838120 (2023).

Judge Rakoff has written previously on forensic evidence. He is a federal district court judge in the Southern District of New York. Goodwin Liu is a justice on the California Supreme Court. Their article was edited by Professor Mnookin.

Abstract

This article describes three major developments in forensic evidence and the use of such evidence in the courts. The first development is the advent of DNA profiling, a scientific technique for identifying and distinguishing among individuals to a high degree of probability. While DNA evidence has been used to prove guilt, it has also demonstrated that many individuals have been wrongly convicted on the basis of other forensic evidence that turned out to be unreliable. The second development is the US Supreme Court precedent requiring judges to carefully scrutinize the reliability of scientific evidence in determining whether it may be admitted in a jury trial. The third development is the publication of a formidable National Academy of Sciences report questioning the scientific validity of a wide range of forensic techniques. The article explains that, although one might expect these developments to have had a major impact on the decisions of trial judges whether to admit forensic science into evidence, in fact, the response of judges has been, and continues to be, decidedly mixed.

7. Jonathan J. Koehler, Jennifer L. Mnookin, and Michael J. Saks, “The scientific reinvention of forensic science,” 120 Proceedings of the National Academies of Science e2301840120 (2023).

Koehler is a professor of law at the Northwestern Pritzker School of Law. Saks is a professor of psychology at Arizona State University, and Regents Professor of Law, at the Sandra Day O’Connor College of Law.

Abstract

Forensic science is undergoing an evolution in which a long-standing “trust the examiner” focus is being replaced by a “trust the scientific method” focus. This shift, which is in progress and still partial, is critical to ensure that the legal system uses forensic information in an accurate and valid way. In this Perspective, we discuss the ways in which the move to a more empirically grounded scientific culture for the forensic sciences impacts testing, error rate analyses, procedural safeguards, and the reporting of forensic results. However, we caution that the ultimate success of this scientific reinvention likely depends on whether the courts begin to engage with forensic science claims in a more rigorous way.

8. William C. Thompson, “Shifting decision thresholds can undermine the probative value and legal utility of forensic pattern-matching evidence,” 120 Proceedings of the National Academies of Science e2301844120 (2023).

Thompson is professor emeritus in the Department of Criminology, Law & Society, University of California, Irvine.

Abstract

Forensic pattern analysis requires examiners to compare the patterns of items such as fingerprints or tool marks to assess whether they have a common source. This article uses signal detection theory to model examiners’ reported conclusions (e.g., identification, inconclusive, or exclusion), focusing on the connection between the examiner’s decision threshold and the probative value of the forensic evidence. It uses a Bayesian network model to explore how shifts in decision thresholds may affect rates and ratios of true and false convictions in a hypothetical legal system. It demonstrates that small shifts in decision thresholds, which may arise from contextual bias, can dramatically affect the value of forensic pattern-matching evidence and its utility in the legal system.

9. Marlene Meyer, Melissa F. Colloff, Tia C. Bennett, Edward Hirata, Amelia Kohl, Laura M. Stevens, Harriet M. J. Smith, Tobias Staudigl & Heather D. Flowe, “Enabling witnesses to actively explore faces and reinstate study-test pose during a lineup increases discriminability,” 120 Proceedings of the National Academies of Science e2301845120 (2023).

Marlene Meyer, Melissa F. Colloff, Tia C. Bennett, Edward Hirata, Amelia Kohl, and Heather D. Flowe are psychologists at the School of Psychology, University of Birmingham (United Kingdom). Harriet M. J. Smith is a psychologist in the School of Psychology, Nottingham Trent University, Nottingham, United Kingdom, and Tobias Staudigl is a psychologist in the Department of Psychology, Ludwig-Maximilians-Universität München, in Munich, Germany.

Abstract

Accurate witness identification is a cornerstone of police inquiries and national security investigations. However, witnesses can make errors. We experimentally tested whether an interactive lineup, a recently introduced procedure that enables witnesses to dynamically view and explore faces from different angles, improves the rate at which witnesses identify guilty over innocent suspects compared to procedures traditionally used by law enforcement. Participants encoded 12 target faces, either from the front or in profile view, and then attempted to identify the targets from 12 lineups, half of which were target present and the other half target absent. Participants were randomly assigned to a lineup condition: simultaneous interactive, simultaneous photo, or sequential video. In the front-encoding and profile-encoding conditions, Receiver Operating Characteristics analysis indicated that discriminability was higher in interactive compared to both photo and video lineups, demonstrating the benefit of actively exploring the lineup members’ faces. Signal-detection modeling suggested interactive lineups increase discriminability because they afford the witness the opportunity to view more diagnostic features such that the nondiagnostic features play a proportionally lesser role. These findings suggest that eyewitness errors can be reduced using interactive lineups because they create retrieval conditions that enable witnesses to actively explore faces and more effectively sample features.


[1] 120 Proceedings of the National Academies of Science (Oct. 10, 2023).

The IARC-hy of Evidence – Incoherent & Inconsistent Classifications of Carcinogenicity

September 19th, 2023

Recently, two lawyers wrote an article in a legal trade magazine about excluding epidemiologic evidence in civil litigation.[1] The article was wildly wide of the mark, with several conceptual and practical errors.[2] For starters, the authors discussed Rule 702 as excluding epidemiologic studies and evidence, when the rule addresses the admissibility of expert witness opinion testimony. The most egregious recommendation of the authors, however, was their recommendation that counsel urge the classifications of chemicals with respect to carcinogenicity, by the International Agency for Research on Cancer (IARC), and by regulatory agencies, as probative for or against causation.

The project of evaluating the evidence for, or against, carcinogenicity of the myriad natural and synthetic agents to which humans are exposed is certainly important. Certainly, IARC has taken the project seriously. There have, however, been problems with IARC’s classifications of specific chemicals, pharmaceuticals, or exposure circumstances, but a basic problem with the classifications begins with the classes themselves. Classification requires defined classes. I don’t mean to be anti-semantic, but IARC’s definitions and its hierarchy of carcinogenicity are not entirely coherent.

The agency was established in 1965, and by the early 1970s, found itself in the business of preparing “monographs on the evaluation of carcinogenic risk of chemicals to man.” Originally, the IARC set out to classify the carcinogenicity of chemicals, but over the years, its scope increased to include complex mixtures, physical agents such as different forms of radiation, and biological organisms. To date, there have been 134 IARC monographs, addressing 1,045 “agents” (either substances or exposure circumstances).

From its beginnings, the IARC has conducted its classifications through working groups that meet to review and evaluate evidence, and classify the cancer hazards of “agents” under discussion. The breakdown of IARC’s classifications among four groups currently is:

Group 1 – Carcinogenic to humans (127 agents)

Group 2A – Probably carcinogenic to humans (95 agents)

Group 2B – Possibly carcinogenic to humans (323 agents)

Group 3 – Not classifiable as to its carcinogenicity to humans   (500 agents)

Previously, the IARC classification included a Group 4 for agents that are probably not carcinogenic for human beings. After decades of review, the IARC placed only a single agent in Group 4, caprolactam, apparently because the agency found everything else in the world to be presumptively a cause of cancer. The IARC could not find sufficiently strong evidence even for water, air, or basic foods to declare that they do not cause cancer in humans. Ultimately, the IARC abandoned Group 4, in favor of a presumption of universal carcinogencity.

The IARC describes its carcinogen classification procedures, requirements, and rationales in a document known as “The Preamble.” Any discussion of IARC classifications, whether in scientific publications or in legal briefs, without reference to this document should be suspect. The Preamble seeks to define many of the words in the classificatory scheme, some in ways that are not intuitive. This document has been amended over time, and the most recent iteration can be found online at the IARC website.[3]

IARC claims to build its classifications upon “consensus” evaluations, based in turn upon considerations of

(a) the strength of evidence of carcinogenicity in humans,

(b) the evidence of carcinogenicity in experimental (non-human) animals, and

(c) the mechanistic evidence of carcinogenicity.

IARC further claims that its evaluations turn on the use of “transparent criteria and descriptive terms.”[4] This last claim is, for some terms, is falsifiable.

The working groups are described as engaged in consensus evaluations, although past evaluations have been reached on simple majority vote of the working group. The working groups are charged with considering the three lines of evidence, described above, for any given agent, and reaching a synthesis in the form of the IARC classificatory scheme. The chart, from the Preamble, below roughly describes how working groups may “mix and match” lines of evidence, of varying degrees of robustness and validity (vel non) to reach a classification.

 

Agents placed in Category I are thus “carcinogenic to humans.” Interestingly, IARC does not refer to Category I carcinogens as “known” carcinogens, although many commentators are prone to do so. The implication of calling Category I agents “known carcinogens” is to distinguish Category IIA, IIB, and III as agents “not known to cause cancer.” The adjective that IARC uses, rather than “known,” is “sufficient” evidence in humans, but IARC also allows for reaching Category I with “limited,” or even “inadequate” human evidence if the other lines of evidence, in experimental animals or mechanistic evidence in humans, are sufficient.

In describing “sufficient” evidence, the IARC’s Preamble does not refer to epidemiologic evidence as potentially “conclusive” or “definitive”; rather its use of “sufficient” implies, perhaps non-transparently, that its labels of “limited” or “inadequate” evidence in humans refer to insufficient evidence. IARC gives an unscientific, inflated weight and understanding to “limited evidence of carcinogenicity,” by telling us that

“[a] causal interpretation of the positive association observed in the body of evidence on exposure to the agent and cancer is credible, but chance, bias, or confounding could not be ruled out with reasonable confidence.”[5]

Remarkably, for IARC, credible interpretations of causality can be based upon evidentiary displays that are confounded or biased.  In other words, non-credible associations may support IARC’s conclusions of causality. Causal interpretations of epidemiologic evidence are “credible” according to IARC, even though Sir Austin’s predicate of a valid association is absent.[6]

The IARC studiously avoids, however, noting that any classification is based upon “insufficient” evidence, even though that evidence may be less than sufficient, as in “limited,” or “inadequate.” A close look at Table 4 reveals that some Category I classifications, and all Category IIA, IIB, and III classifications are based upon insufficient evidence of carcinogenicity in humans.

Non-Probable Probabilities

The classification immediately below Category or Group I is Group 2A, for agents “probably carcinogenic to humans.” The IARC’s use of “probably” is problematic. Group I carcinogens require only “sufficient” evidence of human carcinogenicity, and there is no suggestion that any aspect of a Group I evaluation requires apodictic, conclusive, or even “definitive” evidence. Accordingly, the determination of Group I carcinogens will be based upon evidence that is essentially probabilistic. Group 2A is also defined as having only “limited evidence of carcinogenicity in humans”; in other words, insufficient evidence of carcinogenicity in humans, or epidemiologic studies with uncontrolled confounding and biases.

Importing IARC 2A classifications into legal or regulatory arenas will allow judgments or regulations based upon “limited evidence” in humans, which as we have seen, can be based upon inconsistent observational studies, and studies that fail to measure and adjust for known and potential confounding risk factors and systematic biases. The 2A classification thus requires little substantively or semantically, and many 2A classifications leave juries and judges to determine whether a chemical or medication caused a human being’s cancer, when the basic predicates for Sir Austin Bradford Hill’s factors for causal judgment have not been met.[7]

An IARC evaluation of Group 2A, or “probably carcinogenic to humans,” would seem to satisfy the legal system’s requirement that an exposure to the agent of interest more likely than not causes the harm in question. Appearances and word usage in different contexts, however, can be deceiving. Probability is a continuous quantitative scale from zero to one. In Bayesian analyses, zero and one are unavailable because if either were our starting point, no amount of evidence could ever change our judgment of the probability of causation. (Cromwell’s Rule). The IARC informs us that its use of “probably” is purely idiosyncratic; the probability that a Group 2A agent causes cancer has “no quantitative” meaning. All the IARC intends is that a Group 2A classification “signifies a greater strength of evidence than possibly carcinogenic.”[8] Group 2A classifications are thus consistent with having posterior probabilities less than 0.5 (or 50 percent). A working group could judge the probability of a substance or a process to be carcinogenic to humans to be greater than zero, but no more than say ten percent, and still vote for a 2A classification, in keeping with the IARC Preamble. This low probability threshold for a 2A classification converts the judgment of “probably carcinogenic” into little more than precautionary prescriptions, rendered when the most probable assessment is either ignorance or lack of causality. There is thus a practical certainty, close to 100%, that a 2A classification will confuse judges and juries, as well as the scientific community.

In addition to being based upon limited, that is insufficient, evidence of human carcinogenicity, Group 2A evaluations of “probable human carcinogenicity” connote “sufficient evidence” in experimental animals. An agent can be classified 2A even when the sufficient evidence of carcinogenicity occurs in only one of several non-human animal species, with the other animal species failing to show carcinogenicity. IARC 2A classifications can thus raise the thorny question in court whether a claimant is more like a rat or a mouse.

Courts should, because of the incoherent and diluted criteria for “probably carcinogenic,” exclude expert witness opinions based upon IARC 2A classifications as scientifically insufficient.[9] Given the distortion of ordinary language in its use of defined terms such as “sufficient,” “limited,” and “probable,” any evidentiary value to IARC 2A classifications, and expert witness opinion based thereon, is “substantially outweighed by a danger of … unfair prejudice, confusing the issues, [and] misleading the jury….”[10]

Everything is Possible

Category 2B denotes “possibly carcinogenic.” This year, the IARC announced that a working group had concluded that aspartame, an artificial sugar substitute, was “possibly carcinogenic.”[11] Such an evaluation, however, tells us nothing. If there are no studies at all of an agent, the agent could be said to be possibly carcinogenic. If there are inconsistent studies, even if the better designed studies are exculpatory, scientists could still say that the agent of interest was possibly carcinogenic. The 2B classification does not tell us anything because everything is possible until there is sufficient evidence to inculpate or exculpate it from causing cancer in humans.

It’s a Hazard, Not a Risk

IARC’s classification does not include an assessment of exposure levels. Consequently, there is no consideration of dose or exposure level at which an agent becomes carcinogenic. IARC’s evaluations are limited to whether the agent is or is not carcinogenic. The IARC explicitly concedes that exposure to a carcinogenic agent may carry little risk, but it cannot bring itself to say no risk, or even benefit at low exposures.

As noted, the IARC classification scheme refers to the strength of the evidence that an agent is carcinogenic, and not to the quantitative risk of cancer from exposure at a given level. The Preamble explains the distinction as fundamental:

“A cancer hazard is an agent that is capable of causing cancer, whereas a cancer risk is an estimate of the probability that cancer will occur given some level of exposure to a cancer hazard. The Monographs assess the strength of evidence that an agent is a cancer hazard. The distinction between hazard and risk is fundamental. The Monographs identify cancer hazards even when risks appear to be low in some exposure scenarios. This is because the exposure may be widespread at low levels, and because exposure levels in many populations are not known or documented.”[12]

This attempted explanation reveals important aspects of IARC’s project. First, there is an unproven assumption that there will be cancer hazards regardless of the exposure levels. The IARC contemplates that there may circumstances of low levels of risk from low levels of exposure, but it elides the important issue of thresholds. Second, IARC’s distinction between hazard and risk is obscured by its own classifications.  For instance, when IARC evaluated crystalline silica and classified it in Group I, it did so for only “occupational exposures.”[13] And yet, when IARC evaluated the hazard of coal exposure, it placed coal dust in Group 3, even though coal dust contains crystalline silica.[14] Similarly, in 2018, the IARC classified coffee as a Group 3,[15] even though every drop of coffee contains acrylamide, which is, according to IARC, a Group 2A “probable human carcinogen.”[16]


[1] Christian W. Castile & and Stephen J. McConnell, “Excluding Epidemiological Evidence Under FRE 702,” For The Defense 18 (June 2023) [Castile].

[2]Excluding Epidemiologic Evidence Under Federal Rule of Evidence 702” (Aug. 26, 2023).

[3] IARC Monographs on the Identification of Carcinogenic Hazards to Humans – Preamble (2019).

[4] Jonathan M. Samet , Weihsueh A. Chiu , Vincent Cogliano, Jennifer Jinot, David Kriebel, Ruth M. Lunn, Frederick A. Beland, Lisa Bero, Patience Browne, Lin Fritschi, Jun Kanno , Dirk W. Lachenmeier, Qing Lan, Gerard Lasfargues, Frank Le Curieux, Susan Peters, Pamela Shubat, Hideko Sone, Mary C. White , Jon Williamson, Marianna Yakubovskaya , Jack Siemiatycki, Paul A. White, Kathryn Z. Guyton, Mary K. Schubauer-Berigan, Amy L. Hall, Yann Grosse, Veronique Bouvard, Lamia Benbrahim-Tallaa, Fatiha El Ghissassi, Beatrice Lauby-Secretan, Bruce Armstrong, Rodolfo Saracci, Jiri Zavadil , Kurt Straif, and Christopher P. Wild, “The IARC Monographs: Updated Procedures for Modern and Transparent Evidence Synthesis in Cancer Hazard Identification,” 112 J. Nat’l Cancer Inst. djz169 (2020).

[5] Preamble at 31.

[6] See Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295 (1965) (noting that only when “[o]ur observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance,” do we move on to consider the nine articulated factors for determining whether an association is causal.

[7] Id.

[8] IARC Monographs on the Identification of Carcinogenic Hazards to Humans – Preamble 31 (2019) (“The terms probably carcinogenic and possibly carcinogenic have no quantitative significance and are used as descriptors of different strengths of evidence of carcinogenicity in humans.”).

[9] SeeIs the IARC lost in the weeds” (Nov. 30, 2019); “Good Night Styrene” (Apr. 18, 2019).

[10] Fed. R. Evid. 403.

[11] Elio Riboli, et al., “Carcinogenicity of aspartame, methyleugenol, and isoeugenol,” 24 The Lancet Oncology P848-850 (2023);

IARC, “Aspartame hazard and risk assessment results released” (2023).

[12] Preamble at 2.

[13] IARC Monograph 68, at 41 (1997) (“For these reasons, the Working Group therefore concluded that overall the epidemiological findings support increased lung cancer risks from inhaled crystalline silica (quartz and cristobalite) resulting from occupational exposure.”).

[14] IARC Monograph 68, at 337 (1997).

[15] IARC Monograph No. 116, Drinking Coffee, Mate, and Very Hot Beverages (2018).

[16] IARC Monograph no. 60, Some Industrial Chemicals (1994).

Excluding Epidemiologic Evidence under Federal Rule of Evidence 702

August 26th, 2023

We are 30-plus years into the “Daubert” era, in which federal district courts are charged with gatekeeping the relevance and reliability of scientific evidence. Not surprisingly, given the lawsuit industry’s propensity on occasion to use dodgy science, the burden of awakening the gatekeepers from their dogmatic slumber often falls upon defense counsel in civil litigation. It therefore behooves defense counsel to speak carefully and accurately about the grounds for Rule 702 exclusion of expert witness opinion testimony.

In the context of medical causation opinions based upon epidemiologic evidence, the first obvious point is that whichever party is arguing for exclusion should distinguish between excluding an expert witness’s opinion and prohibiting an expert witness from relying upon a particular study.  Rule 702 addresses the exclusion of opinions, whereas Rule 703 addresses barring an expert witness from relying upon hearsay facts or data unless they are reasonably relied upon by experts in the appropriate field. It would be helpful for lawyers and legal academics to refrain from talking about “excluding epidemiological evidence under FRE 702.”[1] Epidemiologic studies are rarely admissible themselves, but come into the courtroom as facts and data relied upon by expert witnesses. Rule 702 is addressed to the admissibility vel non of opinion testimony, some of which may rely upon epidemiologic evidence.

Another common lawyer mistake is the over-generalization that epidemiologic research provides “gold standard” of general causation evidence.[2] Although epidemiology is often required, it not “the medical science devoted to determining the cause of disease in human beings.”[3] To be sure, epidemiologic evidence will usually be required because there is no genetic or mechanistic evidence that will support the claimed causal inference, but counsel should be cautious in stating the requirement. Glib statements by courts that epidemiology is not always required are often simply an evasion of their responsibility to evaluate the validity of the proffered expert witness opinions. A more careful phrasing of the role of epidemiology will make such glib statements more readily open to rebuttal. In the absence of direct biochemical, physiological, or genetic mechanisms that can be identified as involved in bringing about the plaintiffs’ harm, epidemiologic evidence will be required, and it may well be the “gold standard” in such cases.[4]

When epidemiologic evidence is required, counsel will usually be justified in adverting to the “hierarchy of epidemiologic evidence.” Associations are shown in studies of various designs with vastly differing degrees of validity; and of course, associations are not necessarily causal. There are thus important nuances in educating the gatekeeper about this hierarchy. First, it will often be important to educate the gatekeeper about the distinction between descriptive and analytic studies, and the inability of descriptive studies such as case reports to support causal inferences.[5]

There is then the matter of confusion within the judiciary and among “scholars” about whether a hierarchy even exists. The chapter on epidemiology in the Reference Manual on Scientific Evidence appears to suggest the specious position that there is no hierarchy.[6] The chapter on medical testimony, however, takes a different approach in identifying a normative hierarchy of evidence to be considered in evaluating causal claims.[7] The medical testimony chapter specifies that meta-analyses of randomized controlled trials sit atop the hierarchy. Yet, there are divergent opinions about what should be at the top of the hierarchical evidence pyramid. Indeed, the rigorous, large randomized trial will often replace a meta-analysis of smaller trials as the more definitive evidence.[8] Back in 2007, a dubious meta-analysis of over 40 clinical trials led to a litigation frenzy over rosiglitazone.[9] A mega-trial of rosiglitazone showed that the 2007 meta-analysis was wrong.[10]

In any event, courts must purge their beliefs that once there is “some” evidence in support of a claim, their gatekeeping role is over. Randomized controlled trials really do trump observational studies, which virtually always have actual or potential confounding in their final analyses.[11] While disclaimers about the unavailability of randomized trials for putative toxic exposures are helpful, it is not quite accurate to say that it is “unethical to intentionally expose people to a potentially harmful dose of a suspected toxin.”[12] Such trials are done all the time when there is an expected therapeutic benefit that creates at least equipoise between the overall benefit and harm at the outset of the trial.[13]

At this late date, it seems shameful that courts must be reminded that evidence of associations does not suffice to show causation, but prudence dictates giving the reminder.[14] Defense counsel will generally exhibit a Pavlovian reflex to state that causality based upon epidemiology must be viewed through a lens of “Bradford Hill criteria.”[15] Rhetorically, this reflex seems wrong given that Sir Austin himself noted that his nine different considerations were “viewpoints,” not criteria. Taking a position that requires an immediate retreat seems misguided. Similarly, urging courts to invoke and apply the Bradford Hill considerations must be accompanied the caveat that courts must first apply Bradford Hill’s predicate[16] for the nine considerations:

“Disregarding then any such problem in semantics we have this situation. Our observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance. What aspects of that association should we especially consider before deciding that the most likely interpretation of it is causation?”[17]

Courts should be mindful that the language from the famous, often-cited paper was part of an after-dinner address, in which Sir Austin was speaking informally. Scientists will understand that he was setting out a predicate that calls for

(1) an association, which is

(2) “perfectly clear cut,” such that bias and confounding are excluded, and

(3) “beyond what we would care to attribute to the play of chance,” with random error kept to an acceptable level, before advancing to further consideration of the nine viewpoints commonly recited.

These predicate findings are the basis for advancing to investigate Bradford Hill’s nine viewpoints; the viewpoints do not replace or supersede the predicates.[18]

Within the nine viewpoints, not all are of equal importance. Consistency among studies, a particularly important consideration, implies that isolated findings in a single observational study will rarely suffice to support causal conclusions. Another important consideration, the strength of the association, has nothing to do with “statistical significance,” which is a predicate consideration, but reminds us that large risk ratios or risk differences provides some evidence that the association does not result from unmeasured confounding. Eliminating confounding, however, is one of the predicate requirements for applying the nine factors. As with any methodology, the Bradford Hill factors are not self-executing. The annals of litigation provide all-too-many examples of undue selectivity, “cherry picking,” and other deviations from the scientist’s standard of care.

Certainly lawyers must steel themselves against recommending the “carcinogen” hazard identifications advanced by the International Agency for Research on Cancer (IARC). There are several problematic aspects to the methods of IARC, not the least of which is IARC’s fanciful use of the word “probable.” According to the IARC Preamble, “probable” has no quantitative meaning.[19] In common legal parlance, “probable” typically conveys a conclusion that is more likely than not. Another problem arises from the IARC’s labeling of “probable human carcinogens” made in some cases without any real evidence of carcinogenesis in humans. Regulatory pronouncements are even more diluted and often involved little more than precautionary principle wishcasting.[20]


[1] Christian W. Castile & and Stephen J. McConnell, “Excluding Epidemiological Evidence Under FRE 702,” For The Defense 18 (June 2023) [Castile]. Although these authors provide an interesting overview of the subject, they fall into some common errors, such as failing to address Rule 703. The article is worth reading for its marshaling recent case law on the subject, but I detail of its errors here in the hopes that lawyers will speak more precisely about the concepts involved in challenging medical causation opinions.

[2] Id. at 18. In re Zantac (Ranitidine) Prods. Liab. Litig., No. 2924, 2022 U.S. Dist. LEXIS 220327, at *401 (S.D. Fla. Dec. 6, 2022); see also Horwin v. Am. Home Prods., No. CV 00-04523 WJR (Ex), 2003 U.S. Dist. LEXIS 28039, at *14-15 (C.D. Cal. May 9, 2003) (“epidemiological studies provide the primary generally accepted methodology for demonstrating a causal relation between a chemical compound and a set of symptoms or disease” *** “The lack of epidemiological studies supporting Plaintiffs’ claims creates a high bar to surmount with respect to the reliability requirement, but it is not automatically fatal to their case.”).

[3] See, e.g., Siharath v. Sandoz Pharm. Corp., 131 F. Supp. 2d 1347, 1356 (N.D. Ga. 2001) (“epidemiology is the medical science devoted to determining the cause of disease in human beings”).

[4] See, e.g., Lopez v. Wyeth-Ayerst Labs., No. C 94-4054 CW, 1996 U.S. Dist. LEXIS 22739, at *1 (N.D. Cal. Dec. 13, 1996) (“Epidemiological evidence is one of the most valuable pieces of scientific evidence of causation”); Horwin v. Am. Home Prods., No. CV 00-04523 WJR (Ex), 2003 U.S. Dist. LEXIS 28039, at *15 (C.D. Cal. May 9, 2003) (“The lack of epidemiological studies supporting Plaintiffs’ claims creates a high bar to surmount with respect to the reliability requirement, but it is not automatically fatal to their case”).

[5] David A. Grimes & Kenneth F. Schulz, “Descriptive Studies: What They Can and Cannot Do,” 359 Lancet 145 (2002) (“…epidemiologists and clinicians generally use descriptive reports to search for clues of cause of disease – i.e., generation of hypotheses. In this role, descriptive studies are often a springboard into more rigorous studies with comparison groups. Common pitfalls of descriptive reports include an absence of a clear, specific, and reproducible case definition, and interpretations that overstep the data. Studies without a comparison group do not allow conclusions about cause of disease.”).

[6] Michael D. Green, D. Michal Freedman, and Leon Gordis, “Reference Guide on Epidemiology,” Reference Manual on Scientific Evidence 549, 564n.48 (citing a paid advertisement by a group of scientists, and misleadingly referring to the publication as a National Cancer Institute symposium) (citing Michele Carbone et al., “Modern Criteria to Establish Human Cancer Etiology,” 64 Cancer Res. 5518, 5522 (2004) (National Cancer Institute symposium [sic] concluding that “[t]here should be no hierarchy [among different types of scientific methods to determine cancer causation]. Epidemiology, animal, tissue culture and molecular pathology should be seen as integrating evidences in the determination of human carcinogenicity.”).

[7] John B. Wong, Lawrence O. Gostin & Oscar A. Cabrera, “Reference Guide on Medical Testimony,” in Reference Manual on Scientific Evidence 687, 723 (3d ed. 2011).

[8] See, e.g., J.M. Elwood, Critical Appraisal of Epidemiological Studies and Clinical Trials 342 (3d ed. 2007).

[9] See Steven E. Nissen & Kathy Wolski, “Effect of Rosiglitazone on the Risk of Myocardial Infarction and Death from Cardiovascular Causes,” 356 New Engl. J. Med. 2457 (2007). See also “Learning to Embrace Flawed Evidence – The Avandia MDL’s Daubert Opinion” (Jan. 10, 2011).

[10] Philip D. Home, et al., “Rosiglitazone evaluated for cardiovascular outcomes in oral agent combination therapy for type 2 diabetes (RECORD): a multicentre, randomised, open-label trial,” 373 Lancet 2125 (2009).

[11] In re Zantac (Ranitidine) Prods. Liab. Litig., No. 2924, 2022 U.S. Dist. LEXIS 220327, at *402 (S.D. Fla. Dec. 6, 2022) (“Unlike experimental studies in which subjects are randomly assigned to exposed and placebo groups, observational studies are subject to bias due to the possibility of differences between study populations.”)

[12] Castile at 20.

[13] See, e.g., Benjamin Freedman, “Equipoise and the ethics of clinical research,” 317 New Engl. J. Med. 141 (1987).

[14] See, e.g., In Re Onglyza (Saxagliptin) & Kombiglyze Xr (Saxagliptin & Metformin) Prods. Liab. Litig., No. 5:18-md-2809-KKC, 2022 U.S. Dist. LEXIS 136955, at *127 (E.D. Ky. Aug. 2, 2022); Burleson v. Texas Dep’t of Criminal Justice, 393 F.3d 577, 585-86 (5th Cir. 2004) (affirming exclusion of expert causation testimony based solely upon studies showing a mere correlation between defendant’s product and plaintiff’s injury); Beyer v. Anchor Insulation Co., 238 F. Supp. 3d 270, 280-81 (D. Conn. 2017); Ambrosini v. Labarraque, 101 F.3d 129, 136 (D.C. Cir. 1996).

[15] Castile at 21. See In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., 26 F. Supp. 3d 449, 454-55 (E.D. Pa. 2014).

[16]Bradford Hill on Statistical Methods” (Sept. 24, 2013); see also Frank C. Woodside, III & Allison G. Davis, “The Bradford Hill Criteria: The Forgotten Predicate,” 35 Thomas Jefferson L. Rev. 103 (2013). 

[17] Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 295 (1965).

[18] Castile at 21. See, e.g., In re Onglyza (Saxagliptin) & Kombiglyze XR (Saxagliptin & Metformin) Prods. Liab. Litig., No. 5:18-md-2809-KKC, 2022 U.S. Dist. LEXIS 1821, at *43 (E.D. Ky. Jan. 5, 2022) (“The analysis is meant to apply when observations reveal an association between two variables. It addresses the aspects of that association that researchers should analyze before deciding that the most likely interpretation of [the association] is causation”); Hoefling v. U.S. Smokeless Tobacco Co., LLC, 576 F. Supp. 3d 262, 273 n.4 (E.D. Pa. 2021) (“Nor would it have been appropriate to apply them here: scientists are to do so only after an epidemiological association is demonstrated”).

[19] IARC Monographs on the Identification of Carcinogenic Hazards to Humans – Preamble 31 (2019) (“The terms probably carcinogenic and possibly carcinogenic have no quantitative significance and are used as descriptors of different strengths of evidence of carcinogenicity in humans.”).

[20]Improper Reliance upon Regulatory Risk Assessments in Civil Litigation” (Mar. 19, 2023).