TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

Consenus is Not Science

November 8th, 2023

Ted Simon, a toxicologist and a fellow board member at the Center for Truth in Science, has posted an intriguing piece in which he labels scientific consensus as a fool’s errand.[1]  Ted begins his piece by channeling the late Michael Crichton, who famously derided consensus in science, in his 2003 Caltech Michelin Lecture:

“Let’s be clear: the work of science has nothing whatever to do with consensus. Consensus is the business of politics. Science, on the contrary, requires only one investigator who happens to be right, which means that he or she has results that are verifiable by reference to the real world. In science, consensus is irrelevant. What is relevant is reproducible results. The greatest scientists in history are great precisely because they broke with the consensus.

* * * *

There is no such thing as consensus science. If it’s consensus, it isn’t science. If it’s science, it isn’t consensus. Period.”[2]

Crichton’s (and Simon’s) critique of consensus is worth remembering in the face of recent proposals by Professor Edward Cheng,[3] and others,[4] to make consensus the touchstone for the admissibility of scientific opinion testimony.

Consensus or general acceptance can be a proxy for conclusions drawn from valid inferences, within reliably applied methodologies, based upon sufficient evidence, quantitatively and qualitatively. When expert witnesses opine contrary to a consensus, they raise serious questions regarding how they came to their conclusions. Carl Sagan declaimed that “extraordinary claims require extraordinary evidence,” but his principle was hardly novel. Some authors quote the French polymath Pierre Simon Marquis de Laplace, who wrote in 1810: “[p]lus un fait est extraordinaire, plus il a besoin d’être appuyé de fortes preuves,”[5] but as the Quote Investigator documents,[6] the basic idea is much older, going back at least another century to church rector who expressed his skepticism of a contemporary’s claim of direct communication with the almighty: “Sure, these Matters being very extraordinary, will require a very extraordinary Proof.”[7]

Ted Simon’s essay is also worth consulting because he notes that many sources of apparent consensus are really faux consensus, nothing more than self-appointed intellectual authoritarians who systematically have excluded some points of view, while turning a blind eye to their own positional conflicts.

Lawyers, courts, and academics should be concerned that Cheng’s “consensus principle” will change the focus from evidence, methodology, and inference, to a surrogate or proxy for validity. And the sociological notion of consensus will then require litigation of whether some group really has announced a consensus. Consensus statements in some areas abound, but inquiring minds may want to know whether they are the result of rigorous, systematic reviews of the pertinent studies, and whether the available studies can support the claimed consensus.

Professor Cheng is hard at work on a book-length explication of his proposal, and some criticism will have to await the event.[8] Perhaps Cheng will overcome the objections placed against his proposal.[9] Some of the examples Professor Cheng has given, however, such as his errant his dramatic misreading of the American Statistical Association’s 2016 p-value consensus statement to represent, in Cheng’s words:

“[w]hile historically used as a rule of thumb, statisticians have now concluded that using the 0.05 [p-value] threshold is more distortive than helpful.”[10]

The 2016 Statement said no such thing, although a few statisticians attempted to distort the statement in the way that Cheng suggests. In 2021, a select committee of leading statisticians, appointed by the President of the ASA, issued a statement to make clear that the ASA had not embraced the Cheng misinterpretation.[11] This one example alone does not bode well for the viability of Cheng’s consensus principle.


[1] Ted Simon, “Scientific consensus is a fool’s errand made worse by IARC” (Oct. 2023).

[2] Michael Crichton, “Aliens Cause Global Warming,” Caltech Michelin Lecture (Jan. 17, 2003).

[3] Edward K. Cheng, “The Consensus Rule: A New Approach to Scientific Evidence,” 75 Vanderbilt L. Rev. 407 (2022) [Consensus Rule]

[4] See Norman J. Shachoy Symposium, The Consensus Rule: A New Approach to the Admissibility of Scientific Evidence (2022), 67 Villanova L. Rev. (2022); David S. Caudill, “The ‘Crisis of Expertise’ Reaches the Courtroom: An Introduction to the Symposium on, and a Response to, Edward Cheng’s Consensus Rule,” 67 Villanova L. Rev. 837 (2022); Harry Collins, “The Owls: Some Difficulties in Judging Scientific Consensus,” 67 Villanova L. Rev. 877 (2022); Robert Evans, “The Consensus Rule: Judges, Jurors, and Admissibility Hearings,” 67 Villanova L. Rev. 883 (2022); Martin Weinel, “The Adversity of Adversarialism: How the Consensus Rule Reproduces the Expert Paradox,” 67 Villanova L. Rev. 893 (2022); Wendy Wagner, “The Consensus Rule: Lessons from the Regulatory World,” 67 Villanova L. Rev. 907 (2022); Edward K. Cheng, Elodie O. Currier & Payton B. Hampton, “Embracing Deference,” 67 Villanova L. Rev. 855 (2022).

[5] Pierre-Simon Laplace, Théorie analytique des probabilités (1812) (The more extraordinary a fact, the more it needs to be supported by strong proofs.”). See Tressoldi, “Extraordinary Claims Require Extraordinary Evidence: The Case of Non-Local Perception, a Classical and Bayesian Review of Evidences,” 2 Frontiers Psych. 117 (2011); Charles Coulston Gillispie, Pierre-Simon Laplace, 1749-1827: a life in exact science (1997).

[6]Extraordinary Claims Require Extraordinary Evidence” (Dec. 5, 2021).

[7] Benjamin Bayly, An Essay on Inspiration 362, part 2 (2nd ed. 1708).

[8] The Consensus Principle, under contract with the University of Chicago Press.

[9] SeeCheng’s Proposed Consensus Rule for Expert Witnesses” (Sept. 15, 2022);
Further Thoughts on Cheng’s Consensus Rule” (Oct. 3, 2022); “Consensus Rule – Shadows of Validity” (Apr. 26, 2023).

[10] Consensus Rule at 424 (citing but not quoting Ronald L. Wasserstein & Nicole A. Lazar, “The ASA Statement on p-Values: Context, Process, and Purpose,” 70 Am. Statistician 129, 131 (2016)).

[11] Yoav Benjamini, Richard D. DeVeaux, Bradly Efron, Scott Evans, Mark Glickman, Barry Braubard, Xuming He, Xiao Li Meng, Nancy Reid, Stephen M. Stigler, Stephen B. Vardeman, Christopher K. Wikle, Tommy Wright, Linda J. Young, and Karen Kafadar, “The ASA President’s Task Force Statement on Statistical Significance and Replicability,” 15 Annals of Applied Statistics 1084 (2021); see also “A Proclamation from the Task Force on Statistical Significance” (June 21, 2021).

Just Dissertations

October 27th, 2023

One of my childhood joys was roaming the stacks of libraries and browsing for arcane learning stored in aging books. Often, I had no particular goal in my roaming, and I flitted from topic to topic. Occasionally, however, I came across useful learning. It was in one college library, for instance, that I discovered the process for making nitrogen tri-iodide, which provided me with some simple-minded amusement for years. (I only narrowly avoided detection by Dean Brownlee for a prank involving NI3 in chemistry lab.)

Nowadays, most old book are off limits to the casual library visitor, but digital archives can satisfy my occasional compulsion to browse what is new and compelling in the world of research on topics of interest. And there can be no better source for new and topical research than browsing dissertations and theses, which are usually required to open new ground in scholarly research and debate. There are several online search tools for dissertations, such as ProQuest, EBSCO Open Dissertation, Theses and Dissertations, WorldCat Dissertations and Theses, Open Access Theses and Dissertations, and Yale Library Resources to Find Dissertation.

Some universities generously share the scholarship of their graduate students online, and there are some great gems freely available.[1] Other universities provide a catalogue of their students’ dissertations, the titles of which can be browsed and the texts of which can be downloaded. For lawyers interested in medico-legal issues, the London School of Hygiene & Tropical Medicine has a website, “LSHTM Research Online,” which is delightful place to browse on a rainy afternoon, and which features a free, open access repository of research. Most of the publications are dissertations, some 1,287 at present, on various medical and epidemiologic topics, from 1938 to the present.

The prominence of the London School of Hygiene & Tropical Medicine makes its historical research germane to medico-legal issues such as “state of the art,” notice, priority, knowledge, and intellectual provenance. A 1959 dissertation by J. D. Walters, the Surgeon Lieutenant of the Royal Nayal, is included in the repository.[2] Walters’ dissertation is a treasure trove of the state-of-the-art case – who knew what when – about asbestos health hazards, written before litigation distorted perspectives on the matter. Walters’ dissertation shows in contemporaneous scholarship, not hindsight second guessing, that Sir Richard Doll’s 1955 study, flawed as it was by contemporaneous standards, was seen as establishing an association between asbestosis (not asbestos exposure) and lung cancer. Walters’ careful assessment of how asbestos was actually used in British dockyards documents the differences between British and American product use. The British dockyards had full-time laggers since 1946, and they used spray asbestos, asbestos (amosite and crocidolite) mattresses, as well as lower asbestos content insulation.

Walters reported cases of asbestosis among the laggers. Written four years before Irving Selikoff published on an asbestosis hazard among laggers, the predominant end-users of asbestos-containing insulation, Walters’ dissertation preempts Selikoff’s claim of priority in identifying the asbestos hazard, and it shows that large employers, such as the Royal Navy, and the United States Navy, were well aware of asbestos hazards, before companies began placing warning labels. Like Selikoff, Walters typically had no information about worker compliance with safety regulations, such as respiratory use. Walters emphasized the need for industrial medical officers to be aware of the asbestosis hazard, and the means to prevent it. Noticeably absent was any suggestion that a warning label on bags of asbestos or boxes of pre-fabricated insulation were relevant to the medical officer’s work in controlling the hazard.

Among the litigation relevant finds in the repository is the doctoral thesis of Francis Douglas Kelly Liddell,[3] on the mortality of the Quebec chrysotile workers, with most of the underlying data.[4] A dissertation by Keith Richard Sullivan reported on the mortality patterns of civilian workers at Royal Navy dockyards in England.[5] Sullivan found no increased risk of lung cancer, although excesses of asbestosis and mesothelioma occurred at all dockyards. A critical look at meta-analyses of formaldehyde and cancer outcomes in one dissertation shows prevalent biases in available studies, and insufficient evidence of causation.[6]

Some of the other interesting dissertations with historical medico-legal relevance are:

Francis, The evaluation of small airway disease in the human lung with special reference to tests which are suitable for epidemiological screening; PhD thesis, London School of Hygiene & Tropical Medicine (1978) DOI: https://doi.org/10.17037/PUBS.04655290

Gillian Mary Regan, A Study of pulmonary function in asbestosis, PhD thesis, London School of Hygiene & Tropical Medicine (1977) DOI: https://doi.org/10.17037/PUBS.04655127

Christopher J. Sirrs, Health and Safety in the British Regulatory State, 1961-2001: the HSC, HSE and the Management of Occupational Risk, PhD thesis, London School of Hygiene & Tropical Medicine (2016) DOI: https://doi.org/10.17037/PUBS.02548737

Michael Etrata Rañopa, Methodological issues in electronic healthcare database studies of drug cancer associations: identification of cancer, and drivers of discrepant results, PhD thesis, London School of Hygiene & Tropical Medicine (2016). DOI: https://doi.org/10.17037/PUBS.02572609

Melanie Smuk, Missing Data Methodology: Sensitivity analysis after multiple imputation, PhD thesis, London School of Hygiene & Tropical Medicine (2015) DOI: https://doi.org/10.17037/PUBS.02212896

John Ross Tazare, High-dimensional propensity scores for data-driven confounder adjustment in UK electronic health records, PhD thesis, London School of Hygiene & Tropical Medicine (2022). DOI: https://doi.org/10.17037/PUBS.046647276/

Rebecca Jane Hardy, (1995) Meta-analysis techniques in medical research: a statistical perspective. PhD thesis, London School of Hygiene & Tropical Medicine. DOI: https://doi.org/10.17037/PUBS.00682268

Jemma Walker, Bayesian modelling in genetic association studies, PhD thesis, London School of Hygiene & Tropical Medicine (2012) DOI: https://doi.org/10.17037/PUBS.01635516

  1. Marieke Schoonen, (2007) Pharmacoepidemiology of autoimmune diseases.PhD thesis, London School of Hygiene & Tropical Medicine. DOI: https://doi.org/10.17037/PUBS.04646551

Claudio John Verzilli, Method for the analysis of incomplete longitudinal data, PhD thesis, London School of Hygiene & Tropical Medicine (2003)  DOI: https://doi.org/10.17037/PUBS.04646517

Martine Vrijheid, Risk of congenital anomaly in relation to residence near hazardous waste landfill sites, PhD thesis, London School of Hygiene & Tropical Medicine (2000) DOI: https://doi.org/10.17037/PUBS.00682274


[1] See, e.g., Benjamin Nathan Schachtman, Traumedy: Dark Comedic Negotiations of Trauma in Contemporary American Literature (2016).

[2] J.D. Walters, Asbestos – a potential hazard to health in the ship building and ship repairing industries, DrPH thesis, London School of Hygiene & Tropical Medicine (1959); https://doi.org/10.17037/PUBS.01273049.

[3]The Lobby – Cut on the Bias” (July 6, 2020).

[4] Francis Douglas Kelly Liddell, Mortality of Quebec chrysotile workers in relation to radiological findings while still employed, PhD thesis, London School of Hygiene & Tropical Medicine (1978); DOI: https://doi.org/10.17037/PUBS.04656049

[5] Keith Richard Sullivan, Mortality patterns among civilian workers in Royal Navy Dockyards, PhD thesis, London School of Hygiene & Tropical Medicine (1994) DOI: https://doi.org/10.17037/PUBS.04656717

[6] Damien Martin McElvenny, Meta-analysis of Rare Diseases in Occupational Epidemiology, PhD thesis, London School of Hygiene & Tropical Medicine (2017) DOI: https://doi.org/10.17037/PUBS.03894558

Science & the Law – from the Proceedings of the National Academies of Science

October 5th, 2023

The current issue of the Proceedings of the National Academies of Science (PNAS) features a medley of articles on science generally, and forensic science, in the law.[1] The general editor of the compilation appears to be editorial board member, Thomas D. Albright, the Conrad T. Prebys Professor of Vision Research at the Salk Institute for Biological Studies.

 I have not had time to plow through the set of offerings, but even a superficial inspection reveals that the articles will be of interest to lawyers and judges involved in the litigation of scientific issues. The authors seem to agree that descriptively and prescriptively, validity is more important than expertise in the legal  consideration of scientific evidence.

1. Thomas D. Albright, “A scientist’s take on scientific evidence in the courtroom,” 120 Proceedings of the National Academies of Science 120 (41) e2301839120 (2023).

Albright’s essay was edited by Henry Roediger, a psychologist at the Washington University in St. Louis.

Abstract

Scientific evidence is frequently offered to answer questions of fact in a court of law. DNA genotyping may link a suspect to a homicide. Receptor binding assays and behavioral toxicology may testify to the teratogenic effects of bug repellant. As for any use of science to inform fateful decisions, the immediate question raised is one of credibility: Is the evidence a product of valid methods? Are results accurate and reproducible? While the rigorous criteria of modern science seem a natural model for this evaluation, there are features unique to the courtroom that make the decision process scarcely recognizable by normal standards of scientific investigation. First, much science lies beyond the ken of those who must decide; outside “experts” must be called upon to advise. Second, questions of fact demand immediate resolution; decisions must be based on the science of the day. Third, in contrast to the generative adversarial process of scientific investigation, which yields successive approximations to the truth, the truth-seeking strategy of American courts is terminally adversarial, which risks fracturing knowledge along lines of discord. Wary of threats to credibility, courts have adopted formal rules for determining whether scientific testimony is trustworthy. Here, I consider the effectiveness of these rules and explore tension between the scientists’ ideal that momentous decisions should be based upon the highest standards of evidence and the practical reality that those standards are difficult to meet. Justice lies in carefully crafted compromise that benefits from robust bonds between science and law.

2. Thomas D.Albright, David Baltimore, Anne-MarieMazza, “Science, evidence, law, and justice,” 120 Proceedings of the National Academies of Science 120 (41) e2301839120 (2023).

Professor Baltimore is a nobel laureate and researcher in biology, now at the California Institute of Technology. Anne-Marie Mazza is the director of the Committee on Science, Technology, and Law, of the National Academies of Sciences, Engineering, and Medicine. Jennifer Mnookin is the chancellor of the University of Wisconsin, Madison; previously, she was the dean of the UCLA School of Law. Judge Tatel is a federal judge on the United States Court of Appeals for the District of Columbia Circuit.

Abstract

For nearly 25 y, the Committee on Science, Technology, and Law (CSTL), of the National Academies of Sciences, Engineering, and Medicine, has brought together distinguished members of the science and law communities to stimulate discussions that would lead to a better understanding of the role of science in legal decisions and government policies and to a better understanding of the legal and regulatory frameworks that govern the conduct of science. Under the leadership of recent CSTL co-chairs David Baltimore and David Tatel, and CSTL director Anne-Marie Mazza, the committee has overseen many interdisciplinary discussions and workshops, such as the international summits on human genome editing and the science of implicit bias, and has delivered advisory consensus reports focusing on topics of broad societal importance, such as dual use research in the life sciences, voting systems, and advances in neural science research using organoids and chimeras. One of the most influential CSTL activities concerns the use of forensic evidence by law enforcement and the courts, with emphasis on the scientific validity of forensic methods and the role of forensic testimony in bringing about justice. As coeditors of this Special Feature, CSTL alumni Tom Albright and Jennifer Mnookin have recruited articles at the intersection of science and law that reveal an emerging scientific revolution of forensic practice, which we hope will engage a broad community of scientists, legal scholars, and members of the public with interest in science-based legal policy and justice reform.

3. Nicholas Scurich, David L. Faigman, and Thomas D. Albright, “Scientific guidelines for evaluating the validity of forensic feature-comparison methods,” 120 Proceedings of the National Academies of Science (2023).

Nicholas Scurich is the chair of the department of Psychological Science, at the University of Southern California, David Faigman has written prolifically about science in the law. He is now the chancellor and dean, at the University of San Francisco College of Law.

Abstract

When it comes to questions of fact in a legal context—particularly questions about measurement, association, and causality—courts should employ ordinary standards of applied science. Applied sciences generally develop along a path that proceeds from a basic scientific discovery about some natural process to the formation of a theory of how the process works and what causes it to fail, to the development of an invention intended to assess, repair, or improve the process, to the specification of predictions of the instrument’s actions and, finally, empirical validation to determine that the instrument achieves the intended effect. These elements are salient and deeply embedded in the cultures of the applied sciences of medicine and engineering, both of which primarily grew from basic sciences. However, the inventions that underlie most forensic science disciplines have few roots in basic science, and they do not have sound theories to justify their predicted actions or results of empirical tests to prove that they work as advertised. Inspired by the “Bradford Hill Guidelines”—the dominant framework for causal inference in epidemiology—we set forth four guidelines that can be used to establish the validity of forensic comparison methods generally. This framework is not intended as a checklist establishing a threshold of minimum validity, as no magic formula determines when particular disciplines or hypotheses have passed a necessary threshold. We illustrate how these guidelines can be applied by considering the discipline of firearm and tool mark examination.

4. Peter Stout, “The secret life of crime labs,” 120 Proceedings of the National Academies of Science 120 (41) e2303592120 (2023).

Peter Stout is a scientist with the Houston Forensic Science Center, in Houston, Texas. The Center describes itself as “an independent local government corporation,” which provides forensic “services” to the Houston police

Abstract

Houston TX experienced a widely known failure of its police forensic laboratory. This gave rise to the Houston Forensic Science Center (HFSC) as a separate entity to provide forensic services to the City of Houston. HFSC is a very large forensic laboratory and has made significant progress at remediating the past failures and improving public trust in forensic testing. HFSC has a large and robust blind testing program, which has provided many insights into the challenges forensic laboratories face. HFSC’s journey from a notoriously failed lab to a model also gives perspective to the resource challenges faced by all labs in the country. Challenges for labs include the pervasive reality of poor-quality evidence. Also that forensic laboratories are necessarily part of a much wider system of interdependent functions in criminal justice making blind testing something in which all parts have a role. This interconnectedness also highlights the need for an array of oversight and regulatory frameworks to function properly. The major essential databases in forensics need to be a part of blind testing programs and work is needed to ensure that the results from these databases are indeed producing correct results and those results are being correctly used. Last, laboratory reports of “inconclusive” results are a significant challenge for laboratories and the system to better understand when these results are appropriate, necessary and most importantly correctly used by the rest of the system.

5. Brandon L. Garrett & Cynthia Rudin, “Interpretable algorithmic forensics,” 120 Proceedings of the National Academies of Science 120 (41) 120 (41) e2301842120 (2023).

Garrett teaches at the Duke University School of Law. Rudin teaches statistics at Duke University.

Abstract

One of the most troubling trends in criminal investigations is the growing use of “black box” technology, in which law enforcement rely on artificial intelligence (AI) models or algorithms that are either too complex for people to understand or they simply conceal how it functions. In criminal cases, black box systems have proliferated in forensic areas such as DNA mixture interpretation, facial recognition, and recidivism risk assessments. The champions and critics of AI argue, mistakenly, that we face a catch 22: While black box AI is not understandable by people, they assume that it produces more accurate forensic evidence. In this Article, we question this assertion, which has so powerfully affected judges, policymakers, and academics. We describe a mature body of computer science research showing how “glass box” AI—designed to be interpretable—can be more accurate than black box alternatives. Indeed, black box AI performs predictably worse in settings like the criminal system. Debunking the black box performance myth has implications for forensic evidence, constitutional criminal procedure rights, and legislative policy. Absent some compelling—or even credible—government interest in keeping AI as a black box, and given the constitutional rights and public safety interests at stake, we argue that a substantial burden rests on the government to justify black box AI in criminal cases. We conclude by calling for judicial rulings and legislation to safeguard a right to interpretable forensic AI.

6. Jed S. Rakoff & Goodwin Liu, “Forensic science: A judicial perspective,” 120 Proceedings of the National Academies of Science e2301838120 (2023).

Judge Rakoff has written previously on forensic evidence. He is a federal district court judge in the Southern District of New York. Goodwin Liu is a justice on the California Supreme Court. Their article was edited by Professor Mnookin.

Abstract

This article describes three major developments in forensic evidence and the use of such evidence in the courts. The first development is the advent of DNA profiling, a scientific technique for identifying and distinguishing among individuals to a high degree of probability. While DNA evidence has been used to prove guilt, it has also demonstrated that many individuals have been wrongly convicted on the basis of other forensic evidence that turned out to be unreliable. The second development is the US Supreme Court precedent requiring judges to carefully scrutinize the reliability of scientific evidence in determining whether it may be admitted in a jury trial. The third development is the publication of a formidable National Academy of Sciences report questioning the scientific validity of a wide range of forensic techniques. The article explains that, although one might expect these developments to have had a major impact on the decisions of trial judges whether to admit forensic science into evidence, in fact, the response of judges has been, and continues to be, decidedly mixed.

7. Jonathan J. Koehler, Jennifer L. Mnookin, and Michael J. Saks, “The scientific reinvention of forensic science,” 120 Proceedings of the National Academies of Science e2301840120 (2023).

Koehler is a professor of law at the Northwestern Pritzker School of Law. Saks is a professor of psychology at Arizona State University, and Regents Professor of Law, at the Sandra Day O’Connor College of Law.

Abstract

Forensic science is undergoing an evolution in which a long-standing “trust the examiner” focus is being replaced by a “trust the scientific method” focus. This shift, which is in progress and still partial, is critical to ensure that the legal system uses forensic information in an accurate and valid way. In this Perspective, we discuss the ways in which the move to a more empirically grounded scientific culture for the forensic sciences impacts testing, error rate analyses, procedural safeguards, and the reporting of forensic results. However, we caution that the ultimate success of this scientific reinvention likely depends on whether the courts begin to engage with forensic science claims in a more rigorous way.

8. William C. Thompson, “Shifting decision thresholds can undermine the probative value and legal utility of forensic pattern-matching evidence,” 120 Proceedings of the National Academies of Science e2301844120 (2023).

Thompson is professor emeritus in the Department of Criminology, Law & Society, University of California, Irvine.

Abstract

Forensic pattern analysis requires examiners to compare the patterns of items such as fingerprints or tool marks to assess whether they have a common source. This article uses signal detection theory to model examiners’ reported conclusions (e.g., identification, inconclusive, or exclusion), focusing on the connection between the examiner’s decision threshold and the probative value of the forensic evidence. It uses a Bayesian network model to explore how shifts in decision thresholds may affect rates and ratios of true and false convictions in a hypothetical legal system. It demonstrates that small shifts in decision thresholds, which may arise from contextual bias, can dramatically affect the value of forensic pattern-matching evidence and its utility in the legal system.

9. Marlene Meyer, Melissa F. Colloff, Tia C. Bennett, Edward Hirata, Amelia Kohl, Laura M. Stevens, Harriet M. J. Smith, Tobias Staudigl & Heather D. Flowe, “Enabling witnesses to actively explore faces and reinstate study-test pose during a lineup increases discriminability,” 120 Proceedings of the National Academies of Science e2301845120 (2023).

Marlene Meyer, Melissa F. Colloff, Tia C. Bennett, Edward Hirata, Amelia Kohl, and Heather D. Flowe are psychologists at the School of Psychology, University of Birmingham (United Kingdom). Harriet M. J. Smith is a psychologist in the School of Psychology, Nottingham Trent University, Nottingham, United Kingdom, and Tobias Staudigl is a psychologist in the Department of Psychology, Ludwig-Maximilians-Universität München, in Munich, Germany.

Abstract

Accurate witness identification is a cornerstone of police inquiries and national security investigations. However, witnesses can make errors. We experimentally tested whether an interactive lineup, a recently introduced procedure that enables witnesses to dynamically view and explore faces from different angles, improves the rate at which witnesses identify guilty over innocent suspects compared to procedures traditionally used by law enforcement. Participants encoded 12 target faces, either from the front or in profile view, and then attempted to identify the targets from 12 lineups, half of which were target present and the other half target absent. Participants were randomly assigned to a lineup condition: simultaneous interactive, simultaneous photo, or sequential video. In the front-encoding and profile-encoding conditions, Receiver Operating Characteristics analysis indicated that discriminability was higher in interactive compared to both photo and video lineups, demonstrating the benefit of actively exploring the lineup members’ faces. Signal-detection modeling suggested interactive lineups increase discriminability because they afford the witness the opportunity to view more diagnostic features such that the nondiagnostic features play a proportionally lesser role. These findings suggest that eyewitness errors can be reduced using interactive lineups because they create retrieval conditions that enable witnesses to actively explore faces and more effectively sample features.


[1] 120 Proceedings of the National Academies of Science (Oct. 10, 2023).

The IARC-hy of Evidence – Incoherent & Inconsistent Classifications of Carcinogenicity

September 19th, 2023

Recently, two lawyers wrote an article in a legal trade magazine about excluding epidemiologic evidence in civil litigation.[1] The article was wildly wide of the mark, with several conceptual and practical errors.[2] For starters, the authors discussed Rule 702 as excluding epidemiologic studies and evidence, when the rule addresses the admissibility of expert witness opinion testimony. The most egregious recommendation of the authors, however, was their recommendation that counsel urge the classifications of chemicals with respect to carcinogenicity, by the International Agency for Research on Cancer (IARC), and by regulatory agencies, as probative for or against causation.

The project of evaluating the evidence for, or against, carcinogenicity of the myriad natural and synthetic agents to which humans are exposed is certainly important. Certainly, IARC has taken the project seriously. There have, however, been problems with IARC’s classifications of specific chemicals, pharmaceuticals, or exposure circumstances, but a basic problem with the classifications begins with the classes themselves. Classification requires defined classes. I don’t mean to be anti-semantic, but IARC’s definitions and its hierarchy of carcinogenicity are not entirely coherent.

The agency was established in 1965, and by the early 1970s, found itself in the business of preparing “monographs on the evaluation of carcinogenic risk of chemicals to man.” Originally, the IARC set out to classify the carcinogenicity of chemicals, but over the years, its scope increased to include complex mixtures, physical agents such as different forms of radiation, and biological organisms. To date, there have been 134 IARC monographs, addressing 1,045 “agents” (either substances or exposure circumstances).

From its beginnings, the IARC has conducted its classifications through working groups that meet to review and evaluate evidence, and classify the cancer hazards of “agents” under discussion. The breakdown of IARC’s classifications among four groups currently is:

Group 1 – Carcinogenic to humans (127 agents)

Group 2A – Probably carcinogenic to humans (95 agents)

Group 2B – Possibly carcinogenic to humans (323 agents)

Group 3 – Not classifiable as to its carcinogenicity to humans   (500 agents)

Previously, the IARC classification included a Group 4 for agents that are probably not carcinogenic for human beings. After decades of review, the IARC placed only a single agent in Group 4, caprolactam, apparently because the agency found everything else in the world to be presumptively a cause of cancer. The IARC could not find sufficiently strong evidence even for water, air, or basic foods to declare that they do not cause cancer in humans. Ultimately, the IARC abandoned Group 4, in favor of a presumption of universal carcinogencity.

The IARC describes its carcinogen classification procedures, requirements, and rationales in a document known as “The Preamble.” Any discussion of IARC classifications, whether in scientific publications or in legal briefs, without reference to this document should be suspect. The Preamble seeks to define many of the words in the classificatory scheme, some in ways that are not intuitive. This document has been amended over time, and the most recent iteration can be found online at the IARC website.[3]

IARC claims to build its classifications upon “consensus” evaluations, based in turn upon considerations of

(a) the strength of evidence of carcinogenicity in humans,

(b) the evidence of carcinogenicity in experimental (non-human) animals, and

(c) the mechanistic evidence of carcinogenicity.

IARC further claims that its evaluations turn on the use of “transparent criteria and descriptive terms.”[4] This last claim is, for some terms, is falsifiable.

The working groups are described as engaged in consensus evaluations, although past evaluations have been reached on simple majority vote of the working group. The working groups are charged with considering the three lines of evidence, described above, for any given agent, and reaching a synthesis in the form of the IARC classificatory scheme. The chart, from the Preamble, below roughly describes how working groups may “mix and match” lines of evidence, of varying degrees of robustness and validity (vel non) to reach a classification.

 

Agents placed in Category I are thus “carcinogenic to humans.” Interestingly, IARC does not refer to Category I carcinogens as “known” carcinogens, although many commentators are prone to do so. The implication of calling Category I agents “known carcinogens” is to distinguish Category IIA, IIB, and III as agents “not known to cause cancer.” The adjective that IARC uses, rather than “known,” is “sufficient” evidence in humans, but IARC also allows for reaching Category I with “limited,” or even “inadequate” human evidence if the other lines of evidence, in experimental animals or mechanistic evidence in humans, are sufficient.

In describing “sufficient” evidence, the IARC’s Preamble does not refer to epidemiologic evidence as potentially “conclusive” or “definitive”; rather its use of “sufficient” implies, perhaps non-transparently, that its labels of “limited” or “inadequate” evidence in humans refer to insufficient evidence. IARC gives an unscientific, inflated weight and understanding to “limited evidence of carcinogenicity,” by telling us that

“[a] causal interpretation of the positive association observed in the body of evidence on exposure to the agent and cancer is credible, but chance, bias, or confounding could not be ruled out with reasonable confidence.”[5]

Remarkably, for IARC, credible interpretations of causality can be based upon evidentiary displays that are confounded or biased.  In other words, non-credible associations may support IARC’s conclusions of causality. Causal interpretations of epidemiologic evidence are “credible” according to IARC, even though Sir Austin’s predicate of a valid association is absent.[6]

The IARC studiously avoids, however, noting that any classification is based upon “insufficient” evidence, even though that evidence may be less than sufficient, as in “limited,” or “inadequate.” A close look at Table 4 reveals that some Category I classifications, and all Category IIA, IIB, and III classifications are based upon insufficient evidence of carcinogenicity in humans.

Non-Probable Probabilities

The classification immediately below Category or Group I is Group 2A, for agents “probably carcinogenic to humans.” The IARC’s use of “probably” is problematic. Group I carcinogens require only “sufficient” evidence of human carcinogenicity, and there is no suggestion that any aspect of a Group I evaluation requires apodictic, conclusive, or even “definitive” evidence. Accordingly, the determination of Group I carcinogens will be based upon evidence that is essentially probabilistic. Group 2A is also defined as having only “limited evidence of carcinogenicity in humans”; in other words, insufficient evidence of carcinogenicity in humans, or epidemiologic studies with uncontrolled confounding and biases.

Importing IARC 2A classifications into legal or regulatory arenas will allow judgments or regulations based upon “limited evidence” in humans, which as we have seen, can be based upon inconsistent observational studies, and studies that fail to measure and adjust for known and potential confounding risk factors and systematic biases. The 2A classification thus requires little substantively or semantically, and many 2A classifications leave juries and judges to determine whether a chemical or medication caused a human being’s cancer, when the basic predicates for Sir Austin Bradford Hill’s factors for causal judgment have not been met.[7]

An IARC evaluation of Group 2A, or “probably carcinogenic to humans,” would seem to satisfy the legal system’s requirement that an exposure to the agent of interest more likely than not causes the harm in question. Appearances and word usage in different contexts, however, can be deceiving. Probability is a continuous quantitative scale from zero to one. In Bayesian analyses, zero and one are unavailable because if either were our starting point, no amount of evidence could ever change our judgment of the probability of causation. (Cromwell’s Rule). The IARC informs us that its use of “probably” is purely idiosyncratic; the probability that a Group 2A agent causes cancer has “no quantitative” meaning. All the IARC intends is that a Group 2A classification “signifies a greater strength of evidence than possibly carcinogenic.”[8] Group 2A classifications are thus consistent with having posterior probabilities less than 0.5 (or 50 percent). A working group could judge the probability of a substance or a process to be carcinogenic to humans to be greater than zero, but no more than say ten percent, and still vote for a 2A classification, in keeping with the IARC Preamble. This low probability threshold for a 2A classification converts the judgment of “probably carcinogenic” into little more than precautionary prescriptions, rendered when the most probable assessment is either ignorance or lack of causality. There is thus a practical certainty, close to 100%, that a 2A classification will confuse judges and juries, as well as the scientific community.

In addition to being based upon limited, that is insufficient, evidence of human carcinogenicity, Group 2A evaluations of “probable human carcinogenicity” connote “sufficient evidence” in experimental animals. An agent can be classified 2A even when the sufficient evidence of carcinogenicity occurs in only one of several non-human animal species, with the other animal species failing to show carcinogenicity. IARC 2A classifications can thus raise the thorny question in court whether a claimant is more like a rat or a mouse.

Courts should, because of the incoherent and diluted criteria for “probably carcinogenic,” exclude expert witness opinions based upon IARC 2A classifications as scientifically insufficient.[9] Given the distortion of ordinary language in its use of defined terms such as “sufficient,” “limited,” and “probable,” any evidentiary value to IARC 2A classifications, and expert witness opinion based thereon, is “substantially outweighed by a danger of … unfair prejudice, confusing the issues, [and] misleading the jury….”[10]

Everything is Possible

Category 2B denotes “possibly carcinogenic.” This year, the IARC announced that a working group had concluded that aspartame, an artificial sugar substitute, was “possibly carcinogenic.”[11] Such an evaluation, however, tells us nothing. If there are no studies at all of an agent, the agent could be said to be possibly carcinogenic. If there are inconsistent studies, even if the better designed studies are exculpatory, scientists could still say that the agent of interest was possibly carcinogenic. The 2B classification does not tell us anything because everything is possible until there is sufficient evidence to inculpate or exculpate it from causing cancer in humans.

It’s a Hazard, Not a Risk

IARC’s classification does not include an assessment of exposure levels. Consequently, there is no consideration of dose or exposure level at which an agent becomes carcinogenic. IARC’s evaluations are limited to whether the agent is or is not carcinogenic. The IARC explicitly concedes that exposure to a carcinogenic agent may carry little risk, but it cannot bring itself to say no risk, or even benefit at low exposures.

As noted, the IARC classification scheme refers to the strength of the evidence that an agent is carcinogenic, and not to the quantitative risk of cancer from exposure at a given level. The Preamble explains the distinction as fundamental:

“A cancer hazard is an agent that is capable of causing cancer, whereas a cancer risk is an estimate of the probability that cancer will occur given some level of exposure to a cancer hazard. The Monographs assess the strength of evidence that an agent is a cancer hazard. The distinction between hazard and risk is fundamental. The Monographs identify cancer hazards even when risks appear to be low in some exposure scenarios. This is because the exposure may be widespread at low levels, and because exposure levels in many populations are not known or documented.”[12]

This attempted explanation reveals important aspects of IARC’s project. First, there is an unproven assumption that there will be cancer hazards regardless of the exposure levels. The IARC contemplates that there may circumstances of low levels of risk from low levels of exposure, but it elides the important issue of thresholds. Second, IARC’s distinction between hazard and risk is obscured by its own classifications.  For instance, when IARC evaluated crystalline silica and classified it in Group I, it did so for only “occupational exposures.”[13] And yet, when IARC evaluated the hazard of coal exposure, it placed coal dust in Group 3, even though coal dust contains crystalline silica.[14] Similarly, in 2018, the IARC classified coffee as a Group 3,[15] even though every drop of coffee contains acrylamide, which is, according to IARC, a Group 2A “probable human carcinogen.”[16]


[1] Christian W. Castile & and Stephen J. McConnell, “Excluding Epidemiological Evidence Under FRE 702,” For The Defense 18 (June 2023) [Castile].

[2]Excluding Epidemiologic Evidence Under Federal Rule of Evidence 702” (Aug. 26, 2023).

[3] IARC Monographs on the Identification of Carcinogenic Hazards to Humans – Preamble (2019).

[4] Jonathan M. Samet , Weihsueh A. Chiu , Vincent Cogliano, Jennifer Jinot, David Kriebel, Ruth M. Lunn, Frederick A. Beland, Lisa Bero, Patience Browne, Lin Fritschi, Jun Kanno , Dirk W. Lachenmeier, Qing Lan, Gerard Lasfargues, Frank Le Curieux, Susan Peters, Pamela Shubat, Hideko Sone, Mary C. White , Jon Williamson, Marianna Yakubovskaya , Jack Siemiatycki, Paul A. White, Kathryn Z. Guyton, Mary K. Schubauer-Berigan, Amy L. Hall, Yann Grosse, Veronique Bouvard, Lamia Benbrahim-Tallaa, Fatiha El Ghissassi, Beatrice Lauby-Secretan, Bruce Armstrong, Rodolfo Saracci, Jiri Zavadil , Kurt Straif, and Christopher P. Wild, “The IARC Monographs: Updated Procedures for Modern and Transparent Evidence Synthesis in Cancer Hazard Identification,” 112 J. Nat’l Cancer Inst. djz169 (2020).

[5] Preamble at 31.

[6] See Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295 (1965) (noting that only when “[o]ur observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance,” do we move on to consider the nine articulated factors for determining whether an association is causal.

[7] Id.

[8] IARC Monographs on the Identification of Carcinogenic Hazards to Humans – Preamble 31 (2019) (“The terms probably carcinogenic and possibly carcinogenic have no quantitative significance and are used as descriptors of different strengths of evidence of carcinogenicity in humans.”).

[9] SeeIs the IARC lost in the weeds” (Nov. 30, 2019); “Good Night Styrene” (Apr. 18, 2019).

[10] Fed. R. Evid. 403.

[11] Elio Riboli, et al., “Carcinogenicity of aspartame, methyleugenol, and isoeugenol,” 24 The Lancet Oncology P848-850 (2023);

IARC, “Aspartame hazard and risk assessment results released” (2023).

[12] Preamble at 2.

[13] IARC Monograph 68, at 41 (1997) (“For these reasons, the Working Group therefore concluded that overall the epidemiological findings support increased lung cancer risks from inhaled crystalline silica (quartz and cristobalite) resulting from occupational exposure.”).

[14] IARC Monograph 68, at 337 (1997).

[15] IARC Monograph No. 116, Drinking Coffee, Mate, and Very Hot Beverages (2018).

[16] IARC Monograph no. 60, Some Industrial Chemicals (1994).

PLPs & Five-Legged Dogs

September 1st, 2023

All lawyers have heard the puzzle of “how many legs does a dog have if you call his tail a leg?” The puzzle is often misattributed to Abraham Lincoln, who used the puzzle at various times, including in jury speeches. The answer of course is: “Four. Saying that a tail is a leg does not make it a leg.” Quote investigators have traced the puzzle as far back as 1825, when newspapers quoted legislator John W. Hulbert as saying that something “reminded him of the story.”[1]

What do we call a person who becomes pregnant and delivers a baby?

A woman.

The current, trending fashion is to call such a person a PLP, a person who becomes pregnant and lactates. This façon de parler is particularly misleading if it is meant as an accommodation to the transgender population. Transgender women will not show up as pregnant or lactating, and transgender men will show up only if there transition is incomplete and has left them with functional female reproductive organs.

In 2010, Guinness World Records named Thomas Beatie the “World’s First Married Man to Give Birth.” Thomas Beatie is now legally a man, which is just another way of saying that he chose to identify as a man, and gained legal recognition for his choice. Beatie was born as a female, matured into a woman, and had ovaries and a uterus. Beatie was, in other words, biologically a female when she went through puberty and became biologically a woman.

Beatie underwent partial gender reassignment surgery, consisting at least of double mastectomy, and taking testosterone replacement therapy (off label), but retained ovaries and a uterus.

Guinness makes a fine stout, and we may look upon it kindly for having nurtured the statistical thinking of William Sealy Gosset. Guinness, however, cannot make a dog have five legs simply by agreeing to call its tail a leg. Beatie was not the first pregnant man; rather he was the first person, born with functional female reproductive organs, to have his male gender identity recognized by a state, who then conceived and delivered a newborn. If Guinness wants to call this the first “legal man” to give birth, by semantic legerdemain, that is fine. Certainly we can and should publicly be respectful of transgendered persons, and work to prevent them from being harassed or embarrassed. There may well be many situations in which we would change our linguistic usage to acknowledge a transsexual male as the mother of a child.[2] We do not, however, have to change biology to suit their choices, or to make useless gestures to have them feel included when their inclusion is not relevant to important scientific and medical issues.

Sadly, the NASEM would impose this politico-semanticism upon us while addressing the serious issue whether women of child-bearing age should be included in clinical trials.  At a recent workshop on “Developing a Framework to Address Legal, Ethical, Regulatory, and Policy Issues for Research Specific to Pregnant and Lactating Persons,”[3] the Academies introduced a particularly ugly neologism, “pregnant and lactating persons,” or PLP for short. The workshop reports:

“Approximately 4 million pregnant people in the United States give birth annually, and 70 percent of these individuals take at least one prescription medication during their pregnancy. Yet, pregnant and lactating persons are often excluded from clinical trials, and often have to make treatment decisions without an adequate understanding of the benefits and risks to themselves and their developing fetus or newborn baby. An ad hoc committee of the National Academies of Sciences, Engineering, and Medicine will develop a framework for addressing medicolegal and liability issues when planning or conducting research specific to pregnant and lactating persons.”[4]

The full report from NASEM, with fulsome use of the PLP phrase, is now available.[5]

J.K. Rowling is not the only one who is concerned about the erasure of the female from our discourse. Certainly we can acknowledge that transgenderism is real, without allowing the exception to erase biological facts about reproduction. After all, Guinness’s first pregnant “legal man” could not lactate, as a result of bilateral mastectomies, and thus the “legal man” was not a pregnant person who could lactate. And the pregnant “legal man” had functioning ovaries and uterus, which is not a matter of gender identity, but physiological functioning of biological female sex organs. Furthermore, including transgendered women, or “legal women,” without functional ovaries and uterus, in clinical trials will not answer difficult question about whether experimental therapies may harm women’s reproductive function or their offspring in utero or after birth.

The inclusion of women in clinical trials is a serious issue precisely because experimental therapies may hold risks for participating women’s offspring in utero. The law may not permit a proper informed consent by women for their conceptus. And because of the new latitude legislatures enjoy to impose religion-based bans on abortion, a women who conceives while taking an experimental drug may not be able to terminate her pregnancy that has been irreparably harmed by the drug.

The creation of the PLP category really confuses rather than elucidates how we answer the ethical and medical questions involved in testing new drugs or treatments for women. NASEM’s linguistic gerrymandering may allow some persons who have suffered from gender dysphoria to feel “included,” and perhaps to have their choices “validated,” but the inclusion of transgender women, or partially transgendered men, will not help answer the important questions facing clinical researchers. Taxpayers who fund NASEM and NIH deserve better clarity and judgment in the use of governmental funds in supporting clinical trials.

When and whence comes this PLP neologism?  An N-Gram search shows that “pregnant person” was found in the database before 1975, and that the phrase has waxed and waned since.

N-Gram for pregnant person, conducted September 1, 2023

A search of the National Library of Medicine PubMed database found several dozen hits, virtually all within the last two years. The earliest use was in 1970,[6] with a recrudenscence 11 years later.[7]

                                             

From PubMed search for “pregnant person,” conducted Sept. 1, 2023 

In 2021, the New England Journal of Medicine published a paper on the safety of Covid-19 vaccines in “pregnant persons.”[8] As of last year, the Association of American Medical Colleges sponsored a report about physicians advocating for inclusion of “pregnant people” in clinical trials, in a story that noted that “[p]regnant patients are often excluded from clinical trials for fear of causing harm to them or their babies, but leaders in maternal-fetal medicine say the lack of data can be even more harmful.”[9] And currently, the New York State Department of Health advises that “[d]ue to changes that occur during pregnancy, pregnant people may be more susceptible to viral respiratory infections.”[10]

The neologism of PLP was not always so. Back in the dark ages, 2008, the National Cancer Institute issued guidelines on the inclusion of pregnant and breast-feeding women in clinical trials.[11] As recently as June 2021, The World Health Organization was still old school in discussing “pregnant and lactating women.”[12] The same year, over a dozen female scientists, published a call to action about the inclusion of “pregnant women” in COVID-19 trials.[13]

Two years ago, I gingerly criticized the American Medical Association’s issuance of a linguistic manifesto on how physicians and scientists should use language to advance the Association’s notions of social justice.[14] I criticized the Association’s efforts at the time, but its guide to “correct” usage was devoid of the phrase “pregnant persons” or “lactating persons.”[15] Pregnancy is a function of sex, not of gender.


[1]Suppose You Call a Sheep’s Tail a Leg, How Many Legs Will the Sheep Have?” QuoteResearch (Nov. 15, 2015).

[2] Sam Dylan More, “The pregnant man – an oxymoron?” 7 J. Gender Studies 319 (1998).

[3] National Academies of Sciences, Engineering, and Medicine, “Research with Pregnant and Lactating Persons: Mitigating Risk and Liability: Proceedings of a Workshop in Brief,” (2023).

[4] NASEM, “Research with Pregnant and Lactating Persons: Mitigating Risk and Liability: Proceedings of a Workshop–in Brief” (2023).

[5] National Academies of Sciences, Engineering, and Medicine, Inclusion of pregnant and lactating persons in clinical trials: Proceedings of a workshop (2023).

[6] W.K. Keller, “The pregnant person,” 68 J. Ky. Med. Ass’n 454 (1970).

[7] Vibiana M. Andrade, “The toxic workplace: Title VII protection for the potentially pregnant person,” 4 Harvard Women’s Law J. 71 (1981).

[8] Tom T. Shimabukuro, Shin Y. Kim, Tanya R. Myers, Pedro L. Moro, Titilope Oduyebo, Lakshmi Panagiotakopoulos, Paige L. Marquez, Christine K. Olson, Ruiling Liu, Karen T. Chang, Sascha R. Ellington, Veronica K. Burkel, et al., for the CDC v-safe COVID-19 Pregnancy Registry Team, “Preliminary Findings of mRNA Covid-19 Vaccine Safety in Pregnant Persons,” 384 New Engl. J. Med. 2273 (2021).

[9] Bridget Balch, “Prescribing without data: Doctors advocate for the inclusion of pregnant people in clinical research,” AAMC (Mar. 22, 2022).

[10] New York State Department of Health, “Pregnancy & COVID-19,” last visited August 31, 2023.

[11] NCI, “Guidelines Regarding the Inclusion of Pregnant and Breast-Feeding Women on Cancer Clinical Treatment Trials,” (May 29, 2008).

[12] WHO, “Update on WHO Interim recommendations on COVID-19 vaccination of pregnant and lactating women,” (June 10, 2021).

[13] Melanie M. Taylor, Loulou Kobeissi, Caron Kim, Avni Amin, Anna E Thorson, Nita B. Bellare, Vanessa Brizuela, Mercedes Bonet, Edna Kara, Soe Soe Thwin, Hamsadvani Kuganantham, Moazzam Ali, Olufemi T. Oladapo, Nathalie Broutet, “Inclusion of pregnant women in COVID-19 treatment trials: a review and global call to action,”9 Health Policy E366 (2021).

[14] American Medical Association, “Advancing Health Equity: A Guide to Language, Narrative and Concepts,” (2021); see Harriet Hall, “The AMA’s Guide to Politically Correct Language: Advancing Health Equity,” Science Based Medicine (Nov. 2, 2021).

[15]When the American Medical Association Woke Up” (Nov.17, 2021).

Excluding Epidemiologic Evidence under Federal Rule of Evidence 702

August 26th, 2023

We are 30-plus years into the “Daubert” era, in which federal district courts are charged with gatekeeping the relevance and reliability of scientific evidence. Not surprisingly, given the lawsuit industry’s propensity on occasion to use dodgy science, the burden of awakening the gatekeepers from their dogmatic slumber often falls upon defense counsel in civil litigation. It therefore behooves defense counsel to speak carefully and accurately about the grounds for Rule 702 exclusion of expert witness opinion testimony.

In the context of medical causation opinions based upon epidemiologic evidence, the first obvious point is that whichever party is arguing for exclusion should distinguish between excluding an expert witness’s opinion and prohibiting an expert witness from relying upon a particular study.  Rule 702 addresses the exclusion of opinions, whereas Rule 703 addresses barring an expert witness from relying upon hearsay facts or data unless they are reasonably relied upon by experts in the appropriate field. It would be helpful for lawyers and legal academics to refrain from talking about “excluding epidemiological evidence under FRE 702.”[1] Epidemiologic studies are rarely admissible themselves, but come into the courtroom as facts and data relied upon by expert witnesses. Rule 702 is addressed to the admissibility vel non of opinion testimony, some of which may rely upon epidemiologic evidence.

Another common lawyer mistake is the over-generalization that epidemiologic research provides “gold standard” of general causation evidence.[2] Although epidemiology is often required, it not “the medical science devoted to determining the cause of disease in human beings.”[3] To be sure, epidemiologic evidence will usually be required because there is no genetic or mechanistic evidence that will support the claimed causal inference, but counsel should be cautious in stating the requirement. Glib statements by courts that epidemiology is not always required are often simply an evasion of their responsibility to evaluate the validity of the proffered expert witness opinions. A more careful phrasing of the role of epidemiology will make such glib statements more readily open to rebuttal. In the absence of direct biochemical, physiological, or genetic mechanisms that can be identified as involved in bringing about the plaintiffs’ harm, epidemiologic evidence will be required, and it may well be the “gold standard” in such cases.[4]

When epidemiologic evidence is required, counsel will usually be justified in adverting to the “hierarchy of epidemiologic evidence.” Associations are shown in studies of various designs with vastly differing degrees of validity; and of course, associations are not necessarily causal. There are thus important nuances in educating the gatekeeper about this hierarchy. First, it will often be important to educate the gatekeeper about the distinction between descriptive and analytic studies, and the inability of descriptive studies such as case reports to support causal inferences.[5]

There is then the matter of confusion within the judiciary and among “scholars” about whether a hierarchy even exists. The chapter on epidemiology in the Reference Manual on Scientific Evidence appears to suggest the specious position that there is no hierarchy.[6] The chapter on medical testimony, however, takes a different approach in identifying a normative hierarchy of evidence to be considered in evaluating causal claims.[7] The medical testimony chapter specifies that meta-analyses of randomized controlled trials sit atop the hierarchy. Yet, there are divergent opinions about what should be at the top of the hierarchical evidence pyramid. Indeed, the rigorous, large randomized trial will often replace a meta-analysis of smaller trials as the more definitive evidence.[8] Back in 2007, a dubious meta-analysis of over 40 clinical trials led to a litigation frenzy over rosiglitazone.[9] A mega-trial of rosiglitazone showed that the 2007 meta-analysis was wrong.[10]

In any event, courts must purge their beliefs that once there is “some” evidence in support of a claim, their gatekeeping role is over. Randomized controlled trials really do trump observational studies, which virtually always have actual or potential confounding in their final analyses.[11] While disclaimers about the unavailability of randomized trials for putative toxic exposures are helpful, it is not quite accurate to say that it is “unethical to intentionally expose people to a potentially harmful dose of a suspected toxin.”[12] Such trials are done all the time when there is an expected therapeutic benefit that creates at least equipoise between the overall benefit and harm at the outset of the trial.[13]

At this late date, it seems shameful that courts must be reminded that evidence of associations does not suffice to show causation, but prudence dictates giving the reminder.[14] Defense counsel will generally exhibit a Pavlovian reflex to state that causality based upon epidemiology must be viewed through a lens of “Bradford Hill criteria.”[15] Rhetorically, this reflex seems wrong given that Sir Austin himself noted that his nine different considerations were “viewpoints,” not criteria. Taking a position that requires an immediate retreat seems misguided. Similarly, urging courts to invoke and apply the Bradford Hill considerations must be accompanied the caveat that courts must first apply Bradford Hill’s predicate[16] for the nine considerations:

“Disregarding then any such problem in semantics we have this situation. Our observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance. What aspects of that association should we especially consider before deciding that the most likely interpretation of it is causation?”[17]

Courts should be mindful that the language from the famous, often-cited paper was part of an after-dinner address, in which Sir Austin was speaking informally. Scientists will understand that he was setting out a predicate that calls for

(1) an association, which is

(2) “perfectly clear cut,” such that bias and confounding are excluded, and

(3) “beyond what we would care to attribute to the play of chance,” with random error kept to an acceptable level, before advancing to further consideration of the nine viewpoints commonly recited.

These predicate findings are the basis for advancing to investigate Bradford Hill’s nine viewpoints; the viewpoints do not replace or supersede the predicates.[18]

Within the nine viewpoints, not all are of equal importance. Consistency among studies, a particularly important consideration, implies that isolated findings in a single observational study will rarely suffice to support causal conclusions. Another important consideration, the strength of the association, has nothing to do with “statistical significance,” which is a predicate consideration, but reminds us that large risk ratios or risk differences provides some evidence that the association does not result from unmeasured confounding. Eliminating confounding, however, is one of the predicate requirements for applying the nine factors. As with any methodology, the Bradford Hill factors are not self-executing. The annals of litigation provide all-too-many examples of undue selectivity, “cherry picking,” and other deviations from the scientist’s standard of care.

Certainly lawyers must steel themselves against recommending the “carcinogen” hazard identifications advanced by the International Agency for Research on Cancer (IARC). There are several problematic aspects to the methods of IARC, not the least of which is IARC’s fanciful use of the word “probable.” According to the IARC Preamble, “probable” has no quantitative meaning.[19] In common legal parlance, “probable” typically conveys a conclusion that is more likely than not. Another problem arises from the IARC’s labeling of “probable human carcinogens” made in some cases without any real evidence of carcinogenesis in humans. Regulatory pronouncements are even more diluted and often involved little more than precautionary principle wishcasting.[20]


[1] Christian W. Castile & and Stephen J. McConnell, “Excluding Epidemiological Evidence Under FRE 702,” For The Defense 18 (June 2023) [Castile]. Although these authors provide an interesting overview of the subject, they fall into some common errors, such as failing to address Rule 703. The article is worth reading for its marshaling recent case law on the subject, but I detail of its errors here in the hopes that lawyers will speak more precisely about the concepts involved in challenging medical causation opinions.

[2] Id. at 18. In re Zantac (Ranitidine) Prods. Liab. Litig., No. 2924, 2022 U.S. Dist. LEXIS 220327, at *401 (S.D. Fla. Dec. 6, 2022); see also Horwin v. Am. Home Prods., No. CV 00-04523 WJR (Ex), 2003 U.S. Dist. LEXIS 28039, at *14-15 (C.D. Cal. May 9, 2003) (“epidemiological studies provide the primary generally accepted methodology for demonstrating a causal relation between a chemical compound and a set of symptoms or disease” *** “The lack of epidemiological studies supporting Plaintiffs’ claims creates a high bar to surmount with respect to the reliability requirement, but it is not automatically fatal to their case.”).

[3] See, e.g., Siharath v. Sandoz Pharm. Corp., 131 F. Supp. 2d 1347, 1356 (N.D. Ga. 2001) (“epidemiology is the medical science devoted to determining the cause of disease in human beings”).

[4] See, e.g., Lopez v. Wyeth-Ayerst Labs., No. C 94-4054 CW, 1996 U.S. Dist. LEXIS 22739, at *1 (N.D. Cal. Dec. 13, 1996) (“Epidemiological evidence is one of the most valuable pieces of scientific evidence of causation”); Horwin v. Am. Home Prods., No. CV 00-04523 WJR (Ex), 2003 U.S. Dist. LEXIS 28039, at *15 (C.D. Cal. May 9, 2003) (“The lack of epidemiological studies supporting Plaintiffs’ claims creates a high bar to surmount with respect to the reliability requirement, but it is not automatically fatal to their case”).

[5] David A. Grimes & Kenneth F. Schulz, “Descriptive Studies: What They Can and Cannot Do,” 359 Lancet 145 (2002) (“…epidemiologists and clinicians generally use descriptive reports to search for clues of cause of disease – i.e., generation of hypotheses. In this role, descriptive studies are often a springboard into more rigorous studies with comparison groups. Common pitfalls of descriptive reports include an absence of a clear, specific, and reproducible case definition, and interpretations that overstep the data. Studies without a comparison group do not allow conclusions about cause of disease.”).

[6] Michael D. Green, D. Michal Freedman, and Leon Gordis, “Reference Guide on Epidemiology,” Reference Manual on Scientific Evidence 549, 564n.48 (citing a paid advertisement by a group of scientists, and misleadingly referring to the publication as a National Cancer Institute symposium) (citing Michele Carbone et al., “Modern Criteria to Establish Human Cancer Etiology,” 64 Cancer Res. 5518, 5522 (2004) (National Cancer Institute symposium [sic] concluding that “[t]here should be no hierarchy [among different types of scientific methods to determine cancer causation]. Epidemiology, animal, tissue culture and molecular pathology should be seen as integrating evidences in the determination of human carcinogenicity.”).

[7] John B. Wong, Lawrence O. Gostin & Oscar A. Cabrera, “Reference Guide on Medical Testimony,” in Reference Manual on Scientific Evidence 687, 723 (3d ed. 2011).

[8] See, e.g., J.M. Elwood, Critical Appraisal of Epidemiological Studies and Clinical Trials 342 (3d ed. 2007).

[9] See Steven E. Nissen & Kathy Wolski, “Effect of Rosiglitazone on the Risk of Myocardial Infarction and Death from Cardiovascular Causes,” 356 New Engl. J. Med. 2457 (2007). See also “Learning to Embrace Flawed Evidence – The Avandia MDL’s Daubert Opinion” (Jan. 10, 2011).

[10] Philip D. Home, et al., “Rosiglitazone evaluated for cardiovascular outcomes in oral agent combination therapy for type 2 diabetes (RECORD): a multicentre, randomised, open-label trial,” 373 Lancet 2125 (2009).

[11] In re Zantac (Ranitidine) Prods. Liab. Litig., No. 2924, 2022 U.S. Dist. LEXIS 220327, at *402 (S.D. Fla. Dec. 6, 2022) (“Unlike experimental studies in which subjects are randomly assigned to exposed and placebo groups, observational studies are subject to bias due to the possibility of differences between study populations.”)

[12] Castile at 20.

[13] See, e.g., Benjamin Freedman, “Equipoise and the ethics of clinical research,” 317 New Engl. J. Med. 141 (1987).

[14] See, e.g., In Re Onglyza (Saxagliptin) & Kombiglyze Xr (Saxagliptin & Metformin) Prods. Liab. Litig., No. 5:18-md-2809-KKC, 2022 U.S. Dist. LEXIS 136955, at *127 (E.D. Ky. Aug. 2, 2022); Burleson v. Texas Dep’t of Criminal Justice, 393 F.3d 577, 585-86 (5th Cir. 2004) (affirming exclusion of expert causation testimony based solely upon studies showing a mere correlation between defendant’s product and plaintiff’s injury); Beyer v. Anchor Insulation Co., 238 F. Supp. 3d 270, 280-81 (D. Conn. 2017); Ambrosini v. Labarraque, 101 F.3d 129, 136 (D.C. Cir. 1996).

[15] Castile at 21. See In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., 26 F. Supp. 3d 449, 454-55 (E.D. Pa. 2014).

[16]Bradford Hill on Statistical Methods” (Sept. 24, 2013); see also Frank C. Woodside, III & Allison G. Davis, “The Bradford Hill Criteria: The Forgotten Predicate,” 35 Thomas Jefferson L. Rev. 103 (2013). 

[17] Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 295 (1965).

[18] Castile at 21. See, e.g., In re Onglyza (Saxagliptin) & Kombiglyze XR (Saxagliptin & Metformin) Prods. Liab. Litig., No. 5:18-md-2809-KKC, 2022 U.S. Dist. LEXIS 1821, at *43 (E.D. Ky. Jan. 5, 2022) (“The analysis is meant to apply when observations reveal an association between two variables. It addresses the aspects of that association that researchers should analyze before deciding that the most likely interpretation of [the association] is causation”); Hoefling v. U.S. Smokeless Tobacco Co., LLC, 576 F. Supp. 3d 262, 273 n.4 (E.D. Pa. 2021) (“Nor would it have been appropriate to apply them here: scientists are to do so only after an epidemiological association is demonstrated”).

[19] IARC Monographs on the Identification of Carcinogenic Hazards to Humans – Preamble 31 (2019) (“The terms probably carcinogenic and possibly carcinogenic have no quantitative significance and are used as descriptors of different strengths of evidence of carcinogenicity in humans.”).

[20]Improper Reliance upon Regulatory Risk Assessments in Civil Litigation” (Mar. 19, 2023).

Tenpenny Down to Tuppence

August 22nd, 2023

Over two years ago, an osteopathic physician by the name of Sherri Tenpenny created a stir when she told the Ohio state legislature that Covid vaccines magnetize people or cause them to “interface with 5G towers.”[1] What became apparent at that time was that Tenpenny was herself a virulent disease vector of disinformation. Indeed, in its March 2021 report, the Center for Countering Digital Hate listed Tenpenny as a top anti-vaccination shyster. As a social media vector, she is ranked in the top dozen “influencers.”[2] No surprise, in addition to bloviating about Covid vaccines, someone with such quirkly non-evidence based opinions turns up in litigation as an expert witness.[3]

 

At the time of Tenpenny’s ludicrous testimony before the Ohio state legislature, one astute observer remarked that the AMA Ethical Guidelines specify that medical societies and medical licensing boards are responsible for maintaining high standards for medical testimony, and must assess “claims of false or misleading testimony.”[4] When the testimony is false or misleading, these bodies should discipline the offender “as appropriate.”[5]

The State Medical Board of Ohio stepped up to its responsibility. After receiving hundreds (roughly 350) complaints about Tenpenny’s testimony, the Ohio Board launched an investigation of Tenpenny, who was first licensed as an osteopathic physician in 1984.[6]  The Board’s investigators tried to contact Tenpenny, who apparently evaded engaging with them. Eventually, Thomas Renz, a lawyer for Tenpenny informed the Board that Tenpenny “[d]eclin[ed] to cooperate in the Board’s bad faith and unjustified assault on her licensure, livelihood, and constitutional rights cannot be construed as an admission of any allegations against her.”

After multiple unsuccessful attempts to reach Tenpenny, the Board issued a citation, in 2022, against her for stonewalling the investigation. Tenpenny requested an administrative hearing, set for April 2023, when she would be able to submit her defense in writing. The Board refused to let Tenpenny evade questioning, and suspended her license for failure to comply with the investigation. According to the Board’s Order, “Dr. Tenpenny did not simply fail to cooperate with a Board investigation, she refused to cooperate. *** And that refusal was based on her unsupported and subjective belief regarding the Board’s motive for the investigation. Licensees of the Board cannot simply refuse to cooperate in investigations because they decide they do not like what they assume is the reason for the investigation.”[7]

According to the Board’s Order, Tenpenny has been fined $3,000, and she must satisfy the Board’s conditions before applying for reinstatement. The Ohio Board’s decision is largely based upon a procedural ruling that flowed from Tenpenny’s refusal to cooperate with the Board’s investigation. Most state medical boards have done little to nothing to address the substance of physician misconduct arising out of the COVID pandemic. This month, American Board of Internal Medicine (ABIM) announced that it was revoking the board certifications of two physicians, Drs. Paul Marik and Pierre Kory, members of the Front Line COVID-19 Critical Care Alliance, for engaging in promoting disinformation and invalid opinions about therapies for COVID-19 opinions.[8] Ron Johnson, the quack senator from Wisconsin, predictably and transparently criticized the ABIM’s action with an ad hominem attack on the ABIM as the action of a corporate cabal. Quack physicians of course have a first amendment right to say whatever, but their licensure and their board certification are contingent on basic competence. Both the state boards and the certifying private groups have the right and responsibility to revoke licenses and privileges when physicians demonstrate incompetence and callousness in the face of a pandemic. There is no unqualified right to professional licenses or certifications.


[1] Andrea Salcedo, “A doctor falsely told lawmakers vaccines magnetize people: ‘They can put a key on their forehead. It sticks’,” Washington Post (June 9, 2021); Andy Downing, “What an exceedingly dumb time to be alive,” Columbus Alive (June 10, 2021); Jake Zuckerman, “She says vaccines make you magnetized. This West Chester lawmaker invited her testimony, chair says,” Ohio Capital Journal (July 14, 2021).

[2] The Disinformation Dozen (2021),

[3] Shaw v. Sec’y Health & Human Servs., No. 01-707V, 2009 U.S. Claims LEXIS 534, *84 n.40 (Fed. Cl. Spec. Mstr. Aug. 31, 2009) (excluding expert witness opinion testimony from Tenpenny).

[4]  “Epistemic Virtue – Dropping the Dime on TenpennyTortini (July 18, 2021).

[5] A.M.A. Code of Medical Ethics Opinion 9.7.1.

[6] Michael DePeau-Wilson, “Doc Who Said COVID Vax Magnetized People Has License Suspended,” MedPageToday (Aug. 11, 2023); David Gorski, “The Ohio State Medical Board has finally suspended the medical license of antivax quack Sherri Tenpenny,” Science-Based Medicine (Aug, 14, 2023).

[7] In re Sherri J. Tenpenny, D.O., Case No. 22-CRF-0168 State Medical Board of Ohio (Aug. 9, 2023).

[8] David Gorski, “The American Board of Internal Medicine finally acts against two misinformation-spreading doctors,” Science-Based Medicine (Aug. 7, 2023).

Links, Ties, and Other Hook Ups in Risk Factor Epidemiology

July 5th, 2023

Many journalists struggle with reporting the results from risk factor epidemiology. Recently, JAMA Network Open published an epidemiologic study (“Williams study”) that explored whether exposure to Agent Orange amoby ng United States military veterans was associated with bladder cancer.[1] The reported study found little to no association, but lay and scientific journalists described the study as finding a “link,”[2] or a “tie,”[3] thus suggesting causality. One web-based media report stated, without qualification, that Agent Orange “increases bladder cancer risk.”[4]

 

Even the authors of the Williams study described the results inconsistently and hyperbolically. Within the four corners of the published article, the authors described their having found a “modestly increased risk of bladder cancer,” and then, on the same page, they reported that “the association was very slight (hazard ratio, 1.04; 95% C.I.,1.02-1.06).”

In one place, the Williams study states it looked at a cohort of 868,912 veterans with exposure to Agent Orange, and evaluated bladder cancer outcomes against outcomes in 2,427,677 matched controls. Elsewhere, they report different numbers, which are hard to reconcile. In any event, the authors had a very large sample size, which had the power to detect theoretically small differences as “statistically significant” (p < 0.05). Indeed, the study was so large that even a very slight disparity in rates between the exposed and unexposed cohort members could be “statistically significantly” different, notwithstanding that systematic error certainly played a much larger role in the results than random error. In terms of absolute numbers, the researchers found 50,781 bladder cancer diagnoses, on follow-up of 28,672,655 person-years. There were overall 2.1% bladder cancers among the exposed servicemen, and 2.0% among the unexposed. Calling this putative disparity a “modest association” is a gross overstatement, and it is difficult to square the authors’ pronouncement of a “modest association” with a “very slight increased risk.”

The authors also reported that there was no association between Agent Orange exposure and aggressiveness of bladder cancer, with bladder wall muscle invasion taken to be the marker of aggressiveness. Given that the authors were willing to proclaim a hazard ratio of 1.04 as an association, this report of no association with aggressiveness is manifestly false. The Williams study found a decreased odds of a diagnosis of muscle-invasive bladder cancer among the exposed cases, with an odds ratio of 0.91, 95% CI 0.85-0.98 (p = 0.009). The study thus did not find an absence of an association, but rather an inverse association.

Causality

Under the heading of “Meaning,” the authors wrote that “[t]hese findings suggest an association between exposure to Agent Orange and bladder cancer, although the clinical relevance of this was unclear.” Despite disclaiming a causal interpretation of their results, Williams and colleagues wrote that their results “support prior investigations and further support bladder cancer to be designated as an Agent Orange-associated disease.”

Williams and colleagues note that the Institute of Medicine had suggested that the association between Agent Orange exposure and bladder cancer outcomes required further research.[5] Requiring additional research was apparently sufficient for the Department of Veterans Affairs, in 2021, to assume facts not in evidence, and to designate “bladder cancer as a cancer caused by Agent Orange exposure.”[6]

Williams and colleagues themselves appear to disavow a causal interpretation of their results: “we cannot determine causality given the retrospective nature of our study design.” They also acknowledged their inability to “exclude potential selection bias and misclassification bias.” Although the authors did not explore the issue, exposed servicemen may well have been under greater scrutiny, creating surveillance and diagnostic biases.

The authors failed to grapple with other, perhaps more serious biases and inadequacy of methodology in their study. Although the authors claimed to have controlled for the most important confounders, they failed to include diabetes as a co-variate in their analysis, even though diabetic patients have a more than doubled increased risk for bladder cancer, even after adjustment for smoking.[7] Diabetic patients would also have been likely to have had more visits to VA centers for healthcare and more opportunity to have been diagnosed with bladder cancer.

Futhermore, with respect to the known confounding variable of smoking, the authors trichotomized smoking history as “never,” “former,” or “current” smoker. The authors were missing smoking information in about 13% of the cohort. In a univariate analysis based upon smoking status (Table 4), the authors reported the following hazard ratios for bladder cancer, by smoking status:

Smoking status at bladder cancer diagnosis

Never smoked      1   [Reference]

Current smoker   1.10 (1.00-1.21)

Former smoker    1.08 (1.00-1.18)

Unknown              1.17 (1.05-1.31)

This analysis for smoking risk points to the fragility of the Agent Orange analyses. First, the “unknown” smoking status is associated with roughly twice the risk for current or former smokers. Second, the risk ratios for muscle-invasive bladder cancer were understandably higher for current smokers (OR 1.10, 95% CI 1.00-1.21) and former smokers (OR 1.08, 95% CI 1.00-1.18) than for non-smoking veterans.

Third, the Williams’ study’s univariate analysis of smoking and bladder cancer generates risk ratios that are quite out of line with independent studies of smoking and bladder cancer risk. For instance, meta-analyses of studies of smoking and bladder cancer risk report risk ratios of 2.58 (95% C.I., 2.37–2.80) for any smoking, 3.47 (3.07–3.91) for current smoking, and 2.04 (1.85–2.25) for past smoking.[8] These smoking-related bladder cancer risks are thus order(s) of magnitude greater than the univariate analysis of smoking risk in the Williams study, as well as the multivariate analysis of Agent Orange risk reported.

Fourth, the authors engage in the common, but questionable practice of categorizing a known confounder, smoking, which should ideally be reported as a continuous variable with respect to quantity consumed, years smoked, and years since quitting.[9] The question here, given that the study is very large, is not the loss of power,[10] but bias away from the null. Peter Austin has shown, by Monte Carlo simulation, that categorizing a continuous variable in a logistic regression results in inflating the rate of finding false positive associations.[11] The type I (false-positive) error rates increases with sample size, with increasing correlation between the confounding variable and outcome of interest, and the number of categories used for the continuous variables. The large dataset used by Williams and colleagues, which they see as a plus, works against them by increasing the bias from the use of categorical variables for confounding variables.[12]

The Williams study raises serious questions not only about the quality of medical journalism, but also about how an executive agency such as the Veterans Administration evaluates scientific evidence. If the Williams study were to play a role in compensation determinations, it would seem that veterans with muscle-invasive bladder cancer would be turned away, while those veterans with less serious cancers would be compensated. But with 2.1% incidence versus 2.0%, how can compensation be rationally permitted in every case?


[1] Stephen B. Williams, Jessica L. Janes, Lauren E. Howard, Ruixin Yang, Amanda M. De Hoedt, Jacques G. Baillargeon, Yong-Fang Kuo, Douglas S. Tyler, Martha K. Terris, Stephen J. Freedland, “Exposure to Agent Orange and Risk of Bladder Cancer Among US Veterans,” 6 JAMA Network Open e2320593 (2023).

[2] Elana Gotkine, “Exposure to Agent Orange Linked to Risk of Bladder Cancer,” Buffalo News (June 28, 2023); Drew Amorosi, “Agent Orange exposure linked to increased risk for bladder cancer among Vietnam veterans,” HemOnc Today (June 27, 2023).

[3] Andrea S. Blevins Primeau, “Agent Orange Exposure Tied to Increased Risk of Bladder Cancer,” Cancer Therapy Advisor (June 30, 2023); Mike Bassett, “Agent Orange Exposure Tied to Bladder Cancer Risk in Veterans — Increased risk described as ‘modest’, and no association seen with aggressiveness of cancer,” Medpage Today (June 27, 2023).

[4] Darlene Dobkowski, “Agent Orange Exposure Modestly Increases Bladder Cancer Risk in Vietnam Veterans,” Cure Today (June 27, 2023).

[5] Institute of Medicine – Committee to Review the Health Effects in Vietnam Veterans of Exposure to Herbicides (Tenth Biennial Update), Veterans and Agent Orange: Update 2014 at 10 (2016) (upgrading previous finding of “inadequate” to “suggestive”).

[6] Williams study, citing U.S. Department of Veterans Affairs, “Agent Orange exposure and VA disability compensation.”

[7] Yeung Ng, I. Husain, N. Waterfall, “Diabetes Mellitus and Bladder Cancer – An Epidemiological Relationship?” 9 Path. Oncol. Research 30 (2003) (diabetic patients had an increased, significant odds ratio for bladder cancer compared with non diabetics even after adjustment for smoking and age [OR: 2.69 p=0.049 (95% CI 1.006-7.194)]).

[8] Marcus G. Cumberbatch, Matteo Rota, James W.F. Catto, and Carlo La Vecchia, “The Role of Tobacco Smoke in Bladder and Kidney Carcinogenesis: A Comparison of Exposures and Meta-analysis of Incidence and Mortality Risks?” 70 European Urology 458 (2016).

[9] See generally, “Confounded by Confounding in Unexpected Places” (Dec. 12, 2021).

[10] Jacob Cohen, “The cost of dichotomization,” 7 Applied Psychol. Measurement 249 (1983).

[11] Peter C. Austin & Lawrence J. Brunner, “Inflation of the type I error rate when a continuous confounding variable is categorized in logistic regression analyses,” 23 Statist. Med. 1159 (2004).

[12] See, e.g., Douglas G. Altman & Patrick Royston, “The cost of dichotomising continuous variables,” 332 Brit. Med. J. 1080 (2006); Patrick Royston, Douglas G. Altman, and Willi Sauerbrei, “Dichotomizing continuous predictors in multiple regression: a bad idea,” 25 Stat. Med. 127 (2006); Valerii Fedorov, Frank Mannino, and Rongmei Zhang, “Consequences of dichotomization,” 8 Pharmaceut. Statist. 50 (2009). See also Robert C. MacCallum, Shaobo Zhang, Kristopher J. Preacher, and Derek D. Rucker, “On the Practice of Dichotomization of Quantitative Variables,” 7 Psychological Methods 19 (2002); David L. Streiner, “Breaking Up is Hard to Do: The Heartbreak of Dichotomizing Continuous Data,” 47 Can. J. Psychiatry 262 (2002); Henian Chen, Patricia Cohen, and Sophie Chen, “Biased odds ratios from dichotomization of age,” 26 Statist. Med. 3487 (2007); Carl van Walraven & Robert G. Hart, “Leave ‘em Alone – Why Continuous Variables Should Be Analyzed as Such,” 30 Neuroepidemiology 138 (2008); O. Naggara, J. Raymond, F. Guilbert, D. Roy, A. Weill, and Douglas G. Altman, “Analysis by Categorizing or Dichotomizing Continuous Variables Is Inadvisable,” 32 Am. J. Neuroradiol. 437 (Mar 2011); Neal V. Dawson & Robert Weiss, “Dichotomizing Continuous Variables in Statistical Analysis: A Practice to Avoid,” Med. Decision Making 225 (2012); Phillippa M. Cumberland, Gabriela Czanner, Catey Bunce, Caroline J. Doré, Nick Freemantle, and Marta García-Fiñana, “Ophthalmic statistics note: the perils of dichotomising continuous variables,” 98 Brit. J. Ophthalmol. 841 (2014); Julie R. Irwin & Gary H. McClelland, “Negative Consequences of Dichotomizing Continuous Predictor Variables,” 40 J. Marketing Research 366 (2003); Stanley E. Lazic, “Four simple ways to increase power without increasing the sample size,” PeerJ Preprints (23 Oct 2017).

Is the Scientific Method Fascist?

June 14th, 2023

Just before the pandemic, when our country seems to have gone tits up, there was a studied effort to equate any emphasis on scientific method, and the valuation of “[o]bjective, rational linear thinking; “[c]ause and effect relationships”; and “[q]uantitative emphasis,” with white privilege and microaggression against non-white people.

I am not making up this claim; I am not creative enough. Indeed, for a while, the  Smithsonian National Museum of African American History & Culture featured a graphic that included “emphasis on scientific method” as aspect of white culture, and implied it was an unsavory aspect of “white privilege.”[1]

Well, as it turns out, scientific method is not only racist, but fascist as well.

With pretentious citations to Deleuze,[2] Foucault,[3] and Lyotard,[4] a group of Canadian authors[5] set out to decolonize science and medicine from the fascist grip of scientific methodology and organizations such as the Cochrane Group. The grand insight is that the health sciences have been “colonized” by a scientific research “paradigm” that is “outrageously exclusionary and dangerously normative with regards to scientific knowledge.” By excluding “alternative forms of knowledge,” evidence-based medicine acts as a “fascist structure.” The Cochcrane Group in particular is singled out for having created an exclusionary and non-egalitarian hierarchy of evidence.  Intolerance for non-approved modes of inference and thinking are, in these authors’ view, “manifestations of fascism,” which are more “pernicious,” even if less brutal than the fascism practiced by Hitler and Mussolini.[6]

Clutch the pearls!

Never mind that “deconstruction” itself sounds a bit fascoid,[7] not to mention a rather vague concept. The authors seem intent to promote multiple ways of knowing without epistemic content. Indeed, our antifa authors do not attempt to show that evidence-based medicine leads regularly to incorrect results, or that their unspecified alternatives have greater predictive value. Nonetheless, decolonization of medicine and deconstruction of hierarchical methodology remain key for them to achieve an egalitarian epistemology, by which everyone is equally informed and equally stupid. In the inimitable words of the authors, “many scientists find themselves interpellated by hegemonic discourses and come to disregard all others.”[8]

These epistemic freedom fighters want to divorce the idea of evidence from objective reality, and make evidence bend to “values.”[9] Apparently, the required deconstruction of the “knowing subject” is that the subject is “implicitly implicitly male, white, Western and heterosexual.” Medicine’s fixation on binaries such as normal and pathological, male and female, shows that evidence-based medicine is simply not queer enough. Our intrepid authors must be credited for having outed the “hidden political agenda” of those who pretend simply to find the truth, but who salivate over imposing their “hegemonic norms,” asserted in the “name of ‘truth’.”

These Canadian authors leave us with a battle cry: “scholars have not only a scientific duty, but also an ethical obligation to deconstruct these regimes of power.”[10] Scientists of the world, you have nothing to lose but your socially constructed non-sensical conception of scientific truth.

Although it is easy to make fun of post-modernist pretensions,[11] there is a point about the force of argument and evidence. The word “valid” comes to us from the 16th century French word “valide,” which in turn comes from the Latin validus, meaning strong. Similarly, we describe a well-conducted study with robust findings as compelling our belief.

I recall the late Robert Nozick, back in the 1970s, expressing the view that someone who embraced a contradiction might pop out of existence, the way an electron and a positron might cancel each other. If only it were so, we might have people exercising more care in their thinking and speaking.


[1]Is Your Daubert Motion Racist?” (July 17, 2020). The Smithsonian has since seen fit to remove the chart reproduced here, but we know what they really believe.

[2] Gilles Deleuze and Félix Guattari, Anti-oedipus: Capitalism and Schizophrenia (1980); Gilles Deleuze and Félix Guattari, A Thousand Plateaus: Capitalism and Schizophrenia (1987). This dross enjoyed funding from the Canadian Institutes of Health Research, and the Social Science and Humanities Research Council of Canada.

[3] Michel Foucault, The Birth of the Clinic: An Archaeology of Medical Perception (1973); Michel Foucault, The History of Sexuality, Volume 1: An Introduction (trans. Robert Hurley 1978); Michel Foucault, Society Must Be Defended: Lectures at the Collège de France, 1975–1976 (2003); Michel Foucault, Power/Knowledge: Selected Interviews and Other Writings, 1972–1977 (1980); Michel Foucault, Fearless Speech (2001).

[4] Jean-François Lyotard, The Postmodern Condition: A Report on Knowledge (1984).

[5] Dave Holmes, Stuart J Murray, Amélie Perron, and Geneviève Rail, “Deconstructing the evidence-based discourse in health sciences: truth, power and fascism,” 4 Internat’l J. Evidence-Based Health 180 (2006) [Deconstructing]

[6][6] Deconstructing at 181.

[7] Pace David Frum

[8] Deconstructing at 182.

[9] Deconstructing at 183.

[10] Deconstructing  at 180-81.

[11] Alan D. Sokal, “Transgressing the Boundaries: Toward a Transformative Hermeneutics of Quantum Gravity,” 46 Social Text 217 (1994).

Judicial Flotsam & Jetsam – Retractions

June 12th, 2023

In scientific publishing, when scientists make a mistake, they publish an erratum or a corrigendum. If the mistake vitiates the study, then the erring scientists retract their article. To be sure, sometimes the retraction comes after an obscene delay, with the authors kicking and screaming.[1] Sometimes the retraction comes at the request of the authors, better late than never.[2]

Retractions in the biomedical journals, whether voluntary or not, are on the rise.[3] The process and procedures for retraction of articles often lack transparency. Many articles are retracted without explanation or disclosure of specific problems about the data or the analysis.[4] Sadly, however, misconduct in the form of plagiarism and data falsification is a frequent reason for retractions.[5] The lack of transparency for retractions, and sloppy scholarship, combine to create Zombie papers, which are retracted but continue to be cited in subsequent publications.[6]

LEGAL RETRACTIONS

The law treats errors very differently. Being a judge usually means that you never have to say you are sorry. Judge Andrew Hurwitz has argued that that our legal system would be better served if judges could and did “freely acknowledged and transparently corrected the occasional ‘goof’.”[7] Alas, as Judge Hurwitz notes, very few published decisions acknowledge mistakes.[8]

In the world of scientific jurisprudence, the judicial reticence to acknowledge mistakes is particularly dangerous, and it leads directly to the proliferation of citations to cases that make egregious mistakes. In the niche area of judicial assessment of scientific and statistical evidence, the proliferation of erroneous statements is especially harmful because it interferes with thinking clearly about the issues before courts. Judges believe that they have argued persuasively for a result, not by correctly marshaling statistical and scientific concepts, but by relying upon precedents erroneously arrived at by other judges in earlier cases. Regardless of how many cases are cited (and there are many possible “precedents”), the true parameter does not have a 95% probability of lying within the interval given by a given 95% confidence interval.[9] Similarly, as much as judges would like p-values and confidence intervals to eliminate the need to worry about systematic error, their saying so cannot make it so.[10] Even a mighty federal judge cannot make the p-value probability, or its complement, substitute for the posterior probability of a causal claim.[11]

Some cases in the books are so egregiously decided that it is truly remarkable that they would be cited for any proposition. I call these scientific Dred Scott cases, which illustrate that sometimes science has no criteria of validity that the law is bound to respect. One such Dred Scott case was the result of a bench trial in a federal district court in Atlanta, in Wells v. Ortho Pharmaceutical Corporation.[12]

Wells was notorious for its poor assessment of all the determinants of scientific causation.[13] The decision was met with a storm of opprobrium from the legal and medical community.[14] No scientists or legal scholars offered a serious defense of Wells on the scientific merits. Even the notorious plaintiffs’ expert witness, Carl Cranor, could muster only a distanced agnosticism:

“In Wells v. Ortho Pharmaceutical Corp., which involved a claim that birth defects were caused by a spermicidal jelly, the U.S. Court of Appeals for the 11th Circuit followed the principles of Ferebee and affirmed a plaintiff’s verdict for about five million dollars. However, some members of the medical community chastised the legal system essentially for ignoring a well-established scientific consensus that spermicides are not teratogenic. We are not in a position to judge this particular issue, but the possibility of such results exists.”[15]

Cranor apparently could not bring himself to note that it was not just scientific consensus that was ignored; the Wells case ignored the basic scientific process of examining relevant studies for both internal and external validity.

Notwithstanding this scholarly consensus and condemnation, we have witnessed the repeated recrudescence of the Wells decision. In Matrixx Initiatives, Inc. v. Siracusano,[16] in 2011, the Supreme Court, speaking through Justice Sotomayor, wandered into a discussion, irrelevant to its holding, whether statistical significance was necessary for a determination of the causality of an association:

“We note that courts frequently permit expert testimony on causation based on evidence other than statistical significance. Seee.g.Best v. Lowe’s Home Centers, Inc., 563 F. 3d 171, 178 (6th Cir 2009); Westberry v. Gislaved Gummi AB, 178 F. 3d 257, 263–264 (4th Cir. 1999) (citing cases); Wells v. Ortho Pharmaceutical Corp., 788 F. 2d 741, 744–745 (11th Cir. 1986). We need not consider whether the expert testimony was properly admitted in those cases, and we do not attempt to define here what constitutes reliable evidence of causation.”[17]

The quoted language is remarkable for two reasons. First, the Best and Westberry cases did not involve statistics at all. They addressed specific causation inferences from what is generally known as differential etiology. Second, the citation to Wells was noteworthy because the case has nothing to do with adverse event reports or the lack of statistical significance.

Wells involved a claim of birth defects caused by the use of spermicidal jelly contraceptive, which had been the subject of several studies, one of which at least yielded a nominally statistically significant increase in detected birth defects over what was expected.

Wells could thus hardly be an example of a case in which there was a judgment of causation based upon a scientific study that lacked statistical significance in its findings. Of course, finding statistical significance is just the beginning of assessing the causality of an association. The most remarkable and disturbing aspect of the citation to Wells, however, was that the Court was unaware of, or ignored, the case’s notoriety, and the scholarly and scientific consensus that criticized the decision for its failure to evaluate the entire evidentiary display, as well as for its failure to rule out bias and confounding in the studies relied upon by the plaintiff.

Justice Sotomayor’s decision for a unanimous Court is not alone in its failure of scholarship and analysis in embracing the dubious precedent of Wells. Many other courts have done much the same, both in state[18] and in federal courts,[19] and both before and after the Supreme Court decided Daubert, and even after Rule 702 was amended in 2000.[20] Perhaps even more disturbing is that the current edition of the Reference Manual on Scientific Evidence glibly cites to the Wells case, for the dubious proposition that

“Generally, researchers are conservative when it comes to assessing causal relationships, often calling for stronger evidence and more research before a conclusion of causation is drawn.”[21]

We are coming up on the 40th anniversary of the Wells judgment. It is long past time to stop citing the case. Perhaps we have reached the stage of dealing with scientific evidence at which errant and aberrant cases should be retracted, and clearly marked as retracted in the official reporters, and in the electronic legal databases. Certainly the technology exists to link the scholarly criticism with a case citation, just as we link subsequent judicial treatment by overruling, limiting, and criticizing.


[1] Laura Eggertson, “Lancet retracts 12-year-old article linking autism to MMR vaccines,” 182 Canadian Med. Ass’n J. E199 (2010).

[2] Notice of retraction for Teng Zeng & William Mitch, “Oral intake of ranitidine increases urinary excretion of N-nitrosodimethylamine,” 37 Carcinogenesis 625 (2016), published online (May 4, 2021) (retraction requested by authors with an acknowledgement that they had used incorrect analytical methods for their study).

[3] Tianwei He, “Retraction of global scientific publications from 2001 to 2010,” 96 Scientometrics 555 (2013); Bhumika Bhatt, “A multi-perspective analysis of retractions in life sciences,” 126 Scientometrics 4039 (2021); Raoul R.Wadhwa, Chandruganesh Rasendran, Zoran B. Popovic, Steven E. Nissen, and Milind Y. Desai, “Temporal Trends, Characteristics, and Citations of Retracted Articles in Cardiovascular Medicine,” 4 JAMA Network Open e2118263 (2021); Mario Gaudino, N. Bryce Robinson, Katia Audisio, Mohamed Rahouma, Umberto Benedetto, Paul Kurlansky, Stephen E. Fremes, “Trends and Characteristics of Retracted Articles in the Biomedical Literature, 1971 to 2020,” 181 J. Am. Med. Ass’n Internal Med. 1118 (2021); Nicole Shu Ling Yeo-Teh & Bor Luen Tang, “Sustained Rise in Retractions in the Life Sciences Literature during the Pandemic Years 2020 and 2021,” 10 Publications 29 (2022).

[4] Elizabeth Wager & Peter Williams, “Why and how do journals retract articles? An analysis of Medline retractions 1988-2008,” 37 J. Med. Ethics 567 (2011).

[5] Ferric C. Fanga, R. Grant Steen, and Arturo Casadevall, “Misconduct accounts for the majority of retracted scientific publications,” 109 Proc. Nat’l Acad. Sci. 17028 (2012); L.M. Chambers, C.M. Michener, and T. Falcone, “Plagiarism and data falsification are the most common reasons for retracted publications in obstetrics and gynaecology,” 126 Br. J. Obstetrics & Gyn. 1134 (2019); M.S. Marsh, “Separating the good guys and gals from the bad,” 126 Br. J. Obstetrics & Gyn. 1140 (2019).

[6] Tzu-Kun Hsiao and Jodi Schneider, “Continued use of retracted papers: Temporal trends in citations and (lack of) awareness of retractions shown in citation contexts in biomedicine,” 2 Quantitative Science Studies 1144 (2021).

[7] Andrew D. Hurwitz, “When Judges Err: Is Confession Good for the Soul?” 56 Ariz. L. Rev. 343, 343 (2014).

[8] See id. at 343-44 (quoting Justice Story who dealt with the need to contradict a previously published opinion, and who wrote “[m]y own error, however, can furnish no ground for its being adopted by this Court.” U.S. v. Gooding, 25 U.S. 460, 478 (1827)).

[9] See, e.g., DeLuca v. Merrell Dow Pharms., Inc., 791 F. Supp. 1042, 1046 (D.N.J. 1992) (”A 95% confidence interval means that there is a 95% probability that the ‘true’ relative risk falls within the interval”) , aff’d, 6 F.3d 778 (3d Cir. 1993); In re Silicone Gel Breast Implants Prods. Liab. Litig, 318 F.Supp.2d 879, 897 (C.D. Cal. 2004); Eli Lilly & Co. v. Teva Pharms, USA, 2008 WL 2410420, *24 (S.D.Ind. 2008) (stating incorrectly that “95% percent of the time, the true mean value will be contained within the lower and upper limits of the confidence interval range”). See also Confidence in Intervals and Diffidence in the Courts” (Mar. 4, 2012).

[10] See, e.g., Brock v. Merrill Dow Pharmaceuticals, Inc., 874 F.2d 307, 311-12 (5th Cir. 1989) (“Fortunately, we do not have to resolve any of the above questions [as to bias and confounding], since the studies presented to us incorporate the possibility of these factors by the use of a confidence interval.”). This howler has been widely acknowledged in the scholarly literature. See David Kaye, David Bernstein, and Jennifer Mnookin, The New Wigmore – A Treatise on Evidence: Expert Evidence § 12.6.4, at 546 (2d ed. 2011); Michael O. Finkelstein, Basic Concepts of Probability and Statistics in the Law 86-87 (2009) (criticizing the blatantly incorrect interpretation of confidence intervals by the Brock court).

[11] In re Ephedra Prods. Liab. Litig., 393 F.Supp. 2d 181, 191 (S.D.N.Y. 2005) (Rakoff, J.) (“Generally accepted scientific convention treats a result as statistically significant if the P-value is not greater than .05. The expression ‘P=.05’ means that there is one chance in twenty that a result showing increased risk was caused by a sampling error — i.e., that the randomly selected sample accidentally turned out to be so unrepresentative that it falsely indicates an elevated risk.”); see also In re Phenylpropanolamine (PPA) Prods. Liab. Litig., 289 F.Supp. 2d 1230, 1236 n.1 (W.D. Wash. 2003) (“P-values measure the probability that the reported association was due to chance… .”). Although the erroneous Ephedra opinion continues to be cited, it has been debunked in the scholarly literature. See Michael O. Finkelstein, Basic Concepts of Probability and Statistics in the Law 65 (2009); Nathan A. Schachtman, “Statistical Evidence in Products Liability Litigation,” at 28-13, chap. 28, in Stephanie A. Scharf, George D. Sax, & Sarah R. Marmor, eds., Product Liability Litigation: Current Law, Strategies and Best Practices (2d ed. 2021).

[12] Wells v. Ortho Pharm. Corp., 615 F. Supp. 262 (N.D. Ga.1985), aff’d & modified in part, remanded, 788 F.2d 741 (11th Cir.), cert. denied, 479 U.S. 950 (1986).

[13] I have discussed the Wells case in a series of posts, “Wells v. Ortho Pharm. Corp., Reconsidered,” (2012), part one, two, three, four, five, and six.

[14] See, e.g., James L. Mills and Duane Alexander, “Teratogens and ‘Litogens’,” 15 New Engl. J. Med. 1234 (1986); Samuel R. Gross, “Expert Evidence,” 1991 Wis. L. Rev. 1113, 1121-24 (1991) (“Unfortunately, Judge Shoob’s decision is absolutely wrong. There is no scientifically credible evidence that Ortho-Gynol Contraceptive Jelly ever causes birth defects.”). See also Editorial, “Federal Judges v. Science,” N.Y. Times, December 27, 1986, at A22 (unsigned editorial) (“That Judge Shoob and the appellate judges ignored the best scientific evidence is an intellectual embarrassment.”);  David E. Bernstein, “Junk Science in the Courtroom,” Wall St. J. at A 15 (Mar. 24,1993) (pointing to Wells as a prominent example of how the federal judiciary had embarrassed American judicial system with its careless, non-evidence based approach to scientific evidence); Bert Black, Francisco J. Ayala & Carol Saffran-Brinks, “Science and the Law in the Wake of Daubert: A New Search for Scientific Knowledge,” 72 Texas L. Rev. 715, 733-34 (1994) (lawyers and leading scientist noting that the district judge “found that the scientific studies relied upon by the plaintiffs’ expert were inconclusive, but nonetheless held his testimony sufficient to support a plaintiffs’ verdict. *** [T]he court explicitly based its decision on the demeanor, tone, motives, biases, and interests that might have influenced each expert’s opinion. Scientific validity apparently did not matter at all.”) (internal citations omitted); Bert Black, “A Unified Theory of Scientific Evidence,” 56 Fordham L. Rev. 595, 672-74 (1988); Paul F. Strain & Bert Black, “Dare We Trust the Jury – No,” 18 Brief  7 (1988); Bert Black, “Evolving Legal Standards for the Admissibility of Scientific Evidence,” 239 Science 1508, 1511 (1988); Diana K. Sheiness, “Out of the Twilight Zone: The Implications of Daubert v. Merrill Dow Pharmaceuticals, Inc.,” 69 Wash. L. Rev. 481, 493 (1994); David E. Bernstein, “The Admissibility of Scientific Evidence after Daubert v. Merrell Dow Pharmacueticals, Inc.,” 15 Cardozo L. Rev. 2139, 2140 (1993) (embarrassing decision); Troyen A. Brennan, “Untangling Causation Issues in Law and Medicine: Hazardous Substance Litigation,” 107 Ann. Intern. Med. 741, 744-45 (1987) (describing the result in Wells as arising from the difficulties created by the Ferebee case; “[t]he Wells case can be characterized as the court embracing the hypothesis when the epidemiologic study fails to show any effect”); Troyen A. Brennan, “Causal Chains and Statistical Links: Some Thoughts on the Role of Scientific Uncertainty in Hazardous Substance Litigation,” 73 Cornell L. Rev. 469, 496-500 (1988); David B. Brushwood, “Drug induced birth defects: difficult decisions and shared responsibilities,” 91 W. Va. L. Rev. 51, 74 (1988); Kenneth R. Foster, David E. Bernstein, and Peter W. Huber, eds., Phantom Risk: Scientific Inference and the Law 28-29, 138-39 (1993) (criticizing Wells decision); Peter Huber, “Medical Experts and the Ghost of Galileo,” 54 Law & Contemp. Problems 119, 158 (1991); Edward W. Kirsch, “Daubert v. Merrell Dow Pharmaceuticals: Active Judicial Scrutiny of Scientific Evidence,” 50 Food & Drug L.J. 213 (1995) (“a case in which a court completely ignored the overwhelming consensus of the scientific community”); Hans Zeisel & David Kaye, Prove It With Figures: Empirical Methods in Law and Litigation § 6.5, at 93(1997) (noting the multiple comparisons in studies of birth defects among women who used spermicides, based upon the many reported categories of birth malformations, and the large potential for even more unreported categories); id. at § 6.5 n.3, at 271 (characterizing Wells as “notorious,” and noting that the case became a “lightning rod for the legal system’s ability to handle expert evidence.”); Edward K. Cheng , “Independent Judicial Research in the ‘Daubert’ Age,” 56 Duke L. J. 1263 (2007) (“notoriously concluded”); Edward K. Cheng, “Same Old, Same Old: Scientific Evidence Past and Present,” 104 Michigan L. Rev. 1387, 1391 (2006) (“judge was fooled”); Harold P. Green, “The Law-Science Interface in Public Policy Decisionmaking,” 51 Ohio St. L.J. 375, 390 (1990); Stephen L. Isaacs & Renee Holt, “Drug regulation, product liability, and the contraceptive crunch: Choices are dwindling,” 8 J. Legal Med. 533 (1987); Neil Vidmar & Shari S. Diamond, “Juries and Expert Evidence,” 66 Brook. L. Rev. 1121, 1169-1170 (2001); Adil E. Shamoo, “Scientific evidence and the judicial system,” 4 Accountability in Research 21, 27 (1995); Michael S. Davidson, “The limitations of scientific testimony in chronic disease litigation,” 10 J. Am. Coll. Toxicol. 431, 435 (1991); Charles R. Nesson & Yochai Benkler, “Constitutional Hearsay: Requiring Foundational Testing and Corroboration under the Confrontation Clause,” 81 Virginia L. Rev. 149, 155 (1995); Stephen D. Sugarman, “The Need to Reform Personal Injury Law Leaving Scientific Disputes to Scientists,” 248 Science 823, 824 (1990); Jay P. Kesan, “A Critical Examination of the Post-Daubert Scientific Evidence Landscape,” 52 Food & Drug L. J. 225, 225 (1997); Ora Fred Harris, Jr., “Communicating the Hazards of Toxic Substance Exposure,” 39 J. Legal Ed. 97, 99 (1989) (“some seemingly horrendous decisions”); Ora Fred Harris, Jr., “Complex Product Design Litigation: A Need for More Capable Fact-Finders,” 79 Kentucky L. J. 510 & n.194 (1991) (“uninformed judicial decision”); Barry L. Shapiro & Marc S. Klein, “Epidemiology in the Courtroom: Anatomy of an Intellectual Embarrassment,” in Stanley A. Edlavitch, ed., Pharmacoepidemiology 87 (1989); Marc S. Klein, “Expert Testimony in Pharmaceutical Product Liability Actions,” 45 Food, Drug, Cosmetic L. J. 393, 410 (1990); Michael S. Lehv, “Medical Product Liability,” Ch. 39, in Sandy M. Sanbar & Marvin H. Firestone, eds., Legal Medicine 397, 397 (7th ed. 2007); R. Ryan Stoll, “A Question of Competence – Judicial Role in Regulation of Pharmaceuticals,” 45 Food, Drug, Cosmetic L. J. 279, 287 (1990); Note, “A Question of Competence: The Judicial Role in the Regulation of Pharmaceuticals,” Harvard L. Rev. 773, 781 (1990); Peter H. Schuck, “Multi-Culturalism Redux: Science, Law, and Politics,” 11 Yale L. & Policy Rev. 1, 13 (1993); Howard A. Denemark, “Improving Litigation Against Drug Manufacturers for Failure to Warn Against Possible Side  Effects: Keeping Dubious Lawsuits from Driving Good Drugs off the Market,” 40 Case Western Reserve L.  Rev. 413, 438-50 (1989-90); Howard A. Denemark, “The Search for Scientific Knowledge in Federal Courts in the Post-Frye Era: Refuting the Assertion that Law Seeks Justice While Science Seeks Truth,” 8 High Technology L. J. 235 (1993)

[15] Carl Cranor & Kurt Nutting, “Scientific and Legal Standards of Statistical Evidence in Toxic Tort and Discrimination Suits,” 9 Law & Philosophy 115, 123 (1990) (internal citations omitted).

[16] 131 S.Ct. 1309 (2011) [Matrixx]

[17] Id. at 1319.

[18] Baroldy v. Ortho Pharmaceutical Corp., 157 Ariz. 574, 583, 760 P.2d 574 (Ct. App. 1988); Earl v. Cryovac, A Div. of WR Grace, 115 Idaho 1087, 772 P. 2d 725, 733 (Ct. App. 1989); Rubanick v. Witco Chemical Corp., 242 N.J. Super. 36, 54, 576 A. 2d 4 (App. Div. 1990), aff’d in part, 125 N.J. 421, 442, 593 A. 2d 733 (1991); Minnesota Min. & Mfg. Co. v. Atterbury, 978 S.W. 2d 183, 193 n.7 (Tex. App. 1998); E.I. Dupont de Nemours v. Castillo ex rel. Castillo, 748 So. 2d 1108, 1120 (Fla. Dist. Ct. App. 2000); Bell v. Lollar, 791 N.E.2d 849, 854 (Ind. App. 2003; King v. Burlington Northern & Santa Fe Ry, 277 Neb. 203, 762 N.W.2d 24, 35 & n.16 (2009).

[19] City of Greenville v. WR Grace & Co., 827 F. 2d 975, 984 (4th Cir. 1987); American Home Products Corp. v. Johnson & Johnson, 672 F. Supp. 135, 142 (S.D.N.Y. 1987); Longmore v. Merrell Dow Pharms., Inc., 737 F. Supp. 1117, 1119 (D. Idaho 1990); Conde v. Velsicol Chemical Corp., 804 F. Supp. 972, 1019 (S.D. Ohio 1992); Joiner v. General Elec. Co., 864 F. Supp. 1310, 1322 (N.D. Ga. 1994) (which case ultimately ended up in the Supreme Court); Bowers v. Northern Telecom, Inc., 905 F. Supp. 1004, 1010 (N.D. Fla. 1995); Pick v. American Medical Systems, 958 F. Supp. 1151, 1158 (E.D. La. 1997); Baker v. Danek Medical, 35 F. Supp. 2d 875, 880 (N.D. Fla. 1998).

[20] Rider v. Sandoz Pharms. Corp., 295 F. 3d 1194, 1199 (11th Cir. 2002); Kilpatrick v. Breg, Inc., 613 F. 3d 1329, 1337 (11th Cir. 2010); Siharath v. Sandoz Pharms. Corp., 131 F. Supp. 2d 1347, 1359 (N.D. Ga. 2001); In re Meridia Prods. Liab. Litig., Case No. 5:02-CV-8000 (N.D. Ohio 2004); Henricksen v. ConocoPhillips Co., 605 F. Supp. 2d 1142, 1177 (E.D. Wash. 2009); Doe v. Northwestern Mutual Life Ins. Co., (D. S.C. 2012); In re Chantix (Varenicline) Prods. Liab. Litig., 889 F. Supp. 2d 1272, 1286, 1288, 1290 (N.D. Ala. 2012); Farmer v. Air & Liquid Systems Corp. at n.11 (M.D. Ga. 2018); In re Abilify Prods. Liab. Litig., 299 F. Supp. 3d 1291, 1306 (N.D. Fla. 2018).

[21] Michael D. Green, D. Michal Freedman & Leon Gordis, “Reference Guide on Epidemiology,” 549, 599 n.143, in Federal Judicial Center, National Research Council, Reference Manual on Scientific Evidence (3d ed. 2011).

The opinions, statements, and asseverations expressed on Tortini are my own, or those of invited guests, and these writings do not necessarily represent the views of clients, friends, or family, even when supported by good and sufficient reason.