TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

Specious Claiming in Multi-District Litigation

May 2nd, 2019

In a recent article in an American Bar Association newsletter, Paul Rheingold notes with some concern that, in the last two years or so, there has been a rash of dismissals of entire multi-district litigations (MDLs) based upon plaintiffs’ failure to produce expert witnesses who can survive Rule 702 gatekeeping.[1]  Paul D. Rheingold, “Multidistrict Litigation Mass Terminations for Failure to Prove Causation,” A.B.A. Mass Tort Litig. Newsletter (April 24, 2019) [cited as Rheingold]. According to Rheingold, judges historically involved in the MDL processing of products liability cases did not grant summary judgments across the board. In other words, federal judges felt that if plaintiffs’ lawyers aggregated a sufficient number of cases, then their judicial responsibility was to push settlements or to remand the cases to the transferor courts for trial.

Missing from Rheingold’s account is the prevalent judicial view, in the early going of MDL of products cases, which held that judges lacked the authority to consider Rule 702 motions for all cases in the MDL. Gatekeeping motions were considered extreme and best avoided by pushing them off to the transferor courts upon remand. In MDL 926, involving silicone gel breast implants, the late Judge Sam Pointer, who was a member of the Rules Advisory Committee, expressed the view that Rule 702 gatekeeping was a trial court function, for the trial judge who received the case on remand from the MDL.[2] Judge Pointer’s view was a commonplace in the 1990s. As mass tort litigation moved into MDL “camps,” judges more frequently adopted a managerial rather than a judicial role, and exerted great pressure on the parties, and the defense in particular, to settle cases. These judges frequently expressed their view that the two sides so stridently disagreed on causation that the truth must be somewhere in between, and even with “a little causation,” the defendants should offer a little compensation. These litigation managers thus eschewed dispositive motion practice, or gave it short shrift.

Rheingold cites five recent MDL terminations based upon “Daubert failure,” and he acknowledges other MDLs collapsed because of federal pre-emption issues (Eliquis, Incretins, and possibly Fosamax), and that other fatally weak causal MDL claims settled for nominal compensation (NuvaRing). He omits other MDLs, such as In re Silica, in which an entire MDL collapsed because of prevalent fraud in the screening and diagnosing of silicosis claimants by plaintiffs’ counsel and their expert witnesses.[3] Also absent from his reckoning is the collapse of MDL cases against Celebrex[4] and Viagra[5].

Rheingold does concede that the recent across-the-board dismissals of MDLs were due to very weak causal claims.[6] He softens his judgment by suggesting that the weaknesses were apparent “at least in retrospect,” but the weaknesses were clearly discernible before litigation by the refusal of regulatory agencies, such as the FDA, to accept the litigation-driven causal claims. Rheingold also tries to assuage fellow plaintiffs’ counsel by suggesting that plaintiffs’ lawyers somehow fell prey to the pressure to file cases because of internet advertising and the encouragement of records collection and analysis firms. This attribution of naiveté to Plaintiffs’ Steering Committee (PSC) members does not ring true given the wealth and resources of lawyers on PSCs. Furthermore, the suggestion that PSC member may be newcomers to the MDL playing fields does not hold water given that most of the lawyers involved are “repeat players,” with substantial experience and financial incentives to sort out invalid expert witness opinions.[7]

Rheingold offers the wise counsel that plaintiffs’ lawyers “should take [their] time and investigate for [themselves] the potential proof available for causation and adequacy of labeling.” If history is any guide, his advice will not be followed.


[1] Rheingold cites five MDLs that were “Daubert failures” in the recent times: (1) In re Lipitor (Atorvastatin Calcium) Marketing, Sales Practices & Prods. Liab.  Litig. (MDL 2502), 892 F.3d 624 (4th Cir. 2018) (affirming Rule 702 dismissal of claims that atorvastatin use caused diabetes); (2) In re Mirena IUD Products Liab. Litig. (Mirena II, MDL 2767), 713 F. App’x 11 (2d Cir. 2017) (excluding expert witnesses’ opinion testimony that the intrauterine device caused embedment and perforation); (3) In re Mirena Ius Levonorgestrel-Related Prods. Liab. Litig., (Mirena II), 341 F. Supp. 3d 213 (S.D.N.Y. 2018) (affirming Rule 702 dismissal of claims that product caused pseudotumor cerebri); (4) In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., 858 F.3d 787 (3d Cir. 2017) (affirming MDL trial court’s Rule 702 exclusions of opinions that Zoloft is teratogenic); (5) Jones v. SmithKline Beecham, 652 F. App’x 848 (11th Cir. 2016) (affirming MDL court’s Rule 702 exclusions of expert witness opinions that denture adhesive creams caused metal deficiencies).

[2]  Not only was Judge Pointer a member of the Rules committee, he was the principal author of the 1993 Amendments to the Federal Rules of Civil Procedure, as well as the editor-in-chief of the Federal Judicial Center’s Manual for Complex. At an ALI-ABA conference in 1997, Judge Pointer complained about the burden of gatekeeping. 3 Federal Discovery News 1 (Aug. 1997). He further opined that, under Rule 104(a), he could “look to decisions from the Southern District of New York and Eastern District of New York, where the same expert’s opinion has been offered and ruled upon by those judges. Their rulings are hearsay, but hearsay is acceptable. So I may use their rulings as a basis for my decision on whether to allow it or not.” Id. at 4. Even after Judge Jack Weinstein excluded plaintiffs’ expert witnesses’ causal opinions in the silicone litigation, however, Judge Pointer avoided having to make an MDL-wide decision with the scope of one of the leading judges from the Southern and Eastern Districts of New York. See In re Breast Implant Cases, 942 F. Supp. 958 (E. & S.D.N.Y. 1996). Judge Pointer repeated his anti-Daubert views three years later at a symposium on expert witness opinion testimony. See Sam C. Pointer, Jr., “Response to Edward J. Imwinkelried, the Taxonomy of Testimony Post-Kumho: Refocusing on the Bottom Lines of Reliability and Necessity,” 30 Cumberland L. Rev. 235 (2000).

[3]  In re Silica Products Liab. Litig., MDL No. 1553, 398 F. Supp. 2d 563 (S.D. Tex. 2005).

[4]  In re Bextra & Celebrex Marketing Sales Practices & Prod. Liab. Litig., 524 F. Supp. 2d 1166 (N.D. Calif. 2007) (excluding virtually all relevant expert witness testimony proffered to support claims that ordinary dosages of these COX-2 inhibitors caused cardiovascular events).

[5]  In re Viagra Products Liab. Litig., 572 F. Supp. 2d 1071 (D. Minn. 2008) (addressing claims that sildenafil causes vision loss from non-arteritic anterior ischemic optic neuropathy (NAION)).

[6]  Rheingold (“Examining these five mass terminations, at least in retrospect[,] it is apparent that they were very weak on causation.”)

[7] See Elizabeth Chamblee Burch & Margaret S. Williams, “Repeat Players in Multidistrict Litigation: The Social Network,” 102 Cornell L. Rev. 1445 (2017); Margaret S. Williams, Emery G. Lee III & Catherine R. Borden, “Repeat Players in Federal Multidistrict Litigation,” 5 J. Tort L. 141, 149–60 (2014).

Good Night Styrene

April 18th, 2019

Perri Klass is a pediatrician who writes fiction and non-fiction. Her editorial article on “disruptive chemicals,” in this week’s Science Section of the New York Times contained large segments of fiction.[1]  The Times gives Dr. Klass, along with Nicholas Kristof and others, a generous platform to advance their chemophobic propaganda, on pesticides, phthalates, bisphenols, and flame retardants, without the bother of having to cite evidence. It has been just two weeks since the Times published another Klass fear piece on hormone disrupters.[2]

In her Science Times piece, Klass plugged Leonardo Trasande’s book, Sicker, Fatter, Poorer: The Urgent Threat of Hormone-Disrupting Chemicals to Our Health and Future . . . and What We Can Do About It (2019), to help wind up parents about chemical threats everywhere. Trasande, is “an internationally renowned leader in environmental health” expert; his website tells us so. Klass relies so extensively upon Trasande that it is difficult to discern whether she is presenting anything other than his opinions, which in some places she notes he has qualified as disputed and dependent upon correlational associations that have not established causal associations.

When it comes to recyclable plastic, number 6, Klass throws all journalistic caution and scientific scruple aside and tells us that “[a] number 6 denotes styrene, which is a known carcinogen.”[3] Known to whom? To Trasande? To Klass? To eco-zealots?

The first gaffe is that number 6 plastic, of course, is not styrene; rather it is polystyrene. Leaching of monomer certainly can occur,[4] and is worth noting, but equating polystyrene with styrene is simply wrong. The second gaffe, more serious yet, is that styrene is not a “known” carcinogen.

The International Agency for Research on Cancer, which has been known to engage in epistemic inflation about carcinogenicity, addressed styrene in its monograph 82.[5] Styrene was labeled a “2B” carcinogen, that is possible, not probable, and certainly not “known.” Last year, an IARC working group revisited the assessment of styrene, and in keeping with its current practice of grade inflation bumped styrene up to Group 2A, “probably carcinogenic to humans” based upon limited evidence in human being and sufficient evidence in rats and close relatives.[6] In any event, the IARC Monograph number 121, which will address styrene, is under preparation.

A responsible journalist, or scientist, regulator, or lawyer, is obligated however to note tha “probably” does not mean “more likely than not” in IARC-jargon.[7] Given that all empirical propositions have a probability of being true, somewhere between 0 and 100%, but never actually equal to 0 or 100%, the IARC classifications of “probably” causing cancer are probably not particularly meaningful.  Everything “probably” causes cancer, in this mathematical sense.[8]

In the meanwhile, what does the scientific community have to say about the carcinogenicity of styrene?

Recent reviews and systematic reviews of the styrene carcinogenicity issue have mostly concluded that there is no causal relationship between styrene exposure and any form of cancer in humans.[9] Of course, the “Lobby,” scientists in service to the litigation industry, disagree.[10]


[1]  Perri Klass, “Beware of Disruptive Chemicals,” N.Y. Times (April 16, 2019).

[2] Perri Klass, “How to Minimize Exposures to Hormone Disrupters,” N.Y. Times (April 1, 2019).

[3]  Klass (April 16, 2019), at D6, col. 3.

[4]  See, e.g., Despoina Paraskevopoulou, Dimitris Achiliasa, and Adamantini Paraskevopoulou, “Migration of styrene from plastic packaging based on polystyrene into food simulants,” 61 Polymers Internatl’l 141 (2012); J. R. Withey, “Quantitative Analysis of Styrene Monomerin Polystyrene and Foods Including Some Preliminary Studies of the Uptake and Pharmacodynamics of the Monomer in Rats,” 17 Envt’l Health Persp. 125 (1976).

[5]  IARC Monograph No. 82, at 437-78 (2002).

[6]  IARC Working Group, “Carcinogenicity of quinoline, styrene, and styrene-7,8-oxide,” 19 Lancet Oncology 728 (2018).

[7]  The IARC Preamble definition of probable reveals that “probable” does not mean greater than 50%. See also “The IARC Process is Broken” (May 4, 2016).

[8] See Ed Yong, “Beefing With the World Health Organization’s Cancer Warnings,” The Atlantic (Oct 26, 2015).

[9]  Boffetta, P., Adami, H. O., Cole, P., Trichopoulos, D. and Mandel, J. S., “Epidemiologic studies of styrene and cancer: a review of the literature,” 51 J. Occup. & Envt’l Med. 1275 (2009) (“The available epidemiologic evidence does not support a causal relationship between styrene exposure and any type of human cancer.”); James J. Collins & Elizabeth Delzell, “A systematic review of epidemiologic studies of styrene and cancer,” 48 Critical Revs. Toxicol. 443 (2018)  (“Consideration of all pertinent data, including substantial recent research, indicates that the epidemiologic evidence on the potential carcinogenicity of styrene is inconclusive and does not establish that styrene causes any form of cancer in humans.”).

[10] James Huff & Peter F. Infante, “Styrene exposure and risk of cancer,” 26 Mutagenesis 583 (2011).

Expert Witnesses Who Don’t Mean What They Say

March 24th, 2019

’Then you should say what you mean’, the March Hare went on.
‘I do’, Alice hastily replied; ‘at least–at least I mean what I say–that’s the same thing, you know’.
‘Not the same thing a bit!’ said the Hatter. ‘You might just as well say that “I see what I eat” is the same thing as “I eat what I see!”’

Lewis Carroll, Alice’s Adventures in Wonderland, Chapter VII (1865)

Anick Bérard is an epidemiologist at the Université de Montréal. Most of her publications involve birth outcomes and maternal medication use, but Dr. Bérard’s advocacy also involves social media (Facebook, YouTube) and expert witnessing in litigation against the pharmaceutical industry.

When the FDA issued its alert about cardiac malformations in children born to women who took Paxil (paroxetine) in their first trimesters of pregnancy, the agency characterized its assessment of the “early results of new studies for Paxil” as “suggesting that the drug increases the risk for birth defects, particularly heart defects, when women take it during the first three months of pregnancy.”1 The agency also disclaimed any conclusion of “class effect” among the other selective serotonin reuptake inhibitors (SSRIs), such as Zoloft (sertraline), Celexa (citalopram), and Prozac (fluoxetine). Indeed, the FDA requested the manufacturer of paroxetine to undertake additional research to look at teratogenicity of paroxetine, as well as the possibility of class effects. That research never showed an SSRI teratogenicity class effect.

A “suggestion” from the FDA of an adverse effect is sufficient to launch a thousand litigation complaints, which were duly filed against GlaxoSmithKline. The plaintiffs’ counsel recruited Dr. Bérard to serve as an expert witness in support of a wide array of birth defects in Paxil cases. In her hands, the agency’s “suggestion” of causation became a conclusion. The defense challenged Bérard’s opinions, but the federal court motion to exclude her causal opinions were taken under advisement, without decision. Hayes v. SmithKline Beecham Corp., 2009 WL 4912178 (N.D. Okla. Dec. 14, 2009). One case in state court went to trial, with a verdict for plaintiffs.

Despite Dr. Bérard;s zealous advocacy for a causal association between Paxil and birth defects, she declined to assert any association between maternal use of the other, non-paroxetine SSRIs and birth defects. Here is an excerpt from her Rule 26 report in a paroxetine case:

Taken together, the available scientific evidence makes it clear that Paxil use during the first trimester of pregnancy is an independent risk factor that at least doubles the risk of cardiovascular malformations in newborns at all commonly used doses. This risk has been consistent and was further reinforced by repeated observational study findings as well as meta-analyses results. No such associations were found with other types of SSRI exposures during gestation.”2

In her sworn testimony, Dr. Bérard made clear that she really meant what she had written in her report, about exculpating the non-paroxetine SSRIs of any association with birth defects:

Q. Is it fair to say that you will not be offering an opinion that SSRIs as a class, or individual SSRIs other than Paxil increased the risk of cardiovascular malformations in newborns?

A. This is not what I was asked to do.

Q. But in fact you actually write in your report that you don’t believe there’s sufficient data to reach any conclusion about other SSRIs, true?

A. Correct.”3

In 2010, Dr. Bérard, along with two professional colleagues, published what they called a systematic review of antidepressant use in pregnancy and birth outcomes.4 In this review, Bérard specifically advised that paroxetine should be avoided by women of childbearing age, but she and her colleagaues affirmatively encouraged use of other SSRIs, such as fluoxetine, sertraline, and citalopram:

Clinical Approach: A Brief Overview

For women planning a pregnancy or when a treatment initiation during pregnancy is deemed necessary, the decision should rely not only on drug safety data but also on other factors such as the patient’s condition, previous response to other antidepressants, comorbidities, expected adverse effects and potential interactions with other current pharmacological treatments. Since there is a more extensive clinical experience with SSRIs such as fluoxetine, sertraline, and citalopram, these agents should be used as first-line therapies. Whenever possible, one should refrain from prescribing paroxetine to women of childbearing potential or planning a pregnancy. However, antenatal screening such as fetal echocardiography should be considered in a woman exposed prior to finding out about her pregnancy.5

When Bérard wrote and published her systematic review, she was still actively involved as an expert witness for plaintiffs in lawsuits against the manufacturers of paroxetine. In her 2010 review, Dr. Bérard gave no acknowledgment of monies earned in her capacity as an expert witness, and her disclosure of potential conflicts of interest was limited to noting that she was “a consultant for a plaintiff in the litigation involving Paxil.”6 In fact, Bérard had submitted multiple reports, testified at deposition, and had been listed as a testifying expert witness in many cases involving Paxil or paroxetine.

Not long after the 2010 review article, Glaxo settled most of the pending paroxetine birth defect cases, and the plaintiffs’ bar pivoted to recast their expert witnesses’ opinions as causal teratogenic conclusions about the entire class of SSRIs. In 2012, the federal courts established a “multi-district litigation,” MDL 2342, for birth defect cases involving Zoloft (sertraline), in the Philadelphia courtroom of Judge Cynthia Rufe, in the Eastern District of Pennsylvania.

Notwithstanding her 2010 clinical advice that pregnant women with depression should use fluoxetine, sertraline, or citalopram, Dr. Bérard became actively involved in the new litigation against the other, non-Paxil SSRI manufacturers. By 2013, Dr. Bérard was on record as a party expert witness for plaintiffs, opining that setraline causes virtually every major congenital malformation.7

In the same year, 2013, Dr. Bérard published another review article on teratogens, but now she gave a more equivocal view of the other SSRIs, claiming that they were “known carcinogens,” but acknowledging in a footnote that teratogenicity of the SSRIs was “controversial.”8 Incredibly, this review article states that “Anick Bérard and Sonia Chaabane have no potential conflicts of interest to disclose.”9

Ultimately, Dr. Bérard could not straddle her own contradictory statements and remain upright, which encouraged the MDL court to examine her opinions closely for methodological shortcomings and failures. Although Bérard had evolved to claim a teratogenic “class effect” for all the SSRIs, the scientific support for her claim was somewhere between weak to absent.10 Perhaps even more distressing, many of the pending claims involving the other SSRIs arose from pregnancies and births that predated Bérard’s epiphany about class effect. Finding ample evidence of specious claiming, the federal court charged with oversight of the sertraline birth defect claims excluded Dr. Bérard’s causal opinions for failing to meet the requirements of Federal Rule of Evidence 702.11

Plaintiffs sought to substitute Nicholas Jewell for Dr. Bérard, but Dr. Jewell fared no better, and was excluded for other methodological shenanigans.12 Ultimately, a unanimous panel of the United States Court of Appeals, for the Third Circuit, upheld the expert witness exclusions.13


1 See “FDA Advising of Risk of Birth Defects with Paxil; Agency Requiring Updated Product Labeling,” P05-97 (Dec. 8, 2005) (emphasis added).

2 Bérard Report in Hayes v. SmithKline Beecham Corp, 2009 WL 3072955, at *4 (N.D. Okla. Feb. 4, 2009) (emphasis added).

3 Deposition Testimony of Anick Bérard, in Hayes v. SmithKline Beecham Corp., at 120:16-25 (N.D. Okla. April 2009).

4 Marieve Simoncelli, Brigitte-Zoe Martin & Anick Bérard, “Antidepressant Use During Pregnancy: A Critical Systematic Review of the Literature,” 5 Current Drug Safety 153 (2010).

5 Id. at 168b.

6 Id. at 169 (emphasis added).

7 See Anick Bérard, “Expert Report” (June 19, 2013).

8 Sonia Chaabanen & Anick Bérard, “Epidemiology of Major Congenital Malformations with Specific Focus on Teratogens,” 8 Current Drug Safety 128, 136 (2013).

9 Id. at 137b.

10 See, e.g., Nicholas Myles, Hannah Newall, Harvey Ward, and Matthew Large, “Systematic meta-analysis of individual selective serotonin reuptake inhibitor medications and congenital malformations,” 47 Australian & New Zealand J. Psychiatry 1002 (2013).

11 See In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., MDL No. 2342; 26 F.Supp. 3d 449 (E.D.Pa. 2014) (Rufe, J.). Plaintiffs, through their Plaintiffs’ Steering Committee, moved for reconsideration, but Judge Rufe reaffirmed her exclusion of Dr. Bérard. In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., MDL No. 2342; 12-md-2342, 2015 WL 314149 (E.D. Pa. Jan. 23, 2015) (Rufe, J.) (denying PSC’s motion for reconsideration). See Zoloft MDL Relieves Matrixx Depression” (Jan. 30, 2015).

12 See In re Zoloft Prods. Liab. Litig., No. 12–md–2342, 2015 WL 7776911 (E.D. Pa. Dec. 2, 2015) (excluding Jewell’s opinions as scientifically unwarranted and methodologically flawed); In re Zoloft Prod. Liab. Litig., MDL NO. 2342, 12-MD-2342, 2016 WL 1320799 (E.D. Pa. April 5, 2016) (granting summary judgment after excluding Dr. Jewell). See alsoThe Education of Judge Rufe – The Zoloft MDL” (April 9, 2016).

The Joiner Finale

March 23rd, 2019

“This is the end
Beautiful friend

This is the end
My only friend, the end”

Jim Morrison, “The End” (c. 1966)


The General Electric Co. v. Joiner, 522 U.S. 136 (1997), case was based upon polychlorinated biphenyl exposures (PCB), only in part. The PCB part did not hold up well legally in the Supreme Court; nor was the PCB lung cancer claim vindicated by later scientific evidence. SeeHow Have Important Rule 702 Holdings Held Up With Time?” (Mar. 20, 2015).

The Supreme Court in Joiner reversed and remanded the case to the 11th Circuit, which then remanded the case back to the district court to address claims that Mr. Joiner had been exposed to furans and dioxins, and that these other chemicals had caused, or contributed to, his lung cancer, as well. Joiner v. General Electric Co., 134 F.3d 1457 (11th Cir. 1998) (per curiam). Thus the dioxins were left in the case even after the Supreme Court ruled.

After the Supreme Court’s decision, Anthony Roisman argued that the Court had addressed an artificial question when asked about PCBs alone because the case was really about an alleged mixture of exposures, and he held out hope that the Joiners would do better on remand. Anthony Z. Roisman, “The Implications of G.E. v. Joiner for Admissibility of Expert Testimony,” 1 Res Communes 65 (1999).

Many Daubert observers (including me) are unaware of the legal fate of the Joiners’ claims on remand. In the only reference I could find, the commentator simply noted that the case resolved before trial.[1] I am indebted to Michael Risinger, and Joseph Cecil, for pointing me to documents from PACER, which shed some light upon the Joiner “endgame.”

In February 1998, Judge Orinda Evans, who had been the original trial judge, and who had sustained defendants’ Rule 702 challenges and granted their motions for summary judgments, received and reopened the case upon remand from the 11th Circuit. In March, Judge Evans directed the parties to submit a new pre-trial order by April 17, 1998. At a status conference in April 1998, Judge Evans permitted the plaintiffs additional discovery, to be completed by June 17, 1998. Five days before the expiration of their additional discovery period, the plaintiffs moved for additional time; defendants opposed the request. In July, Judge Evans granted the requested extension, and gave defendants until November 1, 1998, to file for summary judgment.

Meanwhile, in June 1998, new counsel entered their appearances for plaintiffs – William Sims Stone, Kevin R. Dean, Thomas Craig Earnest, and Stanley L. Merritt. The docket does not reflect much of anything about the new discovery other than a request for a protective order for an unpublished study. But by October 6, 1998, the new counsel, Earnest, Dean, and Stone (but not Merritt) withdrew as attorneys for the Joiners, and by the end of October 1998, Judge Evans entered an order to dismiss the case, without prejudice.

A few months later, in February 1999, the parties filed a stipulation, approved by the Clerk, dismissing the action with prejudice, and with each party to bear its own coasts. Given the flight of plaintiffs’ counsel, the dismissals without and then with prejudice, a settlement seems never to have been involved in the resolution of the Joiner case. In the end, the Joiners’ case fizzled perhaps to avoid being Frye’d.

And what has happened since to the science of dioxins and lung cancer?

Not much.

In 2006, the National Research Council published a monograph on dioxin, which took the controversial approach of focusing on all cancer mortality rather than specific cancers that had been suggested as likely outcomes of interest. See David L. Eaton (Chairperson), Health Risks from Dioxin and Related Compounds – Evaluation of the EPA Reassessment (2006). The validity of this approach, and the committee’s conclusions, were challenged vigorously in subsequent publications. Paolo Boffetta, Kenneth A. Mundt, Hans-Olov Adami, Philip Cole, and Jack S. Mandel, “TCDD and cancer: A critical review of epidemiologic studies,” 41 Critical Rev. Toxicol. 622 (2011) (“In conclusion, recent epidemiological evidence falls far short of conclusively demonstrating a causal link between TCDD exposure and cancer risk in humans.”

In 2013, the Industrial Injuries Advisory Council (IIAC), an independent scientific advisory body in the United Kingdom, published a review of lung cancer and dioxin. The Council found the epidemiologic studies mixed, and declined to endorse the compensability of lung cancer for dioxin-exposed industrial workers. Industrial Injuries Advisory Council – Information Note on Lung cancer and Dioxin (December 2013). See also Mann v. CSX Transp., Inc., 2009 WL 3766056, 2009 U.S. Dist. LEXIS 106433 (N.D. Ohio 2009) (Polster, J.) (dioxin exposure case) (“Plaintiffs’ medical expert, Dr. James Kornberg, has opined that numerous organizations have classified dioxins as a known human carcinogen. However, it is not appropriate for one set of experts to bring the conclusions of another set of experts into the courtroom and then testify merely that they ‘agree’ with that conclusion.”), citing Thorndike v. DaimlerChrysler Corp., 266 F. Supp. 2d 172 (D. Me. 2003) (court excluded expert who was “parroting” other experts’ conclusions).

Last year, an industrial cohort, followed for two decades found no increased risk of lung cancer among workers exposed to dioxin. David I. McBride, James J. Collins, Thomas John Bender, Kenneth M Bodner, and Lesa L. Aylward, “Cohort study of workers at a New Zealand agrochemical plant to assess the effect of dioxin exposure on mortality,” 8 Brit. Med. J. Open e019243 (2018) (reporting SMR for lung cancer 0.95, 95%CI: 0.56 to 1.53)


[1] Morris S. Zedeck, Expert Witness in the Legal System: A Scientist’s Search for Justice 49 (2010) (noting that, after remand from the Supreme Court, Joiner v. General Electric resolved before trial)

 

Apportionment and Pennsylvania’s Fair Share Act

March 14th, 2019

In 2011, Pennsylvania enacted the Fair Share Act, which was remedial legislation designed to mitigate the unfairness of joint and several liability in mass, and other, tort litigation by abrogating joint and several liability in favor of apportionment of shares among multiple defendants, including settled defendants.1

Although the statute stated the general rule in terms of negligence,2 the Act was clearly intended to apply to actions for so-called strict liability:3

“(1) Where recovery is allowed against more than one person, including actions for strict liability, and where liability is attributed to more than one defendant, each defendant shall be liable for that proportion of the total dollar amount awarded as damages in the ratio of the amount of that defendant’s liability to the amount of liability attributed to all defendants and other persons to whom liability is apportioned under subsection.”

The intended result of the legislation was for courts to enter separate and several judgments against defendants held liable in the amount apportioned to each defendant’s liability.4 The Act created exceptions for for intentional torts and for cases in which a defendant receives 60% or greater share in the apportionment.5

In Pennsylvania, as in other states, judges sometimes fall prey to the superstition that the law, procedural and substantive, does not apply to asbestos cases. Roverano v. John Crane, Inc., is an asbestos case in which the plaintiff claimed his lung cancer was caused by exposure to multiple defendants’ products. The trial judge, falling under the sway of asbestos exceptionalism, refused to apply Fair Share Act, suggesting that “the jury was not presented with evidence that would permit an apportionment to be made by it.”

The Roverano trial judge’s suggestion is remarkable, given that any plaintiff is exposed to different asbestos products in distinguishable amounts, and for distinguishable durations. Furthermore, asbestos products have distinguishable, relative levels of friability, with different levels of respirable fiber exposure for the plaintiff. In some cases, the products contain different kinds of asbestos minerals, which have distinguishable and relative levels of potency to cause the plaintiff’s specific disease. Asbestos cases, whether involving asbestosis, lung cancer, or mesothelioma claims, are more amenable to apportionment of shares among co-defendants than are “red car / blue car” cases.

Pennsylvania’s intermediate appellate court reversed the trial court’s asbestos exceptionalism, and held that upon remand, the court must:

“[a]pply a non-per capita allocation to negligent joint tortfeasors and strict liability joint tortfeasors; and permit evidence of settlements reached between plaintiffs and bankrupt entities to be included in the calculation of allocation of liability.”

Roverano v. John Crane, 2017 Pa. Super. 415, 177 A.3d 892 (2017).

The Superior Court’s decision did not sit well with the litigation industry, which likes joint and several liability, with equal shares. Joint and several liability permits plaintiffs’ counsel to extort large settlements from minor defendants who face the prospect of out-sized pro rata shares after trial, without the benefit of reductions for the shares of settled bankrupt defendants. The Roverano plaintiff appealed from the Superior Court’s straightforward application of a remedial statute.

What should be a per curiam affirmance of the Superior Court, however, could result in another act of asbestos exceptionalism by Pennsylvania Supreme Court. Media reports of the oral argument in Roverano suggest that several of the justices invoked the specter of “junk science” in apportioning shares among asbestos co-defendants.6 Disrespectfully, Justice Max Baer commented:

“Respectfully, your theory is interjecting junk science. We’ve never held that duration of contact corresponds with culpability.”7

The Pennsylvania Justices’ muddle can be easily avoided. First, the legislature clearly expressed its intention that apportionment be permitted in strict liability cases.

Second, failure-to-warn strict liability cases are, as virtually all scholars and most courts recognize, essentially negligence cases, in any event.8

Third, apportionment is a well-recognized procedure in the law of Torts, including the Pennsylvania law of torts. Apportionment of damages among various causes was recognized in the Restatement of Torts (Second) Section 433A (Apportionment of Harm to Causes), which specifies that:

(1) Damages for harm are to be apportioned among two or more causes where

(a) there are distinct harms, or

(b) there is a reasonable basis for determining the contribution of each cause to a single harm.

Restatement (Second) of Torts § 433A(1) (1965) [hereinafter cited as Section 433A].

The comments to Section 433A suggest a liberal application for apportionment. The rules set out in Section 433A apply “whenever two or more causes have combined to bring about harm to the plaintiff, and each has been a substantial factor in producing the harm … .”

Id., comment a. The independent causes may be tortious or innocent, “and it is immaterial whether all or any of such persons are joined as defendants in the particular action.” Id. Indeed, apportionment also applies when the defendant’s conduct combines “with the operation of a force of nature, or with a pre-existing condition which the defendant has not caused, to bring about the harm to the plaintiff.” Just as the law of grits applies in everyone’s kitchen, the law of apportionment applies in Pennsylvania courts.

Apportionment of damages is an accepted legal principle in Pennsylvania law. Martin v. Owens-Corning Fiberglas Corp., 515 Pa. 377, 528 A.2d 947 (1987). Courts, applying Pennsylvania law, have permitted juries to apportion damages between asbestos and cigarette smoking as causal factors in plaintiffs’ lung cancers, based upon a reasonable basis for determining the contribution of each source of harm to a single harm.9

In Parker, none of the experts assigned exact mathematical percentages to the probability that asbestos rather than smoking caused the lung cancer. The Court of Appeals noted that on the record before it:

“we cannot say that no reasonable basis existed for determining the contribution of cigarette smoking to the cancer suffered by the decedent.”10

The Pennsylvania Supreme Court has itself affirmed the proposition that “liability attaches to a negligent act only to the degree that the negligent act caused the employee’s injury.”11 Thus, even in straight-up negligence cases, causal apportionment must play in a role, even when the relative causal contributions are much harder to determine than in the quasi-quantitative setting of an asbestos exposure claim.

Let’s hope that Justice Baer and his colleagues read the statute and the case law before delivering judgment. The first word in the name of the legislation is Fair.


1 42 Pa.C.S.A. § 7102.

2 42 Pa.C.S.A. § 7102(a)

3 42 Pa.C.S.A. § 7102(a)(1) (emphasis added).

4 42 Pa.C.S.A. § 7102(a)(2).

5 42 Pa.C.S.A. § 7102 (a)(3)(ii), (iii).

7 Id. (quoting Baer, J.).

8 See, e.g, Restatement (Third) of Torts: Products Liability § 2, and comment I (1998); Fane v. Zimmer, Inc., 927 F.2d 124, 130 (2d Cir. 1991) (“Failure to warn claims purporting to sound in strict liability and those sounding in negligence are essentially the same.”).

9 Parker v. Bell Asbestos Mines, No. 86-1197, unpublished slip op. at 5 (3d Cir., Dec. 30, 1987) (per curiam) (citing Section 433A as Pennsylvania law, and Martin v. Owens-Corning Fiberglas Corp. , 515 Pa. 377, 528 A.2d 947, 949 (1987))

10 Id. at 7.

11 Dale v. Baltimore & Ohio RR., 520 Pa. 96, 106, 552 A.2d 1037, 1041 (1989). See also McAllister v. Pennsylvania RR., 324 Pa. 65, 69-70, 187 A. 415, 418 (1936) (holding that plaintiff’s impairment, and pain and suffering, can be apportioned between two tortious causes; plaintiff need not separate damages with exactitude); Shamey v. State Farm Mutual Auto. Ins. Co., 229 Pa. Super. 215, 223, 331 A.2d 498, 502 (1974) (citing, and relying upon, Section 433A; difficulties in proof do not constitute sufficient reason to hold a defendant liable for the damage inflicted by another person). Pennsylvania law is in accord with the law of other states as well, on apportionment. See Waterson v. General Motors Corp., 111 N.J. 238, 544 A.2d 357 (1988) (holding that a strict liability claim against General Motors for an unreasonably dangerous product defect was subject to apportionment for contribution from failing to wear a seat belt) (the jury’s right to apportion furthered the public policy of properly allocating the costs of accidents and injuries).

ASA Statement Goes to Court – Part 2

March 7th, 2019

It has been almost three years since the American Statistical Association (ASA) issued its statement on statistical significance. Ronald L. Wasserstein & Nicole A. Lazar, “The ASA’s Statement on p-Values: Context, Process, and Purpose,” 70 The American Statistician 129 (2016) [ASA Statement]. Before the ASA’s Statement, courts and lawyers from all sides routinely misunderstood, misstated, and misrepresented the meaning of statistical significance.1 These errors were pandemic despite the efforts of the Federal Judicial Center and the National Academies of Science to educate judges and lawyers, through their Reference Manuals on Scientific Evidence and seminars. The interesting question is whether the ASA’s Statement has improved, or will improve, the unfortunate situation.2

The ASA Statement on Testosterone

“Ye blind guides, who strain out a gnat and swallow a camel!”
Matthew 23:24

To capture the state of the art, or the state of correct and flawed interpretations of the ASA Statement, reviewing a recent but now resolved, large so-called mass tort may be illustrative. Pharmaceutical products liability cases almost always turn on evidence from pharmaco-epidemiologic studies that compare the rate of an outcome of interest among patients taking a particular medication with the rate among similar, untreated patients. These studies compare the observed with the expected rates, and invariably assess the differences as either a “risk ratio,” or a “risk difference,” for both the magnitude of the difference and for “significance probability” of observing a rate at least as large as seen in the exposed group, given the assumptions that that the medication did not change the rate and that the data followed a given probability distribution. In these alleged “health effects” cases, claims and counterclaims of misuse of significance probability have been pervasive. After the ASA Statement was released, some lawyers began to modify their arguments to suggest that their adversaries’ arguments offend the ASA’s pronouncements.

One litigation that showcases the use and misuse of the ASA Statement arose from claims that AbbVie, Inc.’s transdermal testosterone medication (TRT) causes heart attacks, strokes, and venous thromboembolism. The FDA had reviewed the plaintiffs’ claims, made in a Public Citizen complaint, and resoundingly rejected the causal interpretation of two dubious observational studies, and an incomplete meta-analysis that used an off-beat composite end point.3 The Public Citizen petition probably did succeed in pushing the FDA to convene an Advisory Committee meeting, which again resulted in a rejection of the causal claims. The FDA did, however, modify the class labeling for TRT with respect to indication and a possible association with cardiovascular outcomes. And then the litigation came.

Notwithstanding the FDA’s determination that a causal association had not been shown, thousands of plaintiffs sued several companies, with most of the complaints falling on AbbVie, Inc., which had the largest presence in the market. The ASA Statement came up occasionally in pre-trial depositions, but became a major brouhaha, when AbbVie moved to exclude plaintiffs’ causation expert witnesses.4

The Defense’s Anticipatory Parry of the ASA Statement

As AbbVie described the situation:

Plaintiffs’ experts uniformly seek to abrogate the established methods and standards for determining … causal factors in favor of precisely the kind of subjective judgments that Daubert was designed to avoid. Tests for statistical significance are characterized as ‘misleading’ and rejected [by plaintiffs’ expert witnesses] in favor of non-statistical ‘estimates’, ‘clinical judgment’, and ‘gestalt’ views of the evidence.”5

AbbVie’s brief in support of excluding plaintiffs’ expert witnesses barely mentioned the ASA Statement, but in a footnote, the defense anticipated the Plaintiffs’ opposition would be based on rejecting the importance of statistical significance testing and the claim that this rejection was somehow supported by the ASA Statement:

The statistical community is currently debating whether scientists who lack expertise in statistics misunderstand p-values and overvalue significance testing. [citing ASA Statement] The fact that there is a debate among professional statisticians on this narrow issue does not validate Dr. Gerstman’s [plaintiffs’ expert witness’s] rejection of the importance of statistical significance testing, or undermine Defendants’ reliance on accepted methods for determining association and causation.”6

In its brief in support of excluding causation opinions, the defense took pains to define statistical significance, and managed to do so, painfully, or at least in ways that the ASA conferees would have found objectionable:

Any association found must be tested for its statistical significance. Statistical significance testing measures the likelihood that the observed association could be due to chance variation among samples. Scientists evaluate whether an observed effect is due to chance using p-values and confidence intervals. The prevailing scientific convention requires that there be 95% probability that the observed association is not due to chance (expressed as a p-value < 0.05) before reporting a result as “statistically significant. * * * This process guards against reporting false positive results by setting a ceiling for the probability that the observed positive association could be due to chance alone, assuming that no association was actually present.7

AbbVie’s brief proceeded to characterize the confidence interval as a tool of significance testing, again in a way that misstates the mathematical meaning and importance of the interval:

The determination of statistical significance can be described equivalently in terms of the confidence interval calculated in connection with the association. A confidence interval indicates the level of uncertainty that exists around the measured value of the association (i.e., the OR or RR). A confidence interval defines the range of possible values for the actual OR or RR that are compatible with the sample data, at a specified confidence level, typically 95% under the prevailing scientific convention. Reference Manual, at 580 (Ex. 14) (“If a 95% confidence interval is specified, the range encompasses the results we would expect 95% of the time if samples for new studies were repeatedly drawn from the same population.”). * * * If the confidence interval crosses 1.0, this means there may be no difference between the treatment group and the control group, therefore the result is not considered statistically significant.”8

Perhaps AbbVie’s counsel should be permitted a plea in mitigation by having cited to, and quoted from, the Reference Manual on Scientific Evidence’s chapter on epidemiology, which was also wide of the mark in its description of the confidence interval. Counsel would have been better served by the Manual’s more rigorous and accurate chapter on statistics. Even so, the above-quoted statements give an inappropriate interpretation of random error as a probability about the hypothesis being tested.9 Particularly dangerous, in terms of failing to advance AbbVie’s own objectives, was the characterization of the confidence interval as measuring the level of uncertainty, as though there were no other sources of uncertainty other than random error in the measurement of the risk ratio.

The Plaintiffs’ Attack on Significance Testing

The Plaintiffs, of course, filed an opposition brief that characterized the defense position as an attempt to:

elevate statistical significance, as measured by confidence intervals and so-called p-values, to the status of an absolute requirement to the establishment of causation.”10

Tellingly, the plaintiffs’ brief fails to point to any modern-era example of a scientific determination of causation based upon epidemiologic evidence, in which the pertinent studies were not assessed for, and found to show, statistical significance.

After citing a few judicial opinions that underplayed the importance of statistical significance, the Plaintiffs’ opposition turned to the ASA Statement for what it perceived to be support for its loosey-goosey approach to causal inference.11 The Plaintiffs’ opposition brief quoted a series of propositions from the ASA Statement, without the ASA’s elaborations and elucidations, and without much in the way of explanation or commentary. At the very least, the Plaintiffs’ heavy reliance upon, despite their distortions of, the ASA Statement helped them to define key statistical concepts more carefully than had AbbVie in its opening brief.

The ASA Statement, however, was not immune from being misrepresented in the Plaintiffs’ opposition brief. Many of the quoted propositions were quite beside the points of the dispute over the validity and reliability of Plaintiffs’ expert witnesses’ conclusions of causation about testosterone and heart attacks, conclusions not reached or shared by the FDA, any consensus statement from medical organizations, or any serious published systematic review:

P-values do not measure the probability that the studied hypothesis is true, … .”12

This proposition from the ASA Statement is true, but trivially true. (Of course, this ASA principle is relevant to the many judicial decisions that have managed to misstate what p-values measure.) The above-quoted proposition follows from the definition and meaning of the p-value; only someone who did not understand significance probability would confuse it with the probability of the truth of the studied hypothesis. P-values’ not measuring the probability of the null hypothesis, or any alternative hypothesis, is not a flaw in p-values, but arguably their strength.

A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.”13

Again, true, true, and immaterial. The existence of other importance metrics, such as the magnitude of an association or correlation, hardly detracts from the importance of assessing the random error in an observed statistic. The need to assess clinical or practical significance of an association or correlation also does not detract from the importance of the assessed random error in a measured statistic.

By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.”14

The Plaintiffs’ opposition attempted to spin the above ASA statement as a criticism of p-values involves an elenchi ignoratio. Once again, the p-value assumes a probability model and a null hypothesis, and so it cannot provide a “measure” or the model or hypothesis’ probability.

The Plaintiffs’ final harrumph on the ASA Statement was their claim that the ASA Statement’s conclusion was “especially significant” to the testosterone litigation:

Good statistical practice, as an essential component of good scientific practice, emphasizes principles of good study design and conduct, a variety of numerical and graphical summaries of data, understanding of the phenomenon under study, interpretation of results in context, complete reporting and proper logical and quantitative understanding of what data summaries mean. No single index should substitute for scientific reasoning.”15

The existence of other important criteria in the evaluation and synthesis of a complex body of studies does not erase or supersede the importance of assessing stochastic error in the epidemiologic studies. Plaintiffs’ Opposition Brief asserted that the Defense had attempted to:

to substitute the single index, the p-value, for scientific reasoning in the reports of Plaintiffs’ experts should be rejected.”16

Some of the defense’s opening brief could indeed be read as reducing causal inference to the determination of statistical significance. A sympathetic reading of the entire AbbVie brief, however, shows that it had criticized the threats to validity in the observational epidemiologic studies, as well as some of the clinical trials, and other rampant flaws in the Plaintiffs’ expert witnesses’ reasoning. The Plaintiffs’ citations to the ASA Statement’s “negative” propositions about p-values (to emphasize what they are not) appeared to be the stuffing of a strawman, used to divert attention from other failings of their own claims and proffered analyses. In other words, the substance of the Rule 702 application had much more to do with data quality and study validity than statistical significance.

What did the trial court make of this back and forth about statistical significance and the ASA Statement? For the most part, the trial court denied both sides’ challenges to proffered expert witness testimony on causation and statistical issues. In sorting the controversy over the ASA Statement, the trial court apparently misunderstood key statistical concepts and paid little attention to the threats to validity other than random variability in study results.17 The trial court summarized the controversy as follows:

In arguing that the scientific literature does not support a finding that TRT is associated with the alleged injuries, AbbVie emphasize [sic] the importance of considering the statistical significance of study results. Though experts for both AbbVie and plaintiffs agree that statistical significance is a widely accepted concept in the field of statistics and that there is a conventional method for determining the statistical significance of a study’s findings, the parties and their experts disagree about the conclusions one may permissibly draw from a study result that is deemed to possess or lack statistical significance according to conventional methods of making that determination.”18

Of course, there was never a controversy presented to the court about drawing a conclusion from “a study.” By the time the briefs were filed, both sides had multiple observational studies, clinical trials, and meta-analyses to synthesize into opinions for or against causal claims.

Ironically, AbbVie might claim to have prevailed in having the trial court adopt its misleading definitions of p-values and confidence intervals:

Statisticians test for statistical significance to determine the likelihood that a study’s findings are due to chance. *** According to conventional statistical practice, such a result *** would be considered statistically significant if there is a 95% probability, also expressed as a “p-value” of <0.05, that the observed association is not the product of chance. If, however, the p-value were greater than 0.05, the observed association would not be regarded as statistically significant, according to prevailing conventions, because there is a greater than 5% probability that the association observed was the result of chance.”19

The MDL court similarly appeared to accept AbbVie’s dubious description of the confidence interval:

A confidence interval consists of a range of values. For a 95% confidence interval, one would expect future studies sampling the same population to produce values within the range 95% of the time. So if the confidence interval ranged from 1.2 to 3.0, the association would be considered statistically significant, because one would expect, with 95% confidence, that future studies would report a ratio above 1.0 – indeed, above 1.2.”20

The court’s opinion clearly evidences the danger in stating the importance of statistical significance without placing equal emphasis on the need to exclude bias and confounding. Having found an observational study and one meta-analysis of clinical trial safety outcomes that were statistically significant, the trial court held that any dispute over the probativeness of the studies was for the jury to assess.

Some but not all of AbbVie’s brief might have encouraged this lax attitude by failing to emphasize study validity at the same time as emphasizing the importance of statistical significance. In any event, trial court continued with its précis of the plaintiffs’ argument that:

a study reporting a confidence interval ranging from 0.9 to 3.5, for example, should certainly not be understood as evidence that there is no association and may actually be understood as evidence in favor of an association, when considered in light of other evidence. Thus, according to plaintiffs’ experts, even studies that do not show a statistically significant association between TRT and the alleged injuries may plausibly bolster their opinions that TRT is capable of causing such injuries.”21

Of course, a single study that reported a risk ratio greater than 1.0, with a confidence interval 0.9 to 3.5 might be reasonably incorporated into a meta-analysis that in turn could support, or not support a causal inference. In the TRT litigation, however, the well-conducted, most up-to-date meta-analyses did not report statistically significant elevated rates of cardiovascular events among users of TRT. The court’s insistence that a study with a confidence interval 0.9 to 3.5 cannot be interpreted as evidence of no association is, of course, correct. Equally correct would be to say that the interval shows that the study failed to show an association. The trial court never grappled with the reality that the best conducted meta-analyses failed to show statistically significant increases in the rates of cardiovascular events.

The American Statistical Association and its members would likely have been deeply disappointed by how both parties used the ASA Statement for their litigation objectives. AbbVie’s suggestion that the ASA Statement reflects a debate about “whether scientists who lack expertise in statistics misunderstand p-values and overvalue significance testing” would appear to have no support in the Statement itself or any other commentary to come out of the meeting leading up to the Statement. The Plaintiffs’ argument that p-values properly understood are unimportant and misleading similarly finds no support in the ASA Statement. Conveniently, the Plaintiffs’ brief ignored the Statement’s insistence upon transparency in pre-specification of analyses and outcomes, and in handling of multiple comparisons:

P-values and related analyses should not be reported selectively. Conducting multiple analyses of the data and reporting only those with certain p-values (typically those passing a significance threshold) renders the reported p-values essentially uninterpretable. Cherrypicking promising findings, also known by such terms as data dredging, significance chasing, significance questing, selective inference, and ‘p-hacking’, leads to a spurious excess of statistically significant results in the published literature and should be vigorously avoided.”22

Most if not all of the plaintiffs’ expert witnesses’ reliance materials would have been eliminated under this principle set forth by the ASA Statement.


1 See, e.g., In re Ephedra Prods. Liab. Litig., 393 F.Supp. 2d 181, 191 (S.D.N.Y. 2005). See alsoConfidence in Intervals and Diffidence in the Courts” (March 4, 2012); “Scientific illiteracy among the judiciary” (Feb. 29, 2012).

3Letter of Janet Woodcock, Director of FDA’s Center for Drug Evaluation and Research, to Sidney Wolfe, Director of Public Citizen’s Health Research Group (July 16, 2014) (denying citizen petition for “black box” warning).

4 Defendants’ (AbbVie, Inc.’s) Motion to Exclude Plaintiffs Expert Testimony on the Issue of Causation, and for Summary Judgment, and Memorandum of Law in Support, Case No. 1:14-CV-01748, MDL 2545, Document #: 1753, 2017 WL 1104501 (N.D. Ill. Feb. 20, 2017) [AbbVie Brief].

5 AbbVie Brief at 3; see also id. at 7-8 (“Depending upon the expert, even the basic tests of statistical significance are simply ignored, dismissed as misleading… .”) AbbVie’s definitions of statistical significance occasionally wandered off track and into the transposition fallacy, but generally its point was understandable.

6 AbbVie Brief at 63 n.16 (emphasis in original).

7 AbbVie Brief at 13 (emphasis in original).

8 AbbVie Brief at 13-14 (emphasis in original).

9 The defense brief further emphasized statistical significance almost as though it were a sufficient basis for inferring causality from observational studies: “Regardless of this debate, courts have routinely found the traditional epidemiological method—including bedrock principles of significance testing—to be the most reliable and accepted way to establish general causation. See, e.g., In re Zoloft, 26 F. Supp. 3d 449, 455; see also Rosen v. Ciba-Geigy Corp., 78 F.3d 316, 319 (7th Cir. 1996) (“The law lags science; it does not lead it.”). AbbVie Brief at 63-64 & n.16. The defense’s language about “including bedrock principles of significance testing” absolves it of having totally ignored other necessary considerations, but still the defense might have advantageously pointed out at the other needed considerations for causal inference at the same time.

10 Plaintiffs’ Steering Committee’ Memorandum of Law in Opposition to Motion of AbbVie Defendants to Exclude Plaintiffs’ Expert Testimony on the Issue of Causation, and for Summary Judgment at p.34, Case No. 1:14-CV-01748, MDL 2545, Document No. 1753 (N.D. Ill. Mar. 23, 2017) [Opp. Brief].

11 Id. at 35 (appending the ASA Statement and the commentary of more than two dozen interested commentators).

12 Id. at 38 (quoting from the ASA Statement at 131).

13 Id. at 38 (quoting from the ASA Statement at 132).

14 Id. at 38 (quoting from the ASA Statement at 132).

15 Id. at 38 (quoting from the ASA Statement at 132).

16 Id. at 38

17  In re Testosterone Replacement Therapy Prods. Liab. Litig., MDL No. 2545, C.M.O. No. 46, 2017 WL 1833173 (N.D. Ill. May 8, 2017) [In re TRT]

18 In re TRT at *4.

19 In re TRT at *4.

20 Id.

21 Id. at *4.

22 ASA Statement at 131-32.

The Advocates’ Errors in Daubert

December 28th, 2018

Over 25 years ago, the United States Supreme Court answered a narrow legal question about whether the so-called Frye rule was incorporated into Rule 702 of the Federal Rules of Evidence. Plaintiffs in Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993), appealed a Ninth Circuit ruling that the Frye rule survived, and was incorporated into, the enactment of a statutory evidentiary rule, Rule 702. As most legal observers can now discern, plaintiffs won the battle and lost the war. The Court held that the plain language of Rule 702 does not memorialize Frye; rather the rule requires an epistemic warrant for the opinion testimony of expert witnesses.

Many of the sub-issues of the Daubert case are now so much water over the dam. The case involved claims of birth defects from maternal use of an anti-nausea medication, Bendectin. Litigation over Bendectin is long over, and the medication is now approved for use in pregnant women, on the basis of a full new drug application, supported by clinical trial evidence.

In revisiting Daubert, therefore, we might imagine that legal scholars and scientists would be interested in the anatomy of the errors that led Bendectin plaintiffs stridently to maintain their causal claims. The oral argument before the Supreme Court is telling with respect to some of the sources of error. Two law professors, Michael H. Gottesman, for plaintiffs, and Charles Fried, for the defense, squared off one Tuesday morning in March 1993. A review of Gottesman’s argument reveals several fallacious lines of argument, which are still relevant today:

A. Regulation is Based Upon Scientific Determinations of Causation

In his oral argument, Gottesman asserted that regulators (as opposed to the scientific community) are in charge of determining causation,1 and environmental regulations are based upon scientific causation determinations.2 By the time that the Supreme Court heard argument in the Daubert case, this conflation of scientific and regulatory standards for causal conclusions was fairly well debunked.3 Gottesman’s attempt to mislead the Court failed, but the effort continues in courtrooms around the United States.

B. Similar Chemical Structures Have the Same Toxicities

Gottesman asserted that human teratogenicity can be determined from similarity in chemical structures with other established teratogens.4 Close may count in horseshoes, but in chemical structural activities, small differences in chemical structures can result in huge differences in toxicologic or pharmacologic properties. A silly little methyl group on a complicated hydrocarbon ring structure can make a world of difference, as in the difference between estrogen and testosterone.

C. All Animals React the Same to Any Given Substance

Gottesman, in his oral argument, maintained that human teratogenicity can be determined from teratogenicity in non-human, non-primate, murine species.5 The Court wasted little time on this claim, the credibility of which has continued to decline in the last 25 years.

D. The Transposition Fallacy

Perhaps of greatest interest to me was Gottesman’s claim that the probability of the claimed causal association can be determined from the p-value or from the coefficient of confidence taken from the observational epidemiologic studies of birth defects among children of women who ingested Bendectin in pregancy; a.k.a. the transposition fallacy.6

All these errors are still in play in American courtrooms, despite efforts of scientists and scientific organizations to disabuse judges and lawyers. The transposition fallacy, which has been addressed in these pages and elsewhere at great length seems especially resilient to educational efforts. Still, the fallacy was as well recognized at the time of the Daubert argument as it is today, and it is noteworthy that the law professor who argued the plaintiffs’ case, in the highest court of the land, advanced this fallacious argument, and that the scientific and statistical community did little to nothing to correct the error.7

Although Professor Gottesman’s meaning in the oral argument is not entirely clear, on multiple occasions, he appeared to have conflated the coefficient of confidence, from confidence intervals, with the posterior probability that attaches to the alternative hypothesis of some association:

What the lower courts have said was yes, but prove to us to a degree of statistical certainty which would give us 95 percent confidence that the human epidemiological data is reflective, that these higher numbers for the mothers who used Bendectin were not the product of random chance but in fact are demonstrating the linkage between this drug and the symptoms observed.”8

* * * * *

“… what was demonstrated by Shanna Swan was that if you used a degree of confidence lower than 95 percent but still sufficient to prove the point as likelier than not, the epidemiological evidence is positive… .”9

* * * * *

The question is, how confident can we be that that is in fact probative of causation, not at a 95 percent level, but what Drs. Swan and Glassman said was applying the Rothman technique, a published technique and doing the arithmetic, that you find that this does link causation likelier than not.”10

Professor Fried’s oral argument for the defense largely refused or failed to engage with plaintiffs’ argument on statistical inference. With respect to the “Rothman” approach, Fried pointed out that plaintiffs’ statistical expert witness, Shanna swan, never actually employed “the Rothman principle.”11

With respect to plaintiffs’ claim that individual studies had low power to detect risk ratios of two, Professor Fried missed the opportunity to point out that such post-hoc power calculations, whatever validity they might possess, embrace the concept of statistical significance at the customary 5% level. Fried did note that a meta-analysis, based upon all the epidemiologic studies, rendered plaintiffs’ power complaint irrelevant.12

Some readers may believe that judging advocates speaking extemporaneously about statistical concepts might be overly harsh. How well then did the lawyers explain and represent statistical concepts in their written briefs in the Daubert case?

Petitioners’ Briefs

Petitioners’ Opening Brief

The petitioners’ briefs reveal that Gottesman’s statements at oral argument represent a consistent misunderstanding of statistical concepts. The plaintiffs consistently conflated significance probability or the coefficient of confidence with the civil burden of proof probability:

The crux of the disagreement between Merrell’s experts and those whose testimony is put forward by plaintiffs is that the latter are prepared to find causation more probable than not when the epidemiological evidence is strongly positive (albeit not at a 95% confidence level) and when it is buttressed with animal and chemical evidence predictive of causation, while the former are unwilling to find causation in the absence of an epidemiological study that satisfies the 95% confidence level.”13

After giving a reasonable fascimile of a definition of statistical significance, the plaintiffs’ brief proceeds to confuse the complement of alpha, or the coefficient of confidence (typically 95%), with probability that the observed risk ratio in a sample is the actual population parameter of risk:

But in toxic tort lawsuits, the issue is not whether it is certain that a chemical caused a result, but rather whether it is likelier than not that it did. It is not self-evident that the latter conclusion would require eliminating the null hypothesis (i.e. non-causation) to a confidence level of 95%.3014

The plaintiffs’ brief cited heavily to Rothman’s textbook, Modern Epidemiology, with the specious claim that the textbook supported the plaintiffs’ use of the coefficient of confidence to derive a posterior probability (> 50%) of the correctness of an elevated risk ratio for birth defects in children born to mothers who had taken Bendectin in their first trimesters of pregnancy:

An alternative mechanism has been developed by epidemiologists in recent years to give a somewhat more informative picture of what the statistics mean. At any given confidence level (e.g. 95%) a confidence interval can be constructed. The confidence interval identifies the range of relative risks that collectively comprise the 95% universe. Additional confidence levels are then constructed exhibiting the range at other confidence levels, e.g., at 90%, 80%, etc. From this set of nested confidence intervals the epidemiologist can make assessments of how likely it is that the statistics are showing a true association. Rothman, Tab 9, pp. 122-25. By calculating nested confidence intervals for the data in the Bendectin studies, Dr. Swan was able to determine that it is far more likely than not that a true association exists between Bendectin and human limb reduction birth defects. Swan, Tab 12, at 3618-28.”15

The heavy reliance upon Rothman’s textbook at first blush appears confusing. Modern Epidemiology makes one limited mention of nested confidence intervals, and certainly never suggests that such intervals can provide a posterior probability of the correctness of the hypothesis. Rothman’s complaints about reliance upon “statistical significance,” however, are well-known, and Rothman himself submitted an amicus brief16 in Daubert, a brief that has its own problems.17

In direct response to the Rothman Brief,18 Professor Alvin Feinstein filed an amicus brief in Daubert, wherein he acknowledged that meta-analyses and re-analyses can be valid, but these techniques are subject to many sources of invalidity, and their employment by careful practitioners in some instances should not be a blank check to professional witnesses who are supported by plaintiffs’ counsel. Similarly, Feinstein acknowledged that standards of statistical significance:

should be appropriately flexible, but they must exist if science is to preserve its tradition of intellectual discipline and high quality research.”19

Petitioners’ Reply Brief

The plaintiffs’ statistical misunderstandings are further exemplified in their Reply Brief, where they reassert the transposition fallacy and alternatively state that associations with p-values greater than 5%, or 95% confidence intervals that include the risk ratio of 1.0, do not show the absence of an association.20 The latter point was, of course irrelevant in the Daubert case, in which plaintiffs had the burden of persuasion. As in their oral argument through Professor Gottesman, the plaintiffs’ appellate briefs misunderstand the crucial point that confidence intervals are conditioned upon the data observed from a particular sample, and do not provide posterior probabilities for the correctness of a claimed hypothesis.

Defense Brief

The defense brief spent little time on the statistical issue or plaintiffs’ misstatements, but dispatched the issue in a trenchant footnote:

Petitioners stress the controversy some epidemiologists have raised about the standard use by epidemiologists of a 95% confidence level as a condition of statistical significance. Pet. Br. 8-10. See also Rothman Amicus Br. It is hard to see what point petitioners’ discussion establishes that could help their case. Petitioners’ experts have never developed and defended a detailed analysis of the epidemiological data using some alternative well-articulated methodology. Nor, indeed, do they show (or could they) that with some other plausible measure of confidence (say, 90%) the many published studies would collectively support an inference that Bendectin caused petitioners’ limb reduction defects. At the very most, all that petitioners’ theoretical speculations do is question whether these studies – as the medical profession and regulatory authorities in many countries have concluded – affirmatively prove that Bendectin is not a teratogen.”21

The defense never responded to the specious argument, stated or implied within the plaintiffs’ briefs, and in Gottesman’s oral argument, that a coefficient of confidence of 51% would have generated confidence intervals that routinely excluded the null hypothesis of risk ratio of 1.0. The defense did, however, respond to plaintiffs’ power argument by adverting to a meta-analysis that failed to find a statistically significant association.22

The defense also advanced two important arguments to which the plaintiffs’ briefs never meaningfully responded. First, the defense detailed the “cherry picking” or selective reliance engaged in by plaintiffs’ expert witnesses.23 Second, the defense noted that plaintiffs’ had a specific causation problem in that their expert witnesses had been attempting to infer specific causation based upon relative risks well below 2.0.24

To some extent, the plaintiffs’ statistical misstatements were taken up by an amicus brief submitted by the United States government, speaking through the office of the Solicitor General.25 Drawing upon the Supreme Court’s decisions in race discrimination cases,26 the government asserted that epidemiologists “must determine” whether a finding of an elevated risk ratio “could have arisen due to chance alone.”27

Unfortunately, the government’s brief butchered the meaning of confidence intervals. Rather than describe the confidence interval as showing what point estimates of risk ratios are reasonable compatible with the sample result, the government stated that confidence intervals show “how close the real population percentage is likely to be to the figure observed in the sample”:

since there is a 95 percent chance that the ‘true’ value lies within two standard deviations of the sample figure, that particular ‘confidence interval’ (i.e., two standard deviations) is therefore said to have a ‘confidence level’ of about 95 percent.” 28

The Solicitor General’s office seemed to have had some awareness that it was giving offense with the above definition because it quickly added:

“While it is customary (and, in many cases, easier) to speak of ‘a 95 percent chance’ that the actual population percentage is within two standard deviations of the figure obtained from the sample, ‘the chances are in the sampling procedure, not in the parameter’.”29

Easier perhaps but clearly erroneous to speak that way, and customary only among the unwashed. The government half apologized for misleading the Court when it followed up with a better definition from David Freedman’s textbook, but sadly the government lawyers were not content to let the matter sit there. The Solicitor General offices brief obscured the textbook definition with a further inaccurate and false précis:

if the sampling from the general population were repeated numerous times, the ‘real’ population figure would be within the confidence interval 95 percent of the time. The ‘real’ figure would be outside that interval the remaining five percent of the time.”30

The lawyers in the Solicitor General’s office thus made the rookie mistake of forgetting that in the long run, after numerous repeated samples, there would be numerous confidence intervals, not one. The 95% probability of containing the true population value belongs to the set of the numerous confidence intervals, not “the confidence interval” obtained in the first go around.

The Daubert case has been the subject of nearly endless scholarly comment, but few authors have chosen to revisit the parties’ briefs. Two authors have published a paper that reviewed the scientists’ amici briefs in Daubert.31 The Rothman brief was outlined in detail; the Feinstein rebuttal was not substantively discussed. The plaintiffs’ invocation of the transposition fallacy in Daubert has apparently gone unnoticed.


1 Oral Argument in Daubert v. Merrell Dow Pharmaceuticals, Inc., U.S. Supreme Court no. 92-102, 1993 WL 754951, *5 (Tuesday, March 30, 1993) [Oral Arg.]

2 Oral Arg. at *6.

3 In re Agent Orange Product Liab. Litig., 597 F. Supp. 740, 781 (E.D.N.Y.1984) (“The distinction between avoidance of risk through regulation and compensation for injuries after the fact is a fundamental one.”), aff’d in relevant part, 818 F.2d 145 (2d Cir. 1987), cert. denied sub nom. Pinkney v. Dow Chemical Co., 484 U.S. 1004 (1988).

4 Org. Arg. at *19.

5 Oral Arg. at *18-19.

6 Oral Arg. at *19.

7 See, e.g., “Sander Greenland on ‘The Need for Critical Appraisal of Expert Witnesses in Epidemiology and Statistics’” (Feb. 8, 2015) (noting biostatistician Sander Greenland’s publications, which selectively criticize only defense expert witnesses and lawyers for statistical misstatements); see alsoSome High-Value Targets for Sander Greenland in 2018” (Dec. 27, 2017).

8 Oral Arg. at *19.

9 Oral Arg. at *20

10 Oral Arg. at *44. At the oral argument, this last statement was perhaps Gottesman’s clearest misstatement of statistical principles, in that he directly suggested that the coefficient of confidence translates into a posterior probability of the claimed association at the observed size.

11 Oral Arg. at *37.

12 Oral Arg. at *32.

13 Petitioner’s Brief in Daubert v. Merrell Dow Pharmaceuticals, Inc., U.S. Supreme Court No. 92-102, 1992 WL 12006442, *8 (U.S. Dec. 2, 1992) [Petitioiner’s Brief].

14 Petitioner’s Brief at *9.

15 Petitioner’s Brief at *n. 36.

16 Brief Amici Curiae of Professors Kenneth Rothman, Noel Weiss, James Robins, Raymond Neutra and Steven Stellman, in Support of Petitioners, 1992 WL 12006438, Daubert v. Merrell Dow Pharmaceuticals, Inc., U.S. S. Ct. No. 92-102 (Dec. 2, 1992).

18 Brief Amicus Curiae of Professor Alvan R. Feinstein in Support of Respondent, in Daubert v. Merrell Dow Pharmaceuticals, Inc., U.S. Supreme Court no. 92-102, 1993 WL 13006284, at *2 (U.S., Jan. 19, 1993) [Feinstein Brief].

19 Feinstein Brief at *19.

20 Petitioner’s Reply Brief in Daubert v. Merrell Dow Pharmaceuticals, Inc., U.S. Supreme Court No. 92-102, 1993 WL 13006390, at *4 (U.S., Feb. 22, 1993).

21 Respondent’s Brief in Daubert v. Merrell Dow Pharmaceuticals, Inc., U.S. Supreme Court No. 92-102, 1993 WL 13006277, at n. 32 (U.S., Jan. 19, 1993) [Respondent Brief].

22 Respondent Brief at *4.

23 Respondent Brief at *42 n.32 and 47.

24 Respondent Brief at *40-41 (citing DeLuca v. Merrell Dow Pharms., Inc., 911 F.2d 941, 958 (3d Cir. 1990)).

25 Brief for the United States as Amicus Curiae Supporting Respondent in Daubert v. Merrell Dow Pharmaceuticals, Inc., U.S. Supreme Court No. 92-102, 1993 WL 13006291 (U.S., Jan. 19, 1993) [U.S. Brief].

26 See, e.g., Hazelwood School District v. United States, 433 U.S. 299, 308-312

(1977); Castaneda v. Partida, 430 U.S. 482, 495-499 & nn.16-18 (1977) (“As a general rule for such large samples, if the difference between the expected value and the observed number is greater than two or three standard deviations, then the hypothesis that the jury drawing was random would be suspect to a social scientist.”).

27 U.S. Brief at *3-4. Over two decades later, when politically convenient, the United States government submitted an amicus brief in a case involving alleged securities fraud for failing to disclose adverse events of an over-the-counter medication. In Matrixx Initiatives Inc. v. Siracusano, 131 S. Ct. 1309 (2011), the securities fraud plaintiffs contended that they need not plead “statistically significant” evidence for adverse drug effects. The Solicitor General’s office, along with counsel for the Food and Drug Division of the Department of Health & Human Services, in their zeal to assist plaintiffs disclaimed the necessity, or even the importance, of statistical significance:

[w]hile statistical significance provides some indication about the validity of a correlation between a product and a harm, a determination that certain data are not statistically significant … does not refute an inference of causation.”

Brief for the United States as Amicus Curiae Supporting Respondents, in Matrixx Initiatives, Inc. v. Siracusano, 2010 WL 4624148, at *14 (Nov. 12, 2010).

28 U.S. Brief at *5.

29 U.S. Brief at *5-6 (citing David Freedman, Freedman, R. Pisani, R. Purves & A. Adhikari, Statistics 351, 397 (2d ed. 1991)).

30 U.S. Brief at *6 (citing Freedman’s text at 351) (emphasis added).

31 See Joan E. Bertin & Mary S. Henifin, Science, Law, and the Search for Truth in the Courtroom: Lessons from Dauburt v. Menell Dow,” 22 J. Law, Medicine & Ethics 6 (1994); Joan E. Bertin & Mary Sue Henifin, “Scientists Talk to Judges: Reflections on Daubert v. Merrell Dow,” 4(3) New Solutions 3 (1994). The authors’ choice of the New Solutions journal is interesting and curious. New Solutions: A journal of Environmental and Occupational Health Policy was published by the Oil, Chemical and Atomic Workers International Union, under the control of Anthony Mazzocchi (June 13, 1926 – Oct. 5, 2002), who was the union’s secretary-treasurer. Anthony Mazzocchi, “Finding Common Ground: Our Commitment to Confront the Issues,” 1 New Solutions 3 (1990); see also Steven Greenhouse, “Anthony Mazzocchi, 76, Dies; Union Officer and Party Father,” N.Y. Times (Oct. 9, 2002). Even a cursory review of this journal’s contents reveals how concerned, even obsessed, the union was interested and invested in the litigation industry and that industry’s expert witnesses. 

 

“Each and Every Exposure” Is a Substantial Factor

December 3rd, 2018

“Every time a bell rings an angel gets his wings”
It’s a Wonderful Life (1946)

Every time a plaintiff shows the smallest imaginable exposure, there is a full recovery.
… The American tort system.

 

In 1984, Philadelphia County had a non-jury system for asbestos personal injury cases, with a right to “appeal” for a de novo trial with a jury. The non-jury trials were a wonderful training ground for a generation of trial lawyers, and for a generation or two of testifying expert witnesses. When I started to try asbestos cases as a young lawyer, the plaintiffs’ counsel had already taught their expert witnesses to include the “each and every exposure” talismanic language in their direct examination testimonies on the causation of the plaintiffs’ condition. The litigation industry had figured out that this expression would help avoid a compulsory non-suit on proximate causation.

Back in those wild, woolly frontier days, I encountered the slick Dr. Joseph Sokolowski (“Sok”), a pulmonary physician in private practice in New Jersey. Sok, like many other pulmonary physicians in the Delaware Valley area, had seen civilian workers referred by Philadelphia Naval Shipyard to be evaluated for asbestosis. When the plaintiff-friendly physicians diagnosed asbestosis, a few preferred firms would then pursue their claims under the Federal Employees Compensation Act (FECA). The United States government would notify the workers of their occupational disease, and urge them to pursue the government’s outside vendors of asbestos-containing materials, with a reminder that the government had a lien against any civil action recovery. The federal government thus made common cause with the niche law practices of workers’ compensation lawyers,1 and helped launch the tsunami of asbestos litigation.2

Sok was perfect for his role in the federal kick-back scheme. He could deliver the most implausible testimony, and weather brutal cross-examination without flinching. He had the face of a choir boy, and his service as an outside examiner for the Navy Yard employees gave his diagnoses the apparent imprimatur of the federal government. Although Sok had no real understanding of epidemiology, he could readily master the Selikoff litany of 5-10-50, for relative risks for lung cancer, from asbestos alone (supposedly), from smoking alone, and from asbestos and smoking combined, respectively. And he similarly mastered his lines that “each and every exposure” is substantial, when pressed on whether and how exposure to a minor vendor’s product was a substantial factor. Back in those days, before Johns-Manville (JM) Corporation went bankrupt, honest witnesses at the Navy Yard acknowledged that JM supplied the vast majority of asbestos products, but that testimony changed literally over the course of a trial day, when the plaintiffs’ bar learned of the JM bankruptcy.

It was into this topsy-turvy litigation world, I was thrown. I had the sense that there was no basis for the “each and every exposure” opinion, but my elders at the defense bar seemed to avoid the opinion studiously on cross-examination. I recall co-defendants’ counsels’ looks of horror and disapproval when I broached the topic in my first cross-examination. Sok had known to incorporate the “each and every exposure” opinion into his direct testimony, but he had no intelligible response to my question about what possible basis there was for the opinion. “Well, we have to blame each and every exposure because we have no way distinguish among exposures.” I could not let it lie there, and so I asked: “So your opinion about each and every exposure is based upon your ignorance?” My question was quickly met with an objection, and just as quickly with a rather loud and disapproving, “Sustained!” When Sok finished his testimony, I moved to strike his substantial factor opinion as having no foundation, but my motion was met with by judicial annoyance and apathy.

And so I learned that science and logic had nothing to do with asbestos litigation. Some determined defense counsel persevered, however, and in the face of over one hundred bankruptcies,3 a few courts started to take the evidence and arguments against the “every exposure” testimony, seriously. Last week, the New York Court of Appeals, New York’s highest court, agreed to state out loud that the plaintiffs’ “every exposure” theory had no clothes, no foundation, and no science. Juni v. A.O. Smith Water Products Co., No. 123, N.Y. Court of Appeals (Nov. 27, 2018).4

In a short, concise opinion, with a single dissent, the Court held that plaintiffs’ evidence (any exposure, no matter how trivial) in a mesothelioma death case was “insufficient as a matter of law to establish that respondent Ford Motor Co.’s conduct was a proximate cause of the decedent’s injuries.” The ruling affirmed the First Department’s affirmance of a trial court’s judgment notwithstanding the $11 million jury verdict against Ford.5 Arguing for the proposition that every exposure is substantial, over three dozen scientists, physicians, and historians, most of whom regularly support and testify for the litigation industry, filed a brief in support of the plaintiffs.6 The Atlantic Legal Foundation filed an amicus brief on behalf of several scientists,7 and I had the privilege of filing an amicus brief on behalf of the Coalition for Litigation Justice and nine other organizations in support of Ford’s positions.8

It has been 34 years since I first encountered the “every exposure is substantial” dogma in a Philaddelphia courtroom. Some times in litigation, it takes a long time to see the truth come out.


1 E.g., Shein and Brookman; Greitzer & Locks; both of Philadelphia.

2 Encouraging litigation against its suppliers, the federal government pulled off a coup of misdirection. First, it deflected public censure from the Navy and other governmental branches for its own carelessness in the use, installation, and removal of asbestos-containing insulations. Second, the government winnowed the ranks of older, better compensated workers. Third, and most diabolically, the government, which was self-insured for FECA claims, recovered most of their outlay when its former employees recovered judgments or settlements against the government’s outside asbestos product vendors. “The United States Government’s Role in the Asbestos Mess” (Jan. 31, 2012). See also Walter Olson, “Asbestos awareness pre-Selikoff,” Point of Law (Oct. 19, 2007); “The U.S. Navy and the asbestos calamityPoint of Law (Oct. 9, 2007).

4 The plaintiffs were represented by Alani Golanski of Weitz & Luxenberg LLP.

6 Abby Lippman, Annie Thebaud Mony, Arthur L. Frank, Barry Castleman, Bruce P. Lanphear,

Celeste Monforton, Colin L. Soskolne, Daniel Thau Teitelbaum, Dario Consonni, Dario Mirabelli, David Egilman, David F. Goldsmith, David Ozonoff, David Rosner, Fiorella Belpoggi, James Huff, John Heinzow, John M. Dement, John Coulter Maddox, Karl T. Kelsey, Kathleen Ruff, Kenneth D. Rosenman, L. Christine Oliver, Laura Welch, Leslie Thomas Stayner, Morris Greenberg, Nachman Brautbar, Philip J. Landrigan, Xaver Baur, Hans-Joachim Woitowitz, Bice Fubini, Richard Kradin, T.K. Joshi, Theresa S. Emory, Thomas H. Gassert,

Tony Fletcher, and Yv Bonnier Viger.

7 John Henderson Duffus, Ronald E. Gots, Arthur M. Langer, Robert Nolan, Gordon L. Nord, Alan John Rogers, and Emanuel Rubin.

8 Amici Curiae Brief of Coalition for Litigation Justice, Inc., Business Council of New York State, Lawsuit Reform Alliance of New York, New York Insurance Association, Inc., Northeast Retail Lumber Association, National Association of Manufacturers, Chamber of Commerce of the U.S.A., American Tort Reform Association, American Insurance Association, and NFIB Small Business Legal Center Supporting Defendant-Respondent Ford Motor Company.

The “Rothman” Amicus Brief in Daubert v. Merrill Dow Pharmaceuticals

November 17th, 2018

Then time will tell just who fell
And who’s been left behind”

                  Dylan, “Most Likely You Go Your Way” (1966)

 

When the Daubert case headed to the Supreme Court, it had 22 amicus briefs in tow. Today that number is routine for an appeal to the high court, but in 1992, it was a signal of intense interest in the case among both the scientific and legal community. To the litigation industry, the prospect of judicial gatekeeping of expert witness testimony was an anathema. To the manufacturing industry, the prospect was precious to defend against specious claiming.

With the benefit of 25 years of hindsight, a look at some of those amicus briefs reveals a good deal about the scientific and legal acumen of the “friends of the court.” Not all amicus briefs in the case were equal; not all have held up well in the face of time. The amicus brief of the American Association for the Advancement of Science and the National Academy of Science was a good example of advocacy for the full implementation of gatekeeping on scientific principles of valid inference.1 Other amici urged an anything goes approach to judicial oversight of expert witnesses.

One amicus brief often praised by Plaintiffs’ counsel was submitted by Professor Kenneth Rothman and colleagues.2 This amicus brief is still cited by parties who find support in the brief for their excuses for not having consistent, valid, strong, and statistically significance evidence to support their claims of causation. To be sure, Rothman did target statistical significance as a strict criterion of causal inference, but there is little support in the brief for the loosey-goosey style of causal claiming that is so prevalent among lawyers for the litigation industry. Unlike the brief filed by the AAAS and the National Academy of Science, Rothman’s brief abstained from the social policies implied by judicial gatekeeping or its rejection. Instead, Rothman’s brief wet out to make three narrow points:

(1) courts should not rely upon strict statistical significance testing for admissibility determinations;

(2) peer review is not an appropriate touchstone for the validity of an expert witness’s opinion; and

(3) unpublished, non-peer-reviewed “reanalysis” of studies is a routine part of the scientific process, and regularly practiced by epidemiologists and other scientists.

Rothman was encouraged to target these three issues by the lower courts’ opinions in the Daubert case, in which the courts made blanket statements about the role of absent statistical significance and peer review, and the illegitimacy of “re-analyses” of published studies.

Professor Rothman has made many admirable contributions to epidemiologic practice, but the amicus brief submitted by him and his colleagues falls into the trap of making the sort of blanket general statements that they condemned in the lower courts’ opinions. Of the brief’s three points, the first, about statistical significance is the most important for epidemiologic and legal practice. Despite reports of an odd journal here or there “abolishing” p-values, most medical journals continue to require the presentation of either p-values or confidence intervals. In the majority of medical journals, 95% confidence intervals that exclude a null hypothesis risk ratio of 1.0, or risk difference of 0, are labelled “statistically significant,” sometimes improvidently in the presence of multiple comparisons and lack of pre-specification of outcome.

For over three decades, Rothman has criticized the prevailing practice on statistical significance. Professor Rothman is also well known for his advocacy for the superiority of confidence intervals over p-values in conveying important information about what range of values are reasonably compatible with the observed data.3 His criticisms of p-values and his advocacy for estimation with intervals have pushed biomedical publishing to embrace confidence intervals as more informative than just p-values. Still, his views on statistical significance have never gained complete acceptance at most clinical journals. Biomedical scientists continue to interpret 95% confidence intervals, at least in part, as to whether they show “significance” by excluding the null hypothesis value of no risk difference or of risk ratios equal to 1.0.

The first point in Rothman’s amicus brief is styled:

THE LOWER COURTS’ FOCUS ON SIGNIFICANCE TESTING IS BASED ON THE INACCURATE ASSUMPTION THAT ‘STATISTICAL SIGNIFICANCE’ IS REQUIRED IN ORDER TO DRAW INFERENCES FROM EPIDEMIOLOGICAL INFORMATION”

The challenge by Rothman and colleagues to the “assumption” that statistical significance is necessary is what, of course, has endeared this brief to the litigation industry. A close read of the brief, however, shows that Rothman’s critique of the assumption is equivocal. Rothman et amici characterized the lower courts as having given:

blind deference to inappropriate and arcane publication standards and ‘significance testing’.”4

The brief is silent about what might be knowing deference, or appropriate publication standards. To be sure, judges have often poorly expressed their reasoning for deciding scientific evidentiary issues, and perhaps poor communication or laziness by judges was responsible for Rothman’s interest in joining the Daubert fray. Putting aside the unclear, rhetorical, and somewhat hyperbolic use of “arcane” in the quote above, the suggestion of inappropriate blind deference is itself expressed in equivocal terms in the brief. At times the authors rail at the use of statistical significance as the “sole” criterion, and at times, they seem to criticize its use at all.

At least twice in their brief, Rothman and friends declare that the lower court:

misconstrues the validity and appropriateness of significance testing as a decision making tool, apparently deeming it the sole test of epidemiological hypotheses.”5

* * * * * *

this Court should reject significance testing as the sole acceptable criterion of scientific validity in epidemiology.”6

Characterizing “statistical significance” as not the sole test or criterion of scientific inference is hardly controversial, and it implies that statistical significance is one test, criterion, or factor among others. This position is consistent with the current ASA Statement on Significance Testing.7 There is, of course, much more to evaluate in a study or a body of studies, than simply whether they individually or collectively help us to exclude chance as an explanation for their findings.

Statistical Significance Is Not Necessary At All

Elsewhere, Rothman and friends take their challenge to statistical significance testing beyond merely suggesting that such testing is only one test or criterion among others. Indeed, their brief in other places states their opinion that significance testing is not necessary at all:

Testing for significance, however, is often mistaken for a sine qua non of scientific inference.”8

And at other times, Rothman and friends go further yet and claim not only that significance is not necessary, but that it is not even appropriate or useful:

Significance testing, however, is neither necessary nor appropriate as a requirement for drawing inferences from epidemiologic data.”9

Rothman compares statistical significance testing with “scientific inference,” which is not a mechanical, mathematical procedure, but rather a “thoughtful evaluation[] of possible explanations for what is being observed.”10 Significance testing, in contrast,” is “merely a statistical tool,” used inappropriately “in the process of developing inferences.”11 Rothman suggests that the term “statistical significance” could be eliminated from scientific discussions without loss of meaning, and this linguistic legerdemain shows that the phrase is unimportant in science and in law.12 Rothman’s suggestion, however, ignores that causal assessments have always required an evaluation of the play of chance, especially for putative causes, which are neither necessary nor sufficient, and which modify underlying stochastic processes by increasing or decreasing the probability of a specified outcome. Asserting that statistical significance is misleading because it never describes the size of an association, which the Rothman brief does, is like telling us that color terms tell us nothing about the mass of a body.

The Rothman brief does make the salutary point that labeling a study outcome as not “statistically significant” carries the danger that the study’s data have no value, or that the study may be taken to reject the hypothesized association. In 1992, such an interpretation may have been more common, but today, in the face of the proliferation of meta-analyses, the risk of such interpretations of single study outcomes is remote.

Questionable History of Statistics

Rothman suggests that the development of statistical hypothesis testing occurred in the context of agricultural and quality-control experiments, which required yes-no answers for future action.13 This suggestion clearly points at Sir Ronald Fisher and Jerzy Neyman, and their foundational work on frequentist statistical theory and practice. In part, the amici correctly identified the experimental milieu in which Fisher worked, but the description of Fisher’s work is neither accurate nor fair. Fisher spent a lifetime thinking and writing about statistical tests, in much more nuanced ways than implied by the claim that such testing occurred in context of agricultural and quality-control experiments. Although Fisher worked on agricultural experiments, his writings acknowledged that when statistical tests and analyses were applied to observational studies, much more searching analyses of bias and confounding were required. Fisher’s and Berkson’s reactions to the observational studies of Hill and Doll on smoking and lung cancer are telling in this regard. These statisticians criticized the early smoking lung cancer studies, not for lack of statistical significance, but for failing to address confounding by a potential common genetic propensity to smoke and to develop lung cancer.

Questionable History of Drug Development

Twice in Rothman’s amicus brief, the authors suggest that “undue reliance” on statistical significance has resulted in overlooking “effective new treatments” because observed benefits were considered “not significant,” despite an “indication” of efficacy.14 The brief never provided any insight on what is due reliance and what is undue reliance on statistical significance. Their criticism of “undue reliance” implies that there are modes or instances of “due reliance” upon statistical significance. The amicus brief fails also to inform readers exactly what “effective new treatments” have been overlooked because the outcomes were considered “not significant.” This omission is regrettable because it leaves the reader with only abstract recommendations, without concrete examples of what such effective treatments might be. The omission was unfortunate because Rothman almost certainly could have marshalled examples. Recently, Rothman tweeted just such an example:15

“30% ↓ in cancer risk from Vit D/Ca supplements ignored by authors & editorial. Why? P = 0.06. http://bit.ly/2oanl6w http://bit.ly/2p0CRj7. The 95% confidence interval for the risk ratio was 0.42–1.02.”

Of course, this was a large, carefully reported randomized clinical trial, with a narrow confidence interval that just missed “statistical significance.” It is not an example that would have given succor to Bendectin plaintiffs, who were attempting to prove an association by identifying flaws in noisy observational studies that generally failed to show an association.

Readers of the 1992 amicus brief can only guess at what might be “indications of efficacy”; no explanation or examples are provided.16 The reality of FDA approvals of new drugs is that pre-specified 5% level of statistical significance is virtually always enforced.17 If a drug sponsor has “indication of efficacy,” it is, of course, free to follow up with an additional, larger, better-designed clinical trial. Rothman’s recent tweet about the vitamin D clinical trial does provide some context and meaning to what the amici may have meant over 25 years ago by indication of efficacy. The tweet also illustrates Rothman’s acknowledgment of the need to address random variability in a data set, whether by p-value or confidence interval, or both. Clearly, Rothman was criticizing the authors of the vitamin D trial for stopping short of claiming that they had shown (or “demonstrated”) a cancer survival benefit. There is, however, a rich literature on vitamin D and cancer outcomes, and such a claim could be made, perhaps, in the context of a meta-analysis or meta-regression of multiple clinical trials, with a synthesis of other experimental and observational data.18

Questionable History of Statistical Analyses in Epidemiology

Rothman’s amicus brief deserves credit for introducing a misinterpretation of Sir Austin Bradford Hill’s famous paper on inferring causal associations, which has become catechism in the briefs of plaintiffs in pharmaceutical and other products liability cases:

No formal tests of significance can answer those questions. Such tests can, and should, remind us of the effects that the play of chance can create, and they will instruct us in the likely magnitude of those effects. Beyond that they contribute nothing to the ‘proof’ of our hypothesis.”

Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 290 (1965) (quoted at Rothman Brief at *6).

As exegesis of Hill’s views, this quote is misleading. The language quoted above was used by Hill in the context of his nine causal viewpoints or criteria. The Rothman brief ignores Hill’s admonition to his readers, that before reaching the nine criteria, there is a serious, demanding predicate that must be shown:

Disregarding then any such problem in semantics we have this situation. Our observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance. What aspects of that association should we especially consider before deciding that the most likely interpretation of it is causation?”

Id. at 295 (emphasis added). Rothman and co-authors did not have to invoke the prestige and authority of Sir Austin, but once they did, they were obligated to quote him fully and with accurate context. Elsewhere, in his famous textbook, Hill expressed his view that common sense was insufficient to interpret data, and that the statistical method was necessary to interpret data in medical studies.19

Rothman complains that statistical significance focuses the reader on conjecture on the role of chance in the observed data rather than the information conveyed by the data themselves.20 The “incompleteness” of statistical analysis for arriving at causal conclusions, however, is not an argument against its necessity.

The Rothman brief does make the helpful point that statistical significance cannot be sufficient to support a conclusion of causation because many statistically significant associations or correlations will be non-causal. They give a trivial example of wearing dresses and breast cancer, but the point is well-taken. Associations, even when statistically significant, are not necessarily causal conclusions. Who ever suggested otherwise, other than expert witnesses for the litigation industry?

Unnecessary Fears

The motivation for Rothman’s challenge to the assumption that statistical significance is necessary is revealed at the end of the argument on Point I. The authors plainly express their concern that false negatives will shut down important research:

To give weight to the failure of epidemiological studies to meet strict ‘statistical significant’ standards — to use such studies to close the door on further inquiry — is not good science.”21

The relevance of this concern to the proceedings is a mystery. The judicial decisions in the case are not referenda on funding initiatives. Scientists were as free in 1993, after Daubert was decided, as they were in 1992, when Rothman wrote, to pursue the hypothesis that Bendectin caused birth defects. The decision had the potential to shut down tort claims, and left scientists to their tasks.

Reanalyses Are Appropriate Scientific Tools to Assess and Evaluate Data, and to Forge Causal Opinions

The Rothman brief took issue with the lower courts’ dismissal of plaintiffs’ expert witnesses’ re-analyses of data in published studies. The authors argued that reanalyses were part of the scientific method, and not “an arcane or specialized enterprise,” deserving of heightened or skeptical scrutiny.22

Remarkably, the Rothman brief, if accepted by the Supreme Court on the re-analysis point, would have led to the sort of unthinking blanket acceptance of a methodology, which the brief’s authors condemned in the context of blanket acceptance of significance testing. The brief covertly urges “blind deference” to its authors on the blanket approval of re-analyses.

Although amici have tight page limits, the brief’s authors made clear that they were offering no substantive opinions on the data involved in the published epidemiologic studies on Bendectin, or on the plaintiffs’ expert witnesses’ re-analyses. With the benefit of hindsight, we can see that the sweeping language used by the Ninth Circuit on re-analyses might have been taken to foreclose important and valid meta-analyses or similar approaches. The Rothman brief is not terribly explicit on what re-analysis techniques were part of the scientific method, but meta-analyses surely had been on the authors’ minds:

by focusing on inappropriate criteria applied to determine what conclusions, if any, can be reached from any one study, the trial court forecloses testimony about inferences that can be drawn from the combination of results reported by many such studies, even when those studies, standing alone, might not justify such inferences.”23

The plaintiffs’ statistical expert witness in Daubert had proffered a re-analysis of at least one study by substituting a different control sample, as well as a questionable meta-analyses. By failing to engage on the propriety of the specific analyses at issue in Daubert, the Rothman brief failed to offer meaningful guidance to the appellate court.

Reanalyses Are Not Invalid Just Because They Have Not Been Published

Rothman was certainly correct that the value of peer review was overstated by the defense in Bendectin litigation.24 The quality of pre-publication peer review is spotty, at best. Predatory journals deploy a pay-to-play scheme, which makes a mockery of scientific publishing. Even at respectable journals, peer review cannot effectively guard against fraud, or ensure that statistical analyses have been appropriately done.25 At best, peer review is a weak proxy for study validity, and an unreliable one at that.

The Rothman brief may have moderated the Supreme Court’s reaction to the defense’s argument that peer review is a requirement for studies, or “re-analyses,” relied upon by expert witnesses. The Court in Daubert opined, in dicta, that peer review is a non-dispositive consideration:

The fact of publication (or lack thereof) in a peer reviewed journal … will be a relevant, though not dispositive, consideration in assessing the scientific validity of a particular technique or methodology on which an opinion is premised.”26

To the extent that Rothman and colleagues might have been disappointed in this outcome, they missed some important context of the Bendectin cases. Most of the cases had been resolved by a consolidated causation issues trial, but many opt-out cases had to be tried in state and federal courts around the country.27 The expert witnesses challenged in Daubert (Drs. Swan and Done) participated in many of these opt-out cases, and in each case, they opined that Bendectin was a public health hazard. The failure of these witnesses to publish their analyses and re-analyses spoke volumes about their bona fides. Courts (and juries if the Swan and Done proffered testimony were admissible) could certainly draw negative inferences from the plaintiffs’ expert witnesses’ failure to publish their opinions and re-analyses.

The Fate of the “Rothman Approach” in the Courts

The so-called “Rothman approach” was urged by Bendectin plaintiffs in opposing summary judgment in a case pending in federal court, in New Jersey, before the Supreme Court decided Daubert. Plaintiffs resisted exclusion of their expert witnesses, who had relied upon inconsistent and statistically non-significant studies on the supposed teratogenicity of Bendectin. The trial court excluded the plaintiffs’ witnesses, and granted summary judgment.28

On appeal, the Third Circuit reversed and remanded the DeLucas’s case for a hearing under Rule 702:

by directing such an overall evaluation, however, we do not mean to reject at this point Merrell Dow’s contention that a showing of a .05 level of statistical significance should be a threshold requirement for any statistical analysis concluding that Bendectin is a teratogen regardless of the presence of other indicia of reliability. That contention will need to be addressed on remand. The root issue it poses is what risk of what type of error the judicial system is willing to tolerate. This is not an easy issue to resolve and one possible resolution is a conclusion that the system should not tolerate any expert opinion rooted in statistical analysis where the results of the underlying studies are not significant at a .05 level.”29

After remand, the district court excluded the DeLuca plaintiffs’ expert witnesses, and granted summary judgment, based upon the dubious methods employed by plaintiffs’ expert witnesses in cherry picking data, recalculating risk ratios in published studies, and ignoring bias and confounding in studies. The Third Circuit affirmed the judgment for Merrell Dow.30

In the end, the decisions in the DeLuca case never endorsed the Rothman approach, although Professor Rothman can take credit perhaps for forcing the trial court, on remand, to come to grips with the informational content of the study data, and the many threats to validity, which severely undermined the relied-upon studies and the plaintiffs’ expert witnesses’ opinions.

More recently, in litigation over alleged causation of birth defects in offspring of mothers who used Zoloft during pregnancy, plaintiffs’ counsel attempted to resurrect, through their expert witnesses, the Rothman approach. The multidistrict court saw through counsel’s assertions that the Rothman approach had been adopted in DeLuca, or that it had become generally accepted.31 After protracted litigation in the Zoloft cases, the district court excluded plaintiffs’ expert witnesses and entered summary judgment for the defense. The Third Circuit found that the district court’s handling of the statistical significance issues was fully consistent with the Circuit’s previous pronouncements on the issue of statistical significance.32


1 filed in Daubert v. Merrell Dow Pharmaceuticals, Inc., U.S. Supreme Court No. 92-102 (Jan. 19, 1993), was submitted by Richard A. Meserve and Lars Noah, of Covington & Burling, and by Bert Black, 12 Biotechnology Law Report 198 (No. 2, March-April 1993); see Daubert’s Silver Anniversary – Retrospective View of Its Friends and Enemies” (Oct. 21, 2018).

2 Brief Amici Curiae of Professors Kenneth Rothman, Noel Weiss, James Robins, Raymond Neutra and Steven Stellman, in Support of Petitioners, 1992 WL 12006438, Daubert v. Merrell Dow Pharmaceuticals, Inc., U.S. S. Ct. No. 92-102 (Dec. 2, 1992). [Rothman Brief].

3 Id. at *7.

4 Rothman Brief at *2.

5 Id. at *2-*3 (emphasis added).

6 Id. at *7 (emphasis added).

7 See Ronald L. Wasserstein & Nicole A. Lazar, “The ASA’s Statement on p-Values: Context, Process, and Purpose,” 70 The American Statistician 129 (2016)

8 Id. at *3.

9 Id. at *2.

10 Id. at *3 – *4.

11 Id. at *3.

12 Id. at *3.

13 Id. at *4 -*5.

14 Id. at*5, *6.

15 at <https://twitter.com/ken_rothman/status/855784253984051201> (April 21, 2017). The tweet pointed to: Joan Lappe, Patrice Watson, Dianne Travers-Gustafson, Robert Recker, Cedric Garland, Edward Gorham, Keith Baggerly, and Sharon L. McDonnell, “Effect of Vitamin D and Calcium Supplementation on Cancer Incidence in Older WomenA Randomized Clinical Trial,” 317 J. Am. Med. Ass’n 1234 (2017).

16 In the case of United States v. Harkonen, Professors Ken Rothman and Tim Lash, and I made common cause in support of Dr. Harkonen’s petition to the United States Supreme Court. The circumstances of Dr. Harkonen’s indictment and conviction provide a concrete example of what Dr. Rothman probably was referring to as “indication of efficacy.” I supported Dr. Harkonen’s appeal because I agreed that there had been a suggestion of efficacy, even if Harkonen had overstated what his clinical trial, standing alone, had shown. (There had been a previous clinical trial, which demonstrated a robust survival benefit.) From my perspective, the facts of the case supported Dr. Harkonen’s exercise of speech in a press release, but it would hardly have justified FDA approval for the indication that Dr. Harkonen was discussing. If Harkonen had indeed committed “wire fraud,” as claimed by the federal prosecutors, then I had (and still have) a rather long list of expert witnesses who stand in need of criminal penalties and rehabilitation for their overreaching opinions in court cases.

17 Robert Temple, “How FDA Currently Makes Decisions on Clinical Studies,” 2 Clinical Trials 276, 281 (2005); Lee Kennedy-Shaffer, “When the Alpha is the Omega: P-Values, ‘Substantial Evidence’, and the 0.05 Standard at FDA,” 72 Food & Drug L.J. 595 (2017); see alsoThe 5% Solution at the FDA” (Feb. 24, 2018).

18 See, e.g., Stefan Pilz, Katharina Kienreich, Andreas Tomaschitz, Eberhard Ritz, Elisabeth Lerchbaum, Barbara Obermayer-Pietsch, Veronika Matzi, Joerg Lindenmann, Winfried Marz, Sara Gandini, and Jacqueline M. Dekker, “Vitamin D and cancer mortality: systematic review of prospective epidemiological studies,” 13 Anti-Cancer Agents in Medicinal Chem. 107 (2013).

19 Austin Bradford Hill, Principles of Medical Statistics at 2, 10 (4th ed. 1948) (“The statistical method is required in the interpretation of figures which are at the mercy of numerous influences, and its object is to determine whether individual influences can be isolated and their effects measured.”) (emphasis added).

20 Id. at *6 -*7.

21 Id. at *9.

22 Id.

23 Id. at *10.

24 Rothman Brief at *12.

25 See William Childs, “Peering Behind The Peer Review Curtain,” Law360 (Aug. 17, 2018).

26 Daubert v. Merrell Dow Pharms., 509 U.S. 579, 594 (1993).

27 SeeDiclegis and Vacuous Philosophy of Science” (June 24, 2015).

28 DeLuca v. Merrell Dow Pharms., Inc., 131 F.R.D. 71 (D.N.J. 1990).

29 DeLuca v. Merrell Dow Pharms., Inc., 911 F.2d 941, 955 (3d Cir. 1990).

30 DeLuca v. Merrell Dow Pharma., Inc., 791 F. Supp. 1042 (D.N.J. 1992), aff’d, 6 F.3d 778 (3d Cir. 1993).

31 In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., MDL No. 2342; 12-md-2342, 2015 WL 314149 (E.D. Pa. Jan. 23, 2015) (Rufe, J.) (denying PSC’s motion for reconsideration), aff’d, 858 F.3d 787 (3d Cir. 2017) (affirming exclusion of plaintiffs’ expert witnesses’ dubious opinions, which involved multiple methodological flaws and failures to follow any methodology faithfully). See generallyZoloft MDL Relieves Matrixx Depression” (Jan. 30, 2015); “WOE — Zoloft Escapes a MDL While Third Circuit Creates a Conceptual Muddle” (July 31, 2015).

32 See Pritchard v. Dow Agro Sciences, 430 F. App’x 102, 104 (3d Cir. 2011) (excluding Concussion hero, Dr. Bennet Omalu).

The American Statistical Association Statement on Significance Testing Goes to Court – Part I

November 13th, 2018

It has been two and one-half years since the American Statistical Association (ASA) issued its statement on statistical significance. Ronald L. Wasserstein & Nicole A. Lazar, “The ASA’s Statement on p-Values: Context, Process, and Purpose,” 70 The American Statistician 129 (2016) [ASA Statement]. When the ASA Statement was published, I commended it as a needed counterweight to the exaggerated criticisms of significance testing.1 Lawyers and expert witnesses for the litigation industry had routinely poo-poohed the absence of statistical significance, but over-endorsed its presence in poorly designed and biased studies. Courts and lawyers from all sides routinely misunderstand, misstated, and misrepresented the meaning of statistical significance.2

The ASA Statement had potential to help resolve judicial confusion. It is written in non-technical language, which is easily understood by non-statisticians. Still, the Statement has to be read with care. The principle of charity led me to believe that lawyers and judges would read the Statement carefully, and that it would improve judicial gatekeeping of expert witnesses’ opinion testimony that involved statistical evidence. I am less sanguine now about the prospect of progress.

No sooner had the ASA issued its Statement than the spinning started. One scientist, and an editor PLoS Biology, blogged that “the ASA notes, the importance of the p-value has been greatly overstated and the scientific community has become over-reliant on this one – flawed – measure.”3 Lawyers for the litigation industry were even less restrained in promoting wild misrepresentations about the Statement, with claims that the ASA had condemned the use of p-values, significance testing, and significance probabilities, as “flawed.”4 And yet, no where in the ASA’s statement does the group suggest that the the p-value was a “flawed” measure.

Criminal Use of the ASA Statement

Where are we now, two plus years out from the ASA Statement? Not surprisingly, the Statement has made its way into the legal arena. The Statement has been used in any number of depositions, relied upon in briefs, and cited in at least a couple of judicial decisions, in the last two years. The empirical evidence of how the ASA Statement has been used, or might be used in the future, is still sparse. Just last month, the ASA Statement was cited by the Washington State Supreme Court, in a ruling that held the death penalty unconstitutional. State of Washington v. Gregory, No. 88086-7, (Wash. S.Ct., Oct. 11, 2018) (en banc). Mr. Gregory, who was facing the death penalty, after being duly convicted or rape, robbery, and murder. The prosecution was supported by DNA matches, fingerprint identification, and other evidence. Mr. Gregory challenged the constitutionality of his imposed punishment, not on per se grounds of unconstitutionality, but on race disparities in the imposition of the death penalty. On this claim, the Washington Supreme Court commented on the empirical evidence marshalled on Mr. Gregory’s behalf:

The most important consideration is whether the evidence shows that race has a meaningful impact on imposition of the death penalty. We make this determination by way of legal analysis, not pure science. At the very most, there is an 11 percent chance that the observed association between race and the death penalty in Beckett’s regression analysis is attributed to random chance rather than true association. Commissioner’s Report at 56-68 (the p-values range from 0.048-0.111, which measures the probability that the observed association is the result of random chance rather than a true association).[8] Just as we declined to require ‘precise uniformity’ under our proportionality review, we decline to require indisputably true social science to prove that our death penalty is impermissibly imposed based on race.

Id. (internal citations omitted).

Whatever you think of the death penalty, or how it is imposed in the United States, you will have to agree that the Court’s discussion of statistics is itself criminal. In the above quotation from the Court’s opinion, the Court badly misinterpreted the p-values generated in various regression analyses that were offered to support claims of race disparity. The Court’s equating statistically significant evidence of race disparity in these regression analyses with “indisputably true social science” also reflects a rhetorical strategy that imputes ridiculously high certainty (indisputably true) to social science conclusions in order to dismiss the need for them in order to accept a causal race disparity claim on empirical evidence.5

Gregory’s counsel had briefed the Washington Court on statistical significance, and raised the ASA Statement as excuse and justification for not presenting statistically significant empirical evidence of race disparity.6 Footnote 8, in the above quote from the Gregory decision shows that the Court was aware of the ASA Statement, which makes the Court’s errors even more unpardonable: 

[8] The most common p-value used for statistical significance is 0.05, but this is not a bright line rule. The American Statistical Association (ASA) explains that the ‘mechanical “bright-line” rules (such as “p < 0.05”) for justifying scientific claims or conclusions can lead to erroneous beliefs and poor decision making’.”7

Conveniently, Gregory’s counsel did not cite to other parts of the ASA Statement, which would have called for a more searching review of the statistical regression analyses:

“Good statistical practice, as an essential component of good scientific practice, emphasizes principles of good study design and conduct, a variety of numerical and graphical summaries of data, understanding the phenomenon under study, interpretation of results in context, complete reporting and proper logical and quantitative understanding of what data summaries mean. No single index should substitute for scientific reasoning.”8

The Supreme Court of Washington first erred in its assessment of what scientific evidence requires in terms of a burden of proof. It then accepted spurious arguments to excuse the absence of statistical significance in the statistical evidence before it, on the basis of a distorted representation of the ASA Statement. Finally, the Court erred in claiming support from social science evidence, by ignoring other methodological issues in Gregory’s empirical claims. Ironically, the Court had made significance testing the end all and be all of its analysis, and when it dispatched statistical significance as a consideration, the Court jumped to the conclusion it wanted to reach. Clearly, the intended message of the ASA Statement had been subverted by counsel and the Court.

2 See, e.g., In re Ephedra Prods. Liab. Litig., 393 F.Supp. 2d 181, 191 (S.D.N.Y. 2005). See alsoConfidence in Intervals and Diffidence in the Courts” (March 4, 2012); “Scientific illiteracy among the judiciary” (Feb. 29, 2012).

5 Moultrie v. Martin, 690 F.2d 1078, 1082 (4th Cir. 1982) (internal citations omitted) (“When a litigant seeks to prove his point exclusively through the use of statistics, he is borrowing the principles of another discipline, mathematics, and applying these principles to the law. In borrowing from another discipline, a litigant cannot be selective in which principles are applied. He must employ a standard mathematical analysis. Any other requirement defies logic to the point of being unjust. Statisticians do not simply look at two statistics, such as the actual and expected percentage of blacks on a grand jury, and make a subjective conclusion that the statistics are significantly different. Rather, statisticians compare figures through an objective process known as hypothesis testing.”).

6 Supplemental Brief of Allen Eugene Gregory, at 15, filed in State of Washington v. Gregory, No. 88086-7, (Wash. S.Ct., Jan. 22, 2018).

7 State of Washington v. Gregory, No. 88086-7, (Wash. S.Ct., Oct. 11, 2018) (en banc) (internal citations omitted).

8 ASA Statement at 132.