TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

Don’t Double Dip Data

March 9th, 2015

Meta-analyses have become commonplace in epidemiology and in other sciences. When well conducted and transparently reported, meta-analyses can be extremely helpful. In several litigations, meta-analyses determined the outcome of the medical causation issues. In the silicone gel breast implant litigation, after defense expert witnesses proffered meta-analyses[1], court-appointed expert witnesses adopted the approach and featured meta-analyses in their reports to the MDL court[2].

In the welding fume litigation, plaintiffs’ expert witness offered a crude, non-quantified, “vote counting” exercise to argue that welding causes Parkinson’s disease[3]. In rebuttal, one of the defense expert witnesses offered a quantitative meta-analysis, which provided strong evidence against plaintiffs’ claim.[4] Although the welding fume MDL court excluded the defense expert’s meta-analysis from the pre-trial Rule 702 hearing as untimely, plaintiffs’ counsel soon thereafter initiated settlement discussions of the entire set of MDL cases. Subsequently, the defense expert witness, with his professional colleagues, published an expanded version of the meta-analysis.[5]

And last month, a meta-analysis proffered by a defense expert witness helped dispatch a long-festering litigation in New Jersey’s multi-county isotretinoin (Accutane) litigation. In re Accutane Litig., No. 271(MCL), 2015 WL 753674 (N.J. Super., Law Div., Atlantic Cty., Feb. 20, 2015) (excluding plaintiffs’ expert witness David Madigan).

Of course, when a meta-analysis is done improperly, the resulting analysis may be worse than none at all. Some methodological flaws involve arcane statistical concepts and procedures, and may be easily missed. Other flaws are flagrant and call for a gatekeeping bucket brigade.

When a merchant puts his hand the scale at the check-out counter, we call that fraud. When George Costanza double dipped his chip twice in the chip dip, he was properly called out for his boorish and unsanitary practice. When a statistician or epidemiologist produces a meta-analysis that double counts crucial data to inflate a summary estimate of association, or to create spurious precision in the estimate, we don’t need to crack open Modern Epidemiology or the Reference Manual on Scientific Evidence to know that something fishy has taken place.

In litigation involving claims that selective serotonin reuptake inhibitors cause birth defects, plaintiffs’ expert witness, a perinatal epidemiologist, relied upon two published meta-analyses[6]. In an examination before trial, this epidemiologist was confronted with the double counting (and other data entry errors) in the relied-upon meta-analyses, and she readily agreed that the meta-analyses were improperly done and that she had to abandon her reliance upon them.[7] The result of the expert witness’s deposition epiphany, however, was that she no longer had the illusory benefit of an aggregation of data, with an outcome supporting her opinion. The further consequence was that her opinion succumbed to a Rule 702 challenge. See In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., MDL No. 2342; 12-md-2342, 2014 U.S. Dist. LEXIS 87592; 2014 WL 2921648 (E.D. Pa. June 27, 2014) (Rufe, J.).

Double counting of studies, or subgroups within studies, is a flaw that most careful readers can identify in a meta-analysis, without advance training. According to statistician Stephen Senn, double counting of evidence is a serious problem in published meta-analytical studies. Stephen J. Senn, “Overstating the evidence – double counting in meta-analysis and related problems,” 9, at *1 BMC Medical Research Methodology 10 (2009). Senn observes that he had little difficulty in finding examples of meta-analyses gone wrong, including meta-analyses with double counting of studies or data, in some of the leading clinical medical journals. Id. Senn urges analysts to “[b]e vigilant about double counting,” id. at *4, and recommends that journals should withdraw meta-analyses promptly when mistakes are found,” id. at *1.

Similar advice abounds in books and journals[8]. Professor Sander Greenland addresses the issue in his chapter on meta-analysis in Modern Epidemiology:

Conducting a Sound and Credible Meta-Analysis

Like any scientific study, an ideal meta-analysis would follow an explicit protocol that is fully replicable by others. This ideal can be hard to attain, but meeting certain conditions can enhance soundness (validity) and credibility (believability). Among these conditions we include the following:

  • A clearly defined set of research questions to address.

  • An explicit and detailed working protocol.

  • A replicable literature-search strategy.

  • Explicit study inclusion and exclusion criteria, with a rationale for each.

  • Nonoverlap of included studies (use of separate subjects in different included studies), or use of statistical methods that account for overlap. * * * * *”

Sander Greenland & Keith O’Rourke, “Meta-Analysis – Chapter 33,” in Kenneth J. Rothman, Sander Greenland, Timothy L. Lash, Modern Epidemiology 652, 655 (3d ed. 2008) (emphasis added).

Just remember George Costanza; don’t double dip that chip, and don’t double dip in the data.


[1] See, e.g., Otto Wong, “A Critical Assessment of the Relationship between Silicone Breast Implants and Connective Tissue Diseases,” 23 Regulatory Toxicol. & Pharmacol. 74 (1996).

[2] See Barbara Hulka, Betty Diamond, Nancy Kerkvliet & Peter Tugwell, “Silicone Breast Implants in Relation to Connective Tissue Diseases and Immunologic Dysfunction:  A Report by a National Science Panel to the Hon. Sam Pointer Jr., MDL 926 (Nov. 30, 1998)”; Barbara Hulka, Nancy Kerkvliet & Peter Tugwell, “Experience of a Scientific Panel Formed to Advise the Federal Judiciary on Silicone Breast Implants,” 342 New Engl. J. Med. 812 (2000).

[3] Deposition of Dr. Juan Sanchez-Ramos, Street v. Lincoln Elec. Co., Case No. 1:06-cv-17026, 2011 WL 6008514 (N.D. Ohio May 17, 2011).

[4] Deposition of Dr. James Mortimer, Street v. Lincoln Elec. Co., Case No. 1:06-cv-17026, 2011 WL 6008054 (N.D. Ohio June 29, 2011).

[5] James Mortimer, Amy Borenstein & Laurene Nelson, Associations of Welding and Manganese Exposure with Parkinson’s Disease: Review and Meta-Analysis, 79 Neurology 1174 (2012).

[6] Shekoufeh Nikfar, Roja Rahimi, Narjes Hendoiee, and Mohammad Abdollahi, “Increasing the risk of spontaneous abortion and major malformations in newborns following use of serotonin reuptake inhibitors during pregnancy: A systematic review and updated meta-analysis,” 20 DARU J. Pharm. Sci. 75 (2012); Roja Rahimi, Shekoufeh Nikfara, Mohammad Abdollahic, “Pregnancy outcomes following exposure to serotonin reuptake inhibitors: a meta-analysis of clinical trials,” 22 Reproductive Toxicol. 571 (2006).

[7] “Q So the question was: Have you read it carefully and do you understand everything that was done in the Nikfar meta-analysis?

A Yes, I think so.

* * *

Q And Nikfar stated that she included studies, correct, in the cardiac malformation meta-analysis?

A That’s what she says.

* * *

Q So if you look at the STATA output, the demonstrative, the — the forest plot, the second study is Kornum 2010. Do you see that?

A Am I —

Q You’re looking at figure four, the cardiac malformations.

A Okay.

Q And Kornum 2010, —

A Yes.

Q — that’s a study you relied upon.

A Mm-hmm.

Q Is that right?

A Yes.

Q And it’s on this forest plot, along with its odds ratio and confidence interval, correct?

A Yeah.

Q And if you look at the last study on the forest plot, it’s the same study, Kornum 2010, same odds ratio and same confidence interval, true?

A You’re right.

Q And to paraphrase My Cousin Vinny, no self-respecting epidemiologist would do a meta-analysis by including the same study twice, correct?

A Well, that was an error. Yeah, you’re right.

***

Q Instead of putting 2 out of 98, they extracted the data and put 9 out of 28.

A Yeah. You’re right.

Q So there’s a numerical transposition that generated a 25-fold increased risk; is that right?

A You’re correct.

Q And, again, to quote My Cousin Vinny, this is no way to do a meta-analysis, is it?

A You’re right.”

Testimony of Anick Bérard, Kuykendall v. Forest Labs, at 223:14-17; 238:17-20; 239:11-240:10; 245:5-12 (Cole County, Missouri; Nov. 15, 2013). According to a Google Scholar search, the Rahimi 2005 meta-analysis had been cited 90 times; the Nikfar 2012 meta-analysis, 11 times, as recently as this month. See, e.g., Etienne Weisskopf, Celine J. Fischer, Myriam Bickle Graz, Mathilde Morisod Harari, Jean-Francois Tolsa, Olivier Claris, Yvan Vial, Chin B. Eap, Chantal Csajka & Alice Panchaud, “Risk-benefit balance assessment of SSRI antidepressant use during pregnancy and lactation based on best available evidence,” 14 Expert Op. Drug Safety 413 (2015); Kimberly A. Yonkers, Katherine A. Blackwell & Ariadna Forray, “Antidepressant Use in Pregnant and Postpartum Women,” 10 Ann. Rev. Clin. Psychol. 369 (2014); Abbie D. Leino & Vicki L. Ellingrod, “SSRIs in pregnancy: What should you tell your depressed patient?” 12 Current Psychiatry 41 (2013).

[8] Julian Higgins & Sally Green, eds., Cochrane Handbook for Systematic Reviews of Interventions 152 (2008) (“7.2.2 Identifying multiple reports from the same study. Duplicate publication can introduce substantial biases if studies are inadvertently included more than once in a meta-analysis (Tramèr 1997). Duplicate publication can take various forms, ranging from identical manuscripts to reports describing different numbers of participants and different outcomes (von Elm 2004). It can be difficult to detect duplicate publication, and some ‘detectivework’ by the reviewauthors may be required.”); see also id. at 298 (Table 10.1.a “Definitions of some types of reporting biases”); id. at 304-05 (10.2.2.1 Duplicate (multiple) publication bias … “The inclusion of duplicated data may therefore lead to overestimation of intervention effects.”); Julian P.T. Higgins, Peter W. Lane, Betsy Anagnostelis, Judith Anzures-Cabrera, Nigel F. Baker, Joseph C. Cappelleri, Scott Haughie, Sally Hollis, Steff C. Lewis, Patrick Moneuse & Anne Whitehead, “A tool to assess the quality of a meta-analysis,” 4 Research Synthesis Methods 351, 363 (2013) (“A common error is to double-count individuals in a meta-analysis.”); Alessandro Liberati, Douglas G. Altman, Jennifer Tetzlaff, Cynthia Mulrow, Peter C. Gøtzsche, John P.A. Ioannidis, Mike Clarke, Devereaux, Jos Kleijnen, and David Moher, “The PRISMA Statement for Reporting Systematic Reviews and Meta-Analyses of Studies That Evaluate Health Care Interventions: Explanation and Elaboration,” 151 Ann. Intern. Med. W-65, W-75 (2009) (“Some studies are published more than once. Duplicate publications may be difficult to ascertain, and their inclusion may introduce bias. We advise authors to describe any steps they used to avoid double counting and piece together data from multiple reports of the same study (e.g., juxtaposing author names, treatment comparisons, sample sizes, or outcomes).”) (internal citations omitted); Erik von Elm, Greta Poglia; Bernhard Walder, and Martin R. Tramèr, “Different patterns of duplicate publication: an analysis of articles used in systematic reviews,” 291 J. Am. Med. Ass’n 974 (2004); John Andy Wood, “Methodology for Dealing With Duplicate Study Effects in a Meta-Analysis,” 11 Organizational Research Methods 79, 79 (2008) (“Dependent studies, duplicate study effects, nonindependent studies, and even covert duplicate publications are all terms that have been used to describe a threat to the validity of the meta-analytic process.”) (internal citations omitted); Martin R. Tramèr, D. John M. Reynolds, R. Andrew Moore, Henry J. McQuay, “Impact of covert duplicate publication on meta­analysis: a case study,” 315 Brit. Med. J. 635 (1997); Beverley J Shea, Jeremy M Grimshaw, George A. Wells, Maarten Boers, Neil Andersson, Candyce Hamel, Ashley C. Porter, Peter Tugwell, David Moher, and Lex M. Bouter, “Development of AMSTAR: a measurement tool to assess the methodological quality of systematic reviews,” 7(10) BMC Medical Research Methodology 2007 (systematic reviews must inquire whether there was “duplicate study selection and data extraction”).

Sander Greenland on “The Need for Critical Appraisal of Expert Witnesses in Epidemiology and Statistics”

February 8th, 2015

Sander Greenland is one of the few academics, who has served as an expert witness, who has written post-mortems of his involvement in various litigations[1]. Although settling scores with opposing expert witnesses can be a risky business[2], the practice can provide important insights for judges and lawyers who want to avoid the errors of the past. Greenland correctly senses that many errors seem endlessly recycled, and that courts could benefit from disinterested commentary on cases. And so, there should be a resounding affirmation from federal and state courts to the proclaimed “need for critical appraisal of expert witnesses in epidemiology and statistics,” as well as in many other disciplines.

A recent exchange[3] with Professor Greenland led me to revisit his Wake Forest Law Review article. His article raises some interesting points, some mistaken, but some valuable and thoughtful considerations about how to improve the state of statistical expert witness testimony. For better and worse[4], lawyers who litigate health effects issues should read it.

Other Misunderstandings

Greenland posits criticisms of defense expert witnesses[5], who he believes have misinterpreted or misstated the appropriate inferences to be drawn from null studies. In one instance, Greenland revisits one of his own cases, without any clear acknowledgment that his views were largely rejected.[6] The State of California had declared, pursuant to Proposition 65 ( the Safe Drinking Water and Toxic Enforcement Act of 1986, Health and Safety Code sections 25249.5, et seq.), that the State “knew” that di(2-ethylhexyl)phthalate, or “DEHP” caused cancer. Baxter Healthcare challenged the classification, and according to Greenland, the defense experts erroneously interpreted inclusive studies with evidence supporting a conclusion that DEHP does not cause cancer.

Greenland argues that the Baxter expert’s reference[7] to an IARC working group’s classification of DEHP as “not classifiable as to its carcinogenicity to humans” did not support the expert’s conclusion that DEHP does not cause cancer in human. If Baxter’s expert invoked the IARC working group’s classification for complete exoneration of DEHP, then Greenland’s point is fair enough. In his single-minded attack on Baxter’s expert’s testimony, however, Greenland missed a more important point, which is that the IARC’s determination that DEHP is not classifiable as to carcinogenicity is directly contradictory of California’s epistemic claim to “know” that DEHP causes cancer. And Greenland conveniently omits any discussion that the IARC working group had reclassified DEHP from “possibly carcinogenic” to “not classifiable,” in the light of its conclusion that mechanistic evidence of carcinogenesis in rodents did not pertain to humans.[8] Greenland maintains that Baxter’s experts misrepresented the IARC working group’s conclusion[9], but that conclusion, at the very least, demonstrates that California was on very shaky ground when it declared that it “knew” that DEHP was a carcinogen. California’s semantic gamesmanship over its epistemic claims is at the root of the problem, not a misstep by defense experts in describing inconclusive evidence as exonerative.

Greenland goes on to complain that in litigation over health claims:

“A verdict of ‛uncertain’ is not allowed, yet it is the scientific verdict most often warranted. Elimination of this verdict from an expert’s options leads to the rather perverse practice (illustrated in the DEHP testimony cited above) of applying criminal law standards to risk assessments, as if chemicals were citizens to be presumed innocent until proven guilty.

39 Wake Forest Law Rev. at 303. Despite Greenland’s alignment with California in the Denton case, the fact of the matter is that a verdict of “uncertain” was allowed, and he was free to criticize California for making a grossly exaggerated epistemic claim on inconclusive evidence.

Perhaps recognizing that he may be readily be seen as an advocate for coming to the defense of California on the DEHP issue, Greenland protests that:

“I am not suggesting that judgments for plaintiffs or actions against chemicals should be taken when evidence is inconclusive.”

39 Wake Forest Law Rev. at 305. And yet, his involvement in the Denton case (as well as other cases, such as silicone gel breast implant cases, thimerosal cases, etc.) suggest that he is willing to lend aid and support to judgments for plaintiffs when the evidence is inconclusive.

Important Advice and Recommendations

These foregoing points are rather severe limitations to Greenland’s article, but lawyers and judges should also look to what is good and helpful here. Greenland is correct to call out expert witnesses, regardless of party of affiliation, who opine that inconclusive studies are “proof” of the null hypothesis. Although some of Greenland’s arguments against the use of significance probability may be overstated, his corrections to the misstatements and misunderstandings of significance probability should command greater attention in the legal community. In one strained passage, however, Greenland uses a disjunction to juxtapose null hypothesis testing with proof beyond a reasonable doubt[10]. Greenland of course understands the difference, but the context would lead some untutored readers to think he has equated the two probabilistic assessments. Writing in a law review for lawyers and judges might have led him to be more careful. Given the prevalence of plaintiffs’ counsel’s confusing the 95% confidence coefficient with a burden of proof akin to beyond a reasonable doubt, great care in this area is, indeed, required.

Despite his appearing for plaintiffs’ counsel in health effects litigation, some of Greenland’s suggestions are balanced and perhaps more truth-promoting than many plaintiffs’ counsel would abide. His article provides an important argument in favor of raising the legal criteria for witnesses who purport to have expertise to address and interpret epidemiologic and experimental evidence[11]. And beyond raising qualification requirements above mere “reasonable pretense at expertise,” Professor Greenland offers some thoughtful, helpful recommendations for improving expert witness testimony in the courts:

  • “Begin publishing projects in which controversial testimony (a matter of public record) is submitted, and as space allows, published on a regular basis in scientific or law journals, perhaps with commentary. An online version could provide extended excerpts, with additional context.
  • Give courts the resources and encouragement to hire neutral experts to peer-review expert testimony.
  • Encourage universities and established scholarly societies (such as AAAS, ASA, APHA, and SER) to conduct workshops on basic epidemiologic and statistical inference for judges and other legal professionals.”

39 Wake Forest Law Rev. at 308.

Each of these three suggestions is valuable and constructive, and worthy of an independent paper. The recommendation of neutral expert witnesses and scholarly tutorials for judges is hardly new. Many defense counsel and judges have argued for them in litigation and in commentary. The first recommendation, of publishing “controversial testimony” is part of the purpose of this blog. There would be great utility to making expert witness testimony, and analysis thereof, more available for didactic purposes. Perhaps the more egregious testimonial adventures should be republished in professional journals, as Greenland suggests. Greenland qualifies his recommendation with “as space allows,” but space is hardly the limiting consideration in the digital age.

Causation

Professor Greenland correctly points out that causal concepts and conclusions are often essentially contested[12], but his argument might well be incorrectly taken for “anything goes.” More helpfully, Greenland argues that various academic ideals should infuse expert witness testimony. He suggests that greater scholarship, with acknowledgment of all viewpoints, and all evidence, is needed in expert witnessing. 39 Wake Forest Law Rev. at 293.

Greenland’s argument provides an important corrective to the rhetoric of Oreskes, Cranor, Michaels, Egilman, and others on “manufacturing doubt”:

“Never force a choice among competing theories; always maintain the option of concluding that more research is needed before a defensible choice can be made.”

Id. Despite his position in the Denton case, and others, Greenland and all expert witnesses are free to maintain that more research is needed before a causal claim can be supported. Greenland also maintains that expert witnesses should “look past” the conclusions drawn by authors, and base their opinions on the “actual data” on which the statistical analyses are based, and from which conclusions have been drawn. Courts have generally rejected this view, but if courts were to insist upon real expertise in epidemiology and statistics, then the testifying expert witnesses should not be constrained by the hearsay opinions in the discussion sections of published studies – sections which by nature are incomplete and tendentious. See Follow the Data, Not the Discussion” (May 2, 2010).

Greenland urges expert witnesses and legal counsel to be forthcoming about their assumptions, their uncertainty about conclusions:

“Acknowledgment of controversy and uncertainty is a hallmark of good science as well as good policy, but clashes with the very time limited tasks faced by attorneys and courts”

39 Wake Forest Law Rev. at 293-4. This recommendation would be helpful in assuring courts that the data may simply not support conclusions sufficiently certain to be submitted to lay judges and jurors. Rosen v. Ciba-Geigy Corp., 78 F.3d 316, 319, 320 (7th Cir. 1996) (“But the courtroom is not the place for scientific guesswork, even of the inspired sort. Law lags science; it does not lead it.”) (internal citations omitted).

Threats to Validity

One of the serious mistakes counsel often make in health effects litigation is to invite courts to believe that statistical significance is sufficient for causal inferences. Greenland emphasizes that validity considerations often are much stronger, and more important considerations than the play of random error[13]:

“For very imperfect data (e.g., epidemiologic data), the limited conclusions offered by statistics must be further tempered by validity considerations.”

*   *   *   *   *   *

“Examples of validity problems include non-random distribution of the exposure in question, non-random selection or cooperation of subjects, and errors in assessment of exposure or disease.”

39 Wake Forest Law Rev. at 302 – 03. Greenland’s abbreviated list of threats to validity should remind courts that they cannot sniff a p-value below five percent and then safely kick the can to the jury. The literature on evaluating bias and confounding is huge, but Greenland was a co-author on an important recent paper, which needs to be added to the required reading lists of judges charged with gatekeeping expert witness opinion testimony about health effects. See Timothy L. Lash, et al., “Good practices for quantitative bias analysis,” 43 Internat’l J. Epidem. 1969 (2014).


[1] For an influential example of this sparse genre, see James T. Rosenbaum, “Lessons from litigation over silicone breast implants: A call for activism by scientists,” 276 Science 1524 (1997) (describing the exaggerations, distortions, and misrepresentations of plaintiffs’ expert witnesses in silicone gel breast implant litigation, from perspective of a highly accomplished scientist physician, who served as a defense expert witness, in proceedings before Judge Robert Jones, in Hall v. Baxter Healthcare Corp., 947 F. Supp. 1387 (D. Or. 1996). In one attempt to “correct the record” in the aftermath of a case, Greenland excoriated a defense expert witness, Professor Robert Makuch, for stating that Bayesian methods are rarely used in medicine or in the regulation of medicines. Sander Greenland, “The Need for Critical Appraisal of Expert Witnesses in Epidemiology and Statistics,” 39 Wake Forest Law Rev. 291, 306 (2004).  Greenland heaped adjectives upon his adversary, “ludicrous claim,” “disturbing, “misleading expert testimony,” and “demonstrably quite false.” See “The Infrequency of Bayesian Analyses in Non-Forensic Court Decisions” (Feb. 16, 2014) (debunking Prof. Greenland’s claims).

[2] One almost comical example of trying too hard to settle a score occurs in a footnote, where Greenland cites a breast implant case as having been reversed in part by another case in the same appellate court. See 39 Wake Forest Law Rev. at 309 n.68, citing Allison v. McGhan Med. Corp., 184 F.3d 1300, 1310 (11th Cir. 1999), aff’d in part & rev’d in part, United States v. Baxter Int’l, Inc., 345 F.3d 866 (11th Cir. 2003). The subsequent case was not by any stretch of the imagination a reversal of the earlier Allison case; the egregious citation is a legal fantasy. Furthermore, Allison had no connection with the procedures for court-appointed expert witnesses or technical advisors. Perhaps the most charitable interpretation of this footnote is that it was injected by the law review editors or supervisors.

[3] SeeSignificance Levels are Made a Whipping Boy on Climate Change Evidence: Is .05 Too Strict? (Schachtman on Oreskes)” (Jan. 4, 2015).

[4] In addition to the unfair attack on Professor Makuch, see supra, n.1, there is much that some will find “disturbing,” “misleading,” and even “ludicrous,” (some of Greenland’s favorite pejorative adjectives) in the article. Greenland repeats in brief his arguments against the legal system’s use of probabilities of causation[4], which I have addressed elsewhere.

[5] One of Baxter’s expert witnesses appeared to be the late Professor Patricia Buffler.

[6] See 39 Wake Forest Law Rev. at 294-95, citing Baxter Healthcare Corp. v. Denton, No. 99CS00868, 2002 WL 31600035, at *1 (Cal. App. Dep’t Super. Ct. Oct. 3, 2002) (unpublished); Baxter Healthcare Corp. v. Denton, 120 Cal. App. 4th 333 (2004)

[7] Although Greenland cites to a transcript, the citation is to a judicial opinion, and the actual transcript of testimony is not available at the citation give.

[8] See Denton, supra.

[9] 39 Wake Forest L. Rev. at 297.

[10] 39 Wake Forest L. Rev. at 305 (“If it is necessary to prove causation ‛beyond a reasonable doubt’–or be ‛compelled to give up the null’ – then action can be forestalled forever by focusing on any aspect of available evidence that fails to conform neatly with the causal (alternative) hypothesis. And in medical and social science there is almost always such evidence available, not only because of the ‛play of chance’ (the focus of ordinary statistical theory), but also because of the numerous validity problems in human research.”

[11] See Peter Green, “Letter from the President to the Lord Chancellor regarding the use of statistical evidence in court cases” (Jan. 23, 2002) (writing on behalf of The Royal Statistical Society; “Although many scientists have some familiarity with statistical methods, statistics remains a specialised area. The Society urges you to take steps to ensure that statistical evidence is presented only by appropriately qualified statistical experts, as would be the case for any other form of expert evidence.”).

[12] 39 Wake Forest Law Rev. at 291 (“In reality, there is no universally accepted method for inferring presence or absence of causation from human observational data, nor is there any universally accepted method for inferring probabilities of causation (as courts often desire); there is not even a universally accepted definition of cause or effect.”).

[13] 39 Wake Forest Law Rev. at 302-03 (“If one is more concerned with explaining associations scientifically, rather than with mechanical statistical analysis, evidence about validity can be more important than statistical results.”).

Sander Greenland on “The Need for Critical Appraisal of Expert Witnesses in Epidemiology and Statistics”

February 8th, 2015

Sander Greenland is one of the few academics, who has served as an expert witness, who has written post-mortems of his involvement in various litigations[1]. Although settling scores with opposing expert witnesses can be a risky business[2], the practice can provide important insights for judges and lawyers who want to avoid the errors of the past. Greenland correctly senses that many errors seem endlessly recycled, and that courts could benefit from disinterested commentary on cases. And so, there should be a resounding affirmation from federal and state courts to the proclaimed “need for critical appraisal of expert witnesses in epidemiology and statistics,” as well as in many other disciplines.

A recent exchange[3] with Professor Greenland led me to revisit his Wake Forest Law Review article. His article raises some interesting points, some mistaken, but some valuable and thoughtful considerations about how to improve the state of statistical expert witness testimony. For better and worse[4], lawyers who litigate health effects issues should read it.

Other Misunderstandings

Greenland posits criticisms of defense expert witnesses[5], who he believes have misinterpreted or misstated the appropriate inferences to be drawn from null studies. In one instance, Greenland revisits one of his own cases, without any clear acknowledgment that his views were largely rejected.[6] The State of California had declared, pursuant to Proposition 65 ( the Safe Drinking Water and Toxic Enforcement Act of 1986, Health and Safety Code sections 25249.5, et seq.), that the State “knew” that di(2-ethylhexyl)phthalate, or “DEHP” caused cancer. Baxter Healthcare challenged the classification, and according to Greenland, the defense experts erroneously interpreted inclusive studies with evidence supporting a conclusion that DEHP does not cause cancer.

Greenland argues that the Baxter expert’s reference[7] to an IARC working group’s classification of DEHP as “not classifiable as to its carcinogenicity to humans” did not support the expert’s conclusion that DEHP does not cause cancer in human. If Baxter’s expert invoked the IARC working group’s classification for complete exoneration of DEHP, then Greenland’s point is fair enough. In his single-minded attack on Baxter’s expert’s testimony, however, Greenland missed a more important point, which is that the IARC’s determination that DEHP is not classifiable as to carcinogenicity is directly contradictory of California’s epistemic claim to “know” that DEHP causes cancer. And Greenland conveniently omits any discussion that the IARC working group had reclassified DEHP from “possibly carcinogenic” to “not classifiable,” in the light of its conclusion that mechanistic evidence of carcinogenesis in rodents did not pertain to humans.[8] Greenland maintains that Baxter’s experts misrepresented the IARC working group’s conclusion[9], but that conclusion, at the very least, demonstrates that California was on very shaky ground when it declared that it “knew” that DEHP was a carcinogen. California’s semantic gamesmanship over its epistemic claims is at the root of the problem, not a misstep by defense experts in describing inconclusive evidence as exonerative.

Greenland goes on to complain that in litigation over health claims:

“A verdict of ‛uncertain’ is not allowed, yet it is the scientific verdict most often warranted. Elimination of this verdict from an expert’s options leads to the rather perverse practice (illustrated in the DEHP testimony cited above) of applying criminal law standards to risk assessments, as if chemicals were citizens to be presumed innocent until proven guilty.

39 Wake Forest Law Rev. at 303. Despite Greenland’s alignment with California in the Denton case, the fact of the matter is that a verdict of “uncertain” was allowed, and he was free to criticize California for making a grossly exaggerated epistemic claim on inconclusive evidence.

Perhaps recognizing that he may be readily be seen as an advocate for coming to the defense of California on the DEHP issue, Greenland protests that:

“I am not suggesting that judgments for plaintiffs or actions against chemicals should be taken when evidence is inconclusive.”

39 Wake Forest Law Rev. at 305. And yet, his involvement in the Denton case (as well as other cases, such as silicone gel breast implant cases, thimerosal cases, etc.) suggest that he is willing to lend aid and support to judgments for plaintiffs when the evidence is inconclusive.

Important Advice and Recommendations

These foregoing points are rather severe limitations to Greenland’s article, but lawyers and judges should also look to what is good and helpful here. Greenland is correct to call out expert witnesses, regardless of party of affiliation, who opine that inconclusive studies are “proof” of the null hypothesis. Although some of Greenland’s arguments against the use of significance probability may be overstated, his corrections to the misstatements and misunderstandings of significance probability should command greater attention in the legal community. In one strained passage, however, Greenland uses a disjunction to juxtapose null hypothesis testing with proof beyond a reasonable doubt[10]. Greenland of course understands the difference, but the context would lead some untutored readers to think he has equated the two probabilistic assessments. Writing in a law review for lawyers and judges might have led him to be more careful. Given the prevalence of plaintiffs’ counsel’s confusing the 95% confidence coefficient with a burden of proof akin to beyond a reasonable doubt, great care in this area is, indeed, required.

Despite his appearing for plaintiffs’ counsel in health effects litigation, some of Greenland’s suggestions are balanced and perhaps more truth-promoting than many plaintiffs’ counsel would abide. His article provides an important argument in favor of raising the legal criteria for witnesses who purport to have expertise to address and interpret epidemiologic and experimental evidence[11]. And beyond raising qualification requirements above mere “reasonable pretense at expertise,” Professor Greenland offers some thoughtful, helpful recommendations for improving expert witness testimony in the courts:

  • “Begin publishing projects in which controversial testimony (a matter of public record) is submitted, and as space allows, published on a regular basis in scientific or law journals, perhaps with commentary. An online version could provide extended excerpts, with additional context.
  • Give courts the resources and encouragement to hire neutral experts to peer-review expert testimony.
  • Encourage universities and established scholarly societies (such as AAAS, ASA, APHA, and SER) to conduct workshops on basic epidemiologic and statistical inference for judges and other legal professionals.”

39 Wake Forest Law Rev. at 308.

Each of these three suggestions is valuable and constructive, and worthy of an independent paper. The recommendation of neutral expert witnesses and scholarly tutorials for judges is hardly new. Many defense counsel and judges have argued for them in litigation and in commentary. The first recommendation, of publishing “controversial testimony” is part of the purpose of this blog. There would be great utility to making expert witness testimony, and analysis thereof, more available for didactic purposes. Perhaps the more egregious testimonial adventures should be republished in professional journals, as Greenland suggests. Greenland qualifies his recommendation with “as space allows,” but space is hardly the limiting consideration in the digital age.

Causation

Professor Greenland correctly points out that causal concepts and conclusions are often essentially contested[12], but his argument might well be incorrectly taken for “anything goes.” More helpfully, Greenland argues that various academic ideals should infuse expert witness testimony. He suggests that greater scholarship, with acknowledgment of all viewpoints, and all evidence, is needed in expert witnessing. 39 Wake Forest Law Rev. at 293.

Greenland’s argument provides an important corrective to the rhetoric of Oreskes, Cranor, Michaels, Egilman, and others on “manufacturing doubt”:

“Never force a choice among competing theories; always maintain the option of concluding that more research is needed before a defensible choice can be made.”

Id. Despite his position in the Denton case, and others, Greenland and all expert witnesses are free to maintain that more research is needed before a causal claim can be supported. Greenland also maintains that expert witnesses should “look past” the conclusions drawn by authors, and base their opinions on the “actual data” on which the statistical analyses are based, and from which conclusions have been drawn. Courts have generally rejected this view, but if courts were to insist upon real expertise in epidemiology and statistics, then the testifying expert witnesses should not be constrained by the hearsay opinions in the discussion sections of published studies – sections which by nature are incomplete and tendentious. See Follow the Data, Not the Discussion” (May 2, 2010).

Greenland urges expert witnesses and legal counsel to be forthcoming about their assumptions, their uncertainty about conclusions:

“Acknowledgment of controversy and uncertainty is a hallmark of good science as well as good policy, but clashes with the very time limited tasks faced by attorneys and courts”

39 Wake Forest Law Rev. at 293-4. This recommendation would be helpful in assuring courts that the data may simply not support conclusions sufficiently certain to be submitted to lay judges and jurors. Rosen v. Ciba-Geigy Corp., 78 F.3d 316, 319, 320 (7th Cir. 1996) (“But the courtroom is not the place for scientific guesswork, even of the inspired sort. Law lags science; it does not lead it.”) (internal citations omitted).

Threats to Validity

One of the serious mistakes counsel often make in health effects litigation is to invite courts to believe that statistical significance is sufficient for causal inferences. Greenland emphasizes that validity considerations often are much stronger, and more important considerations than the play of random error[13]:

“For very imperfect data (e.g., epidemiologic data), the limited conclusions offered by statistics must be further tempered by validity considerations.”

*   *   *   *   *   *

“Examples of validity problems include non-random distribution of the exposure in question, non-random selection or cooperation of subjects, and errors in assessment of exposure or disease.”

39 Wake Forest Law Rev. at 302 – 03. Greenland’s abbreviated list of threats to validity should remind courts that they cannot sniff a p-value below five percent and then safely kick the can to the jury. The literature on evaluating bias and confounding is huge, but Greenland was a co-author on an important recent paper, which needs to be added to the required reading lists of judges charged with gatekeeping expert witness opinion testimony about health effects. See Timothy L. Lash, et al., “Good practices for quantitative bias analysis,” 43 Internat’l J. Epidem. 1969 (2014).


[1] For an influential example of this sparse genre, see James T. Rosenbaum, “Lessons from litigation over silicone breast implants: A call for activism by scientists,” 276 Science 1524 (1997) (describing the exaggerations, distortions, and misrepresentations of plaintiffs’ expert witnesses in silicone gel breast implant litigation, from perspective of a highly accomplished scientist physician, who served as a defense expert witness, in proceedings before Judge Robert Jones, in Hall v. Baxter Healthcare Corp., 947 F. Supp. 1387 (D. Or. 1996). In one attempt to “correct the record” in the aftermath of a case, Greenland excoriated a defense expert witness, Professor Robert Makuch, for stating that Bayesian methods are rarely used in medicine or in the regulation of medicines. Sander Greenland, “The Need for Critical Appraisal of Expert Witnesses in Epidemiology and Statistics,” 39 Wake Forest Law Rev. 291, 306 (2004).  Greenland heaped adjectives upon his adversary, “ludicrous claim,” “disturbing, “misleading expert testimony,” and “demonstrably quite false.” See “The Infrequency of Bayesian Analyses in Non-Forensic Court Decisions” (Feb. 16, 2014) (debunking Prof. Greenland’s claims).

[2] One almost comical example of trying too hard to settle a score occurs in a footnote, where Greenland cites a breast implant case as having been reversed in part by another case in the same appellate court. See 39 Wake Forest Law Rev. at 309 n.68, citing Allison v. McGhan Med. Corp., 184 F.3d 1300, 1310 (11th Cir. 1999), aff’d in part & rev’d in part, United States v. Baxter Int’l, Inc., 345 F.3d 866 (11th Cir. 2003). The subsequent case was not by any stretch of the imagination a reversal of the earlier Allison case; the egregious citation is a legal fantasy. Furthermore, Allison had no connection with the procedures for court-appointed expert witnesses or technical advisors. Perhaps the most charitable interpretation of this footnote is that it was injected by the law review editors or supervisors.

[3] SeeSignificance Levels are Made a Whipping Boy on Climate Change Evidence: Is .05 Too Strict? (Schachtman on Oreskes)” (Jan. 4, 2015).

[4] In addition to the unfair attack on Professor Makuch, see supra, n.1, there is much that some will find “disturbing,” “misleading,” and even “ludicrous,” (some of Greenland’s favorite pejorative adjectives) in the article. Greenland repeats in brief his arguments against the legal system’s use of probabilities of causation[4], which I have addressed elsewhere.

[5] One of Baxter’s expert witnesses appeared to be the late Professor Patricia Buffler.

[6] See 39 Wake Forest Law Rev. at 294-95, citing Baxter Healthcare Corp. v. Denton, No. 99CS00868, 2002 WL 31600035, at *1 (Cal. App. Dep’t Super. Ct. Oct. 3, 2002) (unpublished); Baxter Healthcare Corp. v. Denton, 120 Cal. App. 4th 333 (2004)

[7] Although Greenland cites to a transcript, the citation is to a judicial opinion, and the actual transcript of testimony is not available at the citation give.

[8] See Denton, supra.

[9] 39 Wake Forest L. Rev. at 297.

[10] 39 Wake Forest L. Rev. at 305 (“If it is necessary to prove causation ‛beyond a reasonable doubt’–or be ‛compelled to give up the null’ – then action can be forestalled forever by focusing on any aspect of available evidence that fails to conform neatly with the causal (alternative) hypothesis. And in medical and social science there is almost always such evidence available, not only because of the ‛play of chance’ (the focus of ordinary statistical theory), but also because of the numerous validity problems in human research.”

[11] See Peter Green, “Letter from the President to the Lord Chancellor regarding the use of statistical evidence in court cases” (Jan. 23, 2002) (writing on behalf of The Royal Statistical Society; “Although many scientists have some familiarity with statistical methods, statistics remains a specialised area. The Society urges you to take steps to ensure that statistical evidence is presented only by appropriately qualified statistical experts, as would be the case for any other form of expert evidence.”).

[12] 39 Wake Forest Law Rev. at 291 (“In reality, there is no universally accepted method for inferring presence or absence of causation from human observational data, nor is there any universally accepted method for inferring probabilities of causation (as courts often desire); there is not even a universally accepted definition of cause or effect.”).

[13] 39 Wake Forest Law Rev. at 302-03 (“If one is more concerned with explaining associations scientifically, rather than with mechanical statistical analysis, evidence about validity can be more important than statistical results.”).

Fixodent Study Causes Lockjaw in Plaintiffs’ Counsel

February 4th, 2015

Litigation Drives Science

Back in 2011, the Fixodent MDL Court sustained Rule 702 challenges to plaintiffs’ expert witnesses. “Hypotheses are verified by testing, not by submitting them to lay juries for a vote.” In re Denture Cream Prods. Liab. Litig., 795 F. Supp. 2d 1345, 1367 (S.D.Fla.2011), aff’d, Chapman v. Procter & Gamble Distrib., LLC, 766 F.3d 1296 (11th Cir. 2014). The Court found that the plaintiffs had raised a superficially plausible hypothesis, but that they had not verified the hypothesis by appropriate testing[1].

Like dentures to Fixodent, the plaintiffs stuck to their claims, and set out to create the missing evidence. Plaintiffs’ counsel contracted with Dr. Salim Shah and his companies Sarfez Pharmaceuticals, Inc. and Sarfez USA, Inc. (“Sarfez”) to conduct human research in India, to support their claims that zinc in denture cream causes neurological damage[2]In re Denture Cream Prods. Liab. Litig., Misc. Action 13-384 (RBW), 2013 U.S. Dist. LEXIS 93456, *2 (D.D.C. July 3, 2013).  When the defense learned of this study, and the plaintiffs’ counsel’s payments of over $300,000, to support the study, they sought discovery of raw data, study protocol, statistical analyses, and other materials from plaintiffs’ counsel.  Plaintiffs’ counsel protested that they did not have all the materials, and directed defense counsel to Sarfez.  Although other courts have made counsel produce similar materials from the scientists and independent contractors they engaged, in this case, defense counsel followed the trail of documents to contractor, Sarfez, with subpoenas in hand.  Id. at *3-4.

The defense served a Rule 45 subpoena on Sarfez, which produced some, but not all responsive documents. Proctor & Gamble pressed for the missing materials, including study protocols, analytical reports, and raw data.  Id. at *12-13.  Judge Reggie Walton upheld the subpoena, which sought underlying data and non-privileged correspondence, to be within the scope of Rules 26(b) and 45, and not unduly burdensome. Id. at *9-10, *20. Sarfez attempted to argue that the requested materials, listed as email attachments, might not exist, but Judge Walton branded the suggestion “disingenuous.”  Attachments to emails should be produced along with the emails.  Id. at *12 (citing and collecting cases). Although Judge Walton did not grant a request for forensic recovery of hard-drive data or for sanctions, His Honor warned Sarfez that it might be required to bear the cost of forensic data recovery if it did not comply the court’s order.  Id. at *15, *22.

Plaintiffs Put Their Study Into Play

The study at issue in the subpoena was designed by Frederick K. Askari, M.D., Ph.D., an associate professor of hepatology, in the University of Michigan Health System. In re Denture Cream Prods. Liab. Litig., No. 09–2051–MD, 2015 WL 392021, at *7 (S.D. Fla. Jan. 28, 2015). At the instruction of plaintiffs’ counsel, Dr. Askari sought to study the short-term effects of Fixodent on copper absorption in humans. Working in India, Askari conducted the study on 24 participants, who were given a controlled diet for 36 days. Of the 24 participants, 12, randomly selected, received 12 grams of Fixodent per day (containing 204 mg. of zinc). Another six participants, randomly selected, were given zinc acetate, three times per day (150 mg of zinc), and the remaining six participants received placebo, three times per day.

A study protocol was approved by an independent group[3], id. at *9, and the study was supposed to be conducted with a double blind. Id. at *7. Not surprisingly, those participants who received doses of Fixodent or zinc acetate had higher urinary levels of zinc (pee < 0.05). The important issue, however, was whether the dietary zinc levels affect copper excretion in a way that would support plaintiffs’ claims that copper levels were lowered sufficiently by Fixodent to cause a syndromic neurological disorder. The MDL Court ultimately concluded that plaintiffs’ expert witnesses’ opinions on general causation claims were not sufficiently supported to satisfy the requirements of Rule 702, and upheld defense challenges to those expert witnesses. In doing so, the MDL Court had much of interest to say about case reports, weight of the evidence, and other important issues. This post, however, concentrates on the deviations of one study, commissioned by plaintiffs’ counsel, from the scientific standard of care. The Askari “research” makes for a fascinating case study of how not to conduct a study in a litigation caldron.

Non-Standard Deviations

The First Deviation – Changing the Ascertainment Period After the Data Are Collected

The protocol apparently identified a primary endpoint to be:

“the mean increase in [copper 65] excretion in fecal matter above the baseline (mg/day) averaged over the study period … to test the hypothesis that the release of [zinc] either from Fixodent or Zinc Acetate impairs [copper 65] absorption as measured in feces.”

The study outcome, on the primary end point, was clear. The plaintiffs’ testifying statistician, Hongkun Wang, stated in her deposition that the fecal copper (whether isotope Cu63 or Cu65) was not different across the three groups (Fixodent, zinc acetate, and placebo). Id. at *9[4]. Even Dr. Askari himself admitted that the total fecal copper levels were not increased in the Fixodent group compared with the placebo control group. Id. at *9.[5]

Apparently after obtaining the data, and finding no difference in the pre-specified end point of average fecal copper levels between Fixodent and placebo groups, Askari turned to a new end point, measured in a different way, not described in the protocol as the primary end point.

The Second Deviation – Changing Primary End Point After the Data Are Collected

In the early (days 3, 4, and 5) and late (days 31, 32, and 33) part of the Study, participants received a dose of purified copper 65[6] to help detect the “blockade of copper.” Id. at 8*. The participants’ fecal copper 65 levels were compared to their naturally occurring copper 63 levels. According to Dr. Askari:

“if copper is being blocked in the Fixodent and zinc acetate test subjects from exposure to the zinc in the test product (Fixodent) and positive control (zinc acetate), the ratio of their fecal output of copper 65 as compared to their fecal output of copper 63 would increase relative to the control subjects, who were not dosed with zinc. In short, a higher ratio of copper 65 to copper 63 reflects blocking of copper.”

Id.

Askari analyzed the ratio of two copper isotopes (Cu65 /Cu63), in the limited period of observation to study days 31 to 33. Id. at *9. Askari thus changed the outcome to be measured, the timing of the measurement, and manner of measurement (average over entire period versus amount on days 31 to 33). On this post hoc, non-prespecified end point, Askari claimed to have found “significant” differences.

The MDL Court expressed its skepticism and concern over the difference between the protocol’s specified end point, and one that came into the study only after the data were obtained and analyzed. The plaintiffs claimed that it was their (and Askari’s) intention from the initial stages of designing the Fixodent Blockade Study to use the Cu65/Cu63 ratio as the primary end point. According to the plaintiffs, the isotope ratio was simply better articulated and “clarified” as the primary end point in the final report than it was in the protocol. The Court was not amused or assuaged by the plaintiffs’ assurances. The study sponsor, Dr. Salim Shah could not point to a draft protocol that indicated the isotope ratio as the end point; nor could Dr. Shah identify a request for this analysis by Wang until after the study was concluded. Id. at *9.[7]

Ultimately, the Court declared that whether the protocol was changed post hoc after the primary end point provided disappointing analysis, or the isotope ratio was carelessly omitted from the protocol, the design or conduct of the study was “incompatible with reliable scientific methodology.”

The Third Deviation – Changing the Standard of “Significance” After the Data Are Collected and P-Values Are Computed

The protocol for the Blockade study called for a pre-determined Type I error rate (p-value) of no more than 5 percent.[8] Id. at *10. The difference in the isotope ratio showed an attained level of significance probability of 5.7 percent, and thus even the post hoc end point missed the prespecified level of significance. The final protocol changed the value of “significance” to 10 percent, to permit the plaintiffs to declare a “statistically significant” result. Dr. Wang admitted in deposition that she doubled the acceptable level of Type I error only after she obtained the data and calculated the p-value of 0.057. Id. at *10.[9]

The Court found that this deliberate moving of the statistical goal post reflected a “lack of objectivity and reliability,” which smacked of contrivance[10].

The Court found that the study’s deviations from the protocol demonstrated a lack of objectivity. The inadequacy of the Study’s statistical analysis plan supported the Court’s conclusion that Dr. Askari’s supposed finding of a “statistically significant” difference in fecal copper isotope ratio between Fixodent and placebo group participants was “not based on sufficiently reliable and objective scientific methodology” and thus could not support plaintiffs’ expert witnesses’ general causation claims.

The Fourth Deviation – Failing to Take Steps to Preserve the Blind

The protocol called for a double-blinded study, with neither the participants nor the clinical investigators knowing which participant was in which group. Rather than delivering the three different groups capsules that looked similar, the group each received starkly different looking capsules. Id. at *11. The capsules for one set were apparently so large that the investigators worried whether the participants would comply with the dosing regimen.

The Fifth Deviation – Failing to Take Steps to Keep Biological Samples From Becoming Contaminated

Documents and emails from Dr. Shah acknowledged that there had been “difficulties in storing samples at appropriate temperature.” Id. at *11. Fecal samples were “exposed to unfrozen and undesirable temperature conditions.” Dr. Shah called for remedial steps from the Study manager, but there was no documentation that such steps were taken to correct the problem. Id.

The Consequences of Discrediting the Study

Dr. Askari opined that the Study, along with other evidence, shows that Fixodent can cause copper deficiency myeloneuropathy (“CDM”). The plaintiffs, of course, argued that the Defendants’ criticisms of the Fixodent

Study’s methodology went merely to the “weight rather than admissibility.” Id. at *9. Askari’s study was but one leg of the stool, but the defense’s thorough discrediting of the study was an important step in collapsing the support for the plaintiffs’ claims. As the MDL Court explained:

“The Court cannot turn a blind eye to the myriad, serious methodological flaws in the Fixodent Blockade Study and conclude they go to weight rather than admissibility. While some of these flaws, on their own, may not be serious enough to justify exclusion of the Fixodent Blockade Study; taken together, the Court finds Fixodent Blockade Study is not “good science,” and is not admissible. Daubert, 509 U.S. at 593 (internal quotation marks and citation omitted).”

Id. at *11.

A study, such as the Fixodent Blockade Study, is not itself admissible, but the deconstruction of the study upon which plaintiffs’ expert witnesses relied, led directly to the Court’s decision to exclude those witnesses. The Court omitted any reference to Federal Rule of Evidence 703, which addresses the requirements of facts and data, otherwise inadmissible, which may be relied upon by expert witnesses in reaching their opinions.


 

[1] SeePhiladelphia Plaintiff’s Claims Against Fixodent Prove Toothless” (May 2, 2012); Jacoby v. Rite Aid Corp., 2012 Phila. Ct. Com. Pl. LEXIS 208 (2012), aff’d, 93 A.3d 503 (Pa. Super. 2013); “Pennsylvania Superior Court Takes The Bite Out of Fixodent Claims” (Dec. 12, 2013).

[2] SeeUsing the Rule 45 Subpoena to Obtain Research Data” (July 24, 2013)

[3] The group was identified as the Ethica Norma Ethical Committee.

[4] citing Wang Dep. at 56:7–25, Aug. 13, 2013), and Wang Analysis of Fixodent Blockade Study [ECF No. 2197–56] (noting “no clear treatment effect on Cu63 or Cu65”).

[5] Askari Dep. at 69:21–24, June 20, 2013.

[6] Copper 65 is not a typical tracer; it is not radioactive. Naturally occurring copper consists almost exclusively of two stable (non-radioactive) isotope, Cu65 about 31 percent, Cu63 about 69 percent. See, e.g., Manuel Olivares, Bo Lönnerdal, Steve A Abrams, Fernando Pizarro, and Ricardo Uauy, “Age and copper intake do not affect copper absorption, measured with the use of 65Cu as a tracer, in young infants,” 76 Am. J. Clin. Nutr. 641 (2002); T.D. Lyon, et al., “Use of a stable copper isotope (65Cu) in the differential diagnosis of Wilson’s disease,” 88 Clin. Sci. 727 (1995).

[7] Shah Dep. at 87:12–25; 476:2–536:12, 138:6–142:12, June 5, 2013).

[8] The reported decision leaves unclear how the analysis would proceed, whether by ANOVA for the three groups, or t-tests, and whether there was multiple testing.

[9] Wang Dep. at 151:13–152:7; 153:15–18.

[10] 2015 WL 392021, at *10, citing Perry v. United States, 755 F.2d 888, 892 (11th Cir. 1985) (“A scientist who has a formed opinion as to the answer he is going to find before he even begins his research may be less objective than he needs to be in order to produce reliable scientific results.”); Rink v. Cheminova, Inc., 400 F.3d 1286, 1293 n. 7 (11th Cir.2005) (“In evaluating the reliability of an expert’s method … a district court may properly consider whether the expert’s methodology has been contrived to reach a particular result.” (alteration added)).

 

Zoloft MDL Relieves Matrixx Depression

January 30th, 2015

When the Supreme Court delivered its decision in Matrixx Initiatives, Inc. v. Siracusano, 131 S. Ct. 1309 (2011), a colleague, David Venderbush from Alston & Bird LLP, and I wrote a Washington Legal Foundation Legal Backgrounder, in which we predicted that plaintiffs’ counsel would distort the holding, and inflate the dicta of the opinion. Schachtman & Venderbush, “Matrixx Unbounded: High Court’s Ruling Needlessly Complicates Scientific Evidence Principles,” 26 (14) Legal Backgrounder (June 17, 2011)[1]. Our prediction was sadly all-too accurate. Not only was the context of the Matrixx distorted, but several district courts appeared to adopt the dicta on statistical significance as though it represented the holding of the case[2].

The Matrixx decision, along with the few district court opinions that had embraced its dicta[3], was urged as the basis for denying a defense challenge to the proffered testimony of Dr. Anick Bérard, a Canadian perinatal epidemiologist, in the Zoloft MDL. The trial court, however, correctly discerned several methodological shortcomings and failures, including Dr. Bérard’s reliance upon claims of statistical significance from studies that conducted dozens and hundreds of multiple comparisons. See In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., MDL No. 2342; 12-md-2342, 2014 U.S. Dist. LEXIS 87592; 2014 WL 2921648 (E.D. Pa. June 27, 2014) (Rufe, J.).

Plaintiffs (through their Plaintiffs’ Steering Committee (PSC) in the Zoloft MDL) were undaunted and moved for reconsideration, asserting that the MDL trial court had failed to give appropriate weight to the Supreme Court’s decision in Matrixx, and a Third Circuit decision in DeLuca v. Merrell Dow Pharms., Inc., 911 F.2d 941 (3d Cir. 1990). The MDL trial judge, however, deftly rebuffed the plaintiffs’ use of Matrixx, and their attempt to banish consideration of random error in the interpretation of epidemiologic studies. In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., MDL No. 2342; 12-md-2342, 2015 WL 314149 (E.D. Pa. Jan. 23, 2015) (Rufe, J.) (denying PSC’s motion for reconsideration).

In rejecting the motion for reconsideration, the Zoloft MDL trial judge noted that the PSC had previously cited Matrixx, and that the Court had addressed the case in its earlier ruling. 2015 WL 314149, at *2-3. The MDL Court then proceeded to expand upon its earlier ruling, and to explain how Matrixx was largely irrelevant to the Rule 702 context of Pfizer’s challenge to Dr. Bérard. There were, to be sure, some studies with nominal statistically significant results, for some birth defects, among children of mothers who took Zoloft in their first trimester of pregnancy. As Judge Rufe explained, statistical significance, or the lack thereof, was only one item in a fairly long list of methodological deficiencies in Dr. Bérard’s causation opinions:

“The [original] opinion set forth a detailed and multi-faceted rationale for finding Dr. Bérard’s testimony unreliable, including her inattention to the principles of replication and statistical significance, her use of certain principles and methods without demonstrating either that they are recognized by her scientific community or that they should otherwise be considered scientifically valid, the unreliability of conclusions drawn without adequate hypothesis testing, the unreliability of opinions supported by a ‛cherry-picked’ sub-set of research selected because it was supportive of her opinions (without adequately addressing non-supportive findings), and Dr. Bérard’s failure to reconcile her currently expressed opinions with her prior opinions and her published, peer-reviewed research. Taking into account all these factors, as well as others discussed in the Opinion, the Court found that Dr. Bérard departed from well-established epidemiological principles and methods, and that her opinion on human causation must be excluded.”

Id. at *1.

In citing the multiple deficiencies of the proffered expert witness, the Zoloft MDL Court thus put its decision well within the scope of the Third Circuit’s recent precedent of affirming the exclusion of Dr. Bennet Omalu, in Pritchard v. Dow Agro Sciences, 430 F. App’x 102, 104 (3d Cir.2011). The Zoloft MDL Court further defended its ruling by pointing out that it had not created a legal standard requiring statistical significance, but rather had made a factual finding that epidemiologist, such as the challenged witness, Dr. Anick Bérard, would use some measure of statistical significance in reaching conclusions in her discipline of epidemiology. 2015 WL 314149, at *2[4].

On the plaintiffs’ motion for reconsideration, the Zoloft Court revisited the Matrixx case, properly distinguishing the case as a securities fraud case about materiality of non-disclosed information, not about causation. 2015 WL 314149, at *4. Although the MDL Court could and should have identified the Matrixx language as clearly obiter dicta, it did confidently distinguish the Supreme Court holding about pleading materiality from its own task of gatekeeping expert witness testimony on causation in a products liability case:

“Because the facts and procedural posture of the Zoloft MDL are so dissimilar from those presented in Matrixx, this Court reviewed but did not rely upon Matrixx in reaching its decision regarding Dr. Bérard. However, even accepting the PSC’s interpretation of Matrixx, the Court’s Opinion is consistent with that ruling, as the Court reviewed Dr. Bérard’s methodology as a whole, and did not apply a bright-line rule requiring statistically significant findings.”

Id. at *4.

In mounting their challenge to the MDL Court’s earlier ruling, the Zoloft plaintiffs asserted that the Court had failed to credit Dr. Bérard’s reliance upon what Dr. Bérard called the “Rothman approach.” This approach, attribution to Professor Kenneth Rothman had received some attention in the Bendectin litigation in the Third Circuit, where plaintiffs sought to be excused from their failure to show statistically significant associations when claiming causation between maternal use of Bendectin and infant birth defects. DeLuca v. Merrell Dow Pharms., Inc., 911 F.2d 941 (3d Cir. 1990). The Zoloft MDL Court pointed out that the Circuit, in DeLuca, had never affirmatively endorsed Professor Rothman’s “approach,” but had reversed and remanded the Bendectin case to the district court for a hearing under Rule 702:

“by directing such an overall evaluation, however, we do not mean to reject at this point Merrell Dow’s contention that a showing of a .05 level of statistical significance should be a threshold requirement for any statistical analysis concluding that Bendectin is a teratogen regardless of the presence of other indicial of reliability. That contention will need to be addressed on remand. The root issue it poses is what risk of what type of error the judicial system is willing to tolerate. This is not an easy issue to resolve and one possible resolution is a conclusion that the system should not tolerate any expert opinion rooted in statistical analysis where the results of the underlying studies are not significant at a .05 level.”

2015 WL 314149, at *4 (quoting from DeLuca, 911 F.2d at 955). After remand, the district court excluded the DeLuca plaintiffs’ expert witnesses, and granted summary judgment, based upon the dubious methods employed by plaintiffs’ expert witnesses in cherry picking data, recalculating risk ratios in published studies, and ignoring bias and confounding in studies. The Third Circuit affirmed the judgment for Merrell Dow. DeLuca v. Merrell Dow Pharma., Inc., 791 F. Supp. 1042 (3d Cir. 1992), aff’d, 6 F.3d 778 (3d Cir. 1993).

In the Zoloft MDL, the plaintiffs not only offered an erroneous interpretation of the Third Circuit’s precedents in DeLuca, they also failed to show that the “Rothman” approach had become generally accepted in over two decades since DeLuca. 2015 WL 314149, at *4. Indeed, the hearing record was quite muddled about what the “Rothman” approach involved, other than glib, vague suggestions that the approach would have countenanced Dr. Bérard’s selective, over-reaching analysis of the extant epidemiologic studies. The plaintiffs did not call Rothman as an expert witness; nor did they offer any of Rothman’s publications as exhibits at the Zoloft hearing. Although Professor Rothman has criticized the overemphasis upon p-values and significance testing, he has never suggested that researchers and scientists should ignore random error in interpreting research data. Nevertheless, plaintiffs attempted to invoke some vague notion of a Rothman approach that would ignore confidence intervals, attained significance probability, multiplicity, bias, and confounding. Ultimately, the MDL Court would have none of it. The Court held that the Rothman Approach (whatever that is), as applied by Dr. Bérard, did not satisfy Rule 702.

The testimony at the Rule 702 hearing on the so-called “Rothman approach” had been sketchy at best. Dr. Bérard protested, perhaps too much, when asked about her having ignored p-values:

“I’m not the only one saying that. It’s really the evolution of the thinking of the importance of statistical significance. One of my professors and also a friend of mine at Harvard, Ken Rothman, actually wrote on it – wrote on the topic. And in his book at the end he says obviously what I just said, validity should not be confused with precision, but the third bullet point, it’s saying that the lack of statistical significance does not invalidate results because sometimes you are in the context of rare events, few cases, few exposed cases, small sample size, exactly – you know even if you start with hundreds of thousands of pregnancies because you are looking at rare events and if you want to stratify by exposure category, well your stratum becomes smaller and smaller and your precision decreases. I’m not the only one saying that. Ken Rothman says it as well, so I’m not different from the others. And if you look at many of the studies published nowadays, they also discuss that as well.”

Notes of Testimony of Dr. Anick Bérard, at 76:21- 77:14 (April 9, 2014). See also Notes of Testimony of Dr. Anick Bérard, at 211 (April 11, 2014) (discussing non-statistically significant findings as a “trend,” and asserting that the lack of a significant finding does not mean that there is “no effect”). Bérard’s invocation of Rothman here is accurate but unhelpful. Rothman and Bérard are not alone in insisting that confidence intervals provide a measure of precision of an estimate, and that we should be careful not to interpret the lack of significance to mean no effect. But the lack of significance cannot be used to interpret data to show an effect.

At the Rule 702 hearing, the PSC tried to bolster Dr. Bérard’s supposed reliance upon the “Rothman approach” in cross-examining Pfizer’s expert witness, Dr. Stephen Kimmel:

“Q. You know who Dr. Rothman is, the epidemiologist?
A. Yes.
Q. You actually took a course from Dr. Rothman, didn’t you?
A. I did when I was a student way back.
Q. He is a well-known epidemiologist, isn’t he?
A. Yes, he is.
Q. He has published this book, Modern Epidemiology. Do you have a copy of this?
A. I do.
Q. Do you – Have you ever read it?
A. I read his earlier edition. I have not read the most recent edition.
Q. There’s two other authors, Sander Greenland and Tim Lash. Do you know either one of them?
A. I know Sander. I don’t know Tim.
Q. Dr. Rothman has some – he has written about confidence intervals and statistical significance for some time, hasn’t he?
A. He has.
Q. Do you agree with him that statistical significance is a not matter of validity. It’s a matter of precision?
A. It’s a matter of – well, confidence intervals are matters of precision. P-values are not.
Q. Okay. I want to put up a table and see if you are in agreement with Dr. Rothman. This is the third edition of Modern Epidemiology. And he has – and ignore my brother’s handwriting. But there is an hypothesized rate ratio under 10-3. It says: p-value function from which one can find all confidence limits for a hypothetical study with a rate ratio estimate of 3.1 Do you see that there?
A. Yes. I don’t see the top of the figure, not that it matters.
Q. I want to make sure. The way I understand this, he is giving us a hypothesis that we have a relative risk of 3.1 and it [presumably a 95% confidence interval] crosses 1, meaning it’s not statistically significant. Is that fair?
A. Well, if you are using a value of .05, yes. And again, if this is a single test and there’s a lot of things that go behind it. But, yes, so this is a total hypothetical.
Q. Yes.
A. I’ sorry. He’s saying here is a hypothetical based on math. And so here is – this is what we would propose.
Q. Yes, I want to highlight what he says about this figure and get your thoughts on it. He says:
The message of figure 10-3 is that the example data are more compatible with a moderate to strong association than with no association, assuming the statistical model used to construct the function is correct.
A. Yes.
Q. Would you agree with that statement?
A. Assuming the statistical model is correct. And the problem is, this is a hypothetical.
Q. Sure. So let’s just assume. So what this means to sort of put some meat on the bone, this means that although we cross 1 and therefore are statistically
significant [sic, non-significant], he says the more likely truth here is that there is a moderate to strong effect rather than no effect?
A. Well, you know he has hypothesized this. This is not used in common methods practice in pharmacoepi. Dr. Rothman has lots of ideas but it’s not part of our standard scientific method.

Notes of Testimony of Dr. Stephen Kimmel, at 126:2 to 128:20.

Nothing very concrete about the “Rothman approach” is put before the MDL Court, either through Dr. Bérard or Dr. Kimmel. There are, however, other instructive aspects to the plaintiff’s counsel’s examination. First, the referenced portion of the text, Modern Epidemiology, is a discussion of p-value functions, not of p-values or of confidence intervals per se. Modern Epidemiology at 158-59 (3d ed. 2008). Dr. Bérard never discussed p-value functions in her report or in her testimony, and Dr. Kimmel testified, without contradiction, that such p-value functions are “not used in common methods practice.” Second, the plaintiff’s counsel never marked and offered the Rothman text as an exhibit for the MDL Court to consider. Third, the cross-examiner first asked about the implication for a hypothetical association, and then, when he wanted to “put some meat on the bone” changed the word used in Rothman’s text, “association,” to “effect.” The word “effect” does not appear in Rothman’s text at the referenced discussion about p-value functions. Fortunately, the MDL Court was not poisoned by the “meat on the bone.”

The Pit and the Pendulum

Another document glibly referenced but not provided to the MDL Court was the publication of Sir Austin Bradford Hill’s presidential address to the Royal Society of Medicine on causation. The MDL Court acknowledged that the PSC had argued that the emphasis upon statistical significance was contrary to Hill’s work and teaching. 2015 WL 314149, at *5. In the Court’s words:

“the PSC argues that the Court’s finding regarding the importance of statistical significance in the field of epidemiology is inconsistent with the work of Bradford Hill. The PSC points to a 1965 address by Sir Austin Bradford Hill, which it has not previously presented to the Court, except in opening statements of the Daubert hearings.20 The PSC failed to put forth evidence establishing that Bradford Hill’s statement that ‛I wonder whether the pendulum has not swung too far [in requiring statistical significance before drawing conclusions]’ has, in the decades since that 1965 address, altered the importance of statistical significance to scientists in the field of epidemiology.”

Id. This failure, identified by the Court, is hardly surprising. The snippet of a quotation from Hill would not sustain the plaintiffs’ sweeping generalization. The quoted language in context may help to explain why Hill’s paper was not provided:

“I wonder whether the pendulum has not swung too far – not only with the attentive pupils but even with the statisticians themselves. To decline to draw conclusions without standard errors can surely be just as silly? Fortunately I believe we have not yet gone so far as our friends in the USA where, I am told, some editors of journals will return an article because tests of significance have not been applied. Yet there are innumerable situations in which they are totally unnecessary – because the difference is grotesquely obvious, because it is negligible, or because, whether it be formally significant or not, it is too small to be of any practical importance. What is worse the glitter of the t table diverts attention from the inadequacies of the fare. Only a tithe, and an unknown tithe, of the factory personnel volunteer for some procedure or interview, 20% of patients treated in some particular way are lost to sight, 30% of a randomly-drawn sample are never contracted. The sample may, indeed, be akin to that of the man who, according to Swift, ‘had a mind to sell his house and carried a piece of brick in his pocket, which he showed as a pattern to encourage purchasers.’ The writer, the editor and the reader are unmoved. The magic formulae are there.”

Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 299 (1965).

In the Zoloft cases, no expert witness was prepared to state that the disparity was “grotesquely obvious,” or “negligible.” And Bradford Hill’s larger point was that bias and confounding often dwarf considerations of random error, and that there are many instances in which significance testing is unavailing or unhelpful. And in some studies, with large “effect sizes,” statistical significance testing may be beside the point.

Hill’s presidential address to the Royal Society of Medicine commemorated his successes in epidemiology, and we need only turn to Hill’s own work to see how prevalent was his use of measurements of significance probability. See, e.g., Richard Doll & Austin Bradford Hill, “Smoking and Carcinoma of the Lung: Preliminary Report,” Brit. Med. J. 740 (Sept. 30, 1950); Medical Research Council, “Streptomycin Treatment of Pulmonary Tuberculosis,” Brit. Med. J. 769 (Oct. 30, 1948).

Considering the misdirection on Rothman and on Hill, the Zoloft MDL Court did an admirable job in unraveling the Matrixx trap set by counsel. The Court insisted upon parsing the Bradford Hill factors[5], over Pfizer’s objection, despite the plaintiffs’ failure to show “an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance,” which Bradford Hill insisted was the prerequisite for the exploration of the nine factors he set out in his classic paper. Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 295 (1965). Given the outcome, the Court’s questionable indulgence of plaintiffs’ position was ultimately harmless.


[1] See alsoThe Matrixx – A Comedy of Errors,” and “Matrixx Unloaded,” (Mar. 29, 2011), “The Matrixx Oversold,” and “De-Zincing the Matrixx.”

[2] SeeSiracusano Dicta Infects Daubert Decisions” (Sept. 22, 2012).

[3] See, e.g., In re Chantix (Varenicline) Prods. Liab. Litig., 2012 U.S. Dist. LEXIS 130144, at *22 (N.D. Ala. 2012); Cheek v. Wyeth Pharm. Inc., 2012 U.S. Dist. LEXIS 123485 (E.D. Pa. Aug. 30, 2012); In re Celexa & Lexapro Prods. Liab. Litig.,  ___ F.3d ___, 2013 WL 791780 (E.D. Mo. 2013).

[4] The Court’s reasoning on this point begged the question whether an ordinary clinician, ignorant of the standards, requirements, and niceties of statistical reasoning and inference, would be allowed to testify, unconstrained by any principled epidemiologic reasoning about random or systematic error. It is hard to imagine that Rule 702 would countenance such an end-run around the requirements of sound science.

[5] Adhering to Bradford Hill’s own admonition might have saved the Court the confusion of describing statistical significance as a measure of strength of association. 2015 WL 314149, at *2.

More Antic Proposals for Expert Witness Testimony – Including My Own Antic Proposals

December 30th, 2014

The late Professor Margaret Berger epitomized a person you could like and even admire, while finding many of her ideas erroneous, incoherent, and even dangerous. Berger was frequently on the losing side of expert witness admissibility issues, and she fell under the influence of the plaintiffs’ bar, holding conferences with their walking-around money, laundered through SKAPP, The Project on Scientific Knowledge and Public Policy.[1] In appellate cases, Berger often lent the credibility of her scholarship to support plaintiffs’ efforts to strip away admissibility criteria for expert witness causation opinion.[2] Still, she was always polite and respectful in debate. When Judge Weinstein appointed her to chair a committee to search for appropriate court-appointed expert witnesses in the silicone gel breast implant litigation, Professor Berger proved a careful, impartial listener to all the parties involved.

In 2009, before the publication of the Third Edition of the Reference Manual on Scientific Evidence, Professor Berger gave a presentation for an American Law Institute continuing legal education program, in which she aired her antipathy toward gatekeeping.[3] With her sights set primarily on defense expert witnesses, Berger opined that a monetary relationship between an expert witness and the defendant could be grounds for a Rule 702 exclusion. While the jingle of coin doth soothe the hurt that conscience must feel (for some expert witnesses), the focus of the Rule 702 inquiry is properly on relevance, reliability, and validity. Judge Shira Scheindlin, who sat on the same panel as Professor Berger, diplomatically pointed out that employee expert witnesses are offered all the time, and any bias is a subject for cross-examination, not disqualification. Remarkably, neither Professor Berger nor Judge Scheindlin acknowledged that conflicts of interest, actual or potential, are not relevant to the Daubert or Rule 702 factors that guide admissibility. If Berger’s radical position of identifying conflict of interest with unreliability were correct, we might dismiss her views without any consideration[4], given her conflicts of interest from her association with SKAPP, and her several amicus briefs filed on behalf of plaintiffs, seeking to avoid the exacting requirements of expert witness evidence gatekeeping.

In her ALI-CLE lecture, Professor Berger waxed enthusiastically about what was then a recent federal trial court decision in Allen v. Martin Surfacing, 263 F.R.D. 47 (D. Mass. 2009). Berger asserted that the case was unpublished and that the case, like many other “Daubert” cases was hidden from view. Berger thought that Allen’s obscurity was unfortunate because the decision was “fabulous” and was based upon astute opinions of “outstanding” experts[5]. Berger was wrong on every point, from the chemical involved, to the unavailability of the opinion, to the quality of the expert witnesses (who were not ALS experts, but frequent, willing testifiers), and to the carefulness of the exposure and causation opinions offered.[6] See James L. Bernat & Richard Beresford, Ethical and Legal Issues in Neurology 59-60 (Amsterdam 2013) (discussing Allen and characterizing the court’s decision to admit plaintiffs’ expert witnesses’ opinions as based upon plausibility without more).

Implicit in Berger’s errors, however, may be the beginnings of some concrete suggestions for improving the gatekeeping process. After all, Berger thought that no one would likely find and read the Allen decision.  She may have thus believed that she had some freedom from scrutiny when she praised the decision and the expert witnesses involved. Just as there is a groundswell of support for greater disclosure of underlying data to accompany scientific publications, there should be support for wide dissemination of the underlying materials behind Rule 702 opinions. Most judges cannot or will not write sufficiently comprehensive opinions describing and supporting their decisions to admit or exclude expert witness opinion to permit vigorous public scrutiny. Some judges fail to cite to the underlying studies or data that are the bases of the challenged opinions. As a result, the “Daubert” scholarship suffers because it frequently lacks access to the actual reports, testimony, studies, and data themselves. Often the methodological flaws discussed in judicial opinions are just the tip of the iceberg, with flaws running all the way to the bottom.

And while I am on “antic proposals” of my own, courts should consider requiring all parties to file proposed findings of fact and conclusions of law, with record cites, to support their litigation positions. Lawyers on both sides of the “v.” have proven themselves cavalier and careless in their descriptions and characterizations of scientific evidence, inference, and analysis. Proposed findings would permit reviewing courts, scientists, and scholars to identify errors for the benefit of appellate courts and later trial courts.


 

[1] SKAPP claimed to have aimed at promoting transparent decision making, but deceived the public with its disclosure of having been supported by the “Common Benefit Trust, a fund established pursuant to a court order in the Silicone Gel Breast Implant Products Liability litigation.” Somehow SKAPP forgot to disclose that this court order simply created a common-benefit fund for plaintiffs’ lawyers to pursue their litigation goals. How money from the silicone gel breast implant MDL was diverted for advocated anti-Daubert policies is a mystery that no amount of transparent decision making has to date uncovered. Fortunately, for the commonweal, SKAPP appears to have been dissolved. The SKAPP website lists those who guided and supported SKAPP’s attempts to subvert expert witness validity requirements; not surprisingly, the SKAPP supporters were mostly plaintiffs’ expert witnesses:

Eula Bingham, PhD
Les Boden, PhD
Richard Clapp, DSc, MPH
Polly Hoppin, ScD
Sheldon Krimsky, PhD
David Michaels, PhD, MPH
David Ozonoff, MD, MPH
Anthony Robbins, MD, MPA

[2] See, e.g., Parker v. Mobil Oil Corp., N.Y. Ct. App., Brief Amicus Curiae of Profs. Margaret A. Berger, Edward J. Imwinkelried, Sheila Jasanoff, and Stephen A. Saltzburg (July 28, 2006) (represented by Anthony Z. Roisman, of the National Legal Scholars Law Firm).

[3] Berger, “Evidence, Procedure, and Trial Update: How You Can Win (Or Lose) Your Case (Expert Witnesses, Sanctions, Spoliation, Daubert, and More)” (Mar. 27, 2009).

Berger, “Evidence, Procedure, and Trial Update: How You Can Win (Or Lose) Your Case (Expert Witnesses, Sanctions, Spoliation, Daubert, and More)” (Mar. 27, 2009).

[4] We can see this position carried to its natural, probable, and extreme endpoint in Elizabeth Laposata, Richard Barnes, and Stanton Glantz, “Tobacco Industry Influence on the American Law Institute’s Restatements of Torts and Implications for Its Conflict of Interest Policies,” 98 Iowa Law Rev. 1 (2012), where the sanctimonious authors, all anti-tobacco advocates criticize the American Law Institute for permitting the participation of lawyers who represent tobacco industry. The authors fail to recognize that ALI members include lawyers representing plaintiffs in tobacco litigation, and that it is possible, contrary to their ideological worldview, to discuss and debate an issue without reference to ad hominem “conflicts” issues. The authors might be surprised by the degree to which the plaintiffs’ bar has lobbied (successfully) for many provisions in various Restatements.

[5] Including Richard Clapp, who served as an advisor to SKAPP, which lavished money on Professor Berger’s conferences.

[6] SeeBad Gatekeeping or Missed Opportunity – Allen v. Martin Surfacing” (Nov. 30, 2012); “Gatekeeping in Allen v. Martin Surfacing — Postscript” (April 11, 2013).

New Standard for Scientific Evidence – The Mob

December 27th, 2014

A few years ago, a law student published a Note that argued for the dismantling of judicial gatekeeping.  Note, “Admitting Doubt: A New Standards for Scientific Evidence,” 123 Harvard Law Review 2021 (2010).  The anonymous Harvard law student asserted that juries are at least as good, if not better, at handling technical questions than are “gatekeeping” federal trial judges. The empirical evidence for such a suggestion is slim, and ignores the geographic variability in jury pools.

To be sure, some jurors have much greater scientific and mathematical aptitude than some judges, but the law student’s run at Rule 702 ignores some important institutional differences between judges and juries, including that judicial errors are subject to scrutiny and review, and public comment based upon written judicial opinions. Most judges have 20 years of schooling and 10 years of job experience, which should account for some superiority.

Misplaced Sovereignty

Another student this year has published a much more sophisticated version of the Harvard student’s Note, an antic proposal with a similar policy agenda that would overthrow the regime of judicial scrutiny and gatekeeping of expert witness opinion testimony. Krista M. Pikus, “We the People: Juries, Not Judges, Should be the Gatekeepers of Expert Evidence,” 90 Notre Dame L. Rev. 453 (2014). This more recent publication, while conceding that judges may be no better than juries at evaluating scientific evidence, asserts that jury involvement is required by a political commitment to popular sovereignty. Ms. Pikus begins with the simplistic notion that:

“[o]ur system of government is based on the idea that the people are sovereign.”

Id. at 470. Since juries are made up of people, jury determinations are required to implement popular sovereignty.

This notion of sovereignty is really quite foreign to our Constitution and our system of government. “We, the People” generally do not make laws or apply them, with the exception of grand and petit jury factual determinations. The vast legislative and decision making processes are entrusted to Congress, the Executive, and the ever-expanding system of administrative agencies. The Constitution was indeed motivated to prevent governmental tyranny, but mob rule was not an acceptable alternative. For the founders, juries were a bulwark of liberty, and a shield against an overbearing Crown. Jurors were white men who owned property.

Pikus argues that judges somehow lack the political authority to serve as fact finders because they are not elected, but in some states judges are elected, and in other states and in the federal system, judges are appointed and confirmed by elected officials. Juries are, of course, not elected, and with many jurisdictions permitting juries of six persons or fewer, juries are hardly representative of the “popular sovereign.” The systematic exclusion of intelligent and well-educated jurors by plaintiffs’ counsel, along with the aversion to jury service by self-employed and busy citizens, helps ensure that juries fail to represent a fair cross-section of the population. Curiously, Pikus asserts that the “right to a trial by one’s peers is an integral part of our legal system,” but the peerage concept is nowhere in the Constitution. If it were, defendants in complicated tort cases might well have a right to juries composed of scientists or engineers.

The Right to Trial by Jury

There is, of course, a federal constitutional right to trial by jury, guaranteed by the Seventh Amendment:

“In Suits at common law … the right of trial by jury shall be preserved, and no fact tried by a jury shall be otherwise reexamined in any Court of the United States, than according to the rules of the common law.”

A strict textualist might hold that federal courts could dispense with juries in cases brought under statutory tort legislation, such as the New Jersey Products Liability Act, or for claims or defenses that were not available at the time the Seventh Amendment was enacted. Even the textualist might hold that the change in complexity of fact-finding endeavors, over two centuries, might mean that both the language and the spirit of the Seventh Amendment point away from maintaining the jury in cases of sufficient complexity.

Judges versus Juries

The fact is that judges and juries can, and do, act tyrannically, in deciding factual issues, including scientific and technical issues. Ms. Pikus would push the entire responsibility for ensuring accuracy in scientific fact finding to the least reviewable entity, the petit jury. Juries can ignore facts, decide emotively or irrationally, without fear of personal scrutiny or criticism. Pikus worries that judges “insert their policy opinions into their decisions,” which they have been known to do, but she fails to explain why we should tolerate the same from unelected, unreviewable juries. Id. at 472.

Inconsistently, Pikus acknowledges that “many can agree that some cases might be better suited for a judge instead of a jury,” such as “patent, bankruptcy, or tax” cases that “typically require additional expertise.” Id. at 471 & n. 185. To this list, we could add family law, probate, and equity matters, but the real question is what is it about a tax case that makes it more intractable to a jury than a products case. State power is much more likely to be abused or at issue in a tax case than in a modern products liability case, with a greater need for a “bulwark of liberty”. And the products liability case is much more likely to require scientific and technical expertise than a tax case.

The law of evidence, in federal and in most state courts, permits expert witnesses to present conclusory opinions, without having to account for the methodological correctness of their relied-upon studies, data, and analyses. Jurors, who are poorly paid, and pulled away from their occupations and professions, do not have the aptitude, patience, time, or interest to explore the full range of inferences and analyses performed by expert witnesses. Without some form of gatekeeping, trial outcomes are reduced to juror assessment of weak, inaccurate proxies for scientific determinations.

Pikus claims that juries, and only juries, should assess the reliability of an expert witness’s testimony. Id. at 455. As incompetent as some judges may be in adjudicating scientific issues, their errors are on display for all to see, whereas the jury’s determinations are opaque and devoid of public explanation. Judges can be singled out for technical competency, with appropriate case assignments, and they can be required to participate in professional legal education, including training in statistics, epidemiology, toxicology, genetics, and other subjects. It is difficult to imagine a world in which the jurors are sent home with the Reference Manual on Scientific Evidence, before being allowed to sit on a case. Nor is it feasible to have lay jurors serve on an extended trial that includes a close assessment of the expert witnesses’ opinions, as well as all the facts and data underlying those opinions.

Pikus criticizes the so-called “Daubert” regime as a manifestation of judicial activism, but she ignores that Daubert has been subsumed into an Act of Congress, in the form of a revised and expanded Federal Rules of Evidence.

In the end, this Note, like so much of the anti-Daubert law review literature is a complaint against removing popular, political, and emotive fact finding from technical and scientific issues in litigation. To the critics, science has no criteria of validity which the law is bound to respect. And yet, as John Adams argued before the Revolution:

“Facts are stubborn things; and whatever may be our wishes, our inclinations, or the dictates of our passion, they cannot alter the state of facts and evidence[1].”

Due process requires more than the enemies of Daubert would allow.


[1] John Adams, “Argument in Defense of the Soldiers in the Boston Massacre Trials” (Dec. 1770)

Showing Causation in the Absence of Controlled Studies

December 17th, 2014

The Federal Judicial Center’s Reference Manual on Scientific Evidence has avoided any clear, consistent guidance on the issue of case reports. The Second Edition waffled:

“Case reports lack controls and thus do not provide as much information as controlled epidemiological studies do. However, case reports are often all that is available on a particular subject because they usually do not require substantial, if any, funding to accomplish, and human exposure may be rare and difficult to study. Causal attribution based on case studies must be regarded with caution. However, such studies may be carefully considered in light of other information available, including toxicological data.”

F.J.C. Reference Manual on Scientific Evidence at 474-75 (2d ed. 2000). Note the complete lack of discussion of base-line risk, prevalence of exposure, and external validity of the “toxicological data.”

The second edition’s more analytically acute and rigorous chapter on statistics generally acknowledged the unreliability of anecdotal evidence of causation. See David Kaye & David Freedman, “Reference Guide on Statistics,” in F.J.C. Reference Manual on Scientific Evidence 91 – 92 (2d ed. 2000).

The Third Edition of the Reference Manual is even less coherent. Professor Berger’s introductory chapter[1] begrudgingly acknowledges, without approval, that:

“[s]ome courts have explicitly stated that certain types of evidence proffered to prove causation have no probative value and therefore cannot be reliable.59

The chapter on statistical evidence, which had been relatively clear in the second edition, now states that controlled studies may be better but case reports can be helpful:

“When causation is the issue, anecdotal evidence can be brought to bear. So can observational studies or controlled experiments. Anecdotal reports may be of value, but they are ordinarily more helpful in generating lines of inquiry than in proving causation.14

Reference Manual at 217 (3d ed. 2011). The “generally” is given no context or contour for readers. These authors fail to provide any guidance on what will come from anecdotal evidence, or when and why anecdotal reports may do more than merely generating “lines of inquiry.”

In Matrixx Initiatives Inc. v. Siracusano, 131 S. Ct. 1309 (2011), the Supreme Court went out of its way, way out of its way, to suggest that statistical significance was not always necessary to support conclusions of causation in medicine. Id. at 1319. The Court cited three Circuit court decisions to support its suggestion, but two of three involved specific causation inferences from so-called differential etiologies. General causation was assumed in those two cases, and not at issue[2]. The third case, the notorious Wells v. Ortho Pharmaceutical Corp., 788 F. 2d 741, 744–745 (11th Cir. 1986), was also cited in support of the suggestion that statistical significance was not necessary, but in Wells, the plaintiffs’ expert witnesses actually relied upon studies that claimed at least nominal statistical significance. Wells was and remains representative of what is possible and results when trial judges ignore the constraints of study validity. The Supreme Court, in any event, abjured any intent to specify “whether the expert testimony was properly admitted in those cases [Wells and others],” and the Court made no “attempt to define here what constitutes reliable evidence of causation.” 131 S. Ct. at 1319.

The causal claim in Siracusano involved anosmia, loss of the sense of smell, from the use of Zicam, zinc gluconate. The case arose from a motion to dismiss the complaint; no evidence was ever presented or admitted. No baseline risk of anosmia was pleaded; nor did plaintiffs allege that any controlled study demonstrated an increased risk of anosmia from nasal instillation of zinc gluconate. There were, however, clinical trials conducted in the 1930s, with zinc sulfate for poliomyelitis prophylaxis, which showed a substantial incidence of anosmia in the treated children[3]. Matrixx tried to argue that this evidence was unreliable, in part because it involved a different compound, but this argument (1) in turn demonstrated a factual issue that required discovery and perhaps a trial, and (2) traded on a clear error in asserting that the zinc in zinc sulfate and zinc gluconate were different, when in fact they are both ionic compounds that result in zinc ion exposure, as the active constituent.

The position stridently staked out in Matrixx Initiatives is not uncommon among defense counsel in tort cases. Certainly, similar, unqualified statements, rejecting the use of case reports for supporting causal conclusions, can be found in the medical literature[4].

When the disease outcome has an expected value, a baseline rate, in the exposed population, then case reports simply confirm what we already know: cases of the disease happen in people regardless of their exposure status. For this reason, medical societies, such as the Teratology Society, have issued guidances that generally downplay or dismiss the role that case reports may have in the assessment and determination of causality for birth defects:

“5. A single case report by itself is not evidence of a causal relationship between an exposure and an outcome.  Combinations of both exposures and adverse developmental outcomes frequently occur by chance. Common exposures and developmental abnormalities often occur together when there is no causal link at all. Multiple case reports may be appropriate as evidence of causation if the exposures and outcomes are both well-defined and low in incidence in the general population. The use of multiple case reports as evidence of causation is analogous to the use of historical population controls: the co-occurrence of thalidomide ingestion in pregnancy and phocomelia in the offspring was evidence of causation because both thalidomide use and phocomelia were highly unusual in the population prior to the period of interest. Given how common exposures may be, and how common adverse pregnancy outcome is, reliance on multiple case reports as the sole evidence for causation is unsatisfactory.”

The Public Affairs Committee of the Teratology Society, “Teratology Society Public Affairs Committee Position Paper Causation in Teratology-Related Litigation,” 73 Birth Defects Research (Part A) 421, 423 (2005).

When the base rate for the outcome is near zero, and other circumstantial evidence is present, some commentators insist that causality may be inferred from well-documented case reports:

“However, we propose that some adverse drug reactions are so convincing, even without traditional chronological causal criteria such as challenge tests, that a well documented anecdotal report can provide convincing evidence of a causal association and further verification is not needed.”

Jeffrey K. Aronson & Manfred Hauben, “Drug safety: Anecdotes that provide definitive evidence,” 333 Brit. Med. J. 1267, 1267 (2006) (Dr. Hauben was medical director of risk management strategy for Pfizer, in New York, at the time of publication). But which ones are convincing, and why?

        *        *        *        *        *        *        *        *        *

Dr. David Schwartz, in a recent blog post, picked up on some of my discussion of the gadolinium case reports (see here and there), and posited the ultimate question: when are case reports sufficient to show causation? David Schwartz, “8 Examples of Causal Inference Without Data from Controlled Studies” (Dec. 14, 2014).

Dr. Schwartz discusses several causal claims, all of which gave rise to litigation at some point, in which case reports or case series played an important, if not dispositive, role:

  1.      Gadolinium-based contrast agents and NSF
  2.      Amphibole asbestos and malignant mesothelioma
  3.      Ionizing radiation and multiple cancers
  4.      Thalidomide and teratogenicity
  5.      Rezulin and acute liver failure
  6.      DES and clear cell vaginal adenocarcinoma
  7.      Vinyl chloride and angiosarcoma
  8.      Manganese exposure and manganism

Dr. Schwartz’s discussion is well worth reading in its entirety, but I wanted to emphasize some of Dr. Schwartz’s caveats. Most of the exposures are rare, as are the outcomes. In some cases, the outcomes occur almost exclusively with the identified exposures. All eight examples pose some danger of misinterpretation. Gadolinium-based contrast agents appear to create a risk of NSF only in the presence of chronic renal failure. Amphibole asbestos, and most importantly, crocidolite causes malignant mesothelioma after a very lengthy latency period. Ionizing radiation causes some cancers that are all-too common, but the presence of multiple cancers in the same person, after a suitable latency period, is distinctly uncommon, as is the level of radiation needed to overwhelm bodily defenses and induce cancers. Thalidomide was associated by case reports fairly quickly with phocomelia, which has an extremely low baseline risk. Other birth defects were not convincingly demonstrated by the case series. Rezulin, an oral antidiabetic medication, was undoubtedly causally responsible for rare cases of acute liver failure. Chronic liver disease, however, which is common among type 2 diabetic patients, required epidemiologic evidence, which never materialized[5].

Manganism, by definition, is the cause of manganism, but extremely high levels of manganese exposure, and specific speciation of the manganese, are essential to the causal connection. Manganism raises another issue often seen in so-called signature diseases: diagnostic accuracy. Unless the diagnostic criteria have perfect (100%) specificity, with no false-positive diagnoses, then once again, we expect false-positive cases to appear when the criteria are applied to large numbers of people. In the welding fume litigation, where plaintiffs’ counsel and physicians engaged in widespread, if not wanton, medico-legal screenings, it was not surprising that they might find occasional cases that appeared to satisfy their criteria. Of course, the more the criteria are diluted to accommodate litigation goals, the more likely there will be false positive cases.[6]

Dr. Schwartz identifies some common themes and important factors in identifying the bases for inferring causality from uncontrolled evidence:

“(a) low or no background rate of the disease condition;

(b) low background rate of the exposure;

(c) a clear understanding of the mechanism of action.”

These factors and perhaps others should not be confused with strict criteria here. The exemplar cases suggest a family resemblance of overlapping factors that help support the inference, even against the most robust skepticism.

In litigation, defense counsel typically argue that analytical epidemiology is always necessary, and plaintiffs’ counsel claim epidemiology is never needed. The truth is more nuanced and conditional, but the great majority of litigated cases do require epidemiology for health effects because the claimed harms are outcomes that have an expected incidence or prevalence in the exposed population irrespective of exposure.


[1] Reference Manual on Scientific Evidence at 23 (3d ed. 2011) (citing “Cloud v. Pfizer Inc., 198 F. Supp. 2d 1118, 1133 (D. Ariz. 2001) (stating that case reports were merely compilations of occurrences and have been rejected as reliable scientific evidence supporting an expert opinion that Daubert requires); Haggerty v. Upjohn Co., 950 F. Supp. 1160, 1164 (S.D. Fla. 1996), aff’d, 158 F.3d 588 (11th Cir. 1998) (“scientifically valid cause and effect determinations depend on controlled clinical trials and epidemiological studies”); Wade-Greaux v. Whitehall Labs., Inc., 874 F. Supp. 1441, 1454 (D.V.I. 1994), aff’d, 46 F.3d 1120 (3d Cir. 1994) (stating there is a need for consistent epidemiological studies showing statistically significant increased risks).”)

[2] Best v. Lowe’s Home Centers, Inc., 563 F. 3d 171, 178 (6th Cir 2009); Westberry v. Gislaved Gummi AB, 178 F. 3d 257, 263–264 (4th Cir. 1999).

[3] There may have been a better argument for Matrixx in distinguishing the method and place of delivery of the zinc sulfate in the polio trials of the 1930s, but when Matrixx’s counsel was challenged at oral argument, he asserted simply, and wrongly, that the two compounds were different.

[4] Johnston & Hauser, “The value of a case report,” 62 Ann. Neurology A11 (2007) (“No matter how compelling a vignette may seem, one must always be concerned about the reliability of inference from an “n of one.” No statistics are possible in case reports. Inference is entirely dependent, then, on subjective judgment. For a case meant to suggest that agent A leads to event B, the association of these two occurrences in the case must be compared to the likelihood that the two conditions could co-occur by chance alone …. Such a subjective judgment is further complicated by the fact that case reports are selected from a vast universe of cases.”); David A. Grimes & Kenneth F. Schulz, “Descriptive studies: what they can and cannot do,” 359 Lancet 145, 145, 148 (2002) (“A frequent error in reports of descriptive studies is overstepping the data: studies without a comparison group allow no inferences to be drawn about associations, causal or otherwise.”) (“Common pitfalls of descriptive reports include an absence of a clear, specific, and reproducible case definition, and interpretations that overstep the data. Studies without a comparison group do not allow conclusions about cause and disease.”); Troyen A. Brennan, “Untangling Causation Issues in Law and Medicine: Hazardous Substance Litigation,” 107 Ann. Intern. Med. 741, 746 (1987) (recommending that testifying physicans “[a]void anecdotal evidence; clearly state the opposing side is relying on anecdotal evidence and why that is not good science.”).

[5] See In re Rezulin, 2004 WL 2884327, at *3 (S.D.N.Y. 2004).

[6] This gaming of diagnostic criteria has been a major invitation to diagnostic invalidity in litigation over asbestosis and silicosis in the United States.

More Case Report Mischief in the Gadolinium Litigation

November 28th, 2014

The Decker case is one curious decision, by the MDL trial court, and the Sixth Circuit. Decker v. GE Healthcare Inc., ___ F.3d ___, 2014 FED App. 0258P, 2014 U.S. App. LEXIS 20049 (6th Cir. Oct. 20, 2014). First, the Circuit went out of its way to emphasize that the trial court had discretion, not only in evaluating the evidence on a Rule 702 challenge, but also in devising the criteria of validity[1]. Second, the courts ignored the role and the weight being assigned to Federal Rule of Evidence 703, in winnowing the materials upon which the defense expert witnesses could rely. Third, the Circuit approved what appeared to be extremely asymmetric gatekeeping of plaintiffs’ and defendant’s expert witnesses. The asymmetrical standards probably were the basis for emphasizing the breadth of the trial court’s discretion to devise the criteria for assessing scientific validity[2].

In barring GEHC’s expert witnesses from testifying about gadolinium-naive nephrogenic systemic fibrosis (NSF) cases, Judge Dan Polster, the MDL judge, appeared to invoke a double standard. Plaintiffs could adduce any case report or adverse event report (AER) on the theory that the reports were relevant to “notice” of a “safety signal” between gadolinium-based contrast agents in MRI and NSF. Defendants’ expert witnesses, however, were held to the most exacting standards of clinical identity with the plaintiff’s particular presentation of NSP, biopsy-proven presence of Gd in affected tissue, and documentation of lack of GBCA-exposure, before case reports would be permitted as reliance materials to support the existence of gadolinium-naïve NSF.

A fourth issue with the Decker opinion is the latitude it permitted the district court to allow testimony from plaintiffs’ pharmacovigilance expert witness, Cheryl Blume, Ph.D., over objections, to testify about the “signal” created by the NSF AERs available to GEHC. Decker at *11. At the same trial, the MDL judge prohibited GEHC’s expert witness, Dr. Anthony Gaspari, to testify that the AERs described by Blume did not support a clinical diagnosis of NSF.

On a motion for reconsideration, Judge Polster reaffirmed his ruling on grounds that

(1) the AERs were too incomplete to rule in or rule out a diagnosis of NSF, although they were sufficient to create a “signal”;

(2) whether the AERs were actual cases of NSF was not relevant to their being safety signals;

(3) Dr. Gaspari was not an expert in pharmacovigilance, which studied “signals” as opposed to causation; and

(4) Dr. Gaspari’s conclusion that the AERs were not NSF was made without reviewing all the information available to GEHC at the time of the AERs.

Decker at *12.

The fallacy of this stingy approach to Dr. Gaspari’s testimony lies in the courts’ stubborn refusal to recognize that if an AER was not, as a matter of medical science, a case of NSF, then it could not be a “signal” of a possible causal relationship between GBCA and NSF. Pharmacovigilance does not end with ascertaining signals; yet the courts privileged Blume’s opinions on signals even though she could not proceed to the next step and evaluate diagnostic accuracy and causality. This twisted logic makes a mockery of pharmacovigilance. It also led to the exclusion of Dr. Gaspari’s testimony on a key aspect of plaintiffs’ liability evidence.

The erroneous approach pioneered by Judge Polster was compounded by the district court’s refusal to give a jury instruction that AERs were only relevant to notice, and not to causation. Judge Polster offered his reasoning that “the instruction singles out one type of evidence, and adds, rather than minimizes, confusion.” Judge Polster cited the lack of any expert witness testimony that suggested that AERs showed causation and “besides, it doesn’t matter because those patients are not, are not the plaintiffs.” Decker at *17.

The lack of dispute about the meaning of AERs would have seemed all the more reason to control jury speculation about their import, and to give a binding instruction on AERs and their limited significance. As for the AER patients’ not being the plaintiffs, well, the case report patients were not the plaintiffs, either. This last reason is not even wrong[3]. The Circuit, in affirming, turned a blind eye to the district court’s exercise of discretion in a way that systematically increased the importance of Blume’s testimony on signals, while systematically hobbling the defendant’s expert witnesses.


[1]THE STANDARD OF APPELLATE REVIEW FOR RULE 702 DECISIONS” (Nov. 12, 2014).

[2]Gadolinium, Nephrogenic Systemic Fibrosis, and Case Reports” (Nov. 24, 2014).

[3] “Das ist nicht nur nicht richtig, es ist nicht einmal falsch!” The quote is attributed to Wolfgang Pauli in R. E. Peierls, “Wolfgang Ernst Pauli, 1900-1958,” 5 Biographical Memoirs Fellows Royal Soc’y 175, 186 (1960).

 

History – Lies My Teacher Told Me

November 26th, 2014

James W. Loewen, a professor of history, has been one of the most untiring critics of how history is taught and practiced in the United States. A large part of his criticism derives from the overt politicization of the teaching of history, especially the heavy hand of school boards and textbook committees in their selection of “appropriate textbooks” for high school students. See James W. Loewen, Lies My Teacher Told Me: Everything Your American History Textbook Got Wrong (2007). The disgraceful “conservative” sanitizing of United States highschoolers’ history textbooks is almost equal to the heavy-handed Marxist bent of some University professors. The politicization of history may be unavoidable, but we should be alert to the intellectual depredations from the right and the left.

*     *     *     *     *     *

I recently saw the self-styled social history of silicosis, Deadly Dust, by David Rosner and Gerald Markowitz, cited in a trial court brief. The cite was to the original edition, but it led me to read the “new and expanded” edition[1], published in 2006. Expanded, but not exactly super-sized, and with the same empty calories as before. The authors’ Preface to the second edition relays an air of excitement about recent (at the time of publication) media suggestions that silicosis may be the “new asbestosis.[2]” Of course, the authors were excited because the uptick in silicosis litigation around 2003, based almost exclusively upon fraudulent filings, brought them engagements as compensated expert witnesses for plaintiffs’ counsel.

The Preface also confesses that before their initial edition, the authors were ignorant about silicosis. And because they are so well read they assumed that their not having heard of silicosis meant that silicosis must have disappeared from the literature. Id. at xiii. This fallacious confusion between absence of evidence and evidence of absence pervades the entire book. Their first edition was written with this confirmation bias dominating their narrative:

“The book we wrote tells the story of a condition that dominated public health, medical, labor, and popular discourse on disease in the 1930s but that virtually vanished from popular and professional consciousness after World War II. How, we asked, could a chronic disease that took decades to develop and that was assumed to affect hundreds of thousands of American workers disappear from the literature and public notice in less than a decade? This question is the basis for Deadly Dust, and we believe that we answered it, providing a cultural, medical, and political model of how we, as a society, decide to recognize or forget about illness.”

Deadly Dust at xiv (emphasis added). The second edition is more of the same biased narrative.

Also clear from their Preface is the authors’ messianic complex. I now know why they have repeatedly attacked me for having criticized them: It is important for them to be seen as having been resistant victims of industry, indeed, especially if they are triumphant victims:

“We are particularly proud that lawyers for various industries have sought to get judges to exclude our book from court cases.”

Id. at xvi. Of course, from the lawyers’ perspective, a book such as Deadly Dust has many layers of evidentiary problems, running from authentication of documents, to multiple layers of hearsay, legal and logical relevancy, and rampant, subjective opinion spread throughout the narrative.

The “virtually vanished” phrase caused me to revisit[3] my previous quantitative assessment of discussions of silicosis in the popular and medical literature. The National Library of Medicine PubMed database is expanding back into the past, adding old journals and their articles to the database. Here is the most recent tally, by decade of articles with keyword “silicosis”:

Date Range                    Number of Articles from Keyword Search

1940 – 1949                      119

1950 – 1959                    1,436

1960 – 1969                    1,868

1970 – 1979                    1,176

1980 – 1989                       940

1990 – 1999                       883

2000 – 2009                      860

2010 — present                  498

The Rosner/Markowitz claim about silicosis “virtually vanishing” from professional discourse after World War II, is an assertion that is completely belied by the evidence. Google’s Ngram function further confirms the incorrectness of the fundamental premise of Deadly Dust:

Silicosis Ngram 1920 - 2010

Silicosis Ngram 1920 – 2010

The Google chart shows that although there was a peak around 1940, the level of referencing silicosis remained at or above the level for the mid-1930s until 1960, and never retreated to levels as low as for 1930-32.

This false premise, that silicosis vanished, or virtually vanished, from the medical literature, is the starting point for Rosner and Markowitz’ faux conspiracy charge against industry for suppressing discussion, when the reality was exactly the opposite. What follows from the false premise is a false set of conclusions.


[1] David Rosner & Gerald Markowitz, Deadly Dust: Silicosis and the On-Going Struggle to Protect Workers’ Health (Ann Arbor 2006).

[2] citing Jonathan Glater, “Suits on Silica Being Compared to Asbestos Cases,” New York Times (Sept. 6, 2003), C-1 (quoting one defense lawyer as saying that “I actually thought that we had made the world safe for sand.”).

[3] See Schachtman, “Conspiracy Theories: Historians, In and Out of Court” (April 17, 2013).