For your delectation and delight, desultory dicta on the law of delicts.

WOE — Zoloft Escapes a MDL While Third Circuit Creates a Conceptual Muddle

July 31st, 2017

Multidistrict Litigations (MDLs) can be “muddles” that are easy to get in, but hard to get out of. Pfizer and subsidiary Greenstone fabulously escaped a muddle through persistent lawyering and the astute gatekeeping of a district judge, in the Eastern District of Pennsylvania. That judge, the Hon. Cynthia Rufe, sustained objections to the admissibility of plaintiffs’ epidemiologic expert witness Anick Bérard. When the MDL’s plaintiffs’ steering committee (PSC) demanded, requested, and begged for a do over, Judge Rufe granted them one more chance. The PSC put their litigation industry eggs in a single basket, carried by statistician Nicholas Jewell. Unfortunately for the PSC, Judge Rufe found Jewell’s basket to be as methodologically defective as Bérard’s, and Her Honor excluded Jewell’s proffered testimony. Motions, paper, and appeals followed, but on June 2, 2017, the Third Circuit declared that the PSC and its clients had had enough opportunities to get through the gate. Their baskets of methodological deplorables were not up to snuff. In re Zoloft Prod. Liab. Litig., No. 16-2247 , __ F.3d __, 2017 WL 2385279, 2017 U.S. App. LEXIS 9832 (3d Cir. June 2, 2017) (affirming exclusion of Jewell’s dodgy opinions, which involved multiple methodological flaws and failures to follow any methodology faithfully) [Slip op. cited below as Zoloft].

Plaintiffs Attempt to Substitute WOE for Depressingly Bad Expert Witness Opinion

The ruse of conflating “weight of the evidence,” as used to describe the appellate standard of review for sustaining or reversing a trial court’s factual finding with a purported scientific methodology for inferring causation, was on full display by the PSC in their attack on Judge Rufe’s gatekeeping. In their appellate brief in the Court of Appeals for the Third Circuit, the PSC asserted that Jewell had used a “weight of the evidence method,” even though that phrase, “weight of the evidence” (WOE) was never used in Jewell’s litigation reports. The full context of the PSC’s argument and citations to Milward make clear a deliberate attempt to conflate WOE as an appellate judicial standard for reviewing jury fact finding and a purported scientific methodology. See Appellants’ Opening Brief at 54 (Aug. 10, 2016) [cited as PSC] (asserting that “[a]t all times, the ultimate evaluation of the weight of the evidence is a jury question”; citing Milward v. Acuity Specialty Products Group, Inc., 639 F.3d 11, 20 (1st Cir. 2011), cert. denied, 133 S. Ct. 63 (2012).

Having staked the ground that WOE is akin to a jury’s factual finding, and thus immune to any but the most extraordinary trial court action or appellate intervention, the PSC then pivoted to claim that Jewell’s WOE-ful method was nothing much more than an assessment of “the totality of the available scientific evidence, guided by the well-accepted Bradford-Hill criteria.” PSC at 3, 4, 7. This maneuver allowed the PSC to argue, apparently with a straight face, that WOE methodology as used by Jewell, had been generally accepted in the scientific community, as well as by the Third Circuit, in previous cases in which the court accepted the use of Bradford Hill’s considerations as a reliable method for establishing general causation. See PSC at 4 (citing Gannon v. United States, 292 F. App’x 170, 173 n.1 (3d Cir. 2008)). Jewell then simply plugged in his expertise and “40 years of experience,” and the desired conclusion of causation popped out. Id. Quod erat demonstrandum.

In pressing its point, the PSC took full advantage of loose, inaccurate language from the American Law Institute’s Restatement’s notorious comment C:

No algorithm exists for applying the Hill guidelines to determine whether an association truly reflects a causal relationship or is spurious.”

PSC at 33-34, citing Restatement (Third) of Torts: Physical and Emotional Harm § 28 cmt. c(3) (2010). Well true, but the absence of a mathematical algorithm hardly means that causal judgments are devoid of principles and standards. The PSC was undeterred, by text or by shame, from equating an unarticulated use of WOE methodology with some vague invocation of Bradford Hill’s considerations for evaluating associations for causality. See PSC at 43 (citing cases that never mentioned WOE but only Bradford Hill’s 50-plus year old heuristic as somehow supporting the claimed identity of the two approaches)1.

Pfizer Rebuffs WOE

Pfizer filed a comprehensive brief that unraveled the PSC’s duplicity. For unknown reasons, tactical or otherwise, however, Pfizer did not challenge the specifics of PSC’s equation of WOE with an abridged, distorted application of Bradford Hill’s considerations. See generally Opposition Brief of Defendants-Appellees Pfizer Inc., Pfizer International LLC, and Greenstone LLC [cited as Pfizer]. Perhaps given page limits and limited judicial attention spans, and just how woefully bad Jewell’s opinions were, Pfizer may well have decided that a more directed approach of assuming arguendo WOE’s methodological appropriateness was a more economical, pragmatic approach. A close reading of Pfizer’s brief, however, makes clear that it never conceded the validity of WOE as a scientific methodology.

Pfizer did point to the recasting of Jewell’s aborted attempt to apply Bradford Hill considerations as an employment of WOE methodology. Pfizer at 46-47. The argument reminded me of Abraham Lincoln’s famous argument:

How many legs does a dog have if you call his tail a leg?


Saying that a tail is a leg doesn’t make it a leg.”

Allen Thorndike Rice, Reminiscences of Abraham Lincoln by Distinguished Men of His Time at 242 (1909). Calling Jewell’s supposed method WOE or Bradford Hill or WOE/Bradford Hill did not cure the “fatal methodological flaws in his opinions.” Pfizer at 47.

Pfizer understandably and properly objected to the PSC’s attempt to cast Jewell’s “methodology” at such a high level of generality that any consideration of the many instances of methodological infidelity would be relegated to mere jury questions. Acquiescence in the PSC’s rhetorical move would constitute a complete abandonment of the inquiry whether Jewell had used a proper method. Pfizer at 15-16.

Interestingly, none of the amici curiae addressed the slippery WOE arguments advanced by the PSC. See generally Brief of Amici Curiae American Tort Reform Ass’n & Pharmaceutical Research and Manufacturers of America (Oct. 18, 2016); Brief of Washington Legal Fdtn. as Amicus Curiae (Oct. 18, 2016). There was no meaningful discussion of WOE as a supposedly scientific methodology at oral argument. See Transcript of Oral Argument in In re Zoloft Prod. Liab. Litig., No. 16-2247 (Jan. 25, 2017).

The Third Circuit Acknowledges that Some Methodological Infelicities, Flaws, and Fallacies Are Properly the Subject of Judicial Gatekeeping

Fortunately, Jewell’s methodological infidelities were easily recognized by the Circuit judges. Jewell treated multiple studies, which were nested within one another, and thus involved overlapping and included populations, as though they were independent verifications of the same hypothesis. When the population at issue (from the Danish cohort) was included in a more inclusive pan-Scandivanian study, the relied-upon association dissipated, and Jewell utterly failed to explain or account for these data. Zoloft at 5-6.

Jewell relied upon a study by Anick Bérard, even though he later had to concede that the study had serious flaws that invalidated its conclusions, and which flaws caused him to have a lack of confidence in the paper’s findings.2 In another instance, Jewell relied innocently upon a study that purported to report a statistically significant association, but the authors of this paper were later required by the journal, The New England Journal of Medicine, to correct the very calculated confidence interval upon which Jewell had relied. Despite his substantial mathematical prowess, Jewell missed the miscalculation and relied (uncritically) upon a finding as statistically significant when in fact it was not.

Jewell rejected a meta-analysis of Zoloft studies for questionable methodological quibbles, even though he had relied upon the very same meta-analysis, with the same methodology, in his litigation efforts involving Prozac and birth defects. Not to be corralled by methodological punctilio, Jewell conducted his own meta-analysis with two studies Huybrechts (2014) and Jimenez-Solem (2012), but failed to explain why he excluded other studies, the inclusion of which would have undone his claimed result. Zoloft at 9. Jewell purported to reanalyze and recalculate point estimates in two studies, Jimenez-Solem (2012) and Huybrechts (2014), without any clear protocol or consistency in his approach to other studies. Zoloft at 9. The list goes on, but in sum, Jewell’s handling of these technical issues did not inspire confidence, either in the district or in the appellate court.

WOE to the Third Circuit

The Circuit gave the PSC every conceivable break. Because Pfizer had not engaged specifically on whether WOE was a proper, or any kind of, scientific method, the Circuit treated the issue as virtually conceded:

Pfizer does not seem to contest the reliability of the Bradford Hill criteria or weight of the evidence analysis generally; the dispute centers on whether the specific methodology implemented by Dr. Jewell is reliable. Flexible methodologies, such as the “weight of the evidence,” can be implemented in multiple ways; despite the fact that the methodology is generally reliable, each application is distinct and should be analyzed for reliability.”

Zoloft at 18. The Court acknowledged that WOE arose only in the PSC’s appellate brief, which would have made the entire dubious argument waived under general appellate jurisdictional principles, but the Court, in a footnote, indulged the assumption, “for the sake of argument,” that WOE was Jewell’s purported method from the inception. Zoloft at 18 n. 39. Without any real evidentiary support or analysis or concession from Pfizer, the Circuit accepted that WOE analyses were “generally reliable.” Zoloft at 21.

The Circuit accepted, rather uncritically, that Jewell used a combination of WOE analysis and Bradford Hill considerations. Zoloft at 17. Although Jewell had never described WOE in his litigation report, and WOE was not a feature of his hearing testimony, the Circuit impermissibly engrafted Carl Cranor’s description of WOE as involving inference to the best explanation. Zoloft at 17 & n.37, citing Milward v. Acuity Specialty Prods. Grp., Inc., 639 F.3d 11, 17 (1st Cir. 2011) (internal quotation marks and citation omitted).

There was, however, a limit to the Circuit’s credulousness and empathy. As the Court noted, there must be some assurance that the purported Bradford Hill/WOE method is something more than a “mere conclusion-oriented selection process.” Zoloft at 20. Ultimately, the Court put its markers down for Jewell’s putative WOE methodology:

there must be a scientific method of weighting that is used and explained.”

Zoloft at 20. Calling the method WOE did not, in the final analysis, exclude Jewell from Rule 702 gatekeeping. Try as the PSC might, there was just no mistaking Jewell’s approach as anything other than a crazy patchwork quilt of numerical wizardry in aid of subjective, result-oriented conclusion mongering.

In the Court’s words:

we find that Dr. Jewell did not 1) reliably apply the ‘techniques’ to the body of evidence or 2) adequately explain how this analysis supports specified Bradford Hill criteria. Because ‘any step that renders the analysis unreliable under the Daubert factors renders the expert’s testimony inadmissible’, this is sufficient to show that the District Court did not abuse its discretion in excluding Dr. Jewell’s testimony.”

Zoloft at 28. As heartening as the Circuit’s conclusion is, the Court’s couching its observation as a finding (“we find”) is disheartening with respect to the Third Circuit’s apparent inability to distinguish abuse-of-discretion review from de novo appellate findings. Equally distressing is the Court’s invocation of Daubert factors, which were dicta in a Supreme Court case that was superseded by an amended statute over 17 years ago, in Federal Rule of Evidence 702.

On the crucial question whether Jewell had engaged in an unreliable application of methods or techniques that superficially, at a very high level of generality, claim to be generally accepted, the Court stayed on course. The Court “found” that Jewell had applied techniques, analyses, and critiques so obviously inconsistently that no amount of judicial indulgence, assumptions arguendo, or careless glosses could save Jewell and his fatuous opinions from judicial banishment. Zoloft 28-29. Returning to the correct standard of review (abuse of discretion), but the wrong governing law (Daubert instead of Rule 702), the Court announced that:

[b]ecause ‘any step that renders the analysis unreliable under the Daubert factors renders the expert’s testimony inadmissible’, this is sufficient to show that the District Court did not abuse its discretion in excluding Dr. Jewell’s testimony.”

Zoloft at 21 n.50 (citation omitted). The Court found itself unable to say simply and directly that “the MDL trial court decided the case well within its discretion.”

The Zoloft case was not the Third Circuit’s first WOE rodeo. WOE had raised its unruly head in Magistrini v. One Hour Martinizing Dry Cleaning, 180 F. Supp. 2d 584, 602 (D.N.J. 2002), aff’d, 68 F. App’x 356 (3d Cir. 2003), where an expert witness, David Ozonoff, offered what purported to be a WOE opinion. The Magistrini trial court did not fuss with the assertion that WOE was generally reliable, but took issue with how Ozonoff tried to pass off his analysis as a comprehensive treatment of the totality of the evidence. In Magistrini, Judge Hochberg noted that regardless of the rubric of the methodology, the witness must show that in conducting a WOE analysis:

all of the relevant evidence must be gathered, and the assessment or weighing of that evidence must not be arbitrary, but must itself be based on methods of science.”

Magistrini, 180 F. Supp. 2d at 602. The witness must show that the methodology is more than a “mere conclusion-oriented selection process,” and that it has a “a scientific method of weighting that is used and explained.” Id. at 607. Asserting the use of WOE was not an excuse or escape from judicial gatekeeping as specified by Rule 702.

Although the Third Circuit gave the Zoloft MDL trial court’s findings a searching review (certainly much tougher than the prescribed abuse-of-discretion review), the MDL court’s finding that Jewell “failed to consistently apply the scientific methods he articulates, has deviated from or downplayed certain well-established principles of his field, and has inconsistently applied methods and standards to the data so as to support his a priori opinion” were ultimately vindicated by the Court of Appeals. Zoloft at 10.

All’s well that ends well. Perhaps. It remains unfortunate, however, that a hypothetical method, WOE — which was never actually advocated by the challenged expert witnesses, which lacks serious support in the scientific community, and which was merely assumed arguendo to be valid — will be taken by careless readers to have been endorsed the Third Circuit.

1 Among the cases cited without any support for the PSC’s dubious contention were Gannon v. United States, 292 F. App’x 170, 173 n.1 (3d Cir. 2008); Bitler v. A.O. Smith Corp., 391 F.3d 1114, 1124-25 (10th Cir. 2004); In re Joint E. & S. Dist. Asbestos Litig., 52 F.3d 1124, 1128 (2d Cir. 1995); In re Avandia Mktg., Sales Practices & Prods. Liab. Litig., No. 2007-MD-1871, 2011 WL 13576, at *3 (E.D. Pa. Jan. 4, 2011) (“Bradford-Hill criteria are used to assess whether an established association between two variables actually reflects a causal relationship.”).

2 Anick Bérard, Sertraline Use During Pregnancy and the Risk of Major Malformations, 212 Am. J. Obstet. Gynecol. 795 (2015).

Weight of the Evidence in Science and in Law

July 29th, 2017

woe to that man by whom the offense cometh”

         Matthew 18:7

Weight of the evidence (WOE) has cropped up again in recent trial and appellate court proceedings involving the admissibility of scientific expert witness opinion testimony. With some consistency, the WOE approach advocated is vacuous. The proponents of WOE do not specify what type of evidence is considered, whether all evidence was considered, or how competing and conflicting evidence was weighed.

Interpreted sympathetically, WOE might be taken to mean that “scientific judgment” was exercised with respect to causal inference, without describing exactly what was done. Although sympathetic, this interpretation renders the purported methodology meaningless. WOE-ful scientists might just as well say that they used scientific method. Not surprisingly, WOE is absent from virtually all major epidemiology textbooks

Despite the vacuity of WOE, or because of it, some lawyers, who constitute the lawsuit industry, are particularly fond of WOE.1 Expert witnesses who support the lawsuit industry have defended their “right” to inflict WOE on the litigation system, tooth and nail.2

Carl Cranor, a philosophy professor and a hired expert witness in litigation for plaintiffs’ counsel, has written about WOE and attempted to defend WOE as a scientific methodology. Cranor has caricaturized criticisms of WOE, including mine, by suggesting that the International Agency for Research on Cancer’s use of WOE rebuts my suggestion that WOE is no method at all.3 Cranor’s defense fails, however, because IARC’s method, for all its deficiencies, never invokes a method mired in WOE.

Perhaps the Lawsuit Industry likes WOE as much as it likes the equally vague term, “link.” WOE frees them from the requirement of any meaningful methodology, which means that any conclusion is possible. Under WOE, any conclusion can survive gatekeeping as an opinion. WOE frees the putative expert witness from the need to consider the quality of research. WOE-ful authors such as Carl Cranor invoke WOE or seek to inflict WOE without mentioning the crucial “nuts and bolts” of scientific inference, such as concepts of

  • Internal and external validity
  • Assessment of random error
  • Assessment of known and residual confounding
  • Known and potential threats to validity in
  • Appropriate methods of systematic review
  • Appropriate synthesis across studies, such as systematic review and meta-analysis

These important concepts are lost in the miasma of WOE.

In the published scientific literature, it is a commonplace that WOE is either poorly or not defined and specified. The phrase is vague and ambiguous; its use, inconsistent.4  Even authors sympathetic to the WOE mission have reluctantly concluded that the term is most often used in a way that “does not lend itself to transparency or repeatability except in simple cases.”5

Another reason that WOE resonates so strongly with the Lawsuit Industry is that having expert witnesses proclaim WOE as their methodology permits trial counsel to claim that the proffered opinions are immune to gatekeeping because, after all, weight-of-the-evidence questions are for the jury. Lawyers learn early on about WOE factual issues in appellate review of a wide variety of evidentiary and sufficiency issues in criminal and civil cases.6 Unless against the great WOE, WOE questions are for the jury.

Even venerable judges fall for this semantic confusion. In 1995, the Second Circuit, before the major revision of Rule 702, in 2000, noted that in discharging their gatekeeping role, trial judges do not assume:

“‘the role of St. Peter at the gates of heaven, performing a searching inquiry into the depth of an expert witness’s soul’ that would ‘inexorably lead to evaluating witness credibility and weight of the evidence, the ageless role of the jury’.”

McCullock v. H.B. Fuller Co., 61 F.3d 1038, 1045 (2d Cir.1995) (internal citations omitted).

Of course, the expert witness’s soul is not at issue, but his methodology is. More important, however, note how the appellate court adverted to “weight of the evidence” as something that the jury must evaluate, along with witness credibility. The expert witness WOE litigation strategy deliberately trades upon the confusion between WOE in the allocation between judge and jury, and valid scientific methodology in causal inference. McCullock is proof that judges can be, and are, bamboozled by the litigation strategy.

Twenty years after McCullock, federal appellate judges are still falling for the deliberate confusion between legal and scientific WOE. The Ninth Circuit recently held that the reliability test of Federal Rule of Evidence 702 is:

“‘is not the correctness of the expert’s conclusions but the soundness of his methodology’, and when an expert meets the threshold established by Rule 702, the expert may testify and the fact finder decides how much weight to give that testimony. Challenges that go to the weight of the evidence are within the province of a fact finder, not a trial court judge. A district court should not make credibility determinations that are reserved for the jury.”

City of Pomona v. SQM North America Corp., 750 F.3d 1036, 1044 (9th Cir. 2014) (internal citation omitted), cert. denied, 135 S. Ct. 870 (2014). Characterizing a methodological dispute as one that “merely” concerns the “weight of the evidence” is a strategy to remove the dispute from judicial gatekeeping altogether.

Recently, the Third Circuit displayed this confusion of WOE with methodological impropriety by mischaracterizing failure to correct for multiple testing as merely an improper calculation that ordinarily goes to the weight of the evidence, not its admissibility. Karlo v. Pittsburgh Glass Works, LLC, 849 F.3d 61, 83 (3d Cir. 2017).

The Third Circuit, in Karlo, cited to a Supreme Court case that predated Daubert v. Merrell Dow Pharmaceuticals, 509 U.S. 579 (1993), and which did involve any Rule 702 challenge to the use of a flawed statistical analysis. In Bazemore v. Friday, 478 U.S. 385, 400 (1986), plaintiffs sued as a class for employment discrimination, and sought to show the discrimination through the use of a regression analysis. The defense challenged the plaintiffs’ regression on grounds that key variables were omitted. The Court rejected a sufficiency challenge to a finding of discrimination in plaintiffs’ class action, and noted:

Normally, failure to include variables will affect the analysis’ probativeness, not its admissibility.”

The lesson of the last two decades of judicial gatekeeping is that methodological infirmity will affect both probitiveness and admissibility7. Courts cannot escape their important gatekeeping duties by shifting their responsibility to juries under the guise of WOE.

2 See Schachtman, “Desultory Thoughts on Milward v. Acuity Specialty Products,” (Oct. 2015).

3 Carl F. Cranor, Toxic Torts: Science, Law, and the Possibility of Justice 146 (2d ed. 2016) (citing and selectively quoting from Schachtman, WOE-fully Inadequate Methodology – An Ipse Dixit By Another Name” (May 1, 2012)).

4 See Charles Menzie, Miranda Hope Henning, Jerome Cura, Kenneth Finkelstein, Jack Gentile, James Maughan, David Mitchell, Stephen Petron, Bonnie Potocki, Susan Svirsky & Patti Tyler, “A weight-of-evidence approach for evaluating ecological risks; report of the Massachusetts Weight-of-Evidence Work Group,” 2 Human Ecological Risk Assessment 277, 279 (1996) (“although the term ‘weight of evidence’ is used frequently in ecological risk assessment, there is no consensus on its definition or how it should be applied”); Sheldon Krimsky, “The weight of scientific evidence in policy and law,” 95 Am. J. Pub. Health S129 (2005) (“However, the term [WOE] is applied quite liberally in the regulatory literature, the methodology behind it is rarely explicated.”); V. H. Dale, G.R. Biddinger, M.C. Newman, J.T. Oris, G.W. Suter II, T. Thompson, et al., “Enhancing the ecological risk assessment process,” 4 Integrated Envt’l Assess. Management 306 (2008) (“An approach to interpreting lines of evidence and weight of evidence is critically needed for complex assessments, and it would be useful to develop case studies and/or standards of practice for interpreting lines of evidence.”);  Douglas L. Weed, “Weight of Evidence: A Review of Concept and Methods,” 25 Risk Analysis 1545 (2005) (noting the “lack of definition of the term weight of evidence, multiple uses of the term and a lack of consensus about its meaning, and the many different kinds of weights, both qualitative and quantitative which can be used in risk assessment”); R.G. Stahl Jr., “Issues addressed and unaddressed in EPA’s ecological risk guidelines,” 17 Risk Policy Report 35 (1998) (noting that U.S. Environmental Protection Agency’s guidelines for ecological weight-of-evidence approaches to risk assessment fail to provide guidance); Glenn W. Suter, Susan M. Cormier, “Why and how to combine evidence in environmental assessments:  Weighing evidence and building cases,” 409 Sci. Total Env’t 1406, 1406 (2011) (noting arbitrariness and subjectivity of WOE “methodology”).

5 See Igor Linkov, Drew Loney, Susan Cormier, F. Kyle Satterstrom, and Todd Bridges, “Weight-of-evidence evaluation in environmental assessment: review of qualitative and quantitative approaches,” 407 Sci. Total Env’t 5199, 5203 (2009).

6 See, e.g., People v. Collier, 146 A.D.3d 1146, 1147-48, 2017 NY Slip Op 00342 (N.Y. App. Div. 3d Dep’t, Jan. 19, 2017) (rejecting appeal based upon defendant’s claim that conviction was against “weight of the evidence”); Venson v. Altamirano, 749 F.3d 641, 656 (7th Cir. 2014) (noting “new trial is appropriate if the jury’s verdict is against the manifest weight of the evidence”).

7 David L. Faigman, Christopher Slobogin & John Monahan, “Gatekeeping Science: Using the Structure of Scientific Research to Distinguish Between Admissibility and Weight in Expert Testimony,” 110 Northwestern L. Rev. 859, 865 (2016) (“An expert economist in an employment discrimination case who admittedly fails to control for a key variable such as seniority or wage structure in a regression analysis has committed a general error that should lead to exclusion by a judge… .”).

Slemp Trial Part 4 – Graham Colditz

July 22nd, 2017

The Witness

Somehow, in opposition to two epidemiologists presented by the plaintiff in Slemp, the defense managed to call none. The first of the plaintiffs’ two epidemiology expert witnesses was Graham A. Colditz, a physician with doctoral level training in epidemiology. For many years, Colditz was a professor at the Harvard School of Public Health. Colditz left Harvard to become the Niess-Gain Professor at Washington University St. Louis School of Medicine, where he is also the Associate Director for Prevention and Control at the Alvin J. Siteman Cancer Center.

Colditz is a senior epidemiologist, with many book and article publications to his credit. Although he has not published a causal analysis of ovarian cancer and talc, Colditz was an investigator on the well-known Nurses’ Health Study. One of Colditz’s publications on the Nurses’ cohort featured an analysis of talc use and ovarian cancer outcomes.

Although he is not a frequent testifying expert witness, Colditz is no stranger to the courtroom. He was a regular protagonist in the estrogen-progestin hormone replacement therapy (HRT) litigation, which principally involves claims of female breast cancer. Colditz has a charming Australian accent, with a voice tremor that makes him sound older than 63, and perhaps even more distinguished. He charges $1,500 per hour for his testimonial efforts, but is quick to point out that he has given thousands to charity. At his hourly rate, we can be sure he needs tax deductions of some kind.

In discussing his own qualifications, Colditz was low-key and modest except for what seemed like a strange claim that his HRT litigation work for plaintiffs led the FDA to require a boxed warning of breast cancer risk on the package insert for HRT medications. This claim is certainly false, and an extreme instance of post hoc ergo propter hoc. Colditz gilded the lilly by claiming that he does not get involved unless he believes that general causation exists between the exposure or medication and the disease claimed. Since he has only been a plaintiffs’ expert witness, this self-serving claim is quite circular.

The Examinations

The direct and cross-examinations of Dr. Colditz were long and tedious. Most lawyers are reluctant to have an epidemiologists testify at all, and try to limit the length of their examinations, when they must present epidemiologic testimony. Indeed, the defense in Slemp may have opted to present a clinician based upon the prejudice against epidemiologists testifying about quantitative data and analysis. In any event, Colditz’s direct examination went not hours, but days, as did the defense’s cross-examination.

The tedium of the direct examination was exacerbated by the shameless use of leading, loaded, and argumentative questions by plaintiff’s counsel, Allen Smith. A linguistic analysis might well show that Smith spoke 25 to 30 words for every one word spoken by Colditz on direct examination. Even aside from the niceties of courtroom procedure, the direct examination was lacking in aesthetic qualities. Still, it is hard to argue with a $110 million verdict, which cries out for explanation.

There were virtually no objections to Smith’s testifying in lieu of Colditz, with Colditz reduced to just “yes.” Sometimes, Colditz waxed loquacious, and answered, “yes, sir.” From judicial responses to other objections, however, it was clear that the trial court would have provided little control of the leading and argumentative questions.

Smith’s examination also took Colditz beyond the scope of his epidemiologic expertise in to ethics, social policy, and legal requirements of warnings, again without judicial management or control. We learned, over objection, from Colditz of all witnesses that the determination of causation has nothing to do with whether a warning should be given.

The Subject Matter

Colditz was clearly familiar with the subject matter, and allowed Smith to testify for him on a fairly simplistic level. The testimony was a natural outgrowth of his professional interests, and Colditz must have appeared to have been a credible expert witness, especially in a St. Louis courtroom, given that he was in a leadership role at the leading cancer center in that city.

With Smith’s lead, Colditz broached technical issues of bias evaluation, meta-analysis and pooling, which would never be addressed later by a defense expert witness at an equal level of expertise, sophistication, and credibility. Colditz offered criticisms of the Gonzalez (Sister Study) and the latency built into the observation period of that cohort, and he introduced the concept of Berkson bias in some of the case-control studies. Neither of these particular criticisms was rebutted in the defense case, again raising the question whether the defense expert witness, Dr. Huh, a clinician specializing in gynecologic oncology, was an appropriate foil for the line up of plaintiffs’ expert witness. Dr. Colditz was able to talk authoritatively (and in some cases misleadingly) about issues, which Dr. Huh could not contradict effectively, even if he were to have tried.

Colditz characterized his involvement in the talc cases as starting with his conducting a systematic review, undertaken for litigation, but still systematic. As a professor of epidemiology, Colditz should know what a systematic review is, although he never fully described the process on either direct or cross-examinations. No protocol for the systematic review was adduced into evidence. Sadly, the defense expert witness, Dr. Huh, never stated that he had done a systematic review; nor did he offer any criticisms of Dr. Colditz’s systematic review. Indeed, Huh admitted that he had not read Colditz’s testimony. In general, observing Colditz’s testimony after having watched Dr. Huh testify shouted MISMATCH.

The Issues

Statistical Significance

The beginning point of a case such as Slemp, involving a claim that talc causes ovarian cancer, and that it caused her ovarian cancer, is whether there is supporting epidemiology for the claim. As Sir Austin Bradford Hill put it over 50 years ago:

Disregarding then any such problem in semantics we have this situation. Our observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance. What aspects of that association should we especially consider before deciding that the most likely interpretation of it is causation?”

Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 295 (1965). Colditz, and plaintiff’s counsel, did not run away from the challenge; they embraced statistical significance and presented an argument for why the association was “clear-cut” (not created by bias or confounding).

In one of his lengthy, leading questions, plaintiffs’ counsel attempted to suggest that statistical significance, or a confidence interval that excluded a risk ratio of 1.0, excluded bias as well as chance. Colditz to his credit broke from the straight jacket of “yes, sirs,” and disagreed as to bias. Smith, perhaps chastised then took a chance and asked an open-ended question about what a confidence interval was. With the bit in his mouth, Colditz managed to describe the observed confidence interval incorrectly as providing the range within which the point estimate would fall 95% of the time if the same study were repeated many times! There is a distribution of 95% confidence intervals, which cover the true parameter 95% of the time, assuming a correct statistical model, random sampling, and no bias or confounding. For the observed confidence interval, the true value is either included or not. Perhaps Colditz was thinking of a prediction interval, but Smith had asked for a definition of a confidence interval, and the jury got non-sense.

Dose Response

Colditz parsed the remaining Bradford Hill factors, and opined that exposure-gradient or dose response was good to have but not necessary to support a causal conclusion. Colditz opined, with respect to whether the statistical assessment of a putative dose-response should include non-exposed women, that the non-exposed women should be excluded. This was one of the few technical issues that Dr. Huh engaged with, in the defense case, but Dr. Colditz was not confronted with any textbooks or writings that cast doubt on his preference for excluding non-users.


Plaintiff’s counsel spent a great deal of time, mostly reading lengthy passages of articles on this or that plausible mechanism for talc’s causing human ovarian cancer, only to have Colditz, with little or no demonstrated expertise in biological mechanism, say “yes.” Some articles discussed that talc use was a modifiable risk and that avoiding perineal talc use “may” reduce ovarian cancer risk. Smith would read (accurately) and then ask Colditz whether he agreed that avoiding talc use would reduce ovarian cancer in women. Colditz himself catches and corrects Smith, some times, but not others.

Smith read from an article that invokes a claim that asbestos (with definition as to what mineral) causes ovarian cancer. Colditz agreed. Smith testified that talc has asbestos in it, and Colditz agreed. Smith read from an article that stated vaguely that talc is chemically similar to asbestos and thus this creates plausibility for a causal connection between talc and cancer. Colditz agreed, without any suggestion that he understands whether or not talc is morphologically similar to asbestos. It seems unlikely that Colditz had any real expertise to offer here, but Smith could not resist touching all bases with Colditz; and the defense did not object or follow up on these excesses.

Smith and Colditz, well mostly Smith, testified that tubal ligation reduces the otherwise observed increased risk of ovarian cancer from talc use. Smith here entrusts Colditz with providing the common-sense explanation. There is no meaningful cross-examination on this “jury friendly” point.


Colditz testifed that the studies, both case-control and cohort studies, were consistent in showing an increased risk of ovarian cancer in association with talc use. Indeed, the studies are mostly consistent; the issue is whether they are consistently biased or consistently showing the true population risk. The defense chose to confront Colditz with the lack of statistical significance in some studies (with elevated risk ratios) as though these studies were inconsistent with the studies that found similar risk ratios, with p-values less than 5%. This confrontation did not go well for the defense, either on cross-examination of Colditz, or on direct examination of Dr. Huh. Colditz backed up his opinion on consistency with the available meta-analyses, which find very low p-values for the summary estimate of risk ratio for talc use and ovarian cancer.

Unlike the Zoloft case1, in which consistency was generated across different end points by cherry picking, the consistency in the talc case was evidenced by a consistent elevation of risk ratios for the same end point, across studies. When subgroups of ovarian cell or tumor types were examined, statistical significance was sometimes lost, but the direction of the risk ratio above one was maintained. Meta-analyses generated summary point estimates with very low p-values.

The Gold Standard

Colditz further gilded the consistency lilly by claiming that the Terry study2, a pooled analysis of available case-control studies, was the “gold standard” in this area of observational epidemiology. Smith and Colditz presented at some length as to how the Cochrane Collaboration has labeled combined “individual patient data” (IPD) analyses as the gold standard. Colditz skimmed over the Cochrane’s endorsement of IPD analyses as having been made in the context of systematic reviews, involving primarily randomized clinical trials, for which IPD analyses allow time-to-event measurements, which can substantially modify observed risk ratios, and even reverse their direction. The case-control studies in the Terry pooled analysis did not have anything like the kind of prospectively collected individual patient data, which would warrant holding the Terry paper up as a “gold standard,” and Terry and her co-authors never made such a claim for their analysis. Colditz’s claim about the Terry study cried out for strong rebuttal, which never came.

The defense should have known that this hyperbolic testimony would be forthcoming, but they seemed not to have a rebuttal planned, other than dismissing case-controls studies generally as smaller than cohort studies. Rather than “getting into the weeds” about the merits of pooled analyses of observational studies, as opposed to clinical trials, the defense continued with its bizarre stance that the cohort studies were better because larger, while ignoring that they are smaller with respect to number of ovarian cancer cases and have less precision than the case-control studies. SeeNew Jersey Kemps Ovarian Cancer – Talc Cases” (Sept. 16, 2016). The defense also largely ignored Colditz’s testimony that exposure data collected in the available cohort studies was of limited value because lacking in details about frequency and intensity of use, and in some cases, collected on only one occasion.

Specific Causation

Colditz disclaimed the ability or intention to offer a specific causation opinion about Ms. Slemp’s ovarian cancer. Nonetheless, Colditz volunteered that “cancer is multifactorial,” which says very little because it says so much. In plaintiffs’ counsel’s hands, this characterization became a smokescreen to indict every possible present risk factor as playing a part in the actual causation of a particular case, such as Ms. Slemp’s case. No matter that the plaintiff was massively obese, and a smoker; every risk factor present must be, by fiat, in the “causal pie.”

But this would seem not to be Colditz’s own opinion. Graham Colditz has elsewhere asserted that an increased risk of disease cannot be translated into the “but-for” standard of causation3:

Knowledge that a factor is associated with increased risk of disease does not translate into the premise that a case of disease will be prevented if a specific individual eliminates exposure to that risk factor. Disease pathogenesis at the individual level is extremely complex.”

Just because a risk factor (assuming it is real and causal) is present does not put in the causal set.


The direct examination of Graham Colditz included scurrilous attacks on J & J’s lobbying, paying FDA user fees, and other corporate conduct, based upon documents of which Colditz had not personal knowledge. Colditz was reduced to nothing more than a backboard, off which plaintiff’s counsel could make his shots. On cross, the defense carefully dissected this direct examination and obtained disavowals from Colditz that he had suggested any untoward conduct by J & J. The jury could have been spared their valuable time by a trial judge who did not allow the scurrilous, collateral attacks in the first place.

The defense also tried to diminish Dr. Colditz’s testimony as an opinion coming from a non-physician. The problem, however, was that Colditz is a physician, who understands the biological issues, even if he is not a pathologist, toxicologist, or oncologist. Colditz did not offer opinions about Slemp’s medical treatment, and there was nothing in this line of cross-examination that lessened the impact of Colditz’s general causation testimony.

Generally, the cross-examination did not hurt Dr. Colditz’s strongly stated opinion that talc causes ovarian cancer. The defense (and plaintiff’s counsel before them) spent an inordinate amount of time on why Dr. Colditz has not updated his website to state publicly that talc causes ovarian cancer. Colditz blamed the “IT” guys, a rather disingenuous excuse. His explanation on direct, and on cross, as to why he could not post his opinion on his public-service website was so convoluted, however, that there was no clear admission or inference of dereliction. Colditz was permitted to bill his opinion, never posted to his institution’s website, as a “consensus opinion,” endorsed by several researchers, based upon hearsay emails and oral conversations.

1 See In re Zoloft Prod. Liab. Litig., No. 16-2247 , __ F.3d __, 2017 WL 2385279, 2017 U.S. App. LEXIS 9832 (3d Cir. June 2, 2017) (affirming exclusion of dodgy opinion, which involved changing subgroup end points across studies of maternal sertraline use and infant cardiac birth defects ).

2 Kathryn L. Terry, et al., “Genital powder use and risk of ovarian cancer: a pooled analysis of 8,525 cases and 9,859 controls,” 6 Cancer Prev. & Research 811 (2013).

3 Graham A. Colditz, “From epidemiology to cancer prevention: implications for the 21st Century,” 18 Cancer Causes Control 117, 118 (2007).

Welding Litigation – Another Positive Example of Litigation-Generated Science

July 11th, 2017

In a recent post1, I noted Samuel Tarry’s valuable article2 for its helpful, contrarian discussion of the importance of some scientific articles with litigation provenances. Public health debates can spill over to the courtroom, and developments in the courtroom can, on occasion, inform and even resolve those public health debates that gave rise to the litigation. Tarry provided an account of three such articles, and I provided a brief account of another article, a published meta-analysis, from the welding fume litigation.

The welding litigation actually accounted for several studies, but in this post, I detail the background of another published study, this one an epidemiologic study by a noted Harvard epidemiologist. Not every expert witness’s report has the making of a published paper. In theory, if the expert witness has conducted a systematic review, and reached a conclusion that is not populated among already published papers, we might well expect that the witness had achieved the “least publishable unit.” The reality is that most causal claims are not based upon what could even remotely be called a systematic review. Given the lack of credibility to the causal claim, rebuttal reports are likely to have little interest to serious scientists.

Martin Wells

In the welding fume cases, one of plaintiffs’ hired expert witnesses, Martin Wells, a statistician, proffered an analysis of Parkinson’s disease (PD) mortality among welders and welding tradesmen. Using the National Center for Health Statistics (NCHS) database, Wells aggregated data from 1993 to 1999, for PD among welders and compared this to PD mortality among non-welders. Wells claimed to find an increased risk of PD mortality among younger (under age 65 at death) welders and welding tradesmen in this dataset.

The defense sought discovery of Wells’s methods and materials, and obtained the underlying data from the NCHS. Wells had no protocol, no pre-stated commitment to which years in the dataset he would use, and no pre-stated statistical analysis plan. At a Rule 702 hearing, Wells was unable to state how many welders were included in his analysis, why he selected some years but not others, or why he had selected age 65 as the cut off. His analyses appeared to be pure data dredging.

As the defense discovered, the NCHS dataset contained mortality data for many more years than the limited range employed by Wells in his analysis. Working with an expert witness at the Harvard School of Public Health, the defense discovered that Wells had gerrymandered the years included (and excluded) in his analysis in a way that just happened to generate a marginally, nominally statistically significant association.

NCHS Welder Age Distribution

The defense was thus able to show that the data overall, and in each year, were very sparse. For most years, the value was either 0 or 1, for PD deaths under age 65. Because of the huge denominators, however, the calculated mortality odds ratios were nominally statistically significant. The value of four PD deaths in 1998 is clearly an outlier. If the value were three rather than four, the statistical significance of the calculated OR would have been lost. Alternatively, a simple sensitivity test suggests that if instead of overall n = 7, n were 6, statistical significance would have been lost. The chart below, prepared at the time with help from Dr. David Schwartzof Innovative Science solutions, shows the actual number of “underlying cause” PD deaths that were in the dataset for each year in the NCHS dataset, and how sparse and granular” these data were:

A couple of years later, the Wells’ litigation analysis showed up as a manuscript, with only minor changes in its analyses, and with authors listed as Martin T. Wells and Katherine W. Eisenberg, in the editorial offices of Neurology. Katherine W. Eisenberg, AB and Martin T. Wells, Ph.D., “A Mortality Odds Ratio Study of Welders and Parkinson Disease.” Wells disclosed that he had testified for plaintiffs in the welding fume litigation, but Eisenberg declared no conflicts. Having only an undergraduate degree, and attending medical school at the time of submission, Ms. Eisenberg would not seem to have had the opportunity to accumulate any conflicts of interest. Undisclosed to the editors of Neurology, however, was that Ms. Eisenberg was the daughter of Theodore (Ted) Eisenberg, a lawyer who taught at Cornell University and who represented plaintiffs in the same welding MDL as the one in which Wells testified. Inquiring minds might have wondered whether Ms. Eisenberg’s tuition, room, and board were subsidized by Ted’s earnings in the welding fume and other litigations. Ted Eisenberg and Martin Wells had collaborated on many other projects, but in the welding fume litigation, Ted worked as an attorney for MDL welding plaintiffs, and Martin Wells was compensated handsomely as an expert witness. The acknowledgment at the end of the manuscript thanked Theodore Eisenberg for his thoughtful comments and discussion, without noting that he had been a paid member of the plaintiff’s litigation team. Nor did Wells and Eisenberg tells the Neurology editors that the article had grown out of Wells’ 2005 litigation report in the welding MDL.

The disclosure lapses and oversights by Wells and the younger Eisenberg proved harmless error because Neurology rejected the Wells and Eisenberg paper for publication, and it was never submitted elsewhere. The paper used the same restricted set of years of NCHS data, 1993-1999. The defense had already shown, through its own expert witness’s rebuttal report, that the manuscript’s analysis achieved statistical significance only because it omitted years from the analysis. For instance, if the authors had analyzed 1992 through 1999, their Parkinson’s disease mortality point estimate for younger welding tradesmen would no longer have been statistically significant.

Robert Park

One reason that Wells and Eisenberg may have abandoned their gerrymandered statistical analysis of the NCHS dataset was that an ostensibly independent group3 of investigators published a paper that presented a competing analysis. Robert M. Park, Paul A. Schulte, Joseph D. Bowman, James T. Walker, Stephen C. Bondy, Michael G. Yost, Jennifer A. Touchstone, and Mustafa Dosemeci, “Potential Occupational Risks for Neurodegenerative Diseases,” 48 Am. J. Ind. Med. 63 (2005) [cited as Park (2005)]. The authors accessed the same NCHS dataset, and looked at hundreds of different occupations, including welding tradesmen, and four neurodegenerative diseases.

Park, et al., claimed that they looked at occupations that had previously shown elevated proportional mortality ratios (PMR) in a previous publication of the NIOSH. A few other occupations were included; in all their were hundreds of independent analyses, without any adjustment for multiple testing. Welding occupations4 were included “[b]ecause of reports of Parkinsonism in welders [Racette et al.,, 2001; Levy and Nassetta, 2003], possibly attributable to manganese exposure (from welding rods and steel alloys)… .”5 Racette was a consultant for the Lawsuit Industry, which had been funded his research on parkinsonism among welders. Levy was a testifying expert witness for Lawsuit, Inc. A betting person would conclude that Park had consulted with Wells and Eisenberg, and their colleagues.

These authors looked at four neurological degenerative diseases (NDDs), Alzheimer’s disease, Parkinson’s disease, motor neuron disease, and pre-senile dementia. The authors looked at NCHS death certificate occupational information from 1992 to 1998, which was remarkable because Wells had insisted that 1992 somehow was not available for inclusion in his analyses. During 1992 to 1998, in 22 states, there were 2,614,346 deaths with 33,678 from Parkinson’s diseases. (p. 65b). Then for each of the four disease outcomes, the authors conducted an analysis for deaths below age 65. For the welding tradesmen, none of the four NDDs showed any associations. Park went on to conduct subgroup analyses for each of the four NDDs for death below age 65. In these subgroup analyses for welding tradesmen, the authors purported to find only an association only with Parkinson’s disease:

Of the four NDDs under study, only PD was associated with occupations where arc-welding of steel is performed, and only for the 20 PD deaths below age 65 (MOR=1.77, 95% CI=1.08-2.75) (Table V).”

Park (2005), at 70.

The exact nature of the subgroup was obscure, to say the least. Remarkably, Park and his colleagues had not calculated an odds ratio for welding tradesmen under age 65 at death compared with non-welding tradesmen under age 65 at death. The table’s legend attempts to explain the authors’ calculation:

Adjusted for age, race, gender, region and SES. Model contains multiplicative terms for exposure and for exposure if age at death <65; thus MOR is estimate for deaths occurring age 65+, and MOR, age <65 is estimate of enhanced risk: age <65 versus age 65+”

In other words, Park looked to see whether welding tradesmen who died at a younger age (below age 65) were more likely to have a PD cause of death than welding tradesmen who died an older age (over age 65). The meaning of this internal comparison is totally unclear, but it cannot represent a comparison of welder’s with non-welders. Indeed, every time, Park and his colleagues calculated and reported this strange odds ratio for any occupational group in the published paper, the odds ratio was elevated. If the odds ratio means anything, it is that younger Parkinson’s patients, regardless of occupation, are more likely to die of their neurological disease than older patients. Older men, regardless of occupation, are more likely to die of cancer, cardiovascular disease, and other chronic diseases. Furthermore, this age association within (not between) an occupational groups may be nothing other than a reflection of the greater severity of early-onset Parkinson’s disease in anyone, regardless of their occupation.

Like the manuscript by Eisenberg and Wells, the Park paper was an exercise in data dredging. The Park study reported increased odds ratios for Parkinson’s disease among the following groups on the primary analysis:

biological, medical scientists [MOR 2.04 (95% CI, 1.37-2.92)]

clergy [MOR 1.79 (95% CI, 1.58-2.02)]

religious workers [MOR 1.70 (95% CI, 1.27-2.21)]

college teachers [MOR 1.61 (95% CI, 1.39-1.85)]

social workers [MOR 1.44 (95% CI, 1.14-1.80)]

As noted above, the Park paper reported all of the internal mortality odds ratios for below versus above age 65, within occupational groups were nominally statistically significantly elevated. Nonetheless, the Park authors were on a mission, and determined to make something out of nothing, at least when it came to welding and Parkinson’s disease among younger patients. The authors’ conclusion reflected stunningly poor scholarship:

Studies in the US, Europe, and Korea implicate manganese fumes from arc-welding of steel in the development of a Parkinson’s-like disorder, probably a manifestation of manganism [Sjogren et al., 1990; Kim et al., 1999; Luccini, et al., 1999; Moon et al., 1999]. The observation here that PD mortality is elevated among workers with likely manganese exposures from welding, below age 65 (based on 20 deaths), supports the welding-Parkinsonism connection.”

Park (2005) at 73.

Stunningly bad because the cited papers by Sjogren, Luccini, Kim, and Moon did not examine Parkinson’s disease as an outcome; indeed, they did not even examine a parkinsonian movement disorder. More egregious, however, was the authors’ assertion that their analysis, which compared the odds of Parkinson’s disease mortality between welders under age 65 to that mortality for welders over age 65, supported an association between welding and “Parkinsonism.” 

Every time the authors conducted this analysis internal to an occupational group, they found an elevation among under age 65 deaths compared with over age 65 deaths within the occupational group. They did not report comparisons of any age-defined subgroup of a single occupational group with similarly aged mortality in the remaining dataset.

Elan Louis

The plaintiffs’ lawyers used the Park paper as “evidence” of an association that they claimed was causal. They were aided by a cadre of expert witnesses who could cite to a paper’s conclusions, but could not understand its methods. Occasionally, one of the plaintiffs’ expert witnesses would confess ignorance about exactly what Robert Park had done in this paper. Elan Louis, one of the better qualified expert witnesses on the side of claimants, for instance, testified in the plaintiffs’ attempt to certify a national medical monitoring class action for welding tradesmen. His testimony about what to make of the Park paper was more honest than most of the plaintiffs’ expert witnesses:

Q. My question to you is, is it true that that 1.77 point estimate of risk, is not a comparison of this welder and allied tradesmen under this age 65 mortality, compared with non-welders and allied tradesmen who die under age 65?

A. I think it’s not clear that the footnote — I think that the footnote is not clearly written. When you read the footnote, you didn’t read the punctuation that there are semicolons and colons and commas in the same sentence. And it’s not a well constructed sentence. And I’ve gone through this sentence many times. And I’ve gone through this sentence with Ted Eisenberg many times. This is a topic of our discussion. One of the topics of our discussions. And it’s not clear from this sentence that that’s the appropriate interpretation. *  *  *  However, the footnote, because it’s so poorly written, it obscures what he actually did. And then I think it opens up alternative interpretations.

Q. And if we can pursue that for a moment. If you look at other tables for other occupational titles, or exposure related variables, is it true that every time that Mr. Park reports on that MOR age under 65, that the estimate is elevated and statistically significantly so?

A. Yes. And he uses the same footnote every time. He’s obviously cut and paste that footnote every single time, down to the punctuation is exactly the same. And I would agree that if you look for example at table 4, the mortality odds ratios are elevated in that manner for Parkinson’s Disease, with reference to farming, with reference to pesticides, and with reference to farmers excluding horticultural deaths.

Deposition testimony of Elan Louis, at p. 401-04, in Steele v. A. O. Smith Corp., no. 1:03 CV-17000, MDL 1535 (Jan. 18, 2007). Other less qualified, or less honest expert witnesses on the plaintiffs’ side were content to cite Park (2005) as support for their causal opinions.

Meir Stampfer

The empathetic MDL trial judge denied the plaintiffs’ request for class certification in Steele, but individual personal injury cases continued to be litigated. Steele v. A.O. Smith Corp., 245 F.R.D. 279 (N.D. Ohio 2007) (denying class certification); In re Welding Fume Prods. Liab. Litig., No. 1:03-CV-17000, MDL 1535, 2008 WL 3166309 (N.D. Ohio Aug. 4, 2008) (striking pendent state-law class actions claims)

Although Elan Louis was honest enough to acknowledge his own confusion about the Park paper, other expert witnesses continued to rely upon it, and plaintiffs’ counsel continued to cite the paper in their briefs and to use the apparently elevated point estimate for welders in their cross-examinations of defense expert witnesses. With the NCHS data in hand (on a DVD), defense counsel returned to Meir Stampfer, who had helped them unravel the Martin Wells’ litigation analysis. The question for Professor Stampfer was whether Park’s reported point estimate for PD mortality odds ratio was truly a comparison of welders versus non-welders, or whether it was some uninformative internal comparison of younger welders versus older welders.

The one certainty available to the defense is that it had the same dataset that had been used by Martin Wells in the earlier litigation analysis, and now by Robert Park and his colleagues in their published analysis. Using the NCHS dataset, and Park’s definition of a welder or a welding tradesman, Professor Stampfer calculated PD mortality odds ratios for each definition, as well as for each definition for deaths under age 65. None of these analyses yielded statistically significant associations. Park’s curious results could not be replicated from the NCHS dataset.

For welders, the overall PD mortality odds ratio (MOR) was 0.85 (95% CI, 0.77–0.94), for years 1985 through 1999, in the NCHS dataset. If the definition of welders was expanded to including welding tradesmen, as used by Robert Park, the MOR was 0.83 (95% CI, 0.78–0.88) for all years available in the NCHS dataset.

When Stampfer conducted an age-restricted analysis, which properly compared welders or welding tradesmen with non-welding tradesmen, with death under age 65, he similarly obtained no associations for PD MOR. For the years 1985-1991, death under 65 from PD, Stampfer found MORs 0.99 (95% CI, 0.44–2.22) for just welders, and 0.83 (95% CI, 0.48–1.44) all welding tradesmen.

And for 1992-1999, the years used by Park (2005), and similar to the date range used by Martin Wells, for PD deaths at under age 65, for welders only, Stampfer found a MOR of 1.44 (95% CI, 0.79–2.62), and for all welding tradesmen, 1.20 (95% CI, 0.79–1.84)

None of Park’s slicing, dicing, and subgrouping of welding and PD results could be replicated. Although Dr. Stampfer submitted a report in Steele, there remained the problem that Park (2005) was a peer-reviewed paper, and that plaintiffs’ counsel, expert witnesses, and other published papers were citing it for its claimed results and errant discussion. The defense asked Dr. Stampfer whether the “least publishable unit” had been achieved, and Stampfer reluctantly agreed. He wrote up his analysis, and published it in 2009, with an appropriate disclosure6. Meir J. Stampfer, “Welding Occupations and Mortality from Parkinson’s Disease and Other Neurodegenerative Diseases Among United States Men, 1985–1999,” 6 J. Occup. & Envt’l Hygiene 267 (2009).

Professor Stampfer’s paper may not be the most important contribution to the epidemiology of Parkinson’s disease, but it corrected the distortions and misrepresentations of data in Robert Park’s paper. His paper has since been cited by well-known researchers in support of their conclusion that there is no association between welding and Parkinson’s disease7. Park’s paper has been criticized on PubPeer, with no rebuttal8.

Almost comically, Park has cited Stampfer’s study tendentiously for a claim that there is a healthy worker bias present in the available epidemiology of welding and PD, without noting, or responding to, the devastating criticism of his own Park (2005) work:

For a mortality study of neurodegenerative disease deaths in the United States during 1985 – 1999, Stampfer [61] used the Cause of Death database of the US National Center for Health Statistics and observed adjusted mortality odds ratios for PD of 0.85 (95% CI, 0.77 – 0.94) and 0.83 (95% CI, 0.78 – 0.88) in welders, using two definitions of welding occupations [61]. This supports the presence of a significant HWE [healthy worker effect] among welders. An even stronger effect was observed in welders for motor neuron disease (amyotrophic lateral sclerosis, OR 0.71, 95% CI, 0.56 – 0.89), a chronic condition that clearly would affect welders’ ability to work.”

Robert M. Park, “Neurobehavioral Deficits and Parkinsonism in Occupations with Manganese Exposure: A Review of Methodological Issues in the Epidemiological Literature,” 4 Safety & Health at Work 123, 126 (2013). Amyotrophic lateral sclerosis has a sudden onset, usually in middle age, without any real prodomal signs or symptoms, which would keep a young man from entering welding as a trade. Just shows you can get any opinion published in a peer-reviewed journal, somewhere. Stampfer’s paper, along with Mortimer’s meta-analysis helped put the kabosh on welding fume litigation.


A few weeks ago, the Sixth Circuit affirmed the dismissal of a class action that was attempted based upon claims of environmental manganese exposure. Abrams v. Nucor Steel Marion, Inc., Case No. 3:13 CV 137, 2015 WL 6872511 (N. D. Ohio Nov. 9, 2015) (finding testimony of neurologist Jonathan Rutchik to be nugatory, and excluding his proffered opinions), aff’d, 2017 U.S. App. LEXIS 9323 (6th Cir. May 25, 2017). Class plaintiffs employed one of the regulators, Jonathan Rutchik, from the welding fume parkinsonism litigation).

2 Samuel L. Tarry, Jr., “Can Litigation-Generated Science Promote Public Health?” 33 Am. J. Trial Advocacy 315 (2009)

3 Ostensibly, but not really. Robert M. Park was an employee of NIOSH, but he had spent most of his career working as an employee for the United Autoworkers labor union. The paper acknowledged help from Ed Baker, David Savitz, and Kyle Steenland. Baker is a colleague and associate of B.S. Levy, who was an expert witness for plaintiffs in the welding fume litigation, as well as many others. The article was published in the “red” journal, the American Journal of Industrial Medicine.

4 The welding tradesmen included in the analyses were welders and cutters, boilermakers, structural metal workers, millwrights, plumbers, pipefitters, and steamfitters. Robert M. Park, Paul A. Schulte, Joseph D. Bowman, James T. Walker, Stephen C. Bondy, Michael G. Yost, Jennifer A. Touchstone, and Mustafa Dosemeci, “Potential Occupational Risks for Neurodegenerative Diseases,” 48 Am. J. Ind. Med. 63, 65a, ¶2 (2005).

5 Id.

6 “The project was supported in part through a consulting agreement with a group of manufacturers of welding consumables who had no role in the analysis, or in preparing this report, did not see any draft of this manuscript prior to submission for publication, and had no control over any aspect of the work or its publication.” Stampfer, at 272.

7 Karin Wirdefeldt, Hans-Olov Adami, Philip Cole, Dimitrios Trichopoulos, and Jack Mandel, “Epidemiology and etiology of Parkinson’s disease: a review of the evidence,” 26 Eur. J. Epidemiol. S1 (2011).

8 The criticisms can be found at <>, last visited on July 10, 2017.