TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

David Madigan’s Graywashed Meta-Analysis in Taxotere MDL

June 12th, 2020

Once again, a meta-analysis is advanced as a basis for an expert witness’s causation opinion, and once again, the opinion is the subject of a Rule 702 challenge. The litigation is In re Taxotere (Docetaxel) Products Liability Litigation, a multi-district litigation (MDL) proceeding before Judge Jane Triche Milazzo, who sits on the United States District Court for the Eastern District of Louisiana.

Taxotere is the brand name for docetaxel, a chemotherapic medication used either alone or in conjunction with another chemotherapy, to treat a number of different cancers. Hair loss is a side effect of Taxotere, but in the MDL, plaintiffs claim that they have experienced permanent hair loss, which was not adequately warned about in their view. The litigation thus involved issues of exactly what “permanent” means, medical causation, adequacy of warnings in the Taxotere package insert, and warnings causation.

Defendant Sanofi challenged plaintiffs’ statistical expert witness, David Madigan, a frequent testifier for the lawsuit industry. In its Rule 702 motion, Sanofi argued that Madigan had relied upon two randomized clinical trials (TAX 316 and GEICAM 9805) that evaluated “ongoing alopecia” to reach conclusions about “permanent alopecia.” Sanofi made the point that “ongoing” is not “permanent,” and that trial participants who had ongoing alopecia may have had their hair grow back. Madigan’s reliance upon an end point different from what plaintiffs complained about made his analysis irrelevant. The MDL court rejected Sanofi’s argument, with the observation that Madigan’s analysis was not irrelevant for using the wrong end point, only less persuasive, and that Sanofi’s criticism was one that “Sanofi can highlight for the jury on cross-examination.”[1]

Did Judge Milazzo engage in judicial dodging with rejecting the relevancy argument and emphasizing the truism that Sanofi could highlight the discrepancy on cross-examination?  In the sense that the disconnect can be easily shown by highlight the different event rates for the alopecia differently defined, the Sanofi argument seems like one that a jury could easily grasp and refute. The judicial shrug, however, begs the question why the defendant should have to address a data analysis that does not support the plaintiffs’ contention about “permanence.” The federal rules are supposed to advance the finding of the truth and the fair, speedy resolution of cases.

Sanofi’s more interesting argument, from the perspective of Rule 702 case law, was its claim that Madigan had relied upon a flawed methodology in analyzing the two clinical trials:

“Sanofi emphasizes that the results of each study individually produced no statistically significant results. Sanofi argues that Dr. Madigan cannot now combine the results of the studies to achieve statistical significance. The Court rejects Sanofi’s argument and finds that Sanofi’s concern goes to the weight of Dr. Madigan’s testimony, not to its admissibility.34”[2]

There seems to be a lot going on in the Rule 702 challenge that is not revealed in the cryptic language of the MDL district court. First, the court deployed the jurisprudentially horrific, conclusory language to dismiss a challenge that “goes to the weight …, not to … admissibility.” As discussed elsewhere, this judicial locution is rarely true, fails to explain the decision, and shows a lack of engagement with the actual challenge.[3] Of course, aside from the inanity of the expression, and the failure to explain or justify the denial of the Rule 702 challenge, the MDL court may have been able to provide a perfectly adequately explanation.

Second, the footnote in the quoted language, number 34, was to the infamous Milward case,[4] with the explanatory parenthetical that the First Circuit had reversed a district court for excluding testimony of an expert witness who had sought to “draw conclusions based on combination of studies, finding that alleged flaws identified by district court go to weight of testimony not admissibility.”[5] As discussed previously, the widespread use of the “weight not admissibility” locution, even by the Court of Appeals, does not justify it. More important, however, the invocation of Milward suggests that any alleged flaws in combining study results in a meta-analysis are always matters for the jury, no matter how arcane, technical, or threatening to validity they may be.

So was Judge Milazzo engaged in judicial dodging in Her Honor’s opinion in Taxotere? Although the citation to Milward tends to inculpate, the cursory description of the challenge raises questions whether the challenge itself was valid in the first place. Fortunately, in this era of electronic dockets, finding the actual Rule 702 motion is not very difficult, and we can inspect the challenge to see whether it was dodged or given short shrift. Remarkably, the reality is much more complicated than the simple, simplistic rejection by the MDL court would suggest.

Sanofi’s brief attacks three separate analyses proffered by David Madigan, and not surprisingly, the MDL court did not address every point made by Sanofi.[6] Sanofi’s point about the inappropriateness of conducting the meta-analysis was its third in its supporting brief:

“Third, Dr. Madigan conducted a statistical analysis on the TAX316 and GEICAM9805/TAX301 clinical trials separately and combined them to do a ‘meta-analysis’. But Dr. Madigan based his analysis on unproven assumptions, rendering his methodology unreliable. Even without those assumptions, Dr. Madigan did not find statistical significance for either of the clinical trials independently, making this analysis unhelpful to the trier of fact.”[7]

This introductory statement of the issue is itself not particularly helpful because it fails to explain why combining two individual clinical trials (“RCTs”), each not having “statistically significant” results, by meta-analysis would be unhelpful. Sanofi’s brief identified other problems with Madigan’s analyses, but eventually returned to the meta-analysis issue, with the heading:

“Dr. Madigan’s analysis of the individual clinical trials did not result in statistical significance, thus is unhelpful to the jury and will unfairly prejudice Sanofi.”[8]

After a discussion of some of the case law about statistical significance, Sanofi pressed its case against Madigan. Madigan’s statistical analysis of each of two RCTs apparently did not reach statistical significance, and Sanofi complained that permitting Madigan to present these two analyses with results that were “not statistically very impressive,” would confuse and mislead the jury.[9]

“Dr. Madigan tried to avoid that result here [of having two statistically non-significant results] by conducting a ‘meta-analysis’ — a greywashed term meaning that he combined two statistically insignificant results to try to achieve statistical significance. Madigan Report at 20 ¶ 53. Courts have held that meta-analyses are admissible, but only when used to reduce the numerical instability on existing statistically significant differences, not as a means to achieve statistical significance where it does not exist. RMSE at 361–362, fn76.”

Now the claims here are quite unsettling, especially considering that they were lodged in a defense brief, in an MDL, with many cases at stake, made on behalf of an important pharmaceutical company, represented by two large, capable national or international law firms.

First, what does the defense brief signify by placing ‘meta-analysis’ in quotes. Are these scare quotes to suggest that Madigan was passing off something as a meta-analysis that failed to be one? If so, there is nothing in the remainder of the brief that explains such an interpretation. Meta-analysis has been around for decades, and reporting meta-analyses of observational or of experimental studies has been the subject of numerous consensus and standard-setting papers over the last two decades. Furthermore, the FDA has now issued a draft guidance for the use of meta-analyses in pharmacoepidemiology. Scare quotes are at best unexplained, at worst, inappropriate. If the authors had something else in mind, they did not explain the meaning of using quotes around meta-analysis.

Second, the defense lawyers referred to meta-analysis as a “greywashed” term. I am always eager to expand my vocabulary, and so I looked up the word in various dictionaries of statistical and epidemiologic terms. Nothing there. Perhaps it was not a technical term, so I checked with the venerable Oxford English Dictionary. No relevant entries.

Pushed to the wall, I checked the font of all knowledge – the internet. To be sure, I found definitions, but nothing that could explain this odd locution in a brief filed in an important motion:

gray-washing: “noun In calico-bleaching, an operation following the singeing, consisting of washing in pure water in order to wet out the cloth and render it more absorbent, and also to remove some of the weavers’ dressing.”

graywashed: “adj. adopting all the world’s cultures but not really belonging to any of them; in essence, liking a little bit of everything but not everything of a little bit.”

Those definitions do not appear pertinent.

Another website offered a definition based upon the “blogsphere”:

Graywash: “A fairly new term in the blogsphere, this means an investigation that deals with an offense strongly, but not strongly enough in the eyes of the speaker.”

Hmmm. Still not on point.

Another one from “Urban Dictionary” might capture something of what was being implied:

Graywashing: “The deliberate, malicious act of making art having characters appear much older and uglier than they are in the book, television, or video game series.”

Still, I am not sure how this is an argument that a federal judge can respond to in a motion affecting many cases.

Perhaps, you say, I am quibbling with word choices, and I am not sufficiently in tune with the way people talk in the Eastern District of Louisiana. I plead guilty to both counts. But the third, and most important point, is the defense assertion that meta-analyses are only admissible “when used to reduce the numerical instability on existing statistically significant differences, not as a means to achieve statistical significance where it does not exist.”

This assertion is truly puzzling. Meta-analyses involve so many layers of hearsay that they will virtually never be admissible. Admissibility of the meta-analyses is virtually never the issue. When an expert witness has conducted a meta-analysis, or has relied upon one, the important legal question is whether the witness may reasonably rely upon the meta-analysis (under Rule 703) for an inference that satisfies Rule 702. The meta-analysis itself does not come into evidence, and does not go out to the jury for its deliberations.

But what about the defense brief’s “only when” language that clearly implies that courts have held that expert witnesses may rely upon meta-analyses only to reduce “numerical instability on existing statistically significant differences”? This seems clearly wrong because achieving statistical significance from studies that have no “instability” for their point estimates but individually lack statistical significance is a perfectly legitimate and valid goal. Consider a situation in which, for some reason, sample size in each study is limited by the available observations, but we have 10 studies, each with a point estimate of 1.5, and each with a 95% confidence interval of (0.88, 2.5). This hypothetical situation presents no instability of point estimates, and the meta-analytical summary point estimate would shrink the confidence interval so that the lower bound would exclude 1.0, in a perfectly valid analysis. In the real world, meta-analyses are conducted on studies with point estimates of risk that vary, because of random and non-random error, but there is no reason that meta-analyses cannot reduce random error to show that the summary point estimate is statistically significant at a pre-specified alpha, even though no constituent study was statistically significant.

Sanofi’s lawyers did not cite to any case for the remarkable proposition they advanced, but they did cite the Reference Manual for Scientific Evidence (RMSE). Earlier in the brief, the defense cited to this work in its third edition (2011), and so I turned to the cited page (“RMSE at 361–362, fn76”) only to find the introduction to the chapter on survey research, with footnotes 1 through 6.

After a diligent search through the third edition, I could not find any other language remotely supportive of the assertion by Sanofi’s counsel. There are important discussions about how a poorly conducted meta-analysis, or a meta-analysis that was heavily weighted in a direction by a methodologically flawed study, could render an expert witness’s opinion inadmissible under Rule 702.[10] Indeed, the third edition has a more sustained discussion of meta-analysis under the heading “VI. What Methods Exist for Combining the Results of Multiple Studies,”[11] but nothing in that discussion comes close to supporting the remarkable assertion by defense counsel.

On a hunch, I checked the second edition of RMSE, published in the year 2000. There was indeed a footnote 76, on page 361, which discussed meta-analysis. The discussion comes in the midst of the superseded edition’s chapter on epidemiology. Nothing, however, in the text or in the cited footnote appears to support the defense’s contention about meta-analyses are appropriate only when each included clinical trial has independently reported a statistically significant result.

If this analysis is correct, the MDL court was fully justified in rejecting the defense argument that combining two statistically non-significant clinical trials to yield a statistically significant result was methodologically infirm. No cases were cited, and the Reference Manual does not support the contention. Furthermore, no statistical text or treatise on meta-analysis supports the Sanofi claim. Sanofi did not support its motion with any affidavits of experts on meta-analysis.

Now there were other arguments advanced in support of excluding David Madigan’s testimony. Indeed, there was a very strong methodological challenge to Madigan’s decision to include the two RCTs in his meta-analysis, other than those RCTs lack of statistical significance on the end point at issue. In the words of the Sanofi brief:

“Both TAX clinical trials examined two different treatment regimens, TAC (docetaxel in combination with doxorubicin and cyclophosphamide) versus FAC (5-fluorouracil in combination with doxorubicin and cyclophosphamide). Madigan Report at 18–19 ¶¶ 47–48. Dr. Madigan admitted that TAC is not Taxotere alone, Madigan Dep. 305:21–23 (Ex. B); however, he did not rule out doxorubicin or cyclophosphamide in his analysis. Madigan Dep. 284:4–12 (“Q. You can’t rule out other chemotherapies as causes of irreversible alopecia? … A. I can’t rule out — I do not know, one way or another, whether other chemotherapy agents cause irreversible alopecia.”).”[12]

Now unlike the statistical significance argument, this argument is rather straightforward and turns on the clinical heterogeneity of the two trials that seems to clearly point to the invalidity of a meta-analysis of them. Sanofi’s lawyers could have easily supported this point with statements from standard textbooks and non-testifying experts (but alas did not). Sanofi did support their challenge, however, with citations to an important litigation and Fifth Circuit precedent.[13]

This closer look at the actual challenge to David Madigan’s opinions suggests that Sanofi’s counsel may have diluted very strong arguments about heterogeneity in exposure variable, and in the outcome variable, by advancing what seems a very doubtful argument based upon the lack of statistical significance of the individual studies in the Madigan meta-analysis.

Sanofi advanced two very strong points, first about the irrelevant outcome variable definitions used by Madigan, and second about the complexity of Taxotere’s being used with other, and different, chemotherapeutic agents in each of the two trials that Madigan combined.[14] The MDL court addressed the first point in a perfunctory and ultimately unsatisfactory fashion, but did not address the second point at all.

Ultimately, the result was that Madigan was given a pass to offer extremely tenuous opinions in an MDL on causation. Given that Madigan has proffered tendentious opinions in the past, and has been characterized as “an expert on a mission,” whose opinions are “conclusion driven,”[15] the missteps in the briefing, and the MDL court’s abridgement of the gatekeeping process are regrettable. Also regrettable is that the merits or demerits of a Rule 702 challenge cannot be fairly evaluated from cursory, conclusory judicial decisions riddled with meaningless verbiage such as “the challenge goes to the weight and not the admissibility of the witness.” Access to the actual Rule 702 motion helped shed important light on the inadequacy of one point in the motion but also the complexity and fullness of the challenge that was not fully addressed in the MDL court’s decision. It is possible that a Reply or a Supplemental brief, or oral argument, may have filled in gaps, corrected errors, or modified the motion, and the above analysis missed some important aspect of what happened in the Taxotere MDL. If so, all the more reason that we need better judicial gatekeeping, especially when a decision can affect thousands of pending cases.[16]


[1]  In re Taxotere (Docetaxel) Prods. Liab. Litig., 2019 U.S. Dist. LEXIS 143642, at *13 (E.D. La. Aug. 23, 2019) [Op.]

[2]  Op. at *13-14.

[3]  “Judicial Dodgers – Weight not Admissibility” (May 28, 2020).

[4]  Milward v. Acuity Specialty Prods. Grp., Inc., 639 F.3d 11, 17-22 (1st Cir. 2011).

[5]  Op. at *13-14 (quoting and citing Milward, 639 F.3d at 17-22).

[6]  Memorandum in Support of Sanofi Defendants’ Motion to Exclude Expert Testimony of David Madigan, Ph.D., Document 6144, in In re Taxotere (Docetaxel) Prods. Liab. Litig. (E.D. La. Feb. 8, 2019) [Brief].

[7]  Brief at 2; see also Brief at 14 (restating without initially explaining why combining two statistically non-significant RCTs by meta-analysis would be unhelpful).

[8]  Brief at 16.

[9]  Brief at 17 (quoting from Madigan Dep. 256:14–15).

[10]  Michael D. Green, Michael Freedman, and Leon Gordis, “Reference Guide on Epidemiology,” at 581n.89, in Fed. Jud. Center, Reference Manual on Scientific Evidence (3d ed. 2011).

[11]  Id. at 606.

[12]  Brief at 14.

[13]  Brief at 14, citing Burst v. Shell Oil Co., C. A. No. 14–109, 2015 WL 3755953, at *7 (E.D. La. June 16, 2015) (Vance, J.) (quoting LeBlanc v. Chevron USA, Inc., 396 F. App’x 94, 99 (5th Cir. 2010)) (“[A] study that notes ‘that the subjects were exposed to a range of substances and then nonspecifically note[s] increases in disease incidence’ can be disregarded.”), aff’d, 650 F. App’x 170 (5th Cir. 2016). SeeThe One Percent Non-solution – Infante Fuels His Own Exclusion in Gasoline Leukemia Case” (June 25, 2015).

[14]  Brief at 14-16.

[15]  In re Accutane Litig., 2015 WL 753674, at *19 (N.J.L.Div., Atlantic Cty., Feb. 20, 2015), aff’d, 234 N.J. 340, 191 A.3d 560 (2018). SeeJohnson of Accutane – Keeping the Gate in the Garden State” (Mar. 28, 2015); “N.J. Supreme Court Uproots Weeds in Garden State’s Law of Expert Witnesses” (Aug. 8, 2018).

[16]  Cara Salvatore, “Sanofi Beats First Bellwether In Chemo Drug Hair Loss MDL,” Law360 (Sept. 27, 2019).

Dodgy Data Duck Daubert Decisions

March 11th, 2020

Judges say the darndest things, especially when it comes to their gatekeeping responsibilities under Federal Rules of Evidence 702 and 703. One of the darndest things judges say is that they do not have to assess the quality of the data underlying an expert witness’s opinion.

Even when acknowledging their obligation to “assess the reasoning and methodology underlying the expert’s opinion, and determine whether it is both scientifically valid and applicable to a particular set of facts,”[1] judges have excused themselves from having to look at the trustworthiness of the underlying data for assessing the admissibility of an expert witness’s opinion.

In McCall v. Skyland Grain LLC, the defendant challenged an expert witness’s reliance upon oral reports of clients. The witness, Mr. Bradley Walker, asserted that he regularly relied upon such reports, in similar contexts of the allegations that the defendant misapplied herbicide to plaintiffs’ crops. The trial court ruled that the defendant could cross-examine the declarant who was available trial, and concluded that the “reliability of that underlying data can be challenged in that manner and goes to the weight to be afforded Mr. Walker’s conclusions, not their admissibility.”[2] Remarkably, the district court never evaluated the reasonableness of Mr. Walker’s reliance upon client reports in this or any context.

In another federal district court case, Rodgers v. Beechcraft Corporation, the trial judge explicitly acknowledged the responsibility to assess whether the expert witness’s opinion was based upon “sufficient facts and data,” but disclaimed any obligation to assess the quality of the underlying data.[3] The trial court in Rodgers cited a Tenth Circuit case from 2005,[4] which in turn cited the Supreme Court’s 1993 decision in Daubert, for the proposition that the admissibility review of an expert witness’s opinion was limited to a quantitative sufficiency analysis, and precluded a qualitative analysis of the underlying data’s reliability. Quoting from another district court criminal case, the court in Rodgers announced that “the Court does not examine whether the facts obtained by the witness are themselves reliable – whether the facts used are qualitatively reliable is a question of the weight to be given the opinion by the factfinder, not the admissibility of the opinion.”[5]

In a 2016 decision, United States v. DishNetwork LLC, the court explicitly disclaimed that it was required to “evaluate the quality of the underlying data or the quality of the expert’s conclusions.”[6] This district court pointed to a Seventh Circuit decision, which maintained that  “[t]he soundness of the factual underpinnings of the expert’s analysis and the correctness of the expert’s conclusions based on that analysis are factual matters to be determined by the trier of fact, or, where appropriate, on summary judgment.”[7] The Seventh Circuit’s decision, however, issued in June 2000, several months before the effective date of the amendments to Federal Rule of Evidence 702 (December 2000).

In 2012, a magistrate judge issued an opinion along the same lines, in Bixby v. KBR, Inc.[8] After acknowledging what must be done in ruling on a challenge to an expert witness, the judge took joy in what could be overlooked. If the facts or data upon which the expert witness has relied are “minimally sufficient,” then the gatekeeper can regard questions about “the nature or quality of the underlying data bear upon the weight to which the opinion is entitled or to the credibility of the expert’s opinion, and do not bear upon the question of admissibility.”[9]

There need not be any common law mysticism to the governing standard. The relevant law is, of course, a statute, which appears to be forgotten in many of the failed gatekeeping decisions:

Rule 702. Testimony by Expert Witnesses

A witness who is qualified as an expert by knowledge, skill, experience, training, or education may testify in the form of an opinion or otherwise if:

(a) the expert’s scientific, technical, or other specialized knowledge will help the trier of fact to understand the evidence or to determine a fact in issue;

(b) the testimony is based on sufficient facts or data;

(c) the testimony is the product of reliable principles and methods; and

(d) the expert has reliably applied the principles and methods to the facts of the case.

It would seem that you could not produce testimony that is the product of reliable principles and methods by starting with unreliable underlying facts and data. Certainly, having a reliable method would require selecting reliable facts and data from which to start. What good would the reliable application of reliable principles to crummy data?

The Advisory Committee Notes to Rule 702 hints at an answer to the problem:

“There has been some confusion over the relationship between Rules 702 and 703. The amendment makes clear that the sufficiency of the basis of an expert’s testimony is to be decided under Rule 702. Rule 702 sets forth the overarching requirement of reliability, and an analysis of the sufficiency of the expert’s basis cannot be divorced from the ultimate reliability of the expert’s opinion. In contrast, the ‘reasonable reliance’ requirement of Rule 703 is a relatively narrow inquiry. When an expert relies on inadmissible information, Rule 703 requires the trial court to determine whether that information is of a type reasonably relied on by other experts in the field. If so, the expert can rely on the information in reaching an opinion. However, the question whether the expert is relying on a sufficient basis of information—whether admissible information or not—is governed by the requirements of Rule 702.”

The answer is only partially satisfactory. First, if the underlying data are independently admissible, then there may indeed be no gatekeeping of an expert witness’s reliance upon such data. Rule 703 imposes a reasonableness test for reliance upon inadmissible underlying facts and data, but appears to give otherwise admissible facts and data a pass. Second, the above judicial decisions do not mention any Rule 703 challenge to the expert witnesses’ reliance. If so, then there is a clear lesson for counsel. When framing a challenge to the admissibility of an expert witness’s opinion, show that the witness has unreasonably relied upon facts and data, from whatever source, in violation of Rule 703. Then show that without the unreasonably relied upon facts and data, the witness cannot show that his or her opinion satisfies Rule 702(a)-(d).


[1]  See, e.g., McCall v. Skyland Grain LLC, Case 1:08-cv-01128-KHV-BNB, Order (D. Colo. June 22, 2010) (Brimmer, J.) (citing Dodge v. Cotter Corp., 328 F.3d 1212, 1221 (10th Cir. 2003), citing in turn Daubert v. Merrill Dow Pharms., Inc., 509 U.S. 579,  592-93 (1993).

[2]  McCall v. Skyland Grain LLC Case 1:08-cv-01128-KHV-BNB, Order at p.9 n.6 (D. Colo. June 22, 2010) (Brimmer, J.)

[3]  Rodgers v. Beechcraft Corp., Case No. 15-CV-129-CVE-PJC, Report & Recommendation at p.6 (N.D. Okla. Nov. 29, 2016).

[4]  Id., citing United.States. v. Lauder, 409 F.3d 1254, 1264 (10th Cir. 2005) (“By its terms, the Daubert opinion applies only to the qualifications of an expert and the methodology or reasoning used to render an expert opinion” and “generally does not, however, regulate the underlying facts or data that an expert relies on when forming her opinion.”), citing Daubert v. Merrill Dow Pharms., Inc., 509 U.S. 579, 592-93 (1993).

[5]  Id., citing and quoting United States v. Crabbe, 556 F. Supp. 2d 1217, 1223
(D. Colo. 2008) (emphasis in original). In Crabbe, the district judge mostly excluded the challenged expert witness, thus rendering its verbiage on quality of data as obiter dicta). The pronouncements about the nature of gatekeeping proved harmless error when the court dismissed the case on other grounds. Rodgers v. Beechcraft Corp., 248 F. Supp. 3d 1158 (N.D. Okla. 2017) (granting summary judgment).

[6]  United States v. DishNetwork LLC, No. 09-3073, Slip op. at 4-5 (C.D. Ill. Jan. 13, 2016) (Myerscough, J.)

[7]  Smith v. Ford Motor Co., 215 F.3d 713, 718 (7th Cir. 2000).

[8]  Bixby v. KBR, Inc., Case 3:09-cv-00632-PK, Slip op. at 6-7 (D. Ore. Aug. 29, 2012) (Papak, M.J.)

[9]  Id. (citing Hangarter v. Provident Life & Accident Ins. Co., 373 F.3d 998, 1017 (9th Cir. 2004), quoting Children’s Broad Corp. v. Walt Disney Co., 357 F.3d 860, 865 (8th Cir. 2004) (“The factual basis of an expert opinion goes to the credibility of the testimony, not the admissibility, and it is up to the opposing party to examine the factual basis for the opinion in cross-examination.”).

Science Bench Book for Judges

July 13th, 2019

On July 1st of this year, the National Judicial College and the Justice Speakers Institute, LLC released an online publication of the Science Bench Book for Judges [Bench Book]. The Bench Book sets out to cover much of the substantive material already covered by the Federal Judicial Center’s Reference Manual:

Acknowledgments

Table of Contents

  1. Introduction: Why This Bench Book?
  2. What is Science?
  3. Scientific Evidence
  4. Introduction to Research Terminology and Concepts
  5. Pre-Trial Civil
  6. Pre-trial Criminal
  7. Trial
  8. Juvenile Court
  9. The Expert Witness
  10. Evidence-Based Sentencing
  11. Post Sentencing Supervision
  12. Civil Post Trial Proceedings
  13. Conclusion: Judges—The Gatekeepers of Scientific Evidence

Appendix 1 – Frye/Daubert—State-by-State

Appendix 2 – Sample Orders for Criminal Discovery

Appendix 3 – Biographies

The Bench Book gives some good advice in very general terms about the need to consider study validity,[1] and to approach scientific evidence with care and “healthy skepticism.”[2] When the Bench Book attempts to instruct on what it represents the scientific method of hypothesis testing, the good advice unravels:

“A scientific hypothesis simply cannot be proved. Statisticians attempt to solve this dilemma by adopting an alternate [sic] hypothesis – the null hypothesis. The null hypothesis is the opposite of the scientific hypothesis. It assumes that the scientific hypothesis is not true. The researcher conducts a statistical analysis of the study data to see if the null hypothesis can be rejected. If the null hypothesis is found to be untrue, the data support the scientific hypothesis as true.”[3]

Even in experimental settings, a statistical analysis of the data do not lead to a conclusion that the null hypothesis is untrue, as opposed to not reasonably compatible with the study’s data. In observational studies, the statistical analysis must acknowledge whether and to what extent the study has excluded bias and confounding. When the Bench Book turns to speak of statistical significance, more trouble ensues:

“The goal of an experiment, or observational study, is to achieve results that are statistically significant; that is, not occurring by chance.”[4]

In the world of result-oriented science, and scientific advocacy, it is perhaps true that scientists seek to achieve statistically significant results. Still, it seems crass to come right out and say so, as opposed to saying that the scientists are querying the data to see whether they are compatible with the null hypothesis. This first pass at statistical significance is only mildly astray compared with the Bench Book’s more serious attempts to define statistical significance and confidence intervals:

4.10 Statistical Significance

The research field agrees that study outcomes must demonstrate they are not the result of random chance. Leaving room for an error of .05, the study must achieve a 95% level of confidence that the results were the product of the study. This is denoted as p ≤ 05. (or .01 or .1).”[5]

and

“The confidence interval is also a way to gauge the reliability of an estimate. The confidence interval predicts the parameters within which a sample value will fall. It looks at the distance from the mean a value will fall, and is measured by using standard deviations. For example, if all values fall within 2 standard deviations from the mean, about 95% of the values will be within that range.”[6]

Of course, the interval speaks to the precision of the estimate, not its reliability, but that is a small point. These definitions are virtually guaranteed to confuse judges into conflating statistical significance and the coefficient of confidence with the legal burden of proof probability.

The Bench Book runs into problems in interpreting legal decisions, which would seem softer grist for the judicial mill. The authors present dictum from the Daubert decision as though it were a holding:[7]

“As noted in Daubert, ‘[t]he focus, of course, must be solely on principles and methodology, not on the conclusions they generate’.”

The authors fail to mention that this dictum was abandoned in Joiner, and that it is specifically rejected by statute, in the 2000 revision to the Federal Rule of Evidence 702.

Early in the Bench Book, it authors present a subsection entitled “The Myth of Scientific Objectivity,” which they might have borrowed from Feyerabend or Derrida. The heading appears misleading because the text contradicts it:

“Scientists often develop emotional attachments to their work—it can be difficult to abandon an idea. Regardless of bias, the strongest intellectual argument, based on accepted scientific hypotheses, will always prevail, but the road to that conclusion may be fraught with scholarly cul-de-sacs.”[8]

In a similar vein, the authors misleadingly tell readers that “the forefront of science is rarely encountered in court,” and so “much of the science mentioned there shall be considered established….”[9] Of course, the reality is that many causal claims presented in court have already been rejected or held to be indeterminate by the scientific community. And just when readers may think themselves safe from the goblins of nihilism, the authors launch into a theory of naïve probabilism that science is just placing subjective probabilities upon data, based upon preconceived biases and beliefs:

“All of these biases and beliefs play into the process of weighing data, a critical aspect of science. Placing weight on a result is the process of assigning a probability to an outcome. Everything in the universe can be expressed in probabilities.”[10]

So help the expert witness who honestly (and correctly) testifies that the causal claim or its rejection cannot be expressed as a probability statement!

Although I have not read all of the Bench Book closely, there appears to be no meaningful discussion of Rule 703, or of the need to access underlying data to ensure that the proffered scientific opinion under scrutiny has used appropriate methodologies at every step in its development. Even a 412 text cannot address every issue, but this one does little to help the judicial reader find more in-depth help on statistical and scientific methodological issues that arise in occupational and environmental disease claims, and in pharmaceutical products litigation.

The organizations involved in this Bench Book appear to be honest brokers of remedial education for judges. The writing of this Bench Book was funded by the State Justice Institute (SJI) Which is a creation of federal legislation enacted with the laudatory goal of improving the quality of judging in state courts.[11] Despite its provenance in federal legislation, the SJI is a a private, nonprofit corporation, governed by 11 directors appointed by the President, and confirmed by the Senate. A majority of the directors (six) are state court judges, one state court administrator, and four members of the public (no more than two from any one political party). The function of the SJI is to award grants to improve judging in state courts.

The National Judicial College (NJC) originated in the early 1960s, from the efforts of the American Bar Association, American Judicature Society and the Institute of Judicial Administration, to provide education for judges. In 1977, the NJC became a Nevada not-for-profit (501)(c)(3) educational corporation, which its campus at the University of Nevada, Reno, where judges could go for training and recreational activities.

The Justice Speakers Institute appears to be a for-profit company that provides educational resources for judge. A Press Release touts the Bench Book and follow-on webinars. Caveat emptor.

The rationale for this Bench Book is open to question. Unlike the Reference Manual for Scientific Evidence, which was co-produced by the Federal Judicial Center and the National Academies of Science, the Bench Book’s authors are lawyers and judges, without any subject-matter expertise. Unlike the Reference Manual, the Bench Book’s chapters have no scientist or statistician authors, and it shows. Remarkably, the Bench Book does not appear to cite to the Reference Manual or the Manual on Complex Litigation, at any point in its discussion of the federal law of expert witnesses or of scientific or statistical method. Perhaps taxpayers would have been spared substantial expense if state judges were simply encouraged to read the Reference Manual.


[1]  Bench Book at 190.

[2]  Bench Book at 174 (“Given the large amount of statistical information contained in expert reports, as well as in the daily lives of the general society, the ability to be a competent consumer of scientific reports is challenging. Effective critical review of scientific information requires vigilance, and some healthy skepticism.”).

[3]  Bench Book at 137; see also id. at 162.

[4]  Bench Book at 148.

[5]  Bench Book at 160.

[6]  Bench Book at 152.

[7]  Bench Book at 233, quoting Daubert v. Merrell Dow Pharms., Inc., 509 U.S. 579, 595 (1993).

[8]  Bench Book at 10.

[9]  Id. at 10.

[10]  Id. at 10.

[11] See State Justice Institute Act of 1984 (42 U.S.C. ch. 113, 42 U.S.C. § 10701 et seq.).

Daubert Retrospective – Statistical Significance

January 5th, 2019

The holiday break was an opportunity and an excuse to revisit the briefs filed in the Supreme Court by parties and amici, in the Daubert case. The 22 amicus briefs in particular provided a wonderful basis upon which to reflect how far we have come, and also how far we have to go, to achieve real evidence-based fact finding in technical and scientific litigation. Twenty-five years ago, Rules 702 and 703 vied for control over errant and improvident expert witness testimony. With Daubert decided, Rule 702 emerged as the winner. Sadly, most courts seem to ignore or forget about Rule 703, perhaps because of its awkward wording. Rule 702, however, received the judicial imprimatur to support the policing and gatekeeping of dysepistemic claims in the federal courts.

As noted last week,1 the petitioners (plaintiffs) in Daubert advanced several lines of fallacious and specious argument, some of which was lost in the shuffle and page limitations of the Supreme Court briefings. The plaintiffs’ transposition fallacy received barely a mention, although it did bring forth at least a footnote in an important and overlooked amicus brief filed by American Medical Association (AMA), the American College of Physicians, and over a dozen other medical specialty organizations,2 all of which both emphasized the importance of statistical significance in interpreting epidemiologic studies, and the fallacy of interpreting 95% confidence intervals as providing a measure of certainty about the estimated association as a parameter. The language of these associations’ amicus brief is noteworthy and still relevant to today’s controversies.

The AMA’s amicus brief, like the brief filed by the National Academies of Science and the American Association for the Advancement of Science, strongly endorsed a gatekeeping role for trial courts to exclude testimony not based upon rigorous scientific analysis:

The touchstone of Rule 702 is scientific knowledge. Under this Rule, expert scientific testimony must adhere to the recognized standards of good scientific methodology including rigorous analysis, accurate and statistically significant measurement, and reproducibility.”3

Having incorporated the term “scientific knowledge,” Rule 702 could not permit anything less in expert witness testimony, lest it pollute federal courtrooms across the land.

Elsewhere, the AMA elaborated upon its reference to “statistically significant measurement”:

Medical researchers acquire scientific knowledge through laboratory investigation, studies of animal models, human trials, and epidemiological studies. Such empirical investigations frequently demonstrate some correlation between the intervention studied and the hypothesized result. However, the demonstration of a correlation does not prove the hypothesized result and does not constitute scientific knowledge. In order to determine whether the observed correlation is indicative of a causal relationship, scientists necessarily rely on the concept of “statistical significance.” The requirement of statistical reliability, which tends to prove that the relationship is not merely the product of chance, is a fundamental and indispensable component of valid scientific methodology.”4

And then again, the AMA spelled out its position, in case the Court missed its other references to the importance of statistical significance:

Medical studies, whether clinical trials or epidemiologic studies, frequently demonstrate some correlation between the action studied … . To determine whether the observed correlation is not due to chance, medical scientists rely on the concept of ‘statistical significance’. A ‘statistically significant’ correlation is generally considered to be one in which statistical analysis suggests that the observed relationship is not the result of chance. A statistically significant correlation does not ‘prove’ causation, but in the absence of such a correlation, scientific causation clearly is not proven.95

In its footnote 9, in the above quoted section of the brief, the AMA called out the plaintiffs’ transposition fallacy, without specifically citing to plaintiffs’ briefs:

It is misleading to compare the 95% confidence level used in empirical research to the 51% level inherent in the preponderance of the evidence standard.”6

Actually the plaintiffs’ ruse was much worse than misleading. The plaintiffs did not compare the two probabilities; they equated them. Some might call this ruse, an outright fraud on the court. In any event, the AMA amicus brief remains an available, citable source for opposing this fraud and the casual dismissal of the importance of statistical significance.

One other amicus brief touched on the plaintiffs’ statistical shanigans. The Product Liability Advisory Council, National Association of Manufacturers, Business Roundtable, and Chemical Manufacturers Association jointly filed an amicus brief to challenge some of the excesses of the plaintiffs’ submissions.7  Plaintiffs’ expert witness, Shanna Swan, had calculated type II error rates and post-hoc power for some selected epidemiologic studies relied upon by the defense. Swan’s complaint had been that some studies had only 20% probability (power) to detect a statistically significant doubling of limb reduction risk, with significance at p < 5%.8

The PLAC Brief pointed out that power calculations must assume an alternative hypothesis, and that the doubling of risk hypothesis had no basis in the evidentiary record. Although the PLAC complaint was correct, it missed the plaintiffs’ point that the defense had set exceeding a risk ratio of 2.0, as an important benchmark for specific causation attributability. Swan’s calculation of post-hoc power would have yielded an even lower probability for detecting risk ratios of 1.2 or so. More to the point, PLAC noted that other studies had much greater power, and that collectively, all the available studies would have had much greater power to have at least one study achieve statistical significance without dodgy re-analyses.


1 The Advocates’ Errors in Daubert” (Dec. 28, 2018).

2 American Academy of Allergy and Immunology, American Academy of Dermatology, American Academy of Family Physicians, American Academy of Neurology, American Academy of Orthopaedic Surgeons, American Academy of Pain Medicine, American Association of Neurological Surgeons, American College of Obstetricians and Gynecologists, American College of Pain Medicine, American College of Physicians, American College of Radiology, American Society of Anesthesiologists, American Society of Plastic and Reconstructive Surgeons, American Urological Association, and College of American Pathologists.

3 Brief of the American Medical Association, et al., as Amici Curiae, in Support of Respondent, in Daubert v. Merrell Dow Pharmaceuticals, Inc., U.S. Supreme Court no. 92-102, 1993 WL 13006285, at *27 (U.S., Jan. 19, 1993)[AMA Brief].

4 AMA Brief at *4-*5 (emphasis added).

5 AMA Brief at *14-*15 (emphasis added).

6 AMA Brief at *15 & n.9.

7 Brief of the Product Liability Advisory Council, Inc., National Association of Manufacturers, Business Roundtable, and Chemical Manufacturers Association as Amici Curiae in Support of Respondent, as Amici Curiae, in Support of Respondent, in Daubert v. Merrell Dow Pharmaceuticals, Inc., U.S. Supreme Court no. 92-102, 1993 WL 13006288 (U.S., Jan. 19, 1993) [PLAC Brief].

8 PLAC Brief at *21.