In yet another law review article on Daubert, Susan Haack has managed mostly to repeat her past mistakes, while adding a few new ones to her exegesis of the law of expert witnesses. See Susan Haack, “Mind the Analytical Gap! Tracing a Fault Line in Daubert,” 654 Wayne L. Rev. 653 (2016) [cited as Gap]. Like some other commentators on the law of evidence, Haack purports to discuss this area of law without ever citing or quoting the current version of the relevant statute, Federal Rule of Evidence 703. She pours over Daubert and Joiner, as she has done before, with mostly the same errors of interpretation. In discussing Joiner, Haack misses the importance of the Supreme Court’s reversal of the 11th Circuit’s asymmetric standard of Rule 702 trial court decisions. Gap at 677. And Haack’s analysis of this area of law omits any mention of Rule 703, and its role in Rule 702 determinations. Although you can safely skip yet another Haack article, you should expect to see this one, along with her others, cited in briefs, right up there with David Michael’s Manufacturing Doubt.
A Matter of Degree
“It may be said that the difference is only one of degree. Most differences are, when nicely analyzed.”[1]
Quoting Holmes, Haack appears to complain that the courts’ admissibility decisions on expert witnesses’s opinions are dichotomous and categorical, whereas the component parts of the decisions, involving relevance and reliability, are qualitative and gradational. True, true, and immaterial.
How do you boil a live frog so it does not jump out of the water? You slowly turn up the heat on the frog by degrees. The frog is lulled into complacency, but at the end of the process, the frog is quite, categorically, and sincerely dead. By a matter of degrees, you can boil a frog alive in water, with a categorically ascertainable outcome.
Humans use categorical assignments in all walks of life. We rely upon our conceptual abilities to differentiate sinners and saints, criminals and paragons, scholars and skells. And we do this even though IQ, and virtues, come in degrees. In legal contexts, the finder of fact (whether judge or jury) must resolve disputed facts and render a verdict, which will usually be dichotomous, not gradational.
Haack finds “the elision of admissibility into sufficiency disturbing,” Gap at 654, but that is life, reason, and the law. She suggests that the difference in the nature of relevancy and reliability on the one hand, and admissibility on the other, creates a conceptual “mismatch.” Gap at 669. The suggestion is rubbish, a Briticism that Haack is fond of using herself. Clinical pathologists may diagnose cancer by counting the number of mitotic spindles in cells removed from an organ on biopsy. The number may be characterized by as a percentage of cells in mitosis, a gradational that can run from zero to 100 percent, but the conclusion that comes out of the pathologist’s review is a categorical diagnosis. The pathologist must decide whether the biopsy result is benign or malignant. And so it is with many human activities and ways of understanding the world.
The Problems with Daubert (in Haack’s View)
Atomism versus Holism
Haack repeats a litany of complaints about Daubert, but she generally misses the boat. Daubert was decisional law, in 1993, which interpreted a statute, Federal Rule of Evidence 702. The current version of Rule 702, which was not available to, or binding on, the Court in Daubert, focuses on both validity and sufficiency concerns:
A witness who is qualified as an expert by knowledge, skill, experience, training, or education may testify in the form of an opinion or otherwise if:
(a) the expert’s scientific, technical, or other specialized knowledge will help the trier of fact to understand the evidence or to determine a fact in issue;
(b) the testimony is based on sufficient facts or data;
(c) the testimony is the product of reliable principles and methods; and
(d) the expert has reliably applied the principles and methods to the facts of the case.
Subsection (b) renders most of Haack’s article a legal ignoratio elenchi.
Relative Risks Greater Than Two
Modern chronic disease epidemiology has fostered an awareness that there is a legitimate category of disease causation that involves identifying causes that are neither necessary nor sufficient to produce their effects. Today it is a commonplace that an established cause of lung cancer is cigarette smoking, and yet, not all smokers develop lung cancer, and not all lung cancer patients were smokers. Epidemiology can identify lung cancer causes such as smoking because it looks at stochastic processes that are modified from base rates, or population rates. This model of causation is not expected to produce uniform and consistent categorical outcomes in all exposed individuals, such as lung cancer in all smokers.
A necessary implication of categorizing an exposure or lifestyle variable as a “cause,” in this way is that the evidence that helps establish causation cannot answer whether a given individual case of the outcome of interest was caused by the exposure of interest, even when that exposure is a known cause. We can certainly say that the exposure in the person was a risk for developing the disease later, but we often have no way to make the individual attribution. In some cases, more the exception than the rule, there may be an identified mechanism that allows the detection of a “fingerprint” of causation. For the most part, however, risk and cause are two completely different things.
The magnitude of risk, expressed as a risk ratio, can be used to calculate a population attributable risk, which can in turn, with some caveats, be interpreted as approximating a probability of causation. When the attributable risk is 95%, as it would be for people with light smoking habits and lung cancer, treating the existence of the prior risk as evidence of specific causation seems perfectly reasonable. Treating a 25% attributable risk as evidence to support a conclusion of specific causation, without more, is simply wrong. A simple probabilistic urn model would tell us that we would most likely be incorrect if we attributed a random case to the risk based upon such a low attributable risk. Although we can fuss over whether the urn model is correct, the typical case in litigation allows no other model to be asserted, and it would be the plaintiffs’ burden of proof to establish the alternative model in any event.
As she has done many times before, Haack criticizes Judge Kozinski’s opinion in Daubert,[2] on remand, where he entered judgment for the defendant because further proceedings were futile given the small relative risks claimed by plaintiffs’ expert witnesses. Those relative risks, advanced by Shanna Swan and Alan Done, lacked reliability; they were the product of a for-litigation juking of the stats that were the original target of the defendant and the medical community in the Supreme Court briefing. Judge Kozinski simplified the case, using a common legal strategem of assuming arguendo that general causation was established. With this assumption favorable to plaintiffs made, but never proven or accepted, Judge Kozinski could then shine his analytical light on the fatal weakness of the specific causation opinions. When all the hand waving was put to rest, all that propped up the plaintiff’s specific causation claim was the existence of a claimed relative risk, which was less than two. Haack is unhappy with the analytical clarity achieved by Kozinski, and implicitly urges a conflation of general and specific causation so that “all the evidence” can be counted. The evidence of general causation, however, does not advance plaintiff’s specific causation case when the nature of causation is the (assumed) existence of a non-necessary and non-sufficient risk. Haack quotes Dean McCormick as having observed that “[a] brick is not a wall,” and accuses Judge Kozinski of an atomistic fallacy of ruling out a wall simply because the party had only bricks. Gap at 673, quoting from Charles McCormick, Handbook of the Law of Evidence at 317 (1954).
There is a fallacy opposite to the atomistic fallacy, however, namely the holistic “too much of nothing fallacy” so nicely put by Poincaré:
“Science is built up with facts, as a house is with stones. But a collection of facts is no more a science than a heap of stones is a house.”[3]
Poincaré’s metaphor is more powerful than Haack’s call for holistic evidence because it acknowledges that interlocking pieces of evidence may cohere as a building, or they may be no more than a pile of rubble. Poorly constructed walls may soon revert to the pile of stones from which they came.
Haack proceeds to criticize Judge Kozinski for his “extraordinary argument” that
“(a) equates degrees of proof with statistical probabilities;
(b) assesses each expert’s testimony individually; and
(c) raises the standard of admissibility under the relevance prong to the standard of proof.”
Gap at 672.
Haack misses the point that a low relative risk, with no other valid evidence of specific causation, translates into a low probability of specific causation, even if general causation were apodictically certain. Aggregating the testimony, say between animal toxicologists and epidemiologists, simply does not advance the epistemic ball on specific causation because all the evidence collectively does not help identify the cause of Jason Daubert’s birth defects on the very model of causation that plaintiffs’ expert witnesses advanced.
All this would be bad enough, but Haack then goes on to commit a serious category mistake in confusing the probabilistic inference (for specific causation) of an urn model with the prosecutor’s fallacy of interpreting a random match probability as the evidence of innocence. (Or the complement of the random match probability as the evidence of guilt.) Judge Kozinski was not working with random match probabilities, and he did not commit the prosecutor’s fallacy.
Take Some Sertraline and Call Me in the Morning
As depressing as Haack’s article is, she manages to make matters even gloomier by attempting a discussion of Judge Rufe’s recent decision in the sertraline birth defects litigation. Haack’s discussion of this decision illustrates and typifies her analyses of other cases, including various decisions on causation opinion testimony on phenylpropanolamine, silicone, bendectin, t-PA, and other occupational, environmental, and therapeutic exposures. Maybe 100 mg sertraline is in order.
Haack criticizes what she perceives to be the conflation of admissibility and sufficiency issues in how the sertraline MDL court addressed the defendants’ motion to exclude the proffered testimony of Dr. Anick Bérard. Gap at 683. The conflation is imaginary, however, and the direct result of Haack’s refusal to look at the specific, multiple methodological flaws in plaintiffs’ expert witness Anick Bérard’s methodologic approach taken to reach a causal conclusion. These flaws are not gradational, and they are detailed in the MDL court’s opinion[4] excluding Anick Bérard. Haack, however, fails to look at the details. Instead Haack focuses on what she suggests is the sertraline MDL court’s conclusion that epidemiology was necessary:
“Judge Rufe argues that reliable testimony about human causation should generally be supported by epidemiological studies, and that ‘when epidemiological studies are equivocal or inconsistent with a causation opinion, experts asserting causation opinions must thoroughly analyze the strengths and weaknesses of the epidemiological research and explain why [it] does not contradict or undermine their opinion’. * * *
Judge Rufe acknowledges the difference between admissibility and sufficiency but, when it comes to the part of their testimony he [sic] deems inadmissible, his [sic] argument seems to be that, in light of the defendant’s epidemiological evidence, the plaintiffs’ expert testimony is insufficient.”
Gap at 682.
This précis is a remarkable distortion of the material facts of the case. There was no plaintiffs’ epidemiology evidence and defendants’ epidemiologic evidence. Rather there was epidemiologic evidence, and Bérard ignored, misreported, or misrepresented a good deal of the total evidentiary display. Bérard embraced studies when she could use their risk ratios to support her opinions, but criticized or ignored the same studies when their risk ratios pointed in the direction of no association or even of a protective association. To add to this methodological duplicity, Anick Bérard published many statements, in peer-reviewed journals, that sertraline was not shown to cause birth defects, but then changed her opinion solely for litigation. The court’s observation that there was a need for consistent epidemiologic evidence flowed not only from the conception of causation (non-necessary, not sufficient), but from Berard’s and her fellow plaintiffs’ expert witnesses’ concessions that epidemiology was needed. Haack’s glib approach to criticizing judicial opinions fails to do justice to the difficulties of the task; nor does she advance any meaningful criteria to separate successful from unsuccessful efforts.
In attempting to make her case for the gradational nature of relevance and reliability, Haack acknowledges that the details of the evidence relied upon can render the evidence, and presumably the conclusion based thereon, more or less reliable. Thus, we are told that epidemiologic studies based upon self-reported diagnoses are highly unreliable because such diagnoses are often wrong. Gap at 667-68. Similarly, we are told that in consider a claim that a plaintiff suffered an adverse effect from a medication, that epidemiologic evidence showing a risk ratio of three would not be reliable if it had inadequate or inappropriate controls,[5] was not double blinded, and lacked randomization. Gap at 668-69. Even if the boundaries between reliable and unreliable are not always as clear as we might like, Haack fails to show that the gatekeeping process lacks a suitable epistemic, scientific foundation.
Curiously, Haack calls out Carl Cranor, plaintiffs’ expert witness in the Milward case, for advancing a confusing, vacuous “weight of the evidence” rationale for the methodology employed by the other plaintiffs’ causation expert witnesses in Milward.[6] Haack argues that Cranor’s invocation of “inference to the best explanation” and “weight of the evidence” fails to answer the important questions at issue in the case, namely how to weight the inference to causation as strong, weak, or absent. Gap at 688 & n. 223, 224. And yet, when Haack discusses court decisions that detailed voluminous records of evidence about how causal inferences should be made and supported, she flies over the details to give us confused, empty conclusions that the trial courts conflated admissibility with sufficiency.
[1] Rideout v. Knox, 19 N.E. 390, 392 (Mass. 1892).
[2] Daubert v. Merrell Dow Pharm., Inc., 43 F.3d 1311, 1320 (9th Cir. 1995).
[3] Jules Henri Poincaré, La Science et l’Hypothèse (1905) (chapter 9, Les Hypothèses en Physique)( “[O]n fait la science avec des faits comme une maison avec des pierres; mais une accumulation de faits n’est pas plus une science qu’un tas de pierres n’est une maison.”).
[4] In re Zoloft Prods. Liab. Litig., 26 F. Supp. 3d 466 (E.D. Pa. 2014).
[5] Actually Haack’s suggestion is that a study with a relative risk of three would not be very reliable if it had no controls, but that suggestion is incoherent. A risk ratio could not have been calculated at all if there had been no controls.
[6] Milward v. Acuity Specialty Prods., 639 F.3d 11, 17-18 (1st Cir. 2011), cert. denied, 132 S.Ct. 1002 (2012).