Haack’s Holism vs. Too Much of Nothing

Professor Haack has been an unflagging critic of Daubert and its progeny.  Haack’s major criticism of the Daubert and Joiner cases is based upon the notion that the Supreme Court engaged in a “divide and conquer” strategy in its evaluation of plaintiffs’ evidence, when it should have been considered the “whole gemish” (my phrase, not Haack’s).  See Susan Haack, “Warrant, Causation, and the Atomism of Evidence Law,” 5 Episteme 253, 261 (2008)[hereafter “Warrant“];  “Proving Causation: The Holism of Warrant and the Atomism of Daubert,” 4 J. Health & Biomedical Law 273, 304 (2008)[hereafter “Proving Causation“].

ATOMISM vs. HOLISM

Haack’s concern is that combined pieces of evidence, none individually sufficient to warrant an opinion of causation, may provide the warrant when considered jointly.  Haack reads Daubert to require courts to screen each piece of evidence relied upon an expert witness for reliability, a process that can interfere with discerning the conclusion most warranted by the totality or “the mosaic” of the evidence:

“The epistemological analysis offered in this paper reveals that a combination of pieces of evidence, none of them sufficient by itself to warrant a causal conclusion to the legally required degree of proof, may do so jointly. The legal analysis offered here, interlocking with this, reveals that Daubert’s requirement that courts screen each item of scientific expert testimony for reliability can actually impede the process of arriving at the conclusion most warranted by the evidence proffered.”

Warrant at 253.

But there is nothing in Daubert, or its progeny, to support this crude characterization of the judicial gatekeeping function.  Indeed, there is another federal rule of evidence, Rule 703, which is directed at screening the reasonableness of reliance upon a single piece of evidence.

Surely there are times when the single, relied upon study is one that an expert in the relevant field should and would not rely upon because of invalidity of the data, the conduct of the study, or the study’s analysis of the data.  Indeed, there may well be times, especially in litigation contexts, when an expert witness has relied upon a collection of studies, none of which is reasonably relied upon by experts in the discipline.

Rule 702, which Daubert was interpreting, was, and is, focused with an expert witness’s opinion:

A witness who is qualified as an expert by knowledge, skill, experience, training, or education may testify in the form of an opinion or otherwise if:

(a) the expert’s scientific, technical, or other specialized knowledge will help the trier of fact to understand the evidence or to determine a fact in issue;

(b) the testimony is based on sufficient facts or data;

(c) the testimony is the product of reliable principles and methods; and

(d) the expert has reliably applied the principles and methods to the facts of the case

To be sure, Chief Justice Rehnquist, in explicating why plaintiffs’ expert witnesses’ opinions must be excluded in Joiner, noted the wild, irresponsible, unwarranted inferential leaps made in interpreting specific pieces of evidence.  The plaintiffs’ expert witnesses’ interpretation of a study, involving massive injections of PCBs into the peritoneum of baby mice, with consequent alveologenic adenomas, provided an amusing example of how they, the putative experts, had outrun their scientific headlights by over-interpreting a study in a different species, at different stages of maturation, with different routes of exposure, with different, non-cancerous outcomes.  These examples were effectively aimed at showing that the overall opinion advanced by Rabbi Teitelbaum and others, on behalf of plaintiffs in Joiner, were unreliable.  Haack, however, sees a philosophical kinship with Justice Stevens, who in dissent, argued to give plaintiffs’ expert witnesses a “pass,” based upon the whole evidentiary display.  General Electric Co. v. Joiner, 522 U.S. 136, 153 (1997) (Justice Stevens, dissenting) (“It is not intrinsically ‘unscientific’ for experienced professionals to arrive at a conclusion by weighing all available evidence.”). The problem, of course, is that sometimes “all available evidence” includes a good deal of junk, irrelevant, or invalid studies.  Sometimes “all available evidence” is just too much of nothing.

Perhaps Professor Haack was hurt that she was not cited by Justice Blackmun in Daubert, along with Popper and Hempel.  Haack has written widely on philosophy of science, and on epistemology, and she clearly believes her theory of knowledge would provide a better guide to the difficult task of screening expert witness opinions.

When Professor Haacks describes the “degree to which evidence warrants a conclusion,” she identifies three factors, which in part, require assessment of the strength of individual studies:

(i) how strong the connection is between the evidence and the conclusion (supportiveness);

(ii) how solid each of the elements of the evidence is, independent of the conclusion (independent security); and

(iii) how much of the relevant evidence the evidence includes (comprehensiveness).

Warrant at 258

Of course, supportiveness includes interconnectedness, but nothing in her theory of “warrant” excuses or omits rigorous examination of individual pieces of evidence in assessing a causal claim.

DONE WRONG

Haack seems enamored of the holistic approach taken by Dr. Done, plaintiffs’ expert witness in the Bendectin litigation. Done tried to justify his causal opinions based upon the entire “mosaic” of evidence. See, e.g., Oxendine v. Merrell Dow Pharms. Inc, 506 A.2d 1100, 1108 (D.C 1986)(“[Dr. Done] conceded his inability to conclude that Bendectin is a teratogen based on any of the individual studies which he discussed, but he also made quite clear that all these studies must be viewed together, and that, so viewed, they supplied his conclusion”).

Haack tilts at windmills by trying to argue the plausibility of Dr. Done’s mosaic in some of the Bendectin cases.  She rightly points out that Done challenged the internal and external validity of the defendant’s studies.  Such challenges to the validity of either side’s studies are a legitimate part of scientific discourse, and certainly a part of legal argumentation, but attacks on validity of null studies are not affirmative evidence of an association.  Haack correctly notes that “absence of evidence that p is just that — an absence of evidence of evidence; it is not evidence that not-p.”  Proving Causation at 300.  But the same point holds with respect to Done’s challenges to Merrill Dow’s studies.  If those studies are invalid, and Merrill Dow lacks evidence that “not-p,” this lack is not evidence for Done in favor of p.

Given the lack of supporting epidemiologic data in many studies, and the weak and invalid data relied upon, Done’s causal claims were suspect and have come to be discredited.  Professor Ronald Allen notes that invoking the Bendectin litigation in defense of a “mosaic theory” of evidentiary admissibility is a rather peculiar move for epistemology:

“[T]here were many such hints of risk at the time of litigation, but it is now generally accepted that those slight hints were statistical aberrations or the results of poorly conducted studies.76 Bendectin is still prescribed in many places in the world, including Europe, is endorsed by the World Health Organization as safe, and has been vindicated by meta-analyses and the support of a number of epidemiological studies.77 Given the weight of evidence in favor of Bendectin’s safety, it seems peculiar to argue for mosaic evidence from a case in which it would have plainly been misleading.”

Ronald J. Allen & Esfand Nafisi, “Daubert and its Discontents,” 76 Brooklyn L. Rev. 131, 148 (2010).

Screening each item of “expert evidence” for reliability may deprive the judge of “the mosaic,” but that is not all that the judicial gatekeepers were doing in Bendectin or other Rule 702 cases.   It is all well and good to speak metaphorically about mosaics, but the metaphor and its limits were long ago acknowledged in the philosophy of science.  The suggestion that scraps of evidence from different kinds of scientific studies can establish scientific knowledge was rejected by the great mathematician, physicist, and philosopher of science, Henri Poincaré:

“[O]n fait la science avec des faits comme une maison avec des pierres; mais une accumulation de faits n’est pas plus une science qu’un tas de pierres n’est une maison.”

Jules Henri Poincaré, La Science et l’Hypothèse (1905) (chapter 9, Les Hypothèses en Physique)( “Science is built up with facts, as a house is with stones. But a collection of facts is no more a science than a heap of stones is a house.”).  Poincaré’s metaphor is more powerful than Haack’s and Done’s “mosaic” because it acknowledges that interlocking pieces of evidence may cohere as a building, or they may be no more than a pile of rubble.  Poorly constructed walls may soon revert to the pile of stones from which they came.  Much more is required than simply invoking the “mosaic” theory to bless this mess as a “warranted” claim to knowledge.

Haack’s point about aggregation of evidence is, at one level, unexceptionable.  Surely, the individual pieces of evidence, each inconclusive alone, may be powerful when combined.  An easy example is a series of studies, each with a non-statistically significant result of finding more disease than expected.  None of the studies alone can rule out chance as an explanation, and the defense might be tempted to argue that it is inappropriate to rely upon any of the studies because none is statistically significant.

The defense argument may be wrong in cases in which a valid meta-analysis can be deployed to combine the results into a summary estimate of association.  If a meta-analysis is appropriate, the studies collectively may allow the exclusion of chance as an explanation for the disparity from expected rates of disease in the observed populations.  [Haack misinterprets study “effect size” to be relevant to ruling out chance as explanation for the increased rate of the outcome of interest. Proving Causation at 297.]

The availability of meta-analysis, in some cases, does not mean that hand waving about the “combined evidence” or “mosaics” automatically supports admissibility of the causal opinion.  The gatekeeper would still have to contend with the criteria of validity for meta-analysis, as well as with bias and confounding in the underlying studies.

NECESSITY OF JUDGMENT

Of course, unlike the meta-analysis example, most instances of evaluating an entire evidentiary display are not quantitative exercises.  Haack is troubled by the qualitative, continuous nature of reliability, but the “in or out” aspect of ruling on expert witness opinion admissibility.  Warrant at 262.  The continuous nature of a reliability spectrum, however, does not preclude the practical need for a decision.  We distinguish young from old people, although we age imperceptibly by units of time that are continuous and capable of being specified with increasingly small magnitudes.  Differences of opinions or close cases are likely, but decisions are made in scientific contexts all the time.

FAGGOT FALLACY

Although Haack criticizes defendants for beguiling courts with the claimed “faggot fallacy,” she occasionally, acknowledges that there simply is not sufficient valid evidence to support a conclusion.  Indeed, she makes the case for why, in legal contexts, we will frequently be dealing with “unwarranted” claims:

“Against this background, it isn’t hard to see why the legal system has had difficulties in handling scientific testimony. It often calls on the weaker areas of science and/or on weak or marginal scientists in an area; moreover, its adversarial character may mean that even solid scientific information gets distorted; it may suppress or sequester relevant data; it may demand scientific answers when none are yet well-warranted; it may fumble in applying general scientific findings to specific cases; and it may fail to adapt appropriately as a relevant scientific field progresses.”

Susan Haack, ” Of Truth, in Science and in Law,” 73 Brooklyn L. Rev. 985, 1000 (2008).  It is difficult to imagine a more vigorous call for, and defense of, judicial gatekeeping of expert witness opinion testimony.

Haack seems to object to the scope and intensity of federal judicial gatekeeping, but her characterization of the legal context should awaken her to the need to resist admitting opinions on scientific issues when “none are yet well-warranted.” Id. at 1004 (noting that “the legal system quite often want[s] scientific answers when no warranted answers are available).  The legal system, however, does not “want” unwarranted “scientific” answers; only an interested party on one side or the other wants such a thing.  The legal systems wants a procedure for ensuring rejection of unwarranted claims, which may be passed off as properly warranted, due to the lack of sophistication of the intended audience.

TOO MUCH OF NOTHING

Despite her flirtation with Dr. Done’s holistic medicine, Haack acknowledges that sometimes a study or an entire line of studies is simply not valid, and they should not be part of the “gemish.”  For instance, in the context of meta-analysis, which requires pre-specified inclusionary and exclusionary criteria for studies, Haack acknowledges that a “well-designed and well-conducted meta-analysis” will include a determination “which studies are good enough to be included … and which are best disregarded.”  Proving Causation at 286.  Exactly correct.  Sometimes we simply must drill down to the individual study, and what we find may require us to exclude it from the meta-analysis.  The same could be said of any study that is excluded by appropriate exclusionary criteria.

Elsewhere, Haack acknowledges myriad considerations of validity or invalidity, which must be weighed as part of the gemish:

“The effects of S on animals may be different from its effects on humans. The effects of b when combined with a and c may be different from its effects alone, or when combined with x and/or y.52 Even an epidemiological study showing a strong association between exposure to S and elevated risk of D would be insufficient by itself: it might be poorly-designed and/or poorly-executed, for example (moreover, what constitutes a well-designed study – e.g., what controls are needed – itself depends on further information about the kinds of factor that might be relevant). And even an excellent epidemiological study may pick up, not a causal connection between S and D, but an underlying cause both of exposure to S and of D; or possibly reflect the fact that people in the very early stages of D develop a craving for S. Nor is evidence that the incidence of D fell after S was withdrawn sufficient by itself to establish causation – perhaps vigilance in reporting D was relaxed after S was withdrawn, or perhaps exposure to x, y, z was also reduced, and one or all of these cause D, etc.53

Proving Causation at 288.  These are precisely the sorts of reasons that make gatekeeping of expert witness opinions an important part of the judicial process in litigation.

RATS TO YOU

Similarly, Haack acknowledges that animal studies may be quite irrelevant to the issue at hand:

“The elements of E will also interlock more tightly the more physiologically similar the animals used in any animal studies are to human beings. The results of tests on hummingbirds or frogs would barely engage at all with epidemiological evidence of risk to humans, while the results of tests on mice, rats, guinea-pigs, or rabbits would interlock more tightly with such evidence, and the results of tests on primates more tightly yet. Of course, “similar” has to be understood as elliptical for “similar in the relevant respects;” and which respects are relevant may depend on, among other things, the mode of exposure: if humans are exposed to S by inhalation, for example, it matters whether the laboratory animals used have a similar rate of respiration. (Sometimes animal studies may themselves reveal relevant differences; for example, the rats on which Thalidomide was tested were immune to the sedative effect it had on humans; which should have raised suspicions that rats were a poor choice of experimental animal for this drug.)55 Again, the results of animal tests will interlock more tightly with evidence of risk to humans the more similar the dose of S involved. (One weakness of Joiner’s expert testimony was that the animal studies relied on involved injecting massive doses of PCBs into a baby mouse’s peritoneum, whereas Mr. Joiner had been exposed to much smaller doses when the contaminated insulating oil splashed onto his skin and into his eyes.)56 The timing of the exposure may also matter, e.g., when the claim at issue is that a pregnant woman’s being exposed to S causes this or that specific type of damage to the fetus.”

Proving Causation at 290.

WEIGHT OF THE EVIDENCE (WOE)

Just as she criticizes General Electric for advancing the “faggot fallacy” in Joiner, Haack criticizes the plaintiffs’ appeal to “weight of evidence methodology,” as misleadingly suggesting “that there is anything like an algorithm or protocol, some effective, mechanical procedure for calculating the combined worth of evidence.”  Proving Causation at 293.

INFERENCE  TO BEST EXPLANATION

Professor Haack cautiously evaluates the glib invocation of “inference to the best explanation” as a substitute for actual warrant of a claim to knowledge.  Haack acknowledges the obvious: the legal system is often confronted with claims lacking sufficient warrant.  She appropriately refuses to permit such claims to be dressed up as scientific conclusions by invoking their plausibility:

“Can we infer from the fact that the causes of D are as yet unknown, and that a plaintiff developed D after being exposed to S, that it was this exposure that caused Ms. X’s or Mr. Y’s D?102  No. Such evidence would certainly give us reason to look into the possibility that S is the, or a, cause of D. But loose talk of ‘inference to the best explanation’ disguises the fact that what presently seems like the most plausible explanation may not really be so – indeed, may not really be an explanation at all. We may not know all the potential causes of D, or even which other candidate-explanations we would be wise to investigate.”

Proving Causation at 305.  See also Warrant at 261 (invoking the epistemic category of Rumsfeld’s “known unknowns” and “unknown unknowns” to describe a recurring situation in law’s treatment of scientific claims)(U.S. Sec’y of Defense Donald Rumsfeld: “[T]here are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – there are things we do not know we don’t know. (Feb. 12, 2002)).

It is easy to see why the folks at SKAPP are so fond of Professor Haack’s writings, and why they have invited her to their conferences and meetings.  She has written close to a dozen articles critical of Daubert, each repeating the same mistaken criticisms of the gatekeeping process.  She has provided SKAPP and its plaintiffs’ lawyer sponsors with sound bites to throw at impressionable judges about the epistemological weakness of Daubert and its progeny.  In advancing this critique and SKAPP’s propaganda purposes, Professor Haack has misunderstood the gatekeeping enterprise.  She has, however, correctly identified the gatekeeping process as an exercise in determining whether an opinion possesses sufficient epistemic warrant.  Despite her enthusiasm for the dubious claims of Dr. Done, Haack acknowledges that “warrant” requires close attention to the internal and external validity of studies, and to rigorous analysis of a body of evidence.  Haack’s own epistemic analysis would be hugely improved and advanced by focusing on how the mosaic theory, or WOE, failed to hold up in some of the more egregious, pathological claims of health “effects” — Bendectin, silicone, electro-magnetic frequency, asbestos and colorectal cancer, etc.