For your delectation and delight, desultory dicta on the law of delicts.

Daubert Retrospective – Statistical Significance

January 5th, 2019

The holiday break was an opportunity and an excuse to revisit the briefs filed in the Supreme Court by parties and amici, in the Daubert case. The 22 amicus briefs in particular provided a wonderful basis upon which to reflect how far we have come, and also how far we have to go, to achieve real evidence-based fact finding in technical and scientific litigation. Twenty-five years ago, Rules 702 and 703 vied for control over errant and improvident expert witness testimony. With Daubert decided, Rule 702 emerged as the winner. Sadly, most courts seem to ignore or forget about Rule 703, perhaps because of its awkward wording. Rule 702, however, received the judicial imprimatur to support the policing and gatekeeping of dysepistemic claims in the federal courts.

As noted last week,1 the petitioners (plaintiffs) in Daubert advanced several lines of fallacious and specious argument, some of which was lost in the shuffle and page limitations of the Supreme Court briefings. The plaintiffs’ transposition fallacy received barely a mention, although it did bring forth at least a footnote in an important and overlooked amicus brief filed by American Medical Association (AMA), the American College of Physicians, and over a dozen other medical specialty organizations,2 all of which both emphasized the importance of statistical significance in interpreting epidemiologic studies, and the fallacy of interpreting 95% confidence intervals as providing a measure of certainty about the estimated association as a parameter. The language of these associations’ amicus brief is noteworthy and still relevant to today’s controversies.

The AMA’s amicus brief, like the brief filed by the National Academies of Science and the American Association for the Advancement of Science, strongly endorsed a gatekeeping role for trial courts to exclude testimony not based upon rigorous scientific analysis:

The touchstone of Rule 702 is scientific knowledge. Under this Rule, expert scientific testimony must adhere to the recognized standards of good scientific methodology including rigorous analysis, accurate and statistically significant measurement, and reproducibility.”3

Having incorporated the term “scientific knowledge,” Rule 702 could not permit anything less in expert witness testimony, lest it pollute federal courtrooms across the land.

Elsewhere, the AMA elaborated upon its reference to “statistically significant measurement”:

Medical researchers acquire scientific knowledge through laboratory investigation, studies of animal models, human trials, and epidemiological studies. Such empirical investigations frequently demonstrate some correlation between the intervention studied and the hypothesized result. However, the demonstration of a correlation does not prove the hypothesized result and does not constitute scientific knowledge. In order to determine whether the observed correlation is indicative of a causal relationship, scientists necessarily rely on the concept of “statistical significance.” The requirement of statistical reliability, which tends to prove that the relationship is not merely the product of chance, is a fundamental and indispensable component of valid scientific methodology.”4

And then again, the AMA spelled out its position, in case the Court missed its other references to the importance of statistical significance:

Medical studies, whether clinical trials or epidemiologic studies, frequently demonstrate some correlation between the action studied … . To determine whether the observed correlation is not due to chance, medical scientists rely on the concept of ‘statistical significance’. A ‘statistically significant’ correlation is generally considered to be one in which statistical analysis suggests that the observed relationship is not the result of chance. A statistically significant correlation does not ‘prove’ causation, but in the absence of such a correlation, scientific causation clearly is not proven.95

In its footnote 9, in the above quoted section of the brief, the AMA called out the plaintiffs’ transposition fallacy, without specifically citing to plaintiffs’ briefs:

It is misleading to compare the 95% confidence level used in empirical research to the 51% level inherent in the preponderance of the evidence standard.”6

Actually the plaintiffs’ ruse was much worse than misleading. The plaintiffs did not compare the two probabilities; they equated them. Some might call this ruse, an outright fraud on the court. In any event, the AMA amicus brief remains an available, citable source for opposing this fraud and the casual dismissal of the importance of statistical significance.

One other amicus brief touched on the plaintiffs’ statistical shanigans. The Product Liability Advisory Council, National Association of Manufacturers, Business Roundtable, and Chemical Manufacturers Association jointly filed an amicus brief to challenge some of the excesses of the plaintiffs’ submissions.7  Plaintiffs’ expert witness, Shanna Swan, had calculated type II error rates and post-hoc power for some selected epidemiologic studies relied upon by the defense. Swan’s complaint had been that some studies had only 20% probability (power) to detect a statistically significant doubling of limb reduction risk, with significance at p < 5%.8

The PLAC Brief pointed out that power calculations must assume an alternative hypothesis, and that the doubling of risk hypothesis had no basis in the evidentiary record. Although the PLAC complaint was correct, it missed the plaintiffs’ point that the defense had set exceeding a risk ratio of 2.0, as an important benchmark for specific causation attributability. Swan’s calculation of post-hoc power would have yielded an even lower probability for detecting risk ratios of 1.2 or so. More to the point, PLAC noted that other studies had much greater power, and that collectively, all the available studies would have had much greater power to have at least one study achieve statistical significance without dodgy re-analyses.

1 The Advocates’ Errors in Daubert” (Dec. 28, 2018).

2 American Academy of Allergy and Immunology, American Academy of Dermatology, American Academy of Family Physicians, American Academy of Neurology, American Academy of Orthopaedic Surgeons, American Academy of Pain Medicine, American Association of Neurological Surgeons, American College of Obstetricians and Gynecologists, American College of Pain Medicine, American College of Physicians, American College of Radiology, American Society of Anesthesiologists, American Society of Plastic and Reconstructive Surgeons, American Urological Association, and College of American Pathologists.

3 Brief of the American Medical Association, et al., as Amici Curiae, in Support of Respondent, in Daubert v. Merrell Dow Pharmaceuticals, Inc., U.S. Supreme Court no. 92-102, 1993 WL 13006285, at *27 (U.S., Jan. 19, 1993)[AMA Brief].

4 AMA Brief at *4-*5 (emphasis added).

5 AMA Brief at *14-*15 (emphasis added).

6 AMA Brief at *15 & n.9.

7 Brief of the Product Liability Advisory Council, Inc., National Association of Manufacturers, Business Roundtable, and Chemical Manufacturers Association as Amici Curiae in Support of Respondent, as Amici Curiae, in Support of Respondent, in Daubert v. Merrell Dow Pharmaceuticals, Inc., U.S. Supreme Court no. 92-102, 1993 WL 13006288 (U.S., Jan. 19, 1993) [PLAC Brief].

8 PLAC Brief at *21.

Confounding in Daubert, and Daubert Confounded

November 4th, 2018


The Daubert trilogy and the statutory revisions to Rule 702 have not brought universal enlightenment. Many decisions reflect a curmudgeonly and dismissive approach to gatekeeping.

The New Jersey Experience

Until recently, New Jersey law looked as though it favored vigorous gatekeeping of invalid expert witness opinion testimony. The law as applied, however, was another matter, with most New Jersey judges keen to find ways to escape the logical and scientific implications of the articulated standards, at least in civil cases.1 For example, in Grassis v. Johns-Manville Corp., 248 N.J. Super. 446, 591 A.2d 671, 675 (App. Div. 1991), the intermediate appellate court discussed the possibility that confounders may lead to an erroneous inference of a causal relationship. Plaintiffs’ counsel claimed that occupational asbestos exposure causes colorectal cancer, but the available studies, inconsistent as they were, failed to assess the role of smoking, family history, and dietary factors. The court essentially shrugged its judicial shoulders and let a plaintiffs’ verdict stand, even though it was supported by expert witness testimony that had relied upon seriously flawed and confounded studies. Not surprisingly, 15 years after the Grassis case, the scientific community acknowledged what should have been obvious in 1991: the studies did not support a conclusion that asbestos causes colorectal cancer.2

This year, however, saw the New Jersey Supreme Court step in to help extricate the lower courts from their gatekeeping doldrums. In a case that involved the dismissal of plaintiffs’ expert witnesses’ testimony in over 2,000 Accutane cases, the New Jersey Supreme Court demonstrated how to close the gate on testimony that is based upon flawed studies and involves tenuous and unreliable inferences.3 There were other remarkable aspects of the Supreme Court’s Accutane decision. For instance, the Court put its weight behind the common-sense and accurate interpretation of Sir Austin Bradford Hill’s famous articulation of factors for causal judgment, which requires that sampling error, bias, and confounding be eliminated before assessing whether the observed association is strong, consistent, plausible, and the like.4

Cook v. Rockwell International

The litigation over radioactive contamination from the Colorado Rocky Flats nuclear weapons plant is illustrative of the retrograde tendency in some federal courts. The defense objected to plaintiffs’ expert witness, Dr. Clapp, whose study failed to account for known confounders.5 Judge Kane denied the challenge, claiming that the defense could:

cite no authority, scientific or legal, that compliance with all, or even one, of these factors is required for Dr. Clapp’s methodology and conclusions to be deemed sufficiently reliable to be admissible under Rule 702. The scientific consensus is, in fact, to the contrary. It identifies Defendants’ list of factors as some of the nine factors or lenses that guide epidemiologists in making judgments about causation. Ref. Guide on Epidemiolog at 375.).”6

In Cook, the trial court or the parties or both missed the obvious references in the Reference Manual to the need to control for confounding. Certainly many other scientific sources could be cited as well. Judge Kane apparently took a defense expert witness’s statement that ecological studies do not account for confounders to mean that the presence of confounding does not render such studies unscientific. Id. True but immaterial. Ecological studies may be “scientific,” but they do not warrant inferences of causation. Some so-called scientific studies are merely hypothesis generating, preliminary, tentative, or data-dredging exercises. Judge Kane employed the flaws-are-features approach, and opined that ecological studies are merely “less probative” than other studies, and the relative weights of studies do not render them inadmissible.7 This approach is, of course, a complete abdication of gatekeeping responsibility. First, studies themselves are not admissible; it is the expert witness, whose testimony is challenged. The witness’s reliance upon studies is relevant to the Rule 702 and 703 analyses, but admissibility is not the issue. Second, Rule 702 requires that the proffered opinion be “scientific knowledge,” and ecological studies simply lack the necessary epistemic warrant to support a causal conclusion. Third, the trial court in Cook had to ignore the federal judiciary’s own reference manual’s warnings about the inability of ecological studies to provide causal inferences.8 The Cook case is part of an unfortunate trend to regard all studies as “flawed,” and their relative weights simply a matter of argument and debate for the litigants.9


Another example of sloppy reasoning about confounding can be found in a recent federal trial court decision, In re Abilify Products Liability Litigation,10 where the trial court advanced a futility analysis. All observational studies have potential confounding, and so confounding is not an error but a feature. Given this simplistic position, it follows that failure to control for every imaginable potential confounder does not invalidate an epidemiologic study.11 From its nihilistic starting point, the trial court readily found that an expert witness could reasonably dispense with controlling for confounding factors of psychiatric conditions in studies of a putative association between the antipsychotic medication Abilify and gambling disorders.12

Under this sort of “reasoning,” some criminal defense lawyers might argue that since all human beings are “flawed,” we have no basis to distinguish sinners from saints. We have a long way to go before our courts are part of the evidence-based world.

1 In the context of a “social justice” issue such as whether race disparities exist in death penalty cases, New Jersey court has carefully considered confounding in its analyses. See In re Proportionality Review Project (II), 165 N.J. 206, 757 A.2d 168 (2000) (noting that bivariate analyses of race and capital sentences were confounded by missing important variables). Unlike the New Jersey courts (until the recent decision in Accutane), the Texas courts were quick to adopt the principles and policies of gatekeeping expert witness opinion testimony. See Merrell Dow Pharms., Inc. v. Havner, 953 S.W.2d 706, 714, 724 (Tex.1997) (reviewing court should consider whether the studies relied upon were scientifically reliable, including consideration of the presence of confounding variables).  Even some so-called Frye jurisdictions “get it.” See, e.g., Porter v. SmithKline Beecham Corp., No. 3516 EDA 2015, 2017 WL 1902905 *6 (Phila. Super., May 8, 2017) (unpublished) (affirming exclusion of plaintiffs’ expert witness on epidemiology, under Frye test, for relying upon an epidemiologic study that failed to exclude confounding as an explanation for a putative association), affirming, Mem. Op., No. 03275, 2015 WL 5970639 (Phila. Ct. Com. Pl. Oct. 5, 2015) (Bernstein, J.), and Op. sur Appellate Issues (Phila. Ct. Com. Pl., Feb. 10, 2016) (Bernstein, J.).

3 In re Accutane Litig., ___ N.J. ___, ___ A.3d ___, 2018 WL 3636867 (2018); see N.J. Supreme Court Uproots Weeds in Garden State’s Law of Expert Witnesses(Aug. 8, 2018).

2018 WL 3636867, at *20 (citing the Reference Manual 3d ed., at 597-99).

5 Cook v. Rockwell Internat’l Corp., 580 F. Supp. 2d 1071, 1098 (D. Colo. 2006) (“Defendants next claim that Dr. Clapp’s study and the conclusions he drew from it are unreliable because they failed to comply with four factors or criteria for drawing causal interferences from epidemiological studies: accounting for known confounders … .”), rev’d and remanded on other grounds, 618 F.3d 1127 (10th Cir. 2010), cert. denied, ___ U.S. ___ (May 24, 2012). For another example of a trial court refusing to see through important qualitative differences between and among epidemiologic studies, see In re Welding Fume Prods. Liab. Litig., 2006 WL 4507859, *33 (N.D. Ohio 2006) (reducing all studies to one level, and treating all criticisms as though they rendered all studies invalid).

6 Id.   

7 Id.

8 RMSE3d at 561-62 (“[ecological] studies may be useful for identifying associations, but they rarely provide definitive causal answers”) (internal citations omitted); see also David A. Freedman, “Ecological Inference and the Ecological Fallacy,” in Neil J. Smelser & Paul B. Baltes, eds., 6 Internat’l Encyclopedia of the Social and Behavioral Sciences 4027 (2001).

9 See also McDaniel v. CSX Transportation, Inc., 955 S.W.2d 257 (Tenn. 1997) (considering confounding but holding that it was a jury issue); Perkins v. Origin Medsystems Inc., 299 F. Supp. 2d 45 (D. Conn. 2004) (striking reliance upon a study with uncontrolled confounding, but allowing expert witness to testify anyway)

10 In re Abilifiy (Aripiprazole) Prods. Liab. Litig., 299 F. Supp. 3d 1291 (N.D. Fla. 2018).

11 Id. at 1322-23 (citing Bazemore as a purported justification for the court’s nihilistic approach); see Bazemore v. Friday, 478 U.S. 385, 400 (1986) (“Normally, failure to include variables will affect the analysis’ probativeness, not its admissibility.).

12 Id. at 1325.

Appendix – Some Federal Court Decisions on Confounding

1st Circuit

Bricklayers & Trowel Trades Internat’l Pension Fund v. Credit Suisse Sec. (USA) LLC, 752 F.3d 82, 85 (1st Cir. 2014) (affirming exclusion of expert witness whose event study and causal conclusion failed to consider relevant confounding variables and information that entered market on the event date)

2d Circuit

In re “Agent Orange” Prod. Liab. Litig., 597 F. Supp. 740, 783 (E.D.N.Y. 1984) (noting that confounding had not been sufficiently addressed in a study of U.S. servicemen exposed to Agent Orange), aff’d, 818 F.2d 145 (2d Cir. 1987) (approving district court’s analysis), cert. denied sub nom. Pinkney v. Dow Chemical Co., 484 U.S. 1004 (1988)

3d Circuit

In re Zoloft Prods. Liab. Litig., 858 F.3d 787, 793, 799 (2017) (acknowledging that statistically significant findings occur in the presence of inadequately controlled confounding or bias; affirming the exclusion of statistical expert witness, Nicholas Jewell, in part for using an admittedly non-rigorous approach to adjusting for confouding by indication)

4th Circuit

Gross v. King David Bistro, Inc., 83 F. Supp. 2d 597 (D. Md. 2000) (excluding expert witness who opined shigella infection caused fibromyalgia, given the existence of many confounding factors that muddled the putative association)

5th Circuit

Kelley v. American Heyer-Schulte Corp., 957 F. Supp. 873 (W.D. Tex. 1997) (noting that observed association may be causal or spurious, and that confounding factors must be considered to distinguish spurious from real associations)

Brock v. Merrell Dow Pharms., Inc., 874 F.2d 307, 311 (5th Cir. 1989) (noting that “[o]ne difficulty with epidemiologic studies is that often several factors can cause the same disease.”)

6th Circuit

Nelson v. Tennessee Gas Pipeline Co., WL 1297690, at *4 (W.D. Tenn. Aug. 31, 1998) (excluding an expert witness who failed to take into consideration confounding factors), aff’d, 243 F.3d 244, 252 (6th Cir. 2001), cert. denied, 534 U.S. 822 (2001)

Adams v. Cooper Indus. Inc., 2007 WL 2219212, 2007 U.S. Dist. LEXIS 55131 (E.D. Ky. 2007) (differential diagnosis includes ruling out confounding causes of plaintiffs’ disease).

7th Circuit

People Who Care v. Rockford Bd. of Educ., 111 F.3d 528, 537-38 (7th Cir. 1997) (Posner, J.) (“a statistical study that fails to correct for salient explanatory variables, or even to make the most elementary comparisons, has no value as causal explanation and is therefore inadmissible in a federal court”) (educational achievement in multiple regression);

Sheehan v. Daily Racing Form, Inc., 104 F.3d 940 (7th Cir. 1997) (holding that expert witness’s opinion, which failed to correct for any potential explanatory variables other than age, was inadmissible)

Allgood v. General Motors Corp., 2006 WL 2669337, at *11 (S.D. Ind. 2006) (noting that confounding factors must be carefully addressed; holding that selection bias rendered expert testimony inadmissible)

9th Circuit

In re Bextra & Celebrex Marketing Celebrex Sales Practices & Prod. Liab. Litig., 524 F.Supp. 2d 1166, 1178-79 (N.D. Cal. 2007) (noting plaintiffs’ expert witnesses’ inconsistent criticism of studies for failing to control for confounders; excluding opinions that Celebrex at 200 mg/day can cause heart attacks, as failing to satisfy Rule 702)

Avila v. Willits Envt’l Remediation Trust, 2009 WL 1813125, 2009 U.S. Dist. LEXIS 67981 (N.D. Cal. 2009) (excluding expert witness’s opinion in part because of his failure to rule out confounding exposures and risk factors for the outcomes of interest), aff’d in relevant part, 633 F.3d 828 (9th Cir.), cert denied, 132 S.Ct. 120 (2011)

Hendricksen v. ConocoPhillips Co., 605 F. Supp. 2d 1142, 1158 (E.D. Wash. 2009) (“In general, epidemiology studies are probative of general causation: a relative risk greater than 1.0 means the product has the capacity to cause the disease. “Where the study properly accounts for potential confounding factors and concludes that exposure to the agent is what increases the probability of contracting the disease, the study has demonstrated general causation – that exposure to the agent is capable of causing [the illness at issue] in the general population.’’) (internal quotation marks and citation omitted)

Valentine v. Pioneer Chlor Alkali Co., Inc., 921 F. Supp. 666, 677 (D. Nev. 1996) (‘‘In summary, Dr. Kilburn’s study suffers from very serious flaws. He took no steps to eliminate selection bias in the study group, he failed to identify the background rate for the observed disorders in the Henderson community, he failed to control for potential recall bias, he simply ignored the lack of reliable dosage data, he chose a tiny sample size, and he did not attempt to eliminate so-called confounding factors which might have been responsible for the incidence of neurological disorders in the subject group.’’)

Claar v. Burlington No. RR, 29 F.3d 499 (9th Cir. 1994) (affirming exclusion of plaintiffs’ expert witnesses, and grant of summary judgment, when plaintiffs’ witnesses concluded that the plaintiffs’ injuries were caused by exposure to toxic chemicals, without investigating any other possible causes).

10th Circuit

Hollander v. Sandoz Pharms. Corp., 289 F.3d 1193, 1213 (10th Cir. 2002) (affirming exclusion in Parlodel case involving stroke; confounding makes case reports inappropriate bases for causal inferences, and even observational epidemiologic studies must evaluated carefully for confounding)

D.C. Circuit

American Farm Bureau Fed’n v. EPA, 559 F.3d 512 (2009) (noting that in setting particulate matter standards addressing visibility, agency should avoid relying upon data that failed to control for the confounding effects of humidity)

Rule 702 Requires Courts to Sort Out Confounding

October 31st, 2018


Back in 2000, several law professors wrote an essay, in which they detailed some of the problems courts experienced in expert witness gatekeeping. Their article noted that judges easily grasped the problem of generalizing from animal evidence to human experience, and thus they simplistically emphasized human (epidemiologic) data. But in their emphasis on the problems in toxicological evidence, the judges missed problems of internal validity, such as confounding, in epidemiologic studies:

Why do courts have such a preference for human epidemiological studies over animal experiments? Probably because the problem of external validity (generalizability) is one of the most obvious aspects of research methodology, and therefore one that non-scientists (including judges) are able to discern with ease – and then give excessive weight to (because whether something generalizes or not is an empirical question; sometimes things do and other times they do not). But even very serious problems of internal validity are harder for the untrained to see and understand, so judges are slower to exclude inevitably confounded epidemiological studies (and give insufficient weight to that problem). Sophisticated students of empirical research see the varied weaknesses, want to see the varied data, and draw more nuanced conclusions.”2

I am not sure that the problems are dependent in the fashion suggested by the authors, but their assessment that judges may be reluctant to break the seal on the black box of epidemiology, and that judges frequently lack the ability to make nuanced evaluations of the studies on which expert witnesses rely seems fair enough. Judges continue to miss important validity issues, perhaps because the adversarial process levels all studies to debating points in litigation.3

The frequent existence of validity issues undermines the partisan suggestion that Rule 702 exclusions are merely about “sufficiency of the evidence.” Sometimes, there is just too much of nothing to rise even to a problem of insufficiency. Some studies are “not even wrong.”4 Similarly, validity issues are an embarrassment to those authors who argue that we must assemble all the evidence and consider the entirety under ethereal standards, such as “weight of the evidence,” or “inference to the best explanation.” Sometimes, some or much of the available evidence does not warrant inclusion in the data set at all, and any causal inference is unacceptable.

Threats to validity come in many forms, but confounding is a particularly dangerous one. In claims that substances such as diesel fume or crystalline silica cause lung cancer, confounding is a huge problem. The proponents of the claims suggest relative risks in the range of 1.1 to 1.6 for such substances, but tobacco smoking results in relative risks in excess of 20, and some claim that passive smoking at home or in the workplace results in relative risks of the same magnitude as the risk ratios claimed for diesel particulate or silica. Furthermore the studies behind these claims frequently involve exposures to other known or suspected lung carcinogens, such as arsenic, radon, dietary factors, asbestos, and others.

Definition of Confounding

Confounding results from the presence of a so-called confounding (or lurking) variable, helpfully defined in the chapter on statistics in the Reference Manual on Scientific Evidence:

confounding variable; confounder. A confounder is correlated with the independent variable and the dependent variable. An association between the dependent and independent variables in an observational study may not be causal, but may instead be due to confounding. See controlled experiment; observational study.”5

This definition suggests that the confounder need not be known to cause the dependent variable/outcome; the confounder need be only correlated with the outcome and an independent variable, such as exposure. Furthermore, the confounder may be actually involved in such a way as to increase or decrease the estimated relationship between dependent and independent variables. A confounder that is known to be present typically is referred to as a an “actual” confounder, as opposed to one that may be at work, and known as a “potential” confounder. Furthermore, even after exhausting known and potential confounders, studies of may be affected by “residual” confounding, especially when the total array of causes of the outcome of interest is not understood, and these unknown causes are not randomly distributed between exposed and unexposed groups in epidemiologic studies. Litigation frequently involves diseases or outcomes with unknown causes, and so the reality of unidentified residual confounders is unavoidable.

In some instances, especially in studies pharmaceutical adverse outcomes, there is the danger that the hypothesized outcome is also a feature of the underlying disease being treated. This phenomenon is known as confounding by indication, or as indication bias.6

Kaye and Freedman’s statistics chapter notes that confounding is a particularly important consideration when evaluating observational studies. In randomized clinical trials, one goal of the randomization is the elimination of the role of bias and confounding by the random assignment of exposures:

2. Randomized controlled experiments

In randomized controlled experiments, investigators assign subjects to treatment or control groups at random. The groups are therefore likely to be comparable, except for the treatment. This minimizes the role of confounding.”7

In observational studies, confounding may completely invalidate an association. Kaye and Freedman give an example from the epidemiologic literature:

Confounding remains a problem to reckon with, even for the best observational research. For example, women with herpes are more likely to develop cervical cancer than other women. Some investigators concluded that herpes caused cancer: In other words, they thought the association was causal. Later research showed that the primary cause of cervical cancer was human papilloma virus (HPV). Herpes was a marker of sexual activity. Women who had multiple sexual partners were more likely to be exposed not only to herpes but also to HPV. The association between herpes and cervical cancer was due to other variables.”8

The problem identified as confounding by Freedman and Kaye cannot be dismissed as an issue that goes to the “weight” of the study issue; the confounding goes to the heart of the ability of the herpes studies to show an association that can be interpreted to be causal. Invalidity from confounding renders the studies “weightless” in any “weight of the evidence” approach. There are, of course, many ways to address confounding in studies: stratification, multivariate analyses, multiple regression, propensity scores, etc. Consideration of the propriety and efficacy of these methods is a whole other level of analysis, which does not arise unless and until the threshold question of confounding is addressed.

Reference Manual on Scientific Evidence

The epidemiology chapter of the Second Edition of the Manual stated that ruling out of confounding as an obligation of the expert witness who chooses to rely upon the study.9 Although the same chapter in the Third Edition occasionally waffles, its authors come down on the side of describing confounding as a threat to validity, which must be ruled out before the study can be relied upon. In one place, the authors indicate “care” is required, and that analysis for random error, confounding, bias “should be conducted”:

Although relative risk is a straightforward concept, care must be taken in interpreting it. Whenever an association is uncovered, further analysis should be conducted to assess whether the association is real or a result of sampling error, confounding, or bias. These same sources of error may mask a true association, resulting in a study that erroneously finds no association.”10

Elsewhere in the same chapter, the authors note that “chance, bias, and confounding” must be looked at, but again, the authors stop short of noting that these threats to validity must be eliminated:

Three general categories of phenomena can result in an association found in a study to be erroneous: chance, bias, and confounding. Before any inferences about causation are drawn from a study, the possibility of these phenomena must be examined.”11

                *  *  *  *  *  *  *  *

To make a judgment about causation, a knowledgeable expert must consider the possibility of confounding factors.”12

Eventually, however, the epidemiology chapter takes a stand, and an important one:

When researchers find an association between an agent and a disease, it is critical to determine whether the association is causal or the result of confounding.”13

Mandatory Not Precatory

The better reasoned cases decided under Federal Rule of Evidence 702, and state-court analogues, follow the Reference Manual in making clear that confounding factors must be carefully addressed and eliminated. Failure to rule out the role of confounding renders a conclusion of causation, reached in reliance upon confounded studies, invalid.14

The inescapable mandate of Rules 702 and 703 is to require judges to evaluate the bases of a challenged expert witness’s opinion. Threats to internal validity, such as confounding, in a study may make reliance upon any given study, or an entire set of studies, unreasonable, which thus implicates Rule 703. Importantly, stacking up more invalid studies does not overcome the problem by presenting a heap of evidence, incompetent to show anything.


Before the Supreme Court decided Daubert, few federal or state courts were willing to roll up their sleeves to evaluate the internal validity of relied upon epidemiologic studies. Issues of bias and confounding were typically dismissed by courts as issues that went to “weight, not admissibility.”

Judge Weinstein’s handling of the Agent Orange litigation, in the mid-1980s, marked a milestone in judicial sophistication and willingness to think critically about the evidence that was being funneled into the courtroom.15 The Bendectin litigation also was an important proving ground in which the defendant pushed courts to keep their eyes and minds open to issues of random error, bias, and confounding, when evaluating scientific evidence, on both pre-trial and on post-trial motions.16


When the United States Supreme Court addressed the admissibility of plaintiffs’ expert witnesses in Daubert, its principal focus was on the continuing applicability of the so-called Frye rule after the enactment of the Federal Rules of Evidence. The Court left the details of applying the then newly clarified “Daubert” standard to the facts of the case on remand to the intermediate appellate court. The Ninth Circuit, upon reconsidering the case, re-affirmed the trial court’s previous grant of summary judgment, on grounds of the plaintiffs’ failure to show specific causation.

A few years later, the Supreme Court itself engaged with the actual evidentiary record on appeal, in a lung cancer claim, which had been dismissed by the district court. Confounding was one among several validity issues in the studies relied upon by plaintiffs” expert witnesses. The Court concluded that the plaintiffs’ expert witnesses’ bases did not individually or collectively support their conclusions of causation in a reliable way. With respect to one particular epidemiologic study, the Supreme Court observed that a study that looked at workers who “had been exposed to numerous potential carcinogens” could not show that PCBs cause lung cancer. General Elec. Co. v. Joiner, 522 U.S. 136, 146 (1997).17

1 An earlier version of this post can be found at “Sorting Out Confounded Research – Required by Rule 702” (June 10, 2012).

2 David Faigman, David Kaye, Michael Saks, and Joseph Sanders, “How Good is Good Enough? Expert Evidence Under Daubert andKumho,” 50Case Western Reserve L. Rev. 645, 661 n.55 (2000).

3 See, e.g., In re Welding Fume Prods. Liab. Litig., 2006 WL 4507859, *33 (N.D.Ohio 2006) (reducing all studies to one level, and treating all criticisms as though they rendered all studies invalid).

4 R. Peierls, “Wolfgang Ernst Pauli, 1900-1958,” 5Biographical Memoirs of Fellows of the Royal Society 186 (1960) (quoting Wolfgang Pauli’s famous dismissal of a particularly bad physics paper).

5 David Kaye & David Freedman, “Reference Guide on Statistics,” inReference Manual on Scientific Evidence 211, 285 (3d ed. 2011)[hereafter theRMSE3d].

6 See, e.g., R. Didham, et al., “Suicide and Self-Harm Following Prescription of SSRIs and Other Antidepressants: Confounding By Indication,” 60Br. J. Clinical Pharmacol. 519 (2005).

7 RMSE3d at 220.

8 RMSE3d at 219 (internal citations omitted).

9 Reference Guide on Epidemiology at 369 -70 (2ed 2000) (“Even if an association is present, epidemiologists must still determine whether the exposure causes the disease or if a confounding factor is wholly or partly responsible for the development of the outcome.”).

10 RMSE3d at 567-68 (internal citations omitted).

11 RMSE3d at 572.

12 RMSE3d at 591 (internal citations omitted).

13 RMSE3d at 591

14 Similarly, an exonerative conclusion of no association might be vitiated by confounding with a protective factor, not accounted for in a multivariate analysis. Practically, such confounding seems less prevalent than confounding that generates a positive association.

15 In re “Agent Orange” Prod. Liab. Litig., 597 F. Supp. 740, 783 (E.D.N.Y. 1984) (noting that confounding had not been sufficiently addressed in a study of U.S. servicemen exposed to Agent Orange), aff’d, 818 F.2d 145 (2d Cir. 1987) (approving district court’s analysis), cert. denied sub nom. Pinkney v. Dow Chemical Co., 484 U.S. 1004 (1988).

16 Brock v. Merrell Dow Pharms., Inc., 874 F.2d 307, 311 , modified on reh’g, 884 F.2d 166 (5th Cir. 1989) (noting that “[o]ne difficulty with epidemiologic studies is that often several factors can cause the same disease.”)

17 The Court’s discussion related to the reliance of plaintiffs’ expert witnesses upon, among other studies, Kuratsune, Nakamura, Ikeda, & Hirohata, “Analysis of Deaths Seen Among Patients with Yusho – A Preliminary Report,” 16 Chemosphere 2085 (1987).

Daubert’s Silver Anniversary – Retrospective View of Its Friends and Enemies

October 21st, 2018

Science is inherently controversial because when done properly it has no respect for power or political correctness or dogma or entrenched superstition. We should thus not be surprised that the scientific process has many detractors in houses of worship, houses of representatives, and houses of the litigation industry. And we have more than a few “Dred Scott” decisions, in which courts have held that science has no criteria of validity that they are bound to follow.

To be sure, many judges have recognized a different danger in scientific opinion testimony, namely, its ability to overwhelm the analytical faculties of lay jurors. Fact-finders may view scientific expert witness opinion testimony as having an overwhelming certainty and authority, which swamps their ability to evaluate the testimony.1

One errant judicial strategy to deal with their own difficulty in evaluating scientific evidence was to invent a fictitious divide between a scientific and legal burden of proof:2

Petitioners demand sole reliance on scientific facts, on evidence that reputable scientific techniques certify as certain. Typically, a scientist will not so certify evidence unless the probability of error, by standard statistical measurement, is less than 5%. That is, scientific fact is at least 95% certain. Such certainty has never characterized the judicial or the administrative process. It may be that the ‘beyond a reasonable doubt’ standard of criminal law demands 95% certainty. Cf. McGill v. United States, 121 U.S.App. D.C. 179, 185 n.6, 348 F.2d 791, 797 n.6 (1965). But the standard of ordinary civil litigation, a preponderance of the evidence, demands only 51% certainty. A jury may weigh conflicting evidence and certify as adjudicative (although not scientific) fact that which it believes is more likely than not.”

By falsely elevating the scientific standard, judges see themselves free to decide expeditiously and without constraints, because they are operating at much lower epistemic level.

Another response advocated by “the Lobby,” scientists in service to the litigation industry, has been to deprecate gatekeeping altogether. Perhaps the most brazen anti-science response to the Supreme Court’s decision in Daubert was advanced by David Michaels and his Project on Scientific Knowledge and Public Policy (SKAPP). In its heyday, SKAPP organized meetings and conferences, and cranked out anti-gatekeeping propaganda to the delight of the litigation industry3, while obfuscating and equivocating about the source of its funding (from the litigation industry).4

SKAPP principal David Michaels was also behind the efforts of the American Public Health Association (APHA) to criticize the judicial move to scientific standards in gatekeeping. In 2004, Michaels and fellow litigation industrialists prevailed upon the APHA to adopt a policy statement that attacked evidence-based science and data transparency in the form of “Policy Number: 2004-11 Threats to Public Health Science.”5

SKAPP appears to have gone the way of the dodo, although the defunct organization still has a Wikipedia­ page with the misleading claim that a federal court had funded its operation, and the old link for this sketchy outfit now redirects to the website for the Union of Concerned Scientists. In 2009, David Michaels, fellow in the Collegium Ramazzini, and formerly the driving force of SKAPP, went on to become an under-secretary of Labor, and OSHA administrator in the Obama administration.6

With the end of his regulatory work, Michaels is now back in the litigation saddle. In April 2018, Michaels participated in a ruse in which he allowed himself to be “subpoenaed” by Mark Lanier, to give testimony in a cases involving claims that personal talc use caused ovarian cancers.7 Michaels had no real subject matter expertise, but he readily made himself available so that Mr. Lanier could inject Michaels’ favorite trope of “doubt is their product” into his trial.

Against this backdrop of special pleading from the litigation industry’s go-to expert witnesses, it is helpful to revisit the Daubert decision, which is now 25 years old. The decision followed the grant of the writ of certiorari by the Supreme Court, full briefing by the parties on the merits, oral argument, and twenty two amicus briefs. Not all briefs are created equal, and this inequality is especially true of amicus briefs, for which the quality of argument, and the reputation of the interested third parties, can vary greatly. Given the shrill ideological ranting of SKAPP and the APHA, we might find some interest in what two leading scientific organizations, the American Association for the Advancement of Science (AAAS) and the National Academy of Science (NAS), contributed to the debate over the proper judicial role in policing expert witness opinion testimony.

The Amicus Brief of the AAAS and the NAS, filed in Daubert v. Merrell Dow Pharmaceuticals, Inc., U.S. Supreme Court No. 92-102 (Jan. 19, 1993), was submitted by Richard A. Meserve and Lars Noah, of Covington & Burling, and by Bert Black, of Weinberg & Green. Unfortunately, the brief does not appear to be available on Westlaw, but it was republished shortly after filing, at 12 Biotechnology Law Report 198 (No. 2, March-April 1993) [all citations below are to this republication].

The amici were and are well known to the scientific community. The AAAS is a not-for-profit scientific society, which publishes the prestigious journal Science, and engages in other activities to advance public understanding of science. The NAS was created by congressional charter in the administration of Abraham Lincoln, to examine scientific, medical, and technological issues of national significance. Brief at 208. Meserve, counsel of record for these Amici Curiae, is a member of the National Academy, a president emeritus of the Carnegie Institution for Science, and a former chair of the U.S. Nuclear Regulatory Commission. He received his doctorate in applied physics from Stanford University, and his law degree from Harvard. Noah is now a professor of law in the University of Florida, and Black is still a practicing lawyer, ironically for the litigation industry.

The brief of the AAAP and the NAS did not take a position on the merits of whether Bendectin can cause birth defects, but it had a great deal to say about the scientific process, and the need for courts to intervene to ensure that expert witness opinion testimony was developed and delivered with appropriate methodological rigor.

A Clear and Present Danger

The amici, AAAS and NAS, clearly recognized a threat to the integrity of scientific fact-finding in the regime of uncontrolled and unregulated expert witness testimony. The amici cited the notorious case of Wells v. Ortho Pharmaceutical Corp.8, which had provoked an outcry from the scientific community, and a particularly scathing article by two scientists from the National Institute of Child Health and Human Development.9

The amici also cited several judicial decisions on the need for robust gatekeeping, including the observations of Judge Jack Weinstein that

[t]he uncertainty of the evidence in [toxic tort] cases, dependent as it is on speculative scientific hypotheses and epidemiological studies, creates a special need for robust screening of experts and gatekeeping under Rules 403 and 703 by the court.”10

The AAAS and the NAS saw the “obvious danger that research results generated solely for litigation may be skewed.” Brief at 217& n.11.11 The AAAS and the NAS thus saw a real, substantial threat in countenancing expert witnesess who proffered “putatively scientific evidence that does not in fact reflect the application of scientific principles.” Brief at 208. The Supreme Court probably did not need the AAAS and the NAS to tell them that “[s]uch evidence can lead to incorrect decisions and can serve to discredit the contributions of science,” id., but it may have helped ensure that the Court articulated meaningful guidelines to trial judges to police their courtrooms against scientific conclusions that were not reached in accordance with scientific principles. The amici saw and stated that

[t]he unique persuasive power of scientific evidence and its inherent limitations requires that courts engage special efforts to ensure that scientific evidence is valid and reliable before it is admitted. In performing that task, courts can look to the same criteria that scientists themselves use to evaluate scientific claims.”

Brief at 212.

It may seem quaint to the post-modernists at the APHA, but the AAAS and the NAS were actually concerned “to avoid outcomes that are at odds with reality,” and they were willing to urge that “courts must exercise special care to assure that such evidence is based on valid and reliable scientific methodologies.” Brief at 209 (emphasis added). The amici also urged caution in allowing opinion testimony that conflicted with existing learning, and which had not been presented to the scientific community for evaluation. Brief at 218-19. In the words of the amici:

Courts should admit scientific evidence only if it conforms to scientific standards and is derived from methods that are generally accepted by the scientific community as valid and reliable. Such a test promotes sound judicial decisionmaking by providing workable means for screening and assessing the quality of scientific expert testimony in advance of trial.”

Brief at 233. After all, part of the scientific process itself is weeding out false ideas.

Authority for Judicial Control

The AAAS and NAS and its lawyers gave their full support to Merrill Dow’s position that “courts have the authority and the responsibility to exclude expert testimony that is based upon unreliable or misapplied methodologies.” Brief at 209. The Federal Rules of Evidence, and Rules 702, 703, and 403 in particular, gave trial courts “ample authority for empowering courts to serve as gatekeepers.” Brief at 230. The amici argued what ultimately would become the law, that judicial control, in the spirit and text of the Federal Rules, of “[t]hreshold determinations concerning the admissibility of scientific evidence are necessary to ensure accurate decisions and to avoid unnecessary expenditures of judicial resources on collateral issues. Brief at 210. The AAAS and NAS further recommended that:

Determinations concerning the admissibility of expert testimony based on scientific evidence should be made by a judge in advance of trial. Such judicial control is explicitly called for under Rule 104(a) of the Federal Rules of Evidence, and threshold admissibility determinations by a judge serve several important functions, including simplification of issues at trial (thereby increasing the speed of trial), improvement in the consistency and predictability of results, and clarification of the issues for purposes of appeal. Indeed, it is precisely because a judge can evaluate the evidence in a focused and careful manner, free from the prejudices that might infect deliberations by a jury, that the determination should be made as a threshold matter.”

Brief at 228 (internal citations omitted).

Criteria of Validity

The AAAS and NAS did not shrink from the obvious implications of their position. They insisted that “[i]n evaluating scientific evidence, courts should consider the same factors that scientists themselves employ to assess the validity and reliability of scientific assertions.” Brief at 209, 210. The amici may have exhibited an aspirational view of the ability of judges, but they shared their optimistic view that “judges can understand the fundamental characteristics that separate good science from bad.” Brief at 210. Under the gatekeeping regime contemplated by the AAAS and the NAS, judges would have to think and analyze, rather than delegating to juries. In carrying out their task, judges would not be starting with a blank slate:

When faced with disputes about expert scientific testimony, judges should make full use of the scientific community’s criteria and quality-control mechanisms. To be admissible, scientific evidence should conform to scientific standards and should be based on methods that are generally accepted by the scientific community as valid and reliable.”

Brief at 210. Questions such as whether an hypothesis has survived repeated severe, rigorous tests, whether the hypothesis is consistent with other existing scientific theories, whether the results of the tests have been presented to the scientific community, need to be answered affirmatively before juries are permitted to weigh in with their verdicts. Brief at 216, 217.

The AAAS and the NAS acknowledged implicitly and explicitly that courtrooms were not good places to trot out novel hypotheses, which lacked severe testing and sufficient evidentiary support. New theories must survive repeated testing and often undergo substantial refinements before they can be accepted in the scientific community. The scientific method requires nothing less. Brief at 219. These organizational amici also acknowledged that there will be occasionally “truly revolutionary advances” in the form of an hypothesis not fully tested. The danger of injecting bad science into broader decisions (such as encouraging meritless litigation, or the abandonment of useful products) should cause courts to view unestablished hypotheses with “heightened skepticism pending further testing and review.” Brief at 229. In other words, some hypotheses simply have not matured to the point at which they can support tort or other litigation.

The AAAS and the NAS contemplated that the gatekeeping process could and should incorporate the entire apparatus of scientific validity determinations into Rule 104(a) adjudications. Nowhere in their remarkable amicus brief do they suggest that if there some evidence (however weak) favoring a causal claim, with nothing yet available to weigh against it, expert witnesses can declare that they have the “weight of the evidence” on their side, and gain a ticket to the courthouse door. The scientists at SKAPP, or now those at the Union for Concerned Scientists, prefer to brand gatekeeping as a trick to sell “doubt.” What they fail to realize is that their propaganda threatens both universalism and organized skepticism, two of the four scientific institutional norms, described by sociologist of science Robert K. Merton.12

1 United States v. Brown, 557 F.2d 541, 556 (6th Cir. 1977) (“Because of its apparent objectivity, an opinion that claims a scientific basis is apt to carry undue weight with the trier of fact”); United States v. Addison, 498 F.2d 741, 744 (D.C. Cir. 1974) (“scientific proof may in some instances assume a posture of mystic infallibility in the eyes of a jury of laymen”). Some people say that our current political morass reflects poorly on the ability of United States citizens to assess and evaluate evidence and claims to the truth.

2 See, e.g., Ethyl Corp. v. EPA, 541 F.2d 1, 28 n.58 (D.C. Cir.), cert. denied, 426 U.S. 941 (1976). See also Rhetorical Strategy in Characterizing Scientific Burdens of Proof(Nov. 15, 2014).

3 See, e.g., Project on Scientific Knowledge and Public Policy, “Daubert: The Most Influential Supreme Court Ruling You’ve Never Heard Of(2003).

4 See, e.g., SKAPP A LOT(April 30, 2010); “Manufacturing Certainty(Oct. 25, 2011);David Michaels’ Public Relations Problem(Dec. 2, 2011); Conflicted Public Interest Groups (Nov. 3, 2013).

7 Notes of Testimony by David Michaels, in Ingham v. Johnson & Johnson, Case No. 1522-CC10417-01, St. Louis Circuit Ct, Missouri (April 17, 2018).

8 788 F.2d 741, 744-45 (11th Cir.), cert. denied, 479 U.S. 950 (1986). Remarkably, consultants for the litigation industry have continued to try to “rehabilitate” the Wells decision. SeeCarl Cranor’s Conflicted Jeremiad Against Daubert” (Sept. 23, 2018).

9 James L. Mills & Duane Alexander, “Teratogens and Litogens,” 315 New Engl. J. Med. 1234, 1235 (1986).

10 Brief at n. 31, citing In re Agent Orange Product Liab. Litig., 611 F. Supp. 1267, 1269 (E.D.N.Y. 1985), aff’d, 818 F.2d 187 (2th Cir. 1987), cert. denied, 487 U.S. 1234 (1988).

11 citing among other cases, Perry v. United States, 755 F.2d 888, 892 (11th Cir. 1985) (“A scientist who has a formed opinion as to the answer he is going to find before he even begins his research may be less objective than he needs to be in order to produce reliable scientific results.”).

12 Robert K. Merton, “The Normative Structure of Science,” in Robert K. Merton, The Sociology of Science: Theoretical and Empirical Investigations, chap. 13, at 267, 270 (1973).