CONFOUNDING1
Back in 2000, several law professors wrote an essay, in which they detailed some of the problems courts experienced in expert witness gatekeeping. Their article noted that judges easily grasped the problem of generalizing from animal evidence to human experience, and thus they simplistically emphasized human (epidemiologic) data. But in their emphasis on the problems in toxicological evidence, the judges missed problems of internal validity, such as confounding, in epidemiologic studies:
“Why do courts have such a preference for human epidemiological studies over animal experiments? Probably because the problem of external validity (generalizability) is one of the most obvious aspects of research methodology, and therefore one that non-scientists (including judges) are able to discern with ease – and then give excessive weight to (because whether something generalizes or not is an empirical question; sometimes things do and other times they do not). But even very serious problems of internal validity are harder for the untrained to see and understand, so judges are slower to exclude inevitably confounded epidemiological studies (and give insufficient weight to that problem). Sophisticated students of empirical research see the varied weaknesses, want to see the varied data, and draw more nuanced conclusions.”2
I am not sure that the problems are dependent in the fashion suggested by the authors, but their assessment that judges may be reluctant to break the seal on the black box of epidemiology, and that judges frequently lack the ability to make nuanced evaluations of the studies on which expert witnesses rely seems fair enough. Judges continue to miss important validity issues, perhaps because the adversarial process levels all studies to debating points in litigation.3
The frequent existence of validity issues undermines the partisan suggestion that Rule 702 exclusions are merely about “sufficiency of the evidence.” Sometimes, there is just too much of nothing to rise even to a problem of insufficiency. Some studies are “not even wrong.”4 Similarly, validity issues are an embarrassment to those authors who argue that we must assemble all the evidence and consider the entirety under ethereal standards, such as “weight of the evidence,” or “inference to the best explanation.” Sometimes, some or much of the available evidence does not warrant inclusion in the data set at all, and any causal inference is unacceptable.
Threats to validity come in many forms, but confounding is a particularly dangerous one. In claims that substances such as diesel fume or crystalline silica cause lung cancer, confounding is a huge problem. The proponents of the claims suggest relative risks in the range of 1.1 to 1.6 for such substances, but tobacco smoking results in relative risks in excess of 20, and some claim that passive smoking at home or in the workplace results in relative risks of the same magnitude as the risk ratios claimed for diesel particulate or silica. Furthermore the studies behind these claims frequently involve exposures to other known or suspected lung carcinogens, such as arsenic, radon, dietary factors, asbestos, and others.
Definition of Confounding
Confounding results from the presence of a so-called confounding (or lurking) variable, helpfully defined in the chapter on statistics in the Reference Manual on Scientific Evidence:
“confounding variable; confounder. A confounder is correlated with the independent variable and the dependent variable. An association between the dependent and independent variables in an observational study may not be causal, but may instead be due to confounding. See controlled experiment; observational study.”5
This definition suggests that the confounder need not be known to cause the dependent variable/outcome; the confounder need be only correlated with the outcome and an independent variable, such as exposure. Furthermore, the confounder may be actually involved in such a way as to increase or decrease the estimated relationship between dependent and independent variables. A confounder that is known to be present typically is referred to as a an “actual” confounder, as opposed to one that may be at work, and known as a “potential” confounder. Furthermore, even after exhausting known and potential confounders, studies of may be affected by “residual” confounding, especially when the total array of causes of the outcome of interest is not understood, and these unknown causes are not randomly distributed between exposed and unexposed groups in epidemiologic studies. Litigation frequently involves diseases or outcomes with unknown causes, and so the reality of unidentified residual confounders is unavoidable.
In some instances, especially in studies pharmaceutical adverse outcomes, there is the danger that the hypothesized outcome is also a feature of the underlying disease being treated. This phenomenon is known as confounding by indication, or as indication bias.6
Kaye and Freedman’s statistics chapter notes that confounding is a particularly important consideration when evaluating observational studies. In randomized clinical trials, one goal of the randomization is the elimination of the role of bias and confounding by the random assignment of exposures:
“2. Randomized controlled experiments
In randomized controlled experiments, investigators assign subjects to treatment or control groups at random. The groups are therefore likely to be comparable, except for the treatment. This minimizes the role of confounding.”7
In observational studies, confounding may completely invalidate an association. Kaye and Freedman give an example from the epidemiologic literature:
“Confounding remains a problem to reckon with, even for the best observational research. For example, women with herpes are more likely to develop cervical cancer than other women. Some investigators concluded that herpes caused cancer: In other words, they thought the association was causal. Later research showed that the primary cause of cervical cancer was human papilloma virus (HPV). Herpes was a marker of sexual activity. Women who had multiple sexual partners were more likely to be exposed not only to herpes but also to HPV. The association between herpes and cervical cancer was due to other variables.”8
The problem identified as confounding by Freedman and Kaye cannot be dismissed as an issue that goes to the “weight” of the study issue; the confounding goes to the heart of the ability of the herpes studies to show an association that can be interpreted to be causal. Invalidity from confounding renders the studies “weightless” in any “weight of the evidence” approach. There are, of course, many ways to address confounding in studies: stratification, multivariate analyses, multiple regression, propensity scores, etc. Consideration of the propriety and efficacy of these methods is a whole other level of analysis, which does not arise unless and until the threshold question of confounding is addressed.
Reference Manual on Scientific Evidence
The epidemiology chapter of the Second Edition of the Manual stated that ruling out of confounding as an obligation of the expert witness who chooses to rely upon the study.9 Although the same chapter in the Third Edition occasionally waffles, its authors come down on the side of describing confounding as a threat to validity, which must be ruled out before the study can be relied upon. In one place, the authors indicate “care” is required, and that analysis for random error, confounding, bias “should be conducted”:
“Although relative risk is a straightforward concept, care must be taken in interpreting it. Whenever an association is uncovered, further analysis should be conducted to assess whether the association is real or a result of sampling error, confounding, or bias. These same sources of error may mask a true association, resulting in a study that erroneously finds no association.”10
Elsewhere in the same chapter, the authors note that “chance, bias, and confounding” must be looked at, but again, the authors stop short of noting that these threats to validity must be eliminated:
“Three general categories of phenomena can result in an association found in a study to be erroneous: chance, bias, and confounding. Before any inferences about causation are drawn from a study, the possibility of these phenomena must be examined.”11
* * * * * * * *
“To make a judgment about causation, a knowledgeable expert must consider the possibility of confounding factors.”12
Eventually, however, the epidemiology chapter takes a stand, and an important one:
“When researchers find an association between an agent and a disease, it is critical to determine whether the association is causal or the result of confounding.”13
Mandatory Not Precatory
The better reasoned cases decided under Federal Rule of Evidence 702, and state-court analogues, follow the Reference Manual in making clear that confounding factors must be carefully addressed and eliminated. Failure to rule out the role of confounding renders a conclusion of causation, reached in reliance upon confounded studies, invalid.14
The inescapable mandate of Rules 702 and 703 is to require judges to evaluate the bases of a challenged expert witness’s opinion. Threats to internal validity, such as confounding, in a study may make reliance upon any given study, or an entire set of studies, unreasonable, which thus implicates Rule 703. Importantly, stacking up more invalid studies does not overcome the problem by presenting a heap of evidence, incompetent to show anything.
Pre-Daubert
Before the Supreme Court decided Daubert, few federal or state courts were willing to roll up their sleeves to evaluate the internal validity of relied upon epidemiologic studies. Issues of bias and confounding were typically dismissed by courts as issues that went to “weight, not admissibility.”
Judge Weinstein’s handling of the Agent Orange litigation, in the mid-1980s, marked a milestone in judicial sophistication and willingness to think critically about the evidence that was being funneled into the courtroom.15 The Bendectin litigation also was an important proving ground in which the defendant pushed courts to keep their eyes and minds open to issues of random error, bias, and confounding, when evaluating scientific evidence, on both pre-trial and on post-trial motions.16
Post-Daubert
When the United States Supreme Court addressed the admissibility of plaintiffs’ expert witnesses in Daubert, its principal focus was on the continuing applicability of the so-called Frye rule after the enactment of the Federal Rules of Evidence. The Court left the details of applying the then newly clarified “Daubert” standard to the facts of the case on remand to the intermediate appellate court. The Ninth Circuit, upon reconsidering the case, re-affirmed the trial court’s previous grant of summary judgment, on grounds of the plaintiffs’ failure to show specific causation.
A few years later, the Supreme Court itself engaged with the actual evidentiary record on appeal, in a lung cancer claim, which had been dismissed by the district court. Confounding was one among several validity issues in the studies relied upon by plaintiffs” expert witnesses. The Court concluded that the plaintiffs’ expert witnesses’ bases did not individually or collectively support their conclusions of causation in a reliable way. With respect to one particular epidemiologic study, the Supreme Court observed that a study that looked at workers who “had been exposed to numerous potential carcinogens” could not show that PCBs cause lung cancer. General Elec. Co. v. Joiner, 522 U.S. 136, 146 (1997).17
1 An earlier version of this post can be found at “Sorting Out Confounded Research – Required by Rule 702” (June 10, 2012).
2 David Faigman, David Kaye, Michael Saks, and Joseph Sanders, “How Good is Good Enough? Expert Evidence Under Daubert andKumho,” 50Case Western Reserve L. Rev. 645, 661 n.55 (2000).
3 See, e.g., In re Welding Fume Prods. Liab. Litig., 2006 WL 4507859, *33 (N.D.Ohio 2006) (reducing all studies to one level, and treating all criticisms as though they rendered all studies invalid).
4 R. Peierls, “Wolfgang Ernst Pauli, 1900-1958,” 5Biographical Memoirs of Fellows of the Royal Society 186 (1960) (quoting Wolfgang Pauli’s famous dismissal of a particularly bad physics paper).
5 David Kaye & David Freedman, “Reference Guide on Statistics,” inReference Manual on Scientific Evidence 211, 285 (3d ed. 2011)[hereafter theRMSE3d].
6 See, e.g., R. Didham, et al., “Suicide and Self-Harm Following Prescription of SSRIs and Other Antidepressants: Confounding By Indication,” 60Br. J. Clinical Pharmacol. 519 (2005).
7 RMSE3d at 220.
8 RMSE3d at 219 (internal citations omitted).
9 Reference Guide on Epidemiology at 369 -70 (2ed 2000) (“Even if an association is present, epidemiologists must still determine whether the exposure causes the disease or if a confounding factor is wholly or partly responsible for the development of the outcome.”).
10 RMSE3d at 567-68 (internal citations omitted).
11 RMSE3d at 572.
12 RMSE3d at 591 (internal citations omitted).
13 RMSE3d at 591
14 Similarly, an exonerative conclusion of no association might be vitiated by confounding with a protective factor, not accounted for in a multivariate analysis. Practically, such confounding seems less prevalent than confounding that generates a positive association.
15 In re “Agent Orange” Prod. Liab. Litig., 597 F. Supp. 740, 783 (E.D.N.Y. 1984) (noting that confounding had not been sufficiently addressed in a study of U.S. servicemen exposed to Agent Orange), aff’d, 818 F.2d 145 (2d Cir. 1987) (approving district court’s analysis), cert. denied sub nom. Pinkney v. Dow Chemical Co., 484 U.S. 1004 (1988).
16 Brock v. Merrell Dow Pharms., Inc., 874 F.2d 307, 311 , modified on reh’g, 884 F.2d 166 (5th Cir. 1989) (noting that “[o]ne difficulty with epidemiologic studies is that often several factors can cause the same disease.”)
17 The Court’s discussion related to the reliance of plaintiffs’ expert witnesses upon, among other studies, Kuratsune, Nakamura, Ikeda, & Hirohata, “Analysis of Deaths Seen Among Patients with Yusho – A Preliminary Report,” 16 Chemosphere 2085 (1987).