TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

The “Rothman” Amicus Brief in Daubert v. Merrill Dow Pharmaceuticals

November 17th, 2018

Then time will tell just who fell
And who’s been left behind”

                  Dylan, “Most Likely You Go Your Way” (1966)

 

When the Daubert case headed to the Supreme Court, it had 22 amicus briefs in tow. Today that number is routine for an appeal to the high court, but in 1992, it was a signal of intense interest in the case among both the scientific and legal community. To the litigation industry, the prospect of judicial gatekeeping of expert witness testimony was an anathema. To the manufacturing industry, the prospect was precious to defend against specious claiming.

With the benefit of 25 years of hindsight, a look at some of those amicus briefs reveals a good deal about the scientific and legal acumen of the “friends of the court.” Not all amicus briefs in the case were equal; not all have held up well in the face of time. The amicus brief of the American Association for the Advancement of Science and the National Academy of Science was a good example of advocacy for the full implementation of gatekeeping on scientific principles of valid inference.1 Other amici urged an anything goes approach to judicial oversight of expert witnesses.

One amicus brief often praised by Plaintiffs’ counsel was submitted by Professor Kenneth Rothman and colleagues.2 This amicus brief is still cited by parties who find support in the brief for their excuses for not having consistent, valid, strong, and statistically significance evidence to support their claims of causation. To be sure, Rothman did target statistical significance as a strict criterion of causal inference, but there is little support in the brief for the loosey-goosey style of causal claiming that is so prevalent among lawyers for the litigation industry. Unlike the brief filed by the AAAS and the National Academy of Science, Rothman’s brief abstained from the social policies implied by judicial gatekeeping or its rejection. Instead, Rothman’s brief wet out to make three narrow points:

(1) courts should not rely upon strict statistical significance testing for admissibility determinations;

(2) peer review is not an appropriate touchstone for the validity of an expert witness’s opinion; and

(3) unpublished, non-peer-reviewed “reanalysis” of studies is a routine part of the scientific process, and regularly practiced by epidemiologists and other scientists.

Rothman was encouraged to target these three issues by the lower courts’ opinions in the Daubert case, in which the courts made blanket statements about the role of absent statistical significance and peer review, and the illegitimacy of “re-analyses” of published studies.

Professor Rothman has made many admirable contributions to epidemiologic practice, but the amicus brief submitted by him and his colleagues falls into the trap of making the sort of blanket general statements that they condemned in the lower courts’ opinions. Of the brief’s three points, the first, about statistical significance is the most important for epidemiologic and legal practice. Despite reports of an odd journal here or there “abolishing” p-values, most medical journals continue to require the presentation of either p-values or confidence intervals. In the majority of medical journals, 95% confidence intervals that exclude a null hypothesis risk ratio of 1.0, or risk difference of 0, are labelled “statistically significant,” sometimes improvidently in the presence of multiple comparisons and lack of pre-specification of outcome.

For over three decades, Rothman has criticized the prevailing practice on statistical significance. Professor Rothman is also well known for his advocacy for the superiority of confidence intervals over p-values in conveying important information about what range of values are reasonably compatible with the observed data.3 His criticisms of p-values and his advocacy for estimation with intervals have pushed biomedical publishing to embrace confidence intervals as more informative than just p-values. Still, his views on statistical significance have never gained complete acceptance at most clinical journals. Biomedical scientists continue to interpret 95% confidence intervals, at least in part, as to whether they show “significance” by excluding the null hypothesis value of no risk difference or of risk ratios equal to 1.0.

The first point in Rothman’s amicus brief is styled:

THE LOWER COURTS’ FOCUS ON SIGNIFICANCE TESTING IS BASED ON THE INACCURATE ASSUMPTION THAT ‘STATISTICAL SIGNIFICANCE’ IS REQUIRED IN ORDER TO DRAW INFERENCES FROM EPIDEMIOLOGICAL INFORMATION”

The challenge by Rothman and colleagues to the “assumption” that statistical significance is necessary is what, of course, has endeared this brief to the litigation industry. A close read of the brief, however, shows that Rothman’s critique of the assumption is equivocal. Rothman et amici characterized the lower courts as having given:

blind deference to inappropriate and arcane publication standards and ‘significance testing’.”4

The brief is silent about what might be knowing deference, or appropriate publication standards. To be sure, judges have often poorly expressed their reasoning for deciding scientific evidentiary issues, and perhaps poor communication or laziness by judges was responsible for Rothman’s interest in joining the Daubert fray. Putting aside the unclear, rhetorical, and somewhat hyperbolic use of “arcane” in the quote above, the suggestion of inappropriate blind deference is itself expressed in equivocal terms in the brief. At times the authors rail at the use of statistical significance as the “sole” criterion, and at times, they seem to criticize its use at all.

At least twice in their brief, Rothman and friends declare that the lower court:

misconstrues the validity and appropriateness of significance testing as a decision making tool, apparently deeming it the sole test of epidemiological hypotheses.”5

* * * * * *

this Court should reject significance testing as the sole acceptable criterion of scientific validity in epidemiology.”6

Characterizing “statistical significance” as not the sole test or criterion of scientific inference is hardly controversial, and it implies that statistical significance is one test, criterion, or factor among others. This position is consistent with the current ASA Statement on Significance Testing.7 There is, of course, much more to evaluate in a study or a body of studies, than simply whether they individually or collectively help us to exclude chance as an explanation for their findings.

Statistical Significance Is Not Necessary At All

Elsewhere, Rothman and friends take their challenge to statistical significance testing beyond merely suggesting that such testing is only one test or criterion among others. Indeed, their brief in other places states their opinion that significance testing is not necessary at all:

Testing for significance, however, is often mistaken for a sine qua non of scientific inference.”8

And at other times, Rothman and friends go further yet and claim not only that significance is not necessary, but that it is not even appropriate or useful:

Significance testing, however, is neither necessary nor appropriate as a requirement for drawing inferences from epidemiologic data.”9

Rothman compares statistical significance testing with “scientific inference,” which is not a mechanical, mathematical procedure, but rather a “thoughtful evaluation[] of possible explanations for what is being observed.”10 Significance testing, in contrast,” is “merely a statistical tool,” used inappropriately “in the process of developing inferences.”11 Rothman suggests that the term “statistical significance” could be eliminated from scientific discussions without loss of meaning, and this linguistic legerdemain shows that the phrase is unimportant in science and in law.12 Rothman’s suggestion, however, ignores that causal assessments have always required an evaluation of the play of chance, especially for putative causes, which are neither necessary nor sufficient, and which modify underlying stochastic processes by increasing or decreasing the probability of a specified outcome. Asserting that statistical significance is misleading because it never describes the size of an association, which the Rothman brief does, is like telling us that color terms tell us nothing about the mass of a body.

The Rothman brief does make the salutary point that labeling a study outcome as not “statistically significant” carries the danger that the study’s data have no value, or that the study may be taken to reject the hypothesized association. In 1992, such an interpretation may have been more common, but today, in the face of the proliferation of meta-analyses, the risk of such interpretations of single study outcomes is remote.

Questionable History of Statistics

Rothman suggests that the development of statistical hypothesis testing occurred in the context of agricultural and quality-control experiments, which required yes-no answers for future action.13 This suggestion clearly points at Sir Ronald Fisher and Jerzy Neyman, and their foundational work on frequentist statistical theory and practice. In part, the amici correctly identified the experimental milieu in which Fisher worked, but the description of Fisher’s work is neither accurate nor fair. Fisher spent a lifetime thinking and writing about statistical tests, in much more nuanced ways than implied by the claim that such testing occurred in context of agricultural and quality-control experiments. Although Fisher worked on agricultural experiments, his writings acknowledged that when statistical tests and analyses were applied to observational studies, much more searching analyses of bias and confounding were required. Fisher’s and Berkson’s reactions to the observational studies of Hill and Doll on smoking and lung cancer are telling in this regard. These statisticians criticized the early smoking lung cancer studies, not for lack of statistical significance, but for failing to address confounding by a potential common genetic propensity to smoke and to develop lung cancer.

Questionable History of Drug Development

Twice in Rothman’s amicus brief, the authors suggest that “undue reliance” on statistical significance has resulted in overlooking “effective new treatments” because observed benefits were considered “not significant,” despite an “indication” of efficacy.14 The brief never provided any insight on what is due reliance and what is undue reliance on statistical significance. Their criticism of “undue reliance” implies that there are modes or instances of “due reliance” upon statistical significance. The amicus brief fails also to inform readers exactly what “effective new treatments” have been overlooked because the outcomes were considered “not significant.” This omission is regrettable because it leaves the reader with only abstract recommendations, without concrete examples of what such effective treatments might be. The omission was unfortunate because Rothman almost certainly could have marshalled examples. Recently, Rothman tweeted just such an example:15

“30% ↓ in cancer risk from Vit D/Ca supplements ignored by authors & editorial. Why? P = 0.06. http://bit.ly/2oanl6w http://bit.ly/2p0CRj7. The 95% confidence interval for the risk ratio was 0.42–1.02.”

Of course, this was a large, carefully reported randomized clinical trial, with a narrow confidence interval that just missed “statistical significance.” It is not an example that would have given succor to Bendectin plaintiffs, who were attempting to prove an association by identifying flaws in noisy observational studies that generally failed to show an association.

Readers of the 1992 amicus brief can only guess at what might be “indications of efficacy”; no explanation or examples are provided.16 The reality of FDA approvals of new drugs is that pre-specified 5% level of statistical significance is virtually always enforced.17 If a drug sponsor has “indication of efficacy,” it is, of course, free to follow up with an additional, larger, better-designed clinical trial. Rothman’s recent tweet about the vitamin D clinical trial does provide some context and meaning to what the amici may have meant over 25 years ago by indication of efficacy. The tweet also illustrates Rothman’s acknowledgment of the need to address random variability in a data set, whether by p-value or confidence interval, or both. Clearly, Rothman was criticizing the authors of the vitamin D trial for stopping short of claiming that they had shown (or “demonstrated”) a cancer survival benefit. There is, however, a rich literature on vitamin D and cancer outcomes, and such a claim could be made, perhaps, in the context of a meta-analysis or meta-regression of multiple clinical trials, with a synthesis of other experimental and observational data.18

Questionable History of Statistical Analyses in Epidemiology

Rothman’s amicus brief deserves credit for introducing a misinterpretation of Sir Austin Bradford Hill’s famous paper on inferring causal associations, which has become catechism in the briefs of plaintiffs in pharmaceutical and other products liability cases:

No formal tests of significance can answer those questions. Such tests can, and should, remind us of the effects that the play of chance can create, and they will instruct us in the likely magnitude of those effects. Beyond that they contribute nothing to the ‘proof’ of our hypothesis.”

Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 290 (1965) (quoted at Rothman Brief at *6).

As exegesis of Hill’s views, this quote is misleading. The language quoted above was used by Hill in the context of his nine causal viewpoints or criteria. The Rothman brief ignores Hill’s admonition to his readers, that before reaching the nine criteria, there is a serious, demanding predicate that must be shown:

Disregarding then any such problem in semantics we have this situation. Our observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance. What aspects of that association should we especially consider before deciding that the most likely interpretation of it is causation?”

Id. at 295 (emphasis added). Rothman and co-authors did not have to invoke the prestige and authority of Sir Austin, but once they did, they were obligated to quote him fully and with accurate context. Elsewhere, in his famous textbook, Hill expressed his view that common sense was insufficient to interpret data, and that the statistical method was necessary to interpret data in medical studies.19

Rothman complains that statistical significance focuses the reader on conjecture on the role of chance in the observed data rather than the information conveyed by the data themselves.20 The “incompleteness” of statistical analysis for arriving at causal conclusions, however, is not an argument against its necessity.

The Rothman brief does make the helpful point that statistical significance cannot be sufficient to support a conclusion of causation because many statistically significant associations or correlations will be non-causal. They give a trivial example of wearing dresses and breast cancer, but the point is well-taken. Associations, even when statistically significant, are not necessarily causal conclusions. Who ever suggested otherwise, other than expert witnesses for the litigation industry?

Unnecessary Fears

The motivation for Rothman’s challenge to the assumption that statistical significance is necessary is revealed at the end of the argument on Point I. The authors plainly express their concern that false negatives will shut down important research:

To give weight to the failure of epidemiological studies to meet strict ‘statistical significant’ standards — to use such studies to close the door on further inquiry — is not good science.”21

The relevance of this concern to the proceedings is a mystery. The judicial decisions in the case are not referenda on funding initiatives. Scientists were as free in 1993, after Daubert was decided, as they were in 1992, when Rothman wrote, to pursue the hypothesis that Bendectin caused birth defects. The decision had the potential to shut down tort claims, and left scientists to their tasks.

Reanalyses Are Appropriate Scientific Tools to Assess and Evaluate Data, and to Forge Causal Opinions

The Rothman brief took issue with the lower courts’ dismissal of plaintiffs’ expert witnesses’ re-analyses of data in published studies. The authors argued that reanalyses were part of the scientific method, and not “an arcane or specialized enterprise,” deserving of heightened or skeptical scrutiny.22

Remarkably, the Rothman brief, if accepted by the Supreme Court on the re-analysis point, would have led to the sort of unthinking blanket acceptance of a methodology, which the brief’s authors condemned in the context of blanket acceptance of significance testing. The brief covertly urges “blind deference” to its authors on the blanket approval of re-analyses.

Although amici have tight page limits, the brief’s authors made clear that they were offering no substantive opinions on the data involved in the published epidemiologic studies on Bendectin, or on the plaintiffs’ expert witnesses’ re-analyses. With the benefit of hindsight, we can see that the sweeping language used by the Ninth Circuit on re-analyses might have been taken to foreclose important and valid meta-analyses or similar approaches. The Rothman brief is not terribly explicit on what re-analysis techniques were part of the scientific method, but meta-analyses surely had been on the authors’ minds:

by focusing on inappropriate criteria applied to determine what conclusions, if any, can be reached from any one study, the trial court forecloses testimony about inferences that can be drawn from the combination of results reported by many such studies, even when those studies, standing alone, might not justify such inferences.”23

The plaintiffs’ statistical expert witness in Daubert had proffered a re-analysis of at least one study by substituting a different control sample, as well as a questionable meta-analyses. By failing to engage on the propriety of the specific analyses at issue in Daubert, the Rothman brief failed to offer meaningful guidance to the appellate court.

Reanalyses Are Not Invalid Just Because They Have Not Been Published

Rothman was certainly correct that the value of peer review was overstated by the defense in Bendectin litigation.24 The quality of pre-publication peer review is spotty, at best. Predatory journals deploy a pay-to-play scheme, which makes a mockery of scientific publishing. Even at respectable journals, peer review cannot effectively guard against fraud, or ensure that statistical analyses have been appropriately done.25 At best, peer review is a weak proxy for study validity, and an unreliable one at that.

The Rothman brief may have moderated the Supreme Court’s reaction to the defense’s argument that peer review is a requirement for studies, or “re-analyses,” relied upon by expert witnesses. The Court in Daubert opined, in dicta, that peer review is a non-dispositive consideration:

The fact of publication (or lack thereof) in a peer reviewed journal … will be a relevant, though not dispositive, consideration in assessing the scientific validity of a particular technique or methodology on which an opinion is premised.”26

To the extent that Rothman and colleagues might have been disappointed in this outcome, they missed some important context of the Bendectin cases. Most of the cases had been resolved by a consolidated causation issues trial, but many opt-out cases had to be tried in state and federal courts around the country.27 The expert witnesses challenged in Daubert (Drs. Swan and Done) participated in many of these opt-out cases, and in each case, they opined that Bendectin was a public health hazard. The failure of these witnesses to publish their analyses and re-analyses spoke volumes about their bona fides. Courts (and juries if the Swan and Done proffered testimony were admissible) could certainly draw negative inferences from the plaintiffs’ expert witnesses’ failure to publish their opinions and re-analyses.

The Fate of the “Rothman Approach” in the Courts

The so-called “Rothman approach” was urged by Bendectin plaintiffs in opposing summary judgment in a case pending in federal court, in New Jersey, before the Supreme Court decided Daubert. Plaintiffs resisted exclusion of their expert witnesses, who had relied upon inconsistent and statistically non-significant studies on the supposed teratogenicity of Bendectin. The trial court excluded the plaintiffs’ witnesses, and granted summary judgment.28

On appeal, the Third Circuit reversed and remanded the DeLucas’s case for a hearing under Rule 702:

by directing such an overall evaluation, however, we do not mean to reject at this point Merrell Dow’s contention that a showing of a .05 level of statistical significance should be a threshold requirement for any statistical analysis concluding that Bendectin is a teratogen regardless of the presence of other indicia of reliability. That contention will need to be addressed on remand. The root issue it poses is what risk of what type of error the judicial system is willing to tolerate. This is not an easy issue to resolve and one possible resolution is a conclusion that the system should not tolerate any expert opinion rooted in statistical analysis where the results of the underlying studies are not significant at a .05 level.”29

After remand, the district court excluded the DeLuca plaintiffs’ expert witnesses, and granted summary judgment, based upon the dubious methods employed by plaintiffs’ expert witnesses in cherry picking data, recalculating risk ratios in published studies, and ignoring bias and confounding in studies. The Third Circuit affirmed the judgment for Merrell Dow.30

In the end, the decisions in the DeLuca case never endorsed the Rothman approach, although Professor Rothman can take credit perhaps for forcing the trial court, on remand, to come to grips with the informational content of the study data, and the many threats to validity, which severely undermined the relied-upon studies and the plaintiffs’ expert witnesses’ opinions.

More recently, in litigation over alleged causation of birth defects in offspring of mothers who used Zoloft during pregnancy, plaintiffs’ counsel attempted to resurrect, through their expert witnesses, the Rothman approach. The multidistrict court saw through counsel’s assertions that the Rothman approach had been adopted in DeLuca, or that it had become generally accepted.31 After protracted litigation in the Zoloft cases, the district court excluded plaintiffs’ expert witnesses and entered summary judgment for the defense. The Third Circuit found that the district court’s handling of the statistical significance issues was fully consistent with the Circuit’s previous pronouncements on the issue of statistical significance.32


1 filed in Daubert v. Merrell Dow Pharmaceuticals, Inc., U.S. Supreme Court No. 92-102 (Jan. 19, 1993), was submitted by Richard A. Meserve and Lars Noah, of Covington & Burling, and by Bert Black, 12 Biotechnology Law Report 198 (No. 2, March-April 1993); see Daubert’s Silver Anniversary – Retrospective View of Its Friends and Enemies” (Oct. 21, 2018).

2 Brief Amici Curiae of Professors Kenneth Rothman, Noel Weiss, James Robins, Raymond Neutra and Steven Stellman, in Support of Petitioners, 1992 WL 12006438, Daubert v. Merrell Dow Pharmaceuticals, Inc., U.S. S. Ct. No. 92-102 (Dec. 2, 1992). [Rothman Brief].

3 Id. at *7.

4 Rothman Brief at *2.

5 Id. at *2-*3 (emphasis added).

6 Id. at *7 (emphasis added).

7 See Ronald L. Wasserstein & Nicole A. Lazar, “The ASA’s Statement on p-Values: Context, Process, and Purpose,” 70 The American Statistician 129 (2016)

8 Id. at *3.

9 Id. at *2.

10 Id. at *3 – *4.

11 Id. at *3.

12 Id. at *3.

13 Id. at *4 -*5.

14 Id. at*5, *6.

15 at <https://twitter.com/ken_rothman/status/855784253984051201> (April 21, 2017). The tweet pointed to: Joan Lappe, Patrice Watson, Dianne Travers-Gustafson, Robert Recker, Cedric Garland, Edward Gorham, Keith Baggerly, and Sharon L. McDonnell, “Effect of Vitamin D and Calcium Supplementation on Cancer Incidence in Older WomenA Randomized Clinical Trial,” 317 J. Am. Med. Ass’n 1234 (2017).

16 In the case of United States v. Harkonen, Professors Ken Rothman and Tim Lash, and I made common cause in support of Dr. Harkonen’s petition to the United States Supreme Court. The circumstances of Dr. Harkonen’s indictment and conviction provide a concrete example of what Dr. Rothman probably was referring to as “indication of efficacy.” I supported Dr. Harkonen’s appeal because I agreed that there had been a suggestion of efficacy, even if Harkonen had overstated what his clinical trial, standing alone, had shown. (There had been a previous clinical trial, which demonstrated a robust survival benefit.) From my perspective, the facts of the case supported Dr. Harkonen’s exercise of speech in a press release, but it would hardly have justified FDA approval for the indication that Dr. Harkonen was discussing. If Harkonen had indeed committed “wire fraud,” as claimed by the federal prosecutors, then I had (and still have) a rather long list of expert witnesses who stand in need of criminal penalties and rehabilitation for their overreaching opinions in court cases.

17 Robert Temple, “How FDA Currently Makes Decisions on Clinical Studies,” 2 Clinical Trials 276, 281 (2005); Lee Kennedy-Shaffer, “When the Alpha is the Omega: P-Values, ‘Substantial Evidence’, and the 0.05 Standard at FDA,” 72 Food & Drug L.J. 595 (2017); see alsoThe 5% Solution at the FDA” (Feb. 24, 2018).

18 See, e.g., Stefan Pilz, Katharina Kienreich, Andreas Tomaschitz, Eberhard Ritz, Elisabeth Lerchbaum, Barbara Obermayer-Pietsch, Veronika Matzi, Joerg Lindenmann, Winfried Marz, Sara Gandini, and Jacqueline M. Dekker, “Vitamin D and cancer mortality: systematic review of prospective epidemiological studies,” 13 Anti-Cancer Agents in Medicinal Chem. 107 (2013).

19 Austin Bradford Hill, Principles of Medical Statistics at 2, 10 (4th ed. 1948) (“The statistical method is required in the interpretation of figures which are at the mercy of numerous influences, and its object is to determine whether individual influences can be isolated and their effects measured.”) (emphasis added).

20 Id. at *6 -*7.

21 Id. at *9.

22 Id.

23 Id. at *10.

24 Rothman Brief at *12.

25 See William Childs, “Peering Behind The Peer Review Curtain,” Law360 (Aug. 17, 2018).

26 Daubert v. Merrell Dow Pharms., 509 U.S. 579, 594 (1993).

27 SeeDiclegis and Vacuous Philosophy of Science” (June 24, 2015).

28 DeLuca v. Merrell Dow Pharms., Inc., 131 F.R.D. 71 (D.N.J. 1990).

29 DeLuca v. Merrell Dow Pharms., Inc., 911 F.2d 941, 955 (3d Cir. 1990).

30 DeLuca v. Merrell Dow Pharma., Inc., 791 F. Supp. 1042 (D.N.J. 1992), aff’d, 6 F.3d 778 (3d Cir. 1993).

31 In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., MDL No. 2342; 12-md-2342, 2015 WL 314149 (E.D. Pa. Jan. 23, 2015) (Rufe, J.) (denying PSC’s motion for reconsideration), aff’d, 858 F.3d 787 (3d Cir. 2017) (affirming exclusion of plaintiffs’ expert witnesses’ dubious opinions, which involved multiple methodological flaws and failures to follow any methodology faithfully). See generallyZoloft MDL Relieves Matrixx Depression” (Jan. 30, 2015); “WOE — Zoloft Escapes a MDL While Third Circuit Creates a Conceptual Muddle” (July 31, 2015).

32 See Pritchard v. Dow Agro Sciences, 430 F. App’x 102, 104 (3d Cir. 2011) (excluding Concussion hero, Dr. Bennet Omalu).

The American Statistical Association Statement on Significance Testing Goes to Court – Part I

November 13th, 2018

It has been two and one-half years since the American Statistical Association (ASA) issued its statement on statistical significance. Ronald L. Wasserstein & Nicole A. Lazar, “The ASA’s Statement on p-Values: Context, Process, and Purpose,” 70 The American Statistician 129 (2016) [ASA Statement]. When the ASA Statement was published, I commended it as a needed counterweight to the exaggerated criticisms of significance testing.1 Lawyers and expert witnesses for the litigation industry had routinely poo-poohed the absence of statistical significance, but over-endorsed its presence in poorly designed and biased studies. Courts and lawyers from all sides routinely misunderstand, misstated, and misrepresented the meaning of statistical significance.2

The ASA Statement had potential to help resolve judicial confusion. It is written in non-technical language, which is easily understood by non-statisticians. Still, the Statement has to be read with care. The principle of charity led me to believe that lawyers and judges would read the Statement carefully, and that it would improve judicial gatekeeping of expert witnesses’ opinion testimony that involved statistical evidence. I am less sanguine now about the prospect of progress.

No sooner had the ASA issued its Statement than the spinning started. One scientist, and an editor PLoS Biology, blogged that “the ASA notes, the importance of the p-value has been greatly overstated and the scientific community has become over-reliant on this one – flawed – measure.”3 Lawyers for the litigation industry were even less restrained in promoting wild misrepresentations about the Statement, with claims that the ASA had condemned the use of p-values, significance testing, and significance probabilities, as “flawed.”4 And yet, no where in the ASA’s statement does the group suggest that the the p-value was a “flawed” measure.

Criminal Use of the ASA Statement

Where are we now, two plus years out from the ASA Statement? Not surprisingly, the Statement has made its way into the legal arena. The Statement has been used in any number of depositions, relied upon in briefs, and cited in at least a couple of judicial decisions, in the last two years. The empirical evidence of how the ASA Statement has been used, or might be used in the future, is still sparse. Just last month, the ASA Statement was cited by the Washington State Supreme Court, in a ruling that held the death penalty unconstitutional. State of Washington v. Gregory, No. 88086-7, (Wash. S.Ct., Oct. 11, 2018) (en banc). Mr. Gregory, who was facing the death penalty, after being duly convicted or rape, robbery, and murder. The prosecution was supported by DNA matches, fingerprint identification, and other evidence. Mr. Gregory challenged the constitutionality of his imposed punishment, not on per se grounds of unconstitutionality, but on race disparities in the imposition of the death penalty. On this claim, the Washington Supreme Court commented on the empirical evidence marshalled on Mr. Gregory’s behalf:

The most important consideration is whether the evidence shows that race has a meaningful impact on imposition of the death penalty. We make this determination by way of legal analysis, not pure science. At the very most, there is an 11 percent chance that the observed association between race and the death penalty in Beckett’s regression analysis is attributed to random chance rather than true association. Commissioner’s Report at 56-68 (the p-values range from 0.048-0.111, which measures the probability that the observed association is the result of random chance rather than a true association).[8] Just as we declined to require ‘precise uniformity’ under our proportionality review, we decline to require indisputably true social science to prove that our death penalty is impermissibly imposed based on race.

Id. (internal citations omitted).

Whatever you think of the death penalty, or how it is imposed in the United States, you will have to agree that the Court’s discussion of statistics is itself criminal. In the above quotation from the Court’s opinion, the Court badly misinterpreted the p-values generated in various regression analyses that were offered to support claims of race disparity. The Court’s equating statistically significant evidence of race disparity in these regression analyses with “indisputably true social science” also reflects a rhetorical strategy that imputes ridiculously high certainty (indisputably true) to social science conclusions in order to dismiss the need for them in order to accept a causal race disparity claim on empirical evidence.5

Gregory’s counsel had briefed the Washington Court on statistical significance, and raised the ASA Statement as excuse and justification for not presenting statistically significant empirical evidence of race disparity.6 Footnote 8, in the above quote from the Gregory decision shows that the Court was aware of the ASA Statement, which makes the Court’s errors even more unpardonable: 

[8] The most common p-value used for statistical significance is 0.05, but this is not a bright line rule. The American Statistical Association (ASA) explains that the ‘mechanical “bright-line” rules (such as “p < 0.05”) for justifying scientific claims or conclusions can lead to erroneous beliefs and poor decision making’.”7

Conveniently, Gregory’s counsel did not cite to other parts of the ASA Statement, which would have called for a more searching review of the statistical regression analyses:

“Good statistical practice, as an essential component of good scientific practice, emphasizes principles of good study design and conduct, a variety of numerical and graphical summaries of data, understanding the phenomenon under study, interpretation of results in context, complete reporting and proper logical and quantitative understanding of what data summaries mean. No single index should substitute for scientific reasoning.”8

The Supreme Court of Washington first erred in its assessment of what scientific evidence requires in terms of a burden of proof. It then accepted spurious arguments to excuse the absence of statistical significance in the statistical evidence before it, on the basis of a distorted representation of the ASA Statement. Finally, the Court erred in claiming support from social science evidence, by ignoring other methodological issues in Gregory’s empirical claims. Ironically, the Court had made significance testing the end all and be all of its analysis, and when it dispatched statistical significance as a consideration, the Court jumped to the conclusion it wanted to reach. Clearly, the intended message of the ASA Statement had been subverted by counsel and the Court.

2 See, e.g., In re Ephedra Prods. Liab. Litig., 393 F.Supp. 2d 181, 191 (S.D.N.Y. 2005). See alsoConfidence in Intervals and Diffidence in the Courts” (March 4, 2012); “Scientific illiteracy among the judiciary” (Feb. 29, 2012).

5 Moultrie v. Martin, 690 F.2d 1078, 1082 (4th Cir. 1982) (internal citations omitted) (“When a litigant seeks to prove his point exclusively through the use of statistics, he is borrowing the principles of another discipline, mathematics, and applying these principles to the law. In borrowing from another discipline, a litigant cannot be selective in which principles are applied. He must employ a standard mathematical analysis. Any other requirement defies logic to the point of being unjust. Statisticians do not simply look at two statistics, such as the actual and expected percentage of blacks on a grand jury, and make a subjective conclusion that the statistics are significantly different. Rather, statisticians compare figures through an objective process known as hypothesis testing.”).

6 Supplemental Brief of Allen Eugene Gregory, at 15, filed in State of Washington v. Gregory, No. 88086-7, (Wash. S.Ct., Jan. 22, 2018).

7 State of Washington v. Gregory, No. 88086-7, (Wash. S.Ct., Oct. 11, 2018) (en banc) (internal citations omitted).

8 ASA Statement at 132.

Passing Hypotheses Off as Causal Conclusions – Allen v. Martin Surfacing

November 11th, 2018

The November 2018 issue of the American Bar Association Journal (ABAJ) featured an exposé-style article on the hazards of our chemical environment, worthy of Mother Jones, or the International Journal of Health Nostrums, by a lawyer, Alan Bell.1 Alan Bell, according to his website, is a self-described “environmental health warrior.” Channeling Chuck McGill, Bell also describes himself as a:

[v]ictim, survivor, advocate and avenger. This former organized crime prosecutor almost died from an environmentally linked illness. He now devotes his life to giving a voice for those too weak or sick to fight for themselves.”

Bell apparently is not so ill that he cannot also serve as “a fierce advocate” for victims of chemicals. Here is how Mr. Bell described his own “environmentally linked illness” (emphasis added):

Over the following months, Alan developed high fevers, sore throats, swollen glands and impaired breathing. Eventually, he experienced seizures and could barely walk. His health continued to worsen until he became so ill he was forced to stop working. Despite being examined and tested by numerous world-renowned doctors, none of them could help. Finally, a doctor diagnosed him with Multiple Chemical Sensitivity, a devastating illness caused by exposure to environmental toxins. The medical profession had no treatment to offer Alan: no cure, and no hope. Doctors could only advise him to avoid all synthetic chemicals and live in complete isolation within a totally organic environment.”

Multiple chemical sensitivity (MCS)? Does anyone still remember “clinical ecology”? Despite the strident advocacy of support groups and self-proclaimed victims, MCS is not recognized as a chemically caused illness by the World Health Organization, the American Medical Association, the American Academy of Allergy and Immunology, and the American College of Physicians.2 Double-blinded, placebo-controlled clinical trials have shown that putative MCS patients respond to placebo as strongly as they react to chemicals.3

Still, Bell’s claims must be true; Bell has written a book, Poisoned, about his ordeal and that of others.4 After recounting his bizarre medical symptoms, he describes his miraculous cure in a sterile bubble in the Arizona desert. From safe within his bubble, Bell has managed to create the “Environmental Health Foundation,” which is difficult if not impossible to find on the internet, although there are some cheesy endorsements to be found on YouTube.

According to Bell’s narrative, Daniel Allen, the football coach of the College of the Holy Cross was experiencing neurological signs and symptoms that could not be explained by physicians in the Boston area, home to some of the greatest teaching hospitals in the world. Allen and his wife, Laura, reached out Bell through his Foundation. Bell describes how he put the Allens in touch with Marcia Ratner, who sits on the Scientific Advisory Board of his Environmental Health Foundation. Bell sent the Allens to see “the world renown” Marcia Ratner, who diagnosed Mr. Allen with amyotrophic lateral sclerosis (ALS). Bell’s story may strike some as odd, considering that Ratner is not a physician. Ratner could not provide a cure for Mr. Allen’s tragic disease, but she could help provide the Allens with a lawsuit.

According to Bell:

Testimony from a sympathetic widow, combined with powerful evidence that the chemicals Dan was exposed to caused him to die long before his time, would smash their case to bits. The defense opted to seek a settlement. The case settled in 2009.5

The ABAJ article on the Allen case is a reprise of chapter 15 of Bell’s book “Chemicals Take Down a Football Coach.” Shame on the A.B.A. for not marking the article as unpaid advertising. More shame on the A.B.A. for not fact checking the glib causal claims made in the article, some of which have been the subject of a recently published “case report” in the red journal, the American Journal of Industrial Medicine, by Dr. Ratner and some, but not all, of the other expert witnesses for Mr. Allen’s litigation team.6 Had the editors of the ABAJ compared Mr. Bell’s statements and claims about the Allen case, they would have seen that Dr. Ratner, et al., ten years after beating back the defendants’ Daubert motion in the Allen case, described their literature review and assessment of Mr. Allen’s case, as merely “hypothesis generating”:

This literature review and clinical case report about a 45-year-old man with no family history of motor neuron disease who developed overt symptoms of a neuromuscular disorder in close temporal association with his unwitting occupational exposure to volatile organic compounds (VOCs) puts forth the hypothesis that exposure to VOCs such as toluene, which disrupt motor function and increase oxidative stress, can unmask latent ALS type neuromuscular disorder in susceptible individuals.”7

         * * * * * * *

In conclusion, this hypothesis generating case report provides additional support for the suggestion that exposure to chemicals that share common mechanisms of action with those implicated in the pathogenesis of ALS type neuromuscular disorders can unmask latent disease in susceptible persons. Further research is needed to elucidate these relationships.”8

So in 2018, the Allen case was merely a “hypothesis generating” case report. Ten years earlier, however, in 2008, when Ratner, Abou-Donia, Oliver, Ewing, and Clapp gave solemn oaths and testified under penalty of perjury to a federal district judge, the facts of the same case warranted a claim to scientific knowledge, under Rule 702. Judges, lawyers, and legal reformers should take note of how expert witnesses will characterize facile opinions as causal conclusions when speaking as paid witnesses, and as mere hypotheses in need of evidentiary support when speaking in professional journals to scientists. You’re shocked; eh?

Sometimes when federal courts permit dubious causation opinion testimony over Rule 702 objections, the culprit is bad lawyering by the opponent of the proffered testimony. The published case report by Ratner helps demonstrate that Allen v. Martin Surfacing, 263 F.R.D. 47 (D. Mass. 2009), was the result of litigation overreach by plaintiffs’ counsel and their paid expert witnesses, and a failure of organized skepticism by defense counsel and the judiciary.

Marcia H. Ratner, Ph.D.

I first encountered Dr. Ratner as an expert witness for the litigation industry in cases involving manganese-containing welding rods. Plaintiffs’ counsel, Dickie Scruggs, et al., withdrew her before the defense could conduct an examination before trial. When I came across the Daubert decision in the Allen case, I was intrigued because I had read Ratner’s dissertation9 and her welding litigation report, and saw what appeared to be fallacies10 similar to those that plagued the research of Dr. Brad Racette, who also had worked with Scruggs in conducting screenings, from which he extracted “data” for a study, which for a while became the center piece of Scruggs’ claims.11

The Allen case provoked some research on my part, and then a blog post about that case and Dr. Ratner.12 Dr. Ratner took umbrage to my blog post; and in email correspondence, she threatened to sue me for tortious interference with her prospective business opportunities. She also felt that the blog post had put her in a bad light by commenting upon her criminal conviction for unlawful gun possession.13 As a result of our correspondence, and seeing that Dr. Ratner was no stranger to the courtroom,14 I wrote a post-script to add some context and her perspective on my original post.15

One fact Dr Ratner wished me to include in the blog post-script was that plaintiffs’ counsel in the Allen case had pressured her to opine that toluene and isocyanates caused Mr. Allen’s ALS, and that she had refused. Dr. Ratner of course was making a virtue of necessity since there was, and is, a mountain of medical opinion, authoritative and well-supportive, that there is no known cause of sporadic ALS.16 Dr. Ratner was very proud, however, of having devised a work-around, by proffering an opinion that toluene caused the acceleration of Mr. Allen’s ALS. This causal claim about accelerated onset could have been tested with an observational study, but the litigation claim about earlier onset was as lacking in evidential support as the more straightforward claim of causation.

Bell’s article in the ABAJ – or rather his advertisement17 – cited an unpublished write up of the Allen case, by Ratner, The Allen Case: Our Daubert Strategy, Victory, and Its Legal and Medical Landmark Ramifications, in which she kvelled about how the Allen case was cited in the Reference Manual on Scientific Evidence. The Manual’s citations, however, were about the admissibility of the industrial hygienist’s proffered testimony on exposure, based in turn on Mr. Allen’s account of acute-onset symptoms.18 The Manual does not address the dubious acceleration aspect of Ratner’s causal opinion in the Allen case.

The puff piece in the ABAJ caused me to look again at Dr. Ratner’s activities. According to the Better Business Bureau reports that Dr. Marcia Ratner is a medical consultant in occupational and environmental toxicology. Since early 2016, she has been the sole proprietor of a consulting firm, Neurotoxicants.com, located in Mendon, Vermont. The firm’s website advertises that:

The Principals and Consultants of Neurotoxicants.com provide expert consulting in neurotoxicology and the relationships between neurotoxic chemical exposures and neurodegenerative disease onset and progression.

Only Ratner is identified as working on consulting through the firm. According to the LinkedIn entry for Neurotoxicants.com, Ratner is the also founder and director of Medical-Legal Research at Neurotoxicants.com. Ratner’s website advertises her involvement in occupational exposure litigation as an expert witness for claimants.19 Previously, Ratner was the Vice President and Director of Research at Chemical Safety Net, Inc., another consulting firm that she had founded with the late Robert G. Feldman, MD.

Conflict of Interest

The authors of the published Allen case report gave a curious conflict-of-interest disclosure at the end of their article:

The authors have no current specific competing interests to declare. However, Drs. Ratner, Abou-Donia and Oliver, and Mr. Ewing all served as expert witnesses in this case which settled favorably for the patient over 10 years ago with an outcome that is a fully disclosed matter of public record. Drs. Ratner, Abou-Donia and Oliver and Mr. Ewing are occasionally asked to serve as expert witnesses and/or consultants in occupational and environmental chemical exposure injury cases.”20

The disclosure conveniently omitted that Dr. Ratner owns a business that she set up to provide medico-legal consulting, and that Dr. Oliver testifies with some frequency in asbestos cases. None of the authors was, or is, an expert in the neuroepidemiology of ALS. Dr. Ratner’s conflict-of-interest disclosure in the Allen case report was, however, better than her efforts in previous publications that touched on the subject matter of her commercial consulting practice.21


1 Alan Bell, “Devastated by office chemicals, an attorney helps others fight toxic torts,Am. Bar. Ass’n J. (Nov. 2018).

2 See, e.g., American Academy of Allergy, Asthma and Immunology, “Idiopathic environmental intolerances,” 103 J. Allergy Clin. Immunol. 36 (1999).

3 See Susanne Bornschein, Constanze Hausteiner, Horst Römmelt, Dennis Nowak, Hans Förstl, and Thomas Zilker, “Double-blind placebo-controlled provocation study in patients with subjective Multiple Chemical Sensitivity and matched control subjects,” 46 Clin. Toxicol. 443 (2008); Susanne Bornschein, Hans Förstl, and Thomas Zilker, “Idiopathic environmental intolerances (formerly multiple chemical sensitivity) psychiatric perspectives,” 250 J. Intern. Med. 309 (2001).

4 Poisoned: How a Crime-Busting Prosecutor Turned His Medical Mystery into a Crusade for Environmental Victims (Skyhorse Publishing 2017).

5 Steven H. Foskett Jr., “Late Holy Cross coach’s family, insurers settle lawsuit for $681K,” Telegram & Gazette (Oct. 1, 2009). Obviously, the settlement amount represented a deep compromise over any plaintiff’s verdict.

6 Marcia H. Ratner, Joe F. Jabre, William M. Ewing, Mohamed Abou-Donia, and L. Christine Oliver, “Amyotrophic lateral sclerosis—A case report and mechanistic review of the association with toluene and other volatile organic compounds,” 61 Am. J. Ind. Med. 251 (2018).

7 Id. at 251.

8 Id. at 258 (emphasis added).

9 Marcia Hillary Ratner, Age at Onset of Parkinson’s Disease Among Subjects Occupationally Exposed to Metals and Pesticides; Doctoral Dissertation, UMI Number 3125932, Boston University (2004). Neither Ratner’s dissertation supervisor nor her three readers were epidemiologists.

11 See Brad A. Racette, S.D. Tabbal, D. Jennings, L. Good, Joel S. Perlmutter, and Brad Evanoff, “Prevalence of parkinsonism and relationship to exposure in a large sample of Alabama welders,” 64 Neurology 230 (2005).

13 See Quincy District Court News,” Patriot Ledger June 09, 2010 (reporting that Ratner pleaded guilty to criminal possession of mace and a firearm).

14 Ratner v. Village Square at Pico Condominium Owners Ass’n, Inc., No. 91-2-11 Rdcv (Teachout, J., Aug. 28, 2012).

17 Bell is a client of the Worthy Marketing Group.

18 RMSE3d at 505-06 n.5, 512-13 n. 26, 540 n.88; see also Allen v. Martin Surfacing, 2009 WL 3461145, 2008 U.S. Dist. LEXIS 111658, 263 F.R.D. 47 (D. Mass. 2008) (holding that an industrial hygienist was qualified to testify about the concentration and duration of plaintiffs’ exposure to toluene and isocyanates).

20 Id. at 259. One of the plaintiffs’ expert witnesses, Richard W. Clapp, opted out of co-author status on this publication.

21 See Marcia H. Ratner & Edward Fitzgerald, “Understanding of the role of manganese in parkinsonism and Parkinson disease,” 88 Neurology 338 (2017) (claiming no relevant conflicts of interest); Marcia H. Ratner, David H. Farb, Josef Ozer, Robert G. Feldman, and Raymon Durso, “Younger age at onset of sporadic Parkinson’s disease among subjects occupationally exposed to metals and pesticides,” 7 Interdiscip. Toxicol. 123 (2014) (failing to make any disclosure of conflicts of interest). In one short case report written with Dr. Jonathan Rutchik, another expert witness actively participated for the plaintiffs’ litigation industry in welding fume cases, Dr. Ratner let on that she “occasionally” is asked to serve as an expert witness, but she failed to disclose that she has a business enterprise set up to commercialize her expert witness work. Jonathan Rutchik & Marcia H. Ratner, “Is it Possible for Late-Onset Schizophrenia to Masquerade as Manganese Psychosis?” 60 J. Occup. & Envt’l Med. E207 (2018) (“The authors have no current specific competing interests to declare. However, Dr. Rutchik served as expert witnesses [sic] in this case. Drs. Rutchik and Ratner are occasionally asked to serve as expert witnesses and/or consultants in occupational and environmental chemical exposure injury cases.”)

Confounding in Daubert, and Daubert Confounded

November 4th, 2018

ABERRANT DECISIONS

The Daubert trilogy and the statutory revisions to Rule 702 have not brought universal enlightenment. Many decisions reflect a curmudgeonly and dismissive approach to gatekeeping.

The New Jersey Experience

Until recently, New Jersey law looked as though it favored vigorous gatekeeping of invalid expert witness opinion testimony. The law as applied, however, was another matter, with most New Jersey judges keen to find ways to escape the logical and scientific implications of the articulated standards, at least in civil cases.1 For example, in Grassis v. Johns-Manville Corp., 248 N.J. Super. 446, 591 A.2d 671, 675 (App. Div. 1991), the intermediate appellate court discussed the possibility that confounders may lead to an erroneous inference of a causal relationship. Plaintiffs’ counsel claimed that occupational asbestos exposure causes colorectal cancer, but the available studies, inconsistent as they were, failed to assess the role of smoking, family history, and dietary factors. The court essentially shrugged its judicial shoulders and let a plaintiffs’ verdict stand, even though it was supported by expert witness testimony that had relied upon seriously flawed and confounded studies. Not surprisingly, 15 years after the Grassis case, the scientific community acknowledged what should have been obvious in 1991: the studies did not support a conclusion that asbestos causes colorectal cancer.2

This year, however, saw the New Jersey Supreme Court step in to help extricate the lower courts from their gatekeeping doldrums. In a case that involved the dismissal of plaintiffs’ expert witnesses’ testimony in over 2,000 Accutane cases, the New Jersey Supreme Court demonstrated how to close the gate on testimony that is based upon flawed studies and involves tenuous and unreliable inferences.3 There were other remarkable aspects of the Supreme Court’s Accutane decision. For instance, the Court put its weight behind the common-sense and accurate interpretation of Sir Austin Bradford Hill’s famous articulation of factors for causal judgment, which requires that sampling error, bias, and confounding be eliminated before assessing whether the observed association is strong, consistent, plausible, and the like.4

Cook v. Rockwell International

The litigation over radioactive contamination from the Colorado Rocky Flats nuclear weapons plant is illustrative of the retrograde tendency in some federal courts. The defense objected to plaintiffs’ expert witness, Dr. Clapp, whose study failed to account for known confounders.5 Judge Kane denied the challenge, claiming that the defense could:

cite no authority, scientific or legal, that compliance with all, or even one, of these factors is required for Dr. Clapp’s methodology and conclusions to be deemed sufficiently reliable to be admissible under Rule 702. The scientific consensus is, in fact, to the contrary. It identifies Defendants’ list of factors as some of the nine factors or lenses that guide epidemiologists in making judgments about causation. Ref. Guide on Epidemiolog at 375.).”6

In Cook, the trial court or the parties or both missed the obvious references in the Reference Manual to the need to control for confounding. Certainly many other scientific sources could be cited as well. Judge Kane apparently took a defense expert witness’s statement that ecological studies do not account for confounders to mean that the presence of confounding does not render such studies unscientific. Id. True but immaterial. Ecological studies may be “scientific,” but they do not warrant inferences of causation. Some so-called scientific studies are merely hypothesis generating, preliminary, tentative, or data-dredging exercises. Judge Kane employed the flaws-are-features approach, and opined that ecological studies are merely “less probative” than other studies, and the relative weights of studies do not render them inadmissible.7 This approach is, of course, a complete abdication of gatekeeping responsibility. First, studies themselves are not admissible; it is the expert witness, whose testimony is challenged. The witness’s reliance upon studies is relevant to the Rule 702 and 703 analyses, but admissibility is not the issue. Second, Rule 702 requires that the proffered opinion be “scientific knowledge,” and ecological studies simply lack the necessary epistemic warrant to support a causal conclusion. Third, the trial court in Cook had to ignore the federal judiciary’s own reference manual’s warnings about the inability of ecological studies to provide causal inferences.8 The Cook case is part of an unfortunate trend to regard all studies as “flawed,” and their relative weights simply a matter of argument and debate for the litigants.9

Abilify

Another example of sloppy reasoning about confounding can be found in a recent federal trial court decision, In re Abilify Products Liability Litigation,10 where the trial court advanced a futility analysis. All observational studies have potential confounding, and so confounding is not an error but a feature. Given this simplistic position, it follows that failure to control for every imaginable potential confounder does not invalidate an epidemiologic study.11 From its nihilistic starting point, the trial court readily found that an expert witness could reasonably dispense with controlling for confounding factors of psychiatric conditions in studies of a putative association between the antipsychotic medication Abilify and gambling disorders.12

Under this sort of “reasoning,” some criminal defense lawyers might argue that since all human beings are “flawed,” we have no basis to distinguish sinners from saints. We have a long way to go before our courts are part of the evidence-based world.


1 In the context of a “social justice” issue such as whether race disparities exist in death penalty cases, New Jersey court has carefully considered confounding in its analyses. See In re Proportionality Review Project (II), 165 N.J. 206, 757 A.2d 168 (2000) (noting that bivariate analyses of race and capital sentences were confounded by missing important variables). Unlike the New Jersey courts (until the recent decision in Accutane), the Texas courts were quick to adopt the principles and policies of gatekeeping expert witness opinion testimony. See Merrell Dow Pharms., Inc. v. Havner, 953 S.W.2d 706, 714, 724 (Tex.1997) (reviewing court should consider whether the studies relied upon were scientifically reliable, including consideration of the presence of confounding variables).  Even some so-called Frye jurisdictions “get it.” See, e.g., Porter v. SmithKline Beecham Corp., No. 3516 EDA 2015, 2017 WL 1902905 *6 (Phila. Super., May 8, 2017) (unpublished) (affirming exclusion of plaintiffs’ expert witness on epidemiology, under Frye test, for relying upon an epidemiologic study that failed to exclude confounding as an explanation for a putative association), affirming, Mem. Op., No. 03275, 2015 WL 5970639 (Phila. Ct. Com. Pl. Oct. 5, 2015) (Bernstein, J.), and Op. sur Appellate Issues (Phila. Ct. Com. Pl., Feb. 10, 2016) (Bernstein, J.).

3 In re Accutane Litig., ___ N.J. ___, ___ A.3d ___, 2018 WL 3636867 (2018); see N.J. Supreme Court Uproots Weeds in Garden State’s Law of Expert Witnesses(Aug. 8, 2018).

2018 WL 3636867, at *20 (citing the Reference Manual 3d ed., at 597-99).

5 Cook v. Rockwell Internat’l Corp., 580 F. Supp. 2d 1071, 1098 (D. Colo. 2006) (“Defendants next claim that Dr. Clapp’s study and the conclusions he drew from it are unreliable because they failed to comply with four factors or criteria for drawing causal interferences from epidemiological studies: accounting for known confounders … .”), rev’d and remanded on other grounds, 618 F.3d 1127 (10th Cir. 2010), cert. denied, ___ U.S. ___ (May 24, 2012). For another example of a trial court refusing to see through important qualitative differences between and among epidemiologic studies, see In re Welding Fume Prods. Liab. Litig., 2006 WL 4507859, *33 (N.D. Ohio 2006) (reducing all studies to one level, and treating all criticisms as though they rendered all studies invalid).

6 Id.   

7 Id.

8 RMSE3d at 561-62 (“[ecological] studies may be useful for identifying associations, but they rarely provide definitive causal answers”) (internal citations omitted); see also David A. Freedman, “Ecological Inference and the Ecological Fallacy,” in Neil J. Smelser & Paul B. Baltes, eds., 6 Internat’l Encyclopedia of the Social and Behavioral Sciences 4027 (2001).

9 See also McDaniel v. CSX Transportation, Inc., 955 S.W.2d 257 (Tenn. 1997) (considering confounding but holding that it was a jury issue); Perkins v. Origin Medsystems Inc., 299 F. Supp. 2d 45 (D. Conn. 2004) (striking reliance upon a study with uncontrolled confounding, but allowing expert witness to testify anyway)

10 In re Abilifiy (Aripiprazole) Prods. Liab. Litig., 299 F. Supp. 3d 1291 (N.D. Fla. 2018).

11 Id. at 1322-23 (citing Bazemore as a purported justification for the court’s nihilistic approach); see Bazemore v. Friday, 478 U.S. 385, 400 (1986) (“Normally, failure to include variables will affect the analysis’ probativeness, not its admissibility.).

12 Id. at 1325.


Appendix – Some Federal Court Decisions on Confounding

1st Circuit

Bricklayers & Trowel Trades Internat’l Pension Fund v. Credit Suisse Sec. (USA) LLC, 752 F.3d 82, 85 (1st Cir. 2014) (affirming exclusion of expert witness whose event study and causal conclusion failed to consider relevant confounding variables and information that entered market on the event date)

2d Circuit

In re “Agent Orange” Prod. Liab. Litig., 597 F. Supp. 740, 783 (E.D.N.Y. 1984) (noting that confounding had not been sufficiently addressed in a study of U.S. servicemen exposed to Agent Orange), aff’d, 818 F.2d 145 (2d Cir. 1987) (approving district court’s analysis), cert. denied sub nom. Pinkney v. Dow Chemical Co., 484 U.S. 1004 (1988)

3d Circuit

In re Zoloft Prods. Liab. Litig., 858 F.3d 787, 793, 799 (2017) (acknowledging that statistically significant findings occur in the presence of inadequately controlled confounding or bias; affirming the exclusion of statistical expert witness, Nicholas Jewell, in part for using an admittedly non-rigorous approach to adjusting for confouding by indication)

4th Circuit

Gross v. King David Bistro, Inc., 83 F. Supp. 2d 597 (D. Md. 2000) (excluding expert witness who opined shigella infection caused fibromyalgia, given the existence of many confounding factors that muddled the putative association)

5th Circuit

Kelley v. American Heyer-Schulte Corp., 957 F. Supp. 873 (W.D. Tex. 1997) (noting that observed association may be causal or spurious, and that confounding factors must be considered to distinguish spurious from real associations)

Brock v. Merrell Dow Pharms., Inc., 874 F.2d 307, 311 (5th Cir. 1989) (noting that “[o]ne difficulty with epidemiologic studies is that often several factors can cause the same disease.”)

6th Circuit

Nelson v. Tennessee Gas Pipeline Co., WL 1297690, at *4 (W.D. Tenn. Aug. 31, 1998) (excluding an expert witness who failed to take into consideration confounding factors), aff’d, 243 F.3d 244, 252 (6th Cir. 2001), cert. denied, 534 U.S. 822 (2001)

Adams v. Cooper Indus. Inc., 2007 WL 2219212, 2007 U.S. Dist. LEXIS 55131 (E.D. Ky. 2007) (differential diagnosis includes ruling out confounding causes of plaintiffs’ disease).

7th Circuit

People Who Care v. Rockford Bd. of Educ., 111 F.3d 528, 537-38 (7th Cir. 1997) (Posner, J.) (“a statistical study that fails to correct for salient explanatory variables, or even to make the most elementary comparisons, has no value as causal explanation and is therefore inadmissible in a federal court”) (educational achievement in multiple regression);

Sheehan v. Daily Racing Form, Inc., 104 F.3d 940 (7th Cir. 1997) (holding that expert witness’s opinion, which failed to correct for any potential explanatory variables other than age, was inadmissible)

Allgood v. General Motors Corp., 2006 WL 2669337, at *11 (S.D. Ind. 2006) (noting that confounding factors must be carefully addressed; holding that selection bias rendered expert testimony inadmissible)

9th Circuit

In re Bextra & Celebrex Marketing Celebrex Sales Practices & Prod. Liab. Litig., 524 F.Supp. 2d 1166, 1178-79 (N.D. Cal. 2007) (noting plaintiffs’ expert witnesses’ inconsistent criticism of studies for failing to control for confounders; excluding opinions that Celebrex at 200 mg/day can cause heart attacks, as failing to satisfy Rule 702)

Avila v. Willits Envt’l Remediation Trust, 2009 WL 1813125, 2009 U.S. Dist. LEXIS 67981 (N.D. Cal. 2009) (excluding expert witness’s opinion in part because of his failure to rule out confounding exposures and risk factors for the outcomes of interest), aff’d in relevant part, 633 F.3d 828 (9th Cir.), cert denied, 132 S.Ct. 120 (2011)

Hendricksen v. ConocoPhillips Co., 605 F. Supp. 2d 1142, 1158 (E.D. Wash. 2009) (“In general, epidemiology studies are probative of general causation: a relative risk greater than 1.0 means the product has the capacity to cause the disease. “Where the study properly accounts for potential confounding factors and concludes that exposure to the agent is what increases the probability of contracting the disease, the study has demonstrated general causation – that exposure to the agent is capable of causing [the illness at issue] in the general population.’’) (internal quotation marks and citation omitted)

Valentine v. Pioneer Chlor Alkali Co., Inc., 921 F. Supp. 666, 677 (D. Nev. 1996) (‘‘In summary, Dr. Kilburn’s study suffers from very serious flaws. He took no steps to eliminate selection bias in the study group, he failed to identify the background rate for the observed disorders in the Henderson community, he failed to control for potential recall bias, he simply ignored the lack of reliable dosage data, he chose a tiny sample size, and he did not attempt to eliminate so-called confounding factors which might have been responsible for the incidence of neurological disorders in the subject group.’’)

Claar v. Burlington No. RR, 29 F.3d 499 (9th Cir. 1994) (affirming exclusion of plaintiffs’ expert witnesses, and grant of summary judgment, when plaintiffs’ witnesses concluded that the plaintiffs’ injuries were caused by exposure to toxic chemicals, without investigating any other possible causes).

10th Circuit

Hollander v. Sandoz Pharms. Corp., 289 F.3d 1193, 1213 (10th Cir. 2002) (affirming exclusion in Parlodel case involving stroke; confounding makes case reports inappropriate bases for causal inferences, and even observational epidemiologic studies must evaluated carefully for confounding)

D.C. Circuit

American Farm Bureau Fed’n v. EPA, 559 F.3d 512 (2009) (noting that in setting particulate matter standards addressing visibility, agency should avoid relying upon data that failed to control for the confounding effects of humidity)

Confounding in the Courts

November 2nd, 2018

Confounding in the Lower Courts

To some extent, lower courts, especially in the federal court system, got the message: Rule 702 required them to think about the evidence, and to consider threats to validity. Institutionally, there were signs of resistance to the process. Most judges were clearly much more comfortable with proxies of validity, such as qualification, publication, peer review, and general acceptance. Unfortunately for them, the Supreme Court had spoken, and then, in 2000, the Rules Committee and Congress spoke by revising Rule 702 to require a searching review of the studies upon which challenged expert witnesses were relying. Some of the cases involving confounding of one sort or another follow.

Confounding and Statistical Significance

Some courts and counsel confuse statistical significance with confounding, and suggest that a showing of statistical significance eliminates concern over confounding. This is, as several commentators have indicated, quite wrong.1 Despite the widespread criticism of this mistake in the Brock opinion, lawyers continue to repeat the mistake. One big-firm defense lawyer, for instance, claimed that “a statistically significant confidence interval helps ensure that the findings of a particular study are not due to chance or some other confounding factors.”2

Confounding and “Effect Size”

There is a role of study “effect size” in evaluating potential invalidity due to confounding, but it is frequently more nuanced than acknowledged by courts. The phrase “effect size,” of course, is misleading in that it is used to refer to the magnitude of an association, which may or may not be causal. This is one among many instances of sloppy terminology in statistical and epidemiologic science. Nonetheless, the magnitude of the relative risk may play a role in evaluating observational analytical epidemiologic studies for their ability to support a causal inference.

Small Effect Size

If the so-called effect size is low, say about 2.0, or less, actual, potential, or residual confounding (or bias) may well account for the entirety of the association.3 Many other well-known authors have concurred, with some setting the bar considerably higher, asking for risk ratios in excess of three or more, before accepting that a “clear-cut” association has been shown, unthreatened by confounding.4

Large Effect Size

Some courts have acknowledged that a strong association, with a high relative risk (without committing to what is “high”), increases the likelihood of a causal relationship, even while proceeding to ignore the effects of confounding.5 The Reference Manual suggests that a large effect size, such as for smoking and lung cancer (greater than ten-fold, and often higher than 30-fold), eliminates the need to worry about confounding:

Many confounders have been proposed to explain the association between smoking and lung cancer, but careful epidemiological studies have ruled them out, one after the other.”6

*  *  *  *  *  *

A relative risk of 10, as seen with smoking and lung cancer, is so high that it is extremely difficult to imagine any bias or confounding factor that might account for it. The higher the relative risk, the stronger the association and the lower the chance that the effect is spurious. Although lower relative risks can reflect causality, the epidemiologist will scrutinize such associations more closely because there is a greater chance that they are the result of uncontrolled confounding or biases.”7

The point about “difficult to imagine” is fair enough in the context of smoking and lung cancer, but that is because no other putative confounder presents such a high relative risk in most studies. In studying other epidemiologic associations, of a high magnitude, the absence of competing risk or correlation from lurking variables would need to be independently shown, rather than relying upon the “case study” of smoking and lung cancer.

Regression and Other Statistical Analyses

The failure to include a lurking or confounding variable may render a regression analysis invalid and meaningless. The Supreme Court, however, in Bazemore, a case decided before its own decision in Daubert, and before Rule 702 was statutorily modified,8 issued a Supreme ipse dixit, to hold that the selection or omission of variables in multiple regression raises an issue that affects the weight of the analysis:

Normally, failure to include variables will affect the analysis’ probativeness, not its admissibility.”9

The Supreme Court did, however, acknowledge in Bazemore that:

There may, of course, be some regressions so incomplete as to be inadmissible as irrelevant; but such was clearly not the case here.”10

The footnote in Bazemore is telling; the majority could imagine or hypothesize a multiple regression so incomplete that it would be irrelevant, but it never thought to ask whether a relevant regression could be so incomplete as to be unreliable or invalid. The invalidity of the regression in Bazemore does not appear to have been raised as an evidentiary issue under Rule 702. None of the briefs in the Supreme Court or the judicial opinions cited or discussed Rule 702.

Despite the inappropriateness of considering the Bazemore precedent after the Court decided Daubert, many lower court decisions have treated Bazemore as dispositive of reliability challenges to regression analyses, without any meaningful discussion.11 In the last several years, however, the appellate courts have awakened on occasion to their responsibilities to ensure that opinions of statistical expert witnesses, based upon regression analyses, are evaluated through the lens of Rule 702.12


1 Brock v. Merrill Dow Pharmaceuticals, Inc., 874 F.2d 307, 311-12 (5th Cir. 1989) (“Fortunately, we do not have to resolve any of the above questions [as to bias and confounding], since the studies presented to us incorporate the possibility of these factors by the use of a confidence interval.”). See, e.g., David Kaye, David Bernstein, and Jennifer Mnookin, The New Wigmore – A Treatise on Evidence: Expert Evidence § 12.6.4, at 546 (2d ed. 2011); Michael O. Finkelstein, Basic Concepts of Probability and Statistics in the Law 86-87 (2009) (criticizing the blatantly incorrect interpretation of confidence intervals by the Brock court).

2 Zach Hughes, “The Legal Significance of Statistical Significance,” 28 Westlaw Journal: Pharmaceutical 1, 2 (Mar. 2012).

See Norman E. Breslow & N. E. Day, “Statistical Methods in Cancer Research,” in The Analysis of Case-Control Studies 36 (IARC Pub. No. 32, 1980) (“[r]elative risks of less than 2.0 may readily reflect some unperceived bias or confounding factor”); David A. Freedman & Philip B. Stark, “The Swine Flu Vaccine and Guillain-Barré Syndrome: A Case Study in Relative Risk and Specific Causation,” 64 Law & Contemp. Probs. 49, 61 (2001) (“If the relative risk is near 2.0, problems of bias and confounding in the underlying epidemiologic studies may be serious, perhaps intractable.”).

See, e.g., Richard Doll & Richard Peto, The Causes of Cancer 1219 (1981) (“when relative risk lies between 1 and 2 … problems of interpretation may become acute, and it may be extremely difficult to disentangle the various contributions of biased information, confounding of two or more factors, and cause and effect.”); Ernst L. Wynder & Geoffrey C. Kabat, “Environmental Tobacco Smoke and Lung Cancer: A Critical Assessment,” in H. Kasuga, ed., Indoor Air Quality 5, 6 (1990) (“An association is generally considered weak if the odds ratio is under 3.0 and particularly when it is under 2.0, as is the case in the relationship of ETS and lung cancer. If the observed relative risk is small, it is important to determine whether the effect could be due to biased selection of subjects, confounding, biased reporting, or anomalies of particular subgroups.”); David A. Grimes & Kenneth F. Schulz, “False alarms and pseudo-epidemics: the limitations of observational epidemiology,” 120 Obstet. & Gynecol. 920 (2012) (“Most reported associations in observational clinical research are false, and the minority of associations that are true are often exaggerated. This credibility problem has many causes, including the failure of authors, reviewers, and editors to recognize the inherent limitations of these studies. This issue is especially problematic for weak associations, variably defined as relative risks (RRs) or odds ratios (ORs) less than 4.”); Ernst L. Wynder, “Epidemiological issues in weak associations,” 19 Internat’l J. Epidemiol. S5 (1990); Straus S, Richardson W, Glasziou P, Haynes R., Evidence-Based Medicine. How to Teach and Practice EBM (3d ed. 2005); Samuel Shapiro, “Bias in the evaluation of low-magnitude associations: an empirical perspective,” 151 Am. J. Epidemiol. 939 (2000); Samuel Shapiro, “Looking to the 21st century: have we learned from our mistakes, or are we doomed to compound them?” 13 Pharmacoepidemiol. & Drug Safety 257 (2004); Muin J. Khoury, Levy M. James, W. Dana Flanders, and David J. Erickson, “Interpretation of recurring weak associations obtained from epidemiologic studies of suspected human teratogens,” 46 Teratology 69 (1992); Mark Parascandola, Douglas L Weed & Abhijit Dasgupta, “Two Surgeon General’s reports on smoking and cancer: a historical investigation of the practice of causal inference,” 3 Emerging Themes in Epidemiol. 1 (2006); David Sackett, R. Haynes, Gordon Guyatt, and Peter Tugwell, Clinical Epidemiology: A Basic Science for Clinical Medicine (2d ed. 1991); Gary Taubes, “Epidemiology Faces Its Limits,” 269 Science164, 168 (July 14, 1995) (quoting Marcia Angell, former editor of the New England Journal of Medicine, as stating that [a]s a general rule of thumb, we are looking for a relative risk of 3 or more [before accepting a paper for publication], particularly if it is biologically implausible or if it’s a brand new finding.”) (quoting John C. Bailar: “If you see a 10-fold relative risk and it’s replicated and it’s a good study with biological backup, like we have with cigarettes and lung cancer, you can draw a strong inference. * * * If it’s a 1.5 relative risk, and it’s only one study and even a very good one, you scratch your chin and say maybe.”); Lynn Rosenberg, “Induced Abortion and Breast Cancer: More Scientific Data Are Needed,” 86 J. Nat’l Cancer Instit. 1569, 1569 (1994) (“A typical difference in risk (50%) is small in epidemiologic terms and severely challenges our ability to distinguish if it reflects cause and effect or if it simply reflects bias.”) (commenting upon Janet R. Daling, K. E. Malone, L. F. Voigt, E. White, and Noel S. Weiss, “Risk of breast cancer among young women: relationship to induced abortion,” 86 J. Nat’l Cancer Instit. 1584 (1994); Linda Anderson, “Abortion and possible risk for breast cancer: analysis and inconsistencies,” (Wash. D.C., Nat’l Cancer Institute, Oct. 26,1994) (“In epidemiologic research, relative risks of less than 2 are considered small and are usually difficult to interpret. Such increases may be due to chance, statistical bias, or effects of confounding factors that are sometimes not evident.”); Washington Post (Oct 27, 1994) (quoting Dr. Eugenia Calle, Director of Analytic Epidemiology for the American Cancer Society: “Epidemiological studies, in general are probably not able, realistically, to identify with any confidence any relative risks lower than 1.3 (that is a 30% increase in risk) in that context, the 1.5 [reported relative risk of developing breast cancer after abortion] is a modest elevation compared to some other risk factors that we know cause disease.”). See also General Causation and Epidemiologic Measures of Risk Size” (Nov. 24, 2012). Even expert witnesses for the litigation industry have agreed that small risk ratios (under two) are questionable for potential and residual confounding. David F. Goldsmith & Susan G. Rose, “Establishing Causation with Epidemiology,” in Tee L. Guidotti & Susan G. Rose, eds., Science on the Witness Stand: Evaluating Scientific Evidence in Law, Adjudication, and Policy 57, 60 (2001) (“There is no clear consensus in the epidemiology community regarding what constitutes a ‘strong’ relative risk, although, at a minimum, it is likely to be one where the RR is greater than two; i.e., one in which the risk among the exposed is at least twice as great as among the unexposed.”)

See King v. Burlington Northern Santa Fe Railway Co., 762 N.W.2d 24, 40 (Neb. 2009) (“the higher the relative risk, the greater the likelihood that the relationship is causal”).

RMSE3d at 219.

RMSE3d at 602. See Landrigan v. Celotex Corp., 127 N.J. 404, 605 A.2d 1079, 1086 (1992) (“The relative risk of lung cancer in cigarette smokers as compared to nonsmokers is on the order of 10:1, whereas the relative risk of pancreatic cancer is about 2:1. The difference suggests that cigarette smoking is more likely to be a causal factor for lung cancer than for pancreatic cancer.”).

See Federal Rule of Evidence 702, Pub. L. 93–595, § 1, Jan. 2, 1975, 88 Stat. 1937; Apr. 17, 2000 (eff. Dec. 1, 2000); Apr. 26, 2011, eff. Dec. 1, 2011.)

Bazemore v. Friday, 478 U.S. 385, 400 (1986) (reversing Court of Appeal’s decision that would have disallowed a multiple regression analysis that omitted important variables).

10 Id. at 400 n. 10.

11 See, e.g., Manpower, Inc. v. Insurance Company of the State of Pennsylvania, 732 F.3d 796, 799 (7th Cir., 2013) (“the Supreme Court and this Circuit have confirmed on a number of occasions that the selection of the variables to include in a regression analysis is normally a question that goes to the probative weight of the analysis rather than to its admissibility.”); Cullen v. Indiana Univ. Bd. of Trustees, 338 F.3d 693, 701‐02 & n.4 (7th Cir. 2003) (citing Bazemore in rejecting challenge to expert witness’s omission of variables in regression analysis); In re High Fructose Corn Syrup Antitrust Litigation, 295 F.3d 651, 660‐61 (7th Cir. 2002) (refusing to exclude expert witness opinion testimony based upon regression analyses, flawed by omission of key variables); Adams v. Ameritech Servs., Inc., 231 F.3d 414, 423 (7th Cir. 2000) (relying upon Bazemore to affirm statistical analysis based upon correlation with no regression analysis). See also The Seventh Circuit Regresses on Rule 702” (Oct. 29, 2013).

12 See, e.g., ATA Airlines, Inc. v. Fed. Express Corp., 665 F.3d 882, 888–89 (2011) (Posner, J.) (reversing on grounds that plaintiff’s regression analysis should never have been admitted), cert. denied, 2012 WL 189940 (Oct. 7, 2012); Zenith Elec. Corp. v. WH-TV Broad. Corp., 395 F.3d 416 (7th Cir.) (affirming exclusion of expert witness opinion whose extrapolations were mere “ipse dixit”), cert. denied, 125 S. Ct. 2978 (2005); Sheehan v. Daily Racing Form, Inc. 104 F.3d 940 (7th Cir. 1997) (Posner, J.) (discussing specification error). See also Munoz v. Orr, 200 F.3d 291 (5th Cir. 2000). For a more enlightened and educated view of regression and the scope and application of Rule 702, from another Seventh Circuit panel, Judge Posner’s decision in ATA Airlines, supra, is a good starting place. SeeJudge Posner’s Digression on Regression” (April 6, 2012).