TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

Rule 702 Requires Courts to Sort Out Confounding

October 31st, 2018

CONFOUNDING1

Back in 2000, several law professors wrote an essay, in which they detailed some of the problems courts experienced in expert witness gatekeeping. Their article noted that judges easily grasped the problem of generalizing from animal evidence to human experience, and thus they simplistically emphasized human (epidemiologic) data. But in their emphasis on the problems in toxicological evidence, the judges missed problems of internal validity, such as confounding, in epidemiologic studies:

Why do courts have such a preference for human epidemiological studies over animal experiments? Probably because the problem of external validity (generalizability) is one of the most obvious aspects of research methodology, and therefore one that non-scientists (including judges) are able to discern with ease – and then give excessive weight to (because whether something generalizes or not is an empirical question; sometimes things do and other times they do not). But even very serious problems of internal validity are harder for the untrained to see and understand, so judges are slower to exclude inevitably confounded epidemiological studies (and give insufficient weight to that problem). Sophisticated students of empirical research see the varied weaknesses, want to see the varied data, and draw more nuanced conclusions.”2

I am not sure that the problems are dependent in the fashion suggested by the authors, but their assessment that judges may be reluctant to break the seal on the black box of epidemiology, and that judges frequently lack the ability to make nuanced evaluations of the studies on which expert witnesses rely seems fair enough. Judges continue to miss important validity issues, perhaps because the adversarial process levels all studies to debating points in litigation.3

The frequent existence of validity issues undermines the partisan suggestion that Rule 702 exclusions are merely about “sufficiency of the evidence.” Sometimes, there is just too much of nothing to rise even to a problem of insufficiency. Some studies are “not even wrong.”4 Similarly, validity issues are an embarrassment to those authors who argue that we must assemble all the evidence and consider the entirety under ethereal standards, such as “weight of the evidence,” or “inference to the best explanation.” Sometimes, some or much of the available evidence does not warrant inclusion in the data set at all, and any causal inference is unacceptable.

Threats to validity come in many forms, but confounding is a particularly dangerous one. In claims that substances such as diesel fume or crystalline silica cause lung cancer, confounding is a huge problem. The proponents of the claims suggest relative risks in the range of 1.1 to 1.6 for such substances, but tobacco smoking results in relative risks in excess of 20, and some claim that passive smoking at home or in the workplace results in relative risks of the same magnitude as the risk ratios claimed for diesel particulate or silica. Furthermore the studies behind these claims frequently involve exposures to other known or suspected lung carcinogens, such as arsenic, radon, dietary factors, asbestos, and others.

Definition of Confounding

Confounding results from the presence of a so-called confounding (or lurking) variable, helpfully defined in the chapter on statistics in the Reference Manual on Scientific Evidence:

confounding variable; confounder. A confounder is correlated with the independent variable and the dependent variable. An association between the dependent and independent variables in an observational study may not be causal, but may instead be due to confounding. See controlled experiment; observational study.”5

This definition suggests that the confounder need not be known to cause the dependent variable/outcome; the confounder need be only correlated with the outcome and an independent variable, such as exposure. Furthermore, the confounder may be actually involved in such a way as to increase or decrease the estimated relationship between dependent and independent variables. A confounder that is known to be present typically is referred to as a an “actual” confounder, as opposed to one that may be at work, and known as a “potential” confounder. Furthermore, even after exhausting known and potential confounders, studies of may be affected by “residual” confounding, especially when the total array of causes of the outcome of interest is not understood, and these unknown causes are not randomly distributed between exposed and unexposed groups in epidemiologic studies. Litigation frequently involves diseases or outcomes with unknown causes, and so the reality of unidentified residual confounders is unavoidable.

In some instances, especially in studies pharmaceutical adverse outcomes, there is the danger that the hypothesized outcome is also a feature of the underlying disease being treated. This phenomenon is known as confounding by indication, or as indication bias.6

Kaye and Freedman’s statistics chapter notes that confounding is a particularly important consideration when evaluating observational studies. In randomized clinical trials, one goal of the randomization is the elimination of the role of bias and confounding by the random assignment of exposures:

2. Randomized controlled experiments

In randomized controlled experiments, investigators assign subjects to treatment or control groups at random. The groups are therefore likely to be comparable, except for the treatment. This minimizes the role of confounding.”7

In observational studies, confounding may completely invalidate an association. Kaye and Freedman give an example from the epidemiologic literature:

Confounding remains a problem to reckon with, even for the best observational research. For example, women with herpes are more likely to develop cervical cancer than other women. Some investigators concluded that herpes caused cancer: In other words, they thought the association was causal. Later research showed that the primary cause of cervical cancer was human papilloma virus (HPV). Herpes was a marker of sexual activity. Women who had multiple sexual partners were more likely to be exposed not only to herpes but also to HPV. The association between herpes and cervical cancer was due to other variables.”8

The problem identified as confounding by Freedman and Kaye cannot be dismissed as an issue that goes to the “weight” of the study issue; the confounding goes to the heart of the ability of the herpes studies to show an association that can be interpreted to be causal. Invalidity from confounding renders the studies “weightless” in any “weight of the evidence” approach. There are, of course, many ways to address confounding in studies: stratification, multivariate analyses, multiple regression, propensity scores, etc. Consideration of the propriety and efficacy of these methods is a whole other level of analysis, which does not arise unless and until the threshold question of confounding is addressed.

Reference Manual on Scientific Evidence

The epidemiology chapter of the Second Edition of the Manual stated that ruling out of confounding as an obligation of the expert witness who chooses to rely upon the study.9 Although the same chapter in the Third Edition occasionally waffles, its authors come down on the side of describing confounding as a threat to validity, which must be ruled out before the study can be relied upon. In one place, the authors indicate “care” is required, and that analysis for random error, confounding, bias “should be conducted”:

Although relative risk is a straightforward concept, care must be taken in interpreting it. Whenever an association is uncovered, further analysis should be conducted to assess whether the association is real or a result of sampling error, confounding, or bias. These same sources of error may mask a true association, resulting in a study that erroneously finds no association.”10

Elsewhere in the same chapter, the authors note that “chance, bias, and confounding” must be looked at, but again, the authors stop short of noting that these threats to validity must be eliminated:

Three general categories of phenomena can result in an association found in a study to be erroneous: chance, bias, and confounding. Before any inferences about causation are drawn from a study, the possibility of these phenomena must be examined.”11

                *  *  *  *  *  *  *  *

To make a judgment about causation, a knowledgeable expert must consider the possibility of confounding factors.”12

Eventually, however, the epidemiology chapter takes a stand, and an important one:

When researchers find an association between an agent and a disease, it is critical to determine whether the association is causal or the result of confounding.”13

Mandatory Not Precatory

The better reasoned cases decided under Federal Rule of Evidence 702, and state-court analogues, follow the Reference Manual in making clear that confounding factors must be carefully addressed and eliminated. Failure to rule out the role of confounding renders a conclusion of causation, reached in reliance upon confounded studies, invalid.14

The inescapable mandate of Rules 702 and 703 is to require judges to evaluate the bases of a challenged expert witness’s opinion. Threats to internal validity, such as confounding, in a study may make reliance upon any given study, or an entire set of studies, unreasonable, which thus implicates Rule 703. Importantly, stacking up more invalid studies does not overcome the problem by presenting a heap of evidence, incompetent to show anything.

Pre-Daubert

Before the Supreme Court decided Daubert, few federal or state courts were willing to roll up their sleeves to evaluate the internal validity of relied upon epidemiologic studies. Issues of bias and confounding were typically dismissed by courts as issues that went to “weight, not admissibility.”

Judge Weinstein’s handling of the Agent Orange litigation, in the mid-1980s, marked a milestone in judicial sophistication and willingness to think critically about the evidence that was being funneled into the courtroom.15 The Bendectin litigation also was an important proving ground in which the defendant pushed courts to keep their eyes and minds open to issues of random error, bias, and confounding, when evaluating scientific evidence, on both pre-trial and on post-trial motions.16

Post-Daubert

When the United States Supreme Court addressed the admissibility of plaintiffs’ expert witnesses in Daubert, its principal focus was on the continuing applicability of the so-called Frye rule after the enactment of the Federal Rules of Evidence. The Court left the details of applying the then newly clarified “Daubert” standard to the facts of the case on remand to the intermediate appellate court. The Ninth Circuit, upon reconsidering the case, re-affirmed the trial court’s previous grant of summary judgment, on grounds of the plaintiffs’ failure to show specific causation.

A few years later, the Supreme Court itself engaged with the actual evidentiary record on appeal, in a lung cancer claim, which had been dismissed by the district court. Confounding was one among several validity issues in the studies relied upon by plaintiffs” expert witnesses. The Court concluded that the plaintiffs’ expert witnesses’ bases did not individually or collectively support their conclusions of causation in a reliable way. With respect to one particular epidemiologic study, the Supreme Court observed that a study that looked at workers who “had been exposed to numerous potential carcinogens” could not show that PCBs cause lung cancer. General Elec. Co. v. Joiner, 522 U.S. 136, 146 (1997).17


1 An earlier version of this post can be found at “Sorting Out Confounded Research – Required by Rule 702” (June 10, 2012).

2 David Faigman, David Kaye, Michael Saks, and Joseph Sanders, “How Good is Good Enough? Expert Evidence Under Daubert andKumho,” 50Case Western Reserve L. Rev. 645, 661 n.55 (2000).

3 See, e.g., In re Welding Fume Prods. Liab. Litig., 2006 WL 4507859, *33 (N.D.Ohio 2006) (reducing all studies to one level, and treating all criticisms as though they rendered all studies invalid).

4 R. Peierls, “Wolfgang Ernst Pauli, 1900-1958,” 5Biographical Memoirs of Fellows of the Royal Society 186 (1960) (quoting Wolfgang Pauli’s famous dismissal of a particularly bad physics paper).

5 David Kaye & David Freedman, “Reference Guide on Statistics,” inReference Manual on Scientific Evidence 211, 285 (3d ed. 2011)[hereafter theRMSE3d].

6 See, e.g., R. Didham, et al., “Suicide and Self-Harm Following Prescription of SSRIs and Other Antidepressants: Confounding By Indication,” 60Br. J. Clinical Pharmacol. 519 (2005).

7 RMSE3d at 220.

8 RMSE3d at 219 (internal citations omitted).

9 Reference Guide on Epidemiology at 369 -70 (2ed 2000) (“Even if an association is present, epidemiologists must still determine whether the exposure causes the disease or if a confounding factor is wholly or partly responsible for the development of the outcome.”).

10 RMSE3d at 567-68 (internal citations omitted).

11 RMSE3d at 572.

12 RMSE3d at 591 (internal citations omitted).

13 RMSE3d at 591

14 Similarly, an exonerative conclusion of no association might be vitiated by confounding with a protective factor, not accounted for in a multivariate analysis. Practically, such confounding seems less prevalent than confounding that generates a positive association.

15 In re “Agent Orange” Prod. Liab. Litig., 597 F. Supp. 740, 783 (E.D.N.Y. 1984) (noting that confounding had not been sufficiently addressed in a study of U.S. servicemen exposed to Agent Orange), aff’d, 818 F.2d 145 (2d Cir. 1987) (approving district court’s analysis), cert. denied sub nom. Pinkney v. Dow Chemical Co., 484 U.S. 1004 (1988).

16 Brock v. Merrell Dow Pharms., Inc., 874 F.2d 307, 311 , modified on reh’g, 884 F.2d 166 (5th Cir. 1989) (noting that “[o]ne difficulty with epidemiologic studies is that often several factors can cause the same disease.”)

17 The Court’s discussion related to the reliance of plaintiffs’ expert witnesses upon, among other studies, Kuratsune, Nakamura, Ikeda, & Hirohata, “Analysis of Deaths Seen Among Patients with Yusho – A Preliminary Report,” 16 Chemosphere 2085 (1987).

Ruling Out Bias & Confounding is Necessary to Evaluate Expert Witness Causation Opinions

October 29th, 2018

In 2000, Congress amended the Federal Rules of Evidence to clarify, among other things, that Rule 702 had grown past the Supreme Court’s tentative, preliminary statement in Daubert, to include over a decade and half of further judicial experience and scholarly comment. One point of clarification in the 2000 amendments, carried forward since, was that expert witness testimony is admissible only if “the testimony is based on sufficient facts or data.” Rule 702(b). In other words, an expert witness’s opinions could fail the legal requirement of reliability and validity by lacking sufficient facts or data.

The American Law Institute (ALI), in its 2010 revision to The Restatement of Torts, purported to address the nature and quantum of evidence for causation in so-called toxic tort cases as a matter of substantive law only, without addressing admissibility of expert witness opinion testimony, by noting that the Restatement did “not address any other requirements for the admissibility of an expert witness’s testimony, including qualifications, expertise, investigation, methodology, or reasoning.” Restatement (Third) of Torts: Liability for Physical and Emotional Harm § 28, cmt. E (2010). The qualifying language seems to have come from a motion advanced by ALI member Larry S. Stewart.

The Restatement, however, was not faithful to its own claim; nor could it be. Rule 702(b) made sufficiency an explicit part of the admissibility calculus in 2000. The ALI should have known better to claim that its Restatement would not delve, and had not wandered, into the area of expert witness admissibility. The strategic goal for ignoring a key part of Rule 702 seems to have been to redefine expert witness reliability and validity as a “sufficiency” or “weight of the evidence” question, which the trial court was required to leave to the finder of fact (usually a lay jury) to resolve. The Restatement’s pretense to avoid addressing the admissibility of expert witness opinion turns on an incorrect assumption that sufficiency plays no role in judicial gatekeeping of opinion testimony.

At the time of the release of the Restatement (Third) of Torts: Liability for Physical and Emotional Harm, one of its Reporters, Michael D. Green, published an article in Trial, the glossy journal of the Association of Trial Lawyers of America (now known by the self-congratulatory name of the American Association of Justice), the trade organization for the litigation industry in the United States. Professor Green’s co-author was Larry S. Stewart, a former president of the plaintiffs’ lawyers’ group, and the ALI member who pressed the motion that led to the Comment E language quoted above. Their article indecorously touted the then new Restatement as a toolbox for plaintiffs’ lawyers.1

According to Green and Stewart, “Section 28, comment c [of the Restatement], seeks to clear the air.” Green at 46. These authors suggest that the Restatement sought to avoid “bright-line rules,” by recognizing that causal inference is a

matter of informed judgment, not scientific certainty; scientific analysis is informed by numerous factors (commonly known as the Hill criteria); and, in some cases, reasonable scientists can come to differing conclusions.”

Id.

There are several curious aspects to these pronouncements. First, the authors are conceding that the comment e caveat was violated because the Hill criteria certainly involve the causation expert witness’s methodology and reasoning. Second, the authors’ claim to have avoided “bright-line” rules is muddled when they purport to bifurcate “informed judgment” from “scientific certainty.” The latter phrase, “scientific certainty” is not a requirement in science or the law, which makes the comparison with informed judgment confusing. Understandably, Green and Stewart wished to note that in some cases, scientists could reasonably come to different conclusions about causation from a given data set, but their silence about the many cases in which scientists, outside the courtroom, do not reach the causal conclusion contended for by party advocate expert witnesses, is telling, given the obvious pro-litigation bias of their audience.

Perhaps the most problematic aspect of the authors’ approach to causal analysis is their reductionist statement that “scientific analysis is informed by numerous factors (commonly known as the Hill criteria).” The nine Hill criteria, to be sure, are important, but they follow an assessment whether the pre-requisites for the criteria have been met,2 namely an “association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance.”3

The problematic aspects of this litigation-industry magazine article raise the question whether the Restatement itself similarly provides erroneous guidance. The relevant discussion occurs in Chapter 5, on “Factual Cause, § 28 Comment c (3) General Causation. At one place, the comment seems to elevate the Hill criteria to the entire relevant consideration:

Observational group studies are subject to a variety of errors — sampling error, bias, and confounding — and may, as a result, find associations that are spurious and not causal. Only after an evaluative judgment, based on the Hill criteria, that the association is likely to be causal rather than spurious, is a study valid evidence of general causation and specific causation.”

Restatement at 449b.

This passage, like the Green and Stewart article, appears to treat the Hill criteria as the end-all of the evaluative judgment, which leaves out the need to assess and eliminate “sampling error, bias, and confounding” before proceeding to measure the available evidence against the Hill criteria. The first sentence, however, does suggest that addressing sampling error, bias, and confounding is part of causal inference, at least if spurious associations are to be avoided. Indeed, earlier in comment c, the reporters describe the examination of an association as explained by random error or bias as scientifically required:

when epidemiology finds an association, the observational (rather than experimental) nature of these studies requires an examination of whether the association is truly causal or spurious and due to random error or deficiencies in the study (bias).”

Restatement at 440b (emphasis added). This crucial explanation was omitted from the Green and Stewart article.

An earlier draft of comment c offered the following observation:

Epidemiologists use statistical methods to estimate the range of error that sampling error could produce; assessing the existence and impact of biases and uncorrected confounding is usually qualitative. Whether an inference of causation based on an association is appropriate is a matter of informed judgment, not scientific inquiry, as is a judgment whether a study that finds no association is exonerative or inconclusive.”

Fortunately, this observation was removed in the drafting process. The reason for the deletion is unclear, but its removal was well advised. The struck language would have been at best misleading when it suggests that the assessment of bias and confounding is “usually qualitative.” Elimination of confounding is the goal of multivariate analyses such as logistic regression and propensity score matching models, among other approaches, all of which are quantitative methods. Assessing bias quantitatively has been the subject of book-length treatment in the field of epidemiology.4

In comment c as published, the Reporters acknowledged that confounding can be identified and analyzed:

The observational nature of epidemiologic studies virtually always results in concerns about the results being skewed by biases or unidentified confounders. * * * Sometimes potential confounders can be identified and data gathered that permits analysis of whether confounding exists. Unidentified confounders, however, cannot be analyzed. Often potential biases can be identified, but assessing the extent to which they affected the study’s outcome is problematical. * * * Thus, interpreting the results of epidemiologic studies requires informed judgment and is subject to uncertainty. Unfortunately, contending adversarial experts, because of the pressures of the adversarial system, rarely explore this uncertainty and provide the best, objective assessment of the scientific evidence.”

Restatement at 448a.

It would be a very poorly done epidemiologic study that fails to identify and analyze confounding variables in a multivariate analysis. The key question will be whether the authors have done this analysis with due care, and with all the appropriate co-variates to address confounding thoroughly. The Restatement comment acknowledges that expert witnesses in the our courtrooms often fail to explore the uncertainty created by bias and confounding. Given the pressure on those witnesses claiming causal associations, we might well expect that this failure will not be equally distributed among all expert witnesses.


1 Michael D. Green & Larry S. Stewart, “The New Restatement’s Top 10 Tort Tools,” Trial 44 (April 2010) [cited as Green]. See “The Top Reason that the ALI’s Restatement of Torts Should Steer Clear of Partisan Conflicts.”

2 See Frank C. Woodside, III & Allison G. Davis, “The Bradford Hill Criteria: The Forgotten Predicate,” 35 Thomas Jefferson L. Rev. 103 (2013); see also Woodside & Davis on the Bradford Hill Considerations(Aug. 23, 2013).

3 Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295 (1965).

4 See, e.g., Timothy L. Lash, Matthew P. Fox, and Aliza K. Fink, Applying Quantitative Bias Analysis to Epidemiologic Data (2009).

Daubert’s Silver Anniversary – Retrospective View of Its Friends and Enemies

October 21st, 2018

Science is inherently controversial because when done properly it has no respect for power or political correctness or dogma or entrenched superstition. We should thus not be surprised that the scientific process has many detractors in houses of worship, houses of representatives, and houses of the litigation industry. And we have more than a few “Dred Scott” decisions, in which courts have held that science has no criteria of validity that they are bound to follow.

To be sure, many judges have recognized a different danger in scientific opinion testimony, namely, its ability to overwhelm the analytical faculties of lay jurors. Fact-finders may view scientific expert witness opinion testimony as having an overwhelming certainty and authority, which swamps their ability to evaluate the testimony.1

One errant judicial strategy to deal with their own difficulty in evaluating scientific evidence was to invent a fictitious divide between a scientific and legal burden of proof:2

Petitioners demand sole reliance on scientific facts, on evidence that reputable scientific techniques certify as certain. Typically, a scientist will not so certify evidence unless the probability of error, by standard statistical measurement, is less than 5%. That is, scientific fact is at least 95% certain. Such certainty has never characterized the judicial or the administrative process. It may be that the ‘beyond a reasonable doubt’ standard of criminal law demands 95% certainty. Cf. McGill v. United States, 121 U.S.App. D.C. 179, 185 n.6, 348 F.2d 791, 797 n.6 (1965). But the standard of ordinary civil litigation, a preponderance of the evidence, demands only 51% certainty. A jury may weigh conflicting evidence and certify as adjudicative (although not scientific) fact that which it believes is more likely than not.”

By falsely elevating the scientific standard, judges see themselves free to decide expeditiously and without constraints, because they are operating at much lower epistemic level.

Another response advocated by “the Lobby,” scientists in service to the litigation industry, has been to deprecate gatekeeping altogether. Perhaps the most brazen anti-science response to the Supreme Court’s decision in Daubert was advanced by David Michaels and his Project on Scientific Knowledge and Public Policy (SKAPP). In its heyday, SKAPP organized meetings and conferences, and cranked out anti-gatekeeping propaganda to the delight of the litigation industry3, while obfuscating and equivocating about the source of its funding (from the litigation industry).4

SKAPP principal David Michaels was also behind the efforts of the American Public Health Association (APHA) to criticize the judicial move to scientific standards in gatekeeping. In 2004, Michaels and fellow litigation industrialists prevailed upon the APHA to adopt a policy statement that attacked evidence-based science and data transparency in the form of “Policy Number: 2004-11 Threats to Public Health Science.”5

SKAPP appears to have gone the way of the dodo, although the defunct organization still has a Wikipedia­ page with the misleading claim that a federal court had funded its operation, and the old link for this sketchy outfit now redirects to the website for the Union of Concerned Scientists. In 2009, David Michaels, fellow in the Collegium Ramazzini, and formerly the driving force of SKAPP, went on to become an under-secretary of Labor, and OSHA administrator in the Obama administration.6

With the end of his regulatory work, Michaels is now back in the litigation saddle. In April 2018, Michaels participated in a ruse in which he allowed himself to be “subpoenaed” by Mark Lanier, to give testimony in a cases involving claims that personal talc use caused ovarian cancers.7 Michaels had no real subject matter expertise, but he readily made himself available so that Mr. Lanier could inject Michaels’ favorite trope of “doubt is their product” into his trial.

Against this backdrop of special pleading from the litigation industry’s go-to expert witnesses, it is helpful to revisit the Daubert decision, which is now 25 years old. The decision followed the grant of the writ of certiorari by the Supreme Court, full briefing by the parties on the merits, oral argument, and twenty two amicus briefs. Not all briefs are created equal, and this inequality is especially true of amicus briefs, for which the quality of argument, and the reputation of the interested third parties, can vary greatly. Given the shrill ideological ranting of SKAPP and the APHA, we might find some interest in what two leading scientific organizations, the American Association for the Advancement of Science (AAAS) and the National Academy of Science (NAS), contributed to the debate over the proper judicial role in policing expert witness opinion testimony.

The Amicus Brief of the AAAS and the NAS, filed in Daubert v. Merrell Dow Pharmaceuticals, Inc., U.S. Supreme Court No. 92-102 (Jan. 19, 1993), was submitted by Richard A. Meserve and Lars Noah, of Covington & Burling, and by Bert Black, of Weinberg & Green. Unfortunately, the brief does not appear to be available on Westlaw, but it was republished shortly after filing, at 12 Biotechnology Law Report 198 (No. 2, March-April 1993) [all citations below are to this republication].

The amici were and are well known to the scientific community. The AAAS is a not-for-profit scientific society, which publishes the prestigious journal Science, and engages in other activities to advance public understanding of science. The NAS was created by congressional charter in the administration of Abraham Lincoln, to examine scientific, medical, and technological issues of national significance. Brief at 208. Meserve, counsel of record for these Amici Curiae, is a member of the National Academy, a president emeritus of the Carnegie Institution for Science, and a former chair of the U.S. Nuclear Regulatory Commission. He received his doctorate in applied physics from Stanford University, and his law degree from Harvard. Noah is now a professor of law in the University of Florida, and Black is still a practicing lawyer, ironically for the litigation industry.

The brief of the AAAP and the NAS did not take a position on the merits of whether Bendectin can cause birth defects, but it had a great deal to say about the scientific process, and the need for courts to intervene to ensure that expert witness opinion testimony was developed and delivered with appropriate methodological rigor.

A Clear and Present Danger

The amici, AAAS and NAS, clearly recognized a threat to the integrity of scientific fact-finding in the regime of uncontrolled and unregulated expert witness testimony. The amici cited the notorious case of Wells v. Ortho Pharmaceutical Corp.8, which had provoked an outcry from the scientific community, and a particularly scathing article by two scientists from the National Institute of Child Health and Human Development.9

The amici also cited several judicial decisions on the need for robust gatekeeping, including the observations of Judge Jack Weinstein that

[t]he uncertainty of the evidence in [toxic tort] cases, dependent as it is on speculative scientific hypotheses and epidemiological studies, creates a special need for robust screening of experts and gatekeeping under Rules 403 and 703 by the court.”10

The AAAS and the NAS saw the “obvious danger that research results generated solely for litigation may be skewed.” Brief at 217& n.11.11 The AAAS and the NAS thus saw a real, substantial threat in countenancing expert witnesess who proffered “putatively scientific evidence that does not in fact reflect the application of scientific principles.” Brief at 208. The Supreme Court probably did not need the AAAS and the NAS to tell them that “[s]uch evidence can lead to incorrect decisions and can serve to discredit the contributions of science,” id., but it may have helped ensure that the Court articulated meaningful guidelines to trial judges to police their courtrooms against scientific conclusions that were not reached in accordance with scientific principles. The amici saw and stated that

[t]he unique persuasive power of scientific evidence and its inherent limitations requires that courts engage special efforts to ensure that scientific evidence is valid and reliable before it is admitted. In performing that task, courts can look to the same criteria that scientists themselves use to evaluate scientific claims.”

Brief at 212.

It may seem quaint to the post-modernists at the APHA, but the AAAS and the NAS were actually concerned “to avoid outcomes that are at odds with reality,” and they were willing to urge that “courts must exercise special care to assure that such evidence is based on valid and reliable scientific methodologies.” Brief at 209 (emphasis added). The amici also urged caution in allowing opinion testimony that conflicted with existing learning, and which had not been presented to the scientific community for evaluation. Brief at 218-19. In the words of the amici:

Courts should admit scientific evidence only if it conforms to scientific standards and is derived from methods that are generally accepted by the scientific community as valid and reliable. Such a test promotes sound judicial decisionmaking by providing workable means for screening and assessing the quality of scientific expert testimony in advance of trial.”

Brief at 233. After all, part of the scientific process itself is weeding out false ideas.

Authority for Judicial Control

The AAAS and NAS and its lawyers gave their full support to Merrill Dow’s position that “courts have the authority and the responsibility to exclude expert testimony that is based upon unreliable or misapplied methodologies.” Brief at 209. The Federal Rules of Evidence, and Rules 702, 703, and 403 in particular, gave trial courts “ample authority for empowering courts to serve as gatekeepers.” Brief at 230. The amici argued what ultimately would become the law, that judicial control, in the spirit and text of the Federal Rules, of “[t]hreshold determinations concerning the admissibility of scientific evidence are necessary to ensure accurate decisions and to avoid unnecessary expenditures of judicial resources on collateral issues. Brief at 210. The AAAS and NAS further recommended that:

Determinations concerning the admissibility of expert testimony based on scientific evidence should be made by a judge in advance of trial. Such judicial control is explicitly called for under Rule 104(a) of the Federal Rules of Evidence, and threshold admissibility determinations by a judge serve several important functions, including simplification of issues at trial (thereby increasing the speed of trial), improvement in the consistency and predictability of results, and clarification of the issues for purposes of appeal. Indeed, it is precisely because a judge can evaluate the evidence in a focused and careful manner, free from the prejudices that might infect deliberations by a jury, that the determination should be made as a threshold matter.”

Brief at 228 (internal citations omitted).

Criteria of Validity

The AAAS and NAS did not shrink from the obvious implications of their position. They insisted that “[i]n evaluating scientific evidence, courts should consider the same factors that scientists themselves employ to assess the validity and reliability of scientific assertions.” Brief at 209, 210. The amici may have exhibited an aspirational view of the ability of judges, but they shared their optimistic view that “judges can understand the fundamental characteristics that separate good science from bad.” Brief at 210. Under the gatekeeping regime contemplated by the AAAS and the NAS, judges would have to think and analyze, rather than delegating to juries. In carrying out their task, judges would not be starting with a blank slate:

When faced with disputes about expert scientific testimony, judges should make full use of the scientific community’s criteria and quality-control mechanisms. To be admissible, scientific evidence should conform to scientific standards and should be based on methods that are generally accepted by the scientific community as valid and reliable.”

Brief at 210. Questions such as whether an hypothesis has survived repeated severe, rigorous tests, whether the hypothesis is consistent with other existing scientific theories, whether the results of the tests have been presented to the scientific community, need to be answered affirmatively before juries are permitted to weigh in with their verdicts. Brief at 216, 217.

The AAAS and the NAS acknowledged implicitly and explicitly that courtrooms were not good places to trot out novel hypotheses, which lacked severe testing and sufficient evidentiary support. New theories must survive repeated testing and often undergo substantial refinements before they can be accepted in the scientific community. The scientific method requires nothing less. Brief at 219. These organizational amici also acknowledged that there will be occasionally “truly revolutionary advances” in the form of an hypothesis not fully tested. The danger of injecting bad science into broader decisions (such as encouraging meritless litigation, or the abandonment of useful products) should cause courts to view unestablished hypotheses with “heightened skepticism pending further testing and review.” Brief at 229. In other words, some hypotheses simply have not matured to the point at which they can support tort or other litigation.

The AAAS and the NAS contemplated that the gatekeeping process could and should incorporate the entire apparatus of scientific validity determinations into Rule 104(a) adjudications. Nowhere in their remarkable amicus brief do they suggest that if there some evidence (however weak) favoring a causal claim, with nothing yet available to weigh against it, expert witnesses can declare that they have the “weight of the evidence” on their side, and gain a ticket to the courthouse door. The scientists at SKAPP, or now those at the Union for Concerned Scientists, prefer to brand gatekeeping as a trick to sell “doubt.” What they fail to realize is that their propaganda threatens both universalism and organized skepticism, two of the four scientific institutional norms, described by sociologist of science Robert K. Merton.12


1 United States v. Brown, 557 F.2d 541, 556 (6th Cir. 1977) (“Because of its apparent objectivity, an opinion that claims a scientific basis is apt to carry undue weight with the trier of fact”); United States v. Addison, 498 F.2d 741, 744 (D.C. Cir. 1974) (“scientific proof may in some instances assume a posture of mystic infallibility in the eyes of a jury of laymen”). Some people say that our current political morass reflects poorly on the ability of United States citizens to assess and evaluate evidence and claims to the truth.

2 See, e.g., Ethyl Corp. v. EPA, 541 F.2d 1, 28 n.58 (D.C. Cir.), cert. denied, 426 U.S. 941 (1976). See also Rhetorical Strategy in Characterizing Scientific Burdens of Proof(Nov. 15, 2014).

3 See, e.g., Project on Scientific Knowledge and Public Policy, “Daubert: The Most Influential Supreme Court Ruling You’ve Never Heard Of(2003).

4 See, e.g., SKAPP A LOT(April 30, 2010); “Manufacturing Certainty(Oct. 25, 2011);David Michaels’ Public Relations Problem(Dec. 2, 2011); Conflicted Public Interest Groups (Nov. 3, 2013).

7 Notes of Testimony by David Michaels, in Ingham v. Johnson & Johnson, Case No. 1522-CC10417-01, St. Louis Circuit Ct, Missouri (April 17, 2018).

8 788 F.2d 741, 744-45 (11th Cir.), cert. denied, 479 U.S. 950 (1986). Remarkably, consultants for the litigation industry have continued to try to “rehabilitate” the Wells decision. SeeCarl Cranor’s Conflicted Jeremiad Against Daubert” (Sept. 23, 2018).

9 James L. Mills & Duane Alexander, “Teratogens and Litogens,” 315 New Engl. J. Med. 1234, 1235 (1986).

10 Brief at n. 31, citing In re Agent Orange Product Liab. Litig., 611 F. Supp. 1267, 1269 (E.D.N.Y. 1985), aff’d, 818 F.2d 187 (2th Cir. 1987), cert. denied, 487 U.S. 1234 (1988).

11 citing among other cases, Perry v. United States, 755 F.2d 888, 892 (11th Cir. 1985) (“A scientist who has a formed opinion as to the answer he is going to find before he even begins his research may be less objective than he needs to be in order to produce reliable scientific results.”).

12 Robert K. Merton, “The Normative Structure of Science,” in Robert K. Merton, The Sociology of Science: Theoretical and Empirical Investigations, chap. 13, at 267, 270 (1973).

The Hazard of Composite End Points – More Lumpenepidemiology in the Courts

October 20th, 2018

One of the challenges of epidemiologic research is selecting the right outcome of interest to study. What seems like a simple and obvious choice can often be the most complicated aspect of the design of clinical trials or studies.1 Lurking in this choice of end point is a particular threat to validity in the use of composite end points, when the real outcome of interest is one constituent among multiple end points aggregated into the composite. There may, for instance, be strong evidence in favor of one of the constituents of the composite, but using the composite end point results to support a causal claim for a different constituent begs the question that needs to be answered, whether in science or in law.

The dangers of extrapolating from one disease outcome to another is well-recognized in the medical literature. Remarkably, however, the problem received no meaningful discussion in the Reference Manual on Scientific Evidence (3d ed. 2011). The handbook designed to help judges decide threshold issues of admissibility of expert witness opinion testimony discusses the extrapolation from sample to population, from in vitro to in vivo, from one species to another, from high to low dose, and from long to short duration of exposure. The Manual, however, has no discussion of “lumping,” or on the appropriate (and inappropriate) use of composite or combined end points.

Composite End Points

Composite end points are typically defined, perhaps circularly, as a single group of health outcomes, which group is made up of constituent or single end points. Curtis Meinert defined a composite outcome as “an event that is considered to have occurred if any of several different events or outcomes is observed.”2 Similarly, Montori defined composite end points as “outcomes that capture the number of patients experiencing one or more of several adverse events.”3 Composite end points are also sometimes referred to as combined or aggregate end points.

Many composite end points are clearly defined for a clinical trial, and the component end points are specified. In some instances, the composite nature of an outcome may be subtle or be glossed over by the study’s authors. In the realm of cardiovascular studies, for example, investigators may look at stroke as a single endpoint, without acknowledging that there are important clinical and pathophysiological differences between ischemic strokes and hemorrhagic strokes (intracerebral or subarachnoid). The Fletchers’ textbook4 on clinical epidemiology gives the example:

In a study of cardiovascular disease, for example, the primary outcomes might be the occurrence of either fatal coronary heart disease or non-fatal myocardial infarction. Composite outcomes are often used when the individual elements share a common cause and treatment. Because they comprise more outcome events than the component outcomes alone, they are more likely to show a statistical effect.”

Utility of Composite End Points

The quest for statistical “power” is often cited as a basis for using composite end points. Reduction in the number of “events,” such as myocardial infarction (MI), through improvements in medical care has led to decreased rates of MI in studies and clinical trials. These low event rates have caused power issues for clinical trialists, who have responded by turning to composite end points to capture more events. Composite end points permit smaller sample sizes and shorter follow-up times, without sacrificing power, the ability to detect a statistically significant increased rate of a prespecified size and Type I error. Increasing study power, while reducing sample size or observation time, is perhaps the most frequently cited rationale for using composite end points.

Competing Risks

Another reason sometimes offered in support of using composite end points is composites provide a strategy to avoid the problem of competing risks.5 Death (any cause) is sometimes added to a distinct clinical morbidity because patients who are taken out of the trial by death are “unavailable” to experience the morbidity outcome.

Multiple Testing

By aggregating several individual end points into a single pre-specified outcome, trialists can avoid corrections for multiple testing. Trials that seek data on multiple outcomes, or on multiple subgroups, inevitably raise concerns about the appropriate choice of the measure for the statistical test (alpha) to determine whether to reject the null hypothesis. According to some authors, “[c]omposite endpoints alleviate multiplicity concerns”:

If designated a priori as the primary outcome, the composite obviates the multiple comparisons associated with testing of the separate components. Moreover, composite outcomes usually lead to high event rates thereby increasing power or reducing sample size requirements. Not surprisingly, investigators frequently use composite endpoints.”6

Other authors have similarly acknowledged that the need to avoid false positive results from multiple testing is an important rationale for composite end points:

Because the likelihood of observing a statistically significant result by chance alone increases with the number of tests, it is important to restrict the number of tests undertaken and limit the type 1 error to preserve the overall error rate for the trial.”7

Indecision about an Appropriate Single Outcome

The International Conference on Harmonization suggests that the inability to select a single outcome variable may lead to the adoption of a composite outcome:

If a single primary variable cannot be selected …, another useful strategy is to integrate or combine the multiple measurements into a single or composite variable.”8

The “indecision” rationale has also been criticized as “generally not a good reason to use a composite end point.”9

Validity of Composite End Points

The validity of composite end points depends upon methodological assumptions, which will have to be made at the time of the study design and protocol creation. After the data are collected and analyzed, the assumptions may or may not be supported. Among the supporting assumptions about the validity of using composites are:10

  • similarity in patient importance for included component end points,

  • similarity of association size of the components, and

  • number of events across the components.

The use of composite end points can sometimes be appropriate in the “first look” at a class of diseases or disorders, with the understanding that further research will sort out and refine the associated end point. Research into the causes of human birth defects, for instance, often starts out with a look at “all major malformations,” before focusing in on specific organ and tissue systems. To some extent, the legal system, in its gatekeeping function, has recognized the dangers and invalidity of lumping in the epidemiology of birth defects.11 The Frischhertz decision, for instance, clearly acknowledged that given the clear evidence that different birth defects arise at different times, based upon interference with different embryological processes, “lumping” of end points was methodologically inappropriate. 2012 U.S. Dist. LEXIS 181507, at *8 (citing Chamber v. Exxon Corp., 81 F. Supp. 2d 661 (M.D. La. 2000), aff’d, 247 F.3d 240 (5th Cir. 2001) (unpublished)).

The Chamber decision involved a challenge to the causation opinion of frequent litigation industry witness, Peter Infante,12 who attempted to defend his opinion about benzene and chronic myelogenous leukemia, based upon epidemiology of benzene and acute myelogenous leukemia. Plaintiffs’ witnesses and counsel sought to evade the burden of producing evidence of an AML association by pointing to a study that reported “excess leukemias,” without specifying the relevant type. Chamber, 81 F. Supp. 2d at 664. The trial court, however, perspicaciously recognized the claimants’ failure to identify relevant evidence of the specific association needed to support the causal claim.

The Frischhertz and Chamber cases are hardly unique. Several state and federal courts have concurred in the context of cancer causation claims.13 In the context of birth defects litigation, the Public Affairs Committee of the Teratology Society has weighed in with strong guidance that counsels against extrapolation between different birth defects in litigation:

Determination of a causal relationship between a chemical and an outcome is specific to the outcome at issue. If an expert witness believes that a chemical causes malformation A, this belief is not evidence that the chemical causes malformation B, unless malformation B can be shown to result from malformation A. In the same sense, causation of one kind of reproductive adverse effect, such as infertility or miscarriage, is not proof of causation of a different kind of adverse effect, such as malformation.”14

The threat to validity in attributing a suggested risk for a composite end point to all included component end points is not, unfortunately, recognized by all courts. The trial court, in Ruff v. Ensign-Bickford Industries, Inc.,15 permitted plaintiffs’ expert witness to reanalyze a study by grouping together two previously distinct cancer outcomes to generate a statistically significant result. The result in Ruff is disappointing, but not uncommon. The result is also surprising, considering the guidance provided by the American Law Institute’s Restatement:

Even when satisfactory evidence of general causation exists, such evidence generally supports proof of causation only for a specific disease. The vast majority of toxic agents cause a single disease or a series of biologically-related diseases. (Of course, many different toxic agents may be combined in a single product, such as cigarettes.) When biological-mechanism evidence is available, it may permit an inference that a toxic agent caused a related disease. Otherwise, proof that an agent causes one disease is generally not probative of its capacity to cause other unrelated diseases. Thus, while there is substantial scientific evidence that asbestos causes lung cancer and mesothelioma, whether asbestos causes other cancers would require independent proof. Courts refusing to permit use of scientific studies that support general causation for diseases other than the one from which the plaintiff suffers unless there is evidence showing a common biological mechanism include Christophersen v. Allied-Signal Corp., 939 F.2d 1106, 1115-1116 (5th Cir. 1991) (applying Texas law) (epidemiologic connection between heavy-metal agents and lung cancer cannot be used as evidence that same agents caused colon cancer); Cavallo v. Star Enters., 892 F. Supp. 756 (E.D. Va. 1995), aff’d in part and rev’d in part, 100 F.3d 1150 (4th Cir. 1996); Boyles v. Am. Cyanamid Co., 796 F. Supp. 704 (E.D.N.Y. 1992). In Austin v. Kerr-McGee Ref. Corp., 25 S.W.3d 280, 290 (Tex. Ct. App. 2000), the plaintiff sought to rely on studies showing that benzene caused one type of leukemia to prove that benzene caused a different type of leukemia in her decedent. Quite sensibly, the court insisted that before plaintiff could do so, she would have to submit evidence that both types of leukemia had a common biological mechanism of development.”

Restatement (Third) of Torts § 28 cmt. c, at 406 (2010). Notwithstanding some of the Restatement’s excesses on other issues, the guidance on composites, seems sane and consonant with the scientific literature.

Role of Mechanism in Justifying Composite End Points

A composite end point may make sense when the individual end points are biologically related, and the investigators can reasonably expect that the individual end points would be affected in the same direction, and approximately to the same extent:16

Confidence in a composite end point rests partly on a belief that similar reductions in relative risk apply to all the components. Investigators should therefore construct composite endpoints in which the biology would lead us to expect similar effects across components.”

The important point, missed by some investigators and many courts, is that the assumption of similar “effects” must be tested by examining the individual component end points, and especially the end point that is the harm claimed by plaintiffs in a given case.

Methodological Issues

The acceptability of composite end points is often a delicate balance between the statistical power and efficiency gained and the reliability concerns raised by using the composite. As with any statistical or interpretative tool, the key questions turn on how the tool is used, and for what purpose. The reliability issues raised by the use of composites are likely to be highly contextual.

For instance, there is an important asymmetry between justifying the use of a composite for measuring efficacy and the use of the same composite for safety outcomes. A biological improvement in type 2 diabetes might be expected to lead to a reduction in all the macrovascular complications of that disease, but a medication for type 2 diabetes might have a very specific toxicity or drug interaction, which affects only one constituent end point among all macrovascular complications, such as myocardial infarction. The asymmetry between efficacy and safety outcomes is specifically addressed by cardiovascular epidemiologists in an important methodological paper:17

Varying definitions of composite end points, such as MACE, can lead to substantially different results and conclusions. There, the term MACE, in particular, should not be used, and when composite study end points are desired, researchers should focus separately on safety and effectiveness outcomes, and construct separate composite end points to match these different clinical goals.”

There are many clear, published statements that caution consumers of medical studies against being misled by claims based upon composite end points. Several years ago, for example, the British Medical Journal published a paper with six methodological suggestions for consumers of studies, one of which deals explicitly with composite end points:18

“Guide to avoid being misled by biased presentation and interpretation of data

1. Read only the Methods and Results sections; bypass the Discuss section

2. Read the abstract reported in evidence based secondary publications

3. Beware faulty comparators

4. Beware composite endpoints

5. Beware small treatment effects

6. Beware subgroup analyses”

The paper elaborates on the problems that arise from the use of composite end points:19

Problems in the interpretation of these trials arise when composite end points include component outcomes to which patients attribute very different importance… .”

Problems may also arise when the most important end point occurs infrequently or when the apparent effect on component end points differs.”

When the more important outcomes occur infrequently, clinicians should focus on individual outcomes rather than on composite end points. Under these circumstances, inferences about the end points (which because they occur infrequently will have very wide confidence intervals) will be weak.”

Authors generally acknowledge that “[w]hen large variations exist between components the composite end point should be abandoned.”20

Methodological Issues Concerning Causal Inferences from Composite End Points to Individual End Points

Several authors have criticized pharmaceutical companies for using composite end points to “game” their trials. Composites allow smaller sample size, but they lend themselves to broader claims for outcomes included within the composite. The same criticism applies to attempts to infer that there is risk of an individual endpoint based upon a showing of harm in the composite endpoint.

If a trial report specifies a composite endpoint, the components of the composite should be in the well-known pathophysiology of the disease. The researchers should interpret the composite endpoint in aggregate rather than as showing efficacy of the individual components. However, the components should be specified as secondary outcomes and reported beside the results of the primary analysis.”21

Virtually the entire field of epidemiology and clinical trial study has urged caution in inferring risk for a component end point from suggested risk in a composite end point:

In summary, evaluating trials that use composite outcome requires scrutiny in regard to the underlying reasons for combining endpoints and its implications and has impact on medical decision-making (see below in Sect. 47.8). Composite endpoints are credible only when the components are of similar importance and the relative effects of the intervention are similar across components (Guyatt et al. 2008a).”22

Not only do important methodologists urge caution in the interpretation of composite end points,23 they emphasize a basic point of scientific (and legal) relevancy:

[A] positive result for a composite outcome applies only to the cluster of events included in the composite and not to the individual components.”24

Even regular testifying expert witnesses for the litigation industry insist upon the “principle of full disclosure”:

The analysis of the effect of therapy on the combined end point should be accompanied by a tabulation of the effect of the therapy for each of the component end points.”25

Gatekeepers in our judicial system need to be more vigilant against bait-and-switch inferences based upon composite end points. The quest for statistical power hardly justifies larding up an end point with irrelevant data points.


1 See, e.g., Milton Packer, “Unbelievable! Electrophysiologists Embrace ‘Alternative Facts’,” MedPage (May 16, 2018) (describing clinical trialists’ abandoning pre-specified intention-to-treat analysis).

2 Curtis Meinert, Clinical Trials Dictionary (Johns Hopkins Center for Clinical Trials 1996).

3 Victor M. Montori, et al., “Validity of composite end points in clinical trials.” 300 Brit. Med. J. 594, 596 (2005).

4 R. Fletcher & S. Fletcher, Clinical Epidemiology: The Essentials at 109 (4th ed. 2005).

5 Neaton, et al., “Key issues in end point selection for heart failure trials: composite end points,” 11 J. Cardiac Failure 567, 569a (2005).

6 Schulz & Grimes, “Multiplicity in randomized trials I: endpoints and treatments,” 365 Lancet 1591, 1593a (2005).

7 Freemantle & Calvert, “Composite and surrogate outcomes in randomized controlled trials,” 334 Brit. Med. J. 756, 756a – b (2007).

8 International Conference on Harmonisation of Technical Requrements for Registration of Pharmaceuticals for Human Use; “ICH harmonized tripartite guideline: statistical principles for clinical trials,” 18 Stat. Med. 1905 (1999).

9 Neaton, et al., “Key issues in end point selection for heart failure trials: composite end points,” 11 J. Cardiac Failure 567, 569b (2005).

10 Montori, et al., “Validity of composite end points in clinical trials.” 300 Brit. Med. J. 594, 596, Summary Point No. 2 (2005).

11 SeeLumpenepidemiology” (Dec. 24, 2012), discussing Frischhertz v. SmithKline Beecham Corp., 2012 U.S. Dist. LEXIS 181507 (E.D. La. 2012).Frischhertz was decided in the same month that a New York City trial judge ruled Dr. Shira Kramer out of bounds in the commission of similarly invalid lumping, in Reeps v. BMW of North America, LLC, 2012 NY Slip Op 33030(U), N.Y.S.Ct., Index No. 100725/08 (New York Cty. Dec. 21, 2012) (York, J.), 2012 WL 6729899, aff’d on rearg., 2013 WL 2362566, aff’d, 115 A.D.3d 432, 981 N.Y.S.2d 514 (2013), aff’d sub nom. Sean R. v. BMW of North America, LLC, ___ N.E.3d ___, 2016 WL 527107 (2016). See also New York Breathes Life Into Frye Standard – Reeps v. BMW(Mar. 5, 2013).

12Infante-lizing the IARC” (May 13, 2018).

13 Knight v. Kirby Inland Marine, 363 F.Supp. 2d 859, 864 (N.D. Miss. 2005), aff’d, 482 F.3d 347 (5th Cir. 2007) (excluding opinion of B.S. Levy on Hodgkin’s disease based upon studies of other lymphomas and myelomas); Allen v. Pennsylvania Eng’g Corp., 102 F.3d 194, 198 (5th Cir. 1996) (noting that evidence suggesting a causal connection between ethylene oxide and human lymphatic cancers is not probative of a connection with brain cancer);Current v. Atochem North America, Inc., 2001 WL 36101283, at *3 (W.D. Tex. Nov. 30, 2001) (excluding expert witness opinion of Michael Gochfeld, who asserted that arsenic causes rectal cancer on the basis of studies that show association with lung and bladder cancer; Hill’s consistency factor in causal inference does not apply to cancers generally); Exxon Corp. v. Makofski, 116 S.W.3d 176, 184-85 (Tex. App. Houston 2003) (“While lumping distinct diseases together as ‘leukemia’ may yield a statistical increase as to the whole category, it does so only by ignoring proof that some types of disease have a much greater association with benzene than others.”).

14The Public Affairs Committee of the Teratology Society, “Teratology Society Public Affairs Committee Position Paper Causation in Teratology-Related Litigation,” 73 Birth Defects Research (Part A) 421, 423 (2005).

15 168 F. Supp. 2d 1271, 1284–87 (D. Utah 2001).

16 Montori, et al., “Validity of composite end points in clinical trials.” 300 Brit. Med. J. 594, 595b (2005).

17 Kevin Kip, et al., “The problem with composite end points in cardiovascular studies,” 51 J. Am. Coll. Cardiol. 701, 701 (2008) (Abstract – Conclusions) (emphasis in original).

18 Montori, et al., “Users’ guide to detecting misleading claims in clinical research reports,” 329 Brit. Med. J. 1093 (2004) (emphasis added).

19 Id. at 1094b, 1095a.

20 Montori, et al., “Validity of composite end points in clinical trials.” 300 Brit. Med. J. 594, 596 (2005).

21 Schulz & Grimes, “Multiplicity in randomized trials I: endpoints and treatments,” 365 Lancet 1591, 1595a (2005) (emphasis added). These authors acknowledge that composite end points often lack clinical relevancy, and that the gain in statistical efficiency comes at the high cost of interpretational difficulties. Id. at 1593.

22 Wolfgang Ahrens & Iris Pigeot, eds., Handbook of Epidemiology 1840 (2d ed. 2014) (47.5.8 Use of Composite Endpoints).

23 See, e.g., Stuart J. Pocock, John J.V. McMurray, and Tim J. Collier, “Statistical Controversies in Reporting of Clinical Trials: Part 2 of a 4-Part Series on Statistics for Clinical Trials,” 66 J. Am. Coll. Cardiol. 2648, 2650-51 (2015) (“Interpret composite endpoints carefully.”)(“COMPOSITE ENDPOINTS. These are commonly used in CV RCTs to combine evidence across 2 or more outcomes into a single primary endpoint. But, there is a danger of oversimplifying the evidence by putting too much emphasis on the composite, without adequate inspection of the contribution from each separate component.”); Eric Lim, Adam Brown, Adel Helmy, Shafi Mussa, and Douglas G. Altman, “Composite Outcomes in Cardiovascular Research: A Survey of Randomized Trials,” 149 Ann. Intern. Med. 612, 612, 615-16 (2008) (“Individual outcomes do not contribute equally to composite measures, so the overall estimate of effect for a composite measure cannot be assumed to apply equally to each of its individual outcomes.”) (“Therefore, readers are cautioned against assuming that the overall estimate of effect for the composite outcome can be interpreted to be the same for each individual outcome.”); Freemantle, et al., “Composite outcomes in randomized trials: Greater precision but with greater uncertainty.” 289 J. Am. Med. Ass’n 2554, 2559a (2003) (“To avoid the burying of important components of composite primary outcomes for which on their own no effect is concerned, . . . the components of a composite outcome should always be declared as secondary outcomes, and the results described alongside the result for the composite outcome.”).

24 Freemantle & Calvert, “Composite and surrogate outcomes in randomized controlled trials.” 334 Brit. Med. J. 757a (2007).

25 Lem Moyé, “Statistical Methods for Cardiovascular Researchers,” 118 Circulation Research 439, 451 (2016).

Cartoon Advocacy for Causal Claims

October 5th, 2018

I saw him today at the courthouse
On his table was a sawed-in-half man
He was practiced at the art of deception
Well I could tell by his blood-stained hands
Ah yeah! Yeah1

Mark Lanier’s Deceptive Cartoon Advocacy

A recent book by Kurt Andersen details the extent of American fantasy, in matters religious, political, and scientific.2 Andersen’s book is a good read and a broad-ranging dissection of the American psyche for cadswallop. The book has one gaping hole, however. It completely omits the penchant for fantasy in American courtrooms.

Ideally, the trial lawyers in a case balance each other and their distractions drop out of the judge or jury’s search for the truth. Sometimes, probably too frequently in so-called toxic tort cases, plaintiffs’ counsel’s penchant for fantasy is so great and persistent that it overwhelms the factfinder’s respect for the truth, and results in an unjust award. In a telling article in Forbes, Mr. Daniel Fisher has turned his sights upon plaintiffs’ lawyer Mark Lanier and his role in helping a jury deliver a $5 billion (give or take a few shekels).3

The $5 billion verdict came in the St. Louis, Missouri, courtroom of Judge Rex Burlison, who presided over a multi-plaintiff case in which the plaintiffs claimed that they had developed ovarian cancer from using Johnson & Johnson’s talcum powder. In previous trials, plaintiffs’ counsel and expert witnesses attempted to show that talc itself could cause ovarian cancer, with inconsistent jury results. Mr. Lanier took a different approach in claiming that the talcum powder was contaminated with asbestos, which caused his clients to develop ovarian cancer.

The asserted causal relationship between occupational or personal exposure to talc and ovarian cancer is tenuous at best, but there is at least a debatable issue about the claimed association between occupational asbestos use and ovarian cancer. The more thoughtful reviews of the issue, however, are cautious in noting that disease outcome misclassification (misdiagnosing mesotheliomas that would be expected in these occupational cohorts with ovarian cancer) make conclusions difficult. See, e.g., Alison Reid, Nick de Klerk and Arthur W. (Bill) Musk, “Does Exposure to Asbestos Cause Ovarian Cancer? A Systematic Literature Review and Meta-analysis,” 20 Cancer Epidemiol. Biomarkers & Prevention 1287 (2011).

Fisher reported that Lanier, after obtaining the $5 billion verdict, presented to a litigation industry meeting, held at a plush Napa Valley resort. In this presentation, Lanier described his St. Louis achievement by likening himself to a magician, and explained “how I sawed the man in half.” Of course, if Lanier had sawed the man in half, he would be a murderer, and the principle of charity requires us to believe that he is merely a purveyor of magical thinking, a deceiver, practiced in the art of deception.

Lanier’s boast about his magical skills is telling. The whole point of the magician’s act is to thrill an audience by the seemingly impossible suspension of the laws of nature. Deception, of course, is the key to success for a magician, or an illusionist of any persuasion. It is comforting to think that Lanier regards himself as an illusionist because his self-characterization suggests that he does not really believe in his own courtroom illusions.

Lanier’s magical thinking and acts have gotten him into trouble before. Fisher noted that Lanier had been branded as deceptive by the second highest court in the United States, the United States Court of Appeals, in Christopher v. DePuy Orthopaedics, Inc., Nos. 16-11051, et al., 2018 U.S. App. LEXIS 10476 (5th Cir. April 25, 2018). In Christopher, Lanier had appeared to engineer payments to expert witnesses in a way that he thought he could tell the jury that the witnesses had no pecuniary interest in the case. Id. at *67. The Court noted that “[l]awyers cannot engage with a favorable expert, pay him ‘for his time’, then invite him to testify as a purportedly ‘non-retained’ neutral party. That is deception, plain and simple.” Id. at *67. The Court concluded that “Lanier’s deceptions furnish[ed] independent grounds for a new trial, id. at *8, because Lanier’s “deceptions [had] obviously prevented defendants from ‘fully and fairly’ defending themselves.” Id. at *69.

Cartoon Advocacy

In his presentation to the litigation industry meeting in Napa Valley, Lanier explained that “Every judge lives by certain rules, just like in sports, but every stadium is also allowed to size themselves appropriately to the game.” See Fisher at note 3. Lanier’s magic act thrives in courtrooms where anything goes. And apparently, Lanier was telling his litigation industry audience that anything goes in the St. Louis courtroom of Judge Burlison.

In some of the ovarian cancer cases, Lanier had a problem: the women had a BrCa2 deletion mutation, which put them at a very high lifetime risk of ovarian cancer, irrespective of what exogenous exposures they may have had. Lanier was undaunted by this adverse evidence, and he spun a story that these women were at the edge of a cliff, when evil Johnson & Johnson’s baby powder came along and pushed them over the cliff:

Lanier Exhibit (from Fisher’s article in Forbes)

Whatever this cartoon lacks in artistic ability, we should give the magician his due; this is a powerful rhetorical metaphor, but it is not science. If it were, there would be a study that showed that ovarian cancers occurred more often in women with BrCa 2 mutations and talcum exposure than in women with BrCa 2 mutations without talcum exposure. The cartoon also imputes an intention to harm specific plaintiffs, which is not supported by the evidence. Lanier’s argument about the “edge of the cliff” does not change the scientific or legal standard that the alleged harm be the sine qua non of the tortious exposure. In the language of the American Law Institute’s Restatement of Torts4:

An actor’s tortious conduct must be a factual cause of another’s physical harm for liability to be imposed. Conduct is a factual cause of harm when the harm would not have occurred absent the conduct.”

Lanier’s cartoon also mistakes risk, if risk it should be, with cause in fact. Reverting back to basic principles, Kenneth Rothman reminds us5:

An elementary but essential principle to keep in mind is that a person may be exposed to an agent and then develop disease without there being any causal connection between the exposure and the disease. For this reason, we cannot consider the incidence proportion or the incidence rate among exposed people to measure a causal effect.”

Chain, Chain, Chain — Chain of Foolish Custody

Johnson & Johnson has moved for a new trial, complaining about Lanier’s illusionary antics, as well as cheesy lawyering. Apparently, Lanier used a block of cheese to illustrate his view of talc mining. In most courtrooms, argument is confined to closing statements of counsel, but in Judge Burlison’s courtroom, Lanier seems to have engaged in one, non-stop argument from the opening bell.

Whether there was asbestos in Johnson & Johnson’s baby powder was obviously a key issue in Lanier’s cases. According to Fisher’s article, Lanier was permitted, over defense objections, to present expert witness opinion testimony based upon old baby powder samples bought from collectors on eBay, for which chain of custody was lacking or incomplete. If this reporting is accurate, then Mr. Lanier is truly a magician, with the ability to make well-established law disappear.6

The Lanier Firm’s Website

One suggestion of how out of control Judge Burlison’s courtroom was is evidenced in Johnson & Johnson’s motion for a new trial, as reported by Fisher. Somehow, defense counsel had injected the content of Lanier’s firm’s website into the trial. According to the motion for new trial, that website had stated that talc “used in modern consumer products” was not contaminated with asbestos. In his closing argument, however, Lanier told the jury he had looked at his website, and the alleged admission was not there.

How the defense was permitted to talk about what was on Lanier’s website is a deep jurisprudential puzzle. Such a statement would be hearsay, without an authorizing exception. Perhaps the defense argued that Lanier’s website was the admission by an agent of the plaintiffs, authorized to speak for them. The attorney-client relationship does create an agent-principal relationship, but it is difficult to fathom that it extends to every statement that Mr. Lanier made outside the record of the trials before the court. If you dear reader are aware of authority to the contrary, please let me know.

Whatever tenuous basis the defense may have advanced, in this cartoon trial, to inject Mr. Lanier’s personal extrajudicial statements into evidence, Mr. Lanier went one parsec farther, according to Fisher. In his closing argument, Lanier blatantly testified that he had checked the website cited and that the suggested statement was not there.

Sounds like a cartoon and a circus trial all bound up together; something that would bring smiles to the faces of Penn Jillette, P.T. Barnum, and Donald Duck.


1 With apologies to Mick Jagger and Keith Richards, and their “You Can’t Always Get What You Want,” from which I have borrowed.

2 Kurt Andersen, Fantasyland: How America Went Haywire – A 500-Year History (2017).

4 “Factual Cause,” A.L.I. Restatement of the Law of Torts (Third): Liability for Physical & Emotional Harm § 26 (2010).

5 Kenneth J. Rothman, Epidemiology: An Introduction at 57 (2d ed. 2012).

6 Paul C. Giannelli, “Chain of Custody,” Crim. L. Bull. 446 (1996); R. Thomas Chamberlain, “Chain of Custody: Its Importance and Requirements for Clinical Laboratory Specimens,” 20 Lab. Med. 477 (1989).

The Judicial Labyrinth for Scientific Evidence

October 3rd, 2018

The real Daedalus (not the musician), as every school child knows, was the creator of the Cretan Labyrinth, where the Minotaur resided. The Labyrinth had been the undoing of many Greeks and barbarians, until an Athenian, Theseus, took up the challenge of slaying the Minotaur. With the help of Ariadne’s thread, Theseus solved the labyrinthic puzzle and slayed the Minotaur.

Theseus and the Minotaur on 6th-century black-figure pottery (Wikimedia Commons 2005)

Dædalus is also the Journal of the American Academy of Arts and Sciences. The Academy has been, for over 230 years, addressing issues issues in both the humanities and in the sciences. In the fall 2018 issue of Dædalus (volume 147, No. 4), the Academy has published a dozen essays by noted scholars in the field, who report on the murky interface of science and law in the courtrooms of the United States. Several of the essays focus on sorry state of forensic “science” in the criminal justice system, which has been the subject of several critical official investigations, only to be dismissed and downplayed by both the Obama and Trump administrations. Other essays address the equally sorry state of judicial gatekeeping in civil actions, with some limited suggestions on how the process of scientific fact finding might be improved. In any event, this issue, Science & the Legal System,” is worth reading even if you do not agree with the diagnoses or the proposed therapies. There is still room for a collaboration between a modern day Daedalus and Ariadne to help us find the way out of this labyrinth.

Introduction

Shari Seidman Diamond & Richard O. Lempert, “Introduction” (pp. 5–14)

Connecting Science and Law

Sheila Jasanoff, “Science, Common Sense & Judicial Power in U.S. Courts” (pp. 15-27)

Linda Greenhouse, “The Supreme Court & Science: A Case in Point,” (pp. 28–40)

Shari Seidman Diamond & Richard O. Lempert, “When Law Calls, Does Science Answer? A Survey of Distinguished Scientists & Engineers,” (pp. 41–60)

Accomodation or Collision: When Science and Law Meet

Jules Lobel & Huda Akil, “Law & Neuroscience: The Case of Solitary Confinement,” (pp. 61–75)

Rebecca S. Eisenberg & Robert Cook-Deegan, “Universities: The Fallen Angels of Bayh-Dole?” (pp. 76–89)

Jed S. Rakoff & Elizabeth F. Loftus, “The Intractability of Inaccurate Eyewitness Identification” (pp. 90–98)

Jennifer L. Mnookin, “The Uncertain Future of Forensic Science” (pp. 99–118)

Joseph B. Kadane and Jonathan J. Koehler, “Certainty & Uncertainty in Reporting Fingerprint Evidence” (pp. 119–134)

Communicating Science in Court

Nancy Gertner & Joseph Sanders, “Alternatives to Traditional Adversary Methods of Presenting Scientific Expertise in the Legal System” (pp. 135–151)

Daniel L. Rubinfeld & Joe S. Cecil, “Scientists as Experts Serving the Court” (pp. 152–163)

Valerie P. Hans and Michael J. Saks, “Improving Judge & Jury Evaluation of Scientific Evidence” (pp. 164–180)

Continuing the Dialogue

David Baltimore, David S. Tatel & Anne-Marie Mazza, “Bridging the Science-Law Divide” (pp. 181–194)