The Hazard of Composite End Points – More Lumpenepidemiology in the Courts

One of the challenges of epidemiologic research is selecting the right outcome of interest to study. What seems like a simple and obvious choice can often be the most complicated aspect of the design of clinical trials or studies.1 Lurking in this choice of end point is a particular threat to validity in the use of composite end points, when the real outcome of interest is one constituent among multiple end points aggregated into the composite. There may, for instance, be strong evidence in favor of one of the constituents of the composite, but using the composite end point results to support a causal claim for a different constituent begs the question that needs to be answered, whether in science or in law.

The dangers of extrapolating from one disease outcome to another is well-recognized in the medical literature. Remarkably, however, the problem received no meaningful discussion in the Reference Manual on Scientific Evidence (3d ed. 2011). The handbook designed to help judges decide threshold issues of admissibility of expert witness opinion testimony discusses the extrapolation from sample to population, from in vitro to in vivo, from one species to another, from high to low dose, and from long to short duration of exposure. The Manual, however, has no discussion of “lumping,” or on the appropriate (and inappropriate) use of composite or combined end points.

Composite End Points

Composite end points are typically defined, perhaps circularly, as a single group of health outcomes, which group is made up of constituent or single end points. Curtis Meinert defined a composite outcome as “an event that is considered to have occurred if any of several different events or outcomes is observed.”2 Similarly, Montori defined composite end points as “outcomes that capture the number of patients experiencing one or more of several adverse events.”3 Composite end points are also sometimes referred to as combined or aggregate end points.

Many composite end points are clearly defined for a clinical trial, and the component end points are specified. In some instances, the composite nature of an outcome may be subtle or be glossed over by the study’s authors. In the realm of cardiovascular studies, for example, investigators may look at stroke as a single endpoint, without acknowledging that there are important clinical and pathophysiological differences between ischemic strokes and hemorrhagic strokes (intracerebral or subarachnoid). The Fletchers’ textbook4 on clinical epidemiology gives the example:

In a study of cardiovascular disease, for example, the primary outcomes might be the occurrence of either fatal coronary heart disease or non-fatal myocardial infarction. Composite outcomes are often used when the individual elements share a common cause and treatment. Because they comprise more outcome events than the component outcomes alone, they are more likely to show a statistical effect.”

Utility of Composite End Points

The quest for statistical “power” is often cited as a basis for using composite end points. Reduction in the number of “events,” such as myocardial infarction (MI), through improvements in medical care has led to decreased rates of MI in studies and clinical trials. These low event rates have caused power issues for clinical trialists, who have responded by turning to composite end points to capture more events. Composite end points permit smaller sample sizes and shorter follow-up times, without sacrificing power, the ability to detect a statistically significant increased rate of a prespecified size and Type I error. Increasing study power, while reducing sample size or observation time, is perhaps the most frequently cited rationale for using composite end points.

Competing Risks

Another reason sometimes offered in support of using composite end points is composites provide a strategy to avoid the problem of competing risks.5 Death (any cause) is sometimes added to a distinct clinical morbidity because patients who are taken out of the trial by death are “unavailable” to experience the morbidity outcome.

Multiple Testing

By aggregating several individual end points into a single pre-specified outcome, trialists can avoid corrections for multiple testing. Trials that seek data on multiple outcomes, or on multiple subgroups, inevitably raise concerns about the appropriate choice of the measure for the statistical test (alpha) to determine whether to reject the null hypothesis. According to some authors, “[c]omposite endpoints alleviate multiplicity concerns”:

If designated a priori as the primary outcome, the composite obviates the multiple comparisons associated with testing of the separate components. Moreover, composite outcomes usually lead to high event rates thereby increasing power or reducing sample size requirements. Not surprisingly, investigators frequently use composite endpoints.”6

Other authors have similarly acknowledged that the need to avoid false positive results from multiple testing is an important rationale for composite end points:

Because the likelihood of observing a statistically significant result by chance alone increases with the number of tests, it is important to restrict the number of tests undertaken and limit the type 1 error to preserve the overall error rate for the trial.”7

Indecision about an Appropriate Single Outcome

The International Conference on Harmonization suggests that the inability to select a single outcome variable may lead to the adoption of a composite outcome:

If a single primary variable cannot be selected …, another useful strategy is to integrate or combine the multiple measurements into a single or composite variable.”8

The “indecision” rationale has also been criticized as “generally not a good reason to use a composite end point.”9

Validity of Composite End Points

The validity of composite end points depends upon methodological assumptions, which will have to be made at the time of the study design and protocol creation. After the data are collected and analyzed, the assumptions may or may not be supported. Among the supporting assumptions about the validity of using composites are:10

  • similarity in patient importance for included component end points,

  • similarity of association size of the components, and

  • number of events across the components.

The use of composite end points can sometimes be appropriate in the “first look” at a class of diseases or disorders, with the understanding that further research will sort out and refine the associated end point. Research into the causes of human birth defects, for instance, often starts out with a look at “all major malformations,” before focusing in on specific organ and tissue systems. To some extent, the legal system, in its gatekeeping function, has recognized the dangers and invalidity of lumping in the epidemiology of birth defects.11 The Frischhertz decision, for instance, clearly acknowledged that given the clear evidence that different birth defects arise at different times, based upon interference with different embryological processes, “lumping” of end points was methodologically inappropriate. 2012 U.S. Dist. LEXIS 181507, at *8 (citing Chamber v. Exxon Corp., 81 F. Supp. 2d 661 (M.D. La. 2000), aff’d, 247 F.3d 240 (5th Cir. 2001) (unpublished)).

The Chamber decision involved a challenge to the causation opinion of frequent litigation industry witness, Peter Infante,12 who attempted to defend his opinion about benzene and chronic myelogenous leukemia, based upon epidemiology of benzene and acute myelogenous leukemia. Plaintiffs’ witnesses and counsel sought to evade the burden of producing evidence of an AML association by pointing to a study that reported “excess leukemias,” without specifying the relevant type. Chamber, 81 F. Supp. 2d at 664. The trial court, however, perspicaciously recognized the claimants’ failure to identify relevant evidence of the specific association needed to support the causal claim.

The Frischhertz and Chamber cases are hardly unique. Several state and federal courts have concurred in the context of cancer causation claims.13 In the context of birth defects litigation, the Public Affairs Committee of the Teratology Society has weighed in with strong guidance that counsels against extrapolation between different birth defects in litigation:

Determination of a causal relationship between a chemical and an outcome is specific to the outcome at issue. If an expert witness believes that a chemical causes malformation A, this belief is not evidence that the chemical causes malformation B, unless malformation B can be shown to result from malformation A. In the same sense, causation of one kind of reproductive adverse effect, such as infertility or miscarriage, is not proof of causation of a different kind of adverse effect, such as malformation.”14

The threat to validity in attributing a suggested risk for a composite end point to all included component end points is not, unfortunately, recognized by all courts. The trial court, in Ruff v. Ensign-Bickford Industries, Inc.,15 permitted plaintiffs’ expert witness to reanalyze a study by grouping together two previously distinct cancer outcomes to generate a statistically significant result. The result in Ruff is disappointing, but not uncommon. The result is also surprising, considering the guidance provided by the American Law Institute’s Restatement:

Even when satisfactory evidence of general causation exists, such evidence generally supports proof of causation only for a specific disease. The vast majority of toxic agents cause a single disease or a series of biologically-related diseases. (Of course, many different toxic agents may be combined in a single product, such as cigarettes.) When biological-mechanism evidence is available, it may permit an inference that a toxic agent caused a related disease. Otherwise, proof that an agent causes one disease is generally not probative of its capacity to cause other unrelated diseases. Thus, while there is substantial scientific evidence that asbestos causes lung cancer and mesothelioma, whether asbestos causes other cancers would require independent proof. Courts refusing to permit use of scientific studies that support general causation for diseases other than the one from which the plaintiff suffers unless there is evidence showing a common biological mechanism include Christophersen v. Allied-Signal Corp., 939 F.2d 1106, 1115-1116 (5th Cir. 1991) (applying Texas law) (epidemiologic connection between heavy-metal agents and lung cancer cannot be used as evidence that same agents caused colon cancer); Cavallo v. Star Enters., 892 F. Supp. 756 (E.D. Va. 1995), aff’d in part and rev’d in part, 100 F.3d 1150 (4th Cir. 1996); Boyles v. Am. Cyanamid Co., 796 F. Supp. 704 (E.D.N.Y. 1992). In Austin v. Kerr-McGee Ref. Corp., 25 S.W.3d 280, 290 (Tex. Ct. App. 2000), the plaintiff sought to rely on studies showing that benzene caused one type of leukemia to prove that benzene caused a different type of leukemia in her decedent. Quite sensibly, the court insisted that before plaintiff could do so, she would have to submit evidence that both types of leukemia had a common biological mechanism of development.”

Restatement (Third) of Torts § 28 cmt. c, at 406 (2010). Notwithstanding some of the Restatement’s excesses on other issues, the guidance on composites, seems sane and consonant with the scientific literature.

Role of Mechanism in Justifying Composite End Points

A composite end point may make sense when the individual end points are biologically related, and the investigators can reasonably expect that the individual end points would be affected in the same direction, and approximately to the same extent:16

Confidence in a composite end point rests partly on a belief that similar reductions in relative risk apply to all the components. Investigators should therefore construct composite endpoints in which the biology would lead us to expect similar effects across components.”

The important point, missed by some investigators and many courts, is that the assumption of similar “effects” must be tested by examining the individual component end points, and especially the end point that is the harm claimed by plaintiffs in a given case.

Methodological Issues

The acceptability of composite end points is often a delicate balance between the statistical power and efficiency gained and the reliability concerns raised by using the composite. As with any statistical or interpretative tool, the key questions turn on how the tool is used, and for what purpose. The reliability issues raised by the use of composites are likely to be highly contextual.

For instance, there is an important asymmetry between justifying the use of a composite for measuring efficacy and the use of the same composite for safety outcomes. A biological improvement in type 2 diabetes might be expected to lead to a reduction in all the macrovascular complications of that disease, but a medication for type 2 diabetes might have a very specific toxicity or drug interaction, which affects only one constituent end point among all macrovascular complications, such as myocardial infarction. The asymmetry between efficacy and safety outcomes is specifically addressed by cardiovascular epidemiologists in an important methodological paper:17

Varying definitions of composite end points, such as MACE, can lead to substantially different results and conclusions. There, the term MACE, in particular, should not be used, and when composite study end points are desired, researchers should focus separately on safety and effectiveness outcomes, and construct separate composite end points to match these different clinical goals.”

There are many clear, published statements that caution consumers of medical studies against being misled by claims based upon composite end points. Several years ago, for example, the British Medical Journal published a paper with six methodological suggestions for consumers of studies, one of which deals explicitly with composite end points:18

“Guide to avoid being misled by biased presentation and interpretation of data

1. Read only the Methods and Results sections; bypass the Discuss section

2. Read the abstract reported in evidence based secondary publications

3. Beware faulty comparators

4. Beware composite endpoints

5. Beware small treatment effects

6. Beware subgroup analyses”

The paper elaborates on the problems that arise from the use of composite end points:19

Problems in the interpretation of these trials arise when composite end points include component outcomes to which patients attribute very different importance… .”

Problems may also arise when the most important end point occurs infrequently or when the apparent effect on component end points differs.”

When the more important outcomes occur infrequently, clinicians should focus on individual outcomes rather than on composite end points. Under these circumstances, inferences about the end points (which because they occur infrequently will have very wide confidence intervals) will be weak.”

Authors generally acknowledge that “[w]hen large variations exist between components the composite end point should be abandoned.”20

Methodological Issues Concerning Causal Inferences from Composite End Points to Individual End Points

Several authors have criticized pharmaceutical companies for using composite end points to “game” their trials. Composites allow smaller sample size, but they lend themselves to broader claims for outcomes included within the composite. The same criticism applies to attempts to infer that there is risk of an individual endpoint based upon a showing of harm in the composite endpoint.

If a trial report specifies a composite endpoint, the components of the composite should be in the well-known pathophysiology of the disease. The researchers should interpret the composite endpoint in aggregate rather than as showing efficacy of the individual components. However, the components should be specified as secondary outcomes and reported beside the results of the primary analysis.”21

Virtually the entire field of epidemiology and clinical trial study has urged caution in inferring risk for a component end point from suggested risk in a composite end point:

In summary, evaluating trials that use composite outcome requires scrutiny in regard to the underlying reasons for combining endpoints and its implications and has impact on medical decision-making (see below in Sect. 47.8). Composite endpoints are credible only when the components are of similar importance and the relative effects of the intervention are similar across components (Guyatt et al. 2008a).”22

Not only do important methodologists urge caution in the interpretation of composite end points,23 they emphasize a basic point of scientific (and legal) relevancy:

[A] positive result for a composite outcome applies only to the cluster of events included in the composite and not to the individual components.”24

Even regular testifying expert witnesses for the litigation industry insist upon the “principle of full disclosure”:

The analysis of the effect of therapy on the combined end point should be accompanied by a tabulation of the effect of the therapy for each of the component end points.”25

Gatekeepers in our judicial system need to be more vigilant against bait-and-switch inferences based upon composite end points. The quest for statistical power hardly justifies larding up an end point with irrelevant data points.


1 See, e.g., Milton Packer, “Unbelievable! Electrophysiologists Embrace ‘Alternative Facts’,” MedPage (May 16, 2018) (describing clinical trialists’ abandoning pre-specified intention-to-treat analysis).

2 Curtis Meinert, Clinical Trials Dictionary (Johns Hopkins Center for Clinical Trials 1996).

3 Victor M. Montori, et al., “Validity of composite end points in clinical trials.” 300 Brit. Med. J. 594, 596 (2005).

4 R. Fletcher & S. Fletcher, Clinical Epidemiology: The Essentials at 109 (4th ed. 2005).

5 Neaton, et al., “Key issues in end point selection for heart failure trials: composite end points,” 11 J. Cardiac Failure 567, 569a (2005).

6 Schulz & Grimes, “Multiplicity in randomized trials I: endpoints and treatments,” 365 Lancet 1591, 1593a (2005).

7 Freemantle & Calvert, “Composite and surrogate outcomes in randomized controlled trials,” 334 Brit. Med. J. 756, 756a – b (2007).

8 International Conference on Harmonisation of Technical Requrements for Registration of Pharmaceuticals for Human Use; “ICH harmonized tripartite guideline: statistical principles for clinical trials,” 18 Stat. Med. 1905 (1999).

9 Neaton, et al., “Key issues in end point selection for heart failure trials: composite end points,” 11 J. Cardiac Failure 567, 569b (2005).

10 Montori, et al., “Validity of composite end points in clinical trials.” 300 Brit. Med. J. 594, 596, Summary Point No. 2 (2005).

11 SeeLumpenepidemiology” (Dec. 24, 2012), discussing Frischhertz v. SmithKline Beecham Corp., 2012 U.S. Dist. LEXIS 181507 (E.D. La. 2012).Frischhertz was decided in the same month that a New York City trial judge ruled Dr. Shira Kramer out of bounds in the commission of similarly invalid lumping, in Reeps v. BMW of North America, LLC, 2012 NY Slip Op 33030(U), N.Y.S.Ct., Index No. 100725/08 (New York Cty. Dec. 21, 2012) (York, J.), 2012 WL 6729899, aff’d on rearg., 2013 WL 2362566, aff’d, 115 A.D.3d 432, 981 N.Y.S.2d 514 (2013), aff’d sub nom. Sean R. v. BMW of North America, LLC, ___ N.E.3d ___, 2016 WL 527107 (2016). See also New York Breathes Life Into Frye Standard – Reeps v. BMW(Mar. 5, 2013).

12Infante-lizing the IARC” (May 13, 2018).

13 Knight v. Kirby Inland Marine, 363 F.Supp. 2d 859, 864 (N.D. Miss. 2005), aff’d, 482 F.3d 347 (5th Cir. 2007) (excluding opinion of B.S. Levy on Hodgkin’s disease based upon studies of other lymphomas and myelomas); Allen v. Pennsylvania Eng’g Corp., 102 F.3d 194, 198 (5th Cir. 1996) (noting that evidence suggesting a causal connection between ethylene oxide and human lymphatic cancers is not probative of a connection with brain cancer);Current v. Atochem North America, Inc., 2001 WL 36101283, at *3 (W.D. Tex. Nov. 30, 2001) (excluding expert witness opinion of Michael Gochfeld, who asserted that arsenic causes rectal cancer on the basis of studies that show association with lung and bladder cancer; Hill’s consistency factor in causal inference does not apply to cancers generally); Exxon Corp. v. Makofski, 116 S.W.3d 176, 184-85 (Tex. App. Houston 2003) (“While lumping distinct diseases together as ‘leukemia’ may yield a statistical increase as to the whole category, it does so only by ignoring proof that some types of disease have a much greater association with benzene than others.”).

14The Public Affairs Committee of the Teratology Society, “Teratology Society Public Affairs Committee Position Paper Causation in Teratology-Related Litigation,” 73 Birth Defects Research (Part A) 421, 423 (2005).

15 168 F. Supp. 2d 1271, 1284–87 (D. Utah 2001).

16 Montori, et al., “Validity of composite end points in clinical trials.” 300 Brit. Med. J. 594, 595b (2005).

17 Kevin Kip, et al., “The problem with composite end points in cardiovascular studies,” 51 J. Am. Coll. Cardiol. 701, 701 (2008) (Abstract – Conclusions) (emphasis in original).

18 Montori, et al., “Users’ guide to detecting misleading claims in clinical research reports,” 329 Brit. Med. J. 1093 (2004) (emphasis added).

19 Id. at 1094b, 1095a.

20 Montori, et al., “Validity of composite end points in clinical trials.” 300 Brit. Med. J. 594, 596 (2005).

21 Schulz & Grimes, “Multiplicity in randomized trials I: endpoints and treatments,” 365 Lancet 1591, 1595a (2005) (emphasis added). These authors acknowledge that composite end points often lack clinical relevancy, and that the gain in statistical efficiency comes at the high cost of interpretational difficulties. Id. at 1593.

22 Wolfgang Ahrens & Iris Pigeot, eds., Handbook of Epidemiology 1840 (2d ed. 2014) (47.5.8 Use of Composite Endpoints).

23 See, e.g., Stuart J. Pocock, John J.V. McMurray, and Tim J. Collier, “Statistical Controversies in Reporting of Clinical Trials: Part 2 of a 4-Part Series on Statistics for Clinical Trials,” 66 J. Am. Coll. Cardiol. 2648, 2650-51 (2015) (“Interpret composite endpoints carefully.”)(“COMPOSITE ENDPOINTS. These are commonly used in CV RCTs to combine evidence across 2 or more outcomes into a single primary endpoint. But, there is a danger of oversimplifying the evidence by putting too much emphasis on the composite, without adequate inspection of the contribution from each separate component.”); Eric Lim, Adam Brown, Adel Helmy, Shafi Mussa, and Douglas G. Altman, “Composite Outcomes in Cardiovascular Research: A Survey of Randomized Trials,” 149 Ann. Intern. Med. 612, 612, 615-16 (2008) (“Individual outcomes do not contribute equally to composite measures, so the overall estimate of effect for a composite measure cannot be assumed to apply equally to each of its individual outcomes.”) (“Therefore, readers are cautioned against assuming that the overall estimate of effect for the composite outcome can be interpreted to be the same for each individual outcome.”); Freemantle, et al., “Composite outcomes in randomized trials: Greater precision but with greater uncertainty.” 289 J. Am. Med. Ass’n 2554, 2559a (2003) (“To avoid the burying of important components of composite primary outcomes for which on their own no effect is concerned, . . . the components of a composite outcome should always be declared as secondary outcomes, and the results described alongside the result for the composite outcome.”).

24 Freemantle & Calvert, “Composite and surrogate outcomes in randomized controlled trials.” 334 Brit. Med. J. 757a (2007).

25 Lem Moyé, “Statistical Methods for Cardiovascular Researchers,” 118 Circulation Research 439, 451 (2016).