Gatekeeping the Lumpers and Splitters – Composite End Points

The battle between lumpers and splitters is fought in many disciplines, and so it is not surprise that it finds its way into litigation.

The battle is often entrenched in the discipline of epidemiology, where practitioners tussle over the definition of the end point of a study or clinical trial. Lumping has the advantage of increasing study size, with attendant increases in statistical power.  The down side of lumping is that the “lumped” or composite outcome may no longer be meaningful with respect to the more precise outcome of interest.  In other words, the lumping threatens the external validity of the study.  Splitting preserves external validity with respect to outcome of interest, but decreases study size, with a greater risk of Type II errror.

The issue arises in birth defect litigation, such as the claims made against the manufacturer of Bendectin, where the claimants’ expert witnesses frequently tried to increase power by lumping different birth defects together, despite the lack of embryological plausibility.  The issue has come up in cardiovascular end point trials and meta-analyses, involving thrombo-embolic outcomes, such as stroke and heart attack.  The Celebrex litigation, for instance, involved contested issues of what cardiovascular end points to combine to capture the postulated thrombotic causal mechanism.  In re Pfizer Inc. Securities Litig., 2010 WL 1047618 (S.D.N.Y. 2010).

Despite the recurrence of lumping/splitting issues in litigation of epidemiologic evidence, the Reference Manual for Scientific Evidence (3d ed. 2011)  does not treat the subject at all.  Federal and state judges are often at sea (without sextant or compass) in disputes over lumping and splitting, where the methodology selected can often determine the result.  The following is a collection of some observations, comments, and guidances from the biomedical literature on the use of composite end points. 

 

Composite Endpoints

A.  Definition

Composite end points are typically defined, perhaps circularly, as a single group of health outcomes, which group is made up of constituent or single end points.  Meinert defined a composite outcome as “an event that is considered to have occurred if any of several different events or outcomes is observed.”  C. Meinert, Clinical Trials Dictionary (Johns Hopkins Center for Clinical Trials 1996). Similarly, Montori defined composite end points as “outcomes that capture the number of patients experiencing one or more of several adverse events.”  Montori, et al., “Validity of composite end points in clinical trials.”  300 Brit. Med. J.  594, 596 (2005).  Composite end points are also sometimes referred to as combined or aggregate end points.

Many composite end points are clearly defined for a clinical trial, and the component end points are specified.  In some instances, the composite nature of an outcome may be subtle or be glossed over by the study’s authors.  In the realm of cardiovascular studies, for example, investigators may look at stroke as a single endpoint, without acknowledging that there are important clinical and pathophysiological differences between ischemic strokes and hemorrhagic strokes (intracerebral or subarachnoid).  The Fletchers give the example:

“In a study of cardiovascular disease, for example, the primary outcomes might be the occurrence of either fatal coronary heart disease or non-fatal myocardial infarction.  Composite outcomes are often used when the individual elements share a common cause and treatment.  Because they comprise more outcome events than the component outcomes alone, they are more likely to show a statistical effect.”

R. Fletcher & S. Fletcher, Clinical Epidemiology: The Essentials 109 (4th ed. 2005).

B.  Utility of Composite End Points

1.  Power

Use of composite end points frequently occurs in the context of studying heart attacks as the outcome of interest.  Improvements in medical care have led to decreased frequency in rates of myocardial infarction (MI) and repeat MIs.  In clinical trials, because of the close medical attention received by participants, event rates are even lower than what might be expected from the relevant general patient population.  These low event rates have caused power issues for clinical trialists, who have responded by turning to composite end points to capture more events.  Composite end points permit smaller sample sizes and shorter follow-up times.  Increasing study power, while reducing sample size or observation time, is perhaps the most frequently cited rationale for using composite end points.

Typical statements from the medical literature:

“Clinical trials, particularly in cardiology, often use composite end points to reduce sample size requirements and to capture the overall impact of therapeutic interventions.”

(Ferreira-Gonzalez 2007, p. 1b, Introduction)

“The widespread use of composite end points reflects their elegant simplicity as a solution to the problem of declining event rates.”

(Montori 2005, at 596, Conclusions)

“The primary rationale for considering a composite primary outcome instead of a single event outcome is sample size.”

(Neaton 2005, at 598b)

“Clinical trialists use composite end points, outcomes that capture the number of patients who have one or more of several events, to increase event rates and statistical power.”

(Ferreira-Gonzalez 2007, p. 6a, Box)

“Although dealing with multiple testing is an important factor in the design and analysis of clinical trials, this may not be the only motivation behind the popularity of composite outcome measures.  Instead, issues of statistical efficiency appear to be prominent, with composite outcomes in time-to-event trials leading to higher event rates and thus enabling smaller sample sizes or shorter follow-up (or both).”

(Freemantle 2003, at 2555 b-c)

“Investigators often use composite end points to enhance the statistical efficiency of clinical trials.”

(Montori 2004, at 1094b)

2.  Competing Risks

Another reason that is offered in support of using composite end points is composites provide a strategy to avoid the problem of competing risks.  (Neaton 2005, at 569a)  Death (any cause) is sometimes added to a distinct clinical morbidity because patients who are taken out of the trial by death are “unavailable” to experience the morbidity outcome.

3.  Multiple Testing

By aggregating several individual end points into a single pre-specified outcome, trialists can avoid corrections for multiple testing.  Trials that seek data on multiple outcomes, or on multiple subgroups, inevitably raise concerns about the appropriate choice of the measure for the statistical test (alpha) to determine whether to reject the null hypothesis.  According to some authors, “[c]omposite endpoints alleviate multiplicity concerns.”  Schulz & Grimes, “Multiplicity in randomized trials I:  endpoints and treatments,” 365 Lancet 1591, 1593a (2005).  Schultz and Grimes, who written extensively about methodological issues, comment further:

“If designated a priori as the primary outcome, the composite obviates the multiple comparisons associated with testing of the separate components.  Moreover, composite outcomes usually lead to high event rates thereby increasing power or reducing sample size requirements.  Not surprisingly, investigators frequently use composite endpoints.”

Id.  Freemantle and Calvert acknowledge that the need to avoid false positive results from multiple testing is an important rationale for composite end points:

“Because the likelihood of observing a statistically significant result by chance alone increases with the number of tests, it is important to restrict the number of tests undertaken and limit the type 1 error to preserve the overall error rate for the trial.”

Freemantle & Calvert, “Composite and surrogate outcomes in randomized controlled trials.” 334 Brit. Med. J . 756, 756a – b (2007).  Freemantle previously had articulated a similar rationale:

“[T]he correct (a priori) identification of a composite end point can increase the statistical precision and thus the efficiency of a trial.”

(Freemantle 2003, at 2558a)

4.  Indecision about an Appropriate Single Outcome

The International Conference on Harmonization suggests that the inability to select a single outcome variable may lead to the adoption of a composite outcome:

“If a single primary variable cannot be selected …, another useful strategy is to integrate or combine the multiple measurements into a single or composite variable.”

International Conference on Harmonisation of Technical Requrements for Registration of Pharmaceuticals for Human Use; “ICH harmonized tripartite guideline:  statistical principles for clinical trials,” 18 Stat. Med. 1905 (1999).

Freemantle gives this rationale some measure of approval:

“Composite outcomes can help in avoiding arbitrary decisions between different candidate outcomes when prespecifying the primary outcome … .”

(Freemantle & Calvert 2007, at 757a)

“[A] composite outcome may help investigators who are having difficulty in deciding which outcome to elect as the primary outcome measure in a trial and deal with the issue of multiplicity in an efficient manner, avoiding the need for arbitrary choices.”

(Freemantle 2003, at 2558a-b)

The “indecision” rationale has also been criticized:

“Inability to reach consensus on a single outcome is generally not a good reason to use a composite end point.”

(Neaton 2005, at 569b)

 

C.  Validity of Composite End Points

The validity of composite end points depends upon assumptions, which will have to be made at the time of the study design and protocol creation.  After the data are collected and analyzed, the assumptions may or may not be supported.

“The validity of composite end points depends on

  • similarity in patient importance,
  • [similarity in] treatment effect, and
  • number of events across the components.”

(Montori 2005, at 596, Summary Point No. 2)

“Use of composite end points is usually justified by the assumption that the effect on each of the components will be similar and that patients will attach similar importance to each component.”

(Montori 2005, at 594a, paragraph 2)

 

D.  Role of Mechanism in Justifying Composite End Points

A composite end point will obviously make sense when the individual end points are biologically related, and the investigators reasonably expect that the individual end points would be affected in the same direction, and in the same approximate amount.

“Confidence in a composite end point rests partly on a belief that similar reductions in relative risk apply to all the components.  Investigators should therefore construct composite endpoints in which the biology would lead us to expect similar effects across components.”

(Montori 2005, 595b)

 

E.  Methodological Issues

The acceptability of composite end points is often a delicate balance between the statistical power and efficiency gained and the reliability concerns raised by using the composite.  As with any statistical or interpretative tool, the key questions revolve how is the tool used, and for what purpose.  The reliability issues raised by the use of composites are likely to be highly contextual.

For instance, there is an important asymmetry between justifying the use of a composite for measuring efficacy and the use of the same composite for safety outcomes.  A biological improvement in type 2 diabetes might be expected to lead to a reduction in all the macrovascular complications of that disease, but a medication for type 2 diabetes might have a very specific toxicity or drug interaction, which affects only constituent end point among all macrovascular complications, such as myocardial infarction.  The asymmetry between efficacy and safety outcomes is specifically addressed in a recent publication:

“Varying definitions of composite end points, such as MACE, can lead to substantially different results and conclusions.  There, the term MACE, in particular, should not be used, and when composite study end points are desired, researchers should focus separately on safety and effectiveness outcomes, and construct separate composite end points to match these different clinical goals.”

(Kip 2008, 701, Abstract – Conclusions)(emphasis in original)

There are many clear statements that caution the consumers of medical studies against being misled by misleading claims that may be based upon composite end points, in the medical literature.  Severally years ago, the British Medical Journal published a paper by Montori, et al., “Users’ guide to detecting misleading claims in clinical research reports,” 329 Brit. Med. J. 1093 (2004).  The authors distill their advice down to six suggestions, one of which deals explicitly with composite end points:

“Guide to avoid being misled by biased presentation and interpretation of data

1.  Read only the Methods and Results sections; bypass the Discuss section

2.  Read the abstract reported in evidence based secondary publications

3.  Beware faulty comparators

4.  Beware composite endpoints

5.  Beware small treatment effects

6.  Beware subgroup analyses”

 

 

 

 

 

 

 

 

 

 

 

Id. at 1093a (emphasis added).  The authors elaborate on the problems that arise from the use of composite end points:

“Problems in the interpretation of these trials arise when composite end points include component outcomes to which patients attribute very different importance… .”

(Montori 2004, at 1094b.)

“Problems may also arise when the most important end point occurs infrequently or when the apparent effect on component end points differs.”

(Montori 2004, at 1095a.)

“When the more important outcomes occur infrequently, clinicians should focus on individual outcomes rather than on composite end points.  Under these circumstances, inferences about the end points (which because they occur infrequently will have very wide confidence intervals) will be weak.”

(Montori 2004, at 1095a.)

“When large variations exist between components the composite end point should be abandoned.”

(Montori 2005, at 596, Summary Point No. 3)

“Occasionally, composite end points prove useful and informative for clinical decision making.  Often, they do not.”

(Montori 2005, at 596, Conclusions)

“Composite endpoints frequently lack clinical relevancy.  Thus, composite endpoints address multiplicity and generally yield statistical efficiency at the risk of creating interpretational difficulties.”

(Schulz & Grimes 2005, at 1593a-b)

“The disadvantages of composite outcomes may arise when the constituents do not move in line with each other.”

(Freemantle 2003, at 2558a)

“Composite end points, as currently used in cardiovascular trials, may often be misleading.”

(Ferreira-Gonzalez 2007, p. 6a, Box)

“Trialists should report complete data on individual component end points to facilitate appropriate interpretation; clinicians should view with caution the results of cardiovascular trials that use composite end points to report their results.”

(Ferreira-Gonzalez 2007, p. 7a)

 

F.  Methodological Issues Concerning Causal Inferences from Composite End Points to Individual End Points

Several authors have criticized pharmaceutical companies for using composite end points to “game” their trials.  Composites allow smaller sample size, but they lend themselves to broader claims for outcomes included within the composite.  The same criticism appears to be valid when applied to attempts to infer that there is risk of an individual endpoint based upon a showing of harm in the composite endpoint.

“If a trial report specifies a composite endpoint, the components of the composite should be in the well-known pathophysiology of the disease.  The researchers should interpret the composite endpoint in aggregate rather than as showing efficacy of the individual components.  However, the components should be specified as secondary outcomes and reported beside the results of the primary analysis.”

(Schulz & Grimes 2005, at 1595a)(emphasis added)

“[A] positive result for a composite outcome applies only to the cluster of events included in the composite and not to the individual components.”

(Freemantle & Calvert 2007, at 757a) [Freemantle and Calvert urge “health warnings” that a composite end point benefit cannot be interpreted to mean an actual benefit in every constituent end point.]

“To avoid the burying of important components of composite primary outcomes for which on their own no effect is concerned, . . . the components of a composite outcome should always be declared as secondary outcomes, and the results described alongside the result for the composite outcome.”

(Freemantle 2003, at 2559a, Point No. 3; 2559b-c, Box)

“Authors and journal editors should ensure that the reporting of composite outcomes is clear and avoids the suggestion that individual components of the composite have been demonstrated to be effective.”

(Freemantle 2003, at 2559b-c, Box Point No. 4)

 

G.  Regulatory Experience

“Regulatory behavior may have led to the addition of ‘death’ to many composite primary end points used in trials, and it is our experience that the Food and Drug Administration has actively promoted the use of such composite outcome measures in the heart failure trials.”

(Freemantle & Calvert 2007, at 757a)

The FDA addressed composite end points in the context of its recommendations for looking at cardiovascular outcomes in Phase III and Phase IV clinical trials for anti-diabetic therapies.

“In cardiovascular trials, as in all trials, the primary endpoint should be predefined, justified, and accurately captured and analyzed. Powering the study on an individual type of event (e.g., myocardial infarction) is usually not feasible because of low incidence rates. Therefore, many cardiovascular trials use the MACE (Major Adverse Cardiovascular Event) composite endpoint, which contains all-cause mortality (or cardiovascular death), non-fatal myocardial infarction, and stroke. Some cardiovascular trials include other macrovascular events, such as coronary revascularization and lower-extremity amputation. Use of all-cause mortality as part of the MACE endpoint in a trial with excellent follow-up has the advantage of certainty as to whether the event occurred. However, the cause of death should still be determined in a well-designed trial to ensure that there are no imbalances in particular fatal events (e.g., neoplasms or strokes). Use of cardiovascular death as part of the MACE endpoint may be more relevant but, like myocardial infarction and stroke, requires adjudication by an independent and blinded committee with pre-specified case definitions and methodology for ascertaining events (e.g., access to medical records and laboratory data).  If the study is powered on a composite endpoint, there will likely be too few events for the individual components (e.g., acute myocardial infarction) of the composite to provide conclusive evidence of a difference between treatment groups with regard to these individual endpoints. In addition, a difference between treatment groups in the composite endpoint may primarily be driven by one or more of the individual components that comprise the endpoint. As a result, secondary efficacy measures often include analyses of the individual components as initial and total events to determine their contribution to the overall primary efficacy results.”

(FDA Background Introductory Memorandum, for Endocrinologic and Metabolic Drugs Advisory Committee meeting, July 1-2, 2008, at p. 17 – 18.)

 

H.  Specific Composite End Points

1.  Myocardial ischemia 

In the Avandia litigation, some investigators chose to look at a composite of “myocardia ischemia.”  Plaintiffs’ counsel, and even some publications, appear to equate a finding of this composite end point with one of myocardial infarction.  For instance, Curt Furberg equated MI with myocardial ischemia in a JAMA publication of his meta-analysis of rosiglitazone trials.  See, e.g., Singh, et al., “Long-term risk of cardiovascular events with rosiglitazone:  a meta-analysis,” 298 JAMA 1189, 1193 (2007)(“Two previous meta-analyses showed that the risk of MI was significantly increased by rosiglitazone. An unpublished meta-analysis (ZM 2005/00181/01) conducted in 2005 involving 14,237 participants from 42 double-blind RCTs determined the incidence of MI in the rosiglitazone group to be 1.99% vs. 1.51% in controls (hazard ratio, 1.31; 95% CI, 1.01-1.70).”)(emphasis added; internal references omitted).  From his endnotes, it is clear that Furberg is referencing GlaxoSmithKline’s own meta-analysis, which used myocardial ischemia, not MI, as an end point.  See Alexander Cobitz, et al., “A retrospective evaluation of congestive heart failure and myocardial ischemia events in 14 237 patients with type 2 diabetes mellitus enrolled in 42 short-term, double-blind, randomized clinical studies with rosiglitazone,” 17 Pharmacoepi. and Drug Safety 769 (2008) (reporting GSK’s meta-analysis of 42 clinical trials for a broad definition of myocardial ischemia).  Furberg’s confusion seems the sort of carelessness that trial judges should be alert to guard against.

Myocardial ischemia may be variously defined, but at least it may include MI and angina.  Sometimes revascularization is added.  Subjective symptoms as vague as “dyspnea,” or as specific as sub-sternal pain, may be part of the definition.  A definition of myocardial ischemia used in an exploratory, hypothesis-generating analysis, for purposes of “pharmacovigilance,” may have different validity and operational characteristics from a definition used in a study that is trying to determine whether a medication, does in fact, cause any one of the constituent end points within the composite.

2.  MACE

Recently, the use of the MACE composite end point has been subjected to greater scrutiny and criticism.  Kip summarizes his group’s recent analysis:

“In light of the approximate prior 15 years of the term MACE and its wide heterogeneity in definition and research applications, it is unlikely that a consensus definition will either be universally desired or practical for future research.  Therefore, we recommend against the routine use of MACE as a composite end point at large.  However, if a broad heterogeneous composite end point such as MACE is ultimately desired, minimally, it must be clearly defined, and the individual as well as composite end points need to be analyzed, presented, and discussed.”

(Kip 2008, at 706b)

Kip notes that this his group’s recommendations are consistent with those of the Academic Research Consortium, which has tried to establish consensus composite end point definitions for stent trials.  See Cutlip, et al., “Clinical end points in coronary stent trials:  a case for standardized definitions,” 115 Circulation 2344 (2007).

3.  Cardiovascular or cardiac death

The use of a composite end point of cardiac death has elicited some strong criticism in the published literature, most notably from Dr. Nissen’s former colleague, Dr. Eric Topol.  See generally, Lauer & Topol, “Clinical trials – Multiple treatments, multiple end points, and multiple lessons,” 289 JAMA 2575 (2003).

“Among fatal end points, only all-cause mortality can be considered objective, unbiased, and clinically relevant.  As previously reviewed in depth, the use of end points such as ‘cardiac death’, ‘vascular death’, and ‘arrhythmic death’ are inherently subject to error due to biased assessment and to the biological complexities of disease, especially among elderly individuals.”

(Lauer & Topol 2003, at 2575b)

“When mortality is considered, only all-cause mortality is a valid end point, while end points such as ‘cardiac death’ and ‘arrhythmic death’ should be actively discouraged.”

(Lauer & Topol 2003, at 2577a)

4.  All-cause death

Although most authors accept “any death” as a potential corrective to competing risks, and the ultimate, objective outcome, Lauer and Topol do not completely spare the inclusion of all-cause death in outcome composites, from criticism:

“A composite end point that includes death as well as nonfatal events is subject to biases related to competing risks.  Obviously, patients who die cannot later experience nonfatal myocardial infarction or be hospitalized.  A treatment that leads to an increased risk of death may therefore appear to reduce the risk of nonfatal events.  Although formal methods have been developed to analyze competing risks in an unbiased manner, the optimal approach to this problem is unclear.”

(Lauer & Topol 2003, at 2576a)

 

J.   Bibliography

Cutlip, et al., “Clinical end points in coronary stent trials:  a case for standardized definitions,” 115 Circulation 2344 (2007)

FDA Background Introductory Memorandum, for Endocrinologic and Metabolic Drugs Advisory Committee meeting (July 1-2, 2008)

Ferreira-Gonzalez, et al., “Problems with the use of composite end point in cardiovascular trials: systematic review of randomized controlled trials.”  334 Brit. Med. J.  (published online 2 April 2007).

R. Fletcher & S. Fletcher, Clinical Epidemiology:  The Essentials (4th ed. 2005).

Freemantle, et al., “Composite outcomes in randomized trials: Greater precision but with greater uncertainty.”  289 J. Am. Med. Ass’n  2554 (2003)

Freemantle & Calvert, “Composite and surrogate outcomes in randomized controlled trials.” 334 Brit. Med. J.  756 (2007)

International Conference on Harmonisation of Technical Requrements for Registration of Pharmaceuticals for Human Use.  ICH harmonized tripartite guideline:  statistical principles for clinical trials, 18 Stat. Med. 1905 (1999)

Kip, et al., “The problem with composite end points in cardiovascular studies,” 51 J. Am. Coll. Cardiol. 701 (2008)

Lauer & Topol, “Clinical trials – Multiple treatments, multiple end points, and multiple lessons.”  289 J. Am. Med. Ass’n 2575 (2003)

Montori, et al., “Users’ guide to detecting misleading claims in clinical research reports,” 329 Brit. Med. J. 1093 (2004)

Montori, et al., “Validity of composite end points in clinical trials.”  300 Brit. Med. J. 594 (2005).

Neaton, et al., “Key issues in end point selection for heart failure trials:  composite end points,” 11 J. Cardiac Failure 567 (2005)

Schulz & Grimes, “Multiplicity in randomized trials I:  endpoints and treatments,” 365 Lancet 1591 (2005)