Meta-Meta-Analysis – Celebrex Litigation – The Claims – Part 2


As I noted in part one, the tables were turned on imputation, with plaintiffs making the same accusation that G.E. made in the gadolinium litigation:  imputation involves adding “phantom events” or “imaginary events to each arm of ‘zero event’ trials.”  See Plaintiffs’ Reply Mem. of Law in Further Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 8, 9 (May 5, 2010), in Securities Litig.

The plaintiffs claimed that Wei “created” an artifact of a risk ratio of 1.0 by using imputation in each of the zero-event trials.  The reality, however, is that each of those trials had zero risk difference, and the rates of event in drug and placebo arms were both low and equal to one another.  The plaintiffs’ claim that Wei “diluted” the risk is little more than saying that he failed to inflate the risk by excluding zero-event trials.  But zero-event trials represent a test in which the risk of events in both arms is equal, and relatively low.

The plaintiffs seemed to make their point half-heartedly.  They admitted that “imputation in and of itself is a commonly used methodology,” id. at 10, but they claimed that “adding zero-event trials to a meta-analysis is debated among scientists.”  Id.  A debate over methodology in the realm of meta-analysis procedures hardly makes any one of the debated procedures “not generally accepted,” especially in the context of meta-analysis of uncommon adverse events arising in clinical trials designed for other outcomes.  After all, investigators do not design trials to assess a suspected causal association between a medication and an adverse outcome as their primary outcome.  The debate over the ethics of such a trial would be much greater than any gentle debate over whether to include zero-event trials by using either the risk difference or imputation procedures.

The gravamen of the plaintiffs’ complaint against Wei seems to be that he included too many zero-event trials, “skewing the numbers greatly, and notably cites to no publications in which the dominant portion of the meta-analysis was comprised of studies with no events.”  Id. The plaintiffs further argue that Wei could have minimized the “distortion” created by imputation by using a fractional event, ” a smaller number like .000000001 to each trial.”  Id. The plaintiffs notably cited no texts or articles for this strategy.  In any event, if the zero-event trials are small, as they typically are, then they will have large study variances.  Because meta-analyses weight each trial by the inverse of the variance, studies with large variances have little weight in the summary estimate of association.  Including small studies with imputation methods will generally not affect the outcome very much, and their contribution may well reflect the reality of lower or non-differential risk from the medication.

Eliminating trials on the grounds that they had zero events has also been criticized for throwing away important data.  Charles H. Hennekens, David L. DeMets, C. Noel Bairey Merz, Steven L. Borzak, Jeffrey S. Borer,  “Doing More Harm Than Good,” 122 Am. J. Med. 315 (2009) (criticizing Nissen’s meta-analysis of rosiglitazone in which he excluded zero event trials for as biased towards overestimating the magnitude of the summary estimate of association). George A. Diamond, L. Bax, S. Kaul, “Uncertain effects of rosiglitazone on the risk for myocardial infarction and cardiovascular death,” 147 Ann. Intern. Med. 578 (2007) (conducting sensitivity analyses on Nissen’s meta-analysis of rosiglitazone to show that Nissen’s findings lost statistical significance when continuity corrections were made for zero-event trials).



The plaintiffs are correct that the risk difference is not the predominant risk measure used in meta-analysis or in clinical trials for that matter.  Researchers prefer risk ratios because they reflect base rates in the ratio.  As one textbook explains:

“the limitation of the [risk difference] statistic is its insensitivity to base rates. For example, a risk that increases from 50% to 52% may be less important than one that increases from 2% to 4%, although in both instances RD = 0.02.”

Julia Littell, Jacqueline Corcoran, and Vijayan Pillai, Systematic Reviews and Meta-Analysis 85 (Oxford 2008).  This feature of the risk difference hardly makes its use unreliable, however.

Pfizer pointed out that at least one other case addressed the circumstances in which the risk difference would be superior to risk ratios in meta-analyses:

“The risk difference method is often used in meta-analyses where many of the individual studies (which are all being pooled together in one, larger analysis) do not contain any individuals who developed the investigated side effect.FN17  whereas such studies would have to be excluded from an odds ratio calculation, they can be included in a risk difference calculation. FN18

FN17. This scenario is more likely to occur when studying a particularly rare event, such as suicide.

FN18. Studies where no individuals experienced the effect must be excluded from an odds ratio calculation because their inclusion would necessitate dividing by zero, which, as perplexed middle school math students come to learn, is impossible. The risk difference’s reliance on subtraction, rather than division, enables studies with zero incidences to remain in a meta-analysis. (Hr’g Tr. 310-11, June 20, 2008 (Gibbons.)).”

In re Neurontin Marketing, Sales Practices, and Products Liab. Litig.,  612 F.Supp. 2d 116, 126 (D. Mass. 2009) (MDL 1629).  See Pfizer’s Defendants’ Mem. of Law in Opp. to Plaintiffs’ Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei (Sept. 8, 2009), in Securities Litig. (citing In re Neurontin).

Pfizer also pointed out that Wei had employed both the risk ratio and the risk difference in conducting his meta-analyses, and that none of his summary estimates of association were statistically significant.  Id. at 19, 24.


The plaintiffs argued that the use of “exact confidence” intervals was not scientifically reliable and could not have been used by Pfizer at the time period covered by the securities class’s allegations.  See Plaintiffs’ Reply Mem. of Law in Further Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 15 (May 5, 2010).  Exact intervals, however, are hardly a novelty, and there is often no single way to calculate a confidence interval.  See E. B. Wilson,  “Probable inference, the law of succession, and statistical inference,” 22 J. Am. Stat. Ass’n 209 (1927); C. Clopper, E. S. Pearson, “The use of confidence or fiducial limits illustrated in the case of the binomial,” 26 Biometrika 404 (1934).  Approximation methods are often used, despite their lack of precision, because of their ease in calculation.

Plaintiffs further claimed that the combination of risk difference and exact intervals is novel, not reliable, and not in existence during the class period.  Plaintiffs’ Reply Mem at 15.  The plaintiffs’ argument traded on Wei’s having published on the use of exact intervals in conjunction with the risk difference for heart attacks in clinical trials of Avandia.  See L. Tian, T. Cai, M.A. Pfeffer, N. Piankov, P.Y. Cremieux, and L.J. Wei, “Exact and efficient inference procedure for meta-analysis and its application to the analysis of independent 2 x 2 tables with all available data but without artificial continuity correction,” 10 Biostatistics 275 (2009).  Their argument ignored that Wei combined two well-understood statistical techniques, in a transparent way, with empirical testing of the validity of his approach.  Contrary to plaintiffs’ innuendo, Wei did not develop his approach as an expert witness for GlaxoSmithKline; a version of the manuscript describing his approach was posted on line well before he was ever contacted by GSK counsel. (L.J. Wei, personal communication)  Plaintiffs also claimed that Wei’s use of exact intervals for risk difference showed no increased risk of heart attack for Avandia, contrary to a well-known meta-analysis by Dr. Steven Nissen.  See Steven E. Nissen, M.D., and Kathy Wolski, M.P.H., “Effect of Rosiglitazone on the Risk of Myocardial Infarction and Death from Cardiovascular Causes,” 356 New Engl. J. Med. 2457, 2457 (2007).  This claim, however, is a crude distortion of Wei’s paper, which showed that there was a positive risk difference for heart attacks in the same dataset used by Nissen, but the confidence intervals included zero (no risk difference), and thus chance could not be excluded as explaining Nissen’s result.



Pfizer was ultimately successful in defending the Celebrex litigation on the basis of lack of risk associated with 200 mg/day use.  Pfizer also attempted to argue a duration effect on grounds that in one large trial that saw a statistically significant hazard ratio associated with higher doses, the result occurred for the first time among trial participants on medication, at 33 months into the trial.  Judge Bryer rejected this challenge, without explanation.  In re Bextra & Celebrex Marketing Celebrex Sales Practices & Prod. Liab. Litig., 524 F.Supp. 2d 1166, 1183 (2007).  The reasonable inference, however, is that the meta-analyses showed statistically significant results across trials with less duration of use, for 400 mg and 800 mg/day use.

Clearly duration of use is a potential consideration unless the mechanism of causation is such that a causally related adverse event would occur from the first use or very short-term use of the medication.  See In re Vioxx Prods. Liab. Litig., MDL No. 1657, 414 F. Supp. 2d. 574, 579 (E.D. La. 2006) (“A trial court may consider additional factors in assessing the scientific reliability of expert testimony . . . includ[ing] whether the expert’s opinion is based on incomplete or inaccurate dosage or duration data.”).  In the Celebrex litigation, plaintiffs’ counsel appeared to want to have duration effects both ways; they did not want to disenfrancise plaintiffs whose claims turned on short-term use, but at the same time, they criticized Professor Wei for including short-term trials of Celebrex.

One form that the plaintiffs’ criticism of Wei took was his failure to weight the trials included in his meta-analyses by duration.  In the plaintiffs’ words:

“Wei failed to utilize important information regarding the duration of the clinical trials that he analyzed, information that is critical to interpreting and understanding the Celebrex and Bextra safety information that is contained within those clinical trials.3 Because the types of cardiovascular events that are at issue in this case occur relatively rarely and are more likely to be observed after an extended period of exposure, the scientific community is in agreement that they would not be expected to appear in trials of very short duration.”

Plaintiffs’ Mem. of Law in Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 2 (July 23, 2009), submitted in In re Pfizer, Inc. Securities Litig., Nos. 04 Civ. 9866(LTS)(JLC), 05 md 1688(LTS) (S.D.N.Y.)[hereafter Securities Litig.]  The plaintiffs maintained that Wei’s meta-analyses were “fatally flawed” because he ignored trial duration, such as would be factored in by performing the analyses in terms of patient years.  Id. at 3

Many of the sources cited by plaintiffs do not support their argument. For instance, the plaintiffs cited articles that noted that weighted averages should be used, but virtually all methods, including Wei’s, weight studies by their variance, which takes into account sample size. Id. at 9 n.3, citing Egger, et al. “Meta-analysis: Principles and Procedures,” 315 Brit. Med. J. 1533 (1997) (an arithmetic average from all trials gives misleading results as results from small studies are more subject to the play of chance and should be given less weight. Meta-analyses use weighted results in which larger trials have more influence that smaller ones). See also id. at 22.  True, true, and immaterial.  No one in the Celebrex cases was using an arithmetic average of risk across trials or studies.

Most of the short-term studies were small, and thus contributed little to the overall summary estimate of association.  Some of the plaintiffs’ citations actually supported using “individual patient data” in the form of time-to-event analyses, which was not possible with many of the clinical trials available.  Indeed, the article the plaintiffs cited, by Dahabreh, did not use time-to-event data for rosiglitazone, because such data were not generally available.  Id. at 9 n.3, citing Dahabreh, “Meta-Analysis Of Rare Events: An Update And Sensitivity Analysis Of Cardiovascular Events In Randomized Trials Of Rosiglitazone,” 5 Clinical Trials 116 (2008).

The plaintiffs’ claim was thus a fairly weak challenge to using simple 2 x 2 tables for the included studies in Wei’s meta-analysis. Both sides failed to mention that many published meta-analyses eschew “patient years” in favor of a simple odds ratio for dichotomous count data from each included study.  See, e.g., Steven E. Nissen, M.D., and Kathy Wolski, M.P.H., “Effect of Rosiglitazone on the Risk of Myocardial Infarction and Death from Cardiovascular Causes,” 356 New Engl. J. Med. 2457, 2457 (2007)(using Peto method with count data, for fixed effect model).  Patient years would be a crude tool to modify the fairly common 2 x 2 table.  The analysis for large studies, with a high number of patient years, would still not reveal whether the adverse events occurred early or late in the trials.  Only a time-to-event analysis could provide the missing information about “duration,” and neither side’s expert witnesses appeared to use a time-to-event analysis.

Interestingly, plaintiffs’ expert witness, Prof. Madigan appears to have received the patient-level data from Pfizer’s clinical trials, but still did not conduct a time-to-event analysis.  Plaintiffs’ Mem. of Law in Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 12 (July 23, 2009), submitted in In re Pfizer, Inc. Securities Litig., Nos. 04 Civ. 9866(LTS)(JLC), 05 md 1688(LTS) (S.D.N.Y.)[hereafter Securities Litig] (noting that Madigan had examined all SAS data files produced by Pfizer, and that “[t]hese  files contained voluminous information on each subject in the trials, including information about duration of exposure to the drug ( or placebo), any adverse events experienced and a wide variety of other information.”).  Of course, even with time-to-event data from the Pfizer clinical trials, Madigan had the problem of whether to limit himself to just the Pfizer trials or use all the data, including non-Pfizer trials.  If he opted for completeness, he would have been forced to include trials for which he did not have underlying data.  In all likelihood, Madigan used patient-years in his analyses because he could not conduct a complete analysis with time-to-event data for all trials.

The plaintiffs’ point appears well taken if the court were to assume that there really was a duration issue, but the plaintiffs’ theories were to the contrary, and Pfizer lost its attempt to limit claims to those events that appeared 33 months (or some other fixed time) after first ingestion.  It is certainly correct that patient-year analyses, in the absence of time-to-event analyses, is generally preferred.  Pfizer had used patient-year information to analyze combined trials in its submission to the FDA’s Advisory Committee.  See Pfizer’s Submission of Advisory Committee Briefing Document at 15 (January 12, 2005).  See also  FDA Reviewer Guidance: Conducting a Clinical Safety Review of a New Product Application and Preparing a Report on the Review at 22 (2005); see also id. at 15 (“If there is a substantial difference in exposure across treatment groups, incidence rates should be calculated using person-time exposure in the denominator, rather than number of patients in the denominator.”);  R. H. Friis & T. A. Sellers, Epidemiology for Public Health Practice at 105 (2008) (“To allow for varying periods of observation of the subjects, one uses a modification of the formula for incidence in which the denominator becomes person-time of observation”).

Professor Wei chose not to do a “patient-year” analysis because such a methodological commitment would have required him to drop over a dozen Celebrex clinical trials involving thousands of patients, and dozens of heart attack and stroke events of interest.  Madigan’s approach led him to disregard a large amount of data.  Wei could, of course, stratified the summary estimates for different length clinical trials, and analyzed whether there were differences as a function of trial duration.  Pfizer claimed that Wei conducted a variety of sensitivity analyses, but it is unclear whether he ever used this technique.  Wei should have been allowed in any event to take plaintiffs at their word that thrombotic events from Celebrex occurred shortly after first ingestion.   Pfizer Mem. of Law in Opp. to Plaintiffs’ Motion to Exclude Defendants’ Expert Dr. Lee-Jen Wei at 2 (Sept. 8, 2009), in Secur. Litig.



According to Pfizer, Professor Madigan reached different results from Wei’s largely because he had used different event counts and end points.  The defendants’ challenge to Madigan turned largely upon the unreliable way he went about counting events to include in his meta-analyses.

Data concerning unexpected adverse events in clinical trials often is collected as reports of treating physicians, whose descriptions may be incomplete, inaccurate, or inadequate.  When there is a suggestion that a particular adverse event – say heart attack – occurred more frequently in the medication arm as opposed to the placebo or comparator arms, the usual course of action is to have a panel of clinical experts review all the adverse event reports, and supporting medical charts, to provide diagnoses that can be used in a more complete statistical analyses.  Obviously, the reviewers should be blinded to the patients’ assignment to medication or placebo, and the reviewers should be clinical experts in the clinical specialty of the adverse event.  Cardiologists should be making the call for heart attacks.

In addition to event definition and adjudication, clinical trial interpretation sometimes leads to the use of “composite end points,” which consist of related diagnostic categories, aggregated in some way that makes biological sense.  For instance, if the concern is that a medication causes cardiovascular thrombotic events, a suitable cardiovascular composite end point might include heart attack and ischemic stroke.  Inclusion of hemorrhagic stroke, endocarditis, and valvular disease in the composite, however, would be inappropriate, given the concern over thrombosis.

Professor Madigan is a highly qualified statistician, but, as Pfizer argued, he had no clinical expertise to reassign diagnoses or determine appropriate composite end points.  The essence of the defendants’ challenges revolved around claims of flawed outcome and endpoint ascertainment and definitions.  According to Pfizer’s briefing, the event definition process was unblinded, and conducted by inexpert, partisan reviewers.  Madigan apparently relied upon the work of another plaintiffs’ witness, cardiologist Dr. Lawrence Baruch, as well as that of Dr. Curt Furberg.  Furberg was not a cardiologist; indeed he has never been licensed to practice medicine in the United Dates, and he had not treated a patient in over 30 years. Pfizer Mem. of Law in Opp. to Plaintiffs’ Motion to Exclude Defendants’ Expert Dr. Lee-Jen Wei at 29 (Sept. 8, 2009), in Secur. Litig.  Furthermore, Furberg was not familiar with current diagnostic criteria for heart attack.  Plaintiffs’ counsel asked Furberg to rework some but not all of Baruch’s classifications, but only for fatal events.  Baruch could not explain why Furberg made these reclassifications.  Furberg acknowledged that he had never used “one-line descriptions to classify events,” which he did in the Celebrex litigation, when he received the assignment from plaintiffs’ counsel on the eve of the Court’s deadline for disclosures.  Id. According to Pfizer, if the plaintiffs’ witnesses had used appropriate end points and event counts, their meta-analyses would not have differed from Professor Wei’s work.  Id.

Pfizer pointed to Madigan’s testimony to claim that he had admitted that, based upon the impropriety of Furberg’s changing end point definitions, and his own changes, made without the assistance of a clinician, he would not submit the earlier version of his meta-analysis for peer review.  Pfizer’s [Proposed] Findings of Fact and Conclusions of Law with Respect to Motion to Exclude Certain Plaintiffs’ Experts’ Opinions Regarding Celebrex and Bextra, and Plaintiffs’ Motion to Exclude Defendants’ Expert Dr. Lee-Jen Wei, Document 175, submitted in Securities Litig. (Dec. 4, 2009). at 33,  43.  The plaintiffs countered that Furberg’s reclassifications did not change Madigan’s reports, at least for certain years. Plaintiffs’ Reply Mem. of Law in Further Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 18 (May 5, 2010), in Securities Litig.

The trial court denied Pfizer’s challenges to Madigan’s meta-analysis in the securities fraud class action.  The court attributed any weakness in the classification of fatal adverse events by Baruch and Furberg to the limitations of the underlying data created and produced by Pfizer itself.  In re Pfizer Inc. Securities Litig., 2010 WL 1047618, *4 (S.D.N.Y. 2010).



Pfizer also argued that Madigan put together composite outcomes that did not make biological sense in view of the plaintiffs’ causal theories.  For instance, Madigan left out strokes in his composite, although he included both heart attack and stroke in his primary end point for his Vioxx litigation analysis, and he had no reason to distinguish Vioxx and Celebrex in terms of claimed thrombotic effects.  Pfizer’s [Proposed] Findings of Fact and Conclusions of Law with Respect to Motion to Exclude Certain Plaintiffs’ Experts’ Opinions Regarding Celebrex and Bextra, and Plaintiffs’ Motion to Exclude Defendants’ Expert Dr. Lee-Jen Wei, Document 175, submitted in Securities Litig. (Dec. 4, 2009). at 13-14, 18.  According to Pfizer, Madigan’s composite was novel and unvalidated by relevant, clinical opinion.  Id. at 29, 33.

The plaintiffs’ response is obscure.  The plaintiffs seemed to claim that Madigan was justified in excluding strokes because some kinds of stroke, hemorrhagic strokes, are unrelated to thrombosis.  Plaintiffs’ Reply Memorandum of Law in Further Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 14 (May 5, 2010), in Securities Litig. at 14.  This argument is undermined by the facts:  better than 85% of strokes being ischemic in origin, and even some hemorrhagic strokes start as a result of an ischemic event.

In any event, Pfizer’s argument about Madigan’s composite end points did not gain any traction with the trial judge in the securities fraud class action:

“Dr. Madigan’s written submissions and testimony described clearly and justified cogently his statistical methods, selection of endpoints, decisions regarding event classification, sources of data, as well as the conclusions he drew from his analysis. Indeed, Dr. Madigan’s meta-analysis was based largely on data and endpoints developed by Pfizer. All four of the endpoints that Dr. Madigan used in his analysis-Hard CHD, Myocardial Thromboembolic Events, Cardiovascular Thromboembolic Events, and CV Mortality-have been employed by Pfizer in its own research and analysis. The use of Hard CHD in the relevant literature combined with the use of the other three endpoints by Pfizer in its own 2005 meta-analysis will assist the trier of fact in determining Pfizer’s knowledge and understanding of the pre-December 17, 2004, cardiovascular safety profile of Celebrex.”

In re Pfizer Inc. Securities Litig., 2010 WL 1047618, *4 (S.D.N.Y. 2010).

Print Friendly, PDF & Email

Comments are closed.