TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

Samuel Tarry’s Protreptic for Litigation-Sponsored Publications

July 9th, 2017

Litigation-related research has been the punching bag of self-appointed public health advocates for some time. Remarkably, and perhaps not surprising to readers of this blog, many of the most strident critics have deep ties to the lawsuit industry, and have served the plaintiffs’ bar loyally and zealously for many years.1,2,3,4 And many of these critics have ignored or feigned ignorance of the litigation provenance of much research that they hold dear, such as Irving Selikoff’s asbestos research undertaken for the asbestos workers’ union and its legal advocates. These critics’ campaign is an exquisite study in hypocrisy.

For some time, I have argued that the standards for conflict-of-interest disclosures should be applied symmetrically and comprehensively to include positional conflicts, public health and environmental advocacy, as well as litigation consulting or testifying for any party. Conflicts should be disclosed, but they should not become a facile excuse or false justification for dismissing research, regardless of the party that sponsored it.5 Scientific studies should be interpreted scientifically – that is carefully, thoroughly, and rigorously – regardless whether they are conducted and published by industry-sponsored, union-sponsored, or Lord help us, even lawyer-sponsored scientists.

Several years ago, a defense lawyer, Samuel Tarry, published a case series of industry-sponsored research or analysis, which grew out of litigation, but made substantial contributions to the scientific understanding of claimed health risks. See Samuel L. Tarry, Jr., “Can Litigation-Generated Science Promote Public Health?” 33 Am. J. Trial Advocacy 315 (2009). Tarry’s paper is a helpful corrective to the biased (and often conflicted) criticisms of industry-sponsored research and analysis by the lawsuit industry and its scientific allies and consultants. It an ocean of uninformative papers about “Daubert,” Tarry’s paper stands out and should be required reading for all lawyers who practice in the area of “health effects litigation.”

Tarry presented a brief summary of the litigation context for three publications that deserve to remembered and used as exemplars of important, sound, scientific publications that helped changed the course of litigations, as well as the scientific community’s appreciation of prior misleading contentions and publications. His three case studies grew out of the silicone-gel breast implant litigation, the latex allergy litigation, and the never-ending asbestos litigation.

1. Silicone

There are some glib characterizations of the silicone gel breast implant litigation as having had no evidentiary basis. A more careful assessment would allow that there was some evidence, much of it fraudulent and irrelevant. See, e.g., Hon. Jack B. Weinstein, “Preliminary Reflections on Administration of Complex Litigation” 2009 Cardozo L. Rev. de novo 1, 14 (2009) (describing plaintiffs’ expert witnesses in the silicone gel breast implant litigation as “charlatans” and the litigation as largely based upon fraud). The lawsuit industry worked primarily through so-called support groups, which in turn funded friendly, advocate physicians, who in turn testified for plaintiffs and their lawyers in personal injury cases.

When the defendants, such as Dow Corning, reacted by sponsoring serious epidemiologic analyses of the issue whether exposure to silicone gel was associated with specific autoimmune or connective tissue diseases, the plaintiffs’ bar mounted a conflict-of-interest witch hunt over industry funding.6 Ultimately, the source of funding became obviously irrelevant; the concordance between industry-funded and all high quality research on the litigation claims was undeniable. Obvious that is to court-appointed expert witnesses7, and to a blue-ribbon panel of experts in the Institute of Medicine8.

2. Latex Hypersensitivity

Tarry’s second example comes from the latex hypersensitivity litigation. Whatever evidentiary basis may have existed for isolated cases of latex allergy, the plaintiffs’ bar had taken and expanded into a full-scale mass tort. A defense expert witness, Dr. David Garabrant, a physician and an epidemiologist, published a meta-analysis and systematic review of the extant scientific evidence. David H. Garabrant & Sarah Schweitzer, “Epidemiology of latex sensitization and allergies in health care workers,” 110 J. Allergy & Clin. Immunol. S82 (2002). Garabrant’s formal, systematic review documented his litigation opinions that the risk of latex hypersensitivity was much lower than claimed and not the widespread hazard asserted by plaintiffs and their retained expert witnesses. Although Garabrant’s review did not totally end the litigation and public health debate about latex, it went a long way toward ending both.

3. Fraudulent Asbestos-Induced Radiography

I still recall, sitting at my desk, my secretary coming into my office to tell me excitedly that a recent crop of silicosis claimants had had previous asbestosis claims. When I asked how she knew, she showed me the computer print out for closed files for another client. Some of the names were so distinctive that the probability that there were two men with the same name was minuscule. When we obtained the closed files from storage, sure enough, the social security numbers matched, as did all other pertinent data, except that what had been called asbestosis previously was now called silicosis.

My secretary’s astute observation was mirrored in the judicial proceedings of Judge Janis Graham Jack, who presided over MDL 1553. Judge Jack, however, discovered something even more egregious: in some cases, a single physician interpreted a single chest radiograph as showing either asbestosis or silicosis, but not both. The two, alternative diagnoses were recorded in two, separate reports, for two different litigation cases against different defendants. This fraudulent practice, as well as others, are documented in Judge Jack’s extraordinary, thorough opinion. See In re Silica Prods. Liab. Litig., 398 F. Supp. 2d 563 (S.D. Tex. 2005)9.

The revelations of fraud in Judge Jack’s opinion were not entirely surprising. As everyone involved in asbestos litigation has always known, there is a disturbing degree of subjectivity in the interpretation of chest radiographs for pneumoconiosis. The federal government has long been aware of this problem, and through the Centers for Disease Control and the National Institute of Occupational Safety and Health, has tried to subdue extreme subjectivity by creating a pneumoconiosis classification schemed for chest radiographs known as the “B-reader” system. Unfortunately, B-reader certification meant only that physicians could achieve inter-observer and intra-observer reproducibility of interpretations on the examination, but they were free to peddle extreme interpretations for litigation. Indeed, the B-reader certification system exacerbated the problem by creating a credential that was marketed to advance the credibility of some of the most biased, over-reading physicians in asbestos, silica, and coal pneumoconiosis litigation.

Tarry’s third example is a study conducted under the leadership of the late Joseph Gitlin, at Johns Hopkins Medical School. With funding from defendants and insurers, Dr. Joseph Gitlin conducted a concordance study of films that had been read by predatory radiologists and physicians as showing pneumoconiosis. The readers in his study found a very low level of positive films (less than 5%), despite their having been interpreted as showing pneumoconiosis by the litigation physicians. See Joseph N. Gitlin, Leroy L. Cook, Otha W. Linton, and Elizabeth Garrett-Mayer, “Comparison of ‘B’ Readers’ Interpretations of Chest Radiographs for Asbestos Related Changes,” 11 Acad. Radiol. 843 (2004); Marjorie Centofanti, “With thousands of asbestos workers demanding compensation for lung disease, a radiology researcher here finds that most cases lack merit,” Hopkins Medicine (2006). As with the Sokol hoax, the practitioners of post-modern medicine cried “foul,” and decried industry sponsorship, but the disparity between courtroom and hospital medicine was sufficient proof for most disinterested observers that there was a need to fix the litigation process.

Meretricious Mensuration10 – Manganese Litigation Example

Tarry’s examples are important reminders that corporate sponsorship, whether from the plaintiffs’ lawsuit industry or from manufacturing industry, does not necessarily render research tainted or unreliable. Although lawyers often confront exaggerated or false claims, and witness important, helpful correctives in the form of litigation-sponsored studies, the demands of legal practice and “the next case” typically prevent lawyers from documenting the scientific depredations and their rebuttals. Sadly, unlike litigations such as those involving Bendectin and silicone, the chronicles of fraud and exaggeration are mostly closed books in closed files in closed offices. These examples need the light of day and a fresh breeze to disseminate them widely in both the scientific and legal communities, so that all may have a healthy appreciation for the value of appropriately conducted studies generated in litigation contexts.

As I have intimated elsewhere, the welding fume litigation is a great example of specious claiming, which ultimately was unhorsed by publications inspired or funded by the defense. In the typical welding fume case, plaintiff claimed that exposure to manganese in welding fume caused Parkinson’s disease or manganism. Although manganism sounds as though it must be a disease that can be caused only by manganese, in the hands of plaintiffs’ expert witnesses, manganism became whatever ailment plaintiffs claimed to have suffered. Circularity and perfect definitional precision were achieved by semantic fiat.

The Sanchez-Ramos Meta-Analysis

Manganese Madness was largely the creation of the Litigation Industry, under the dubious leadership of Dickie Scruggs & Company. Although the plaintiffs enjoyed a strong tail wind in the courtroom of an empathetic judge, they had difficulties in persuading juries and ultimately decamped from MDL 1535, in favor of more lucrative targets. In their last hurrah, however, plaintiffs retained a neurologist, Juan Sanchez-Ramos, who proffered a biased, invalid synthesis, which he billed as a meta-analysis11.

Sanchez-Ramos’s meta-analysis, such as it was, provoked professional disapproval and criticism from the defense expert witness, Dr. James Mortimer. Because the work product of Sanchez-Ramos was first disclosed in deposition, and not in his Rule 26 report, Dr. Mortimer undertook belatedly a proper meta-analysis.12 Even though Dr. Mortimer’s meta-analysis was done in response to the Sanchez-Ramos’s improper, tardy disclosure, the MDL judge ruled that Mortimer’s meta-analysis was too late. The effect, however, of Mortimer’s meta-analysis was clear in showing that welding had no positive association with Parkinson’s disease outcomes. The MDL 1535 resolved quickly thereafter, and with only slight encouragement, Dr. Mortimer published a further refined meta-analysis with two other leading neuro-epidemiologists. See James Mortimer, Amy Borenstein, and Lorene Nelson, “Associations of welding and manganese exposure with Parkinson disease: Review and meta-analysis,” 79 Neurology 1174 (2012). See also Manganese Meta-Analysis Further Undermines Reference Manual’s Toxicology Chapter(Oct. 15, 2012).


1 See, e.g., David Michaels & Celeste Monforton, “Manufacturing Uncertainty Contested Science and the Protection ofthe Public’s Health and Environment,” 95 Am. J. Pub. Health S39, S40 (2005); David Michaels & Celeste Monforton, “How Litigation Shapes the Scientific Literature: Asbestos and Disease Among Automobile Mechanics,” 15 J. L. & Policy 1137, 1165 (2007). Michaels had served as a plaintiffs’ paid expert witness in chemical exposure litigation, and Monforton had been employed by labor unions before these papers were published, without disclosure of conflicts.

2 Leslie Boden & David Ozonoff, “Litigation-Generated Science: Why Should We Care?” 116 Envt’l Health Persp. 121, 121 (2008) (arguing that systematic distortion of the scientific record will result from litigation-sponsored papers even with disclosure of conflicts of interest). Ozonoff had served as a hired plaintiffs’ expert witnesses on multiple occasion before the publication of this article, which was unadorned by disclosure.

3 Lennart Hardell, Martin J. Walker, Bo Walhjalt, Lee S. Friedman, and Elihu D. Richter, “Secret Ties to Industry and Conflicting Interest in Cancer Research,” 50 Am. J. Indus. Med. 227, 233 (2007) (criticizing “powerful industrial interests” for “undermining independent research on hazard and risk,” in a “red” journal that is controlled by allies of the lawsuit industry). Hardell was an expert witness for plaintiffs in mobile phone litigation in which plaintiffs claimed that non-ionizing radiation caused brain cancer. In federal litigation, Hardell was excluded as an expert witness when his proffered opinions were found to be scientifically unreliable. Newman v. Motorola, Inc., 218 F. Supp. 2d. 769, 777 (D. Md. 2002), aff’d, 78 Fed. Appx. 292 (4th Cir. 2003).

4 See David Egilman & Susanna Bohme, “IJOEH and the Critique of Bias,” 14 Internat’l J. Occup. & Envt’l Health 147, 148 (2008) (urging a Marxist critique that industry-sponsored research is necessarily motivated by profit considerations, and biased in favor of industry funders). Although Egilman usually gives a disclosure of his litigation activities, he typically characterizes those activities as having been for both plaintiffs and defendants, even though his testimonial work for defendants is minuscule.

5 Kenneth J. Rothman, “Conflict of Interest: The New McCarthyism in Science,” 269 J. Am. Med. Ass’n 2782 (1993).

6 See Charles H. Hennekens, I-Min Lee, Nancy R. Cook, Patricia R. Hebert, Elizabeth W. Karlson, Fran LaMotte; JoAnn E. Manson, and Julie E. Buring, “Self-reported Breast Implants and Connective- Tissue Diseases in Female Health Professionals: A Retrospective Cohort Study, 275 J. Am. Med. Ass’n 616-19 (1998) (analyzing established cohort for claimed associations, with funding from the National Institutes of Health and Dow Corning Corporation).

7 See Barbara Hulka, Betty Diamond, Nancy Kerkvliet & Peter Tugwell, “Silicone Breast Implants in Relation to Connective Tissue Diseases and Immunologic Dysfunction: A Report by a National Science Panel to the Hon. Sam Pointer Jr., MDL 926 (Nov. 30, 1998).” The court-appointed expert witnesses dedicated a great deal of their professional time to their task of evaluating the plaintiffs’ claims and the evidence. At the end of the process, they all published their litigation work in leading journals. See Barbara Hulka, Nancy Kerkvliet & Peter Tugwell, “Experience of a Scientific Panel Formed to Advise the Federal Judiciary on Silicone Breast Implants,” 342 New Engl. J. Med. 812 (2000); Esther C. Janowsky, Lawrence L. Kupper., and Barbara S. Hulka, “Meta-Analyses of the Relation between Silicone Breast Implants and the Risk of Connective-Tissue Diseases,” 342 New Engl. J. Med. 781 (2000); Peter Tugwell, George Wells, Joan Peterson, Vivian Welch, Jacqueline Page, Carolyn Davison, Jessie McGowan, David Ramroth, and Beverley Shea, “Do Silicone Breast Implants Cause Rheumatologic Disorders? A Systematic Review for a Court-Appointed National Science Panel,” 44 Arthritis & Rheumatism 2477 (2001).

8 Stuart Bondurant, Virginia Ernster, and Roger Herdman, eds., Safety of Silicone Breast Implants (Institute of Medicine) (Wash. D.C. 1999).

9 See also Lester Brickman, “On the Applicability of the Silica MDL Proceeding to Asbestos Litigation, 12 Conn. Insur. L. J. 289 (2006); Lester Brickman, “Disparities Between Asbestosis and Silicosis Claims Generated By Litigation Screenings and Clinical Studies,” 29 Cardozo L. Rev. 513 (2007).

10 This apt phraseology is due to the late Keith Morgan, whose wit, wisdom, and scientific acumen are greatly missed. See W. Keith C. Morgan, “Meretricious Mensuration,” 6 J. Eval. Clin. Practice 1 (2000).

11 See Deposition of Dr. Juan Sanchez-Ramos, in Street v. Lincoln Elec. Co., Case No. 1:06-cv-17026, 2011 WL 6008514 (N.D. Ohio May 17, 2011).

12 See Deposition of Dr. James Mortimer, in Street v. Lincoln Elec. Co., Case No. 1:06-cv-17026, 2011 WL 6008054 (N.D. Ohio June 29, 2011).

The Education of Judge Rufe – The Zoloft MDL

April 9th, 2016

The Honorable Cynthia M. Rufe is a judge on the United States District Court, for the Eastern District of Pennsylvania.  Judge Rufe was elected to a judgeship on the Bucks County Court of Common Pleas in 1994.  She was appointed to the federal district court in 2002. Like most state and federal judges, little in her training and experience as a lawyer prepared her to serve as a gatekeeper of complex expert witness scientific opinion testimony.  And yet, the statutory code of evidence, and in particular, Federal Rules of Evidence 702 and 703, requires her do just that.

The normal approach to MDL cases is marked by the Field of Dreams: “if you build it, they will come.” Last week, Judge Rufe did something that is unusual in pharmaceutical litigation; she closed the gate and sent everyone home. In re Zoloft Prod. Liab. Litig., MDL NO. 2342, 12-MD-2342, 2016 WL 1320799 (E.D. Pa. April 5, 2016).

Her Honor’s decision was hardly made in haste.  The MDL began in 2012, and proceeded in a typical fashion with case management orders that required the exchange of general causation expert witness reports. The plaintiffs’ steering committee (PSC), acting for the plaintiffs, served the report of only one epidemiologist, Anick Bérard, who took the position that Zoloft causes virtually every major human congenital anomaly known to medicine. The defendants challenged the admissibility of Bérard’s opinions.  After extensive briefings and evidentiary hearings, the trial court found that Bérard’s opinions were riddled with inconsistent assessments of studies, eschewed generally accepted methods of causal inference, ignored contrary evidence, adopted novel, unreliable methods of endorsing “trends” in studies, and failed to address epidemiologic studies that did not support her subjective opinions. In re Zoloft Prods. Liab. Litig., 26 F. Supp. 3d 449 (E.D.Pa.2014). The trial court permitted plaintiffs an opportunity to seek reconsideration of Bérard’s exclusion, which led to the trial court’s reaffirming its previous ruling. In re Zoloft Prods. Liab. Litig., No. 12–md–2342, 2015 WL 314149, at *2 (E.D.Pa. Jan. 23, 2015).

Notwithstanding the PSC’s claims that Bérard was the best qualified expert witness in her field and that she was the only epidemiologist needed to support the plaintiffs’ causal claims, the MDL court indulged the PSC by permitting plaintiffs another bite at the apple.  Over defendants’ objections, the court permitted the PSC to name yet another expert witness, statistician Nicholas Jewell, to do what Bérard had failed to do: proffer an opinion on general causation supported by sound science.  In re Zoloft Prods. Liab. Litig., No. 12–md–2342, 2015 WL 115486, at * 2 (E.D.Pa. Jan. 7, 2015).

As a result of this ruling, the MDL dragged on for over a year, in which time, the PSC served a report by Jewell, and then the defendants conducted a discovery deposition of Jewell, and lodged a new Rule 702 challenge.  Although Jewell brought more statistical sophistication to the task, he could not transmute lead into gold; nor could he support the plaintiffs’ causal claims without committing most of the same fallacies found in Bérard’s opinions.  After another round of Rule 702 briefs and hearings, the MDL court excluded Jewell’s unwarranted causal opinions. In re Zoloft Prods. Liab. Litig., No. 12–md–2342, 2015 WL 7776911 (E.D.Pa. Dec. 2, 2015).

The successive exclusions of Bérard and Jewell left the MDL court in a peculiar position. There were other witnesses, Robert Cabrera, a teratologist, Michael Levin, a molecular biologist, and Thomas Sadler, an embryologist, whose opinions addressed animal toxicologic studies, biological plausibility, and putative mechanisms.  These other witnesses, however, had little or no competence in epidemiology, and they explicitly relied upon Bérard’s opinions with respect to human outcomes.  As a result of Bérard’s exclusion, these witnesses were left free to offer their views about what happens in animals at high doses, or about theoretical mechanisms, but they were unable to address human causation.

Although the PSC had no expert witnesses who could legitimately offer reasonably supported opinions about the causation of human birth defects, the plaintiffs refused to decamp and leave the MDL forum. Faced with the prospect of not trying their cases to juries, the PSC instead tried the patience of the MDL judge. The PSC pulled out the stops in adducing weak, irrelevant, and invalid evidence to support their claims, sans epidemiologic expertise. The PSC argued that adverse event reports, internal company documents that discussed possible associations, the biological plausibility opinions of Levin and Sadler, the putative mechanism opinions of Cabrera, differential diagnoses offered to support specific causation, and the hip-shot opinions of a former-FDA-commissioner-for-hire, David Kessler could come together magically to supply sufficient evidence to have their cases submitted to juries. Judge Rufe saw through the transparent effort to manufacture evidence of causation, and granted summary judgment on all remaining Zoloft cases in the MDL. s In re Zoloft Prod. Liab. Litig., MDL NO. 2342, 12-MD-2342, 2016 WL 1320799, at *4 (E.D. Pa. April 5, 2016).

After a full briefing and hearing on Bérard’s opinion, a reconsideration of Bérard, a permitted “do over” of general causation with Jewell, a full briefing and hearing on Jewell’s opinions, the MDL court was able to deal deftly with the snippets of evidence “cobbled together” to substitute for evidence that might support a conclusion of causation. The PSC’s cobbled case was puffed up to give the appearance of voluminous evidence, in 200 exhibits that filled six banker’s boxes.  Id. at *5. The ruse was easily undone; most of the exhibits and purported evidence were obvious rubbish. “The quantity of the evidence is not, however, coterminous with the quality of evidence with regard to the issues now before the Court.” Id. The banker’s boxes contained artifices such as untranslated foreign-language documents, and company documents relating to the development and marketing of the medication. The PSC resubmitted reports from Levin, Cabrera, and Sadler, whose opinions were already adjudicated to be incompetent, invalid, irrelevant, or inadequate to support general causation.  The PSC pointed to the specific causation opinions of a clinical cardiologist, Ra-Id Abdulla, M.D., who proffered dubious differential etiologies, ruling in Zoloft as a cause of individual children’s birth defects, despite his inability to rule out truly known and unknown causes in the differential reasoning.  The MDL court, however, recognized that “[a] differential diagnosis assumes that general causation has been established,” id. at *7, and that Abdulla could not bootstrap general causation by purporting to reach a specific causation opinion (even if those specific causation opinions were legitimate).

The PSC submitted the recent consensus statement of the American Statistical Association (ASA)[1], which it misrepresented to be an epidemiologic study.  Id. at *5. The consensus statement makes some pedestrian pronouncements about the difference between statistical and clinical significance, about the need for other considerations in addition to statistical significance, in supporting causal claims, and the lack of bright-line distinctions for statistical significance in assessing causality.  All true, but immaterial to the PSC’s expert witnesses’ opinions that over-endorsed statistical significance in the few instances in which it was shown, and over-interpreted study data that was based upon data mining and multiple comparisons, in blatant violation of the ASA’s declared principles.

Stretching even further for “human evidence,” the PSC submitted documentary evidence of adverse event reports, as though they could support a causal conclusion.[2]  There are about four million live births each year, with an expected rate of serious cardiac malformations of about one per cent.[3]  The prevalence of SSRI anti-depressant use is at least two per cent, which means that we would expect 800 cardiac birth defects each year to occur in children of mother’s who took SSRI anti-depressants in the first trimester. If Zoloft had an average market share of all the SSRIs of about 25 per cent, then 200 cardiac defects each year would occur in children born to mothers who took Zoloft.  Given that Zoloft has been on the market since the early 1990s, we would expect that there would be thousands of children, exposed to Zoloft during embryogenesis, born with cardiac defects, if there was nothing untoward about maternal exposure to the medication.  Add the stimulated reporting of adverse events from lawyers, lawyer advertising, and lawyer instigation, you have manufactured evidence not probative of causation at all.[4] The MDL court cut deftly and swiftly through the smoke screen:

“These reports are certainly relevant to the generation of study hypotheses, but are insufficient to create a material question of fact on general causation.”

Id. at *9. The MDL court recognized that epidemiology was very important in discerning a causal connection between a common exposure and a common outcome, especially when the outcome has an expected rate in the general population. The MDL court stopped short of holding that epidemiologic evidence was required (which on the facts of the case would have been amply justified), but instead supported its ratio decidendi on the need to account for the extant epidemiology that contradicted or failed to support the strident and subjective opinions of the plaintiffs’ expert witnesses. The MDL court thus gave plaintiffs every benefit of the doubt by limiting its holding on the need for epidemiology to:

“when epidemiological studies are equivocal or inconsistent with a causation opinion, experts asserting causation opinions must thoroughly analyze the strengths and weaknesses of the epidemiological research and explain why that body of research does not contradict or undermine their opinion.”

Id. at *5, quoting from In re Zoloft Prods. Liab. Litig., 26 F. Supp. 3d 449, 476 (E.D. Pa. 2014).

The MDL court also saw through the thin veneer of respectability of the testimony of David Kessler, a former FDA commissioner who helped make large fortunes for some of the members of the PSC by the feeding frenzy he created with his moratorium on silicone gel breast implants.  Even viewing Kessler’s proffered testimony in the most charitable light, the court recognized that he offered little support for a causal conclusion other than to delegate the key issues to epidemiologists. Id. at *9. As for the boxes of regulatory documents, foreign labels, and internal company memoranda, the MDL court found that these documents did not raise a genuine issue of material fact concerning general causation:

“Neither these documents, nor draft product documents or foreign product labels containing language that advises use of birth control by a woman taking Zoloft constitute an admission of causation, as opposed to acknowledging a possible association.”

Id.

In the end, the MDL court found that the PSC’s many banker boxes of paper contained too much of nothing for the issue at hand.  Having put the defendants through the time and expense of litigating and re-litigating these issues, nothing short of dismissing the pending cases was a fair and appropriate outcome to the Zoloft MDL.

_______________________________________

Given the denouement of the Zoloft MDL, it is worth considering the MDL judge’s handling of the scientific issues raised, misrepresented, argued, or relied upon by the parties.  Judge Rufe was required, by Rules 702 and 703, to roll up her sleeves and assess the methodological validity of the challenged expert witnesses’ opinions.  That Her Honor was able to do this is a testament to her hard work. Zoloft was not Judge Rufe’s first MDL, and she clearly learned a lot from her previous judicial assignment to an MDL for Avandia personal injury actions.

On May 21, 2007, the New England Journal of Medicine published online a seriously flawed meta-analysis of cardiovascular disease outcomes and rosiglitazone (Avandia) use.  See Steven E. Nissen, M.D., and Kathy Wolski, M.P.H., “Effect of Rosiglitazone on the Risk of Myocardial Infarction and Death from Cardiovascular Causes,” 356 New Engl. J. Med. 2457 (2007).  The Nissen article did not appear in print until June 14, 2007, but the first lawsuits resulted within a day or two of the in-press version. The lawsuits soon thereafter reached a critical mass, with the inevitable creation of a federal court Multi-District Litigation.

Within a few weeks of Nissen’s article, the Annals of Internal Medicine published an editorial by Cynthia Mulrow, and other editors, in which questioned the Nissen meta-analysis[5], and introduced an article that attempted to replicate Nissen’s work[6].  The attempted replication showed that the only way Nissen could have obtained his nominally statistically significant result was to have selected a method, Peto’s fixed effect method, known to be biased for use with clinical trials with uneven arms. Random effect methods, more appropriate for the clinically heterogeneous clinical trials, consistently failed to replicate the Nissen result. Other statisticians weighed in and pointed out that using the risk difference made much more sense when there were multiple trials with zero events in one or the other or both arms of the trials. Trials with zero cardiovascular events in both arms represented important evidence of low, but equal risk, of heart attacks, which should be captured in an appropriate analysis.  When the risk difference approach was used, with exact statistical methods, there was no statistically significant increase in risk in the dataset used by Nissen.[7] Other scientists, including some of Nissen’s own colleagues at the Cleveland Clinic, and John Ioannidis, weighed in to note how fragile and insubstantial the Nissen meta-analysis was[8]:

“As rosiglitazone case demonstrates, minor modifications of the meta-analysis protocol can change the statistical significance of the result.  For small effects, even the direction of the treatment effect estimate may change.”

Nissen achieved his political objective with his shaky meta-analysis.  The FDA convened an Advisory Committee meeting, which in turn resulted in a negative review of the safety data, and the FDA’s imposition of warnings and a Risk Evaluation and Mitigation Strategy, which all but prohibited use of rosiglizone.[9]  A clinical trial, RECORD, had already started, with support from the drug sponsor, GlaxoSmithKline, which fortunately was allowed to continue.

On a parallel track to the regulatory activities, the federal MDL, headed by Judge Rufe, proceeded to motions and a hearing on GSK’s Rule 702 challenge to plaintiffs’ evidence of general causation. The federal MDL trial judge denied GSK’s motions to exclude plaintiffs’ causation witnesses in an opinion that showed significant diffidence in addressing scientific issues.  In re Avandia Marketing, Sales Practices and Product Liability Litigation, 2011 WL 13576, *12 (E.D. Pa. 2011).  SeeLearning to Embrace Flawed Evidence – The Avandia MDL’s Daubert Opinion” (Jan. 10, 2011.

After Judge Rufe denied GSK’s challenges to the admissibility of plaintiffs’ expert witnesses’ causation opinions in the Avandia MDL, the RECORD trial was successfully completed and published.[10]  RECORD was a long term, prospectively designed randomized cardiovascular trial in over 4,400 patients, followed on average of 5.5 yrs.  The trial was designed with a non-inferiority end point of ruling out a 20% increased risk when compared with standard-of-care diabetes treatment The trial achieved its end point, with a hazard ratio of 0.99 (95% confidence interval, 0.85-1.16) for cardiovascular hospitalization and death. A readjudication of outcomes by the Duke Clinical Research Institute confirmed the published results.

On Nov. 25, 2013, after convening another Advisory Committee meeting, the FDA announced the removal of most of its restrictions on Avandia:

“Results from [RECORD] showed no elevated risk of heart attack or death in patients being treated with Avandia when compared to standard-of-care diabetes drugs. These data do not confirm the signal of increased risk of heart attacks that was found in a meta-analysis of clinical trials first reported in 2007.”

FDA Press Release, “FDA requires removal of certain restrictions on the diabetes drug Avandia” (Nov. 25, 2013). And in December 2015, the FDA abandoned its requirement of a Risk Evaluation and Mitigation Strategy for Avandia. FDA, “Rosiglitazone-containing Diabetes Medicines: Drug Safety Communication – FDA Eliminates the Risk Evaluation and Mitigation Strategy (REMS)” (Dec. 16, 2015).

GSK’s vindication came too late to reverse Judge Rufe’s decision in the Avandia MDL.  GSK spent over six billion dollars on resolving Avandia claims.  And to add to the company’s chagrin, GSK lost patent protection for Avandia in April 2012.[11]

Something good, however, may have emerged from the Avandia litigation debacle.  Judge Rufe heard from plaintiffs’ expert witnesses in Avandia about the hierarchy of evidence, about how observational studies must be evaluated for bias and confounding, about the importance of statistical significance, and about how studies that lack power to find relevant associations may still yield conclusions with appropriate meta-analysis. Important nuances of meta-analysis methodology may have gotten lost in the kerfuffle, but given that plaintiffs had reasonable quality clinical trial data, Avandia plaintiffs’ counsel could eschew their typical reliance upon weak and irrelevant lines of evidence, based upon case reports, adverse event disproportional reporting, and the like.

The Zoloft litigation introduced Judge Rufe to a more typical pharmaceutical litigation. Because the outcomes of interest were birth defects, there were no clinical trials.  To be sure, there were observational epidemiologic studies, but now the defense expert witnesses were carefully evaluating the studies for bias and confounding, and the plaintiffs’ expert witnesses were double counting studies and ignoring multiple comparisons and validity concerns.  Once again, in the Zoloft MDL, plaintiffs’ expert witnesses made their non-specific complaints about “lack of power” (without ever specifying the relevant alternative hypothesis), but it was the defense expert witnesses who cited relevant meta-analyses that attempted to do something about the supposed lack of power. Plaintiffs’ expert witnesses inconsistently argued “lack of power” to disregard studies that had outcomes that undermined their opinions, even when those studies had narrow confidence intervals surrounding values at or near 1.0.

The Avandia litigation laid the foundation for Judge Rufe’s critical scrutiny by exemplifying the nature and quantum of evidence to support a reasonable scientific conclusion.  Notwithstanding the mistakes made in the Avandia litigation, this earlier MDL created an invidious distinction with the Zoloft PSC’s evidence and arguments, which looked as weak and insubstantial as they really were.


[1] Ronald L. Wasserstein & Nicole A. Lazar, “The ASA’s Statement on p-Values: Context, Process, and Purpose,” The American Statistician, available online (Mar. 7, 2016), in-press at DOI:10.1080/00031305.2016.1154108, <http://dx.doi.org/10.1080/>. SeeThe American Statistical Association’s Statement on and of Significance” (Mar. 17, 2016); “The ASA’s Statement on Statistical Significance – Buzzing from the Huckabees” (Mar. 19, 2016).

[2] See 21 C.F.R. § 314.80 (a) Postmarketing reporting of adverse drug experiences (defining “[a]dverse drug experience” as “[a]ny adverse event associated with the use of a drug in humans, whether or not considered drug related”).

[3] See Centers for Disease Control and Prevention, “Birth Defects Home Page” (last visited April 8, 2016).

[4] See, e.g., Derrick J. Stobaugh, Parakkal Deepak, & Eli D. Ehrenpreis, “Alleged isotretinoin-associated inflammatory bowel disease: Disproportionate reporting by attorneys to the Food and Drug Administration Adverse Event Reporting System,” 69 J. Am. Acad. Dermatol. 393 (2013) (documenting stimulated reporting from litigation activities).

[5] Cynthia D. Mulrow, John Cornell & A. Russell Localio, “Rosiglitazone: A Thunderstorm from Scarce and Fragile Data,” 147 Ann. Intern. Med. 585 (2007).

[6] George A. Diamond, Leon Bax & Sanjay Kaul, “Uncertain Effects of Rosiglitazone on the Risk for Myocardial Infartion and Cardiovascular Death,” 147 Ann. Intern. Med. 578 (2007).

[7] Tian, et al., “Exact and efficient inference procedure for meta-analysis and its application to the analysis of independent 2 × 2 tables with all available data but without artificial continuity correction” 10 Biostatistics 275 (2008)

[8] Adrian V. Hernandez, Esteban Walker, John P.A. Ioannidis,  and Michael W. Kattan, “Challenges in meta-analysis of randomized clinical trials for rare harmful cardiovascular events: the case of rosiglitazone,” 156 Am. Heart J. 23, 28 (2008).

[9] Janet Woodcock, FDA Decision Memorandum (Sept. 22, 2010).

[10] Philip D. Home, et al., “Rosiglitazone evaluated for cardiovascular outcomes in oral agent combination therapy for type 2 diabetes (RECORD): a multicentre, randomised, open-label trial,” 373 Lancet 2125 (2009).

[11]Pharmacovigilantism – Avandia Litigation” (Nov. 27, 2013).

Systematic Reviews and Meta-Analyses in Litigation

February 5th, 2016

Kathy Batty is a bellwether plaintiff in a multi-district litigation[1] (MDL) against Zimmer, Inc., in which hundreds of plaintiffs claim that Zimmer’s NexGen Flex implants are prone to have their femoral and tibial elements prematurely aseptically loosen (independent of any infection). Batty v. Zimmer, Inc., MDL No. 2272, Master Docket No. 11 C 5468, No. 12 C 6279, 2015 WL 5050214 (N.D. Ill. Aug. 25, 2015) [cited as Batty].

PRISMA Guidelines for Systematic Reviews

Zimmer proffered Dr. Michael G. Vitale, an orthopedic surgeon, with a master’s degree in public health, to testify that, in his opinion, Batty’s causal claims were unfounded. Batty at *4. Dr. Vitale prepared a Rule 26 report that presented a formal, systematic review of the pertinent literature. Batty at *3. Plaintiff Batty challenged the admissibility of Dr. Vitale’s opinion on grounds that his purportedly “formal systematic literature review,” done for litigation, was biased and unreliable, and not conducted according to generally accepted principles for such reviews. The challenged was framed, cleverly, in terms of Dr. Vitale’s failure to comply with a published set of principles outlined in “PRISMA” guidelines (Preferred Reporting Items for Systematic reviews and Meta-Analyses), which enjoy widespread general acceptance among the clinical journals. See David Moher , Alessandro Liberati, Jennifer Tetzlaff, Douglas G. Altman, & The PRISMA Group, “Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement,” 6 PLoS Med e1000097 (2009) [PRISMA]. Batty at *5. The trial judge, Hon. Rebecca R. Pallmeyer, denied plaintiff’s motion to exclude Dr. Vitale, but in doing so accepted, arguendo, the plaintiff’s implicit premise that an expert witness’s opinion should be reached in the manner of a carefully constructed systematic review.

The plaintiff’s invocation of the PRISMA guidelines presented several difficult problems for her challenge and for the court. PRISMA provides a checklist of 27 items for journal editors to assess the quality and completeness of systematic reviews that are submitted for publication. Plaintiff Batty focused on several claimed deviations from the guidelines:

  • “failing to explicitly state his study question,
  • failing to acknowledge the limitations of his review,
  • failing to present his findings graphically, and failing to reproduce his search results.”

Batty’s challenge to Dr. Vitale thus turned on whether Zimmer’s expert witness had failed to deploy “same level of intellectual rigor,” as someone in the world of clinical medicine would [should] have in conducting a similar systematic review. Batty at *6.

Zimmer deflected the challenge, in part by arguing that PRISMA’s guidelines are for the reporting of systematic reviews, and they are not necessarily criteria for valid reviews. The trial court accepted this rebuttal, Batty at *7, but missed the point that some of the guidelines call for methods that are essential for rigorous, systematic reviews, in any forum, and do not merely specify “publishability.” To be sure, PRISMA itself does not always distinguish between what is essential for journal publication, as opposed to what is needed for a sufficiently valid systematic review. The guidelines, for instance, call for graphical displays, but in litigation, charts, graphs, and other demonstratives are often not produced until the eve of trial, when case management orders call for the parties to exchange such materials. In any event, Dr. Vitale’s omission of graphical representations of his findings was consistent with his finding that the studies were too clinical heterogeneous in study design, follow-up time, and pre-specified outcomes, to permit nice, graphical summaries. Batty at *7-8.

Similarly, the PRISMA guidelines call for a careful specification of the clinical question to be answered, but in litigation, the plaintiff’s causal claims frame the issue to be addressed by the defense expert witness’s literature review. The trial court readily found that Dr. Vitale’s research question was easily discerned from the context of his report in the particular litigation. Batty at *7.

Plaintiff Batty’s challenge pointed to Dr. Vitale’s failure to acknowledge explicitly the limitations of his systematic review, an omission that virtually defines expert witness reports in litigation. Given the availability of discovery tools, such as a deposition of Dr. Vitale (at which he readily conceded the limitations of his review), and the right of confrontation and cross-examination (which are not available, alas, for published articles), the trial court found that this alleged deviation was not particularly relevant to the plaintiff’s Rule 702 challenge. Batty at *8.

Batty further charged that Dr. Vitale had not “reproduced” his own systematic review. Arguing that a systematic review’s results must be “transparent and reproducible,” Batty claimed that Zimmer’s expert witness’s failure to compile a list of studies that were originally retrieved from his literature search deprived her, and the trial court, of the ability to determine whether the search was complete and unbiased. Batty at *8. Dr. Vitale’s search protocol and inclusionary and exclusionary criteria were, however, stated, explained, and reproducible, even though Dr. Vitale did not explain the application of his criteria to each individual published paper. In the final analysis, the trial court was unmoved by Batty’s critique, especially given that her expert witnesses failed to identify any relevant studies omitted from Dr. Vitale’s review. Batty at *8.

Lumping or Commingling of Heterogeneous Studies

The plaintiff pointed to Dr. Vitale’s “commingling” of studies, heterogeneous in terms of “study length, follow-up, size, design, power, outcome, range of motion, component type” and other clinical features, as a deep flaw in the challenged expert witness’s methodology. Batty at *9. Batty’s own retained expert witness, Dr. Kocher, supported Batty’s charge by adverting to the clinical variability in studies included in Dr. Vitale’s review, and suggesting that “[h]igh levels of heterogeneity preclude combining study results and making conclusions based on combining studies.” Dr. Kocher’s argument was rather beside the point because Dr. Vitale had not impermissibly combined clinically or statistically heterogeneous outcomes.[2] Similarly, the plaintiff’s complaint that Dr. Vitale had used inconsistent criteria of knee implant survival rates was dismissed by the trial court, which easily found Dr. Vitale’s survival criteria both pre-specified and consistent across his review of studies, and relevant to the specific alleged by Ms. Batty. Batty at *9.

Cherry Picking

The trial court readily agreed with Plaintiff’s premise that an expert witness who used inconsistent inclusionary and exclusionary criteria would have to be excluded under Rule 702. Batty at *10, citing In Re Zoloft, 26 F. Supp. 3d 449, 460–61 (E.D. Pa.2014) (excluding epidemiologist Dr. Anick Bérard proffered testimony because of her biased cherry picking and selection of studies to support her studies, and her failure to account for contradictory evidence). The trial court, however, did find that Dr. Vitale’s review was corrupted by the kind of biased cherry picking that Judge Rufe found to have been committed by Dr. Anick Bérard, in the Zoloft MDL.

Duplicitous Duplication

Plaintiff’s challenge of Dr. Vitale did manage to spotlight an error in Dr. Vitale’s inclusion of two studies that were duplicate analyses of the same cohort. Apparently, Dr. Vitale had confused the studies as not being of the same cohort because the two papers reported different sample sizes. Dr. Vitale admitted that his double counting the same cohort “got by the peer-review process and it got by my filter as well.” Batty at *11, citing Vitale Dep. 284:3–12. The trial court judged Dr. Vitale’s error to have been:

“an inadvertent oversight, not an attempt to distort the data. It is also easily correctable by removing one of the studies from the Group 1 analysis so that instead of 28 out of 35 studies reporting 100% survival rates, only 27 out of 34 do so.”

Batty at *11.

The error of double counting studies in quantitative reviews and meta-analyses has become a prevalent problem in both published studies[3] and in litigation reports. Epidemiologic studies are sometimes updated and extended with additional follow up. The prohibition against double counting data is so obvious that it often is not even identified on checklists, such as PRISMA. Furthermore, double counting of studies, or subgroups within studies, is a flaw that most careful readers can identify in a meta-analysis, without advance training. According to statistician Stephen Senn, double counting of evidence is a serious problem in published meta-analytical studies.[4] Senn observes that he had little difficulty in finding examples of meta-analyses gone wrong, including meta-analyses with double counting of studies or data, in some of the leading clinical medical journals. Senn urges analysts to “[b]e vigilant about double counting,” and recommends that journals should withdraw meta-analyses promptly when mistakes are found.”[5]

An expert witness who wished to skate over the replication and consistency requirement might be tempted, as was Dr. Michael Freeman, to count the earlier and later iteration of the same basic study to count as “replication.” Proper methodology, however, prohibits double dipping data to count the later study that subsumes the early one as a “replication”:

“Generally accepted methodology considers statistically significant replication of study results in different populations because apparent associations may reflect flaws in methodology. Dr. Freeman claims the Alwan and Reefhuis studies demonstrate replication. However, the population Alwan studied is only a subset of the Reefhuis population and therefore they are effectively the same.”

Porter v. SmithKline Beecham Corp., No. 03275, 2015 WL 5970639, at *9 (Phila. Cty. Pennsylvania, Ct. C.P. October 5, 2015) (Mark I. Bernstein, J.)

Conclusions

The PRISMA and similar guidelines do not necessarily map the requisites of admissible expert witness opinion testimony, but they are a source of some important considerations for the validity of any conclusion about causality. On the other hand, by specifying the requisites of a good publication, some PRISMA guidelines are irrelevant to litigation reports and testimony of expert witnesses. Although Plaintiff Batty’s challenge overreached and failed, the premise of her challenge is noteworthy, as is the trial court’s having taken the premise seriously. Ultimately, the challenge to Dr. Vitale’s opinion failed because the specified PRISMA guidelines, supposedly violated, were either irrelevant or satisfied.


[1] Zimmer Nexgen Knee Implant Products Liability Litigation.

[2] Dr. Vitale’s review is thus easily distinguished from what has become commonplace in litigation of birth defect claims, where, for instance, some well-known statisticians [names available upon request] have conducted qualitative reviews and quantitative meta-analyses of highly disparate outcomes, such as any and all cardiovascular congenital anomalies. In one such case, a statistician expert witness hired by plaintiffs presented a meta-analysis that included study results of any nervous system defect, and central nervous system defect, and any neural tube defect, without any consideration of clinical heterogeneity or even overlap with study results.

[3] See, e.g., Shekoufeh Nikfar, Roja Rahimi, Narjes Hendoiee, and Mohammad Abdollahi, “Increasing the risk of spontaneous abortion and major malformations in newborns following use of serotonin reuptake inhibitors during pregnancy: A systematic review and updated meta-analysis,” 20 DARU J. Pharm. Sci. 75 (2012); Roja Rahimi, Shekoufeh Nikfara, Mohammad Abdollahic, “Pregnancy outcomes following exposure to serotonin reuptake inhibitors: a meta-analysis of clinical trials,” 22 Reproductive Toxicol. 571 (2006); Anick Bérard, Noha Iessa, Sonia Chaabane, Flory T. Muanda, Takoua Boukhris, and Jin-Ping Zhao, “The risk of major cardiac malformations associated with paroxetine use during the first trimester of pregnancy: A systematic review and meta-analysis,” 81 Brit. J. Clin. Pharmacol. (2016), in press, available at doi: 10.1111/bcp.12849.

[4] Stephen J. Senn, “Overstating the evidence – double counting in meta-analysis and related problems,” 9, at *1 BMC Medical Research Methodology 10 (2009).

[5] Id. at *1, *4.


DOUBLE-DIP APPENDIX

Some papers and textbooks, in addition to Stephen Senn’s paper, cited above, which note the impermissible method of double counting data or studies in quantitative reviews.

Aaron Blair, Jeanne Burg, Jeffrey Foran, Herman, Gibb, Sander Greenland, Robert Morris, Gerhard Raabe, David Savitz, Jane Teta, Dan Wartenberg, Otto Wong, and Rae Zimmerman, “Guidelines for Application of Meta-analysis in Environmental Epidemiology,” 22 Regulatory Toxicol. & Pharmacol. 189, 190 (1995).

“II. Desirable and Undesirable Attributes of Meta-Analysis

* * *

Redundant information: When more than one study has been conducted on the same cohort, the later or updated version should be included and the earlier study excluded, provided that later versions supply adequate information for the meta-analysis. Exclusion of, or in rare cases, carefully adjusting for overlapping or duplicated studies will prevent overweighting of the results by one study. This is a critical issue where the same cohort is reexamined or updated several times. Where duplication exists, decision criteria should be developed to determine which of the studies are to be included and which excluded.”

Sander Greenland & Keith O’Rourke, “Meta-Analysis – Chapter 33,” in Kenneth J. Rothman, Sander Greenland, Timothy L. Lash, Modern Epidemiology 652, 655 (3d ed. 2008) (emphasis added)

Conducting a Sound and Credible Meta-Analysis

Like any scientific study, an ideal meta-analysis would follow an explicit protocol that is fully replicable by others. This ideal can be hard to attain, but meeting certain conditions can enhance soundness (validity) and credibility (believability). Among these conditions we include the following:

  • A clearly defined set of research questions to address.

  • An explicit and detailed working protocol.

  • A replicable literature-search strategy.

  • Explicit study inclusion and exclusion criteria, with a rationale for each.

  • Nonoverlap of included studies (use of separate subjects in different included studies), or use of statistical methods that account for overlap.* * * * *”

Matthias Egger, George Davey Smith, and Douglas G. Altman, Systematic Reviews in Health Care: Meta-Analysis in Context 59 – 60 (2001).

Duplicate (multiple) publication bias

***

The production of multiple publications from single studies can lead to bias in a number of ways.85 Most importantly, studies with significant results are more likely to lead to multiple publications and presentations,45 which makes it more likely that they will be located and included in a meta-analysis. The inclusion of duplicated data may therefore lead to overestimation of treatment effects, as recently demonstrated for trials of the efficacy of ondansetron to prevent postoperative nausea and vomiting86.”

Khalid Khan, Regina Kunz, Joseph Kleijnen, and Gerd Antesp, Systematic Reviews to Support Evidence-Based Medicine: How to Review and Apply Findings of Healthcare Research 35 (2d ed. 2011)

“2.3.5 Selecting studies with duplicate publication

Reviewers often encounter multiple publications of the same study. Sometimes these will be exact  duplications, but at other times they might be serial publications with the more recent papers reporting increasing numbers of participants or lengths of follow-up. Inclusion of duplicated data would inevitably bias the data synthesis in the review, particularly because studies with more positive results are more likely to be duplicated. However, the examination of multiple reports of the same study may provide useful information about its quality and other characteristics not captured by a single report. Therefore, all such reports should be examined. However, the data should only be counted once using the largest, most complete report with the longest follow-up.”

Julia H. Littell, Jacqueline Corcoran, and Vijayan Pillai, Systematic Reviews and Meta-Analysis 62-63 (2008)

Duplicate and Multiple Reports

***

It is a bit more difficult to identify multiple reports that emanate from a single study. Sometimes these reports will have the same authors, sample sizes, program descriptions, and methodological details. However, author lines and sample sizes may vary, especially when there are reports on subsamples taken from the original study (e.g., preliminary results or special reports). Care must be taken to ensure that we know which reports are based on the same samples or on overlapping samples—in meta-analysis these should be considered multiple reports from a single study. When there are multiple reports on a single study, we put all of the citations for that study together in summary information on the study.”

Kay Dickersin, “Publication Bias: Recognizing the Problem, Understanding Its Origins and Scope, and Preventing Harm,” Chapter 2, in Hannah R. Rothstein, Alexander J. Sutton & Michael Borenstein, Publication Bias in Meta-Analysis – Prevention, Assessment and Adjustments 11, 26 (2005)

“Positive results appear to be published more often in duplicate, which can lead to overestimates of a treatment effect (Timmer et al., 2002).”

Julian P.T. Higgins & Sally Green, eds., Cochrane Handbook for Systematic Reviews of Interventions 152 (2008)

“7.2.2 Identifying multiple reports from the same study

Duplicate publication can introduce substantial biases if studies are  inadvertently included more than once in a meta-analysis (Tramer 1997). Duplicate publication can take various forms, ranging from identical manuscripts to reports describing different numbers of participants and different outcomes (von Elm 2004). It can be difficult to detect duplicate publication, and some ‘detectivework’ by the review authors may be required.”

Don’t Double Dip Data

March 9th, 2015

Meta-analyses have become commonplace in epidemiology and in other sciences. When well conducted and transparently reported, meta-analyses can be extremely helpful. In several litigations, meta-analyses determined the outcome of the medical causation issues. In the silicone gel breast implant litigation, after defense expert witnesses proffered meta-analyses[1], court-appointed expert witnesses adopted the approach and featured meta-analyses in their reports to the MDL court[2].

In the welding fume litigation, plaintiffs’ expert witness offered a crude, non-quantified, “vote counting” exercise to argue that welding causes Parkinson’s disease[3]. In rebuttal, one of the defense expert witnesses offered a quantitative meta-analysis, which provided strong evidence against plaintiffs’ claim.[4] Although the welding fume MDL court excluded the defense expert’s meta-analysis from the pre-trial Rule 702 hearing as untimely, plaintiffs’ counsel soon thereafter initiated settlement discussions of the entire set of MDL cases. Subsequently, the defense expert witness, with his professional colleagues, published an expanded version of the meta-analysis.[5]

And last month, a meta-analysis proffered by a defense expert witness helped dispatch a long-festering litigation in New Jersey’s multi-county isotretinoin (Accutane) litigation. In re Accutane Litig., No. 271(MCL), 2015 WL 753674 (N.J. Super., Law Div., Atlantic Cty., Feb. 20, 2015) (excluding plaintiffs’ expert witness David Madigan).

Of course, when a meta-analysis is done improperly, the resulting analysis may be worse than none at all. Some methodological flaws involve arcane statistical concepts and procedures, and may be easily missed. Other flaws are flagrant and call for a gatekeeping bucket brigade.

When a merchant puts his hand the scale at the check-out counter, we call that fraud. When George Costanza double dipped his chip twice in the chip dip, he was properly called out for his boorish and unsanitary practice. When a statistician or epidemiologist produces a meta-analysis that double counts crucial data to inflate a summary estimate of association, or to create spurious precision in the estimate, we don’t need to crack open Modern Epidemiology or the Reference Manual on Scientific Evidence to know that something fishy has taken place.

In litigation involving claims that selective serotonin reuptake inhibitors cause birth defects, plaintiffs’ expert witness, a perinatal epidemiologist, relied upon two published meta-analyses[6]. In an examination before trial, this epidemiologist was confronted with the double counting (and other data entry errors) in the relied-upon meta-analyses, and she readily agreed that the meta-analyses were improperly done and that she had to abandon her reliance upon them.[7] The result of the expert witness’s deposition epiphany, however, was that she no longer had the illusory benefit of an aggregation of data, with an outcome supporting her opinion. The further consequence was that her opinion succumbed to a Rule 702 challenge. See In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., MDL No. 2342; 12-md-2342, 2014 U.S. Dist. LEXIS 87592; 2014 WL 2921648 (E.D. Pa. June 27, 2014) (Rufe, J.).

Double counting of studies, or subgroups within studies, is a flaw that most careful readers can identify in a meta-analysis, without advance training. According to statistician Stephen Senn, double counting of evidence is a serious problem in published meta-analytical studies. Stephen J. Senn, “Overstating the evidence – double counting in meta-analysis and related problems,” 9, at *1 BMC Medical Research Methodology 10 (2009). Senn observes that he had little difficulty in finding examples of meta-analyses gone wrong, including meta-analyses with double counting of studies or data, in some of the leading clinical medical journals. Id. Senn urges analysts to “[b]e vigilant about double counting,” id. at *4, and recommends that journals should withdraw meta-analyses promptly when mistakes are found,” id. at *1.

Similar advice abounds in books and journals[8]. Professor Sander Greenland addresses the issue in his chapter on meta-analysis in Modern Epidemiology:

Conducting a Sound and Credible Meta-Analysis

Like any scientific study, an ideal meta-analysis would follow an explicit protocol that is fully replicable by others. This ideal can be hard to attain, but meeting certain conditions can enhance soundness (validity) and credibility (believability). Among these conditions we include the following:

  • A clearly defined set of research questions to address.

  • An explicit and detailed working protocol.

  • A replicable literature-search strategy.

  • Explicit study inclusion and exclusion criteria, with a rationale for each.

  • Nonoverlap of included studies (use of separate subjects in different included studies), or use of statistical methods that account for overlap. * * * * *”

Sander Greenland & Keith O’Rourke, “Meta-Analysis – Chapter 33,” in Kenneth J. Rothman, Sander Greenland, Timothy L. Lash, Modern Epidemiology 652, 655 (3d ed. 2008) (emphasis added).

Just remember George Costanza; don’t double dip that chip, and don’t double dip in the data.


[1] See, e.g., Otto Wong, “A Critical Assessment of the Relationship between Silicone Breast Implants and Connective Tissue Diseases,” 23 Regulatory Toxicol. & Pharmacol. 74 (1996).

[2] See Barbara Hulka, Betty Diamond, Nancy Kerkvliet & Peter Tugwell, “Silicone Breast Implants in Relation to Connective Tissue Diseases and Immunologic Dysfunction:  A Report by a National Science Panel to the Hon. Sam Pointer Jr., MDL 926 (Nov. 30, 1998)”; Barbara Hulka, Nancy Kerkvliet & Peter Tugwell, “Experience of a Scientific Panel Formed to Advise the Federal Judiciary on Silicone Breast Implants,” 342 New Engl. J. Med. 812 (2000).

[3] Deposition of Dr. Juan Sanchez-Ramos, Street v. Lincoln Elec. Co., Case No. 1:06-cv-17026, 2011 WL 6008514 (N.D. Ohio May 17, 2011).

[4] Deposition of Dr. James Mortimer, Street v. Lincoln Elec. Co., Case No. 1:06-cv-17026, 2011 WL 6008054 (N.D. Ohio June 29, 2011).

[5] James Mortimer, Amy Borenstein & Laurene Nelson, Associations of Welding and Manganese Exposure with Parkinson’s Disease: Review and Meta-Analysis, 79 Neurology 1174 (2012).

[6] Shekoufeh Nikfar, Roja Rahimi, Narjes Hendoiee, and Mohammad Abdollahi, “Increasing the risk of spontaneous abortion and major malformations in newborns following use of serotonin reuptake inhibitors during pregnancy: A systematic review and updated meta-analysis,” 20 DARU J. Pharm. Sci. 75 (2012); Roja Rahimi, Shekoufeh Nikfara, Mohammad Abdollahic, “Pregnancy outcomes following exposure to serotonin reuptake inhibitors: a meta-analysis of clinical trials,” 22 Reproductive Toxicol. 571 (2006).

[7] “Q So the question was: Have you read it carefully and do you understand everything that was done in the Nikfar meta-analysis?

A Yes, I think so.

* * *

Q And Nikfar stated that she included studies, correct, in the cardiac malformation meta-analysis?

A That’s what she says.

* * *

Q So if you look at the STATA output, the demonstrative, the — the forest plot, the second study is Kornum 2010. Do you see that?

A Am I —

Q You’re looking at figure four, the cardiac malformations.

A Okay.

Q And Kornum 2010, —

A Yes.

Q — that’s a study you relied upon.

A Mm-hmm.

Q Is that right?

A Yes.

Q And it’s on this forest plot, along with its odds ratio and confidence interval, correct?

A Yeah.

Q And if you look at the last study on the forest plot, it’s the same study, Kornum 2010, same odds ratio and same confidence interval, true?

A You’re right.

Q And to paraphrase My Cousin Vinny, no self-respecting epidemiologist would do a meta-analysis by including the same study twice, correct?

A Well, that was an error. Yeah, you’re right.

***

Q Instead of putting 2 out of 98, they extracted the data and put 9 out of 28.

A Yeah. You’re right.

Q So there’s a numerical transposition that generated a 25-fold increased risk; is that right?

A You’re correct.

Q And, again, to quote My Cousin Vinny, this is no way to do a meta-analysis, is it?

A You’re right.”

Testimony of Anick Bérard, Kuykendall v. Forest Labs, at 223:14-17; 238:17-20; 239:11-240:10; 245:5-12 (Cole County, Missouri; Nov. 15, 2013). According to a Google Scholar search, the Rahimi 2005 meta-analysis had been cited 90 times; the Nikfar 2012 meta-analysis, 11 times, as recently as this month. See, e.g., Etienne Weisskopf, Celine J. Fischer, Myriam Bickle Graz, Mathilde Morisod Harari, Jean-Francois Tolsa, Olivier Claris, Yvan Vial, Chin B. Eap, Chantal Csajka & Alice Panchaud, “Risk-benefit balance assessment of SSRI antidepressant use during pregnancy and lactation based on best available evidence,” 14 Expert Op. Drug Safety 413 (2015); Kimberly A. Yonkers, Katherine A. Blackwell & Ariadna Forray, “Antidepressant Use in Pregnant and Postpartum Women,” 10 Ann. Rev. Clin. Psychol. 369 (2014); Abbie D. Leino & Vicki L. Ellingrod, “SSRIs in pregnancy: What should you tell your depressed patient?” 12 Current Psychiatry 41 (2013).

[8] Julian Higgins & Sally Green, eds., Cochrane Handbook for Systematic Reviews of Interventions 152 (2008) (“7.2.2 Identifying multiple reports from the same study. Duplicate publication can introduce substantial biases if studies are inadvertently included more than once in a meta-analysis (Tramèr 1997). Duplicate publication can take various forms, ranging from identical manuscripts to reports describing different numbers of participants and different outcomes (von Elm 2004). It can be difficult to detect duplicate publication, and some ‘detectivework’ by the reviewauthors may be required.”); see also id. at 298 (Table 10.1.a “Definitions of some types of reporting biases”); id. at 304-05 (10.2.2.1 Duplicate (multiple) publication bias … “The inclusion of duplicated data may therefore lead to overestimation of intervention effects.”); Julian P.T. Higgins, Peter W. Lane, Betsy Anagnostelis, Judith Anzures-Cabrera, Nigel F. Baker, Joseph C. Cappelleri, Scott Haughie, Sally Hollis, Steff C. Lewis, Patrick Moneuse & Anne Whitehead, “A tool to assess the quality of a meta-analysis,” 4 Research Synthesis Methods 351, 363 (2013) (“A common error is to double-count individuals in a meta-analysis.”); Alessandro Liberati, Douglas G. Altman, Jennifer Tetzlaff, Cynthia Mulrow, Peter C. Gøtzsche, John P.A. Ioannidis, Mike Clarke, Devereaux, Jos Kleijnen, and David Moher, “The PRISMA Statement for Reporting Systematic Reviews and Meta-Analyses of Studies That Evaluate Health Care Interventions: Explanation and Elaboration,” 151 Ann. Intern. Med. W-65, W-75 (2009) (“Some studies are published more than once. Duplicate publications may be difficult to ascertain, and their inclusion may introduce bias. We advise authors to describe any steps they used to avoid double counting and piece together data from multiple reports of the same study (e.g., juxtaposing author names, treatment comparisons, sample sizes, or outcomes).”) (internal citations omitted); Erik von Elm, Greta Poglia; Bernhard Walder, and Martin R. Tramèr, “Different patterns of duplicate publication: an analysis of articles used in systematic reviews,” 291 J. Am. Med. Ass’n 974 (2004); John Andy Wood, “Methodology for Dealing With Duplicate Study Effects in a Meta-Analysis,” 11 Organizational Research Methods 79, 79 (2008) (“Dependent studies, duplicate study effects, nonindependent studies, and even covert duplicate publications are all terms that have been used to describe a threat to the validity of the meta-analytic process.”) (internal citations omitted); Martin R. Tramèr, D. John M. Reynolds, R. Andrew Moore, Henry J. McQuay, “Impact of covert duplicate publication on meta­analysis: a case study,” 315 Brit. Med. J. 635 (1997); Beverley J Shea, Jeremy M Grimshaw, George A. Wells, Maarten Boers, Neil Andersson, Candyce Hamel, Ashley C. Porter, Peter Tugwell, David Moher, and Lex M. Bouter, “Development of AMSTAR: a measurement tool to assess the methodological quality of systematic reviews,” 7(10) BMC Medical Research Methodology 2007 (systematic reviews must inquire whether there was “duplicate study selection and data extraction”).