TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

The Supreme Court’s Unsteady Gatekeeping Pre-Daubert

September 8th, 2012

Some writers assert that the United States Supreme Court did not wade into the troubled waters of medical causation and expert witness testimony until it decided Daubert v. Merrell Dow Pharmaceuticals, 509 U.S. 579 (1993).  Actually, the Court swam in these stormy waters in admiralty and FELA cases, at least up through the 1950’s.

In 1953, Mr. Sentilles, a marine engineer, was thrown to the deck of his ship, and washed off deck, by a wave.  He became ill with tuberculosis, and he brought a person injury action (for “maintenance and cure”) against vessel owner.  Inter-Caribbean Shipping Corp. v. Sentilles, 256 F.2d 156 (5th Cir. 1958).  The vessel owner defended on the theory that the plaintiff’s diabetes pre-disposed him to TB, and that the plaintiffs’ expert witnesses were equivocal in their conclusions of causality or aggravation.  The jury nonetheless found for the plaintiff.

The judgment entered on a jury verdict for the seaman was reversed by the Fifth Circuit, which found the plaintiffs’ expert witnesses’ testimony inadequate to support submission of the case to the jury:

“The rule as to the medical testimony respecting causation which is required to take a case to a jury has been thus stated:

It appears to be well settled that medical testimony as to the possibility of a causal relation between a given accident or injury and the subsequent death or impaired physical or mental condition of the person injured is not sufficient, standing alone, to establish such relation. By testimony as to possibility is meant testimony in which the witness asserts that the accident or injury `might have’, `may have’, or `could have’ caused, or `possibly did’ cause the subsequent physical condition or death or that a given physical condition (or death) `might have,’ `may have,’ or `could have’ resulted or `possibly did’ result from a previous accident or injury — testimony, that is, which is confined to words indicating the possibility or chance of the existence of the causal relation in question and does not include words indicating the probability or likelihood of its existence.”

Id. (internal citations omitted).

The Supreme Court granted a writ of certiorari, heard argument, and reversed the Court of Appeals.  Sentilles v. Inter-Caribbean Shipping Corp., 361 U.S. 107 (1959).  In announcing the Court’s opinion, Justice Brennan voiced the remarkable doctrine that the jury could find reasonable probability when the expert witnesses could not:

“The jury’s power to draw the inference that the aggravation of petitioner’s tubercular condition, evident so shortly after the accident, was in fact caused by that accident, was not impaired by the failure of any medical witness to testify that it was in fact the cause.  Neither can it be impaired by the lack of medical unanimity as to the respective likelihood of the potential causes of the aggravation, or by the fact that other potential causes of the aggravation existed and were not conclusively negated by the proofs.  The matter does not turn on the use a of particular form of words by the physicians in giving their testimony.  The members of the jury, not the medical witnesses, were sworn to make a legal determination of the question of causation.  They were entitled to take all the circumstances, including the medical testimony into consideration.  Though this case involves a medical issue, it is no exception to the admonition that, ‘It is not the function of a court to search the record for conflicting circumstantial evidence in order to take the case away from the jury on a theory that the proof gives equal support to inconsistent and uncertain inferences.  The focal point of judicial review is the reasonableness of the particular inference or conclusion drawn by the jury. * * * The very essence of its function is to select from conflicting inferences and conclusions that which it considers most reasonable.  * * * Courts are not free to reweigh the evidence and set aside the jury verdict merely because the jury could have drawn different inferences or conclusions or because judges feel that other results are more reasonable.’”

Id. at 109-10.  Justice Brennan thus ignored equally venerable precedent that juries are not free to speculate, and he failed to consider how the jury in this case could reach a determination in the face of conflicting evidence, and without ruling out alternative causes.

Sentilles was decided before the enactment of the Federal Rules of Evidence, and there was no challenge to the plaintiff’s expert witnesses’ testimony under the Frye doctrine.  Another crucial difference, of course, is that Sentilles was an isolated case, not likely to recur frequently in the federal courts.  With the rise of product liability law, and the emergence of epidemiology as a basis for inferring causality, the federal courts would soon see mass exposure situations resulting in mass torts.  Dubious expert witness testimony resulting in dubious judgments of causation would attain much greater notoriety, for the expert witnesses, for the trial bar, and for the courts that tolerated the results.

 

 

David Egilman’s Methodology for Divining Causation

September 6th, 2012

If the Method Yields An Erroneous Conclusion, then the Method is Wrong

David Stephen Egilman wanted very much to testify in a diacetyl case.  One judge, however, did not think that this was such a good idea, and excluded Dr. Egilman’s testimony. Newkirk v. Conagra Foods, Inc.  727  F.Supp. 2d 1006 (E.D. Wash. 2010).

Egilman was so distraught by being excluded that he sought to file a personal appeal to the United States Court of Appeal. See “Declaration of David Egilman, M.D., M.P.H., in Support of Opposition to Motion for Order to Show Cause Why Appeal Should Not Be Dismissed for Lack of Standing.”  (Attached: Egilman Motion Appeal Diacetyl Exclusion 2011 and Egilman Declaration Newkirk Diacetyl Appeal 2011.)

Egilman improvidently, if not scurrulously, attacked the district judge for having excluded Egilman’s proffered testimony.  If Egilman’s attack on the trial judge were not sufficiently odd, Egilman also claimed a right to intervene in the appeal by advancing the claim that the Rule 702 exclusion hurt his livelihood.  Here is how Egilman put the matter:

“The Daubert ruling eliminates my ability to testify in this case and in others. I will lose the opportunity to bill for services in this case and in others (although I generally donate most fees related to courtroom testimony to charitable organizations, the lack of opportunity to do so is an injury to me). Based on my experience, it is virtually certain that some lawyers will choose not to attempt to retain me as a result of this ruling. Some lawyers will be dissuaded from retaining my services because the ruling is replete with unsubstantiated pejorative attacks on my qualifications as a scientist and expert. The judge’s rejection of my opinion is primarily an ad hominem attack and not based on an actual analysis of what I said – in an effort to deflect the ad hominem nature of the attack the judge creates ‘straw man’ arguments and then knocks the straw men down, without ever addressing the substance of my positions.”

Egilman Declaration at ¶ 11.

The Ninth Circuit, unmoved by the prospect of an impoverished Dr. Egilman, denied his personal appeal, and affirmed the district court’s exclusion. Newkirk v. Conagra Foods, Inc., 438 Fed. Appx. 607 (9th Cir. 2011).

In his appellate papers, Egilman did not stop at simply citing his pecuniary interest.  With no sense of false shame or modesty, Egilman recited what a wonderful expert witness he has been.  Egilman suggested that courts have been duly impressed by his views on the scientific assessment of causation:

“My views on the scientific standards for the determination of cause-effect relationships (medical epistemology) have been cited by the Massachusetts Supreme Court (Vassallo v. Baxter Healthcare Corporation, 428 Mass. 1 (1998)):

‘Although there was conflicting testimony at the Oregon hearing as to the necessity of epidemiological data to establish causation of a disease, the judge appears to have accepted the testimony of an expert epidemiologist that, in the absence of epidemiology, it is “sound science…. to rely on case reports, clinical studies, in vivo tests and animal tests.” The judge may also have relied on the affidavit of the plaintiff’s epidemiological expert, Dr. David S. Egilman, who identified several examples in which disease causation has been established based on animal and clinical case studies alone to demonstrate that “doctors utilize epidemiological data as one tool among many”.’”

Egilman Declaration at p.5-6.

We may excuse Dr. Egilman, a non-lawyer, for incorrectly referring to a non-existent court.  Massachusetts does not have a “Supreme Court,” but the quoted language did indeed come from the Supreme Judicial Court of Massachusetts, in Vassallo v. Baxter Healthcare Corporation, 428 Mass. 1, 12, 696 N.E.2d 909, 917 (1998).

The Massachusetts court’s suggestion that there was conflicting testimony at the “Oregon hearing,” about the need for epidemiologic evidence is itself rather bizarre.  The Oregon hearing was the Rule 702 hearing before Judge Jones, of the District of Oregon.  Judge Jones appointed four technical advisors to assist him in ruling on the defendants’ motions to exclude plaintiffs’ causation opinions.  One of the appointed advisors was an epidemiologist.  More important, the plaintiffs’ counsel presented the testimony of an epidemiologist, Dr. David Goldsmith.  The Massachusetts court did not, and indeed, could not cite the Oregon District Court’s opinion, or the underlying record, for any suggestion that epidemiologic testimony was not needed to show a causal relationship between silicone breast implants and the development of autoimmune disease.  See Hall v. Baxter Healthcare Corp., 947 F. Supp. 1387 (D. Or. 1996). Judge Jones made his views very clear:  epidemiology was needed, but lacking, in the plaintiffs’ case.  The argument that epidemiology was unnecessary came from Dr. Egilman’s report, and the plaintiffs’ counsel’s briefs.

There is more, however, to the disingenuousness of Dr. Egilman’s citation to the Vassallo case.  The Newkirk court, in receiving his curious affidavit, would not likely know that Vassallo was a silicone gel breast implant case, and one may suspect that Dr. Egilman wanted to keep the Ninth Circuit uninformed of his role in the silicone litigation.  If Dr. Egilman submitted an affidavit in connection with the so-called Oregon hearings, which took place during the summer of 1996, it was not a particularly important piece of evidence.  Egilman is not mentioned by name in the Hall decision, even though the district court clearly rejected the plaintiffs’ witnesses and affiants, in their efforts to make a case for silicone as a cause of autoimmune disease.

A few months after the Oregon hearings, Judge Weinstein, in the fall of 1996, along with other federal and state judges, held a “Daubert” hearing on the admissibility of expert witness opinion testimony in breast implant cases, pending in New York state and federal courts.  Plaintiffs’ counsel suggested that Egilman might testify, but ultimately he was a no show.  After the New York hearings, Judge Weinstein granted, sua sponte, partial summary judgment against all plaintiffs’ claims of systemic immune-system injury.  In re Breast Implant Cases, 942 F. Supp. 958 (E.&S.D.N.Y. 1996).

At the New York hearings, plaintiffs’ counsel again attempted to make an epidemiologic case, and once again called Dr. David Goldsmith.  Marshaling the evidentiary display that Egilman would have presented had he shown up in New York, Dr. Goldsmith’s testimony did not go well. At one point, Judge Weinstein interrupted and offered his interim assessment of Dr. Goldsmith and the plaintiffs causation case:

THE COURT: Why are you presenting this witness, for epidemiological purposes?

MR. GORDON: That’s correct.

THE COURT: And I can tell you for epidemiological purposes, based on the only testimony I have seen, he doesn’t meet my standard of anybody who can be helpful to a jury, not because he isn’t a great epidemiologist, I’m sure he is, but because the data he is relying on admittedly is almost useless. I’m not going to go forward with a trial on this kind of haphazard abstract without any basic definition or explication.

Transcript at p.159:7-18, from Nyitray v. Baxter Healthcare Corp., CV 93-159 (E.D.N.Y. Oct. 9, 1996)(pre-trial hearing before Judge Jack Weinstein, Justice Lobis, and Magistrate Cheryl Pollak).  In his semi-autobiographical writings, Judge Jack B. Weinstein elaborated upon his published breast-implant decision, with a bit more detail about how he viewed the plaintiffs’ expert witnesses.  Judge Jack B. Weinstein, “Preliminary Reflections on Administration of Complex Litigation” 2009 Cardozo L. Rev. de novo 1, 14 (2009) (describing plaintiffs’ expert witnesses in silicone litigation as “charlatans”; “[t]he breast implant litigation was largely based on a litigation fraud. … Claims—supported by medical charlatans—that enormous damages to women’s systems resulted could not be supported.”)

When Judge Weinstein began to create a process for the selection of Rule 706 court-appointed expert witnesses, plaintiffs’ counsel rushed to have Judge Pointer take control over the process.  Because Judge Pointer believed that there must be some germ of validity in the plaintiffs’ case, the plaintiffs were hoping that his courtroom, the center of MDL 926, would be a more favorable forum than Judge Weinstein’s withering skepticism.  Ultimately, Judge Pointer, through a select nominating committee, appointed appointed expert witnesses, in the fields of toxicology, immunology, rheumology, and epidemiology.  MDL 926 Order No. 31 (Appointment of Rule 706 Expert Witnesses).

Each of the four witnesses prepared, presented, and defended his or her own report, but all the reports soundly rejected plaintiffs’ causation theories.  Laural L. Hooper, Joe S. Cecil, and Thomas E. Willging, Neutral Science Panels: Two Examples of Panels of Court-Appointed Experts in the Breast Implants Product Liability Litigation (Fed. Jud. Ctr. 2001).

In the United Kingdom, the British Minister of Health ordered an independent review of the breast implant controversy, which led to the formation of the Independent Review Group (IRG) to evaluate the causal claims that were being made by claimants and advocates. The IRG concluded that there was no demonstrable risk of connective tissue disease from silicone breast implants. Independent Review Group, Silicone Breast Implants: The Report of the Independent Review Group 8, 22-23 (July 1998).

In 1999, The Institute of Medicine delivered its assessment of the safety of silicone breast implants.  Again, the plaintiffs’ theories were rejected.  Stuart Bondurant, Virginia Ernster, and Roger Herdman, eds., Safety of Silicone Breast Implants (1999).

Still, Egilman persisted.  As late as 2000, Egilman was posting his breast-implant litigation report at his Brown University website.  His conclusion, however awkwardly worded, was clear enough:

“Although a prospective, large epidemiological study investigating atypical symptoms and disease would clearly contribute to underestimating of the strength of association between silicone breast implants and disease, the available epidemiologic evidence is suggestive of a causal association for silicone breast implants and atypical connective tissue diseases and scleroderma.”

David S. Egilman, “Breast Implants and Disease” (2000) (“For purposes of this report SBI induced disease is considered an iatrogenic environmental disease.”) (<http://209.67.232.40/brown/implants/sbi.html> lasted visited on Mar. 28, 2000).

Sometime after 2000, Egilman developed a sensitivity to being associated with the plaintiffs’ side of the silicone litigation.  In 2009, Dr. Laurence Hirsch, published an article critical of Egilman’s disclosures of conflicts of interest, in some of his published articles.  Hirsch struck a sensitive nerve in mentioning Egilman’s involvement in the breast implant litigation:

“Egilman reports having testified for plaintiffs in legal cases involving asbestosis, occupational lung disease, beryllium poisoning, silicone breast implants and connective tissue disease (characterized as the epitome of junk science91), selective serotonin reuptake inhibitor and suicide risk, atypical antipsychotics and metabolic changes, and selective COX-2 inhibitors and cardiovascular disease, an amazing breadth of medical expertise.”

Laurence J. Hirsch, “Conflicts of Interest, Authorship, and Disclosures in Industry-Related Scientific Publications: The Tort Bar and Editorial Oversight of Medical Journals,” 84 Mayo Clin. Proc. 811, 815 (2009).

Egilman apparently besieged Dr. Hirsch and the Mayo Clinic Proceedings with his protests, and it seems that he was able to induce the author or the journal into a “correction”:

“Dr Egilman has not testified in court in breast implant and connective tissue disease, or in antidepressant or antipsychotic drug cases.”

Laurence J. Hirsch, “Corrections,” 85 Mayo Clin. Proc. 99 (2010).  But this correction is itself incorrect because Dr. Egilman testified over the course of three days, in court, in the same Vassallo v. Baxter Healthcare case he holds up as having embraced his causal “principles.”  The Vassallo case involved allegations that silicone had caused systemic autoimmune disease, an allegation that was ultimately shown to be meritless by the MDL court’s neutral expert witnesses, as well as the Institute of Medicine.

Perhaps this history helps explain Dr. Egilman’s coyness in what he told the Newkirk appellate court about his involvement in the Vassallo case.  More likely is that Dr. Egilman understands, all too well, the logical implications of his being wrong in the breast implant litigation.  If his vaunted method leads to an erroneous conclusion, then the method must be wrong.  It is a simple matter of modus tollens.

 

 

 

 

Open Admissions for Expert Witnesses in Chantix Litigation

September 1st, 2012

Chantix is medication that helps people stop smoking.  Smoking kills people, but make a licensed drug and the lawsuits will come.

Earlier this month, Judge Inge Prytz Johnson, the MDL trial judge in the Chantix litigation, filed an opinion that rejected Pfizer’s challenges to plaintiffs’ general causation expert witnesses.  Memorandum Opinion and Order, In re Chantix (Varenicline) Products Liability Litigation, MDL No. 2092, Case 2:09-cv-02039-IPJ Document 642 (N.D. Ala. Aug. 21, 2012)[hereafter cited as Chantix].

Plaintiffs claimed that Chantix causes depression and suicidality, sometimes severe enough to result in suicide, attempted or completed.  Chantix at 3-4.  Others have written about Judge Johnson’s decision.  See Lacayo, “Win Some, Lose Some: Recent Federal Court Rulings on Daubert Challenges to Plaintiffs’ Experts,” (Aug. 30, 2012).

The breadth and depth of error of the trial court’s analysis, or lack thereof, remains, however, to be explored.

 

STATISTICAL SIGNIFICANCE

The Chantix MDL court notes several times that the defendant “harped” on this or that issue; the reader might think the defendant was a music label rather than a pharmaceutical manufacturer.  One of the defendant’s chords that failed to resonate with the trial judge was the point that the plaintiffs’ expert witnesses relied upon statistically non-significant results.  Here is how the trial court reported the issue:

“While the defendant repeatedly harps on the importance of statistically significant data, the United States Supreme Court recently stated that ‘[a] lack of statistically significant data does not mean that medical experts have no reliable basis for inferring a causal link between a drug and adverse events …. medical experts rely on other evidence to establish an inference of causation.’ Matrixx Initiatives, Inc. v. Siracsano, 131 S.Ct. 1309, 1319 (2011).”

Chantix at 22.

Well, it was only a matter of time before the Supreme Court’s dictum would be put to this predictably erroneous interpretation.  SeeThe Matrixx Oversold” (April 4, 2011).

Matrixx involved a motion to dismiss the complaint, which the trial court granted, but the Ninth Circuit reversed.  No evidence was offered; nor was any ruling that evidence was unreliable or insufficient at issue. The Supreme Court affirmed the Circuit on the issue whether pleading statistical significance was necessary.  Matrixx Initiatives took this position in the hopes of avoiding the merits, and so the issue of causation was never before the Supreme Court.  A unanimous Supreme Court held that because FDA regulatory action does not require reliable evidence to support a causal conclusion, pleading materiality for a securities fraud suit does not require an allegation of causation, and thus does not require an allegation of statistically significant evidence. Everything that the Court said about statistical significance and causation was obiter dictum, and rather ill-considered dictum at that.

The Supreme Court thus wandered far beyond its holding to suggest that courts “frequently permit expert testimony on causation based on evidence other than statistical significance.” Matrixx Initiatives, Inc. v. Siracsano, 131 S.Ct. 1309, 1319 (2011) (citing Wells v. Ortho Pharm. Corp., 788 F.2d 741, 744-745 (11th Cir.1986)).  But the Supreme Court’s citation to Wells, in Justice Sotomayor’s opinion, failed to support the point she was trying to make, or the decision that the trial court announced in Chantix.

Wells involved a claim of birth defects caused by the use of spermicidal jelly contraceptive.  At least one study reported a statistically significant increase in detected birth defects over the expected rate.  Wells v. Ortho Pharmaceutical Corp., 615 F. Supp. 262 (N.D.Ga. 1985), aff’d, and rev’d in part on other grounds, 788 F.2d 741 (11th Cir.), cert. denied, 479 U.S.950 (1986).  Wells is not an example of a case in which an expert witness opined about causation in the absence of a scientific study with statistical significance. Of course, finding statistical significance is just the beginning of assessing the causality of an association; the Wells case was and remains notorious for the expert witness’s poor assessment of all the determinants of scientific causation, including the validity of the studies relied upon.

The Wells decision was met with severe criticism in the 1980s.  The decision was widely criticized for its failure to evaluate the entire evidentiary display, as well as for its failure to rule out bias and confounding in the studies relied upon by the plaintiff.  See, e.g., James L. Mills and Duane Alexander, “Teratogens and ‘Litogens’,” 15 New Engl. J. Med. 1234 (1986); Samuel R. Gross, “Expert Evidence,” 1991 Wis. L. Rev. 1113, 1121-24 (1991) (“Unfortunately, Judge Shoob’s decision is absolutely wrong. There is no scientifically credible evidence that Ortho-Gynol Contraceptive Jelly ever causes birth defects.”). See also Editorial, “Federal Judges v. Science,” N.Y. Times, December 27, 1986, at A22 (unsigned editorial);  David E. Bernstein, “Junk Science in the Courtroom,” Wall St. J. at A 15 (Mar. 24,1993) (pointing to Wells as a prominent example of how the federal judiciary had embarrassed the American judicial system with its careless, non-evidence based approach to scientific evidence). A few years later, another case in the same judicial district, against the same defendant, for the same product, resulted in the grant of summary judgment.  Smith v. Ortho Pharmaceutical Corp., 770 F. Supp. 1561 (N.D. Ga. 1991) (supposedly distinguishing Wells on the basis of more recent studies).

Neither the Justices in Matrixx Initiatives nor the trial court in Chantix can be excused for their poor scholarship, or their failure to note that Wells was overruled sub silentio by the Supreme Court’s own subsequent decisions in Daubert, Joiner, Kumho Tire, and Weisgram.  And if the weight of precedent did not kill the concept, then there is the simple matter of a supervening statute:  the 2000 amendment of Rule 702, of Federal Rules of Evidence.

 

CONFUSING REGULATORY ACTION WITH CAUSAL ASSESSMENTS

The Supreme Court in Matrixx Initiatives was careful to distinguish causal judgments from regulatory action, but then went on in dictum to conflate the two.  The trial judge in Chantix showed no similar analytical care.  Judge Johnson held that the asserted absence of statistical significance was not a basis for excluding plaintiffs’ expert witnesses’ opinions on general causation.  Her Honor adverted to the Matrixx Initiatives dictum that the FDA “does not apply any single metric for determining when additional inquiry or action is necessary.” Matrixx, 131 S.Ct. at 1320.  Chantix at 22.  Judge Johnson noted

“that ‘[n]ot only does the FDA rely on a wide range of evidence of causation, it sometimes acts on the basis of evidence that suggests, but does not prove, causation…. the FDA may make regulatory decisions against drugs based on postmarketing evidence that gives rise to only a suspicion of causation’.  Matrixx, id. The court declines to hold the plaintiffs’ experts to a more exacting standard as the defendant requests.”

Chantix at 23.

In the trial court’s analysis, the difference between regulatory action and civil litigation fact adjudication is obliterated.  This, however, is not the law of the United States, which has consistently acknowledged the difference. See, e.g., IUD v. API, 448 U.S. 607, 656 (1980)(“agency is free to use conservative assumptions in interpreting the data on the side of overprotection rather than underprotection.”)

As the Second Edition of the Reference Manual on Scientific Evidence (which was the out-dated edition cited by the court in Chantix) explains:

“[p]roof of risk and proof of causation entail somewhat different questions because risk assessment frequently calls for a cost-benefit analysis. The agency assessing risk may decide to bar a substance or product if the potential benefits are outweighed by the possibility of risks that are largely unquantifiable because of presently unknown contingencies. Consequently, risk assessors may pay heed to any evidence that points to a need for caution, rather than assess the likelihood that a causal relationship in a specific case is more likely than not.”

Margaret A. Berger, “The Supreme Court’s Trilogy on the Admissibility of Expert Testimony,” in Reference Manual On Scientific Evidence at 33 (Fed. Jud. Ctr. 2d. ed. 2000).

 

CONCLUSIONS VS. METHODOLOGY

Judge Johnson insisted that the “court’s focus was solely on the principles and methodology, not on the conclusions they generate.” Chantix at 9.  This insistence, however, is contrary to the established law of Rule 702.

Although the United States Supreme Court attempted, in Daubert, to draw a distinction between the reliability of an expert witness’s methodology and conclusion, that Court soon realized that the distinction was flawed. If an expert witness’s proffered testimony is discordant from regulatory and scientific conclusions, a reasonable, disinterested scientists would be led to question the reliability of the testimony’s methodology and its inferences from facts and data, to its conclusion.  The Supreme Court recognized this connection in General Electric v. Joiner, and the connection between methodology and conclusions was ultimately incorporated into a statute, the revised Federal Rule of Evidence 702:

“[I]f scientific, technical or other specialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training or education, may testify thereto in the form of an opinion or otherwise, if

  1. the testimony is based upon sufficient fact or data,
  2. the testimony is the product of reliable principles and methods; and
  3. the witness has applied the principles and methods reliably to the facts.”

When the testimony is a conclusion about causation, the Rule 702 directs an inquiry into whether that conclusion is based upon sufficient fact or data, and whether that conclusion is the product of reliable principles and methods.  The court’s focus should indeed be on the conclusion as well the methodology claimed to generate the conclusion.  The Chantix MDL court thus ignored the clear mandate of a statute, Rule 702(1), and applied dictum from Daubert, superseded by Joiner, and an Act of Congress.  The ruling is thus legally invalid to the extent it departs from the statute.

 

EPIDEMIOLOGY

For obscure reasons, Judge Johnson sought to deprecate the need to rely upon epidemiologic studies, whether placebo-controlled clinical trials or observational studies.  See Chantix at 25 (citing Rider v. Sandoz Pharm. Corp., 295 F.3d 1194, 1198-99 (11 Cir.2002)). Of course, the language cited in Rider came from a pre-Daubert, pre-Joiner, case, Wells v. Ortho Pharm. Corp., 788 F.2d 741, 745 (11th Cir.1986) (holding that “a cause-effect relationship need not be clearly established by animal or epidemiological studies”).  This dubious legal lineage cannot support the glib dismissal of the need for epidemiologic evidence.

 

WEIGHT OF THE EVIDENCE (WOE)

According to Judge Johnson, plaintiffs’ expert witness Shira Kramer considered all the evidence relevant to Chantix and neuropsychiatric side effects, in what Kramer described as a “weight of the evidence” analysis.  Chantix at 26.  In her report, Kramer had written that determinations about the weight of evidence are “subjective interpretations” based upon “various lines of scientific evidence. Id. (citing and quoting Kramer’s report). Kramer also claimed that every scientist “brings a unique set of experiences, training and expertise …. Philosophical differences exist between experts…. Therefore, it is not surprising that differences of opinion exist among scientists. Such differences of opinion are not necessarily evidence of flawed scientific reasoning or methodology, but rather differences in judgment between scientists.” Id.

Without any support from scientific literature, or the Reference Manual on Scientific Evidence, Judge Johnson accepted Kramer’s explanation of a totally subjective, unprincipled approach as a scientific methodology.  Not surprisingly, Judge Johnson cited the First Circuit’s embrace of a similar vacuous embrace of a WOE analysis in Milward v. Acuity Specialty Products Group, Inc. 639 F.3d 11, 22 (1st Cir. 2011).  Chantix at 51.

 

CHERRY PICKING

Judge Johnson noted, contrary to her earlier suggestion that Shira Kramer had considered all the studies, that Kramer had excluded data from her analysis.  Kramer’s basis for excluding data may have been based upon pre-specified exclusionary principles, or they may have been completely ad hoc, as were the lack of weighting principles in her WOE analysis.  In its gatekeeping role, however, the trial court expressed complete indifference to Kramer’s selectivity in excluding data.  “Why Dr. Kramer chose to include or exclude data from specific clinical trials is a matter for cross-examination.”  Chantix at 27.  This indifference is an abdication of the court’s gatekeeping responsibility.

 

POWER

The trial court attempted to justify its willingness to mute defendant’s harping on statistical significance by adverting to the concept of statistical power:

“Oftentimes, epidemiological studies lack the statistical power needed for definitive conclusions, either because they are small or the suspected adverse effect is particularly rare. Id. [Michael D. Green et al., “Reference Guide on Epidemiology,” in Reference Manual on Scientific Evidence 333, 335 (Fed. Judicial Ctr. 2d ed. 2000)… .

Chantix at 29 n.16.

To be fair to the trial court, the Reference Manual invited this illegitimate use of statistical power because it, at times, omits the specification that statistical power requires not only a level of statistical significance to be attained, but also a specified alternative hypothesis to assess power.  See Power in the Courts — Part One; Power in the Courts — Part Two.  The trial court offered no alternative hypothesis against which any measure of power was to be assessed.

Judge Johnson did not report any power analyses, and she certainly did not report any quantification of power or lack thereof against some specific alternative hypothesis.  Judge Johnson’s invocation of power was just that – power used arbitrarily, without data, evidence, or reason.

 

CONFIDENCE INTERVALS

As with the invocation of statistical power, the trial also invoked the concept of confidence intervals to suggest that such intervals provide a more refined approach to assessing statistical significance:

“A study found to have ‘results that are unlikely to be the result of random error’ is ‘statistically significant’. Reference Guide on Epidemiology, supra, at 354. Statistical significance, however, does not indicate the strength of an association found in a study. Id. at 359. ‘A study may be statistically significant but may find only a very weak association; conversely, a study with small sample sizes may find a high relative risk but still not be statistically significant.’ Id. To reach a ‘more refined assessment of appropriate inferences about the association found in an epidemiologic study’, researchers rely on another statistical technique known as a confidence interval’. Id. at 360.”

Chantix at 30 n.17.  True, true, but immaterial.  The trial court, again, never carries through with the direction given by the Reference Manual.  Not a single confidence interval is presented.  No confidence intervals are subjected to this more refined assessment.  Why have more refined assessments when even the cruder assessments are not done?

 

OPEN ADMISSIONS IN SCHOOL OF EXPERT WITNESSING

The trial court somehow had the notion that all it had to do was state that every disputed fact and opinion went to the weight not the admissibility, and then pass to a presumably more scientifically literate jury.  To be sure, the court engaged in a good deal of hand waving, going through the motions of deciding a contested issues.  Not only did the Judge Johnson smash poor Pfizer’s harp, Her Honor unhinged the gate that federal judges are supposed to keep.  Chantix declares that it is now open admissions for expert witnesses testifying to causation in federal cases.  This is a judgment in search of an appeal.

Eighth Circuit Holds That Increased Risk Is Not Cause

August 4th, 2012

The South Dakota legislature took it upon itself to specify the “risks” to be included in the informed consent required by state law for an abortion procedure:

(1) A statement in writing providing the following information:
* * *
(e) A description of all known medical risks of the procedure and statistically significant risk factors to which the pregnant woman would be subjected, including:
(i) Depression and related psychological distress;
(ii) Increased risk of suicide ideation and suicide;
* * *

S.D.C.L. § 34-23A-10.1(1)(e)(i)(ii).  Planned Parenthood challenged the law on constitutional grounds, and the district court granted a preliminary injunction against the South Dakota statute, which a panel of the Eight Circuit affirmed, only to have that Circuit en banc reverse and remand the case for further proceedings.  Planned Parenthood Minn. v. Rounds, 530 F.3d 724 (8th Cir. 2008) (en banc).

On remand, the parties filed cross-motions for summary judgment.  The district court held that the so-called suicide advisory was unconstitutional.  On the second appeal to the Eight Circuit, a divided panel affirmed the trial court’s holding on the suicide advisory. 653 F.3d 662 (8th Cir. 2011).  The Circuit, however, again granted rehearing en banc, and reversed the summary judgment for Planned Parenthood on the advisory.  Planned Parenthood Minnesota v. Rounds, Slip op. July 24, 2012 (en banc)[Slip op.].

In support of the injunction, Planned Parenthood argued that the state’s mandatory suicide advisory violated women’s abortion rights and physicians’ free speech rights. The en banc court rejected this argument, holding that the required advisory was “truthful, non-misleading information,” which did not unduly burden abortion rights, even if it might cause women to forgo abortion.  See Planned Parenthood of Southeastern Pennsylvania v. Casey, 505 U.S. 833, 882-83 (1992).

Risk  ≠ Cause

Planned Parenthood’s success in the trial court turned on its identification of risk (or increased risk) with cause, and its expert witness evidence that causation had not been accepted in the medical literature. In other words, Planned Parenthood argued that the advisory required disclosure of a conclusive causal “link” between abortion and suicide or suicidal ideation.  See 650 F. Supp. 2d 972, 982 (D.S.D. 2009).  The en banc court, on the second appeal, sought to save the statute by rejecting Planned Parenthood’s reading.  The court parsed the statute to suggest that the term “increased risk” is more precise and limited than the umbrella term of “risk,” standing alone.  Slip op. at 6.  The statute does not define “increased risk,” which the en banc court noted had various meanings in medicine.  Id. at 7.

Reviewing the medical literature, the en banc court held that the term “increased risk” does not refer to causation but to a much more modest finding of “a relatively higher probability of an adverse outcome in one group compared to other groups—that is, to ‘relative risk’.”  Id.  The en banc majority seemed to embroil itself in some considerable semantic confusion.  One the hand, the majority, in a rhetorical rift proclaimed that:

“It would be nonsensical for those in the field to distinguish a relationship of ‘increased risk’ from one of causation if the term ‘risk’ itself was equivalent to causation.”

Id. at 9.  The majority’s nonsensical labeling is, well, … nonsensical.  There is a compelling difference in assessment of risk and causation.  Risk is an ex ante concept, applied before the effect has occurred. Assessment or attribution of causation takes place after the effect. Of course, there is a sense of risk or “increased risk,” which is epistemologically more modest, but that hardly makes the more rigorous use of risk as an ex ante cause, nonsensical.

The majority, however, is not content to leave the matter alone.  Elsewhere, the en banc court contradicts itself, and endorses a view that risk = causation.  For instance, in citing to a civil action involving a claimed causal relationship between Bendectin and a birth defect, the Eighth Circuit reduces risk to cause.  See Slip op. at 26 n. 9 (citing Brock v. Merrell Dow Pharms., Inc., 874 F.2d 307, 312 , modified on reh’g, 884 F.2d 166 (5th Cir. 1989)).  The en banc court’s “explanatory” parenthetical explains the depths of its confusion:

“explaining that if studies establish, within an acceptable confidence interval, that those who use a pharmaceutical have a relative risk of greater than 1.0—that is, an increased risk—of an adverse outcome, those studies might be considered sufficient to support a jury verdict of liability on a failure-to-warn claim.”

This reading of Brock is wrong on two counts.  First, the Fifth Circuit, in Brock, and consistently since, has required the relative risk greater than 1.0 to be statistically significant at the conventional significance probability, as well as other indicia of causality, such as the Bradford Hill factors.  So Brock and its progeny did not confuse or conflate risk with cause, or dilute the meaning of cause such that it could be satisfied by a mere showing of an increased relative risk.

Second, Brock itself made a serious error in interpreting statistical significance and confidence intervals. The Bendectin studies at issue in Brock were not statistically significant, and the confidence intervals did not include a measure of no association (relative risk = one). Brock, however, in notoriously incorrect dicta claimed that the computation of confidence intervals took into account bias and confounding as well as sampling variability.  Brock v. Merrill Dow Pharmaceuticals, Inc., 874 F.2d 307, 311-12 (5th Cir. 1989)(“Fortunately, we do not have to resolve any of the above questions [as to bias and confounding], since the studies presented to us incorporate the possibility of these factors by the use of a confidence interval.”)(emphasis in original).  See, e.g., David H. Kaye, David E. Bernstein, and Jennifer L. Mnookin, The New Wigmore – A Treatise on Evidence:  Expert Evidence § 12.6.4, at 546 (2d ed. 2011); Michael O. Finkelstein, Basic Concepts of Probability and Statistics in the Law 86-87 (2009)(criticizing the over-interpretation of confidence intervals by the Brock court); Schachtman, “Confidence in Intervals and Diffidence in the Courts” (Mar. 4, 2012).

The en banc majority’s discussion of the studies of abortion and suicidality make clear that the presence of bias and confounding in a study may prevent inference of causation, but they do not undermine the conclusion that the studies show an increased risk.  A conclusion that the body of epidemiologic studies was inconclusive, and that it failed to “to disentangle confounding factors and establish relative risks of abortion compared to its alternatives,” did not, therefore, render the suicide advisory about risk or increased risk unsupported, untruthful, or misleading.  Slip op. at 20.  Indeed, the en banc court provided an example, outside the context of abortion, to illustrate its meaning.  The en banc court’s use of the example of prolonged television viewing and “increased risk” of mortality suggests that the court took risk to mean any association, no matter how likely it was the result of bias or confounding.  See id. at 10 n. 3 (citing Anders Grøntved, et al., “Television Viewing and Risk of Type 2 Diabetes, Cardiovascular Disease, and All-Cause Mortality, 305 J. Am. Med. Ass’n 2448 (2011). The en banc majority held that the advisory would be misleading only if Planned Parenthood could show that the available epidemiologic studies conclusively ruled out causation.  Slip op. at 24-25.

The Suicide Advisory Has Little Content Because Risk Is Not Cause

The majority decision clarified that the mandatory disclosure does not require a physician to inform a patient that abortion causes suicide or suicidal thoughts.  Slip op. at 25.  The en banc court took solace in its realization that physicians’ reviewing the available studies could provide a disclosure that captures the difference between risk, relative risk, and causation.  In other words, physicians are free to tell patients that this thing called increased risk is not concerning because the studies are highly confounded, and they do not show causation.  Id. at 25-26.  Indeed, it would be hard to imagine an ethical physician telling patients anything else.

Dissent

Four of the Eight Circuit judges dissented, pointing to evidence that the South Dakota legislators intended to mandate a disclosure about causality.  Slip op. at 29.  Putting aside whether the truthfulness of the suicide advisory can be saved by reverting to a more modest interpretation of risk or of increased risk, the dissenters appear to have the better argument that the advisory is misleading.  The majority, however, by driving its wedge between causation and increased risk have allowed physicians to explain that the advisory has little or no meaning.

NOCEBO

The nocebo effect is the dark side of the placebo effect.  As pointed out recently in the Journal of the American Medical Association, nocebos can induce harmful outcomes because of the expectation of injury from the “psychosocial context or therapeutic environment” affecting patients’ perception of their health.  Luana Colloca & Damien Finniss, “Nocebo Effects, Patient-Clinician Communication, and Therapeutic Outcomes,” 307 J. Am. Med. Ass’n 567, 567 (2012).  It is fairly well accepted that clinicians can inadvertently prejudice health outcomes by how they frame outcome information to patients.  Colloca and Finniss note that the negative expectations created by nocebo communication can take place in the process of obtaining informed consent.

Unfortunately, there is no discussion of nocebo effects in the Eight Circuit’s decision. Planned Parenthood might well consider the role the nocebo effect has on the risk-benefit of an informed consent disclosure about a risk that really is not a risk, or is not a risk in the sense that it is a factor that will result in the putative cause, but rather only something that is under study and which cannot be separated from many confounding factors.  Surely, physicians in South Dakota will figure out how to give truthful, non-misleading disclosures that incorporate the mandatory suicide advisory, as well as the scientific evidence.

Viagra, Part II — MDL Court Sees The Light – Bad Data Trump Nuances of Statistical Inference

July 8th, 2012

In the Viagra vision loss MDL, the first Daubert hearing did not end well for the defense.  Judge Magnuson refused to go beyond conclusory statements by the plaintiffs’ expert witness, Gerald McGwin, and to examine the qualitative and quantitative evaluative errors invoked to support plaintiffs’ health claims.  The weakness of McGwin’s evidence, however, appeared to  encourage Judge Magnuson to authorize extensive discovery into McGwin’s study.  In re Viagra Products Liab. Litig., 572 F. Supp. 2d 1071, 1090 (D. Minn. 2008).

The discovery into McGwin’s study had already been underway, with subpoenas to him and to his academic institution.  As it turned out, defendant’s discovery into the data and documents underlying McGwin’s study won the day.  Although Judge Magnuson struggled with inferential statistics, he understood the direct attack on the integrity of McGwin’s data.  Over a year after denying defendant’s Rule 702 motion to exclude Gerald McGwin, the MDL court reconsidered and granted the motion.  In re Viagra Products Liab. Litig., 658 F. Supp. 2d 936, 945 (D. Minn. 2009).

The basic data on prior exposures and risk factors for the McGwin study was collected by telephone surveys, from which the information was coded into an electronic dataset.  In analyzing the data, McGwin used the electronic dataset and not the survey forms.  Id. at 939.  The transfer from survey forms to electronic dataset did not go smoothly; about 11 patients were miscoded as “exposed“ when their use of Viagra post-dated the onset of NAION. Id. at 942.  Furthermore, the published article incorrectly stated personal history of heart attack as a “risk factor ”; the survey inquired about family not personal history of heart attack. Id. at 944.

The plaintiffs threw several bombs in response, but without legal effect.  First, the plaintiffs claimed that the study participants had been recontacted and the database had been corrected, but they were unable to document this process or the alleged corrections.  Id. at 433.  Furthermore, the plaintiffs could not explain how, if their contention had been true, McGwin would have not committed serious violations of his university’s institutional review board’s regulations with respect to deviations from the original protocol.  Id. at 943 n.7.

Second, the plaintiffs argued that the underlying survey forms were “inadmissible ” and thus the defense could not use them to impeach the McGwin study.  Some might think this a duplicitous argument, utterly at odds with Rule 703 – rely upon a study but prevent use of underlying data and documents to explain that the study does not show what it purports to show.  The MDL court spared the plaintiffs the embarrassment of ruling that the documents on which McGwin had based his study were inadmissible, and found that the forms were business records and admissible under Federal Rule of evidence 803(6).  The court could have gone further to point out that McGwin’s reliance upon hearsay in the form of his study, McGwin 2006, opened the door to impeaching the hearsay relied upon with other hearsay.  See Rule 806.

When defense counsel sat down with McGwin in a deposition, they found that he had not undertaken any new analyses of corrected data.  Plaintiffs’ counsel directed him not to do so.  Id. at 940-41.  But then after the deposition was over, McGwin submitted a letter to the journal to report a corrected analysis.  Pfizer’s counsel obtained the letter in response to their subpoena to McGwin’s university, the University of Alabama, Birmingham.  Mirabile dictu; now the increase risk appeared limited to only to the defendant’s medication, Viagra!

The trial court was not amused.  First, the new analysis was no longer peer reviewed, and the court had placed a great deal of emphasis on peer review in denying the first challenge to McGwin.  Second, the new analysis was no longer that of an independent scientist, but was conducted and submitted as a letter to the editor, while McGwin was working for plaintiffs’ counsel.  Third, the plaintiffs and McGwin conceded that the data were not accurate.  Last, but not least, the trial court clearly was not pleased that the plaintiffs’ counsel had deliberately delayed McGwin’s further analyses until after the deposition, and then tried to submit yet another supplemental report with those further analyses. In sum:

“the Court finds good reason to vacate its original Daubert Order permitting Dr. McGwin to testify as a general causation expert based on the McGwin Study as published. Almost every indicia of reliability the Court relied on in its previous Daubert Order regarding the McGwin Study has been shown now to be unreliable.  Peer review and publication mean little if a study is not based on accurate underlying data. Likewise, the known rate of error is also meaningless if it is based on inaccurate data. Even if the McGwin Study as published was conducted according to generally accepted epidemiologic research and did not result from post-litigation research, the fact that the McGwin Study appears to have been based on data that cannot now be documented or supported renders it inadmissibly unreliable. The Court concludes that under Daubert, Dr. McGwin’s opinion, to the extent that it is based on the McGwin Study as published, lacks sufficient indicia of reliability to be admitted as a general causation opinion.”

Id. at 945-46.  The remaining evidence was the Margo & French study, but McGwin had previously criticized that study as lacking data that ensured that Viagra use preceded onset of NAION.  In the end, McGwin was left with bupkes, and the plaintiffs were left with even less.

*******************

McGwin 2006 Was Also A Pain in the Rear End for McGwin

The Rule 702 motions and hearings on McGwin’s proposed testimony had consequences in the scientific world itself.  In 2011, the British Journal of Ophthalmology retracted McGwin’s 2006 paper.  “Retraction: Non-arteritic anterior ischaemic optic neuropathy and the treatment of erectile dysfunction, ” 95 Brit. J. Ophthalmol. 595 (2011).

Interestingly, the retraction was reported in the Retraction Watch blog, “Retractile dysfunction? Author says journal yanked paper linking Viagra, Cialis to vision problem after legal threats.”  The blog treated the retraction as routine except for the hint of “legal pressure”:

“One of the authors of the paper, a researcher at the University of Alabama named Gerald McGwin Jr., told us that the journal retracted the article because it had become a tool in a lawsuit involving Pfizer, which makes Viagra, and, presumably, men who’d developed blindness after taking the drug:

‘The article just became too much of a pain in the rear end. It became one of those things where we couldn’t provide all the relevant documentation [to the university, which had to provide records for attorneys].’

Ultimately, however, McGwin said that the BJO pulled the plug on the paper.”

Id. The legal threat is hard to discern other than the fact that lawyers wanted to see something that peer reviewers almost never see – the documentation underlying the published paper.  So now, the study that formed the basis for the original ruling against Pfizer floats aimlessly as a derelict on the sea of science.  McGwin is, however, still at his craft.  In a study he published in 2010, he claimed that Viagra but not Cialis use was associated with hearing impairment.  Gerald McGwin, Jr, “Phosphodiesterase Type 5 Inhibitor Use and Hearing Impairment,” 136 Arch. Otolaryngol. Head & Neck Surgery 488 (2010).

Where are Senator Grassley and Congressman Waxman when you need them?

Love is Blind but What About Judicial Gatekeeping of Expert Witnesses? – Viagra Part I

July 7th, 2012

The Viagra litigation over claimed vision loss vividly illustrates the difficulties that trial judges have in understanding and applying the concept of statistical significance.  In this MDL, plaintiffs sued for a specific form of vision loss, non-arteritic ischemic optic neuropathy (NAION), which they claimed was caused by their use of defendant’s medication, Viagra.  In re Viagra Products Liab. Litig., 572 F. Supp. 2d 1071 (D. Minn. 2008).  Plaintiffs’ key expert witness, Gerald McGwin considered three epidemiologic studies; none found a statistically significant elevation of risk of NAION after Viagra use.  Id. at 1076. The defense filed a Rule 702 motion to exclude McGwin’s testimony, based in part upon the lack of statistical significance of the risk ratios he relied upon for his causal opinion.  The trial court held that this lack did not render McGwin’s testimony and unreliable and inadmissible  Id. at 1090.

One of the three studies considered by McGwin was his own published paper.  G. McGwin, Jr., M. Vaphiades, T. Hall, C. Owsley, ‘‘Non-arteritic anterior ischaemic optic neuropathy and the treatment of erectile dysfunction,’’ 90 Br. J. Ophthalmol. 154 (2006)[“McGwin 2006”].    The MDL court noted that McGwin had stated that his paper reported an odds ratio (OR) of 1.75, with a 95% confidence interval (CI), 0.48 to 6.30.  Id. at 1080.  The study also presented multiple subgroup analyses of men who had reported Viagra use after a history of heart attack (OR = 10.7) or hypertension (OR = 6.9), but the MDL court did not provide p-values or confidence intervals for the subgroup analysis results.

Curiously, Judge Magnuson eschewed the guidance of the Reference Manual on Scientific Evidence, in dealing with statistics of sampling estimates of means or proportions.  The Reference Manual on Scientific Evidence (2d ed. 2000) urges that:

“[w]henever possible, an estimate should be accompanied by its standard error.”

RMSE 2d ed. at 117-18.  The new third edition again conveys the same basic message:

What is the standard error? The confidence interval?

An estimate based on a sample is likely to be off the mark, at least by a small amount, because of random error. The standard error gives the likely magnitude of this random error, with smaller standard errors indicating better estimates.”

RMSE 3d ed. at 243.

The point of the RSME‘s guidance is, of course, that the standard error, or the confidence interval (C.I.) based upon a specified number of standard errors, is an important component of the sample statistic, without which the sample estimate is virtually meaningless.  Just as a narrative statement should not be truncated, a statistical or numerical expression should not be unduly abridged.

The statistical data on which McGwin was basing his opinion was readily available from McGwin 2006:

“Overall, males with NAION were no more likely to report a history of Viagra … use compared to similarly aged controls (odd ratio (OR) 1.75, 95% confidence interval (CI) 0.48 to 6.30.  However, for those with a history of myocardial infarction, a statistically significant association was observed (OR 10.7, 95% CI 1.3 to 95.8). A similar association was observed for those with a history of hypertension though it lacked statistical significance (OR 6.9, 95% CI 0.8 to 63.6).”

McGwin 2006, at 154.  Following the RSME‘s guidance would have assisted the MDL court in its gatekeeping responsibility in several distinct ways.  First, the court would have focused on how wide the 95% confidence intervals were.  The width of the intervals pointed to statistical imprecision and instability in the point estimates urged by McGwin.  Second, the MDL court would have confronted the extent to which there were multiple ad hoc subgroup analyses in McGwin’s paper.  See Newman v. Motorola, Inc., 218 F. Supp. 2d 769, 779 (D. Md. 2002)(“It is not good scientific methodology to highlight certain elevated subgroups as significant findings without having earlier enunciated a hypothesis to look for or explain particular patterns.”) Third, the court would have confronted the extent to which the study’s validity was undermined by several potent biases.  Statistical significance was the least of the problems faced by McGwin 2006.

The second study considered and relied upon by McGwin was referred to as Margo & French.  McGwin cited this paper for an “elevated OR of 1.10,” id. at 1081, but again, had the court engaged with the actual evidence, it would have found that McGwin had cherry picked the data he chose to emphasize.  The Margo & French study was a retrospective cohort study using the National Veterans Health Administration’s pharmacy and clinical databases.  C. Margo & D. French, ‘‘Ischemic optic neuropathy in male veterans prescribed phosphodiesterase-5 inhibitors,’’ 143 Am. J. Ophthalmol. 538 (2007).  There were two outcomes ascertained:  NAION and “possible” NAION.  The relative risk of NAION among men prescribed a PDE-5 inhibitor (the class to which Viagra belongs) was 1.02 (95% confidence interval [CI]: 0.92 to 1.12.  In other words, the Margo & French paper had very high statistical precision, and it reported essentially no increased risk at all.  Judge Magnuson cited uncritically McGwin’s endorsement of a risk ratio that included ‘‘possible’’ NAION cases, which could not bode well for a gatekeeping process that is supposed to protect against speculative evidence and conclusions.

McGwin’s citation of Margo & French for the proposition that men who had taken the PDE-5 inhibitors had a 10% increased risk was wrong on several counts.  First, he relied upon an outcome measure that included ‘‘possible’’ cases of NAION.  Second, he completely ignored the sampling error that is captured in the confidence interval.  The MDL court failed to note or acknowledge the p-value or confidence interval for any result in Margo & French. The consideration of random error was not an optional exercise for the expert witness or the court; nor was ignoring it a methodological choice that simply went to the ‘‘disagreement among experts.’’

The Viagra MDL court not only lost its way by ignoring the guidance of the RMSE, it appeared to confuse the magnitude of the associations with the concept of statistical significance.  In the midst of the discussion of statistical significance, the court digressed to address the notion that the small relative risk in Margo & French might mean that no plaintiff could show specific causation, and then in the same paragraph returned to state that ‘‘persuasive authority’’ supported the notion that the lack of statistical significance did not detract from the reliability of a study.  Id. at 1081 (citing In re Phenylpropanolamine (PPA) Prods. Liab. Litig., MDL No. 1407, 289 F.Supp.2d 1230, 1241 (W.D.Wash. 2003)).  The magnitude of the observed odds ratio is an independent concept from that of whether an odds ratio as extreme or more so would have occurred by chance if there really was no elevation.

Citing one case, at odds with a great many others, however, did not create an epistemic warrant for ignoring the lack of statistical significance.  The entire notion of cited caselaw for the meaning and importance of statistical significance for drawing inferences is wrong headed.  Even more to the point, the lack of statistical significance in the key study in the PPA litigation did not detract from the reliability of the study, although other features of that study certainly did.  The lack of statistical significance in the PPA study did, however, detract from the reliability of the inference from the study’s estimate of ‘‘effect size’’ to a conclusion of causal association. Indeed, nowhere in the key PPA study did its authors draw a causal conclusion with respect to PPA ingestion and hemorrhagic stroke.  See Walter Kernan, Catherine Viscoli, Lawrence Brass, Joseph Broderick, Thomas Brott, Edward Feldmann, Lewis Morgenstern, Janet Lee Wilterdink, and Ralph Horwitz, ‘‘Phenylpropanolamine and the Risk of Hemorrhagic Stroke,’’ 343 New England J. Med. 1826 (2000).

The MDL court did attempt to distinguish the Eighth Circuit’s decision in Glastetter v. Novartis Pharms. Corp., 252 F.3d 986 (8th Cir. 2001), cited by the defense:

‘‘[I]n Glastetter … expert evidence was excluded because ‘rechallenge and dechallenge data’ presented statistically insignificant results and because the data involved conditions ‘quite distinct’ from the conditions at issue in the case. Here, epidemiologic data is at issue and the studies’ conditions are not distinct from the conditions present in the case. The Court does not find Glastetter to be controlling.’’

Id. at 1081 (internal citations omitted; emphasis in original).  This reading of Glastetter, however, misses important features of that case and the Parlodel litigation more generally.  First, the Eighth Circuit commented not only upon the rechallenge-dechallenge data, which involved arterial spasms, but upon an epidemiologic study of stroke, from which Ms. Glastetter suffered.  The Glastetter court did not review the epidemiologic evidence itself, but cited to another court, which did discuss and criticize the study for various ‘‘statistical and conceptual flaws.’’  See Glastetter, 252 F.3d at 992 (citing Siharath v. Sandoz Pharms.Corp., 131 F.Supp. 2d 1347, 1356-59 (N.D.Ga.2001)).  Glastetter was binding authority, and not so easily dismissed and distinguished.

The Viagra MDL court ultimately placed its holding upon the facts that:

‘‘the McGwin et al. and Margo et al. studies were peer-reviewed, published, contain known rates of error, and result from generally accepted epidemiologic research.’’

In re Viagra, 572 F. Supp. 2d at 1081 (citations omitted).  This holding was a judicial ipse dixit substituting for the expert witness’s ipse dixit.  There were no known rates of error for the systematic errors in the McGwin study, and the ‘‘known’’ rates of error for random error in McGwin 2006  were intolerably high.  The MDL court never considered any of the error rates, systematic or random, for the Margo & French study.  The court appeared to have abdicated its gatekeeping responsibility by delegating it to unknown peer reviewers, who never considered whether the studies at issue in isolation or together could support a causal health claim.

With respect to the last of the three studies considered, the Gorkin study, McGwin opined that it was  too small, and the data were not suited to assessing temporal relationship.  Id.  The court did not appear inclined to go beyond McGwin’s ipse dixit.  The Gorkin study was hardly small, in that it was based upon more than 35,000 patient-years of observation in epidemiologic studies and clinical trials, and provided an estimate of incidence for NAION among users of Viagra that was not statistically different from the general U.S. population.  See L. Gorkin, K. Hvidsten, R. Sobel, and R. Siegel, ‘‘Sildenafil citrate use and the incidence of nonarteritic anterior ischemic optic neuropathy,’’ 60 Internat’l J. Clin. Pract. 500, 500 (2006).

Judge Magnuson did proceed, in his 2008 opinion, to exclude all the other expert witnesses put forward by the plaintiffs.  McGwin survived the defendant’s Rule 702 challenge, largely because the court refused to consider the substantial random variability in the point estimates from the studies relied upon by McGwin. There was no consideration of the magnitude of random error, or for that matter, of the systematic error in McGwin’s study.  The MDL court found that the studies upon which McGwin relied had a known and presumably acceptable ‘‘rate of error.’’  In fact, the court did not consider the random or sampling error in any of the three cited studies; it failed to consider the multiple testing and interaction; and it failed to consider the actual and potential biases in the McGwin study.

Some legal commentators have argued that statistical significance should not be a litmus test.  David Faigman, Michael Saks, Joseph Sanders, and Edward Cheng, Modern Scientific Evidence: The Law and Science of Expert Testimony § 23:13, at 241 (‘‘Statistical significance should not be a litmus test. However, there are many situations where the lack of significance combined with other aspects of the research should be enough to exclude an expert’s testimony.’’)  While I agree that significance probability should not be evaluated in a mechanical fashion, without consideration of study validity, multiple testing, bias, confounding, and the like, handing waving about litmus tests does not excuse courts or commentators from totally ignoring random variability in studies based upon population sampling.  The dataset in the Viagra litigation was not a close call.

Maryland Puts the Brakes on Each and Every Asbestos Exposure

July 3rd, 2012

Last week, the Maryland Court of Special Appeals reversed a plaintiffs’ verdict in Dixon v. Ford Motor Company, 2012 WL 2483315 (Md. App. June 29, 2012).  Jane Dixon died of pleural mesothelioma.  The plaintiffs, her survivors, claimed that her last illness and death were caused by her household improvement projects, which involved exposure to spackling/joint compound, and by her husband’s work with car parts and brake linings, which involved “take home” exposure on his clothes.  Id. at *1.

All the expert witnesses appeared to agree that mesothelioma is a “dose-response disease,” meaning that the more the exposure, the greater the likelihood that a person exposed will develop the disease. Id. at *2.  Plaintiffs’ expert witness, Dr. Laura Welch, testified that “every exposure to asbestos is a substantial contributing cause and so brake exposure would be a substantial cause even if [Mrs. Dixon] had other exposures.” On cross-examination, Dr. Welch elaborated upon her opinion to explain that any “discrete” exposure would be a contributing factor. Id.

Welch, of course, criticized the entire body of epidemiology of car mechanics and brake repairmen, which generally finds no increased risk of mesothelioma above overall population rates.  With respect to the take-home exposure, Welch had to acknowledge that there were no epidemiologic studies that investigated the risk of wives of brake mechanics.  Welch argued that the studies of car mechanics did not involve exposure to brake shoes as would have been experienced by brake repairmen, but her argument only served to make her attribution based upon take-home exposure to brake linings seem more preposterous.  Id. at *3.  The court recognized that Dr. Welch’s opinion may have been trivially true, but still unhelpful.  Each discrete exposure, even as attenuated as a take-home exposure from having repaired a single brake shoe may have “contributed,” but that opinion did not help the jury assess whether the contribution was substantial.

The court sidestepped the issue of fiber type, and threshold, and honed in on the agreement that mesothelioma risk showed a dose-response relationship with asbestos exposure.  (There is a sense that the court confused the dose-response concept to mean no threshold.)  The court credited hyperbolic risk assessment figures from the United States Environmental Protection Agency, which suggested that even ambient air exposure to asbestos leads to an increase in mesothelioma risk, but then realized that such claims made the legal need to characterize the risk from the defendant’s product all the more important before the jury could reasonably have concluded that any particular exposure experienced by Ms. Dixon was “a substantial contributing factor.”  Id. at *5.

Having recognized that the best the plaintiffs could offer was a claim of increased risk, and perhaps crude quantification of the relative risks resulting from each product’s exposure, the court could not escape that the conclusion that Dr. Welch’s empty recitation of “every exposure” is substantial was nothing more than an unscientific and empty assertion.  Welch’s claim was either tautologically true or empirical nonsense.  The court also recognized that risk substituting for causation opened the door to essentially probabilistic evidence:

“If risk is our measure of causation, and substantiality is a threshold for risk, then it follows—as intimated above—that ‘substantiality’ is essentially a burden of proof. Moreover, we can explicitly derive the probability of causation from the statistical measure known as ‘relative risk’ … .  For reasons we need not explore in detail, it is not prudent to set a singular minimum ‘relative risk’ value as a legal standard.12 But even if there were some legal threshold, Dr. Welch provided no information that could help the finder of fact to decide whether the elevated risk in this case was ‘substantial’.”

Id. at *7.  The court’s discussion here of “the elevated risk” seems wrong unless we understand it to mean the elevated risk attributable to the particular defendant’s product, in the context of an overall exposure that we accept as having been sufficient to cause the decedent’s mesothelioma.  Despite the lack of any quantification of relative risks in the case, overall or from particular products, and the court’s own admonition against setting a minimum relative risk as a legal standard, the court proceeded to discuss relative risks at length.  For instance, the court criticized Judge Kozinski’s opinion in Daubert, upon remand from the Supreme Court, for not going far enough:

“In other words, the Daubert court held that a plaintiff’s risk of injury must have at least doubled in order to hold that the defendant’s action was ‘more likely than not’ the actual cause of the plaintiff’s injury. The problem with this holding is that relative risk does not behave like a ‘binary’ hypothesis that can be deemed ‘true’ or ‘false’ with some degree of confidence; instead, the un-certainty inherent in any statistical measure means that relative risk does not resolve to a certain probability of specific causation. In order for a study of relative risk to truly fulfill the preponderance standard, it would have to result in 100% confidence that the relative risk exceeds two, which is a statistical impossibility. In short, the Daubert approach to relative risk fails to account for the twin statistical uncertainty inherent in any scientific estimation of causation.”

Id. at *7 n.12 (citing Daubert v. Merrell Dow Pharms., Inc., 43 F.3d 1311, 1320-21 (9th Cir.1995) (holding that that a preponderance standard requires causation to be shown by probabilistic evidence of relative risk greater than two) (opinion on remand from Daubert v. Merrell Dow Pharms., 509 U.S. 579 (1993)).  The statistical impossibility derives from the asymptotic nature of the normal distribution, but the court failed to explain why a relative risk of two must be excluded as statistically implausible based upon the sample statistic.  After all, a relative risk greater than two, with a lower bound of a 95% confidence interval above one, based upon an unbiased sampling, suggests that our best evidence is that the population parameter is greater than two, as well.  The court, however, insisted upon stating the relative-risk-greater-than-two rule with a vengeance:

“All of this is not to say, however, that any and all attempts to establish a burden of proof of causation using relative risk will fail. Decisions can be – and in science or medicine are – premised on the lower limit of the relative risk ratio at a requisite confidence level. The point of this minor discussion is that one cannot apply the usual, singular ‘preponderance’ burden to the probability of causation when the only estimate of that probability is statistical relative risk. Instead, a statistical burden of proof of causation must consist of two interdependent parts: a requisite confidence of some minimum relative risk. As we explain in the body of our discussion, the flaws in Dr. Welch’s testimony mean we need not explore this issue any further.44

Id. (emphasis in original).

And despite having declared the improvidence of addressing the relative risk issue, and then the lack of necessity for addressing the issue given Dr. Welch’s flawed testimony, the court nevertheless tackled the issue once more, a couple of pages later:

“It would be folly to require an expert to testify with absolute certainty that a plaintiff was exposed to a specific dose or suffered a specific risk. Dose and risk fall on a spectrum and are not ‘true or false’. As such, any scientific estimate of those values must be expressed as one or more possible intervals and, for each interval, a corresponding confidence that the true value is within that interval.”

Id. at 9 (emphasis in original; internal citations omitted).  The court captured the frequentist concept of the confidence interval as being defined operationally by repeated samplings and their random variability, but the confidence of the confidence interval means that the specified coefficient represents the percentage of all such intervals that include the “true” value, not the probability that a particular interval, calculated from a given sample, contains the true value.  The true value is either in or not in the interval generated from a single sample risk statistic.  Again, it is unclear why the court was weighing in on this aspect of probabilistic evidence when plaintiffs’ expert witness, Welch, offered no quantitation of the overall risk or of the risk attributable to a specific product exposure.

The court indulged the plaintiffs’ no-threshold fantasy but recognized that the risks of low-level asbestos exposure were low, and likely below a doubling of risk, an issue that the court stressed it wanted to avoid.  The court cited one study that suggested a risk (odds) ratio of 1.1 for exposures less than 0.5 fiber/ml – years.  See id. at *5 (citing Y. Iwatsubo et al., “Pleural mesothelioma: dose-response relation at low levels of asbestos exposure in a French population-based case-control study,” 148 Am. J. Epidemiol. 133 (1998) (estimating an odds ratio of 1.1 for exposures less than 0.5 fibers/ml-years).  But the court, which tried to be precise elsewhere, appears to have lost its way in citing Iwatsubo here.  After all, how can a single odds ratio of 1.1 describe all exposures from 0 all the way up to 0.5 f/ml-years?  How can a single odds ratio describe all exposures in this range, regardless of fiber type, when chrystotile asbestos carries little to no risk for mesothelioma, and certainly orders of magnitude risk less than amphibole fibers such as amosite and crocidolite.  And if a low-level exposure has a risk ratio of 1.1, how can plaintiffs’ hired expert witness, Welch, even make the attribution of Dixon’s mesothelioma to the entirety of her exposure, let alone the speculative take-home chrysotile exposure involved from Ford’s brake linings?  Obviously, had the court posed these questions, it would it would have realized that “it is not possible” to permit Welch’s testimony at all.

The court further lost its way in addressing the exculpatory epidemiology put forward by the defense expert witnesses:

“Furthermore, the leading epidemiological report cited by Ford and its amici that specifically studied ‘brake mechanics’, P.A. Hessel et al., ‘Meso-thelioma Among Brake Mechanics: An Expanded Analysis of a Case-control Study’, 24 Risk Analysis 547 (2004), does not at all dispel the notion that this population faced an increased risk of mesothelioma due to their industrial asbestos exposure. … When calculated at the 95% confidence level, Hessel et al. estimated that the odds ratio of mesothelioma could have been as low as 0.01 or as high as 4.71, implying a nearly quintupled risk of mesothelioma among the population of brake mechanics. 24 Risk Analysis at 550–51.”

Id. at *8.  Again, the court is fixated with the confidence interval, to the exclusion of the estimated magnitude of the association!  This time, after earlier shouting that it was the lower bound of the interval that matters scientifically, the court emphasizes the upper bound.  The court here has strayed far from the actual data, and any plausible interpretation of them:

“The odds ratio (OR) for employment in brake installation or repair was 0.71 (95% CI: 0.30-1.60) when controlled for insulation or shipbuilding. When a history of employment in any of the eight occupations with potential asbestos exposure was controlled, the OR was 0.82 (95% CI: 0.36-1.80). ORs did not increase with increasing duration of brake work. Exclusion of those with any of the eight exposures resulted in an OR of 0.62 (95% CI: 0.01-4.71) for occupational brake work.”

P.A. Hessel et al., “Mesothelioma Among Brake Mechanics: An Expanded Analysis of a Case-control Study,” 24 Risk Analysis 547, 547 (2004).  All of Dr. Hessel’s estimates of effect sizes were below 1.0, and he found no trend for duration of brake work.  Cherry picking out the upper bound of a single subgroup analysis for emphasis was unwarranted, and hardly did justice to the facts or the science.

Dr. Welch’s conclusion that the exposure and risk in this case were “substantial” simply was not a scientific conclusion, and without it her testimony did not provide information for the jury to use in reaching its conclusion as to substantial factor causation. Id. at *7.  The court noted that Welch, and the plaintiffs, may have lacked scientific data to provide estimates of Dixon’s exposure to asbestos or relative risk of mesothelioma, but ignorance or uncertainty was hardly the basis to warrant an expert witness’s belief that the relevant exposures and risks are “substantial.” Id. at *10.  The court was well justified in being discomforted by the conclusory, unscientific opinion rendered by Laura Welch.

In the final puzzle of the Dixon case, the court vacated the judgment, and remanded for a new trial, “either without her opinion on substantiality or else with some quantitative testimony that will help the jury fulfill its charge.”  Id. at *10.  The court thus seemed to imply that an expert witness need not utter the magic word, “substantial,” for the case to be submitted to the jury against a brake defendant in a take-home exposure case.  Given the state of the record, the court should have simply reversed and rendered judgment for Ford.

Ecological Fallacy Goes to Court

June 30th, 2012

In previous posts, I have bemoaned the judiciary’s tin ear for important qualitative differences between and among different research study designs.  The Reference Manual for Scientific Evidence (3d ed. 2011)(RMSE3d) offers inconsistent advice, ranging from Margaret Berger’s counsel to abandon any hierarchy of evidence, to other chapters’ emphasizing the importance of a hierarchy.

The Cook case is one of the more aberrant decisions, which elevated an ecological study, without a statistically significant result, into an acceptable basis for a causal conclusion under Rule 702.  Senior Judge Kane’s decision in the litigation over radioactive contamination from the Colorado Rocky Flats nuclear weapons plant is illustrative of a judicial refusal to engage with the substantive differences among studies, and to ignore the inability of some study designs to support causality.  See Cook v. Rockwell Internat’l Corp., 580 F. Supp. 2d 1071, 1097-98 (D. Colo. 2006) (“Defendants assert that ecological studies are inherently unreliable and therefore inadmissible under Rule 702.  Ecological studies, however, are one of several methods of epidemiological study that are well-recognized and accepted in the scientific community.”), rev’d and remanded on other grounds, 618 F.3d 1127 (10th Cir. 2010), cert. denied, ___ U.S. ___ (May 24, 2012).  Senior Judge Kane’s point about the recognition and acceptance of ecological studies has nothing to do with their ability to support conclusions of causality.  This basic non sequitur led the trial judge into ruling that the challenge “goes to the weight, not the admissibility” of the challenged opinion testimony.  This is a bit like using an election day exit poll, with 5% returns, for “reliable” evidence to support a prediction of the winner.  The poll may have been conducted most expertly, but it lacks the ability to predict the winner.

The issue is not whether ecological studies are “scientific”; they are part of the epidemiologists’ toolkit.  The issue is whether they warrant inferences of causation.  Some so-called scientific studies are merely hypothesis generating, preliminary, tentative, or data-dredging exercises.  Judge Kane opined that ecological studies are merely “less probative” than other studies, and the relative weights of studies do not render them inadmissible.  Id.  This is a misunderstanding or an abdication of gatekeeping responsibility.  First, studies themselves are not admissible; it is the expert witness, whose testimony is challenged.  Second, Rule 702 requires that the proffered opinion be “scientific knowledge,” and ecological studies simply lack the necessary epistemic warrant.

The legal sources cited by Senior Judge Kane provide only equivocal and minimal support at best for his decision.  The court pointed to RSME2d at 344-45, for the proposition that ecological studies are useful for establishing associations, but are weak evidence for causality. The other legal citations give seem equally unhelpful.  In re Hanford Nuclear Reservation Litig., No. CY–91– 3015–AAM, 1998 WL 775340 at *106 (E.D.Wash. Aug.21, 1998) (citing RMSE2d and the National Academy of Science Committee on Radiation Dose Reconstruction for Epidemiological Uses, which states that “ecological studies are usually regarded as hypothesis generating at best, and their results must be regarded as questionable until confirmed with cohort or case‑control studies.” National Research Council, Radiation Dose Reconstruction for Epidemiologic Uses at 70 (1995)), rev’d on other grounds, 292 F.3d 1124 (9th Cir. 2002).  Ruff v. Ensign– Bickford Indus., Inc., 168 F.Supp. 2d 1271, 1282 (D. Utah 2001) (reviewing evidence that consisted of a case-control study in addition to an ecological study; “It is well established in the scientific community that ecological studies are correlational studies and generally provide relatively weak evidence for establishing a conclusive cause and effect relationship.’’); see also id. at 1274 n.3 (“Ecological studies tend to be less reliable than case–control studies and are given little evidentiary weight with respect to establishing causation.”)

 

ERROR COMPOUNDED

The new edition of RMSE cites the Cook case at several places.  In an introductory chapter, the late Professor Margaret Berger cites the case incorrectly for having excluded expert witness testimony.  See Margaret A. Berger, “The Admissibility of Expert Testimony 11, 24 n.62 in RMSE3d (“See Cook v. Rockwell Int’l Corp., 580 F. Supp. 2d 1071 (D. Colo. 2006) (discussing why the court excluded expert’s testimony, even though his epidemiological study did not produce statistically significant results).”)  The chapter on epidemiology cites Cook correctly for having refused to exclude the plaintiffs’ expert witness, Dr. Richard Clapp, who relied upon an ecological study of two cancer outcomes in the area adjacent to the Rocky Flats Nuclear Weapons Plant.  See Michael D. Green, D. Michal Freedman, and Leon Gordis, “Reference Guide on Epidemiology,” 549, 561 n. 34, in Reference Manual for Scientific Evidence (3d ed. 2011).  The authors, however, abstain from any judgmental comments about the Cook case, which is curious given their careful treatment of ecological studies and their limitations:

“4. Ecological studies

Up to now, we have discussed studies in which data on both exposure and health outcome are obtained for each individual included in the study.33 In contrast, studies that collect data only about the group as a whole are called ecological studies.34 In ecological studies, information about individuals is generally not gathered; instead, overall rates of disease or death for different groups are obtained and compared. The objective is to identify some difference between the two groups, such as diet, genetic makeup, or alcohol consumption, that might explain differences in the risk of disease observed in the two groups.35 Such studies may be useful for identifying associations, but they rarely provide definitive causal answers.36

Id. at 561.  The epidemiology chapter proceeds to note that the lack of information about individual exposure and disease outcome in an ecological study “detracts from the usefulness of the study,” and renders it prone to erroneous inferences about the association between exposure and outcome, “a problem known as an ecological fallacy.”  Id. at 562.  The chapter authors define the ecological fallacy:

“Also, aggregation bias, ecological bias. An error that occurs from inferring that a relationship that exists for groups is also true for individuals.  For example, if a country with a higher proportion of fishermen also has a higher rate of suicides, then inferring that fishermen must be more likely to commit suicide is an ecological fallacy.”

Id. at 623.  Although the ecological study design is weak and generally unsuitable to support causal inferences, the authors note that such studies can be useful in generating hypotheses for future research using studies that gather data about individuals. Id. at 562.  See also David Kaye & David Freedman, “Reference Guide on Statistics,” 211, 266 n.130 (citing the epidemiology chapter “for suggesting that ecological studies of exposure and disease are ‘far from conclusive’ because of the lack of data on confounding variables (a much more general problem) as well as the possible aggregation bias”); Leon Gordis, Epidemiology 205-06 (3d ed. 2004)(ecologic studies can be of value to suggest future research, but “[i]n and of themselves, however, they do not demonstrate conclusively that a causal association exists”).

The views expressed in the Reference Manual for Scientific Evidence, about ecological studies, are hardly unique.  The following quotes show how ecological studies are typically evaluated in epidemiology texts:

Ecological fallacy

An ecological fallacy or bias results if inappropriate conclusions are drawn on the basis of ecological data. The bias occurs because the association observed between variables at the group level does not necessarily represent the association that exists at the individual level (see Chapter 2).

***

Such ecological inferences, however limited, can provide a fruitful start for more detailed epidemiological work.”

R. Bonita, R. Beaglehole, and T. Kjellström, Basic Epidemiology 43 2d ed. (WHO 2006).

“A first observation of a presumed relationship between exposure and disease is often done at the group level by correlating one group characteristic with an outcome, i.e. in an attempt to relate differences in morbidity or mortality of population groups to differences in their local environment, living habits or other factors. Such correlational studies that are usually based on existing data are prone to the so-called ‘ecological fallacy’ since the compared populations may also differ in many other uncontrolled factors that are related to the disease. Nevertheless, ecological studies can provide clues to etiological hypotheses and may serve as a gateway towards more detailed investigations.”

Wolfgang Ahrens & Iris Pigeot, eds., Handbook of Epidemiology 17-18 (2005).

The Cook case is a wonderful illustration of the judicial mindset that avoids and evades gatekeeping by resorting to the conclusory reasoning that a challenge “goes to the weight, not the admissibility” of an expert witness’s opinion.

Let’s Require Health Claims to Be Evidence Based

June 28th, 2012

Litigation arising from the FDA’s refusal to approval “health claims” for foods and dietary supplements is a fertile area for disputes over the interpretation of statistical evidence.  A ‘‘health claim’’ is ‘‘any claim made on the label or in labeling of a food, including a dietary supplement, that expressly or by implication … characterizes the relationship of any substance to a disease or health-related condition.’’ 21 C.F.R. § 101.14(a)(1); see also 21 U.S.C. § 343(r)(1)(A)-(B).

Unlike the federal courts exercising their gatekeeping responsibility, the FDA has committed to pre-specified principles of interpretation and evaluation. By regulation, the FDA gives notice of standards for evaluating complex evidentiary displays for the ‘‘significant scientific agreement’’ required for approving a food or dietary supplement health claim.  21 C.F.R. § 101.14.  See FDA – Guidance for Industry: Evidence-Based Review System for the Scientific Evaluation of Health Claims – Final (2009).

If the FDA’s refusal to approve a health claim requires pre-specified criteria of evaluation, then we should be asking ourselves why have the federal courts failed to develop a set of criteria for evaluating health effects claims as part of its Rule 702 (“Daubert“) gatekeeping responsibilities.  Why, after close to 20 years after the Supreme Court decided Daubert, can lawyers make “health claims” without having to satisfy evidence-based criteria?

Although the FDA’s guidance is not always as precise as might be hoped, it is far better than the suggestion of the new Reference Manual for Scientific Evidence (3d ed. 2011) that there is no hierarchy of evidence.   See RMSE 3d at 564 & n.48 (citing and quoting idiosyncratic symposium paper that “[t]here should be no hierarchy [among different types of scientific methods to determine cancer causation]; “Late Professor Berger’s Introduction to the Reference Manual on Scientific Evidence” (Oct. 23, 2011).

The FDA’s attempt to articulate an evidence-based hierarchy is noteworthy because the agency must evaluate a wide range of evidence, from in vitro, to animal studies, to observational studies of varying kinds, to clinical trials, to meta-analyses and reviews.  The FDA’s criteria are a good start, and I imagine that they will develop and improve over time.  Although imperfect, the criteria are light years ahead of the situation in federal and state court gatekeeping.  Unlike gatekeeping in civil actions, the FDA criteria are pre-stated and not devised post hoc.  The FDA’s attempt to implement evidence-based principles in the evaluation of health claims made is a model that would much improve the Reference Manual for Scientific EvidenceSee Christopher Guzelian & Philip Guzelian, “Prevention of false scientific speech: a new role for an evidence-based approach,” 27 Human & Experimental Toxicol. 733 (2008).

The FDA’s evidence-based criteria need work in some areas.  For instance, the FDA’s Guidance on meta-analysis is not particularly specific or helpful:

Research Synthesis Studies

Reports that discuss a number of different studies, such as review articles, do not provide sufficient information on the individual studies reviewed for FDA to determine critical elements such as the study population characteristics and the composition of the products used. Similarly, the lack of detailed information on studies summarized in review articles prevents FDA from determining whether the studies are flawed in critical elements such as design, conduct of studies, and data analysis. FDA must be able to review the critical elements of a study to determine whether any scientific conclusions can be drawn from it. Therefore, FDA intends to use review articles and similar publications to identify reports of additional studies that may be useful to the health claim review and as background about the substance/disease relationship. If additional studies are identified, the agency intends to evaluate them individually. Most meta-analyses, because they lack detailed information on the studies summarized, will only be used to identify reports of additional studies that may be useful to the health claim review and as background about the substance-disease relationship.  FDA, however, intends to consider as part of its health claim review process a meta-analysis that reviews all the publicly available studies on the substance/disease relationship. The reviewed studies should be consistent with the critical elements, quality and other factors set out in this guidance and the statistical analyses adequately conducted.”

FDA – Guidance for Industry: Evidence-Based Review System for the Scientific Evaluation of Health Claims – Final at 10 (2009).

The dismissal of review articles as a secondary source is welcome, but meta-analyses are quantitative reviews that can add additional insights and evidence, if methodologically appropriate, by providing a summary estimate of association, sensitivity analyses, meta-regression, etc.  The FDA’s guidance was applied in connection with the agency’s refusal to approve a health claim for vitamin C and lung cancer.  Proponents claimed that a particular meta-analysis supported their health claim, but the FDA disagreed.  The proponents sought injunctive relief in federal district court, which upheld the FDA’s decision on vitamin C and lung cancer.  Alliance for Natural Health US v. Sebelius, 786 F.Supp. 2d 1, 21 (D.D.C. 2011).  The district court found that the FDA’s refusal to approve the health claim was neither arbitrary nor capricious with respect to its evaluation of the cited meta-analysis:

‘‘The FDA discounted the Cho study because it was a ‘meta-analysis’ of studies reflected in a review article. FDA Decision at 2523. As explained in the 2009 Guidance Document, ‘research synthesis studies’, and ‘review articles’, including ‘most meta-analyses’, ‘do not provide sufficient information on the individual studies reviewed’ to determine critical elements of the studies and whether those elements were flawed. 2009 Guidance Document at A.R. 2432. The Guidance Document makes an exception for meta-analyses ‘that review[ ] all the publicly available studies on the substance/disease relationship’. Id. Based on the Court’s review of the Cho article, the FDA’s decision to exclude this article as a meta-analysis was not arbitrary and capricious.’’

Id. at 19.

The FDA’s Guidance was adequate for its task in the vitamin C/lung cancer health claim, but notably absent from the Guidance are any criteria to evaluate competing meta-analyses that do include “all the publicly available studies on the substance/disease relationship.”  The model assumptions of meta-analyses, fixed effect versus random effects, lack of heterogeneity, as well as other considerations will need to be spelled out in advance.  Still not a bad start.  Implementing evidence-based criteria in Rule 702 gatekeeping has the potential to tame the gatekeeper’s discretion.

Meta-Meta-Analysis – Celebrex Litigation – The Claims – Part 2

June 25th, 2012

IMPUTATION

As I noted in part one, the tables were turned on imputation, with plaintiffs making the same accusation that G.E. made in the gadolinium litigation:  imputation involves adding “phantom events” or “imaginary events to each arm of ‘zero event’ trials.”  See Plaintiffs’ Reply Mem. of Law in Further Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 8, 9 (May 5, 2010), in Securities Litig.

The plaintiffs claimed that Wei “created” an artifact of a risk ratio of 1.0 by using imputation in each of the zero-event trials.  The reality, however, is that each of those trials had zero risk difference, and the rates of event in drug and placebo arms were both low and equal to one another.  The plaintiffs’ claim that Wei “diluted” the risk is little more than saying that he failed to inflate the risk by excluding zero-event trials.  But zero-event trials represent a test in which the risk of events in both arms is equal, and relatively low.

The plaintiffs seemed to make their point half-heartedly.  They admitted that “imputation in and of itself is a commonly used methodology,” id. at 10, but they claimed that “adding zero-event trials to a meta-analysis is debated among scientists.”  Id.  A debate over methodology in the realm of meta-analysis procedures hardly makes any one of the debated procedures “not generally accepted,” especially in the context of meta-analysis of uncommon adverse events arising in clinical trials designed for other outcomes.  After all, investigators do not design trials to assess a suspected causal association between a medication and an adverse outcome as their primary outcome.  The debate over the ethics of such a trial would be much greater than any gentle debate over whether to include zero-event trials by using either the risk difference or imputation procedures.

The gravamen of the plaintiffs’ complaint against Wei seems to be that he included too many zero-event trials, “skewing the numbers greatly, and notably cites to no publications in which the dominant portion of the meta-analysis was comprised of studies with no events.”  Id. The plaintiffs further argue that Wei could have minimized the “distortion” created by imputation by using a fractional event, ” a smaller number like .000000001 to each trial.”  Id. The plaintiffs notably cited no texts or articles for this strategy.  In any event, if the zero-event trials are small, as they typically are, then they will have large study variances.  Because meta-analyses weight each trial by the inverse of the variance, studies with large variances have little weight in the summary estimate of association.  Including small studies with imputation methods will generally not affect the outcome very much, and their contribution may well reflect the reality of lower or non-differential risk from the medication.

Eliminating trials on the grounds that they had zero events has also been criticized for throwing away important data.  Charles H. Hennekens, David L. DeMets, C. Noel Bairey Merz, Steven L. Borzak, Jeffrey S. Borer,  “Doing More Harm Than Good,” 122 Am. J. Med. 315 (2009) (criticizing Nissen’s meta-analysis of rosiglitazone in which he excluded zero event trials for as biased towards overestimating the magnitude of the summary estimate of association). George A. Diamond, L. Bax, S. Kaul, “Uncertain effects of rosiglitazone on the risk for myocardial infarction and cardiovascular death,” 147 Ann. Intern. Med. 578 (2007) (conducting sensitivity analyses on Nissen’s meta-analysis of rosiglitazone to show that Nissen’s findings lost statistical significance when continuity corrections were made for zero-event trials).

 

RISK DIFFERENCE

The plaintiffs are correct that the risk difference is not the predominant risk measure used in meta-analysis or in clinical trials for that matter.  Researchers prefer risk ratios because they reflect base rates in the ratio.  As one textbook explains:

“the limitation of the [risk difference] statistic is its insensitivity to base rates. For example, a risk that increases from 50% to 52% may be less important than one that increases from 2% to 4%, although in both instances RD = 0.02.”

Julia Littell, Jacqueline Corcoran, and Vijayan Pillai, Systematic Reviews and Meta-Analysis 85 (Oxford 2008).  This feature of the risk difference hardly makes its use unreliable, however.

Pfizer pointed out that at least one other case addressed the circumstances in which the risk difference would be superior to risk ratios in meta-analyses:

“The risk difference method is often used in meta-analyses where many of the individual studies (which are all being pooled together in one, larger analysis) do not contain any individuals who developed the investigated side effect.FN17  whereas such studies would have to be excluded from an odds ratio calculation, they can be included in a risk difference calculation. FN18

FN17. This scenario is more likely to occur when studying a particularly rare event, such as suicide.

FN18. Studies where no individuals experienced the effect must be excluded from an odds ratio calculation because their inclusion would necessitate dividing by zero, which, as perplexed middle school math students come to learn, is impossible. The risk difference’s reliance on subtraction, rather than division, enables studies with zero incidences to remain in a meta-analysis. (Hr’g Tr. 310-11, June 20, 2008 (Gibbons.)).”

In re Neurontin Marketing, Sales Practices, and Products Liab. Litig.,  612 F.Supp. 2d 116, 126 (D. Mass. 2009) (MDL 1629).  See Pfizer’s Defendants’ Mem. of Law in Opp. to Plaintiffs’ Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei (Sept. 8, 2009), in Securities Litig. (citing In re Neurontin).

Pfizer also pointed out that Wei had employed both the risk ratio and the risk difference in conducting his meta-analyses, and that none of his summary estimates of association were statistically significant.  Id. at 19, 24.


EXACT CONFIDENCE INTERVALS

The plaintiffs argued that the use of “exact confidence” intervals was not scientifically reliable and could not have been used by Pfizer at the time period covered by the securities class’s allegations.  See Plaintiffs’ Reply Mem. of Law in Further Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 15 (May 5, 2010).  Exact intervals, however, are hardly a novelty, and there is often no single way to calculate a confidence interval.  See E. B. Wilson,  “Probable inference, the law of succession, and statistical inference,” 22 J. Am. Stat. Ass’n 209 (1927); C. Clopper, E. S. Pearson, “The use of confidence or fiducial limits illustrated in the case of the binomial,” 26 Biometrika 404 (1934).  Approximation methods are often used, despite their lack of precision, because of their ease in calculation.

Plaintiffs further claimed that the combination of risk difference and exact intervals is novel, not reliable, and not in existence during the class period.  Plaintiffs’ Reply Mem at 15.  The plaintiffs’ argument traded on Wei’s having published on the use of exact intervals in conjunction with the risk difference for heart attacks in clinical trials of Avandia.  See L. Tian, T. Cai, M.A. Pfeffer, N. Piankov, P.Y. Cremieux, and L.J. Wei, “Exact and efficient inference procedure for meta-analysis and its application to the analysis of independent 2 x 2 tables with all available data but without artificial continuity correction,” 10 Biostatistics 275 (2009).  Their argument ignored that Wei combined two well-understood statistical techniques, in a transparent way, with empirical testing of the validity of his approach.  Contrary to plaintiffs’ innuendo, Wei did not develop his approach as an expert witness for GlaxoSmithKline; a version of the manuscript describing his approach was posted on line well before he was ever contacted by GSK counsel. (L.J. Wei, personal communication)  Plaintiffs also claimed that Wei’s use of exact intervals for risk difference showed no increased risk of heart attack for Avandia, contrary to a well-known meta-analysis by Dr. Steven Nissen.  See Steven E. Nissen, M.D., and Kathy Wolski, M.P.H., “Effect of Rosiglitazone on the Risk of Myocardial Infarction and Death from Cardiovascular Causes,” 356 New Engl. J. Med. 2457, 2457 (2007).  This claim, however, is a crude distortion of Wei’s paper, which showed that there was a positive risk difference for heart attacks in the same dataset used by Nissen, but the confidence intervals included zero (no risk difference), and thus chance could not be excluded as explaining Nissen’s result.

 

DURATION OF TRIALS

Pfizer was ultimately successful in defending the Celebrex litigation on the basis of lack of risk associated with 200 mg/day use.  Pfizer also attempted to argue a duration effect on grounds that in one large trial that saw a statistically significant hazard ratio associated with higher doses, the result occurred for the first time among trial participants on medication, at 33 months into the trial.  Judge Bryer rejected this challenge, without explanation.  In re Bextra & Celebrex Marketing Celebrex Sales Practices & Prod. Liab. Litig., 524 F.Supp. 2d 1166, 1183 (2007).  The reasonable inference, however, is that the meta-analyses showed statistically significant results across trials with less duration of use, for 400 mg and 800 mg/day use.

Clearly duration of use is a potential consideration unless the mechanism of causation is such that a causally related adverse event would occur from the first use or very short-term use of the medication.  See In re Vioxx Prods. Liab. Litig., MDL No. 1657, 414 F. Supp. 2d. 574, 579 (E.D. La. 2006) (“A trial court may consider additional factors in assessing the scientific reliability of expert testimony . . . includ[ing] whether the expert’s opinion is based on incomplete or inaccurate dosage or duration data.”).  In the Celebrex litigation, plaintiffs’ counsel appeared to want to have duration effects both ways; they did not want to disenfrancise plaintiffs whose claims turned on short-term use, but at the same time, they criticized Professor Wei for including short-term trials of Celebrex.

One form that the plaintiffs’ criticism of Wei took was his failure to weight the trials included in his meta-analyses by duration.  In the plaintiffs’ words:

“Wei failed to utilize important information regarding the duration of the clinical trials that he analyzed, information that is critical to interpreting and understanding the Celebrex and Bextra safety information that is contained within those clinical trials.3 Because the types of cardiovascular events that are at issue in this case occur relatively rarely and are more likely to be observed after an extended period of exposure, the scientific community is in agreement that they would not be expected to appear in trials of very short duration.”

Plaintiffs’ Mem. of Law in Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 2 (July 23, 2009), submitted in In re Pfizer, Inc. Securities Litig., Nos. 04 Civ. 9866(LTS)(JLC), 05 md 1688(LTS) (S.D.N.Y.)[hereafter Securities Litig.]  The plaintiffs maintained that Wei’s meta-analyses were “fatally flawed” because he ignored trial duration, such as would be factored in by performing the analyses in terms of patient years.  Id. at 3

Many of the sources cited by plaintiffs do not support their argument. For instance, the plaintiffs cited articles that noted that weighted averages should be used, but virtually all methods, including Wei’s, weight studies by their variance, which takes into account sample size. Id. at 9 n.3, citing Egger, et al. “Meta-analysis: Principles and Procedures,” 315 Brit. Med. J. 1533 (1997) (an arithmetic average from all trials gives misleading results as results from small studies are more subject to the play of chance and should be given less weight. Meta-analyses use weighted results in which larger trials have more influence that smaller ones). See also id. at 22.  True, true, and immaterial.  No one in the Celebrex cases was using an arithmetic average of risk across trials or studies.

Most of the short-term studies were small, and thus contributed little to the overall summary estimate of association.  Some of the plaintiffs’ citations actually supported using “individual patient data” in the form of time-to-event analyses, which was not possible with many of the clinical trials available.  Indeed, the article the plaintiffs cited, by Dahabreh, did not use time-to-event data for rosiglitazone, because such data were not generally available.  Id. at 9 n.3, citing Dahabreh, “Meta-Analysis Of Rare Events: An Update And Sensitivity Analysis Of Cardiovascular Events In Randomized Trials Of Rosiglitazone,” 5 Clinical Trials 116 (2008).

The plaintiffs’ claim was thus a fairly weak challenge to using simple 2 x 2 tables for the included studies in Wei’s meta-analysis. Both sides failed to mention that many published meta-analyses eschew “patient years” in favor of a simple odds ratio for dichotomous count data from each included study.  See, e.g., Steven E. Nissen, M.D., and Kathy Wolski, M.P.H., “Effect of Rosiglitazone on the Risk of Myocardial Infarction and Death from Cardiovascular Causes,” 356 New Engl. J. Med. 2457, 2457 (2007)(using Peto method with count data, for fixed effect model).  Patient years would be a crude tool to modify the fairly common 2 x 2 table.  The analysis for large studies, with a high number of patient years, would still not reveal whether the adverse events occurred early or late in the trials.  Only a time-to-event analysis could provide the missing information about “duration,” and neither side’s expert witnesses appeared to use a time-to-event analysis.

Interestingly, plaintiffs’ expert witness, Prof. Madigan appears to have received the patient-level data from Pfizer’s clinical trials, but still did not conduct a time-to-event analysis.  Plaintiffs’ Mem. of Law in Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 12 (July 23, 2009), submitted in In re Pfizer, Inc. Securities Litig., Nos. 04 Civ. 9866(LTS)(JLC), 05 md 1688(LTS) (S.D.N.Y.)[hereafter Securities Litig] (noting that Madigan had examined all SAS data files produced by Pfizer, and that “[t]hese  files contained voluminous information on each subject in the trials, including information about duration of exposure to the drug ( or placebo), any adverse events experienced and a wide variety of other information.”).  Of course, even with time-to-event data from the Pfizer clinical trials, Madigan had the problem of whether to limit himself to just the Pfizer trials or use all the data, including non-Pfizer trials.  If he opted for completeness, he would have been forced to include trials for which he did not have underlying data.  In all likelihood, Madigan used patient-years in his analyses because he could not conduct a complete analysis with time-to-event data for all trials.

The plaintiffs’ point appears well taken if the court were to assume that there really was a duration issue, but the plaintiffs’ theories were to the contrary, and Pfizer lost its attempt to limit claims to those events that appeared 33 months (or some other fixed time) after first ingestion.  It is certainly correct that patient-year analyses, in the absence of time-to-event analyses, is generally preferred.  Pfizer had used patient-year information to analyze combined trials in its submission to the FDA’s Advisory Committee.  See Pfizer’s Submission of Advisory Committee Briefing Document at 15 (January 12, 2005).  See also  FDA Reviewer Guidance: Conducting a Clinical Safety Review of a New Product Application and Preparing a Report on the Review at 22 (2005); see also id. at 15 (“If there is a substantial difference in exposure across treatment groups, incidence rates should be calculated using person-time exposure in the denominator, rather than number of patients in the denominator.”);  R. H. Friis & T. A. Sellers, Epidemiology for Public Health Practice at 105 (2008) (“To allow for varying periods of observation of the subjects, one uses a modification of the formula for incidence in which the denominator becomes person-time of observation”).

Professor Wei chose not to do a “patient-year” analysis because such a methodological commitment would have required him to drop over a dozen Celebrex clinical trials involving thousands of patients, and dozens of heart attack and stroke events of interest.  Madigan’s approach led him to disregard a large amount of data.  Wei could, of course, stratified the summary estimates for different length clinical trials, and analyzed whether there were differences as a function of trial duration.  Pfizer claimed that Wei conducted a variety of sensitivity analyses, but it is unclear whether he ever used this technique.  Wei should have been allowed in any event to take plaintiffs at their word that thrombotic events from Celebrex occurred shortly after first ingestion.   Pfizer Mem. of Law in Opp. to Plaintiffs’ Motion to Exclude Defendants’ Expert Dr. Lee-Jen Wei at 2 (Sept. 8, 2009), in Secur. Litig.

 

MADIGAN’S META-ANALYSIS

According to Pfizer, Professor Madigan reached different results from Wei’s largely because he had used different event counts and end points.  The defendants’ challenge to Madigan turned largely upon the unreliable way he went about counting events to include in his meta-analyses.

Data concerning unexpected adverse events in clinical trials often is collected as reports of treating physicians, whose descriptions may be incomplete, inaccurate, or inadequate.  When there is a suggestion that a particular adverse event – say heart attack – occurred more frequently in the medication arm as opposed to the placebo or comparator arms, the usual course of action is to have a panel of clinical experts review all the adverse event reports, and supporting medical charts, to provide diagnoses that can be used in a more complete statistical analyses.  Obviously, the reviewers should be blinded to the patients’ assignment to medication or placebo, and the reviewers should be clinical experts in the clinical specialty of the adverse event.  Cardiologists should be making the call for heart attacks.

In addition to event definition and adjudication, clinical trial interpretation sometimes leads to the use of “composite end points,” which consist of related diagnostic categories, aggregated in some way that makes biological sense.  For instance, if the concern is that a medication causes cardiovascular thrombotic events, a suitable cardiovascular composite end point might include heart attack and ischemic stroke.  Inclusion of hemorrhagic stroke, endocarditis, and valvular disease in the composite, however, would be inappropriate, given the concern over thrombosis.

Professor Madigan is a highly qualified statistician, but, as Pfizer argued, he had no clinical expertise to reassign diagnoses or determine appropriate composite end points.  The essence of the defendants’ challenges revolved around claims of flawed outcome and endpoint ascertainment and definitions.  According to Pfizer’s briefing, the event definition process was unblinded, and conducted by inexpert, partisan reviewers.  Madigan apparently relied upon the work of another plaintiffs’ witness, cardiologist Dr. Lawrence Baruch, as well as that of Dr. Curt Furberg.  Furberg was not a cardiologist; indeed he has never been licensed to practice medicine in the United Dates, and he had not treated a patient in over 30 years. Pfizer Mem. of Law in Opp. to Plaintiffs’ Motion to Exclude Defendants’ Expert Dr. Lee-Jen Wei at 29 (Sept. 8, 2009), in Secur. Litig.  Furthermore, Furberg was not familiar with current diagnostic criteria for heart attack.  Plaintiffs’ counsel asked Furberg to rework some but not all of Baruch’s classifications, but only for fatal events.  Baruch could not explain why Furberg made these reclassifications.  Furberg acknowledged that he had never used “one-line descriptions to classify events,” which he did in the Celebrex litigation, when he received the assignment from plaintiffs’ counsel on the eve of the Court’s deadline for disclosures.  Id. According to Pfizer, if the plaintiffs’ witnesses had used appropriate end points and event counts, their meta-analyses would not have differed from Professor Wei’s work.  Id.

Pfizer pointed to Madigan’s testimony to claim that he had admitted that, based upon the impropriety of Furberg’s changing end point definitions, and his own changes, made without the assistance of a clinician, he would not submit the earlier version of his meta-analysis for peer review.  Pfizer’s [Proposed] Findings of Fact and Conclusions of Law with Respect to Motion to Exclude Certain Plaintiffs’ Experts’ Opinions Regarding Celebrex and Bextra, and Plaintiffs’ Motion to Exclude Defendants’ Expert Dr. Lee-Jen Wei, Document 175, submitted in Securities Litig. (Dec. 4, 2009). at 33,  43.  The plaintiffs countered that Furberg’s reclassifications did not change Madigan’s reports, at least for certain years. Plaintiffs’ Reply Mem. of Law in Further Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 18 (May 5, 2010), in Securities Litig.

The trial court denied Pfizer’s challenges to Madigan’s meta-analysis in the securities fraud class action.  The court attributed any weakness in the classification of fatal adverse events by Baruch and Furberg to the limitations of the underlying data created and produced by Pfizer itself.  In re Pfizer Inc. Securities Litig., 2010 WL 1047618, *4 (S.D.N.Y. 2010).

 

Composites

Pfizer also argued that Madigan put together composite outcomes that did not make biological sense in view of the plaintiffs’ causal theories.  For instance, Madigan left out strokes in his composite, although he included both heart attack and stroke in his primary end point for his Vioxx litigation analysis, and he had no reason to distinguish Vioxx and Celebrex in terms of claimed thrombotic effects.  Pfizer’s [Proposed] Findings of Fact and Conclusions of Law with Respect to Motion to Exclude Certain Plaintiffs’ Experts’ Opinions Regarding Celebrex and Bextra, and Plaintiffs’ Motion to Exclude Defendants’ Expert Dr. Lee-Jen Wei, Document 175, submitted in Securities Litig. (Dec. 4, 2009). at 13-14, 18.  According to Pfizer, Madigan’s composite was novel and unvalidated by relevant, clinical opinion.  Id. at 29, 33.

The plaintiffs’ response is obscure.  The plaintiffs seemed to claim that Madigan was justified in excluding strokes because some kinds of stroke, hemorrhagic strokes, are unrelated to thrombosis.  Plaintiffs’ Reply Memorandum of Law in Further Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 14 (May 5, 2010), in Securities Litig. at 14.  This argument is undermined by the facts:  better than 85% of strokes being ischemic in origin, and even some hemorrhagic strokes start as a result of an ischemic event.

In any event, Pfizer’s argument about Madigan’s composite end points did not gain any traction with the trial judge in the securities fraud class action:

“Dr. Madigan’s written submissions and testimony described clearly and justified cogently his statistical methods, selection of endpoints, decisions regarding event classification, sources of data, as well as the conclusions he drew from his analysis. Indeed, Dr. Madigan’s meta-analysis was based largely on data and endpoints developed by Pfizer. All four of the endpoints that Dr. Madigan used in his analysis-Hard CHD, Myocardial Thromboembolic Events, Cardiovascular Thromboembolic Events, and CV Mortality-have been employed by Pfizer in its own research and analysis. The use of Hard CHD in the relevant literature combined with the use of the other three endpoints by Pfizer in its own 2005 meta-analysis will assist the trier of fact in determining Pfizer’s knowledge and understanding of the pre-December 17, 2004, cardiovascular safety profile of Celebrex.”

In re Pfizer Inc. Securities Litig., 2010 WL 1047618, *4 (S.D.N.Y. 2010).