Schachtman Law » Rule 703

TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

Fixodent Study Causes Lockjaw in Plaintiffs’ Counsel

February 4th, 2015

Litigation Drives Science

Back in 2011, the Fixodent MDL Court sustained Rule 702 challenges to plaintiffs’ expert witnesses. “Hypotheses are verified by testing, not by submitting them to lay juries for a vote.” In re Denture Cream Prods. Liab. Litig., 795 F. Supp. 2d 1345, 1367 (S.D.Fla.2011), aff’d, Chapman v. Procter & Gamble Distrib., LLC, 766 F.3d 1296 (11th Cir. 2014). The Court found that the plaintiffs had raised a superficially plausible hypothesis, but that they had not verified the hypothesis by appropriate testing[1].

Like dentures to Fixodent, the plaintiffs stuck to their claims, and set out to create the missing evidence. Plaintiffs’ counsel contracted with Dr. Salim Shah and his companies Sarfez Pharmaceuticals, Inc. and Sarfez USA, Inc. (“Sarfez”) to conduct human research in India, to support their claims that zinc in denture cream causes neurological damage[2]. In re Denture Cream Prods. Liab. Litig., Misc. Action 13-384 (RBW), 2013 U.S. Dist. LEXIS 93456, *2 (D.D.C. July 3, 2013). When the defense learned of this study, and the plaintiffs’ counsel’s payments of over $300,000, to support the study, they sought discovery of raw data, study protocol, statistical analyses, and other materials from plaintiffs’ counsel. Plaintiffs’ counsel protested that they did not have all the materials, and directed defense counsel to Sarfez. Although other courts have made counsel produce similar materials from the scientists and independent contractors they engaged, in this case, defense counsel followed the trail of documents to contractor, Sarfez, with subpoenas in hand. Id. at *3-4.

The defense served a Rule 45 subpoena on Sarfez, which produced some, but not all responsive documents. Proctor & Gamble pressed for the missing materials, including study protocols, analytical reports, and raw data. Id. at *12-13. Judge Reggie Walton upheld the subpoena, which sought underlying data and non-privileged correspondence, to be within the scope of Rules 26(b) and 45, and not unduly burdensome. Id. at *9-10, *20. Sarfez attempted to argue that the requested materials, listed as email attachments, might not exist, but Judge Walton branded the suggestion “disingenuous.” Attachments to emails should be produced along with the emails. Id. at *12 (citing and collecting cases). Although Judge Walton did not grant a request for forensic recovery of hard-drive data or for sanctions, His Honor warned Sarfez that it might be required to bear the cost of forensic data recovery if it did not comply the court’s order. Id. at *15, *22.

Plaintiffs Put Their Study Into Play

The study at issue in the subpoena was designed by Frederick K. Askari, M.D., Ph.D., an associate professor of hepatology, in the University of Michigan Health System. In re Denture Cream Prods. Liab. Litig., No. 09–2051–MD, 2015 WL 392021, at *7 (S.D. Fla. Jan. 28, 2015). At the instruction of plaintiffs’ counsel, Dr. Askari sought to study the short-term effects of Fixodent on copper absorption in humans. Working in India, Askari conducted the study on 24 participants, who were given a controlled diet for 36 days. Of the 24 participants, 12, randomly selected, received 12 grams of Fixodent per day (containing 204 mg. of zinc). Another six participants, randomly selected, were given zinc acetate, three times per day (150 mg of zinc), and the remaining six participants received placebo, three times per day.

A study protocol was approved by an independent group[3], id. at *9, and the study was supposed to be conducted with a double blind. Id. at *7. Not surprisingly, those participants who received doses of Fixodent or zinc acetate had higher urinary levels of zinc (pee < 0.05). The important issue, however, was whether the dietary zinc levels affect copper excretion in a way that would support plaintiffs’ claims that copper levels were lowered sufficiently by Fixodent to cause a syndromic neurological disorder. The MDL Court ultimately concluded that plaintiffs’ expert witnesses’ opinions on general causation claims were not sufficiently supported to satisfy the requirements of Rule 702, and upheld defense challenges to those expert witnesses. In doing so, the MDL Court had much of interest to say about case reports, weight of the evidence, and other important issues. This post, however, concentrates on the deviations of one study, commissioned by plaintiffs’ counsel, from the scientific standard of care. The Askari “research” makes for a fascinating case study of how not to conduct a study in a litigation caldron.

Non-Standard Deviations

The First Deviation – Changing the Ascertainment Period After the Data Are Collected

The protocol apparently identified a primary endpoint to be:

“the mean increase in [copper 65] excretion in fecal matter above the baseline (mg/day) averaged over the study period … to test the hypothesis that the release of [zinc] either from Fixodent or Zinc Acetate impairs [copper 65] absorption as measured in feces.”

The study outcome, on the primary end point, was clear. The plaintiffs’ testifying statistician, Hongkun Wang, stated in her deposition that the fecal copper (whether isotope Cu63 or Cu65) was not different across the three groups (Fixodent, zinc acetate, and placebo). Id. at *9[4]. Even Dr. Askari himself admitted that the total fecal copper levels were not increased in the Fixodent group compared with the placebo control group. Id. at *9.[5]

Apparently after obtaining the data, and finding no difference in the pre-specified end point of average fecal copper levels between Fixodent and placebo groups, Askari turned to a new end point, measured in a different way, not described in the protocol as the primary end point.

The Second Deviation – Changing Primary End Point After the Data Are Collected

In the early (days 3, 4, and 5) and late (days 31, 32, and 33) part of the Study, participants received a dose of purified copper 65[6] to help detect the “blockade of copper.” Id. at 8*. The participants’ fecal copper 65 levels were compared to their naturally occurring copper 63 levels. According to Dr. Askari:

“if copper is being blocked in the Fixodent and zinc acetate test subjects from exposure to the zinc in the test product (Fixodent) and positive control (zinc acetate), the ratio of their fecal output of copper 65 as compared to their fecal output of copper 63 would increase relative to the control subjects, who were not dosed with zinc. In short, a higher ratio of copper 65 to copper 63 reflects blocking of copper.”

Id.

Askari analyzed the ratio of two copper isotopes (Cu65 /Cu63), in the limited period of observation to study days 31 to 33. Id. at *9. Askari thus changed the outcome to be measured, the timing of the measurement, and manner of measurement (average over entire period versus amount on days 31 to 33). On this post hoc, non-prespecified end point, Askari claimed to have found “significant” differences.

The MDL Court expressed its skepticism and concern over the difference between the protocol’s specified end point, and one that came into the study only after the data were obtained and analyzed. The plaintiffs claimed that it was their (and Askari’s) intention from the initial stages of designing the Fixodent Blockade Study to use the Cu65/Cu63 ratio as the primary end point. According to the plaintiffs, the isotope ratio was simply better articulated and “clarified” as the primary end point in the final report than it was in the protocol. The Court was not amused or assuaged by the plaintiffs’ assurances. The study sponsor, Dr. Salim Shah could not point to a draft protocol that indicated the isotope ratio as the end point; nor could Dr. Shah identify a request for this analysis by Wang until after the study was concluded. Id. at *9.[7]

Ultimately, the Court declared that whether the protocol was changed post hoc after the primary end point provided disappointing analysis, or the isotope ratio was carelessly omitted from the protocol, the design or conduct of the study was “incompatible with reliable scientific methodology.”

The Third Deviation – Changing the Standard of “Significance” After the Data Are Collected and P-Values Are Computed

The protocol for the Blockade study called for a pre-determined Type I error rate (p-value) of no more than 5 percent.[8] Id. at *10. The difference in the isotope ratio showed an attained level of significance probability of 5.7 percent, and thus even the post hoc end point missed the prespecified level of significance. The final protocol changed the value of “significance” to 10 percent, to permit the plaintiffs to declare a “statistically significant” result. Dr. Wang admitted in deposition that she doubled the acceptable level of Type I error only after she obtained the data and calculated the p-value of 0.057. Id. at *10.[9]

The Court found that this deliberate moving of the statistical goal post reflected a “lack of objectivity and reliability,” which smacked of contrivance[10].

The Court found that the study’s deviations from the protocol demonstrated a lack of objectivity. The inadequacy of the Study’s statistical analysis plan supported the Court’s conclusion that Dr. Askari’s supposed finding of a “statistically significant” difference in fecal copper isotope ratio between Fixodent and placebo group participants was “not based on sufficiently reliable and objective scientific methodology” and thus could not support plaintiffs’ expert witnesses’ general causation claims.

The Fourth Deviation – Failing to Take Steps to Preserve the Blind

The protocol called for a double-blinded study, with neither the participants nor the clinical investigators knowing which participant was in which group. Rather than delivering the three different groups capsules that looked similar, the group each received starkly different looking capsules. Id. at *11. The capsules for one set were apparently so large that the investigators worried whether the participants would comply with the dosing regimen.

The Fifth Deviation – Failing to Take Steps to Keep Biological Samples From Becoming Contaminated

Documents and emails from Dr. Shah acknowledged that there had been “difficulties in storing samples at appropriate temperature.” Id. at *11. Fecal samples were “exposed to unfrozen and undesirable temperature conditions.” Dr. Shah called for remedial steps from the Study manager, but there was no documentation that such steps were taken to correct the problem. Id.

The Consequences of Discrediting the Study

Dr. Askari opined that the Study, along with other evidence, shows that Fixodent can cause copper deficiency myeloneuropathy (“CDM”). The plaintiffs, of course, argued that the Defendants’ criticisms of the Fixodent

Study’s methodology went merely to the “weight rather than admissibility.” Id. at *9. Askari’s study was but one leg of the stool, but the defense’s thorough discrediting of the study was an important step in collapsing the support for the plaintiffs’ claims. As the MDL Court explained:

“The Court cannot turn a blind eye to the myriad, serious methodological flaws in the Fixodent Blockade Study and conclude they go to weight rather than admissibility. While some of these flaws, on their own, may not be serious enough to justify exclusion of the Fixodent Blockade Study; taken together, the Court finds Fixodent Blockade Study is not “good science,” and is not admissible. Daubert, 509 U.S. at 593 (internal quotation marks and citation omitted).”

Id. at *11.

A study, such as the Fixodent Blockade Study, is not itself admissible, but the deconstruction of the study upon which plaintiffs’ expert witnesses relied, led directly to the Court’s decision to exclude those witnesses. The Court omitted any reference to Federal Rule of Evidence 703, which addresses the requirements of facts and data, otherwise inadmissible, which may be relied upon by expert witnesses in reaching their opinions.

[1] See “Philadelphia Plaintiff’s Claims Against Fixodent Prove Toothless” (May 2, 2012); Jacoby v. Rite Aid Corp., 2012 Phila. Ct. Com. Pl. LEXIS 208 (2012), aff’d, 93 A.3d 503 (Pa. Super. 2013); “Pennsylvania Superior Court Takes The Bite Out of Fixodent Claims” (Dec. 12, 2013).

[2] See “Using the Rule 45 Subpoena to Obtain Research Data” (July 24, 2013)

[3] The group was identified as the Ethica Norma Ethical Committee.

[4] citing Wang Dep. at 56:7–25, Aug. 13, 2013), and Wang Analysis of Fixodent Blockade Study [ECF No. 2197–56] (noting “no clear treatment effect on Cu63 or Cu65”).

[5] Askari Dep. at 69:21–24, June 20, 2013.

[6] Copper 65 is not a typical tracer; it is not radioactive. Naturally occurring copper consists almost exclusively of two stable (non-radioactive) isotope, Cu65 about 31 percent, Cu63 about 69 percent. See, e.g., Manuel Olivares, Bo Lönnerdal, Steve A Abrams, Fernando Pizarro, and Ricardo Uauy, “Age and copper intake do not affect copper absorption, measured with the use of 65Cu as a tracer, in young infants,” 76 Am. J. Clin. Nutr. 641 (2002); T.D. Lyon, et al., “Use of a stable copper isotope (65Cu) in the differential diagnosis of Wilson’s disease,” 88 Clin. Sci. 727 (1995).

[7] Shah Dep. at 87:12–25; 476:2–536:12, 138:6–142:12, June 5, 2013).

[8] The reported decision leaves unclear how the analysis would proceed, whether by ANOVA for the three groups, or t-tests, and whether there was multiple testing.

[9] Wang Dep. at 151:13–152:7; 153:15–18.

[10] 2015 WL 392021, at *10, citing Perry v. United States, 755 F.2d 888, 892 (11th Cir. 1985) (“A scientist who has a formed opinion as to the answer he is going to find before he even begins his research may be less objective than he needs to be in order to produce reliable scientific results.”); Rink v. Cheminova, Inc., 400 F.3d 1286, 1293 n. 7 (11th Cir.2005) (“In evaluating the reliability of an expert’s method … a district court may properly consider whether the expert’s methodology has been contrived to reach a particular result.” (alteration added)).

Posted in Expert Witnesses, Rule 702, Rule 703, Scientific Evidence, statistical evidence | Comments Off on Fixodent Study Causes Lockjaw in Plaintiffs’ Counsel

More Case Report Mischief in the Gadolinium Litigation

November 28th, 2014

The Decker case is one curious decision, by the MDL trial court, and the Sixth Circuit. Decker v. GE Healthcare Inc., ___ F.3d ___, 2014 FED App. 0258P, 2014 U.S. App. LEXIS 20049 (6th Cir. Oct. 20, 2014). First, the Circuit went out of its way to emphasize that the trial court had discretion, not only in evaluating the evidence on a Rule 702 challenge, but also in devising the criteria of validity[1]. Second, the courts ignored the role and the weight being assigned to Federal Rule of Evidence 703, in winnowing the materials upon which the defense expert witnesses could rely. Third, the Circuit approved what appeared to be extremely asymmetric gatekeeping of plaintiffs’ and defendant’s expert witnesses. The asymmetrical standards probably were the basis for emphasizing the breadth of the trial court’s discretion to devise the criteria for assessing scientific validity[2].

In barring GEHC’s expert witnesses from testifying about gadolinium-naive nephrogenic systemic fibrosis (NSF) cases, Judge Dan Polster, the MDL judge, appeared to invoke a double standard. Plaintiffs could adduce any case report or adverse event report (AER) on the theory that the reports were relevant to “notice” of a “safety signal” between gadolinium-based contrast agents in MRI and NSF. Defendants’ expert witnesses, however, were held to the most exacting standards of clinical identity with the plaintiff’s particular presentation of NSP, biopsy-proven presence of Gd in affected tissue, and documentation of lack of GBCA-exposure, before case reports would be permitted as reliance materials to support the existence of gadolinium-naïve NSF.

A fourth issue with the Decker opinion is the latitude it permitted the district court to allow testimony from plaintiffs’ pharmacovigilance expert witness, Cheryl Blume, Ph.D., over objections, to testify about the “signal” created by the NSF AERs available to GEHC. Decker at *11. At the same trial, the MDL judge prohibited GEHC’s expert witness, Dr. Anthony Gaspari, to testify that the AERs described by Blume did not support a clinical diagnosis of NSF.

On a motion for reconsideration, Judge Polster reaffirmed his ruling on grounds that

(1) the AERs were too incomplete to rule in or rule out a diagnosis of NSF, although they were sufficient to create a “signal”;

(2) whether the AERs were actual cases of NSF was not relevant to their being safety signals;

(3) Dr. Gaspari was not an expert in pharmacovigilance, which studied “signals” as opposed to causation; and

(4) Dr. Gaspari’s conclusion that the AERs were not NSF was made without reviewing all the information available to GEHC at the time of the AERs.

Decker at *12.

The fallacy of this stingy approach to Dr. Gaspari’s testimony lies in the courts’ stubborn refusal to recognize that if an AER was not, as a matter of medical science, a case of NSF, then it could not be a “signal” of a possible causal relationship between GBCA and NSF. Pharmacovigilance does not end with ascertaining signals; yet the courts privileged Blume’s opinions on signals even though she could not proceed to the next step and evaluate diagnostic accuracy and causality. This twisted logic makes a mockery of pharmacovigilance. It also led to the exclusion of Dr. Gaspari’s testimony on a key aspect of plaintiffs’ liability evidence.

The erroneous approach pioneered by Judge Polster was compounded by the district court’s refusal to give a jury instruction that AERs were only relevant to notice, and not to causation. Judge Polster offered his reasoning that “the instruction singles out one type of evidence, and adds, rather than minimizes, confusion.” Judge Polster cited the lack of any expert witness testimony that suggested that AERs showed causation and “besides, it doesn’t matter because those patients are not, are not the plaintiffs.” Decker at *17.

The lack of dispute about the meaning of AERs would have seemed all the more reason to control jury speculation about their import, and to give a binding instruction on AERs and their limited significance. As for the AER patients’ not being the plaintiffs, well, the case report patients were not the plaintiffs, either. This last reason is not even wrong[3]. The Circuit, in affirming, turned a blind eye to the district court’s exercise of discretion in a way that systematically increased the importance of Blume’s testimony on signals, while systematically hobbling the defendant’s expert witnesses.

[1] “THE STANDARD OF APPELLATE REVIEW FOR RULE 702 DECISIONS” (Nov. 12, 2014).

[2] “Gadolinium, Nephrogenic Systemic Fibrosis, and Case Reports” (Nov. 24, 2014).

[3] “Das ist nicht nur nicht richtig, es ist nicht einmal falsch!” The quote is attributed to Wolfgang Pauli in R. E. Peierls, “Wolfgang Ernst Pauli, 1900-1958,” 5 Biographical Memoirs Fellows Royal Soc’y 175, 186 (1960).

Posted in Causation, Expert Witnesses, Rule 702, Rule 703 | Comments Off on More Case Report Mischief in the Gadolinium Litigation

Gadolinium, Nephrogenic Systemic Fibrosis, and Case Reports

November 24th, 2014

Gadolinium (Gd) is a rare earth element. In its ionic form (+3), gadolinium is known to be highly toxic to humans. Gadolinium is strongly paramagnetic, which makes it a valuable contrast agent in for magnetic resonance imaging (MRI). The gadolinium is administered intravenously in a chelated form before MRI. In its chelated form, the ion is escorted out of the body through the kidneys before exposure to free Gd ion occurs. Or that was the theory.

Nephrogenic systemic fibrosis (NSF) is a rare, painful, incurable progressive connective tissue disease. NSF manifests with skin thickening and fibrosis, tethering, which means it cannot be pulled away from body. Some patients may develop extracutaneous fibrosis of muscle, lymph nodes, pleura, and other internal organs. Elana J. Bernstein, Christian Schmidt-Lauber, and Jonathan Kay, “Nephrogenic systemic fibrosis: A systemic fibrosing disease resulting from gadolinium exposure,” 26 Best Practice & Research Clin. Rheum. 489, 489 (2012).

As a diagnostic entity, NSF is a relatively recent discovery. The first case was noted in 1997, in California. Within a few years, the differential diagnostic criteria to distinguish NSF from other fibrotic diseases were developed. Centers for Disease Control, “Fibrosing skin condition among patients with renal disease–United States and Europe, 1997–2002,” 51 MMWR Morbidity and Mortality Weekly Report 25 (2002). Physicians identified the condition among patients with renal insufficiency who had received MRI with a gadolinium-based contrast agent (GBCA). Given the rarity of both the exposure (GBCA and renal insufficiency) and the outcome (NSF), the relationship between NSF and the use of gadolinium-containing contrast agents for magnetic resonance imaging (MRI) was discovered largely from case reports. A case registry is maintained at Yale University, and has identified 380 cases to date. Shawn E. Cowper, “Nephrogenic Systemic Fibrosis” at the website for The International Center for Nephrogenic Systemic Fibrosis Research (ICNSFR) [last updated June 15, 2013).

The little epidemiology that exists on the subject generally has found that all “cases” had exposure to Gd[1]. Or almost all. There have been occasional cases found without reported exposure to GBCA. Indeed, one case of NSF without prior GBCA was reported last month in the dermatological literature. C. Ross, N. De Rosa, G. Marshman, D. Astill, “Nephrogenic systemic fibrosis in a gadolinium-naïve patient: Successful treatment with oral sirolimus,” Australas. J. Dermatol. (2014); doi: 10.1111/ajd.12176. [Epub ahead of print].

In litigation, the usual scenario is that plaintiffs and their counsel and expert witnesses want to offer case reports or case series as probative of a causal association between an exposure and a particular disease outcome. In the silicone gel breast implant litigation, women, who self-characterized themselves “victims,” shouted outside courtrooms, “We are the evidence.”

When the outcome in question has a baseline rate, and the exposure is widespread, this strategy is usually illegitimate and most courts have limited or prohibited the obvious attempt to prejudice the jury by the use of evidence that has little or no probative value.

The causal connection between NSF and GBCA, described above, was postulated on the basis of case reports, but this is not really a rejection of the general rule about case reports. NSF is an extremely rare outcome, and GBCA administered to patients with serious kidney insufficiency is a fairly rare exposure. In addition, gadolinium ion has a known human toxicity, and the connection between renal insufficiency and Gd toxicity is rather straightforward. The insufficiency of the kidney function results in longer “in residence” times for the GBCA, with the consequence that the gadolinium disassociates from its chelating agent, and the free Gd ion does its damage. Furthermore, biopsies of affected tissues show an uptake of gadolinium in NSF patients.

* * * * * * * *

GE Healthcare manufactures Omniscan, a GBCA, for use as an MRI-contrast medium. Given the recently discovered dangers of GBCAs in vulnerable patients, Omniscan has been a magnet for lawsuits, with the peak intensity of the litigation field in the MDL courtroom of federal district courtroom of Judge Dan Polster. Judge Polster tried the first Omniscan case, which resulted in a verdict for the plaintiff. GE appealed, complaining about several of Judge Polster’s rulings, including the uneven handling of case reports. Last month, the Sixth Circuit affirmed. Decker v. GE Healthcare Inc., ___ F.3d ___, 2014 FED App. 0258P, 2014 U.S. App. LEXIS 20049 (6th Cir. Oct. 20, 2014).

General causation between GBCAs and NSF was apparently not disputed in Decker. Although plaintiffs in the GBCA litigation established the causality of GABC in producing NSF, by case reports, Judge Polster refused to permit GEHC’s expert witnesses to testify about their reliance upon case reports of gadolinium-naïve cases of NSF; that is, the court disallowed testimony about reported cases that occurred in the absence of GBCA exposure[2]. Id. at *9. Judge Polster found that the reported gadolinium-naïve case reports were “methodologically flawed” because they did not adequately show that the NSF patients in question lacked Gd exposure, with tissue biopsy or other means. Id. at * 10. The district court speculated that there may have Gd exposure from a non-MRI procedure, but never explained what non-MRI procedure would involve internal administration of GBCA. Nor did the district court address the temporal relationship between this undocumented, conjectured non-MRI gadolinium-based imaging procedure and the onset of the reported patient’s NSF.

Before trial defendant GEHC moved for reconsideration of the district court’s previous decision on defensive use of gadolinium-naïve case reports, based upon on a then recent publication of a “purported” case of gadolinium-naïve NSF. Id. at *8. A quick read of the late-breaking case study shows that it was more than a “purported” case. A.A. Lemy, et al., “Revisiting nephrogenic systemic fibrosis in 6 kidney transplant recipients: a single-center experience,” 63 J. Am. Acad. Dermatol. 389 (2010). The cited paper by Lemy had diagnosed NSF in a patient without GBCA exposure, and mass spectrometry testing of affected tissue revealed no Gd. The district court, however, dismissed the Lemy case as irrelevant unless GEHC’s expert witnesses could demonstrate that Lemy’s patient number 5 and the plaintiff were so clinical similar that “it was probable that Mr. Decker’s NSF was not caused by his 2005 Omniscan [exposure].”

The Sixth Circuit affirmed this “tails they win; heads you lose” approach to gatekeeping as all within the scope of the district court’s exercise of discretion. Lemy’s case number 5 and Mr. Decker both had NSF, and yet the courts do not describe clinical varieties among NSF, which vary based upon their relatedness to gadolinium exposure. It would seem that the courts were imposing an extremely heavy burden on the defense to show that the gadolinium-naïve cases were absolutely free of Gd exposure, and that they resembled the particular plaintiff’s NSF diagnosis in every respect. Without any evidence of diagnostic disease criteria sensitivity and specificity, and positive predictive value for the criteria, the district and the appellate courts seem to have accepted glib demands for absolute identity between the plaintiff’s NSF manifestation and any candidate Gd-free NSF case. Given that there is clinical heterogeneity among Gd-NSF cases, and that causality was basically inferred from cases and case series, the courts’ reasoning seems strained.

The appellate court also seemed blithely unaware of the fallacious circularity of permitting a diagnostic entity to be defined based upon exposure, thereby preventing any fair test of the hypothesis that all NSF cases are caused by gadolinium. This fallacy was advanced in the silicone gel breast implant litigation, where the litigation industry shrank from claims that silicone caused classic connective tissue diseases, in the face of exculpatory epidemiologic studies. The claimants retreated to a claim that silicone caused a “new” disease that was defined by mostly vague, self-reported symptoms [so very different from NSF in this respect], in conjunction with silicone exposure. The court-appointed expert witnesses, however, would have none of these shenanigans:

“The National Science Panel concluded that they do not yet support the inclusion of SSRD [systemic silicone-related disease] in the list of accepted diseases, for 4 reasons. First, the requirement of the inclusion of the putative cause (silicone exposure) as one of the criteria does not allow the criteria set to be tested objectively without knowledge of the presence of implants, thus incurring incorporation bias (27).”

Peter Tugwell, George Wells, Joan Peterson, Vivian Welch, Jacqueline Page, Carolyn Davison, Jessie McGowan, David Ramroth, and Beverley Shea, “Do Silicone Breast Implants Cause Rheumatologic Disorders? A Systematic Review for a Court-Appointed National Science Panel,” 44 Arthritis & Rheumatism 2477, 2479 (2001) (citing David Sackett, “Bias in analytic research,” 32 J. Chronic Dis. 51 (1979)).

Of course, NSF does not share the dubious provenance of SSRD, or SAD [silicone-associated disorder] as it was sometimes known. Still, the analytic studies that have shown that NSF cases all, or mostly, had GBCA exposure, explicitly refrained from defining the NSF case as including gadolinium exposure.

Decker is thus a curious case. The trial and appellate court talked about preventing the defense expert witnesses from relying upon case reports that were “methodologically flawed,” but the courts never mentioned Federal Rule of Evidence 703, which should have been the basis for such selective pruning of the expert witnesses’ reliance materials. And then there is the matter that even if GEHC were correct about Gd-free NSF cases, the attributable risk for NSF to prior Gd exposure is almost certainly very high, and the debate over whether NSF is a “signature” disease was not likely going to affect the case outcome.

Decker can perhaps best be understood as a dispute about specific causation, with established general causation, in which the relative risk of NSF from GBCA exposure is extraordinarily high among patients with renal insufficiency. If there are other causes of NSF, they are considerably more rare than GBCA/renal insufficiency exposed cases. In the face of this very high attributable risk, GE’s expert witnesses’ discussions of an idiopathic or other cause was too speculative to pass muster under Rule 702.

[1] Elana J. Bernstein, Tamara Isakova, Mary E. Sullivan, Lori B. Chibnik, Myles Wolf & Jonathan Kay, “Nephrogenic systemic fibrosis is associated with hypophosphataemia: a case–control study,” 53 Rheumatology 1613 (2014); T.R. Elmholdt, M. Pedersen, B. Jørgensen, K. Søndergaard, J.D. Jensen, M. Ramsing, and A.B. Olesen, “Nephrogenic systemic fibrosis is found only among gadolinium-exposed patients with renal insufficiency: a case-control study from Denmark,” 165 Br. J. Dermatol. 828 (2011); P. Marckmann, “An epidemic outbreak of nephrogenic systemic fibrosis in a Danish hospital,” 66 Eur. J. Radiol. 187 (2008) (reporting all patients had gadodiamide-enhanced magnetic resonance imaging and severe renal insufficiency before onset of NSF); P. Marckmann, L. Skov, K. Rossen, J.G. Heaf, and H.S. Thomsen, “Case-control study of gadodiamide-related nephrogenic systemic fibrosis,” 22 Nephrol. Dialysis &Transplant. 3174 (2007) (all 19 cases in case-control study had prior exposure to gadolinium (Gd)-containing magnetic resonance imaging contrast agents); Centers for Disease Control, “Nephrogenic Fibrosing Dermopathy Associated with Exposure to Gadolinium-Containing Contrast Agents — St. Louis, Missouri, 2002–2006,” 56 MMWR Morbidity and Mortality Weekly Report (Feb. 23, 2007).

[2] T.A. Collidge, P.C. Thomson, P.B. Mark, et al., “Gadolinium-Enhanced MR Imaging and Nephrogenic Systemic Fibrosis: Retrospective Study of a Renal Replacement Therapy Cohort,” 245 Radiology 168-175 (2007); I.M. Wahba, E.L. Simpson, and K. White, “Gadolinium Is Not The Only Trigger For Nephrogenic Systemic Fibrosis: Insights From Two Cases And Review Of The Recent Literature,” 7 Am. J. Transplant. 1 (2007); A. Deng, D.B. Martin, et al., “Nephrogenic Systemic Fibrosis with a Spectrum of Clinical and Histopathological Presentation: A Disorder of Aberrant Dermal Remodeling,” 37 J. Cutan. Pathol. 204 (2009).

Posted in Causation, Rule 702, Rule 703 | Comments Off on Gadolinium, Nephrogenic Systemic Fibrosis, and Case Reports

The Seventh Circuit Regresses on Rule 702

October 29th, 2013

Earlier this month, a panel of the Seventh Circuit of the United States Court of Appeal decided a relatively straight forward case by reversing the trial court’s exclusion of a forensic accountant’s damages calculation. Manpower, Inc. v. Insurance Company of the State of Pennsylvania, No. 12‐2688 (7th Cir. Oct. 16, 2013). In reversing, the appellate court disregarded a congressional statute, Supreme Court precedent, and Circuit decisional law.

The case involved a dispute over insurance coverage dispute and an economic assessment of Manpower, Inc.’s economic losses that followed a building collapse. The trial court excluded Manpower’s accounting expert witness, Sullivan, who projected a growth rate (7.76%) for the plaintiff by comparing total revenues for a five month period in 2006 to the same five months in the previous year. Id. at 8. The historical performance, however, included a negative annual growth rate of 4.79% , over the years 2003 to 2009. Over the five months immediately preceding Sullivan’s chosen period in 2006, the growth rate was merely 3.8%, less than half his projected growth rate. Id. Sullivan tried to justify his rather his extreme selectivity in data reliance by adverting to information that he obtained from the company about its having initiated new policies and installed new managers by the end of 2005. Id.

The trial court held that Sullivan, who was not an expert on business management, had uncritically accepted the claimant’s proffered explanation for a very short-term swing in profitability and revenue. Id. at 9. While suggesting that Sullivan’s opinion was not “bulletproof,” the panel of the Seventh Circuit reversed. The panel, which should have been reviewing the district court for potential “abuse of discretion,” appears to have made its own independent determination that Sullivan opinion was “sufficiently reliable to present to a jury.” Id. at 17. In reversing, the panel explained that “the district court exercised its gatekeeping role under Daubert with too much vigor.” Id.

The panel attempted to justify its reversal by suggesting that a district court “usurps the role of the jury, and therefore abuses its discretion, if it unduly scrutinizes the quality of the expert’s data and conclusions rather than the reliability of the methodology the expert employed.” Id. at 18. The panel’s reversal illustrates several methodological and legal confusions that make this case noteworthy beyond its mundane subject matter.

Of course, the most striking error in the panel’s approach is citing to a Supreme Court case, Daubert, which has been effectively superseded by a Congressional statute, Federal Rule of Evidence 702, in 2000:

“A witness who is qualified as an expert … may testify in the form of an opinion or otherwise if:

(a) the expert’s scientific, technical, or other specialized knowledge will help the trier of fact to understand the evidence or to determine a fact in issue;

(b) the testimony is based on sufficient facts or data;

(c) the testimony is the product of reliable principles and methods; and

(d) the expert has reliably applied the principles and methods to the facts of the case.”

Pub. L. 93–595, § 1, Jan. 2, 1975, 88 Stat. 1937; Apr. 17, 2000 (eff. Dec. 1, 2000); Apr. 26, 2011, eff. Dec. 1, 2011.) Ironically, the Supreme Court’s Daubert case itself, had the Manpower panel paid attention to it, reversed the Ninth Circuit for applying a standard, the so-called Frye test, which predated the adoption of the Federal Rules of Evidence in 1975. Rather than following the holding of the Daubert case, the panel got mired down in its dicta about a distinction between methodology and conclusion. The Supreme Court itself abandoned his distinction a few years later in General Electric Co. v. Joiner, when it noted that

“conclusions and methodology are not entirely distinct from one another.”

522 U.S. 136, 146 (1997).

The panel of the Seventh Circuit concluded, without much real analysis, that the district court had excluded Sullivan’s opinions on a basis that implicated his conclusion and data selection, not his methodology. Id. at 19-20. The problem, of course, is that how one selects data of past performance to project future performance is part and parcel of the methodology of making the economic projection. The supposed distinction advanced by the panel is illusory, and contrary to post-Daubert decisions, and the Congressional revision of the statute, which requires attention to whether “the testimony is based on sufficient facts or data; the testimony is the product of reliable principles and methods; and, the expert has reliably applied the principles and methods to the facts of the case.” Rule 702.

To make matters worse, the appellate court in Manpower proceeded to attempt to justify its reversal on grounds of “[t]he latitude we afford to statisticians employing regression analysis, a proven statistical methodology used in a wide variety of contexts.” Id. at 21. Here the appellate court suggests that if expert witnesses use a statistical test or analysis, such as regression analysis, it does not matter how badly they apply the test, or how worthless their included data are. Id. at 22. According to the Manpower panel:

“the Supreme Court and this Circuit have confirmed on a number of occasions that the selection of the variables to include in a regression analysis is normally a question that goes to the probative weight of the analysis rather than to its admissibility. See, e.g.,Bazemore v. Friday, 478 U.S. 385, 400 (1986) (reversing lower court’s exclusion of regression analysis based on its view that the analysis did not include proper selection of variables); Cullen v. Indiana Univ. Bd. of Trustees, 338 F.3d 693, 701‐02 & n.4 (7th Cir. 2003) (citing Bazemore in rejecting challenge to expert based on omission of variables in regression analysis); In re High Fructose Corn Syrup Antitrust Litigation, 295 F.3d 651, 660‐61 (7th Cir. 2002) (detailing arguments of counsel about omission of variables and other flaws in application of the parties’ respective regression analyses and declining to exclude analyses on that basis); Adams v. Ameritech Servs., Inc., 231 F.3d 414, 423 (7th Cir. 2000) (citing Bazemore in affirming use of statistical analysis based solely on correlations—in other words, on a statistical comparison that employed no regression analysis of any independent variables at all). These precedents teach that arguments about how the selection of data inputs affect the merits of the conclusions produced by an accepted methodology should normally be left to the jury.”

Id. at 22.

Again, the Seventh Circuit’s approach in Manpower is misguided. Bazemore involved a multivariate regression analysis in the context of a discrimination case. Neither the Supreme Court nor the Fourth Circuit considered the regression at issue in Bazemore as evidence; rather the analysis was focused upon whether, within the framework of discrimination law, the plaintiffs’ regression satisfied their burden of establishing a prima facie case that shifted the burden to the defendant. No admissibility challenge was made to the regression in Bazemore under Rule 702. Of course, the Bazemore litigation predates the Supreme Court’s decision in Daubert by several years. Furthermore, even the Bazemore decision acknowledged that there may be

“some regressions so incomplete as to be inadmissible as irrelevant… .”

478 U.S. 385, 400 n.10 (1986).

The need for quantitative analysis of race and other suspect class discrimination under the equal protection clause no doubt led the Supreme Court, and subsequent lower courts to avoid looking too closely at regression analyses. Some courts, such as the Manpower panel view Bazemore as excluding regression analysis from gatekeeping of statistical evidence, which magically survives Daubert. The better reasoned cases, however, even within the Seventh Circuit fully apply the principles of Rule 702 to statistical inference and analyses. See, e.g., ATA Airlines, Inc. v. Fed. Express Corp., 665 F.3d 882, 888–89 (2011) (Posner, J.) (reversing on grounds that plaintiff’s regression analysis should never have been admitted), cert. denied, 2012 WL 189940 (Oct. 7, 2012); Zenith Elecs. Corp. v. WH-TV Broad. Corp., 395 F.3d 416 (7th Cir.) (affirming exclusion of expert witness opinion whose extrapolations were mere “ipse dixit”), cert. denied, 125 S. Ct. 2978 (2005); Sheehan v. Daily Racing Form, Inc. 104 F.3d 940 (7th Cir. 1997) (Posner, J.) (discussing specification error). See also Munoz v. Orr, 200 F.3d 291 (5th Cir. 2000). For a more enlightened and educated view of regression and the scope and application of Rule 702, from another Seventh Circuit panel, Judge Posner’s decision in ATA Airlines, supra, is an essential starting place. See “Judge Posner’s Digression on Regression” (April 6, 2012).

There is yet one more flaw in the Manpower decision and its rejection of the relevancy of data quality for judicial gatekeeping. Federal Rule of Evidence 703 specifically addresses the bases of an expert witness’s opinion testimony. The Rule, in relevant part, provides that:

“If experts in the particular field would reasonably rely on those kinds of facts or data in forming an opinion on the subject, they need not be admissible for the opinion to be admitted.”

Here the district court had acted prudently in excluding an expert witness who accepted the assertions of new management that it had, within a very short time span, turned a company from a money loser into a money earner. As any observer of the market knows, there are too many short-term “fixes,” such as cutting personnel, selling depreciated property, and the like, to accredit any such short-term data as “reasonably relied upon.” See In re Agent Orange Product Liability Lit., 611 F. Supp. 1223, 1246 (E.D.N.Y. 1985) (excluding opinions under Rule 703 of proffered expert witnesses who relied upon checklists of symptoms prepared by the litigants; “no reputable physician relies on hearsay checklists by litigants to reach a conclusion with respect to the cause of their affliction”), aff’d on other grounds, 818 F.2d 187 (2d Cir. 1987), cert. denied, 487 U.S. 1234 (1988).

Manpower represents yet another example of Court of Appeals abrogating gatekeeping by reversing a district judge who attempted to apply the Rules and the relevant Supreme Court precedent. The panel in Manpower ignored Congressional statutory enactments and precedents of its own Circuit, and it relied upon cases superseded and overruled by later Supreme Court cases. That’s regression for you.

Posted in Rule 702, Rule 703, statistical evidence | Comments Off on The Seventh Circuit Regresses on Rule 702

Using the Rule 45 Subpoena to Obtain Research Data

July 24th, 2013

Back in June, Mr. William Ruskin posted a blog post, “When Should Data Underlying Scientific Studies Be Discoverable?” on his firm’s Toxic Tort Litigation Blog. Earlier this week, the Defense Research Institute’s blog, dri-today republished the post, and I posted a response, “Research Data from Published Papers Generally Should Be Available,” on the DRI blog.

Mr. Ruskin’s blog post calls attention to the important problem of access to research data in litigation and other contexts. The effort to obtain Dr. Racette’s underlying data is an interesting case study in these legal discovery battles. Ruskin notes that there is the potential for “injustice” from such discovery, but he fails to acknowledge that the National Research Council has been urging scientists for decades to have a plan for data sharing as part of their protocol, and that the National Institutes of Health now requires such planning. Some journals require a commitment to data sharing as a condition to publication. The Annals of Internal Medicine, which is probably the most rigorously edited internal medicine journal, requires authors to state to what extent they will share data when their articles appear in print. Ultimately, litigants are entitled to “everyman’s” and “every woman’s” evidence, regardless whether they are scientists. If scientists complied with the best practices, guidances, and regulations on planning for data sharing, the receipt of a subpoena for underlying data would not be a particularly disruptive event in their laboratories.

In the case of Dr. Racette, it was clear that the time he needed to spend to respond to defense counsel’s subpoena was largely caused by his failure to comply with NIH guidelines on data sharing. Racette was represented by university counsel, who refused to negotiate over the subpoena, and raised frivolous objections. Ultimately, the costs of production were visited upon the defendants who paid what seemed like rather exorbitant amounts for Racette and his colleagues to redact individual identifier information. The MDL court suggested that Racette was operating independently of plaintiffs’ counsel, but the fact was that plaintiffs’ counsel recruited the study participants and brought them to the screenings, where Racette and colleagues videotaped them to make their assessments of Parkinsonism. Much more could be said but for a protective order that was put in place by the MDL court. What I can say is that after the defense obtained a good part of the underlying data, the Racette study was no longer actively used by plaintiffs’ counsel in the welding fume cases.

It is not only litigation that gives rise to needs for transparency and openness. Regulation and public policy disputes similarly create need for data access. As Mr. Ruskin acknowledges, the case of Weitz & Luxenberg v. Georgia-Pacific LLC, 2013 WL 2435565, 2013 NY Slip Op 04127 (June 6, 2013), is very different, but at bottom is the same secrecy and false sense of entitlement to privilege underlying data. The Appellate Division’s invocation of the crime-fraud exception seems to be hyperbolic precisely because no attorney-client privilege attached in the first place. Basic tenets of openness and transparency in science should have guided the Appellate Division.

The Georgia-Pacific effort was misguided on many levels, but we should at least rejoice that science won, and that G-P will be required to share underlying data with plaintiffs’ counsel or whoever wants access. Without reviewing the underlying data and documents, it is hard to say what the studies were designed to do, but saying that they were designed “to cast doubt,” as Mr. Ruskin does, is uncharitable to G-P. After all, G-P may well have found itself responding in court to some rather dodgy data, and thought it could sponsor stronger studies that were likely to refute the published papers. And the published papers may have been undertaken to “cast certainty,” or even a faux certainty, over issues that were not what they were portrayed to be in the papers.

Earlier this month, Judge Reggie Walton granted a motion to compel a litigant’s motion for underlying research in the denture cream litigation. Plaintiffs’ counsel contracted with Dr. Salim Shah and his companies Sarfez Pharmaceuticals, Inc. and Sarfez USA, Inc. (“Sarfez”) to conduct human research in India, to support their claims that zinc in denture cream causes neurological damage. In re Denture Cream Prods. Liab. Litig., Misc. Action 13-384 (RBW), 2013 U.S. Dist. LEXIS 93456, *2 (D.D.C. July 3, 2013). When defense counsel learned of the Sarfez study, known as the Zinc/077/12 Study, and the plaintiffs’ counsel’s payments of over $300,000, to support the study, they sought discovery of raw data, study protocol, statistical analyses, and other materials from plaintiffs’ counsel. Plaintiffs’ counsel protested that they did not have all the materials, and directed defense counsel to Sarfez. Although other courts have made counsel produce similar materials from the scientist independent contractors they engaged, in this case, defense counsel followed the trail of documents to contractor, Sarfez. Id. at *3-4.

After serving a Rule 45 subpoena on Sarfez, things got interesting. Raising no objections, and asserting no privileges, Sarfez served about 1,500 pages of responsive documents. Some of the documents were emails, but crucial attachments were missing, including protocols, analytical reports, and raw data. Id. at *12-13. When the defendant, Proctor & Gamble Company (P&G) pressed, Sarfez resisted further production. P&G filed a motion to compel, and Sarfez objected on various grounds, including lack of relevancy.

The objections did not go very far. Plaintiffs’ counsel, who probably should have been tasked with producing the subpoenaed materials in the first instance, had already declared their intent to rely upon the study that they contracted for with Sarfez. Id. at *9. Judge Walton noted that relevancy did not require that the subpoenaed materials be admissible at trial, but only that they may be relevant to the claim or defense of a party. Id. at *6. Judge Walton also upheld the subpoena, which sought underlying data and non-privileged correspondence, to be within the scope of Rules 26(b) and 45, and not unduly burdensome. Id. at *9-10, *20.

Sarfez attempted to suggest that the email attachments might not exist, but Judge Walton branded the suggestion “disingenuous.” Attachments to emails should be produced along with the emails. Id. at *12 (citing and collecting cases). Although Judge Walton did not grant a request for forensic recovery of hard-drive data or for sanctions, His Honor warned Sarfez that it might be required to bear the cost of forensic data recovery if it did not comply the district court’s order. Id. at *15, *22.

The Denture Cream case is a helpful reminder that not only industrial defendants sponsor scientific studies in litigation contexts. Plaintiffs’ counsel, and sometimes their interest proxies — labor unions, support groups, advocacy groups, zealous scientists, and regulatory agencies — sponsor and conduct studies as well. Proctor & Gamble should not have been put to the expense and trouble of a Rule 45 subpoena, but it is encouraging to see that Judge Walton cut through the evasions and disingenuous claims, and enforced the research subpoena in this case.

Posted in Rule 703, Underlying Data | Comments Off on Using the Rule 45 Subpoena to Obtain Research Data

Wells v. Ortho Pharmaceutical Corp. Reconsidered – Part 6

November 21st, 2012

In 1984, before Judge Shoob gave his verdict in the Wells case, another firm filed a birth defects case against Ortho for failure to warn in connection with its non-ionic surfactant spermicides, in the same federal district court, the Northern District of Georgia. The mother in Smith used Ortho’s product about the same time as the mother in Wells (in 1980). The case was assigned to Judge Shoob, who recused himself. Smith v. Ortho Pharmaceutical Corp., 770 F. Supp. 1561, 1562 n.1 (N.D. Ga. 1991) (no reasons for the recusal provided). The Smith case was reassigned to Judge Horace Ward, who entertained Ortho’s motion for summary judgment in July 1988. Two and one-half years later, Judge Ward granted summary judgment to Ortho on grounds that the plaintiffs’ expert witnesses’ testimony was not based upon the type of data reasonably relied upon by experts in the field, and was thus inadmissible under Federal Rule of Evidence 703. 770 F. Supp. at 1681.

A prevalent interpretation of the split between Wells and Smith is that the scientific evidence developed with new studies, and that the scientific community’s views matured in the five years between the two district court opinions. The discussion in Modern Scientific Evidence is typical:

“As epidemiological evidence develops over time, courts may change their view as to whether testimony based on other evidence is admissible. In this regard it is worth comparing Wells v. Ortho Pharmaceutical Corp., 788 F.2d 741 (11th Cir. 1986), with Smith v. Ortho Pharmaceutical Corp., 770 F. Supp. 1561 (N.D. Ga. 1991). Both involve allegations that the use of spermicide caused a birth defect. At the time of the Wells case there was limited epidemiological evidence and this type of claim was relatively novel. In a bench trial the court found for the plaintiff. *** The Smith court, writing five years later, noted that, ‘The issue of causation with respect to spermicide and birth defects has been extensively researched since the Wells decision.’ Smith v. Ortho Pharmaceutical Corp., 770 F. Supp. 1561, 1563 (N.D. Ga. 1991).”

1 David L. Faigman, Michael J. Saks, Joseph Sanders, and Edward K. Cheng, Modern Scientific Evidence: The Law and Science of Expert Testimony, “Chapter 23 – Epidemiology,” § 23:4, at 213 n.12 (West 2011) (internal citations omitted).

Although Judge Ward was being charitable to his judicial colleague, this attempt to reconcile Wells and Smith does a disservice to Judge Ward’s hard work in Smith, and Judge Shoob’s errors in Wells.

Even a casual reading of Smith and Wells reveals that the injuries were completely differently. Plaintiff Crystal Smith was born with a chromosomal defect known as Trisomy-18; Plaintiff Katie Wells was born with limb reduction deficits. Some studies relevant to one injury had no information about the other. Other studies, which addressed both injuries, yielded different results for the different injuries. Although some additional studies were available to Judge Ward in 1988, this difference is hardly the compelling difference between the two cases.

Perhaps the most important difference between the cases is that in Smith, the biologically plausibility that spermicides could cause a Trisomy-18 was completely absent. The chromosomal defect arises from a meiotic disjunction, an error in meiosis that is part of the process in which germ cells are formed. Simply put, spermicides arrive on the scene too late to cause a Trisomy-18. Notwithstanding the profound differences between the injuries involved in Wells and Smith, the Smith plaintiffs sought the application of collateral estoppel. Judge Ward refused this motion, on the basis of the factual differences in the cases, as well as the availability of new evidence. 770 F.Supp. at 1562.

The difference in injuries, however, was not the only important difference between these two cases. Wells was actually tried, apparently without any challenge under Frye, or Rules 702 or 703, to the admissibility of expert witness testimony. There is little to no discussion of scientific validity of studies, or analysis of the requisites for evaluating associations for causality. It is difficult to escape the conclusion that Judge Shoob decided the Wells case on the basis of superficial appearances, and that he frequently ignored validity concerns in drawing invidious distinctions between plaintiffs’ and defendant’s expert witnesses and their “credibility.” Smith, on the other hand, was never tried. Judge Ward entertained and granted dispositive motions for summary judgment, on grounds that the plaintiffs’ expert witnesses’ testimony was inadmissible. Legally, the cases are light years apart.

In Smith, Judge Ward evaluated the same FDA reports and decisions seen by Judge Shoob. Judge Ward did not, however, dismiss these agency materials simply because one or two of dozens of independent scientists involved had some fleeting connection with industry. 770 F.Supp. at 1563-64.

Judge Ward engaged with the structure and bases of the expert witnesses’ opinions, under Rules 702 and 703. The Smith case thus turned on whether expert witness opinions were admissible, an issue not considered or discussed in Wells. As was often the case before the Supreme Court decided Daubert in 1993, Judge Ward paid little attention to Rule 702’s requirement of helpfulness or knowledge. The court’s 702 analysis was limited to qualifications. Id. at 1566-67. The qualifications of the plaintiffs’ witnesses were rather marginal. They relied upon genetic and epidemiologic studies, but they had little training or experience in these disciplines. Finding the plaintiffs’ expert witnesses to meet the low threshold for qualification to offer an opinion in court, Judge Ward focused on Rule 703’s requirement that expert witnesses reasonably rely upon facts and data that are not otherwise admissible.

The trial court in Smith struggled with how it should analyze the underpinnings of plaintiffs’ witnesses’ proffered testimony. The court acknowledged that conflicts between expert witnesses typically raise questions of weight, not admissibility. Id. at 1569. Ortho had, however, challenged plaintiffs’ witnesses for having given opinions that lacked a “sound underlying methodology.” Id. The trial court found at least one Fifth Circuit case that suggested that Rule 703 requires trial courts to evaluate the reliability of expert witnesses’ sources. Id. (citing Soden v. Freightliner Corp., 714 F.2d 498, 505 (5th Cir. 1983). Elsewhere, the trial court also found precedent from Judge Weinstein’s opinion in Agent Orange, as well as Court of Appeals decisions involving Bendectin, all of which turned to Rule 703 as the legal basis for reviewing, and in some cases limiting or excluding expert witness opinion testimony. Id.

The defendant’s argument under Rule 703 was strained; Ortho argued that the plaintiffs’

“experts’ selection and use of the epidemiological data is faulty and thus provides an insufficient basis upon which experts in the field of diagnosing the source of birth defects normally form their opinions. The defendant also contends that the plaintiffs’ experts’ data on genetics is not of the kind reasonably relied upon by experts in field of determining causation of birth defects.”

Id. at 1572. Nothing in Rule 703 addresses the completeness or thoroughness of expert witnesses in their consideration of facts and data; nor does Rule 703 address the sufficiency of data or the validity vel non of inferences drawn from facts and data considered. Nonetheless, the trial court in Smith took Rule 703 as its legal basis for exploring the epistemic warrant for plaintiffs’ witnesses’ causation opinions.

Although plaintiffs’ expert witnesses stated that they had relied upon epidemiologic studies and method, the trial court in Smith went beyond their asseverations. The Smith trial court explored the credibility of these witnesses at a whole other level. The court reviewed and discussed the basic structure of epidemiologic studies, and noted that the objective of such studies is to provide a statistical analysis:

“The objective of both case-control and cohort studies is to determine whether the difference observed in the two groups, if any, is ‘statistically significant’, (that is whether the difference found in the particular study did not occur by chance alone).⁴⁰ However, statistical methods alone, or the finding of a statistically significant association in one study, do not establish a causal relationship.⁴¹ As one authority states:

‘Statistical methods alone cannot establish proof of a causal relationship in an association’.⁴²

As a result, once a statistical association is found in an epidemiological study, that data must then be evaluated in a systematic manner to determine causation. If such an association is present, then the researcher looks for ‘bias’ in the study. Bias refers to the existence of factors in the design of a study or in the manner in which the study was carried out which might distort the result.⁴³

If a statistically significant association is found and there is no apparent ‘bias’, an inference is created that there may be a cause-and-effect relationship between the agent and the medical effect. To confirm or rebut that inference, an epidemiologist must apply five criteria in making judgments as to whether the associations found reflect a cause-and-effect relationship.⁴⁴ The five criteria are:

1. The consistency of the association;

2. The strength of the association;

3. The specificity of the association;

4. The temporal relationship of the association; and,

5. The coherence of the association.

Assuming there is some statistical association, it is these five criteria that provide the generally accepted method of establishing causation between drugs or chemicals and birth defects.⁴⁵ ”

The Smith court acknowledged that there were differences of opinion in weighting these five factors, but that some of them were very important to drawing a reliable inference of causality. Id. at 1775.

A major paradigm shift thus separates Wells and Smith. The trial court in Wells contented itself with superficial and subjective indicia of witnesses’ personal credibility; the trial in Smith delved into the methodology of drawing an appropriate scientific conclusion about causation. Telling was the Smith court’s citation to Moultrie v. Martin, 690 F.2d 1078, 1082 (4th Cir. 1982) (“In borrowing from another discipline. a litigant cannot be selective in which principles are applied.”). 770 F.Supp. at 1575 & n.45. Gone is the Wells retreat from engagement with science, and the dodge that the court must make a legal, not a scientific decision.

Applying the relevant principles, the Smith court found that the plaintiffs’ expert witnesses had deviated from the scientific standards of reasoning and analysis:

“It is apparent to the court that the testimony of Doctors Bussey and Holbrook is insufficiently grounded in any reliable evidence. * * * The conclusions Doctors Bussey and Holbrook reach are also insufficient as a basis for a finding of causality because they fail to consider critical information, such as the most relevant epidemiologic studies and the other possible causes of disease.⁸¹

The court finds that the opinions of plaintiffs’ experts are not based upon the type of data reasonably relied upon by experts in determining the cause of birth defects. Experts in determining birth defects rely upon a consensus in genetic or epidemiological investigations or specific generally accepted studies in these fields. While a consensus in genetics or epidemiology is not a prerequisite to a finding of causation in any and all birth defect cases, Rule 703 requires some reliable evidence for the basis of an expert’s opinion.

Experts in determining birth defects also utilize methodologies and protocols not followed by plaintiffs’ experts. Without a well-founded methodology, opinions which run contrary to the consensus of the scientific community and are not supported by any reliable data are necessarily speculative and lacking in the type of foundation necessary to be admissible.

For the foregoing reasons, the court finds that plaintiffs have failed to produce admissible evidence sufficient to show that defendant’s product caused Crystal’s birth defects.”

Id. at 1581. Rule 703 was forced into a service to filter out methodologically specious opinions.

Not all was smooth sailing for Judge Ward. Like Judge Shoob, Judge Ward seemed to think that a physical examination of the plaintiff provided helpful, relevant evidence, but he never articulated what the basis for this opinion was. (His Honor did note that the parties agreed that the physical examination offered no probative evidence about causation. Id. at 1572 n.32.) No harm came of this opinion. Judge Ward wrestled with the lack of peer review in some unpublished studies, and the existence of a study only in abstract form. See, e.g., id. at 1579 (“a scientific study not subject to peer review has little probative value”); id. at 1578 (insightfully noting that an abstract had insufficient data to permit a reader to evaluate its conclusions). The Smith court recognized the importance of statistical analysis, but it confused Bayesian posterior probabilities with significance probabilities:

“Because epidemiology involves evidence on causation derived from group based information, rather than specific conclusions regarding causation in an individual case, epidemiology will not conclusively prove or disprove that an agent or chemical causes a particular birth defect. Instead, its probative value lies in the statistical likelihood of a specific agent causing a specific defect. If the statistical likelihood is negligible, it establishes a reasonable degree of medical certainty that there is no cause-and-effect relationship absent some other evidence.”

The confusion here is hardly unique, but ultimately it did not prevent Judge Ward from reaching a sound result in Smith.

What intervened between Wells and Smith was not any major change in the scientific evidence on spermicides and birth defects; the sea change came in the form of judicial attitudes toward the judge’s role in evaluating expert witness opinion testimony. In 1986, for instance, after the Court of Appeals affirmed the judgment in Wells, Judge Higginbotham, speaking for a panel of the Fifth Circuit, declared:

“Our message to our able trial colleagues: it is time to take hold of expert testimony in federal trials.”

In re Air Crash Disaster at New Orleans, 795 F.2d 1230, 1234 (5th Cir. 1986). By the time the motion for summary judgment in Smith was decided, that time had come.

Posted in Causation, Data Sharing, Frye, Rule 702, Rule 703, Scientific Evidence, statistical evidence | Comments Off on Wells v. Ortho Pharmaceutical Corp. Reconsidered – Part 6

Haack’s Holism vs. Too Much of Nothing

May 24th, 2012

Professor Haack has been an unflagging critic of Daubert and its progeny. Haack’s major criticism of the Daubert and Joiner cases is based upon the notion that the Supreme Court engaged in a “divide and conquer” strategy in its evaluation of plaintiffs’ evidence, when it should have been considered the “whole gemish” (my phrase, not Haack’s). See Susan Haack, “Warrant, Causation, and the Atomism of Evidence Law,” 5 Episteme 253, 261 (2008)[hereafter “Warrant“]; “Proving Causation: The Holism of Warrant and the Atomism of Daubert,” 4 J. Health & Biomedical Law 273, 304 (2008)[hereafter “Proving Causation“].

ATOMISM vs. HOLISM

Haack’s concern is that combined pieces of evidence, none individually sufficient to warrant an opinion of causation, may provide the warrant when considered jointly. Haack reads Daubert to require courts to screen each piece of evidence relied upon an expert witness for reliability, a process that can interfere with discerning the conclusion most warranted by the totality or “the mosaic” of the evidence:

“The epistemological analysis offered in this paper reveals that a combination of pieces of evidence, none of them sufficient by itself to warrant a causal conclusion to the legally required degree of proof, may do so jointly. The legal analysis offered here, interlocking with this, reveals that Daubert’s requirement that courts screen each item of scientific expert testimony for reliability can actually impede the process of arriving at the conclusion most warranted by the evidence proffered.”

Warrant at 253.

But there is nothing in Daubert, or its progeny, to support this crude characterization of the judicial gatekeeping function. Indeed, there is another federal rule of evidence, Rule 703, which is directed at screening the reasonableness of reliance upon a single piece of evidence.

Surely there are times when the single, relied upon study is one that an expert in the relevant field should and would not rely upon because of invalidity of the data, the conduct of the study, or the study’s analysis of the data. Indeed, there may well be times, especially in litigation contexts, when an expert witness has relied upon a collection of studies, none of which is reasonably relied upon by experts in the discipline.

Rule 702, which Daubert was interpreting, was, and is, focused with an expert witness’s opinion:

A witness who is qualified as an expert by knowledge, skill, experience, training, or education may testify in the form of an opinion or otherwise if:

(a) the expert’s scientific, technical, or other specialized knowledge will help the trier of fact to understand the evidence or to determine a fact in issue;

(b) the testimony is based on sufficient facts or data;

(c) the testimony is the product of reliable principles and methods; and

(d) the expert has reliably applied the principles and methods to the facts of the case

To be sure, Chief Justice Rehnquist, in explicating why plaintiffs’ expert witnesses’ opinions must be excluded in Joiner, noted the wild, irresponsible, unwarranted inferential leaps made in interpreting specific pieces of evidence. The plaintiffs’ expert witnesses’ interpretation of a study, involving massive injections of PCBs into the peritoneum of baby mice, with consequent alveologenic adenomas, provided an amusing example of how they, the putative experts, had outrun their scientific headlights by over-interpreting a study in a different species, at different stages of maturation, with different routes of exposure, with different, non-cancerous outcomes. These examples were effectively aimed at showing that the overall opinion advanced by Rabbi Teitelbaum and others, on behalf of plaintiffs in Joiner, were unreliable. Haack, however, sees a philosophical kinship with Justice Stevens, who in dissent, argued to give plaintiffs’ expert witnesses a “pass,” based upon the whole evidentiary display. General Electric Co. v. Joiner, 522 U.S. 136, 153 (1997) (Justice Stevens, dissenting) (“It is not intrinsically ‘unscientific’ for experienced professionals to arrive at a conclusion by weighing all available evidence.”). The problem, of course, is that sometimes “all available evidence” includes a good deal of junk, irrelevant, or invalid studies. Sometimes “all available evidence” is just too much of nothing.

Perhaps Professor Haack was hurt that she was not cited by Justice Blackmun in Daubert, along with Popper and Hempel. Haack has written widely on philosophy of science, and on epistemology, and she clearly believes her theory of knowledge would provide a better guide to the difficult task of screening expert witness opinions.

When Professor Haacks describes the “degree to which evidence warrants a conclusion,” she identifies three factors, which in part, require assessment of the strength of individual studies:

(i) how strong the connection is between the evidence and the conclusion (supportiveness);

(ii) how solid each of the elements of the evidence is, independent of the conclusion (independent security); and

(iii) how much of the relevant evidence the evidence includes (comprehensiveness).

Warrant at 258

Of course, supportiveness includes interconnectedness, but nothing in her theory of “warrant” excuses or omits rigorous examination of individual pieces of evidence in assessing a causal claim.

DONE WRONG

Haack seems enamored of the holistic approach taken by Dr. Done, plaintiffs’ expert witness in the Bendectin litigation. Done tried to justify his causal opinions based upon the entire “mosaic” of evidence. See, e.g., Oxendine v. Merrell Dow Pharms. Inc, 506 A.2d 1100, 1108 (D.C 1986)(“[Dr. Done] conceded his inability to conclude that Bendectin is a teratogen based on any of the individual studies which he discussed, but he also made quite clear that all these studies must be viewed together, and that, so viewed, they supplied his conclusion”).

Haack tilts at windmills by trying to argue the plausibility of Dr. Done’s mosaic in some of the Bendectin cases. She rightly points out that Done challenged the internal and external validity of the defendant’s studies. Such challenges to the validity of either side’s studies are a legitimate part of scientific discourse, and certainly a part of legal argumentation, but attacks on validity of null studies are not affirmative evidence of an association. Haack correctly notes that “absence of evidence that p is just that — an absence of evidence of evidence; it is not evidence that not-p.” Proving Causation at 300. But the same point holds with respect to Done’s challenges to Merrill Dow’s studies. If those studies are invalid, and Merrill Dow lacks evidence that “not-p,” this lack is not evidence for Done in favor of p.

Given the lack of supporting epidemiologic data in many studies, and the weak and invalid data relied upon, Done’s causal claims were suspect and have come to be discredited. Professor Ronald Allen notes that invoking the Bendectin litigation in defense of a “mosaic theory” of evidentiary admissibility is a rather peculiar move for epistemology:

“[T]here were many such hints of risk at the time of litigation, but it is now generally accepted that those slight hints were statistical aberrations or the results of poorly conducted studies.⁷⁶ Bendectin is still prescribed in many places in the world, including Europe, is endorsed by the World Health Organization as safe, and has been vindicated by meta-analyses and the support of a number of epidemiological studies.⁷⁷ Given the weight of evidence in favor of Bendectin’s safety, it seems peculiar to argue for mosaic evidence from a case in which it would have plainly been misleading.”

Ronald J. Allen & Esfand Nafisi, “Daubert and its Discontents,” 76 Brooklyn L. Rev. 131, 148 (2010).

Screening each item of “expert evidence” for reliability may deprive the judge of “the mosaic,” but that is not all that the judicial gatekeepers were doing in Bendectin or other Rule 702 cases. It is all well and good to speak metaphorically about mosaics, but the metaphor and its limits were long ago acknowledged in the philosophy of science. The suggestion that scraps of evidence from different kinds of scientific studies can establish scientific knowledge was rejected by the great mathematician, physicist, and philosopher of science, Henri Poincaré:

“[O]n fait la science avec des faits comme une maison avec des pierres; mais une accumulation de faits n’est pas plus une science qu’un tas de pierres n’est une maison.”

Jules Henri Poincaré, La Science et l’Hypothèse (1905) (chapter 9, Les Hypothèses en Physique)( “Science is built up with facts, as a house is with stones. But a collection of facts is no more a science than a heap of stones is a house.”). Poincaré’s metaphor is more powerful than Haack’s and Done’s “mosaic” because it acknowledges that interlocking pieces of evidence may cohere as a building, or they may be no more than a pile of rubble. Poorly constructed walls may soon revert to the pile of stones from which they came. Much more is required than simply invoking the “mosaic” theory to bless this mess as a “warranted” claim to knowledge.

Haack’s point about aggregation of evidence is, at one level, unexceptionable. Surely, the individual pieces of evidence, each inconclusive alone, may be powerful when combined. An easy example is a series of studies, each with a non-statistically significant result of finding more disease than expected. None of the studies alone can rule out chance as an explanation, and the defense might be tempted to argue that it is inappropriate to rely upon any of the studies because none is statistically significant.

The defense argument may be wrong in cases in which a valid meta-analysis can be deployed to combine the results into a summary estimate of association. If a meta-analysis is appropriate, the studies collectively may allow the exclusion of chance as an explanation for the disparity from expected rates of disease in the observed populations. [Haack misinterprets study “effect size” to be relevant to ruling out chance as explanation for the increased rate of the outcome of interest. Proving Causation at 297.]

The availability of meta-analysis, in some cases, does not mean that hand waving about the “combined evidence” or “mosaics” automatically supports admissibility of the causal opinion. The gatekeeper would still have to contend with the criteria of validity for meta-analysis, as well as with bias and confounding in the underlying studies.

NECESSITY OF JUDGMENT

Of course, unlike the meta-analysis example, most instances of evaluating an entire evidentiary display are not quantitative exercises. Haack is troubled by the qualitative, continuous nature of reliability, but the “in or out” aspect of ruling on expert witness opinion admissibility. Warrant at 262. The continuous nature of a reliability spectrum, however, does not preclude the practical need for a decision. We distinguish young from old people, although we age imperceptibly by units of time that are continuous and capable of being specified with increasingly small magnitudes. Differences of opinions or close cases are likely, but decisions are made in scientific contexts all the time.

FAGGOT FALLACY

Although Haack criticizes defendants for beguiling courts with the claimed “faggot fallacy,” she occasionally, acknowledges that there simply is not sufficient valid evidence to support a conclusion. Indeed, she makes the case for why, in legal contexts, we will frequently be dealing with “unwarranted” claims:

“Against this background, it isn’t hard to see why the legal system has had difficulties in handling scientific testimony. It often calls on the weaker areas of science and/or on weak or marginal scientists in an area; moreover, its adversarial character may mean that even solid scientific information gets distorted; it may suppress or sequester relevant data; it may demand scientific answers when none are yet well-warranted; it may fumble in applying general scientific findings to specific cases; and it may fail to adapt appropriately as a relevant scientific field progresses.”

Susan Haack, ” Of Truth, in Science and in Law,” 73 Brooklyn L. Rev. 985, 1000 (2008). It is difficult to imagine a more vigorous call for, and defense of, judicial gatekeeping of expert witness opinion testimony.

Haack seems to object to the scope and intensity of federal judicial gatekeeping, but her characterization of the legal context should awaken her to the need to resist admitting opinions on scientific issues when “none are yet well-warranted.” Id. at 1004 (noting that “the legal system quite often want[s] scientific answers when no warranted answers are available). The legal system, however, does not “want” unwarranted “scientific” answers; only an interested party on one side or the other wants such a thing. The legal systems wants a procedure for ensuring rejection of unwarranted claims, which may be passed off as properly warranted, due to the lack of sophistication of the intended audience.

TOO MUCH OF NOTHING

Despite her flirtation with Dr. Done’s holistic medicine, Haack acknowledges that sometimes a study or an entire line of studies is simply not valid, and they should not be part of the “gemish.” For instance, in the context of meta-analysis, which requires pre-specified inclusionary and exclusionary criteria for studies, Haack acknowledges that a “well-designed and well-conducted meta-analysis” will include a determination “which studies are good enough to be included … and which are best disregarded.” Proving Causation at 286. Exactly correct. Sometimes we simply must drill down to the individual study, and what we find may require us to exclude it from the meta-analysis. The same could be said of any study that is excluded by appropriate exclusionary criteria.

Elsewhere, Haack acknowledges myriad considerations of validity or invalidity, which must be weighed as part of the gemish:

“The effects of S on animals may be different from its effects on humans. The effects of b when combined with a and c may be different from its effects alone, or when combined with x and/or y.⁵² Even an epidemiological study showing a strong association between exposure to S and elevated risk of D would be insufficient by itself: it might be poorly-designed and/or poorly-executed, for example (moreover, what constitutes a well-designed study – e.g., what controls are needed – itself depends on further information about the kinds of factor that might be relevant). And even an excellent epidemiological study may pick up, not a causal connection between S and D, but an underlying cause both of exposure to S and of D; or possibly reflect the fact that people in the very early stages of D develop a craving for S. Nor is evidence that the incidence of D fell after S was withdrawn sufficient by itself to establish causation – perhaps vigilance in reporting D was relaxed after S was withdrawn, or perhaps exposure to x, y, z was also reduced, and one or all of these cause D, etc.⁵³“

Proving Causation at 288. These are precisely the sorts of reasons that make gatekeeping of expert witness opinions an important part of the judicial process in litigation.

RATS TO YOU

Similarly, Haack acknowledges that animal studies may be quite irrelevant to the issue at hand:

“The elements of E will also interlock more tightly the more physiologically similar the animals used in any animal studies are to human beings. The results of tests on hummingbirds or frogs would barely engage at all with epidemiological evidence of risk to humans, while the results of tests on mice, rats, guinea-pigs, or rabbits would interlock more tightly with such evidence, and the results of tests on primates more tightly yet. Of course, “similar” has to be understood as elliptical for “similar in the relevant respects;” and which respects are relevant may depend on, among other things, the mode of exposure: if humans are exposed to S by inhalation, for example, it matters whether the laboratory animals used have a similar rate of respiration. (Sometimes animal studies may themselves reveal relevant differences; for example, the rats on which Thalidomide was tested were immune to the sedative effect it had on humans; which should have raised suspicions that rats were a poor choice of experimental animal for this drug.)⁵⁵ Again, the results of animal tests will interlock more tightly with evidence of risk to humans the more similar the dose of S involved. (One weakness of Joiner’s expert testimony was that the animal studies relied on involved injecting massive doses of PCBs into a baby mouse’s peritoneum, whereas Mr. Joiner had been exposed to much smaller doses when the contaminated insulating oil splashed onto his skin and into his eyes.)⁵⁶ The timing of the exposure may also matter, e.g., when the claim at issue is that a pregnant woman’s being exposed to S causes this or that specific type of damage to the fetus.”

Proving Causation at 290.

WEIGHT OF THE EVIDENCE (WOE)

Just as she criticizes General Electric for advancing the “faggot fallacy” in Joiner, Haack criticizes the plaintiffs’ appeal to “weight of evidence methodology,” as misleadingly suggesting “that there is anything like an algorithm or protocol, some effective, mechanical procedure for calculating the combined worth of evidence.” Proving Causation at 293.

INFERENCE TO BEST EXPLANATION

Professor Haack cautiously evaluates the glib invocation of “inference to the best explanation” as a substitute for actual warrant of a claim to knowledge. Haack acknowledges the obvious: the legal system is often confronted with claims lacking sufficient warrant. She appropriately refuses to permit such claims to be dressed up as scientific conclusions by invoking their plausibility:

“Can we infer from the fact that the causes of D are as yet unknown, and that a plaintiff developed D after being exposed to S, that it was this exposure that caused Ms. X’s or Mr. Y’s D?¹⁰² No. Such evidence would certainly give us reason to look into the possibility that S is the, or a, cause of D. But loose talk of ‘inference to the best explanation’ disguises the fact that what presently seems like the most plausible explanation may not really be so – indeed, may not really be an explanation at all. We may not know all the potential causes of D, or even which other candidate-explanations we would be wise to investigate.”

Proving Causation at 305. See also Warrant at 261 (invoking the epistemic category of Rumsfeld’s “known unknowns” and “unknown unknowns” to describe a recurring situation in law’s treatment of scientific claims)(U.S. Sec’y of Defense Donald Rumsfeld: “[T]here are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – there are things we do not know we don’t know. (Feb. 12, 2002)).

It is easy to see why the folks at SKAPP are so fond of Professor Haack’s writings, and why they have invited her to their conferences and meetings. She has written close to a dozen articles critical of Daubert, each repeating the same mistaken criticisms of the gatekeeping process. She has provided SKAPP and its plaintiffs’ lawyer sponsors with sound bites to throw at impressionable judges about the epistemological weakness of Daubert and its progeny. In advancing this critique and SKAPP’s propaganda purposes, Professor Haack has misunderstood the gatekeeping enterprise. She has, however, correctly identified the gatekeeping process as an exercise in determining whether an opinion possesses sufficient epistemic warrant. Despite her enthusiasm for the dubious claims of Dr. Done, Haack acknowledges that “warrant” requires close attention to the internal and external validity of studies, and to rigorous analysis of a body of evidence. Haack’s own epistemic analysis would be hugely improved and advanced by focusing on how the mosaic theory, or WOE, failed to hold up in some of the more egregious, pathological claims of health “effects” — Bendectin, silicone, electro-magnetic frequency, asbestos and colorectal cancer, etc.

Posted in Causation, Meta-analysis, Rule 702, Rule 703, Scientific Evidence | Comments Off on Haack’s Holism vs. Too Much of Nothing

Giving Rule 703 the Cold Shoulder

May 12th, 2012

I have written previously about the gap in Rule 702, which provides a multi-factorial test for the admissibility of an opinion from a properly qualified expert witness:

(a) the expert’s scientific, technical, or other specialized knowledge will help the trier of fact to understand the evidence or to determine a fact in issue;

(b) the testimony is based on sufficient facts or data;

(c) the testimony is the product of reliable principles and methods; and

(d) the expert has reliably applied the principles and methods to the facts of the case.

Noticeably absent from Rule 702 is any requirement that the facts or data upon which the expert witness relies be worth a damn. From Rule 702(b), (c), and (d) alone, an expert witness, armed with sufficient unreliable, fraudulent, imaginary, or simply incorrect facts and data, using reliable principles and methods, and applying those principles and methods reliably to the facts of the case, gets to testify at trial. Arguably, the first subsection, Rule 702(a), which limits testimony to helpful “knowledge” provides an overriding condition that helps to qualify the next three. It is difficult to imagine that knowledge is based upon unreliable facts and data.

Still, the failure to require reliable data explicitly within the scope of Rule 702 is disturbing. This unhappy state of affairs, in which courts do not exercise gatekeeping over the quality of the data themselves, is apparently the law of the Tenth Circuit, of the United States Court of Appeals.

In Pritchett v. I-Flow Corporation, the plaintiff had shoulder surgery, which required the use of a “pain pump” to inject anesthetic medication into the shoulder post-operatively. The plaintiff went on to develop “chondrolysis” in his shoulder joint, a condition that involves partial or complete loss of cartilage in the shoulder joint. Pritchett v. I-Flow Corp., Civil Action No. 09-cv-02433-WJM-KLM. (D. Colo. April 17, 2012) (Mix, J., Magistrate Judge).

The opinion is a mechanical recitation of Daubert procedure and method, with little analysis of the expert witness’s opinion, until the magistrate judge describes the requirement of Rule 702 (b) for “sufficient facts and data”:

“i. Sufficient Facts and Data

The proponent of the opinion must first show that the witness gathered “sufficient facts and data” to formulate the opinion. In the Tenth Circuit, assessment of the sufficiency of the facts and data used by the witness is a quantitative, rather than a qualitative, analysis. Fed. R. Evid. 702, Advisory Committee Notes to 2000 Amendments; see also United States v. Lauder, 409 F.3d 1254, 1264 n.5 (10th Cir. 2005). That is to say, the Court does not examine whether the facts obtained by the witness are themselves reliable; whether the facts used are qualitatively reliable is a question of the weight that should be given to the opinion by the fact-finder, not the admissibility of the opinion. Lauder, 409 F.3d at 1264. Instead, “this inquiry examines only whether the witness obtained the amount of data that the methodology itself demands.” Crabbe, 556 F. Supp. 2d at 1223.”

Pritchett v. I-Flow Corp. (emphasis added). That is to say: the whole gatekeeping enterprise is really about appearances and not about trying to ensure more accurate fact finding.

If the court’s analysis of Rule 702 should be correct, it is in any event an incomplete analysis that omits the important role of Rule 703:

Rule 703. Bases of an Expert’s Opinion Testimony

An expert may base an opinion on facts or data in the case that the expert has been made aware of or personally observed. If experts in the particular field would reasonably rely on those kinds of facts or data in forming an opinion on the subject, they need not be admissible for the opinion to be admitted. But if the facts or data would otherwise be inadmissible, the proponent of the opinion may disclose them to the jury only if their probative value in helping the jury evaluate the opinion substantially outweighs their prejudicial effect.

According to Magistrate Mix, the reliability of the facts and data do not count for gatekeeping. Chalk up another loophole to the law’s requirement of reliable scientific evidence.

Posted in Rule 702, Rule 703, Underlying Data | Comments Off on Giving Rule 703 the Cold Shoulder

WOE-fully Inadequate Methodology – An Ipse Dixit By Another Name

May 1st, 2012

Take all the evidence, throw it into the hopper, close your eyes, open your heart, and guess the weight. You could be a lucky winner! The weight of the evidence suggests that the weight-of-the-evidence (WOE) method is little more than subjective opinion, but why care if it helps you to get to a verdict?

The scientific community has never been seriously impressed by the so-called weight of the evidence (WOE) approach to determining causality. The phrase is vague and ambiguous; its use, inconsistent. See, e.g., V. H. Dale, G.R. Biddinger, M.C. Newman, J.T. Oris, G.W. Suter II, T. Thompson, et al., “Enhancing the ecological risk assessment process,” 4 Integrated Envt’l Assess. Management 306 (2008)(“An approach to interpreting lines of evidence and weight of evidence is critically needed for complex assessments, and it would be useful to develop case studies and/or standards of practice for interpreting lines of evidence.”); Igor Linkov, Drew Loney, Susan M. Cormier, F.Kyle Satterstrom, Todd Bridges, “Weight-of-evidence evaluation in environmental assessment: review of qualitative and quantitative approaches,” 407 Science of Total Env’t 5199–205 (2009); Douglas L. Weed, “Weight of Evidence: A Review of Concept and Methods,” 25 Risk Analysis 1545 (2005) (noting the vague, ambiguous, indefinite nature of the concept of “weight of evidence” review); R.G. Stahl Jr., “Issues addressed and unaddressed in EPA’s ecological risk guidelines,” 17 Risk Policy Report 35 (1998); (noting that U.S. Environmental Protection Agency’s guidelines for ecological weight-of-evidence approaches to risk assessment fail to provide guidance); Glenn W. Suter II, Susan M. Cormier, “Why and how to combine evidence in environmental assessments: Weighing evidence and building cases,” 409 Science of the Total Environment 1406, 1406 (2011)(noting arbitrariness and subjectivity of WOE “methodology”).

General Electric v. Joiner

Most savvy judges quickly figured out that weight of the evidence (WOE) was suspect methodology, woefully lacking, and indeed, not really a methodology at all.

The WOE method was part of the hand waving in Joiner by plaintiffs’ expert witnesses, including the frequent testifier Rabbi Teitelbaum. The majority recognized that Rabbi Teitelbaum’s WOE weighed in at less than a peppercorn, and affirmed the district court’s exclusion of his opinions. The Joiner Court’s assessment provoked a dissent from Justice Stevens, who was troubled by the Court’s undressing of the WOE methodology:

“Dr. Daniel Teitelbaum elaborated on that approach in his deposition testimony: ‘[A]s a toxicologist when I look at a study, I am going to require that that study meet the general criteria for methodology and statistical analysis, but that when all of that data is collected and you ask me as a patient, Doctor, have I got a risk of getting cancer from this? That those studies don’t answer the question, that I have to put them all together in my mind and look at them in relation to everything I know about the substance and everything I know about the exposure and come to a conclusion. I think when I say, “To a reasonable medical probability as a medical toxicologist, this substance was a contributing cause,” … to his cancer, that that is a valid conclusion based on the totality of the evidence presented to me. And I think that that is an appropriate thing for a toxicologist to do, and it has been the basis of diagnosis for several hundred years, anyway’.

* * * *

Unlike the District Court, the Court of Appeals expressly decided that a ‘weight of the evidence’ methodology was scientifically acceptable. To this extent, the Court of Appeals’ opinion is persuasive. It is not intrinsically “unscientific” for experienced professionals to arrive at a conclusion by weighing all available scientific evidence—this is not the sort of ‘junk science’ with which Daubert was concerned. After all, as Joiner points out, the Environmental Protection Agency (EPA) uses the same methodology to assess risks, albeit using a somewhat different threshold than that required in a trial. Petitioners’ own experts used the same scientific approach as well. And using this methodology, it would seem that an expert could reasonably have concluded that the study of workers at an Italian capacitor plant, coupled with data from Monsanto’s study and other studies, raises an inference that PCB’s promote lung cancer.”

General Electric v. Joiner, 522 U.S. 136, 152-54 (1997)(Stevens, J., dissenting)(internal citations omitted)(confusing critical assessment of studies with WOE; and quoting Rabbit Teitelbaum’s attempt to conflate diagnosis with etiological attribution). Justice Stevens could reach his assessment only by ignoring the serious lack of internal and external validity in the studies relied upon by Rabbi Teitelbaum. Those studies did not support his opinion individually or collectively.

Justice Stevens was wrong as well about the claimed scientific adequacy of WOE. Courts have long understood that precautionary, preventive judgments of regulatory agencies are different from scientific conclusions that are admissible in civil and criminal litigation. See Allen v. Pennsylvania Engineering Corp., 102 F.3d 194 (5th Cir. 1996)(WOE, although suitable for regulatory risk assessment, is not appropriate in civil litigation). Justice Stevens’ characterization of WOE was little more than judicial ipse dixit, and it was, in any event, not the law; it was the argument of a dissenter.

Milward v. Acuity Specialty Products

Admittedly, dissents can sometimes help lower court judges chart a path of evasion and avoidance of a higher court’s holding. In Milward, Justice Stevens’ mischaracterization of WOE and scientific method was adopted as the legal standard for expert witness testimony by a panel of the United States Court of Appeals, for the First Circuit. Milward v. Acuity Specialty Products Group, Inc., 664 F.Supp. 2d 137 (D. Mass. 2009), rev’d, 639 F.3d 11 (1st Cir. 2011), cert. denied, U.S. Steel Corp. v. Milward, ___ U.S. ___, 2012 WL 33303 (2012).

Mr. Milward claimed that he was exposed to benzene as a refrigerator technician, and developed acute promyelocytic leukeumia (APL) as result. 664 F. Supp. 2d at 140. In support of his claim, Mr. Milward offered the testimony of Dr. Martyn T. Smith, a toxicologist, who testified that the “weight of the evidence” supported his opinion that benzene exposure causes APL. Id. Smith, in his litigation report, described his methodology as an application of WOE:

“The term WOE has come to mean not only a determination of the statistical and explanatory power of any individual study (or the combined power of all the studies) but the extent to which different types of studies converge on the hypothesis.) In assessing whether exposure to benzene may cause APL, I have applied the Hill considerations . Nonetheless, application of those factors to a particular causal hypothesis, and the relative weight to assign each of them, is both context dependent and subject to the independent judgment of the scientist reviewing the available body of data. For example, some WOE approaches give higher weight to mechanistic information over epidemiological data.”

Smith Report at ¶¶19, 21 (citing Sheldon Krimsky, “The Weight of Scientific Evidence in Policy and Law,” 95(S1) Am. J. Public Health 5130, 5130-31 (2005))(March 9, 2009). Smith marshaled several bodies of evidence, which he claimed collectively supported his opinion that benzene causes APL. Milward, 664 F. Supp. 2d at 143.

Milward also offered the testimony of a philosophy professor, Carl F. Cranor, for the opinion that WOE was an acceptable methodology, and that all scientific inference is subject to judgment. This is the same Cranor who, advocating for open admissions of all putative scientific opinions, showcased his confusion between statistical significance probability and the posterior probability involved in a conclusion of causality. Carl F. Cranor, Regulating Toxic Substances: A Philosophy of Science and the Law at 33-34(Oxford 1993)(“One can think of α, β (the chances of type I and type II errors, respectively) and 1- β as measures of the “risk of error” or “standards of proof.”) See also id. at 44, 47, 55, 72-76.

After a four-day evidentiary hearing, the district court found that Martyn Smith’s opinion was merely a plausible hypothesis, and not admissible. Milward, 664 F. Supp. 2d at 149. The Court of Appeals, in an opinion by Chief Judge Lynch, however, reversed and ruled that an inference of general causation based on a WOE methodology satisfied the reliability requirement for admission under Federal Rule of Evidence 702. 639 F.3d at 26. According to the Circuit, WOE methodology was scientifically sound, Id. at 22-23.

WOE Cometh

Because the WOE methodology is not well described, either in the published literature or in Martyn Smith’s litigation report, it is difficult to understand exactly what the First Circuit approved by reversing Smith’s exclusion. Usually the burden is on the proponent of the opinion testimony, and one would have thought that the vagueness of the described methodology would count against admissibility. It is hard to escape the conclusion that the Circuit elevated a poorly described method, best characterized as hand waving, into a description of scientific method

The Panel appeared to have been misled by Carl F. Cranor, who described “inference to the best explanation” as requiring a scientist to “consider all of the relevant evidence” and “integrate the evidence using professional judgment to come to a conclusion about the best explanation. Id at 18. The available explanations are then weighed, and a would-be expert witness is free to embrace the one he feels offers the “best” explanation. The appellate court’s opinion takes WOE, combined with Cranor’s “inference to the best explanation,” to hold that an expert witness need only opine that he has considered the range of plausible explanations for the association, and that he believes that the causal explanation is the best or “most plausible.” Id. at 20 (upholding this approach as “methodologically reliable”).

What is missing of course is the realization that plausible does not mean established, reasonably certain, or even more likely than not. The Circuit’s invocation of plausibility also obscures the indeterminacy of the available data for supporting a reliable conclusion of causation in many cases.

Curiously, the Panel likened WOE to the use of differential diagnosis, which is a method for inferring the specific cause of a particular patient’s disease or disorder. Id. at 18. This is a serious confusion between a method concerned with general causation and one concerned with specific causation. Even if, by the principle of charity, we allow that the First Circuit was thinking of some process of differential etiology rather than diagnosis, given that diagnoses (other than for infectious diseases and a few pathognomonic disorders) do not usually carry with them information about unique etiologic agents. But even such a process of differential etiology is a well-structured dysjunctive syllogism of the form:

A v B v C

~A ∩ ~B

∴ C

There is nothing subjective about assigning weights or drawing inferences in applying such a syllogism. In the Milward case, one of the propositional facts that might have well explained the available evidence was chance, but plaintiff’s expert witness Smith could not and did not rule out chance in that the studies upon which he relied were not statistically significant. Smith could thus never get past “therefore” in any syllogism or in any other recognizable process of reasoning.

The Circuit Court provides no insight into the process Smith used to weigh the available evidence, and it failed to address the analytical gaps and evidentiary insufficiencies addressed by the trial court, other than to invoke the mantra that all these issues go to “the weight, not the admissibility” of Smith’s opinions. This, of course, is a conclusion, not an explanation or a legal theory.

There is also a cute semantic trick lurking in plaintiffs’ position in Milward, which results from their witnesses describing their methodology as “WOE.” Since the jury is charged with determining the “weight of the evidence,” any evaluation of the WOE would be an invasion of the province of the jury. Milward, 639 F.3d at 20. QED by the semantic device of deliberating conflating the name of the putative scientific methodology with the term traditionally used to describe jury fact finding.

In any event, the Circuit’s chastisement of the district court for evaluating Smith’s implementation of the WOE methodology, his logical, mathematical, and epidemiological errors, his result-driven reinterpretation of study data, threatens to read an Act of Congress — the Federal Rules of Evidence, and especially Rules 702 and 703 — out of existence by judicial fiat. The Circuit’s approach is also at odds with Supreme Court precedent (now codified in Rule 702) on the importance and the requirement of evaluating opinion testimony for analytical gaps and the ipse dixit of expert witnesses. General Electic Co. v. Joiner, 522 U.S. 136, 146 (1997).

Smith’s Errors in Recalculating Odds Ratios of Published Studies

In the district court, the defendants presented testimony of an epidemiologist, Dr. David H. Garabrant, who took Smith to task for calculating risk ratios incorrectly. Smith did not have any particular expertise in epidemiologist, and his faulty calculations were problematic from the perspective of both Rule 702 and Rule 703. The district court found the criticisms of Smith’s calculations convincing, 664 F. Supp. 2d at 149, but the appellate court held that the technical dispute was for the jury; “both experts’ opinions are supported by evidence and sound scientific reasoning,” Milward, 639 F.3d at 24. This ruling is incomprehensible. Plaintiffs had the burden of showing admissibility of Smith opinion generally, but also the reasonability of his reliance upon the calculated odds ratio. The defendants had no burden of persuasion on the issue of Smith’s calculations, but they presented testimony, which apparently carried the day. The appellate court had no basis for reversing the specific ruling with respect to the erroneously calculated risk ratio.

Smith’s Reliance upon Statistically Insignificant Studies

Smith relied upon studies that were not statistically significant at any accepted level. An opinion of causality requires a showing that chance, bias, and confounding have been excluded in assessing an existing association. Smith failed to exclude chance as an explanation for the association, and the burden to make this exclusion was on the plaintiffs. This failure was not something that could readily be patched by adverting to other evidence of studies in animals or in test tubes. The Court of Appeals excused the important analytical gap in plaintiffs’ witness’s opinion because APL is rare, and data collection is difficult in the United States. Id. at 24. Evidence “consistent with” and “suggestive of” the challenged witness’s opinion thus suffices. This is a remarkable homeopathic dilution of both legal and scientific causation. Now we have a rule of law that allows plaintiffs to be excused from having to prove their case with reliable evidence if they allege a rare disease for which they lack evidence.

Leveling the Hierarchy of Evidence

Imagine trying to bring a medication to market with a small case-control study, with a non-statistically significant odds ratio! Oh, but these clinical trials are so difficult and expensive; and they take such a long time. Like a moment’s thought, when thinking is so hard and a moment such a long time. We would be quite concerned if the FDA abridged the standard for causal efficacy in the licensing of new medications; we should be just as concerned about judicial abridgments of standards for causation of harm in tort actions.

Leveling the hierarchy of evidence has been an explicit or implicit goal of several law professors. Some of the leveling efforts even show up in the new Reference Manual for Scientific Evidence (RMSE 3d ed. 2011). See “New-Age Levellers – Flattening Hierarchy of Evidence.”

The Circuit, in Milward, quoted an article published in the Journal of the National Cancer Institute by Michele Carbone and others who suggest that there should be no hierarchy, but the Court ignored a huge body of literature that explains and defends the need for recognizing that not all study designs or types are equal. Interestingly, the RMSE chapter on epidemiology by Professor Green (see more below) cites the same article. RMSE 3d at 564 & n.48 (citing and quoting symposium paper that “[t]here should be no hierarchy [among different types of scientific methods to determine cancer causation]. Epidemiology, animal, tissue culture and molecular pathology should be seen as integrating evidences in the determination of human carcinogenicity.” Michele Carbone et al., “Modern Criteria to Establish Human Cancer Etiology,” 64 Cancer Res. 5518, 5522 (2004).) Carbone, of course, is best known for his advocacy of a viral cause (SV40), of human mesothelioma, a claim unsupported, and indeed contradicted, by epidemiologic studies. Carbone’s statement does not support the RMSE chapter’s leveling of epidemiology and toxicology, and Carbone is, in any event, an unlikely source to cite.

The First Circuit, in Milward, studiously ignored a mountain of literature on evidence-based medicine, including the RSME 3d chapter on “Reference Guide on Medical Testimony,” which teaches that leveling of study designs and types is inappropriate. The RMSE chapter devotes several pages to explaining the role of study design in assessing an etiological issue:

“3. Hierarchy of medical evidence

With the explosion of available medical evidence, increased emphasis has been placed on assembling, evaluating, and interpreting medical research evidence. A fundamental principle of evidence-based medicine (see also Section IV.C.5, infra) is that the strength of medical evidence supporting a therapy or strategy is hierarchical.

When ordered from strongest to weakest, systematic review of randomized trials (meta-analysis) is at the top, followed by single randomized trials, systematic reviews of observational studies, single observational studies, physiological studies, and unsystematic clinical observations.¹⁵⁰ An analysis of the frequency with which various study designs are cited by others provides empirical evidence supporting the influence of meta-analysis followed by randomized controlled trials in the medical evidence hierarchy.¹⁵¹ Although they are at the bottom of the evidence hierarchy, unsystematic clinical observations or case reports may be the first signals of adverse events or associations that are later confirmed with larger or controlled epidemiological studies (e.g., aplastic anemia caused by chloramphenicol,¹⁵² or lung cancer caused by asbestos153). Nonetheless, subsequent studies may not confirm initial reports (e.g., the putative association between coffee consumption and pancreatic cancer).¹⁵⁴“

John B. Wong, Lawrence O. Gostin, and Oscar A. Cabrera, “Reference Guide on Medical Testimony,” RMSE 3d 687, 723 -24 (2011). The implication that there is no hierarchy of evidence in causal inference, and that tissue culture studies are as relevant as epidemiology, is patently absurd. The Circuit not only went out on a limb, it managed to saw the limb off, while “out there.”

Milward – Responses Critical and Otherwise

The First Circuit’s decision in Milward made an immediate impression upon those writers who have worked hard to dismantle or marginalize Rule 702. The Circuit’s decision was mysteriously cited with obvious approval by Professor Margaret Berger, even though she had died before the decision was published! Margaret A. Berger, “The Admissibility of Expert Testimony,” RMSE 3d at 20 & n. 51(2011). Professor Michael Green, one of the reporters for the ALI’s Restatement (Third) of Torts hyperbolically called Milward “[o]ne of the most significant toxic tort causation cases in recent memory.” Michael D. Green, “Introduction: Restatement of Torts as a Crystal Ball,” 37 Wm. Mitchell L. Rev. 993, 1009 n.53 (2011).

The WOE approach, and its embrace in Milward, obscures the reality that sometimes the evidence does not logically or analytically support the offered conclusion, and at other times, the best explanation is uncertainty. By adopting the WOE approach, vague and ambiguous as it is, the Milward Court was beguiled into holding that WOE determinations are for the jury. The lack of meaningful content of WOE means that decisions such as Milward effectively remove the gatekeeping function, or permit that function to be minimally satisfied by accepting an expert witness’s claim to have employed WOE. The epistemic warrant required by Rule 702 is diluted if not destroyed. Scientific hunch and speculation, proper in their place, can be passed off for scientific knowledge to gullible or result-oriented judges and juries.

Posted in Causation, Reference Manual on Scientific Evidence, Rule 702, Rule 703, Scientific Evidence | Comments Off on WOE-fully Inadequate Methodology – An Ipse Dixit By Another Name

Admissibility versus Sufficiency of Expert Witness Evidence

April 18th, 2012

Professors Michael Green and Joseph Sanders are two of the longest serving interlocutors in the never-ending discussion and debate about the nature and limits of expert witness testimony on scientific questions about causation. Both have made important contributions to the conversation, and both have been influential in academic and judicial circles. Professor Green has served as the co-reporter for the American Law Institute’s Restatement (Third) of Torts: Liability for Physical Harm. Whether wrong or right, new publications about expert witness issues by Green or Sanders call for close attention.

Early last month, Professors Green and Sanders presented together at a conference, on “Admissibility Versus Sufficiency: Controlling the Quality of Expert Witness Testimony in the United States.” Video and audio of their presentation can be found online. The authors posted a manuscript of their draft article on expert witness testimony to the Social Science Research Network. See Michael D. Green & Joseph Sanders, “Admissibility Versus Sufficiency: Controlling the Quality of Expert Witness Testimony in the United States,” <downloaded on March 25, 2012>.

The authors argue that most judicial exclusions of expert witness causal opinion testimony are based upon a judgment that the challenged witness’s opinion is based upon insufficient evidence. They point to litigations, such as the Bendectin and silicone gel breast implant cases, where the defense challenges were supported in part by a body of “exonerative” epidemiologic studies. Legal theory construction is always fraught with danger in that it either stands to be readily refuted by counterexample, or it is put forward as a normative, prescriptive tool to change the world, thus lacking in descriptive or explanatory component. Green and Sanders, however, seem to be earnest in suggesting that their reductionist approach is both descriptive and elucidative of actual judicial practice.

The authors’ reductionist approach in this area, and especially as applied to the Bendectin and silicone decisions, however, ignores that even before the so-called exonerative epidemiology on Bendectin and silicone was available, the plaintiffs’ expert witnesses were presenting opinions on general and specific causation, based upon studies and evidence of dubious validity. Given that the silicone litigation erupted before Daubert was decided, and Bendectin cases pretty much ended with Daubert, neither litigations really permit a clean before and after picture. Before Daubert, courts struggled with how to handle both the invalidity and the insufficiency (once the impermissible inferences were stripped away) in the Bendectin cases. And before Daubert, all silicone cases went to the jury. Even after Daubert, for some time, silicone cases resulted in jury verdicts, which were upheld on appeal. It took defendants some time to uncover the nature and extent of the invalidity in plaintiffs’ expert witnesses’ opinions, the invalidity of the studies upon which these witnesses relied, and the unreasonableness of the witnesses’ reliance upon various animal and in vitro toxicologic and immunologic studies. And it took trial courts a few years after the Supreme Court’s 1993 Daubert decision to warm up to their new assignment. Indeed, Green and Sanders get a good deal of mileage in their reductionist approach from trial and appellate courts that were quite willing to collapse the distinction between reliability or validity on the one hand, and sufficiency on the other. Some of those “back benching” courts used consensus statements and reviews, which both marshaled the contrary evidence as well as documented the invalidity of the would-be affirmative evidence. This judicial reliance upon external sources that encompassed both sufficiency and reliability should not be understood to mean that reliability (or validity) is nothing other than sufficiency.

A post-Daubert line of cases is more revealing: the claim that the ethyl mercury vaccine preservative, thimerosal, causes autism. Professors Green and Sanders touch briefly upon this litigation. See Blackwell v. Wyeth, 971 A.2d 235 (Md. 2009). Plaintiff’s expert witness, David Geier, had published several articles in which he claimed to have supported a causal nexus between thimerosal and autism. Green and Sanders dutifully note that the Maryland courts ultimately rejected the claims based upon Geier’s data as wholly inadequate, standing alone to support the inference he zealously urged to be drawn. Id. at 32. Whether this is sufficiency or the invalidity of his ultimate inference of causation from an inadequate data set perhaps can be debated, but surely the validity concerns should not be lost in the shuffle of evaluating the evidence available. Of course, exculpatory epidemiologic studies ultimately were published, based upon high quality data and inferences, but strictly speaking, these studies were not necessary to the process of ruling Geier’s advocacy science out of bounds for valid scientific discourse and legal proceedings.

Some additional comments.

1. Questionable reductionism. The authors describe the thrust of their argument as a need to understand judicial decisions on expert witness admissibility as “sufficiency judgments.” While their analysis simplifies the gatekeeping decisions, it also abridges the process in a way that omits important determinants of the law and its application. Sufficiency, or the lack thereof, is often involved as a fatal deficiency in expert witness opinion testimony on causal issues, but the authors’ attempt to reduce many exclusionary decisions to insufficiency determinations ignores the many ways that expert witnesses (and scientists in the real world outside of courtrooms) go astray. The authors’ reductionism seems a weak, if not flawed, predictive, explanatory, and normative theory of expert witness gatekeeping. Furthermore, this reductionism holds a false allure to judges who may be tempted to oversimplify their gatekeeping task by conflating gatekeeping with the jury’s role: exclude the proffered expert witness opinion testimony because, considering all the available evidence, the testimony is probably wrong.

2. Weakness of peer review, publication, and general acceptance in predicting gatekeeping decisions. The authors further describe a “sufficiency approach” as openly acknowledging the relative unimportance of peer review, publication, and general acceptance. Id. at 39. These factors do not lack importance because they are unrelated to sufficiency; they are unimportant because they are weak proxies for validity. Their presence or absence does not really help predict whether the causal opinion offered is invalid, or otherwise unreliable. The existence of published, high-quality, peer-reviewed systematic reviews does, however, bear on sufficiency of the evidence. At least in some cases, courts consider such reviews and rely upon them heavily in reaching a decision on Rule 702, but we should ask to what extent has the court simply avoided the hard work of thinking through the problem on its own.

3. Questionable indictment of juries and the adversarial system for the excesses of expert witnesses. Professors Green and Sanders describe the development of common law, and rules, to control expert witness testimony as “a judicial attempt to moderate the worst consequences of two defining characteristics of United States civil trials: party control of experts and the widespread use of jury decision makers.” Id. at 2. There is no doubt that these are two contributing factors in some of the worst excesses, but the authors really offer no support for their causal judgment. The experience of courts in Europe, where civil juries and party control of expert witnesses are often absent from the process, raises questions about the Green and Sanders’ attribution. See, e.g., R. Meester, M. Collins, R.D. Gill, M. van Lambalgen, “On the (ab)use of statistics in the legal case against the nurse Lucia de B”. 5 Law, Probability and Risk 233 (2007) (describing the conviction of Nurse Lucia de Berk in the Netherlands, based upon shabby statistical evidence).

Perhaps a more general phenomenon is at play, such as an epistemologic pathology of expert witnesses who feel empowered and unconstrained by speaking in court, to untutored judges or juries. The thrill of power, the arrogance of asserted opinion, the advancement of causes and beliefs, the lure of lucre, the freedom from contradiction, and a whole array of personality quirks are strong inducements for expert witnesses, in many countries, to outrun their scientific headlights. See Judge Jack B. Weinstein, “Preliminary Reflections on Administration of Complex Litigation” 2009 Cardozo L. Rev. de novo 1, 14 (2009) (describing plaintiffs’ expert witnesses in silicone litigation as “charlatans”; “[t]he breast implant litigation was largely based on a litigation fraud. … Claims—supported by medical charlatans—that enormous damages to women’s systems resulted could not be supported.”)

In any event, there have been notoriously bad verdicts in cases decided by trial judges as the finders of fact. See, e.g., Wells v. Ortho Pharmaceutical Corp., 615 F. Supp. 262 (N.D.Ga. 1985), aff’d and rev’d in part on other grounds, 788 F.2d 741 (11^th Cir.), cert. denied, 479 U.S. 950 (1986); Barrow v. Bristol-Meyers Squibb Co., 1998 WL 812318, at *23 (M.D. Fla., Oct. 29, 1998)(finding for breast implant plaintiff whose claims were supported by dubious scientific studies), aff’d, 190 F. 3d 541 (11th Cir. 1999). Bad things can happen in the judicial process even without the participation of lay juries.

Green and Sanders are correct to point out that juries are often confused by scientific evidence, and lack the time, patience, education, and resources to understand it. Same for judges. The real difference is that the decisions of judges is public. Judges are expected to explain their reasoning, and there is some, even if limited, appellate review for judicial gatekeeping decisions. In this vein, Green and Sanders dismiss the hand wringing over disagreements among courts on admissibility decisions by noting that similar disagreements over evidentiary sufficiency issues fill the appellate reporters. Id. at 37. Green and Sanders might well add that at least the disagreements are out in the open, advanced with supporting reasoning, for public discussion and debate, unlike the unimpeachable verdicts of juries and their cloistered, secretive reasoning or lack thereof.

In addition, Green and Sander’s fail to mention a considerable problem: the admission of weak, pathologic, or overstated scientific opinion undermines confidence in the judicial judgments based upon verdicts that come out of a process that featured the dubious opinions of the expert witnesses. The public embarrassment of the court system for its judgments, based upon questionable expert witness opinion testimony, was a strong inducement to changing the libertine pre-Daubert laissez-faire approach.

4. Failure to consider the important role of Rule 703, which is quite independent of any “sufficiency” considerations, in the gatekeeping process. Green and Sanders properly acknowledge the historical role that Rule 703, of the Federal Rules of Evidence, played in judicial attempts to regain some semblance of control over expert witness opinion. They do not pursue the issue of its present role, which is often neglected and underemphasized. In part, Rule 703, with its requirement that courts screen expert witness reliance upon independently inadmissible evidence (which means virtually all epidemiologic and animal studies and their data analyses), goes to the heart of gatekeeping by requiring judges to examine the quality of study data, and the reasonableness of reliance upon such data, by testifying expert witnesses. See Schachtman, RULE OF EVIDENCE 703 — Problem Child of Article VII (Sept. 19, 2011). Curiously, the authors try to force Rule 703 into their sufficiency pigeonhole even though it calls for a specific inquiry into the reasonableness (vel non) of reliance upon specific (hearsay or otherwise inadmissible) studies. In my view, Rule 703 is predominantly a validity, and not a sufficiency, inquiry.

Judge Weinstein’s use of Rule 703, in In re Agent Orange, to strip out the most egregiously weak evidence did not predominantly speak to the evidentiary insufficiency of the plaintiffs’ expert witnesses reliance materials; nor did it look to the defendants’ expert witnesses’ reliance upon contradicting evidence. Judge Weinstein was troubled by the plaintiffs’ expert witnesses reliance upon hearsay statements, from biased witnesses, of the plaintiffs’ medical condition. Judge Weinstein did, of course, famously apply sufficiency criteria, including relative risks too low to permit an inference of specific causation, and the insubstantial totality of the evidence, but Judge Weinstein’s judicial philosophy then was to reject Rule 702 as a quality-control procedure for expert witness opinion testimony. See In re Agent Orange Product Liab. Litig., 597 F. Supp. 740, 785, 817 (E.D.N.Y. 1984)(plaintiffs must prove at least a two-fold increase in rate of disease allegedly caused by the exposure), aff’d, 818 F.2d 145, 150-51 (2d Cir. 1987)(approving district court’s analysis), cert. denied sub nom. Pinkney v. Dow Chemical Co., 484 U.S. 1004 (1988); see also In re “Agent Orange” Prod. Liab. Litig., 611 F. Supp. 1223, 1240, 1262 (E.D.N.Y. 1985), aff’d, 818 F.2d 187 (2d Cir. 1987), cert. denied, 487 U.S. 1234 (1988). A decade later, in the breast implant litigation, Judge Weinstein adhered to his rejection of Rule 702 to make explicit expert witness validity rulings or sufficiency determinations by granting summary judgment on the entire evidentiary display. This assessment of sufficiency was not, however, driven by the rules of evidence; it was based firmly upon Federal Rule of Civil Procedure 56’s empowerment of the trial judge to make an overall assessment that plaintiffs lack a submissible case. See In re Breast Implant Cases, 942 F.Supp. 958 (E. & S.D.N.Y. 1996)(granting summary judgment because of insufficiency of plaintiffs’ evidence, but specifically declining to rule on defendants’ Rule 702 and Rule 703 motions). Within a few years, court-appointed expert witnesses, and the Institute of Medicine, weighed in with withering criticisms of plaintiffs’ attempted scientific case. Given that there was so little valid evidence, sufficiency really never was at issue for these experts, but Judge Weinstein chose to frame the issue as sufficiency to avoid ruling on the pending motions under Rule 702.

5. Re-analyzing Re-analysis. In the Bendectin litigation, some of the plaintiffs’ expert witnesses sought to offer various re-analyses of published papers. Defendant Merrell Dow objected, and appeared to have framed its objections in general terms to unpublished re-analyses of published papers. Green and Sanders properly note that some of the defense arguments, to the extent stated generally as prohibitions against re-analyses, were overblown and overstated. Re-analyses can take so many forms, and the quality of peer reviewed papers is so variable, it would be foolhardy to frame a judicial rule as a prohibition against re-analyzing data in published studies. Indeed, so many studies are published with incorrect statistical analyses that parties and expert witnesses have an obligation to call the problems to the courts’ attention, and to correct the problems when possible.

The notion that peer review was important in any way to serve as a proxy for reliability or validity has not been borne out. Similarly, the suggestion that reanalyses of existing data from published papers were presumptively suspect was also not well considered. Id. at 13.

6. Comments dismissive of statistical significance and methodological rigor. Judgments of causality are, at the end of the day, qualitative judgments, but is it really true that:

“Ultimately, of course, regardless of how rigorous the methodology of more probative studies, the magnitude of any result and whether it is statistically significant, judgment and inference is required as to whether the available research supports an inference of causation.”

Id. at 16 (citing among sources a particularly dubious case, Milward v. Acuity Specialty Prods. Group, Inc., 639 F.3d 11 (1st Cir. 2011), cert. denied, ___ U.S. ___ (2012). Can the authors really intend to say that the judgment of causal inference is or should be made “regardless” of the rigor of methodology, regardless of statistical significance, regardless of a hierarchy of study evidentiary probitiveness? Perhaps the authors simply meant to say that, at the end of the day, judgments of causal inference are qualitative judgments. As much as I would like to extend the principle of charity to the authors, their own labeling of appellate decisions contrary to Milward as “silly,” makes the benefit of the doubt seem inappropriate.

7. The shame of scientists and physicians opining on specific causation. Green and Sanders acknowledge that judgments of specific causation – the causation of harm in a specific person – are often uninformed by scientific considerations, and that Daubert criteria are unhelpful.

“Unfortunately, outside the context of litigation this is an inquiry to which most doctors devote very little time.46 True, they frequently serve as expert witnesses in such cases (because the law demands evidence on this issue) but there is no accepted scientific methodology for determining the cause of an individual’s disease and, therefore, the error rate is simply unknown and unquantifiable.47”

Id. at 18. (Professor Green’s comments at the conference seemed even more apodictic.) The authors, however, seem to have no sense of outrage that expert witnesses offer opinions on this topic, for which the witnesses have no epistemic warrant, and that courts accept these facile, if not fabricated, judgments. Furthermore, specific causation is very much a scientific issue. Scientists may, as a general matter, concentrate on population studies that show associations, which may be found to be causal, but some scientists have worked on gene associations that define extremely high risk sub-populations that determine the overall population risk. As Green and Sanders acknowledge, when the relative risks are extremely high (say > 100), we do not need to use any fancy math to know that most cases in the exposed group will result (but for) from their exposure. A tremendous amount of scientific work has been done to identify biomarkers of increased risk, and to tie the increased risk to an agent-specific causal mechanism. See, e.g., Gregory L. Erexson, James L. Wilmer, and Andrew D. Kligerman, “Sister Chromatid Exchange Induction in Human Lymphocytes Exposed to Benzene and Its Metabolites in Vitro,” 45 Cancer Research 2471 (1985).

8. Sufficiency versus admissibility. Green and Sanders opine that many gatekeeping decisions, such as the Bendectin and breast implant cases, should be understood as sufficiency decisions that have incorporated the significant exculpatory epidemiologic evidence offered by defendants. Id. at 20. The “mature epidemiologic evidence” overwhelmed the plaintiffs’ meager evidence to the point that a jury verdict was not sustainable as a matter of law. Id. The authors’ approach requires a weighing of the complete evidentiary display, “entirely apart from the [plaintiffs’] expert’s testimony, to determine the overall sufficiency and reasonableness of the claimed inference of causation. Id. at 21. What is missing, however, from this approach is that even without the defendants’ mature or solid body of epidemiologic evidence, the plaintiff’s expert witness was urging an inference of causation based upon fairly insubstantial evidence. Green and Sanders are concerned, no doubt, that if sufficiency were the main driver of exclusionary rulings, then the disconnect between appellate standard of review for expert witness opinion admissibility, which is reversed only for an “abuse of discretion” by the trial court, and the standard of review for typical grants of summary judgments, which are evaluated “de novo” by the appellate court. Green and Sanders hint that the expert witnesses decisions, which they see as mainly sufficiency judgments, may not be appropriate for the non-searching “abuse of discretion” standard. See id. at 40 – 41 (citing the asymmetric “hard look” approach taken in In re Paoli RR Yard PCB Litig., 35 F.3d 717, 749-5- (3d Cir. 1994), and in the intermediate appellate court in Joiner itself). Of course, the Supreme Court’s decision in Joiner was an abandonment of something akin to de novo hard-look appellate review, lopsidedly applied to exclusions only. Decisions to admit did not lead to summary dispositions without trial and thus were never given any meaningful appellate review.

Elsewhere, Green and Sanders note that they do not necessarily share the doubts of the “hand wringers” over the inconsistent exclusionary rulings that result from an abuse of discretion standard. At the end of their article, however, the authors note that viewing expert witness opinion exclusions as “sufficiency determinations” raises the question whether appellate courts should review these determinations de novo, as they would review ordinary factual “no evidence” or “insufficient evidence” grants of summary judgment. Id. at 40. There are reasonable arguments both ways, but it is worth pointing out that appellate decisions affirming rulings going both ways on the same expert witnesses, opining about the same litigated causal issue, are different from jury verdicts going both ways on causation. First, the reasoning of the courts is, we hope, set out for public consumption, discussion, and debate, in a way that a jury’s deliberations are not. Second, the fact of decisions “going both ways” is a statement that the courts view the issue as close and subject to debate. Third, if the scientific and legal communities are paying attention, as they should, they can weigh in on the disparity, and on the stated reasons. Assuming that courts are amenable to good reasons, they may have the opportunity to revisit the issue in a way that juries, which serve for one time on the causal issue, can never do. We might hope that the better reasoned decisions, especially those that were supported by the disinterested scientific community, would have some persuasive authority,

9. Abridgment of Rule 702’s approach to gatekeeping. The authors’ approach to sufficiency also suffers from ignoring, not only Rule 703’s requirements into the reasonableness of reliance upon individual studies, but also from ignoring Rule 702 (c) and (d), which require that:

(c) the testimony is the product of reliable principles and methods; and

(d) the expert has reliably applied the principles and methods to the facts of the case.

These subsections of Rule 702 do not readily allow the use of proxy or substitute measures of validity or reliability; they require the trial court to assess the expert witnesses’ reasoning from data to conclusions. In large part, Green and Sanders have been misled by the instincts of courts to retreat to proxies for validity in the form of “general acceptance,” “peer review,” and contrary evidence that makes the challenged opinion appear “insubstantial.”

There is a substantial danger that Green and Sander’s reductionist approach, and their equation of admissibility with sufficiency, will undermine trial courts’ willingness to assess the more demanding, and time-consuming, validity claims that are inherent in all expert witness causation opinions.

10. Weight-of-the evidence (WOE) reasoning. The authors appear captivated by the use of so-called weight-of-the evidence (WOE) reasoning, questionably featured in some recent judicial decisions. The so-called WOE method is really not much of a method at all, but rather a hand-waving process that often excuses the poverty of data and valid analysis. See, e.g., Douglas L. Weed, “Weight of Evidence: A Review of Concept and Methods,” 25 Risk Analysis 1545 (2005) (noting the vague, ambiguous, indefinite nature of the concept of “weight of evidence” review). See also Schachtman, “Milward — Unhinging the Courthouse Door to Dubious Scientific Evidence” (Sept. 2, 2011).

In Allen v. Pennsylvania Engineering Corp., 102 F.3d 194 (5th Cir.1996), the appellate court disparaged WOE as a regulatory tool for making precautionary judgments, not fit for civil litigation that involves actual causation as opposed to “as if” judgments. Green and Sanders pejoratively label the Allen court’s approach as “silly”:

“The idea that a regulatory agency would make a carcinogenicity determination if it were not the best explanation of the evidence, i.e., more likely than not, is silly.”

Id. at 29 n.82 (emphasis added). But silliness is as silliness does. Only a few pages later in their paper, Green and Sanders admit that:

“As some courts have noted, the regulatory threshold is lower than required in tort claims. With respect to the decision of the FDA to withdraw approval of Parlodel, the court in Glastetter v. Novartis Pharmaceuticals Corp., 107 F. Supp. 2d 1015 (E.D. Mo. 2000), judgment aff’d, 252 F.3d 986 (8th Cir. 2001), commented that the FDA’s withdrawal statement, “does not establish that the FDA had concluded that bormocriptine can cause an ICH [intreceberal hemorrhage]; instead, it indicates that in light of the limited social utility of bromocriptine in treating lactation and the reports of possible adverse effects, the drug should no longer be used for that purpose. For these reasons, the court does not believe that the FDA statement alone establishes the reliability of plaintiffs’ experts’ causation testimony.” Glastetter v. Novartis Pharmaceuticals Corp., 107 F. Supp. 2d 1015 (E.D. Mo. 2000), aff’d, 252 F.3d 986 (8th Cir. 2001).”

Id. at 34 n.101. Not only do the authors appear to contradict themselves on the burden of persuasion for regulatory decisions, they offer no support for their silliness indictment. Certainly, regulatory decisions, and not only the FDA’s, are frequently based upon precautionary principles that involve applying uncertain, ambiguous, or confusing data analyses to the process of formulating protective rules and regulations in the absence of scientific knowledge. Unlike regulatory agencies, which operate under the Administrative Procedures Act, federal courts, and many state courts, operate under Rule 702 and 703’s requirements that expert witness opinion have the epistemic warrant of “knowledge,” not hunch, conjecture, or speculation.

Posted in Causation, Rule 702, Rule 703, Scientific Evidence | Comments Off on Admissibility versus Sufficiency of Expert Witness Evidence