TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

Meta-Meta-Analysis – Celebrex Litigation – The Claims – Part One

June 21st, 2012

In the Celebrex/Bextra litigation, both sides acknowledged the general acceptance and validity of meta-analysis, for both observational studies and clinical trials, but attacked the other side’s witnesses’ meta-analyses on grounds specific to how they were conducted.  See, e.g., Pfizer Defendants’ Motion to Exclude Certain Plaintiffs’ Experts’ Causation Opinion Regarding Celebrex – Memorandum of Points and Authorities in Support Thereof at 14, 16 (describing meta-analysis as “appropriate” and a “useful way to evaluate the presence and consistency of an effect,” and “a valid technique for analyzing the results of both randomized clinical trials and observational studies”)(dated July 20, 2007), submitted in MDL 1699, In re Bextra and Celebrex Marketing Sales Practices & Prod. Liab. Litig., Case No. 05-CV-01699 CRB (N.D. Calif.) [hereafter MDL 1699]; Plaintiffs’ Memorandum of Law in Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 2 (July 23, 2009) (“While use of a properly conducted meta-analysis is appropriate, there are underlying scientific principles and techniques to be used in meta-analysis that are widely accepted among biostatisticians and epidemiologists. Wei’s meta-analysis – which he acknowledges is based in part on an admittedly novel approach that is not generally recognized by the scientific community – fails to follow certain of these key principles.”), submitted in In re Pfizer, Inc. Securities Litig., Nos. 04 Civ. 9866(LTS)(JLC), 05 md 1688(LTS) (S.D.N.Y.)[hereafter Securities Litig.]

The plaintiffs and defendants expended a great deal of energy in attacking the other side’s meta-analyses as conducted.  With all the briefing in the federal MDL, the New York state cases, and the securities fraud class action, hundreds of pages were written on the suspected flaws in meta-analyses.  The courts, in both the products liability MDL cases and in the securities case, denied the challenges in a few sentences.  Indeed, it is difficult if not impossible to discern what the challenges were from reading the courts’ decisions. In re Pfizer Inc. Securities Litig., 2010 WL 1047618 (S.D.N.Y. 2010); In re Bextra and Celebrex, 2008 N.Y. Misc. LEXIS 720; 239 N.Y.L.J. 27(2008); In re Bextra and Celebrex Marketing Sales Practices and Product Liability Litig., MDL No. 1699, 524 F.Supp. 2d 1166 (N.D. Calif. 2007)

Although the issues shifted some over the course of these litigations, certain important themes recurred.  The plaintiffs focused their attack upon the meta-analyses conducted by defense expert witness, Lee-Jen Wei, a professor of biostatistics at the Harvard School of Public Health.

The plaintiffs maintained that Professor Wei’s meta-analyses should be excluded under Rule 702, or the New York case law, because of

  • inclusion of short-term clinical trials
  • failure to weight risk ratios by person years
  • inclusion of zero-event trials with use of imputation methods
  • use of risk difference instead of risk ratios
  • use of exact confidence intervals instead of estimated intervals

See generally Plaintiffs’ Memorandum of Law in Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei (July 23, 2009), in Securities Litig.

The plaintiffs advanced meta-analyses conducted by Professor David Madigan, Professor and Chair in the Department of Statistics, Columbia University.  The essence of the defendants’ challenges revolved around claims of flawed outcome and endpoint ascertainment and definitions:

  • invalid clinical endpoints
  • flawed data collection procedures
  • ad hoc changes in procedure and methods
  • novel methodologies “never used in the history of clinical research”
  • lack of documentation for classifying events
  • absence of expert clinical judgment in classifying event for inclusion in meta-analysis
  • creation of composite endpoints that included events unrelated to plaintiffs’ theory of thrombotic mechanism
  • lack of blinding to medication use when categorizing events
  • failure to adjust for multiple comparisons in meta-analyses

See generally Pfizer Defendants’ Motion to Exclude Certain Plaintiffs’ Experts’ Causation Opinion Regarding Celebrex – Memorandum of Points and Authorities in Support Thereof (dated July 20, 2007), in MDL 1699; Pfizer defendants’ [Proposed] Findings of Fact and Conclusions of Law with Respect to Motion to Exclude Certain Plaintiffs’ Experts’ Opinions Regarding Celebrex and Bextra, and Plaintiffs’ Motion to Exclude Defendants’ Expert Dr. Lee-Jen Wei, Document 175, submitted in Securities Litig. (Dec. 4, 2009).

Why did the three judges involved (Judge Breyer in the federal MDL; Justice Kornreich in the New York state cases; and Judge Swain in the federal securities putative class action) give such cursory attention to these Rule 702/Frye challenges?  The complexity of the issues, the lack of clarity in the lawyers’ briefings, and the stridency of both sides perhaps contributed to shorten judicial attention span.  Some of the claims were simply untenable, and may have obliterated more telling critiques.

ZERO-EVENT TRIALS

Many of the Celebrex parties’ claims can be traced to a broader issue of what to include or exclude in a meta-analysis.  Consider for instance the plaintiffs’ challenge to Wei’s meta-analysis.  The plaintiffs faulted Wei for including short-term clinical trials in his meta-analysis, while sponsoring their own expert witness testimony that Celebrex could induce heart attack or stroke after first ingestion of the medication.  Having made the claim, the plaintiffs were hard pressed to exclude short-term trials, other than to argue that such trials frequently had zero adverse events in either the medication or placebo arms.  Many meta-analytic methods, which treat each included study as a 2 x 2 contingency table, and calculate an odds ratio for each table, cannot accommodate zero event data.

Whether or not hard pressed, the plaintiffs made the claim. The plaintiffs’ analogized to the lack of reliability of underpowered clinical trials to provide evidence of safety.  See Plaintiffs’ Reply Memorandum of Law in Further Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 6 (May 5, 2010), in Securities Litig. (citing In re Neurontin Mktg., Sales Practices, and Prod. Liab. Litig., 612 F. Supp. 2d 116, 141 (D. Mass. 2009) (noting that many of Pfizer’s studies were “underpowered” to detect the alleged connection between Neurontin and suicide).  The power argument, however, does not make sense in the context of a meta-analysis, which is aggregating data across studies to overcome the alleged lack of power in a single study.

Not surprisingly, clinical trials of a non-cardiac medication will often report no event of the outcome of interest, such as heart attack.  These trials are referred to as a “zero event”, which can happen in one or both arms of a given trial.  Some searchers exclude these studies from a meta-analysis because of the impossibility of calculating an odds ratio without using imputation in the zero cells of the 2 x 2 tables. Although there are methods to address zero-event trials, some researchers believe that the existence of several zero-event trials essentially means that the sparse data from rare outcomes deprives statistical tests of their usual meaning.  Traditional statistical standards of significance (p < 0.05) are described as “tenuous,” and too high, in this situation. A.V. Hernandez, E. Walker, J. P. Ioannidis, M.W. Kattan, “Challenges in meta-analysis of randomized clinical trials for rare harmful cardiovascular events: the case of rosiglitazone,” 156 Am. Heart J. 23, 28 (2008).

The exclusion of zero-event trials from meta-analyses of rare outcomes can yield biased results. See generally M.J. Bradburn, J.J Deeks, J.A. Berlin, and A. Russell Localio,” Much ado about nothing: a comparison of the performance of meta-analytical methods with rare events,” 26 Statistics in Med. 53 (2007); M.J. Sweeting, A.J. Sutton, and P.C. Lambert, “What to add to nothing? Use and avoidance of continuity corrections in meta-analysis of sparse data,” 23 Statistics in Med. 1351 (2004)(erratum at 25 Statistics in Med. 2700 (2006) (“Many routinely used summary methods provide widely ranging estimates when applied to sparse data with high imbalance between the size of the studies’ arms. A sensitivity analysis using several methods and continuity correction factors is advocated for routine practice.”).

Others researchers include zero-event trials as providing helpful information about the absence of risk. Zero-event trials:

“provide relevant data by showing that event rates for both the intervention and control groups are low and relatively equal. Excluding such trial data potentially increases the risk of inflating the magnitude of the pooled treatment effect.”

J.O. Friedrich, N.K. Adhikari, J. Beyene, “Inclusion of zero total event trials in meta-analyses maintains analytic consistency and incorporates all available data,” 5 BMC Med. Res. Methodol. 2 (2007)[cited as Friedrich].  Zero event trials can be included in meta-analyses by using something called a standard “continuity correction,” which involves imputing events, or fractional events, in all cells of the 2 x 2 table. One approach, the zero is replaced with 0.5 and all other numbers are increased by 0.5. Friedrich at 7.

After examining the bias in several meta-analyses from excluding zero-event trials, Friedrich and colleagues recommended:

“We believe these trials [with zero events] should also be included if RR [relative risks] or OR [odds ratios] are the effect measures to provide a more conservative estimate of effect size(even if this change in effect size is very small for RR and OR), and to provide analytic consistency and include the same number of trials in the meta-analysis, regardless of the summary effect measure used. Inclusion of zero total event trials would enable the inclusion of all available randomized controlled data in a meta-analysis, thereby providing the most generalizable estimate of treatment effect.”

Friedrich at 5-6.

Wei addressed the problem of zero-event trials by using common imputation methods, not so different from what plaintiffs’ expert witness Dr. Ix used in the gadolinium litigation. See Meta-Meta-Analysis — The Gadolinium MDL — More Than Ix’se Dixit.  Given that plaintiffs advanced a mechanistic theory, which would explain cardiovascular thrombotic events almost immediately upon first ingestion of Celebrex, Professor Wei’s attempt to save the data inherent in zero-event trials by “continuity correction” or imputation methods seems reasonable and well within meta-analytic procedures.

 

RISK DIFFERENCE

Professor Wei did not limit himself to a single method or approach.  In addition to using imputation methods, Wei used risk difference, rather than risk ratios, as the parameter of interest.  The risk difference is simply the difference between two risks: the risk or probability of an event in one group less the risk or probability of that event in another group.  Contrary to the plaintiffs’ claims, there is nothing novel or subversive about conducting a meta-analysis with the risk difference as the parameter of interest, rather than a risk ratio.  In the context of randomized clinical trials, the risk difference is expected as a measure of absolute effect.  See generally, Michael Borenstein, L. V. Hedges, J. P. T. Higgins, and H. R. Rothstein, Introduction to Meta-Analysis (2009); Julian PT Higgins and Sally Green, eds., Cochrane Handbook for Systematic Reviews of Interventions (2008)

Like risk ratios, the risk difference yield a calculated confidence interval at any desired coefficient of confidence.  Confidence intervals for dichotomous events are often based upon approximate methods that build upon the normal approximation to the binomial distribution.  These approximate methods require assumptions of sample size that may not be met in cases involving sparse data.  With modern computers, calculating exact confidence intervals is not particularly difficult, and Professor Wei has published a methods paper in which he explains the desirability of using the risk difference with exact intervals in addressing meta-analyses of sparse data, such as was involved in the Celebrex litigation.  See L. Tian, T. Cai, M.A. Pfeffer, N. Piankov, P.Y. Cremieux, and L.J. Wei, “Exact and efficient inference procedure for meta-analysis and its application to the analysis of independent 2 x 2 tables with all available data but without artificial continuity correction,” 10 Biostatistics 275 (2009).

Plaintiffs attacked Wei’s approach as “novel” and not generally accepted.  Judge Swain appropriately dismissed this attack:

“Dr. Wei’s methodology, the validity of which Plaintiffs contest and the novelty of which Plaintiffs seek to highlight, appears to have survived the rigors of peer review at least once, and is subject to critique by virtue of its transparency. Dr. Wei’s report, supplemented by his declaration, is sufficient to meet Defendants’ burden of demonstrating that his testimony is the product of reliable principles and methods. He has explained his methods, which can be tested. Plaintiffs’ critiques of Dr. Wei’s choices regarding which trials to include in his own meta-analysis, the origins of the data he used, the date at which he undertook his meta-analysis, and at whose behest he performed his analysis all go to the weight of Dr. Wei’s testimony.”

In re Pfizer Inc. Securities Litig., 2010 WL 1047618, *7 (S.D.N.Y. 2010).  The approach taken by Wei is novel only in the sense that researchers have not previously tried to push the methodological envelope of meta-analysis to deploy the technique for rare outcomes and sparse data, with many zero-event trials.  The risk difference approach is well suited to the situation, and the use of exact confidence intervals is hardly novel or dubious.

The Cherry-Picking Fallacy in Synthesizing Evidence

June 15th, 2012

What could be wrong with picking cherries?  At the end of the process you have cherries, and if you do it right, you have all ripe, and no rotten, cherries.  Your collection of ripe cherries, however, will be unrepresentative of the universe of cherries, but at least we understand how and why your cherries were selected.

Elite colleges pick the best high school students; leading law schools pick the top college students; and top law firms and federal judges cherry pick the best students of the best law schools.  Lawyers are all-too-comfortable with “cherry picking.”  Of course, the cherry-picking process here has at least some objective criteria, which can be stated in advance of the selection.

In litigation, each side is expected to “cherry pick” the favorable evidence, and ignore or flyblow the contrary evidence.  Judges are thus often complacent about selectivity in the presentation of evidence by parties and their witnesses.  In science, this kind of adversarial selectivity is a sure way to inject bias and subjectivity into claims of knowledge.  The development of the systematic review, in large measure, has been supported by the widespread recognition that studies cannot be evaluated with post hoc, subjective evaluative criteria. Cynthia D. Mulrow, Deborah J. Cook, Frank Davidoff, “Systematic Reviews: Critical Links in the Great Chain of Evidence,” 126 Ann. Intern. Med. 389 (1997)

The International Encyclopedia of Philosophy describes “cherry picking” as a fallacy, “a kind of error in reasoning.”  Cherry-picking the evidence, also known as “suppressed evidence,” is:

“[i]ntentionally failing to use information suspected of being relevant and significant is committing the fallacy of suppressed evidence. This fallacy usually occurs when the information counts against one’s own conclusion. * * * If the relevant information is not intentionally suppressed but rather inadvertently overlooked, the fallacy of suppressed evidence also is said to occur, although the fallacy’s name is misleading in this case.”

Bradley Dowden, “Suppressed Evidence,” International Encyclopedia of Philosophy (Last updated: December 31, 2010).

Cherry picking is a main rhetorical device for the litigator, and many judges simply do not understand what is so wrong with each side’s selection of the studies that it wishes to emphasize.  Whatever the acceptability of lawyers’ cherry picking in the presentation of evidence, it is antithetical to scientific methodology.  “Cherry picking (fallacy),” Wikipedia (describing cherry picking as the pointing to data that appears to confirm one’s opinion, while ignoring contradictory data)[last visited on June 14, 2012]

Given the pejorative connotations of “cherry picking,” no one should be very surprised that lawyers and judges couch their Rule 702 arguments and opinions in terms of whether expert witnesses engaged in this fruitful behavior.  Although I had heard plaintiffs’ and defendants’ counsel use the phrase, I only recently came across it in a judicial opinion.  Since the phrase nicely describes a fallacious form of reasoning, I thought it would be helpful to collect pertinent cases that describe the fallaciousness of fruit-pickin’ expert witness testimony.

United States Court of Appeals

Barber v. United Airlines, Inc., 17 Fed.Appx. 433, 437 (7th Cir. 2001) (affirming exclusion of “cherry-picking” expert witness who failed to explain why he ignored certain data while accepting others)

District Courts

Dwyer v. Sec’y of Health & Human Servs., No. 03-1202V, 2010 WL 892250 (Fed. Cl. Spec. Mstr. Mar. 12, 2010)(recommending rejection of thimerosal autism claim)(“In general, respondent’s experts provided more responsive answers to such questions.  Respondent’s experts were generally more careful and nuanced in their expert reports and testimony. In contrast, petitioners’ experts were more likely to offer opinions that exceeded their areas of expertise, to “cherry-pick” data from articles that were otherwise unsupportive of their position, or to draw conclusions unsupported by the data cited… .”)

In re Bausch & Lomb, Inc., 2009 WL 2750462 at 13 (D. S.C. 2009)( “Dr. Cohen did not address [four contradictory] studies in her expert reports or affidavit, and did not include them on her literature reviewed list [. . .] This failure to address this contrary data renders plaintiffs’ theory inherently unreliable.”)

Rimbert v. Eli Lilly & Co., No. 06-0874, 2009 WL 2208570, *19 (D.N.M. July 21, 2009) )(“Even more damaging . . . is her failure to grapple with any of the myriad epidemiological studies that refute her conclusion.”), aff’d, 647 F.3d 1247 (10th Cir. 2011) (affirming exclusion but remanding to permit plaintiff to find a new expert witness)

In re Bextra & Celebrex Prod. Liab. Litig., 524 F. Supp.2d 1166, 1176, 1179, 1181, 1184 (N.D. Cal. 2007) (criticizing plaintiffs’ expert witnesses for “cherry-picking studies”); id. at 1181 (“these experts ignore the great weight of the observational studies that contradict their conclusion and rely on the handful that appear to support their litigation-created opinion.”)

LeClerq v. Lockformer Co., No. 00 C 7164, 2005 U.S. Dist. LEXIS 7602, at *15 (N.D. Ill. Apr. 28, 2005) (holding that expert witness’s “cherry-pick[ing] the facts he considered to render his opinion, and such selective use of facts fail[s] to satisfy the scientific method and Daubert.”)(internal citations and quotations omitted)

Holden Metal & Aluminum Works v. Wismarq Corp., No. 00 C 0191, 2003 WL 1797844, at *2 (N.D. Ill. Apr. 2, 2003).

State Courts

Betz v. Pneumo Abex LLC, 2012 WL 1860853, *16 (May 23, 2012 Pa. S. Ct.)(“According to Appellants, moreover, the pathologist’s self-admitted selectivity in his approach to the literature is decidedly inconsistent with the scientific method. Accord Brief for Amici Scientists at 17 n.2 (“‘Cherry picking’ the literature is also a departure from ‘accepted procedure’.”)).

George v. Vermont League of Cities and Towns, 2010 Vt. 1, 993 A.2d 367, 398 (Vt. 2010)(expressing concern about how and why plaintiff’s expert witnesses selected some studies to include in their “weight of evidence” methodology.  Without an adequate explanation of selection and weighting criteria, the choices seemed arbitrary)

Scaife v. AstraZeneca LP, 2009 WL 1610575 at 8 (Del. Super. 2009) (“Simply stated, the expert cannot accept some but reject other data from the medical literature without explaining the bases for her acceptance or rejection.”)

In re Bextra & Celebrex, 2008 N.Y. Misc. LEXIS 720, *20, 239 N.Y.L.J. 27 (2008) (holding that New York’s Frye rule requires proponent to show that its expert witness had “look[ed] at the totality of the evidence and [did] not ignore contrary data.”); see also id. at *36 (“Moreover, out of 32 studies (29 published) cited by defendants, plaintiffs chose only 8 to plead their case.  This smacks of ‘cherry-picking,’ skewing their analysis by only looking at the helpful studies. Such practice contradicts the accepted method for an expert’s analysis of epidemiological data.”)

Bowen v. E.I. DuPont de Nemours & Co., 906 A.2d 787, 797 (Del. 2006) (noting that expert witnesses cannot ignore studies contrary to their opinions)

Selig v. Pfizer, Inc., 185 Misc. 2d 600, 607, 713 N.Y.S.2d 898 (Sup. Ct. N.Y. Cty. 2000) (holding that expert witness failed to satisfy Frye test’s requirement of following an accepted methodology when he ignored studies contrary to his opinion), aff’d, 290 A.D.2d 319, 735 N.Y.S.2d 549 (1st Dep’t 2002)

******************

Most but not all the caselaw uniformly recognizes the fallacy for an expert witness to engage in ad hoc selectivity in addressing studies upon which to rely.  In the following two cases, the cherry-picking was identified, but acquiesced in by judges.

McClellan v. I-Flow Corp., 710 F. Supp. 2d 1092, 1114 (D. Ore. 2010)(discussing cherry picking but rejecting “document by document” review)(“Finally, defendants contend that plaintiffs’ experts employ unreliable methodologies by ‘cherry-picking’ facts from certain studies and asserting reliance on the ‘totality’ or ‘global gestalt of medical evidence’. Defendants argue that in  doing so, plaintiffs’ experts fail to ‘painstakingly’ link each piece of data to their conclusions or explain how the evidence supports their opinions.”)

United States v. Paracha, 2006 WL 12768 (S.D. N.Y. Jan. 3, 2006)(rejecting challenge to terrorism expert on grounds that he cherry picked evidence in conspiracy prosecution involving al Queda)

King v. Burlington No. Santa Fe Ry, ___N.W.2d___, 277 Neb. Reports 203, 234 (2009)(noting that the law does “not preclude a trial court from considering as part of its reliability inquiry whether an expert has cherry-picked a couple of supporting studies from an overwhelming contrary body of literature,” but ignoring the force of the fallacious expert witness testimony by noting that the questionable expert witness (Frank) had some studies that showed associations between exposure to diesel exhaust or benzene and multiple myeloma).

Another Confounder in Lung Cancer Occupational Epidemiology — Diesel Engine Fumes

June 13th, 2012

Researchers obviously need to be aware of, and control for, potential and known confounders.  In the context of investigating the etiologies of lung cancer, there is a long list of potential confounding exposures, often ignored in peer-reviewed papers, which focus on one particular outcome of interest.  Just last week, I wrote to emphasize the need to account for potential and known confounding agents, and how this need was particularly strong in studies of weak alleged carcinogens such as crystalline silica.  See Sorting Out Confounded Research – Required by Rule 702.  Yesterday, the World Health Organization (WHO) added another “known” confounder for lung cancer epidemiology:  diesel fume.

According to the International Agency for Research on Cancer (IARC), a division of the WHO, a working group of international experts voted to reclassify diesel engine exhaust as a “Group I” carcinogen.  IARC: Diesel engines exhaust carcinogenic (2012).  This classification means, in IARC parlance, that ” there is sufficient evidence of carcinogenicity in humans. Exceptionally, an agent may be placed in this category when evidence of carcinogenicity in humans is less than sufficient but there is sufficient evidence of carcinogenicity in experimental animals and strong evidence in exposed humans that the agent acts through a relevant mechanism of carcinogenicity.”  The Group was headed up by Dr. Christopher Portier, who is the director of the National Center for Environmental Health and the Agency for Toxic Substances and Disease Registry at the Centers for Disease Control and Prevention.  Id.

The reclassification removes diesel exhaust from its previous categorization as a Group 2A carcinogen, which is interpreted “as probably carcinogenic to humans.”  Diesel exhaust has been on a high-priority list for re-evaluation since 1998, as result of epidemiologic research from many countries.  The Working Group specifically found that there was sufficient evidence to conclude that diesel exhaust is a cause of lung cancer in humans, and limited evidence to support an association with bladder cancer.  The Group rejected any change in classification of gasoline engine exhaust from its current IARC rating as “possibly carcinogenic to humans. (Group 2B).”

Unlike other IARC Working Group decisions (such as crystalline silica), which were weakened by close votes and significant dissents, the diesel Group’s conclusion was unanimous.  The diesel Group appeared to be impressed by two recent studies of lung cancer in underground miners, released in March 2012.  One study was in a large cohort, conducted by NIOSH, and the other was a nested case-control study, conducted by the National Cancer Institute (NCI).  See Debra T. Silverman, Claudine M. Samanic, Jay H. Lubin, Aaron E. Blair, Patricia A. Stewart , Roel Vermeulen, Joseph B. Coble, Nathaniel Rothman, Patricia L. Schleiff , William D. Travis, Regina G. Ziegler, Sholom Wacholder, Michael D. Attfield, “The Diesel Exhaust in Miners Study: A Nested Case-Control Study of Lung Cancer and Diesel Exhaust,” J. Nat’l Cancer Instit. (2012)(in press and open access); and Michael D. Attfield, Patricia L. Schleiff, Jay H. Lubin, Aaron Blair, Patricia A. Stewart, Roel Vermeulen, Joseph B. Coble, and Debra T. Silverman, “The Diesel Exhaust in Miners Study: A Cohort Mortality Study With Emphasis on Lung Cancer,” J. Nat’l Cancer Instit. (2012)(in press).

According to a story in the New York Times, the IARC Working Group described diesel engine exhaust as “more carcinogenic than secondhand cigarette smoke.”  Donald McNeil, “W.H.O. Declares Diesel Fumes Cause Lung Cancer,” N.Y. Times (June 12, 2012).  The Times also quoted Dr. Debra Silverman, NCI chief of environmental epidemiology, at length.  Dr. Silverman, who was the lead author of the nested case-control study cited by the IARC Press Release, noted that her large study showed that long-term heavy exposure to diesel fumes increased lung cancer risk seven fold. Dr. Silverman described this risk as much greater than that thought to be created by passive smoking, but much smaller than smoking two packs of cigarettes a day.  She stated that “totally” supported the IARC reclassification, and that she believed that governmental agencies would use the IARC analysis as the basis for changing the regulatory classification of diesel exhaust.

Silverman’s nested case-control study appears to have been based upon careful diesel exhaust exposure information, as well as smoking histories.  The study also searched and analyzed for other potential confounders, which might be expected to be involved in underground mining:

“Other potential confounders [ie, duration of cigar smoking; frequency of pipe smoking; environmental tobacco smoke; family history of lung cancer in a first-degree relative; education; body mass index based on usual adult weight and height; leisure time physical activity; diet; estimated cumulative exposure to radon, asbestos, silica, polycyclic aromatic hydrocarbons (PAHs) from non-diesel sources, and respirable dust in the study facility based on air measurement and other data (14)] were evaluated but not included in the final models because they had little or no impact on odds ratios (ie, inclusion of these factors in the final models changed point estimates for diesel exposure by ≤ 10%).”

Silverman, et al., at 4.  The absence of an association between lung cancer and silica exposure is noteworthy in a such a large study of underground miners.

Meta-Meta-Analysis — The Gadolinium MDL — More Than Ix’se Dixit

June 8th, 2012

There is an tendency, for better or worse, for legal bloggers to be partisan cheerleaders over litigation outcomes.  I admit that most often I am dismayed by judicial failures or refusals to exclude dubious plaintiffs’ expert witnesses’ opinion testimony, and I have been known to criticize such decisions.  Indeed, I wouldn’t mind seeing courts exclude dubious defendants’ expert witnesses.  I have written approvingly about cases in which judges have courageously engaged with difficult scientific issues, seen through the smoke screen, and properly assessed the validity of the opinions expressed.  The Gadolinium MDL (No. 1909) Daubert motions and decision offer a fascinating case study of a challenge to an expert witness’s meta-analysis, an effective defense of the meta-analysis, and a judicial decision to admit the testimony, based upon the meta-analysis.  In re Gadolinium-Based Contrast Agents Prods. Liab. Litig., 2010 WL 1796334 (N.D. Ohio May 4, 2010) [hereafter Gadolinium], reconsideration denied, 2010 WL 5173568 (June 18, 2010).

Plaintiffs proffered general causation opinions (between gadolinium contrast media and Nephrogenic Systemic Fibrosis (“NSF”), by a nephrologist, Joachim H. Ix, M.D., with training in epidemiology.  Dr. Ix’s opinions were based in large part upon a meta-analysis he conducted on data in published observational studies.  Judge Dan Aaron Polster, the MDL judge, itemized the defendant’s challenges to Dr. Ix’s proposed testimony:

“The previously-used procedures GEHC takes issue with are:

(1) the failure to consult with experts about which studies to include;

(2) the failure to independently verify which studies to select for the meta-analysis;

(3) using retrospective and non-randomized studies;

(4) relying on studies with wide confidence intervals; and

(5) using a “more likely than not” standard for causation that would not pass scientific scrutiny.”

Gadolinium at *23.  Judge Polster confidently dispatched these challenges.  Dr. Ix, as a nephrologist, had subject-matter expertise with which to develop inclusionary and exclusionary criteria on his own.  The defendant never articulated what, if any, studies were inappropriately included or excluded.  The complaint that Dr. Ix had used retrospective and non-randomized studies also rang hollow in the absence of any showing that there were randomized clinical trials with pertinent data at hand.  Once a serious concern of nephrotoxicity arose, clinical trials were unethical, and the defendant never explained why observational studies were somehow inappropriate for inclusion in a meta-analysis.

Relying upon studies with wide confidence intervals can be problematic, but that is one of the reasons to conduct a meta-analysis, assuming the model assumptions for the meta-analysis can be verified.  The plaintiffs effectively relied upon a published meta-analysis, which pre-dated their expert witness’s litigation effort, in which the authors used less conservative inclusionary criteria, and reported a statistically significant summary estimate of risk, with an even wider confidence interval.  R. Agarwal, et al., ” Gadolinium-based contrast agents and nephrogenic systemic fibrosis: a systematic review and meta-analysis,” 24 Nephrol. Dialysis & Transplantation 856 (2009).  As the plaintiffs noted in their opposition to the challenge to Dr. Ix:

“Furthermore, while GEHC criticizes Dr. Ix’s CI from his meta-analysis as being “wide” at (5.18864 and 25.326) it fails to share with the court that the peer-reviewed Agarwal meta-analysis, reported a wider CI of (10.27–69.44)… .”

Plaintiff’s Opposition to GE Healthcare’s Motion to Exclude the Opinion Testimony of Joachim Ix at 28 (Mar. 12, 2010)[hereafter Opposition].

Wider confidence intervals certainly suggest greater levels of random error, but Dr. Ix’s intervals suggested statistical significance, and he had carefully considered statistical heterogeneity.  Opposition at 19. (Heterogeneity was never advanced by the defense as an attack on Dr. Ix’s meta-analysis).  Remarkably, the defendant never advanced a sensitivity analysis to suggest or to show that reasonable changes to the evidentiary dataset could result in loss of statistical significance, as might be expected from the large intervals.  Rather, the defendant relied upon the fact that Dr. Ix had published other meta-analyses in which the confidence interval was much narrower, and then claimed that he had “required” these narrower confidence intervals for his professional, published research.  Memorandum of Law of GE Healthcare’s Motion to Exclude Certain Testimony of Plaintiffs’ Generic Expert, Joachim H. Ix, MD, MAS, In re Gadolinium MDL No. 1909, Case: 1:08-gd-50000-DAP  Doc #: 668   (Filed Feb. 12, 2010)[hereafter Challenge].  There never was, however, a showing that narrower intervals were required for publication, and the existence of the published Agarwal meta-analysis contradicted the suggestion.

Interestingly, the defense did not call attention to Dr. Ix’s providing an incorrect definition of the confidence interval!  Here is how Dr. Ix described the confidence interval, in language quoted by plaintiffs in their Opposition:

“The horizontal lines display the “95% confidence interval” around this estimate. This 95% confidence interval reflects the range of odds ratios that would be observed 95 times if the study was repeated 100 times, thus the narrower these confidence intervals, the more precise the estimate.”

Opposition at 20.  The confidence interval does not provide a probability distribution of the parameter of interest; rather the distribution of confidence intervals has a probability of covering the hypothesized “true value” of the parameter.

Finally, the defendant never showed any basis for suggesting that a scientific opinion on causation requires something more than a “more likely than not” basis.

Judge Polster also addressed some more serious challenges:

“Defendants contend that Dr. Ix’s testimony should also be excluded because the methodology he utilized for his generic expert report, along with varying from his normal practice, was unreliable. Specifically, Defendants assert that:

(1) Dr. Ix could not identify a source he relied upon to conduct his meta-analysis;

(2) Dr. Ix imputed data into the study;

(3) Dr. Ix failed to consider studies not reporting an association between GBCAs and NSF; and

(4) Dr. Ix ignored confounding factors.”

Gadolinium at *24

IMPUTATION

The first point, above – the alleged failure to identify a source for conducting the meta-analysis – rings fairly hollow, and Judge Polster easily deflected it.  The second point raised a more interesting challenge.  In the words of defense counsel:

“However, in arriving at this estimate, Dr. Ix imputed, i.e., added, data into four of the five studies.  (See Sept. 22 Ix Dep. Tr. (Ex. 20), at 149:10-151:4.)  Specifically, Dr. Ix added a single case of NSF without antecedent GBCA exposure to the patient data in the underlying studies.

* * *

During his deposition, Dr. Ix could not provide any authority for his decision to impute the additional data into his litigation meta-analysis.  (See Sept. 22 Ix Dep. Tr. (Ex. 20), at 149:10-151:4.)  When pressed for any authority supporting his decision, Dr. Ix quipped that ‘this may be a good question to ask a Ph.D level biostatistician about whether there are methods to [calculate an odds ratio] without imputing a case [of NSF without antecedent GBCA exposure]’.”

Challenge at 12-13.

The deposition reference suggests that the examiner had scored a debating point by catching Dr. Ix unprepared, but by the time the parties briefed the challenge, the plaintiffs had the issue well in hand, citing A. W. F. Edwards, “The Measure of Association in a 2 × 2 Table,” 126 J. Royal Stat. Soc. Series A 109 (1963); R.L. Plackett, “The Continuity Correction in 2 x 2 Tables,” 51 Biometrika 327 (1964).  Opposition at 36 (describing the process of imputation in the event of zero counts in the cells of a 2 x 2 table for odds ratios).  There are qualms to be stated about imputation, but the defense failed to make them.  As a result, the challenge overall lost momentum and credibility.  As the trial court stated the matter:

“Next, there is no dispute that Dr. Ix imputed data into his meta-analysis. However, as Defendants acknowledge, there are valid scientific reasons to impute data into a study. Here, Dr. Ix had a valid basis for imputing data. As explained by Plaintiffs, Dr. Ix’s imputed data is an acceptable technique for avoiding the calculation of an infinite odds ratio that does not accurately measure association.7 Moreover, Dr. Ix chose the most conservative of the widely accepted approaches for imputing data.8 Therefore, Dr. Ix’s decision to impute data does not call into question the reliability of his meta-analysis.”

Gadolinium at *24.

FAILURE TO CONSIDER NULL STUDIES

The defense’s challenged including a claim that Dr. Ix had arbitrarily excluded studies in which there was no reported incidence of NSF. The defense brief unfortunately does not describe the studies excluded, and what, if any, effect their inclusion in the meta-analysis would have had.  This was, after all, the crucial issue. The abstract nature of the defense claim left the matter ripe for misrepresentation by the plaintiffs:

“GEHC continues to misunderstand the role of a meta-analysis and the need for studies that included patients both that did or did not receive GBCAs and reported on the incidence of NSF, despite Dr. Ix’s clear elucidation during his deposition. (Ix Depo. TR [Exh.1] at 97-98).  Meta-analyses such as performed by Dr. Ix and Dr. Agarwal search for whether or not there is a statistically valid association between exposure and disease event. In order to ascertain the relationship between the exposure and event one must have an event to evaluate. In other words, if you have a study in which the exposed group consists of 10,000 people that are exposed to GBCAs and none develop NSF, compared to a non-exposed group of 10,000 who were not exposed to GBCAs and did not develop NSF, the study provides no information about the association between GBCAs and NSF or the relative risk of developing NSF.”

Challenge at 37 – 38 (emphasis in original).  What is fascinating about this particular challenge, and the plaintiffs’ response, is the methodological hypocrisy exhibited.  In essence, the plaintiffs argued that imputation was appropriate in a case-control study, in which one cell contained a zero, but they would ignore a great deal of data in a cohort study with data.  To be sure, case-control studies are more efficient than cohort studies for identifying and assessing risk ratios for rare outcomes.  Nevertheless, the plaintiffs could easily have been hoisted with their own hypothetical petard.  No one in 10,000 gadolinium-exposed patients developed NSF; and no one in a control group did either.  The hypothetical study suggests that the rate of NSF is low and not different in the exposed and in the unexposed patients.  The risk ratio could be obtained by imputing an integer for the cells containing zero, and a confidence interval calculated.  The risk ratio, of course, would be 1.0.

Unfortunately, the defense did not make this argument; nor did it explore where the meta-analysis might have come out had a more even-handed methodology been taken by Dr. Ix.  The gap allowed the trial court to brush the challenge aside:

“The failure to consider studies not reporting an association between GBCAs and NSF also does not render Dr. Ix’s meta-analysis unreliable. The purpose of Dr. Ix’s meta-analysis was to study the strength of the association between an exposure (receiving GBCA) and an outcome (development of NSF). In order to properly do this, Dr. Ix necessarily needed to examine studies where the exposed group developed NSF.”

Gadolinium at *24.  Judge Polster, with no help from the defense brief, missed the irony of Dr. Ix’s willingness to impute data in the case-control 2 x 2 contingency tables, but not in the relative risk tables.

CONFOUNDING

Defendants complained that Dr. Ix had ignored the possibility that confounding factors had contributed to the development of NSF.  Challenge at 13.  Defendants went so far as to charge Dr. Ix with misleading the court by failing to consider other possible causative exposures or conditions.  Id.

Defendants never identified the existence, source, and likely magnitude of confounding factors.  As a result, the plaintiffs’ argument, based in the Reference Manual, that confounding was an unlikely explanation for a very large risk ratio was enthusiastically embraced by the trial court, virtually verbatim from the plaintiffs’ Opposition (at 14):

“Finally, the Court rejects Defendants’ argument that Dr. Ix failed to consider confounding factors. Plaintiffs argued and Defendants did not dispute that, applying the Bradford Hill criteria, Dr. Ix calculated a pooled odds ratio of 11.46 for the five studies examined, which is higher than the 10 to 1 odds ratio of smoking and lung cancer that the Reference Manual on Scientific Evidence deemed to be “so high that it is extremely difficult to imagine any bias or confounding factor that may account for it.” Id. at 376.  Thus, from Dr. Ix’s perspective, the odds ratio was so high that a confounding factor was improbable. Additionally, in his deposition, Dr. Ix acknowledged that the cofactors that have been suggested are difficult to confirm and therefore he did not try to specifically quantify them. (Doc # : 772-20, at 27.) This acknowledgement of cofactors is essentially equivalent to the Agarwal article’s representation that “[t]here may have been unmeasured variables in the studies confounding the relationship between GBCAs and NSF,” cited by Defendants as a representative model for properly considering confounding factors. (See Doc # : 772, at 4-5.)”

Gadolinium at *24.

The real problem is that the defendant’s challenge pointed only to possible, unidentified causal agents.  The smoking/lung cancer analogy, provided by the Reference Manual, was inapposite.  Smoking is indeed a large risk factor for lung cancer, with relative risks over 20.  Although there are other human lung carcinogens, none is consistently in the same order of magnitude (not even asbestos), and as a result, confounding can generally be excluded as an explanation for the large risk ratios seen in smoking studies.  It would be easy to imagine that there are confounders for NSF, especially given that it is relatively recently been identified, and that they might be of the same or greater magnitude as that suggested for the gadolinium contrast media.  The defense, however, failed to identify confounders that actually threatened the validity of any of the individual studies, or of the meta-analysis.

CONCLUSION

The defense hinted at the general unreliability of meta-analysis, with references to References Manual on Scientific Evidence at 381 (2d ed. 2000)(noting problems with meta-analysis), and other, relatively dated papers.  See, e.g., John Bailar, “Assessing Assessments,” 277 Science 529 (1997)(arguing that “problems have been so frequent and so deep, and overstatements of the strength of conclusions so extreme, that one might well conclude there is something seriously and fundamentally wrong with [meta-analysis].”).  The Reference Manual language carried over into the third edition, is out of date, and represents a failing of the new edition.  See The Treatment of Meta-Analysis in the Third Edition of the Reference Manual on Scientific Evidence” (Nov. 14, 2011).

The plaintiffs came forward with some descriptive statistics of the prevalence of meta-analysis in contemporary biomedical literature.  The defendants gave mostly argument; there is a dearth of citation to defense expert witnesses, affidavits, consensus papers on meta-analysis, textbooks, papers by leading authors, and the like.  The defense challenge suffered from being diffuse and unfocused; it lost persuasiveness by including weak, collateral issues such as claiming that Dr. Ix was opining “only” on a “more likely than not” basis, and that he had not consulted with other experts, and that he had failed to use randomized trial data.  The defense was quick to attack perceived deficiencies, but it did not illustrate how or why the alleged deficiencies threatened the validity of Dr. Ix’s meta-analysis.  Indeed, even when the defense made strong points, such as the exclusion of zero-event cohort studies, it failed to document that such studies existed, and that their inclusion might have made a difference.

 

Politics of Expert Witnesses – The Treating Physician

June 7th, 2012

If a party retains an expert witness who has actually conducted research on the issue in controversy, the witnesses’ underlying data and analyses will be sought in discovery.  Of course, litigants are entitled to every man’s (and woman’s) evidence, and independent research, but the involvement of an investigator-author as an expert witness will almost certainly increase the scope of discovery.  Counsel will seek manuscript drafts, emails with co-authors, interim data, protocols and protocol amendments, preliminary analyses, among other documents.  Many would-be expert witnesses are reluctant to put their own research into issue.  The result is that expert witnesses frequently do not have “hands-on” experience with respect to the exact issue raised by the litigation in which they serve.

The combination of these factors creates vulnerabilities for witnesses.  Expert witnesses who have not conducted research or written about the issue end up being more attractive to lawyers.  But even these witnesses will be flawed in the eyes of a jury or trial judge:  they have been paid for their time in reviewing literature, preparing reports, sitting for depositions, traveling, appearing at trial.  The compensation of a highly skilled and experienced professional can lead to large amounts of money, amounts sufficient to make juries skeptical and lawyers’ uncomfortable.

Physicians, who care and treat a claimant, represent a litigation Holy Grail:  the prospect of having a neutral, disinterested, and caring expert witness opine about causation, diagnosis, damages, or prognosis, without the baggage of having been selected and paid by lawyers.  A lot of sharp elbows are thrown in the process of trying to align treating physicians with one side or the other’s litigation positions.

In some litigations, in some states, ex parte interviews by defense counsel are forbidden, but similar interviews by plaintiffs’ counsel are allowed.  Much mischief results.  The practice of trying to turn the treating physician into a “causation” or “damages” witness runs amuck, especially when trial courts do not require full Federal Rules of Civil Procedure Rule 26 disclosures from the treating physicians.

Jurors will want to know what treating physicians said, and may regard them as disinterested.  Indeed, the supposed neutrality and beneficence of the treating physician is often emphasized by counsel in their addresses to juries.  See, e.g., Simmons v. Novartis Pharm. Corp., 2012 WL 2016246, *2, *7 (6th Cir. 2012)((affirming exclusion of retained expert witness, as well as a treating physician who relied solely upon a limited selection of medical studies given to him by plaintiffs’ counsel); Tamraz v. BOC Group Inc., No. 1:04-CV-18948, 2008 WL 2796726 (N.D.Ohio July 18, 2008)(denying Rule 702 challenge to treating physician’s causation opinion), rev’d sub nom. Tamraz v. Lincoln Elec. Co., 620 F.3d 665 (6th Cir. 2010)(carefully reviewing record of trial testimony of plaintiffs’ treating physician; reversing judgment for plaintiff based in substantial part upon treating physician’s speculative causal assessment created by plaintiffs’ counsel), cert. denied, ___ U.S. ___ , 131 S. Ct. 2454, 2011 WL 863879 (2011).  See generally Robert Ambrogi, “A ‘Masterly’ Opinion on Expert Testimony,” Bullseye: October 2010;   David Walk, “A masterly Daubert opinion” (Sept. 15, 2010);  Ellen Melville, “Comment, Gating the Gatekeeper: Tamraz v. Lincoln Electric Co. and the Expansion of Daubert Reviewing Authority,” 53 B.C. L. Rev. 195 (2012) (student review that mistakenly equates current Rule 702 law with the Supreme Court’s 1993 Daubert decision, while ignoring subsequent precedent and revision of Rule 702).

In the silicone gel breast implant litigation, plaintiffs corralled a herd of rheumatologists who were sympathetic to their claims of connective tissue disease, and who would support their “creative” causation theories.  As a result, defense rheumatologists were not likely to have seen many of the claimants in their practice.  The plaintiffs’ counsel capitalized upon this “deficiency” in their experience, by attacking the defense experts’ expertise and their experience with the newly emergent phenomenon of “silicone-associated disease” (SAD).  The treating physicians were involved early on in the SAD litigation exploit.

In New Jersey, defense counsel have a limited right to ex parte interviews of treating physicians.  Stempler v. Speidell, 100 N.J. 368, 495 A.2d 857 (1985).  Certain New Jersey state trial judges, however, have ignored the Stempler holding in mass tort contexts, and have severely limited defendants’ ability to get information from treating physicians.  Last week, the New Jersey Appellate Division waded into this contentious area, by reversing an aberrant trial judge’s decision that severely restricted defendants’ retention of any physician who had treated a plaintiff in the mass tort.  In Re Pelvic Mesh/Gynecare Litig., No. A-5685-10T4 (N.J. Super. App. Div. June 1, 2012).

The defendants, Johnson & Johnson and Ethicon, Inc., designed, made, marketed, and sold pelvic mesh medical devices for the treatment of pelvic organ prolapse and stress urinary incontinence.  In re Pelvic at 2.  Several hundred personal injury cases against the defendants were assigned to the Atlantic County law division.  In a pretrial order, the trial court barred “defendants from consulting with or retaining as an expert witness any physician who has at any time treated one or more of the plaintiffs.”  Id. Remarkably, the trial court’s order was not limited to attempts to contact a physician for purposes of discussing a particular plaintiff’s case.  The trial court’s order had the effect of severely limiting defendants access to expert witnesses, as well as disqualifying expert witnesses already retained.  Plaintiffs’ counsel, however, were free to line up their clients’ treating physicians, and other treating physicians with substantial clinical experience with the allegedly defective device.

The Appellate Division reversed the trial court’s asymmetrical rules regarding treating physicians as manifestly inconsistent with the New Jersey Supreme Court’s mandate in Stempler and other cases.  The Appellate Division showed little patience for the trial court’s weak attempt to justify the uneven-handed treatment of access to treating physicians.  The trial court had invoked the potential for interference with the doctor-patient privilege as a basis for its pretrial order, but hornbook law, in New Jersey and in virtually every state, treats the filing of a lawsuit as a waiver of the privilege.  Id. at 11.  Similarly, the Appellate Division rejected the trial court’s insistence that a treating physician was obligated to protect and advance patients’ litigation interests by either testifying for patients or refraining from testifying for defendants. Id. at 15.  A treating physician has no “duty of loyalty” to help advance a patient’s litigious goals.  Id. at 26. The trial court had myopically confused a duty to provide medical care and treatment with helping plaintiffs’ counsel advance their view of the patients’ welfare.

The Appellate Division’s reversal is a welcome return of sanity and equity to New Jersey law of expert witnesses.  The over-reaching rationale of the trial court posed some incredible implications.  The appellate court noted, as an example, that “radiologists, orthopedists, and neurologists who routinely testify as experts for the defense in numerous personal injury cases in our courts are likely to be treating or consulting physicians for other patients with similar injuries, and some of those patients may also have filed lawsuits or may do so in the future.”  Id. at 16.  The trial court’s reasoning would strip defendants in virtually all personal injury litigation of access to expert physician opinion.  In asbestos litigation, for instance, the defense would find any and all pulmonary physicians who was treating a worker with asbestos-related disease to be off limits to consulting or testifying.  The Appellate Division’s strong ruling should be seen as a cloud on the validity of the continuing practice of barring defense counsel from ex parte interviews of treating physicians in mass or other tort litigation.

 

Haack’s Holism vs. Too Much of Nothing

May 24th, 2012

Professor Haack has been an unflagging critic of Daubert and its progeny.  Haack’s major criticism of the Daubert and Joiner cases is based upon the notion that the Supreme Court engaged in a “divide and conquer” strategy in its evaluation of plaintiffs’ evidence, when it should have been considered the “whole gemish” (my phrase, not Haack’s).  See Susan Haack, “Warrant, Causation, and the Atomism of Evidence Law,” 5 Episteme 253, 261 (2008)[hereafter “Warrant“];  “Proving Causation: The Holism of Warrant and the Atomism of Daubert,” 4 J. Health & Biomedical Law 273, 304 (2008)[hereafter “Proving Causation“].

ATOMISM vs. HOLISM

Haack’s concern is that combined pieces of evidence, none individually sufficient to warrant an opinion of causation, may provide the warrant when considered jointly.  Haack reads Daubert to require courts to screen each piece of evidence relied upon an expert witness for reliability, a process that can interfere with discerning the conclusion most warranted by the totality or “the mosaic” of the evidence:

“The epistemological analysis offered in this paper reveals that a combination of pieces of evidence, none of them sufficient by itself to warrant a causal conclusion to the legally required degree of proof, may do so jointly. The legal analysis offered here, interlocking with this, reveals that Daubert’s requirement that courts screen each item of scientific expert testimony for reliability can actually impede the process of arriving at the conclusion most warranted by the evidence proffered.”

Warrant at 253.

But there is nothing in Daubert, or its progeny, to support this crude characterization of the judicial gatekeeping function.  Indeed, there is another federal rule of evidence, Rule 703, which is directed at screening the reasonableness of reliance upon a single piece of evidence.

Surely there are times when the single, relied upon study is one that an expert in the relevant field should and would not rely upon because of invalidity of the data, the conduct of the study, or the study’s analysis of the data.  Indeed, there may well be times, especially in litigation contexts, when an expert witness has relied upon a collection of studies, none of which is reasonably relied upon by experts in the discipline.

Rule 702, which Daubert was interpreting, was, and is, focused with an expert witness’s opinion:

A witness who is qualified as an expert by knowledge, skill, experience, training, or education may testify in the form of an opinion or otherwise if:

(a) the expert’s scientific, technical, or other specialized knowledge will help the trier of fact to understand the evidence or to determine a fact in issue;

(b) the testimony is based on sufficient facts or data;

(c) the testimony is the product of reliable principles and methods; and

(d) the expert has reliably applied the principles and methods to the facts of the case

To be sure, Chief Justice Rehnquist, in explicating why plaintiffs’ expert witnesses’ opinions must be excluded in Joiner, noted the wild, irresponsible, unwarranted inferential leaps made in interpreting specific pieces of evidence.  The plaintiffs’ expert witnesses’ interpretation of a study, involving massive injections of PCBs into the peritoneum of baby mice, with consequent alveologenic adenomas, provided an amusing example of how they, the putative experts, had outrun their scientific headlights by over-interpreting a study in a different species, at different stages of maturation, with different routes of exposure, with different, non-cancerous outcomes.  These examples were effectively aimed at showing that the overall opinion advanced by Rabbi Teitelbaum and others, on behalf of plaintiffs in Joiner, were unreliable.  Haack, however, sees a philosophical kinship with Justice Stevens, who in dissent, argued to give plaintiffs’ expert witnesses a “pass,” based upon the whole evidentiary display.  General Electric Co. v. Joiner, 522 U.S. 136, 153 (1997) (Justice Stevens, dissenting) (“It is not intrinsically ‘unscientific’ for experienced professionals to arrive at a conclusion by weighing all available evidence.”). The problem, of course, is that sometimes “all available evidence” includes a good deal of junk, irrelevant, or invalid studies.  Sometimes “all available evidence” is just too much of nothing.

Perhaps Professor Haack was hurt that she was not cited by Justice Blackmun in Daubert, along with Popper and Hempel.  Haack has written widely on philosophy of science, and on epistemology, and she clearly believes her theory of knowledge would provide a better guide to the difficult task of screening expert witness opinions.

When Professor Haacks describes the “degree to which evidence warrants a conclusion,” she identifies three factors, which in part, require assessment of the strength of individual studies:

(i) how strong the connection is between the evidence and the conclusion (supportiveness);

(ii) how solid each of the elements of the evidence is, independent of the conclusion (independent security); and

(iii) how much of the relevant evidence the evidence includes (comprehensiveness).

Warrant at 258

Of course, supportiveness includes interconnectedness, but nothing in her theory of “warrant” excuses or omits rigorous examination of individual pieces of evidence in assessing a causal claim.

DONE WRONG

Haack seems enamored of the holistic approach taken by Dr. Done, plaintiffs’ expert witness in the Bendectin litigation. Done tried to justify his causal opinions based upon the entire “mosaic” of evidence. See, e.g., Oxendine v. Merrell Dow Pharms. Inc, 506 A.2d 1100, 1108 (D.C 1986)(“[Dr. Done] conceded his inability to conclude that Bendectin is a teratogen based on any of the individual studies which he discussed, but he also made quite clear that all these studies must be viewed together, and that, so viewed, they supplied his conclusion”).

Haack tilts at windmills by trying to argue the plausibility of Dr. Done’s mosaic in some of the Bendectin cases.  She rightly points out that Done challenged the internal and external validity of the defendant’s studies.  Such challenges to the validity of either side’s studies are a legitimate part of scientific discourse, and certainly a part of legal argumentation, but attacks on validity of null studies are not affirmative evidence of an association.  Haack correctly notes that “absence of evidence that p is just that — an absence of evidence of evidence; it is not evidence that not-p.”  Proving Causation at 300.  But the same point holds with respect to Done’s challenges to Merrill Dow’s studies.  If those studies are invalid, and Merrill Dow lacks evidence that “not-p,” this lack is not evidence for Done in favor of p.

Given the lack of supporting epidemiologic data in many studies, and the weak and invalid data relied upon, Done’s causal claims were suspect and have come to be discredited.  Professor Ronald Allen notes that invoking the Bendectin litigation in defense of a “mosaic theory” of evidentiary admissibility is a rather peculiar move for epistemology:

“[T]here were many such hints of risk at the time of litigation, but it is now generally accepted that those slight hints were statistical aberrations or the results of poorly conducted studies.76 Bendectin is still prescribed in many places in the world, including Europe, is endorsed by the World Health Organization as safe, and has been vindicated by meta-analyses and the support of a number of epidemiological studies.77 Given the weight of evidence in favor of Bendectin’s safety, it seems peculiar to argue for mosaic evidence from a case in which it would have plainly been misleading.”

Ronald J. Allen & Esfand Nafisi, “Daubert and its Discontents,” 76 Brooklyn L. Rev. 131, 148 (2010).

Screening each item of “expert evidence” for reliability may deprive the judge of “the mosaic,” but that is not all that the judicial gatekeepers were doing in Bendectin or other Rule 702 cases.   It is all well and good to speak metaphorically about mosaics, but the metaphor and its limits were long ago acknowledged in the philosophy of science.  The suggestion that scraps of evidence from different kinds of scientific studies can establish scientific knowledge was rejected by the great mathematician, physicist, and philosopher of science, Henri Poincaré:

“[O]n fait la science avec des faits comme une maison avec des pierres; mais une accumulation de faits n’est pas plus une science qu’un tas de pierres n’est une maison.”

Jules Henri Poincaré, La Science et l’Hypothèse (1905) (chapter 9, Les Hypothèses en Physique)( “Science is built up with facts, as a house is with stones. But a collection of facts is no more a science than a heap of stones is a house.”).  Poincaré’s metaphor is more powerful than Haack’s and Done’s “mosaic” because it acknowledges that interlocking pieces of evidence may cohere as a building, or they may be no more than a pile of rubble.  Poorly constructed walls may soon revert to the pile of stones from which they came.  Much more is required than simply invoking the “mosaic” theory to bless this mess as a “warranted” claim to knowledge.

Haack’s point about aggregation of evidence is, at one level, unexceptionable.  Surely, the individual pieces of evidence, each inconclusive alone, may be powerful when combined.  An easy example is a series of studies, each with a non-statistically significant result of finding more disease than expected.  None of the studies alone can rule out chance as an explanation, and the defense might be tempted to argue that it is inappropriate to rely upon any of the studies because none is statistically significant.

The defense argument may be wrong in cases in which a valid meta-analysis can be deployed to combine the results into a summary estimate of association.  If a meta-analysis is appropriate, the studies collectively may allow the exclusion of chance as an explanation for the disparity from expected rates of disease in the observed populations.  [Haack misinterprets study “effect size” to be relevant to ruling out chance as explanation for the increased rate of the outcome of interest. Proving Causation at 297.]

The availability of meta-analysis, in some cases, does not mean that hand waving about the “combined evidence” or “mosaics” automatically supports admissibility of the causal opinion.  The gatekeeper would still have to contend with the criteria of validity for meta-analysis, as well as with bias and confounding in the underlying studies.

NECESSITY OF JUDGMENT

Of course, unlike the meta-analysis example, most instances of evaluating an entire evidentiary display are not quantitative exercises.  Haack is troubled by the qualitative, continuous nature of reliability, but the “in or out” aspect of ruling on expert witness opinion admissibility.  Warrant at 262.  The continuous nature of a reliability spectrum, however, does not preclude the practical need for a decision.  We distinguish young from old people, although we age imperceptibly by units of time that are continuous and capable of being specified with increasingly small magnitudes.  Differences of opinions or close cases are likely, but decisions are made in scientific contexts all the time.

FAGGOT FALLACY

Although Haack criticizes defendants for beguiling courts with the claimed “faggot fallacy,” she occasionally, acknowledges that there simply is not sufficient valid evidence to support a conclusion.  Indeed, she makes the case for why, in legal contexts, we will frequently be dealing with “unwarranted” claims:

“Against this background, it isn’t hard to see why the legal system has had difficulties in handling scientific testimony. It often calls on the weaker areas of science and/or on weak or marginal scientists in an area; moreover, its adversarial character may mean that even solid scientific information gets distorted; it may suppress or sequester relevant data; it may demand scientific answers when none are yet well-warranted; it may fumble in applying general scientific findings to specific cases; and it may fail to adapt appropriately as a relevant scientific field progresses.”

Susan Haack, ” Of Truth, in Science and in Law,” 73 Brooklyn L. Rev. 985, 1000 (2008).  It is difficult to imagine a more vigorous call for, and defense of, judicial gatekeeping of expert witness opinion testimony.

Haack seems to object to the scope and intensity of federal judicial gatekeeping, but her characterization of the legal context should awaken her to the need to resist admitting opinions on scientific issues when “none are yet well-warranted.” Id. at 1004 (noting that “the legal system quite often want[s] scientific answers when no warranted answers are available).  The legal system, however, does not “want” unwarranted “scientific” answers; only an interested party on one side or the other wants such a thing.  The legal systems wants a procedure for ensuring rejection of unwarranted claims, which may be passed off as properly warranted, due to the lack of sophistication of the intended audience.

TOO MUCH OF NOTHING

Despite her flirtation with Dr. Done’s holistic medicine, Haack acknowledges that sometimes a study or an entire line of studies is simply not valid, and they should not be part of the “gemish.”  For instance, in the context of meta-analysis, which requires pre-specified inclusionary and exclusionary criteria for studies, Haack acknowledges that a “well-designed and well-conducted meta-analysis” will include a determination “which studies are good enough to be included … and which are best disregarded.”  Proving Causation at 286.  Exactly correct.  Sometimes we simply must drill down to the individual study, and what we find may require us to exclude it from the meta-analysis.  The same could be said of any study that is excluded by appropriate exclusionary criteria.

Elsewhere, Haack acknowledges myriad considerations of validity or invalidity, which must be weighed as part of the gemish:

“The effects of S on animals may be different from its effects on humans. The effects of b when combined with a and c may be different from its effects alone, or when combined with x and/or y.52 Even an epidemiological study showing a strong association between exposure to S and elevated risk of D would be insufficient by itself: it might be poorly-designed and/or poorly-executed, for example (moreover, what constitutes a well-designed study – e.g., what controls are needed – itself depends on further information about the kinds of factor that might be relevant). And even an excellent epidemiological study may pick up, not a causal connection between S and D, but an underlying cause both of exposure to S and of D; or possibly reflect the fact that people in the very early stages of D develop a craving for S. Nor is evidence that the incidence of D fell after S was withdrawn sufficient by itself to establish causation – perhaps vigilance in reporting D was relaxed after S was withdrawn, or perhaps exposure to x, y, z was also reduced, and one or all of these cause D, etc.53

Proving Causation at 288.  These are precisely the sorts of reasons that make gatekeeping of expert witness opinions an important part of the judicial process in litigation.

RATS TO YOU

Similarly, Haack acknowledges that animal studies may be quite irrelevant to the issue at hand:

“The elements of E will also interlock more tightly the more physiologically similar the animals used in any animal studies are to human beings. The results of tests on hummingbirds or frogs would barely engage at all with epidemiological evidence of risk to humans, while the results of tests on mice, rats, guinea-pigs, or rabbits would interlock more tightly with such evidence, and the results of tests on primates more tightly yet. Of course, “similar” has to be understood as elliptical for “similar in the relevant respects;” and which respects are relevant may depend on, among other things, the mode of exposure: if humans are exposed to S by inhalation, for example, it matters whether the laboratory animals used have a similar rate of respiration. (Sometimes animal studies may themselves reveal relevant differences; for example, the rats on which Thalidomide was tested were immune to the sedative effect it had on humans; which should have raised suspicions that rats were a poor choice of experimental animal for this drug.)55 Again, the results of animal tests will interlock more tightly with evidence of risk to humans the more similar the dose of S involved. (One weakness of Joiner’s expert testimony was that the animal studies relied on involved injecting massive doses of PCBs into a baby mouse’s peritoneum, whereas Mr. Joiner had been exposed to much smaller doses when the contaminated insulating oil splashed onto his skin and into his eyes.)56 The timing of the exposure may also matter, e.g., when the claim at issue is that a pregnant woman’s being exposed to S causes this or that specific type of damage to the fetus.”

Proving Causation at 290.

WEIGHT OF THE EVIDENCE (WOE)

Just as she criticizes General Electric for advancing the “faggot fallacy” in Joiner, Haack criticizes the plaintiffs’ appeal to “weight of evidence methodology,” as misleadingly suggesting “that there is anything like an algorithm or protocol, some effective, mechanical procedure for calculating the combined worth of evidence.”  Proving Causation at 293.

INFERENCE  TO BEST EXPLANATION

Professor Haack cautiously evaluates the glib invocation of “inference to the best explanation” as a substitute for actual warrant of a claim to knowledge.  Haack acknowledges the obvious: the legal system is often confronted with claims lacking sufficient warrant.  She appropriately refuses to permit such claims to be dressed up as scientific conclusions by invoking their plausibility:

“Can we infer from the fact that the causes of D are as yet unknown, and that a plaintiff developed D after being exposed to S, that it was this exposure that caused Ms. X’s or Mr. Y’s D?102  No. Such evidence would certainly give us reason to look into the possibility that S is the, or a, cause of D. But loose talk of ‘inference to the best explanation’ disguises the fact that what presently seems like the most plausible explanation may not really be so – indeed, may not really be an explanation at all. We may not know all the potential causes of D, or even which other candidate-explanations we would be wise to investigate.”

Proving Causation at 305.  See also Warrant at 261 (invoking the epistemic category of Rumsfeld’s “known unknowns” and “unknown unknowns” to describe a recurring situation in law’s treatment of scientific claims)(U.S. Sec’y of Defense Donald Rumsfeld: “[T]here are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – there are things we do not know we don’t know. (Feb. 12, 2002)).

It is easy to see why the folks at SKAPP are so fond of Professor Haack’s writings, and why they have invited her to their conferences and meetings.  She has written close to a dozen articles critical of Daubert, each repeating the same mistaken criticisms of the gatekeeping process.  She has provided SKAPP and its plaintiffs’ lawyer sponsors with sound bites to throw at impressionable judges about the epistemological weakness of Daubert and its progeny.  In advancing this critique and SKAPP’s propaganda purposes, Professor Haack has misunderstood the gatekeeping enterprise.  She has, however, correctly identified the gatekeeping process as an exercise in determining whether an opinion possesses sufficient epistemic warrant.  Despite her enthusiasm for the dubious claims of Dr. Done, Haack acknowledges that “warrant” requires close attention to the internal and external validity of studies, and to rigorous analysis of a body of evidence.  Haack’s own epistemic analysis would be hugely improved and advanced by focusing on how the mosaic theory, or WOE, failed to hold up in some of the more egregious, pathological claims of health “effects” — Bendectin, silicone, electro-magnetic frequency, asbestos and colorectal cancer, etc.

Exposure, Epidemiology, and External Validity under Rule 702

May 14th, 2012

Sometimes legal counsel take positions in court determined solely by the expediency of what expert witnesses are available, and what opinions are held by those witnesses.

Back in the early days of the asbestos litigation in Philadelphia, a hotbed of early asbestos litigation, plaintiffs and defendants each identified a pool of available expert witnesses on lung diseases.  Each side found witnesses who held views on important issues, such as whether asbestos caused lung cancer, with or without pre-existing asbestosis, whether all types of asbestos caused mesothelioma, whether asbestos caused gastrointestinal cancers, and whether “each and every exposure was a substantial factor” in producing an asbestos-related disease.  Some expert witnesses adopted opinions as a matter of convenience and malleability, but most witnesses expressed sincerely held opinions.  Either way, each expert witness active in the asbestos litigation, came to be seen as a partisan of one side.  Because of the volume of cases, there was the opportunity to be engaged in a large number of cases, and to earn sizable fees as an expert witness.  Both side’s expert witnesses struggled to avoid being labeled hired guns.

A few expert witnesses, eager to avoid being locked in as either a “plaintiff’s” or a “defendant’s” expert witness, with perhaps some damage to their professional reputations, balanced their views in a way to avoid being classified as working exclusively for one side or the other.  The late Paul Epstein, MD, adopted this strategy to great effect.  Dr. Epstein had excellent credentials, and he was an excellent physician.  He was on the faculty at the University of Pennsylvania, and he was a leader in the American College of Physicians, where he was the deputy editor of the Annals of Internal Medicine.  Dr. Epstein exemplified gravitas and learning.  He was not, however, above adopting views in such a way as to balance out his commitments to both the plaintiffs’ and defense bars.  By doing so, Dr. Epstein made himself invaluable to both sides, and he made aggressive cross-examination difficult, if not impossible, when he testified.  I suspect his positions had this strategic goal.

In his first testimonies, in the late 1970’s and early 1980’s, Dr. Epstein expressed the view that asbestos exposure caused parietal pleural plaques, but these plaques rarely interfered with respiration.  Pleural plaques did not cause impairment or disability, and thus they were not an “injury.”  Dr. Epstein’s views were very helpful in obtaining defense verdicts in cases of disputed pleural thickening or plaques, and they led to his being much sought after by defense counsel for their independent medical examinations.  Dr. Epstein also strongly believed, based upon the epidemiologic evidence, that asbestos did not cause gastrointestinal or laryngeal cancer.

Dr. Epstein was wary of being labeled a “defendants’ expert” in the asbestos litigation, especially given the social opprobrium that attached to working for the “asbestos industry.”  And so, by the mid-1980’s, Dr. Epstein surprised the defense bar by showing up in a plaintiff’s lung cancer case, without underlying asbestosis.  Dr. Epstein took the position that if the plaintiff worked around asbestos, and later developed lung cancer, then asbestos caused his lung cancer, and “each and every exposure to asbestos” contributed substantially to the outcome.  Risk was causation; ipse dixit.  Dr. Epstein recited the Selikoff multiplicative “synergy” theory, with relative risks of 5 (for non-smoking asbestos workers), 10 (for smoking non-asbestos workers), and 50 (for smoking asbestos-exposed workers).  Every worker was described with the same set of risk ratios.  Remarkably, and unscientifically, Dr. Epstein gave the same risk figures in every plaintiff’s lung cancer case, regardless of the duration or level of exposure.  In mesothelioma cases, Dr. Epstein took the unscientific position that all fiber types (chrysotile, amosite, crocidolite, and anthopyllite) contributed to any patient’s mesothelioma.

Dr. Epstein’s views made him off limits to plaintiffs in non-malignancy cases, and off limits to defendants in lung cancer and mesothelioma cases.

Because of his careful alignment with both plaintiffs’ and defense bars, Dr. Epstein’s views were never forcefully challenged.  Of course, the Pennsylvania case law in the 1980’s and 1990’s was not particularly favorable to challenges to the validity of opinions about causation, but even as Rule 702 evolved in federal court, both plaintiffs’ and defense counsel were unable to antagonize Dr. Epstein.  The inanity of “each and every exposure” was not seriously hurtful in the early asbestos litigation, when the defendants were almost all manufacturers of asbestos-containing insulation, and if a manufacturer had supplied insulation to a worksite, then the proportion of asbestos exposure for that manufacturer would likely have been “substantial.”

Today, the nature of the asbestos litigation has changed, but it when we examine Pennsylvania law and procedure, it is not surprising to see that Dr. Epstein’s views have had a long-lasting effect.  Claimants with only pleural plaques have been relegated to an “inactive” docket.  Plaintiffs’ expert witnesses still opine that each and every exposure was substantial, without any basis in evidence, and they still recite the same 5x, 10x, and 50x risk ratios, based upon Selikoff’s insulator studies, even though the Philadelphia Court of Common Pleas probably has not seen more than a handful of insulators’ cases in the last decade.  Dozens of epidemiologic studies have shown that asbestos exposures of bystander trades, chrysotile factory workers, and other non-insulator, occupational exposures have lower risks of asbestos-related diseases.

The failure to challenge the Selikoff risk ratios is regrettable, especially considering that it was based upon politics, personalities, and not on scientific or legal evidentiary grounds.

As Irving Selikoff observed about his frequently cited statistics:

“These particular figures apply to the particular groups of asbestos workers in this study.  The net synergistic effect would not have been the same if their smoking habits had been different; and it probably would have been different if their lapsed time from first exposure to asbestos dust had been different or if the amount of asbestos dust they had inhaled had been different.”

E. Cuyler Hammond, Irving Selikoff, and Herbert Seidman, “Asbestos Exposure, Cigarette Smoking and Death Rates,” 330 Ann. N.Y. Acad. Sci. 473, 487 (1979).

The Selikoff risk figures were unreliable even for insulators, given that the so-called non-smokers were admittedly occasional smokers, and the low relative risk for smokers in the general population came from an historical cohort of relatively healthy American Cancer Society volunteers. The updated risk figures for smokers in the general population placed their lung cancer risk closer to, and above, 20-fold, which raised doubts about Selikoff’s neat multiplicative theory.

The more important lesson though is that the Philadelphia courts, with acquiescence from most defense counsel, never challenged the use of Selikoff’s 5x, 10x, and 50x risk ratios to describe asbestos effects and smoking interactions.  Dr. Epstein made such a challenge impolitic and imprudent.  In Philadelphia, the Selikoff risk ratios gained a measure of respectability that they never deserved in science, or in the courtroom.

*****

Under Rule 702, the law has evolved to require reasonable exposure assessments of plaintiffs’ exposures, and supporting epidemiology that shows relevant increase risks at the level and the latency actual experienced by each plaintiff.  This criterion does not come from a “sufficiency” review as some have suggested; it is clearly a requirement of external validity of the epidemiologic studies relied upon by expert witnesses.

The following cases excluded or limited expert witness opinion testimony with respect to epidemiological studies that the court concluded were not sufficiently similar to the facts of the case to warrant the admission of an expert’s opinion based on their results:

SUPREME COURT

General Electric Co. v. Joiner, 522 U.S. 136 (1997)(questioning the external validity of a study of massive injected doses of PCBs in baby mice, with an outcome unrelated to the cancer claimed by paintiff)

1st Circuit

Sutera v. Perrier Group of America Inc., 986 F. Supp. 655 (D. Mass. 1997)(occupational epidemiology of benzene exposure and benzene does not inform health effects from vanishingly low exposure to benzene in bottled water)

Whiting v. Boston Edison Co., 891 F. Supp. 12 (D. Mass. 1995) (excluding plaintiff’s expert witnesses; holding that epidemiology of Japanese atom bomb victims, and of patients treated with X-rays for spinal arthritis, and acute lymphocytic leukemia (ALL), was an invalid extrapolative model for plaintiff’s much lower exposure)

2d Circuit

Wills v. Amerada Hess Corp., 2002 WL 140542 (S.D. N.Y. 2002)(excluding plaintiff’s expert witness who attempted to avoid exposure assessment by arguing no threshold)(‘‘[E]ven though benzene and PAHs have been shown to cause some types of cancer, it is too difficult a leap to allow testimony that says any amount of exposure to these toxins caused squamous cell carcinoma of the head and neck in the decedent… . It is not grounded in reliable scientific methods, but only Dr. Bidanset’s presumptions. It fails all of the Daubert factors.’’), aff’d, 379 F.3d 32 (2d Cir. 2004)(Sotomayor, J.), cert. denied, 126 S.Ct. 355 (2005)

Amorgianos v. National RR Passenger Corp., 137 F. Supp. 2d 147 (E.D. N.Y. 2001), aff’d, 303 F.3d 256 (2d Cir. 2002);

Mancuso v. Consolidated Edison Co., 967 F.Supp. 1437, 1444 (S.D.N.Y. 1997)

3d Circuit

Magistrini v. One Hour Martinizing Dry Cleaning, 180 F. Supp. 2d 584(D.N.J. 2002), aff’d, 68 Fed. Appx. 356 (3d Cir. 2003);

In re W.R. Grace & Co., 355 B.R. 462 (Bankr. D. Del. 2006)

4th Circuit

White v. Dow Chemical Co., 321 F.Appx. 266, 273 (4th Cir. 2009)

Newman v. Motorola, Inc., 78 Fed. Appx. 292 (4th Cir. 2003)

Cavallo v. Star Enterprise, 892 F. Supp. 756, 764, 773 (E.D. Va. 1995) (excluding opinion of expert witness who failed to identify plaintiff ’s exposure levels to jet fuel, and failed to characterize the relevant dose-response relationship), aff’d in relevant part, 100 F.3d 1150, 1159 (4th Cir. 1996)

5th Circuit

LeBlanc v. Chevron USA, Inc., 396 Fed. Appx. 94 (5th Cir. 2010)

 Knight v. Kirby Inland Marine Inc.,482 F.3d 347 (5th Cir. 2007);

Cotroneo v. Shaw Environmental & Infrastructure, Inc., 2007 WL 3145791 (S.D. Tex. 2007)

Castellow v. Chevron USA, 97 F. Supp. 2d 780, 796 (S.D. Tex. 2000) (‘‘[T]here is no reliable evidence before this court on the amount of benzene, from gasoline or any other source, to which Mr. Castellow was exposed.’’)

Moore v. Ashland Chemical Inc., 151 F.3d 269, 278 (5th Cir. 1998) (en banc);

Allen v. Pennsylvania Engineering Corp., 102 F.3d 194, 198-99 (5th Cir. 1996)

6th Circuit

Pluck v. BP Oil Pipeline Co., 640 F.3d 671 (6th Cir. 2011)(affirming district court’s exclusion of Dr. James Dahlgren; noting that he lacked reliable data to support his conclusion of heavy benzene exposure; holding that without quantifiable exposure data, the Dahlgren’s causation opinion was mere “speculation and conjecture”)

 Nelson v. Tennessee Gas Pipeline Co., 243 F.3d 244, 252 (6th Cir. 2001)(noting ‘‘with respect to the question of dose, plaintiffs cannot dispute that [their expert] made no attempt to determine what amount of PCB exposure the Lobelvill subjects had received and simply assumed that it was sufficient to make them ill.’’)

Conde v. Velsicol Chemical Corp., 24 F.3d, 809, 810 (6th Cir. 1994)(excluding expert testimony that chlordane,although an acknowledged carcinogen that was applied in a manner that violated federal criminal law, caused plaintiff’s injuries when expert witness’s opinion was based upon high-dose animal studies as opposed to the low-exposure levels experienced by the plaintiffs)

7th Circuit

Cunningham v. Masterwear Corp., 2007 WL 1164832 (S.D. Ind., Apr. 19, 2007)(excluding plaintiff’s expert witnesses who opined without valid evidence of plaintiffs’ exposure to perchloroethylene (PCE)), aff’d, 569 F.3d 673 (7th Cir. 2009) (Posner, J.)(affirming exclusion of expert witness and grant of summary judgment)

Wintz v. Northrop Corp., 110 F.3d 508, 513 (7th Cir. 1997)

Schmaltz v. Norfolk & Western Ry. Co., 878 F. Supp. 1119, 1122 (N.D. Ill. 1995) (excluding expert witness opinion testimony that was offered in ignorance of plaintiff’s level of exposure to herbicide)

8th Circuit

Junk v. Terminix Intern. Co. Ltd. Partnership, 594 F. Supp. 2d 1062, 1073 (S.D. Iowa 2008).

Medalen v. Tiger Drylac U.S.A., Inc., 269 F. Supp. 2d 1118, 1132 (D. Minn. 2003)

National Bank of Commerce v. Associated Milk Producers, Inc., 22 F. Supp. 2d 942 (E.D. Ark. 1998)(excluding causation opinion that lacked exposure level data), aff’d, 191 F.3d 858 (8th Cir. 1999)

Bednar v. Bassett Furniture Mfg. Co., Inc.,147 F.3d 737, 740 (8th Cir. 1998) (“The Bednars had to make a threshold showing that the dresser exposed the baby to levels of gaseous formaldehyde known to cause the type of injuries she suffered”)

Wright v. Willamette Industries, Inc., 91 F.3d 1105, 1106 (8th Cir. 1996) (affirming exclusion; requiring evidence of actual exposure to levels of substance known to cause claimed injury)

National Bank of Commerce v. Dow Chemical Co., 965 F. Supp. 1490, 1502 (E.D. Ark., 1996)

9th Circuit

In re Bextra & Celebrex Marketing Sales Practices & Product Liab. Litig., 524 F. Supp. 2d 1166, 1180 (N.D. Cal. 2007)(granting Rule 702 exclusion of expert witness’s opinions with respect to low dose, but admitting opinions with respect to high dose Bextra and Celebrex)

Henricksen v. ConocoPhillips Co., 605 F. Supp. 2d 1142, 1157 (E.D. Wash. 2009)

Valentine v. Pioneer Chlor Alkali Co., Inc., 921 F. Supp. 666, 676 (D. Nev. 1996)

Abuan v. General Electric Co., 329 F.3d 329, 333 (9th Cir. 1993) (Guam)

10th Circuit

Maddy v. Vulcan Materials Co., 737 F.Supp. 1528, 1533 (D.Kan. 1990) (noting the lack of any scientific evidence of the level or duration of plaintiff’s exposure to specific toxins).

Estate of Mitchell v. Gencorp, Inc., 968 F. Supp. 592, 600 (D. Kan. 1997), aff’d,165 F.3d 778, 781 (10th Cir. 1999)

11th Circuit

Brooks v. Ingram Barge Co., 2008 WL 5070243 *5 (N.D. Miss. 2008)) (noting that plaintiff’s expert witness “acknowledges that it is unclear how much exhaust Brooks was exposed to, how much exhaust it takes to make developing cancer a probability, or how much other factors played a role in Brooks developing cancer.”)

Cuevas v. E.I. DuPont de Nemours & Co., 956 F. Supp. 1306, 1312 (S.D. Miss. 1997)

Chikovsky v. Ortho Pharmaceutical Corp., 832 F. Supp. 341, 345–46 (S.D. Fla. 1993)(excluding opinion of an expert witness who did not know plaintiff’s actual exposure or dose of Retin-A, and the level of absorbed Retin-A that is unsafe for gestating women)

Savage v. Union Pacific RR, 67 F. Supp. 2d 1021 (E.D. Ark. 1999)

 

STATE CASES

California

Jones v. Ortho Pharmaceutical Corp., 163 Cal. App. 3d 396, 404, 209 Cal. Rptr. 456, 461 (1985)(duration of use in relied upon studies not relevant to plaintiffs’ use)

Michigan

Nelson v. American Sterilizer Co., 566 N.W. 2d 671 (Mich. Ct. App. 1997)(affirming exclusion of expert witness who opined, based upon high-dose animal studies, that plaintiff’s liver disease was caused by low-level exposure to chemicals used in sterilizing medical equipment)

Mississippi

Watts v. Radiator Specialty Co., 2008 WL 2372694 *3 (Miss.2008);

Ohio

Valentine v. PPG Indus., Inc., 158 Ohio App. 3d 615, 821 N.E.2d 580 (2004)

Oklahoma

Christian v. Gray, 2003 Okla. 10, 65 P.3d 591, 601 (2003);

Holstine v. Texasco, 2001 WL 605137 (Okla. Dist. Ct. 2001)(excluding expert witness testimony that failed to assess plaintiff’s short-term, low-level benzene exposure as fitting the epidemiology relied upon to link plaintiff’s claimed injury with his exposure)

Texas

Merrell Dow Pharm., Inc. v. Havner, 953 S.W.2d 706, 720 (Tex. 1997) (“To raise a fact issue on causation and thus to survive legal sufficiency review, a claimant must do more than simply introduce into evidence epidemiological studies that show a substantially elevated risk. A claimant must show that he or she is similar to those in the studies.”).

Merck & Co. v. Garza, 347 S.W.3d 256 (Tex. 2011)

Frias v. Atlantic Richfield Co., 104 S.W.3d 925, 929 (Tex. App. Houston 2003)(holding that plaintiffs’ expert witness’s testimony was inadmissible for relying upon epidemiologic studies that involved much higher levels of exposure than experienced by plaintiff)

Daniels v. Lyondell-Citgo Refining Co, 99 S.W.3d 722 (Tex. App. 2003) (claim that benzene exposure caused plaintiff’s lung cancer had to be supported with studies of comparable exposure, and latency, as that observed and reported in the studies)

Austin v. Kerr-McGee Refining Corp., 25 S.W.3d 280, 292 (Tex. App. Texarkana 2000)

Giving Rule 703 the Cold Shoulder

May 12th, 2012

I have written previously about the gap in Rule 702, which provides a multi-factorial test for the admissibility of an opinion from a properly qualified expert witness:

(a) the expert’s scientific, technical, or other specialized knowledge will help the trier of fact to understand the evidence or to determine a fact in issue;

(b) the testimony is based on sufficient facts or data;

(c) the testimony is the product of reliable principles and methods; and

(d) the expert has reliably applied the principles and methods to the facts of the case.

Noticeably absent from Rule 702 is any requirement that the facts or data upon which the expert witness relies be worth a damn.  From Rule 702(b), (c), and (d) alone, an expert witness, armed with sufficient unreliable, fraudulent, imaginary, or simply incorrect facts and data, using reliable principles and methods, and applying those principles and methods reliably to the facts of the case, gets to testify at trial.  Arguably, the first subsection, Rule 702(a), which limits testimony to helpful “knowledge” provides an overriding condition that helps to qualify the next three.  It is difficult to imagine that knowledge is based upon unreliable facts and data.

Still, the failure to require reliable data explicitly within the scope of Rule 702 is disturbing.  This unhappy state of affairs, in which courts do not exercise gatekeeping over the quality of the data themselves, is apparently the law of the Tenth Circuit, of the United States Court of Appeals.

In Pritchett v. I-Flow Corporation, the plaintiff had shoulder surgery, which required the use of a “pain pump” to inject anesthetic medication into the shoulder post-operatively.  The plaintiff went on to develop “chondrolysis” in his shoulder joint, a condition that involves partial or complete loss of cartilage in the shoulder joint.  Pritchett v. I-Flow Corp., Civil Action No. 09-cv-02433-WJM-KLM. (D. Colo. April 17, 2012) (Mix, J., Magistrate Judge).

The opinion is a mechanical recitation of Daubert procedure and method, with little analysis of the expert witness’s opinion, until the magistrate judge describes the requirement of Rule 702 (b) for “sufficient facts and data”:

“i. Sufficient Facts and Data

The proponent of the opinion must first show that the witness gathered “sufficient facts and data” to formulate the opinion. In the Tenth Circuit, assessment of the sufficiency of the facts and data used by the witness is a quantitative, rather than a qualitative, analysis. Fed. R. Evid. 702, Advisory Committee Notes to 2000 Amendments; see also United States v. Lauder, 409 F.3d 1254, 1264 n.5 (10th Cir. 2005). That is to say, the Court does not examine whether the facts obtained by the witness are themselves reliable; whether the facts used are qualitatively reliable is a question of the weight that should be given to the opinion by the fact-finder, not the admissibility of the opinion. Lauder, 409 F.3d at 1264. Instead, “this inquiry examines only whether the witness obtained the amount of data that the methodology itself demands.” Crabbe, 556 F. Supp. 2d at 1223.”

Pritchett v. I-Flow Corp. (emphasis added).  That is to say: the whole gatekeeping enterprise is really about appearances and not about trying to ensure more accurate fact finding.

If the court’s analysis of Rule 702 should be correct, it is in any event an incomplete analysis that omits the important role of Rule 703:

Rule 703. Bases of an Expert’s Opinion Testimony

An expert may base an opinion on facts or data in the case that the expert has been made aware of or personally observed. If experts in the particular field would reasonably rely on those kinds of facts or data in forming an opinion on the subject, they need not be admissible for the opinion to be admitted. But if the facts or data would otherwise be inadmissible, the proponent of the opinion may disclose them to the jury only if their probative value in helping the jury evaluate the opinion substantially outweighs their prejudicial effect.

According to Magistrate Mix, the reliability of the facts and data do not count for gatekeeping.  Chalk up another loophole to the law’s requirement of reliable scientific evidence.

 

Philadelphia Plaintiff’s Claims Against Fixodent Prove Toothless

May 2nd, 2012

In Milward, Martyn Smith got a pass from the First Circuit of the U.S. Court of Appeals on his “weight of the evidence” (WOE) approach to formulating an opinion as an expert witness.  Last week, Smith’s WOE did not fare so well.  The Honorable Sandra Mazer Moss, in one of her last rulings as judge presiding over the Philadelphia Court of Common Pleas mass tort program, sprinkled some cheer to dispel WOE in Jacoby v Rite Aid PCCP (Order of April 27, 2012; Opinion of April 12, 2012).

Applying Pennsylvania’s Frye standard, Judge Moss upheld Proctor & Gambles challenge to Dr. Martyn Smith, as well as two other plaintiff expert witnesses, Dr. Ebbing Lautenbach and Dr. Frederick Askari.  The plaintiff, Mr. Mark Jacoby, used Fixodent for six years before he first experienced parasthesias and numbness in his hands and feet.  Jacoby’s expert witnesses claimed that Fixodent contains zinc compounds, which are released upon use, and are absorbed into the blood stream.  Very high zinc levels suppress copper levels, and cause a copper deficiency myeloneuropathy.  Finding that the plaintiffs’ causal claims were toothless in the face of sound science, Judge Moss excluded the reports and proffered testimony of Drs. Smith, Askari, and Lautenbach.

Although Pennsylvania courts follow a Frye standard, Judge Moss followed the lead of a federal judge, who had previously examined the same body of evidence, and who excluded plaintiff’s expert witnesses, under Federal Rule of Evidence 702, in In re Denture Cream Prods. Liab. Litig., 795 F. Supp. 2d 1345 (S.D. Fla. 2011).  Without explication, Judge Moss stated that Judge Altanoga’s reasoning and conclusions, reached under federal law, were “very persuasive” under Frye.  Moss Opinion at 5.  In particular, Judge Moss appeared to be impressed by the lack of baseline incidence data on copper deficiency myeloneuropathy, the lack of exposure-response information, and the lack of risk ratios for any level of use of Fixodent.  Id. at 6 – 10.

Judge Moss accepted at face value Martyn Smith’s claims that WOE can be used to demonstrate causation when no individual study is conclusive.  Her Honor did, however, look more critically at the component parts of Smith’s particular application of WOE in the Jacoby case.  Smith used various steps of extrapolation, dose-response, and differential diagnosis in applying WOE, but these steps were woefully unsound.  Id. at 9.  There was no evidence of how low, and for how long, a person’s copper levels must drop before injury results.  Having attacked Proctor & Gamble’s pharmacokinetic studies, the plaintiffs’ expert witnesses had no basis for inferring levels for any plaintiff.  Furthermore, the plaintiffs’ witnesses had no baseline incidence data, and no risk ratios to apply for any level of exposure to, or use of, defendant’s product.

Predictably, plaintiffs’ invoked the pass that Smith received in Milward, but Judge Moss easily distinguished Milward as having involved baseline rates and risk ratios (even if Smith may have imagined the data to calculate those ratios).

Another plaintiff witness, Dr. Askari, used a method he called the “totality of the evidence” (TOE) approach.  In short, TOE is WOE is NO good, as applied in this case.  Id. at 10 -11.

Finally, another plaintiff’s witness, Dr. Lautenbach applied the Naranjo Adverse Drug Reaction Probability Scale, by which he purported to transmute case reports and case series into a conclusion of causality.  Actually, Lautenbach seems to have claimed that the lack of analytical epidemiologic studies supporting an association between Fixodent and myeloneuropathy did not refute the existence of a causal relationship.  Of course, this lack of evidence hardly supports the causal relationship.  Judge Moss assumed that Lautenbach was actually asserting a causal relationship, but since he was relying upon the same woefully, toefully flawed body of evidence, Her Honor excluded Dr. Lautenbach as well.  Id. at 12.

WOE-fully Inadequate Methodology – An Ipse Dixit By Another Name

May 1st, 2012

Take all the evidence, throw it into the hopper, close your eyes, open your heart, and guess the weight.  You could be a lucky winner!  The weight of the evidence suggests that the weight-of-the-evidence (WOE) method is little more than subjective opinion, but why care if it helps you to get to a verdict?

The scientific community has never been seriously impressed by the so-called weight of the evidence (WOE) approach to determining causality.  The phrase is vague and ambiguous; its use, inconsistent. See, e.g., V. H. Dale, G.R. Biddinger, M.C. Newman, J.T. Oris, G.W. Suter II, T. Thompson, et al., “Enhancing the ecological risk assessment process,” 4 Integrated Envt’l Assess. Management 306 (2008)(“An approach to interpreting lines of evidence and weight of evidence is critically needed for complex assessments, and it would be useful to develop case studies and/or standards of practice for interpreting lines of evidence.”);  Igor Linkov, Drew Loney, Susan M. Cormier, F.Kyle Satterstrom, Todd Bridges, “Weight-of-evidence evaluation in environmental assessment: review of qualitative and quantitative approaches,” 407 Science of Total Env’t 5199–205 (2009); Douglas L. Weed, “Weight of Evidence: A Review of Concept and Methods,” 25 Risk Analysis 1545 (2005) (noting the vague, ambiguous, indefinite nature of the concept of “weight of evidence” review);   R.G. Stahl Jr., “Issues addressed and unaddressed in EPA’s ecological risk guidelines,” 17 Risk Policy Report 35 (1998); (noting that U.S. Environmental Protection Agency’s guidelines for ecological weight-of-evidence approaches to risk assessment fail to provide guidance); Glenn W. Suter II, Susan M. Cormier, “Why and how to combine evidence in environmental assessments:  Weighing evidence and building cases,” 409 Science of the Total Environment 1406, 1406 (2011)(noting arbitrariness and subjectivity of WOE “methodology”).

 

General Electric v. Joiner

Most savvy judges quickly figured out that weight of the evidence (WOE) was suspect methodology, woefully lacking, and indeed, not really a methodology at all.

The WOE method was part of the hand waving in Joiner by plaintiffs’ expert witnesses, including the frequent testifier Rabbi Teitelbaum.  The majority recognized that Rabbi Teitelbaum’s WOE weighed in at less than a peppercorn, and affirmed the district court’s exclusion of his opinions.  The Joiner Court’s assessment provoked a dissent from Justice Stevens, who was troubled by the Court’s undressing of the WOE methodology:

“Dr. Daniel Teitelbaum elaborated on that approach in his deposition testimony: ‘[A]s a toxicologist when I look at a study, I am going to require that that study meet the general criteria for methodology and statistical analysis, but that when all of that data is collected and you ask me as a patient, Doctor, have I got a risk of getting cancer from this? That those studies don’t answer the question, that I have to put them all together in my mind and look at them in relation to everything I know about the substance and everything I know about the exposure and come to a conclusion. I think when I say, “To a reasonable medical probability as a medical toxicologist, this substance was a contributing cause,” … to his cancer, that that is a valid conclusion based on the totality of the evidence presented to me. And I think that that is an appropriate thing for a toxicologist to do, and it has been the basis of diagnosis for several hundred years, anyway’.

* * * *

Unlike the District Court, the Court of Appeals expressly decided that a ‘weight of the evidence’ methodology was scientifically acceptable. To this extent, the Court of Appeals’ opinion is persuasive. It is not intrinsically “unscientific” for experienced professionals to arrive at a conclusion by weighing all available scientific evidence—this is not the sort of ‘junk science’ with which Daubert was concerned. After all, as Joiner points out, the Environmental Protection Agency (EPA) uses the same methodology to assess risks, albeit using a somewhat different threshold than that required in a trial.  Petitioners’ own experts used the same scientific approach as well. And using this methodology, it would seem that an expert could reasonably have concluded that the study of workers at an Italian capacitor plant, coupled with data from Monsanto’s study and other studies, raises an inference that PCB’s promote lung cancer.”

General Electric v. Joiner, 522 U.S. 136, 152-54 (1997)(Stevens, J., dissenting)(internal citations omitted)(confusing critical assessment of studies with WOE; and quoting Rabbit Teitelbaum’s attempt to conflate diagnosis with etiological attribution).  Justice Stevens could reach his assessment only by ignoring the serious lack of internal and external validity in the studies relied upon by Rabbi Teitelbaum.  Those studies did not support his opinion individually or collectively.

Justice Stevens was wrong as well about the claimed scientific adequacy of WOE.  Courts have long understood that precautionary, preventive judgments of regulatory agencies are different from scientific conclusions that are admissible in civil and criminal litigation.  See Allen v. Pennsylvania Engineering Corp., 102 F.3d 194 (5th Cir. 1996)(WOE, although suitable for regulatory risk assessment, is not appropriate in civil litigation).  Justice Stevens’ characterization of WOE was little more than judicial ipse dixit, and it was, in any event, not the law; it was the argument of a dissenter.

 

Milward v. Acuity Specialty Products

Admittedly, dissents can sometimes help lower court judges chart a path of evasion and avoidance of a higher court’s holding.  In Milward, Justice Stevens’ mischaracterization of WOE and scientific method was adopted as the legal standard for expert witness testimony by a panel of the United States Court of Appeals, for the First Circuit.  Milward v. Acuity Specialty Products Group, Inc., 664 F.Supp. 2d 137 (D. Mass. 2009), rev’d, 639 F.3d 11 (1st Cir. 2011), cert. denied, U.S. Steel Corp. v. Milward, ___ U.S. ___, 2012 WL 33303 (2012).

Mr. Milward claimed that he was exposed to benzene as a refrigerator technician, and developed acute promyelocytic leukeumia (APL) as result.  664 F. Supp. 2d at 140. In support of his claim, Mr. Milward offered the testimony of Dr. Martyn T. Smith, a toxicologist, who testified that the “weight of the evidence” supported his opinion that benzene exposure causes APL. Id. Smith, in his litigation report, described his methodology as an application of WOE:

“The term WOE has come to mean not only a determination of the statistical and explanatory power of any individual study (or the combined power of all the studies) but the extent to which different types of studies converge on the hypothesis.) In assessing whether exposure to benzene may cause APL, I have applied the Hill considerations . Nonetheless, application of those factors to a particular causal hypothesis, and the relative weight to assign each of them, is both context dependent and subject to the independent judgment of the scientist reviewing the available body of data. For example, some WOE approaches give higher weight to mechanistic information over epidemiological data.”

Smith Report at ¶¶19, 21 (citing Sheldon Krimsky, “The Weight of Scientific Evidence in Policy and Law,” 95(S1) Am. J. Public Health 5130, 5130-31 (2005))(March 9, 2009).  Smith marshaled several bodies of evidence, which he claimed collectively supported his opinion that benzene causes APL.  Milward, 664 F. Supp. 2d at 143.

Milward also offered the testimony of a philosophy professor, Carl F. Cranor, for the opinion that WOE was an acceptable methodology, and that all scientific inference is subject to judgment.  This is the same Cranor who, advocating for open admissions of all putative scientific opinions, showcased his confusion between statistical significance probability and the posterior probability involved in a conclusion of causality.  Carl F. Cranor, Regulating Toxic Substances: A Philosophy of Science and the Law at 33-34(Oxford 1993)(“One can think of α, β (the chances of type I and type II errors, respectively) and 1- β as measures of the “risk of error” or “standards of proof.”) See also id. at 44, 47, 55, 72-76.

After a four-day evidentiary hearing, the district court found that Martyn Smith’s opinion was merely a plausible hypothesis, and not admissible.  Milward, 664 F. Supp. 2d at 149.  The Court of Appeals, in an opinion by Chief Judge Lynch, however, reversed and ruled that an inference of general causation based on a WOE methodology satisfied the reliability requirement for admission under Federal Rule of Evidence 702.  639 F.3d at 26.  According to the Circuit, WOE methodology was scientifically sound,  Id. at 22-23.

 

WOE Cometh

Because the WOE methodology is not well described, either in the published literature or in Martyn Smith’s litigation report, it is difficult to understand exactly what the First Circuit approved by reversing Smith’s exclusion.  Usually the burden is on the proponent of the opinion testimony, and one would have thought that the vagueness of the described methodology would count against admissibility.  It is hard to escape the conclusion that the Circuit elevated a poorly described method, best characterized as hand waving, into a description of scientific method

The Panel appeared to have been misled by Carl F. Cranor, who described “inference to the best explanation” as requiring a scientist to “consider all of the relevant evidence” and “integrate the evidence using professional judgment to come to a conclusion about the best explanation. Id at 18. The available explanations are then weighed, and a would-be expert witness is free to embrace the one he feels offers the “best” explanation.  The appellate court’s opinion takes WOE, combined with Cranor’s “inference to the best explanation,” to hold that an expert witness need only opine that he has considered the range of plausible explanations for the association, and that he believes that the causal explanation is the best or “most plausible.”  Id. at 20 (upholding this approach as “methodologically reliable”).

What is missing of course is the realization that plausible does not mean established, reasonably certain, or even more likely than not.  The Circuit’s invocation of plausibility also obscures the indeterminacy of the available data for supporting a reliable conclusion of causation in many cases.

Curiously, the Panel likened WOE to the use of differential diagnosis, which is a method for inferring the specific cause of a particular patient’s disease or disorder.  Id. at 18.  This is a serious confusion between a method concerned with general causation and one concerned with specific causation.  Even if, by the principle of charity, we allow that the First Circuit was thinking of some process of differential etiology rather than diagnosis, given that diagnoses (other than for infectious diseases and a few pathognomonic disorders) do not usually carry with them information about unique etiologic agents.  But even such a process of differential etiology is a well-structured dysjunctive syllogism of the form:

A v B v C

~A ∩ ~B

∴ C

There is nothing subjective about assigning weights or drawing inferences in applying such a syllogism.  In the Milward case, one of the propositional facts that might have well explained the available evidence was chance, but plaintiff’s expert witness Smith could not and did not rule out chance in that the studies upon which he relied were not statistically significant.  Smith could thus never get past “therefore” in any syllogism or in any other recognizable process of reasoning.

The Circuit Court provides no insight into the process Smith used to weigh the available evidence, and it failed to address the analytical gaps and evidentiary insufficiencies addressed by the trial court, other than to invoke the mantra that all these issues go to “the weight, not the admissibility” of Smith’s opinions.  This, of course, is a conclusion, not an explanation or a legal theory.

There is also a cute semantic trick lurking in plaintiffs’ position in Milward, which results from their witnesses describing their methodology as “WOE.”  Since the jury is charged with determining the “weight of the evidence,” any evaluation of the WOE would be an invasion of the province of the jury.  Milward, 639 F.3d at 20. QED by the semantic device of deliberating conflating the name of the putative scientific methodology with the term traditionally used to describe jury fact finding.

In any event, the Circuit’s chastisement of the district court for evaluating Smith’s implementation of the WOE methodology, his logical, mathematical, and epidemiological errors, his result-driven reinterpretation of study data, threatens to read an Act of Congress — the Federal Rules of Evidence, and especially Rules 702 and 703 — out of existence by judicial fiat.  The Circuit’s approach is also at odds with Supreme Court precedent (now codified in Rule 702) on the importance and the requirement of evaluating opinion testimony for analytical gaps and the ipse dixit of expert witnesses.  General Electic Co. v. Joiner, 522 U.S. 136, 146 (1997).

 

Smith’s Errors in Recalculating Odds Ratios of Published Studies

In the district court, the defendants presented testimony of an epidemiologist, Dr. David H. Garabrant, who took Smith to task for calculating risk ratios incorrectly.  Smith did not have any particular expertise in epidemiologist, and his faulty calculations were problematic from the perspective of both Rule 702 and Rule 703.  The district court found the criticisms of Smith’s calculations convincing, 664 F. Supp. 2d at 149, but the appellate court held that the technical dispute was for the jury; “both experts’ opinions are supported by evidence and sound scientific reasoning,” Milward, 639 F.3d at 24.  This ruling is incomprehensible.  Plaintiffs had the burden of showing admissibility of Smith opinion generally, but also the reasonability of his reliance upon the calculated odds ratio.  The defendants had no burden of persuasion on the issue of Smith’s calculations, but they presented testimony, which apparently carried the day.  The appellate court had no basis for reversing the specific ruling with respect to the erroneously calculated risk ratio.

 

Smith’s Reliance upon Statistically Insignificant Studies

Smith relied upon studies that were not statistically significant at any accepted level.  An opinion of causality requires a showing that chance, bias, and confounding have been excluded in assessing an existing association.  Smith failed to exclude chance as an explanation for the association, and the burden to make this exclusion was on the plaintiffs. This failure was not something that could readily be patched by adverting to other evidence of studies in animals or in test tubes.    The Court of Appeals excused the important analytical gap in plaintiffs’ witness’s opinion because APL is rare, and data collection is difficult in the United States.  Id. at 24.  Evidence “consistent with” and “suggestive of” the challenged witness’s opinion thus suffices.  This is a remarkable homeopathic dilution of both legal and scientific causation.  Now we have a rule of law that allows plaintiffs to be excused from having to prove their case with reliable evidence if they allege a rare disease for which they lack evidence.

 

Leveling the Hierarchy of Evidence

Imagine trying to bring a medication to market with a small case-control study, with a non-statistically significant odds ratio!  Oh, but these clinical trials are so difficult and expensive; and they take such a long time.  Like a moment’s thought, when thinking is so hard and a moment such a long time.  We would be quite concerned if the FDA abridged the standard for causal efficacy in the licensing of new medications; we should be just as concerned about judicial abridgments of standards for causation of harm in tort actions.

Leveling the hierarchy of evidence has been an explicit or implicit goal of several law professors.  Some of the leveling efforts even show up in the new Reference Manual for Scientific Evidence (RMSE 3d ed. 2011).  SeeNew-Age Levellers – Flattening Hierarchy of Evidence.”

The Circuit, in Milward, quoted an article published in the Journal of the National Cancer Institute by Michele Carbone and others who suggest that there should be no hierarchy, but the Court ignored a huge body of literature that explains and defends the need for recognizing that not all study designs or types are equal.  Interestingly, the RMSE chapter on epidemiology by Professor Green (see more below) cites the same article.  RMSE 3d at 564 & n.48 (citing and quoting symposium paper that “[t]here should be no hierarchy [among different types of scientific methods to determine cancer causation]. Epidemiology, animal, tissue culture and molecular pathology should be seen as integrating evidences in the determination of human carcinogenicity.” Michele Carbone et al., “Modern Criteria to Establish Human Cancer Etiology,” 64 Cancer Res. 5518, 5522 (2004).)  Carbone, of course, is best known for his advocacy of a viral cause (SV40), of human mesothelioma, a claim unsupported, and indeed contradicted, by epidemiologic studies.  Carbone’s statement does not support the RMSE chapter’s leveling of epidemiology and toxicology, and Carbone is, in any event, an unlikely source to cite.

The First Circuit, in Milward, studiously ignored a mountain of literature on evidence-based medicine, including the RSME 3d chapter on “Reference Guide on Medical Testimony,” which teaches that leveling of study designs and types is inappropriate. The RMSE chapter devotes several pages to explaining the role of study design in assessing an etiological issue:

3. Hierarchy of medical evidence

With the explosion of available medical evidence, increased emphasis has been placed on assembling, evaluating, and interpreting medical research evidence.  A fundamental principle of evidence-based medicine (see also Section IV.C.5, infra) is that the strength of medical evidence supporting a therapy or strategy is hierarchical.

When ordered from strongest to weakest, systematic review of randomized trials (meta-analysis) is at the top, followed by single randomized trials, systematic reviews of observational studies, single observational studies, physiological studies, and unsystematic clinical observations.150 An analysis of the frequency with which various study designs are cited by others provides empirical evidence supporting the influence of meta-analysis followed by randomized controlled trials in the medical evidence hierarchy.151 Although they are at the bottom of the evidence hierarchy, unsystematic clinical observations or case reports may be the first signals of adverse events or associations that are later confirmed with larger or controlled epidemiological studies (e.g., aplastic anemia caused by chloramphenicol,152 or lung cancer caused by asbestos153). Nonetheless, subsequent studies may not confirm initial reports (e.g., the putative association between coffee consumption and pancreatic cancer).154

John B. Wong, Lawrence O. Gostin, and Oscar A. Cabrera, “Reference Guide on Medical Testimony,” RMSE 3d 687, 723 -24 (2011).   The implication that there is no hierarchy of evidence in causal inference, and that tissue culture studies are as relevant as epidemiology, is patently absurd. The Circuit not only went out on a limb, it managed to saw the limb off, while “out there.”

 

Milward – Responses Critical and Otherwise

The First Circuit’s decision in Milward made an immediate impression upon those writers who have worked hard to dismantle or marginalize Rule 702.  The Circuit’s decision was mysteriously cited with obvious approval by Professor Margaret Berger, even though she had died before the decision was published!  Margaret A. Berger, “The Admissibility of Expert Testimony,” RMSE 3d at 20 & n. 51(2011).  Professor Michael Green, one of the reporters for the ALI’s Restatement (Third) of Torts hyperbolically called Milward “[o]ne of the most significant toxic tort causation cases in recent memory.”  Michael D. Green, “Introduction: Restatement of Torts as a Crystal Ball,” 37 Wm. Mitchell L. Rev. 993, 1009 n.53 (2011).

The WOE approach, and its embrace in Milward, obscures the reality that sometimes the evidence does not logically or analytically support the offered conclusion, and at other times, the best explanation is uncertainty.  By adopting the WOE approach, vague and ambiguous as it is, the Milward Court was beguiled into holding that WOE determinations are for the jury.  The lack of meaningful content of WOE means that decisions such as Milward effectively remove the gatekeeping function, or permit that function to be minimally satisfied by accepting an expert witness’s claim to have employed WOE.  The epistemic warrant required by Rule 702 is diluted if not destroyed.  Scientific hunch and speculation, proper in their place, can be passed off for scientific knowledge to gullible or result-oriented judges and juries.