TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

Federal Rule of Evidence 702 Requires Perscrutations — Samaan v. St. Joseph Hospital (2012)

February 4th, 2012

After the dubious decision in Milward, the First Circuit would seem an unlikely forum for perscrutations of expert witness opinion testimony.  Milward v. Acuity Specialty Products Group, Inc., 639 F.3d 11 (1st Cir. 2011), cert. denied, ___ U.S.___ (2012).  SeeMilwardUnhinging the Courthouse Door to Dubious Scientific Evidence” (Sept. 2, 2011).  Late last month, however, a First Circuit panel of the United States Court of Appeals held that Rule 702 required perscrutation of expert witness opinion, and then proceeded to perscrutate perspicaciously, in Samaan v. St. Joseph Hospital, 2012 WL 34262 (1st Cir. 2012).

The plaintiff, Mr. Samaan suffered an ischemic stroke, for which he was treated by the defendant hospital and physician.  Plaintiff claimed that the defendants’ treatment deviated from the standard of care by failing to administer intravenous tissue plasminogen activator (t-PA).  Id. at *1.  The plaintiff’s only causation expert witness, Dr. Ravi Tikoo, opined that the defendants’ failure to administer t-PA caused plaintiffs’ neurological injury.  Id. at *2.   Dr. Tikoo’s opinions, as well as those of the defense expert witness, were based in large part upon data from a study done by one of the National Institutes of Health:  The National Institute of Neurological Disorders and Stroke rt-PA Stroke Study Group, “Tissue Plasminogen Activator for Acute Ischemic Stroke,” 333 New Engl. J. Med. 1581 (1995).

Both the District Court and the Court of Appeals noted that the problem with Dr. Tikoo’s opinions lay not in the unreliability of the data, or in the generally accepted view that t-PA can, under certain circumstances, mitigate the sequelae of ischemic stroke; rather the problem lay in the analytical gap between those data and Dr. Tikoo’s conclusion that the failure to administer t-PA caused Mr. Samaan’s stroke-related injuries.

The district court held that Dr. Tikoo’s opinion failed to satisfy the requirements of Rule 702. Id. at *8 – *9.  Dr. Tikoo examined odds ratios from the NINDS study, and others, and concluded that a patient’s chances of improved outcome after stroke increased 50% with t-PA, and thus Mr. Samaan’s healthcare providers’ failure to provide t-PA had caused his poor post-stroke outcome.  Id. at *9.  The appellate court similarly rejected the inference from an increased odds ratio to specific causation:

“Dr. Tikoo’s first analysis depended upon odds ratios drawn from the literature. These odds ratios are, as the term implies, ratios of the odds of an adverse outcome, which reflect the relative likelihood of a particular result.FN5 * * * Dr. Tikoo opined that the plaintiff more likely than not would have recovered had he received the drug.”

Id. at *10.

The Court correctly identified the expert witness’s mistake in inferring specific causation from an odds ratio of about 1.5, without any additional information.  The Court characterized the testimonial flaw as one of “lack of fit,” but it was equally an unreliable inference from epidemiologic data to a conclusion about specific causation.

While the Court should be applauded for rejecting the incorrect inference about specific causation, we might wish that it had been more careful about important details.  The Court misinterpreted the meaning of an odds ratio to be a relative risk.  The NINDS study reported risk ratio results both as an odds ratio and as a relative risk.  The Court’s sloppiness should be avoided; the two statistics are different, especially when the outcome of interest is not particularly rare.

Still, the odds ratio is interesting and important as an approximation for the relative risk, and neither measure of risk can substitute for causation, especially when the magnitude of the risk is small, and less than two-fold.  The First Circuit recognized and focused in on this gap between risk and causal attribution in an individual’s case:

“[Dr. Tikoo’s] reasoning is structurally unsound and leaves a wide analytical gap between the results produced through the use of odds ratios and the conclusions drawn by the witness. When a person’s chances of a better outcome are 50% greater with treatment (relative to the chances of those who were not treated), that is not the same as a person having a greater than 50% chance of experiencing the better outcome with treatment. The latter meets the required standard for causation; the former does not.  To illustrate, suppose that studies have shown that 10 out of a group of 100 people who do not eat bananas will die of cancer, as compared to 15 out of a group of 100 who do eat bananas. The banana-eating group would have an odds ratio of 1.5 or a 50% greater chance of getting cancer than those who eschew bananas. But this is a far cry from showing that a person who eats bananas is more likely than not to get cancer.

Even if we were to look only at the fifteen persons in the banana-eating group who did get cancer, it would not be likely that any particular person in that cohort got it from the consumption of bananas. Correlation is not causation, and a substantial number of persons with cancer within the banana-eating group would in all probability have contracted the disease whether or not they ate bananas.FN6

We think that this example exposes the analytical gap between Dr. Tikoo’s methods and his conclusions.  Although he could present figures ranging higher than 50%, those figures were not responsive to the question of causation. Let us take the “stroke scale” figure from the NINDS study as an example. This scale measures the neurological deficits in different parts of the nervous system. Twenty percent of patients who experienced a stroke and were not treated with t-PA had a favorable outcome according to this scale, whereas that figure escalated to 31% when t-PA was administered.

Although this means that the patients treated with t-PA had over a 50% better chance of recovery than they otherwise would have had, 69% of those patients experienced the adverse outcome (stroke-related injury) anyway.FN7  The short of it is that while the odds ratio analysis shows that a t-PA patient may have a better chance of recovering than he otherwise would have had without t-PA, such an analysis does not show that a person has a better than even chance of avoiding injury if the drug is administered. The odds ratio, therefore, does not show that the failure to give t-PA was more likely than not a substantial factor in causing the plaintiff’s injuries. The unavoidable conclusion from the studies deemed authoritative by Dr. Tikoo is that only a small number of patients overall (and only a small fraction of those who would otherwise have experienced stroke-related injuries) experience improvement when t-PA is administered.”

*11 and n.6 (citing Milward).

The court in Samaan thus suggested, but did not state explicitly, that the study would have to have shown better than a 100% increase in the rate of recovery for attributability to have exceeded 50%.  The Court’s timidity is regrettable. Yes, Dr. Tikoo’s confusing the percentage increased risk with the percentage of attributability was quite knuckleheaded.  I doubt that many would want to subject themselves to Dr. Tikoo’s quality of care, at least not his statistical care.  The First Circuit, however, stopped short of stating what magnitude increase in risk would permit an inference of specifc causation for Mr. Samaan’s post-stroke sequelae.

The Circuit noted that expert witnesses may present epidemiologic statistics in a variety of forms:

“to indicate causation. Either absolute or relative calculations may suffice in particular circumstances to achieve the causation standard. See, e.g., Smith v. Bubak, 643 F.3d 1137, 1141–42 (8th Cir.2011) (rejecting relative benefit testimony and suggesting in dictum that absolute benefit “is the measure of a drug’s overall effectiveness”); Young v. Mem’l Hermann Hosp. Sys., 573 F.3d 233, 236 (5th Cir.2009) (holding that Texas law requires a doubling of the relative risk of an adverse outcome to prove causation), cert. denied, ___ U.S. ___, 130 S.Ct. 1512, 176 L.Ed.2d 111 (2010).”

 Id. at *11.

Although the citation to Texas law with its requirement of a doubling of a relative risk is welcome and encouraging, the Court seems to have gone out of its way to muddle its holding.  First, the Young case involved t-PA and a claimed deviation from the standard of care in a stroke case, and was exactly on point.  The Fifth Circuit’s reliance upon Texas substantive law left unclear to what extent the same holding would have been required by Federal Rule of Evidence 702.

Second, the First Circuit, with its banana hypothetical, appeared to confuse an odds ratio with a relative risk.  The odds ratio is different from a relative risk, and typically an odds ratio will be higher than the corresponding relative risk, unless the outcome is rare.  See Michael O. Finkelstein & Bruce Levin, Statistics for Lawyers at 37 (2d ed. 2001). In studies of medication efficacy, however, the benefit will not be particularly rare, and the rare disease assumption cannot be made.

Third, risk is not causation, regardless of magnitude.  If the magnitude of risk is used to infer specific causation, then what is the basis for the inference, and how large must the risk be?  In what way can epidemiologic statistics be used “to indicate” specific causation?  The opinion tells us that Dr. Tivoo’s reliance upon an odds ratio of 1.5 was unhelpful, but why?  The Court, which spoke so clearly and well in identifying the fallacious reasoning of Dr. Tivoo, faltered in identifying what use of risk statistics would permit an inference of specific causation in this case, where general causation was never in doubt.

The Fifth Circuit’s decision in Young, supra, invoked a greater than doubling of risk required by Texas law.  This requirement is nothing more than a logical, common-sense recognition that risk is not causation, and that small risks alone cannot support an inference of specific causation.  Requiring a relative risk greater than two makes practical sense despite the apoplectic objections of Professor Sander Greenland.  SeeRelative Risks and Individual Causal Attribution Using Risk Size” (Mar. 18, 2011).

Importantly, the First Circuit panel in Samaan did not engage in the hand-waving arguments that were advanced in Milward, and stuck to clear, transparent rational inferences.  In footnote 6, the Samaan Court cited its earlier decision in Milward, but only with double negatives, and for the relevancy of odds ratios to the question of general causation:

“This is not to say that the odds ratio may not help to prove causation in some instances.  See, e.g., Milward v. Acuity Specialty Prods. Group, Inc., 639 F.3d 11, 13–14, 23–25 (1st Cir.2011) (reversing exclusion of expert prepared to testify as to general rather than specific causation using in part the odds ratio).”

Id. at n.6.

The Samaan Court went on to suggest that inferring specific causation from the magnitude of risk was “theoretically possible”:

Indeed, it is theoretically possible that a particular odds ratio calculation might show a better-than-even chance of a particular outcome. Here, however, the odds ratios relied on by Dr. Tikoo have no such probative force.

Id. (emphasis added).  But why and how? The implication of the Court’s dictum is that when the risk ratio is small, less than or equal to two, the ratio cannot be taken to have supported the showing of “better than even chance.” In Milward, one of the key studies relied upon by plaintiff’s expert witness reported an increased risk of only 40%.  Although Milward presented primarily a challenge on general causation, the Samaan decision suggests that the low-dose benzene exposure plaintiffs are doomed, not by benzene, but by the perscrutation required by Rule 702.

Epidemiology, Risk, and Causation – Report of Workshops

November 15th, 2011

This month’s issue of Preventive Medicine includes a series of papers arising from last year’s workshops on “Epidemiology, Risk, and Causation,” at Cambridge University. The workshops were organized by philosopher Alex Broadbent,  a member of the Department of History and Philosophy of Science, in Cambridge University.  The workshops were financially sponsored by the Foundation for Genomics and Population Health (PHG), a not-for-profit British organization.

Broadbent’s workshops were intended for philosophers of science, statisticians, and epidemiologists, lawyers involved in health effects litigation will find the papers of interest as well.  The themes of workshops included:

  • the nature of epidemiologic causation,
  • the competing claims of observational and experimental research for establishing causation,
  • the role of explanation and prediction in assessing causality,
  • the role of moral values in causal judgments, and
  • the role of statistical and epistemic uncertainty in causal judgments

See Alex Broadbent, ed., “Special Section: Epidemiology, Risk, and Causation,” 53 Preventive Medicine 213-356 (October-November 2011).  Preventive Medicine is published by Elsevier Inc., so you know that the articles are not free.  Still you may want to read these at your local library to determine what may be useful in challenging and defending causal judgments in the courtroom.  One of the interlocutors, Sander Greenland, is of particular interest because he shows up as an expert witness with some regularity.

Here are the individual papers published in this special issue:

Alfredo Morabia, Michael C. Costanza, Philosophy and epidemiology

Alex Broadbent, Conceptual and methodological issues in epidemiology: An overview

Alfredo Morabia, Until the lab takes it away from epidemiology

Nancy Cartwright, Predicting what will happen when we act. What counts for warrant?

Sander Greenland, Null misinterpretation in statistical testing and its impact on health risk assessment

Daniel M. Hausman, How can irregular causal generalizations guide practice

Mark Parascandola, Causes, risks, and probabilities: Probabilistic concepts of causation in chronic disease epidemiology

John Worrall, Causality in medicine: Getting back to the Hill top

Olaf M. Dekkers, On causation in therapeutic research: Observational studies, randomised experiments and instrumental variable analysis

Alexander Bird, The epistemological function of Hill’s criteria

Michael Joffe, The gap between evidence discovery and actual causal relationships

Stephen John, Why the prevention paradox is a paradox, and why we should solve it: A philosophical view

Jonathan Wolff, How should governments respond to the social determinants of health?

Alex Broadbent, What could possibly go wrong? — A heuristic for predicting population health outcomes of interventions, Pages 256-259

The Treatment of Meta-Analysis in the Third Edition of the Reference Manual on Scientific Evidence

November 14th, 2011

Meta-analysis is a statistical procedure for aggregating data and statistics from individual studies into a single summary statistical estimate of the population measurement of interest.  The first meta-analysis is typically attributed to Karl Pearson, circa 1904, who sought a method to overcome the limitations of small sample size and low statistical power.  Statistical methods for meta-analysis, however, did not mature until the 1970s.  Even then, the biomedical scientific community remained skeptical of, if not out rightly hostile to, meta-analysis until relatively recently.

The hostility to meta-analysis, especially in the context of observational epidemiologic studies, was colorfully expressed by Samuel Shapiro and Alvan Feinstein, as late as the 1990s:

“Meta-analysis begins with scientific studies….  [D]ata from these studies are then run through computer models of bewildering complexity which produce results of implausible precision.”

* * * *

“I propose that the meta-analysis of published non-experimental data should be abandoned.”

Samuel Shapiro, “Meta-analysis/Smeta-analysis,” 140 Am. J. Epidem. 771, 777 (1994).  See also Alvan Feinstein, “Meta-Analysis: Statistical Alchemy for the 21st Century,” 48 J. Clin. Epidem. 71 (1995).

The professional skepticism about meta-analysis was reflected in some of the early judicial assessments of meta-analysis in court cases.  In the 1980s and early 1990s, some trial judges erroneously dismissed meta-analysis as a flawed statistical procedure that claimed to make something out of nothing. Allen v. Int’l Bus. Mach. Corp., No. 94-264-LON, 1997 U.S. Dist. LEXIS 8016, at *71–*74 (suggesting that meta-analysis of observational studies was controversial among epidemiologists).

In In re Paoli Railroad Yard PCB Litigation, Judge Robert Kelly excluded plaintiffs’ expert witness Dr. William Nicholson and his testimony based upon his unpublished meta-analysis of health outcomes among PCB-exposed workers.  Judge Kelly found that the meta-analysis was a novel technique, and that Nicholson’s meta-analysis was not peer reviewed.  Furthermore, the meta-analysis assessed health outcomes not experienced by any of the plaintiffs before the trial court.  706 F. Supp. 358, 373 (E.D. Pa. 1988).

The Court of Appeals for the Third Circuit reversed the exclusion of Dr. Nicholson’s testimony, and remanded for reconsideration with instructions.  In re Paoli R.R. Yard PCB Litig., 916 F.2d 829, 856-57 (3d Cir. 1990), cert. denied, 499 U.S. 961 (1991); Hines v. Consol. Rail Corp., 926 F.2d 262, 273 (3d Cir. 1991).  The Circuit noted that meta-analysis was not novel, and that the lack of peer-review was not an automatic disqualification.  Acknowledging that a meta-analysis could be performed poorly using invalid methods, the appellate court directed the trial court to evaluate the validity of Dr. Nicholson’s work on his meta-analysis.

In one of many squirmishes over colorectal cancer claims in asbestos litigation, Judge Sweet in the Southern District of New York was unimpressed by efforts to aggregate data across studies.  Judge Sweet declared that “no matter how many studies yield a positive but statistically insignificant SMR for colorectal cancer, the results remain statistically insignificant. Just as adding a series of zeros together yields yet another zero as the product, adding a series of positive but statistically insignificant SMRs together does not produce a statistically significant pattern.”  In In re Joint E. & S. Dist. Asbestos Litig., 827 F. Supp. 1014, 1042 (S.D.N.Y. 1993).  The plaintiffs’ expert witness who had offered the unreliable testimony, Dr. Steven Markowitz, like Nicholson, another foot soldier in Dr. Irving Selikoff’s litigation machine, did not offer a formal meta-analysis to justify his assessment that multiple non-significant studies, taken together, rule out chance as a likely explanation for an aggregate finding of an increased risk.

Judge Sweet was quite justified in rejecting this back of the envelope, non-quantitative meta-analysis.  His suggestion, however, that multiple non-significant studies could never collectively serve to rule out chance as an explanation for an overall increased rate of disease in the exposed groups is wrong.  Judge Sweet would have better focused on the validity issues in key studies, the presence of bias and confounding, and the completeness of the proffered meta-analysis.  The Second Circuit reversed the entry of summary judgment, and remanded the colorectal cancer claim for trial.  52 F.3d 1124 (2d Cir. 1995).  Over a decade later, with even more accumulated studies and data, the Institute of Medicine found the evidence for asbestos plaintiffs’ colorectal cancer claims to be scientifically insufficient.  Institute of Medicine, Asbestos: Selected Cancers (Wash. D.C. 2006).

Courts continue to go astray with an erroneous belief that multiple studies, all without statistically significant results, cannot yield a statistically significant summary estimate of increased risk.  See, e.g., Baker v. Chevron USA, Inc., 2010 WL 99272, *14-15 (S.D.Ohio 2010) (addressing a meta-analysis by Dr. Infante on multiple myeloma outcomes in studies of benzene-exposed workers).  There were many sound objections to Infante’s meta-analysis, but the suggestion that multiple studies without statistical significance could not yield a summary estimate of risk with statistical significance was not one of them.

In the last two decades, meta-analysis has emerged as an important technique for addressing random variation in studies, as well as some of the limitations of frequentist statistical methods.  In 1980s, articles reporting meta-analyses were rare to non-existent.  In 2009, there were over 2,300 articles with “meta-analysis” in their title, or in their keywords, indexed in the PubMed database of the National Library of Medicine.  See Michael O. Finkelstein and Bruce Levin, “Meta-Analysis of ‘Sparse’ Data: Perspectives from the Avandia Cases” (2011) (forthcoming in Jurimetrics).

The techniques for aggregating data have been studied, refined, and employed extensively in thousands of methods and application papers in the last decade. Consensus guideline papers have been published for meta-analyses of clinical trials as well as observational studies.  See Donna Stroup, et al., “Meta-analysis of Observational Studies in Epidemiology: A Proposal for Reporting,” 283 J. Am. Med. Ass’n 2008 (2000) (MOOSE statement); David Moher, Deborah Cook, Susan Eastwood, Ingram Olkin, Drummond Rennie, and Donna Stroup, “Improving the quality of reports of meta-analyses of randomised controlled trials: the QUOROM statement,” 354 Lancet 1896 (1999).  See also Jesse Berlin & Carin Kim, “The Use of Meta-Analysis in Pharmacoepidemiology,” in Brian Strom, ed., Pharmacoepidemiology 681, 683–84 (4th ed. 2005); Zachary Gerbarg & Ralph Horwitz, “Resolving Conflicting Clinical Trials: Guidelines for Meta-Analysis,” 41 J. Clin. Epidemiol. 503 (1988).

Meta-analyses, of observational studies and of randomized clinical trials, routinely are relied upon by expert witnesses in pharmaceutical and so-called toxic tort litigation. Id. See also In re Bextra and Celebrex Marketing Sales Practices and Prod. Liab. Litig., 524 F. Supp. 2d 1166, 1174, 1184 (N.D. Cal. 2007) (holding that reliance upon “[a] meta-analysis of all available published and unpublished randomized clinical trials” was reasonable and appropriate, and criticizing the expert witnesses who urged the complete rejection of meta-analysis of observational studies)

The second edition of the Reference Manual on Scientific Evidence gave very little attention to meta-analysis.  With this historical backdrop, it is interesting to see what the new third edition provides for guidance to the federal judiciary on this important topic.

STATISTICS CHAPTER

The statistics chapter of the third edition gives continues to give scant attention to meta-analysis.  The chapter notes, in a footnote, that there are formal procedures for aggregating data across studies, and that the power of the aggregated data will exceed the power of the individual, included studies.  The footnote then cautions that meta-analytic procedures “have their own weakness,” without detailing what that one weakness is.  RMSE 3d at 254 n. 107.

The glossary at the end of the statistics chapter offers a definition of meta-analysis:

“meta-analysis. Attempts to combine information from all studies on a certain topic. For example, in the epidemiological context, a meta-analysis may attempt to provide a summary odds ratio and confidence interval for the effect of a certain exposure on a certain disease.”

Id. at 289.

This definition is inaccurate in ways that could yield serious mischief.  Virtually all meta-analyses are built upon a systematic review that sets out to collect all available studies on a research issue of interest.  It is a rare meta-analysis, however, that includes “all” studies in its quantitative analysis.  The meta-analytic process involves a pre-specification of inclusionary and exclusionary criteria for the quantitative analysis of the summary estimate of risk.  Those criteria may limit the quantitative analysis to randomized trials, or to analytical epidemiologic studies.  Furthermore, meta-analyses frequently and appropriately have pre-specified exclusionary criteria that relate to study design or quality.

On a more technical note, the offered definition suggests that the summary estimate of risk will be an odds ratio, which may or may not be true.  Meta-analyses of risk ratios may yield summary estimates of risk in terms of relative risk or hazard ratios, or even of risk differences.  The meta-analysis may combine data of means rather than proportions as well.

EPIDEMIOLOGY CHAPTER

The chapter on epidemiology delves into meta-analysis in greater detail than the statistics chapter, and offers apparently inconsistent advice.  The overall gist of the chapter, however, can perhaps best be summarized by the definition offered in this chapter’s glossary:

“meta-analysis. A technique used to combine the results of several studies to enhance the precision of the estimate of the effect size and reduce the plausibility that the association found is due to random sampling error.  Meta-analysis is best suited to pooling results from randomly controlled experimental studies, but if carefully performed, it also may be useful for observational studies.”

Reference Guide on Epidemiology, RSME3d at 624.  See also id. at 581 n. 89 (“Meta-analysis is better suited to combining results from randomly controlled experimental studies, but if carefully performed it may also be helpful for observational studies, such as those in the epidemiologic field.”).  The epidemiology chapter appropriately notes that meta-analysis can help address concerns over random error in small studies.  Id. at 579; see also id. at 607 n. 171.

Having told us that properly conducted meta-analyses of observational studies can be helpful, the chapter hedges considerably:

“Meta-analysis is most appropriate when used in pooling randomized experimental trials, because the studies included in the meta-analysis share the most significant methodological characteristics, in particular, use of randomized assignment of subjects to different exposure groups. However, often one is confronted with nonrandomized observational studies of the effects of possible toxic substances or agents. A method for summarizing such studies is greatly needed, but when meta-analysis is applied to observational studies – either case-control or cohort – it becomes more controversial.174 The reason for this is that often methodological differences among studies are much more pronounced than they are in randomized trials. Hence, the justification for pooling the results and deriving a single estimate of risk, for example, is problematic.175

Id. at 607.  The stated objection to pooling results for observational studies is certainly correct, but many research topics have sufficient studies available to allow for appropriate selectivity in framing inclusionary and exclusionary criteria to address the objection.  The chapter goes on to credit the critics of meta-analyses of observational studies.  As they did in the second edition of the RSME, the authors repeat their cites to, and quotes from, early papers by John Bailar, who was then critical of such meta-analyses:

“Much has been written about meta-analysis recently and some experts consider the problems of meta-analysis to outweigh the benefits at the present time. For example, John Bailar has observed:

‘[P]roblems have been so frequent and so deep, and overstatements of the strength of conclusions so extreme, that one might well conclude there is something seriously and fundamentally wrong with the method. For the present . . . I still prefer the thoughtful, old-fashioned review of the literature by a knowledgeable expert who explains and defends the judgments that are presented. We have not yet reached a stage where these judgments can be passed on, even in part, to a formalized process such as meta-analysis.’

John Bailar, “Assessing Assessments,” 277 Science 528, 529 (1997).”

Id. at 607 n.177.  Bailar’s subjective preference for “old-fashioned” reviews, which often cherry picked the included studies is, well, “old fashioned.”  More to the point, it is questionable science, and a distinctly minority viewpoint in the light of substantial improvements in the conduct and reporting of meta-analyses of observational studies.  Bailar may be correct that some meta-analyses should have never left the protocol stage, but the RMSE 3d fails to provide the judiciary with the tools to appreciate the distinction between good and bad meta-analyses.

This categorical rejection, cited with apparent approval, is amplified by a recitation of some real or apparent problems with meta-analyses of observational studies.  What is missing is a discussion of how many of these problems can be and are dealt with in contemporary practice:

“A number of problems and issues arise in meta-analysis. Should only published papers be included in the meta-analysis, or should any available studies be used, even if they have not been peer reviewed? Can the results of the meta-analysis itself be reproduced by other analysts? When there are several meta-analyses of a given relationship, why do the results of different meta-analyses often disagree? The appeal of a meta-analysis is that it generates a single estimate of risk (along with an associated confidence interval), but this strength can also be a weakness, and may lead to a false sense of security regarding the certainty of the estimate. A key issue is the matter of heterogeneity of results among the studies being summarized.  If there is more variance among study results than one would expect by chance, this creates further uncertainty about the summary measure from the meta-analysis. Such differences can arise from variations in study quality, or in study populations or in study designs. Such differences in results make it harder to trust a single estimate of effect; the reasons for such differences need at least to be acknowledged and, if possible, explained.176 People often tend to have an inordinate belief in the validity of the findings when a single number is attached to them, and many of the difficulties that may arise in conducting a meta-analysis, especially of observational studies such as epidemiologic ones, may consequently be overlooked.177

Id. at 608.  The authors are entitled to their opinion, but their discussion leaves the judiciary uninformed about current practice, and best practices, in epidemiology.  A categorical rejection of meta-analyses of observational studies is at odds with the chapter’s own claim that such meta-analyses can be helpful if properly performed.  What was needed, and is missing, is a meaningful discussion to help the judiciary determine whether a meta-analysis of observational studies was properly performed.

MEDICAL TESTIMONY CHAPTER

The chapter on medical testimony is the third pass at meta-analysis in RMSE 3d.   The second edition’s chapter on medical testimony ignored meta-analysis completely; the new edition addresses meta-analysis in the context of the hierarchy of study designs:

“Other circumstances that set the stage for an intense focus on medical evidence included

(1) the development of medical research, including randomized controlled trials and other observational study designs;

(2) the growth of diagnostic and therapeutic interventions;141

(3) interest in understanding medical decision making and how physicians reason;142 and

(4) the acceptance of meta-analysis as a method to combine data from multiple randomized trials.143

RMSE 3d at 722-23.

The chapter curiously omits observational studies, but the footnote reference (note 143) then inconsistently discusses two meta-analyses of observational, rather than experimental, studies:

“143. Video Software Dealers Ass’n v. Schwarzenegger, 556 F.3d 950, 963 (9th Cir. 2009) (analyzing a meta-analysis of studies on video games and adolescent behavior); Kennecott Greens Creek Min. Co. v. Mine Safety & Health Admin., 476 F.3d 946, 953 (D.C. Cir. 2007) (reviewing the Mine Safety and Health Administration’s reliance on epidemiological studies and two meta-analyses).”

Id. at 723 n.143.

The medical testimony chapter then provides further confusion by giving a more detailed listing of the hierarchy of medical evidence in the form of different study designs:

3. Hierarchy of medical evidence

With the explosion of available medical evidence, increased emphasis has been placed on assembling, evaluating, and interpreting medical research evidence.  A fundamental principle of evidence-based medicine (see also Section IV.C.5, infra) is that the strength of medical evidence supporting a therapy or strategy is hierarchical.  When ordered from strongest to weakest, systematic review of randomized trials (meta-analysis) is at the top, followed by single randomized trials, systematic reviews of observational studies, single observational studies, physiological studies, and unsystematic clinical observations.150 An analysis of the frequency with which various study designs are cited by others provides empirical evidence supporting the influence of meta-analysis followed by randomized controlled trials in the medical evidence hierarchy.151 Although they are at the bottom of the evidence hierarchy, unsystematic clinical observations or case reports may be the first signals of adverse events or associations that are later confirmed with larger or controlled epidemiological studies (e.g., aplastic anemia caused by chloramphenicol,152 or lung cancer caused by asbestos153). Nonetheless, subsequent studies may not confirm initial reports (e.g., the putative association between coffee consumption and pancreatic cancer).154

Id. at 723-24.  This discussion further muddies the water by using a parenthetical to suggest that meta-analyses of randomized clinical trials are equivalent to systematic reviews of such studies — “systematic review of randomized trials (meta-analysis).” Of course, systematic reviews are not meta-analyses, although they are a necessary precondition for conducting a meta-analysis.  The relationship between the procedures for a systematic review and a meta-analysis are in need of clarification, but the judiciary will not find it in the new Reference Manual.

Reference Manual on Scientific Evidence v3.0 – Disregarding Study Validity in Favor of the “Whole Gamish”

October 14th, 2011

There is much to digest in the new Reference Manual on Scientific Evidence, third edition (RMSE 3d).  Much of what is covered is solid information on the individual scientific and technical disciplines covered.  Although the information is easily available from other sources, there is some value in collecting the material in a single volume for the convenience of judges.  Of course, given that this information is provided to judges from an ostensibly neutral, credible source, lawyers will naturally focus on what is doubtful or controversial in the RMSE.

I have already noted some preliminary concerns, however, with some of the comments in the Preface, by Judge Kessler and Dr. Kassirer.  See “New Reference Manual’s Uneven Treatment of Conflicts of Interest.”  In addition, there is a good deal of overlap among the chapters on statistics, epidemiology, and medical testimony.  This overlap is at first blush troubling because the RMSE has the potential to confuse and obscure issues by having multiple authors address them inconsistently.  This is an area where reviewers should pay close attention.

From first looks at the RMSE 3d, there is a good deal of equivocation between encouraging judges to look at scientific validity, and discouraging them from any meaningful analysis by emphasizing inaccurate proxies for validity, such as conflicts of interest.  (As I have pointed out, the new RSME did not do quite so well in addressing its own conflicts of interest.  SeeToxicology for Judges – The New Reference Manual on Scientific Evidence (2011).”)

The strengths of the chapter on statistical evidence, updated from the second edition, remain, as do some of the strengths and flaws of the chapter on epidemiology.  I hope to write more about each of these important chapters at a later date.

The late Professor Margaret Berger has an updated version of her chapter from the second edition, “The Admissibility of Expert Testimony,” RSME 3d 11 (2011).  Berger’s chapter has a section criticizing “atomization,” a process she describes pejoratively as a “slicing-and-dicing” approach.  Id. at 19.  Drawing on the publications of Daubert-critic Susan Haack, Berger rejects the notion that courts should examine the reliability of each study independently. Id. at 20 & n. 51 (citing Susan Haack, “An Epistemologist in the Bramble-Bush: At the Supreme Court with Mr. Joiner,” 26 J. Health Pol. Pol’y & L. 217–37 (1999).  Berger contends that the “proper” scientific method, as evidenced by works of the International Agency for Research on Cancer, the Institute of Medicine, the National Institute of Health, the National Research Council, and the National Institute for Environmental Health Sciences, “is to consider all the relevant available scientific evidence, taken as a whole, to determine which conclusion or hypothesis regarding a causal claim is best supported by the body of evidence.” Id. at 19-20 & n.52.  This contention, however, is profoundly misleading.  Of course, scientists undertaking a systematic review should identify all the relevant studies, but some of the “relevant” studies may well be insufficiently reliable (because of internal or external validity issues) to answer the research question at hand. All the cited agencies, and other research organizations and researchers, exclude studies that are fundamentally flawed, whether as a result of bias, confounding, erroneous data analyses, or related problems.  Berger cites no support for the remarkable suggestion that scientists do not make “reliability” judgments about available studies when assessing the “totality of the evidence.”

Professor Berger, who had a distinguished career as a law professor and evidence scholar, died in November 2010.  She was no friend of Daubert, but remarkably her antipathy has outlived her.  Her critical discussion of “atomization” cites the notorious decision in Milward v. Acuity Specialty Products Group, Inc., 639 F.3d 11, 26 (1st Cir. 2011), which was decided four months after her passing. Id. at 20 n.51. (The editors note that the published chapter was Berger’s last revision, with “a few edits to respond to suggestions by reviewers.”)

Professor Berger’s contention about the need to avoid assessments of individual studies in favor of the whole gamish must also be rejected because Federal Rule of Evidence 703 requires that each study considered by an expert witness “qualify” for reasonable reliance by virtue of the study’s containing facts or data that are “of a type reasonably relied upon by experts in the particular field forming opinions or inferences upon the subject.”  One of the deeply troubling aspects of the Milward decision is that it reversed the trial court’s sensible decision to exclude a toxicologist, Dr. Martyn Smith, who outran his headlights on issues having to do with a field in which he was clearly inexperienced – epidemiology.

Scientific studies, and especially epidemiologic studies, involve multiple levels of hearsay.  A typical epidemiologic study may contain hearsay leaps from patient to clinician, to laboratory technicians, to specialists interpreting test results, back to the clinician for a diagnosis, to a nosologist for disease coding, to a national or hospital database, to a researcher querying the database, to a statistician analyzing the data, to a manuscript that details data, analyses, and results, to editors and peer reviewers, back to study authors, and on to publication.  Those leaps do not mean that the final results are untrustworthy, only that the study itself is not likely admissible in evidence.

The inadmissibility of scientific studies is not problematic because Rule 703 permits testifying expert witnesses to formulate opinions based upon facts and data, which are not themselves admissible in evidence. The distinction between relied upon, and admissible, studies is codified in the Federal Rules of Evidence, and in virtually every state’s evidence law.

Referring to studies, without qualification, as admissible in themselves is wrong as a matter of evidence law.  The error has the potential to encourage carelessness in gatekeeping expert witnesses’ opinions for their reliance upon inadmissible studies.  The error is doubly wrong if this approach to expert witness gatekeeping is taken as license to permit expert witnesses to rely upon any marginally relevant study of their choosing.  It is therefore disconcerting that the new Reference Manual on Science Evidence (RMSE 3d) fails to make the appropriate distinction between admissibility of studies and admissibility of expert witness opinion that has reasonably relied upon appropriate studies.

Consider the following statement from the chapter on epidemiology:

“An epidemiologic study that is sufficiently rigorous to justify a conclusion that it is scientifically valid should be admissible,184 as it tends to make an issue in dispute more or less likely.185

RMSE 3d at 610.  Curiously, the authors of this chapter have ignored Professor Berger’s caution against slicing and dicing, and speak to a single study’s ability to justify a conclusion. The authors of the epidemiology chapter seem to be stressing that scientifically valid studies should be admissible.  The footnote emphasizes the point:

See DeLuca v. Merrell Dow Pharms., Inc., 911 F.2d 941, 958 (3d Cir. 1990); cf. Kehm v. Procter & Gamble Co., 580 F. Supp. 890, 902 (N.D. Iowa 1982) (“These [epidemiologic] studies were highly probative on the issue of causation—they all concluded that an association between tampon use and menstrually related TSS [toxic shock syndrome] cases exists.”), aff’d, 724 F.2d 613 (8th Cir. 1984). Hearsay concerns may limit the independent admissibility of the study, but the study could be relied on by an expert in forming an opinion and may be admissible pursuant to Fed. R. Evid. 703 as part of the underlying facts or data relied on by the expert. In Ellis v. International Playtex, Inc., 745 F.2d 292, 303 (4th Cir. 1984), the court concluded that certain epidemiologic studies were admissible despite criticism of the methodology used in the studies. The court held that the claims of bias went to the studies’ weight rather than their admissibility. Cf. Christophersen v. Allied-Signal Corp., 939 F.2d 1106, 1109 (5th Cir. 1991) (“As a general rule, questions relating to the bases and sources of an expert’s opinion affect the weight to be assigned that opinion rather than its admissibility. . . .”).”

RMSE 3d at 610 n.184 (emphasis in bold, added).  This statement, that studies relied upon by an expert in forming an opinion may be admissible pursuant to Rule 703, is unsupported by Rule 703 and the overwhelming weight of case law interpreting and applying the rule.  (Interestingly, the authors of this chapter seem to abandon their suggestion that studies relied upon “might qualify for the learned treatise exception to the hearsay rule, Fed. R. Evid. 803(18), or possibly the catchall exceptions, Fed. R. Evid. 803(24) & 804(5),” which was part of their argument in the Second Edition of the RMSE.  RMSE 2d at 335 (2000).)  See also RMSE 3d at 214 (discussing statistical studies as generally “admissible,” but acknowledging that admissibility may be no more than permission to explain the basis for an expert’s opinion).

The cases cited by the epidemiology chapter, Kehm and Ellis, both involved “factual findings” in public investigative or evaluative reports, which were independently admissible under Federal Rule of Evidence 803(8)(C).  See Ellis, 745 F.2d at 299-303; Kehm, 724 F.2d at 617-18.  As such, the cases hardly support the chapter’s suggestion that Rule 703 is a rule of admissibility for epidemiologic studies.

Here the RMSE, in one sentence, confuses Rule 703 with an exception to the rule against hearsay, which would prevent the statistical studies from being received in evidence.  The point is reasonably clear, however, that the studies “may be offered” to explain an expert witness’s opinion.  Under Rule 705, that offer may also be refused. The offer, however, is to “explain,” not to have the studies admitted in evidence.

The RMSE is certainly not alone in advancing this notion that studies are themselves admissible.  Other well-respected evidence scholars lapse into this position:

“Well conducted studies are uniformly admitted.”

David L. Faigman, et al., Modern Scientific Evidence:  The Law and Science of Expert Testimony v.1, § 23:1,at 206 (2009)

Evidence scholars should not conflate admissibility of the epidemiologic (or other) studies with the ability of an expert witness to advert to a study to explain his or her opinion.  The testifying expert witness really has no need to become a conduit for off-hand comments and opinions in the introduction or discussion section of relied upon articles, and the wholesale admission of such hearsay opinions undermines the court’s control over opinion evidence.  Rule 703 authorizes reasonable reliance upon “facts and data,” not every opinion that creeps into the published literature.

New Reference Manual’s Uneven Treatment of Conflicts of Interest

October 12th, 2011

The new, third edition of the Reference Manual on Scientific Evidence (RMSE) appears to get off to a good start in the Preface by Judge Kessler and Dr. Kassirer, when they note that the Supreme Court mandated federal courts to

“examine the scientific basis of expert testimony to ensure that it meets the same rigorous standard employed by scientific researchers and practitioners outside the courtroom.”

RMSE at xiii.  The preface falters, however, on two key issues, causation and conflicts of interest, which are taken up as an introduction to the new volume.

1. CAUSATION

The authors tell us in squishy terms that causal assessments are judgments:

“Fundamentally, the task is an inferential process of weighing evidence and using judgment to conclude whether or not an effect is the result of some stimulus. Judgment is required even when using sophisticated statistical methods. Such methods can provide powerful evidence of associations between variables, but they cannot prove that a causal relationship exists. Theories of causation (evolution, for example) lose their designation as theories only if the scientific community has rejected alternative theories and accepted the causal relationship as fact. Elements that are often considered in helping to establish a causal relationship include predisposing factors, proximity of a stimulus to its putative outcome, the strength of the stimulus, and the strength of the events in a causal chain.”

RMSE at xiv.

The authors leave the inferential process as a matter of “weighing evidence,” but without saying anything about how the scientific community does its “weighing.”  Language about “proving” causation is also unclear because “proof” in scientific parlance connotes a demonstration, which we typically find in logic or in mathematics.  Proving empirical propositions suggests a bar set too high such that the courts must inevitable lower the bar considerably.  The question is, of course, how low will judges go to admit evidence.

The authors thus introduce hand waving and excuses for why evidence can be weighed differently in court proceedings from the world of science:

“Unfortunately, judges may be in a less favorable position than scientists to make causal assessments. Scientists may delay their decision while they or others gather more data. Judges, on the other hand, must rule on causation based on existing information. Concepts of causation familiar to scientists (no matter what stripe) may not resonate with judges who are asked to rule on general causation (i.e., is a particular stimulus known to produce a particular reaction) or specific causation (i.e., did a particular stimulus cause a particular consequence in a specific instance). In the final analysis, a judge does not have the option of suspending judgment until more information is available, but must decide after considering the best available science.”

RMSE at xiv.  But the “best available science” may be pretty crummy, and the temptation to turn desperation into evidence (“well, it’s the best we have now”) is often severe.  The authors of the Preface signal that “inconclusive” is not a judgment open to judges charged with expert witness gatekeeping.  If the authors truly mean to suggest that judges should go with whatever is dished out as “the best available science,” then they have overlooked the obvious:  Rule 702 opens the door to “scientific, technical, or other specialized knowledge,” not to hunches, suggestive but inconclusive evidence, and wishful thinking about how the science may turn out when further along.  Courts have a choice to exclude expert witness opinion testimony that is based upon incomplete or inconclusive evidence.

2. CONFLICTS OF INTEREST

Surprisingly, given the scope of the scientific areas covered in the RMSE, the authors discuss conflicts of interest (COI) at some length.  Conflicts of interest are a fact of life in all endeavors, and it is understandable counsel judges and juries to try to identify, assess, and control them.  COIs, however, are weak proxies for unreliability.  The emphasis given here is undue because federal judges are misled into thinking that they can discern unreliability from COI, when they should be focused on the data and the analysis.

The authors of the Preface set about to use COI as a basis for giving litigation plaintiffs a pass, and for holding back studies sponsored by corporate defendants.

“Conflict of interest manifests as bias, and given the high stakes and adversarial nature of many courtroom proceedings, bias can have a major influence on evidence, testimony, and decisionmaking. Conflicts of interest take many forms and can be based on religious, social, political, or other personal convictions. The biases that these convictions can induce may range from serious to extreme, but these intrinsic influences and the biases they can induce are difficult to identify. Even individuals with such prejudices may not appreciate that they have them, nor may they realize that their interpretations of scientific issues may be biased by them. Because of these limitations, we consider here only financial conflicts of interest; such conflicts are discoverable. Nonetheless, even though financial conflicts can be identified, having such a conflict, even one involving huge sums of money, does not necessarily mean that a given individual will be biased. Having a financial relationship with a commercial entity produces a conflict of interest, but it does not inevitably evoke bias. In science, financial conflict of interest is often accompanied by disclosure of the relationship, leaving to the public the decision whether the interpretation might be tainted. Needless to say, such an assessment may be difficult. The problem is compounded in scientific publications by obscure ways in which the conflicts are reported and by a lack of disclosure of dollar amounts.

Judges and juries, however, must consider financial conflicts of interest when assessing scientific testimony. The threshold for pursuing the possibility of bias must be low. In some instances, judges have been frustrated in identifying expert witnesses who are free of conflict of interest because entire fields of science seem to be co-opted by payments from industry. Judges must also be aware that the research methods of studies funded specifically for purposes of litigation could favor one of the parties. Though awareness of such financial conflicts in itself is not necessarily predictive of bias, such information should be sought and evaluated as part of the deliberations.”

RMSE at xiv-xv.  All in all, rather misleading advice.  Financial conflicts are not the only conflicts that can be “discovered.”  Often expert witnesses will have political and organizational alignments, which will show deep-seated ideological alignments with the party for which they are testifying.  For instance, in one silicosis case, an expert witness in the field of history of medicine testified, at an examination before trial, that his father suffered from a silica-related disease.  This witness’s alignment with Marxist historians and his identification with radical labor movements made his non-financial conflicts obvious, although these COI would not necessarily have been apparent from his scholarly publications alone.

How low will the bar be set for discovering COI?  If testifying expert witnesses are relying upon textbooks, articles, essays, will federal courts open the authors/hearsay declarants up to searching discovery of their finances?

Also misleading is the suggestion that “entire fields of science seem to be co-opted by payments from industry.”  Do the authors mean to exclude the plaintiffs’ lawyer litigation industry, which has grown so large and politically powerful in this country?  In litigations in which I have been involved, I have certainly seen plaintiffs’ counsel, or their proxies – labor unions or “victim support groups” provide substantial funding for studies.  The Preface authors themselves show an untoward bias by their pointing out industry payments without giving balanced attention to other interested parties’ funding of scientific studies.

The attention to COI is also surprising given that one of the key chapters, for toxic tort practitioners, was written by Dr. Bernard D. Goldstein, who has testified in toxic tort cases, mostly (but not exclusively) for plaintiffs.  See, e.g., Parker v. Mobil Oil Corp., 7 N.Y.3d 434, 857 N.E.2d 1114, 824 N.Y.S.2d 584 (2006); Exxon Corp. v. Makofski, 116 SW 3d 176 (Tex. Ct. App. 2003).  The Makofsky case is particularly interesting because Dr. Goldstein was forced to explain why he was willing to opine that benzene caused acute lymphocytic leukemia, despite the plethora of published studies finding no statistically significant relationship.  Dr. Goldstein resorted to the inaccurate notion that scientific “proof” of causation requires 95 percent certainty, whereas he imposed only a 51 percent certainty for his medico-legal testimonial adventures. Dr. Goldstein also attempted to justify the discrepancy from the published literature by adverting to the lower standards used by federal regulatory agencies and treating physicians. Id.

These explanations are particularly concerning because they reflect basic errors in statistics and in causal reasoning.  The 95 percent derives from the use of the same percentage in confidence intervals, but the probability involved there is not the probability of the association’s being correct, and it has nothing to do with the probability in the belief that an association is real or is causal.  (Thankfully the RMSE chapter on statistics gets this right, but my fear is that judges will skip over the more demanding chapter on statistics and place undue weight on the toxicology chapter, written by Dr. Goldstein.)  The reference to federal agencies (OSHA, EPA, etc.) and to treating physicians was meant, no doubt, to invoke precautionary principle concepts as a justification for some vague, ill-defined, lower standard of causal assessment.

The Preface authors might well have taken their own counsel and conducted a more searching assessment of COI among authors of Reference Manual.  Better yet, the authors might have focused the judiciary on the data and the analysis.

Toxicology for Judges – The New Reference Manual on Scientific Evidence (2011)

October 5th, 2011

I have begun to dip into the massive third edition of the Reference Manual on Scientific Evidence.  To date, there have been only a couple of acknowledgments of this new work, which was released to the public on September 28, 2011.  SeeA New Day – A New Edition of the Reference Manual of Scientific Evidence”; and David Kaye, “Prometheus Unbound: Releasing the New Edition of the FJC Reference Manual on Scientific Evidence.”

Like previous editions, the substantive scientific areas are covered in discrete chapters, written by subject matter specialists, often along with a lawyer who addresses the legal implications and judicial treatment of that subject matter.  From my perspective, the chapters on statistics, epidemiology, and toxicology are the most important in my practice and in teaching, and I decided to start with the toxicology.  The toxicology chapter, “Reference Guide on Toxicology,” in the third edition is written by Professor Bernard D. Goldstein, of the University of Pittsburgh Graduate School of Public Health, and Mary Sue Henifin, a partner in the law firm of Buchanan Ingersoll, P.C.

CONFLICTS OF INTEREST

At the question and answer session of the public release ceremony, one gentleman rose to note that some of the authors were lawyers with big firm affiliations, which he supposed must mean that they represent mostly defendants.  Based upon his premise, he asked what the review committee had done to ensure that conflicts of interest did not skew or distort the discussions in the affected chapters.  Dr. Kassirer and Judge Kessler responded by pointing out that the chapters were peer reviewed by outside reviewers, and reviewed by members of the supervising review committee.  The questioner seemed reassured, but now that I have looked at the toxicology chapter, I am not so sure.

The questioner’s premise that a member of a large firm will represent mostly defendants and thus have a pro-defense  bias is probably a common perception among unsophisticated lay observers.  What is missing from their analysis is the realization that although gatekeeping helps the defense lawyers’ clients, it takes away legal work from firms that represent defendants in the litigations that are pretermitted by effective judicial gatekeeping.  Erosion of gatekeeping concepts, however, inures to the benefit of plaintiffs, their counsel, as well as the expert witnesses engaged on behalf of plaintiffs in litigation.

The questioner’s supposition in the case of the toxicology chapter, however, is doubly flawed.  If he had known more about the authors, he would probably not have asked his question.  First, the lawyer author, Ms. Henifin, is known for having taken virulently anti-manufacturer positions.  See Richard M. Lynch and Mary S. Henifin, “Causation in Occupational Disease: Balancing Epidemiology, Law and Manufacturer Conduct,” 9 Risk: Health, Safety & Environment 259, 269 (1998) (conflating distinct causal and liability concepts, and arguing that legal and scientific causal criteria should be abrogated when manufacturing defendant has breached a duty of care).

As for the scientist author of the toxicology chapter, Professor Goldstein, the casual reader of the chapter may want to know that he has testified in any number of toxic tort cases, almost invariably on the plaintiffs’ side.  Unlike the defense lawyer, who loses business revenue, when courts shut down unreliable claims, plaintiffs’ testifying or consulting expert witnesses stand to gain by minimalist expert witness opinion gatekeeping.  Given the economic asymmetries, the reader must thus want to know that Prof. Goldstein was excluded as an expert witness in some high-profile toxic tort cases.  See, e.g., Parker v. Mobil Oil Corp., 7 N.Y.3d 434, 857 N.E.2d 1114, 824 N.Y.S.2d 584 (2006) (dismissing leukemia (AML) claim based upon claimed low-level benzene exposure from gasoline) , aff’g 16 A.D.3d 648 (App. Div. 2d Dep’t 2005).  No; you will not find the Parker case cited in the Manual‘s chapter on toxicology. (Parker is, however, cited in the chapter on exposure science.)

I have searched but I could not find any disclosure of Professor Goldstein’s conflicts of interests in this new edition of the Reference Manual.  I would welcome a correction if I am wrong.  Having pointed out this conflict, I would note that financial conflicts of interest are nothing really compared to ideological conflicts of interest, which often propel scientists into service as expert witnesses.

HORMESIS

One way that ideological conflicts might be revealed is to look for imbalances in the presentation of toxicologic concepts.  Most lawyers who litigate cases that involve exposure-response issues are familiar with the “linear no threshold” (LNT) concept that is used frequently in regulatory risk assessments, and which has metastasized to toxic tort litigation, where LNT often has no proper place.

LNT is a dubious assumption because it claims to “known” the dose response at very low exposure levels in the absence of data.  There is a thin plausibility for genotoxic chemicals claimed to be carcinogens, but even that plausibility evaporates when one realizes that there are defense and repair mechanisms to genotoxicity, which must first be saturated before there can be a carcinogenic response.  Hormesis is today an accepted concept that describes a dose-response relationship that shows a benefit at low doses, but harm at high doses.

The toxicology chapter in the Reference Manual has several references to LNT but none to hormesis.  That font of all knowledge, Wikipedia reports that hormesis is controversial, but so is LNT.  This is the sort of imbalance that may well reflect an ideological bias.

One of the leading textbooks on toxicology describes hormesis:

“There is considerable evidence to suggest that some non-nutritional toxic substances may also impart beneficial or stimulatory effects at low doses but that, at higher doses, they produce adverse effects. This concept of “hormesis” was first described for radiation effects but may also pertain to most chemical responses.”

Curtis D. Klaassen, Casarett & Doull’s Toxicology: The Basic Science of Poisons 23 (7th ed. 2008) (internal citations omitted).

Similarly, the Encyclopedia of Toxicology describes hormesis as an important phenomenon in toxicologic science:

“This type of dose–response relationship is observed in a phenomenon known as hormesis, with one explanation being that exposure to small amounts of a material can actually confer resistance to the agent before frank toxicity begins to appear following exposures to larger amounts.  However, analysis of the available mechanistic studies indicates that there is no single hormetic mechanism. In fact, there are numerous ways for biological systems to show hormetic-like biphasic dose–response relationship. Hormetic dose–response has emerged in recent years as a dose–response phenomenon of great interest in toxicology and risk assessment.”

Philip Wexler, Bethesda, et al., eds., 2 Encyclopedia of Toxicology 96 (2005).  One might think that hormesis would also be of great interest to federal judges, but they will not learn about it from reading the Reference Manual.

Hormesis research has come into its own.  The International Dose-Response Society, which “focus[es] on the dose-response in the low-dose zone,” publishes a journal, Dose-Response, and a newsletter, BELLE:  Biological Effects of Low Level Exposure.  In 2009, two leading researchers in the area of hormesis published a collection of important papers:  Mark P. Mattson and Edward J. Calabrese, eds., Hormesis: A Revolution in Biology, Toxicology and Medicine (N.Y. 2009).

A check in PubMed shows that LNT has more “hits” than “hormesis” or “hermetic,” but still the latter phrases exceed 1,267 references, hardly insubstantial.  In actuality, there are many more hermetic relationships identified in the scientific literature, which often fails to identify the relationship by the term hormesis or hermetic.  See Edward J. Calabrese and Robyn B. Blain, “The hormesis database: The occurrence of hormetic dose responses in the toxicological literature,” 61 Regulatory Toxicology and Pharmacology 73 (2011) (reviewing about 9,000 dose-response relationships for hormesis, to create a database of various aspects of hormesis).  See also Edward J. Calabrese and Robyn B. Blain, “The occurrence of hormetic dose responses in the toxicological literature, the hormesis database: An overview,” 202 Toxicol. & Applied Pharmacol. 289 (2005) (earlier effort to establish hormesis database).

The Reference Manual’s omission of hormesis is regrettable.  Its inclusion of references to LNT but not to hormesis appears to result from an ideological bias.

QUESTIONABLE SUBSTANTIVE OPINIONS

One would hope that the toxicology chapter would not put forward partisan substantive positions on issues that are currently the subject of active litigation.  Fondly we would hope that any substantive position advanced would at least be well documented.

For at least one issue, the toxicology chapter dashes our fondest hopes.  Table 1 in the chapter presents a “Sample of Selected Toxicological End Points and Examples of Agents of Concern in Humans.” No documentation or citations are provided for this table.  Most of the exposure agent/disease outcome relationships in the table are well accepted, but curiously at least one agent-disease pair is the subject of current litigation is wildly off the mark:

Parkinson’s disease and manganese

Reference Manual at 653.  If the chapter’s authors had looked, they would have found that Parkinson’s disease is almost universally accepted to have no known cause, except among a few plaintiffs’ litigation expert witnesses.  They would also have found that the issue has been addressed carefully and the claimed relationship or “concern” has been rejected by the leading researchers in the field (who have no litigation ties).  See, e.g., Karin Wirdefeldt, Hans-Olaf Adami, Philip Cole, Dimitrios Trichopoulos, and Jack Mandel, “Epidemiology and etiology of Parkinson’s disease: a review of the evidence.  26 European J. Epidemiol. S1, S20-21 (2011); Tomas R. Guilarte, “Manganese and Parkinson’s Disease: A Critical Review and New Findings,” 118 Environ Health Perspect. 1071, 1078 (2010) (“The available evidence from human and non­human primate studies using behavioral, neuroimaging, neurochemical, and neuropathological end points provides strong sup­port to the hypothesis that, although excess levels of [manganese] accumulation in the brain results in an atypical form of parkinsonism, this clini­cal outcome is not associated with the degen­eration of nigrostriatal dopaminergic neurons as is the case in PD.”)

WHEN ALL YOU HAVE IS A HAMMER, EVERYTHING LOOKS LIKE A NAIL

The substantive specialist author, Professor Goldstein, is not a physician; nor is he an epidemiologist.  His professional focus on animal and cell research shows, and biases the opinions offered in this chapter.

“In qualitative extrapolation, one can usually rely on the fact that a compound causing an effect in one mammalian species will cause it in another species. This is a basic principle of toxicology and pharmacology.  If a heavy metal, such as mercury, causes kidney toxicity in laboratory animals, it is highly likely to do so at some dose in humans.”

Reference Manual at 646.

Such extrapolations may make sense in regulatory contexts, where precauationary judgments are of interest, but they hardly can be said to be generally accepted in controversies in civil actions over actual causation.  Crystalline silica, for instance, causes something resembling lung cancer in rats, but not in mice, guinea pigs, or hamsters.  It hardly makes sense to ask juries to decide whether the plaintiff is more like a rat than a mouse.

For a sober second opinion to the toxicology chapter, one may consider the views of some well-known authors:

“Whereas the concordance was high between cancer-causing agents initially discovered in humans and positive results in animal studies (Tomatis et al., 1989; Wilbourn et al., 1984), the same could not be said for the reverse relationship: carcinogenic effects in animals frequently lacked concordance with overall patterns in human cancer incidence (Pastoor and Stevens, 2005).”

Hans-Olov Adami, Sir Colin L. Berry, Charles B. Breckenridge, Lewis L. Smith, James A. Swenberg, Dimitrios Trichopoulos, Noel S. Weiss, and Timothy P. Pastoor, “Toxicology and Epidemiology: Improving the Science with a Framework for Combining Toxicological and Epidemiological Evidence to Establish Causal Inference,” 122 Toxciological Sciences 223, 224 (2011).

Once again, there is a sense that the scholarship of the toxicology chapter is not as complete or thorough as we would hope.

Diluting “Reasonable Degree of Medical Certainty” – An AAJ-Restatement “Tool” to Help Plaintiffs

October 3rd, 2011

In “the Top Reason that the ALI’s Restatement of Torts Should Steer Clear of Partisan Conflicts,” I pointed out the inappropriateness of advertising the ALI’s Restatement of Torts to the organized plaintiffs’ bar, much as the plaintiffs’ bar advertises potential huge recoveries for the latest tort du jour.  See Michael D. Green & Larry S. Stewart, “The New Restatement’s Top 10 Tort Tools,” Trial 44 (April 2010).

Some of the authors’ tort tool kit may be unexceptionable.  Among these authors’ top ten tort tools, however, is the new Restatement’s edict that “reasonable degree of medical certainty” means, or should mean, nothing more than saying “more likely than not.”  The authors criticize the reasonable certainty standard with an abbreviated rendition of the Restatement’s critique:

“Many courts hold that expert opinion must be expressed in terms of medical or scientific certainty’. Requiring certainty seems to impose a criminal law-like burden of proof that is inconsistent with civil burdens of preponderance of the evidence to establish a fact. Such a requirement is also problematic at best because medical and scientific communities have no such ‘reasonable certainty’ standard. The standard then becomes whatever the attorney who hired the expert tells the expert it means or, absent that, whatever the expert imagines it means. Section 28, comment e, of the Restatement criticizes this standard and makes clear that the same preponderance standard (or ‘more likely than not’ standard), which is universally applied in all aspects of civil cases, also applies to expert testimony.”

Id. at 46-47.

Well, the more likely than not standard is not “universally applied in all aspects of civil cases,” because several states require exemplary damages to be proven by “clear and convincing” or greater evidence.  In some states, the burden of proof in fraud cases is higher than a mere preponderance of the evidence. This premise of the authors’ article is incorrect.

But even if the authors were correct that the preponderance standard applied “in all aspects” of civil cases, their scholarship would remain suspect, as others and I have previously pointed out.  SeeReasonable Degree of Medical Certainty,” and “More Uncertainty About Reasonable Degree of Medical Certainty.”

1. The Restatement’s Treatment of Expert Witness Evidentiary Rules Exceeded the Scope of the Tort Restatement.

The most peculiar aspect of this “top tool,” is that it has nothing to do with the law of torts.  The level of certitude required of an expert witness is an evidentiary and a procedural issue. Of course the issue comes up in tort cases, which frequently involve medical and scientific causation opinions, as well as other expert witness opinions.  The issue, however, comes up in all cases that involve expert witnesses:  trust and estates, regulatory, environmental, securities fraud, commercial, and other cases.

The Restatement of Torts weakly acknowledges its frolic and detour in treating a procedural issue concerning the admissibility of expert witness opinion testimony, by noting that it does “not address any other requirements for the admissibility of an expert witness’s testimony, including qualifications, expertise, investigation, methodology, or reasoning.” Restatement (Third) of Torts: Liability for Physical and Emotional Harm § 28, cmt. e (2010).  The certitude issue has nothing special to do with the substantive law of torts, and should not have been addressed in the torts restatement.

2. The Restatement’s Treatment of “Reasonable Degree of Medical Certainty” Has No Relevance to the Burden of Proof in Tort Cases.

The expert witness certitude issue has nothing to do with the burden of proof, and the Restatement should not have confused and conflated the burden of proof with the standard of certitude for expert witnesses.  The clear but unacceptable implication is that expert witnesses in criminal cases must testify to certitude “beyond a reasonable doubt,” and in claims for equitable relief, expert witnesses may share only opinions that are made, in their minds, by “clear and convincing evidence.”  There is no support in law or logic for the identification of witness certitude with parties’ burdens of proof.

Comment e states the critique more fully:

“If courts do interpret the reasonable-certainty standard to require a level of certitude greater than the preponderance-of-the-evidence standard requires, this creates a troubling inconsistency between standards for the admissibility of evidence and the threshold required for sufficiency of proof. The threshold for admissibility should not be higher than the threshold to sufficiency.  Moreover, the reasonable-certainty standard provides no assurance of the quality of the expert’s qualifications, expertise, investigation, methodology, or reasoning.  Thus, the Section adopts the same preponderance standard that is universally adopted in civil cases.  Direct and cross-examination can be employed to flesh out the degree of certainty with which an expert’s opinion is held and to identify opinions that are speculative and therefore inadmissible.”

Id. The critique badly misfires because there is no inconsistency and no trouble in having different standards for the admissibility of opinion evidence and the burden of proof.  As noted, expert witnesses testify on causation and other issues in criminal, equity, and tort cases, all with different burdens of proof.  Juries in criminal and tort cases must apply instructions on burdens of proof to an entire evidentiary display, not just the expert witnesses’ opinions.  In logic and law, there ultimately must be different burdens for admissibility of expert witness testimony and for sufficiency of a party’s proofs.

3. The Restatement’s Treatment of “Reasonable Degree of Medical Certainty” Incoherently Confuses Two Different Standards.

We can see that Comment e’s approach to legislating an equivalence between expert witness certitude and the burden must fail even on its own terms.  Consider the legal consequences of tort claimants, with the burden of proof, who produce expert witnesses to opine about key elements (e.g., causation) of torts by stating that their opinions were held by a mere “preponderance of the evidence.”

If this probability is understood to be only infinitesimally greater than 50%, then courts would have to direct verdicts in many (and perhaps most) cases.

Courts must ensure that a rational jury can find for the party with the burden of proof.  Juries must evaluate the credibility and reliability of expert witnesses, their opinions, as well as the predicate facts for those opinions.  If those expert witness opinions were barely greater than 50% probable on an essential element, then unless the witnesses had perfect credibility, and all predicate facts were as probable as claimed by the witnesses, then juries would frequently have to reject the witnesses’ opinions.  The bare preponderance of the expert witnesses’ opinions would result in an overall probability of the essential element less than 50%.

4. The Restatement Incorrectly Implies that Expert Witnesses Can Quantify Their Opinions in Probabilistic Terms.

There are even more far-reaching problems with simply substituting “more likely than not” for RDMC as a threshold requirement of expert witness testimony.  Comment e implies that expert witnesses can discern the difference between an opinion that they believe is “more likely than not” and another which is “as likely as not.” On some occasions, there may be opinions that derive from quantitative reasoning, for which an expert witness could truly say, with some level of certainty, that his or her opinion is “more likely than not.” On most occasions, an expert witness’s degree of certainty is a qualitative opinion that simply does not admit of a quantitative characterization. The Restatement’s comment perpetuates this confusion by casting the reasonable certainty standard as a bare probability.

Comment e further suggests that expert witnesses are themselves expert in assessing their own level of certainty, and that they have the training and experience to distinguish an opinion that is 50.1% likely from another that is only 50% likely. The assignment of precise mathematical probabilities to personal, subjective beliefs is a doubtful exercise, at best. See, e.g., Daniel Kahneman and Amos Tversky, “Judgment under Uncertainty: Heuristics and Biases,” 185 Science 1124 (1974).

5. The Restatement Incorrectly Labels “Reasonable Degree of Medical Certainty” As An Empty Formalism.

Comment e ignores the epistemic content of reasonable certainty, which bears an uncanny resemblance to the knowledge requirement of Rule 702.  The “mantra” is helpful to the extent it imposes an objective epistemic standard, especially in states that have failed to impose, or that have abrogated, expert witness gatekeeping.  In some states, there is no meaningful expert witness gatekeeping under either the Frye standard or Rule 702. See, e.g., “Expert Evidence Free-for-All in Washington State.”  See also Joseph Sanders, “Science, Law, and the Expert Witness,” 72 Law & Contemporary Problems 63, 87 & n. 118 (2009) (noting that the meaning of “reasonable degree of scientific certainty” is unclear, but that it can be understood as an alternative formulation of Kumho’s “same intellectual rigor” test).

Some of these “top” tools may be defective.  The authors may need good defense counsel.

Playing Hide the Substantial Factors in Asbestos Litigation

September 27th, 2011

In previous posts, I have noted that Dr. Selikoff, who did so much to shine light on the health hazards of asbestos, did much to keep fiber type differential causation in the dark.  Selikoff was a “crocidolite denier,” who went so far as to deny that American workers had crocidolite exposure at all.  SeeSelikoff and the Mystery of the Disappearing Amphiboles.”

Dr. Selikoff’s extreme positions on crocidolite are difficult to explain in terms of the data known to him.  In addition to some of the data already presented, consider the following statistical tables from the 1965 volume of the Annals of the New York Academy of Science, edited by Dr. Selikoff:

US Dept. of Commerce statistics on imported amosite and crocidolite

year           amosite              crocidolite

1957            14,197                   17,820

1958            16,994                   19,690

1959            16,614                   18,006

1960            19,581                   14,899

1961            15,501                   14,978

1962              9,602                   20,235

App. 3, Statistical Tables – Asbestos, prepared by T. May, United States Bureau of Mines, in I.J. Selikoff & J. Churg, eds., “Biological Effects of Asbestos,” 132 Ann. N.Y. Acad. Sci. at 753, Table 17 (1965).

Blue wins by about 13,000 short tons, over these 5 years.  Dr. Selikoff presided over the Academy meeting that gave rise to this publication, and he edited the volume that was contained these statistics.  Why did Selikoff deny the obvious?

A fair historical hypothesis, to be investigated, would posit that Dr. Selikoff was well aware of the fiber type differential, but he was also aware that the Canadian mining concerns were poised to play up the difference in mesothelioma potency, both in regulatory and litigation contexts.  We have seen how Dr. Selikoff was in close touch with plaintiffs’ advocates, such as Barry Castleman.  The hypothesis is that people like Barry Castleman and his principals, the plaintiffs’ asbestos bar, encouraged or pressured Dr. Selikoff to promote the notion that all asbestos minerals were equally pathogenic to undermine a substantial factor defense from companies that mined or used chrysotile fiber.

Dr. Selikoff almost certainly was aware that the South African companies were judgment proof in U.S. courtrooms.  South Africa was a renegade nation at the time, increasingly the subject of disinvestment campaigns and economic boycotts.  South Africa would not honor court judgments based upon verdicts in U.S. asbestos personal injury cases, and the intermediaries, distributors of amosite and crocidolite, were little more than shell corporations.

Plaintiffs’ counsel, as far back at the late 1970s, surely anticipated the substantial-factor battles ahead.  They obviously had talked to Dr. Schepers, who told them that in his view, chrysotile was innocuous with respect to mesothelioma causation.  The plaintiffs’ lawyers needed to keep the solvent North American companies in the courtroom.

I do not have a Castleman letter to, or a tape recording of a Ron Motley conversation with, Dr. Selikoff to document my postulated scenario.  It is hard, however, to fathom any good reason as to why Dr. Selikoff was so motivated to be a crocidolite denier, when the evidence on both prevalence of, and health effects from, the use of crocidolite and amosite, was so obvious.

Law school professors are fond of analogizing asbestos mesothelioma cases to the famous “two fires” hypothetical in the law of torts. See, e.g., Anderson v. Minneapolis, St. Paul & Sault Ste. Marie Railway 146 Minn. 430, 179 N.W. 45 (1920) (abandoning “but for” causation when two fires, each would have tortuously burned house); Restatement Second of Torts Sec. 432(2).  The analogy is far removed from the typical mesothelioma case, which involves multiple fiber types, with widely varying level of exposures.

Rather than 10 defendants, each responsible for 10% of the total risk, the real world court cases illustrate the misuse of joint and several liability, and the abuse from hiding exposures to products of bankrupt and judgment proof companies.  The following hypothetical is more typical of cases I have litigated:

Plaintiff was a shipyard worker, with 30 years of worksite exposure.  Plaintiff worked with a range of insulation products, some of which had crocidolite or amosite content, but most had only chrysotile asbestos in their makeup.  All or mostly all of the insulation manufacturers are bankrupt.  The plaintiff claims to have changed his car’s brake linings, and that he was exposed to chrysotile once a year, when he did this car repair.

To put some figures to the hypothetical, suppose a range of varying “potency factors” for different fiber types, with different breakdown of the three major asbestos mineral varieties:

10% crocidolite, with a potency factor 200x

20% amosite, with a potency factor 50x

70% chrysotile, with a potency factor 1x

These potency factors are realistic although not everyone would agree.  On these facts, the chrysotile exposure, although quantitatively substantial would have an insubstantial role in producing mesothelioma in such a shipyard worker.  The total relative chrysotile role would be about 2.28% of the total.  Realistically, all chrysotile products, considered together, would not be a substantial factor in producing a mesothelioma.

Now the brake linings exposure claimed from changing brakes once a year supposedly involved only chrysotile exposure.  Compared to the occupational exposure in the hulls of ships, this outdoor work rarely took more than a couple of hours.  A conservative estimate would put the chrysotile exposure somewhere at 0 to 0.01% of all the chrysotile exposure sustained, or somewhere from 0% to 0.0002.3% of causation, assuming that chrysotile can even cause mesothelioma (a doubtful assumption).

Dr. Selikoff surely not envision the gritty details of today’s world of asbestos litigation, in the wake of 90 bankruptcies, with its cynical game of hiding the bankrupt and judgment-proof companies’ shares of liability.  He did, however, likely see that chrysotile mining and manufacturing firms would press the relative innocuousness of chrysotile fiber in causing mesothelioma.  The ground work for the injustice of the mantra that “each and every exposure” to asbestos is a substantial factor was laid a long time ago.

Expert Evidence Free-for-All in Washington State

September 23rd, 2011

Daubert/Frye issues are fact specific. Meaningful commentary about expert witness decisions requires a close familiarity with the facts and data in the case under scrutiny.  A recent case in point comes from the Washington Supreme Court.   The plaintiff alleged that her child was born with birth defects as a result of her workplace exposure to solvents from mixing paints.  The trial court dismissed the case on summary judgment, after excluding plaintiff’s expert witnesses’ causation opinions. On appeal, the Court, en banc, reversed the summary judgment, and remanded for trail.  Anderson v. Akzo Nobel Coatings Inc., No. 82264-6, Wash. Sup.; 2011 Wash. LEXIS 669 (Sept. 8, 2011).

Anderson worked for Akzo Nobel Coatings, Inc., until the time she was fired, which occurred shortly after she filed a safety complaint.  Her last position was plant environmental coordinator for health and safety. Her job occasionally required her to mix paints.  Akzo’s safety policies required respirator usage when mixing paints, although Anderson claimed that enforcement was lax.  Slip op. at 2.  Anderson gave birth to a son, who was diagnosed with congenital nervous and renal system defects.  Id. at 3.

Anderson apparently had two expert witnesses:  one of her child’s treating physicians and Dr. Khattak, an author of an epidemiologic study on birth defects in women exposed to organic solvents. Sohail Khattak, et. al., “Pregnancy Outcome Following Gestational Exposure to Organic Solvents,” 281 J. Am. Med. Ass’n 1106 (1999). See Slip op. at 3.

The conclusions of the published paper were modest, and no claim to causality was made from either the study alone or from the study combined with the prior knowledge in the field.  When the author, Dr. Khattak donned the mantle of expert witness, intellectual modest went out the door:  He opined that the association was causal.  The treating physician echoed Dr. Khattak’s causal opinion.

The fact-specific nature of the decision makes it difficult to assess the accuracy or validity of the plaintiff’s expert witnesses’ opinions.  The claimed teratogenicity of paint solvents is an interesting issue, but I confess it is one with which I am not familiar.  Perhaps others will address the claim.  Regardless whether or not the claim has scientific merit, the Anderson decision is itself seriously defective.  The Washington Supreme Court’s opinion shows that it did little to familiarize itself with the factual issue, and holds that judges need not tax themselves very much to understand the application of scientific principles to the facts and data of their cases.  Indeed, what is disturbing about this decision is that it sets the bar so low for medical causation claims. Although Anderson does not mark a reversion to the old Ferebee standard, which would allow any qualified, willing expert witness to testify to any conclusion, the decision does appear to permit any opinion based upon a generally accepted methodology, without gatekeeping analysis of whether the expert has actually faithfully and appropriately applied the claimed methodology.  The decision eschews the three subparts of Federal Rule of Evidence 702, which requires that the proffered opinion:

(1) … is based upon sufficient facts or data,

(2) … is the product of reliable principles and methods, and

(3) …[is the product of the application of] the principles and methods reliably to the facts of the case.

Federal Rule of Evidence 702.

In abrogating standards for expert witness opinion testimony, the Washington Supreme Court manages to commit several important errors about the nature of scientific and medical testimony.  These errors are much more serious than any possible denial of intellectual due process in the Anderson case because they virtually ensure that meaningful gatekeeping will not take place in future Washington state court cases.

I. The Court Confused Significance Probability with Expert Witnesses’ Subjective Assessment of Posterior Probability

The Washington Supreme Court advances two grounds for abrogating gatekeeping in medical causation cases.  First, the Court mistakenly states that the degree of certainty for scientific propositions is greater in the scientific world than it is in a civil proceeding:

“Generally, the degree of certainty required for general acceptance in the scientific community is much higher than the concept of probability used in civil courts.  While the standard of persuasion in criminal cases is “beyond a reasonable doubt,” the standard in most civil cases is a mere “preponderance.”

Id. at 14.  No citation is provided for the proposition that the scientific degree of certainty is “much higher,” other than a misleading reference to a book by Marcia Angell, former editor of the New England Journal of Medicine:

“By contrast, “[f]or a scientific finding to be accepted, it is customary to require a 95 percent probability that it is not due to chance alone.”  Marcia Angell, M.D., Science on Trial: The Clash of Medical Evidence and the Law in the Breast Implant Case 114 (1996).  The difference in degree of confidence to satisfy the Frye “general acceptance” standard and the substantially lower standard of “preponderance” required for admissibility in civil matters has been referred to as “comparing apples to oranges.” Id. To require the exacting level of scientific certainty to support opinions on causation would, in effect, change the standard for opinion testimony in civil cases.”

Id. at 15.  This popular press book hardly supports the Court’s contention. The only charitable interpretation of the 95% probability is that the Court, through Dr. Angell, is taking an acceptable rate of false positive errors to be no more than the customary 5%, and is looking at a confidence interval based upon this specified error rate of 1 – α. This error rate, however, is not the probability that the null hypothesis is true.  If the Court would have read the very next sentence, after the first sentence it quotes from Dr. Angell, it would have seen:

“(I am here giving a shorthand version of a much more complicated statistical concept.)”

Science on Trial at 114 (1996).  The Court failed to note that Dr. Angell was talking about significance probability, which is used to assess the strength of the evidence in a single study against the null hypothesis of no association.  Dr. Angell was well aware that she was simplifying the meaning of significance probability in order to distinguish it from a totally different concept, the probability of attribution of a specific case to a known cause of the disease.  It is the probability of attribution that has some relevance to the Court’s preponderance standard; and the probability of attribution standard is not different from the civil preponderance standard.

The Court’s citation of Dr. Angell for the proposition that the “degree of confidence” and the “preponderance” standard are like “comparing apples to oranges,” is a complete distortion of Dr. Angell’s book.  She is comparing the attributable risk based upon an effect size – the relative risk, which need be only greater than 50% for specific causation, with a significance probability for the interpretation of the data from a single, based upon the assumption of the null hypothesis:

“Comparing the size of an effect with the probability that a given finding isn’t due to chance is comparing apples and oranges.”

Id. This statement is a far cry from the Court’s misleading paraphrase, and is no support at all for the Court’s statistical solecism. Implicit in the Court’s error is its commission of the transpositional fallacy; it has confused significance probability (the probability of the evidence given the null hypothesis) with Bayesian posterior probabilities (the probability of the null hypothesis given all the data and evidence in the case).

Having misunderstood significance probability to be at odds with the preponderance standard, the Court notes that the “absence of a statistically significant basis” for an expert witness’s opinion does not implicate Frye or render the expert witness’s opinion inadmissible.  Id. at 16.  In the Anderson case, this musing is pure dictum because Dr. Khattak’s study showed a highly statistically significant difference in the rate of birth defects among women with solvent exposures compared with women without such exposures.

II.  The Court Abandons Evidence or Data as Necessary to Support Judgments of Causality

The Anderson Court did not stop with its misguided distinction between burdens of proof in science and in law.  The Court went on to offer the remarkable suggestion that gatekeeping is unnecessary for medical opinions because they are not, in any event, evidence-based:

“Many expert medical opinions are pure opinions and are based on experience and training rather than scientific data.  We only require that ‘medical expert testimony . . . be based upon ‘a reasonable degree of medical certainty’ or probability.”

Slip op. at16 -17 (internal citations omitted).  There may be some opinions that are experientially based, but the Court did not, and could not, adduce any support for the proposition that judgments of teratogenic causation do not require scientific data.  Troublingly, the Court appears to allow medical expert opinions to be “pure opinions,” unsupported by empirical, scientific data.

Presumably as an example of non-evidence based medical opinions, the Anderson Court offers the example of differential diagnosis:

“Many medical opinions on causation are based upon differential diagnoses. A physician or other qualified expert may base a conclusion about causation through a process of ruling out potential causes with due consideration to temporal factors, such as events and the onset of symptoms.”

Id. at 17. This example, however, does not explain or justify anything the Court  claimed.  Differential diagnoses, or more accurately “differential etiology,” is a process of reasoning by iterative disjunctive syllogism to the most likely cause of a particular patient’s disease.  The syllogism assumes that any disjunct – possible cause of this specific case – has previously, independently been shown to be capable of causing the outcome in question.  There is no known methodology by which this syllogism itself can show general causation.

Not surprisingly, the Court makes no attempt to support its mistaken claim that differential diagnosis permits the assessment of general causation without the necessity of “scientific data.”

The Court’s confusion between significance probability (1 – α)% and posterior probability based upon all the evidence, as well as its confusion between differential diagnosis and evidence-based assessments of general causation, allowed the Court to take a short way with medical causation evidence.  The denial of scientific due process followed inevitably.

III.  The Court Abandoned All Gatekeeping for Expert Witness Opinion Testimony

The Anderson Court suggested that gatekeeping was required by Washington’s continued adherence to the stringent Frye test, but the Court then created an exception bigger than the rule:

“Once a methodology is accepted in the scientific community, then application of the science to a particular case is a matter of weight and admissibility under ER 702, the Frye test is only implicated where the opinion offered is based upon novel science.  It applies where either the theory and technique or method of arriving at the data relied upon is so novel that it is not generally accepted by the relevant scientific community.  There is nothing novel about the theory that organic solvent exposure may cause brain damage and encephalopathy.  See, e.g., Berry v. CSX Transp., Inc., 709 So. 2d 552, 568 & n.12, 571-72 (Fla. Dist. Ct. App. 1998) (surveying medical literature). Nor does it appear that there is anything novel about the methods of the study about which Dr. Khattak wrote. Khattak, supra, at 1106. Frye does not require that the specific conclusions drawn from the scientific data upon which Dr. Khatta relied be generally accepted in the scientific community.  Frye does not require every deduction drawn from generally accepted theories to be generally accepted.”

Slip op. at 18-19 (internal citations omitted).

By excepting the specific inferences and conclusions from judicial review, the Court has sanctioned any nonsense as long as the expert witness can proclaim that he used the methods of “toxicology,” or of “epidemiology,” or some other generally accepted branch of science.  The Court left no room to challenge whether the claim is correct at any other than the most general level.  The studies cited in support of a causation may completely lack internal or external validity, but if they are of a class of studies that are “scientific,” and purport to use a method that is generally accepted (e.g., cohort or case-control studies), then the inquiry is over. Indeed, the Court left no room at all for challenges to expert witnesses who give dubious opinions about medical causation.

IV. Fault Issues

Not content to banish science from the judicial assessment of scientific causality judgments, the Anderson Court went further to take away any defense based upon the mother’s fault in engaging in unprotected mixing of paints while pregnant, or the mother’s fault in smoking while pregnant.   Slip op. at 20.  Suing the mother as a tortfeasor may not be an attractive litigation option to the defendant in a case arising out of workplace exposure to an alleged teratogen, but clearly the mother could be at fault with respect to the causation of her child’s harm. She was in charge of environmental health and safety, and she may well have been aware of the hazards of solvent exposures.  In this case, there were grounds to assert the mother’s fault both in failing to comply with workplace safety rules, and in smoking during her pregnancy (assuming that there was evidence, at the same level as paint fumes, for the teratogenicity of smoking).

Milward — Unhinging the Courthouse Door to Dubious Scientific Evidence

September 2nd, 2011

It has been an interesting year in the world of expert witnesses.  We have seen David Egilman attempt a personal appeal of a district court’s order excluding him as an expert.  Stephen Ziliak has prattled on about how he steered the Supreme Court from the brink of disaster by helping them to avoid the horrors of statistical significance.  And then we had a philosophy professor turned expert witness, Carl Cranor, publicly touting an appellate court’s decision that held his testimony admissible.  Cranor, under the banner of the Center for Progressive Reform (CPR), hails the First Circuit’s opinion as the greatest thing since Sir Isaac Newton.   Carl Cranor, “Milward v. Acuity Specialty Products: How the First Circuit Opened Courthouse Doors for Wronged Parties to Present Wider Range of Scientific Evidence” (July 25, 2011).

Philosophy Professor Carl Cranor has been trying for decades to dilute the scientific approach to causal conclusions to permit the precautionary principle to find its way into toxic tort cases.  Cranor, along with others, has also criticized federal court expert witness gatekeeping for deconstructing individual studies, showing that the individual studies are weak, and ignoring the overall pattern of evidence from different disciplines.  This criticism has some theoretical merit, but the criticism is typically advanced as an excuse for “manufacturing certainty” from weak, inconsistent, and incoherent scientific evidence.  The criticism also ignores the actual text of the relevant rule – Rule 702, which does not limit the gatekeeping court to assessing individual “pieces” of evidence.  The scientific community acknowledges that there are times when a weaker epidemiologic dataset may be supplemented by strong experiment evidence that leads appropriately to a conclusion of causation.  See, e.g., Hans-Olov Adami, Sir Colin L. Berry, Charles B. Breckenridge, Lewis L. Smith, James A. Swenberg, Dimitrios Trichopoulos, Noel S. Weiss, and Timothy P. Pastoor, “Toxicology and Epidemiology: Improving the Science with a Framework for Combining Toxicological and Epidemiological Evidence to Establish Causal Inference,” 122 Toxicological Sci. 223 (2011) (noting the lack of a systematic, transparent way to integrate toxicologic and epidemiologic data to support conclusions of causality; proposing a “grid” to permit disparate lines of evidence to be integrated into more straightforward conclusions).

For the most part, Cranor’s publications have been ignored in the Rule 702 gatekeeping process.  Perhaps that is why he shrugged his academic regalia and took on the mantle of the expert witness, in Milward v. Acuity Specialty Products, a case involving a claim that benzene exposure caused plaintiff’s acute promyelocytic leukemia (APL), one of several types of acute myeloid leukemia.  Milward v. Acuity Specialty Products Group, Inc., 664 F.Supp. 2d 137 (D.Mass. 2009) (O’Toole, J.).

Philosophy might seem like the wrong discipline to help a court or a jury decide general and specific causation of a rare cancer, with an incidence of less 8 cases per million per year.  (A PubMed search on leukeumia and Cranor yielded no hits.)  Cranor supplemented the other, more traditional testimony from a toxiciologist, by attempting to show that the toxicologist’s testimony was based upon sound scientific method.  Cranor was particularly intent to show that the toxicologist, Dr. Martyn Smith, had used sound method to reach a scientific conclusion, even though he lacked strong epidemiologic studies to support his opinion.

The district court excluded Cranor’s testimony, along with plaintiff’s scientific expert witnesses.  The Court of Appeals, however, reversed, and remanded with instructions that plaintiff’s scientific expert witnesses’ opinions were admissible.  639 F.3d 11 (1st Cir. 2011).  Hence Cranor’s and the CPR’s hyperbole about the opening of the courthouse doors.

The district court was appropriately skeptical about plaintiff’s expert witnesses’ reliance upon epidemiologic studies, the results of which were not statistically significant.  Before reaching the issue of statistical significance, however, the district court found that Dr. Smith had relied upon studies that did not properly support his opinion.  664 F.Supp. 2d at 148.  The defense presented Dr. David Garabrant, an expert witness with substantial qualifications and accomplishments in epidemiologic science.  Dr. Garabrant persuaded the Court that Dr. Smith had relied upon some studies that tended to show no association, and others that presented faulty statistical analyses.  Other studies, relied upon by Dr. Smith, presented data on AML, but Dr. Smith speculated that these AML cases could have been APL cases.  Id.

None of the studies relied upon by plaintiffs’ Dr Smith had a statistically significant result for APL.  Id. at 144. The district court pointed out that scientists typically take care to rely upon data only that shows “statistical significance,” and Dr. Smith (plaintiff’s expert witness) deviated from sound scientific method in attempting to support his conclusion with studies that had not ruled out chance as an explanation for their increased risk ratios.  Id.  The district court did not summarize the studies’ results, and so the unsoundness of plaintiff’s method is difficult to evaluate.  Rather than engaging in hand waving and speculating about “trends” and suggestions, those witnesses could have performed a meta-analysis to increase the statistical precision of a summary point estimate beyond what was achieved in any single, small study.  Neither the plaintiff nor the district court addressed the issue of aggregating study results to address the role of chance in producing the observed results.

The inability to show a statistically significant result was not surprising given how rare the APL subtype of AML is.  Sample size might legitimately interfere with the ability of epidemiologic studies to detect a statistically significant association that really existed.  If this were truly the case, the lack of a statistically significant association could not be interpreted to mean the absence of an association without potentially committing a type II error. In any event, the district court in Milward was willing to credit the plaintiffs’ claim that epidemiologic evidence may not always be essential for establishing causality.  If causality does exist, however, epidemiologic studies are usually required to confirm the existence of the causal relationship.  Id. at 148.

The district court also took a close look at Smith’s mechanistic biological evidence, and found it equally speculative.  Although plausibility is a desirable feature of a causal hypothesis, it only sets the stage for actual data:

“Dr. Smith’s opinion is that ‘[s]ince benzene is clastogenic and has the capability of breaking and rearranging chromosomes, it is biologically plausible for benzene to cause’ the t(15;17) translocation. (Smith Decl. ¶ 28.b.) This is a kind of ‘bull in the china shop’ generalization: since the bull smashes the teacups, it must also smash the crystal. Whether that is so, of course, would depend on the bull having equal access to both teacups and crystal.”

Id. at 146.

“Since general extrapolation is not justified and since there is no direct observational evidence that benzene causes the t(15;17) translocation, Dr. Smith’s opinion — that because benzene is an agent that can cause some chromosomal mutations, it is ‘plausible’ that it causes the one critical to APL—is simply an hypothesis, not a reliable scientific conclusion.”

Id. at 147.

Judge O’Toole’s opinion is a careful, detailed consideration of the facts and data upon which Dr. Smith relied upon, but the First Circuit found an abuse of discretion, and reversed. 639 F.3d 11 (1st Cir. 2011).

The Circuit incorrectly suggested that Smith’s opinion was based upon a “weight of the evidence” methodology described by “the world-renowned epidemiologist Sir Arthur Bradford Hill in his seminal methodological article on inferences of causality. See Arthur Bradford Hill, The Environment and Disease: Association or Causation?, 58 Proc. Royal Soc’y Med. 295 (1965).” Id. at 17.  This suggestion is remarkable because everyone knows that it was Arthur’s much smarter brother, Austin, who wrote the seminal article and gave the Bradford Hill name to the famous presidential address published by the Royal Society of Medicine.  Arthur Bradford Hill was not even a knight if he existed at all.

The Circuit’s suggestion is also remarkable for confusing a vague “weight of the evidence” methodology with the statistical and epidemiologic approach of one of the 20th century’s great methodologists.  Sir Austin is known for having conducted the first double-blinded randomized clinical trial, as well as having shown, with fellow knight Sir Richard Doll, the causal relationship between smoking and lung cancer.  Sir Austin wrote one of the first texts on medical statistics, Principles of Medical Statistics (London 1937).  Sir Austin no doubt was turning in his grave when he was associated with Cranor’s loosey-goosey “weight of the evidence” methodology.  See, e.g., Douglas L. Weed, “Weight of Evidence: A Review of Concept and Methods,” 25 Risk Analysis 1545 (2005) (noting the vague, ambiguous, indefinite nature of the concept of “weight of evidence” review).

The Circuit adopted a dismissive attitude towards epidemiology in general, citing to an opinion piece by several cancer tumor biologists, whom the court described as a group from the National Cancer Institute (NCI).  The group was actually a workshop sponsored by the NCI, with participants from many institutions.  Id. at 17 (citing Michele Carbon[e] et al., “Modern Criteria to Establish Human Cancer Etiology,” 64 Cancer Res. 5518, 5522 (2004)).  The cited article did report some suggestions for modifying Bradford Hill’s criteria in the light of modern molecular biology, as well as a sense of the group that there was no “hierarchy” in which epidemiology was at the top.  (The group definitely did not address the established concept that some types of epidemiologic studies are analytically more powerful to support inferences of causality than others — the hierarchy of epidemiologic evidence.)

The Circuit then proceeded to evaluate Dr. Smith’s consideration of the available epidemiologic studies.  The Circuit mistakenly defined an “odds ratio” as the “the difference in the incidence of a disease between a population that has been exposed to benzene and one that has not.”  Id. at 24. Having failed to engage with the evidence sufficiently to learn what an odds ratio was, the Circuit Court then proceeded to state that the difference between Dr. Garabrant and Dr. Smith, as to how to calculate the odds ratio in some of the studies, was a mere difference in opinion between experts, and Dr. Garabrant’s criticisms of Dr. Smith’s approach went to the weight, not the admissibility, of the evidence.  These sparse words are, of course, a legal conclusion, not an explanation, and the Circuit leaves us without any real understanding of how Dr. Smith may have gone astray, but still have been advancing a legitimate opinion within epidemiology, which was not his discipline.  Id. at 22. If Dr. Smith’s idea of an odds ratio was as incorrect as the Circuit’s, his calculation may have had no validity whatsoever, and thus his opinions derived from his flawed ideas may have clearly failed the requirements of Rule 702.  The Circuit’s opinion is not terribly helpful in understanding anything other than its summary rejection of the district court’s more detailed analysis.

The Circuit also advanced the “impossibility” defense for Dr. Smith’s failure to rely upon epidemiologic studies with statistically significant results.  Id. at 24. As noted above, such studies fail to rule out chance for their finding of risk ratios above or below 1.0 (the measure of no association).  Because the likelihood of obtaining a risk ratio of exactly 1.0 is vanishingly small, epidemiologic science must and does consider the role of chance in explaining data that diverges from a measure of no association.  Dr. Smith’s hand waving about the large size of the studies needed to show an increased risk may have some validity in the context of benzene exposure and APL, but it does not explain or justify the failure to use aggregative techniques such as meta-analysis.  The hand waving also does nothing to rule out the role of chance in producing the results he relied upon in court.

The Circuit Court appeared to misunderstand the very nature of the need for statistical evaluation of stochastic biological events, such as APL incidence in a population.  According to the Circuit, Dr. Smith’s reliance upon epidemiologic data was merely

“meant to challenge the theory that benzene exposure could not cause APL, and to highlight that the limited data available was consistent with the conclusions that he had reached on the basis of other bodies of evidence. He stated that ‘[i]f epidemiologic studies of benzene-exposed workers were devoid of workers who developed APL, one could hypothesize that benzene does not cause this particular subtype of AML.’ The fact that, on the  contrary, ‘APL is seen in studies of workers exposed to benzene where the subtypes of AML have been separately analyzed and has been found at higher levels than expected’ suggested to him that the limited epidemiological evidence was at the very least consistent with, and suggestive of, the conclusion that benzene can cause APL.

* * *

Dr. Smith did not infer causality from this suggestion alone, but rather from the accumulation of multiple scientifically acceptable inferences from different bodies of evidence.”

Id. at 25

But challenging the theory that benzene exposure does not cause APL does not help show the validity of the studies relied upon, or the inferences drawn from them.  This was plaintiffs’ and Dr. Smith’s burden under Rule 702, and the Circuit seemed to lose sight of the law and the science with Professor Cranor’s and Dr. Smith’s sleight of hand.  As for the Circuit’s suggestion that scraps of evidence from different kinds of scientific studies can establish scientific knowledge, this approach was rejected by the great mathematician, physicist, and philosopher of science, Henri Poincaré:

“[O]n fait la science avec des faits comme une maison avec des pierres; mais une accumulation de faits n’est pas plus une science qu’un tas de pierres n’est une maison.”

Henri Poincaré, La Science et l’Hypothèse (1905) (chapter 9, Les Hypothèses en Physique).  Litigants, either plaintiff or defendant, should not be allowed to pick out isolated findings in a variety of studies, and throw them together as if that were science.

As unclear and dubious as the Circuit’s opinion is, the court did not throw out the last 18 years of Rule 702 law.  The Court distinguished the Milward case, with its sparse epidemiologic studies from those cases “in which the available epidemiological studies found that there is no causal link.”  Id. at 24 (citing Norris v. Baxter Healthcare Corp., 397 F.3d 878, 882 (10th Cir.2005), and Allen v. Pa. Eng’g Corp., 102 F.3d 194, 197 (5th Cir.1996).  The Court, however, provided no insight into why the epidemiologic studies must rise to the level of showing no causal link before an expert can torture weak, inconsistent, and contradictory data to claim such a link.  This legal sleight of hand is simply a shifting of the burden of proof, which should have been on plaintiffs and Dr. Smith.  Desperation is not a substitute for adequate scientific evidence to support a scientific conclusion.

The Court’s failure to engage more directly with the actual data, facts, and inferences, however, is likely to cause mischief in federal cases around the country.