TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

Confidence in Intervals and Diffidence in the Courts

March 4th, 2012

Next year, the Supreme Court’s Daubert decision will turn 20.  The decision, in interpreting Federal Rule of Evidence 702, dramatically changed the landscape of expert witness testimony.  Still, there are many who would turn the clock back to disabling the gatekeeping function.  In past posts, I have identified scholars, such as Erica Beecher-Monas and the late Margaret Berger, who tried to eviscerate judicial gatekeeping.  Recently a student note argued for the complete abandonment of all judicial control of expert witness testimony.  See  Note, “Admitting Doubt: A New Standard for Scientific Evidence,” 123 Harv. L. Rev. 2021 (2010)(arguing that courts should admit all relevant evidence).

One advantage that comes from requiring trial courts to serve as gatekeepers is that the expert witnesses’ reasoning is approved or disapproved in an open, transparent, and rational way.  Trial courts subject themselves to public scrutiny in a way that jury decision making does not permit.  The critics of Daubert often engage in a cynical attempt to remove all controls over expert witnesses in order to empower juries to act on their populist passions and prejudices.  When courts misinterpret statistical and scientific evidence, there is some hope of changing subsequent decisions by pointing out their errors.  Jury errors on the other hand, unless they involve determinations of issues for which there were “no evidence,” are immune to institutional criticism or correction.

Despite my whining, not all courts butcher statistical concepts.  There are many astute judges out there who see error and call it error.  Take for instance, the trial judge who was confronted with this typical argument:

“While Giles admits that a p-value of .15 is three times higher than what scientists generally consider statistically significant—that is, a p-value of .05 or lower—she maintains that this ‘‘represents 85% certainty, which meets any conceivable concept of preponderance of the evidence.’’ (Doc. 103 at 16).”

Giles v. Wyeth, Inc., 500 F.Supp. 2d 1048, 1056-57 (S.D.Ill. 2007), aff’d, 556 F.3d 596 (7th Cir. 2009).  Despite having case law cited to it (such as In re Ephedra), the trial court looked to the Reference Manual on Scientific Evidence, a resource that seems to be ignored by many federal judges, and rejected the bogus argument.  Unfortunately, the lawyers who made the bogus argument still are licensed, and at large, to incite the same error in other cases.

This business perhaps would be amenable to an empirical analysis.  An enterprising sociologist of the law could conduct some survey research on the science and math training of the federal judiciary, on whether the federal judges have read chapters of the Reference Manual before deciding cases involving statistics or science, and whether federal judges expressed the need for further education.  This survey evidence could be capped by an analysis of the prevalence of certain kinds of basic errors, such as the transpositional fallacy committed by so many judges (but decisively rejected in the Giles case).  Perhaps such an empirical analysis would advance our understanding whether we need specialty science courts.

One of the reasons that the Reference Manual on Scientific Evidence is worthy of so much critical attention is that the volume has the imprimatur of the Federal Judicial Center, and now the National Academies of Science.  Putting aside the idiosyncratic chapter by the late Professor Berger, the Manual clearly present guidance on many important issues.  To be sure, there are gaps, inconsistencies, and mistakes, but the statistics chapter should be a must-read for federal (and state) judges.

Unfortunately, the Manual has competition from lesser authors whose work obscures, misleads, and confuses important issues.  Consider an article by two would-be expert witnesses, who testify for plaintiffs, and confidently misstate the meaning of a confidence interval:

“Thus, a RR [relative risk] of 1.8 with a confidence interval of 1.3 to 2.9 could very likely represent a true RR of greater than 2.0, and as high as 2.9 in 95 out of 100 repeated trials.”

Richard W. Clapp & David Ozonoff, “Environment and Health: Vital Intersection or Contested Territory?” 30 Am. J. L. & Med. 189, 210 (2004).  This misstatement was then cited and quoted with obvious approval by Professor Beecher-Monas, in her text on scientific evidence.  Erica Beecher-Monas, Evaluating Scientific Evidence: An Interdisciplinary Framework for Intellectual Due Process 60-61 n. 17 (2007).   Beecher-Monas goes on, however, to argue that confidence interval coefficients are not the same as burdens of proof, but then implies that scientific standards of proof are different from the legal preponderance of the evidence.  She provides no citation or support for the higher burden of scientific proof:

“Some commentators have attributed the causation conundrum in the courts to the differing burdens of proof in science and law.28 In law, the civil standard of ‘more probable than not’ is often characterized as a probability greater than 50 percent.29 In science, on the other hand, the most widely used standard is a 95 percent confidence interval (corresponding to a 5 percent level of significance, or p-level).30 Both sound like probabilistic assessment. As a result, the argument goes, civil judges should not exclude scientific testimony that fails scientific validity standards because the civil legal standards are much lower. The transliteration of the ‘more probable than not’ standard of civil factfinding into a quantitative threshold of statistical evidence is misconceived. The legal and scientific standards are fundamentally different. They have different goals and different measures.  Therefore, one cannot justifiably argue that evidence failing to meet the scientific standards nonetheless should be admissible because the scientific standards are too high for preponderance determinations.”

Id. at 65.  This seems to be on the right track, although Beecher-Monas does not state clearly whether she subscribes to the notion that the burdens of proof in science and law differ.  The argument then takes a wrong turn:

“Equating confidence intervals with burdens of persuasion is simply incoherent. The goal of the scientific standard – the 95 percent confidence interval – is to avoid claiming an effect when there is none (i.e., a false positive).31

Id. at 66.   But this is crazy error; confidence intervals are not burdens of persuasion, legal or scientific.  Beecher-Monas is not, however, content to leave this alone:

“Scientists using a 95 percent confidence interval are making a prediction about the results being due to something other than chance.”

Id. at 66 (emphasis added).  Other than chance?  Well this implies causality, as well as bias and confounding, but the confidence interval, like the p-value, addresses only random or sampling error.  Beecher-Monas’s error is neither random nor scientific.  Indeed, she perpetuates the same error committed by the Fifth Circuit in a frequently cited Bendectin case, which interpreted the confidence interval as resolving questions of the role of matters “other than chance,” such as bias and confounding.  Brock v. Merrill Dow Pharmaceuticals, Inc., 874 F.2d 307, 311-12 (5th Cir. 1989)(“Fortunately, we do not have to resolve any of the above questions [as to bias and confounding], since the studies presented to us incorporate the possibility of these factors by the use of a confidence interval.”)(emphasis in original).  See, e.g., David H. Kaye, David E. Bernstein, and Jennifer L. Mnookin, The New Wigmore – A Treatise on Evidence:  Expert Evidence § 12.6.4, at 546 (2d ed. 2011) Michael O. Finkelstein, Basic Concepts of Probability and Statistics in the Law 86-87 (2009)(criticizing the overinterpretation of confidence intervals by the Brock court).

Clapp, Ozonoff, and Beecher-Monas are not alone in offering bad advice to judges who must help resolve statistical issues.  Déirdre Dwyer, a prominent scholar of expert evidence in the United Kingdom, manages to bundle up the transpositional fallacy and a misstatement of the meaning of the confidence interval into one succinct exposition:

“By convention, scientists require a 95 per cent probability that a finding is not due to chance alone. The risk ratio (e.g. ‘2.2’) represents a mean figure. The actual risk has a 95 per cent probability of lying somewhere between upper and lower limits (e.g. 2.2 ±0.3, which equals a risk somewhere between 1.9 and 2.5) (the ‘confidence interval’).”

Déirdre Dwyer, The Judicial Assessment of Expert Evidence 154-55 (Cambridge Univ. Press 2008).

Of course, Clapp, Ozonoff, Beecher-Monas, and Dwyer build upon a long tradition of academics’ giving errant advice to judges on this very issue.  See, e.g., Christopher B. Mueller, “Daubert Asks the Right Questions:  Now Appellate Courts Should Help Find the Right Answers,” 33 Seton Hall L. Rev. 987, 997 (2003)(describing the 95% confidence interval as “the range of outcomes that would be expected to occur by chance no more than five percent of the time”); Arthur H. Bryant & Alexander A. Reinert, “The Legal System’s Use of Epidemiology,” 87 Judicature 12, 19 (2003)(“The confidence interval is intended to provide a range of values within which, at a specified level of certainty, the magnitude of association lies.”) (incorrectly citing the first edition of Rothman & Greenland, Modern Epidemiology 190 (Philadelphia 1998);  John M. Conley & David W. Peterson, “The Science of Gatekeeping: The Federal Judicial Center’s New Reference Manual on Scientific Evidence,” 74 N.C.L.Rev. 1183, 1212 n.172 (1996)(“a 95% confidence interval … means that we can be 95% certain that the true population average lies within that range”).

Who has prevailed?  The statistically correct authors of the statistics chapter of the Reference Manual on Scientific Evidence, or the errant commentators?  It would be good to have some empirical evidence to help evaluate the judiciary’s competence. Here are some cases, many drawn from the Manual‘s discussions, arranged chronologically, before and after the first appearance of the Manual:

Before First Edition of the Reference Manual on Scientific Evidence:

DeLuca v. Merrell Dow Pharms., Inc., 911 F.2d 941, 948 (3d Cir. 1990)(“A 95% confidence interval is constructed with enough width so that one can be confident that it is only 5% likely that the relative risk attained would have occurred if the true parameter, i.e., the actual unknown relationship between the two studied variables, were outside the confidence interval.   If a 95% confidence interval thus contains ‘1’, or the null hypothesis, then a researcher cannot say that the results are ‘statistically significant’, that is, that the null hypothesis has been disproved at a .05 level of significance.”)(internal citations omitted)(citing in part, D. Barnes & J. Conley, Statistical Evidence in Litigation § 3.15, at 107 (1986), as defining a CI as “a limit above or below or a range around the sample mean, beyond which the true population is unlikely to fall”).

United States ex rel. Free v. Peters, 806 F. Supp. 705, 713 n.6 (N.D. Ill. 1992) (“A 99% confidence interval, for instance, is an indication that if we repeated our measurement 100 times under identical conditions, 99 times out of 100 the point estimate derived from the repeated experimentation will fall within the initial interval estimate … .”), rev’d in part, 12 F.3d 700 (7th Cir. 1993)

DeLuca v. Merrell Dow Pharms., Inc., 791 F. Supp. 1042, 1046 (D.N.J. 1992)(”A 95% confidence interval means that there is a 95% probability that the ‘true’ relative risk falls within the interval”) , aff’d, 6 F.3d 778 (3d Cir. 1993)

Turpin v. Merrell Dow Pharms., Inc., 959 F.2d 1349, 1353-54 & n.1 (6th Cir. 1992)(describing a 95% CI of 0.8 to 3.10, to mean that “random repetition of the study should produce, 95 percent of the time, a relative risk somewhere between 0.8 and 3.10”)

Hilao v. Estate of Marcos, 103 F.3d 767, 787 (9th Cir. 1996)(Rymer, J., dissenting and concurring in part).

After the first publication of the Reference Manual on Scientific Evidence:

American Library Ass’n v. United States, 201 F.Supp. 2d 401, 439 & n.11 (E.D.Pa. 2002), rev’d on other grounds, 539 U.S. 194 (2003)

SmithKline Beecham Corp. v. Apotex Corp., 247 F.Supp.2d 1011, 1037-38 (N.D. Ill. 2003)(“the probability that the true value was between 3 percent and 7 percent, that is, within two standard deviations of the mean estimate, would be 95 percent”)(also confusing attained significance probability with posterior probability: “This need not be a fatal concession, since 95 percent (i.e., a 5 percent probability that the sign of the coefficient being tested would be observed in the test even if the true value of the sign was zero) is an  arbitrary measure of statistical significance.  This is especially so when the burden of persuasion on an issue is the undemanding ‘preponderance’ standard, which  requires a confidence of only a mite over 50 percent. So recomputing Niemczyk’s estimates as significant only at the 80 or 85 percent level need not be thought to invalidate his findings.”), aff’d on other grounds, 403 F.3d 1331 (Fed. Cir. 2005)

In re Silicone Gel Breast Implants Prods. Liab. Litig, 318 F.Supp.2d 879, 897 (C.D. Cal. 2004) (interpreting a relative risk of 1.99, in a subgroup of women who had had polyurethane foam covered breast implants, with a 95% CI that ran from 0.5 to 8.0, to mean that “95 out of 100 a study of that type would yield a relative risk somewhere between on 0.5 and 8.0.  This huge margin of error associated with the PUF-specific data (ranging from a potential finding that implants make a woman 50% less likely to develop breast cancer to a potential finding that they make her 800% more likely to develop breast cancer) render those findings meaningless for purposes of proving or disproving general causation in a court of law.”)(emphasis in original)

Ortho–McNeil Pharm., Inc. v. Kali Labs., Inc., 482 F.Supp. 2d 478, 495 (D.N.J.2007)(“Therefore, a 95 percent confidence interval means that if the inventors’ mice experiment was repeated 100 times, roughly 95 percent of results would fall within the 95 percent confidence interval ranges.”)(apparently relying party’s expert witness’s report), aff’d in part, vacated in part, sub nom. Ortho McNeil Pharm., Inc. v. Teva Pharms Indus., Ltd., 344 Fed.Appx. 595 (Fed. Cir. 2009)

Eli Lilly & Co. v. Teva Pharms, USA, 2008 WL 2410420, *24 (S.D.Ind. 2008)(stating incorrectly that “95% percent of the time, the true mean value will be contained within the lower and upper limits of the confidence interval range”)

Benavidez v. City of Irving, 638 F.Supp. 2d 709, 720 (N.D. Tex. 2009)(interpreting a 90% CI to mean that “there is a 90% chance that the range surrounding the point estimate contains the truly accurate value.”)

Estate of George v. Vermont League of Cities and Towns, 993 A.2d 367, 378 n.12 (Vt. 2010)(erroneously describing a confidence interval to be a “range of values within which the results of a study sample would be likely to fall if the study were repeated numerous times”)

Correct Statements

There is no reason for any of these courts to have struggled so with the concept of statistical significance or of the confidence interval.  These concepts are well elucidated in the Reference Manual on Scientific Evidence (RMSE):

“To begin with, ‘confidence’ is a term of art. The confidence level indicates the percentage of the time that intervals from repeated samples would cover the true value. The confidence level does not express the chance that repeated estimates would fall into the confidence interval.91

* * *

According to the frequentist theory of statistics, probability statements cannot be made about population characteristics: Probability statements apply to the behavior of samples. That is why the different term ‘confidence’ is used.”

RMSE 3d at 247 (2011).

Even before the Manual, many capable authors have tried to reach the judiciary to help them learn and apply statistical concepts more confidently.  Professors Michael Finkelstein and Bruce Levin, of the Columbia University’s Law School and Mailman School of Public Health, respectively, have worked hard to educate lawyers and judges in the important concepts of statistical analyses:

“It is the confidence limits PL and PU that are random variables based on the sample data. Thus, a confidence interval (PL, PU ) is a random interval, which may or may not contain the population parameter P. The term ‘confidence’ derives from the fundamental property that, whatever the true value of P, the 95% confidence interval will contain P within its limits 95% of the time, or with 95% probability. This statement is made only with reference to the general property of confidence intervals and not to a probabilistic evaluation of its truth in any particular instance with realized values of PL and PU. “

Michael O. Finkelstein & Bruce Levin, Statistics for Lawyers at 169-70 (2d ed. 2001)

Courts have no doubt been confused to some extent between the operational definition of a confidence interval and the role of the sample point estimate as an estimator of the population parameter.  In some instances, the sample statistic may be the best estimate of the population parameter, but that estimate may be rather crummy because of the sampling error involved.  See, e.g., Kenneth J. Rothman, Sander Greenland, Timothy L. Lash, Modern Epidemiology 158 (3d ed. 2008) (“Although a single confidence interval can be much more informative than a single P-value, it is subject to the misinterpretation that values inside the interval are equally compatible with the data, and all values outside it are equally incompatible. * * *  A given confidence interval is only one of an infinite number of ranges nested within one another. Points nearer the center of these ranges are more compatible with the data than points farther away from the center.”); Nicholas P. Jewell, Statistics for Epidemiology 23 (2004)(“A popular interpretation of a confidence interval is that it provides values for the unknown population proportion that are ‘compatible’ with the observed data.  But we must be careful not to fall into the trap of assuming that each value in the interval is equally compatible.”); Charles Poole, “Confidence Intervals Exclude Nothing,” 77 Am. J. Pub. Health 492, 493 (1987)(“It would be more useful to the thoughtful reader to acknowledge the great differences that exist among the p-values corresponding to the parameter values that lie within a confidence interval … .”).

Admittedly, I have given an impressionistic account, and I have used anecdotal methods, to explore the question whether the courts have improved in their statistical assessments in the 20 years since the Supreme Court decided Daubert.  Many decisions go unreported, and perhaps many errors are cut off from the bench in the course of testimony or argument.  I personally doubt that judges exercise greater care in their comments from the bench than they do in published opinions.  Still, the quality of care exercised by the courts would be a worthy area of investigation by the Federal Judicial Center, or perhaps by other sociologists of the law.

Relative of Risk > Two in the Courts – Updated

March 3rd, 2012

See , for the updated the case law on the issue of using relative and attributable risks to satisfy plaintiff’s burden of showing, more likely than not, that an exposure or condition caused a plaintiff’s disease or injury.

Scientific illiteracy among the judiciary

February 29th, 2012

Ken Feinberg, speaking at a symposium on mass torts, asks what legal challenges do mass torts confront in the federal courts.  The answer seems obvious.

Pharmaceutical cases that warrant federal court multi-district litigation (MDL) treatment typically involve complex scientific and statistical issues.  The public deserves having MDL cases assigned to judges who have special experience and competence to preside in cases in which these complex issues predominate.  There appears to be no procedural device to ensure that the judges selected in the MDL process have the necessary experience and competence, and a good deal of evidence to suggest that the MDL judges are not up to the task at hand.

In the aftermath of the Supreme Court’s decision in Daubert, the Federal Judicial Center assumed responsibility for producing science and statistics tutorials to help judges grapple with technical issues in their cases.  The Center has produced videotaped lectures as well as the Reference Manual on Scientific Evidence, now in its third edition.  Despite the Center’s best efforts, many federal judges have shown themselves to be incorrigible.  It is time to revive the discussions and debates about implementing a “science court.”

The following three federal MDLs all involved pharmaceutical products, well-respected federal judges, and a fundamental error in statistical inference.

Avandia

Avandia is a prescription oral anti-diabetic medication licensed by GlaxoSmithKline (GSK).  Concerns over Avandia’s association with excess heart attack risk resulted in regulatory revisions of its availability, as well as thousands of lawsuits.  In a decision that affected virtually all of those several thousand claims, aggregated for pretrial handing in a federal MDL, a federal judge, in ruling on a Rule 702 motion, described a clinical trial with a risk ratio greater than 1.0, with a p-value of 0.08, as follows:

“The DREAM and ADOPT studies were designed to study the impact of Avandia on prediabetics and newly diagnosed diabetics. Even in these relatively low-risk groups, there was a trend towards an adverse outcome for Avandia users (e.g., in DREAM, the p-value was .08, which means that there is a 92% likelihood that the difference between the two groups was not the result of mere chance).FN72

In re Avandia Marketing, Sales Practices and Product Liability Litigation, 2011 WL 13576, *12 (E.D. Pa. 2011)(Rufe, J.).  This is a remarkable error by a trial judge given the responsibility for pre-trial handling of so many cases.  There are many things you can argue about a p-value of 0.08, but Judge Rufe’s interpretation is not an argument; it is error.  That such an error, explicitly warned against in the Reference Manual on Scientific Evidence, could be made by an MDL judge, over 15 years since the first publication of the Manual, highlights the seriousness and the extent of the illiteracy problem.

What possible basis could the Avandia MDL court have to support this clearly erroneous interpretation of crucial studies in the litigation?  Footnote 72 in Judge Rufe’s opinion references a report by plaintiffs’ expert witness, Allan D. Sniderman, M.D, “a cardiologist, medical researcher, and professor at McGill University.” Id. at *10.  The trial court goes on to note that:

“GSK does not challenge Dr. Sniderman’s qualifications as a cardiologist, but does challenge his ability to analyze and draw conclusions from epidemiological research, since he is not an epidemiologist. GSK’s briefs do not elaborate on this challenge, and in any event the Court finds it unconvincing given Dr. Sniderman’s credentials as a researcher and published author, as well as clinician, and his ability to analyze the epidemiological research, as demonstrated in his report.”

Id.

What more evidence could the Avandia MDL trial court possibly have needed to show that Sniderman was incompetent to give statistical and epidemiologic testimony?  Fundamentally at odds with the Manual on an uncontroversial point, Sniderman had given the court a baseless, incorrect interpretation of a p-value.  Everything else he might have to say on the subject was likely suspect.  If, as the court suggested, GSK did not elaborate upon its challenge with specific examples, then shame on GSK. The trial court, however, could have readily determined that Sniderman was speaking nonsense by reading the chapter on statistics in the Reference Manual on Scientific Evidence.  For all my complaints about gaps in coverage in the Manual, the text, on this issue is clear and concise. It really is not too much to expect an MDL trial judge to be conversant with the basic concepts of scientific and statistical evidence set out in the Manual, which is prepared to help federal judges.

Phenylpropanolamine (PPA) Litigation

Litigation over phenylpropanolamine was aggregated, within the federal system, before Judge Barbara Rothstein.  Judge Rothstein is not only a respected federal trial judge, she was the director of the Federal Judicial Center, which produces the Reference Manual on Scientific Evidence.  Her involvement in overseeing the preparation of the third edition of the Manual, however, did not keep Judge Rothstein from badly misunderstanding and misstating the meaning of a p-value in the PPA litigation.  See In re Phenylpropanolamine (PPA) Prods. Liab. Litig., 289 F.Supp. 2d 1230, 1236 n.1 (W.D. Wash. 2003)(“P-values measure the probability that the reported association was due to chance… .”).  Tellingly, Judge Rothstein denied, in large part, the defendants’ Rule 702 challenges.  Juries, however, overwhelmingly rejected the claims that PPA caused their strokes.

Ephedra Litigation

Judge Rakoff, of the Southern District of New York, notoriously committed the transposition fallacy in the Ephedra litigation:

“Generally accepted scientific convention treats a result as statistically significant if the P-value is not greater than .05. The expression ‘P=.05’ means that there is one chance in twenty that a result showing increased risk was caused by a sampling error—i.e., that the randomly selected sample accidentally turned out to be so unrepresentative that it falsely indicates an elevated risk.”

In re Ephedra Prods. Liab. Litig., 393 F.Supp. 2d 181, 191 (S.D.N.Y. 2005).

Judge Rakoff then fallaciously argued that the use of a critical value of less than 5% of significance probability increased the “more likely than not” burden of proof upon a civil litigant.  Id. at 188, 193.  See Michael O. Finkelstein, Basic Concepts of Probability and Statistics in the Law 65 (2009).

Judge Rakoff may well have had help in confusing the probability used to characterize the plaintiff’s burden of proof with the probability of attained significance.  At least one of the defense expert witnesses in the Ephedra cases gave an erroneous definition of “statistically significant association,” which may have invited the judicial error:

“A statistically significant association is an association between exposure and disease that meets rigorous mathematical criteria demonstrating that the finding is unlikely to be the result of chance.”

Report of John Concato, MD, MS, MPH, at 7, ¶29 (Sept. 13, 2004).  Dr. Concato’s error was picked up and repeated in the defense briefing of its motion to preclude:

“The likelihood that an observed association could occur by chance alone is evaluated using tests for statistical significance.”

Memorandum of Law in Support of Motion by Ephedra Defendants to Exclude Expert Opinions of Charles Buncher, [et alia] …That Ephedra Causes Hemorrhagic Stroke, Ischemic Stroke, Seizure, Myocardial Infarction, Sudden Cardiac Death, and Heat-Related Illnesses at 9 (Dec. 3, 2004).

Judge Rakoff’s insistence that requiring “statistical significance” at the customary 5% level would change the plaintiffs’ burden of proof, and require greater certitude for epidemiologists than for other expert witnesses who opine in less “rigorous” fields of learning, is wrong as a matter of fact.  His Honor’s comparison, however, ignores the Supreme Court’s observation that the point of Rule 702 is:

‘‘to make certain that an expert, whether basing testimony upon professional studies or personal experience, employs in the courtroom the same level of intellectual rigor that characterizes the practice of an expert in the relevant field.’’

Kumho Tire Co. v. Carmichael, 526 U.S. 137, 152 (1999).

Judge Rakoff not only ignored the conditional nature of significance probability, but he overinterpreted the role of significance testing in arriving at a conclusion of causality.  Statistical significance may answer the question of the strength of the evidence for ruling out chance in producing the data observed based upon an assumption of the no risk, but it doesn’t alone answer the question whether the study result shows an increased risk.  Bias and confounding must be considered, along with other Bradford Hill factors.

Even if the p-value could be turned into a posterior probability of the null hypothesis, there would be many other probabilities that would necessarily diminish that probability.  Some of the other factors (which could be expressed as objective or subjective probabilities) include:

  • accuracy of the data reporting
  • data collection
  • data categorization
  • data cleaning
  • data handling
  • data analysis
  • internal validity of the study
  • external validity of the study
  • credibility of study participants
  • credibility of study researchers
  • credibility of the study authors
  • accuracy of the study authors’ expression of their research
  • accuracy of the editing process
  • accuracy of the testifying expert witness’s interpretation
  • credibility of the testifying expert witness
  • other available studies, and their respective data and analysis factors
  • all the other Bradford Hill factors

If these largely independent factors each had a probability or accuracy of 95%, the conjunction of their probabilities would likely be below the needed feather weight on top of 50%.  In sum, Judge Rakoff’s confusing significance probability and the posterior probability of the null hypothesis does not subvert the usual standards of proof in civil cases.  See also Sander Greenland, “Null Misinterpretation in Statistical Testing and Its Impact on Health Risk Assessment,” 53 Preventive Medicine 225 (2011).

WHENCE COMES THIS ERROR

As a matter of intellectual history, I wonder where this error entered into the judicial system.  As a general matter, there was not much judicial discussion of statistical evidence before the 1970s.  The earliest manifestation of the transpositional fallacy in connection with scientific and statistical evidence appears in an opinion of the United States Court of Appeals, for the District of Columbia Circuit.  Ethyl Corp. v. EPA, 541 F.2d 1, 28 n.58 (D.C. Cir.), cert. denied, 426 U.S. 941 (1976).  The Circuit’s language is worth looking at carefully:

“Petitioners demand sole reliance on scientific facts, on evidence that reputable scientific techniques certify as certain.

Typically, a scientist will not so certify evidence unless the probability of error, by standard statistical measurement, is less than 5%. That is, scientific fact is at least 95% certain.  Such certainty has never characterized the judicial or the administrative process. It may be that the ‘beyond a reasonable doubt’ standard of criminal law demands 95% certainty.  Cf. McGill v. United States, 121 U.S.App.D.C. 179, 185 n.6, 348 F.2d 791, 797 n.6 (1965). But the standard of ordinary civil litigation, a preponderance of the evidence, demands only 51% certainty. A jury may weigh conflicting evidence and certify as adjudicative (although not scientific) fact that which it believes is more likely than not. ***”

 Id.  The 95% certainty appears to derive from 95% confidence intervals, although “confidence” is a technical term in statistics, and it most certainly does not mean the probability of the alternative hypothesis under consideration.  Similarly, the error that is less than 5% is not the probability of error of the belief in hypothesis of no difference between observations and expectations, but rather the probability of observing the data or the data even more extreme, on the assumption that observed would equal the expected.  The District of Columbia Circuit thus created a strawman:  scientific certainty is 95%, whereas civil and administrative law certainty is 51%.  This is rubbish, which confuses the frequentist probability from hypothesis testing with the subjective probability for belief in a fact.

The transpositional fallacy has a good pedigree, but that does not make it correct.  Only a lawyer would suggest that a mistake once made was somehow binding upon future litigants.  The following collection of citations and references illustrate how widespread the fundamental misunderstanding of statistical inference is, in the courts, in the academy, and at the bar.  If courts cannot deliver fair, accurate adjudication of scientific facts, then it is time to reform the system.


Courts

U.S. Supreme Court

Vasquez v. Hillery, 474 U.S. 254, 259 n.3 (1986) (“the District Court . . . accepted . . . a probability of 2 in 1,000 that the phenomenon was attributable to chance”)

U.S. Court of Appeals

First Circuit

Fudge v. Providence Fire Dep’t, 766 F.2d 650, 658 (1st Cir. 1985) (“Widely accepted statistical techniques have been developed to determine the likelihood an observed disparity resulted from mere chance.”)

Second Circuit

Nat’l Abortion Fed. v. Ashcroft, 330 F. Supp. 2d 436 (S.D.N.Y. 2004), aff’d in part, 437 F.3d 278 (2d Cir. 2006), vacated, 224 Fed. App’x 88 (2d Cir. 2007) (reporting an expert witness’s interpretation of a p-value of 0.30 to mean that there was a 30% probability that the study results were due to chance alone)

Smith v. Xerox Corp., 196 F.3d 358, 366 (2d Cir. 1999) (“If an obtained result varies from the expected result by two standard deviations, there is only about a .05 probability that the variance is due to chance.”)

Waisome v. Port Auth., 948 F.2d 1370, 1376 (2d Cir. 1991) (“about one chance in 20 that the explanation for a deviation could be random”)

Ottaviani v. State Univ. of New York at New Paltz, 875 F.2d 365, 372 n.7 (2d Cir. 1989)

Murphy v. General Elec. Co., 245 F. Supp. 2d 459, 467 (N.D.N.Y. 2003) (“less than a 5% probability that age was related to termination by chance”)

Third Circuit

United States v. State of Delaware, 2004 WL 609331, *10 n.27 (D. Del. 2004) (“there is a 5% (or 1 in 20) chance that the relationship observed is purely random”)

Magistrini v. One Hour Martinizing Dry Cleaning, 180 F. Supp. 2d 584, 605 n.26 (D.N.J. 2002) (“only 5% probability that an observed association is due to chance”)

Fifth Circuit

EEOC v. Olson’s Dairy Queens, Inc., 989 F.2d 165, 167 (5th Cir. 1993) (“Dr. Straszheim concluded that the likelihood that [the] observed hiring patterns resulted from truly race-neutral hiring practices was less than one chance in ten thousand.”)

Capaci v. Katz & Besthoff, Inc., 711 F.2d 647, 652 (5th Cir. 1983) (“the highest probability of unbiased hiring was 5.367 × 10-20”), cert. denied, 466 U.S. 927 (1984)

Rivera v. City of Wichita Falls, 665 F.2d 531, 545 n.22 (5th Cir. 1982)(” A variation of two standard deviations would indicate that the probability of the observed outcome occurring purely by chance would be approximately five out of 100; that is, it could be said with a 95% certainty that the outcome was not merely a fluke. Sullivan, Zimmer & Richards, supra n.9 at 74.”)

Vuyanich v. Republic Nat’l Bank, 505 F. Supp. 224, 272 (N.D.Tex. 1980) (“the chances are less than one in 20 that the true coefficient is actually zero”), judgement vacated, 723 F.2d 1195 (5th Cir. 1984).

Rivera v. City of Wichita Falls, 665 F.2d 531, 545 n.22 (5th Cir. 1982) (“the probability of the observed outcome occurring purely by chance would be approximately five out of 100; that is, it could be said with a 95% certainty that the outcome was not merely a fluke”)

Seventh Circuit

Adams v. Ameritech Services, Inc., 231 F.3d 414, 424, 427 (7th Cir. 2000) (“it is extremely unlikely (that is, there is less than a 5% probability) that the disparity is due to chance.”)

Sheehan v. Daily Racing Form, Inc., 104 F.3d 940, 941 (7th Cir. 1997) (“An affidavit by a statistician . . . states that the probability that the retentions . . . are uncorrelated with age is less than 5 percent.”)

Eighth Circuit

Craik v. Minnesota State Univ. Bd., 731 F.2d 465, 476n. 13 (8th Cir. 1984) (“Statistical significance is a measure of the probability that an observed disparity is not due to chance. Baldus & Cole, Statistical Proof of Discrimination § 9.02, at 290 (1980). A finding that a disparity is statistically significant at the 0.05 or 0.01 level means that there is a 5 per cent. or 1 per cent. probability, respectively, that the disparity is due to chance.

Ninth Circuit

Good v. Fluor Daniel Corp., 222 F.Supp. 2d 1236, 1241n.9 (E.D. Wash. 2002)(describing “statistical tools to calculate the probability that the difference seen is caused by random variation”)

D.C. Circuit

National Lime Ass’n v. EPA, 627 F.2d 416,453 (D.C. Cir. 1980)

FEDERAL CIRCUIT

Hodges v. Secretary Dep’t Health & Human Services, 9 F.3d 958, 967 (Fed. Cir. 1993) (Newman, J., dissenting) (“Scientists as well as judges must understand: ‘the reality that the law requires a burden of proof, or confidence level, other than the 95 percent confidence level that is often used by scientists to reject the possibility that chance alone accounted for observed differences’.”)(citing and quoting from the Report of the Carnegie Commission on Science, Technology, and Government, Science and Technology in Judicial Decision Making 28 (1993).


Regulatory Guidance

OSHA’s Guidance for Compliance with Hazard Communication Act:

“Statistical significance is a mathematical determination of the confidence in the outcome of a test. The usual criterion for establishing statistical significance is the p-value (probability value). A statistically significant difference in results is generally indicated by p < 0.05, meaning there is less than a 5% probability that the toxic effects observed were due to chance and were not caused by the chemical. Another way of looking at it is that there is a 95% probability that the effect is real, i.e., the effect seen was the result of the chemical exposure.”

U.S. Dep’t of Labor, Guidance for Hazard Determination for Compliance with the OSHA Hazard Communication Standard (29 CFR § 1910.1200) Section V (July 6, 2007).


Academic Commentators

Lucinda M. Finley, “Guarding the Gate to the Courthouse:  How Trial Judges Are Using Their Evidentiary Screening Role to Remake Tort Causation Rules,” 336 DePaul L. Rev. 335, 348 n. 49 (1999):

“Courts also require that the risk ratio in a study be ‘statistically significant,’ which is a statistical measurement of the likelihood that any detected association has occurred by chance, or is due to the exposure. Tests of statistical significance are intended to guard against what are called ‘Type I’ errors, or falsely ascribing a relationship when there in fact is not one (a false positive).  See SANDERS, supra note 5, at 51. The discipline of epidemiology is inherently conservative in making causal ascriptions, and regards Type I errors as more serious than Type II errors, or falsely assuming no association when in fact there is one (false negative). Thus, epidemiology conventionally requires a 95% level of statistical significance, i.e. that in statistical terms it is 95% likely that the association is due to exposure, rather than to chance. See id. at 50-52; Thompson, supra note 3, at 256-58. Despite courts’ use of statistical significance as an evidentiary screening device, this measurement has nothing to do with causation. It is most reflective of a study’s sample size, the relative rarity of the disease being studied, and the variance in study populations. Thompson, supra note 3, at 256.”

 

Erica Beecher-Monas, Evaluating Scientific Evidence: An Interdisciplinary Framework for Intellectual Due Process 42 n. 30 (2007):

 “‘By rejecting a hypothesis only when the test is statistically significant, we have placed an upper bound, .05, on the chance of rejecting a true hypothesis’. Fienberg et al., p. 22. Another way of explaining this is that it describes the probability that the procedure produced the observed effect by chance.”

Professor Fienberg stated the matter corrrectly, but Beecher-Monas goes on to restate the matter in her own words, erroneously.  Later, she repeats her incorrect interpretation:

“Statistical significance is a statement about the frequency with which a particular finding is likely to arise by chance.19”

Id. at 61 (citing a paper by Sander Greenland, who correctly stated the definition).

Mark G. Haug, “Minimizing Uncertainty in Scientific Evidence,” in Cynthia H. Cwik & Helen E. Witt, eds., Scientific Evidence Review:  Current Issues at the Crossroads of Science, Technology, and the Law – Monograph No. 7, at 87 (2006)

Carl F. Cranor, Regulating Toxic Substances: A Philosophy of Science and the Law at 33-34(Oxford 1993)(One can think of α, β (the chances of type I and type II errors, respectively) and 1- β as measures of the “risk of error” or “standards of proof.”) See also id. at 44, 47, 55, 72-76.

Arnold Barnett, “An Underestimated Threat to Multiple Regression Analyses Used in Job Discrimination Cases, 5 Indus. Rel. L.J. 156, 168 (1982) (“The most common rule is that evidence is compelling if and only if the probability the pattern obtained would have arisen by chance alone does not exceed five percent.”)

David W. Barnes, Statistics as Proof: Fundamentals of Quantitative Evidence 162 (1983)(“Briefly, however, the findings of statistical significance at the P < .05, P < .04, and P < .02 levels indicate that the court can be 95%, 96%, and 98% certain, respectively, that the null hypotheses involved in the specific tests carried out … should be rejected.”)

Wayne Roth-Nelson & Kathey Verdeal, “Risk Evidence in Toxic Torts,” 2 Envt’l Lawyer 405,415-16 (1996) (confusing burden of proof with standard for hypothesis testint; and apparently endorsing the erroneous views given by Judge Newman, dissenting in Hodges). Caveat: Roth-Nelson is now a “forensic” toxicologist, who testifies in civil and criminal trials.

Steven R. Weller, “Book Review: Regulating Toxic Substances: A Philosophy of Science and Law,” 6 Harv. J. L. & Tech. 435, 436, 437-38 (1993) (“only when the statistical evidence gathered from studies shows that it is more than ninety-five percent likely that a test substance causes cancer will the substance be characterized scientifically as carcinogenic … to determine legal causality, the plaintiff need only establish that the probability with which it is true that the substance in question causes cancer is at least fifty percent, rather than the ninety-five percent to prove scientific causality”).

The Carnegie Commission on Science, Technology, and Government, Report on Science and Technology in Judicial Decision Making 28 (1993) (“The reality is that courts often decide cases not on the scientific merits, but on concepts such as burden of proof that operate differently in the legal and scientific realms. Scientists may misperceive these decisions as based on a misunderstanding of the science, when in actuality the decision may simply result from applying a different norm, one that, for the judiciary, is appropriate.  Much, for instance, has been written about ‘junk science’ in the courtroom. But judicial decisions that appear to be based on ‘bad’ science may actually reflect the reality that the law requires a burden of proof, or confidence level, other than the 95 percent confidence level that is often used by scientists to reject the possibility that chance alone accounted for observed differences.”).


Plaintiffs’ Counsel

Steven Rotman, “Don’t Know Much About Epidemiology?” Trial (Sept. 2007) (Author’s question answered in the affirmative:  “P values.  These measure the probability that a reported association between a drug and condition was due to chance.  A P-value of 0.05, which is generally considered the standard for statistical significance, means there is a 5 percent probability that the association was due to chance.”)

Defense Counsel

Bruce R. Parker & Anthony F. Vittoria, “Debunking Junk Science: Techniques for Effective Use of Biostatistics,” 65 Defense Csl. J. 35, 44 (2002) (“a P value of .01 means the researcher can be 99 percent sure that the result was not due to chance”).

Meta-Analysis of Observational Studies in Non-Pharmaceutical Litigations

February 26th, 2012

Yesterday, I posted on several pharmaceutical litigations that have involved meta-analytic studies.   Meta-analytic studies have also figured prominently in non-pharmaceutical product liability litigation, as well as in litigation over videogames, criminal recidivism, and eyewitness testimony.  Some, but not all, of the cases in these other areas of litigation are collected below.  In some cases, the reliability or validity of the meta-analyses were challenged; in some cases, the court fleetingly referred to meta-analyses relied upon the parties.  Some of the courts’ treatments of meta-analysis are woefully inadequate or erroneous.  The failure of the Reference Manual on Scientific Evidence to update its treatment of meta-analysis is telling.  See The Treatment of Meta-Analysis in the Third Edition of the Reference Manual on Scientific Evidence” (Nov. 14, 2011).

 

Abortion (Breast Cancer)

Christ’s Bride Ministries, Inc. v. Southeastern Pennsylvania Transportation Authority, 937 F.Supp. 425 (E.D. Pa. 1996), rev’d, 148 F.3d 242 (3d Cir. 1997)

Asbestos

In re Joint E. & S. Dist. Asbestos Litig., 827 F. Supp. 1014, 1042 (S.D.N.Y. 1993)(“adding a series of positive but statistically insignificant SMRs [standardized mortality ratios] together does not produce a statistically significant pattern”), rev’d, 52 F.3d 1124 (2d Cir. 1995).

In Re Asbestos Litigation, Texas Multi District Litigation Cause No. 2004-03964 (June 30, 2005)(Davidson, J.)(“The Defendants’ response was presented by Dr. Timothy Lash.  I found him to be highly qualified and equally credible.  He largely relied on the report submitted to the Environmental Protection Agency by Berman and Crump (“B&C”).  He found the meta-analysis contained in B&C credible and scientifically based.  B&C has not been published or formally accepted by the EPA, but it does perform a valuable study of the field.  If the question before me was whether B&C is more credible than the Plaintiffs’ studies taken together, my decision might well be different.”)

Jones v. Owens-Corning Fiberglas, 288 N.J. Super. 258, 672 A.2d 230 (1996)

Berger v. Amchem Prods., 818 N.Y.S.2d 754 (2006)

Grenier v. General Motors Corp., 2009 WL 1034487 (Del.Super. 2009)

Benzene

Knight v. Kirby Inland Marine, Inc., 363 F. Supp. 2d 859 (N.D. Miss. 2005)(precluding proffered opinion that benzene caused bladder cancer and lymphoma; noting without elaboration or explanation, that meta-analyses are “of limited value in combining the results of epidemiologic studies based on observation”), aff’d, 482 F.3d 347 (5th Cir. 2007)

Baker v. Chevron USA, Inc., 680 F.Supp. 2d 865 (S.D. Ohio 2010)

Diesel Exhaust Exposure

King v. Burlington Northern Santa Fe Ry. Co., 277 Neb. 203, 762 N.W.2d 24 (2009)

Kennecott Greens Creek Mining Co. v. Mine Safety & Health Admin., 476 F.3d 946 (D.C. Cir. 2007)

Eyewitness Testimony

State of New Jersey v. Henderson, 208 N.J. 208, 27 A.3d 872 (2011)

Valle v. Scribner, 2010 WL 4671466 (C.D. Calif. 2010)

People v. Banks, 16 Misc.3d 929, 842 N.Y.S.2d 313 (2007)

Lead

Palmer Asarco Inc., 510 F.Supp.2d 519 (N.D. Okla. 2007)

PCBs

In re Paoli R.R. Yard PCB Litigation, 916 F.2d 829, 856-57 (3d Cir.1990) (‘‘There is some evidence that half the time you shouldn’t believe meta-analysis, but that does not mean that meta-analyses are necessarily in error. It means that they are, at times, used in circumstances in which they should not be.’’) (internal quotation marks and citations omitted), cert. denied, 499 U.S. 961 (1991)

Repetitive Stress

Allen v. International Business Machines Corp., 1997 U.S. Dist. LEXIS 8016 (D. Del. 1997)

Tobacco

Flue-Cured Tobacco Cooperative Stabilization Corp. v. United States Envt’l Protection Agency, 4 F.Supp.2d 435 (M.D.N.C. 1998), vacated by, 313 F.3d 852 (4th Cir. 2002)

Tocolytics – Medical Malpractice

Hurd v. Yaeger, 2009 WL 2516874 (M.D. Pa. 2009)

Toluene

Black v. Rhone-Poulenc, Inc., 19 F.Supp.2d 592 (S.D.W.Va. 1998)

Video Games (Violent Behavior)

Brown v. Entertainment Merchants Ass’n, ___ U.S.___, 131 S.Ct. 2729 (2011)

Entertainment Software Ass’n v. Blagojevich, 404 F.Supp.2d 1051 (N.D. Ill. 2005)

Entertainment Software Ass’n v. Hatch, 443 F.Supp.2d 1065 (D. Minn. 2006)

Video Software Dealers Ass’n v. Schwarzenegger, 556 F.3d 950 (9th Cir. 2009)

Vinyl Chloride

Taylor v. Airco, 494 F. Supp. 2d 21 (D. Mass. 2007)(permitting opinion testimony that vinyl chloride caused intrahepatic cholangiocarcinoma, without commenting upon the reasonableness of reliance upon the meta-analysis cited)

Welding

Cooley v. Lincoln Electric Co., 693 F.Supp.2d 767 (N.D. Ohio. 2010)

Meta-Analysis in Pharmaceutical Cases

February 25th, 2012

The Third Edition of the Reference Manual on Scientific Evidence attempts to cover a lot of ground to give the federal judiciary guidance on scientific, medical, and statistical, and engineering issues.  It has some successes, and some failures.  One of the major problems in coverage in the new Manual is its inconsistent, sparse, and at points out-dated treatment of meta-analysis.   See The Treatment of Meta-Analysis in the Third Edition of the Reference Manual on Scientific Evidence” (Nov. 14, 2011).

As I have pointed out elsewhere, the gaps and problems in the Manual‘s coverage are not “harmless error,” when some courts have struggled to deal with methodological and evaluative issues in connection with specific meta-analyses.  SeeLearning to Embrace Flawed Evidence – The Avandia MDL’s Daubert Opinion” (Jan. 10, 2011).

Perhaps the reluctance to treat meta-analysis more substantively comes from a perception that the technique for analyzing multiple studies does not come up frequently in litigation.  If so, let me help dispel the notion.  I have collected a partial list of drug and medical device cases that have confronted meta-analysis in one form or another.  In some cases, such as the Avandia MDL, a meta-analysis was a key, or the key, piece of evidence.  In other cases, meta-analysis may have been treated more peripherally.  Still, there are over 20 pharmaceutical cases in the last two decades that have dealt with the statistical techniques involved in meta-analysis.  In another post, I will collect the non-pharmaceutical cases as well.

 

Aredia – Zometa

Deutsch v. Novartis Pharm. Corp., 768 F. Supp. 2d 420 (E.D.N.Y. 2011)

 

Avandia

In re Avandia Marketing, Sales Practices and Product Liability Litigation, 2011 WL 13576, *12 (E.D. Pa. 2011)

Avon Pension Fund v. GlaxoSmithKline PLC, 343 Fed.Appx. 671 (2d Cir. 2009)

 

Baycol

In re Baycol Prods. Litig., 532 F.Supp. 2d 1029 (D. Minn. 2007)

 

Bendectin

Daubert v. Merrell Dow Pharm., 43 F.3d 1311 (9th Cir. 1995) (on remand from Supreme Court)

DePyper v. Navarro, 1995 WL 788828 (Mich.Cir.Ct. 1995)

 

Benzodiazepine

Vinitski v. Adler, 69 Pa. D. & C.4th 78, 2004 WL 2579288 (Phila. Cty. Ct. Common Pleas 2004)

 

Celebrex – Bextra

In re Bextra & Celebrex Marketing Sales Practices & Prod. Liab. Litig., 524 F.Supp.2d 1166 (2007)


E5 (anti-endotoxin monoclonal antibody for gram-negative sepsis)

Warshaw v. Xoma Corp., 74 F.3d 955 (1996)

 

Excedrin vs. Tylenol

McNeil-P.C.C., Inc. v. Bristol-Myers Squibb Co., 938 F.2d 1544 (2d Cir. 1991)

 

Fenfluramine, Phentermine

In re Diet Drugs Prod. Liab. Litig., 2000 WL 1222042 (E.D.Pa. 2000)

 

Fosamax

In re Fosamax Prods. Liab. Litig., 645 F.Supp.2d 164 (S.D.N.Y. 2009)

 

Gadolinium

In re Gadolinium-Based Contrast Agents Prod. Liab. Litig., 2010 WL 1796334 (N.D. Ohio 2010)

 

Neurontin

In re Neurontin Marketing, Sales Pracices, and Products Liab. Litig., 612 F.Supp.2d 116 (D. Mass. 2009)

 

Paxil (SSRI)

Tucker v. Smithkline Beecham Corp., 2010 U.S. Dist. LEXIS 30791 (S.D.Ind. 2010)

 

Prozac (SSRI)

Rimberg v. Eli Lilly & Co., 2009 WL 2208570 (D.N.M.)

 

Seroquel

In re Seroquel Products Liab. Litig., 2009 WL 3806434 *5 (M.D. Fla. 2009)

 

Silicone – Breast Implants

Allison v. McGhan Med. Corp., 184 F.3d 1300, 1315 n. 12 (11th Cir. 1999)(noting, in passing that the district court had found a meta-analysis (the “Kayler study”) unreliable “because it was a re-analysis of other studies that had found no statistical correlation between silicone implants and disease”)

Thimerosal – Vaccine

Salmond v. Sec’y Dep’t of Health & Human Services, 1999 WL 778528 (Fed.Cl. 1999)

Hennessey v. Sec’y Dep’t Health & Human Services, 2009 WL 1709053 (Fed.Cl. 2009)

 

Trasylol

In re Trasylol Prods. Liab. Litig., 2010 WL 1489793 (S.D. Fla. 2010)

 

Vioxx

Merck & Co., Inc. v. Ernst, 296 S.W.3d 81 (Tex. Ct. App. 2009)
Merck & Co., Inc. v. Garza, 347 S.W.3d 256 (Tex. 2011)

 

X-Ray Contrast Media (Nephrotoxicity of Visipaque versus Omnipaque)

Bracco Diagnostics, Inc. v. Amersham Health, Inc., 627 F.Supp.2d 384 (D.N.J. 2009)

Zestril

E.R. Squibb & Sons, Inc. v. Stuart Pharms., 1990 U.S. Dist. LEXIS 15788 (D.N.J. 1990)(Zestril versus Squibb’s competing product,
Capote)

 

Zoloft (SSRI)

Miller v. Pfizer, Inc., 356 F.3d 1326 (10th Cir. 2004)

 

Zymar

Senju Pharmaceutical Co. Ltd. v. Apotex Inc., 2011 WL 6396792 (D.Del. 2011)

 

Zyprexa

In re Zyprexa Products Liab. Litig., 489 F.Supp.2d 230 (E.D.N.Y. 2007) (Weinstein, J.)

The MDL Pocket Guide

February 22nd, 2012

Multi-district litigation is the way that the great bulk of products liability cases are now handled in the federal courts.  Once the Judicial Panel on Multi-District Litigation decides that MDL treatment is appropriate, the district courts, around the country, where cases have been filed, transfer their cases to the single district court judge for pre-trial consolidation.  Along with the products cases, the districts will transfer related cases, such as consumer and securities fraud and medical monitoring class actions to the transferee court.

The Federal Judicial Center has recently published a “pocket guide” to describe the process of managing an MDL for products liability cases:   Barbara J. Rothstein and Catherine R. Borden, Managing Multidistrict Litigation in Products Liability Cases: A Pocket Guide for Transferee Judges (2011).  Link or download  The guide weighs in at 53 pages with some standard, and some non-standard, guidance for judges managing MDL products liability cases.  Judge Rothstein, a veteran judge in MDL products litigation, recently stepped down as head of the Federal Judicial Center.

Because there is a significant risk that your MDL judge will read the Pocket Guide, this pamphlet should be on your and your clients’ reading list.  Much of the pamphlet is unexceptional, but there is some non-standard guidance, which you may want to flag for the MDL judge in early briefings.  What follows are just some brief comments on the Guide.

Expert Discovery

Much of the Guide‘s discussion on expert discovery is hornbook law, but the following passage gives some novel, dubious guidance:

“You should be aware of the possibility that not only the parties’ testifying experts, but also the published research on which the experts rely, may be subject to charges of bias. For example, where parties directly or indirectly fund authors of research articles and studies that are relied upon by testifying experts, such funding may be discoverable as relevant to the issue of bias.48  In cases involving disputed evidence on causation, there will often be ongoing scientific studies addressing the disputed issue. You may need to establish procedures for discovery regarding such studies. Generally, courts protect researchers from disclosure of data or opinions relating to an ongoing unpublished study. By contrast, courts generally allow discovery into party-sponsored studies.49

Pocket Guide at § 9.e (citations omitted). The Guide suggests that courts protect researchers from compulsory process to obtain data from “ongoing unpublished” studies, but this begs the question what should be done for studies that already have been published and are being relied upon by the parties, one side, or the other, or both, in litigation.  More troubling is the Guide’s suggestion that an MDL court should unleash discovery against authors of published works for evidence of bias, with a citation to a case that ordered parties to produce lists of payments to authors of articles relied upon by expert witnesses.

The case-law support for the suggested approach is thin – just one case – and it involves serious problems.  For instance, expert witnesses must itemize all studies, publications, data, and the like, which they have “considered.”  The expert witnesses’ reports must give a detailed recitation of their opinions and the bases for their opinions.  Does the mere appearance of an article on an expert witness’s “consideration” list trigger this invasive discovery?  The Guide‘s language and citation suggest so, but there is little reason or logic to support such an inquiry.  The authors of a study relied upon might appear to be more appropriate targets for this inquiry, but sometimes studies cut different ways, and an expert witness for one side or the other might reasonably rely upon some data and analyses and not others from a single study.  The process contemplated by the Guide appears to dichotomize, in a simple-minded way, the entire body of research that might bear upon scientific questions in a litigation.

Second, the discovery exercise described in pamphlet raises concerns about the confidentiality of consultations made with experts who were never considered for a testifying role in litigation. These consultations may have been made with the understanding that the fact and the substance of the consultation would be confidential. Some consultants, on both sides of litigations, may be concerned about powerful superiors in their universities who are allied with litigants or their regulatory allies on one side or the other.

Third, both sides in MDL cases are likely to speak to a good number of experts in the field. The parties on all sides will generally interview experts based upon their reported views or their interest in issues that are relevant to the litigation. Before courts create lists of “tainted” scientific papers, they might well consider the timing of the authorship and whether the payments were made before or after the author in question wrote the article that is relied upon.

Fourth, there is an unfair asymmetry involved in this exercise. Many MDL cases involve one or a few defendants, and it is generally feasible for those defendants and their counsel to trace all payments made to scientists, for whatever reason. Plaintiffs’ counsel, serving on a Steering Committee, may express an inability to contact every plaintiffs’ counsel who has taken state or federal cases related to the MDL, or who has considered taking cases, and who has spoken to experts as part of their research or representations.  While that claimed inability may well be real (or not), it leaves the reporting on one side incomplete, and creates prejudice to the side (usually the defense) that has the ability to provide a definitive list.

Fifth, the scope of the disclosure exercise cannot be easily and fairly circumscribed. The defendants or the Plaintiffs’ Steering Committee may not have paid any money to scientists who has worked with other litigants in other litigations. Those scientists, who are not financially tied to the parties in the particular MDL, may still have substantial biases as a result of having worked with counsel – indeed, they may be the same counsel as are involved in the MDL – but the disclosure rules obscure their biases and create an imbalanced view of who is “interested,” and who is “disinterested.”  For instance, a prominent plaintiffs’ counsel on the MDL’s Plaintiffs’ Steering Committee may have worked with an expert, and even may have encouraged that expert to publish on a topic that would affect a wide array of litigations, including the MDL where discovery is proposed into the expert’s “biases.”  If the expert, however, had no engagement for the MDL itself, the association with plaintiffs’ counsel in other cases would appear to immunize this expert from discovery into payments and biases.

Sixth, the suggested procedure will not bring in information from plaintiffs’ counsel, whose cases are filed only in state courts, and who are thus not subject to orders of the MDL court.  The state court plaintiffs could work up any number of consulting expert witnesses, and have them publish extensively on the MDL issues, but the federal MDL court’s discovery will not reveal the subterfuge.  The practice of the state court consultations will be “privileged” under most states’ rules on expert witnesses.  The defendants, of course, will be in both state and federal courts, and thus all their consulting expert witnesses will be subject to discovery.

The Guide‘s suggestion does not appear to have been thought through very carefully.

 

Attorney Fees

Who can be against attorney’s fees, but common-benefit funds raise some thorny cy pres problems when the MDL has wound down:

“In a large MDL, many courts appoint common benefit fee committees, charged either with auditing and recommending common benefit compensation requests, or determining the final allocation of a common benefit fee award among the competing common benefit attorneys.”

Pocket Guide at § 4.b.

The discussion of common-benefit funds could benefit from discussing some of the mechanics of ensuring that monies in the funds are returned to claimants at the conclusion of the litigation to avoid improprieties, such as have been seen in MDL 926, In re Silicone Gel Breast Implants Litigation.  See SKAPP A LOT (April 30, 2010).

 

Name that MDL

The Pocket Guide has no suggestions about how to name the MDL, but while I am whining, here is another complaint:  why are product MDLs typically given names like:  In re Widget Products Liability Litigation?  Doesn’t this prejudge the issue in a way unfairly to the defendant?  Every videotaped deposition will begin with a statement from the videographer to the effect that the deposition is being taken in the Widget liability litigation, or something like that.  Why aren’t these MDLs named:  In re Widget Safety Litigation?  Or, In re Widget Alleged Product Liability Litigation?  The names are already a mouthful; they should at least be fair.

Unreported Decisions on Expert Witness Opinion in New Jersey

February 21st, 2012

In New Jersey, as in other states, unpublished opinions have a quasi-outlaw existence.  According to the New Jersey Rules of Court, unpublished opinions are not precedential.  By court fiat, the court system has declared that it can act a certain way in a given case, and not have to follow its own lead in other cases:

No unpublished opinion shall constitute precedent or be binding upon any court. Except for appellate opinions not approved for publication that have been reported in an authorized administrative law reporter, and except to the extent required by res judicata, collateral estoppel, the single controversy doctrine or any other similar principle of law, no unpublished opinion shall be cited by any court. No unpublished opinion shall be cited to any court by counsel unless the court and all other parties are served with a copy of the opinion and of all contrary unpublished opinions known to counsel.

New Jersey Rule of Court 1:36-3 (Unpublished Opinions).

Litigants down the road may feel that they are not being given the equal protection of the law, but never mind.  Res judicata and collateral estoppel are in, but stare decisis is out.  Consistency and coherence are so difficult, surely it is better to be free from having from these criteria of rationality unless we decide to “opt in” by publishing opinions with our decisions.  As many other scholars and commentators have noted, rules of this sort allow decisions from other states, and even other countries, to be potentially persuasive, whereas by court rule and fiat, an unpublished decision of the deciding court can not have any precedential value.  Why then permit unpublished cases to be cited at all?

Having tracked decisions, published and un-, in New Jersey for many years, I am left with an impression that the Appellate Division has a tendency to refuse to publish opinions of decisions in which it has reversed the trial court’s refusal to exclude expert witness testimony, or in which it has affirmed the trial court’s exclusion of expert testimony.  Opinions that explain the affirmance of a denial of expert witness exclusion or the reversal of a trial court’s grant of exclusion appear to be published more often.  Stated as a four-fold table:

  Trial Court Permits Expert Trial Court Bars Expert
Appellate Court Affirms Published Not Published
Appellate Court Reverses Not Published Publish

My impression is that there is an institutional bias against creating a body of law that illuminates the criteria for admission and for exclusion of expert witness opinion testimony. This is only an impression, and I do not have statistics, descriptive or inferential on these judicial behaviors.  From a jurisprudential perspective, the affirmance of an exclusion below, or the reversal of a denial of exclusion below, should be at least as important as publishing the reversal of an exclusion below.  The goal of announcing to the Bar and to trial judges the criteria for inclusion and exclusion would seem to suggest greater publication of the opinions, from the two unpublished cells, in the contingency table, above.

No citation and no precedent rules are deeply problematic, and have attracted a great deal of scholarly attention.  See Erica Weisgerber, “Unpublished Opinions: A Convenient Means to an Unconstitutional End,” 97 Georgetown L.J. 621 (2009);  Rafi Moghadam, “Judge Nullification: A Perception of Unpublished Opinions,” 62 Hastings L.J. 1397 (2011);  Norman R. Williams, “The failings of Originalism:  The Federal Courts and the Power of Precedent,” 37 U.C.. Davis L. Rev. 761 (2004);  Dione C. Greene, “The Federal Courts of Appeals, Unpublished Decisions, and the ‘No-Citation Rule,” 81 Indiana L.J. 1503 (2006);  Vincent M. Cox, “Freeing Unpublished Opinions from Exile: Going Beyond the Citation Permitted by Proposed Federal Rule of Appellate Procedure 32.1,” 44 Washburn L.J. 105 (2004);  Sarah E. Ricks, “The Perils of Unpublished Non-Precedential Federal Appellate Opinions: A Case Study of The Substantive Due Process State-Created Danger Doctrine in One Circuit,” 81 Wash. L.Rev. 217 (2006);  Michael J. Woodruff, “State Supreme Court Opinion Publication in the Context of Ideology and Electoral Incentives.” New York University Department of Politics (March 2011);   Michael B. W. Sinclair, “Anastasoff versus Hart: The Constitutionality and Wisdom of Denying Precedential Authority to Circuit Court Decisions.”  See generally The Committee for the Rule of Law (website) (collecting scholarship and news on the issue of unpublished and supposedly non-precedential opinions).

What would be useful is an empirical analysis of the New Jersey Appellate Division’s judicial behavior in deciding whether or not to publish decisions for each of the four cells, in the four-fold table, above.  If my impression is correct, the suggestion of institutional bias would give further support to the abandonment of N.J. Rule of Court 1:36-3.

When There Is No Risk in Risk Factor

February 20th, 2012

Some of the terminology of statistics and epidemiology is not only confusing, but it is misleading.  Consider the terms “effect size,” “random effects,” and “fixed effect,” which are all used to describe associations even if known to be non-causal.  Biostatisticians and epidemiologists know that the terms are about putative or potential effects, but the sloppy, short-hand nomenclature can be misleading.

Although “risk” has a fairly precise meaning in scientific parlance, the usage for “risk factor” is fuzzy, loose, and imprecise.  Journalists and plaintiffs’ lawyers use “risk factor,” much as they another frequently abused term in their vocabulary:  “link.”  Both “risk factor” and “link” sound as though they are “causes,” or at least as though they have something to do with causation.  The reality is usually otherwise.

The business of exactly what “risk factor” means is puzzling and disturbing.  The phrase seems to have gained currency because it is squishy and without a definite meaning.  Like the use of “link” by journalists, the use of “risk factor” protects the speaker against contradiction, but appears to imply a scientifically valid conclusion.  Plaintiffs’ counsel and witnesses love to throw this phrase around precisely because of its ambiguity.  In journal articles, authors sometimes refer to any exposure inquired about in a case-control study to be a “risk factor,” regardless of the study result.  So a risk factor can be merely an “exposure of interest,” or a possible cause, or a known cause.

The author’s meaning in using the phrase “risk factor” can often be discerned from context.  When an article reports a case-control study, which finds an association with an exposure to some chemical the article will likely report in the discussion section that the study found that chemical to be a risk factor.  The context here makes clear that the chemical was found to be associated with the outcome, and that chance was excluded as a likely explanation because the odds ratio was statistically significant.  The context is equally clear that the authors did not conclude that the chemical was a cause of the outcome because they did not rule out bias or confounding; nor did they do any appropriate analysis to reach a causal conclusion and because their single study would not have justified reaching a causal association.

Sometimes authors qualify “risk factor” with an adjective to give more specific meaning to their usage.  Some of the adjectives used in connection with the phrase include:

– putative, possible, potential, established, well-established, known, certain, causal, and causative

The use of the adjective highlights the absence of a precise meaning for “risk factor,” standing alone.  Adjectives such as “established,” or “known” imply earlier similar findings, which are corroborated by the study at hand.  Unless “causal” is used to modify “risk factor,” however, there is no reason to interpret the unqualified phrase to imply a cause.

Here is how the phrase “risk factor” is described in some noteworthy texts and treatises.

Legal Treatises

Professor David Faigman, and colleagues, with some understatement, note that the term “risk factor is loosely used”:

Risk Factor An aspect of personal behavior or life-style, an environmental exposure, or an inborn or inherited characteristic, which on the basis of epidemiologic evidence is known to be associated with health-related condition(s) considered important to prevent. The term risk factor is rather loosely used, with any of the following meanings:

1. An attribute or exposure that is associated with an increased probability of a specified outcome, such as the occurrence of a disease. Not necessarily a causal factor.

2. An attribute or exposure that increases the probability of occurrence of disease or other specified outcome.

3. A determinant that can be modified by intervention, thereby reducing the probability of occurrence of disease or other specified outcomes.”

David L. Faigman, Michael J. Saks, Joseph Sanders, and Edward Cheng, Modern Scientific Evidence:  The Law and Science of Expert Testimony 301, vol. 1 (2010)(emphasis added).

The Reference Manual on Scientific Evidence (2011) (RMSE3d) does not offer much in the way of meaningful guidance here.  The chapter on statistics in the third edition provides a somewhat circular, and unhelpful definition.  Here is the entry in that chapter’s glossary:

risk factor. See independent variable.

RMSE3d at 295.  If the glossary defined “independent variable” as a simply a quantifiable variable that was being examined for some potential relationship with the outcome, or dependent, variable, the RMSE would have avoided error.  Instead the chapter’s glossary, as well as its text, defines independent variables as “causes,” which begs the question why do a study to determine whether the “independent variable” is even a candidate for a causal factor?  Here is how the statistics chapter’s glossary defines independent variable:

“Independent variables (also called explanatory variables, predictors, or risk factors) represent the causes and potential confounders in a statistical study of causation; the dependent variable represents the effect. ***. “

RMSE3d at 288.  This is surely circular.  Studies of causation are using independent variables that represent causes?  There would be no reason to do the study if we already knew that the independent variables were causes.

The text of the RMSE chapter on statistics propagates the same confusion:

“When investigating a cause-and-effect relationship, the variable that represents the effect is called the dependent variable, because it depends on the causes.  The variables that represent the causes are called independent variables. With a study of smoking and lung cancer, the independent variable would be smoking (e.g., number of cigarettes per day), and the dependent variable would mark the presence or absence of lung cancer. Dependent variables also are called outcome variables or response variables. Synonyms for independent variables are risk factors, predictors, and explanatory variables.”

FMSE3d at 219.  In the text, the identification of causes with risk factors is explicit.  Independent variables are the causes, and a synonym for an independent variable is “risk factor.”  The chapter could have avoided this error simply by the judicious use of “putative,” or “candidate” in front of “causes.”

The chapter on epidemiology exercises more care by using “potential” to modify and qualify the risk factors that are considered in a study:

“In contrast to clinical studies in which potential risk factors can be controlled, epidemiologic investigations generally focus on individuals living in the community, for whom characteristics other than the one of interest, such as diet, exercise, exposure to other environmental agents, and genetic background, may distort a study’s results.”

FMSE3d at 556 (emphasis added).

 

Scientific Texts

Turning our attention to texts on epidemiology written for professionals rather than judges, we find that sometimes the term “risk factor” with a careful awareness of its ambiguity.

Herbert I. Weisberg is a statistician whose firm, Correlation Research Inc., specializes in the applied statistics in legal issues.  Weisberg recently published an interesting book on bias and causation, which is recommended reading for lawyers who litigate claimed health effects.  Weisberg’s book defines “risk factor” as merely an exposure of interest in a study that is looking for associations with a harmful outcome.  He insightfully notes that authors use the phrase “risk factor” and similar phrases to avoid causal language:

“We will often refer to this factor of interest as a risk factor, although the outcome event is not necessarily something undesirable.”

Herbert I. Weisberg, Bias and Causation:  Models and Judgment for Valid Comparisons 27 (2010).

“Causation is discussed elliptically if at all; statisticians typically employ circumlocutions such as ‘independent risk factor’ or ‘explanatory variable’ to avoid causal language.”

Id. at 35.

Risk factor : The risk factor is the exposure of interest in an epidemiological study and often has the connotation that the outcome event is harmful or in some way undesirable.”

Id. at 317.   This last definition is helpful in illustrating a balanced, fair definition that does not conflate risk factor with causation.

*******************

Lemuel A. Moyé is an epidemiologist who testified in pharmaceutical litigation, mostly for plaintiffs.  His text, Statistical Reasoning in Medicine:  The Intuitive P-Value Primer, is in places a helpful source of guidance on key concepts.  Moyé puts no stock in something’s being a risk factor unless studies show a causal relationship, established through a proper analysis.  Accordingly, he uses “risk factor” to signify simply an exposure of interest:

4.2.1 Association versus Causation

An associative relationship between a risk factor and a disease is one in which the two appear in the same patient through mere coincidence. The occurrence of the risk factor does not engender the appearance of the disease.

Causal relationships on the other hand are much stronger. A relationship is causal if the presence of the risk factor in an individual generates the disease. The causative risk factor excites the production of the disease. This causal relationship is tight, containing an embedded directionality in the relationship, i.e., (1) the disease is absence in the patient, (2) the risk factor is introduced, and (3) the risk factor’s presence produces the disease.

The declaration that a relationship is causal has a deeper meaning then the mere statement that a risk factor and disease are associated. This deeper meaning and its implications for healthcare require that the demonstration of a causal relationship rise to a higher standard than just the casual observation of the risk factor and disease’s joint occurrence.

Often limited by logistics and the constraints imposed by ethical research, the epidemiologist commonly cannot carry out experiments that identify the true nature of the risk factor–disease relationship. They have therefore become experts in observational studies. Through skillful use of observational research methods and logical thought, epidemiologists assess the strength of the links between risk factors and disease.”

Lemuel A. Moyé, Statistical Reasoning in Medicine:  The Intuitive P-Value Primer 92 (2d ed. 2006)

***************************

In A Dictionary of Epidemiology, which is put out by the International Epidemiology Association, a range of meanings is acknowledged, although the range is weighted toward causality:

“RISK FACTOR (Syn: risk indicator)

1. An aspect of personal behavior or lifestyle, an environmental exposure, or an inborn or inherited characteristic that, on the basis of scientific evidence, is known to be associated with meaningful health-related condition(s). In the twentieth century multiple cause era, a synonymous with determinant acting at the individual level.

2. An attribute or exposure that is associated with an increased probability of a specified outcome, such as the occurrence of a disease. Not necessarily a causal factor: it may be a risk marker.

3. A determinant that can be modified by intervention, thereby reducing the probability of occurrence of disease or other outcomes. It may be referred to as a modifiable risk factor, and logically must be a cause of the disease.

The term risk factor became popular after its frequent use by T. R. Dawber and others in papers from the Framingham study.346 The pursuit of risk factors has motivated the search for causes of chronic disease over the past half-century. Ambiguities in risk and in risk-related concepts, uncertainties inherent to the concept, and different legitimate meanings across cultures (even if within the same society) must be kept in mind in order to prevent medicalization of life and iatrogenesis.124–128,136,142,240

Miquel Porta, Sander Greenland, John M. Last, eds., A Dictionary of Epidemiology 218-19 (5th ed. 2008).  We might add that the uncertainties inherent in risk concepts should be kept in mind to prevent overcompensation for outcomes not shown to be caused by alleged tortogens.

***************

One introductory text uses “risk factor” as a term to describe the independent variable, while acknowledging that the variable does not become a risk factor until after the study shows an association between factor and the outcome of interest:

“A case-control study is one in which the investigator seeks to establish an association between the presence of a characteristic (a risk factor).”

Sylvia Wassertheil-Smoller, Biostatistics and Epidemiology: A Primer for Health and Biomedical Professionals 104 (3d ed. 2004).  See also id. at 198 (“Here, also, epidemiology plays a central role in identifying risk factors, such as smoking for lung cancer”).  Although it should be clear that much more must happen in order to show a risk factor is causally associated with an outcome, such as lung cancer, it would be helpful to spell this out.  Some texts simply characterize risk factor as associations, not necessarily causal in nature.  Another basic text provides:

“Analytical studies examine an association, i.e. the relationship between a risk factor and a disease in detail and conduct a statistical test of the corresponding hypothesis … .”

Wolfgang Ahrens & Iris Pigeot, eds., Handbook of Epidemiology 18 (2005).  See also id. at 111 (Table describing the reasoning in a case-control study:    “Increased prevalence of risk factor among diseased may indicate a causal relationship.”)(emphasis added).

These texts, both legal and scientific, indicate a wide range of usage and ambiguity for “risk factor.”  There is a tremendous potential for the unscrupulous expert witness, or the uneducated lawyer, to take advantage of this linguistic latitude.  Courts and counsel must be sensitive to the ambiguity and imprecision in usages of “risk factor,” and the mischief that may result.  The Reference Manual on Scientific Evidence needs to sharpen and update its coverage of this and other statistical and epidemiologic issues.

When Is Risk Really Risk?

February 14th, 2012

The term “risk” has a fairly precise meaning in scientific parlance.  The following is a typical definition:

RISK The probability that an event will occur, e.g., that an individual will become ill or die within a stated period of time or by a certain age. Also, a nontechnical term encompassing a variety of measures of the probability of a (generally) unfavorable outcome. See also probability.

Miquel Porta, ed., A Dictionary of Epidemiology 212-18 (5th ed. 2008)(sponsored by the Internat’l Epidemiological Ass’n).

In other words, a risk is an ex ante cause.  The probability is not a qualification about whether there is a causal relationship, but rather whether any person at risk will develop the outcome of interest.  Such is the nature of stochastic risks.

Regulatory agencies often use the term “risk” metaphorically, as a fiction to justify precautionary regulations.  Although there may be nothing wrong with such precautionary initiatives, regulators often imply a real threat of harm from what can only be a hypothetical harm.  Why?  If for no other reason, regulators operate with a “wish bias” in favor of the reality of the risk they wish to avert if risk it should be.  We can certainly imagine the cognitive slippage that results from the need to motivate the regulated actors to comply with regulations, and at times, to prosecute the noncompliant.

Plaintiffs’ counsel in personal injury and class action litigation have none of the regulators’ socially useful motives for engaging in distortions of the meaning of the word “risk.”  In the context of civil litigation, plaintiffs’ counsel use the term “risk,” borrowed from the Humpty-Dumpty playbook:

“When I use a word,” Humpty Dumpty said, in rather a scornful tone, “it means just what I choose it to mean—neither more nor less.”
“The question is,” said Alice, “whether you can make words mean so many different things.”
“The question is,” said Humpty Dumpty, “which is to be master — that’s all.”

Lewis Carroll, Through the Looking-Glass 72 (Raleigh 1872).

Undeniably, the word mangling and distortion have had some success with weak-minded judges, but Humpty-Dumpty linguistics had a fall recently in the Third Circuit.  Others have written about it, but I am only just getting around to read the analytically precise and insightful decision in Gates v. Rohm and Haas Co., 655 F.3d 255 (3d Cir. 2011).  See Sean Wajert, “Court of Appeals Rejects Medical Monitoring Class Action” (Aug. 31, 2011); Carl A. Solano, “Appellate Court Consensus on Medical Monitoring Class Actions Solidifies” (Sept. 12, 2011).

Gates was an attempted class action, in which the district court denied plaintiffs’ motion for certification of a medical monitoring and property damage class.  265 F.R.D. 208 (E.D.Pa. 2010)(Pratter, J.).  Plaintiffs contended that they were exposed to varying amounts of vinyl chloride exposure in air, and perhaps in water at levels too low to detect. Gates, 655 F.3d at 258-59.   The class’s request for medical monitoring foundered because plaintiffs were unable to prove that they were all exposed to a level of vinyl chloride that created a significant risk of serious latent disease for all class members. Id. at 267-68.

With no scientific evidence in hand, the plaintiffs tried to maintain that they were “at risk” on the basis of EPA regulations, which set a very low, precautionary threshold, but the district and circuit courts rebuffed this use of regulatory “risk” language:

The court identified two problems with the proposed evidence. First, it rejected the plaintiffs’ proposed threshold—exposure above 0.07µ/m3, developed as a regulatory threshold by the EPA for mixed populations of adults and children—as a proper standard for determining liability under tort law. Second, the court correctly noted, even if the 0.07 µ/m3 standard were a correct measurement of the aggregate threshold, it would not be the threshold for each class member who may be more or less susceptible to diseases from exposure to vinyl chloride.18 Although the positions of regulatory policymakers are relevant, their risk assessments are not necessarily conclusive in determining what risk exposure presents to specified individuals. See Federal Judicial Center, Reference Manual on Scientific Evidence 413 (2d ed.2000) (“While risk assessment information about a chemical can be somewhat useful in a toxic tort case, at least in terms of setting reasonable boundaries as to the likelihood of causation, the impetus for the development of risk assessment has been the regulatory process, which has different goals.”); id. at 423 (“Particularly problematic are generalizations made in personal injury litigation from regulatory positions…. [I]f regulatory standards are discussed in toxic tort cases to provide a reference point for assessing exposure levels, it must be recognized that  there is a great deal of variability in the extent of evidence required to support different regulations.”).

Thus, plaintiffs could not carry their burden of proof for a class of specific persons simply by citing regulatory standards for the population as a whole. Cf. Wright v. Willamette Indus., Inc., 91 F.3d 1105, 1107 (8th Cir.1996) (“Whatever may be the considerations that ought to guide a legislature in its determination of what the general good requires, courts and juries, in deciding cases, traditionally make more particularized inquiries into matters of cause and effect.”).

Plaintiffs have failed to propose a method of proving the proper point where exposure to vinyl chloride presents a significant risk of developing a serious latent disease for each class member.

Plaintiffs propose a single concentration without accounting for the age of the class member being exposed, the length of exposure, other individual factors such as medical history, or showing the exposure was so toxic that such individual factors are irrelevant. The court did not abuse its discretion in concluding individual issues on this point make trial as a class unfeasible, defeating cohesion.

Id. at 268.  For class actions, the inability to invoke a low threshold of “permissible” exposure may be the death knell of medical monitoring and personal injury class actions.  The implications of the Gates court’s treatment of “regulatory risk” is, however, more far reaching.  Sometimes risk is not really risk at all.  The ambiguity of the risk in risk assessment has confused judges from the lowest magistrate up to Supreme Court justices.  It is time to disambiguate.  See General Electric v. Joiner, 522 U.S. 136, 153-54 (1997) (Stevens, J., dissenting in part) (erroneously assuming that plaintiffs’ expert witness was justified in relying upon a weight-of-evidence methodology because such methodology is often used in risk assessment).

Two Articles of Interest in JAMA – Nocebo Effects; Medical Screening

February 12th, 2012

Two articles in this week’s Journal of the American Medical Association (JAMA) are of interest to lawyers who litigate, or counsel about, health effects.

One article deals with the nocebo effect, which is the dark side of the placebo effect.  Placebos can induce beneficial outcomes because of the expectation of useful therapy; nocebos can induce harmful outcomes because of the expectation of injury. The viewpoint article in JAMA points out that nocebo effects, like placebo effects, result from the “psychosocial context or therapeutic environment” affecting a patient’s perception of his state of health or illness.  Luana Colloca, MD, PhD, and Damien Finniss, MSc Med., “Nocebo Effects, Patient-Clinician Communication, and Therapeutic Outcomes,” 307 J. Am. Med. Ass’n 567, 567 (2012).

The authors discuss how clinicians can inadvertently prejudice health outcomes by how they frame outcome information to patients.  Importantly, Colloca and Finniss also note that the negative expectations created by the nocebo communication can take place in the process of obtaining informed consent.

The litigation significance is substantial because the creation of negative expectations is not the exclusive domain of clinicians.  Plaintiffs’ counsel, support and advocacy groups, and expert witnesses, even when well meaning, can similarly create negative expectations for health outcomes.  These actors often enjoy undeserved authority among their audience of litigants or claimants.  The extremely high rate of psychogenic illness found in many litigations is the result.  The harmful communications, however, are not limited to plaintiffs’ lawyers and their auxiliaries.  As Colloca and Finniss point out, nocebo effects can be induced by well-meaning warnings and disclosure of information from healthcare providers to patients.  Id. at 567.  The potential to induce negative harms in this way has the obvious consequence for the tort system:  more warnings are not always beneficial.  Indeed, warnings themselves can bring about harm.  This realization should temper courts’ enthusiasms for the view that more warnings are always better.  Warnings about adverse health outcomes should be based upon good scientific bases.

*************

The other article from this week’s issue of JAMA addresses the harms of screening.  Steven H. Woolf, MD, MPH, and Russell Harris, MD, MPH, “The Harms of Screening: New Attention to an Old Concern,” 307 J. Am. Med. Ass’n 565 (2012).    As I pointed out on these pages, screening for medical illnesses carries significant health risks to patients and ethical risks for the healthcare providers.  SeeEthics and Daubert: The Scylla and Charybdis of Medical Monitoring” (Feb. 1, 2012).  Bayes’ Theorem teaches us that even very high likelihood ratios for screening tests will yield true positive cases swamped by false positive cases when the baseline prevalence is low.  See Jonathan Deeks and Douglas Altman, “Diagnostic tests 4: likelihood ratios,” 329 Brit. Med. J. 168 (2004) (Providing a useful nomogram to illustrate how even highly accurate tests, with high likelihood ratios, will produce more false than true positive cases when the baseline prevalence of disease is low).

The viewpoint piece by Woolf and Harris emphasizes the potential iatrogenic harms from screening:

  • physical injury from the test itself (as in colonic perforations from colonoscopy);
  • cascade of further testing, with further risk of harm, both physical and emotional;
  • anxiety and emotional distress over abnormal results;
  • overdiagnosis; and
  • the overtreatment of conditions that are not substantial threats to patients’ health

These issues should have an appropriately chilling effect on judicial enthusiasm for medical monitoring and surveillance claims.  Great care is required to fashion a screening plan for patients or claimants.  Of course, there are legal risks as well, as when plaintiffs’ counsel fail to obtain the necessary prescriptions or permits to conduct radiological screenings.  See Schachtman “State Regulators Impose Sanction for Unlawful Silicosis Screenings,” 17(13) Wash. Leg. Fdtn. Legal Op. Ltr. (May 25, 2007).  Caveat litigator.

The opinions, statements, and asseverations expressed on Tortini are my own, or those of invited guests, and these writings do not necessarily represent the views of clients, friends, or family, even when supported by good and sufficient reason.