TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

Learning to Embrace Flawed Evidence – The Avandia MDL’s Daubert Opinion

January 10th, 2011

If GlaxoSmithKline (GSK) did not have bad luck when it comes to its oral anti-diabetic medication Avandia, it would have no luck at all.

On January 4, 2011, the federal judge who oversees the Avandia multi-district litigation (MDL) in Philadelphia entered an order denying GSK’s motion to exclude the causation opinion testimony of plaintiffs’ expert witnesses.  In re Avandia Marketing, Sales Practices, and Products Liab. Litig., MDL 1871, Mem. Op. and Order (E.D.Pa. Jan. 3, 2011)(Rufe, J.)[cited as “Op.”].  The decision is available on the CBS Interactive Business Network news blog, BNET

Based largely upon a meta-analysis of randomized clinical trials (RCTs) by Dr Steven Nissen and Ms Kathleen Wolski, plaintiffs’ witnesses opined that Avandia (rosiglitizone) causes heart attacks and strokes.  Because meta-analysis has received so little serious judicial attention in connection with Rule 702 or 703 motions, this opinion by the Hon. Cynthia Rufe, deserves careful attention by all students of “Daubert” law.  Unfortunately, that attention is likely to be critical — Judge Rufe’s opinion fails to engage the law and facts of the case, while committing serious mistakes on both fronts.

The Law

The reader will know that things are not going well for a sound legal analysis when the trial court begins by misstating the controlling law for decision:

“Under the Third Circuit framework, the focus of the Court’s inquiry must be on the experts’ methods, not their conclusions. Therefore, the fact that Plaintiffs’ experts and defendants’ experts reach different conclusions does not factor into the Court’s assessment of the reliability of their methods.”

Op. at 2 (internal citation omitted).

and

“As noted, the experts are not required to use the best possible methods, but rather are required to use scientifically reliable methods.”

Op. at 26.

Although the United States Supreme Court attempted, in Daubert, to draw a distinction between the reliability of an expert witness’s methodology and conclusion, that Court soon realized that the distinction is flawed. If an expert witness’s proffered testimony is discordant from regulatory and scientific conclusions, a reasonable, disinterested scientists would be led to question the reliability of the testimony’s methodology and its inferences from facts and data, to its conclusion.  The Supreme Court recognized this connection in General Electric v. Joiner, and the connection between methodology and conclusions was ultimately incorporated into a statute, the revised Federal Rule of Evidence 702:

“[I]f scientific, technical or other specialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training or education, may testify thereto in the form of an opinion or otherwise, if

  1. the testimony is based upon sufficient fact or data,
  2. the testimony is the product of reliable principles and methods; and
  3. the witness has applied the principles and methods reliably to the facts.”

The Avandia MDL court thus ignored the clear mandate of a statute, Rule 702(1), and applied an unspecified “Third Circuit” framework, which is legally invalid to the extent it departs from the statute.

The Avandia court’s ruling, however, goes beyond this clear error in applying the wrong law.  Judge Rufe notes that:

“The experts must use good grounds to reach their conclusions, but not necessarily the best grounds or unflawed methods.”

Op. at 2-3 (internal citations omitted).

Here the trial court’s double negative is confusing.  The court clearly suggests that plaintiffs’ experts must use “good grounds,” but that their methods can be flawed and still survive challenge.  We can certainly hope that the trial court did not intend to depart so far from the statute, scientific method, and common sense, but the court’s own language suggests that it abused its discretion in applying a clearly incorrect standard.

Misstatements of Fact

The apparent errors of the Avandia decision transcend mistaken legal standards, and go to key facts of the case.  Some errors perhaps show inadvertence or inattention, for instance, when the court states that the RECORD trial, an RCT conducted by GSK, set out “specifically to compare the cardiovascular safety of Avandia to that of Actos (a competitor medication in the same class).  Op. at 4.  In fact, Actos (or pioglitazone) was not involved in the RECORD trial, which involved Avandia, along with two other oral anti-diabetic medications, metformin and sulfonylurea. 

Erroneous Reliance upon p-values to the exclusion of Confidence Intervals

Other misstatements of fact, however, suggest that the trial court did not understand the scientific evidence in the case.  By way of example, the trial court erroneously over-emphasized p-values, and ignored the important interpretative value of the corresponding confidence intervals.  For example, we are told that “[t]he NISSEN meta-analysis combined 42 clinical trials, including the RECORD trial and other RCTs, and found that Avandia increased the risk of myocardial infarction by 43%, a statistically significant result (p = .031).”  Op. at 5.  Ignoring for the moment that the cited meta-analysis did not include the RECORD RCT, the Court should have have reported the p-value along with the corresponding two-sided 95% confidence interval:

“the odds ratio for myocardial infarction was 1.43 (95% confidence interval [CI], 1.03 to 1.98; P = 0.03).”

Steven E. Nissen, M.D., and Kathy Wolski, M.P.H., “Effect of Rosiglitazone on the Risk of Myocardial Infarction and Death from Cardiovascular Causes,” 356 New Engl. J. Med. 2457, 2457 (2007).

The Court repeats this error later in its opinion:

“In 2007, the New England Journal of Medicine published the NISSEN meta-analysis, which combined results from 42 double-blind RCTs and found that patients taking Avandia had a statistically significant 43% increase in myocardial ischemic events. NISSEN used all publicly available data from double-blind RCTs of Avandia in which cardiovascular disease events were recorded, thereby eliminating one major drawback of meta-analysis: the biased selection of studies.”

Op. at 17.  The second time, however, the Court introduced new factual errors.  The Court erred in suggesting that Nissen uses all publicly available data.  There were, in fact, studies available to Nissen and to the public, which met Nissen’s inclusion criteria, but which he failed to include in his meta-analysis.  Nissen’s meta-analysis was thus biased by its failure to have conducted a complete, thorough review of the medical literature for qualifying RCTs.  Furthermore, contrary to the Court’s statement, Nissen included non-double-blinded RCTs, as his own published paper makes clear.

Erroneous Interpretation of p-values

The court erred in its interpretation of p-values:

 “The DREAM and ADOPT studies were designed to study the impact of Avandia on prediabetics and newly diagnosed diabetics. Even in these relatively low-risk groups, there was a trend towards an adverse outcome for Avandia users (e.g., in DREAM, the p-value was .08, which means that there is a 92% likelihood that the difference between the two groups was not the result of mere chance). “

Op. at 25 (internal citation omitted).  The p-value is, of course, the probability that results as large or larger would have been observed, given the truth of the null hypothesis that there is no difference between Avandia and its comparator medications.  The p-value does not permit a probabilistic assessment of the correctness of the null hypothesis; nor does it permit a straightforward probabilistic assessment of the correctness of the alternative hypothesis of rejecting the null hypothesis.

See Federal Judical Center, Reference Manual Scientific Evidence 2d ed. 122, 357 (2000).

Hand Waiving over Statin Use

The Court appeared to have been confused by plaintiffs’ rhetoric that statin use masked a real risk of heart attacks in the Avandia RCTs. 

“It is not clear whether statin use was allowed in the DREAM study.”

Op. at 25.  The problem is that the Court fails to point to any evidence that the use of statins differed between the Avandia and comparator arms of the RCTs.  Statins have been one of the great pharmaceutical success stories of the last 15 years, and it is reasonable to believe that today most diabetic patients (who often high blood fats) would taking statins.  At the time of the DREAM study, the prevalence of use would have been lower than today, but there was no evidence mentioned that the use was different between the Avandia and other arms of the DREAM trial.

Errors in Interpreting RCTs by Intention to Treat Analyses

For unexplained reasons, the court was impressed by what it called a high dropout rate in one of the larger Avandia RCTs:

“The ADOPT study was marred by a very high dropout rate (more than 40% of the subjects did not complete the four year follow up) and the use of statins during the trial.”

Op. at 25.  Talk about being hoisted with one’s own petard!  The high dropout rate in ADOPT resulted from the fact that this RCT was a long-term test of “glycemic control.”  Avandia did better with respect to durable glycemic control than two major, accepted medications, metformin and sulfonylurea, and thus the dropouts came mostly in the comparator arms as patients not taking Avandia required more and stronger medications, or even injected insulin.  The study investigators were obligated to analyze their data in accord with “intention to treat” principles, and so patients removed from the trial due to lack of glycemic control could no longer be counted with respect to any outcome of interest.  Avandia patients thus had longer follow-up time, and more opportunity to have events due to their underlying pathologic physiology (diabetes and diabetic-related heart attacks).

Ignoring Defense Arguments

GSK may have hurt itself by electing not to call an expert witness at the Daubert hearing in this MDL.  Still, the following statement by the Court is hard to square with opening argument given at the hearing:

“GSK points out no specific flaws or limitations in the design or implementation of the NISSEN meta-analysis”

Op. at 6.  If true, then shame on GSK; but somehow this statement seems too incredible to be true.

Ignoring the Difference between myocardial ischemic events and myocardial infarction (MI)

MI occurs when heart muscle dies as a result of a blockage in a blood vessel that brings oxygenated blood.  An ischemic event is defined very broadly in GSK’s study:

“To minimize the possibility of missing events of interest, all events coded with broadly inclusive AE terms captured from investigator reports were reviewed. SAEs identified from the trials database included cardiac failure, angina pectoris, acute pulmonary edema, all cases of chest pain without a clear non-cardiac etiology and myocardial infarction/myocardial ischemia.”

Alexander Cobitz MD, PhD, et al., “A retrospective evaluation of congestive heart failure and myocardial ischemia events in 14 237 patients with type 2 diabetes mellitus enrolled in 42 short-term, double-blind, randomized clinical studies with rosiglitazone,” 17 Pharmacoepidem. & Drug Safety 769, 770 (2008).

In its pooled analysis, GSK was clearly erring on the side of safety in creating its composite end point, but the crucial point is that GSK included events that had nothing to do with MI.  The MDL court appears to have accepted uncritically the plaintiffs’ expert witnesses’ claim that the difference between myocardial ischemic events and MI is only a matter of degree.  The Court found “that the experts were able to draw reliable conclusions about myocardial infarction” from a meta-analysis about a different end point, “by virtue of their expertise and the available data.”  Op. at 10.  This is hand waiving or medical alchemy.

Uncritical Acceptance of Mechanistic Evidence Related to Increased Congestive Heart Failure (CHF) in Avandia Users

The court noted that plaintiffs’ expert witnesses relied upon a well-established relationship  between Avandia and congestive heart failure (CHF).  Op. at 14.  True, true, but immaterial.  Avandia causes fluid retention, but so do other drugs in this class of drugs as well.  Actos causes fluid retention, and carries the same warning for CHF, but there is no evidence that Actos causes MI or stroke.  Although the Court’s desire to have a mechanism of causation is understandable, that desire cannot substitute for actual evidence.

Misuse of Power Analyses

The Avandia MDL Court mistakenly referred to inadequate statistical power in the context of interpreting data of heart attacks in Avandia RCTs. 

“If the sample size is too small to adequately assess whether the substance is associated with the outcome of interest, statisticians say that the study lacks the power necessary to test the hypothesis. Plaintiffs’ experts argue, among other points, that the RCTs upon which GSK relies are all underpowered to study cardiac risks.”

Op. at 5.

The Court might have helped itself by adverting to the Reference Manual of Scientific Evidence:

“Power is the chance that a statistical test will declare an effect when there is an effect to declare. This chance depends on the size of the effect and the size of the sample.”

Federal Judical Center, Reference Manual Scientific Evidence 2d ed. 125 – 26, 357 (2000) (internal citations omitted).  In other words, you cannot assess the power of the study unless you specify the size of the association of the alternative hypothesis, and the sample size, among other things.  It is true that most of the Avandia trials were not powered to detect heart attacks, but the concept of power requires the user to specify at least the alternative hypothesis against which the study is being assessed for power. Once the studies were completed, and the data became available, there was no longer any need or use for the consideration of power; the statistical precision of the studies’ results was given by their confidence intervals.

Incorrect Use of the Concept of Replication

The MDL court erred in accepting the plaintiffs’ expert witnesses’ bolstering of Nissen’s meta-analytic results by their claim that Nissen’s results had been “replicated”:

“[T]he NISSEN results have been replicated by other researchers. For example, the SINGH meta-analysis pooled data from four long-term clinical trials, and also found a statistically significant increase in the risk of myocardial infarction for patients taking Avandia. GSK and the FDA have also replicated the results of NISSEN through their own meta-analyses.”

Op. at 6 (internal citations omitted).

“The SINGH, GSK and FDA meta-analyses replicated the key findings of the NISSEN study.43”

Op. at 17.

These statements mistakenly suggest that Nissen’s meta-analysis was able to generate a reliable conclusion that there was a statistically significant association between Avandia use and MI.  The Court’s insistence that Nissesn was replicated does not become more true for having been stated twice.  Nissen’s meta-analysis was not an observational study in the usual sense.  His publication made very clear what studies were included (and not at all clear what studies were excluded), and the meta-analytic model that he used.  Thus, it is trivially true that anyone could have replicated his analysis, and indeed, several researchers did so.  See, e.g., George A. Diamond, MD, et al., “Uncertain Effects of Rosiglitazone on the Risk for Myocardial Infarction and Cardiovascular Death,” 147 Ann. Intern. Med. 578 (2007).

But Nissen’s results were not replicated by Singh, GSK, or the FDA, because these other meta-analyses used different methods, different endpoints (in GSK’s analysis), different inclusion criteria, different data, and different interpretative methods.  Most important, GSK and FDA could not reproduce the statistically significant finding for their summary estimate of association between Avandia and heart attacks.

One definition of replication that the MDL court might have consulted makes clear that replication is a repeat of the same experiment to determine whether the same (or a consistent) result is obtained:

“REPLICATION — The execution of an experiment or survey more than once so as to confirm the findings, increase precision, and obtain a closer estimation of sampling error.  Exact replication should be distinguished from consistency of results on replication.  Exact replication is often possible in the physical sciences, but in the biological and behavioral sciences, to which epidemiology belongs, consistency of results on replication is often the best that can be attained. Consistency of results on replication is perhaps the most important criterion in judgments of causality.”

Miquel Porta, Sander Greenland, and John M. Last, eds., A Dictionary of Epidemiology, 5th ed., at 214 (2008).  The meta-analyses of Singh, GSK, and FDA did not, and could not, replicate Nissen’s.  Singh’s meta-analysis obtained a result similar to Nissen’s, but the other meta-analyses by GSK, FDA, and Manucci failed to yield a statistically significant result for MI.  This is replication only in Wonderland.

It is hard to escape the conclusion that the MDL denied GSK intellectual due process of law.

Beecher-Monas and the Attempt to Eviscerate Daubert from Within

November 23rd, 2010

Part 2, of a Critique of Evaluating Scientific Evidence, by Erica Beecher-Monas (EBM)

Giving advice to trial and appellate judges on how they should review scientific evidence can be a tricky business.  Such advice must reliably capture the nature of scientific reasoning in several different fields, such as epidemiology and toxicology, and show how such reasoning can and should be incorporated within a framework of statutes, rules, and common law rules.  Erica Beecher-Monas’ book, Evaluating Scientific Evidence, fails to accomplish these goals.  What she does accomplish is the confusion of regulatory assumptions and principles of precautionary principles with the science of health effects in humans.

7.  “Empowering one type of information or one kind of study to the exclusion of another makes no scientific evidentiary sense.”  Id. at 59.

It is telling that Erica Beecher-Monas (EBM) does not mention either the systematic review or the technique of meta-analysis, which is based upon the systematic review.  Of course, these approaches, whether qualitative or quantitative, require a commitment to pre-specify a hierarchy of evidence, and inclusionary and exclusionary criteria for studies.  What EBM seems to hope to accomplish is the flattening of the hierarchy of evidence, and making all types of evidence comparable in probative value.  This is not science or scientific, but part of an agenda to turn Daubert into a standard of bare relevancy.  Systematic reviews do not literally exclude any “one kind” of study, but they recognize that not all study designs are equal.  The omission in EBM’s book speaks volumes.

8. “[T]he likelihood that someone whose health was adversely affected will have the courthouse doors slammed in his or her face,”  id. at 64, troubles EBM. 

EBM recognizes that inferences and scientific methodologies involve false positives and false negatives, but she appears disproportionately concerned by false negatives.  Of course, this solicitude begs the question whether we have reasonably good knowledge that that someone really was adversely affected.  A similar solicitude for the defendant who has had the courthouse door slammed on his head, in cases in which it has caused no harm, is missing.  This imbalance leads EBM to excuse and defend gaps in plaintiffs’ evidentiary displays on scientific issues.

9.  “Gaps in scientific knowledge are inevitable, not fatal flaws.”  Id. at 51 (citing a work on risk assessment).

The author also seems to turn a blind eye to the size of gaps.  Some gaps are simply too big to be bridged by assumptions.  Scientists have to be honest about their assumptions, and temper their desire to reach conclusions.  Expert witnesses often lack the requisite scientific temper to remain agnostic; they take positions when they should rightfully press for the gaps to be filled.  Expert witnesses outrun their headlights, but EBM cites virtually no example of a gatekeeping decision with approval.

Excusing gaps in risk assessment may make some sense given that risk assessment is guided by the precautionary principle.  The proofs in a toxic tort case are not.  EBM’s assertion about the inevitability of “gaps” skirts the key question:  When are gaps too large to countenance, and to support a judgment?  The Joiner case made clear that when the gaps are supported only by the ipse dixit of an expert witness, courts should look hard to determine whether the conclusion is reasonably, reliably supported by the empirical evidence.  The alternative, which EBM seems to invite, is intellectual anarchy.

8.  “Extrapolation from rodent studies to human cancer causation is universally accepted as valid (at least by scientists) because ‘virtually all of the specific chemicals known to be carcinogenic in humans are also positive in rodent bioassays, and sometimes even at comparable dose and with similar organ specificity’.” Id. at 71n.55 (quoting Bernard Weinstein, “Mitogenesis is only one factor in carcinogenesis,” 251 Science 387, 388 (1991)).

When it comes to urging the primacy and superiority of animal evidence, EBM’s brief is relentless and baseless.

Remarkably, in the sentence quoted above, EBM has committed the logical fallacy of affirming the consequent:  If all human carcinogens are rat carcinogens, then all rat carcinogens are human carcinogens.  This argument form is invalid, and the consequent does not follow from the antecedent.  And it is the consequent that provides the desired, putative validity for extrapolating from rodent studies to humans.  Not only does EBM commit a non-sequitur, she quotes Dr. Weinstein’s article out of context, because his article makes quite clear that not all rat carcinogens are accepted causes of cancer in human beings.

9.  “Post-Daubert courts often exclude expert testimony in toxic tort cases simply because the underlying tests relate to animals rather than humans.”  Id. at 71n. 54.

Given EBM’s radical mission to “empower” animal evidence, we should not be too surprised that she is critical of Daubert decisions that have given lesser weight to animal evidence.  The above statement is another example of EBM’s over- and misstatement.  The cases cited, for instance the Hall decision by Judge Jones in the breast implant litigation, and the Texas Supreme Court in Havner, do not support the “simply because.”  Those cases represent complex evidentiary displays that involved animal, in vitro, chemical analysis, and epidemiologic studies. The Hall decision was based upon Rule 702, but it was followed by Judge Jack Weinstein, who, after conducting two weeks of hearings, entered summary judgment sua sponte against the plaintiffs (animal evidence and all).  Recently, Judge Weinstein characterized the expert witnesses who supported the plaintiffs’ claims as “charlatans.”  See Judge Jack B. Weinstein, “Preliminary Reflections on Administration of Complex Litigation.” Cardozo Law Review De Novo at 14, http://www.cardozolawreview.com/content/denovo/WEINSTEIN_2009_1.pdf (“[t]he breast implant litigation was largely based on a litigation fraud. …  Claims—supported by medical charlatans—that enormous damages to women’s systems resulted could not be supported.”) (emphasis added).

Given the widespread rejection of the junk science behind breast implant claims, by courts, scientists, court-appointed experts, and the Institute of Medicine, EBM’s insertion of “simply” in the sentence above simply tells volumes about how she would evaluate the evidentiary display in HallSee also Evaluating Scientific Evidence at 81n.99 (arguing that Hall was mistaken).  If the gatekeeping in the silicone breast implant litigation was mistaken, as EBM argues, it is difficult to imagine what slop would be kept out by a gatekeeper who chose to apply EBM’s “intellectual due process.”

10.  “Animal studies are more persuasive than epidemiology for demonstrating small increases of risk.” Id. at 70

EBM offers no support for this contention, and there is none unless one is concerned to demonstrate small risks for animals.  Even for the furry beasts themselves, the studies do not “demonstrate” (a mathematical concept) small increased risks at low doses comparable to the doses experienced by human beings. 

EBM’s urging of “scientifically justifiable default assumptions” turns into advocacy for regulatory pronouncements of precautionary principle, which have been consistently rejected by courts as not applicable to toxic tort litigation for personal injuries.

11.  “Nonthreshold effects, on the other hand, are characteristic of diseases (like some cancers) that are caused by genetic mutations.” Id. at 75.

EBM offers no support for this assertion, and she ignores the growing awareness that the dose-response curves for many substances are hormetic; that is, the substance often exercises a beneficial or therapeutic effect at low doses, but may be harmful at high doses.  Alcohol is a known human carcinogen, but at low doses, alcohol reduces cardiovascular mortality.  At moderate to high doses, alcohol causes female breast cancer, and liver cancer.  Liver cancer, however, requires sufficiently high, prolonged doses to causes permanent fibrotic and architectural changes in the liver (cirrhosis) before it increases risk of liver cancer.  These counterexamples, and others, show that thresholds are often important features of the dose-response curves of carcinogens.

Similarly, EBM incorrectly argues that the default assumption of a linear dose-response pattern is reasonable because it is, according to her, widely accepted.  Id. at 74n. 65.  Her supporting citation is, however, to an EPA document on risk assessment, which has nothing to do with determinations of causality.  Risk assessments assume causality and attempt to place an upper bound on the magnitude of the hypothetical risk.  Again, EBM’s commitment to the precautionary principle and regulatory approaches preempt scientific thinking.  If EBM had considered the actual and postulated mechanisms of carcinogenesis, even in sources she cites, she would have to acknowledge that the linear no threshold model makes no sense because it ignores the operation of multiple protective mechanisms that must be saturated and overwhelmed before carcinogenetic exposures can actually induce clinically meaningful tumors in animals.  See, e.g., Bernard Weinstein, “Mitogenesis is only one factor in carcinogenesis,” 251 Science 387, 388 (1991) (mistakenly cited by EBM for the proposition that rodent carcinogens should be “assumed” to cause cancer in humans).

12.  “Under this assumption [of the EPA], demonstrating the development of lung cancer in mice would be admissible to show human causation in any organ.  Because we know so little about cancer causation, there is justification for this as a workable but questionable assumption with respect to cancer.”  Id. at 77.

Extrapolation, across species, across organs, and across disparate doses!  No gap is too wide, too deep to be traversed by EBM’s gatekeepers.  In arguing that extrapolation is a routine part of EPA risk assessment, EBM ignores that the extrapolation is not the basis for reaching scientific conclusions about health effects in human beings.  Regulatory science is “mandating certainty” — the opposite side of David Michael’s caricature of industry’s “manufacturing doubt.”

13. “[T]he court in Hall was mistaken when it excluded the expert testimony because the studies relied on only showed that silicone could have caused the plaintiff’s diseases, not that it did.”  Id. at 81n.99.

Admittedly, it is difficult to tell whether EBM is discussing general or specific causation in this sentence, but it certainly seems as if she is criticizing the Hall decision, by Judge Jones, because the expert witnesses for the plaintiff were unable to say that silicone did, in fact, cause Hall’s illness.  EBM appears to be diluting specific causation to a “might have had some effect” standard. 

The readers who have actually read the Hall decision, or who are familiar with the record in Hall, will know that one key expert witness for plaintiffs, an epidemiologist, Dr. David Goldsmith, conceded that he could not say that silicone more likely than not caused autoimmune disease.  A few weeks after testifying in Hall, Goldsmith changed his testimony.  In October 1996, in Judge Weinstein’s courtroom, based upon an abstract of a study that he saw the night before testifying, Goldsmith asserted that believed that silicone did cause autoimmune connective tissue disease, more likely than not.  Before Goldsmith left the stand, Judge Weinstein declared that he did not believe that Goldsmith’s testimony would be helpful to a jury.

So perhaps EBM is indeed claiming that testimony that purports to provide the causal conclusion need not be expressed to some degree of certainty other than possibility.  This interpretation is consistent with what appears to be EBM’s dilution of “intellectual due process” to permit virtually any testimony at all that has the slightest patina of scientific opinion.

14.  “The underlying reason that courts appear to founder in this area [toxic torts] is that causation – an essential element for liability – is highly uncertain, scientifically speaking, and courts do not deal well with this uncertainty.”  Id. at 57.

Regulation in the face of uncertain makes sense as an application of the precautionary principle, but litigation requires expert witness opinion that rises to the level of “scientific knowledge.”  Rule 702.  EBM’s candid acknowledgment is the very reason that Daubert is an essential tool to strip out regulatory “science,” which may well support regulation against a potential, unproven hazard.  Regulations can be abrogated.  Judgments in litigation are forever.  The social goals and the evidentiary standards are different.

15.  “Causal inference is a matter of explanation.”  Id. at 43. 

Here and elsewhere, EBM talks of causality as though it were only about explanations, when in fact, the notion of causal inference includes an element of prediction, as well.  EBM seems to downplay the predictive nature of scientific theories, perhaps because this is where theories founder and confront their error rate.  Inherent in any statement of causal inference is a prediction that if the factual antecedents are the same, the result will be the same.  Causation is more than a narrative of why the effect followed the cause.

EBM’s work feeds the illusion that courts can act as gatekeepers, wrapped in the appearance of “intellectual due process,” but at the end of the day find just about any opinion to be admissible.  I could give further examples of the faux pas, ipse dixit, and non sequitur in EBM’s Evaluating Scientific Evidence, but the reader will appreciate the overall point.  Her topic is important, but there are better places for judges and lawyers to seek guidance in this difficult area.  The Federal Judicial Center’s Reference Manual on Scientific Evidence, although not perfect, is at least free of the sustained ideological noise that afflicts EBM’s text.