TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

Open Admissions for Expert Witnesses in Chantix Litigation

September 1st, 2012

Chantix is medication that helps people stop smoking.  Smoking kills people, but make a licensed drug and the lawsuits will come.

Earlier this month, Judge Inge Prytz Johnson, the MDL trial judge in the Chantix litigation, filed an opinion that rejected Pfizer’s challenges to plaintiffs’ general causation expert witnesses.  Memorandum Opinion and Order, In re Chantix (Varenicline) Products Liability Litigation, MDL No. 2092, Case 2:09-cv-02039-IPJ Document 642 (N.D. Ala. Aug. 21, 2012)[hereafter cited as Chantix].

Plaintiffs claimed that Chantix causes depression and suicidality, sometimes severe enough to result in suicide, attempted or completed.  Chantix at 3-4.  Others have written about Judge Johnson’s decision.  See Lacayo, “Win Some, Lose Some: Recent Federal Court Rulings on Daubert Challenges to Plaintiffs’ Experts,” (Aug. 30, 2012).

The breadth and depth of error of the trial court’s analysis, or lack thereof, remains, however, to be explored.

 

STATISTICAL SIGNIFICANCE

The Chantix MDL court notes several times that the defendant “harped” on this or that issue; the reader might think the defendant was a music label rather than a pharmaceutical manufacturer.  One of the defendant’s chords that failed to resonate with the trial judge was the point that the plaintiffs’ expert witnesses relied upon statistically non-significant results.  Here is how the trial court reported the issue:

“While the defendant repeatedly harps on the importance of statistically significant data, the United States Supreme Court recently stated that ‘[a] lack of statistically significant data does not mean that medical experts have no reliable basis for inferring a causal link between a drug and adverse events …. medical experts rely on other evidence to establish an inference of causation.’ Matrixx Initiatives, Inc. v. Siracsano, 131 S.Ct. 1309, 1319 (2011).”

Chantix at 22.

Well, it was only a matter of time before the Supreme Court’s dictum would be put to this predictably erroneous interpretation.  SeeThe Matrixx Oversold” (April 4, 2011).

Matrixx involved a motion to dismiss the complaint, which the trial court granted, but the Ninth Circuit reversed.  No evidence was offered; nor was any ruling that evidence was unreliable or insufficient at issue. The Supreme Court affirmed the Circuit on the issue whether pleading statistical significance was necessary.  Matrixx Initiatives took this position in the hopes of avoiding the merits, and so the issue of causation was never before the Supreme Court.  A unanimous Supreme Court held that because FDA regulatory action does not require reliable evidence to support a causal conclusion, pleading materiality for a securities fraud suit does not require an allegation of causation, and thus does not require an allegation of statistically significant evidence. Everything that the Court said about statistical significance and causation was obiter dictum, and rather ill-considered dictum at that.

The Supreme Court thus wandered far beyond its holding to suggest that courts “frequently permit expert testimony on causation based on evidence other than statistical significance.” Matrixx Initiatives, Inc. v. Siracsano, 131 S.Ct. 1309, 1319 (2011) (citing Wells v. Ortho Pharm. Corp., 788 F.2d 741, 744-745 (11th Cir.1986)).  But the Supreme Court’s citation to Wells, in Justice Sotomayor’s opinion, failed to support the point she was trying to make, or the decision that the trial court announced in Chantix.

Wells involved a claim of birth defects caused by the use of spermicidal jelly contraceptive.  At least one study reported a statistically significant increase in detected birth defects over the expected rate.  Wells v. Ortho Pharmaceutical Corp., 615 F. Supp. 262 (N.D.Ga. 1985), aff’d, and rev’d in part on other grounds, 788 F.2d 741 (11th Cir.), cert. denied, 479 U.S.950 (1986).  Wells is not an example of a case in which an expert witness opined about causation in the absence of a scientific study with statistical significance. Of course, finding statistical significance is just the beginning of assessing the causality of an association; the Wells case was and remains notorious for the expert witness’s poor assessment of all the determinants of scientific causation, including the validity of the studies relied upon.

The Wells decision was met with severe criticism in the 1980s.  The decision was widely criticized for its failure to evaluate the entire evidentiary display, as well as for its failure to rule out bias and confounding in the studies relied upon by the plaintiff.  See, e.g., James L. Mills and Duane Alexander, “Teratogens and ‘Litogens’,” 15 New Engl. J. Med. 1234 (1986); Samuel R. Gross, “Expert Evidence,” 1991 Wis. L. Rev. 1113, 1121-24 (1991) (“Unfortunately, Judge Shoob’s decision is absolutely wrong. There is no scientifically credible evidence that Ortho-Gynol Contraceptive Jelly ever causes birth defects.”). See also Editorial, “Federal Judges v. Science,” N.Y. Times, December 27, 1986, at A22 (unsigned editorial);  David E. Bernstein, “Junk Science in the Courtroom,” Wall St. J. at A 15 (Mar. 24,1993) (pointing to Wells as a prominent example of how the federal judiciary had embarrassed the American judicial system with its careless, non-evidence based approach to scientific evidence). A few years later, another case in the same judicial district, against the same defendant, for the same product, resulted in the grant of summary judgment.  Smith v. Ortho Pharmaceutical Corp., 770 F. Supp. 1561 (N.D. Ga. 1991) (supposedly distinguishing Wells on the basis of more recent studies).

Neither the Justices in Matrixx Initiatives nor the trial court in Chantix can be excused for their poor scholarship, or their failure to note that Wells was overruled sub silentio by the Supreme Court’s own subsequent decisions in Daubert, Joiner, Kumho Tire, and Weisgram.  And if the weight of precedent did not kill the concept, then there is the simple matter of a supervening statute:  the 2000 amendment of Rule 702, of Federal Rules of Evidence.

 

CONFUSING REGULATORY ACTION WITH CAUSAL ASSESSMENTS

The Supreme Court in Matrixx Initiatives was careful to distinguish causal judgments from regulatory action, but then went on in dictum to conflate the two.  The trial judge in Chantix showed no similar analytical care.  Judge Johnson held that the asserted absence of statistical significance was not a basis for excluding plaintiffs’ expert witnesses’ opinions on general causation.  Her Honor adverted to the Matrixx Initiatives dictum that the FDA “does not apply any single metric for determining when additional inquiry or action is necessary.” Matrixx, 131 S.Ct. at 1320.  Chantix at 22.  Judge Johnson noted

“that ‘[n]ot only does the FDA rely on a wide range of evidence of causation, it sometimes acts on the basis of evidence that suggests, but does not prove, causation…. the FDA may make regulatory decisions against drugs based on postmarketing evidence that gives rise to only a suspicion of causation’.  Matrixx, id. The court declines to hold the plaintiffs’ experts to a more exacting standard as the defendant requests.”

Chantix at 23.

In the trial court’s analysis, the difference between regulatory action and civil litigation fact adjudication is obliterated.  This, however, is not the law of the United States, which has consistently acknowledged the difference. See, e.g., IUD v. API, 448 U.S. 607, 656 (1980)(“agency is free to use conservative assumptions in interpreting the data on the side of overprotection rather than underprotection.”)

As the Second Edition of the Reference Manual on Scientific Evidence (which was the out-dated edition cited by the court in Chantix) explains:

“[p]roof of risk and proof of causation entail somewhat different questions because risk assessment frequently calls for a cost-benefit analysis. The agency assessing risk may decide to bar a substance or product if the potential benefits are outweighed by the possibility of risks that are largely unquantifiable because of presently unknown contingencies. Consequently, risk assessors may pay heed to any evidence that points to a need for caution, rather than assess the likelihood that a causal relationship in a specific case is more likely than not.”

Margaret A. Berger, “The Supreme Court’s Trilogy on the Admissibility of Expert Testimony,” in Reference Manual On Scientific Evidence at 33 (Fed. Jud. Ctr. 2d. ed. 2000).

 

CONCLUSIONS VS. METHODOLOGY

Judge Johnson insisted that the “court’s focus was solely on the principles and methodology, not on the conclusions they generate.” Chantix at 9.  This insistence, however, is contrary to the established law of Rule 702.

Although the United States Supreme Court attempted, in Daubert, to draw a distinction between the reliability of an expert witness’s methodology and conclusion, that Court soon realized that the distinction was flawed. If an expert witness’s proffered testimony is discordant from regulatory and scientific conclusions, a reasonable, disinterested scientists would be led to question the reliability of the testimony’s methodology and its inferences from facts and data, to its conclusion.  The Supreme Court recognized this connection in General Electric v. Joiner, and the connection between methodology and conclusions was ultimately incorporated into a statute, the revised Federal Rule of Evidence 702:

“[I]f scientific, technical or other specialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training or education, may testify thereto in the form of an opinion or otherwise, if

  1. the testimony is based upon sufficient fact or data,
  2. the testimony is the product of reliable principles and methods; and
  3. the witness has applied the principles and methods reliably to the facts.”

When the testimony is a conclusion about causation, the Rule 702 directs an inquiry into whether that conclusion is based upon sufficient fact or data, and whether that conclusion is the product of reliable principles and methods.  The court’s focus should indeed be on the conclusion as well the methodology claimed to generate the conclusion.  The Chantix MDL court thus ignored the clear mandate of a statute, Rule 702(1), and applied dictum from Daubert, superseded by Joiner, and an Act of Congress.  The ruling is thus legally invalid to the extent it departs from the statute.

 

EPIDEMIOLOGY

For obscure reasons, Judge Johnson sought to deprecate the need to rely upon epidemiologic studies, whether placebo-controlled clinical trials or observational studies.  See Chantix at 25 (citing Rider v. Sandoz Pharm. Corp., 295 F.3d 1194, 1198-99 (11 Cir.2002)). Of course, the language cited in Rider came from a pre-Daubert, pre-Joiner, case, Wells v. Ortho Pharm. Corp., 788 F.2d 741, 745 (11th Cir.1986) (holding that “a cause-effect relationship need not be clearly established by animal or epidemiological studies”).  This dubious legal lineage cannot support the glib dismissal of the need for epidemiologic evidence.

 

WEIGHT OF THE EVIDENCE (WOE)

According to Judge Johnson, plaintiffs’ expert witness Shira Kramer considered all the evidence relevant to Chantix and neuropsychiatric side effects, in what Kramer described as a “weight of the evidence” analysis.  Chantix at 26.  In her report, Kramer had written that determinations about the weight of evidence are “subjective interpretations” based upon “various lines of scientific evidence. Id. (citing and quoting Kramer’s report). Kramer also claimed that every scientist “brings a unique set of experiences, training and expertise …. Philosophical differences exist between experts…. Therefore, it is not surprising that differences of opinion exist among scientists. Such differences of opinion are not necessarily evidence of flawed scientific reasoning or methodology, but rather differences in judgment between scientists.” Id.

Without any support from scientific literature, or the Reference Manual on Scientific Evidence, Judge Johnson accepted Kramer’s explanation of a totally subjective, unprincipled approach as a scientific methodology.  Not surprisingly, Judge Johnson cited the First Circuit’s embrace of a similar vacuous embrace of a WOE analysis in Milward v. Acuity Specialty Products Group, Inc. 639 F.3d 11, 22 (1st Cir. 2011).  Chantix at 51.

 

CHERRY PICKING

Judge Johnson noted, contrary to her earlier suggestion that Shira Kramer had considered all the studies, that Kramer had excluded data from her analysis.  Kramer’s basis for excluding data may have been based upon pre-specified exclusionary principles, or they may have been completely ad hoc, as were the lack of weighting principles in her WOE analysis.  In its gatekeeping role, however, the trial court expressed complete indifference to Kramer’s selectivity in excluding data.  “Why Dr. Kramer chose to include or exclude data from specific clinical trials is a matter for cross-examination.”  Chantix at 27.  This indifference is an abdication of the court’s gatekeeping responsibility.

 

POWER

The trial court attempted to justify its willingness to mute defendant’s harping on statistical significance by adverting to the concept of statistical power:

“Oftentimes, epidemiological studies lack the statistical power needed for definitive conclusions, either because they are small or the suspected adverse effect is particularly rare. Id. [Michael D. Green et al., “Reference Guide on Epidemiology,” in Reference Manual on Scientific Evidence 333, 335 (Fed. Judicial Ctr. 2d ed. 2000)… .

Chantix at 29 n.16.

To be fair to the trial court, the Reference Manual invited this illegitimate use of statistical power because it, at times, omits the specification that statistical power requires not only a level of statistical significance to be attained, but also a specified alternative hypothesis to assess power.  See Power in the Courts — Part One; Power in the Courts — Part Two.  The trial court offered no alternative hypothesis against which any measure of power was to be assessed.

Judge Johnson did not report any power analyses, and she certainly did not report any quantification of power or lack thereof against some specific alternative hypothesis.  Judge Johnson’s invocation of power was just that – power used arbitrarily, without data, evidence, or reason.

 

CONFIDENCE INTERVALS

As with the invocation of statistical power, the trial also invoked the concept of confidence intervals to suggest that such intervals provide a more refined approach to assessing statistical significance:

“A study found to have ‘results that are unlikely to be the result of random error’ is ‘statistically significant’. Reference Guide on Epidemiology, supra, at 354. Statistical significance, however, does not indicate the strength of an association found in a study. Id. at 359. ‘A study may be statistically significant but may find only a very weak association; conversely, a study with small sample sizes may find a high relative risk but still not be statistically significant.’ Id. To reach a ‘more refined assessment of appropriate inferences about the association found in an epidemiologic study’, researchers rely on another statistical technique known as a confidence interval’. Id. at 360.”

Chantix at 30 n.17.  True, true, but immaterial.  The trial court, again, never carries through with the direction given by the Reference Manual.  Not a single confidence interval is presented.  No confidence intervals are subjected to this more refined assessment.  Why have more refined assessments when even the cruder assessments are not done?

 

OPEN ADMISSIONS IN SCHOOL OF EXPERT WITNESSING

The trial court somehow had the notion that all it had to do was state that every disputed fact and opinion went to the weight not the admissibility, and then pass to a presumably more scientifically literate jury.  To be sure, the court engaged in a good deal of hand waving, going through the motions of deciding a contested issues.  Not only did the Judge Johnson smash poor Pfizer’s harp, Her Honor unhinged the gate that federal judges are supposed to keep.  Chantix declares that it is now open admissions for expert witnesses testifying to causation in federal cases.  This is a judgment in search of an appeal.

Canadian Remedy for American Taliban

August 29th, 2012

A few years ago, Quebec introduced a very interesting religious education program for public school.  The Province’s Ethics and Religious Culture (“ERC”) Program, which went into effect in 2008, requires that children learn facts about the many different religions practiced in the Canada.  The intent and the content of the ERC program was to maintain neutrality between faiths, and to help children understand the beliefs of others in the Province.

Two parents of school children sought to have their children removed from the education program on the ground that their children’s “freedom of religion” was infringed by their having to learn facts about other religions.  The challenge might seem peculiar because nothing in the ERC Program kept the children from practicing their own faith, or the faith of their parents thrust upon them; nor did the Program require them to practice any faith, or cult for that matter.

The briefs of the parents and of organized churches, however, made clear what the gravaman of the complaint was.  Being required to learn about other faiths (and cults) would undermine the parents’ claims that their faith was the “one true faith,” and would lead to the children’s rejection of their parents’ faith.  The school system, by opening children’s eyes to the existence of many different faiths, making competing claims to truth and understanding, would interfere with the parents’ “obligation” to indoctrinate the Catholic faith in their children by causing their children to question their faith.  Maybe more to the point, but unstated, the education program would not just cause children to question their faith, but rather it would allow the children to see that the existence of competing faiths undermined any claim to Truth in one.  All the faiths might take on an arbitrary and capricious appearance.

Now if the parents believed that their Catholic faith was somehow privileged and True, it would have been a relatively simple matter to teach their children the how and why of their own religious beliefs.  We would think that the children would be inoculated against the heretical views of the diverse religions practiced in Canada.  Perhaps the parents’ anxiety, and their resort to pleadings, reveals some insecurity about their faith’s ability to withstand critical scrutiny.  Better to put off the day of reckoning until the brainwashing of the children is complete.

On February 17, 2012 the Supreme Court of Canada upheld Quebec’s Ethics & Religious Culture Program, in S.L. v. Commission scolaire des Chênes, 2012 SCC 7.  The Court held that the parents, whose names are not revealed (due to shame?), and their children suffered no infringement of their freedom of religion.  Accepting that the parents were sincere in their professions of faith, the Court unanimously held that the ERC Program did not interfere with those beliefs.  Parents in Canada remain free to do their best to indoctrinate their children in parental religious beliefs, whether those beliefs be Protestant, Catholic, Muslim, Jewish, Jain, Scientology, Satanic, Astrological, or even Pastafarian.

Merely causing children to open their eyes and compare religions in a factual way is not an infringement of the Canadian Charter of Rights and Freedoms.  Learning about the diversity of faiths is not a restraint of the free exercise of religion.  The Supreme Court of Canada noted that the ERC Program maintained neutrality in presenting facts about religion and morals.

Refusing to accept that the ERC Program interfered with parental “obligations” to inculcate and indoctrinate their own faith was perhaps non-empirical.  The parents’ claim is not implausible, and it might well be true.  The Court’s holding ignored that neutrality was the LAST thing the litigious parents wanted in matters of religion.  The parents, S.L. and D.J., took their anonymous children out of public school, and placed them in Catholic schools, where they can have their children indoctrinated without scrutiny or appeal to law or reason.

Indeed, let’s hope that it is true that teaching facts about competing faiths, which cannot all be equally correct, might lead to some epistemic humility and even skepticism.  Surely that would be welcomed. We have something here to learn from our northern neighbor.  Teaching “anthropology of religion” in the United States might have great benefits to break the stranglehold of cults on our politics.  The Quebec ERC Program would be a step in moving from a faith-based to an evidence-based world.  American Taliban beware.

The Dow-Bears Debate the Decline of Daubert

August 10th, 2012

Last month, I posted a short screenplay about how judicial gatekeeping of expert witnesses has slackened recently.  SeeDaubert Approaching the Age of Majority” (July 17, 2012).

Dr. David Schwartz, of Innovative Science Solutions, has adapted the screenplay to the cinematic screen, and directed a full-length feature movie, The Daubert Will Set Your Client Free, using text-to-talk technology. Dr. Schwartz is not only a first-rate scientist, but he is also an aspiring film maker and artist.

OK; full-length is only a little more than 90 seconds, but you may still enjoy our movie-making debut.  And it is coming to a YouTube screen near you, now.

Eighth Circuit Holds That Increased Risk Is Not Cause

August 4th, 2012

The South Dakota legislature took it upon itself to specify the “risks” to be included in the informed consent required by state law for an abortion procedure:

(1) A statement in writing providing the following information:
* * *
(e) A description of all known medical risks of the procedure and statistically significant risk factors to which the pregnant woman would be subjected, including:
(i) Depression and related psychological distress;
(ii) Increased risk of suicide ideation and suicide;
* * *

S.D.C.L. § 34-23A-10.1(1)(e)(i)(ii).  Planned Parenthood challenged the law on constitutional grounds, and the district court granted a preliminary injunction against the South Dakota statute, which a panel of the Eight Circuit affirmed, only to have that Circuit en banc reverse and remand the case for further proceedings.  Planned Parenthood Minn. v. Rounds, 530 F.3d 724 (8th Cir. 2008) (en banc).

On remand, the parties filed cross-motions for summary judgment.  The district court held that the so-called suicide advisory was unconstitutional.  On the second appeal to the Eight Circuit, a divided panel affirmed the trial court’s holding on the suicide advisory. 653 F.3d 662 (8th Cir. 2011).  The Circuit, however, again granted rehearing en banc, and reversed the summary judgment for Planned Parenthood on the advisory.  Planned Parenthood Minnesota v. Rounds, Slip op. July 24, 2012 (en banc)[Slip op.].

In support of the injunction, Planned Parenthood argued that the state’s mandatory suicide advisory violated women’s abortion rights and physicians’ free speech rights. The en banc court rejected this argument, holding that the required advisory was “truthful, non-misleading information,” which did not unduly burden abortion rights, even if it might cause women to forgo abortion.  See Planned Parenthood of Southeastern Pennsylvania v. Casey, 505 U.S. 833, 882-83 (1992).

Risk  ≠ Cause

Planned Parenthood’s success in the trial court turned on its identification of risk (or increased risk) with cause, and its expert witness evidence that causation had not been accepted in the medical literature. In other words, Planned Parenthood argued that the advisory required disclosure of a conclusive causal “link” between abortion and suicide or suicidal ideation.  See 650 F. Supp. 2d 972, 982 (D.S.D. 2009).  The en banc court, on the second appeal, sought to save the statute by rejecting Planned Parenthood’s reading.  The court parsed the statute to suggest that the term “increased risk” is more precise and limited than the umbrella term of “risk,” standing alone.  Slip op. at 6.  The statute does not define “increased risk,” which the en banc court noted had various meanings in medicine.  Id. at 7.

Reviewing the medical literature, the en banc court held that the term “increased risk” does not refer to causation but to a much more modest finding of “a relatively higher probability of an adverse outcome in one group compared to other groups—that is, to ‘relative risk’.”  Id.  The en banc majority seemed to embroil itself in some considerable semantic confusion.  One the hand, the majority, in a rhetorical rift proclaimed that:

“It would be nonsensical for those in the field to distinguish a relationship of ‘increased risk’ from one of causation if the term ‘risk’ itself was equivalent to causation.”

Id. at 9.  The majority’s nonsensical labeling is, well, … nonsensical.  There is a compelling difference in assessment of risk and causation.  Risk is an ex ante concept, applied before the effect has occurred. Assessment or attribution of causation takes place after the effect. Of course, there is a sense of risk or “increased risk,” which is epistemologically more modest, but that hardly makes the more rigorous use of risk as an ex ante cause, nonsensical.

The majority, however, is not content to leave the matter alone.  Elsewhere, the en banc court contradicts itself, and endorses a view that risk = causation.  For instance, in citing to a civil action involving a claimed causal relationship between Bendectin and a birth defect, the Eighth Circuit reduces risk to cause.  See Slip op. at 26 n. 9 (citing Brock v. Merrell Dow Pharms., Inc., 874 F.2d 307, 312 , modified on reh’g, 884 F.2d 166 (5th Cir. 1989)).  The en banc court’s “explanatory” parenthetical explains the depths of its confusion:

“explaining that if studies establish, within an acceptable confidence interval, that those who use a pharmaceutical have a relative risk of greater than 1.0—that is, an increased risk—of an adverse outcome, those studies might be considered sufficient to support a jury verdict of liability on a failure-to-warn claim.”

This reading of Brock is wrong on two counts.  First, the Fifth Circuit, in Brock, and consistently since, has required the relative risk greater than 1.0 to be statistically significant at the conventional significance probability, as well as other indicia of causality, such as the Bradford Hill factors.  So Brock and its progeny did not confuse or conflate risk with cause, or dilute the meaning of cause such that it could be satisfied by a mere showing of an increased relative risk.

Second, Brock itself made a serious error in interpreting statistical significance and confidence intervals. The Bendectin studies at issue in Brock were not statistically significant, and the confidence intervals did not include a measure of no association (relative risk = one). Brock, however, in notoriously incorrect dicta claimed that the computation of confidence intervals took into account bias and confounding as well as sampling variability.  Brock v. Merrill Dow Pharmaceuticals, Inc., 874 F.2d 307, 311-12 (5th Cir. 1989)(“Fortunately, we do not have to resolve any of the above questions [as to bias and confounding], since the studies presented to us incorporate the possibility of these factors by the use of a confidence interval.”)(emphasis in original).  See, e.g., David H. Kaye, David E. Bernstein, and Jennifer L. Mnookin, The New Wigmore – A Treatise on Evidence:  Expert Evidence § 12.6.4, at 546 (2d ed. 2011); Michael O. Finkelstein, Basic Concepts of Probability and Statistics in the Law 86-87 (2009)(criticizing the over-interpretation of confidence intervals by the Brock court); Schachtman, “Confidence in Intervals and Diffidence in the Courts” (Mar. 4, 2012).

The en banc majority’s discussion of the studies of abortion and suicidality make clear that the presence of bias and confounding in a study may prevent inference of causation, but they do not undermine the conclusion that the studies show an increased risk.  A conclusion that the body of epidemiologic studies was inconclusive, and that it failed to “to disentangle confounding factors and establish relative risks of abortion compared to its alternatives,” did not, therefore, render the suicide advisory about risk or increased risk unsupported, untruthful, or misleading.  Slip op. at 20.  Indeed, the en banc court provided an example, outside the context of abortion, to illustrate its meaning.  The en banc court’s use of the example of prolonged television viewing and “increased risk” of mortality suggests that the court took risk to mean any association, no matter how likely it was the result of bias or confounding.  See id. at 10 n. 3 (citing Anders Grøntved, et al., “Television Viewing and Risk of Type 2 Diabetes, Cardiovascular Disease, and All-Cause Mortality, 305 J. Am. Med. Ass’n 2448 (2011). The en banc majority held that the advisory would be misleading only if Planned Parenthood could show that the available epidemiologic studies conclusively ruled out causation.  Slip op. at 24-25.

The Suicide Advisory Has Little Content Because Risk Is Not Cause

The majority decision clarified that the mandatory disclosure does not require a physician to inform a patient that abortion causes suicide or suicidal thoughts.  Slip op. at 25.  The en banc court took solace in its realization that physicians’ reviewing the available studies could provide a disclosure that captures the difference between risk, relative risk, and causation.  In other words, physicians are free to tell patients that this thing called increased risk is not concerning because the studies are highly confounded, and they do not show causation.  Id. at 25-26.  Indeed, it would be hard to imagine an ethical physician telling patients anything else.

Dissent

Four of the Eight Circuit judges dissented, pointing to evidence that the South Dakota legislators intended to mandate a disclosure about causality.  Slip op. at 29.  Putting aside whether the truthfulness of the suicide advisory can be saved by reverting to a more modest interpretation of risk or of increased risk, the dissenters appear to have the better argument that the advisory is misleading.  The majority, however, by driving its wedge between causation and increased risk have allowed physicians to explain that the advisory has little or no meaning.

NOCEBO

The nocebo effect is the dark side of the placebo effect.  As pointed out recently in the Journal of the American Medical Association, nocebos can induce harmful outcomes because of the expectation of injury from the “psychosocial context or therapeutic environment” affecting patients’ perception of their health.  Luana Colloca & Damien Finniss, “Nocebo Effects, Patient-Clinician Communication, and Therapeutic Outcomes,” 307 J. Am. Med. Ass’n 567, 567 (2012).  It is fairly well accepted that clinicians can inadvertently prejudice health outcomes by how they frame outcome information to patients.  Colloca and Finniss note that the negative expectations created by nocebo communication can take place in the process of obtaining informed consent.

Unfortunately, there is no discussion of nocebo effects in the Eight Circuit’s decision. Planned Parenthood might well consider the role the nocebo effect has on the risk-benefit of an informed consent disclosure about a risk that really is not a risk, or is not a risk in the sense that it is a factor that will result in the putative cause, but rather only something that is under study and which cannot be separated from many confounding factors.  Surely, physicians in South Dakota will figure out how to give truthful, non-misleading disclosures that incorporate the mandatory suicide advisory, as well as the scientific evidence.

Tal Golan’s Preliminary History of Epidemiologic Evidence in U.S. Courts

July 10th, 2012

Tal Golan  is an historian, with a special interest in the history of science in the 18th and 19th centuries, and in historical relationships among, science, technology, and the law.  He now teaches history at the University of California, San Diego.  Golan’s  book on the history of expert witnesses in the common law is an important starting place in understanding the evolution of the adversarial expert witness system in English and American courts.  Tal Golan, Laws of Man and Laws of Nature: A History of Scientific Expert Testimony (Harvard 2004).

Last year, Golan led a faculty seminar at the University of Haifa’s Law School on the history of epidemiologic evidence in 20th century American litigation.  A draft of Golan’s paper is available at the school’s website, and for those interested in the evolution of the American courts’ treatment of statistical and epidemiologic evidence, the paper is worth a look.  Tal Golan, “A preliminary history of epidemiological evidence in the twentieth-century American Courtroom” manuscript (2011) [Golan 2011].

There are problems, however, with Golan’s historical narrative.  Golan points to tobacco cases as the earliest forays into the use of epidemiologic evidence to prove health claims in court:

“I found only four toxic tort cases in the 1960s that involved epidemiological evidence – two tobacco and two vaccine cases. In the tobacco cases, the plaintiffs tried and failed to establish a causal relation between smoking and cancer via the testimony of epidemiological experts. In both cases the judges dismissed the epidemiological evidence and directed summary verdicts for the tobacco companies.38

Golan 2011 at 11 & n. 38 (citing Pritchard v. Liggett & Myers Tobacco Co., 295 F.2d 292 (1961); Lartigue v. R.J. Reynolds Tobacco Co., 317 F.2d 19 (1963)).  Golan may be correct that some of the early tobacco cases were dismissive of statistical and epidemiologic evidence, but these citations do not support his assertion.  The Latrigue case resulted in a defense verdict after a jury trial.  The judgment for the defendant was affirmed on appeal, with specific reference to the plaintiff’s use of epidemiologic evidence.  Lartigue v. R.J. Reynolds Tobacco Co., 317 F.2d 19 (5th Cir. 1963) (“The plaintiff contends that the jury’s verdict was contrary to the manifest weight of the evidence. The record consists of twenty volumes, not to speak of exhibits, most of it devoted to medical opinion. The jury had the benefit of chemical studies, epidemiological studies, reports of animal experiments, pathological evidence, reports of clinical observations, and the testimony of renowned doctors. The plaintiff made a convincing case, in general, for the causal connection between tobacco and cancer and, in particular, for the causal connection between Lartigue’s smoking and his cancer. The defendants made a convincing case for the lack of any causal connection.”), cert. denied, 375 U.S. 865 (1963), and cert. denied, 379 U.S. 869 (1964).  Golan is thus wrong to suggest that the plaintiffs in Lartigue suffered a summary judgment or a directed verdict on their causation claims.

In Pritchard, the plaintiff had three trials in the course of litigating his tobacco-related claims.  See Pritchard v. Liggett & Myers Tobacco Co., 134 F. Supp. 829 (W.D. Pa. 1955), rev’d, 295 F.2d 292, 294 (3d Cir. 1961), 350 F.2d 479 (3d Cir. 1965), cert. denied, 382 U.S. 987 (1966), amended, 370 F.2d 95 (3d Cir. 1966), cert. denied, 386 U.S. 1009 (1967).  The Pritchard case ultimately turned on liability more than causation issues.  In both cases, Golan’s citations are abridged and incorrect.

Golan also wades into a discussion of statistical significance in which he misstates the meaning of the concept and he incorrectly describes how it was handled in at least one important case:

“Statistics provides such an assurance by calculating the probability of false association, and the epidemiological dogma demands it to be smaller than 5% (i.e, less than 1 in 20) for the association to be considered statistically significant.”

Golan 2011, at 18.  This statement is wrong.  Statistics do not provide a probability of the truth or falsity of the association.  The significance probability to which Golan refers measures the probability of data at least as extreme as those observed if the null hypothesis of no difference is correct.

Having misunderstood and misstated the meaning of significance probability, Golan proceeds to make the classic misidentification of statistical significance probability with the probability of the either the null hypothesis or the observed result.  Frequentist statistical testing cannot do this, and Golan’s error has no place in a history of these concepts other than to point out that courts have frequently made this mistake:

“The ‘statistical significance‘ standard is far more demanding than the ‘preponderance of the evidence‘ or ‘more likely than not‘ standard used in civil law. It reflects the cautious attitude of scientists who wish to be 95% certain that their measurements are not spurious.

**********

Epidemiologists have considered the price well worth paying. So has criminal law, which emphasizes the minimization of false conviction, even at the price of overlooking true crime. But civil law does not share this concern.”

This narrative misstates what epidemiologist are doing in using significance probability and null hypothesis significance testing.  The confusion between epidemiologic statistical standards and burden of proof in criminal cases is a serious error.

Golan compares and contrasts the approaches of the trial judges in Allen v. United States, and in In re Agent Orange:

“Judge Weinstein, on the other hand, was far less concerned with the strictness of the epidemiology. A scholar of evidence, member of the Advisory Committee that drafted the Federal Rules of Evidence during the early 1970s, and a critic of the partisan deployment of science in the adversarial courtroom, Weinstein embraced the stringent 95% significance threshold as a ready-made admissibility test that could validate the veracity of the statistical evidence used in court. Thus, while he referred to epidemiological studies as ―the best (if not the sole) available evidence in mass exposure cases,‖ he nevertheless refused to accept them in evidence, unless they were statistically significant.64

Golan at 19.  Weinstein is all that and more, but he never simplistically embraced statistical significance as a “ready-made admissibility test.”  Of course 95% is the coefficient of confidence, and the complement of alpha of 0.05%, but this alpha is not a particularly stringent threshold unless it is misunderstood as a burden of proof.  Contrary to Golan’s suggestion, Judge Weinstein was not being conservative or restrictive in his approach in In re Agent Orange.

Golan’s “preliminary” history is a good start, but it misses an important perspective.  After World War II, biological science, in the form of genetics, as well as epidemiology and other areas, grew to encompass stochastic processes as well as mechanistic processes.  To a large extent, in permitting judgments to be based upon statistical and epidemiologic evidence, the law was struggling to catch up with developments in science.   There is quite a bit of evidence that the law is still struggling.

Maryland Puts the Brakes on Each and Every Asbestos Exposure

July 3rd, 2012

Last week, the Maryland Court of Special Appeals reversed a plaintiffs’ verdict in Dixon v. Ford Motor Company, 2012 WL 2483315 (Md. App. June 29, 2012).  Jane Dixon died of pleural mesothelioma.  The plaintiffs, her survivors, claimed that her last illness and death were caused by her household improvement projects, which involved exposure to spackling/joint compound, and by her husband’s work with car parts and brake linings, which involved “take home” exposure on his clothes.  Id. at *1.

All the expert witnesses appeared to agree that mesothelioma is a “dose-response disease,” meaning that the more the exposure, the greater the likelihood that a person exposed will develop the disease. Id. at *2.  Plaintiffs’ expert witness, Dr. Laura Welch, testified that “every exposure to asbestos is a substantial contributing cause and so brake exposure would be a substantial cause even if [Mrs. Dixon] had other exposures.” On cross-examination, Dr. Welch elaborated upon her opinion to explain that any “discrete” exposure would be a contributing factor. Id.

Welch, of course, criticized the entire body of epidemiology of car mechanics and brake repairmen, which generally finds no increased risk of mesothelioma above overall population rates.  With respect to the take-home exposure, Welch had to acknowledge that there were no epidemiologic studies that investigated the risk of wives of brake mechanics.  Welch argued that the studies of car mechanics did not involve exposure to brake shoes as would have been experienced by brake repairmen, but her argument only served to make her attribution based upon take-home exposure to brake linings seem more preposterous.  Id. at *3.  The court recognized that Dr. Welch’s opinion may have been trivially true, but still unhelpful.  Each discrete exposure, even as attenuated as a take-home exposure from having repaired a single brake shoe may have “contributed,” but that opinion did not help the jury assess whether the contribution was substantial.

The court sidestepped the issue of fiber type, and threshold, and honed in on the agreement that mesothelioma risk showed a dose-response relationship with asbestos exposure.  (There is a sense that the court confused the dose-response concept to mean no threshold.)  The court credited hyperbolic risk assessment figures from the United States Environmental Protection Agency, which suggested that even ambient air exposure to asbestos leads to an increase in mesothelioma risk, but then realized that such claims made the legal need to characterize the risk from the defendant’s product all the more important before the jury could reasonably have concluded that any particular exposure experienced by Ms. Dixon was “a substantial contributing factor.”  Id. at *5.

Having recognized that the best the plaintiffs could offer was a claim of increased risk, and perhaps crude quantification of the relative risks resulting from each product’s exposure, the court could not escape that the conclusion that Dr. Welch’s empty recitation of “every exposure” is substantial was nothing more than an unscientific and empty assertion.  Welch’s claim was either tautologically true or empirical nonsense.  The court also recognized that risk substituting for causation opened the door to essentially probabilistic evidence:

“If risk is our measure of causation, and substantiality is a threshold for risk, then it follows—as intimated above—that ‘substantiality’ is essentially a burden of proof. Moreover, we can explicitly derive the probability of causation from the statistical measure known as ‘relative risk’ … .  For reasons we need not explore in detail, it is not prudent to set a singular minimum ‘relative risk’ value as a legal standard.12 But even if there were some legal threshold, Dr. Welch provided no information that could help the finder of fact to decide whether the elevated risk in this case was ‘substantial’.”

Id. at *7.  The court’s discussion here of “the elevated risk” seems wrong unless we understand it to mean the elevated risk attributable to the particular defendant’s product, in the context of an overall exposure that we accept as having been sufficient to cause the decedent’s mesothelioma.  Despite the lack of any quantification of relative risks in the case, overall or from particular products, and the court’s own admonition against setting a minimum relative risk as a legal standard, the court proceeded to discuss relative risks at length.  For instance, the court criticized Judge Kozinski’s opinion in Daubert, upon remand from the Supreme Court, for not going far enough:

“In other words, the Daubert court held that a plaintiff’s risk of injury must have at least doubled in order to hold that the defendant’s action was ‘more likely than not’ the actual cause of the plaintiff’s injury. The problem with this holding is that relative risk does not behave like a ‘binary’ hypothesis that can be deemed ‘true’ or ‘false’ with some degree of confidence; instead, the un-certainty inherent in any statistical measure means that relative risk does not resolve to a certain probability of specific causation. In order for a study of relative risk to truly fulfill the preponderance standard, it would have to result in 100% confidence that the relative risk exceeds two, which is a statistical impossibility. In short, the Daubert approach to relative risk fails to account for the twin statistical uncertainty inherent in any scientific estimation of causation.”

Id. at *7 n.12 (citing Daubert v. Merrell Dow Pharms., Inc., 43 F.3d 1311, 1320-21 (9th Cir.1995) (holding that that a preponderance standard requires causation to be shown by probabilistic evidence of relative risk greater than two) (opinion on remand from Daubert v. Merrell Dow Pharms., 509 U.S. 579 (1993)).  The statistical impossibility derives from the asymptotic nature of the normal distribution, but the court failed to explain why a relative risk of two must be excluded as statistically implausible based upon the sample statistic.  After all, a relative risk greater than two, with a lower bound of a 95% confidence interval above one, based upon an unbiased sampling, suggests that our best evidence is that the population parameter is greater than two, as well.  The court, however, insisted upon stating the relative-risk-greater-than-two rule with a vengeance:

“All of this is not to say, however, that any and all attempts to establish a burden of proof of causation using relative risk will fail. Decisions can be – and in science or medicine are – premised on the lower limit of the relative risk ratio at a requisite confidence level. The point of this minor discussion is that one cannot apply the usual, singular ‘preponderance’ burden to the probability of causation when the only estimate of that probability is statistical relative risk. Instead, a statistical burden of proof of causation must consist of two interdependent parts: a requisite confidence of some minimum relative risk. As we explain in the body of our discussion, the flaws in Dr. Welch’s testimony mean we need not explore this issue any further.44

Id. (emphasis in original).

And despite having declared the improvidence of addressing the relative risk issue, and then the lack of necessity for addressing the issue given Dr. Welch’s flawed testimony, the court nevertheless tackled the issue once more, a couple of pages later:

“It would be folly to require an expert to testify with absolute certainty that a plaintiff was exposed to a specific dose or suffered a specific risk. Dose and risk fall on a spectrum and are not ‘true or false’. As such, any scientific estimate of those values must be expressed as one or more possible intervals and, for each interval, a corresponding confidence that the true value is within that interval.”

Id. at 9 (emphasis in original; internal citations omitted).  The court captured the frequentist concept of the confidence interval as being defined operationally by repeated samplings and their random variability, but the confidence of the confidence interval means that the specified coefficient represents the percentage of all such intervals that include the “true” value, not the probability that a particular interval, calculated from a given sample, contains the true value.  The true value is either in or not in the interval generated from a single sample risk statistic.  Again, it is unclear why the court was weighing in on this aspect of probabilistic evidence when plaintiffs’ expert witness, Welch, offered no quantitation of the overall risk or of the risk attributable to a specific product exposure.

The court indulged the plaintiffs’ no-threshold fantasy but recognized that the risks of low-level asbestos exposure were low, and likely below a doubling of risk, an issue that the court stressed it wanted to avoid.  The court cited one study that suggested a risk (odds) ratio of 1.1 for exposures less than 0.5 fiber/ml – years.  See id. at *5 (citing Y. Iwatsubo et al., “Pleural mesothelioma: dose-response relation at low levels of asbestos exposure in a French population-based case-control study,” 148 Am. J. Epidemiol. 133 (1998) (estimating an odds ratio of 1.1 for exposures less than 0.5 fibers/ml-years).  But the court, which tried to be precise elsewhere, appears to have lost its way in citing Iwatsubo here.  After all, how can a single odds ratio of 1.1 describe all exposures from 0 all the way up to 0.5 f/ml-years?  How can a single odds ratio describe all exposures in this range, regardless of fiber type, when chrystotile asbestos carries little to no risk for mesothelioma, and certainly orders of magnitude risk less than amphibole fibers such as amosite and crocidolite.  And if a low-level exposure has a risk ratio of 1.1, how can plaintiffs’ hired expert witness, Welch, even make the attribution of Dixon’s mesothelioma to the entirety of her exposure, let alone the speculative take-home chrysotile exposure involved from Ford’s brake linings?  Obviously, had the court posed these questions, it would it would have realized that “it is not possible” to permit Welch’s testimony at all.

The court further lost its way in addressing the exculpatory epidemiology put forward by the defense expert witnesses:

“Furthermore, the leading epidemiological report cited by Ford and its amici that specifically studied ‘brake mechanics’, P.A. Hessel et al., ‘Meso-thelioma Among Brake Mechanics: An Expanded Analysis of a Case-control Study’, 24 Risk Analysis 547 (2004), does not at all dispel the notion that this population faced an increased risk of mesothelioma due to their industrial asbestos exposure. … When calculated at the 95% confidence level, Hessel et al. estimated that the odds ratio of mesothelioma could have been as low as 0.01 or as high as 4.71, implying a nearly quintupled risk of mesothelioma among the population of brake mechanics. 24 Risk Analysis at 550–51.”

Id. at *8.  Again, the court is fixated with the confidence interval, to the exclusion of the estimated magnitude of the association!  This time, after earlier shouting that it was the lower bound of the interval that matters scientifically, the court emphasizes the upper bound.  The court here has strayed far from the actual data, and any plausible interpretation of them:

“The odds ratio (OR) for employment in brake installation or repair was 0.71 (95% CI: 0.30-1.60) when controlled for insulation or shipbuilding. When a history of employment in any of the eight occupations with potential asbestos exposure was controlled, the OR was 0.82 (95% CI: 0.36-1.80). ORs did not increase with increasing duration of brake work. Exclusion of those with any of the eight exposures resulted in an OR of 0.62 (95% CI: 0.01-4.71) for occupational brake work.”

P.A. Hessel et al., “Mesothelioma Among Brake Mechanics: An Expanded Analysis of a Case-control Study,” 24 Risk Analysis 547, 547 (2004).  All of Dr. Hessel’s estimates of effect sizes were below 1.0, and he found no trend for duration of brake work.  Cherry picking out the upper bound of a single subgroup analysis for emphasis was unwarranted, and hardly did justice to the facts or the science.

Dr. Welch’s conclusion that the exposure and risk in this case were “substantial” simply was not a scientific conclusion, and without it her testimony did not provide information for the jury to use in reaching its conclusion as to substantial factor causation. Id. at *7.  The court noted that Welch, and the plaintiffs, may have lacked scientific data to provide estimates of Dixon’s exposure to asbestos or relative risk of mesothelioma, but ignorance or uncertainty was hardly the basis to warrant an expert witness’s belief that the relevant exposures and risks are “substantial.” Id. at *10.  The court was well justified in being discomforted by the conclusory, unscientific opinion rendered by Laura Welch.

In the final puzzle of the Dixon case, the court vacated the judgment, and remanded for a new trial, “either without her opinion on substantiality or else with some quantitative testimony that will help the jury fulfill its charge.”  Id. at *10.  The court thus seemed to imply that an expert witness need not utter the magic word, “substantial,” for the case to be submitted to the jury against a brake defendant in a take-home exposure case.  Given the state of the record, the court should have simply reversed and rendered judgment for Ford.

Ecological Fallacy Goes to Court

June 30th, 2012

In previous posts, I have bemoaned the judiciary’s tin ear for important qualitative differences between and among different research study designs.  The Reference Manual for Scientific Evidence (3d ed. 2011)(RMSE3d) offers inconsistent advice, ranging from Margaret Berger’s counsel to abandon any hierarchy of evidence, to other chapters’ emphasizing the importance of a hierarchy.

The Cook case is one of the more aberrant decisions, which elevated an ecological study, without a statistically significant result, into an acceptable basis for a causal conclusion under Rule 702.  Senior Judge Kane’s decision in the litigation over radioactive contamination from the Colorado Rocky Flats nuclear weapons plant is illustrative of a judicial refusal to engage with the substantive differences among studies, and to ignore the inability of some study designs to support causality.  See Cook v. Rockwell Internat’l Corp., 580 F. Supp. 2d 1071, 1097-98 (D. Colo. 2006) (“Defendants assert that ecological studies are inherently unreliable and therefore inadmissible under Rule 702.  Ecological studies, however, are one of several methods of epidemiological study that are well-recognized and accepted in the scientific community.”), rev’d and remanded on other grounds, 618 F.3d 1127 (10th Cir. 2010), cert. denied, ___ U.S. ___ (May 24, 2012).  Senior Judge Kane’s point about the recognition and acceptance of ecological studies has nothing to do with their ability to support conclusions of causality.  This basic non sequitur led the trial judge into ruling that the challenge “goes to the weight, not the admissibility” of the challenged opinion testimony.  This is a bit like using an election day exit poll, with 5% returns, for “reliable” evidence to support a prediction of the winner.  The poll may have been conducted most expertly, but it lacks the ability to predict the winner.

The issue is not whether ecological studies are “scientific”; they are part of the epidemiologists’ toolkit.  The issue is whether they warrant inferences of causation.  Some so-called scientific studies are merely hypothesis generating, preliminary, tentative, or data-dredging exercises.  Judge Kane opined that ecological studies are merely “less probative” than other studies, and the relative weights of studies do not render them inadmissible.  Id.  This is a misunderstanding or an abdication of gatekeeping responsibility.  First, studies themselves are not admissible; it is the expert witness, whose testimony is challenged.  Second, Rule 702 requires that the proffered opinion be “scientific knowledge,” and ecological studies simply lack the necessary epistemic warrant.

The legal sources cited by Senior Judge Kane provide only equivocal and minimal support at best for his decision.  The court pointed to RSME2d at 344-45, for the proposition that ecological studies are useful for establishing associations, but are weak evidence for causality. The other legal citations give seem equally unhelpful.  In re Hanford Nuclear Reservation Litig., No. CY–91– 3015–AAM, 1998 WL 775340 at *106 (E.D.Wash. Aug.21, 1998) (citing RMSE2d and the National Academy of Science Committee on Radiation Dose Reconstruction for Epidemiological Uses, which states that “ecological studies are usually regarded as hypothesis generating at best, and their results must be regarded as questionable until confirmed with cohort or case‑control studies.” National Research Council, Radiation Dose Reconstruction for Epidemiologic Uses at 70 (1995)), rev’d on other grounds, 292 F.3d 1124 (9th Cir. 2002).  Ruff v. Ensign– Bickford Indus., Inc., 168 F.Supp. 2d 1271, 1282 (D. Utah 2001) (reviewing evidence that consisted of a case-control study in addition to an ecological study; “It is well established in the scientific community that ecological studies are correlational studies and generally provide relatively weak evidence for establishing a conclusive cause and effect relationship.’’); see also id. at 1274 n.3 (“Ecological studies tend to be less reliable than case–control studies and are given little evidentiary weight with respect to establishing causation.”)

 

ERROR COMPOUNDED

The new edition of RMSE cites the Cook case at several places.  In an introductory chapter, the late Professor Margaret Berger cites the case incorrectly for having excluded expert witness testimony.  See Margaret A. Berger, “The Admissibility of Expert Testimony 11, 24 n.62 in RMSE3d (“See Cook v. Rockwell Int’l Corp., 580 F. Supp. 2d 1071 (D. Colo. 2006) (discussing why the court excluded expert’s testimony, even though his epidemiological study did not produce statistically significant results).”)  The chapter on epidemiology cites Cook correctly for having refused to exclude the plaintiffs’ expert witness, Dr. Richard Clapp, who relied upon an ecological study of two cancer outcomes in the area adjacent to the Rocky Flats Nuclear Weapons Plant.  See Michael D. Green, D. Michal Freedman, and Leon Gordis, “Reference Guide on Epidemiology,” 549, 561 n. 34, in Reference Manual for Scientific Evidence (3d ed. 2011).  The authors, however, abstain from any judgmental comments about the Cook case, which is curious given their careful treatment of ecological studies and their limitations:

“4. Ecological studies

Up to now, we have discussed studies in which data on both exposure and health outcome are obtained for each individual included in the study.33 In contrast, studies that collect data only about the group as a whole are called ecological studies.34 In ecological studies, information about individuals is generally not gathered; instead, overall rates of disease or death for different groups are obtained and compared. The objective is to identify some difference between the two groups, such as diet, genetic makeup, or alcohol consumption, that might explain differences in the risk of disease observed in the two groups.35 Such studies may be useful for identifying associations, but they rarely provide definitive causal answers.36

Id. at 561.  The epidemiology chapter proceeds to note that the lack of information about individual exposure and disease outcome in an ecological study “detracts from the usefulness of the study,” and renders it prone to erroneous inferences about the association between exposure and outcome, “a problem known as an ecological fallacy.”  Id. at 562.  The chapter authors define the ecological fallacy:

“Also, aggregation bias, ecological bias. An error that occurs from inferring that a relationship that exists for groups is also true for individuals.  For example, if a country with a higher proportion of fishermen also has a higher rate of suicides, then inferring that fishermen must be more likely to commit suicide is an ecological fallacy.”

Id. at 623.  Although the ecological study design is weak and generally unsuitable to support causal inferences, the authors note that such studies can be useful in generating hypotheses for future research using studies that gather data about individuals. Id. at 562.  See also David Kaye & David Freedman, “Reference Guide on Statistics,” 211, 266 n.130 (citing the epidemiology chapter “for suggesting that ecological studies of exposure and disease are ‘far from conclusive’ because of the lack of data on confounding variables (a much more general problem) as well as the possible aggregation bias”); Leon Gordis, Epidemiology 205-06 (3d ed. 2004)(ecologic studies can be of value to suggest future research, but “[i]n and of themselves, however, they do not demonstrate conclusively that a causal association exists”).

The views expressed in the Reference Manual for Scientific Evidence, about ecological studies, are hardly unique.  The following quotes show how ecological studies are typically evaluated in epidemiology texts:

Ecological fallacy

An ecological fallacy or bias results if inappropriate conclusions are drawn on the basis of ecological data. The bias occurs because the association observed between variables at the group level does not necessarily represent the association that exists at the individual level (see Chapter 2).

***

Such ecological inferences, however limited, can provide a fruitful start for more detailed epidemiological work.”

R. Bonita, R. Beaglehole, and T. Kjellström, Basic Epidemiology 43 2d ed. (WHO 2006).

“A first observation of a presumed relationship between exposure and disease is often done at the group level by correlating one group characteristic with an outcome, i.e. in an attempt to relate differences in morbidity or mortality of population groups to differences in their local environment, living habits or other factors. Such correlational studies that are usually based on existing data are prone to the so-called ‘ecological fallacy’ since the compared populations may also differ in many other uncontrolled factors that are related to the disease. Nevertheless, ecological studies can provide clues to etiological hypotheses and may serve as a gateway towards more detailed investigations.”

Wolfgang Ahrens & Iris Pigeot, eds., Handbook of Epidemiology 17-18 (2005).

The Cook case is a wonderful illustration of the judicial mindset that avoids and evades gatekeeping by resorting to the conclusory reasoning that a challenge “goes to the weight, not the admissibility” of an expert witness’s opinion.

Let’s Require Health Claims to Be Evidence Based

June 28th, 2012

Litigation arising from the FDA’s refusal to approval “health claims” for foods and dietary supplements is a fertile area for disputes over the interpretation of statistical evidence.  A ‘‘health claim’’ is ‘‘any claim made on the label or in labeling of a food, including a dietary supplement, that expressly or by implication … characterizes the relationship of any substance to a disease or health-related condition.’’ 21 C.F.R. § 101.14(a)(1); see also 21 U.S.C. § 343(r)(1)(A)-(B).

Unlike the federal courts exercising their gatekeeping responsibility, the FDA has committed to pre-specified principles of interpretation and evaluation. By regulation, the FDA gives notice of standards for evaluating complex evidentiary displays for the ‘‘significant scientific agreement’’ required for approving a food or dietary supplement health claim.  21 C.F.R. § 101.14.  See FDA – Guidance for Industry: Evidence-Based Review System for the Scientific Evaluation of Health Claims – Final (2009).

If the FDA’s refusal to approve a health claim requires pre-specified criteria of evaluation, then we should be asking ourselves why have the federal courts failed to develop a set of criteria for evaluating health effects claims as part of its Rule 702 (“Daubert“) gatekeeping responsibilities.  Why, after close to 20 years after the Supreme Court decided Daubert, can lawyers make “health claims” without having to satisfy evidence-based criteria?

Although the FDA’s guidance is not always as precise as might be hoped, it is far better than the suggestion of the new Reference Manual for Scientific Evidence (3d ed. 2011) that there is no hierarchy of evidence.   See RMSE 3d at 564 & n.48 (citing and quoting idiosyncratic symposium paper that “[t]here should be no hierarchy [among different types of scientific methods to determine cancer causation]; “Late Professor Berger’s Introduction to the Reference Manual on Scientific Evidence” (Oct. 23, 2011).

The FDA’s attempt to articulate an evidence-based hierarchy is noteworthy because the agency must evaluate a wide range of evidence, from in vitro, to animal studies, to observational studies of varying kinds, to clinical trials, to meta-analyses and reviews.  The FDA’s criteria are a good start, and I imagine that they will develop and improve over time.  Although imperfect, the criteria are light years ahead of the situation in federal and state court gatekeeping.  Unlike gatekeeping in civil actions, the FDA criteria are pre-stated and not devised post hoc.  The FDA’s attempt to implement evidence-based principles in the evaluation of health claims made is a model that would much improve the Reference Manual for Scientific EvidenceSee Christopher Guzelian & Philip Guzelian, “Prevention of false scientific speech: a new role for an evidence-based approach,” 27 Human & Experimental Toxicol. 733 (2008).

The FDA’s evidence-based criteria need work in some areas.  For instance, the FDA’s Guidance on meta-analysis is not particularly specific or helpful:

Research Synthesis Studies

Reports that discuss a number of different studies, such as review articles, do not provide sufficient information on the individual studies reviewed for FDA to determine critical elements such as the study population characteristics and the composition of the products used. Similarly, the lack of detailed information on studies summarized in review articles prevents FDA from determining whether the studies are flawed in critical elements such as design, conduct of studies, and data analysis. FDA must be able to review the critical elements of a study to determine whether any scientific conclusions can be drawn from it. Therefore, FDA intends to use review articles and similar publications to identify reports of additional studies that may be useful to the health claim review and as background about the substance/disease relationship. If additional studies are identified, the agency intends to evaluate them individually. Most meta-analyses, because they lack detailed information on the studies summarized, will only be used to identify reports of additional studies that may be useful to the health claim review and as background about the substance-disease relationship.  FDA, however, intends to consider as part of its health claim review process a meta-analysis that reviews all the publicly available studies on the substance/disease relationship. The reviewed studies should be consistent with the critical elements, quality and other factors set out in this guidance and the statistical analyses adequately conducted.”

FDA – Guidance for Industry: Evidence-Based Review System for the Scientific Evaluation of Health Claims – Final at 10 (2009).

The dismissal of review articles as a secondary source is welcome, but meta-analyses are quantitative reviews that can add additional insights and evidence, if methodologically appropriate, by providing a summary estimate of association, sensitivity analyses, meta-regression, etc.  The FDA’s guidance was applied in connection with the agency’s refusal to approve a health claim for vitamin C and lung cancer.  Proponents claimed that a particular meta-analysis supported their health claim, but the FDA disagreed.  The proponents sought injunctive relief in federal district court, which upheld the FDA’s decision on vitamin C and lung cancer.  Alliance for Natural Health US v. Sebelius, 786 F.Supp. 2d 1, 21 (D.D.C. 2011).  The district court found that the FDA’s refusal to approve the health claim was neither arbitrary nor capricious with respect to its evaluation of the cited meta-analysis:

‘‘The FDA discounted the Cho study because it was a ‘meta-analysis’ of studies reflected in a review article. FDA Decision at 2523. As explained in the 2009 Guidance Document, ‘research synthesis studies’, and ‘review articles’, including ‘most meta-analyses’, ‘do not provide sufficient information on the individual studies reviewed’ to determine critical elements of the studies and whether those elements were flawed. 2009 Guidance Document at A.R. 2432. The Guidance Document makes an exception for meta-analyses ‘that review[ ] all the publicly available studies on the substance/disease relationship’. Id. Based on the Court’s review of the Cho article, the FDA’s decision to exclude this article as a meta-analysis was not arbitrary and capricious.’’

Id. at 19.

The FDA’s Guidance was adequate for its task in the vitamin C/lung cancer health claim, but notably absent from the Guidance are any criteria to evaluate competing meta-analyses that do include “all the publicly available studies on the substance/disease relationship.”  The model assumptions of meta-analyses, fixed effect versus random effects, lack of heterogeneity, as well as other considerations will need to be spelled out in advance.  Still not a bad start.  Implementing evidence-based criteria in Rule 702 gatekeeping has the potential to tame the gatekeeper’s discretion.

Johnson v. Arkema Inc. – The Fifth Circuit Proves to Be Sophisticated Consumer of Science

June 21st, 2012

Yesterday, in celebration of the first day of summer, the Fifth Circuit handed down a decision in a case that looks like a laundry list of expert witness fallacies.  Fortunately, the district judge and two of the three appellate judges kept their analytical faculties intact.  Johnson v. Arkema Inc., Slip op., 2012 WL ___ (5th Cir. June 20, 2012) (per curiam) (affirming exclusion of expert witnesses).

The plaintiff had worked in a glass bottling plant, where on two occasions in 2007, he was in close proximity to the defendant’s ventilation hood, designed to be used with a chemical, Certincoat, composed of monobutyltin trichloride (MBTC), an organometallic compound.  Plaintiff claimed that the ventilation was inadequate and that as a result he was exposed to MBTC as well as hydrochloric acid.

The plaintiff sustained some acute symptoms and ultimately was diagnosed with a “chemical pneumonia,” by his treating physician.  The plaintiff further claimed that his condition progressively worsened,  and that he was ultimately diagnosed with “pulmonary fibrosis,” a “severe restrictive lung disease.” The plaintiff filed reports from two expert witnesses – Richard Schlesinger, a toxicologist, and Charles Grodzin, a pulmonary physician – in support of his claim that his pulmonary fibrosis was caused by overexposure to MBTC and hydrochloric acid (HCl).

Plaintiff’s claim led to defendant’s Rule 702 challenge, which the trial court sustained, and the appellate court affirmed.

A basic problem faced by plaintiff is that there was virtually no evidence that MBTC or HCl causes pulmonary fibrosis. Undaunted, the plaintiff and his expert witnesses pushed on, but the lack of epidemiologic evidence associating MBTC or HCl with pulmonary fibrosis proved reliably harmful to plaintiff’s case.

General Acceptance

Plaintiff could point to no evidence that MBTC or HCl causes pulmonary fibrosis.  Slip op. at 7. Given the delay in manifestation of the fibrosis after the plaintiff’s rather limited, discrete exposures, the court recognized that epidemiologic evidence was important, if not essential, to plaintiff’s case. Without epidemiology, the plaintiff retreated to generalities – the chemicals cause lung irritation, lung injury, etc.  One concurring judge was taken in, but the majority of the panel saw through the dodge.

Anecdotal Evidence

Without epidemiologic evidence, the plaintiff invoked anecdotal evidence that other employees sustained similar lung injuries. The problem, however, for even this low-level evidence was that other employees experienced only transitory symptoms, which quickly resolved.  Id. at 4 -5, 27.

Post Hoc, Ergo Propter Hoc

Focusing only on himself as an anecdote with n =1, the plaintiff, and his expert witnesses, argued that temporal sequence of his exposure and his pulmonary fibrosis was itself evidence of causality.  Neither the trial court nor the appellate court found this much of an argument.  Id. at 16 n.13, 18.

Mechanism in Search of Data – Schlesinger’s irritant theory

Schlesinger argued that both MBTC and HCl are pulmonary irritants, which can cause inflammation, and pulmonary fibrosis results from inflammation. Id. at 8.  True, but not all irritants cause pulmonary fibrosis.  Chronicity and dose are important considerations.  Whether these chemicals, under exposure conditions experienced by plaintiff, were capable of causing pulmonary fibrosis, cried out for evidence.

The Material Safety Data Sheets (MSDS)

The plaintiff argued that the MSDS for HCl established that this chemical was “severely corrosive to the respiratory system.” Id. at 11-12.  The defendant’s own MSDS for MBTC stated that MBTC “causes respiratory tract irritation.” Id. at 16.  The courts saw these arguments as transparently absent evidence. None of the MSDS identified pulmonary fibrosis; nor did they specify (1) the underlying scientific support, or (2) the relevant duration and exposure needed to induce any particular adverse outcome.

Animal Studies

For both MBTC and HCl, plaintiff adverted to animal studies, but the courts found that the animal studies failed to support the plaintiff’s expert witnesses’ opinions and the plaintiff’s claims.  The studies were readily distinguishable in terms of dose, duration, and disease outcome.  In particular, none of the studies showed that the chemicals caused pulmonary fibrosis. Id. at 7, 12 (baboon study of HCl showed impairment but not fibrosis at 10,000ppm for one year, quite unlike plaintiff’s exposure), 16-17 (rat inhalation study of MBTC, six hrs/day, five days/wk, up to 30 mg/m3, with toxicity but no mention of lung fibrosis).

Regulatory Limits

Plaintiff argued that HCl levels were multiples of the OSHA limits, but the courts would not credit regulatory exposure limits are evidence of harmfulness because of the precautionary nature of many regulations.  Id. at 14.  Furthermore, the disease outcomes of regulatory concern did not appear to be pulmonary fibrosis for the chemicals involved.

Res Ipsa Loquitur

The plaintiff argued that causation was a matter of common sense and general experience.  Even if his expert witnesses did not have valid, reliable evidence, the jury could make the causal determination without scientific evidence. Id. at  26.  Rejected.

Chemical Analogies

The defendant’s expert witness acknowledged that tin oxide can cause pulmonary fibrosis.  Id. at 28.  This admission, however, came without any qualification about what exposure or duration data might be needed to support a conclusion about specific causation in the plaintiff.  Id.  Furthermore, tin pneumoconiosis, or stannosis, is known as a benign lung disease, unassociated with impairment or disability.  Like simple silicosis, stannosis is a picture change on chest radiograph, without diminution of performance on pulmonary function tests.  Agency for Toxic Substances and Disease Registry, A Toxicological Profile for Tin and Tin Compounds at 30 (2005).

Differential Diagnosis

Plaintiff’s pulmonary expert witness, Dr. Grodzin, tried to bootstrap specific causation by assuming general and putting it in the “differentials” for him to embrace.  Id. at 19.  A fallacious form of reasoning, but the courts here were on top of it.

* * * * *

The panel did reverse the trial court’s grant of summary judgment.  The gate closed a little too fast to permit scrutiny of plaintiff’s claim of acute injuries and symptoms, which were less dependent upon epidemiologic evidence.

 

Another Confounder in Lung Cancer Occupational Epidemiology — Diesel Engine Fumes

June 13th, 2012

Researchers obviously need to be aware of, and control for, potential and known confounders.  In the context of investigating the etiologies of lung cancer, there is a long list of potential confounding exposures, often ignored in peer-reviewed papers, which focus on one particular outcome of interest.  Just last week, I wrote to emphasize the need to account for potential and known confounding agents, and how this need was particularly strong in studies of weak alleged carcinogens such as crystalline silica.  See Sorting Out Confounded Research – Required by Rule 702.  Yesterday, the World Health Organization (WHO) added another “known” confounder for lung cancer epidemiology:  diesel fume.

According to the International Agency for Research on Cancer (IARC), a division of the WHO, a working group of international experts voted to reclassify diesel engine exhaust as a “Group I” carcinogen.  IARC: Diesel engines exhaust carcinogenic (2012).  This classification means, in IARC parlance, that ” there is sufficient evidence of carcinogenicity in humans. Exceptionally, an agent may be placed in this category when evidence of carcinogenicity in humans is less than sufficient but there is sufficient evidence of carcinogenicity in experimental animals and strong evidence in exposed humans that the agent acts through a relevant mechanism of carcinogenicity.”  The Group was headed up by Dr. Christopher Portier, who is the director of the National Center for Environmental Health and the Agency for Toxic Substances and Disease Registry at the Centers for Disease Control and Prevention.  Id.

The reclassification removes diesel exhaust from its previous categorization as a Group 2A carcinogen, which is interpreted “as probably carcinogenic to humans.”  Diesel exhaust has been on a high-priority list for re-evaluation since 1998, as result of epidemiologic research from many countries.  The Working Group specifically found that there was sufficient evidence to conclude that diesel exhaust is a cause of lung cancer in humans, and limited evidence to support an association with bladder cancer.  The Group rejected any change in classification of gasoline engine exhaust from its current IARC rating as “possibly carcinogenic to humans. (Group 2B).”

Unlike other IARC Working Group decisions (such as crystalline silica), which were weakened by close votes and significant dissents, the diesel Group’s conclusion was unanimous.  The diesel Group appeared to be impressed by two recent studies of lung cancer in underground miners, released in March 2012.  One study was in a large cohort, conducted by NIOSH, and the other was a nested case-control study, conducted by the National Cancer Institute (NCI).  See Debra T. Silverman, Claudine M. Samanic, Jay H. Lubin, Aaron E. Blair, Patricia A. Stewart , Roel Vermeulen, Joseph B. Coble, Nathaniel Rothman, Patricia L. Schleiff , William D. Travis, Regina G. Ziegler, Sholom Wacholder, Michael D. Attfield, “The Diesel Exhaust in Miners Study: A Nested Case-Control Study of Lung Cancer and Diesel Exhaust,” J. Nat’l Cancer Instit. (2012)(in press and open access); and Michael D. Attfield, Patricia L. Schleiff, Jay H. Lubin, Aaron Blair, Patricia A. Stewart, Roel Vermeulen, Joseph B. Coble, and Debra T. Silverman, “The Diesel Exhaust in Miners Study: A Cohort Mortality Study With Emphasis on Lung Cancer,” J. Nat’l Cancer Instit. (2012)(in press).

According to a story in the New York Times, the IARC Working Group described diesel engine exhaust as “more carcinogenic than secondhand cigarette smoke.”  Donald McNeil, “W.H.O. Declares Diesel Fumes Cause Lung Cancer,” N.Y. Times (June 12, 2012).  The Times also quoted Dr. Debra Silverman, NCI chief of environmental epidemiology, at length.  Dr. Silverman, who was the lead author of the nested case-control study cited by the IARC Press Release, noted that her large study showed that long-term heavy exposure to diesel fumes increased lung cancer risk seven fold. Dr. Silverman described this risk as much greater than that thought to be created by passive smoking, but much smaller than smoking two packs of cigarettes a day.  She stated that “totally” supported the IARC reclassification, and that she believed that governmental agencies would use the IARC analysis as the basis for changing the regulatory classification of diesel exhaust.

Silverman’s nested case-control study appears to have been based upon careful diesel exhaust exposure information, as well as smoking histories.  The study also searched and analyzed for other potential confounders, which might be expected to be involved in underground mining:

“Other potential confounders [ie, duration of cigar smoking; frequency of pipe smoking; environmental tobacco smoke; family history of lung cancer in a first-degree relative; education; body mass index based on usual adult weight and height; leisure time physical activity; diet; estimated cumulative exposure to radon, asbestos, silica, polycyclic aromatic hydrocarbons (PAHs) from non-diesel sources, and respirable dust in the study facility based on air measurement and other data (14)] were evaluated but not included in the final models because they had little or no impact on odds ratios (ie, inclusion of these factors in the final models changed point estimates for diesel exposure by ≤ 10%).”

Silverman, et al., at 4.  The absence of an association between lung cancer and silica exposure is noteworthy in a such a large study of underground miners.