TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

Railroading Scientific Evidence of Causation in Court

August 31st, 2014

Harold Tanfield spent 40 years or so working for Consolidated Rail Corporation (and its predecessors), from 1952 to 1992.  Mr. Tanfield’s widow sued Conrail, under the Federal Employers’ Liability Act (“FELA”), 45 U.S.C.A. §§ 51-60, for negligently overexposing her late husband to diesel fumes, which allegedly caused him to develop lung cancer. Tanfield v. Leigh RR, No. A-4170-12T2, New Jersey Superior Court, App. Div. (Aug. 11, 2014) Slip op. at 3. [cited below as Tanfield].

The trial court granted Conrail summary judgment on grounds that plaintiff failed to show that Conrail had breached a duty of care.  The appellate court reversed and remanded for trial. The Appellate Division’s decision is “per curiam,” and franked “not for publication without the approval of the Appellate Division.” Only two of the usual three appellate judges participated.  The panel decided the case one week after it was submitted.

The plaintiff relied upon two witness, a co-worker of her husband, and an expert witness, Steven R. Tahan, M.D.  Dr. Tahan is a pathologist, an Associate Professor, Department of Pathology, Harvard Medical School, and the Director of Dermatopathology, Beth Israel Deaconess Medical Center.  Dr. Tahan’s website lists melanoma as his principal research interest. A PubMed search reveals no publications on diesel fume, occupational disease, or lung cancer.  Dr. Tahan’s principal research interest, skin pathology, was decidedly not at issue in the Tanfield case.

The panel of the Appellate Division quoted from the relevant paragraphs of Tahan’s report:

“Mr. Tanfield was a railroad worker for 35 years, where he was exposed to a large number of carcinogenic chemicals and fumes, including asbestos, antimony, arsenic, benzene, beryllium, cadmium, carbon disulfide, cyanide, DDT, diesel fumes, diesel fuel, dioxins, ethylbenzene, lead, methylene chloride, mercury, naphthalene, petroleum hydrocarbon, polychlorinated biphenyls, polynuclear aromatic hydrocarbons, toluene, vinyl acetate, and other volatile organics.

I have reviewed the cytology and biopsy slides from the right lung and confirm that he had a poorly differentiated malignant non-small cell carcinoma with both adenocarcinomatous and squamous features.  I have reached the following conclusions to a reasonable degree of medical certainty based on review of the above materials, my education, training, and experience, and review of published studies.

Mr. Tanfield’s more than 35 year substantial occupational exposure to an extensive array of carcinogens and diesel fumes without provision of protective equipment such as masks, respirators, and other filters created a long-term hazard that substantially multiplied his risk for developing lung cancer over the baseline he had as a former smoker.  It is more likely than not that his occupational exposure to diesel fumes and other carcinogenic toxins present in his workplace was a significant causative factor for his development of lung cancer and death from his cancer.”

Tanfield at 6-7.

Mr. Tanfield’s co-worker testified to what appeared to him to be excessive diesel fumes in the workplace, but there is no mention of any quantitative or qualitative evidence to any other lung carcinogen.  The Appellate Division states that the above three paragraphs represent the substance of Dr. Tahan’s report, and so it appears that there is no quantification of Tanfield’s smoking abuse, or the length of time between his discontinuing his smoking and the diagnosis of his lung cancer.  There is no discussion of any support for the alleged interaction between risks, or for any quantification of the extent of his increased risk from his lifestyle choices as opposed to his workplace exposure(s). There is no discussion of what Dr. Tahan visualized in his review of cytology and pathology slides, which permitted him to draw inferences about the actual causes of Mr. Tanfield’s lung cancer.

The trial judge proceeded on the assumption that there was an adequate proffer of expert opinion on causation, but that Dr. Tahan’s opinions on the failure to provide masks or respirators was a “net opinion,” a bit out of Tahan’s area of expertise.  Tanfield at 8. The Appellate Division apparently thought having a skin pathologist opine about the duty of care for a railroad was good enough for government work.  The appellate court gave the widow the benefit of the lower evidentiary threshold for negligence under FELA, which supposedly excuses the lack of an industrial hygiene opinion.  Tanfield at 10.  According to the two-judge panel, “[t]he doctor’s [Tahan’s] opinions are backed by professional literature and by his own considerable years of research and experience.” Tanfield at 11.  The Panel’s statement is all the more remarkable given that Tahan had never published on lung cancer, exposure assessments, or industrial hygiene measures; the vaunted experience of this witness was irrelevant to the issues in the case. Perhaps even more disturbing are the gaps in the proofs concerning the lack of causal connection between many of the alleged exposures and lung cancer generally, any discussion that the level of exposure to diesel fumes, from 1952 to 1992, was such that the railroads knew or should have known that that level of diesel fume caused lung cancer in workers.  And then there is the lurking probability that Mr. Tanfield’s smoking was the sole cause of his lung cancer.

Over 50 years ago, the New York Court of Appeals rejected a claim for leukemia, based upon allegations of benzene exposure, without any quantification of risk from the alleged exposure.  Miller v. National Cabinet Co., 8 N.Y.2d 277, 283-84, 168 N.E.2d 811, 813-15, 204 N.Y.S.2d 129, 132-34, modified on other grounds, 8 N.Y.2d 1025, 70 N.E.2d 214, 206 N.Y.S.2d 795 (1960). It is time to raise the standard for New Jersey courts’ consideration of epidemiologic evidence.

Pritchard v. Dow Agro – Gatekeeping Exemplified

August 25th, 2014

Robert T. Pritchard was diagnosed with Non-Hodgkin’s Lymphoma (NHL) in August 2005; by fall 2005, his cancer was in remission. Mr. Pritchard had been a pesticide applicator, and so, of course, he and his wife sued the deepest pockets around, including Dow Agro Sciences, the manufacturer of Dursban. Pritchard v. Dow Agro Sciences, 705 F.Supp. 2d 471 (W.D.Pa. 2010).

The principal active ingredient of Dursban is chlorpyrifos, along with some solvents, such as xylene, cumene, and ethyltoluene. Id. at 474.  Dursban was licensed for household insecticide use until 2000, when the EPA phased out certain residential applications.  The EPA’s concern, however, was not carcinogenicity:  the EPA categorizes chlorpyrifos as “Group E,” non-carcinogenetic in humans. Id. at 474-75.

According to the American Cancer Society (ACS), the cause or causes of NHL cases are unknown.  Over 60,000 new cases are diagnosed annually, in people from all walks of life, occupations, and lifestyles. The ACS identifies some risk factors, such as age, gender, race, and ethnicity, but the ACS emphasizes that chemical exposures are not proven risk factors or causes of NHL.  See Pritchard, 705 F.Supp. 2d at 474.

The litigation industry does not need scientific conclusions of causal connections; their business is manufacturing certainty in courtrooms. Or at least, the appearance of certainty. The Pritchards found their way to the litigation industry in Pittsburgh, Pennsylvania, in the form of Goldberg, Persky & White, P.C. The Goldberg Persky firm sued Dow Agro, and then put the Pritchards in touch with Dr. Bennet Omalu, to serve as their expert witness.  A lawsuit ensued.

Alas, the Pritchards’ lawsuit ran into a wall, or at least a gate, in the form of Federal Rule of Evidence 702. In the capable hands of Judge Nora Barry Fischer, Rule 702 became an effective barrier against weak and poorly considered expert witness opinion testimony.

Dr. Omalu, no stranger to lost causes, was the medical examiner of San Joaquin County, California, at the time of his engagement in the Pritchard case. After careful consideration of the Pritchards’ claims, Omalu prepared a four page report, with a single citation, to Harrison’s Principles of Internal Medicine.  Id. at 477 & n.6.  This research, however, sufficed for Omalu to conclude that Dursban caused Mr. Pritchard to develop NHL, as well as a host of ailments he had never even sued Dow Agro for, including “neuropathy, fatigue, bipolar disorder, tremors, difficulty concentrating and liver disorder.” Id. at 478. Dr. Omalu did not cite or reference any studies, in his report, to support his opinion that Dursban caused Mr. Pritchard’s ailments.  Id. at 480.

After counsel objected to Omalu’s report, plaintiffs’ counsel supplemented the report with some published articles, including the “Lee” study.  See Won Jin Lee, Aaron Blair, Jane A. Hoppin, Jay H. Lubin, Jennifer A. Rusiecki, Dale P. Sandler, Mustafa Dosemeci, and Michael C. R. Alavanja, “Cancer Incidence Among Pesticide Applicators Exposed to Chlorpyrifos in the Agricultural Health Study,” 96 J. Nat’l Cancer Inst. 1781 (2004) [cited as Lee].  At his deposition, and in opposition to defendants’ 702 motion, Omalu became more forthcoming with actual data and argument.  According to Omalu, the Lee study “the 2004 Lee Study strongly supports a conclusion that high-level exposure to chlorpyrifos is associated with an increased risk of NHL.’’ Id. at 480.

This opinion put forward by Omalu bordered on scientific malpractice.  No; it was malpractice.  The Lee study looked at many different cancer end points, without adjustment for multiple comparisons.  The lack of adjustment means at the very least that any interpretation of p-values or confidence intervals would have to modified to acknowledge the higher rate of random error.  Now for NHL, the overall relative risk (RR) for chlorpyrifos exposure was 1.03, with a 95% confidence interval, 0.62 to 1.70.  Lee at 1783.  In other words, the study that Omalu claimed supported his opinion was about as null a study as can be, with reasonably tight confidence interval that made a doubling of the risk rather unlikely given the sample RR.

If the multiple endpoint testing were not sufficient to dissuade a scientist, intent on supporting the Pritchards’ claims, then the exposure subgroup analysis would have scared any prudent scientist away from supporting the plaintiffs’ claims.  The Lee study authors provided two different exposure-response analyses, one with lifetime exposure and the other with an intensity-weighted exposure, both in quartiles.  Neither analysis revealed an exposure-response trend.  For the lifetime exposure-response trend, the Lee study reported an NHL RR of 1.01, for the highest quartile of chloripyrifos exposure. For the intensity-weighted analysis, for the highest quartile, the authors reported RR = 1.61, with a 95% confidence interval, 0.74 to 3.53).

Although the defense and the district court did not call out Omalu on his fantasy statistical inference, the district judge certainly appreciated that Omalu had no statistically significant associations between chloripyrifos and NHL, to support his opinion. Given the weakness of relying upon a single epidemiologic study (and torturing the data therein), the district court believed that a showing of statistical significance was important to give some credibility to Omalu’s claims.  705 F.Supp. 2d at 486 (citing General Elec. Co. v. Joiner, 522 U.S. 136, 144-46 (1997);  Soldo v. Sandoz Pharm. Corp., 244 F.Supp. 2d 434, 449-50 (W.D. Pa. 2003)).

Figure 3 adapted from Lee

Figure 3 adapted from Lee

What to do when there is really no evidence supporting a claim?  Make up stuff.  Here is how the trial court describes Omalu’s declaration opposing exclusion:

 “Dr. Omalu interprets and recalculates the findings in the 2004 Lee Study, finding that ‘an 80% confidence interval for the highly-exposed applicators in the 2004 Lee Study spans a relative risk range for NHL from slightly above 1.0 to slightly above 2.5.’ Dr. Omalu concludes that ‘this means that there is a 90% probability that the relative risk within the population studied is greater than 1.0’.”

705 F.Supp. 2d at 481 (internal citations omitted); see also id. at 488. The calculations and the rationale for an 80% confidence interval were not provided, but plaintiffs’ counsel assured Judge Fischer at oral argument that the calculation was done using high school math. Id. at 481 n.12. Judge Fischer seemed unimpressed, especially given that there was no record of the calculation.  Id. at 481, 488.

The larger offense, however, was that Omalu’s interpretation of the 80% confidence interval as a probability statement of the true relative risk’s exceeding 1.0, was bogus. Dr. Omalu further displayed his lack of statistical competence when he attempted to defend his posterior probability derived from his 80% confidence interval by referring to a power calculation of a different disease in the Lee study:

“He [Omalu] further declares that ‘‘the authors of the 2004 Lee Study themselves endorse the probative value of a finding of elevated risk with less than a 95% confidence level when they point out that ‘this analysis had a 90% statistical power to detect a 1.5–fold increase in lung cancer incidence’.”

Id. at 488 (court’s quoting of Omalu’s quoting from the Lee study). To quote Wolfgang Pauli, Omalu is so far off that he is “not even wrong.” Lee and colleagues were offering a pre-study power calculation, which they used to justify their looking at the cohort for lung cancer, not NHL, outcomes.  Lee at 1787. The power calculation does not apply to the data observed for lung cancer; and the calculation has absolutely nothing to do with NHL. The power calculation certainly has nothing to do with Omalu’s misguided attempt to offer a calculation of a posterior probability for NHL based upon a subgroup confidence interval.

Given that there were epidemiologic studies available, Judge Fischer noted that expert witnesses were obligated to factor such studies into their opinions. See 705 F.Supp. 2d at 483 (citing Soldo, 244 F.Supp. 2d at 532).  Omalu sins against Rule 702 included his failure to consider any studies other than the Lee study, regardless of how unsupportive the Lee study was of his opinion.  The defense experts pointed to several studies that found lower NHL rates among exposed workers than among controls, and Omalu completely failed to consider and to explain his opinion in the face of the contradictory evidence.  See 705 F.Supp. 2d at 485 (citing Perry v. Novartis Pharm. Corp. 564 F.Supp. 2d 452, 465 (E.D. Pa. 2008)). In other words, Omalu was shown to have been a cherry picker. Id. at 489.

In addition to the abridged epidemiology, Omalu relied upon an analogy between the ethyl-toluene and other solvents that contained benzene rings and benzene itself to argue that these chemicals, supposedly like benzene, cause NHL.  Id. at 487. The analogy was never supported by any citations to published studies, and, of course, the analogy is seriously flawed. Many chemicals, including chemicals made and used by the human body, have benzene rings, without the slightest propensity to cause NHL.  Indeed, the evidence that benzene itself causes NHL is weak and inconsistent.  See, e.g., Knight v. Kirby Inland Marine Inc., 482 F.3d 347 (2007) (affirming the exclusion of Dr. B.S. Levy in a case involving benzene exposure and NHL).

Looking at all the evidence, Judge Fischer found Omalu’s general causation opinions unreliable.  Relying upon a single, statistically non-significant epidemiologic study (Lee), while ignoring contrary studies, was not sound science.  It was not even science; it was courtroom rhetoric.

Omalu’s approach to specific causation, the identification of what caused Mr. Pritchard’s NHL, was equally spurious. Omalu purportedly conducted a “differential diagnosis” or a “differential etiology,” but he never examined Mr. Pritchard; nor did he conduct a thorough evaluation of Mr. Pritchard’s medical records. 705 F.Supp. 2d at 491. Judge Fischer found that Omalu had not conducted a thorough differential diagnosis, and that he had made no attempt to rule out idiopathic or unknown causes of NHL, despite the general absence of known causes of NHL. Id. at 492. The one study identified by Omalu reported a non-statistically significant 60% increase in NHL risk, for a subgroup in one of two different exposure-response analyses.  Although Judge Fischer treated the relative risk less than two as a non-dispositive factor in her decision, she recognized that

“The threshold for concluding that an agent was more likely than not the cause of an individual’s disease is a relative risk greater than 2.0… . When the relative risk reaches 2.0, the agent is responsible for an equal number of cases of disease as all other background causes. Thus, a relative risk of 2.0 … implies a 50% likelihood that an exposed individual’s disease was caused by the agent. A relative risk greater than 2.0 would permit an inference that an individual plaintiff’s disease was more likely than not caused by the implicated agent.”

Id. at 485-86 (quoting from Reference Manual on Scientific Evidence at 384 (2d ed. 2000)).

Left with nowhere to run, plaintiffs’ counsel swung for the bleachers by arguing that the federal court, sitting in diversity, was required to apply Pennsylvania law of evidence because the standards of Rule 702 constitute “substantive,” not procedural law. The argument, which had been previously rejected within the Third Circuit, was as legally persuasive as Omalu’s scientific opinions.  Judge Fischer excluded Omalu’s proffered opinions and granted summary judgment to the defendants. The Third Circuit affirmed in a per curiam decision. 430 Fed. Appx. 102, 2011 WL 2160456 (3d Cir. 2011).

Practical Evaluation of Scientific Claims

The evaluative process that took place in the Pritchard case missed some important details and some howlers committed by Dr. Omalu, but it was more than good enough for government work. The gatekeeping decision in Pritchard was nonetheless the target of criticism in a recent book.

Kristin Shrader-Frechette (S-F) is a professor of science who wants to teach us how to expose bad science. S-F has published, or will soon publish, a book that suggests that philosophy of science can help us expose “bad science.”  See Kristin Shrader-Frechette, Tainted: How Philosophy of Science Can Expose Bad Science (Oxford U.P. 2014)[cited below at Tainted; selections available on Google books]. S-F’s claim is intriguing, as is her move away from the demarcation problem to the difficult business of evaluation and synthesis of scientific claims.

In her introduction, S-F tells us that her book shows “how practical philosophy of science” can counteract biased studies done to promote special interests and PROFITS.  Tainted at 8. Refreshingly, S-F identifies special-interest science, done for profit, as including “individuals, industries, environmentalists, labor unions, or universities.” Id. The remainder of the book, however, appears to be a jeremiad against industry, with a blind eye towards the litigation industry (plaintiffs’ bar) and environmental zealots.

The book promises to address “public concerns” in practical, jargon-free prose. Id. at 9-10. Some of the aims of the book are to provide support for “rejecting demands for only human evidence to support hypotheses about human biology (chapter 3), avoiding using statistical-significance tests with observational data (chapter 12), and challenging use of pure-science default rules for scientific uncertainty when one is doing welfare-affecting science (chapter 14).”

Id. at 10. Hmmm.  Avoiding statistical significance tests for observational data?!?  If avoided, what does S-F hope to use to assess random error?

And then S-F refers to plaintiffs’ hired expert witness (from the Milward case), Carl Cranor, as providing “groundbreaking evaluations of causal inferences [that] have helped to improve courtroom verdicts about legal liability that otherwise put victims at risk.” Id. at 7. Whether someone is a “victim” and has been “at risk” turns on assessing causality. Cranor is not a scientist, and his philosophy of science turns of “weight of the evidence” (WOE), a subjective, speculative approach that is deaf, dumb, and blind to scientific validity.

There are other “teasers,” in the introduction to Tainted.  S-F advertises that her Chapter 5 will teach us that “[c]ontrary to popular belief, animal and not human data often provide superior evidence for human-biological hypotheses.”  Tainted at 11. Chapter 6 will show that“[c]ontrary to many physicists’ claims, there is no threshold for harm from exposure to ionizing radiation.” Id.  S-F tells us that her Chapter 7 will criticize “a common but questionable way of discovering hypotheses in epidemiology and medicine—looking at the magnitude of some effect in order to discover causes. The chapter shows instead that the likelihood, not the magnitude, of an effect is the better key to causal discovery.” Id. at 13. Discovering hypotheses — what is that about? You might have thought that hypotheses were framed from observations and then tested.

Which brings us to the trailer for Chapter 8, in which S-F promises to show that “[c]ontrary to standard statistical and medical practice, statistical-significance tests are not causally necessary to show medical and legal evidence of some effect.” Tainted at 11. Again, the teaser raises lots of questions such as what could S-F possibly mean when she says statistical tests are not causally necessary to show an effect.  Later in the introduction, S-F says that her chapter on statistics “evaluates the well-known statistical-significance rule for discovering hypotheses and shows that because scientists routinely misuse this rule, they can miss discovering important causal hypotheses. Id. at 13. Discovering causal hypotheses is not what courts and regulators must worry about; their task is to establish such hypotheses with sufficient, valid evidence.

Paging through the book reveals that a rhetoric that is thick and unremitting, with little philosophy of science or meaningful advice on how to evaluate scientific studies.  The statistics chapter calls out, and lo, it features a discussion of the Pritchard case. See Tainted, Chapter 8, “Why Statistics Is Slippery: Easy Algorithms Fail in Biology.”

The chapter opens with an account of German scientist Fritz Haber’s development of organophosphate pesticides, and the Nazis use of related compounds as chemical weapons.  Tainted at 99. Then, in a fevered non-sequitur and rhetorical flourish, S-F states, with righteous indignation, that although the Nazi researchers “clearly understood the causal-neurotoxic effects of organophosphate pesticides and nerve gas,” chemical companies today “claim that the causal-carcinogenic effects of these pesticides are controversial.” Is S-F saying that a chemical that is neurotoxic must be carcinogenic for every kind of human cancer?  So it seems.

Consider the Pritchard case.  Really, the Pritchard case?  Yup; S-F holds up the Pritchard case as her exemplar of what is wrong with civil adjudication of scientific claims.  Despite the promise of jargon-free language, S-F launches into a discussion of how the judges in Pritchard assumed that statistical significance was necessary “to hypothesize causal harm.”  Tainted at 100. In this vein, S-F tells us that she will show that:

“the statistical-significance rule is not a legitimate requirement for discovering causal hypotheses.”

Id. Again, the reader is left to puzzle why statistical significance is discussed in the context of hypothesis discovery, whatever that may be, as opposed to hypothesis testing or confirmation. And whatever it may be, we are warned that “unless the [statistical significance] rule is rejected as necessary for hypothesis-discovery, it will likely lead to false causal claims, questionable scientific theories, and massive harm to innocent victims like Robert Pritchard.”

Id. S-F is decidedly not adverting to Mr. Pritichard’s victimization by the litigation industry and the likes of Dr. Omalu, although she should. S-F not only believes that the judges in Pritchard bungled their gatekeeping wrong, she knows that Dr. Omalu was correct, and the defense experts wrong, and that Pritchard was a victim of Dursban and of questionable scientific theories that were used to embarrass Omalu and his opinions.

S-F promised to teach her readers how to evaluate scientific claims and detect “tainted” science, but all she delivers here is an ipse dixit.  There is no discussion of the actual measurements, extent of random error, or threats to validity, for studies cited either by the plaintiffs or the defendants in Pritchard.  To be sure, S-F cites the Lee study in her endnotes, but she never provides any meaningful discussion of that study or any other that has any bearing on chlorpyrifos and NHL.  S-F also cited two review articles, the first of which provides no support for her ipse dixit:

“Although mutagenicity and chronic animal bioassays for carcinogenicity of chlorpyrifos were largely negative, a recent epidemiological study of pesticide applicators reported a significant exposure response trend between chlorpyrifos use and lung and rectal cancer. However, the positive association was based on small numbers of cases, i.e., for rectal cancer an excess of less than 10 cases in the 2 highest exposure groups. The lack of precision due to the small number of observations and uncertainty about actual levels of exposure warrants caution in concluding that the observed statistical association is consistent with a causal association. This association would need to be observed in more than one study before concluding that the association between lung or rectal cancer and chlorpyrifos was consistent with a causal relationship.

There is no evidence that chlorpyrifos is hepatotoxic, nephrotoxic, or immunotoxic at doses less than those that cause frank cholinesterase poisoning.”

David L. Eaton, Robert B. Daroff, Herman Autrup, James Bridges, Patricia Buffler, Lucio G. Costa, Joseph Coyle, Guy McKhann, William C. Mobley, Lynn Nadel, Diether Neubert, Rolf Schulte-Hermann, and Peter S. Spencer, “Review of the Toxicology of Chlorpyrifos With an Emphasis on Human Exposure and Neurodevelopment,” 38 Critical Reviews in Toxicology 1, 5-6(2008).

The second cited review article was written by clinical ecology zealot[1], William J. Rea. William J. Rea, “Pesticides,” 6 Journal of Nutritional and Environmental Medicine 55 (1996). Rea’s article does not appear in Pubmed.

Shrader-Frechette’s Criticisms of Statistical Significance Testing

What is the statistical significance against which S-F rails? She offers several definitions, none of which is correct or consistent with the others.

“The statistical-significance level p is defined as the probability of the observed data, given that the null hypothesis is true.8

Tainted at 101 (citing D. H. Johnson, “What Hypothesis Tests Are Not,” 16 Behavioral Ecology 325 (2004). Well not quite; attained significance probability is the probability of data observed or those more extreme, given the null hypothesis.  A Tainted definition.

Later in Chapter 8, S-F discusses significance probability in a way that overtly commits the transposition fallacy, not a good thing to do in a book that sets out to teach how to evaluate scientific evidence:

“However, typically scientists view statistical significance as a measure of how confidently one might reject the null hypothesis. Traditionally they have used a 0.05 statistical-significance level, p < or = 0.05, and have viewed the probability of a false-positive (incorrectly rejecting a true null hypothesis), or type-1, error as 5 percent. Thus they assume that some finding is statistically significant and provides grounds for rejecting the null if it has at least a 95-percent probability of not being due to chance.

Tainted at 101. Not only does the last sentence ignore the extent of error due to bias or confounding, it erroneously assigns a posterior probability that is the complement of the significance probability.  This error is not an isolated occurrence; here is another example:

“Thus, when scientists used the rule to examine the effectiveness of St. John’s Wort in relieving depression,14 or when they employed it to examine the efficacy of flutamide to treat prostate cancer,15 they concluded the treatments were ineffective because they were not statistically significant at the 0.05 level. Only at p < or = 0.14 were the results statistically significant. They had an 86-percent chance of not being due to chance.16

Tainted at 101-02 (citing papers by Shelton (endnote 14)[2], by Eisenberger (endnote 15) [3], and Rothman’s text (endnote 16)[4]). Although Ken Rothman has criticized the use of statistical significance tests, his book surely does not interpret a p-value of 0.14 as an 86% chance that the results were not due to chance.

Although S-F previous stated that statistical significance is interpreted as the probability that the null is true, she actually goes on to correct the mistake, sort of:

“Requiring the statistical-significance rule for hypothesis-development also is arbitrary in presupposing a nonsensical distinction between a significant finding if p = 0.049, but a nonsignificant finding if p = 0. 051.26 Besides, even when one uses a 90-percent (p < or = 0.10), an 85-percent (p < or = 0.15), or some other confidence level, it still may not include the null point. If not, these other p values also show the data are consistent with an effect. Statistical-significance proponents thus forget that both confidence levels and p values are measures of consistency between the data and the null hypothesis, not measures of the probability that the null is true. When results do not satisfy the rule, this means merely that the null cannot be rejected, not that the null is true.”

Tainted at 103.

S-F’s repeats some criticisms of significance testing, most of which involve their own misunderstandings of the concept.  It hardly suffices to argue that evaluating the magnitude of random error is worthless because it does not measure the extent of bias and confounding.  The flaw lies in those who would interpret the p-value as the sole measure of error involved in a measurement.

S-F takes the criticisms of significance probability to be sufficient to justify an alternative approach: evaluating causal hypotheses “on a preponderance of evidence,47 whether effects are more likely than not.”[5] Here citations, however, do not support the notion that an overall assessment of the causal hypothesis is a true alternative of statistical testing, but rather only a later step in the causal assessment, which presupposes the previous elimination of random variability in the observed associations.

S-F compounds her confusion by claiming that this purported alternative is superior to significance testing or any evaluation of random variability, and by noting that juries in civil cases must decide causal claims on the preponderance of the evidence, not on attained significance probabilities:

“In welfare-affecting areas of science, a preponderance-of-evidence rule often is better than a statistical-significance rule because it could take account of evidence based on underlying mechanisms and theoretical support, even if evidence did not satisfy statistical significance. After all, even in US civil law, juries need not be 95 percent certain of a verdict, but only sure that a verdict is more likely than not. Another reason for requiring the preponderance-of-evidence rule, for welfare-related hypothesis development, is that statistical data often are difficult or expensive to obtain, for example, because of large sample-size requirements. Such difficulties limit statistical-significance applicability. ”

Tainted at 105-06. S-F’s assertion that juries need not have 95% certainty in their verdict is either a misunderstanding or a misrepresentation of the meaning of a confidence interval, and a conflation of two very kinds of probability or certainty.  S-F invites a reading that commits the transposition fallacy by confusing the probability involved in a confidence interval with that involved in a posterior probability.  S-F’s claim that sample size requirements often limit the ability to use statistical significance evaluations is obviously highly contingent upon the facts of case, but in civil cases, such as Pritchard, this limitation is rarely at play.  Of course, if the sample size is too small to evaluate the role of chance, then a scientist should probably declare the evidence too fragile to support a causal conclusion.

S-F also postulates that that a posterior probability rather than a significance probability approach would “better counteract conflicts of interest that sometimes cause scientists to pay inadequate attention to public-welfare consequences of their work.” Tainted at 106. This claim is a remarkable assertion, which is not supported by any empirical evidence.  The varieties of evidence that go into an overall assessment of a causal hypothesis are often quantitatively incommensurate.  The so-called preponderance-of-the-evidence described by S-F is often little more than a subjective overall assessment of weight of the evidence.  The approving citations to the work of Carl Cranor support interpreting S-F to endorse this subjective, anything-goes approach to weight of the evidence.  As for WOE eliminating inadequate attention to “public welfare,” S-F’s citations actually suggest the opposite. S-F’s citations to the 1961 reviews by Wynder and by Little illustrate how subjective narrative reviews can be, with diametrically opposed results.  Rather than curbing conflicts of interest, these subjective, narrative reviews illustrate how contrary results may be obtained by the failure to pre-specify criteria of validity, and inclusion and exclusion of admissible evidence. Still, S-F asserts that “up to 80 percent of welfare-related statistical studies have false-negative or type-II errors, failing to reject a false null.” Tainted at 106. The support for this assertion is a citation to a review article by David Resnik. See David Resnik, “Statistics, Ethics, and Research: An Agenda for Education and Reform,” 8 Accountability in Research 163, 183 (2000). Resnik’s paper is a review article, not an empirical study, but at the page cited by S-F, Resnik in turn cites to well-known papers that present actual data:

“There is also evidence that many of the errors and biases in research are related to the misuses of statistics. For example, Williams et al. (1997) found that 80% of articles surveyed that used t-tests contained at least one test with a type II error. Freiman et al. (1978)  * * *  However, empirical research on statistical errors in science is scarce, and more work needs to be done in this area.”

Id. The papers cited by Resnik, Williams (1997)[6] and Freiman (1978)[7] did identify previously published studies that over-interpreted statistically non-significant results, but the identified type-II errors were potential errors, not ascertained errors, because the authors made no claim that every non-statistically significant result actually represented a missed true association. In other words, S-F is not entitled to say that these empirical reviews actually identified failures to reject fall null hypotheses. Furthermore, the empirical analyses in the studies cited by Resnik, who was in turn cited by S-F, did not look at correlations between alleged conflicts of interest and statistical errors. The cited research calls for greater attention to proper interpretation of statistical tests, not for their abandonment.

In the end, at least in the chapter on statistics, S-F fails to deliver much if anything on her promise to show how to evaluate science from a philosophic perspective.  Her discussion of the Pritchard case is not an analysis; it is a harangue. There are certainly more readable, accessible, scholarly, and accurate treatments of the scientific and statistical issues in this book.  See, e.g., Michael B. Bracken, Risk, Chance, and Causation: Investigating the Origins and Treatment of Disease (2013).


[1] Not to be confused with the deceased federal judge by the same name, William J. Rea. William J. Rea, 1 Chemical Sensitivity – Principles and Mechanisms (1992); 2 Chemical Sensitivity – Sources of Total Body Load (1994),  3 Chemical Sensitivity – Clinical Manifestation of Pollutant Overload (1996), 4 Chemical Sensitivity – Tools of Diagnosis and Methods of Treatment (1998).

[2] R. C. Shelton, M. B. Keller, et al., “Effectiveness of St. John’s Wort in Major Depression,” 285 Journal of the American Medical Association 1978 (2001).

[3] M. A. Eisenberger, B. A. Blumenstein, et al., “Bilateral Orchiectomy With or Without Flutamide for Metastic [sic] Prostate Cancer,” 339 New England Journal of Medicine 1036 (1998).

[4] Kenneth J. Rothman, Epidemiology 123–127 (NY 2002).

[5] Endnote 47 references the following papers: E. Hammond, “Cause and Effect,” in E. Wynder, ed., The Biologic Effects of Tobacco 193–194 (Boston 1955); E. L. Wynder, “An Appraisal of the Smoking-Lung-Cancer Issue,”264  New England Journal of Medicine 1235 (1961); see C. Little, “Some Phases of the Problem of Smoking and Lung Cancer,” 264 New England Journal of Medicine 1241 (1961); J. R. Stutzman, C. A. Luongo, and S. A McLuckey, “Covalent and Non-Covalent Binding in the Ion/Ion Charge Inversion of Peptide Cations with Benzene-Disulfonic Acid Anions,” 47 Journal of Mass Spectrometry 669 (2012). Although the paper on ionic charges of peptide cations is unfamiliar, the other papers do not eschew traditional statistical significance testing techniques. By the time these early (1961) reviews were written, the association that was reported between smoking and lung cancer was clearly accepted as not likely explained by chance.  Discussion focused upon bias and potential confounding in the available studies, and the lack of animal evidence for the causal claim.

[6] J. L. Williams, C. A. Hathaway, K. L. Kloster, and B. H. Layne, “Low power, type II errors, and other statistical problems in recent cardiovascular research,” 42 Am. J. Physiology Heart & Circulation Physiology H487 (1997).

[7] Jennie A. Freiman, Thomas C. Chalmers, Harry Smith and Roy R. Kuebler, “The importance of beta, the type II error and sample size in the design and interpretation of the randomized control trial: survey of 71 ‛negative’ trials,” 299 New Engl. J. Med. 690 (1978).

Contra Parascandola’s Reduction of Specific Causation to Risk

August 22nd, 2014

Mark Parascandola is a photographer who splits his time between Washington DC, and Almeria, Spain.  Before his career in photography, Parascandola studied philosophy (Cambridge), and did graduate work in epidemiology (Johns Hopkins, MPH). In 1997 to 1998, he studied the National Cancer Institute’s role in determining that smoking causes some kinds of cancer.  He went on to serve as a staff epidemiologist at NCI, at its Tobacco Control Research Branch, in the Division of Cancer Control and Population Sciences (DCCPS).

Back in the 1990s, Parascandola wrote an article, which is a snapshot and embellishment of arguments given by Sander Greenland, on the use and alleged abuse of relative risks to derive a “probability of causation.” See Mark Parascandola, “What’s Wrong with the Probability of Causation?” 39 Jurimetrics J. 29 (1998)[cited here are Parascandola]. Parascandola’s article is a locus of arguments that have recurred from time to time, and worth revisiting.

Parascandola offers an interesting historical factoid, which is a useful reminder to those who suggest that the RR > 2 argument was the brainchild of lawyers:  The argument was first suggested in 1959, by Dr. Victor P. Bond, a physician with expertise in medical physics at the Brookhaven National Laboratory.  See Parascandola at 31 n. 6 (citing Victor P. Bond, The Medical Effects of Radiation (1960), reprinted in NACCA 13th Annual Convention 1959, at 126 (1960).

Unfortunately, Parascandola is a less reliable reporter when it comes to the judicial use of the relative risk greater than two (RR > 2) argument.  He argues that Judge Jack Weinstein opposed the RR > 2 argument on policy grounds, when in fact, Judge Weinstein rejected the anti-probabilistic argument that probabilistic inference could never establish specific causation, and embraced the RR > 2 argument as a logical policy compromise that would allow evidence of risk to substitute for specific causation in a limited fashion. Parascandola at 33-34 & n.20. Given Judge Weinstein’s many important contributions to tort and procedural law, and the importance of the Agent Orange litigation, it is worth describing Judge Weinstein’s views accurately. See In re Agent Orange Product Liab. Litig., 597 F. Supp. 740, 785, 817, 836 (E.D.N.Y. 1984) (“A government administrative agency may regulate or prohibit the use of toxic substances through rulemaking, despite a very low probability of any causal relationship.  A court, in contrast, must observe the tort law requirement that a plaintiff establish a probability of more than 50% that the defendant’s action injured him. … This means that at least a two-fold increase in incidence of the disease attributable to Agent Orange exposure is required to permit recovery if epidemiological studies alone are relied upon.”), aff’d 818 F.2d 145, 150-51 (2d Cir. 1987)(approving district court’s analysis), cert. denied sub nom. Pinkney v. Dow Chemical Co., 487 U.S. 1234 (1988); see also In re “Agent Orange” Prod. Liab. Litig., 611 F. Supp. 1223, 1240, 1262 (E.D.N.Y. 1985)(excluding plaintiffs’ expert witnesses), aff’d, 818 F.2d 187 (2d Cir. 1987), cert. denied, 487 U.S. 1234 (1988).[1]

Parascandola’s failure to cite and describe Judge Weinstein’s views raises some question of the credibility of his analyses, and his assertion that “[he] will demonstrate that the PC formula is invalid in many situations and cannot fill the role it is given.” Parascandola at 30 (emphasis added).

Parascandola describes basic arithmetic of probability of causation (PC) in terms of a disease for which we “expect cases” and for which we have “excess cases.” The rate of observed cases in an exposed population divided by the rate of expected cases in an unexposed population provides an estimate of the population relative risk (RR). The excess cases can be obtained simply from the difference between observed cases in the exposed group and the expected cases in the unexposed group.  The attributable fraction is the ratio of excess cases to total cases.

The probability of causation “PC” = 1 – (1/RR).

Heterogeneity Yields Uncertainty Argument

The RR describes a group statistic, and an individual’s altered risk will almost certainly not be exactly equal to the group’s average risk. Parascandola notes that sometimes this level of uncertainty can be remedied by risk measurements for subgroups that better fit an individual plaintiff’s characteristics.  All true, but this is hardly an argument against RR > 2.  At best, the heterogeneity argument is an expression of inference skepticism of the sort that led Judge Weinstein to accept RR > 2 as a reasonable compromise. The presence of heterogeneity of this sort simply increases the burden upon plaintiff to provide RR statistics from studies that very tightly resemble plaintiff in terms of exposure and other characteristics.

Urning for Probablistic Certainty

Parascandola describes how the PC formula arises from a consideration of the “urn model” of disease causation.  Suppose in group of sufficient size there were expected 200 stomach cancer cases within a certain time, but 300 were observed. We can model the situation with an urn of 300 marbles, 200 of which are green, and 100 are red. Blindfolded or colorblind, we pull a single marble from the urn, and we have only a 1/3 chance of obtaining a red, “excess” marble case. Parascandola at 36-37 (borrowing from David Kaye, “The Limits of the Preponderance of the Evidence Standard: Justifiably Naked Statistical Evidence and Multiple Causation,” 7 Am. Bar Fdtn. Res. J. 487, 501 (1982)).

Parascandola argues that the urn model is not necessarily correct.  Causation cannot always be reduced to a single cause. Complex etiologic mechanisms and pathways are common.  Interactions between and among causes frequently occur.  Biological phenomena are sometimes “over-determined.” Parascandola asks us to assume that some of the non-excess cases are also “etiologic cases,” which were caused by the exposure but which would not have occurred but for the exposure.  Id. at 37. Borrowing from Greenland, Parascandola asserts that “[a]ll excess cases are etiologic cases, but not vice versa.” Id. at 38 & n.37 (quoting from Sander Greenland & James M. Robins, “Conceptual Problems in the Definition and Interpretation of Attributable Fractions,” 128 Am. J. Epidem. 1185, 1185 (1988)).

Parascandola’s argument, if accepted, proves too much to help plaintiffs who hope to establish specific causation with evidence of increased risk. His argument posits a different, more complex model of causation, for which plaintiffs usually have no evidence.  (If they did have such evidence, then they would have nothing to fear in the assumptions of the simplistic urn model; they could rebut those assumptions.) Parascandola’s argument pushes the speculation envelope by asking us to believe that some “non-excess” cases are etiologic cases, but providing no basis for identifying which ones they are.  Unless and until such evidence is forthcoming, Parascandola’s argument is simply uncontrolled multi-leveled conjecture.

Again borrowing from Sander Greenland’s speculation, Parascandola advances a variant of the argument above by suggesting that an exposure may not increase the overall number of excess cases, but that it may accelerate the onset of the harm in question. While it is true that the element of time is important, both in law and in life, the invoked speculation can be, and usually is, tested by time windows or time series analyses in observational epidemiology and clinical trials.  The urn model is “flat” with respect to the temporal dimension, but if plaintiffs want to claim acceleration, then they should adduce Kaplan-Meier curves and the like.  But again, with the modification of the time dimension, plaintiffs will still need hazard ratios or other risk ratios greater than two to make out their case, unless there is a biomarker/fingerprint of individual causation. The introduction of the temporal element is important to an understanding of risk, but Parascandola’s argument does not help transmute evidence of risk in a group to causation in an individual.

Joint Chancy Causation

In line with his other speculative arguments, Parascandola asks:  what if a given cancer in the exposed group is the product of two causes rather than due to one or another of the two causes? Parascandola at 40. This question restates the speculative argument in only slightly different terms.  We could multiply the possible causal sets by suggesting that the observed effect resulted from one or the other or both or none of the causes.  Parascandola calls this “joint chancy causation,” but he manages to show only that the inference of causation from prior chance or risk is a very chancy (or dicey) step in his argument.  Parascandola argues that we should not assume that the urn model is true, when multiple causation models are “plausible and consistent” with other causal theories.

Interestingly, this line of argument would raise the burden upon plaintiffs by requiring them to specify the applicable causal model in ways that (1) they often cannot, and (2) they now, under current law, are not required to do.

Conclusion

In the end, Parascandola realizes that he has raised, not lowered, the burden for plaintiffs.  His counter is to suggest, contrary to law and science, that “the existence of alternative hypotheses should not prevent the plaintiff’s case from proceeding.” Parascandola at 41 n.50.  Because he says so. In other words, Parascandola is telling us that irrespective of how poorly established a hypothesis is, or of how speculative an inference is, or of the existence and strength of alternative hypotheses,

“This trial must be tried.”

W.S. Gilbert, Trial by Jury (1875).

With bias of every kind, no doubt.

That is not science, law, or justice.


[1] An interesting failure or lack of peer review in a legal journal.

 

Silicosis, Lung Cancer, and Evidence-Based Medicine in North America

July 4th, 2014

According to her biographies[1], Madge Thurlow Macklin excelled in mathematics, graduated from Goucher College, received a fellowship to study physiology at Johns Hopkins University, and then went on graduate with honors from the Johns Hopkins Medical School, in 1919.  Along the way, she acquired a husband, Charles C. Macklin, an associate professor of anatomy at Hopkins, and had her first child.

In 1921, the Macklins moved to London, Ontario, to take positions at the University of Western Ontario.  Charles received an appointment as a professor of histology and embryology, and went on to distinguish himself in pulmonary pathology. Madge Macklin received an appointment as a part-time instructor at Western, but faced decades of resistance because of her sex and her marriage to a professor. She was never promoted beyond part-time assistant professor, at Western.

Despite the hostile work environment, Madge Macklin published and lectured on statistical and medical genetics.  Her papers made substantial contributions to the inheritable aspects of human cancer and other diseases.

Macklin advocated tirelessly for the inclusion of medical genetics in the American medical school curriculum. See, e.g., Marge T. Macklin, “Should The Teaching Of Genetics As Applied To Medicine Have A Place In The Medical Curriculum?” 7 J. Ass’n Am. Med. Coll. 368 (1932); “The Teaching of Inheritance of Disease to Medical Students: A Proposed Course in Medical Genetics,” 6 Ann. Intern. Med. 1335 (1933). Her advocacy largely succeeded both in medical education and in the recognition of the importance of genetics for human diseases.

Macklin’s commitment to medical genetics led her to believe that physicians had a social responsibility to engage in sensible genetics counseling, and reasonable guidance on procreation and birth control. In 1930, Macklin helped found the Eugenics Society of Canada, and went on to serve as its Director in 1935. Her writings show none of the grandiosity or pretensions that lie in creating a master race, as much as avoiding procreation among imbeciles. See, e.g., Madge Macklin, “Genetical Aspects of Sterilization of the Mentally Unfit,” 30 Can. Med. Ass’n J. 190 (1934).

Some of her biographers suggest that Macklin lost her position at Western due to her views on eugenics, and others suggest that her trenchant criticisms of the inequity of the University’s sexism led her to go to Ohio State University in 1946, as a cancer researcher, funded by the National Research Council. Macklin taught genetics at Ohio State, something that Western never permitted her to do. In 1959, three years before her death, Macklin was elected president of the American Society for Human Genetics.

By all accounts, Macklin was an extraordinary woman and a gifted scientist, but my interest in her work stems from her recognition in the 1930s and 1940s, for the need for greater rigor in drawing etiological inferences in medical science.  Well ahead of her North American colleagues, Macklin emphasized the need to rule out bias, confounding, and chance before accepting apparent associations as causal. She wrote with unusual clarity and strength on the subject, decades before Sir Austin Bradford Hill. Her early mathematical prowess served her well in rebutting case reports and associations that were often embraced uncritically.

 *  *  *  *  *  *  *

In 1939, Professor Max Klotz of the University of Toronto, reported a very crude analysis from which he inferred a putative association between silicosis and lung cancer. Max O. Klotz, “The Association of Silicosis and Carcinoma of the Lung, 35 Am. J. Cancer 38 (1939). Klotz was a pathologist, and he worked with autopsy series, without statistical tools or understanding, as was common at the time. Macklin wrote a thorough refutation, which amply illustrates her abilities and her clear thinking:

“Another type of improper control for analysing cancer data arises through ignoring the fact that every cancer has a specific age incidence, and sex predilection. I have already mentioned breast, uterine and prostatic cancers, but other types of cancer, not of the generative organs,  have marked sex predilection. Cancer of the lung is a good example. It occurs four times as frequently in the male as in the female. If we desire to make any study of causative factors in lung cancer we must be sure that our control group is comparable to our experimental group. Again I will take an example from the literature. A worker was investigating the possible role of silicosis in inducing lung cancer. He compared the incidence of lung cancer in a group of 50 cases of silicosis, and in a large necropsy group of 4500 ‘unselected’ cases from a general hospital. He found that lung cancer was 7 times as frequent in the silicosis group as in the unselected necropsies. This is an excellent example of misunderstanding as to what is meant by ‘random’ sample. Because the 4500 necropsies were ‘unselected’ the worker thought that he had a good control group. As a matter of fact, in order to have a good control, he needed to select very carefully from these 4500 necropsies, those which he was to use as his standard. He forgot two things:

(1) that lung cancer is 4 times as common in the male as in the female and that all his silicosis cases were males, therefore his unselected necropsies should have been highly selected to contain only males. Assuming that half of his 4500 necropsies were females, and that among them one fifth of the lung cancers occurred, one can easily show that had his control group been all males as was his silicosis group, lung cancer would have been only 4.8 times as common among the silicosis patients as among the general necropsy group instead of 7 times as he found it.

(2) The second thing he forgot is that silicosis does not develop until 15 or 20 years of exposure have passed by. That placed all his silicosis patients in the late forties or early fifties, just when lung cancer becomes most common. Many of his general necropsy group were in the age range below 45, hence not in the lung cancer age. He should have selected only those males from the necropsy group who matched the age distribution of his silicosis patients. If he then found a significantly higher percentage of lung cancer among his silicosis patients he could have suggested a relationship between the two. Until that control group is properly studied, his results are valueless.”

****

SUMMARY

* * *

“The second point to be noted is that the control group should correspond as nearly as possible in all respects with the group under investigation, with the single exception of the etiologic factor being investigated. If silicosis is being considered as a causative agent in lung cancer, the control group should be as nearly like the experimental or observed group as possible in sex, age distribution, race, facilities for diagnosis, other possible carcinogenic factors, etc. The only point in which the control group should differ in an ideal study would be that they were not exposed to free silica, whereas the experimental group was. The incidence of lung cancer could then be compared in the two groups of patients.

This necessity is often ignored; and a ‘random’ control group is obtained for comparison on the assumption that any group taken at random is a good group for comparison. Fallacious results based on such studies are discussed briefly.”

Madge Thurlow Macklin, “Pitfalls in Dealing with Cancer Statistics, Especially as Related to Cancer of the Lung,” 14 Diseases Chest 525 532-33, 529-30 (1948).

The recognition that uncontrolled, or improperly controlled, research was worthless was a great advance in thinking about medical causation.  In the 1940s, Macklin was ahead of her time; indeed, if she were alive today, she would be ahead of many contemporary epidemiologists.

——

[1]Barry Mehler, “Madge Thurlow Macklin,” from Barbara Sicherman and Carl Hurd Green, eds., Notable American Women: The Modern Period 451-52 (1980); Laura Lynn Windsor, Women in Medicine: An Encyclopedia 134 (2002).

 

 

 

 

 

 



[1] Barry Mehler, “Madge Thurlow Macklin,” from Barbara Sicherman and Carl Hurd Green, eds., Notable American Women: The Modern Period 451-52 (1980); Laura Lynn Windsor, Women in Medicine: An Encyclopedia 134 (2002).

 

Zoloft MDL Excludes Proffered Testimony of Anick Bérard, Ph.D.

June 27th, 2014

Anick Bérard is a Canadian perinatal epidemiologist in the Université de Montréal.  Bérard was named by plaintiffs’ counsel in the Zoloft MDL to offer an opinion that selective serotonin reuptake inhibitor (SSRI) antidepressants as a class, and Zoloft (sertraline) specifically, cause a wide range of birth defects. Bérard previously testified against GSK about her claim that paroxetine, another SSRI antidepressant is a teratogen.

Pfizer challenged Bérard’s proffered testimony under Federal Rules of Evidence 104(a), 702, 703, and 403.  Today, the Zoloft MDL transferee court handed down its decision to exclude Dr. Bérard’s testimony at the time of trial.  In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., MDL 2342, Document 979 (June 27, 2014).  The MDL court acknowledged the need to consider the selectivity (“cherry picking”) of studies upon which Dr. Bérard relied, as well as her failure to consider multiple comparisons, ascertainment bias, confounding by indication, and lack of replication of specific findings across the different SSRI medications, and across studies. Interestingly, the MDL court recognized that Dr. Bérard’s critique of studies as “underpowered” was undone by her failure to consider available meta-analyses or to conduct one of her own. The MDL court seemed especially impressed by Dr. Bérard’s having published several papers that rejected a class effect of teratogenicity for all SSRIs, as recently as 2012, while failing to identify anything that was published subsequently that could explain her dramatic change in opinion for litigation.

Differential Etiology and Other Courtroom Magic

June 23rd, 2014

ITERATIVE DISJUNCTIVE SYLLOGISM

Basic propositional logic teaches that the disjunctive syllogism (modus tollendo ponens) is a valid argument, in which one of its premises is a disjunction (P v Q), and the other premise is the negation of one of the disjuncts:

P v Q

~P­­­_____

∴ Q

See Irving Copi & Carl Cohen Introduction to Logic at 362 (2005). If we expand the disjunctive premise to more than one disjunction, we can repeat the inference (iteratively), eliminating one disjunct at a time, until we arrive at a conclusion that is a simple, affirmative proposition, without any disjunctions in it.

P v Q v R

~P­­­_____

∴ Q v R

~Q­­­_____

∴ R

Hence, the term, “iterative disjunctive syllogism.” Fans of Sir Arthur Conan Doyle will recognize that iterative disjunctive syllogism is nothing other than the process of elimination, as explained by Doyle’s fictional detective, Sherlock Holmes. See, e.g., Doyle, The Blanched Soldier (“…when you have eliminated all which is impossible, then whatever remains, however improbable, must be the truth.”); Doyle, The Beryl Coronet (“It is an old maxim of mine that when you have excluded the impossible, whatever remains, however improbable, must be the truth.”); Doyle, The Hound of the Baskervilles (1902) (“We balance probabilities and choose the most likely. It is the scientific use of the imagination.”); Doyle, The Sign of the Four, ch 6 (1890)(“‘You will not apply my precept’, he said, shaking his head. ‘How often have I said to you that when you have eliminated the impossible, whatever remains, however improbable, must be the truth? We know that he did not come through the door, the window, or the chimney. We also know that he could not have been concealed in the room, as there is no concealment possible. When, then, did he come?”)

The process of elimination sometimes surfaces in court cases in which expert witnesses attempt to attribute a health outcome in a specific person to that person’s prior environmental, occupational, or lifestyle exposures.  A few general conclusions can be advanced about this mode of reasoning:

1. Differential Etiology NOT Differential Diagnosis

Although courts and expert witnesses sometimes refer to this process of ruling out as “differential diagnosis,” their terminology is a misnomer.  Their usage is not an innocent diction error because diagnosis is almost never involved, and the usage attempts to suggest that the causal attribution is part of a process typically conducted by a treating physician, when in fact, the treating physician rarely determines the actual cause in the person. Etiology is usually not needed to determine the nature of the disease or the proper course of treatment. Biomarkers, other than diagnostic criteria, rarely point to a specific cause(s) in a given case. The “differential diagnosis” misnomer tends to obscure clear reasoning about physician witnesses, who are often not experts in epidemiology or other sciences needed to assess general causation, not familiar with systematic reviews, not published on the scientific issue of general causation.  The specific causal attribution is analogous to differential diagnosis, in its process of ruling in, and then ruling out, and therefore is sometimes called differential etiology. See, e.g., Michael D. Green, D. Michal Freedman, and Leon Gordis, Reference Guide on Epidemiology 549, 617 & n.211, in Reference Manual on Scientific Evidence (3ed ed. 2011)[RMSE].

2. Differential Etiology Assumes, and Cannot Establish, General Causation

The differential etiology process assumes that each disjunct – each putative specific cause – has itself been established as a known cause of the disease in general. Id. at 618 (“Although differential etiologies are a sound methodology in principle, this approach is only valid if general causation exists … .”). In the case of a novel putative cause, the case may give rise to a hypothesis that the putative cause can cause the outcome, in general, and did so in the specific case.  That hypothesis must, of course, then be tested and supported by appropriate analytical methods before it can be accepted for general causation and as a putative specific cause in a particular individual.

3.  Differential etiology typically fails when a substantial percentage of cases are idiopathic in origin

When one of the disjuncts is “no known cause,” then it will be virtually impossible to negate and remove from the disjunction. If very few cases have idiopathic causes, the error rate may be low, and tolerable. Take for example, asbestosis, a diffuse interstitial lung disease caused by chronic, excessive inhalation of asbestos.  Clinically asbestosis will look similar to idiopathic pulmonary fibrosis (IPF), a lung disease of unknown origin.  IPF may remain a differential diagnosis in every case because it cannot be ruled out, clinically.  The likelihood of IPF, however, will be relatively low in a cohort of asbestos miners, and thus not a serious source of error.  In a study of household exposure cases, in which the exposure resulted from a family member’s bringing home dust from work, IPF may be a much likelier alternative, and the failure to rule it out may invalidate conclusions about the asbestosis diagnosis in every case in the cohort.

With respect to differential etiology, the same principle applies: the iterative disjunctive syllogism requires ruling out “unknown,” or at least minimizing the number of cases in the unknown disjunct that are not ruled out.  See RMSE at 618 (“Although differential etiologies are a sound methodology in principle, this approach is only valid if … a substantial proportion of competing causes are known. Thus, for diseases for which the causes are largely unknown, such as most birth defects, a differential etiology is of little benefit.”)(internal citations omitted). Accordingly, many cases reject proffered expert witness testimony on differential etiology, when the witnesses fail to rule out idiopathic causes in the case at issue. What is a substantial proportion?  Unfortunately, the RMSE does not attempt to quantify or define “substantial.” The inability to rule out unknown etiologies remains the fatal flaw in much expert witness opinion testimony on specific causation.

More Nonsense on Differential Diagnosis

The Supreme Court recently addressed differential etiology in Matrixx Initiatives, in stunningly irrelevant and errant dicta:

“We note that courts frequently permit expert testimony on causation based on evidence other than statistical significance. See, e.g., Best v. Lowe’s Home Centers, Inc., 563 F. 3d 171, 178 (6th Cir 2009); Westberry v. Gislaved Gummi AB, 178 F. 3d 257, 263–264 (4th Cir. 1999) (citing cases); Wells v. Ortho Pharmaceutical Corp., 788 F. 2d 741, 744–745 (11th Cir. 1986). We need not consider whether the expert testimony was properly admitted in those cases, and we do not attempt to define here what constitutes reliable evidence of causation.”

Matrixx Initiatives, Inc. v. Siracusano, 131 S. Ct. 1309, 1319 (2011).  The citation to Wells was clearly wrong in that the plaintiffs in that case had, in fact, relied upon studies that were nominally statistically significant, and so the Wells court could not have held that statistical significance was unnecessary.[1]

The two other cases cited by the Supreme Court, however, were both about “differential diagnosis,” and had nothing to do with statistical significance.  Both cases assumed that general causation was established, and inquired into whether expert witnesses could reasonably attribute the health outcome in the case to the exposures that were established causes of such outcomes.  The Court’s selection of these cases, quite irrelevant to its discussion, appears to have come from the Solicitor General’s amicus brief in Matrixx.[2]

Although cited for an irrelevant proposition, the Supreme Court’s selection of the Best’s case was puzzling because the Sixth Circuit’s discussion of the issue is particularly muddled. Here is the relevant language from Best:

“[A] doctor’s differential diagnosis is reliable and admissible where the doctor

(1) objectively ascertains, to the extent possible, the nature of the patient’s injury…,

(2) ‘rules in’ one or more causes of the injury using a valid methodology,

and

(3) engages in ‘standard diagnostic techniques by which doctors normally rule out alternative causes” to reach a conclusion as to which cause is most likely’.”

Best v. Lowe’s Home Centers, Inc., 563 F.3d 171, 179, 183-84 (6th Cir. 2009).

Of course, a physicians rarely use this iterative process to arrive at causes of diseases in an individual; they use it to identify the disease or disease process that is responsible for the patient’s signs and symptoms. See generally Harold C. Sox, Michael C. Higgins, and Douglas K. Owens, Medical Decision Making (2d ed. 2014).  The Best court’s description does not make sense in that it characterizes the process as ruling in “one or more” causes, and then ruling out alternative causes.  If an expert had ruled in only one cause, then there would be no need or opportunity to rule out an alternative cause.  If the one ruled-in cause was ruled out for other reasons, then the expert witness would be left with a case of idiopathic disease.[3]

We can take some solace in the Supreme Court’s disclaimer that it was not attempting reliable evidence of causation. Differential etiology, however, is irrelevant to general causation, which is the context in which statistical significance arises.  The issue of statistical significance was not addressed; nor could it have been addressed in either Best or Westberry.

What follows is an incomplete selection of cases on differential etiology, good and bad.


Differential Etiology for Specific Causation

FIRST CIRCUIT

Baker v. Dalkon Shield Claimaints Trust, 156 F.3d 248, 252-53 (1st Cir. 1998) (stating that “ ‘differential diagnosis’ is a standard medical technique”)

District Courts within 1st Circuit

Whiting v. Boston Edison Co., 891 F. Supp. 12, 21 n.41 (D. Mass. 1995) (noting that differential diagnosis cannot be used to support conclusion of specific causation when 90% disease cases are idiopathic)

Polaino v. Bayer Corp., 122 F. Supp. 2d 63, 70 & n.7 (D. Mass. 2000) (“differential diagnosis is a useful means of distinguishing one disease from another with similar symptoms, it is not a technique typically used to investigate the cause of an illness”)

Plourde v. Gladstone, 190 F. Supp. 2d 708, 722-723 (D. Vt. 2002) (excluding testimony where expert failed to rule out causes of plaintiff’s illness other than exposure to herbicides)

Allen v. Martin Surfacing, 263 F.R.D. 47, 56 (D. Mass. 2008) (admitting general and specific causation testimony of ALS, to be tested by adversary process, rather than excluded altogether, despite paucity of epidemiologic evidence)

Milward v. Acuity Specialty Products Group, Inc., Civil Action No. 07–11944–DPW, 2013 WL 4812425 (D. Mass. Sept. 6, 2013)


SECOND CIRCUIT

McCullock v. H.B. Fuller Co., 61 F.3d 1038, 1043–44 (2d Cir.1995) (defining differential etiology as an analysis “which requires listing possible causes, then eliminating all causes but one”) (affirming admission of a treating doctor’s testimony despite his inability to “point to a single piece of medical literature that says glue fumes cause throat polyps”) (upholding admission of treating physician who relied upon his “care and treatment of McCullock; her medical history (as she related it to him and as derived from a review of her medical and surgical reports); pathological studies; review of [Defendant] Fuller’s [Material Safety Data Sheet], his training and experience, use of a scientific analysis known as differential etiology (which requires listing possible causes, then eliminating all causes but one); and reference to various scientific and medical treatises”)

United States v. Zuchowitz, 140 F.3d 381, 385-87 (2d Cir. 1998) (“[d]isputes as to . . . faults in [the] use of differential etiology as a methodology, or lack of textual authority for [an] opinion, go to the weight, not the admissibility of [the] testimony”)

Wills v. Amerada Hess Corp., 379 F. 3d 32, 45-46 (2d Cir. 2004)(noting that expert witness failed to account for other possible causes), cert. denied, 126 S.Ct. 355 (2005)

Ruggiero v. Warner-Lambert Co., 424 F.3d 249, 254 (2d Cir. 2005) (“Where an expert employs differential diagnosis to ‘rule out other potential causes’ for the injury at issue, he must also ‘rule in the suspected cause’ and do so using ‘scientifically valid methodology’.”) (quoting Cavallo v. Star Enter., 892 F. Supp. 756, 771 (E.D. Va. 1995), aff’d on this ground, rev’d on other grounds, 100 F.3d 1150 (4th Cir. 1996))

District Courts within 2d Circuit

Becker v. National Health Products, 896 F.Supp. 100 (N.D.N.Y. 1995).

Mancuso v. Consolidated Edison Co. of New York, Inc., 967 F. Supp. 1437, 1450 (S.D.N.Y. 1997)(“it is improper for an expert to presume that the plaintiff ‘must have somehow been exposed to a high enough dose to exceed the threshold [necessary to cause the illness], thereby justifying his initial diagnosis.’ This is circular reasoning.”)

Zwillinger v. Garfield Slope Hous. Corp., 1998 WL 623589, at *20 (E.D.N.Y. Aug. 17, 1998) (excluding testimony and granting summary judgment where expert failed to rule out alternative causes of plaintiff’s immunotoxicity syndrome)

Prohaska v. Sofamor, S.N.C., 138 F. Supp. 2d 422, 439 (W.D.N.Y. 2001) (excluding expert’s opinion and granting summary judgment where expert “was unable to rule out, to a reasonable degree of medical certainty, [plaintiff’s] pre-existing condition, scoliosis, as a current cause of her pain”)

Martin v. Shell Oil Co., 180 F. Supp. 2d 313, 320 (D. Conn. 2002)

Figueroa v. Boston Scientific Corp., 254 F.Supp. 2d 361, 368 (S.D.N.Y. 2003)(“failure to rule out alternative causes is not determinative of admissibility of evidence but goes to weight, which is for a jury to decide”)

Perkins v. Origin Medsystems, Inc., 299 F. Supp. 2d 45, 57-61 (D. Conn. 2004)

In re Rezulin Prods. Liab. Litig., No. MDL 1348, 00 Civ. 2843(LAK), 2004 WL 2884327, at *3-4 (S.D.N.Y. Dec. 10, 2004) (holding that differential etiology may not be used to prove general causation) (“differential diagnosis does not ‘speak to the issue of general causation. [It] assumes that general causation has been proven for the list of possible causes’ that it rules in and out in coming to a conclusion.”)

In re Ephedra Prods. Liab. Litig., 393 F. Supp. 2d 181, 187 (S.D.N.Y. 2005) (Rakoff, J.)


THIRD CIRCUIT

In re Paoli R.R. Yard PCB Litig., 916 F.2d 829, 862 (3d Cir.1990)

In re Paoli R.R. Yard PCB Litig., 35 F.3d 717, 758 (3d Cir. 1994) (“[D]ifferential diagnosis generally is a technique that has widespread acceptance in the medical community, has been subject to peer review, and does not frequently lead to incorrect results …. )

Wade-Greaux v. Whitehall Labs., Inc., 874 F. Supp. 1441 (D.V. I.), aff’d, 46 F.3d 1120 (3d Cir. 1994) (excluding testimony of expert who failed to rule out alternative causes of plaintiff’s birth defects)

Kannankeril v. Terminex Int’l, Inc., 128 F.3d 802, 807 (3d Cir. 1997)

Heller v. Shaw Indus., Inc., 167 F.3d 146, 154 (3d Cir. 1999) (a medical expert need not “always cite published studies on general causation in order to reliably conclude that a particular object caused a particular illness” so long as there are good grounds, such as differential diagnosis, for the conclusion)

District Courts within 3d Circuit

Wade-Greaux v. Whitehall Labs., Inc., 874 F. Supp. 1441 (D.V. I.), aff’d, 46 F.3d 1120 (3d Cir. 1994) (excluding testimony of expert who failed to rule out alternative causes of plaintiff’s birth defects)

Diaz v. Matthey, Inc., 893 F. Supp. 358, 376-377 (D.N.J. 1995) (excluding testimony and granting summary judgment where expert failed to rule out alternative causes for plaintiff’s asthma)

Rutigliano v. Valley Bus. Forms, 929 F. Supp. 779, 787 (D.N.J. 1996) (excluding expert’s testimony and granting summary judgment where the “record is replete with evidence, including [the expert’s] own admissions, that [plaintiff’s] symptoms could be attributable to medical conditions other than formaldehyde sensitization”)

Reiff v. Convergent Technologies, 957 F. Supp. 573, 582-83 (D.N.J. 1997) (excluding expert’s testimony and granting summary judgment where expert failed to rule out alternative causes of plaintiff’s carpal tunnel syndrome)

O’Brien v. Sofamor, 1999 WL 239414, at *5 (E.D. Pa. Mar. 30, 1999) (excluding expert’s testimony and granting summary judgment where plaintiff “offer[ed] no evidence that [plaintiff’s experts] performed a differential diagnosis, or even considered other potential causes” of plaintiff’s back condition)

Kent v. Howell Elec. Motors, 1999 WL 517106, at * 5 (E.D. Pa. July 20, 1999) (excluding expert testimony and granting summary judgment because expert could “not rule out reasonable alternative theories of what caused the retaining ring to fail”)

Schmerling v. Danek Med., Inc., 1999 WL 712591, at *9 (E.D. Pa. Sept. 10, 1999) (excluding expert’s testimony and granting summary judgment on the grounds that expert’s failure to rule out alternative causes “alone warrants a determination that the expert’s methodology is unreliable”)

Turbe v. Lynch Trucking Inc., 1999 WL 1087026, at *6 (D.V.I. Oct. 7, 1999) (excluding expert’s testimony where expert “expressed awareness of obvious alternative causes” yet “did not investigate any other possible causes”)

In re Paoli R.R. Yard PCB Litig., 2000 WL 274262, at *5 (E.D. Pa. March 1, 2000) (expert’s opinion should be excluded “because she failed to rule out alternative causes” of plaintiff’s injuries)

Magistrini v. One Hour Martinizing Dry Cleaning, 180 F. Supp. 2d 584, 608-610 (D.N.J. 2002) (excluding testimony of expert who sought to testify that dry cleaning fluid caused leukemia, but failed to rule out smoking as an alternative cause) (holding expert witness’s differential methodology unreliable when objection to the opinion points to a plausible alternative cause, and the expert witness offers no explanation for his conclusion that the exposure was a substantial factor in causing plaintiff’s injury)

Yarchak v. Trek Bicycle Corp., 208 F. Supp. 2d 470, 498 (D.N.J. 2002)

Soldo v. Sandoz Pharms. Corp., 244 F.Supp. 2d 434, 554-56, 567 (W.D. Pa. 2003) (excluding experts’ specific causation testimony based on a differential diagnosis because the witnesses “did not demonstrate any valid diagnostic methodology–any ‘sufficient diagnostic technique’–for excluding” other plausible causes as the sole cause of the plaintiff’s injury) (holding that the differential “diagnostic” process is not reliable, and not admissible, unless it reliably rules out reasonable alternative causes or idiopathic causes of the alleged harm); see id. at 524 (differential diagnosis cannot establish general causation)

Perry v. Novartis, 564 F. Supp. 2d 452, 469 (E.D. Penn. 2008)(Dalzell, J.) (“Standing alone, the presence of a known risk factor is not a sufficient basis for ruling out idiopathic origin in a particular case, particularly where most of the cases of the disease have no known cause.”)


FOURTH CIRCUIT

Benedi v. McNeil-P.P.C. Inc., 66 F.3d 1378, 1384 (4th Cir. 1995) (upholding admission of differential diagnosis , reasoning circularly that diagnosing physicians use it)

Cavallo v. Star Enter., 892 F. Supp. 756, 771, (E.D. Va. 1995) (noting that it is not sufficient for an expert to rule out other possible causes if he has no sound evidence that allows him to “rule in” the purported cause), aff’d in relevant part, rev’d in part on other grounds, 100 F.3d 1150 (4th Cir. 1996)

Oglesby v. General Motors Corp., 190 F.3d 244, 250 (4th Cir. 1999) (affirming exclusion of testimony where “as a matter of logic, [the expert] could not eliminate other equally plausible causes” of cracked plastic inlet)

Westberry v. Gislaved Gummi AB, 178 F.3d 257, 262-263 (4th Cir. 1999) (“Differential diagnosis, or differential etiology, is a standard scientific technique of identifying the cause of a medical problem by eliminating the likely causes until the most probable one is isolated”)

Cooper v. Smith & Nephew, Inc., 259 F.3d 194, 202 (4th Cir.2001) (holding that an expert’s opinion based on a differential diagnosis is generally admissible but that there must be adequate evidence that the differential is a cause of the disease)

District Courts within 4th Circuit

Higgins v. Diversey Corp., 998 F. Supp. 598, 603 (D. Md. 1997), aff’d, 135 F.2d 769 (4th Cir. 1998) (excluding expert’s testimony that the accidental inhalation of a bleach caused plaintiff’s injuries, where expert “admit[ted] that he [could] not rule out several other possible causes”)

Driggers v. Sofamor, S.N.C., 44 F. Supp. 2d 760, 765 (M.D.N.C. 1998) (excluding expert’s testimony and granting summary judgment where “expert failed to rule out other possible causes of [plaintiff’s back] pain”)

Aldridge v. Goodyear Tire & Rubber Co., 34 F. Supp. 2d 1010, 1024 (D. Md. 1999), vacated on other grounds, 223 F.3d 263 (4th Cir. 2000) (excluding testimony of plaintiffs’ experts where they “failed to adequately address possible alternative causes of plaintiffs’ illnesses”)

Fitzerald v. Smith & Nephew Richards, Inc., 1999 WL 1489199 (D. Md. Dec. 30, 1999) (excluding expert’s testimony and granting summary judgment where expert “failed to rule out what could have been another cause of [plaintiff’s] condition”)

Shreve v. Sears, Robuck & Co., 166 F. Supp. 2d 378, 397-98 (D. Md. 2001) (excluding testimony where expert failed to rule out other causes of plaintiff’s injury other than an alleged defect in snow thrower)

Smith v. Wyeth-Ayerst Laboratories Co., 278 F.Supp. 2d 684, 692 (W.D.N.C. 2003)(inexplicably rejecting argument that idiopathic causes prevent the use of “differential etiology” method to ascertain specific causation)

Roche v. Lincoln Property Co., 278 F.Supp. 2d 744 (E.D. Va. 2003) (excluding in part expert witness’s testimony that mold caused the plaintiffs’ allergy-like symptoms because he failed “to rule out the Roches’ significant allergies to cats, dust mites, grasses, weeds, and trees as potential causes for the Roches’ symptoms,” which pre-existed moving to the defendant’s apartment)

Doe v. Ortho-Clinical Diagnostics, Inc., 440 F.Supp. 2d 465, 476-78 (M.D.N.C. 2006) (excluding improperly conducted differential diagnosis in thimerosal vaccine autism case)

Hines v. Wyeth, Inc., 2011 WL 2792436, at *3 (S.D.W.V. July 14, 2011) (excluding expert witness who failed properly to rule out alternative causes of breast cancer in hormone therapy case)


FIFTH CIRCUIT

Moore v. Ashland Chem. Inc., 151 F.3d 269, 278-79 (5th Cir. 1998)(en banc), cert. denied, 526 U.S. 1064 (1999)(holding that trial court has discretion to conclude that an expert’s differential diagnosis was insufficiently reliable to be submitted to the jury)

Curtis v. M&S Petroleum, Inc., 174 F.3d 661, 670 (5th Cir. 1999)

Michaels v. Avitech, Inc., 202 F.3d 746, 753 (5th Cir. 2000) (excluding testimony when “plaintiff’s experts wholly fail[ed] to address and rule out the numerous other potential causes” of an aircraft disaster)

Black v Food Lion, Inc, 171 F3d 308 (5th Cir 1999) (expert witness, purporting to use a differential diagnosis, testified that plaintiff’s slip in the supermarket caused fibromyalgia, which is largely idiopathic) (“This analysis amounts to saying that because [the physician] thought she had eliminated other possible causes of fibromyalgia, even though she does not know the real ‘cause,’ it had to be the fall at Food Lion. This is not an exercise in scientific logic but in the fallacy of post-hoc propter-hoc reasoning, which is as unacceptable in science as in law.”)

Johnson v. Arkema, Inc., 685 F.3d 452, 467–68 (5th Cir. 2012) (suggesting that a proper differential diagnosis may be admissible)

District Courts within 5th Circuit

Bennett v. PRC Public Sector, 931 F. Supp. 484, 492 (S.D. Tex. 1996) (excluding testimony of expert who failed to consider and rule out alternative causes of plaintiff’s repetitive motion disorders)

Conger v. Danek Med., Inc., 1998 WL 1041331, at *5-6 (N.D. Tex. Dec. 14, 1998) (excluding expert’s testimony and granting summary judgment when expert “had not attempted to rule out [other potential sources] as causes for [plaintiff’s back] pain”);

Nobles v. Sofamor, 1999 WL 1129661 (S.D. Tex June 30, 1999) (Rosenthal, J.)

Leigh v. Danek Med., Inc., 1998 WL 1041329, at *4-5 (N.D. Tex. Dec. 14, 1998) (excluding expert’s testimony and granting summary judgment where expert failed to rule out alternative causes of plaintiff’s back pain)

In re Propulsid Products Liability Litigation, 261 F. Supp. 2d 603, 618 (E.D. La. 2003)(“They also cannot rule out other explanations for the measurements that form the predicate of the QTc, the heart rate, or heart rate variability)

Cano v. Everest Minerals Corp., 362 F. Supp. 2d 814, 844-46 (W.D. Tex. 2005) (addressing specific causation in context of known carcinogen (radiation), and holding that expert witness’s methodology of concluding that any cause that could have been a cause was in fact a cause and a substantial factor was invalid)

Ridgeway v. Pfizer Inc., No. 2:09-cv-02794, 2010 WL 1729187, *4 (E.D.La. April 27, 2010) (using “differential diagnosis,” or res ipsa loquitur, the proponent bears the burden of “excluding  reasonable explanations for the accident other than defendant’s negligence”)


SIXTH CIRCUIT

Glaser v. Thompson Med. Co., 32 F.3d 969, 978 (6th Cir. 1994) (differential diagnosis defined as “standard diagnostic tool used by medical professionals to diagnose the most likely cause or causes of illness, injury and disease”)

Hardyman v. Norfolk & W. Ry. Co., 243 F.3d 255, 260 (6th Cir.2001) (“Differential diagnosis … is a standard scientific technique of identifying the cause of a medical problem”)

Downs v. Perstorp Components, Inc., 26 F. Appx. 472, 476–77 (6th Cir. 2002) (holding that exclusion of expert’s opinion was appropriate when arrived at by a “methodology primarily [that] involved reasoning backwards from Downs’ condition and, through a process of elimination, concluding that [defendant’s product] must have caused it”)

Best v. Lowe’s Home Centers, Inc., 563 F. 3d 171, 178-80 (6th Cir. 2009)

Gass v. Marriott Hotel Servs., 558 F.3d 419, 426 (6th Cir. 2009) (“the ability to diagnose medical conditions is not remotely the same as the ability to deduce … in a scientifically reliable manner the causes of those medical conditions”)(internal citations omitted)

Tamraz v. BOC Group Inc., No. 1:04-CV-18948, 2008 WL 2796726 (N.D. Ohio July 18, 2008) (denying Rule 702 challenge to treating physician’s causation opinion), rev’d sub nom., Tamraz v. Lincoln Elec. Co., 620 F.3d 665, 673 (6th Cir. 2010) (carefully reviewing record of trial testimony of plaintiffs’ treating physician; reversing judgment for plaintiff based in substantial part upon treating physician’s speculative causal assessment created by plaintiffs’ counsel; “Getting the diagnosis right matters greatly to a treating physician, as a bungled diagnosis can lead to unnecessary procedures at best and death at worst. But with etiology, the same physician may often follow a precautionary principle: If a particular factor might cause a disease, and the factor is readily avoidable, why not advise the patient to avoid it? Such advice—telling a welder, say, to use a respirator—can do little harm, and might do a lot of good. This low threshold for making a decision serves well in the clinic but not in the courtroom, where decision requires not just an educated hunch but at least a preponderance of the evidence.”) (internal citations omitted), cert. denied, ___ U.S. ___ , 131 S. Ct. 2454, 2011 WL 863879 (2011)

Thomas v. Novartis Pharm. Corp., 443 Fed. App’x 58, 61-62 (6th Cir. 2011) (excluding expert witnesses in cases involving osteonecrosis of the jaw, allegedly caused by bisphosphonate medication, for failing to conduct proper differential analysis; emphasizing “the importance of correctly determining the cause of the osteonecrosis … does nothing to establish that [the doctor] can in fact, reliably determine the cause of a patient’s [osteonecrosis]”)

District Courts within 6th Circuit

Nelson v. Tennessee Gas Pipeline Co., 1998 WL 1297690, at *6 (W.D. Tenn. Aug. 1, 1998) (excluding testimony of expert who “failed to engage in adequate techniques to rule out alternative causes and offers no good explanation as to why his opinion is nevertheless reliable in light of other potential causes of the alleged injuries”)

Downs v. Perstorp Components, 126 F. Supp. 2d 1090, 1127 (E.D. Tenn. 1999) (excluding expert testimony as to whether exposure to chemicals caused plaintiff’s injuries where expert failed to rule out alternative causes)

Huffman v. SmithKline Beecham Clinical Lab., Inc., 111 F. Supp. 2d 921, 930 (N.D. Ohio 2000)

Asad v. Continental Airlines, Inc., 314 F. Supp. 2d 726 (N.D. Ohio 2004)


SEVENTH CIRCUIT

O’Connor v. Commonwealth Edison, 13 F.3d 1090, 1106 (7th Cir. 1994) (holding that physician’s testimony that  cataracts were caused by radiation exposure based upon visual examination of the plaintiff’s was not reliably supported by clinical examination), cert. denied, 114 S.Ct. 2711 (1994).

Ervin v. Johnson & Johnson, Inc., 492 F.3d 901, 904 (7th Cir.2007)(noting that “[a] differential diagnosis satisfies a Daubert analysis if the expert uses reliable methods”) (excluding differential etiological testimony that was based upon ruling a particular potential specific cause based on temporal proximity)

District Courts within 7th Circuit

Schmaltz v. Norfolk & Western Ry., 878 F.Supp. 1122 (N.D. Ill. 1995)

Lennon v. Norfolk & Western Ry., 123 F.Supp.2d 1143, 1153 (N.D.Ind. 2000) (excluding neurologist’s unreliable causal attribution of multiple sclerosis to fall)

Eve v. Sandoz Pharm. Corp., No. IP 98-1429, 2001 U.S. Dist. LEXIS 4531 (S.D. Ind. 2001)

Caraker v. Sandoz Pharms., 188 F. Supp. 2d 1026, 1030 (S.D. Ill. 2001) (when a differential diagnosis is employed “in the practice of science (as opposed to its use by treating physicians in the practice of medicine out of necessity) it must reliably ‘rule in’ a potential cause”)

Bickel v. Pfizer, Inc., 431 F.Supp. 2d 918, 923 (N.D. Ind. 2006) (“the Plaintiff cannot rely on [differential] diagnosis to establish general causation”)


EIGHTH CIRCUIT

National Bank of Commerce v. Assoc. Milk Producers, 22 F. Supp. 2d 942, 963 (E.D. Ark. 1998), aff’d, 191 F.3d 858 (8th Cir.1999) (excluding testimony and granting summary judgment where expert did “not successfully rule out other possible alternative causes” for cancer)

Turner v. Iowa Fire Equip. Co., 229 F.3d 1202, 1208-09 (8th Cir. 2000) (“[A] medical opinion about causation, based upon a proper differential diagnosis, is sufficiently reliable to satisfy Daubert.”)(“If a properly qualified medical expert performs a reliable differential diagnosis through which, to a reasonable degree of medical certainty, all other possible causes of the victims’ condition can be eliminated, leaving only the toxic substance as the cause, a causation opinion based on that differential diagnosis should be admitted.”)

Bonner v. ISP Technologies, Inc., 259 F.3d 924, 1208 (8th Cir. 2001)

Glastetter v. Novartis Pharms. Corp., 252 F.3d 986, 989 (8th Cir. 2001) (per curiam) (“[T]he district court excluded the differential diagnoses performed by Glastetter’s expert physicians because they lacked a proper basis for ‘ruling in’ Parlodel as a potential cause of [an intracerebral hemorrhage] in the first place. . . . We agree with the district court’s conclusion.”)

Jazairi v. Royal Oaks Apts., 217 Fed. Appx. 895 (8th Cir. 2007) (excluding differential etiological testimony that was based upon ruling a particular potential specific cause based on temporal proximity)

Bland v. Verizon Wireless, L.L.C., 538 F.3d 893, 897 (8th Cir. 2008) (affirming exclusion of treating physician’s differential diagnosis)

District Courts within 8th Circuit

Stover v. Eagle Products, 1996 WL 172972, at *11 (D. Kan. Mar. 19, 1996) (excluding testimony of expert who “[did] not explain in any meaningful detail how he [was] able to exclude the numerous multiple alternative causes” of injury to plaintiff’s dogs) (excluding expert testimony for failing to rule out alternative causes)

Bruzer v. Danek Med., Inc., 1999 WL 613329, at *8 (D. Minn. Mar. 8, 1999) (excluding expert’s testimony and granting summary judgment where expert did “not attempt to rule out any alternative potential causes for [plaintiff’s] continuing and increasing [back] pain”) (excluding expert testimony for failing to rule out alternative causes)

Thurman v. Missouri Gas Energy, 107 F. Supp. 2d 1046, 1058 (W.D. Mo. 2000) (expert’s opinion “that the pipeline failed because of corrosion” was excluded and summary judgment granted where expert reached the conclusion “without eliminating other causes”) (excluding expert testimony for failing to rule out alternative causes)

Jisa Farms, Inc. v. Farmland Indus., No. 4:99CV3294, 2001 U.S. Dist. LEXIS 26084 (D. Neb. 2001) (excluding expert testimony for failing to rule out alternative causes)

In re Viagra Prod. Liab. Litig., 658 F. Supp. 2d 950, 957 (D. Minn. 2009)


NINTH CIRCUIT

Kennedy v. Collagen Corp., 161 F.3d 1226, 1228-30 (9th Cir. 1998)

Clausen v. M/V NEW CARISSA, 339 F.3d 1049, 1057 (9th Cir. 2003)

Messick v. Novartis Pharms., ___ F.3d. ___, 2014 WL 1328182 (9th Cir. 2014)

District Courts within 9th Circuit

Hall v. Baxter Healthcare Corp., 947 F.Supp. 1387, 1413 (D.Ore. 1996) (explaining that differential diagnosis assumes general causation has been established) (“differential diagnosis does not by itself prove the cause, even for the particular patient. Nor can the technique speak to the issue of general causation.”)


TENTH CIRCUIT

Hollander v. Sandoz Pharms. Corp., 289 F.3d 1193, 1211 (10th Cir. 2002) (stating that “experts would need to present reliable evidence that the drug can cause strokes” before differential diagnosis could be admissible)

Goebel v. Denver & Rio Grande W. RR., 346 F.3d 987, 999 (10th Cir. 2003)

Tingey v. Radionics, 193 Fed. Appx. 747, 763 (10th Cir. 2006)

District Courts within 10th Circuit

Stover v. Eagle Products, 1996 WL 172972, at *11 (D. Kan. Mar. 19, 1996) (excluding testimony of expert who “[did] not explain in any meaningful detail how he [was] able to exclude the numerous multiple alternative causes” of injury to plaintiff’s dogs)

In re Breast Implant Lit., 11 F. Supp. 2d 1217, 1230, 1234 (D. Colo. 1998) (excluding expert testimony where expert failed to “explain what alternative causes he considered, or how he ruled out other possible causes” of plaintiffs’ auto- immune disease) (“Differential diagnosis may be utilized by a clinician to determine what recognized disease or symptom the patient has, but it is incapable of determining whether exposure to a substance caused disease in the legal sense.”)


ELEVENTH CIRCUIT

McClain v. Metabolife Int’l, Inc., 401 F.3d 1233, 1252-53 (11th Cir.2005) (detailing a reliable differential diagnostic process)(“A valid differential diagnosis, however, only satisfies a Daubert analysis if the expert can show the general toxicity of the drug by reliable methods.”)

Rink v. Cheminova, Inc., 400 F.3d 1286, 1295 (11th Cir. 2005) (holding that a differential diagnosis alone does not support a finding of causation where no expert testimony from a treating physician or toxicologist is presented, or any toxicological evidence produced; specifically rejecting the Westberry)  (“[I]n the context of summary judgment . . . differential diagnosis evidence by itself does not suffice for proof of causation.”)

Guinn v. AstraZeneca Pharms. LP, 602 F.3d 1245 (11th Cir. 2010), aff’g 598 F. Supp. 2d 1239, 1243 (M.D. Fla. 2009) (excluding expert witness’s specific causation opinion for failing “to articulate any scientific methodology for assessing whether, and to what extent, Seroquel contributed to Guinn’s weight gain and diabetes”)

Hendrix v. Evenflo Co., 609 F.3d 1183, 1194-95 (11th Cir. 2010), aff’g, 255 F.R.D. 568, 596 (N.D. Florida, 2009)(differential etiology not diagnosis)

Kilpatrick v. Breg, Inc., 613 F.3d 1329, 1342 (11th Cir. 2010) (noting that differential diagnosis “assumes the existence of general causation”)

District Courts within 11th Circuit

Coleman v. Danek Med., Inc., 43 F. Supp. 2d 637, 650 n. 23 (S.D. Miss. 1999) (stating that “in reaching his conclusion that these plaintiffs were injured by Danek’s product, Dr. Aldreti did not rule out other causes of their alleged injuries. Thus, his conclusion that their injuries were caused by Danek’s product is based on pure speculation – and is not a valid differential diagnosis.”)

Siharath v. Sandoz Pharms. Corp., 131 F. Supp. 2d 1347, 1356-71 (N.D. Ga. 2001) (holding that differential diagnosis cannot rule in a general causal factor, and noting in Parlodel case that “[e]xperts must do something more than just ‘rule out’ other possible causes. They must explain how they were able to ‘rule in’ the product in question”), aff’d sub nom., Rider v. Sandoz Pharm. Corp., 295 F.3d 1194 (11th Cir. 2002).


D.C. CIRCUIT

Ambrosini v. Labarraque, 101 F.3d 129, 140 (D.C.Cir.1996) (describing the appropriate use of differential diagnosis to prove specific causation)

Meister v. Med. Eng’g Corp., 267 F.3d 1123, 1129, 347 U.S. App. D.C. 361 (D.C. Cir. 2001)(“whatever factors remain after other alternative causes have been eliminated [must be] at least capable of causing the disease in question”)


STATE COURT CASES

ALASKA

John’s Heating Service v. Lamb, 46 P.3d 1024 (Alaska 2002) (“[a] differential diagnosis that fails to take serious account of other potential causes may be so lacking that it cannot provide a reliable basis for an opinion on causation,” but not in this case involving carbon monoxide poisoning)

ARIZONA

Lofgren v. Motorola, No. CV 93-05521, 1998 WL 299925, at *24 (Ariz. Super. Ct. June 1, 1998) (differential diagnosis as a method of determining the cause of disease has been “unequivocally rejected by the scientific community”)

IOWA

Ranes v. Adams Labs., Inc., 778 N.W.2d 677, 690 (Iowa 2010)(general causation for each differential should be established by adequate evidence)

KANSAS

Kuhn v. Sandoz Pharms., 14 P.3d 1170, 1173-78 (Kan. 2000) (Frye test not applicable to “pure opinion” testimony such as differential diagnosis)

LOUISIANA

Keener v. Mid-Continent Cas., 817 So. 2d 347 (La. Ct. App. 5th Cir. 2002), writ denied, 825 So. 2d 1175 (La. 2002)

MINNESOTA

Zandi v. Wyeth, 2009 Minn. App. Unpub. LEXIS 785, at *17-18 (Minn. Ct. App. July 21, 2009), petition denied, 2009 Minn. LEXIS 648 (Minn. Sept. 29, 2009)

NEW JERSEY

Creanga v. Jardal, 185 N.J. 345, 886 A.2d 633 (2005) (holding that properly conducted differential diagnosis was admissible; reversing exclusion of physician testimony in case)

OHIO

Terry v. Ottawa Cty. Bd. of Mental Retardation & Developmental Delay, 658, 847 N.E.2d 1246 (Ohio Ct. App. 2006) (“We agree with the trial court: Dr. Bernstein did not conduct a scientifically valid differential diagnosis, because his method relied primarily upon temporal relationships and because he did not rule out other possible causes. He was properly barred from testifying to specific causation.”)

TEXAS

Mitchell Energy Corp. v. Bartlett, 958 S.W.2d 430, 448 (Tex. App.–Fort Worth 1997, pet. denied) (“Dr. Basset’s failure to rule out other causes of the presence of hydrogen sulfide in appellees’ water renders his opinion ‘little more than speculation.’”)

Weiss v. Mechanical Associated Services, Inc., 989 S.W.2d 120, 126 (Tex. App.– San Antonio 1999, pet. denied) (affirming summary judgment for the defendants in a case involving injuries allegedly caused by exposure to a chemical, because “none of Weiss’ experts were able to rule out other potential causes of Weiss’ illness with reasonable certainty”)

Williams v. NGF, Inc., 994 S.W.2d 255, 257 (Tex. App.–Texarkana 1999, no pet. h.) (affirming summary judgment for defendant because plaintiffs “failed to produce evidence which excluded the possibility that . . . other flowers or chemical agents used on them were the cause of her injuries”)

Austin v. Kerr-McGee Refining Corp., 25 S.W.3d 280, 293 (Tex. App.-Texarkana 2000, no pet.) (affirming summary judgment for defendants; trial court properly excluded plaintiffs’ scientific evidence because, among other reasons, plaintiffs “failed to exclude other plausible causes with reasonable certainty”)

Martinez v. City of San Antonio, 40 S.W.3d 587, 595 (Tex. App.–San Antonio 2001, no pet.) (“The opinions of Matson and Baynes, when offered to prove Alamodome site lead caused appellants’ injuries, constitute no evidence because Matson, in arriving at his lead calculation, failed to rule out alternative sources of the lead contamination.”)

Neal v. Dow Agrosciences L.L.C., 74 S.W.3d 468, 473 n. 3 (Tex. App. – Dallas 2002, no pet.)(describing “differential diagnosis” as a patient-specific process of elimination) (citing Minnesota Min. And Mfg. Co. v. Atterbury, 978 S.W.2d 183, 194 n. 9 (Tex. App. – Texarkana 1998, pet. denied)

Coastal Tankships, USA, Inc. v. Anderson, 87 S.W.3d 591, at 609-10 (2002)(“In the toxic-tort context, a plaintiff must establish general causation for a differential diagnosis to be relevant to show specific causation.”)

UTAH

Alder v. Bayer Corp., AGFA Div., 61 P.3d 1068, 1084–85 (Utah 2002)

VERMONT

Blanchard v. Goodyear Tire & Rubber Co.,  2011 Vt. 85, 30 A.3d 1271 (2011)(holding that plaintiff’s claim that his NHL was caused by benzene was not reliably supported by differential diagnosis when a large percentage of NHL cases have no known cause)

WYOMING

Easum v. Miller, 92 P.3d 794, 802 (Wyo. 2004) (“Most circuits have held that a reliable differential diagnosis satisfies Daubert and provides a valid foundation for admitting an expert opinion. The circuits reason that a differential diagnosis is a tested methodology, has been subjected to peer review/publication, does not frequently lead to incorrect results, and is generally accepted in the medical community.”) (quoting Turner v. Iowa Fire Equip. Co., 229 F.3d 1202, 1208 (8th Cir. 2000)


COMMENTATORS

Conley & Garver,  “William C. Keady and the Law of Scientific Evidence,” 68 Miss. L.J. 39, 51 (1998) (differential diagnosis is “a mixture of science and art, far too complicated for its accuracy to be assessed quantitatively or for a meaningful error rate to be calculated”)

Wendy Michelle Ertmer, “Just What the Doctor Ordered: The Admissibility of Differential Diagnosis in Pharmaceutical Product Litigation,” 56 Vand. L. Rev. 1227 (2003)

Joe G. Hollingsworth & Eric G. Lasker, “The Case Against Differential Diagnosis: Daubert, Medical Causation Testimony, and the Scientific Method,” 37 J. Health Law 85, 98 (2004)

Edward J. Imwinkelried,, “The Admissibility and Legal Sufficiency of Testimony about Differential Diagnosis (Etiology): Of Under‑ and Over‑Estimations,” 56 Baylor L. Rev. 391, 406 (2004)

Michael B. Kent Jr., “Daubert, Doctors and Differential Diagnosis: Treating Medical Causation Testimony as Evidence,” 66 Def. Couns. J. 525 (1999)

Joseph Sanders, “Applying Daubert Inconsistently? Proof of Individual Causation in Toxic Tort and Forensic Cases,” 75 Brooklyn L. Rev. 1367 (2010)

Joseph Sanders & Julie Machal-Fulks, “The Admissibility of Differential Diagnosis Testimony to Prove Causation in Toxic Tort Cases: The Interplay of Adjective and Substantive Law,” 64 Law & Contemp. Prob. 107 (2001)

Ian S. Spechler, “Physicians at the Gates of Daubert: A Look at the Admissibility of Differential Diagnosis Testimony to Show External Causation in Toxic Tort Litigation,” 26 Rev. Litig. 739 (2007)

Teratology Society, Public Affairs Committee, “Teratology Society Public Affairs Committee Position Paper Causation in Teratology-Related Litigation,” 73 Birth Defects Research (Part A) 421, 423 (2005) (“7. Biologic plausibility is an essential element in establishing causation. *** The consideration of alternative explanations is sometimes misused by expert witnesses to mean that failure to find an alternative explanation for an outcome is proof that the exposure at issue must have caused the outcome. A conclusion that an exposure caused an outcome is, however, based on positive evidence rather than on lack of an alternative explanation.”)


[1] Wells involved a claim of birth defects caused by the use of spermicidal jelly contraceptive, which had been the subject of several studies, one of which at least yielded a statistically significant increase in detected birth defects over what was expected.  Wells v. Ortho Pharmaceutical Corp., 615 F. Supp. 262 (N.D.Ga. 1985), aff’d and rev’d in part on other grounds, 788 F.2d 741 (11th Cir.), cert. denied, 479 U.S.950 (1986). The problematic aspect of the evidence in Wells lay in its involving spermicidal compounds different from the one at issue in the litigation, and the multiple testing that eroded the usual interpretation of the significance probability.

[2] Brief for the United States as Amicus Curiae Supporting Respondents, in Matrixx Initiatives, Inc. v. Siracusano, 2010 WL 4624148, at *16 (“Best v. Lowe’s Home Centers, Inc., 563 F.3d 171, 178 (6th Cir. 2009) (“an ‘overwhelming majority of the courts of appeals’ agree” that differential diagnosis, a process for medical diagnosis that does not entail statistical significance tests, informs causation) (quoting Westberry v. Gislaved Gummi AB, 178 F.3d 257, 263 (4th Cir. 1999)).”

[3] In the Rule 702 hearings before Judge Jones in Hall v. Baxter Healthcare, Dr. Eric Gershwin defined idiopathic disease as what a pathetic patient suffers from when she has an idiot for a physician.

Substituting Risk for Specific Causation

June 15th, 2014

Specious, Speculative, Spurious, and Sophistical

Some legal writers assert that all evidence is ultimately “probable,” but that assertion appears to be true only to the extent that the evidentiary support for any claim can be mapped on scale from 0 to 1, much as probability is.  Probability thus finds its way into discussions of burdens of persuasion as requiring the claim to be shown more probably than not, and expert witness certitude as requiring “reasonable degree of scientific probability.”

There is a contrary emphasis in the law on “actual truth,” which is different from “mere probability.”  The rejection of probabilism can be seen in some civil cases, in which courts have emphasized the need for individualistic data and conclusions, beyond generalizations that might be made about groups that clearly encompass the individual at issue. For example, the Supreme Court has held that charging more for funding a woman’s pension than a man’s is discriminatory because not all women will outlive all men, or the men’s average life expectancy. City of Los Angeles Dep’t of Water and Power v. Manhart, 435 U.S. 702, 708 (1978) (“Even a true generalization about a class is an  insufficient reason for disqualifying an individual to whom the generalization does not apply.”). See also El v. Southeastern Pennsylvania Transportation Authority, 479 F.3d 232, 237 n.6 (3d Cir. 2007) (“The burden of persuasion … is the obligation to convince the factfinder at trial that a litigant’s necessary propositions of fact are indeed true.”).

Specific causation is the soft underbelly of the toxic tort world, in large measure because courts know that risk is not specific causation. In the context of risk of disease, which is usually based upon a probabilistic group assessment, courts occasionally distinguish between risk and specific causation. SeeProbabilism Case Law” (Jan. 28, 2013) (collecting cases for and against probabilism).

In In re Fibreboard Corp., 893 F. 2d 706, 711-12 (5th Cir. 1990), the court rejected a class action approach to litigating asbestos personal injury claims because risk could not substitute for findings of individual causation:

“That procedure cannot focus upon such issues as individual causation, but ultimately must accept general causation as sufficient, contrary to Texas law. It is evident that these statistical estimates deal only with general causation, for ‘population-based probability estimates do not speak to a probability of causation in any one case; the estimate of relative risk is a property of the studied population, not of an individual’s case.’ This type of procedure does not allow proof that a particular defendant’s asbestos ‘really’ caused a particular plaintiff’s disease; the only ‘fact’ that can be proved is that in most cases the defendant’s asbestos would have been the cause.”

Id. at 711-12 (citing Steven Gold, “Causation in Toxic Torts: Burdens of Proof, Standards of Persuasion, and Statistical Evidence,” 96 Yale L.J. 376, 384, 390 (1986). See also Guinn v. AstraZeneca Pharms., 602 F.3d 1245, 1255 (11th Cir. 2010) (“An expert, however, cannot merely conclude that all risk factors for a disease are substantial contributing factors in its development. ‘The fact that exposure to [a substance] may be a risk factor for [a disease] does not make it an actual cause simply because [the disease] developed.’”) (internal citation omitted).

Specific causation is the soft underbelly of the toxic tort world, in large measure because courts know that risk is not specific causation. The analytical care of the Guinn case and others is often abandoned when it will stand in the way of compensation. The conflation of risk and (specific) causation is prevalent precisely because in many cases there is no scientific or medical way to discern what antecedent risks actually played a role in causing an individual’s disease.  Opinions about specific causation are thus frequently devoid of factual or logical support, and are propped up solely by hand waving about differential etiology and inference to the best explanation.

In the scientific world, most authors recognize that risk, even if real and above baseline, regardless of magnitude, does not support causal attribution in a specific case.[1]  Sir Richard Doll, who did so much to advance the world’s understanding of asbestosis as a cause of lung cancer, issued a caveat about the limits of specific causation inference. Richard Doll, “Proof of Causality: Deduction from Epidemiological Observation,” 45 Perspectives in Biology & Medicine 499, 500 (2002) (“That asbestos is a cause of lung cancer in this practical sense is incontrovertible, but we can never say that asbestos was responsible for the production of the disease in a particular patient, as there are many other etiologically significant agents to which the individual may have been exposed, and we can speak only of the extent to which the risk of the disease was increased by the extent of his or her exposure.”)

Similarly, Kenneth Rothman, a leading voice among epidemiologists, cautioned against conflating epidemiologic inferences about groups with inferences about causes in individuals. Kenneth Rothman, Epidemiology: An Introduction 44 (Oxford 2002) (“An elementary but essential principal that epidemiologists must keep in mind is that a person may be exposed to an agent and then develop disease without there being any causal connection between exposure and disease.”  … “In a courtroom, experts are asked to opine whether the disease of a given patient has been caused by a specific exposure.  This approach of assigning causation in a single person is radically different from the epidemiologic approach, which does not attempt to attribute causation in any individual instance.  Rather, the epidemiologic approach is to evaluate the proposition that the exposure is a cause of the disease in a theoretical sense, rather than in a specific person.”) (emphasis added).

The late David Freedman, who was the co-author of the chapters on statistics in all three editions of the Reference Manual on Scientific Evidence, was also a naysayer when it came to transmuting risk into cause:

“The scientific connection between specific causation and a relative risk of two is doubtful. *** Epidemiologic data cannot determine the probability of causation in any meaningful way because of individual differences.”

David Freedman & Philip Stark, “The Swine Flu Vaccine and Guillaine-Barré Syndrome:  A Case Study in Relative Risk and Specific Causation,” 64 Law & Contemporary Problems 49, 61 (2001) (arguing that proof of causation in a specific case, even starting with a relative risk of four, was “unconvincing”; citing Manko v. United States, 636 F. Supp. 1419, 1437 (W.D. Mo. 1986) (noting relative risk of 3.89–3.92 for GBS from swine-flu vaccine), aff’d in part, 830 F.2d 831 (8th Cir. 1987)).

Graham Colditz, who testified for plaintiffs in the hormone therapy litigation, similarly has taught that an increased risk of disease cannot be translated into the “but-for” standard of causation.  Graham A. Colditz, “From epidemiology to cancer prevention: implications for the 21st Century,” 18 Cancer Causes Control 117, 118 (2007) (“Knowledge that a factor is associated with increased risk of disease does not translate into the premise that a case of disease will be prevented if a specific individual eliminates exposure to that risk factor. Disease pathogenesis at the individual level is extremely complex.”)

Another epidemiologist, who wrote the chapter in the Federal Judicial Center’s Reference Manual on Scientific Evidence, on epidemiology, put the matter thus:

“However, the use of data from epidemiologic studies is not without its problems. Epidemiology answers questions about groups, whereas the court often requires information about individuals.

Leon Gordis, Epidemiology 362 (5th ed. 2014) (emphasis in original).

=========================================================

In New Jersey, an expert witness’s opinion that lacks a factual foundation is termed a “net opinion.” Polzo v. County of Essex, 196 N.J. 569, 583 (2008) (explaining New Jersey law’s prohibition against “net opinions” and “speculative testimony”). Under federal law, Rule 702, such an opinion is simply called inadmissible.

Here is an interesting example of a “net opinion” from an expert witness, in the field of epidemiology, who has testified in many judicial proceedings:

 

                                                                                          November 12, 2008

George T. Brugess, Esq.
Hoey & Farina, Attorneys at Law
542 South Dearborn Street, Suite 200
Chicago, IL 60605

Ref: Oscar Brooks v. Ingram Barge and Jantran Inc.

* * * *

Because [the claimant] was employed 28 years, he falls into the greater than 20 years railroad employment category (see Table 3 of Garshick’s 2004 paper) which shows a significant risk for lung cancer that ranges from 1.24 to 1.50. This means that his diesel exposure was a significant factor in his contracting lung cancer. His extensive smoking was also a factor in his lung cancer, and diesel exposure combined with smoking is an explanation for the relatively early age, 61 years old, of his diagnosis.

Now assuming that diesel exposure truly causes lung cancer, what was the basis for this witness (David F. Goldsmith, PhD) to opine that diesel exposure was a “significant factor” in the claimant’s developing lung cancer?  None really.  There was no basis in the report, or in the scientific data, to transmute an exposure that yielded a risk ratio of 1.24 to 1.50 for lung cancer, in a similarly exposed population to diesel emissions, into a “significant factor.” The claimant’s cancer may have arisen from background, baseline risk.  The cancer may have arisen from the risk due to smoking, which would have been on the order of a 2,000% increase, or so.  The cancer may have arisen from the claimed carcinogenicity of diesel emissions, on the order of 25 to 50%, which was rather insubstantial compared with his smoking risk.  Potentially, the cancer arose from a combination of the risk from both diesel emissions and tobacco smoking. In the population of men who looked like Mr. Oscar Brooks, by far, the biggest reduction in incidence would be achieved by removing tobacco smoking.

There were no biomarkers that identified the claimant’s lung cancer as having been caused by diesel emissions.  The expert witness’s opinion was nothing more than an ipse dixit that equated a risk, and a rather small risk, with specific causation.  Notice how a 24% increased risk from diesel emissions was a “significant factor,” but the claimant’s smoking history was merely “a factor.”

Goldsmith’s report on specific causation was a net opinion that exemplifies what is wrong with a legal system that encourages and condones baseless expert witness testimony. In Agent Orange, Judge Weinstein pointed out that the traditional judicial antipathy to probabilism would mean no recovery in many chemical and medicinal exposure cases.  If the courts lowered their scruples to permit recovery on a naked statistical inference of greater than 50%, from relative risks greater than two, some cases might remain viable (but alas not the Agent Orange case itself). Judge Weinstein was, no doubt, put off by the ability of defendants, such as tobacco companies, to avoid liability because plaintiffs would never have more than evidence of risk.  In the face of relative risks often in excess of 30, with attributable risks in excess of 95%, this outcome was disturbing.

Judge Weinstein’s compromise was a pragmatic solution to the problem of adjudicating specific causation on the basis of risk evidence. Although as noted above, many scientists rejected any use of risk to support specific causation inferences, some scientists agreed with this practical solution.  Ironically, David Goldsmith, the author of the report in the Oscar Brooks case, supra, was one such writer who had embraced the relative risk cut off:

“A relative risk greater than 2.0 produces an attributable risk (sometimes called attributable risk percent10) or an attributable fraction that exceeds 50%.  An attributable risk greater than 50% also means that ‘it is more likely than not’, or, in other words, there is a greater than 50% probability that the exposure to the risk factor is associated with disease.”

David F. Goldsmith & Susan G. Rose, “Establishing Causation with Epidemiology,” in Tee L. Guidotti & Susan G. Rose, eds., Science on the Witness Stand:  Evaluating Scientific Evidence in Law, Adjudication, and Policy 57, 60 (OEM Press 2001).

In the Brooks case, Goldsmith did not have an increased risk even close to 2.0. The litigation industry ultimately would not accept anything other than full compensation for attributable risks greater than 0%.


[1] See, e.g., Sander Greenland, “Relation of the Probability of Causation to Relative Risk and Doubling Dose:  A Methodologic Error that Has Become a Social Problem,” 89 Am. J. Pub. Health 1166, 1168 (1999)(“[a]ll epidemiologic measures (such as rate ratios and rate fractions) reflect only the net impact of exposure on a population”); Joseph V. Rodricks & Susan H. Rieth, “Toxicological Risk Assessment in the Courtroom:  Are Available Methodologies Suitable for Evaluating Toxic Tort and Product Liability Claims?” 27 Regulatory Toxicol. & Pharmacol. 21, 24-25 (1998)(noting that a population risk applies to individuals only if all persons within the population are the same with respect to the influence of the risk on outcome); G. Friedman, Primer of Epidemiology 2 (2d ed. 1980)(epidemiologic studies address causes of disease in populations, not causation in individuals)

 

Goodman v Viljoen – Meeting the Bayesian Challenge Head On

June 11th, 2014

Putting Science On Its Posterior

Plaintiffs’ and Defendants’ counsel both want the scientific and legal standard to be framed as a very high posterior probability of the truth of a claim. Plaintiffs want the scientific posterior probability to be high because they want to push the legal system in the direction of allowing weak or specious claims that are not supported by sufficient scientific evidence to support a causal conclusion.  By asserting that the scientific posterior probability for a causal claim is high, and that the legal and scientific standards are different, they seek to empower courts and juries to support judgments of causality that are deemed inconclusive, speculative, or worse, by scientists themselves.

Defendants want the scientific posterior probability to be high, and claim that the legal standard should be at least as high as the scientific standard.

Both Plaintiffs and Defendants thus find common cause in committing the transposition fallacy by transmuting the coefficient of confidence, typically 95%, into a minimally necessary posterior probability for scientific causal judgments.  “One wanders to the left, another to the right ; both are equally in error, but are seduced by different delusions.”[1]

In the Goodman v. Viljoen[2] case, both sides, plaintiffs and defendants, embraced the claim that science requires a high posterior probability, and that the p-value provided evidence of the posterior probability of the causal claim at issue.  The error came mostly from the parties’ clinical expert witnesses and from the lawyers themselves; the parties’ statistical expert witnesses appeared to try to avoid the transposition fallacy. Clearly, no text would support the conflation of confidence with certainty. No scientific text, treatise, or authority was cited for the notion that scientific “proof” required 95% certainty. This notion was simply an opinion of testifying witnesses.

The principal evidence that antenatal corticosteroid (ACS) therapy can prevent cerebral palsy (CP) came from a Cochrane review and meta-analysis[3] of clinical trials.  The review examined a wide range of outcomes, only one of which was CP.  The trials were apparently not designed to assess CP risk, and they varied significantly in case definition, diagnostic criteria, and length of follow up for case ascertainment. Of the five included studies, four ascertained CP at follow up from two to six years, and the length of follow up was unknown in the fifth study.

Data were sparse in the Cochrane review, as expected for a relatively rare outcome.  The five studies encompassed 904 children, with 490 in the treatment group, and 414 in the control group. There was a total of 48 CP cases, with 20 in the treatment, and 28 in the control, groups. Blinding was apparently not maintained over the extended reporting period.

Professor Andrew Willan, plaintiffs’ testifying expert witness on statistics, sponsored a Bayesian statistical analysis, with which he concluded that there was between a 91 and 97% probability that there was an increased risk of CP from not providing ACS in pre-term labor (or, a decreased risk of CP from administering ACS).[4] Willan’s posterior probabilities was for any increased risk, based upon the Cochrane data.  Willan’s calculations were not provided in his testimony, and no information about his prior probability, was given. The data came from clinical trials, but the nature of the observations and the analyses made these trials little more than observational studies conducted within the context of clinical trials designed to look at other outcomes. The Bayesian analysis did not account for the uncertainty in the case definitions, variations in internal validity and follow up, and biases in the clinical trials. Willan’s posterior probabilities thus described a maximal probability for general causation, which surely needed to be discounted for validity and bias issues.

There was a further issue of external validity. The Goodman twins developed CP from having sustained periventricular leukomalacia (PVL), which is one among several mechanistic pathways by which CP can develop in pre-term infants.  The Cochrane data did not address PVL, and the included trials were silent as to whether any of the CP cases involved PVL mechanisms.  There was no basis for assuming that ACS reduced risk of CP from all mechanisms equally, or even at all.[5] The Willan posterior probabilities did not address the external validity issues as they pertained to the Goodman case itself.

Although Dr. Viljoen abandoned the challenge to the Bayesian analysis at trial, his statistical expert witness, Dr. Robert Platt went further to opine that he agreed with Willan’s calculations.  To agree with his calculations, and the posterior probabilities that came out of those calculations, Platt had to have agreed with the analyses themselves. This agreement seems ill considered given that elsewhere in his testimony, Platt appears to advance important criticisms of the Cochrane data in the form of validity and bias issues.

Certainly, Platt’s concession about the correctness of Willan’s calculations greatly undermined Dr. Viljoen’s position with the trial and appellate court. Dr. Viljoen maintained those criticisms throughout the trial, and on appeal.  See, e.g., Defendant (Appellant) Factum, 2012 CCLTFactum 20936, at ¶14(a)

(“(a) antenatal corticosteroids have never been shown to reduce the incidence or effect of PVL”); id. at ¶14(d)(“at best, even taking the Bayesian approach at face value, the use of antenatal corticosteroids showed only a 40% reduction in the incidence of cerebral palsy, but not PVL”).

How might have things gone better for Dr. Vijoen? For one thing, Platt’s concession about the correctness of Willan’s calculations had to be explained and qualified as conceding only the posterior probability on the doubtful and unproven assumptions made by Willan. Willan’s posterior, as big as it was, represented only an idealized maximal posterior probability, which in reality had to be deeply discounted by important uncertainties, biases, and validity concerns.  The inconclusiveness of the data were “provable” on either a frequentist or a Bayesian analysis.


[1] Horace, in Wood, Dictionary of Quotations 182 (1893).

[2] Goodman v. Viljoen, 2011 ONSC 821 (CanLII), aff’d, 2012 ONCA 896 (CanLII), leave appeal den’d, Supreme Court of Canada No. 35230 (July 11, 2013).

[3] Devender Roberts & Stuart R Dalziel “Antenatal corticosteroids for accelerating fetal lung maturation for women at risk of preterm birth,” Cochrane Database of Systematic Reviews, at 8, Issue 3. Art. No. CD004454 (2006)

[4] Notes of Testimony of Andrew Willan at 34 (April 9, 2010) (concluding that ACS reduces risk of CP, with a probability of 91 to 97 percent, depending upon whether random effects or fixed effect models are used).

[5] See, e.g., Olivier Baud, Laurence Laurence Foix l’Hélias, et al., “Antenatal Glucocorticoid- Treatment and Cystic Periventricular Leukomalacia in Very Premature Infants,” 341 New Engl. J. Med. 1190, 1194 (1999) (“Our results suggest that exposure to betamethasone but not dexamethasone is associated with a decreased risk of cystic periventricular leukomalacia.”).

 

Goodman v Viljoen – Statistical Fallacies from Both Sides

June 8th, 2014

There was a deep irony to the Goodman[1] case.  If a drug company, in 1995, marketed antenatal corticosteroid (ACS) for the prevention of cerebral palsy (CP) in the United States, the government might well have prosecuted the company for misbranding.  The company might also be subject to a False Claims Act case as well. No clinical trial had found ACS efficacious for the prevention of CP at the significance level typically required by the FDA; no meta-analysis had found ACS statistically significantly better than placebo for this purpose.  In the Goodman case, however, failure to order a full course of ACS was malpractice with respect to the claimed causation of CP in the Goodman twins.

The Goodman case also occasioned a well-worn debate over the difference between scientific and legal evidence, inference, and standards of “proof.” The plaintiffs’ case rested upon a Cochrane review of ACS with respect to various outcomes. For CP, the Cochrane meta-analyzed only clinical trial data, and reported:

“a trend towards fewer children having cerebral palsy (RR 0.60, 95% CI 0.34 to 1.03, five studies, 904 children, age at follow up two to six years in four studies, and unknown in one study).”[2]

The defendant, Dr. Viljoen, appeared to argue that the Cochrane meta-analysis must be disregarded because it did not provide a showing of efficacy for ACS in preventing CP, at a significance probability less than 5 percent.  Here is the trial court’s characterization of Dr. Viljoen’s argument:

“[192] The argument that the Cochrane data concerning the effects of ACS on CP must be ignored because it fails to reach statistical significance rests on the flawed premise that legal causation requires the same standard of proof as medical/scientific causation. This is of course not the case; the two standards are in fact quite different. The law is clear that scientific certainty is not required to prove causation to the legal standard of proof on a balance of probabilities (See: Snell v. Farrell, [1990] 2 S.C.R. 311, at para. 34). Accordingly, the defendant’s argument in this regard must fail and for the purposes of this court, I accept the finding of the Cochrane analysis that ACS reduces the instance [sic] of CP by 40%.”

“Disregard” seems extreme for a meta-analysis that showed a 40% reduction in risk of a serious central nervous system disorder, with p = 0.065.  Perhaps Dr. Viljoen might have tempered his challenge some by arguing that the Cochrane analysis was insufficient.  One problem with Dr. Viljoen’s strident argument about statistical significance was that it overshadowed the more difficult, qualitative arguments about threats to validity in the Cochrane finding from loss to follow up in the aggregated trial data. These threats were probably stronger arguments against accepting the Cochrane “trend” as a causal conclusion. Indeed, the validity and the individual studies and the meta-analyses, along with questions about the accuracy of data, were not reflected in Bayesian analysis.

Another problem is that Dr. Viljoen’s strident assertion that p < 0.05 was absolutely necessary fed plaintiffs’ argument that the defendant was attempting to change the burden of proof for plaintiffs from greater than 50% to 95% or greater.  Given the defendant’s position, great care was required to prevent the trial court from committing the transposition fallacy.

Justice Walters rejected the suggestion that a meta-analysis with a p-value of 6.5% should be disregarded, but the court’s discussion skirts the question whether and how the Cochrane data can be sufficient to support a conclusion of ACS efficacy. Aside from citing a legal case, however, Justice Walters provided no basis for suggesting that the scientific standard of proof was different from the legal standard. From the trial court’s opinion, the parties or their expert witnesses appeared to conflate “confidence,” a technical term when used to describe intervals or random error around sample statistics, with “level of certainty” in the obtained result.

Justice Walters is certainly not the first judge to fall prey to the fallacious argument that the scientific burden of proof is 95%.[3]  The 95% is, of course, the coefficient of confidence for the confidence interval that is based upon a p-value of 5%. No other explanation for why 95% is a “scientific” standard of proof was offered in Goodman; nor is it likely that anyone could point to an authoritative source for the claim that scientists actually adjudge facts and theories by this 95 percent probability level.

Justice Walters’ confusion was led by the transposition fallacy, which confuses posterior and significance probabilities.  Here is a sampling from Her Honor’s opinion, first from Dr. Jon Barrett, one of the plaintiffs’ expert witnesses, an obstetrician and fetal maternal medicine specialist at Sunnybrook Hospital, in Toronto, Ontario:

“[85] Dr. Barrett’s opinion was not undermined during his lengthy cross-examination. He acknowledged that the scientific standard demands 95% certainty. He is, however, prepared to accept a lower degree of certainty. To him, 85 % is not merely a chance outcome.

                                                                                        * * *

[87] He acknowledged that scientific evidence in support of the use of corticosteroids has never shown statistical significance with respect to CP. However, he explained it is very close at 93.5%. He cautioned that if you use a black and white outlook and ignore the obvious trends, you will falsely come to the conclusion that there is no effect.”

Dr. Jon (Yoseph) Barrett is a well-respected physician, who specializes in high-risk pregnancies, but his characterization of a black-white outlook on significance testing as leading to a false conclusion of no effect was statistically doubtful.[4]  Dr. Barrett may have to make divinely inspired choices in surgery, but in a courtroom, expert witnesses are permitted to say that they just do not know. Failure to achieve statistical significance, with p < 0.05, does support a conclusion that there is no effect.

Professor Andrew Willan was plaintiffs’ testifying expert witness on statistics.  Here is how Justice Walters summarized Willan’s testimony:

“[125] Dr. Willan described different statistical approaches and in particular, the frequentist or classical approach and the Bayesian approach which differ in their respective definitions of probability. Simply, the classical approach allows you to test the hypothesis that there is no difference between the treatment and a placebo. Assuming that there is no difference, allows one to make statements about the probability that the results are not due to chance alone.

To reach statistical significance, a standard of 95% is required. A new treatment will not be adopted into practice unless there is less than a 5% chance that the results are due to chance alone (rather than due to true treatment effect).

[127] * * * The P value represents the frequentist term of probability. For the CP analysis [from the Cochrane meta-analysis], the P value is 0.065. From a statistical perspective, that means that there is a 6.5% chance that the differences that are being observed between the treatment arm versus the non-treatment arm are due to chance rather than the treatment, or conversely, a 93.5% chance that they are not.”

Justice Walters did not provide transcript references for these statements, but they are clear examples of the transposition fallacy. The court’s summary may have been unfair to Professor Willan, who seems to have taken care to avoid the transposition fallacy in his testimony:

“And I just want to draw your attention to the thing in parenthesis where it says, “P = 0.065.” So, basically that is the probability of observing data this extremely, this much in favor of ACS given, if, if in fact the no [sic, null] hypothesis was true. So, if, if the no hypothesis was true, that is there was no difference, then the probability of observing this data is only 6.5 percent.”

Notes of Testimony of Andrew Willan at 26 (April , 2010). In this quote, Professor Willan might have been more careful to point out that the significance probability of 6.5%  is a cumulative probability by describing the data observed “this extremely” and more. Nevertheless, Willan certainly made clear that the probability measure was based upon assuming the correctness of the null hypothesis. The trial court, alas, erred in stating the relevant statistical concepts.

And then there was the bizarre description by Justice Walters, of the Cochrane data, as embodying a near-uniform distribution represented by the Cochrane data:

“[190] * * * The Cochrane analysis found that ACS reduced the risk of CP (in its entirety) by 40%, 93.5% of the time.”

The trial court did not give the basis for this erroneous description of the Cochrane ACS/CP data.[5] To be sure, if the Cochrane result were true, then 40% reduction might be the expected value for all trials, but it would be a remarkable occurrence for 93.5% of the trials to obtain the same risk ratio as the one observed in the meta-analysis.

The defendant’s expert witness on statistical issues, Prof. Robert Platt, similarly testified that the significance probability reported by the Cochrane was dependent upon an assumption of the null hypothesis of no association:

“What statistical significance tells us, and I mentioned at the beginning that it refers to the probability of a chance finding could occur under the null-hypothesis of no effect. Essentially, it provides evidence in favour of there being an effect.  It doesn’t tell us anything about the magnitude of that effect.”

Notes of Testimony of Robert Platt at 11 (April 19, 2010)

Perhaps part of the confusion resulted from Prof. Willan’s sponsored Bayesian analysis, which led him to opine that the Cochrane data permitted him to state that there was a 91 to 97 percent probability of an effect, which might have appeared to the trial court to be saying the same thing as interpretation of the Cochrane’s p-value of 6.5%.  Indeed, Justice Walters may have had some assistance in this confusion from the defense statistical expert witness, Prof. Platt, who testified:

“From the inference perspective the p-value of 0.065 that we observe in the Cochrane review versus a 91 to 97 percent probability that there is an effect, those amount to the same thing.”

Notes of Testimony of Robert Platt at 50 (April 19, 2010).  Now the complement of the p-value, 93.5%, may have fallen within the range of posterior probabilities asserted by Professor Willan, but these probabilities are decidedly not the same thing.

Perhaps Prof. Platt was referring only to the numerical equivalence, but his language, “the same thing,” certainly could have bred misunderstanding.  The defense apparently attacked the reliability of the Bayesian analysis before trial, only to abandon the challenge by the time of trial.  At trial, defense expert witness Prof. Platt testified that he did not challenge Willan’s Bayesian analysis, or the computation of posterior probabilities.  Platt’s acquiescence in Willan’s Bayesian analysis is unfortunate because the parties never developed testimony exactly as to how Willan arrived at his posterior probabilities, and especially as to what prior probability he employed.

Professor Platt went on to qualify his understanding of Willan’s Bayesian analysis as providing a posterior probability that there is an effect, or in other words, that the “effect size” is greater than 1.0.  At trial, the parties spent a good deal of time showing that the Cochrane risk ratio of 0.6 represented the decreased risk for CP of administering a full course of ACS, and that this statistic could be presented as an increased CP risk ratio of 1.7, for not having administered a full course of ACS.  Platt and Willan appeared to agree that the posterior probability described the cumulative posterior probabilities for increased risks above 1.0.

“[T]he 91% is a probability that the effect is greater than 1.0, not that it is 1.7 relative risk.”

Notes of Testimony of Robert Platt at 51 (April 19, 2010); see also Notes of Testimony of Andrew Willan at 34 (April 9, 2010) (concluding that ACS reduces risk of CP, with a probability of 91 to 97 percent, depending upon whether random effects or fixed effect models are used).[6]

One point on which the parties’ expert witnesses did not agree was whether the failure of the Cochrane’s meta-analysis to achieve statistical significance was due solely to the sparse data aggregated from the randomized trials. Plaintiffs’ witnesses appeared to have testified that had the Cochrane been able to aggregate additional clinical trial data, the “effect size” would have remained constant, and the p-value would have shrunk, ultimately to below the level of 5 percent.  Prof. Platt, testifying for the defense, appropriately criticized this hand-waving excuse:

“Q. and the probability factor, the P value, was 0.065, which the previous witness had suggested is an increase in probability of our reliability on the underlying data.  Is it reasonable to assume that this data that a further increase in the sample size will achieve statistical significance?

A. No, that’s not a reasonable assumption….”

Notes of Testimony of Robert Platt at 29 (April 19, 2010).

Positions on Appeal

Dr. Viljoen continued to assert the need for significance on appeal. As appellant, he challenged the trial court’s finding that the Cochrane review concluded that there was a 40% risk reduction. See Goodman v. Viljoen, 2011 ONSC 821, at ¶192 (CanLII) (“I accept the finding of the Cochrane analysis that ACS reduces the instance of CP by 40%”). Dr. Viljoen correctly pointed out that the Cochrane review never reached such a conclusion. Appellant’s Factum, 2012 CCLTFactum 20936, ¶64.  It was the plaintiffs’ expert witnesses, not the Cochrane reviewers, who reached the conclusion of causality from the Cochrane data.

On appeal, Dr. Viljoen pressed the point that his expert witnesses described statistical significance in the Cochrane analysis would have been “a basic and universally accepted standard” for showing that ACS was efficacious in preventing CP or PVL. Id. at ¶40. The appellant’s brief then commits to the very error that Dr. Barrett complained would follow from a finding that did not have statistical significance; Dr. Viljoen maintained that the “trend” of reduced CP reduced CD rates from ACS administration “is the same as a chance occurrence.” Defendant (Appellant), 2012 CCLTFactum 20936, at ¶40; see also id. at ¶14(e) (arguing that the Cochrane result for ACS/CP “should be treated as pure chance given it was not a statistically significant difference”).

Relying upon the Daubert decision from the United States, as well as Canadian cases, Dr. Viljoen framed one of his appellate issues as whether the trial court had “erred in relying upon scientific evidence that had not satisfied the benchmark of statistical significance”:

“101. Where a scientific effect is not shown to a level of statistical significance, it is not proven. No study has demonstrated a reduction in cerebral palsy with antenatal corticosteroids at a level of statistical significance.

102. The Trial Judge erred in law in accepting that antenatal corticosteroids reduce the risk of cerebral palsy based on Dr. Willan’s unpublished Bayesian probability analysis of the 48 cases of cerebral palsy reviewed by Cochrane—an analysis prepared for the specific purpose of overcoming the statistical limitations faced by the Plaintiffs on causation.”

Defendant (Appellant), 2012 CCLTFactum 20936. The use of the verb “proven” is problematic because it suggests a mathematical demonstration, which is never available for empirical propositions about the world, and especially not for the biological world.  The use of a mathematical standard begs the question whether the Cochrane data were sufficient to establish a scientific conclusion of the efficacy of ACS in preventing CP.

In opposing Dr. Viljoen’s appeal, the plaintiffs capitalized upon his assertion that science requires a very high level of posterior probability for establishing a causal claim, by simply agreeing with it. See Plaintiffs’ (Respondents’) Factum,  2012 CCLTFactum 20937, at ¶31 (“The scientific method requires statistical significance at a 95% level.”).  By accepting the idealized notion that science somehow requires 95% certainty (as opposed to 95% confidence levels as a test for assessing random error), the plaintiffs made the defendant’s legal position untenable.

In order to keep the appellate court thinking that the defendant was imposing an extra-legal, higher burden of proof upon plaintiffs, the plaintiffs went so far as to misrepresent the testimony of their own expert witness, Professor Willan, as having committed the transposition fallacy:

“49. Dr. Willan provided the frequentist explanation of the Cochrane analysis on CP:

a. The risk ratio (RR) is .060 which means that there is a 40% risk reduction in cerebral palsy where there has been administration of antenatal corticosteroids;

b. The upper limit of the confidence interval (CI) barely crosses 1 so it just barely fails to meet the rigid test of statistical significance;

c. The p value represents the frequentist term of probability;

d. In this case the p value is .065;

e. From a statistical perspective that means that there is a 6.5% chance that the difference observed in CP rates is due to chance alone;

f. Conversely there is a 93.5% chance that the result (the 40% reduction in CP) is due to a true treatment effect of ACS.”

2012 CCLTFactum 20937, at ¶49 (citing Evidence of Dr. Willan, Respondents’ Compendium, Tab 4, pgs. 43-52).

Although Justice Doherty dissented from the affirmance of the trial court’s judgment, he succumbed to the parties’ misrepresentations about scientific certainty, and their prevalent commission of the transposition fallacy. Goodman v. Viljoen, 2012 ONCA 896 (CanLII) at ¶36 (“Scientists will draw a cause and effect relationship only when a result follows at least 95 per cent of the time. The results reported in the Cochrane analysis fell just below that standard.”), leave appeal den’d, Supreme Court of Canada No. 35230 (July 11, 2013).

The statistical errors on both sides redounded to the benefit of the plaintiffs.


[1] Goodman v. Viljoen, 2011 ONSC 821 (CanLII), aff’d, 2012 ONCA 896 (CanLII), leave appeal den’d, Supreme Court of Canada No. 35230 (July 11, 2013).

[2] Devender Roberts & Stuart R Dalziel “Antenatal corticosteroids for accelerating fetal lung maturation for women at risk of preterm birth,” Cochrane Database of Systematic Reviews, at 8, Issue 3. Art. No. CD004454 (2006).

[3] See, e.g., In re Ephedra Prods. Liab. Litig., 393 F.Supp. 2d 181, 191, 193 (S.D.N.Y. 2005) (fallaciously arguing that the use of a critical value of less than 5% of significance probability increased the “more likely than not” burden of proof upon a civil litigant.  Id. at 188, 193.  See also Michael O. Finkelstein, Basic Concepts of Probability and Statistics in the Law 65 (2009) (criticizing the Ephedra decision for confusing posterior probability with significance probability).

[4] I do not have the complete transcript of Dr. Barrett’s testimony, but the following excerpt from April 9, 2010, at page 100, suggests that he helped lead Justice Walters into error: “When you say statistical significance, if you say that something is statistically significance, it means you’re, for the scientific notation, 95 percent sure. That’s the standard we use, 95 percent sure that that result could not have happened by chance. There’s still a 5 percent chance it could. It doesn’t mean for sure, but 95 percent you’re sure that the result you’ve got didn’t happen by chance.”

[5] On appeal, the dissenting judge erroneously accepted Justice Walters’ description of the Cochrane review as having supposedly reported a 40% reduction in CP incidence, 93.5% of the time, from use of ACS. Goodman v. Viljoen, 2012 ONCA 896 (CanLII) at ¶36, leave appeal den’d, Supreme Court of Canada No. 35230 (July 11, 2013).

[6] The Bayesian analysis did not cure the attributability problem with respect to specific causation.

 

Goodman v Viljoen – Subterfuge to Circumvent Relative Risks Less Than 2

June 6th, 2014

Back in March, I wrote about a “Black Swan” case, in which litigants advanced a Bayesian analysis to support their claims. Goodman v. Viljoen, 2011 ONSC 821 (CanLII), aff’d, 2012 ONCA 896 (CanLII), leave appeal den’d, Supreme Court of Canada No. 35230 (July 11, 2013).

Goodman was a complex medical practice case in which Mrs. Goodman alleged that her obstetrician, Dr. Johan Viljoen, deviated from the standard of care by failing to prescribe antenatal corticosteroids (ACS) sufficiently in advance of delivery to reduce the risks attendant early delivery for her twin boys, of early delivery. Both boys developed cerebral palsy (CP). The parties and their experts agreed that the administration of ACS reduced the risks of respiratory distress and other complications of pre-term birth, but they disputed the efficacy of ACS to avoid or diminish the risk of CP.

According to the plaintiffs, ACS would have, more probably than not, prevented the twins from developing cerebral palsy, or would have diminished the severity of their condition.  Dr. Viljoen disputed both general and specific causation. Evidence of general causation came from both randomized clinical trials (RCTs) and observational studies.

Limitations Issue

There were many peculiar aspects to the Goodman case, not the least of which was that the twins sued Dr. Viljoen over a decade after they were born.  Dr. Viljoen had moved his practice in the passage of time, and he was unable to produce crucial records that supported his account of how his staff responded to Mrs. Goodman’s telephone call about signs and symptoms of labor. The prejudice to Dr. Viljoen illustrates the harshness of broad tolling statutes, the unfairness of which could be reduced by requiring infant plaintiffs to give notice of their intent to sue, even if they wait until the age of majority before filing their complaints.

State of the Art Issue

Dr. Viljoen suffered perhaps a more serious prejudice in the form of hindsight bias that resulted from the evaluation of his professional conduct by evidence that was unavailable when the twins were born in 1995. The following roughly contemporaneous statement from the New England Journal of Medicine is typical of serious thinking at the time of the alleged malpractice:

“Antenatal glucocorticoid therapy decreases the incidence of several complications among very premature infants. However, its effect on the occurrence of cystic periventricular leukomalacia, a major cause of cerebral palsy, remains unknown.”

Olivier Baud, Laurence Laurence Foix l’Hélias, et al., “Antenatal Glucocorticoid- Treatment and Cystic Periventricular Leukomalacia in Very Premature Infants,” 341 New Engl. J. Med. 1190, 1190 (1999) (emphasis added). The findings of this observational study illustrate some of the difficulties with the claim that Dr. Viljoen failed to prevent an avoidable consequence of pre-term delivery:

“Our results suggest that exposure to betamethasone but not dexamethasone is associated with a decreased risk of cystic periventricular leukomalacia.”

Id. at 1194. Results varied among various corticosteroids, among doses, among timing regimens.  There hardly seemed enough data in 1995 to dictate a standard of care.

Meta-Analysis Issues

Over ten years after the Goodman twins were born, the Cochrane collaboration published a meta-analysis that was primarily concerned with the efficacy of ACS for lung maturation. Devender Roberts & Stuart R Dalziel “Antenatal corticosteroids for accelerating fetal lung maturation for women at risk of preterm birth,” Cochrane Database of Systematic Reviews Issue 3. Art. No. CD004454 (2006). The trials included mostly post-dated the birth of the twins, and the alleged malpractice. The relevance of the trials to address the causation of CP in infants who experienced periventricular leukomalacia (PVL) was hotly disputed, but for now, I will gloss over the external validity problem of the Cochrane meta-analysis.

The Cochrane Collaboration usually limits its meta-analyses to the highest quality evidence, or RCTs, but in this instance, the RCTs did not include CP in its primary pre-specified outcomes. Furthermore, the trials were generally designed to ascertain short-term benefits from ACS, and the data in the trials were uncertain with respect to longer-term outcomes, which may have been ascertained differentially. Furthermore, the trials were generally small and were plagued by sparse data.  None of the individual trials was itself statistically significant at the 5 percent level.  The meta-analysis did not show a statistically significant decrease in CP from ACS treatment.  The authors reported:

“a trend towards fewer children having cerebral palsy (RR 0.60, 95% CI 0.34 to 1.03, five studies, 904 children, age at follow up two to six years in four studies, and unknown in one study).”

 Id. at 8 (emphasis added).

The Cochrane authors were appropriately cautious in interpreting the sparse data:

“Results suggest that antenatal corticosteroids result in less neurodevelopmental delay and possibly less cerebral palsy in childhood.”

Id. at 13-14 (emphasis added).

The quality of the trials included in the Cochrane meta-analysis varied, as did the trial methodologies.  Despite the strong clinical heterogeneity, the Cochrane authors performed their meta-analysis with a fixed-effect model. The confidence interval, which included 1.0, reflected a p-value of 0.065, but that p-value would have certainly increased if a more appropriate random-effects model had been used.

Furthermore, the RCTs were often no better than observational studies on the CP outcome. The RCTs here perhaps should not have been relied upon to the apparent exclusion of observational epidemiology.

Relative Risk Less Than Two

There is much to be said about the handling of statistical significance, the Bayesian analysis, the arguments about causal inference, but for now, let us look at one of the clearest errors in the case:  the inference of specific causation from a relative risk less than two.  To be sure, the Cochrane meta-analysis reported a non-statistically significant 40% decrease, but if we were to look at this outcome in terms of the increase in risk of CP from the physician’s failure to administer ACS timely, then the risk ratio would be 1.67, or a 67% increase.  On either interpretation, fewer than half the cases of CP can be attributed to the failure to administer ACS fully and timely in the case.

The parties tried their case before Justice Walters, in St. Catherines, Ontario. Goodman v. Viljoen, 2011 ONSC 821 (CanLII).  Justice Walters recognized that specific causation was essential and at the heart of the parties’ disagreement:

“[47] In order to succeed, the plaintiffs must establish that the failure to receive a full course of ACS materially affected the twins’ outcome. That is, they must establish that “but for” the failureto receive a full course of ACS, the twins would not have suffered from the conditions they now do, or that the severity of these afflictions would have been materially reduced.

[48] Not surprisingly, this was the most contentious issue at trial and the court heard a good deal of evidence with respect to the issue of causation.”

One of the defendant’s expert witnesses, Robert Platt, a professor of statistics at McGill University School of Medicine, testified, according to Justice Walters:

“[144] Dr. Platt also stated that the absolute risk in and of itself does not tell us anything about what might have happened in a specific case absent clinical and mechanistic explanations for that specific case.”

The plaintiffs’ expert witnesses apparently conceded the point.  Professor Andrew Willan, a statistician, testifying for the plaintiffs, attempted to brush Platt’s point aside by suggesting it would render clinical research useless, but that was hardly the point.  Platt embraced clinical research for what it could show about the “averages” in a sample of the population, even if we cannot discern causal efficacy retrospectively in a specific patient:

“[133] Dr. Willan also responded to Dr. Platt’s criticism that it was impossible to determine the distribution of the effect across the population. Professor Willan felt this issue was a red herring, and if it were valid, it would render most clinical research useless. There is really no way of knowing who will benefit from a treatment and who will not. Unless there are reasons to believe otherwise, it is best to apply the population average effect to each person.”

Although Willan labeled Platt’s point as cold-blooded and fishy, he ultimately concurred that the population average effect should be applied to each person in the absence of evidence of risk being sequestered in a subgroup.

A closer look at Willan’s testimony at trial is instructive. Willan acknowledged, on direct examination, that the plaintiffs were at increased risk, even if their mother had received a full course of ACS.  All he would commit to, on behalf of the plaintiffs, was that their risk would have been less had the ACS been given earlier:

“All we can say is that there’s a high probability that that risk would be reduced and that this is probably the best estimate of the excess risk for not being treated and I would say that puts that in the 70 percent range of excess risk and I would say the probability that the risk would have been reduced is into the 90 percentage points.”

Notes of Testimony of Andrew Willan at 62 (April 6, 2010).  The 90 percentage points reference here was Willan’s posterior probability that the claimed effect was real.

On cross-examination, the defense pressed the point:

Q. What you did not do in this, in this report, is provide any quantification for the reduction in the risk, true?

A. That’s correct.

Notes of Testimony of Andrew Willan at 35 (April 9, 2010)

Q. And you stated that there is no evidence that the benefits of steroids is restricted to any particular subgroup of patients?

A. I wasn’t given any. I haven’t seen any evidence of that.

Id. at 43.

Q. And what you’re suggesting with that statement, is that the statistics should be generally, should be considered by the court to be generally applicable, true?

A. That’s correct.

Id. at 44.

Q. But given your report, you can’t offer assistance on the clinical application to the statistics, true?

A. That’s true.

Id. at 46.

With these concessions in hand, defense counsel elicited the ultimate concession relevant to the “but for” standard of causation:

Q. And to do that by looking at an increase in risk, the risk ratio from the data must achieve 2 in order for there to be a 50 percent change in the underlying data, true?

A. Yeah, to double the risk, the risk ratio would have to be 2, to double the risk.

Id. at 63.

* * *

Q. So, none of this data achieves the threshold of a 50 percent change in the underlying data, whether you look at it as an increase in risk or …

A. Sure.

Q …. a decrease in risk …

A. Yeah.

Id. at 66.

Leaping Inferences

The legal standard for causation in Canada is the same counterfactual requirement that applies in most jurisdictions in the United States.  Goodman v. Viljoen, 2011 ONSC 821 (CanLII), at ¶14, 47. The trial court well understood that the plaintiffs’ evidence left them short of showing that their CP would not have occurred but for the delay in administering ACS. Remarkably, the court permitted the plaintiffs to use non-existing evidence to bridge the gap.

According to Dr. Max Perlman, plaintiffs’ expert witness on neonatology and pediatrics, CP is not a dichotomous condition, but a spectrum that is manifested on a continuum of signs and symptoms.  The RCTs relied upon had criteria for ascertaining CP and including it as an outcome.  The result of these criteria was that CP was analyzed as a binary outcome.  Dr. Perlman, however, held forth that “common sense and clinical experience” told him that CP is not a condition that is either present or not, but rather presented on a continuum. Id. at [74].

Without any evidence, Perlman testified that when CP is not avoided by ACS, “it is likely that it is less severe for those who do go on to develop it.” Id. [75].  Indeed, Perlman made the absence of evidence a claimed virtue; with all his experience and common sense, he “could not think of a single treatment which affects a basic biological process that has a yes or no effect; they are all on a continuum.” Id. From here, Perlman soared to his pre-specified conclusion that “that it is more likely than not that the twins would have seen a material advantage had they received the optimal course of steroids.” Id. at [76].

Perlman’s testimony is remarkable for inventing a non-existing feature of biological evidence:  everything is a continuum. Justice Walters could not resist this seductive testimony:

“[195] The statistical information is but one piece of the puzzle; one way of assessing the impact of ACS on CP. Notably, the 40% reduction in CP attributable to ACS represents an all or nothing proposal. In other words, 93.5% of the time, CP is reduced in its entirety by 40%. It was the evidence of Dr. Perlman, which I accept, that CP is not a black and white condition, and, like all biological processes, it can be scaled on a continuum of severity. It therefore follows that in those cases where CP is not reduced in its entirety, it is likely to be less severe for those who go on to develop it. Such cases are not reflected in the Cochrane figure.

[196] Since the figure of 40% represents an all or nothing proposal, it does not accurately reflect the total impact of ACS on CP. Based on this evidence, it is a logical  conclusion that if one were able to measure the total effect of ACS on CP, the statistical measure of that effect would be inflated beyond 40%.

[197] Unfortunately, this common sense conclusion has never and can never be tested by science. As Dr. Perlman testified, such a study would be impossible to conduct because it would require pre-identification of those persons who go on to develop CP.  Furthermore, because the short term benefits of ACS are now widely accepted, it would be unethical to withhold steroids to conduct further studies on long term outcomes.”

Doubly unfortunate, because Perlman’s argument was premised on a counterfactual assumption.  Many biological phenomena are dichotomous.  Pregnancy, for instance, does not admit of degrees.  Disease states are frequently dichotomous, and no evidence was presented that CP was not dichotomous. Threshold effects abound in living organisms. Perlman’s argument further falls apart when we consider that the non-experimental arm of the RCTs would also have had additional “less-severe” CP cases, with no evidence that they occurred disproportionately in the control arms of these RCTs. Furthermore, high-quality observational studies might have greater validity than post-hoc RCTs in this area, and there have been, and likely will continue to be, such studies to attempt better understanding of the efficacy of ACS, as well as differing effects among the various corticosteroids, doses, and patterns of administration.

On appeal, the Justice Walters’ verdict for plaintiffs was affirmed, but over a careful, thoughtful dissent. Goodman v. Viljoen, 2012 ONCA 896 (CanLII) (Doherty, J., dissenting). Justice Doherty caught the ultimate futility of Dr. Perlman’s opinion based upon non-existent evidence: even if there were additional sub-CP cases in the treatment arms of the RCTs, and if they occurred disporportionately more often in the treatment than in the placebo arms, we are still left guessing about the quantitative adjustment to make to the 40% decrease, doubtful as it was, which came from the Cochrane review.