TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

Demonstration of Frye Gatekeeping in Pennsylvania Birth Defects Case

October 6th, 2015

Michael D. Freeman is a chiropractor and self-styled “forensic epidemiologist,” affiliated with Departments of Public Health & Preventive Medicine and Psychiatry, Oregon Health & Science University School of Medicine, in Portland, Oregon. His C.V. can be found here. Freeman has an interesting publication in press on his views of forensic epidemiology. Michael D. Freeman & Maurice Zeegers, “Principles and applications of forensic epidemiology in the medico-legal setting,” Law, Probability and Risk (2015); doi:10.1093/lpr/mgv010. Freeman’s views on epidemiology did not, however, pass muster in the courtroom. Porter v. Smithkline Beecham Corp., Phila. Cty. Ct. C.P., Sept. Term 2007, No. 03275. Slip op. (Oct. 5, 2015).

In Porter, plaintiffs sued Pfizer, the manufacturer of the SSRI antidepressant Zoloft. Plaintiffs claimed the mother plaintiff’s use of Zoloft during pregnancy caused her child to be born with omphalocele, a serious defect that occurs when the child’s intestines develop outside his body. Pfizer moved to exclude plaintiffs’ medical causation expert witnesses, Dr. Cabrera and Dr. Freeman. The trial judge was the Hon. Mark I. Bernstein, who has written and presented frequently on expert witness evidence.[1] Judge Bernstein held a two day hearing in September 2015, and last week, His Honor ruled that the plaintiffs’ expert witnesses failed to meet Pennsylvania’s Frye standard for admissibility. Judge Bernstein’s opinion reads a bit like a Berenstain Bear book on how not to use epidemiology.

GENERAL CAUSATION SCREW UPS

Proper Epidemiologic Method

First, Find An Association

Dr. Freeman has a methodologic map that included Bradford Hill criteria at the back end of the procedure. Dr. Freeman, however, impetuously forgot that before you get to the back end, you must traverse the front end:

“Dr. Freemen agrees that he must, and claims he has, applied the Bradford Hill Criteria to support his opinion. However, the starting procedure of any Bradford-Hill analysis is ‘an association between two variables’ that is ‘perfectly clear-cut and beyond what we would care to attribute to the play of chance’.35 Dr. Freeman testified that generally accepted methodology requires a determination, first, that there’s evidence of an association and, second, whether chance, bias and confounding have been accounted for, before application of the Bradford-Hill criteria.36 Because no such association has been properly demonstrated, the Bradford Hill criteria could not have been properly applied.”

Slip op. at 12-13. In other words, don’t go rushing to the Bradford Hill factors until and unless you have first shown an association; second, you have shown that it is “clear cut,” and not likely the result of bias or confounding; and third, you have ruled out the play of chance or random variability in explaining the difference between the observed and expected rates of disease.

Proper epidemiologic method requires surveying the pertinent published studies that investigate whether there is an association between the medication use and the claimed harm. The expert witnesses must, however, do more than write a bibliography; they must assess any putative associations for “chance, confounding or bias”:

“Proper epidemiological methodology begins with published study results which demonstrate an association between a drug and an unfortunate effect. Once an association has been found, a judgment as whether a real causal relationship between exposure to a drug and a particular birth defect really exists must be made. This judgment requires a critical analysis of the relevant literature applying proper epidemiologic principles and methods. It must be determined whether the observed results are due to a real association or merely the result of chance. Appropriate scientific studies must be analyzed for the possibility that the apparent associations were the result of chance, confounding or bias. It must also be considered whether the results have been replicated.”

Slip op. at 7.

Then Rule Out Chance

So if there is something that appears to be an association in a study, the expert epidemiologist must assess whether it is likely consistent with a chance association. If we flip a fair coin 10 times, we “expect” 5 heads and 5 tails, but actually the probability of not getting the expected result is about three times greater than obtaining the expected result. If on one series of 10 tosses we obtain 6 heads and 4 tails, we would certainly not reject a starting assumption that the expected outcome was 5 heads/ 5 tails. Indeed, the probability of obtaining 6 heads / 4 tails or 4 heads /6 tails is almost double that of the probability of obtaining the expected outcome of equal number of heads and tails.

As it turned out in the Porter case, Dr. Freeman relied rather heavily upon one study, the Louik study, for his claim that Zoloft causes the birth defect in question. See Carol Louik, Angela E. Lin, Martha M. Werler, Sonia Hernández-Díaz, and Allen A. Mitchell, “First-Trimester Use of Selective Serotonin-Reuptake Inhibitors and the Risk of Birth Defects,” 356 New Engl. J. Med. 2675 (2007). The authors of the Louik study were quite clear that they were not able to rule out chance as a sufficient explanation for the observed data in their study:

“The previously unreported associations we identified warrant particularly cautious interpretation. In the absence of preexisting hypotheses and the presence of multiple comparisons, distinguishing random variation from true elevations in risk is difficult. Despite the large size of our study overall, we had limited numbers to evaluate associations between rare outcomes and rare exposures. We included results based on small numbers of exposed subjects in order to allow other researchers to compare their observations with ours, but we caution that these estimates should not be interpreted as strong evidence of increased risks.24

Slip op at 10 (quoting from Louik study).

Judge Bernstein thus criticized Freeman for failing to account for chance in explaining his putative association between maternal Zoloft use and infant omphalocele. The appropriate and generally accepted methodology for accomplishing this step of evaluating a putative association is to consider whether the association is statistically significant at the conventional level.

In relying heavily upon the Louik study, Dr. Freeman opened himself up to serious methodological criticism. Judge Bernstein’s opinion stands for the important proposition that courts should not be unduly impressed with nominal statistical significance in the presence of multiple comparisons and very broad confidence intervals:

“The Louik study is the only study to report a statistically significant association between Zoloft and omphalocele. Louik’s confidence interval which ranges between 1.6 and 20.7 is exceptionally broad. … The Louik study had only 3 exposed subjects who developed omphalocele thus limiting its statistical power. Studies that rely on a very small number of cases can present a random statistically unstable clustering pattern that may not replicate the reality of a larger population. The Louik authors were unable to rule out confounding or chance. The results have never been replicated concerning omphalocele. Dr. Freeman’s testimony does not explain, or seemingly even consider these serious limitations.”

Slip op. at 8. Statistical precision in the point estimate of risk, which includes assessing the outcome in the context of whether the authors conducted multiple comparisons, and whether the observed confidence intervals were very broad, is part of the generally accepted epidemiologic methodology, which Freeman flouted:

“Generally accepted methodology considers statistically significant replication of study results in different populations because apparent associations may reflect flaws in methodology.”

Slip op. at 9. The studies that Freeman cited and apparently relied upon failed to report statistically significant associations between sertraline (Zoloft) and omphalocele. Judge Bernstein found this lack to be a serious problem for Freeman and his epidemiologic opinion:

“While non-significant results can be of some use, despite a multitude of subsequent studies which isolated omphalocele, there is no study which replicates or supports Dr. Freeman’s conclusions.”

Slip op. at 10. The lack of statistical significance, in the context of repeated attempts to find it, helped sink Freeman’s proffered testimony.

Then Rule Out Bias and Confounding

As noted, Freeman relied heavily upon the Louik study, which was the only study to report a nominally statistically significant risk ratio for maternal Zoloft use and infant omphalocele. The Louik study, by its design, however, could not exclude chance or confounding as full explanation for the apparent association, and Judge Bernstein chastised Dr. Freeman for overselling the study as support for the plaintiffs’ causal claim:

“The Louik authors were unable to rule out confounding or chance. The results have never been replicated concerning omphalocele. Dr. Freeman’s testimony does not explain, or seemingly even consider these serious limitations.”

Slip op. at 8.

And Only Then Consider the Bradford Hill Factors

Even when an association is clear cut, and beyond what we can likely attribute to chance, generally accepted methodology requires the epidemiologist to consider the Bradford Hill factors. As Judge Bernstein explains, generally accepted methodology for assessing causality in this area requires a proper consideration of Hill’s factors before a conclusion of causation is reached:

“As the Bradford-Hill factors are properly considered, causality becomes a matter of the epidemiologist’s professional judgment.”

Slip op. at 7.

Consistency or Replication

The nine Hill factors are well known to lawyers because they have been stated and discussed extensively in Hill’s original article, and in references such as the Reference Manual on Scientific Evidence. Not all the Hill factors are equally important, or important at all, but one that is consistency or concordance of results among the available epidemiologic studies. Stated alternatively, a clear cut association unlikely to be explained by chance is certainly interesting and probative, but it raises an important methodological question — can the result be replicated? Judge Bernstein restated this important Hill factor as an important determinant of whether a challenged expert witness employed a generally accepted method:

“Generally accepted methodology considers statistically significant replication of study results in different populations because apparent associations may reflect flaws in methodology.”

Slip op. at 10.

“More significantly neither Reefhuis nor Alwan reported statistically significant associations between Zoloft and omphalocele. While non-significant results can be of some use, despite a multitude of subsequent studies which isolated omphalocele, there is no study which replicates or supports Dr. Freeman’s conclusions.”

Slip op. at 10.

Replication But Without Double Dipping the Data

Epidemiologic studies are sometimes updated and extended with additional follow up. An expert witness who wished to skate over the replication and consistency requirement might be tempted, as was Dr. Freeman, to count the earlier and later iteration of the same basic study to count as “replication.” The Louik study was indeed updated and extended this year in a published paper by Jennita Reefhuis and colleagues.[2] Proper methodology, however, prohibits double dipping data to count the later study that subsumes the early one as a “replication”:

“Generally accepted methodology considers statistically significant replication of study results in different populations because apparent associations may reflect flaws in methodology. Dr. Freeman claims the Alwan and Reefhuis studies demonstrate replication. However, the population Alwan studied is only a subset of the Reefhuis population and therefore they are effectively the same.”

Slip op. at 10.

The Lumping Fallacy

Analyzing the health outcome of interest at the right level of specificity can sometimes be a puzzle and a challenge, but Freeman generally got it wrong by opportunistically “lumping” disparate outcomes together when it helps him get a result that he likes. Judge Bernstein admonishes:

“Proper methodology further requires that one not fall victim to the … the ‘Lumping Fallacy’. … Different birth defects should not be grouped together unless they a part of the same body system, share a common pathogenesis or there is a specific valid justification or necessity for an association20 and chance, bias, and confounding have been eliminated.”

Slip op. at 7. Dr. Freeman lumped a lot, but Judge Bernstein saw through the methodological ruse. As Judge Bernstein pointed out:

“Dr. Freeman’s analysis improperly conflates three types of data: Zoloft and omphalocele, SSRI’s generally and omphalocele, and SSRI’s and gastrointestinal and abdominal malformations.”

Slip op. at 8. Freeman’s approach, which sadly is seen frequently in pharmaceutical and other products liability cases, is methodologically improper:

“Generally accepted causation criteria must be based on the data applicable to the specific birth defect at issue. Dr. Freeman improperly lumps together disparate birth defects.”

Slip op. at 11.

Class Effect Fallacy

Another kind of improper lumping results from treating all SSRI antidepressants the same to either lump them together, or to pick and choose from among all the SSRIs, the data points that are supportive of the plaintiffs’ claims (while ignoring those SSRI data points not supportive of the claims). To be sure, the SSRI antidepressants do form a “class,” in that they all have a similar pharmacologic effect. The SSRIs, however, do not all achieve their effect in the serotonergic neurons the same way; nor do they all have the same “off-target” effects. Treating all the SSRIs as interchangeable for a claimed adverse effect, without independent support for this treatment, is known as the class effect fallacy. In Judge Bernstein’s words:

“Proper methodology further requires that one not fall victim to the ‘Class Effect Fallacy’ … . A class effect cannot be assumed. The causation conclusion must be drug specific.”

Slip op. at 7. Dr. Freeman’s analysis improperly conflated Zoloft data with SSRI data generally. Slip op. at 8. Assuming what you set out to demonstrate is, of course, a fine way to go methodologically into the ditch:

“Without significant independent scientific justification it is contrary to generally accepted methodology to assume the existence of a class effect. Dr. Freeman lumps all SSRI drug results together and assumes a class effect.”

Slip op. at 10.

SPECIFIC CAUSATION SCREW UPS

Dr. Freeman was also offered by plaintiffs to provide a specific causation opinion – that Mrs. Porter’s use of Zoloft in pregnancy caused her child’s omphalocele. Freeman claimed to have performed a differential diagnosis or etiology or something to rule out alternative causes.

Genetics

In the field of birth defects, one possible cause looming in any given case is an inherited or spontaneous genetic mutation. Freeman purported to have considered and ruled out genetic causes, which he acknowledged to make up a substantial percentage of all omphalocele cases. Bo Porter, Mrs. Porter’s son, was tested for known genetic causes, and Freeman argued that this allowed him to “rule out” genetic causes. But the current state of the art in genetic testing allowed only for identifying a small number of possible genetic causes, and Freeman failed to explain how he might have ruled out the as-of-yet unidentified genetic causes of birth defects:

“Dr. Freeman fails to properly rule out genetic causes. Dr. Freeman opines that 45-49% of omphalocele cases are due to genetic factors and that the remaining 50-55% of cases are due to non-genetic factors. Dr. Freeman relies on Bo Porter’s genetic testing which did not identify a specific genetic cause for his injury. However, minor plaintiff has not been tested for all known genetic causes. Unknown genetic causes of course cannot yet be tested. Dr. Freeman has made no analysis at all, only unwarranted assumptions.”

Slip op. at 15-16. Judge Bernstein reviewed Freeman’s attempted analysis and ruling out of potential causes, and found that it departed from the generally accepted methodology in conducting differential etiology. Slip op. at 17.

Timing Errors

One feature of putative terotogenicity is that an embryonic exposure must take place at a specific gestational developmental time in order to have its claimed deleterious effect. As Judge Bernstein pointed out, omphalocele results from an incomplete folding of the abdominal wall during the third to fifth weeks of gestation. Mrs. Porter, however, did not begin taking Zoloft until her seventh week of pregnancy, which left Dr. Freeman opinion-less as to how Zoloft contributed to the claimed causation of the minor plaintiff’s birth defect. Slip op. at 14. This aspect of Freeman’s specific causation analysis was glaringly defect, and clearly not the sort of generally accepted methodology of attributing a birth defect to a teratogen.

******************************************************

All in all, Judge Bernstein’s opinion is a tour de force demonstration of how a state court judge, in a so-called Frye jurisdiction, can show that failure to employ generally accepted methods renders an expert witness’s opinions inadmissible. There is one small problem in statistical terminology.

Statistical Power

Judge Bernstein states, at different places, that the Louik study was and was not statistically significant for Zoloft and omphalocele. The court’s opinion ultimately does explain that the nominal statistical significance was vitiated by multiple comparisons and an extremely broad confidence interval, which more than justified its statement that the study was not truly statistically significant. In another moment, however, the court referred to the problem as one of lack of statistical power. For some reason, however, Judge Bernstein chose to explain the problem with the Louik study as a lack of statistical power:

“Equally significant is the lack of power concerning the omphalocele results. The Louik study had only 3 exposed subjects who developed omphalocele thus limiting its statistical power.”

Slip op. at 8. The adjusted odds ratio for Zoloft and omphalocele, was 5.7, with a 95% confidence interval of 1.6 – 20.7. Power was not the issue because if the odds ratio were otherwise credible, free from bias, confounding, and chance, the study had the power to observe an increased risk of close to 500%, which met the pre-stated level of significance. The problem, however, was multiple testing, fragile and imprecise results, and inability to evaluate the odds ratio fully for bias and confounding.


 

[1] Mark I. Bernstein, “Expert Testimony in Pennsylvania,” 68 Temple L. Rev. 699 (1995); Mark I. Bernstein, “Jury Evaluation of Expert Testimony under the Federal Rules,” 7 Drexel L. Rev. 239 (2014-2015).

[2] Jennita Reefhuis, Owen Devine, Jan M Friedman, Carol Louik, Margaret A Honein, “Specific SSRIs and birth defects: bayesian analysis to interpret new data in the context of previous reports,” 351 Brit. Med. J. (2015).

The C-8 (Perfluorooctanoic Acid) Litigation Against DuPont, part 1

September 27th, 2015

The first plaintiff has begun her trial against E.I. Du Pont De Nemours & Company (DuPont), for alleged harm from environmental exposure to perfluorooctanoic acid or its salts (PFOA). Ms. Carla Bartlett is claiming that she developed kidney cancer as a result of drinking water allegedly contaminated with PFOA by DuPont. Nicole Hong, “Chemical-Discharge Case Against DuPont Goes to Trial: Outcome could affect thousands of claims filed by other U.S. residents,” Wall St. J. (Sept. 13, 2015). The case is pending before Chief Judge Edmund A. Sargus, Jr., in the Southern District of Ohio.

PFOA is not classified as a carcinogen in the Integrated Risk Information System (IRIS), of the U.S. Environmental Protection Agency (EPA). In 2005, the EPA Office of Pollution Prevention and Toxics submitted a “Draft Risk Assessment of the Potential Human Health Effects Associated With Exposure to Perfluorooctanoic Acid and Its Salts (PFOA),” which is available at the EPA’s website. The draft report, which is based upon some epidemiology and mostly animal toxicology studies, stated that there was “suggestive evidence of carcinogenicity, but not sufficient to assess human carcinogenic potential.”

In 2013, The Health Council of the Netherlands evaluated the PFOA cancer issue, and found the data unsupportive of a causal conclusions. The Health Council of the Netherlands, “Perfluorooctanoic acid and its salts: Evaluation of the carcinogenicity and genotoxicity” (2013) (“The Committee is of the opinion that the available data on perfluorooctanoic acid and its salts are insufficient to evaluate the carcinogenic properties (category 3)”).

Last year, the World Health Organization (WHO) through its International Agency for Research on Cancer (IARC) reviewed the evidence on the alleged carcinogenicity of PFOA. The IARC, which has fostered much inflation with respect to carcinogenicity evaluations, classified as PFOA as only possibly carcinogenic. See News, “Carcinogenicity of perfluorooctanoic acid, tetrafl uoroethylene, dichloromethane, 1,2-dichloropropane, and 1,3-propane sultone,” 15 The Lancet Oncology 924 (2014).

Most independent reviews also find the animal and epidemiologic unsupportive of a causal conclusion between PFOA and any human cancer. See, e.g., Thorsten Stahl, Daniela Mattern, and Hubertus Brunn, “Toxicology of perfluorinated compounds,” 23 Environmental Sciences Europe 38 (2011).

So you might wonder how DuPont lost its Rule 702 challenges in such a case, which it surely did. In re E. I. du Pont de Nemours & Co. C-8 Pers. Injury Litig., Civil Action 2:13-md-2433, 2015 U.S. Dist. LEXIS 98788 (S.D. Ohio July 21, 2015). That is a story for another day.

David Faigman’s Critique of G2i Inferences at Weinstein Symposium

September 25th, 2015

The DePaul Law Review’s 20th Annual Clifford Symposium on Tort Law and Social Policy is an 800-plus page tribute in honor of Judge Jack Weinstein. 64 DePaul L. Rev. (Winter 2015). There are many notable, thought-provoking articles, but my attention was commanded by the contribution on Judge Weinstein’s approach to expert witness opinion evidence. David L. Faigman & Claire Lesikar, “Organized Common Sense: Some Lessons from Judge Jack Weinstein’s Uncommonly Sensible Approach to Expert Evidence,” 64 DePaul L. Rev. 421 (2015) [cited as Faigman].

Professor Faigman praises Judge Jack Weinstein for his substantial contributions to expert witness jurisprudence, while acknowledging that Judge Weinstein has been a sometimes reluctant participant and supporter of judicial gatekeeping of expert witness testimony. Professor Faigman also uses the occasion to restate his own views about the so-called “G2i” problem, the problem of translating general knowledge that pertains to groups to individual cases. In the law of torts, the G2i problem arises from the law’s requirement that plaintiffs show that they were harmed by defendants’ products or environmental exposures. In the context of modern biological “sufficient” causal set principles, this “proof” requirement entails that the product or exposure can cause the specified harms in human beings generally (“general causation”) and that the product or exposure actually played a causal role in bringing about plaintiffs’ specific harms.

Faigman makes the helpful point that courts initially and incorrectly invoked “differential diagnosis,” as the generally accepted methodology for attributing causation. In doing so, the courts extrapolated from the general acceptance of differential diagnosis in the medical community to the courtroom testimony about etiology. The extrapolation often glossed over the methodological weaknesses of the differential approach to etiology. Not until 1995 did a court wake to the realization that what was being proffered was a “differential etiology,” and not a differential diagnosis. McCullock v. H.B. Fuller Co., 61 F.3d 1038, 1043 (2d Cir. 1995). This realization, however, did not necessarily stimulate the courts’ analytical faculties, and for the most part, they treated the methodology of specific causal attribution as general acceptance and uncontroversial. Faigman’s point that the courts need to pay attention to the methodological challenges to differential etiological analysis is well taken.

Faigman also claims, however, that in advancing “differential etiologies, expert witnesses were inventing wholesale an approach that had no foundation or acceptance in their scientific disciplines:

 “Differential etiology is ostensibly a scientific methodology, but one not developed by, or even recognized by, physicians or scientists. As described, it is entirely logical, but has no scientific methods or principles underlying it. It is a legal invention and, as such, has analytical heft, but it is entirely bereft of empirical grounding. Courts and commentators have so far merely described the logic of differential etiology; they have yet to define what that methodology is.”

Faigman at 444.[1] Faigman is correct that courts often have left unarticulated exactly what the methodology is, but he does not quite make sense when he writes that the method of differential etiology is “entirely logical,” but has no “scientific methods or principles underlying it.” Afterall, Faigman starts off his essay with a quotation from Thomas Huxley that “science is nothing but trained and organized common sense.”[2] As I have written elsewhere, the form of reasoning involved in differential diagnosis is nothing other than the iterative disjunctive syllogism.[3] Either-or reasoning occurs throughout the physical and biological sciences; it is not clear why Faigman declares it un- or extra-scientific.

The strength of Faigman’s claim about the made-up nature of differential etiology appears to be undermined and contradicted by an example that he provides from clinical allergy and immunology:

“Allergists, for example, attempt to identify the etiology of allergic reactions in order to treat them (or to advise the patient to avoid what caused them), though it might still be possible to treat the allergic reactions without knowing their etiology.”

Faigman at 437. Of course, not only allergists try to determine the cause of an individual patient’s disease. Psychiatrists, in the psychoanalytic tradition, certain do so as well. Physicians who use predictive regression models use group data, in multivariate analyses, to predict outcomes, risk, and mortality in individual patients. Faigman’s claim is similarly undermined by the existence of a few diseases (other than infectious diseases) that are defined by the causative exposure. Silicosis and manganism have played a large role in often bogus litigation, but they represent instances in which a differential diagnosis and puzzle may also be an etiological diagnosis and puzzle. Of course, to the extent that a disease is defined in terms of causative exposures, there may be serious and even intractable problems caused by the lack of specificity and accuracy in the diagnostic criteria for the supposedly pathognomonic disease.

As for whether the concept of “differential etiology” is ever used in the sciences themselves, a few citations for consideration follow.

Kløve & D. Doehring, “MMPI in epileptic groups with differential etiology,” 18 J. Clin. Psychol. 149 (1962)

Kløve & C. Matthews, “Psychometric and adaptive abilities in epilepsy with differential etiology,” 7 Epilepsia 330 (1966)

Teuber & K. Usadel, “Immunosuppression in juvenile diabetes mellitus? Critical viewpoint on the treatment with cyclosporin A with consideration of the differential etiology,” 103 Fortschr. Med. 707 (1985)

G.May & W. May, “Detection of serum IgA antibodies to varicella zoster virus (VZV)–differential etiology of peripheral facial paralysis. A case report,” 74 Laryngorhinootologie 553 (1995)

Alan Roberts, “Psychiatric Comorbidity in White and African-American Illicity Substance Abusers” Evidence for Differential Etiology,” 20 Clinical Psych. Rev. 667 (2000)

Mark E. Mullinsa, Michael H. Leva, Dawid Schellingerhout, Gilberto Gonzalez, and Pamela W. Schaefera, “Intracranial Hemorrhage Complicating Acute Stroke: How Common Is Hemorrhagic Stroke on Initial Head CT Scan and How Often Is Initial Clinical Diagnosis of Acute Stroke Eventually Confirmed?” 26 Am. J. Neuroradiology 2207 (2005)

Qiang Fua, et al., “Differential Etiology of Posttraumatic Stress Disorder with Conduct Disorder and Major Depression in Male Veterans,” 62 Biological Psychiatry 1088 (2007)

Jesse L. Hawke, et al., “Etiology of reading difficulties as a function of gender and severity,” 20 Reading and Writing 13 (2007)

Mastrangelo, “A rare occupation causing mesothelioma: mechanisms and differential etiology,” 105 Med. Lav. 337 (2014)


[1] See also Faigman at 448 (“courts have invented a methodology – differential etiology – that purports to resolve the G2i problem. Unfortunately, this method has only so far been described; it has not been defined with any precision. For now, it remains a highly ambiguous idea, sound in principle, but profoundly underdefined.”).

[2] Thomas H. Huxley, “On the Education Value of the Natural History Sciences” (1854), in Lay Sermons, Addresses and Reviews 77 (1915).

[3] See, e.g.,Differential Etiology and Other Courtroom Magic” (June 23, 2014) (collecting cases); “Differential Diagnosis in Milward v. Acuity Specialty Products Group” (Sept. 26, 2013).

Beecher-Monas Proposes to Abandon Common Sense, Science, and Expert Witnesses for Specific Causation

September 11th, 2015

Law reviews are not peer reviewed, not that peer review is a strong guarantor of credibility, accuracy, and truth. Most law reviews have no regular provision for letters to the editor; nor is there a PubPeer that permits readers to point out errors for the benefit of the legal community. Nonetheless, law review articles are cited by lawyers and judges, often at face value, for claims and statements made by article authors. Law review articles are thus a potent source of misleading, erroneous, and mischievous ideas and claims.

Erica Beecher-Monas is a law professor at Wayne State University Law School, or Wayne Law, which considers itself “the premier public-interest law school in the Midwest.” Beware of anyone or any institution that describes itself as working for the public interest. That claim alone should put us on our guard against whose interests are being included and excluded as legitimate “public” interest.

Back in 2006, Professor Beecher-Monas published a book on evaluating scientific evidence in court, which had a few goods points in a sea of error and nonsense. See Erica Beecher-Monas, Evaluating Scientific Evidence: An Interdisciplinary Framework for Intellectual Due Process (2006)[1]. More recently, Beecher-Monas has published a law review article, which from its abstract suggests that she might have something to say about this difficult area of the law:

“Scientists and jurists may appear to speak the same language, but they often mean very different things. The use of statistics is basic to scientific endeavors. But judges frequently misunderstand the terminology and reasoning of the statistics used in scientific testimony. The way scientists understand causal inference in their writings and practice, for example, differs radically from the testimony jurists require to prove causation in court. The result is a disconnect between science as it is practiced and understood by scientists, and its legal use in the courtroom. Nowhere is this more evident than in the language of statistical reasoning.

Unacknowledged difficulties in reasoning from group data to the individual case (in civil cases) and the absence of group data in making assertions about the individual (in criminal cases) beset the courts. Although nominally speaking the same language, scientists and jurists often appear to be in dire need of translators. Since expert testimony has become a mainstay of both civil and criminal litigation, this failure to communicate creates a conundrum in which jurists insist on testimony that experts are not capable of giving, and scientists attempt to conform their testimony to what the courts demand, often well beyond the limits of their expertise.”

Beecher-Monas, “Lost in Translation: Statistical Inference in Court,” 46 Arizona St. L.J. 1057, 1057 (2014) [cited as BM].

A close read of the article shows, however, that Beecher-Monas continues to promulgate misunderstanding, error, and misdirection on statistical and scientific evidence.

Individual or Specific Causation

The key thesis of this law review is that expert witnesses have no scientific or epistemic warrant upon which to opine about individual or specific causation.

“But what statistics cannot do—nor can the fields employing statistics, like epidemiology and toxicology, and DNA identification, to name a few—is to ascribe individual causation.”

BM at 1057-58.

Beecher-Monas tells us that expert witnesses are quite willing to opine on specific causation, but that they have no scientific or statistical warrant for doing so:

“Statistics is the law of large numbers. It can tell us much about populations. It can tell us, for example, that so-and-so is a member of a group that has a particular chance of developing cancer. It can tell us that exposure to a chemical or drug increases the risk to that group by a certain percentage. What statistics cannot do is tell which exposed person with cancer developed it because of exposure. This creates a conundrum for the courts, because nearly always the legal question is about the individual rather than the group to which the individual belongs.”

BM at 1057. Clinical medicine and science come in for particular chastisement by Beecher-Monas, who acknowledges the medical profession’s legitimate role in diagnosing and treating disease. Physicians use a process of differential diagnosis to arrive at the most likely diagnosis of disease, but the etiology of the disease is not part of their normal practice. Beecher-Monas leaps beyond the generalization that physicians infrequently ascertain specific causation to the sweeping claim that ascertaining the cause of a patient’s disease is beyond the clinician’s competence and scientific justification. Beecher-Monas thus tells us, in apodictic terms, that science has nothing to say about individual or specific causation. BM at 1064, 1075.

In a variety of contexts, but especially in the toxic tort arena, expert witness testimony is not reliable with respect to the inference of specific causation, which, Beecher-Monas writes, usually without qualification, is “unsupported by science.” BM at 1061. The solution for Beecher-Monas is clear. Admitting baseless expert witness testimony is “pernicious” because the whole purpose of having expert witnesses is to help the fact finder, jury or judge, who lack the background understanding and knowledge to assess the data, interpret all the evidence, and evaluate the epistemic warrant for the claims in the case. BM at 1061-62. Beecher-Monas would thus allow the expert witnesses to testify about what they legitimately know, and let the jury draw the inference about which expert witnesses in the field cannot and should not opine. BM at 1101. In other words, Beecher-Monas is perfectly fine with juries and judges guessing their way to a verdict on an issue that science cannot answer. If her book danced around this recommendation, now her law review article has come out into the open, declaring an open season to permit juries and judges to be unfettered in their specific causation judgments. What is touching is that Beecher-Monas is sufficiently committed to gatekeeping of expert witness opinion testimony that she proposes a solution to take a complex area away from expert witnesses altogether rather than confront the reality that there is often simply no good way to connect general and specific causation in a given person.

Causal Pies

Beecher-Monas relies heavily upon Professor Rothman’s notion of causal pies or sets to describe the factors that may combine to bring about a particular outcome. In doing so, she commits a non-sequitur:

“Indeed, epidemiologists speak in terms of causal pies rather than a single cause. It is simply not possible to infer logically whether a specific factor caused a particular illness.”[2]

BM at 1063. But the question on her adopted model of causation is not whether any specific factor was the cause, but whether it was one of the multiple slices in the pie. Her citation to Rothman’s statement that “it is not possible to infer logically whether a specific factor was the cause of an observed event,” is not the problem that faces factfinders in court cases.

With respect to differential etiology, Beecher-Monas claims that “‘ruling in’ all potential causes cannot be done.” BM at 1075. But why not? While it is true that disease diagnosis is often made upon signs and symptoms, BM at 1076, sometimes physicians are involved in trying to identify causes in individuals. Psychiatrists of course are frequently involved in trying to identify sources of anxiety and depression in their patients. It is not all about putting a DSM-V diagnosis on the chart, and prescribing medication. And there are times, when physicians can say quite confidently that a disease has a particular genetic cause, as in a man with BrCa1, or BrCa2, and breast cancer, or certain forms of neurodegenerative diseases, or an infant with a clearly genetically determined birth defect.

Beecher-Monas confuses “the” cause with “a” cause, and wonders away from both law and science into her own twilight zone. Here is an example of how Beecher-Monas’ confusion plays out. She asserts that:

“For any individual case of lung cancer, however, smoking is no more important than any of the other component causes, some of which may be unknown.”

BM at 1078. This ignores the magnitude of the risk factor and its likely contribution to a given case. Putting aside synergistic co-exposures, for most lung cancers, smoking is the “but for” cause of individual smokers’ lung cancers. Beecher-Monas sets up a strawman argument by telling us that is logically impossible to infer “whether a specific factor in a causal pie was the cause of an observed event.” BM at 1079. But we are usually interested in whether a specific factor was “a substantial contributing factor,” without which the disease would not have occurred. This is hardly illogical or impracticable for a given case of mesothelioma in a patient who worked for years in a crocidolite asbestos factor, or for a case of lung cancer in a patient who smoked heavily for many years right up to the time of his lung cancer diagnosis. I doubt that many people would hesitate, on either logical or scientific grounds, to attribute a child’s phocomelia birth defects to his mother’s ingestion of thalidomide during an appropriate gestational window in her pregnancy.

Unhelpfully, Beecher-Monas insists upon playing this word game by telling us that:

“Looking backward from an individual case of lung cancer, in a person exposed to both asbestos and smoking, to try to determine the cause, we cannot separate which factor was primarily responsible.”

BM at 1080. And yet that issue, of “primary responsibility” is not in any jury instruction for causation in any state of the Union, to my knowledge.

From her extreme skepticism, Beecher-Monas swings to the other extreme that asserts that anything that could have been in the causal set or pie was in the causal set:

“Nothing in relative risk analysis, in statistical analysis, nor anything in medical training, permits an inference of specific causation in the individual case. No expert can tell whether a particular exposed individual’s cancer was caused by unknown factors (was idiopathic), linked to a particular gene, or caused by the individual’s chemical exposure. If all three are present, and general causation has been established for the chemical exposure, one can only infer that they all caused the disease.115 Courts demanding that experts make a contrary inference, that one of the factors was the primary cause, are asking to be misled. Experts who have tried to point that out, however, have had a difficult time getting their testimony admitted.”

BM at 1080. There is no support for Beecher-Monas’ extreme statement. She cites, in footnote 115, to Kenneth Rothman’s introductory book on epidemiology, but what he says at the cited page is quite different. Rothman explains that “every component cause that played a role was necessary to the occurrence of that case.” In other words, for every component cause that actually participated in bringing about this case, its presence was necessary to the occurrence of the case. What Rothman clearly does not say is that for a given individual’s case, the fact that a factor can cause a person’s disease means that it must have caused it. In Beecher-Monas’ hypothetical of three factors – idiopathic, particular gene, and chemical exposure, all three, or any two, or only one of the three may have made a given individual’s causal set. Beecher-Monas has carelessly or intentionally misrepresented Rothman’s actual discussion.

Physicians and epidemiologists do apply group risk figures to individuals, through the lens of predictive regression equations.   The Gail Model for 5 Year Risk of Breast Cancer, for instance, is a predictive equation that comes up with a prediction for an individual patient by refining the subgroup within which the patient fits. Similarly, there are prediction models for heart attack, such as the Risk Assessment Tool for Estimating Your 10-year Risk of Having a Heart Attack. Beecher-Monas might complain that these regression equations still turn on subgroup average risk, but the point is that they can be made increasingly precise as knowledge accumulates. And the regression equations can generate confidence intervals and prediction intervals for the individual’s constellation of risk factors.

Significance Probability and Statistical Significance

The discussion of significance probability and significance testing in Beecher-Monas’ book was frequently in error,[3] and this new law review article is not much improved. Beecher-Monas tells us that “judges frequently misunderstand the terminology and reasoning of the statistics used in scientific testimony,” BM at 1057, which is true enough, but this article does little to ameliorate the situation. Beecher-Monas offers the following definition of the p-value:

“The P- value is the probability, assuming the null hypothesis (of no effect) is true (and the study is free of bias) of observing as strong an association as was observed.”

BM at 1064-65. This definition misses that the p-value is a cumulative tail probability, and can be one-sided or two-sided. More seriously in error, however, is the suggestion that the null hypothesis is one of no effect, when it is merely a pre-specified expected value that is the subject of the test. Of course, the null hypothesis is often one of no disparity between the observed and the expected, but the definition should not mislead on this crucial point.

For some reason, Beecher-Monas persists in describing the conventional level of statistical significance as 95%, which substitutes the coefficient of confidence for the complement of the frequently pre-specified p-value for significance. Annoying but decipherable. See, e.g., BM at 1062, 1064, 1065. She misleadingly states that:

“The investigator will thus choose the significance level based on the size of the study, the size of the effect, and the trade-off between Type I (incorrect rejection of the null hypothesis) and Type II (incorrect failure to reject the null hypothesis) errors.”

BM at 1066. While this statement is sometimes, rarely true, it mostly is not. A quick review of the last several years of the New England Journal of Medicine will document the error. Invariably, researchers use the conventional level of alpha, at 5%, unless there is multiple testing, such as in a genetic association study.

Beecher-Monas admonishes us that “[u]sing statistical significance as a screening device is thus mistaken on many levels,” citing cases that do not provide support for this proposition.[4] BM at 1066. The Food and Drug Administration’s scientists, who review clinical trials for efficacy and safety will be no doubt be astonished to hear this admonition.

Beecher-Monas argues that courts should not factor statistical significance or confidence intervals into their gatekeeping of expert witnesses, but that they should “admit studies,” and leave it to the lawyers and expert witnesses to explain the strengths and weaknesses of the studies relied upon. BM at 1071. Of course, studies themselves are rarely admitted because they represent many levels of hearsay by unknown declarants. Given Beecher-Monas’ acknowledgment of how poorly judges and lawyers understand statistical significance, this argument is cynical indeed.

Remarkably, Beecher-Monas declares, without citation, that the

“the purpose of epidemiologists’ use of statistical concepts like relative risk, confidence intervals, and statistical significance are intended to describe studies, not to weed out the invalid from the valid.”

BM at 1095. She thus excludes by ipse dixit any inferential purposes these statistical tools have. She goes further and gives us a concrete example:

“If the methodology is otherwise sound, small studies that fail to meet a P-level of 5 [sic], say, or have a relative risk of 1.3 for example, or a confidence level that includes 1 at 95% confidence, but relative risk greater than 1 at 90% confidence ought to be admissible. And understanding that statistics in context means that data from many sources need to be considered in the causation assessment means courts should not dismiss non-epidemiological evidence out of hand.”

BM at 1095. Well, again, studies are not admissible; the issue is whether they may be reasonably relied upon, and whether reliance upon them may support an opinion claiming causality. And a “P-level” of 5 is, well, let us hope a serious typographical error. Beecher-Monas’ advice is especially misleading when there is there is only one study, or only one study in a constellation of exonerative studies. See, e.g., In re Accutane, No. 271(MCL), 2015 WL 753674, 2015 BL 59277 (N.J. Super. Law Div. Atlantic Cty. Feb. 20, 2015) (excluding Professor David Madigan for cherry picking studies to rely upon).

Confidence Intervals

Beecher-Monas’ book provided a good deal of erroneous information on confidence intervals.[5] The current article improves on the definitions, but still manages to go astray:

“The rationale courts often give for the categorical exclusion of studies with confidence intervals including the relative risk of one is that such studies lack statistical significance.62 Well, yes and no. The problem here is the courts’ use of a dichotomous meaning for statistical significance (significant or not).63 This is not a correct understanding of statistical significance.”

BM at 1069. Well yes and no; this interpretation of a confidence interval, say with a coefficient of confidence of 95%, is a reasonable interpretation of whether the point estimate is statistically significant at an alpa of 5%. If Beecher-Monas does not like strict significant testing, that is fine, but she cannot mandate its abandonment by scientists or the courts. Certainly the cited interpretation is one proper interpretation among several.

Power

There were several misleading references to statistical power in Beecher-Monas’ book, but the new law review tops them by giving a new, bogus definition:

“Power, the probability that the study in which the hypothesis is being tested will reject the alterative [sic] hypothesis when it is false, increases with the size of the study.”

BM at 1065. For this definition, Beecher-Monas cites to the Reference Manual on Scientific Evidence, but butchers the correct definition give by the late David Freedman and David Kaye.[6] All of which is very disturbing.

Relative Risks and Other Risk Measures

Beecher-Monas begins badly by misdefining the concept of relative risk:

“as the percentage of risk in the exposed population attributable to the agent under investigation.”

BM at 1068. Perhaps this percentage can be derived from the relative risk, if we know it to be the true measure with some certainty, through a calculation of attributable risk, but confusing and conflating attributable and relative risk in a law review article that is taking the entire medical profession to task, and most of the judiciary to boot, should be written more carefully.

Then Beecher-Monas tells us that the “[r]elative risk is a statistical test that (like statistical significance) depends on the size of the population being tested.” BM at 1068. Well, actually not; the calculation of the RR is unaffected by the sample size. The variance of course will vary with the sample size, but Beecher-Monas seems intent on ignoring random variability.

Perhaps most egregious is Beecher-Monas’ assertion that:

“Any increase above a relative risk of one indicates that there is some effect.”

BM at 1067. So much for ruling out chance, bias, and confounding! Or looking at an entire body of epidemiologic research for strength, consistency, coherence, exposure-response, etc. Beecher-Monas has thus moved beyond a liberal, to a libertine, position. In case the reader has any doubts of the idiosyncrasy of her views, she repeats herself:

“As long as there is a relative risk greater than 1.0, there is some association, and experts should be permitted to base their causal explanations on such studies.”

BM at 1067-68. This is evidentiary nihilism in full glory. Beecher-Monas has endorsed relying upon studies irrespective of their study design or validity, their individual confidence intervals, their aggregate summary point estimates and confidence intervals, or the absence of important Bradford Hill considerations, such as consistency, strength, and dose-response. So an expert witness may opine about general causation from reliance upon a single study with a relative risk of 1.05, say with a 95% confidence interval of 0.8 – 1.4?[7] For this startling proposition, Beecher-Monas cites the work of Sander Greenland, a wild and wooly plaintiffs’ expert witness in various toxic tort litigations, including vaccine autism and silicone autoimmune cases.

RR > 2

Beecher-Monas’ discussion of inferring specific causation from relative risks greater than two devolves into a muddle by her failure to distinguish general from specific causation. BM at 1067. There are different relevancies for general and specific causation, depending upon context, such as clinical trials or epidemiologic studies for general causation, number of studies available, and the like. Ultimately, she adds little to the discussion and debate about this issue, or any other.


[1] See previous comments on the book at “Beecher-Monas and the Attempt to Eviscerate Daubert from Within”; “Friendly Fire Takes Aim at Daubert – Beecher-Monas And The Undue Attack on Expert Witness Gatekeeping; and “Confidence in Intervals and Diffidence in the Courts.”

[2] Kenneth J. Rothman, Epidemiology: An Introduction 250 (2d ed. 2012).

[3] Erica Beecher-Monas, Evaluating Scientific Evidence: An Interdisciplinary Framework for Intellectual Due Process 42 n. 30, 61 (2007) (“Another way of explaining this is that it describes the probability that the procedure produced the observed effect by chance.”) (“Statistical significance is a statement about the frequency with which a particular finding is likely to arise by chance.”).

[4] See BM at 1066 & n. 44, citing “See, e.g., In re Breast Implant Litig., 11 F. Supp. 2d 1217, 1226–27 (D. Colo. 1998); Haggerty v. Upjohn Co., 950 F. Supp. 1160, 1164 (S.D. Fla. 1996), aff’d, 158 F.3d 588 (11th Cir. 1998) (“[S]cientifically valid cause and effect determinations depend on controlled clinical trials and epidemiological studies.”).”

 

[5] See, e.g., Erica Beecher-Monas, Evaluating Scientific Evidence 58, 67 (N.Y. 2007) (“No matter how persuasive epidemiological or toxicological studies may be, they could not show individual causation, although they might enable a (probabilistic) judgment about the association of a particular chemical exposure to human disease in general.”) (“While significance testing characterizes the probability that the relative risk would be the same as found in the study as if the results were due to chance, a relative risk of 2 is the threshold for a greater than 50 percent chance that the effect was caused by the agent in question.”)(incorrectly describing significance probability as a point probability as opposed to tail probabilities).

[6] David H. Kaye & David A. Freedman, Reference Guide on Statistics, in Federal Jud. Ctr., Reference Manual on Scientific Evidence 211, 253–54 (3d ed. 2011) (discussing the statistical concept of power).

[7] BM at 1070 (pointing to a passage in the FJC’s Reference Manual on Scientific Evidence that provides an example of one 95% confidence interval that includes 1.0, but which shrinks when calculated as a 90% interval to 1.1 to 2.2, which values “demonstrate some effect with confidence interval set at 90%). This is nonsense in the context of observational studies.

Seventh Circuit Affirms Exclusion of Expert Witnesses in Vinyl Chloride Case

August 30th, 2015

Last week, the Seventh Circuit affirmed a federal district court’s exclusion of plaintiffs’ expert witnesses in an environmental vinyl chloride exposure case. Wood v. Textron, Inc., No. 3:10 CV 87, 2014 U.S. Dist. LEXIS 34938 (N.D. Ind. Mar. 17, 2014); 2014 U.S. Dist. LEXIS 141593, at *11 (N.D. Ind. Oct. 3, 2014), aff’d, Slip op., No. 14-3448, 20125 U.S. App. LEXIS 15076 (7th Cir. Aug. 26, 2015). Plaintiffs, children C.W. and E.W., claimed exposure from Textron’s manufacturing facility in Rochester, Indiana, which released vinyl chloride as a gas that seeped into ground water, and into neighborhood residential water wells. Slip op. at 2-3. Plaintiffs claimed present injuries in the form of “gastrointestinal issues (vomiting, bloody stools), immunological issues, and neurological issues,” as well as future increased risk of cancer. Importantly, the appellate court explicitly approved the trial court’s careful reading of relied upon studies to determine whether they really did support the scientific causal claims made by the expert witnesses. Given the reluctance of some federal district judges to engage with the studies actually cited, this holding is noteworthy.

To support their claims, plaintiffs offered the testimony from three familiar expert witnesses:

(1) Dr. James G. Dahlgren;

(2) Dr. Vera S. Byers; and

(3) Dr. Jill E. Ryer-Powder.

Slip op. at 5. This gaggle offered well-rehearsed but scientifically unsound arguments in place of actual evidence that the children were hurt, or would be afflicted, as a result of their claimed exposures:

(a) extrapolation from high dose animal and human studies;

(b) assertions of children’s heightened vulnerability;

(c) differential etiology;

(d) temporality; and

(e) regulatory exposure limits.

On appeal, a panel of the Seventh Circuit held that the district court had properly conducted “an in-depth review of the relevant studies that the experts relied upon to generate their differential etiology,” and their general causation opinions. Slip op. at 13-14 (distinguishing other Seventh Circuit decisions that reversed district court Rule 702 rulings, and noting that the court below followed Joiner’s lead by analyzing the relied-upon studies to assess analytical gaps and extrapolations). The plaintiffs’ expert witnesses simply failed in analytical gap bridging, and dot connecting.

Extrapolation

The Circuit agreed with the district court that the extrapolations asserted were extreme, and that they represented “analytical gaps” too wide to be permitted in a courtroom. Slip op. at 15. The challenged expert witnesses extrapolated between species, between exposure levels, between exposure duration, between exposure circumstances, and between disease outcomes.

The district court faulted Dahlgren for relying upon articles that “fail to establish that [vinyl chloride] at the dose and duration present in this case could cause the problems that the [p]laintiffs have experienced or claim that they are likely to experience.” C.W. v. Textron, 2014 U.S. Dist. LEXIS 34938, at *53, *45 (N.D. Ind. Mar. 17, 2014) (finding that the analytical gap between the cited studies and Dahlgren’s purpose in citing the studies was an unbridged gap, which Dahlgren had failed to explain). Slip op. at 8.

Byers, for instance, cited one study[1] that involved exposure for five years, at an average level that was over 1,000 times higher than the children’s alleged exposure levels, which lasted less than 17 and 7 months, each. Perhaps even more extreme were the plaintiffs’ expert witnesses’ attempted extrapolations from animal studies, which the district court recognized as “too attenuated” from plaintiffs’ case. Slip op. at 14. The Seventh Circuit rejected plaintiffs’ alleged error that the district court had imposed a requirement of “absolute precision,” in holding that the plaintiffs’ expert witnesses’ analytical gaps (and slips) were too wide to be bridged. The Circuit provided a colorful example of a study on laboratory rodents, pressed into service for a long-term carcinogenetic assay, which found no statistically significant increase in tumors fed 0.03 milligrams vinyl chloride per kilogram of bodyweight, (0.03 mg/kg), for 4 to 5 days each week, for 59 weeks, compared to control rodents fed olive oil.[2] Slip op. at 14-15. This exposure level in this study of 0.03 mg/kg was over 10 times the children’s exposure, as estimated by Ryer-Powder. The 59 weeks of study exposure represents the great majority of the rodents’ adult years, which greatly exceeds the children’s exposure was took place over several months of their lives. Slip op. at 15.

The Circuit held that the district court was within its discretion in evaluating the analytical gaps, and that the district court was correct to look at the study details to exercise its role as a gatekeeper under Rule 702. Slip op. at 15-17. The plaintiffs’ expert witnesses failed to explain their extrapolations, which was made their opinions suspect. As the Circuit court noted, there is a methodology by which scientists sometimes attempt to model human risks from animal evidence. Slip op. at 16-17, citing Bernard D. Goldtsein & Mary Sue Henifin, “Reference Guide on Toxicology,” in Federal Manual on Scientific Evidence 646 (3d ed. 2011) (“The mathematical depiction of the process by which an external dose moves through various compartments in the body until it reaches the target organ is often called physiologically based pharmokinetics or toxicokinetics.”). Given the abject failures of plaintiffs’ expert witnesses to explain their leaps of faith, the appellate court had no occasion to explore the limits of risk assessment outside regulatory contexts.

Children’s Vulnerability

Plaintiffs’ expert witness asserted that children are much more susceptible than adult workers, and even laboratory rats. As is typical in such cases, these expert witnesses had no evidence to support their assertions, and they made no effort even to invoke models that attempted reasonable risk assessments of children’s risk.

Differential Etiology

Dahlgren and Byers both claimed that they reached individual or specific causation conclusions based upon their conduct of a “differential etiology.” The trial and appellate court both faulted them for failing to “rule in” vinyl chloride for plaintiffs’ specific ailments before going about the business of ruling out competing or alternative causes. Slip op. at 6-7; 9-10; 20-21.

The courts also rejected Dahlgren’s claim that he could rule out all potential alternative causes by noting that the children’s treating physicians had failed to identify any cause for their ailments. So after postulating a limited universe of alternative causes of “inheritance, allergy, infection or another poison,” Dahlgren ruled all of them out of the case, because these putative causes “would have been detected by [the appellants’] doctors and treated accordingly.” Slip op. at 7, 18. As the Circuit court saw the matter:

“[T]his approach is not the stuff of science. It is based on faith in his fellow physicians—nothing more. The district court did not abuse its discretion in rejecting it.”

Slip op. at 18. Of course, the court might well have noted that physicians are often concerned exclusively with identifying effective therapy, and have little or nothing to offer on actual causation.

The Seventh Circuit panel did fuss with dicta in the trial court’s opinion that suggested differential etiology “cannot be used to support general causation.” C.W. v. Textron, 2014 U.S. Dist. LEXIS 141593, at *11 (N.D. Ind. Oct. 3, 2014). Elsewhere, the trial court wrote, in a footnote, that “[d]ifferential [etiology] is admissible only insofar as it supports specific causation, which is secondary to general causation … .” Id. at *12 n.3. Curiously the appellate court characterized these statements as “holdings” of the trial court, but disproved their own characterization by affirming the judgment below. The Circuit court countered with its own dicta that

“there may be a case where a rigorous differential etiology is sufficient to help prove, if not prove altogether, both general and specific causation.”

Slip op. at 20 (citing, in turn, improvident dicta from the Second Circuit, in Ruggiero v. Warner-Lambert Co., 424 F.3d 249, 254 (2d Cir. 2005) (“There may be instances where, because of the rigor of differential diagnosis performed, the expert’s training and experience, the type of illness or injury at issue, or some other … circumstance, a differential diagnosis is sufficient to support an expert’s opinion in support of both general and specific causation.”).

Regulatory Pronouncements

Dahlgren based his opinions upon the children’s water supply containing vinyl chloride in excess of regulatory levels set by state and federal agencies, including the U.S. Environmental Protection Agency (E.P.A.). Slip op. at 6. Similarly, Ryer-Powder relied upon exposure levels’ exceeding regulatory permissible limits for her causation opinions. Slip op. at 10.

The district court, with the approval now of the Seventh Circuit would have none of this nonsense. Exceeding governmental regulatory exposure limits does not prove causation. The con-compliance does not help the fact finder without knowing “the specific dangers” that led the agency to set the permissible level, and thus the regulations are not relevant at all without this information. Even with respect to specific causation, the regulatory infraction may be weak or null evidence for causation. Slip op. at 18-19 (citing Cunningham v. Masterwear Corp., 569 F.3d 673, 674–75 (7th Cir. 2009).

Temporality

Byers and Dahlgren also emphasized that the children’s symptoms began after exposure and abated after removal from exposure. Slip op. at 9, 6-7. Both the trial and appellate courts were duly unimpressed by the post hoc ergo propter hoc argument. Slip op. at 19, citing Ervin v. Johnson & Johnson, 492 F.3d 901, 904-05 (7th Cir. 2007) (“The mere existence of a temporal relationship between taking a medication and the onset of symptoms does not show a sufficient causal relationship.”).

Increased Risk of Cancer

The plaintiffs’ expert witnesses offered opinions about the children’s future risk of cancer that were truly over the top. Dahlgren testified that the children were “highly likely” to develop cancer in the future. Slip op. at 6. Ryer-Powder claimed that the children’s exposures were “sufficient to present an unacceptable risk of cancer in the future.” Slip op. at 10. With no competence evidence to support their claims of present or past injury, these opinions about future cancer were no longer relevant. The Circuit thus missed an opportunity to comment on how meaningless these opinions were. Most people will develop a cancer at some point in their lifetime, and we might all agree that any risk is unacceptable, which is why medical research continues into the causes, prevention, and cure of cancer. An unquantified risk of cancer, however, cannot support an award of damages even if it were a proper item of damages. See, e.g., Sutcliffe v. G.A.F. Corp., 15 Phila. 339, 1986 Phila. Cty. Rptr. LEXIS 22, 1986 WL 501554 (1986). See alsoBack to Baselines – Litigating Increased Risks” (Dec. 21, 2010).


[1] Steven J. Smith, et al., “Molecular Epidemiology of p53 Protein Mutations in Workers Exposed to Vinyl Chloride,” 147 Am. J. Epidemiology 302 (1998) (average level of workers’ exposure was 3,735 parts per million; children were supposedly exposed at 3 ppb). This study looked only at a putative biomarker for angiosarcoma of the liver, not at cancer risk.

[2] Cesare Maltoni, et al., “Carcinogenity Bioassays of Vinyl Chloride Monomer: A Model of Risk Assessment on an Experimental Basis, 41 Envt’l Health Persp. 3 (1981).

Events, Outcomes, and Effects – Media Responsibility to Be Accurate

July 29th, 2015

Thanks to Dr. David Schwartz for the pointer to a story, by a Bloomberg, Reuters health reporter, on a JAMA online-first article on drug “side effects.” See David Schwartz, “Lack of compliance on ADR Reporting: Some serious drug side effects not told to FDA within 15 days” (July 29, 2015).

The reporter, Lisa Rapaport, wrote about an in-press article in JAMA Internal Medicine, about delays in drug company mandatory reporting. Lisa Rapaport, “Some serious drug side effects not told to FDA within 15 days,” (July 27, 2015). The article that gave rise to this media coverage, however, was not about side effects, or direct effects, for that matter; it was about adverse events. See Paul Ma, Iván Marinovic, and Pinar Karaca-Mandic, “Drug Manufacturers’ Delayed Disclosure of Serious and Unexpected Adverse Events to the US Food and Drug Administration,” JAMA Intern. Med. (published online July 27, 2015) (doi:10.1001/jamainternmed.2015.3565).

The word “effect[s]” occurs 10 times in Rapaport’s news item; and yet, that word does not appear at all in the JAMA article, except in a footnote that points to a popular media article. And Reuters is the source of the footnoted popular media article.[1] Apparently, Reuter’s reporters are unaware of the difference between an event and an effect. The companies’ delay in reporting apparently made up 10% of all adverse event reports, but spinning the story as though it were about adverse effects makes the story seem more important and the delays more nefarious.

Why would a reporter covering a medical journal article not be familiar with the basic terminology and concepts at issue? The FDA’s description of its adverse event system makes clear that adverse events have nothing to do with “effects.” The governing regulations for post-marketing reporting of adverse drug experiences are even more clear that adverse events or experiences are not admissions or conclusions of causality. 21 C.F.R. 314.80(a), (k). See also ICH Harmonised Tripartite Guideline for Good Clinical Practice E6(R1) (10 June 1996).

Perhaps this is an issue with which Sense about Science USA can help? Located in the brain basket of America – Brooklyn, NY – Sense about Science is:

“a non-profit, non-partisan American branch of the British charitable trust, Sense About Science, which was founded in 2003 and which grew to play a pivotal role in promoting scientific understanding and defending scientific integrity in the UK and Europe.”

One of the organization’s activities is offering media help in understanding scientific and statistical issues. Let’s hope that they take the help being offered.


[1] S. Heavey, “FDA warns Pfizer for not reporting side effects” (June 10, 2010).

California Actos Decision Embraces Relative-Risk-Greater-Than-Two Argument

July 28th, 2015

A recent decision of the California Court of Appeal, Second District, Division Three, continues the dubious state and federal practice of deciding important issues under cover of unpublished opinions. Cooper v. Takeda Pharms. America, Inc., No. B250163, 2015 Cal. App. Unpub. LEXIS 4965 (Calif. App., 2nd Dist., Div. 3; July 16, 2015). In Cooper, plaintiff claimed that her late husband’s bladder cancer was caused by defendant’s anti-diabetic medication, Actos (pioglitazone). The defendant moved to strike the expert witness testimony in support of specific causation. The trial judge expressed serious concerns about the admissibility of plaintiff’s expert witnesses on specific causation, but permitted the trial to go forward. After a jury returned its verdict in favor of plaintiff, the trial court entered judgment for the defendants, on grounds that the plaintiff lacked admissible expert witness testimony.

Although a recent, large, well-conducted study[1] failed to find any meaningful association between pioglitazone and bladder cancer, there were, at the time of trial, several studies that suggested an association. Plaintiff’s expert witnesses, epidemiologist Dr. Alfred Neugut and bladder oncologist Dr. Norm Smith interpreted the evidence to claim a causal association, but both conceded that there were no biomarkers that allowed them to attribute Cooper’s cancer to pioglitazone. The plaintiff also properly conceded that identifying a cause of the bladder cancer was irrelevant to treating the disease. Cooper, 2015 Cal. App. Unpub. LEXIS 4965, at *13. Specific causation was thus determined by the so-called process of differential etiology, with the ex ante existence of risk substituting for cause, and using risk exposure in the differential analysis.

The trial court was apparently soured on Dr. Smith’s specific causation assessment because of his poor performance at deposition, in which he demonstrated a lack of understanding of Cooper’s other potential exposures. Smith’s spotty understanding of Cooper’s actual and potential exposures and other risks made any specific causation assessment less than guesswork. By the time of trial, Dr. Smith and plaintiff’s counsel had backfilled the gaps, and Smith presented a more confident analysis of Cooper’s exposures and potentially competing risks.

Cooper had no family history of bladder cancer, no alcohol consumption, and no obvious exposure to occupational bladder carcinogens. His smoking history would account for exposure to a known bladder carcinogen, cigarette smoke, but Cooper’s documented history was of minor tobacco use, and remote in time. Factually, Cooper’s history was suspect and at odds with his known emphysema. Based upon this history, along with their causal interpretation of the Actos bladder cancer association, and their quantitative assessment that the risk ratio for bladder cancer from Actos was 7.0 or higher for Mr. Cooper (controlled for covariate, potential confounders), the plaintiff’s expert witnesses opined that Actos was probably a substantial factor in causing Mr. Cooper’s bladder cancer. The court did not examine the reasonableness of Dr. Smith’s risk ratios, which seem exorbitant in view of several available meta-analyses.[2]

The court stated that under the applicable California law of “substantial factor,” the plaintiff’s expert witness, in conducting a differential diagnosis, need not exclude every other possible cause of plaintiff’s disease “with absolute certainty.” Cooper, at *41-42. This statement leaves unclear and ambiguous whether the plaintiff’s expert witness must (and did in this case) rule out other possible causes with some level of certitude less than “absolute certainty,” such as reasonable medical certainty, or perhaps reasonable probability. Dr. Smith’s testimony, as described, did not attempt to go so far as to rule out smoking as “a cause” of Cooper’s bladder cancer; only that the risk from smoking was a lower order of magnitude than that for Actos. In Dr. Smith’s opinion, the discrepancy in magnitude between the risk ratios for smoking and Actos allowed him to state confidently that Actos was the most substantial risk.

Having estimated the smoking-related increased risk to somewhere between 0 and 100%, with the Actos increased risk at 600% or greater, Dr. Smith was able to present an admissible opinion that Actos was a substantial factor. Of course, this all turns on the appellate court’s acceptance of risk, of some sufficiently large magnitude, as evidence of specific causation. In the Cooper court’s words:

“The epidemiological studies relied on by Dr. Smith indicated exposure to Actos® resulted in hazard ratios for developing bladder cancer ranging from 2.54 to 6.97.18 By demonstrating a relative risk greater than 2.0 that a product causes a disease, epidemiological studies thereby become admissible to prove that the product at issue was more likely than not responsible for causing a particular person’s disease. “When statistical analyses or probabilistic results of epidemiological studies are offered to prove specific causation . . . under California law those analyses must show a relative risk greater than 2.0 to be ‘useful’ to the jury. Daubert v. Merrell Dow Pharmaceuticals Inc., 43 F.3d 1311, 1320 (9th Cir.), cert. denied 516 U.S. 869 (1995) [Daubert II]. This is so, because a relative risk greater than 2.0 is needed to extrapolate from generic population-based studies to conclusions about what caused a specific person’s disease. When the relative risk is 2.0, the alleged cause is responsible for an equal number of cases of the disease as all other background causes present in the control group. Thus, a relative risk of 2.0 implies a 50% probability that the agent at issue was responsible for a particular individual’s disease. This means that a relative risk that is greater than 2.0 permits the conclusion that the agent was more likely than not responsible for a particular individuals disease. [Reference Manual on Scientific Evidence (Federal Judicial Center 2d ed. 2000) (“Ref. Manual”),] Ref. Manual at 384, n. 140 (citing Daubert II).” (In re Silicone Gel Breast Implant Prod. Liab. Lit. (C.D. Cal. 2004) 318 F.Supp.2d 879, 893; italics added.) Thus, having considered and ruled out other background causes of bladder cancer based on his medical records, Dr. Smith could conclude based on the studies that it was more likely than not that Cooper’s exposure to Actos® caused his bladder cancer. In other words, because the studies, to varying degrees, adjusted for race, age, sex, and smoking, as well as other known causes of bladder cancer, Dr. Smith could rely upon those studies to make his differential diagnosis ruling in Actos®—as well as smoking—and concluding that Actos® was the most probable cause of Cooper’s disease.”

Cooper, at *78-80 (emphasis in the original).

Of course, the epidemiologic studies themselves are not admissible, regardless of the size of the relative risk, but the court was, no doubt, speaking loosely about the expert witness opinion testimony that was based upon the studies with risk ratios greater than two. Although the Cooper case does not change California law’s facile acceptance of risk as a substitute for cause, the case does base its approval of plaintiff’s expert witness’s attribution as turning on the magnitude of the risk ratio, adjusted for confounders, as having exceeded two. The Cooper case leaves open what happens when the risk that is being substituted for cause is a ratio ≤ 2.0. Some critics of the risk ratio > 2.0 inference have suggested that risk ratios greater than two would lead to directed verdicts for plaintiffs in all cases, but this suggestion requires demonstrations of both the internal and external validity of the studies that measure the risk ratio, which in many cases is in doubt. In Cooper, the plaintiff’s expert witnesses’ embrace of a high, outlier risk ratio for Actos, while simultaneously downplaying competing risks, allowed them to make out their specific causation case.


[1] James D. Lewis, Laurel A. Habel, Charles P. Quesenberry, Brian L. Strom, Tiffany Peng, Monique M. Hedderson, Samantha F. Ehrlich, Ronac Mamtani, Warren Bilker, David J. Vaughn, Lisa Nessel, Stephen K. Van Den Eeden, and Assiamira Ferrara, “Pioglitazone Use and Risk of Bladder Cancer and Other Common Cancers in Persons With Diabetes,” 314 J. Am. Med. Ass’n 265 (2015) (adjusted hazard ratio 1.06, 95% CI, 0.89-1.26).

[2] See, e.g., R.M. Turner, et al., “Thiazolidinediones and associated risk of bladder cancer: a systematic review and meta-analysis,” 78 Brit. J. Clin. Pharmacol. 258 (2014) (OR = 1.51, 95% CI 1.26-1.81, for longest cumulative duration of pioglitazone use); M. Ferwana, et al., “Pioglitazone and risk of bladder cancer: a meta-analysis of controlled studies,” 30 Diabet. Med. 1026 (2013) (based upon 6 studies, with median follow-up of 44 months, risk ratio = 1.23; 95% CI 1.09-1.39); Cristina Bosetti, “Cancer Risk for Patients Using Thiazolidinediones for Type 2 Diabetes: A Meta-Analysis,” 18 The Oncologist 148 (2013) (RR = 1.64 for longest exposure); Shiyao He, et al., “Pioglitazone prescription increases risk of bladder cancer in patients with type 2 diabetes: an updated meta-analysis,” 35 Tumor Biology 2095 (2014) (pooled hazard ratio = 1.67 (95% C.I., 1.31 – 2.12).

Crayons Help Divert California from Real Risks – Tales from the Fearmonger’s Shop

July 8th, 2015

Living in a state, California, beset by the actuality of drought and the real, imminent threat of earthquake, must be scary. And still, Californians seem to relish increasing the appearance of risks everywhere. The state has astonishing epistemic insights, knowing risks not known to anyone else, through its Proposition 65. And then there are legislative fiats that posit risks, again unknown outside California. David Lazarus, “Berkeley’s warning about cellphone radiation may go too far,” Los Angeles Times (June 26, 2015). And now there are killer crayons from China. Victoria Colliver, “Asbestos fibers found in some crayons, toys from China,” SFGate (July 8, 2015).

Ms. Colliver is largely the uncritical conduit for an advocacy group, which speaks through her, without any scientific filter:

“Environmental health advocates said there is no safe level of exposure to asbestos, a group of naturally occurring minerals with microscopic fibers. The fibers can accumulate in the lungs and have been linked to cancer and other health problems.”

Id. At best, some scientists, mostly of the zealot brand, say that there is no known safe level of exposure to asbestos, but this is quite different from saying there is known to be no safe level. Honest scientists will acknowledge a dispute about whether low-level exposures are innocuous, but uncertainty about safety at low doses does not translate into certainty about unsafety at low doses. And the suggestion that fibers can accumulate in the lungs may be true for occupational and paraoccupational exposures, but human beings have defense mechanisms that block entry by, and rid the lungs of, asbestos fibers. At the cellular and subcellular level, humans have robust defenses to low-levels of carcinogens in the form of DNA repair mechanisms. Of course, as wild and unpredictable as little children can be, they rarely inhale crayons. If they do, asbestos won’t be their problem. (To be fair, one of the products tested was a powder, which could be aerosolized, but there is no quantitative assessment of the extent of asbestos in this powder product.)

Ms. Colliver’s source is a report put out by Environmental Working Group Action Fund, the website for which does not acknowledge any scientific oversight or membership. Colliver’s “hot quotes” are from Richard Lemen, who is a regular testifier for the asbestos litigation industry.

What you will not see in Colliver’s “science” coverage is that there is no mineral asbestos; rather it is a commercial term for six different fibrous naturally occurring minerals. Five of the asbestos minerals are amphiboles – crocidolite (blue), amosite (brown), tremolite, anthophyllite, and actinolite. The remaining mineral fiber is chrysotile. There are, to be sure, many other fibrous minerals, but none with any suggested carcinogenicity, other than the non-asbestos zeolite mineral erionite. The most serious health effect of some kinds of asbestos is mesothelioma, a malignancy of the serosal tissues around the lung, heart, and gut. Crocidolite and amosite are by far the major causes of mesothelioma.

Although Colliver does not link to the EWG’s report, it is easy enough to find the report on the group’s website. See Bill Walker and Sonya Lunder, “Tests Find Asbestos in Kids’ Crayons, Crime Scene Kits” (2015). Most of the products tested had no detectable asbestos fiber of any kind. The EWG report provides no quantification of the findings so it is hard to assess the extent of the asbestos present. The report does provide the identity of the fibrous asbestos minerals present: tremolite, anthophyllite, actinolite, which suggests that the fibers were present in low levels in the talcs used as binding agents or mold release for the crayons. The EWG report fails to provide quantitative information on the distribution of morphology of the so-called fibers. The biologically dangerous fibers have a high-aspect ratio. Importantly, crocidolite and amosite, which collectively are the major causes of mesothelioma, were not found.

Assuming the report is correct, the hazard to children is remote and incredibly speculative. Fibers in the crayons would not be readily aerosolized, and the fibers could not represent even a theoretical hazard unless they were inhaled. The only disease for which low exposures is even a theoretical concern is mesothelioma. Back in 2000, a similar scare erupted in the media. At that time the Consumer Product Safety Commission tested the crayons, and concluded that the risk of a child’s inhaling asbestos fiber was “extremely low.” No airborne fibers could be detected after a simulation of a child’s “vigorously coloring” with a crayon. U.S. Consumer Product Safety Commission, CPSC Staff Report on Asbestos Fibers in Children’s Crayons (2000). The business of defining what counts as an asbestos fiber, as opposed to a non-carcinogenic particle, is complicated and sometimes controversial. See Bruce W. Case, Jerrold L. Abraham, G. Meeker , Fred D. Pooley & K. E. Pinkerton, “Applying definitions of “asbestos” to environmental and “low-dose” exposure levels and health effects, particularly malignant mesothelioma,” 14 J. Toxicol. Envt’l Health B Crit. Rev. 3 (2011) (noting lack of consensus about the specific definitions for asbestos fibers).

As for low-exposure alleged risks, the evidence varies by disease outcome. For lung cancer, there is actually rather strong evidence of a threshold. And lack of conclusive evidence of a threshold below which mesothelioma will not occur is hardly evidence that mesothelioma could result from any theoretical exposure postulated from children’s use of the crayons. The business of attributing a case of mesothelioma to a low-level previous exposure is, of course, very different from predicting that a very low level exposure will have a public health effect in a large population. Asbestos minerals occur naturally, and rural and urban residents, even those without occupational exposure, have a level of asbestos that can be found in their lung tissue. There is, however, a business of attributing mesotheliomas to low-level exposures, that has become a big business indeed, in courtrooms all around the United States. The business is run by expert witnesses who regularly conflate “no known safe level” with “known no safe level,” just as Ms. Colliver did in her article. What a coincidence! See Mark A. Behrens & William L. Anderson, “The ‘any exposure’ theory: an unsound basis ·for asbestos causation and expert testimony,” 37 Southwestern Univ. L. Rev. 479 (2008); Nicholas P. Vari and Michael J. Ross, “State Courts Move to Dismiss Every Exposure Liability Theory in Asbestos Lawsuits,” 29 Legal Backgrounder (Feb. 28, 2014).

The One Percent Non-solution – Infante Fuels His Own Exclusion in Gasoline Leukemia Case

June 25th, 2015

Most epidemiologic studies are not admissible. Such studies involve many layers of hearsay evidence, measurements of exposures, diagnoses, records, and the like, which cannot be “cross-examined.” Our legal system allows expert witnesses to rely upon such studies, although clearly inadmissible, when “experts in the particular field would reasonably rely on those kinds of facts or data in forming an opinion on the subject.” Federal Rule of Evidence 703. One of the problems that judges face in carrying out their gatekeeping duties is to evaluate whether challenged expert witnesses have reasonably relied upon particular studies and data. Judges, unlike juries, have an obligation to explain their decisions, and many expert witness gatekeeping decisions by judges fall short by failing to provide citations to the contested studies at issue in the challenge. Sometimes the parties may be able to discern what is being referenced, but the judicial decision has a public function that goes beyond speaking to the litigants before the court. Without full citations to the studies that underlie an expert witness’s opinion, the communities of judges, lawyers, scientists, and others cannot evaluate the judge’s gatekeeping. Imagine a judicial opinion that vaguely referred to a decision by another judge, but failed to provide a citation? We would think such an opinion to be a miserable failure of the judge’s obligation to explain and justify the resolution of the matter, as well as a case of poor legal scholarship. The same considerations should apply to the scientific studies relied upon by an expert witness, whose opinion is being discussed in a judicial opinion.

Judge Sarah Vance’s opinion in Burst v. Shell Oil Co., C. A. No. 14–109, 2015 WL 3755953 (E.D. La. June 16, 2015) [cited as Burst], is a good example of judicial opinion writing, in the context of deciding an evidentiary challenge to an expert witness’s opinion, which satisfies the requirements of judicial opinion writing, as well as basic scholarship. The key studies relied upon by the challenged expert witness are identified, and cited, in a way that permits both litigants and non-litigants to review Her Honor’s opinion, and evaluate both the challenged expert witness’s opinion, and the trial judge’s gatekeeping performance. Citations to the underlying studies creates the delicious possibility that the trial judge might actually have read the papers to decide the admissibility question. On the merits, Judge Vance’s opinion in Burst also serves as a good example of judicial scrutiny that cuts through an expert witness’s hand waving and misdirection in the face of inadequate, inconsistent, and insufficient evidence for a causal conclusion.

Burst is yet another case in which plaintiff claimed that exposure to gasoline caused acute myeloid leukemia (AML), one of several different types of leukemia[1]. The claim is fraught with uncertainty and speculation in the form of extrapolations between substances, from high to low exposures, and between diseases.

Everyone has a background exposure to benzene from both natural and anthropogenic sources. Smoking results in approximately a ten-fold elevation of benzene exposure. Agency for Toxic Substances and Disease Registry (ATSDR) Public Health Statement – Benzene CAS#: 71-43-2 (August 2007). Gasoline contains small amounts of benzene, on the order of 1 percent or less. U.S. Environmental Protection Agency (EPA), Summary and Analysis of the 2011 Gasoline Benzene Pre-Compliance Report (2012).

Although gasoline has always contained benzene, the quantitative difference in levels of benzene exposure involved in working with concentrated benzene and with gasoline has led virtually all scientists and regulatory agencies to treat the two exposures differently. Benzene exposure is a known cause of AML; gasoline exposure, even in occupational contexts, is not taken to be a known cause of AML. Dose matters.

Although the reviews of the International Agency for Research on Cancer (IARC) are sometimes partisan, incomplete, and biased towards finding carcinogenicity, the IARC categorizes benzene as a known human carcinogen, in large part because of its known ability to cause AML, but regards the evidence for gasoline as inadequate for making causal conclusions. IARC, Monographs on the Evaluation of Carcinogenic Risks to Humans, Vol. 45, Occupational Exposures in Petroleum Refining; Crude Oil and Major Petroleum Fuels (1989) (“There is inadequate evidence for the carcinogenicity in humans of gasoline.”) (emphasis in original)[2].

To transmogrify a gasoline case into a benzene case, plaintiff called upon Peter F. Infante, a fellow of the white-hat conspiracy, Collegium Ramazzini, and an adjunct professor at George Washington University School of Public Health and Health Services. Previously, Dr. Infante was Director of OHSA’s Office of Standards Review (OSHA). More recently, Infante is known as the president and registered agent of Peter F. Infante Consulting, LLC, in Falls Church, Virginia, and a go-to expert witness for plaintiffs in toxic tort litigation[3].

In the Burst case, Infante started out in trouble, by claiming that he had he “followed the methodology of the International Agency for Research on Cancer (IARC) and of the Occupational Safety and Health Administration (OSHA) in evaluating epidemiological studies, case reports and toxicological studies of benzene exposure and its effect on the hematopoietic system.” Burst at *4. Relying upon the IARC’s methodology might satisfy some uncritical courts, but here the IARC itself sharply distinguished its characterizations of benzene and gasoline in separate reviews. Infante’s opinion ignored this divide, although it ultimately had to connect gasoline exposure to the claimed injury[4].

Judge Vance found that Infante’s proffered opinions ransacked the catalogue of expert witness errors. Infante:

  • relied upon studies of benzene exposure and diseases other than the outcome of interest, AML. Burst at *4, *10, *13.
  • relied upon studies of benzene exposure rather than gasoline exposure. Burst at *9.
  • relied upon studies that assessed outcomes in groups with multiple exposures, which studies were hopelessly confounded. Burst at *7.
  • failed to acknowledge the inconsistency of outcomes in the studies of the relevant exposure, gasoline. Burst at *9.
  • relied upon studies that lacked adequate exposure measurements and characterizations, which lack was among the reasons that the ATSDR declined to label gasoline a carcinogen. Burst at *12.
  • relied upon studies that did not report statistically significant associations between gasoline exposure and AML. Burst at *10, *12
  • cherry picked studies and failed to explain contrary results. Burst at *10.
  • cherry picked data from within studies that did not otherwise support his conclusion. Burst at *10.
  • interpreted studies at odds with how the authors of published papers interpreted their own studies. Burst at *10.
  • failed to reconcile conflicting studies. Burst at *10.
  • manipulated data without sufficient explanation or justification. Burst at *14.
  • failed to conduct an appropriate analysis of the entire dataset, along the lines of Sir Austin Bradford Hill’s nine factors. Burst at *10.

The manipulation charge is worth further discussion because it reflects upon the trial court’s acumen and the challenged witness’s deviousness. Infante combined the data from two exposure subgroups from one study[5] to claim that the study actually had a statistically significant association. The trial court found that Dr. Infante failed to explain or justify the recalculation. Burst at *14. At the pre-trial hearing, Dr. Infante offered that he performed the re-calculation on a “sticky note,” but failed to provide his calculations. The court might also have been concerned about the misuse of claiming statistical significance in a post-hoc, non-prespecified analysis that would have clearly raised a multiple comparisons issue. Infante also combined two separate datasets from an unpublished study (the Spivey study for Union Oil), which the court found problematic for his failure to explain and justify the aggregation of data. Id. This recalculation raises the issue whether the two separate datasets could be appropriately combined.

For another study[6], Infante adjusted the results based upon his assessment that the study was biased by a “healthy worker effect[7].” Burst at *15. Infante failed to provide any explanation of how he adjusted for the healthy worker effect, thus giving the court no basis for evaluating the reliability of his methodology. Perhaps more telling, the authors of this study acknowledged the hypothetical potential for healthy worker bias, but chose not to adjust for it because their primary analyses were conducted internally within the working study population, which fully accounted for the potential bias[8].

The court emphasized that it did not question whether combining datasets or adjusting for bias was accepted or proper methodology; rather it focused its critical scrutiny on Infante’s refusal or failure to explain and justify his post-hoc “manipulations of published data.” Burst at *15. Without a showing that AML is more common among non-working, disabled men, the health worker adjustment could well be questioned.

In the final analysis, Infante’s sloppy narrative review could not stand in the face of obviously inconsistent epidemiologic data. Burst at *16. The trial court found that Dr. Infante’s methodology of claiming reliance upon multiple studies, which did not reliably (validly) support his claims or “fit” his conclusions, failed to satisfy the requirements of Federal Rule of Evidence 702. The analytical gap between the data and the opinion were too great. Id. at *8. Infante’s opinion fell into the abyss[9].


[1] See, e.g., Castellow v. Chevron USA, 97 F. Supp. 2d 780, 796 (S.D.Tex.2000) (“Plaintiffs here have not shown that the relevant scientific or medical literature supports the conclusion that workers exposed to benzene, as a component of gasoline, face a statistically significant risk of an increase in the rate of AML.”); Henricksen v. Conoco Phillips Co., 605 F.Supp.2d 1142, 1175 (E.D.Wa. 2009) (“None of the studies relied upon have concluded that gasoline has the same toxic effect as benzene, and none have concluded that the benzene component of gasoline is capable of causing AML.”); Parker v. Mobil Oil Corp., 7 N.Y.3d 434, 450 (N.Y.2006) (“[N]o significant association has been found between gasoline exposure and AML. Plaintiff’s experts were unable to identify a single epidemiologic study finding an increased risk of AML as a result of exposure to gasoline.”).

[2] See also ATSDR Toxicological Profile for Gasoline (1995) (concluding “there is no conclusive evidence to support or refute the carcinogenic potential of gasoline in humans or animals based on the carcinogenicity of one of its components, benzene”); ATSDR, Public Health Statement for Automotive Gasoline (June 1995) (“[However, there is no evidence that exposure to gasoline causes cancer in humans. There is not enough information available to determine if gasoline causes birth defects or affects reproduction.”).

[3] See, e.g., Harris v. CSX Transp., Inc., 753 SE 2d 275, 232 W. Va. 617 (2013); Henricksen v. ConocoPhillips Co., 605 F. Supp. 2d 1142 (E.D. Wash. 2009); Roney v. GENCORP, Civil Action No. 3: 05-0788 (S.D.W. Va. Sept. 18, 2009); Chambers v. Exxon Corp., 81 F. Supp. 2d 661 (M.D. La. 2000).

[4] Judge Vance did acknowledge that benzene studies were relevant to Infante’s causation opinion, but emphasized that such studies could not suffice to show that all gasoline exposures could cause AML. Burst at *10 (citing Dickson v. Nat’l Maint. & Repair of Ky., Inc., No. 5:08–CV–00008, 2011 WL 12538613, at *6 (W.D. Ky. April 28, 2011) (“Benzene may be considered a causative agent despite only being a component of the alleged harm.”).

[5] L. Rushton & H. Romaniuk, “A Case-Control Study to Investigate the

Risk of Leukaemia Associated with Exposure to Benzene in Petroleum Marketing and Distribution Workers in the United Kingdom,” 54 Occup. & Envt’l Med. 152 (1997).

[6] Otto Wong, et al., “Health Effects of Gasoline Exposure. II. Mortality Patterns of Distribution Workers in the United States,” 101 Envt’l Health Persp. 6 (1993).

[7] Burst at *15, citing and quoting from John Last, A Dictionary of Epidemiology (3d ed.1995) (“Workers usually exhibit lower overall death rates than the general population because the severely ill and chronically disabled are ordinarily excluded from employment.”).

[8] Wong, supra.

[9] In a separate opinion, Judge Vance excluded a physician, Dr. Robert Harrison, who similarly opined that gasoline causes AML, and Mr. Burst’s AML, without the benefit of sound science to support his opinion. Burst v. Shell Oil Co., C. A. No. 14–109, 2015 WL 2015 WL 3620111 (E.D. La. June 9, 2015).

Science as Adversarial Process versus Group Think

May 7th, 2015

Climate scientists, at least those scientists who believe that climate change is both real and an existential threat to human civilization, have invoked their consensus as an evidentiary ground for political action. These same scientists have also used their claim of a consensus to shame opposing points of view (climate change skeptics) as coming from “climate change deniers.”

Consensus, or “general acceptance” as it is sometimes cast in legal discussions, is rarely more than nose counting. At best, consensus is a proxy for data quality and inferential validity. At worst, consensus is a manifestation of group think and herd mentality. Debates about climate change, as well as most scientific issues, would progress more dependably if there were more data, and less harrumphing about consensus.

Olah’s Nobel Speech

One Nobel laureate, Professor George Olah, explicitly rejected the kumbaya view of science and its misplaced emphasis on consensus and collaboration. In accepting his Nobel Prize in Chemistry, Olah emphasized the value of adversarial challenges in refining and establishing scientific discovery:

“Intensive, critical studies of a controversial topic always help to eliminate the possibility of any errors. One of my favorite quotation is that by George von Bekessy (Nobel Prize in Medicine, 1961).

‘[One] way of dealing with errors is to have friends who are willing to spend the time necessary to carry out a critical examination of the experimental design beforehand and the results after the experiments have been completed. An even better way is to have an enemy. An enemy is willing to devote a vast amount of time and brain power to ferreting out errors both large and small, and this without any compensation. The trouble is that really capable enemies are scarce; most of them are only ordinary. Another trouble with enemies is that they sometimes develop into friends and lose a good deal of their zeal. It was in this way the writer lost his three best enemies. Everyone, not just scientists, needs a few good enemies!’”

George A. Olah, “My Search for Carbocations and Their Role in Chemistry,” Nobel Lecture (Dec. 8, 1994), quoting George von Békésy, Experiments in Hearing 8 (N.Y. 1960); see also McMillan v. Togus Reg’l Office, Dep’t of Veterans Affairs, 294 F. Supp. 2d 305, 317 (E.D.N.Y. 2003) (“As in political controversy, ‘science is, above all, an adversary process.’”) (internal citation omitted).

Carl Sagan expressed similar views about the importance of skepticism in science :

“At the heart of science is an essential balance between two seemingly contradictory attitudes — an openness to new ideas, no matter how bizarre or counterintuitive they may be, and the most ruthless skeptical scrutiny of all ideas, old and new. This is how deep truths are winnowed from deep nonsense.”

Carl Sagan, The Demon-Haunted World: Science as a Candle in the Dark (1995); See also Cary Coglianese, “The Limits of Consensus,” 41 Environment 28 (April 1999).

Michael Crichton, no fan of Sagan, agreed at least on the principle:

“I want to . . . talk about this notion of consensus, and the rise of what has been called consensus science. I regard consensus science as an extremely pernicious development that ought to be stopped cold in its tracks. Historically, the claim of consensus has been the first refuge of scoundrels; it is a way to avoid debate by claiming that the matter is already settled. Whenever you hear the consensus of scientists agrees on something or other, reach for your wallet, because you’re being had.

Let’s be clear: the work of science has nothing whatever to do with consensus. Consensus is the business of politics. Science, on the contrary, requires only one investigator who happens to be right, which means that he or she has results that are verifiable by reference to the real world. In science consensus is irrelevant. What is relevant is [sic] reproducible results. The greatest scientists in history are great precisely because they broke with the consensus.  There is no such thing as consensus science. If it’s consensus, it isn’t science. If it’s science, it isn’t consensus. Period.34

Michael Crichton, “Lecture at California Institute of Technology: Aliens Cause Global Warming” (Jan. 17, 2003) (describing many examples of how “consensus” science historically has frustrated scientific progress).

Crystalline Silica, Carcinogenesis, and Faux Consensus

Clearly, there are times when consensus in science works against knowledge and data-driven inferences. Consider the saga of crystalline silica and lung cancer. Suggestions that silica causes lung cancer date back to the 1930s, but the suggestions were dispelled by data. The available data were evaluated by the likes of Wilhelm Heuper[1], Cuyler Hammond[2] (Selikoff’s go-to-epidemiologist), Gerrit Schepers[3], and Hans Weill[4]. Even etiologic fabulists, such as Kaye Kilburn, disclaimed any connection between silica or silicosis and lung cancer[5]. As recently as 1988, august international committees, writing for the National Institute of Occupational Safety and Health, acknowledged the evidentiary insufficiency of any claim that silica caused lung cancer[6].

IARC (1987)

So what happened to the “consensus”? A group of activist scientists, who disagreed with the consensus, sought to establish their own, new consensus. Working through the International Agency for Research on Cancer (IARC), these scientists were able to inject themselves into the IARC working group process, and gradually raise the IARC ranking of crystalline silica. In 1987, the advocate scientists were able to move the IARC to adopt a “limited evidence” classification for crystalline silica.

The term “limited evidence” is defined incoherently by the IARC as evidence that provides for a “credible” causal explanation, even though chance, bias, and confounding have not been adequately excluded. Despite the incoherent definition that giveth and taketh away, the 1987 IARC reclassification[7] into Group 2A had regulatory consequences that saw silica classified as a “regulatory carcinogen,” or a substance that was “reasonably anticipated to be a carcinogen.”

The advocates’ prophecy was self-fulfilling. In 1996, another working group of the IARC met in Lyon, France, to deliberate on the classification of crystalline silica. The 1996 working group agreed, by a close vote, to reclassify crystalline silica as a “known human carcinogen,” or a Group 1 carcinogen. The decision was accepted and reported officially in volume 68 of the IARC monographs, in 1997.

According to participants, the debate was intense and the vote close. Here is the description from one of the combatants;

“When the IARC Working Group met in Lyon in October 1996 to assess the carcinogenicity of crystalline silica, a seemingly interminable debate ensued, only curtailed by a reminder from the Secretariat that the IARC was concerned with the identification of carcinogenic hazards and not the evaluation of risks. The important distinction between the potential to cause disease in certain circumstances, and in what circumstances, is not always appreciated.

*   *   *   *   *

Even so, the debate in Lyon continued for some time, finally ending in a narrow vote, reflecting the majority view of the experts present at that particular time.”

See Corbett McDonald, “Silica and Lung Cancer: Hazard or Risk,” 44 Ann. Occup. Hyg. 1, 1 (2000); see also Corbett McDonald & Nicola Cherry, “Crystalline Silica and Lung Cancer: The Problem of Conflicting Evidence,” 8 Indoor Built Env’t 8 (1999).

Although the IARC reclassification hardly put the silica lung cancer debate to rest, it did push the regulatory agencies to walk in lockstep with the IARC and declare crystalline silica to be a “known human carcinogen.” More important, it gave regulators and scientists an excuse to avoid the hard business of evaluating complicated data, and of thinking for themselves.

Post IARC

From a sociology of science perspective, the aftermath of the 1997 IARC monograph is a fascinating natural experiment to view the creation of a sudden, thinly supported, new orthodoxy. To be sure, there were scientists who looked carefully at the IARC’s stated bases and found them inadequate, inconsistent, and incoherent[8]. One well-regarded pulmonary text in particular gives the IARC and regulatory agencies little deference:

“Silica-induced lung cancer

A series of studies suggesting that there might be a link between silica inhalation and lung cancer was reviewed by the International Agency for Research on Cancer in 1987, leading to the conclusion that the evidence for carcinogenicity of crystalline silica in experimental animals was sufficient, while in humans it was limited.112 Subsequent epidemiological publications were reviewed in 1996, when it was concluded that the epidemiological evidence linking exposure to silica to the risk of lung cancer had become somewhat stronger.113 but that in the absence of lung fibrosis remained scanty.113 The pathological evidence in humans is also weak in that premalignant changes around silicotic nodules are seldom evident.114 Nevertheless, on this rather insubstantial evidence, lung cancer in the presence of silicosis (but not coal or mixed-dust pneumoconiosis) has been accepted as a pre­scribed industrial disease in the UK since 1992.115 Some subsequent studies have provided support for this decision.116 In contrast to the sparse data on classic silicosis, the evidence linking carcinoma of the lung to the rare diffuse pattern of fibrosis attributed to silica and mixed dusts is much stronger and appears incontrovertible.33,92

Bryan Corrin[9] & Andrew Nicholson, Pathology of the Lungs (3d ed. 2011).

=======================================================

Cognitive biases cause some people to see a glass half full, while others see it half empty. Add a “scientific consensus” to the mix, and many people will see a glass filled 5% as 95% full.

Consider a paper by Centers for Disease Control and NIOSH authors on silica exposure and morality from various diseases. Geoffrey M. Calvert, Faye L. Rice, James M. Boiano, J. W. Sheehy, and Wayne T. Sanderson, “Occupational silica exposure and risk of various diseases: an analysis using death certificates from 27 states of the United States,” 60 Occup. Envt’l Med. 122 (2003). The paper was nominated for the Charles Shepard Award for Best Scientific Publication by a CDC employee, and was published in the British Medical Journal’s publication on occupational medicine. The study analyzed death certificate data from the U.S. National Occupational Mortality Surveillance (NOMS) system, which is based upon the collaboration of NIOSH, the National Center for Health Statistics, the National Cancer Institute, and some state health departments. Id. at 122.

From about 4.8 million death certificates included in their analysis, the authors found a statistically decreased mortality odds ratio (MOR) for lung cancer among those who had silicosis (MOR = 0.70, 95% C.I., 0.55 to 0.89). Of course, with silicosis on the death certificates along with lung cancer, the investigators could be reasonably certain about silica exposure. Given the group-think in occupational medicine about silica and lung cancer, the authors struggled to explain away their finding:

“Although many studies observed that silicotics have an increased risk for lung cancer, a few studies, including ours, found evidence suggesting the lack of such an association. Although this lack of consistency across studies may be related to differences in study design, it suggests that silicosis is not necessary for an increased risk of lung cancer among silica exposed workers.”

Well this statement is at best disingenuous. The authors did not merely find a lack of an association; they found a statistically significance inverse or “negative” association between silicosis and lung cancer. So it is not the case that silicosis is not necessary for an increased risk; silicosis is antithetical to an increased risk.

Looking at only death certificate information, without any data on known or suspected confounders (“diet, hobbies, tobacco use, alcohol use, or medication,” id. at 126, or comorbid diseases or pulmonary impairment, or other occupational or environmental exposures), the authors inferred low, medium, high, and “super high” silica exposure from job categories. Comparing the ever-exposed categories with low exposure yielded absolutely no association between exposure and lung cancer, and subgroup analyses (without any correction for multiple comparisons) found little association, although two subgroups were nominally statistically significantly increased, and one was nominally statistically significantly decreased, at very small deviations from expected:

Lung Cancer Mortality Odds Ratios (p-value for trend < 0.001)

ever vs. low/no exposure:                0.99 (0.98 to 1.00)

medium vs. low/no exposure:         0.88 (0.87 to 0.90)

high vs. low/no exposure:                 1.13 (1.11 to 1.15)

super high vs. low/no exposure:      1.13 (1.06 to 1.21)

Id. at Table 4, and 124.

On this weak evidentiary display, the authors declare that their “study corroborates the association between crystalline silica exposure and silicosis, lung cancer.” Id. at 123. In their conclusions, they elaborate:

“Our findings support an association between high level crystalline silica exposure and lung cancer. The statistically significant MORs for high and super high exposures compared with low/no exposure (MORs = 1.13) are consistent with the relative risk of 1.3 reported in a meta-analysis of 16 cohort and case-control studies of lung cancer in crystalline silica exposed workers without silicosis”

Id. at 126. Actually not; Calvert’s reported MORs exclude an OR of 1.3.

The Calvert study thus is a stunning example of authors, prominent in the field of public health, looking at largely exculpatory data and declaring that they have confirmed an important finding of silica carcinogenesis. And to think that United States taxpayers paid for this paper, and that the authors almost received an honorific award for this thing!


[1] Wilhelm Hueper, “Environmental Lung Cancer,” 20 Industrial Medicine & Surgery 49, 55-56 (1951) (“However, the great majority of investigators have come to the conclusion that there does not exist any causal relation between silicosis and pulmonary or laryngeal malignancy”).

[2] Cuyler Hammond & W. Machle, “Environmental and Occupational Factors in the Development of Lung Cancer,” Ch. 3, pp. 41, 50, in E. Mayer & H. Maier, Pulmonary Carcinoma: Pathogenesis, Diagnosis, and Treatment (N.Y. 1956) (“Studies by Vorwald (41) and others agree in the conclusion that pneumoconiosis in general, and silicosis in particular, do not involve any predisposition of lung cancer.”).

[3] Gerrit Schepers, “Occupational Chest Diseases,” Chap. 33, in A. Fleming, et al., eds., Modern Occupational Medicine at 455 (Philadelphia 2d ed. 1960) (“Lung cancer, of course, occurs in silicotics and is on the increase. Thus far, however, statistical studies have failed to reveal a relatively enhanced incidence of pulmonary neoplasia in silicotic subjects.”).

[4] Ziskind, Jones, and Weill, “State of the Art: Silicosis” 113 Am. Rev. Respir. Dis. 643, 653 (1976) (“There is no indication that silicosis is associated with increased risk for the development of cancer of the respiratory or other systems.”); Weill, Jones, and Parkes, “Silicosis and Related Diseases, Chap. 12, in Occupational Lung Disorders (3d ed. 1994) (“It may be reasonably concluded that the evidence to date that occupational exposure to silica results in excess lung cancer risk is not yet persuasive.”).

[5] Kaye Kilburn, Ruth Lilis, Edwin Holstein, “Silicosis,” in Maxcy-Rosenau, Public Health and Preventive Medicine, 11th ed., at 606 (N.Y. 1980) (“Lung cancer is apparently not a complication of silicosis”).

[6] NIOSH Silicosis and Silicate Disease Committee, “Diseases Associated With Exposure to Silica and Nonfibrous Silicate Minerals,” 112 Arch. Path. & Lab. Med. 673, 711b, ¶ 2 (1988) (“The epidemiological evidence at present is insufficient to permit conclusions regarding the role of silica in the pathogenesis of bronchogenic carcinoma.”)

[7] 42 IARC Monographs on the Evaluation of the Carcinogenic Risk of Chemicals to Humans at 22, 111, § 4.4 (1987).

[8] See, e.g., Patrick A. Hessel, John F. Gamble, J. Bernard L. Gee, Graham Gibbs, Francis H. Y. Green, W. Keith C. Morgan, and Brooke T. Mossman, “Silica, Silicosis, and Lung Cancer: A Response to A Recent Working Group,” 42 J. Occup. Envt’l Med. 704, 718 (2000) (“The data demonstrate a lack of association between lung cancer and exposure to crystalline silica in human studies. Furthermore, silica is not directly genotoxic and has been to be a pulmonary carcinogen in only one animal species, the rat, which seems to be an inappropriate carcinogenesis in humans.”)

[9] Professor of Thoracic Pathology, National Heart and Lung Institute, Imperial College School of Medicine; Honorary Consultant Pathologist, Brompton Hospital, London, UK.