Schachtman Law

TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

Demonstration of Frye Gatekeeping in Pennsylvania Birth Defects Case

October 6th, 2015

Michael D. Freeman is a chiropractor and self-styled “forensic epidemiologist,” affiliated with Departments of Public Health & Preventive Medicine and Psychiatry, Oregon Health & Science University School of Medicine, in Portland, Oregon. His C.V. can be found here. Freeman has an interesting publication in press on his views of forensic epidemiology. Michael D. Freeman & Maurice Zeegers, “Principles and applications of forensic epidemiology in the medico-legal setting,” Law, Probability and Risk (2015); doi:10.1093/lpr/mgv010. Freeman’s views on epidemiology did not, however, pass muster in the courtroom. Porter v. Smithkline Beecham Corp., Phila. Cty. Ct. C.P., Sept. Term 2007, No. 03275. Slip op. (Oct. 5, 2015).

In Porter, plaintiffs sued Pfizer, the manufacturer of the SSRI antidepressant Zoloft. Plaintiffs claimed the mother plaintiff’s use of Zoloft during pregnancy caused her child to be born with omphalocele, a serious defect that occurs when the child’s intestines develop outside his body. Pfizer moved to exclude plaintiffs’ medical causation expert witnesses, Dr. Cabrera and Dr. Freeman. The trial judge was the Hon. Mark I. Bernstein, who has written and presented frequently on expert witness evidence.[1] Judge Bernstein held a two day hearing in September 2015, and last week, His Honor ruled that the plaintiffs’ expert witnesses failed to meet Pennsylvania’s Frye standard for admissibility. Judge Bernstein’s opinion reads a bit like a Berenstain Bear book on how not to use epidemiology.

GENERAL CAUSATION SCREW UPS

Proper Epidemiologic Method

First, Find An Association

Dr. Freeman has a methodologic map that included Bradford Hill criteria at the back end of the procedure. Dr. Freeman, however, impetuously forgot that before you get to the back end, you must traverse the front end:

“Dr. Freemen agrees that he must, and claims he has, applied the Bradford Hill Criteria to support his opinion. However, the starting procedure of any Bradford-Hill analysis is ‘an association between two variables’ that is ‘perfectly clear-cut and beyond what we would care to attribute to the play of chance’.³⁵ Dr. Freeman testified that generally accepted methodology requires a determination, first, that there’s evidence of an association and, second, whether chance, bias and confounding have been accounted for, before application of the Bradford-Hill criteria.³⁶ Because no such association has been properly demonstrated, the Bradford Hill criteria could not have been properly applied.”

Slip op. at 12-13. In other words, don’t go rushing to the Bradford Hill factors until and unless you have first shown an association; second, you have shown that it is “clear cut,” and not likely the result of bias or confounding; and third, you have ruled out the play of chance or random variability in explaining the difference between the observed and expected rates of disease.

Proper epidemiologic method requires surveying the pertinent published studies that investigate whether there is an association between the medication use and the claimed harm. The expert witnesses must, however, do more than write a bibliography; they must assess any putative associations for “chance, confounding or bias”:

“Proper epidemiological methodology begins with published study results which demonstrate an association between a drug and an unfortunate effect. Once an association has been found, a judgment as whether a real causal relationship between exposure to a drug and a particular birth defect really exists must be made. This judgment requires a critical analysis of the relevant literature applying proper epidemiologic principles and methods. It must be determined whether the observed results are due to a real association or merely the result of chance. Appropriate scientific studies must be analyzed for the possibility that the apparent associations were the result of chance, confounding or bias. It must also be considered whether the results have been replicated.”

Slip op. at 7.

Then Rule Out Chance

So if there is something that appears to be an association in a study, the expert epidemiologist must assess whether it is likely consistent with a chance association. If we flip a fair coin 10 times, we “expect” 5 heads and 5 tails, but actually the probability of not getting the expected result is about three times greater than obtaining the expected result. If on one series of 10 tosses we obtain 6 heads and 4 tails, we would certainly not reject a starting assumption that the expected outcome was 5 heads/ 5 tails. Indeed, the probability of obtaining 6 heads / 4 tails or 4 heads /6 tails is almost double that of the probability of obtaining the expected outcome of equal number of heads and tails.

As it turned out in the Porter case, Dr. Freeman relied rather heavily upon one study, the Louik study, for his claim that Zoloft causes the birth defect in question. See Carol Louik, Angela E. Lin, Martha M. Werler, Sonia Hernández-Díaz, and Allen A. Mitchell, “First-Trimester Use of Selective Serotonin-Reuptake Inhibitors and the Risk of Birth Defects,” 356 New Engl. J. Med. 2675 (2007). The authors of the Louik study were quite clear that they were not able to rule out chance as a sufficient explanation for the observed data in their study:

“The previously unreported associations we identified warrant particularly cautious interpretation. In the absence of preexisting hypotheses and the presence of multiple comparisons, distinguishing random variation from true elevations in risk is difficult. Despite the large size of our study overall, we had limited numbers to evaluate associations between rare outcomes and rare exposures. We included results based on small numbers of exposed subjects in order to allow other researchers to compare their observations with ours, but we caution that these estimates should not be interpreted as strong evidence of increased risks.²⁴”

Slip op at 10 (quoting from Louik study).

Judge Bernstein thus criticized Freeman for failing to account for chance in explaining his putative association between maternal Zoloft use and infant omphalocele. The appropriate and generally accepted methodology for accomplishing this step of evaluating a putative association is to consider whether the association is statistically significant at the conventional level.

In relying heavily upon the Louik study, Dr. Freeman opened himself up to serious methodological criticism. Judge Bernstein’s opinion stands for the important proposition that courts should not be unduly impressed with nominal statistical significance in the presence of multiple comparisons and very broad confidence intervals:

“The Louik study is the only study to report a statistically significant association between Zoloft and omphalocele. Louik’s confidence interval which ranges between 1.6 and 20.7 is exceptionally broad. … The Louik study had only 3 exposed subjects who developed omphalocele thus limiting its statistical power. Studies that rely on a very small number of cases can present a random statistically unstable clustering pattern that may not replicate the reality of a larger population. The Louik authors were unable to rule out confounding or chance. The results have never been replicated concerning omphalocele. Dr. Freeman’s testimony does not explain, or seemingly even consider these serious limitations.”

Slip op. at 8. Statistical precision in the point estimate of risk, which includes assessing the outcome in the context of whether the authors conducted multiple comparisons, and whether the observed confidence intervals were very broad, is part of the generally accepted epidemiologic methodology, which Freeman flouted:

“Generally accepted methodology considers statistically significant replication of study results in different populations because apparent associations may reflect flaws in methodology.”

Slip op. at 9. The studies that Freeman cited and apparently relied upon failed to report statistically significant associations between sertraline (Zoloft) and omphalocele. Judge Bernstein found this lack to be a serious problem for Freeman and his epidemiologic opinion:

“While non-significant results can be of some use, despite a multitude of subsequent studies which isolated omphalocele, there is no study which replicates or supports Dr. Freeman’s conclusions.”

Slip op. at 10. The lack of statistical significance, in the context of repeated attempts to find it, helped sink Freeman’s proffered testimony.

Then Rule Out Bias and Confounding

As noted, Freeman relied heavily upon the Louik study, which was the only study to report a nominally statistically significant risk ratio for maternal Zoloft use and infant omphalocele. The Louik study, by its design, however, could not exclude chance or confounding as full explanation for the apparent association, and Judge Bernstein chastised Dr. Freeman for overselling the study as support for the plaintiffs’ causal claim:

“The Louik authors were unable to rule out confounding or chance. The results have never been replicated concerning omphalocele. Dr. Freeman’s testimony does not explain, or seemingly even consider these serious limitations.”

Slip op. at 8.

And Only Then Consider the Bradford Hill Factors

Even when an association is clear cut, and beyond what we can likely attribute to chance, generally accepted methodology requires the epidemiologist to consider the Bradford Hill factors. As Judge Bernstein explains, generally accepted methodology for assessing causality in this area requires a proper consideration of Hill’s factors before a conclusion of causation is reached:

“As the Bradford-Hill factors are properly considered, causality becomes a matter of the epidemiologist’s professional judgment.”

Slip op. at 7.

Consistency or Replication

The nine Hill factors are well known to lawyers because they have been stated and discussed extensively in Hill’s original article, and in references such as the Reference Manual on Scientific Evidence. Not all the Hill factors are equally important, or important at all, but one that is consistency or concordance of results among the available epidemiologic studies. Stated alternatively, a clear cut association unlikely to be explained by chance is certainly interesting and probative, but it raises an important methodological question — can the result be replicated? Judge Bernstein restated this important Hill factor as an important determinant of whether a challenged expert witness employed a generally accepted method:

“Generally accepted methodology considers statistically significant replication of study results in different populations because apparent associations may reflect flaws in methodology.”

Slip op. at 10.

“More significantly neither Reefhuis nor Alwan reported statistically significant associations between Zoloft and omphalocele. While non-significant results can be of some use, despite a multitude of subsequent studies which isolated omphalocele, there is no study which replicates or supports Dr. Freeman’s conclusions.”

Slip op. at 10.

Replication But Without Double Dipping the Data

Epidemiologic studies are sometimes updated and extended with additional follow up. An expert witness who wished to skate over the replication and consistency requirement might be tempted, as was Dr. Freeman, to count the earlier and later iteration of the same basic study to count as “replication.” The Louik study was indeed updated and extended this year in a published paper by Jennita Reefhuis and colleagues.[2] Proper methodology, however, prohibits double dipping data to count the later study that subsumes the early one as a “replication”:

“Generally accepted methodology considers statistically significant replication of study results in different populations because apparent associations may reflect flaws in methodology. Dr. Freeman claims the Alwan and Reefhuis studies demonstrate replication. However, the population Alwan studied is only a subset of the Reefhuis population and therefore they are effectively the same.”

Slip op. at 10.

The Lumping Fallacy

Analyzing the health outcome of interest at the right level of specificity can sometimes be a puzzle and a challenge, but Freeman generally got it wrong by opportunistically “lumping” disparate outcomes together when it helps him get a result that he likes. Judge Bernstein admonishes:

“Proper methodology further requires that one not fall victim to the … the ‘Lumping Fallacy’. … Different birth defects should not be grouped together unless they a part of the same body system, share a common pathogenesis or there is a specific valid justification or necessity for an association²⁰ and chance, bias, and confounding have been eliminated.”

Slip op. at 7. Dr. Freeman lumped a lot, but Judge Bernstein saw through the methodological ruse. As Judge Bernstein pointed out:

“Dr. Freeman’s analysis improperly conflates three types of data: Zoloft and omphalocele, SSRI’s generally and omphalocele, and SSRI’s and gastrointestinal and abdominal malformations.”

Slip op. at 8. Freeman’s approach, which sadly is seen frequently in pharmaceutical and other products liability cases, is methodologically improper:

“Generally accepted causation criteria must be based on the data applicable to the specific birth defect at issue. Dr. Freeman improperly lumps together disparate birth defects.”

Slip op. at 11.

Class Effect Fallacy

Another kind of improper lumping results from treating all SSRI antidepressants the same to either lump them together, or to pick and choose from among all the SSRIs, the data points that are supportive of the plaintiffs’ claims (while ignoring those SSRI data points not supportive of the claims). To be sure, the SSRI antidepressants do form a “class,” in that they all have a similar pharmacologic effect. The SSRIs, however, do not all achieve their effect in the serotonergic neurons the same way; nor do they all have the same “off-target” effects. Treating all the SSRIs as interchangeable for a claimed adverse effect, without independent support for this treatment, is known as the class effect fallacy. In Judge Bernstein’s words:

“Proper methodology further requires that one not fall victim to the ‘Class Effect Fallacy’ … . A class effect cannot be assumed. The causation conclusion must be drug specific.”

Slip op. at 7. Dr. Freeman’s analysis improperly conflated Zoloft data with SSRI data generally. Slip op. at 8. Assuming what you set out to demonstrate is, of course, a fine way to go methodologically into the ditch:

“Without significant independent scientific justification it is contrary to generally accepted methodology to assume the existence of a class effect. Dr. Freeman lumps all SSRI drug results together and assumes a class effect.”

Slip op. at 10.

SPECIFIC CAUSATION SCREW UPS

Dr. Freeman was also offered by plaintiffs to provide a specific causation opinion – that Mrs. Porter’s use of Zoloft in pregnancy caused her child’s omphalocele. Freeman claimed to have performed a differential diagnosis or etiology or something to rule out alternative causes.

Genetics

In the field of birth defects, one possible cause looming in any given case is an inherited or spontaneous genetic mutation. Freeman purported to have considered and ruled out genetic causes, which he acknowledged to make up a substantial percentage of all omphalocele cases. Bo Porter, Mrs. Porter’s son, was tested for known genetic causes, and Freeman argued that this allowed him to “rule out” genetic causes. But the current state of the art in genetic testing allowed only for identifying a small number of possible genetic causes, and Freeman failed to explain how he might have ruled out the as-of-yet unidentified genetic causes of birth defects:

“Dr. Freeman fails to properly rule out genetic causes. Dr. Freeman opines that 45-49% of omphalocele cases are due to genetic factors and that the remaining 50-55% of cases are due to non-genetic factors. Dr. Freeman relies on Bo Porter’s genetic testing which did not identify a specific genetic cause for his injury. However, minor plaintiff has not been tested for all known genetic causes. Unknown genetic causes of course cannot yet be tested. Dr. Freeman has made no analysis at all, only unwarranted assumptions.”

Slip op. at 15-16. Judge Bernstein reviewed Freeman’s attempted analysis and ruling out of potential causes, and found that it departed from the generally accepted methodology in conducting differential etiology. Slip op. at 17.

Timing Errors

One feature of putative terotogenicity is that an embryonic exposure must take place at a specific gestational developmental time in order to have its claimed deleterious effect. As Judge Bernstein pointed out, omphalocele results from an incomplete folding of the abdominal wall during the third to fifth weeks of gestation. Mrs. Porter, however, did not begin taking Zoloft until her seventh week of pregnancy, which left Dr. Freeman opinion-less as to how Zoloft contributed to the claimed causation of the minor plaintiff’s birth defect. Slip op. at 14. This aspect of Freeman’s specific causation analysis was glaringly defect, and clearly not the sort of generally accepted methodology of attributing a birth defect to a teratogen.

******************************************************

All in all, Judge Bernstein’s opinion is a tour de force demonstration of how a state court judge, in a so-called Frye jurisdiction, can show that failure to employ generally accepted methods renders an expert witness’s opinions inadmissible. There is one small problem in statistical terminology.

Statistical Power

Judge Bernstein states, at different places, that the Louik study was and was not statistically significant for Zoloft and omphalocele. The court’s opinion ultimately does explain that the nominal statistical significance was vitiated by multiple comparisons and an extremely broad confidence interval, which more than justified its statement that the study was not truly statistically significant. In another moment, however, the court referred to the problem as one of lack of statistical power. For some reason, however, Judge Bernstein chose to explain the problem with the Louik study as a lack of statistical power:

“Equally significant is the lack of power concerning the omphalocele results. The Louik study had only 3 exposed subjects who developed omphalocele thus limiting its statistical power.”

Slip op. at 8. The adjusted odds ratio for Zoloft and omphalocele, was 5.7, with a 95% confidence interval of 1.6 – 20.7. Power was not the issue because if the odds ratio were otherwise credible, free from bias, confounding, and chance, the study had the power to observe an increased risk of close to 500%, which met the pre-stated level of significance. The problem, however, was multiple testing, fragile and imprecise results, and inability to evaluate the odds ratio fully for bias and confounding.

[1] Mark I. Bernstein, “Expert Testimony in Pennsylvania,” 68 Temple L. Rev. 699 (1995); Mark I. Bernstein, “Jury Evaluation of Expert Testimony under the Federal Rules,” 7 Drexel L. Rev. 239 (2014-2015).

[2] Jennita Reefhuis, Owen Devine, Jan M Friedman, Carol Louik, Margaret A Honein, “Specific SSRIs and birth defects: bayesian analysis to interpret new data in the context of previous reports,” 351 Brit. Med. J. (2015).

Posted in Causation, Frye, Scientific Evidence | Comments Off on Demonstration of Frye Gatekeeping in Pennsylvania Birth Defects Case

Clinical Trials and Epidemiologic Studies Biased by False and Misleading Data From Research Participants

October 2nd, 2015

Many legal commentators erroneously refer to epidemiologic studies as “admitted” into evidence.[1] These expressions are sloppy, and unfortunate, because they obscure the tenuousness of study validity, and the many hearsay levels that are represented by an epidemiologic study. Rule 702 permits expert witness opinion that has an epistemic basis, and Rule 703 allows expert witnesses to rely upon otherwise inadmissible facts and data, as long as real experts in the field would reasonably rely upon such facts and data. Nothing in Rule 702 or 703 make an epidemiologic study itself admissible. And the general inadmissibility of the studies themselves is a good thing, given that they will be meaningless to the trier of fact without the endorsements, qualifications, and explanations of an expert witness, and given that many studies are inaccurate, invalid, and lack data integrity to boot.

Dr. Frank Woodside was kind enough to call my attention to an interesting editorial piece in the current issue of the New England Journal of Medicine, which reinforced the importance of recognizing that epidemiologic studies and clinical trials are inadmissible in themselves. The editorial, by scientists from the National Institute of Environmental Health Studies and the National Institute on Drug Abuse, calls out the problem of study participants who lie, falsify, fail to disclose, and exaggerate important aspects of their medical histories as well as their data. See David B. Resnik & David J. McCann, “Deception by Research Participants,” 373 New Engl. J. Med. 1192 (2015). The editorial is an important caveat for those who would glibly describe epidemiologic studies and clinical trials as “admissible.”

As a reminder of the autonomy of those who participate in clinical trials and studies, we now refer to individuals in a study as “participants,” and not “subjects.” Resnik and McCann remind us, however, that notwithstanding their importance, study participants can bias a study in important ways. Citing other recent papers,[2] the editorialists note that clinical trials offer financial incentives to participants, which may lead to exaggeration of symptoms to ensure enrollment, to failure to disclose exclusionary medical conditions and information, and to withholding of embarrassing or inculpatory information. Although fabrication or falsification of medical history and data by research participants is not research misconduct by the investigators, the participants’ misconduct can seriously bias and undermine the validity and integrity of a study.

Resnik and McCann’s concerns about the accuracy and truthfulness of clinical trial participant medical data and information can mushroom exponentially in the context of observational studies that involve high-stakes claims for compensation and vindication on medical causation issues. Here are a couple of high-stakes examples.

The Brinton Study in Silicone Gel Breast Implant Litigation

In the silicone gel breast implant litigation, claimants looked forward to a study by one of their champions, Dr. Louis Brinton, of the National Cancer Institute (NCI). Brinton had obtained intramural funding to conduct a study of women who had had silicone gel breast implants and their health outcomes. To their consternation, the defendants in that litigation learned of Dr. Brinton’s close ties with plaintiffs’ counsel, plaintiffs’ support groups, and other advocates. Further investigation, including Freedom of Information Act requests to the NCI led to some disturbing and startling revelations.

In October 1996, a leading epidemiologist wrote a “concerned citizen” letter to Dr. Joseph Fraumeni, who was then the director of Epidemiology and Genetics at the NCI. The correspondent wrote to call Dr. Fraumeni’s attention to severe bias problems in Dr. Brinton’s pending study of disease and symptom outcomes among women who had had silicone breast implants. Dr. Brinton had written to an Oregon attorney (Michael Williams) to enlist him to encourage his clients to participate in Brinton’s NCI study. Dr. Brinton had also written to a Philadelphia attorney (Steven Sheller) to seek permission to link potential study subjects to the global settlement database of information on women participating in the settlement. Perhaps most egregiously, Dr. Brinton and others had prepared a study Question & Answer sheet, from the National Institutes of Health, which ended with a ringing solicitation of “The study provides an opportunity for women who may be suffering as a result of implants to be heard. Now is your chance to make a major contribution to women’s health by supporting this essential research.” Dr. Brinton apparently had not thought of appealing to women with implants who did not have health problems.

Dr. Brinton’s methodology doomed her study from the start. Without access to the background materials, such as the principal investigator’s correspondence file, or the recruitment documents used to solicit participation of ill women in the study, the scientific community, and the silicone litigation defendants would not have had the important insights into serious bias and flaws of Brinton’s study.

The Racette-Scruggs’ Study in Welding Fume Litigation

The welding fume litigation saw its version of a study corrupted by the participation of litigants and potential litigants. Richard (Dickie) Scruggs and colleagues funded some neurological researchers to travel to Alabama and Mississippi to “screen” plaintiffs and potential plaintiffs in litigation for over claims of neurological injury and disease from welding fume exposure. The plaintiffs’ lawyers rounded up the research subjects (a.k.a. clients and potential clients), talked to them before the medical evaluations, and administered the study questionnaires. Clearly the study subjects were aware of Scruggs’ “research” hypothesis. The plaintiffs’ lawyers then invited researchers who saw the welding tradesmen, using a novel videotaping methodology, to evaluate the workers for parkinsonism.

After their sojourn, at Scruggs’ expense to Alabama and Mississippi, the researchers wrote up their results, with little or no detail of the circumstances of how they had acquired their research “participants,” or those participants’ motives to give accurate or inaccurate medical and employment history information. See Brad A. Racette, S.D. Tabbal, D. Jennings, L. Good, J.S. Perlmutter, and Brad Evanoff, “Prevalence of parkinsonism and relationship to exposure in a large sample of Alabama welders,” 64 Neurology 230 (2005); Brad A. Racette, et al., “A rapid method for mass screening for parkinsonism,” 27 Neurotoxicology 357 (2006) (a largely duplicative report of the Alabama welders study).

Defense counsel directed subpoenas to both Dr. Racette and his institution, Washington University St. Louis, for the study protocol, underlying data, data codes, and statistical analyses. After a long discovery fight, the MDL court largely enforced the subpoenas. See, e.g., In re Welding Fume Prods. Liab. Litig., MDL 1535, 2005 WL 5417815 (N.D. Ohio Oct. 18, 2005) (upholding defendants’ subpoena for protocol, data, data codes, statistical analyses, and other things from Dr. Racette’s Alabama study on welding and parkinsonism). After the defense had the opportunity to obtain and analyze the underlying data in the Scruggs-Racette study, the welding plaintiffs largely retreated from their epidemiologic case. The Racette Alabama study faded into the background of the trials.

Both the Brinton and the Racette studies are painful reminders of the importance of assessing the motives of the study participants in observational epidemiologic studies, and the participants’ ability to undermine data integrity. If the financial motives identified by Resnik and McCann are sufficient to lead participants to give false information, or to fail to disclose correct information, we can only imagine how powerful are the motives created by the American tort litigation system among actual and potential claimants when they participate in epidemiologic studies. Resnik and McCann may be correct that fabrication or falsification of medical history and data by research participants is not research misconduct by the investigators themselves, but investigators who turn a blind eye to the knowledge, intent, and motives of their research participants may be conducting studies that are doomed from the outset.

[1] Michael D. Green, D. Michal Freedman, Leon Gordis, “Reference Guide on Epidemiology 549, 551,” in Reference Manual on Scientific Evidence (3d ed. 2011) ( “Epidemiologic studies have been well received by courts deciding cases involving toxic substances. *** Well-conducted studies are uniformly admitted.) (citing David L. Faigman et al. eds., 3 Modern Scientific Evidence: The Law and Science of Expert Testimony § 23.1, at 187 (2007–08)).

[2] Eric Devine, Megan Waters, Megan Putnam, et al., “Concealment and fabrication by experienced research subjects,” 20 Clin. Trials 935 (2013); Rebecca Dresser, “Subversive subjects: rule-breaking and deception in clinical trials,” 41 J. Law Med. Ethics 829 (2013).

Posted in Data Sharing, Rule 702, Scientific Evidence, Scientific Misconduct, Underlying Data | Comments Off on Clinical Trials and Epidemiologic Studies Biased by False and Misleading Data From Research Participants

The C-8 (Perfluorooctanoic Acid) Litigation Against DuPont, part 1

September 27th, 2015

The first plaintiff has begun her trial against E.I. Du Pont De Nemours & Company (DuPont), for alleged harm from environmental exposure to perfluorooctanoic acid or its salts (PFOA). Ms. Carla Bartlett is claiming that she developed kidney cancer as a result of drinking water allegedly contaminated with PFOA by DuPont. Nicole Hong, “Chemical-Discharge Case Against DuPont Goes to Trial: Outcome could affect thousands of claims filed by other U.S. residents,” Wall St. J. (Sept. 13, 2015). The case is pending before Chief Judge Edmund A. Sargus, Jr., in the Southern District of Ohio.

PFOA is not classified as a carcinogen in the Integrated Risk Information System (IRIS), of the U.S. Environmental Protection Agency (EPA). In 2005, the EPA Office of Pollution Prevention and Toxics submitted a “Draft Risk Assessment of the Potential Human Health Effects Associated With Exposure to Perfluorooctanoic Acid and Its Salts (PFOA),” which is available at the EPA’s website. The draft report, which is based upon some epidemiology and mostly animal toxicology studies, stated that there was “suggestive evidence of carcinogenicity, but not sufficient to assess human carcinogenic potential.”

In 2013, The Health Council of the Netherlands evaluated the PFOA cancer issue, and found the data unsupportive of a causal conclusions. The Health Council of the Netherlands, “Perfluorooctanoic acid and its salts: Evaluation of the carcinogenicity and genotoxicity” (2013) (“The Committee is of the opinion that the available data on perfluorooctanoic acid and its salts are insufficient to evaluate the carcinogenic properties (category 3)”).

Last year, the World Health Organization (WHO) through its International Agency for Research on Cancer (IARC) reviewed the evidence on the alleged carcinogenicity of PFOA. The IARC, which has fostered much inflation with respect to carcinogenicity evaluations, classified as PFOA as only possibly carcinogenic. See News, “Carcinogenicity of perfluorooctanoic acid, tetrafl uoroethylene, dichloromethane, 1,2-dichloropropane, and 1,3-propane sultone,” 15 The Lancet Oncology 924 (2014).

Most independent reviews also find the animal and epidemiologic unsupportive of a causal conclusion between PFOA and any human cancer. See, e.g., Thorsten Stahl, Daniela Mattern, and Hubertus Brunn, “Toxicology of perfluorinated compounds,” 23 Environmental Sciences Europe 38 (2011).

So you might wonder how DuPont lost its Rule 702 challenges in such a case, which it surely did. In re E. I. du Pont de Nemours & Co. C-8 Pers. Injury Litig., Civil Action 2:13-md-2433, 2015 U.S. Dist. LEXIS 98788 (S.D. Ohio July 21, 2015). That is a story for another day.

Posted in Causation, Expert Witnesses, Rule 702, Scientific Evidence, Uncategorized | Comments Off on The C-8 (Perfluorooctanoic Acid) Litigation Against DuPont, part 1

Hagiography of Selikoff

September 26th, 2015

The October 2015, Volume 58, Issue 10, is a “Special Issue” of the American Journal of Industrial Medicine dedicated to “Historical Perspectives,” of Selikoff. No serious historian need have applied; the collection consists of short articles by adulatory former students, and from the voice of Big Labor’s heavy hand on medical research, Sheldon Samuels. Still, students of the Selikoff phenomenon might find the ramblings of “The Lobby” revealing of its preoccupations and prejudices.

—————————————————————-

Henry A. Anderson, “Reflections on the legacy of Irving J. Selikoff, MD, on the 100th anniversary of his birth,” 58 Am. J. Indus. Med. 1013 (2015)

Philip J. Landrigan, “Irving J. Selikoff, MD January 15, 1915–May 20, 1992,” 58 Am. J. Indus. Med. 1015 (2015)

Albert Miller, “From the clinic to the field: Joint pulmonary medicine—environmental sciences laboratory investigations, 1973–1992 and beyond,” 58 Am. J. Indus. Med. 1017 (2015)

Morris Greenberg, “In commemoration of Irving J. Selikoff,” 58 Am. J. Indus. Med. 1025 (2015)

Sheldon Samuels, “The rise of a Titan: Irving J. Selikoff and his campaign for independent science,” 58 Am. J. Indus. Med. 1028 (2015)

Irving J. Selikoff MD, photographs (pages 1021–1024)

Posted in Asbestos, Historian Testimony | Comments Off on Hagiography of Selikoff

David Faigman’s Critique of G2i Inferences at Weinstein Symposium

September 25th, 2015

The DePaul Law Review’s 20th Annual Clifford Symposium on Tort Law and Social Policy is an 800-plus page tribute in honor of Judge Jack Weinstein. 64 DePaul L. Rev. (Winter 2015). There are many notable, thought-provoking articles, but my attention was commanded by the contribution on Judge Weinstein’s approach to expert witness opinion evidence. David L. Faigman & Claire Lesikar, “Organized Common Sense: Some Lessons from Judge Jack Weinstein’s Uncommonly Sensible Approach to Expert Evidence,” 64 DePaul L. Rev. 421 (2015) [cited as Faigman].

Professor Faigman praises Judge Jack Weinstein for his substantial contributions to expert witness jurisprudence, while acknowledging that Judge Weinstein has been a sometimes reluctant participant and supporter of judicial gatekeeping of expert witness testimony. Professor Faigman also uses the occasion to restate his own views about the so-called “G2i” problem, the problem of translating general knowledge that pertains to groups to individual cases. In the law of torts, the G2i problem arises from the law’s requirement that plaintiffs show that they were harmed by defendants’ products or environmental exposures. In the context of modern biological “sufficient” causal set principles, this “proof” requirement entails that the product or exposure can cause the specified harms in human beings generally (“general causation”) and that the product or exposure actually played a causal role in bringing about plaintiffs’ specific harms.

Faigman makes the helpful point that courts initially and incorrectly invoked “differential diagnosis,” as the generally accepted methodology for attributing causation. In doing so, the courts extrapolated from the general acceptance of differential diagnosis in the medical community to the courtroom testimony about etiology. The extrapolation often glossed over the methodological weaknesses of the differential approach to etiology. Not until 1995 did a court wake to the realization that what was being proffered was a “differential etiology,” and not a differential diagnosis. McCullock v. H.B. Fuller Co., 61 F.3d 1038, 1043 (2d Cir. 1995). This realization, however, did not necessarily stimulate the courts’ analytical faculties, and for the most part, they treated the methodology of specific causal attribution as general acceptance and uncontroversial. Faigman’s point that the courts need to pay attention to the methodological challenges to differential etiological analysis is well taken.

Faigman also claims, however, that in advancing “differential etiologies, expert witnesses were inventing wholesale an approach that had no foundation or acceptance in their scientific disciplines:

“Differential etiology is ostensibly a scientific methodology, but one not developed by, or even recognized by, physicians or scientists. As described, it is entirely logical, but has no scientific methods or principles underlying it. It is a legal invention and, as such, has analytical heft, but it is entirely bereft of empirical grounding. Courts and commentators have so far merely described the logic of differential etiology; they have yet to define what that methodology is.”

Faigman at 444.[1] Faigman is correct that courts often have left unarticulated exactly what the methodology is, but he does not quite make sense when he writes that the method of differential etiology is “entirely logical,” but has no “scientific methods or principles underlying it.” Afterall, Faigman starts off his essay with a quotation from Thomas Huxley that “science is nothing but trained and organized common sense.”[2] As I have written elsewhere, the form of reasoning involved in differential diagnosis is nothing other than the iterative disjunctive syllogism.[3] Either-or reasoning occurs throughout the physical and biological sciences; it is not clear why Faigman declares it un- or extra-scientific.

The strength of Faigman’s claim about the made-up nature of differential etiology appears to be undermined and contradicted by an example that he provides from clinical allergy and immunology:

“Allergists, for example, attempt to identify the etiology of allergic reactions in order to treat them (or to advise the patient to avoid what caused them), though it might still be possible to treat the allergic reactions without knowing their etiology.”

Faigman at 437. Of course, not only allergists try to determine the cause of an individual patient’s disease. Psychiatrists, in the psychoanalytic tradition, certain do so as well. Physicians who use predictive regression models use group data, in multivariate analyses, to predict outcomes, risk, and mortality in individual patients. Faigman’s claim is similarly undermined by the existence of a few diseases (other than infectious diseases) that are defined by the causative exposure. Silicosis and manganism have played a large role in often bogus litigation, but they represent instances in which a differential diagnosis and puzzle may also be an etiological diagnosis and puzzle. Of course, to the extent that a disease is defined in terms of causative exposures, there may be serious and even intractable problems caused by the lack of specificity and accuracy in the diagnostic criteria for the supposedly pathognomonic disease.

As for whether the concept of “differential etiology” is ever used in the sciences themselves, a few citations for consideration follow.

Kløve & D. Doehring, “MMPI in epileptic groups with differential etiology,” 18 J. Clin. Psychol. 149 (1962)

Kløve & C. Matthews, “Psychometric and adaptive abilities in epilepsy with differential etiology,” 7 Epilepsia 330 (1966)

Teuber & K. Usadel, “Immunosuppression in juvenile diabetes mellitus? Critical viewpoint on the treatment with cyclosporin A with consideration of the differential etiology,” 103 Fortschr. Med. 707 (1985)

G.May & W. May, “Detection of serum IgA antibodies to varicella zoster virus (VZV)–differential etiology of peripheral facial paralysis. A case report,” 74 Laryngorhinootologie 553 (1995)

Alan Roberts, “Psychiatric Comorbidity in White and African-American Illicity Substance Abusers” Evidence for Differential Etiology,” 20 Clinical Psych. Rev. 667 (2000)

Mark E. Mullinsa, Michael H. Leva, Dawid Schellingerhout, Gilberto Gonzalez, and Pamela W. Schaefera, “Intracranial Hemorrhage Complicating Acute Stroke: How Common Is Hemorrhagic Stroke on Initial Head CT Scan and How Often Is Initial Clinical Diagnosis of Acute Stroke Eventually Confirmed?” 26 Am. J. Neuroradiology 2207 (2005)

Qiang Fua, et al., “Differential Etiology of Posttraumatic Stress Disorder with Conduct Disorder and Major Depression in Male Veterans,” 62 Biological Psychiatry 1088 (2007)

Jesse L. Hawke, et al., “Etiology of reading difficulties as a function of gender and severity,” 20 Reading and Writing 13 (2007)

Mastrangelo, “A rare occupation causing mesothelioma: mechanisms and differential etiology,” 105 Med. Lav. 337 (2014)

[1] See also Faigman at 448 (“courts have invented a methodology – differential etiology – that purports to resolve the G2i problem. Unfortunately, this method has only so far been described; it has not been defined with any precision. For now, it remains a highly ambiguous idea, sound in principle, but profoundly underdefined.”).

[2] Thomas H. Huxley, “On the Education Value of the Natural History Sciences” (1854), in Lay Sermons, Addresses and Reviews 77 (1915).

[3] See, e.g., “Differential Etiology and Other Courtroom Magic” (June 23, 2014) (collecting cases); “Differential Diagnosis in Milward v. Acuity Specialty Products Group” (Sept. 26, 2013).

Posted in Causation, Expert Witnesses, Rule 702, Scientific Evidence | Comments Off on David Faigman’s Critique of G2i Inferences at Weinstein Symposium

Beecher-Monas Proposes to Abandon Common Sense, Science, and Expert Witnesses for Specific Causation

September 11th, 2015

Law reviews are not peer reviewed, not that peer review is a strong guarantor of credibility, accuracy, and truth. Most law reviews have no regular provision for letters to the editor; nor is there a PubPeer that permits readers to point out errors for the benefit of the legal community. Nonetheless, law review articles are cited by lawyers and judges, often at face value, for claims and statements made by article authors. Law review articles are thus a potent source of misleading, erroneous, and mischievous ideas and claims.

Erica Beecher-Monas is a law professor at Wayne State University Law School, or Wayne Law, which considers itself “the premier public-interest law school in the Midwest.” Beware of anyone or any institution that describes itself as working for the public interest. That claim alone should put us on our guard against whose interests are being included and excluded as legitimate “public” interest.

Back in 2006, Professor Beecher-Monas published a book on evaluating scientific evidence in court, which had a few goods points in a sea of error and nonsense. See Erica Beecher-Monas, Evaluating Scientific Evidence: An Interdisciplinary Framework for Intellectual Due Process (2006)[1]. More recently, Beecher-Monas has published a law review article, which from its abstract suggests that she might have something to say about this difficult area of the law:

“Scientists and jurists may appear to speak the same language, but they often mean very different things. The use of statistics is basic to scientific endeavors. But judges frequently misunderstand the terminology and reasoning of the statistics used in scientific testimony. The way scientists understand causal inference in their writings and practice, for example, differs radically from the testimony jurists require to prove causation in court. The result is a disconnect between science as it is practiced and understood by scientists, and its legal use in the courtroom. Nowhere is this more evident than in the language of statistical reasoning.

Unacknowledged difficulties in reasoning from group data to the individual case (in civil cases) and the absence of group data in making assertions about the individual (in criminal cases) beset the courts. Although nominally speaking the same language, scientists and jurists often appear to be in dire need of translators. Since expert testimony has become a mainstay of both civil and criminal litigation, this failure to communicate creates a conundrum in which jurists insist on testimony that experts are not capable of giving, and scientists attempt to conform their testimony to what the courts demand, often well beyond the limits of their expertise.”

Beecher-Monas, “Lost in Translation: Statistical Inference in Court,” 46 Arizona St. L.J. 1057, 1057 (2014) [cited as BM].

A close read of the article shows, however, that Beecher-Monas continues to promulgate misunderstanding, error, and misdirection on statistical and scientific evidence.

Individual or Specific Causation

The key thesis of this law review is that expert witnesses have no scientific or epistemic warrant upon which to opine about individual or specific causation.

“But what statistics cannot do—nor can the fields employing statistics, like epidemiology and toxicology, and DNA identification, to name a few—is to ascribe individual causation.”

BM at 1057-58.

Beecher-Monas tells us that expert witnesses are quite willing to opine on specific causation, but that they have no scientific or statistical warrant for doing so:

“Statistics is the law of large numbers. It can tell us much about populations. It can tell us, for example, that so-and-so is a member of a group that has a particular chance of developing cancer. It can tell us that exposure to a chemical or drug increases the risk to that group by a certain percentage. What statistics cannot do is tell which exposed person with cancer developed it because of exposure. This creates a conundrum for the courts, because nearly always the legal question is about the individual rather than the group to which the individual belongs.”

BM at 1057. Clinical medicine and science come in for particular chastisement by Beecher-Monas, who acknowledges the medical profession’s legitimate role in diagnosing and treating disease. Physicians use a process of differential diagnosis to arrive at the most likely diagnosis of disease, but the etiology of the disease is not part of their normal practice. Beecher-Monas leaps beyond the generalization that physicians infrequently ascertain specific causation to the sweeping claim that ascertaining the cause of a patient’s disease is beyond the clinician’s competence and scientific justification. Beecher-Monas thus tells us, in apodictic terms, that science has nothing to say about individual or specific causation. BM at 1064, 1075.

In a variety of contexts, but especially in the toxic tort arena, expert witness testimony is not reliable with respect to the inference of specific causation, which, Beecher-Monas writes, usually without qualification, is “unsupported by science.” BM at 1061. The solution for Beecher-Monas is clear. Admitting baseless expert witness testimony is “pernicious” because the whole purpose of having expert witnesses is to help the fact finder, jury or judge, who lack the background understanding and knowledge to assess the data, interpret all the evidence, and evaluate the epistemic warrant for the claims in the case. BM at 1061-62. Beecher-Monas would thus allow the expert witnesses to testify about what they legitimately know, and let the jury draw the inference about which expert witnesses in the field cannot and should not opine. BM at 1101. In other words, Beecher-Monas is perfectly fine with juries and judges guessing their way to a verdict on an issue that science cannot answer. If her book danced around this recommendation, now her law review article has come out into the open, declaring an open season to permit juries and judges to be unfettered in their specific causation judgments. What is touching is that Beecher-Monas is sufficiently committed to gatekeeping of expert witness opinion testimony that she proposes a solution to take a complex area away from expert witnesses altogether rather than confront the reality that there is often simply no good way to connect general and specific causation in a given person.

Causal Pies

Beecher-Monas relies heavily upon Professor Rothman’s notion of causal pies or sets to describe the factors that may combine to bring about a particular outcome. In doing so, she commits a non-sequitur:

“Indeed, epidemiologists speak in terms of causal pies rather than a single cause. It is simply not possible to infer logically whether a specific factor caused a particular illness.”[2]

BM at 1063. But the question on her adopted model of causation is not whether any specific factor was the cause, but whether it was one of the multiple slices in the pie. Her citation to Rothman’s statement that “it is not possible to infer logically whether a specific factor was the cause of an observed event,” is not the problem that faces factfinders in court cases.

With respect to differential etiology, Beecher-Monas claims that “‘ruling in’ all potential causes cannot be done.” BM at 1075. But why not? While it is true that disease diagnosis is often made upon signs and symptoms, BM at 1076, sometimes physicians are involved in trying to identify causes in individuals. Psychiatrists of course are frequently involved in trying to identify sources of anxiety and depression in their patients. It is not all about putting a DSM-V diagnosis on the chart, and prescribing medication. And there are times, when physicians can say quite confidently that a disease has a particular genetic cause, as in a man with BrCa1, or BrCa2, and breast cancer, or certain forms of neurodegenerative diseases, or an infant with a clearly genetically determined birth defect.

Beecher-Monas confuses “the” cause with “a” cause, and wonders away from both law and science into her own twilight zone. Here is an example of how Beecher-Monas’ confusion plays out. She asserts that:

“For any individual case of lung cancer, however, smoking is no more important than any of the other component causes, some of which may be unknown.”

BM at 1078. This ignores the magnitude of the risk factor and its likely contribution to a given case. Putting aside synergistic co-exposures, for most lung cancers, smoking is the “but for” cause of individual smokers’ lung cancers. Beecher-Monas sets up a strawman argument by telling us that is logically impossible to infer “whether a specific factor in a causal pie was the cause of an observed event.” BM at 1079. But we are usually interested in whether a specific factor was “a substantial contributing factor,” without which the disease would not have occurred. This is hardly illogical or impracticable for a given case of mesothelioma in a patient who worked for years in a crocidolite asbestos factor, or for a case of lung cancer in a patient who smoked heavily for many years right up to the time of his lung cancer diagnosis. I doubt that many people would hesitate, on either logical or scientific grounds, to attribute a child’s phocomelia birth defects to his mother’s ingestion of thalidomide during an appropriate gestational window in her pregnancy.

Unhelpfully, Beecher-Monas insists upon playing this word game by telling us that:

“Looking backward from an individual case of lung cancer, in a person exposed to both asbestos and smoking, to try to determine the cause, we cannot separate which factor was primarily responsible.”

BM at 1080. And yet that issue, of “primary responsibility” is not in any jury instruction for causation in any state of the Union, to my knowledge.

From her extreme skepticism, Beecher-Monas swings to the other extreme that asserts that anything that could have been in the causal set or pie was in the causal set:

“Nothing in relative risk analysis, in statistical analysis, nor anything in medical training, permits an inference of specific causation in the individual case. No expert can tell whether a particular exposed individual’s cancer was caused by unknown factors (was idiopathic), linked to a particular gene, or caused by the individual’s chemical exposure. If all three are present, and general causation has been established for the chemical exposure, one can only infer that they all caused the disease.¹¹⁵ Courts demanding that experts make a contrary inference, that one of the factors was the primary cause, are asking to be misled. Experts who have tried to point that out, however, have had a difficult time getting their testimony admitted.”

BM at 1080. There is no support for Beecher-Monas’ extreme statement. She cites, in footnote 115, to Kenneth Rothman’s introductory book on epidemiology, but what he says at the cited page is quite different. Rothman explains that “every component cause that played a role was necessary to the occurrence of that case.” In other words, for every component cause that actually participated in bringing about this case, its presence was necessary to the occurrence of the case. What Rothman clearly does not say is that for a given individual’s case, the fact that a factor can cause a person’s disease means that it must have caused it. In Beecher-Monas’ hypothetical of three factors – idiopathic, particular gene, and chemical exposure, all three, or any two, or only one of the three may have made a given individual’s causal set. Beecher-Monas has carelessly or intentionally misrepresented Rothman’s actual discussion.

Physicians and epidemiologists do apply group risk figures to individuals, through the lens of predictive regression equations. The Gail Model for 5 Year Risk of Breast Cancer, for instance, is a predictive equation that comes up with a prediction for an individual patient by refining the subgroup within which the patient fits. Similarly, there are prediction models for heart attack, such as the Risk Assessment Tool for Estimating Your 10-year Risk of Having a Heart Attack. Beecher-Monas might complain that these regression equations still turn on subgroup average risk, but the point is that they can be made increasingly precise as knowledge accumulates. And the regression equations can generate confidence intervals and prediction intervals for the individual’s constellation of risk factors.

Significance Probability and Statistical Significance

The discussion of significance probability and significance testing in Beecher-Monas’ book was frequently in error,[3] and this new law review article is not much improved. Beecher-Monas tells us that “judges frequently misunderstand the terminology and reasoning of the statistics used in scientific testimony,” BM at 1057, which is true enough, but this article does little to ameliorate the situation. Beecher-Monas offers the following definition of the p-value:

“The P- value is the probability, assuming the null hypothesis (of no effect) is true (and the study is free of bias) of observing as strong an association as was observed.”

BM at 1064-65. This definition misses that the p-value is a cumulative tail probability, and can be one-sided or two-sided. More seriously in error, however, is the suggestion that the null hypothesis is one of no effect, when it is merely a pre-specified expected value that is the subject of the test. Of course, the null hypothesis is often one of no disparity between the observed and the expected, but the definition should not mislead on this crucial point.

For some reason, Beecher-Monas persists in describing the conventional level of statistical significance as 95%, which substitutes the coefficient of confidence for the complement of the frequently pre-specified p-value for significance. Annoying but decipherable. See, e.g., BM at 1062, 1064, 1065. She misleadingly states that:

“The investigator will thus choose the significance level based on the size of the study, the size of the effect, and the trade-off between Type I (incorrect rejection of the null hypothesis) and Type II (incorrect failure to reject the null hypothesis) errors.”

BM at 1066. While this statement is sometimes, rarely true, it mostly is not. A quick review of the last several years of the New England Journal of Medicine will document the error. Invariably, researchers use the conventional level of alpha, at 5%, unless there is multiple testing, such as in a genetic association study.

Beecher-Monas admonishes us that “[u]sing statistical significance as a screening device is thus mistaken on many levels,” citing cases that do not provide support for this proposition.[4] BM at 1066. The Food and Drug Administration’s scientists, who review clinical trials for efficacy and safety will be no doubt be astonished to hear this admonition.

Beecher-Monas argues that courts should not factor statistical significance or confidence intervals into their gatekeeping of expert witnesses, but that they should “admit studies,” and leave it to the lawyers and expert witnesses to explain the strengths and weaknesses of the studies relied upon. BM at 1071. Of course, studies themselves are rarely admitted because they represent many levels of hearsay by unknown declarants. Given Beecher-Monas’ acknowledgment of how poorly judges and lawyers understand statistical significance, this argument is cynical indeed.

Remarkably, Beecher-Monas declares, without citation, that the

“the purpose of epidemiologists’ use of statistical concepts like relative risk, confidence intervals, and statistical significance are intended to describe studies, not to weed out the invalid from the valid.”

BM at 1095. She thus excludes by ipse dixit any inferential purposes these statistical tools have. She goes further and gives us a concrete example:

“If the methodology is otherwise sound, small studies that fail to meet a P-level of 5 [sic], say, or have a relative risk of 1.3 for example, or a confidence level that includes 1 at 95% confidence, but relative risk greater than 1 at 90% confidence ought to be admissible. And understanding that statistics in context means that data from many sources need to be considered in the causation assessment means courts should not dismiss non-epidemiological evidence out of hand.”

BM at 1095. Well, again, studies are not admissible; the issue is whether they may be reasonably relied upon, and whether reliance upon them may support an opinion claiming causality. And a “P-level” of 5 is, well, let us hope a serious typographical error. Beecher-Monas’ advice is especially misleading when there is there is only one study, or only one study in a constellation of exonerative studies. See, e.g., In re Accutane, No. 271(MCL), 2015 WL 753674, 2015 BL 59277 (N.J. Super. Law Div. Atlantic Cty. Feb. 20, 2015) (excluding Professor David Madigan for cherry picking studies to rely upon).

Confidence Intervals

Beecher-Monas’ book provided a good deal of erroneous information on confidence intervals.[5] The current article improves on the definitions, but still manages to go astray:

“The rationale courts often give for the categorical exclusion of studies with confidence intervals including the relative risk of one is that such studies lack statistical significance.⁶² Well, yes and no. The problem here is the courts’ use of a dichotomous meaning for statistical significance (significant or not).⁶³ This is not a correct understanding of statistical significance.”

BM at 1069. Well yes and no; this interpretation of a confidence interval, say with a coefficient of confidence of 95%, is a reasonable interpretation of whether the point estimate is statistically significant at an alpa of 5%. If Beecher-Monas does not like strict significant testing, that is fine, but she cannot mandate its abandonment by scientists or the courts. Certainly the cited interpretation is one proper interpretation among several.

Power

There were several misleading references to statistical power in Beecher-Monas’ book, but the new law review tops them by giving a new, bogus definition:

“Power, the probability that the study in which the hypothesis is being tested will reject the alterative [sic] hypothesis when it is false, increases with the size of the study.”

BM at 1065. For this definition, Beecher-Monas cites to the Reference Manual on Scientific Evidence, but butchers the correct definition give by the late David Freedman and David Kaye.[6] All of which is very disturbing.

Relative Risks and Other Risk Measures

Beecher-Monas begins badly by misdefining the concept of relative risk:

“as the percentage of risk in the exposed population attributable to the agent under investigation.”

BM at 1068. Perhaps this percentage can be derived from the relative risk, if we know it to be the true measure with some certainty, through a calculation of attributable risk, but confusing and conflating attributable and relative risk in a law review article that is taking the entire medical profession to task, and most of the judiciary to boot, should be written more carefully.

Then Beecher-Monas tells us that the “[r]elative risk is a statistical test that (like statistical significance) depends on the size of the population being tested.” BM at 1068. Well, actually not; the calculation of the RR is unaffected by the sample size. The variance of course will vary with the sample size, but Beecher-Monas seems intent on ignoring random variability.

Perhaps most egregious is Beecher-Monas’ assertion that:

“Any increase above a relative risk of one indicates that there is some effect.”

BM at 1067. So much for ruling out chance, bias, and confounding! Or looking at an entire body of epidemiologic research for strength, consistency, coherence, exposure-response, etc. Beecher-Monas has thus moved beyond a liberal, to a libertine, position. In case the reader has any doubts of the idiosyncrasy of her views, she repeats herself:

“As long as there is a relative risk greater than 1.0, there is some association, and experts should be permitted to base their causal explanations on such studies.”

BM at 1067-68. This is evidentiary nihilism in full glory. Beecher-Monas has endorsed relying upon studies irrespective of their study design or validity, their individual confidence intervals, their aggregate summary point estimates and confidence intervals, or the absence of important Bradford Hill considerations, such as consistency, strength, and dose-response. So an expert witness may opine about general causation from reliance upon a single study with a relative risk of 1.05, say with a 95% confidence interval of 0.8 – 1.4?[7] For this startling proposition, Beecher-Monas cites the work of Sander Greenland, a wild and wooly plaintiffs’ expert witness in various toxic tort litigations, including vaccine autism and silicone autoimmune cases.

RR > 2

Beecher-Monas’ discussion of inferring specific causation from relative risks greater than two devolves into a muddle by her failure to distinguish general from specific causation. BM at 1067. There are different relevancies for general and specific causation, depending upon context, such as clinical trials or epidemiologic studies for general causation, number of studies available, and the like. Ultimately, she adds little to the discussion and debate about this issue, or any other.

[1] See previous comments on the book at “Beecher-Monas and the Attempt to Eviscerate Daubert from Within”; “Friendly Fire Takes Aim at Daubert – Beecher-Monas And The Undue Attack on Expert Witness Gatekeeping; and “Confidence in Intervals and Diffidence in the Courts.”

[2] Kenneth J. Rothman, Epidemiology: An Introduction 250 (2d ed. 2012).

[3] Erica Beecher-Monas, Evaluating Scientific Evidence: An Interdisciplinary Framework for Intellectual Due Process 42 n. 30, 61 (2007) (“Another way of explaining this is that it describes the probability that the procedure produced the observed effect by chance.”) (“Statistical significance is a statement about the frequency with which a particular finding is likely to arise by chance.”).

[4] See BM at 1066 & n. 44, citing “See, e.g., In re Breast Implant Litig., 11 F. Supp. 2d 1217, 1226–27 (D. Colo. 1998); Haggerty v. Upjohn Co., 950 F. Supp. 1160, 1164 (S.D. Fla. 1996), aff’d, 158 F.3d 588 (11th Cir. 1998) (“[S]cientifically valid cause and effect determinations depend on controlled clinical trials and epidemiological studies.”).”

[5] See, e.g., Erica Beecher-Monas, Evaluating Scientific Evidence 58, 67 (N.Y. 2007) (“No matter how persuasive epidemiological or toxicological studies may be, they could not show individual causation, although they might enable a (probabilistic) judgment about the association of a particular chemical exposure to human disease in general.”) (“While significance testing characterizes the probability that the relative risk would be the same as found in the study as if the results were due to chance, a relative risk of 2 is the threshold for a greater than 50 percent chance that the effect was caused by the agent in question.”)(incorrectly describing significance probability as a point probability as opposed to tail probabilities).

[6] David H. Kaye & David A. Freedman, Reference Guide on Statistics, in Federal Jud. Ctr., Reference Manual on Scientific Evidence 211, 253–54 (3d ed. 2011) (discussing the statistical concept of power).

[7] BM at 1070 (pointing to a passage in the FJC’s Reference Manual on Scientific Evidence that provides an example of one 95% confidence interval that includes 1.0, but which shrinks when calculated as a 90% interval to 1.1 to 2.2, which values “demonstrate some effect with confidence interval set at 90%). This is nonsense in the context of observational studies.

Posted in Causation, Risk and Risk Factor, Rule 702, Uncategorized | Comments Off on Beecher-Monas Proposes to Abandon Common Sense, Science, and Expert Witnesses for Specific Causation

Seventh Circuit Affirms Exclusion of Expert Witnesses in Vinyl Chloride Case

August 30th, 2015

Last week, the Seventh Circuit affirmed a federal district court’s exclusion of plaintiffs’ expert witnesses in an environmental vinyl chloride exposure case. Wood v. Textron, Inc., No. 3:10 CV 87, 2014 U.S. Dist. LEXIS 34938 (N.D. Ind. Mar. 17, 2014); 2014 U.S. Dist. LEXIS 141593, at *11 (N.D. Ind. Oct. 3, 2014), aff’d, Slip op., No. 14-3448, 20125 U.S. App. LEXIS 15076 (7th Cir. Aug. 26, 2015). Plaintiffs, children C.W. and E.W., claimed exposure from Textron’s manufacturing facility in Rochester, Indiana, which released vinyl chloride as a gas that seeped into ground water, and into neighborhood residential water wells. Slip op. at 2-3. Plaintiffs claimed present injuries in the form of “gastrointestinal issues (vomiting, bloody stools), immunological issues, and neurological issues,” as well as future increased risk of cancer. Importantly, the appellate court explicitly approved the trial court’s careful reading of relied upon studies to determine whether they really did support the scientific causal claims made by the expert witnesses. Given the reluctance of some federal district judges to engage with the studies actually cited, this holding is noteworthy.

To support their claims, plaintiffs offered the testimony from three familiar expert witnesses:

(1) Dr. James G. Dahlgren;

(2) Dr. Vera S. Byers; and

(3) Dr. Jill E. Ryer-Powder.

Slip op. at 5. This gaggle offered well-rehearsed but scientifically unsound arguments in place of actual evidence that the children were hurt, or would be afflicted, as a result of their claimed exposures:

(a) extrapolation from high dose animal and human studies;

(b) assertions of children’s heightened vulnerability;

(c) differential etiology;

(d) temporality; and

(e) regulatory exposure limits.

On appeal, a panel of the Seventh Circuit held that the district court had properly conducted “an in-depth review of the relevant studies that the experts relied upon to generate their differential etiology,” and their general causation opinions. Slip op. at 13-14 (distinguishing other Seventh Circuit decisions that reversed district court Rule 702 rulings, and noting that the court below followed Joiner’s lead by analyzing the relied-upon studies to assess analytical gaps and extrapolations). The plaintiffs’ expert witnesses simply failed in analytical gap bridging, and dot connecting.

Extrapolation

The Circuit agreed with the district court that the extrapolations asserted were extreme, and that they represented “analytical gaps” too wide to be permitted in a courtroom. Slip op. at 15. The challenged expert witnesses extrapolated between species, between exposure levels, between exposure duration, between exposure circumstances, and between disease outcomes.

The district court faulted Dahlgren for relying upon articles that “fail to establish that [vinyl chloride] at the dose and duration present in this case could cause the problems that the [p]laintiffs have experienced or claim that they are likely to experience.” C.W. v. Textron, 2014 U.S. Dist. LEXIS 34938, at *53, *45 (N.D. Ind. Mar. 17, 2014) (finding that the analytical gap between the cited studies and Dahlgren’s purpose in citing the studies was an unbridged gap, which Dahlgren had failed to explain). Slip op. at 8.

Byers, for instance, cited one study[1] that involved exposure for five years, at an average level that was over 1,000 times higher than the children’s alleged exposure levels, which lasted less than 17 and 7 months, each. Perhaps even more extreme were the plaintiffs’ expert witnesses’ attempted extrapolations from animal studies, which the district court recognized as “too attenuated” from plaintiffs’ case. Slip op. at 14. The Seventh Circuit rejected plaintiffs’ alleged error that the district court had imposed a requirement of “absolute precision,” in holding that the plaintiffs’ expert witnesses’ analytical gaps (and slips) were too wide to be bridged. The Circuit provided a colorful example of a study on laboratory rodents, pressed into service for a long-term carcinogenetic assay, which found no statistically significant increase in tumors fed 0.03 milligrams vinyl chloride per kilogram of bodyweight, (0.03 mg/kg), for 4 to 5 days each week, for 59 weeks, compared to control rodents fed olive oil.[2] Slip op. at 14-15. This exposure level in this study of 0.03 mg/kg was over 10 times the children’s exposure, as estimated by Ryer-Powder. The 59 weeks of study exposure represents the great majority of the rodents’ adult years, which greatly exceeds the children’s exposure was took place over several months of their lives. Slip op. at 15.

The Circuit held that the district court was within its discretion in evaluating the analytical gaps, and that the district court was correct to look at the study details to exercise its role as a gatekeeper under Rule 702. Slip op. at 15-17. The plaintiffs’ expert witnesses failed to explain their extrapolations, which was made their opinions suspect. As the Circuit court noted, there is a methodology by which scientists sometimes attempt to model human risks from animal evidence. Slip op. at 16-17, citing Bernard D. Goldtsein & Mary Sue Henifin, “Reference Guide on Toxicology,” in Federal Manual on Scientific Evidence 646 (3d ed. 2011) (“The mathematical depiction of the process by which an external dose moves through various compartments in the body until it reaches the target organ is often called physiologically based pharmokinetics or toxicokinetics.”). Given the abject failures of plaintiffs’ expert witnesses to explain their leaps of faith, the appellate court had no occasion to explore the limits of risk assessment outside regulatory contexts.

Children’s Vulnerability

Plaintiffs’ expert witness asserted that children are much more susceptible than adult workers, and even laboratory rats. As is typical in such cases, these expert witnesses had no evidence to support their assertions, and they made no effort even to invoke models that attempted reasonable risk assessments of children’s risk.

Differential Etiology

Dahlgren and Byers both claimed that they reached individual or specific causation conclusions based upon their conduct of a “differential etiology.” The trial and appellate court both faulted them for failing to “rule in” vinyl chloride for plaintiffs’ specific ailments before going about the business of ruling out competing or alternative causes. Slip op. at 6-7; 9-10; 20-21.

The courts also rejected Dahlgren’s claim that he could rule out all potential alternative causes by noting that the children’s treating physicians had failed to identify any cause for their ailments. So after postulating a limited universe of alternative causes of “inheritance, allergy, infection or another poison,” Dahlgren ruled all of them out of the case, because these putative causes “would have been detected by [the appellants’] doctors and treated accordingly.” Slip op. at 7, 18. As the Circuit court saw the matter:

“[T]his approach is not the stuff of science. It is based on faith in his fellow physicians—nothing more. The district court did not abuse its discretion in rejecting it.”

Slip op. at 18. Of course, the court might well have noted that physicians are often concerned exclusively with identifying effective therapy, and have little or nothing to offer on actual causation.

The Seventh Circuit panel did fuss with dicta in the trial court’s opinion that suggested differential etiology “cannot be used to support general causation.” C.W. v. Textron, 2014 U.S. Dist. LEXIS 141593, at *11 (N.D. Ind. Oct. 3, 2014). Elsewhere, the trial court wrote, in a footnote, that “[d]ifferential [etiology] is admissible only insofar as it supports specific causation, which is secondary to general causation … .” Id. at *12 n.3. Curiously the appellate court characterized these statements as “holdings” of the trial court, but disproved their own characterization by affirming the judgment below. The Circuit court countered with its own dicta that

“there may be a case where a rigorous differential etiology is sufficient to help prove, if not prove altogether, both general and specific causation.”

Slip op. at 20 (citing, in turn, improvident dicta from the Second Circuit, in Ruggiero v. Warner-Lambert Co., 424 F.3d 249, 254 (2d Cir. 2005) (“There may be instances where, because of the rigor of differential diagnosis performed, the expert’s training and experience, the type of illness or injury at issue, or some other … circumstance, a differential diagnosis is sufficient to support an expert’s opinion in support of both general and specific causation.”).

Regulatory Pronouncements

Dahlgren based his opinions upon the children’s water supply containing vinyl chloride in excess of regulatory levels set by state and federal agencies, including the U.S. Environmental Protection Agency (E.P.A.). Slip op. at 6. Similarly, Ryer-Powder relied upon exposure levels’ exceeding regulatory permissible limits for her causation opinions. Slip op. at 10.

The district court, with the approval now of the Seventh Circuit would have none of this nonsense. Exceeding governmental regulatory exposure limits does not prove causation. The con-compliance does not help the fact finder without knowing “the specific dangers” that led the agency to set the permissible level, and thus the regulations are not relevant at all without this information. Even with respect to specific causation, the regulatory infraction may be weak or null evidence for causation. Slip op. at 18-19 (citing Cunningham v. Masterwear Corp., 569 F.3d 673, 674–75 (7th Cir. 2009).

Temporality

Byers and Dahlgren also emphasized that the children’s symptoms began after exposure and abated after removal from exposure. Slip op. at 9, 6-7. Both the trial and appellate courts were duly unimpressed by the post hoc ergo propter hoc argument. Slip op. at 19, citing Ervin v. Johnson & Johnson, 492 F.3d 901, 904-05 (7th Cir. 2007) (“The mere existence of a temporal relationship between taking a medication and the onset of symptoms does not show a sufficient causal relationship.”).

Increased Risk of Cancer

The plaintiffs’ expert witnesses offered opinions about the children’s future risk of cancer that were truly over the top. Dahlgren testified that the children were “highly likely” to develop cancer in the future. Slip op. at 6. Ryer-Powder claimed that the children’s exposures were “sufficient to present an unacceptable risk of cancer in the future.” Slip op. at 10. With no competence evidence to support their claims of present or past injury, these opinions about future cancer were no longer relevant. The Circuit thus missed an opportunity to comment on how meaningless these opinions were. Most people will develop a cancer at some point in their lifetime, and we might all agree that any risk is unacceptable, which is why medical research continues into the causes, prevention, and cure of cancer. An unquantified risk of cancer, however, cannot support an award of damages even if it were a proper item of damages. See, e.g., Sutcliffe v. G.A.F. Corp., 15 Phila. 339, 1986 Phila. Cty. Rptr. LEXIS 22, 1986 WL 501554 (1986). See also “Back to Baselines – Litigating Increased Risks” (Dec. 21, 2010).

[1] Steven J. Smith, et al., “Molecular Epidemiology of p53 Protein Mutations in Workers Exposed to Vinyl Chloride,” 147 Am. J. Epidemiology 302 (1998) (average level of workers’ exposure was 3,735 parts per million; children were supposedly exposed at 3 ppb). This study looked only at a putative biomarker for angiosarcoma of the liver, not at cancer risk.

[2] Cesare Maltoni, et al., “Carcinogenity Bioassays of Vinyl Chloride Monomer: A Model of Risk Assessment on an Experimental Basis, 41 Envt’l Health Persp. 3 (1981).

Posted in Causation, Rule 702, Scientific Evidence | Comments Off on Seventh Circuit Affirms Exclusion of Expert Witnesses in Vinyl Chloride Case

District Court Denies Writ of Coram Nobis to Dr Harkonen

August 27th, 2015

Courts are generally suspicious of convicted defendants who challenge the competency of their trial counsel on any grounds that might reflect strategic trial decisions. A convicted defendant can always speculate about how his trial might have gone better had some witnesses, who did not fare well at trial, not been called. Similarly, a convicted defendant might well speculate that his trial counsel could and should have called other or better witnesses. Still, sometimes, trial counsel really do screw up, especially when it comes to technical, scientific, or statistical issues.

The Harkonen case is a true comedy of errors – statistical, legal, regulatory, and practical. Indeed, some would say it is truly criminal to convict someone for an interpretation of a clinical trial result.[1] As discussed in several previous posts, Dr. W. Scott Harkonen was convicted under the wire fraud statute, 18 U.S.C. § 1343, for having distributed a faxed press release about InterMune’s clinical trial, in which he described the study as having “demonstrated” Actimmune’s survival benefit in patients with mild to moderate idiopathic pulmonary fibrosis (cryptogenic fibrosing alveolitis). The trial had not shown a statistically significant result on its primary outcome, and the significance probability on the secondary outcome of survival benefit was 0.08. Dr. Harkonen reported on a non-prespecified subgroup of patients with mild to moderate disease at randomization, in which subgroup, the trial showed better survival in the experimental therapy group, p-value of 0.004, compared with the placebo group.

Having exhausted his direct appeal, Dr. Harkonen petitioned for post-conviction relief in the form of a writ of coram nobis, on grounds of ineffective assistance of counsel. Last week, federal District Judge Richard Seeborg, in San Francisco, denied Dr. Harkonen’s petition. United States v. Harkonen, Case No. 08-cr-00164-RS-1, Slip op. (N.D. Cal. Aug. 21, 2015). See Dani Kass, “Ex-InterMune CEO’s Complaints Against Trial Counsel Nixed,” Law360 (Aug. 24, 2015). Judge Seeborg held that Dr. Harkonen had failed to explain why he had not raised the claim of ineffective assistance earlier, and that trial counsel’s tactical and strategic decisions, with respect to not calling statistical expert witnesses, were “not so beyond the pale of reasonable conduct as to warrant the finding of ineffective assistance.” Slip op. at 1.

To meet its burden at trial, the government presented Dr. Thomas Fleming, a statistician and “trialist,” who had served on the data safety and monitoring board of the clinical trial at issue.[2] Fleming took the rather extreme view that a clinical trial that “fails” to meet its primary pre-stated end point at the conventional p-value of less than 5 percent is an abject failure and provides no demonstration of any claim of efficacy. (Other experts might well say that the only failed clinical trial is one that was not done.) Judge Seeborg correctly discerned that Fleming’s testimony was in the form of an opinion, and that the law of wire fraud prohibits prosecution of scientific opinions about which reasonable scientists may differ. The government’s burden was thus to show, beyond a reasonable doubt, that no reasonable scientist could have reported the Actimmune clinical trial as having “demonstrated” a survival benefit in the mild to moderate disease subgroup. Slip op. at 2.

Remarkably, at trial, the government presented no expert witnesses, and Fleming testified as a fact witness. While acknowledging that the contested issue, whether anyone could fairly say that the Actimmune clinical trial had demonstrated efficacy in a non-prespecified subgroup, called for an opinion, Judge Seeborg gave the government a pass for not presenting expert witnesses to make out its case. Indeed, Judge Seeborg noted that the government had “stressed testimony from its experts touting the view that study results without sufficiently low p-values are inherently unreliable and meaningless.” Slip op. at 3 (emphasis added). Judge Seeborg’s description of Fleming as an expert witness is remarkable because the government never sought to qualify Dr. Fleming as an expert witness, and the trial judge never gave the jury an instruction on how to evaluate the testimony of an expert witness, including an explanation that the jury was free to accept some, all, or none of Fleming’s opinion testimony. After the jury returned its guilty verdict, Harkonen’s counsel filed a motion for judgment of acquittal, based in part upon the government’s failure to qualify Fleming as an expert witness in the field of biostatistics. The trial judge refused this motion on grounds that

(1) at one point Fleming had been listed as an expert witness;

(2) Fleming’s curriculum vitae had been marked and admitted into evidence; and

(3) “[m]ost damningly,” according to the trial judge, Harkonen’s lawyers had failed to object to Fleming’s holding forth on opinions about statistical theory and practice.

Slip op. at 7. Damning indeed as evidence of a potentially serious deviation from a reasonable standard of care and competence for trial practice! On the petition for coram nobis, Judge Seeborg curiously refers to Dr. Harkonen as not objecting, when the very issue before the court, on the petition for coram nobis, is the competency of his counsel’s failing to object. Allowing a well-credentialed statistician, such as Fleming, to testify, without requesting a limiting instruction on expert witness opinion testimony certainly seems “beyond the pale.” If there were some potential tactic involved in this default, Judge Seeborg does not identify it, and none comes to mind. And even if this charade, of calling Fleming as a fact witness, were some sort of tactical cat-and-mouse litigation game between government and defendant, certainly the trial judge should have taken control of the matter by disallowing a witness, not tendered as an expert witness, from offering opinion testimony on arcane statistical issues.

Having not objected to Fleming’s opinions, Dr. Harkonen’s counsel decided not to call its own defense expert witnesses. The post-conviction court makes much of the lesser credentials of the defense witnesses, and a decision not to call expert witnesses based upon defense counsel’s apparent belief that it had undermined Fleming’s opinion on cross-examination. There is little in the cross-examination of Fleming to support the coram nobis court’s assessment. Fleming’s opinions were vulnerable in ways that trial counsel failed to exploit, and in ways that even a lesser credentialed expert witness could have made clear to a lay jury or the court. Even a journeyman statistician would have realized that Fleming had overstated the statistical orthodoxy that p-values are “magical numbers,” by noting that many statisticians and epidemiologists disagreed with invoking statistical hypothesis testing as a rigid decision procedure, based upon p-values less than 0.05. Indeed, the idea of statistical testing as driven by a rigid, pre-selected level of acceptable Type 1 error rate was rejected by the very statistician who developed and advanced computations of the p-value. See Sir Ronald Fisher, Statistical Methods and Scientific Inference 42 (Hafner 1956) (ridiculing rigid hypothesis testing as “absurdly academic, for in fact no scientific worker has a fixed level of significance at which from year to year, and in all circumstances, he rejects hypotheses; he rather gives his mind to each particular case in the light of his evidence and his ideas.”).

After the jury convicted on the wire fraud count, Dr. Harkonen changed counsel from Kasowitz Benson Torres & Friedman LLP, to Mark Haddad at Sidley Austin LLP. Mr. Haddad was able, in relatively short order, to line up two outstanding statisticians, Professor Steven Goodman, of Stanford University’s Medical School, and Professor Donald Rubin, of Harvard University. Both Professors Goodman and Rubin robustly rejected Fleming’s orthodox positions in post-trial declarations, which were too late to affect the litigation of the merits, although their contributions may well have made it difficult for the trial judge to side with the government on its request for a Draconian ten-year prison sentence. From my own perspective, I can say it was not difficult to recruit two leading, capable epidemiologists, Professors Kenneth Rothman and Timothy Lash to join in an amicus brief that criticized Fleming’s testimony in a way that would have been devastating had it been done at trial.

The entire Harkonen affair is marked by extraordinary governmental hypocrisy. As Judge Seeborg reports:

“[t]hroughout its case in chief, the government stressed testimony from Fleming and Crager who offered that, in the world of biostatistical analysis, a 0.05 p-value threshold is ‘somewhat of a magic number’; that the only meaningful p-value from a study is the one for its primary endpoint; and that data from post-hoc subgroup analyses cannot be reported upon accurately without information about the rest of the sampling context.”[3]

Slip op. at 4. And yet, in another case, when it was politically convenient to take the opposite position, the government proclaimed, through its Solicitor General, on behalf of the FDA, that statistical significance at any level is not necessary at all for demonstrating causation:

“[w]hile statistical significance provides some indication about the validity of a correlation between a product and a harm, a determination that certain data are not statistically significant … does not refute an inference of causation.”

Brief for the United States as Amicus Curiae Supporting Respondents, in Matrixx Initiatives, Inc. v. Siracusano, 2010 WL 4624148, at *14 (Nov. 12, 2010). The methods of epidemiology and data analysis are not, however, so amenable to political expedience. The government managed both to overstate the interpretation of p-values in Harkonen, and to understate them in Matrixx Initiatives.

Like many of the judges who previously have ruled on one or another issue in the Harkonen case, Judge Seeborg struggled with statistical concepts and gave a rather bizarre, erroneous definition of what exactly was at issue with the p-values in the Actimmune trial:

“In clinical trials, a p-value is a number between one and zero which represents the probability that the results establish a cause-and-effect relationship, rather than a random effect, between the drug and a positive health benefit. Because a p-value indicates the degree to which the tested drug does not explain observed benefits, the smaller the p-value, the larger a study’s significance.”

Slip op. at 2-3. Ultimately, this error was greatly overshadowed by a simpler error of overlooking, and condoning, trial counsel’s default in challenging the government’s failure to present credible expert witness opinion testimony on the crucial issue in the case.

At the heart of the government’s complaint is that Dr. Harkonen’s press release does not explicitly disclose that the subgroup of mild and moderate disease patients was not pre-specified for analysis in the trial protocol and statistical analysis plan. Dr. Harkonen’s failure to disclose the ad hoc nature of the subgroup, while not laudable, hardly rose to the level of criminal fraud, especially when considered in the light of the available prior clinical trials on the same medication, and prevalent practice in not making the appropriate disclosure in press releases, and even in full, peer-reviewed publications of clinical trials and epidemiologic studies.

For better or worse, the practice of presenting unplanned subgroup analyses, is quite common in the scientific community. Several years ago the New England Journal of Medicine published a survey of publication practice in its own pages, and documented the widespread failure to limit “demonstrated” findings to pre-specified analyses.[4] In general, the survey authors were unable to determine the total number of subgroup analyses performed; and in the majority (68%) of trials discussed, the authors could not determine whether the subgroup analyses were pre-specified.[5] Although the authors of this article proposed guidelines for identifying subgroup analyses as pre-specified or post-hoc, they emphasized that the proposals were not “rules” that could be rigidly prescribed.[6]

Of course, what was at issue in Dr. Harkonen’s case was not a peer-reviewed article in a prestigious journal, but a much more informal, less rigorous communication that is typical of press releases. Lack of rigor in this context is not limited to academic and industry press releases. Consider the press release recently issued by the National Institutes of Health (NIH) in connection with a NIH funded clinical trial on age-related macular degeneration (AMD). NIH Press Release, “NIH Study Provides Clarity on Supplements for Protection against Blinding Eye Disease,” NIH News & Events Website (May 5, 2013) [last visited August 27, 2015]. The clinical trial studied a modified dietary supplement in common use to prevent or delay AMD. The NIH’s press release claimed that the study “provides clarity on supplements,” and announced a “finding” of “some benefits” when looking at just two of the subgroups. The press release does not use the words “post hoc” or “ad hoc” in connection with the subgroup analysis used to support the “finding” of benefit.

The clinical trial results were published the same day in a journal article that labeled the subgroup findings as post hoc subgroup findings.[7] The published paper also reported that the pre-specified endpoints of the clinical trial did not show statistically significant differences between therapies and placebo.

None of the p-values for any of the post-hoc subgroup analysis was adjusted for multiple comparisons. NIH webpages with Questions and Answers for the public and the media both fail to report the post-hoc nature of the subgroup findings.[8] By the standards imposed upon Dr. Harkonen in this case through Dr. Fleming’s testimony, and contrary to the NIH’s public representations, the NIH trial had “failed,” and no inferences could be drawn with respect to any endpoint because the primary endpoint did not yield a statistically significant result.

There are, to be sure, hopeful signs that the prevalent practice is changing. A recent article documented an increasing number of “null” effect clinical trials that have been reported, perhaps as the result of better reporting of trials without dramatic successes, increasing willingness to publish such trial results, and greater availability of trial protocols in advance of, or with, peer-review publication of trial results.[9] Transparency in clinical and other areas of research is welcome and should be the norm, descriptively and prescriptively, but we should be wary of criminalizing lapses with indictments of wire fraud for conduct that can be found in most scientific journals and press releases.

[1] See, e.g., “Who Jumped the Shark in United States v. Harkonen”; “Multiplicity versus Duplicity – The Harkonen Conviction”; “The (Clinical) Trial by Franz Kafka”; “Further Musings on U.S. v. Harkonen”; and “Subgroups — Subpar Statistical Practice versus Fraud.” In the Supreme Court, two epidemiologists and a law school lecturer filed an Amicus Brief that criticized the government’s statistical orthodoxy. Brief by Scientists And Academics as Amici Curiae, in Harkonen v. United States, 2013 WL 5915131, 2013 WL 6174902 (Supreme Court Sept. 9, 2013).

[2] The government also presented the testimony of Michael Crager, an InterMune biostatistician. Reading between the lines, we may infer that Dr. Crager was induced to testify in exchange for not being prosecuted, and that his credibility was compromised.

[3] This testimony was particularly egregious because mortality or survival is often the most important outcome measure, but frequently not made the primary trial end point because of concern over whether there would be a sufficient number of deaths over the course of the trial to assess efficacy in this outcome. In the context of the Actimmune trial, this concern was in full display, but as it turned out, when the data were collected, there was a survival benefit (p = 0.08, which shrank to 0.055 when the analysis was limited to patients who met entrance criteria, and shrank further to 0.004, when the analysis was limited plausibly to patients with only mild or moderate disease at randomization).

[4] Rui Wang, et al., “Statistics in Medicine – Reporting of Subgroup Analyses in Clinical Trials,” 357 New Eng. J. Med. 2189 (2007).

[5] Id. at 2192.

[6] Id. at 2194.

[7] Emily Chew, et al., Lutein + Zeaxanthin and Omega-3 Fatty Acids for Age-Related Macular Degeneration, 309 J. Am. Med. Ass’n 2005 (2013).

[8] See “For the Public: What the Age-Related Eye Disease Studies Mean for You” (May 2013) [last visited August 27, 2015]; “For the Media: Questions and Answers about AREDS2” (May 2013) [last visited August 27, 2015].

[9] See Robert M. Kaplan & Veronica L. Irvin, “Likelihood of Null Effects of Large NHLBI Clinical Trials Has Increased over Time,” 10 PLoS ONE e0132382 (2015); see also Editorial, “Trials register sees null results rise,” 524 Nature 269 (Aug. 20, 2015); Paul Basken, “When Researchers State Goals for Clinical Trials in Advance, Success Rates Plunge,” The Chronicle of Higher Education (Aug. 5, 2015).

Posted in statistical evidence | Comments Off on District Court Denies Writ of Coram Nobis to Dr Harkonen

Time to Retire Ancient Documents As Hearsay Exception

August 23rd, 2015

The Committee on Rules of Practice and Procedure of the Judicial Conference of the United States has prepared a Preliminary Draft of Proposed Amendments to the Federal Rules of Bankruptcy Procedure and the Federal Rules of Evidence (Aug. 2015). The Committee seeks approval of proposed amendments to Bankruptcy Rules 1001 and 1006, and to Federal Rules of Evidence Rules 803 (16)and 902. See Debra Cassens Weiss, “Federal judiciary considers dumping ‘ancient documents’ rule,” ABA Journal Online (Aug. 19, 2015).

Rule 803(16) of the Federal Rules of Evidence is the so-called ancient document exception to the rule against hearsay. The proposed amendment would abolish this hearsay exception.

The Federal Rules of Evidence, as well as most state rules and common law, allow for the authentication of ancient documents, by showing just three things:

(A) is in a condition that creates no suspicion about its authenticity;

(B) was in a place where, if authentic, it would likely be; and

(C) is at least 20 years old when offered.

Federal Rule of Evidence 902(8) (“Evidence About Ancient Documents or Data Compilations”). Rule 803(16) goes beyond the authentication to permit the so-called ancient document, more than 20-years old, appearing to be authentic, to be admitted for its truth. The Committee is seeking the abrogation of Rule 803(16), the ancient documents exception to the hearsay rule. The proposal is based upon an earlier report of the Advisory Committee on Evidence Rules. See Hon. William K. Sessions, III, Chair, Report of the Advisory Committee on Evidence Rules (May 7, 2015).

The requested change is based upon the Committee’s understanding that the exception is rarely used, and upon the development of electronic documents, which makes the exception unneccessary because so-called ancient documents would usually be admissible under the business records or the residual hearsay exceptions. Comments can be submitted online or in writing, by February 16, 2016.

The fact that a document is old may perhaps add to its authenticity, but in many technical, scientific, and medical contexts, the “ancient” provenance actually makes the content unlikely to be true. The pace of change of technical and scientific opinion and understanding is too fast to indulge this exception that permits false statements of doubtful validity to confuse the finder of fact. The rule as currently in effect is thus capable of a good deal of mischief. With respect to statements or claims to scientific knowledge, the Federal Rules of Evidence has evolved towards a system of evidence-based opinion, and away from naked opinion based upon the apparent authority or prestige of the speaker. Similarly, the age of the speaker or of the document provides no warrant for the truth of the document’s content. Of course, the statements in authenticated ancient documents remain relevant to the declarant’s state of mind, and nothing in the proposed amendment would affect this use of the document. As for the contested truth of the document’s content, there will usually be better, more recent, and sounder scientific evidence to support the ancient document’s statements if those statements are indeed correct. In the unlikely instance that more recent, more exacting evidence is unavailable, and the trustworthiness of the ancient document’s statements can be otherwise established, then the statements would probably be admissible pursuant to other exceptions to the rule against hearsay, as noted by the Committee.

Posted in Scientific Evidence, Trial Procedure | Comments Off on Time to Retire Ancient Documents As Hearsay Exception

Let Me Not Be Frank With You – Frank Subpoena Quashed

August 19th, 2015

In June 2015, Honeywell International Inc. subpoenaed non-party witness Dr. Arthur Frank, to produce documents and to testify, in Yates v. Ford Motor Co., et al., No. 5:12-cv-752-FL (E.D.N.C.). Although Dr. Frank is a “prolific plaintiffs’ expert” witness, he was not retained in Yates. Dr. Frank thus moved to quash the subpoena in the district where he was served, and the matter ended up on the docket of Judge Gerald J. Pappert. Frank v. Honeywell Int’l, Inc., No. 15-mc-00172, 2015 U.S. Dist. LEXIS 106453, 2015 BL 260668 (E.D. Pa. Aug. 12, 2015) [cited below as Yates]. See also Steven M. Sellers, “Asbestos Expert Tops Honeywell in Subpoena Battle,” BNA Bloomberg Law (Aug. 18, 2015).

Back in 2009, Dr. Frank lobbied the National Cancer Institute (“NCI”), and succeeded in having the NCI change its website and “Fact Sheets” about the supposed cancer risks among auto mechanics from exposure to asbestos in repairing brakes. The NCI had proposed describing any increased risk of mesothelioma or lung cancer among brake repairman as “controversial,” and not supported by the available evidence. Dr. Frank, who routinely testifies for the litigation industry that the risk is certain, known, and substantial, believed the NCI statement would be “misleading, erroneous, and contrary to the public health.” Frank believed that the NCI was basing its evaluation upon studies that were “unreliable,” and so set out to lobby the NCI. As a result of his telephoning and letter writing campaign, the NCI eliminated citations to two studies deemed unreliable (or inconvenient) to Dr. Frank, and adopted the following Frank-approved language:

“Studies into the cancer risk experienced by automobile mechanics exposed to asbestos through brake repair are limited, but the overall evidence suggests that there is no safe level for asbestos exposure.”

Yates at *4.

Operating in cahoots with, and under the guidance of asbestos plaintiffs’ counsel, Frank wrote to the NCI, of course mindful to run a draft of his correspondence past his litigation industry members. Plaintiffs’ counsel made various suggestions that Frank adopted. Yates at *5-7.

Frank objected to the subpoena on grounds that it:

(1) was too broad and unduly burdensome, as well as intended to harass;

(2) sought communications protected by attorney-client privilege; and

(3) sought the opinion of an unretained expert witness, contrary to Federal Rule of Civil Procedure 45(d)(3)(B)(ii).

The court quashed Honeywell’s subpoena only on grounds of burden, Rule 45(d)(3)(A), and did not reach Frank’s other arguments. Yates at *8.

Citing local Eastern District of Pennsylvania precedent, Judge Pappert noted that a claim of undue burden is resolved by considering several factors:

“(1) relevance of the requested materials,

(2) the party’s need for the documents,

(3) the breadth of the request,

(4) the time period covered by the request,

(5) the particularity with which the documents are described,

(6) the burden imposed, and

(7) the recipient’s status as a non-party.”

Yates at *12.

Honeywell was easily able to show the relevance of Frank’s lobbying shenanigans. Plaintiffs’ counsel have used the Frank-approved NCI website language to cross-examine defense expert witnesses, in asbestos personal injury cases.

Judge Pappert was not persuaded that Honeywell needed the requested discovery because Frank had given much of the material before, and he had previously acknowledged his working in concert with plaintiffs’ lawyers to change the NCI statement.

Honeywell thus had the evidence it needed to rehabilitate defense expert witnesses challenged with the Frank-approved NCI language. The court thus left the discovery into Frank’s ex parte lobbying activities for a case in which Frank was actually a retained expert witness, which surely will be soon. Judge Pappert exercised restraint by not addressing Frank’s improvident claim of attorney-client privilege and involuntarily servitude as an expert witness.

Frank’s lawyer, John O’Riordan, was quoted by the BNA as chastizing Honeywell:

“What the auto industry, Honeywell and others are trying to do is attack Dr. Frank personally, and what they tried to do was improper. … If they think he was wrong as a matter of science, the answer is to come back with good science.”

Steven M. Sellers, “Asbestos Expert Tops Honeywell in Subpoena Battle,” BNA Bloomberg Law (Aug. 18, 2015).

O’Riordan’s response is rather disingenuous, given that plaintiffs’ counsel in asbestos cases exploit the imprimatur of the NCI in its Frank-approved statement to challenge defense expert witnesses. This game is not about science, it is about name dropping and authority-based decision making, the antithesis of science.

Posted in Asbestos, Conflicts of Interest, Expert Witnesses, Trial Procedure | Comments Off on Let Me Not Be Frank With You – Frank Subpoena Quashed

The opinions, statements, and asseverations expressed on Tortini are my own, or those of invited guests, and these writings do not necessarily represent the views of clients, friends, or family, even when supported by good and sufficient reason.