For your delectation and delight, desultory dicta on the law of delicts.

Railroading Scientific Evidence of Causation in Court

August 31st, 2014

Harold Tanfield spent 40 years or so working for Consolidated Rail Corporation (and its predecessors), from 1952 to 1992.  Mr. Tanfield’s widow sued Conrail, under the Federal Employers’ Liability Act (“FELA”), 45 U.S.C.A. §§ 51-60, for negligently overexposing her late husband to diesel fumes, which allegedly caused him to develop lung cancer. Tanfield v. Leigh RR, No. A-4170-12T2, New Jersey Superior Court, App. Div. (Aug. 11, 2014) Slip op. at 3. [cited below as Tanfield].

The trial court granted Conrail summary judgment on grounds that plaintiff failed to show that Conrail had breached a duty of care.  The appellate court reversed and remanded for trial. The Appellate Division’s decision is “per curiam,” and franked “not for publication without the approval of the Appellate Division.” Only two of the usual three appellate judges participated.  The panel decided the case one week after it was submitted.

The plaintiff relied upon two witness, a co-worker of her husband, and an expert witness, Steven R. Tahan, M.D.  Dr. Tahan is a pathologist, an Associate Professor, Department of Pathology, Harvard Medical School, and the Director of Dermatopathology, Beth Israel Deaconess Medical Center.  Dr. Tahan’s website lists melanoma as his principal research interest. A PubMed search reveals no publications on diesel fume, occupational disease, or lung cancer.  Dr. Tahan’s principal research interest, skin pathology, was decidedly not at issue in the Tanfield case.

The panel of the Appellate Division quoted from the relevant paragraphs of Tahan’s report:

“Mr. Tanfield was a railroad worker for 35 years, where he was exposed to a large number of carcinogenic chemicals and fumes, including asbestos, antimony, arsenic, benzene, beryllium, cadmium, carbon disulfide, cyanide, DDT, diesel fumes, diesel fuel, dioxins, ethylbenzene, lead, methylene chloride, mercury, naphthalene, petroleum hydrocarbon, polychlorinated biphenyls, polynuclear aromatic hydrocarbons, toluene, vinyl acetate, and other volatile organics.

I have reviewed the cytology and biopsy slides from the right lung and confirm that he had a poorly differentiated malignant non-small cell carcinoma with both adenocarcinomatous and squamous features.  I have reached the following conclusions to a reasonable degree of medical certainty based on review of the above materials, my education, training, and experience, and review of published studies.

Mr. Tanfield’s more than 35 year substantial occupational exposure to an extensive array of carcinogens and diesel fumes without provision of protective equipment such as masks, respirators, and other filters created a long-term hazard that substantially multiplied his risk for developing lung cancer over the baseline he had as a former smoker.  It is more likely than not that his occupational exposure to diesel fumes and other carcinogenic toxins present in his workplace was a significant causative factor for his development of lung cancer and death from his cancer.”

Tanfield at 6-7.

Mr. Tanfield’s co-worker testified to what appeared to him to be excessive diesel fumes in the workplace, but there is no mention of any quantitative or qualitative evidence to any other lung carcinogen.  The Appellate Division states that the above three paragraphs represent the substance of Dr. Tahan’s report, and so it appears that there is no quantification of Tanfield’s smoking abuse, or the length of time between his discontinuing his smoking and the diagnosis of his lung cancer.  There is no discussion of any support for the alleged interaction between risks, or for any quantification of the extent of his increased risk from his lifestyle choices as opposed to his workplace exposure(s). There is no discussion of what Dr. Tahan visualized in his review of cytology and pathology slides, which permitted him to draw inferences about the actual causes of Mr. Tanfield’s lung cancer.

The trial judge proceeded on the assumption that there was an adequate proffer of expert opinion on causation, but that Dr. Tahan’s opinions on the failure to provide masks or respirators was a “net opinion,” a bit out of Tahan’s area of expertise.  Tanfield at 8. The Appellate Division apparently thought having a skin pathologist opine about the duty of care for a railroad was good enough for government work.  The appellate court gave the widow the benefit of the lower evidentiary threshold for negligence under FELA, which supposedly excuses the lack of an industrial hygiene opinion.  Tanfield at 10.  According to the two-judge panel, “[t]he doctor’s [Tahan’s] opinions are backed by professional literature and by his own considerable years of research and experience.” Tanfield at 11.  The Panel’s statement is all the more remarkable given that Tahan had never published on lung cancer, exposure assessments, or industrial hygiene measures; the vaunted experience of this witness was irrelevant to the issues in the case. Perhaps even more disturbing are the gaps in the proofs concerning the lack of causal connection between many of the alleged exposures and lung cancer generally, any discussion that the level of exposure to diesel fumes, from 1952 to 1992, was such that the railroads knew or should have known that that level of diesel fume caused lung cancer in workers.  And then there is the lurking probability that Mr. Tanfield’s smoking was the sole cause of his lung cancer.

Over 50 years ago, the New York Court of Appeals rejected a claim for leukemia, based upon allegations of benzene exposure, without any quantification of risk from the alleged exposure.  Miller v. National Cabinet Co., 8 N.Y.2d 277, 283-84, 168 N.E.2d 811, 813-15, 204 N.Y.S.2d 129, 132-34, modified on other grounds, 8 N.Y.2d 1025, 70 N.E.2d 214, 206 N.Y.S.2d 795 (1960). It is time to raise the standard for New Jersey courts’ consideration of epidemiologic evidence.

Peer Review, PubPeer, PubChase, and Rule 702 – Candles in the Ear

August 28th, 2014

In deciding the Daubert case, the Supreme Court identified several factors to assess whether “the reasoning or methodology underlying the testimony is scientifically valid and of whether that reasoning or methodology properly can be applied to the facts in issue.” One of those factors was whether the proffered opinion had been “peer reviewed” and published. Daubert v. Merrell Dow Pharms., Inc., 509 U.S. 579, 593-94 (1993). The Court explained the publication factor:

“Another pertinent consideration is whether the theory or technique has been subjected to peer review and publication. Publication (which is but one element of peer review) is not a sine qua non of admissibility; it does not necessarily correlate with reliability, and in some instances well-grounded but innovative theories will not have been published. Some propositions, moreover, are too particular, too new, or of too limited interest to be published. But submission to the scrutiny of the scientific community is a component of ‘good science,’ in part because it increases the likelihood that substantive flaws in methodology will be detected. The fact of publication (or lack thereof) in a peer reviewed journal thus will be a relevant, though not dispositive, consideration in assessing the scientific validity of a particular technique or methodology on which an opinion is premised.”

Daubert, 509 U.S. at 593-94 (internal citations omitted). See, e.g., Lust v. Merrell Dow Pharms., Inc., 89 F. 3d 594, 597 (9th Cir. 1996) (affirming exclusion of Dr. Alan Done, plaintiffs’ expert witness in Chlomid birth defects case, in part because of the lack of peer review and publication of his litigation-driven opinions); Hall v. Baxter Healthcare Corp., 947 F. Supp. 1387, 1406 (1996)  (noting that “the lack of peer review for [epidemiologist] Dr. Swan’s theories weighs heavily against the admissibility of Dr. Swan’s testimony”).

Case law since Daubert has made clear that peer review is neither necessary nor sufficient for the admissibility of an opinion. United States v. Mikos, 539 F.3d 706, 711 (7th Cir. 2008) (noting that the absence of peer-reviewed studies on subject of bullet grooving did not render opinion, based upon FBI database, inadmissible); In re Zoloft Prods. Liab. Litig. MDL No. 2342; 12-md-2342,  2014 U.S. Dist. LEXIS 87592; 2014 WL 2921648 (E.D. Pa. June 27, 2014) (excluding proffered testimony of epidemiologist Anick Bérard for arbitrarily selecting some point estimates and ignoring others in published studies).

As Susan Haack has noted, “peer review” has taken on mythic proportions in the adjudication of expert witness opinion admissibility.  Susan Haack, “Peer Review and Publication: Lessons for Lawyers,” 36 Stetson L. Rev. 789 (2007), republished in Susan Haack, Evidence Matters: Science, Proof, and Truth in the Law 156 (2014). Peer review, at best, is a weak proxy for the study validity, which is what is really needed in judicial proceedings. Proxies avoid the labor of independent, original thought, and so they are much favored by many judges.

In the past, some litigants oversold peer review as a touchstone of reliable, admissible expert witness testimony only to find that some very shoddy opinions show up in ostensibly peer-reviewed journals. SeeMisplaced Reliance On Peer Review to Separate Valid Science From Nonsense” (Aug. 14, 2011). Scientists often claim that science is “self-correcting,” but in some areas of research, there are few severe tests and little critical review, and mostly glib confirmations from acolytes.

Letters to the editor are sometimes held out as a remedy to peer-review screw ups, but such letters, which are not themselves peer reviewed, are subject to the whims of imperious editors who might wish to silence the views of those who would be critical of their judgment in publishing the article under discussion. Most journals have space only for a few letters, and unpopular but salient points of view can go unreported. Many scientists will not write letters to the editors, even when the published article is terribly wrong in its methods, data analyses, conclusions, or discussion.  Letters to the editor are often frowned upon in academic circles as not advancing affirmative research and scholarship agenda.

Letters to the editor often must be sent within a short time window of initial publication, often too short for busy academics to analyze a paper carefully and comment.  Furthermore, letters  and are often limited to a few hundred words, which length is often inadequate to develop a careful critique or exposition of the issues in the paper.  Moreover, such letters suffer from an additional procedural problem:  authors are permitted a response, and the letter writers are not permitted a reply. Authors thus get the last word, which they can often use to deflect or diffuse important criticisms.  The authors’ response can be sufficiently self-serving and misleading, with immunity from further criticism, that many would-be correspondents abandon the project altogether. See, e.g., PubPeer – “Example case showing why letters to the editor can be a waste of time” (Oct. 8, 2013).

Websites and blogs provide for dynamic content, with the potential for critical reviews that can be identified by search engines. See, e.g., Paul S. Brookes, “Our broken academic journal corrections system,” PSBLAB: Cardiac Mitochondrial Research in the Lab (Jan. 14, 2014). Mostly, the internet holds untapped potential for analysis, discussion, and debate on published studies.  To be sure, some journals provide “comment fields,” on their websites, with an opportunity for open discussion.  Often, full critiques must be developed and presented elsewhere. See, e.g., Androgen Study Group, “Letter to JAMA Asking for Retraction of Misleading Article on Testosterone Therapy” (Mar. 25, 2014).


Kate Yandell, in TheScientist, reports on the creation of PubPeer a few years ago, as a forum for post-publication review and discussion published scientific papers. Kate Yandell, “Concerns Raised Online Linger” (Aug. 25, 2014).  Billing itself as an “online journal club,” PubPeer has pointed out potentially serious problems, some of which have led to retractions and corrections. Another internet site of interest is PubChase, which monitors discussion of particular articles, as well as generating email alerts and recommendations for related articles.

One journal editor has taken notice and given notice that he will not pay attention to post-publication peer review.  Eric J. Murphy, the editor in chief of Lipids, posting a comment at PubPeer, illustrates that there will be a good deal of resistance to post-publication open peer review, out of the control of journal editors:

“As an Editor-in-Chief of a society journal, I have never examined PubPeer nor will I do so. First, there is the crowd or group mentality that may over emphasize some point in an irrational manner.  Just as using the marble theory of officiating is bad, one should never base a decision on the quantity of negative or positive comments. Second, if the concerned individual sent an e-mail or letter to me, then I would be duty bound to examine the issue.  It is not my duty to monitor PubPeer or any other such site, but rather to respond to queries sent to me.  So, with regards to Hugh’s point, I don’t support that position at all.

Mistakes happen, although frankly we try to limit these mistakes and do take steps to prevent publishing papers with FFP, it does happen.  Also, honest mistakes happen in science all the time, so[me] of these result in an erratum, while others go unnoticed by editors and reviewers.  In such a case, someone who does notice should contact the editor to put them on notice regarding the issue so that it may be resolved.  Resolution does not necessarily mean correction, but rather the editor taking a close look at the situation, discussing the situation with the original authors, and then reaching a decision.  Most of the time a correction will be made, but not always.”

Murphy’s comments are remarkable.  PubPeer provides a forum for post-publication comment, but it hardly requires editors, investigators, and consumers of scientific studies to evaluate published works by “nose counts” of favorable and unfavorable comments.  This is not, and never has been, a democratic enterprise.  Somehow, we might expect Murphy and others to evaluate the comments, on the merits, not on their prevalence.  Murphy’s declaration that he is duty-bound to investigate and evaluate letters or emails sent to him about published articles is encouraging, but the editors’ ability to ratify publication, in the face of a private communication, without comment to the scientific community, strips the community of making a principled decision on its own.  Murphy’s way, which seems largely the way of contemporary scientific publishing, ignores the important social dimension of scientific debate and resolution of issues.  Leaving control of the discussion in the hands of the editors who approved and published studies may be asking too much of editors. Nemo iudex in causa sua.

PubPeer has already tested the limits of free speech. Kate Yandell, “PubPeer Threatened with Legal Action” (Aug. 19, 2014). A scientist whose works were receiving unfavorable attention on PubPeer threatened a lawsuit.  Let’s hope that scientists can learn to be sufficiently thick skinned that there can be open discourse of the merits of their research, their data, and their conclusions.

Pritchard v. Dow Agro – Gatekeeping Exemplified

August 25th, 2014

Robert T. Pritchard was diagnosed with Non-Hodgkin’s Lymphoma (NHL) in August 2005; by fall 2005, his cancer was in remission. Mr. Pritchard had been a pesticide applicator, and so, of course, he and his wife sued the deepest pockets around, including Dow Agro Sciences, the manufacturer of Dursban. Pritchard v. Dow Agro Sciences, 705 F.Supp. 2d 471 (W.D.Pa. 2010).

The principal active ingredient of Dursban is chlorpyrifos, along with some solvents, such as xylene, cumene, and ethyltoluene. Id. at 474.  Dursban was licensed for household insecticide use until 2000, when the EPA phased out certain residential applications.  The EPA’s concern, however, was not carcinogenicity:  the EPA categorizes chlorpyrifos as “Group E,” non-carcinogenetic in humans. Id. at 474-75.

According to the American Cancer Society (ACS), the cause or causes of NHL cases are unknown.  Over 60,000 new cases are diagnosed annually, in people from all walks of life, occupations, and lifestyles. The ACS identifies some risk factors, such as age, gender, race, and ethnicity, but the ACS emphasizes that chemical exposures are not proven risk factors or causes of NHL.  See Pritchard, 705 F.Supp. 2d at 474.

The litigation industry does not need scientific conclusions of causal connections; their business is manufacturing certainty in courtrooms. Or at least, the appearance of certainty. The Pritchards found their way to the litigation industry in Pittsburgh, Pennsylvania, in the form of Goldberg, Persky & White, P.C. The Goldberg Persky firm sued Dow Agro, and then put the Pritchards in touch with Dr. Bennet Omalu, to serve as their expert witness.  A lawsuit ensued.

Alas, the Pritchards’ lawsuit ran into a wall, or at least a gate, in the form of Federal Rule of Evidence 702. In the capable hands of Judge Nora Barry Fischer, Rule 702 became an effective barrier against weak and poorly considered expert witness opinion testimony.

Dr. Omalu, no stranger to lost causes, was the medical examiner of San Joaquin County, California, at the time of his engagement in the Pritchard case. After careful consideration of the Pritchards’ claims, Omalu prepared a four page report, with a single citation, to Harrison’s Principles of Internal Medicine.  Id. at 477 & n.6.  This research, however, sufficed for Omalu to conclude that Dursban caused Mr. Pritchard to develop NHL, as well as a host of ailments he had never even sued Dow Agro for, including “neuropathy, fatigue, bipolar disorder, tremors, difficulty concentrating and liver disorder.” Id. at 478. Dr. Omalu did not cite or reference any studies, in his report, to support his opinion that Dursban caused Mr. Pritchard’s ailments.  Id. at 480.

After counsel objected to Omalu’s report, plaintiffs’ counsel supplemented the report with some published articles, including the “Lee” study.  See Won Jin Lee, Aaron Blair, Jane A. Hoppin, Jay H. Lubin, Jennifer A. Rusiecki, Dale P. Sandler, Mustafa Dosemeci, and Michael C. R. Alavanja, “Cancer Incidence Among Pesticide Applicators Exposed to Chlorpyrifos in the Agricultural Health Study,” 96 J. Nat’l Cancer Inst. 1781 (2004) [cited as Lee].  At his deposition, and in opposition to defendants’ 702 motion, Omalu became more forthcoming with actual data and argument.  According to Omalu, the Lee study “the 2004 Lee Study strongly supports a conclusion that high-level exposure to chlorpyrifos is associated with an increased risk of NHL.’’ Id. at 480.

This opinion put forward by Omalu bordered on scientific malpractice.  No; it was malpractice.  The Lee study looked at many different cancer end points, without adjustment for multiple comparisons.  The lack of adjustment means at the very least that any interpretation of p-values or confidence intervals would have to modified to acknowledge the higher rate of random error.  Now for NHL, the overall relative risk (RR) for chlorpyrifos exposure was 1.03, with a 95% confidence interval, 0.62 to 1.70.  Lee at 1783.  In other words, the study that Omalu claimed supported his opinion was about as null a study as can be, with reasonably tight confidence interval that made a doubling of the risk rather unlikely given the sample RR.

If the multiple endpoint testing were not sufficient to dissuade a scientist, intent on supporting the Pritchards’ claims, then the exposure subgroup analysis would have scared any prudent scientist away from supporting the plaintiffs’ claims.  The Lee study authors provided two different exposure-response analyses, one with lifetime exposure and the other with an intensity-weighted exposure, both in quartiles.  Neither analysis revealed an exposure-response trend.  For the lifetime exposure-response trend, the Lee study reported an NHL RR of 1.01, for the highest quartile of chloripyrifos exposure. For the intensity-weighted analysis, for the highest quartile, the authors reported RR = 1.61, with a 95% confidence interval, 0.74 to 3.53).

Although the defense and the district court did not call out Omalu on his fantasy statistical inference, the district judge certainly appreciated that Omalu had no statistically significant associations between chloripyrifos and NHL, to support his opinion. Given the weakness of relying upon a single epidemiologic study (and torturing the data therein), the district court believed that a showing of statistical significance was important to give some credibility to Omalu’s claims.  705 F.Supp. 2d at 486 (citing General Elec. Co. v. Joiner, 522 U.S. 136, 144-46 (1997);  Soldo v. Sandoz Pharm. Corp., 244 F.Supp. 2d 434, 449-50 (W.D. Pa. 2003)).

Figure 3 adapted from Lee

Figure 3 adapted from Lee

What to do when there is really no evidence supporting a claim?  Make up stuff.  Here is how the trial court describes Omalu’s declaration opposing exclusion:

 “Dr. Omalu interprets and recalculates the findings in the 2004 Lee Study, finding that ‘an 80% confidence interval for the highly-exposed applicators in the 2004 Lee Study spans a relative risk range for NHL from slightly above 1.0 to slightly above 2.5.’ Dr. Omalu concludes that ‘this means that there is a 90% probability that the relative risk within the population studied is greater than 1.0’.”

705 F.Supp. 2d at 481 (internal citations omitted); see also id. at 488. The calculations and the rationale for an 80% confidence interval were not provided, but plaintiffs’ counsel assured Judge Fischer at oral argument that the calculation was done using high school math. Id. at 481 n.12. Judge Fischer seemed unimpressed, especially given that there was no record of the calculation.  Id. at 481, 488.

The larger offense, however, was that Omalu’s interpretation of the 80% confidence interval as a probability statement of the true relative risk’s exceeding 1.0, was bogus. Dr. Omalu further displayed his lack of statistical competence when he attempted to defend his posterior probability derived from his 80% confidence interval by referring to a power calculation of a different disease in the Lee study:

“He [Omalu] further declares that ‘‘the authors of the 2004 Lee Study themselves endorse the probative value of a finding of elevated risk with less than a 95% confidence level when they point out that ‘this analysis had a 90% statistical power to detect a 1.5–fold increase in lung cancer incidence’.”

Id. at 488 (court’s quoting of Omalu’s quoting from the Lee study). To quote Wolfgang Pauli, Omalu is so far off that he is “not even wrong.” Lee and colleagues were offering a pre-study power calculation, which they used to justify their looking at the cohort for lung cancer, not NHL, outcomes.  Lee at 1787. The power calculation does not apply to the data observed for lung cancer; and the calculation has absolutely nothing to do with NHL. The power calculation certainly has nothing to do with Omalu’s misguided attempt to offer a calculation of a posterior probability for NHL based upon a subgroup confidence interval.

Given that there were epidemiologic studies available, Judge Fischer noted that expert witnesses were obligated to factor such studies into their opinions. See 705 F.Supp. 2d at 483 (citing Soldo, 244 F.Supp. 2d at 532).  Omalu sins against Rule 702 included his failure to consider any studies other than the Lee study, regardless of how unsupportive the Lee study was of his opinion.  The defense experts pointed to several studies that found lower NHL rates among exposed workers than among controls, and Omalu completely failed to consider and to explain his opinion in the face of the contradictory evidence.  See 705 F.Supp. 2d at 485 (citing Perry v. Novartis Pharm. Corp. 564 F.Supp. 2d 452, 465 (E.D. Pa. 2008)). In other words, Omalu was shown to have been a cherry picker. Id. at 489.

In addition to the abridged epidemiology, Omalu relied upon an analogy between the ethyl-toluene and other solvents that contained benzene rings and benzene itself to argue that these chemicals, supposedly like benzene, cause NHL.  Id. at 487. The analogy was never supported by any citations to published studies, and, of course, the analogy is seriously flawed. Many chemicals, including chemicals made and used by the human body, have benzene rings, without the slightest propensity to cause NHL.  Indeed, the evidence that benzene itself causes NHL is weak and inconsistent.  See, e.g., Knight v. Kirby Inland Marine Inc., 482 F.3d 347 (2007) (affirming the exclusion of Dr. B.S. Levy in a case involving benzene exposure and NHL).

Looking at all the evidence, Judge Fischer found Omalu’s general causation opinions unreliable.  Relying upon a single, statistically non-significant epidemiologic study (Lee), while ignoring contrary studies, was not sound science.  It was not even science; it was courtroom rhetoric.

Omalu’s approach to specific causation, the identification of what caused Mr. Pritchard’s NHL, was equally spurious. Omalu purportedly conducted a “differential diagnosis” or a “differential etiology,” but he never examined Mr. Pritchard; nor did he conduct a thorough evaluation of Mr. Pritchard’s medical records. 705 F.Supp. 2d at 491. Judge Fischer found that Omalu had not conducted a thorough differential diagnosis, and that he had made no attempt to rule out idiopathic or unknown causes of NHL, despite the general absence of known causes of NHL. Id. at 492. The one study identified by Omalu reported a non-statistically significant 60% increase in NHL risk, for a subgroup in one of two different exposure-response analyses.  Although Judge Fischer treated the relative risk less than two as a non-dispositive factor in her decision, she recognized that

“The threshold for concluding that an agent was more likely than not the cause of an individual’s disease is a relative risk greater than 2.0… . When the relative risk reaches 2.0, the agent is responsible for an equal number of cases of disease as all other background causes. Thus, a relative risk of 2.0 … implies a 50% likelihood that an exposed individual’s disease was caused by the agent. A relative risk greater than 2.0 would permit an inference that an individual plaintiff’s disease was more likely than not caused by the implicated agent.”

Id. at 485-86 (quoting from Reference Manual on Scientific Evidence at 384 (2d ed. 2000)).

Left with nowhere to run, plaintiffs’ counsel swung for the bleachers by arguing that the federal court, sitting in diversity, was required to apply Pennsylvania law of evidence because the standards of Rule 702 constitute “substantive,” not procedural law. The argument, which had been previously rejected within the Third Circuit, was as legally persuasive as Omalu’s scientific opinions.  Judge Fischer excluded Omalu’s proffered opinions and granted summary judgment to the defendants. The Third Circuit affirmed in a per curiam decision. 430 Fed. Appx. 102, 2011 WL 2160456 (3d Cir. 2011).

Practical Evaluation of Scientific Claims

The evaluative process that took place in the Pritchard case missed some important details and some howlers committed by Dr. Omalu, but it was more than good enough for government work. The gatekeeping decision in Pritchard was nonetheless the target of criticism in a recent book.

Kristin Shrader-Frechette (S-F) is a professor of science who wants to teach us how to expose bad science. S-F has published, or will soon publish, a book that suggests that philosophy of science can help us expose “bad science.”  See Kristin Shrader-Frechette, Tainted: How Philosophy of Science Can Expose Bad Science (Oxford U.P. 2014)[cited below at Tainted; selections available on Google books]. S-F’s claim is intriguing, as is her move away from the demarcation problem to the difficult business of evaluation and synthesis of scientific claims.

In her introduction, S-F tells us that her book shows “how practical philosophy of science” can counteract biased studies done to promote special interests and PROFITS.  Tainted at 8. Refreshingly, S-F identifies special-interest science, done for profit, as including “individuals, industries, environmentalists, labor unions, or universities.” Id. The remainder of the book, however, appears to be a jeremiad against industry, with a blind eye towards the litigation industry (plaintiffs’ bar) and environmental zealots.

The book promises to address “public concerns” in practical, jargon-free prose. Id. at 9-10. Some of the aims of the book are to provide support for “rejecting demands for only human evidence to support hypotheses about human biology (chapter 3), avoiding using statistical-significance tests with observational data (chapter 12), and challenging use of pure-science default rules for scientific uncertainty when one is doing welfare-affecting science (chapter 14).”

Id. at 10. Hmmm.  Avoiding statistical significance tests for observational data?!?  If avoided, what does S-F hope to use to assess random error?

And then S-F refers to plaintiffs’ hired expert witness (from the Milward case), Carl Cranor, as providing “groundbreaking evaluations of causal inferences [that] have helped to improve courtroom verdicts about legal liability that otherwise put victims at risk.” Id. at 7. Whether someone is a “victim” and has been “at risk” turns on assessing causality. Cranor is not a scientist, and his philosophy of science turns of “weight of the evidence” (WOE), a subjective, speculative approach that is deaf, dumb, and blind to scientific validity.

There are other “teasers,” in the introduction to Tainted.  S-F advertises that her Chapter 5 will teach us that “[c]ontrary to popular belief, animal and not human data often provide superior evidence for human-biological hypotheses.”  Tainted at 11. Chapter 6 will show that“[c]ontrary to many physicists’ claims, there is no threshold for harm from exposure to ionizing radiation.” Id.  S-F tells us that her Chapter 7 will criticize “a common but questionable way of discovering hypotheses in epidemiology and medicine—looking at the magnitude of some effect in order to discover causes. The chapter shows instead that the likelihood, not the magnitude, of an effect is the better key to causal discovery.” Id. at 13. Discovering hypotheses — what is that about? You might have thought that hypotheses were framed from observations and then tested.

Which brings us to the trailer for Chapter 8, in which S-F promises to show that “[c]ontrary to standard statistical and medical practice, statistical-significance tests are not causally necessary to show medical and legal evidence of some effect.” Tainted at 11. Again, the teaser raises lots of questions such as what could S-F possibly mean when she says statistical tests are not causally necessary to show an effect.  Later in the introduction, S-F says that her chapter on statistics “evaluates the well-known statistical-significance rule for discovering hypotheses and shows that because scientists routinely misuse this rule, they can miss discovering important causal hypotheses. Id. at 13. Discovering causal hypotheses is not what courts and regulators must worry about; their task is to establish such hypotheses with sufficient, valid evidence.

Paging through the book reveals that a rhetoric that is thick and unremitting, with little philosophy of science or meaningful advice on how to evaluate scientific studies.  The statistics chapter calls out, and lo, it features a discussion of the Pritchard case. See Tainted, Chapter 8, “Why Statistics Is Slippery: Easy Algorithms Fail in Biology.”

The chapter opens with an account of German scientist Fritz Haber’s development of organophosphate pesticides, and the Nazis use of related compounds as chemical weapons.  Tainted at 99. Then, in a fevered non-sequitur and rhetorical flourish, S-F states, with righteous indignation, that although the Nazi researchers “clearly understood the causal-neurotoxic effects of organophosphate pesticides and nerve gas,” chemical companies today “claim that the causal-carcinogenic effects of these pesticides are controversial.” Is S-F saying that a chemical that is neurotoxic must be carcinogenic for every kind of human cancer?  So it seems.

Consider the Pritchard case.  Really, the Pritchard case?  Yup; S-F holds up the Pritchard case as her exemplar of what is wrong with civil adjudication of scientific claims.  Despite the promise of jargon-free language, S-F launches into a discussion of how the judges in Pritchard assumed that statistical significance was necessary “to hypothesize causal harm.”  Tainted at 100. In this vein, S-F tells us that she will show that:

“the statistical-significance rule is not a legitimate requirement for discovering causal hypotheses.”

Id. Again, the reader is left to puzzle why statistical significance is discussed in the context of hypothesis discovery, whatever that may be, as opposed to hypothesis testing or confirmation. And whatever it may be, we are warned that “unless the [statistical significance] rule is rejected as necessary for hypothesis-discovery, it will likely lead to false causal claims, questionable scientific theories, and massive harm to innocent victims like Robert Pritchard.”

Id. S-F is decidedly not adverting to Mr. Pritichard’s victimization by the litigation industry and the likes of Dr. Omalu, although she should. S-F not only believes that the judges in Pritchard bungled their gatekeeping wrong, she knows that Dr. Omalu was correct, and the defense experts wrong, and that Pritchard was a victim of Dursban and of questionable scientific theories that were used to embarrass Omalu and his opinions.

S-F promised to teach her readers how to evaluate scientific claims and detect “tainted” science, but all she delivers here is an ipse dixit.  There is no discussion of the actual measurements, extent of random error, or threats to validity, for studies cited either by the plaintiffs or the defendants in Pritchard.  To be sure, S-F cites the Lee study in her endnotes, but she never provides any meaningful discussion of that study or any other that has any bearing on chlorpyrifos and NHL.  S-F also cited two review articles, the first of which provides no support for her ipse dixit:

“Although mutagenicity and chronic animal bioassays for carcinogenicity of chlorpyrifos were largely negative, a recent epidemiological study of pesticide applicators reported a significant exposure response trend between chlorpyrifos use and lung and rectal cancer. However, the positive association was based on small numbers of cases, i.e., for rectal cancer an excess of less than 10 cases in the 2 highest exposure groups. The lack of precision due to the small number of observations and uncertainty about actual levels of exposure warrants caution in concluding that the observed statistical association is consistent with a causal association. This association would need to be observed in more than one study before concluding that the association between lung or rectal cancer and chlorpyrifos was consistent with a causal relationship.

There is no evidence that chlorpyrifos is hepatotoxic, nephrotoxic, or immunotoxic at doses less than those that cause frank cholinesterase poisoning.”

David L. Eaton, Robert B. Daroff, Herman Autrup, James Bridges, Patricia Buffler, Lucio G. Costa, Joseph Coyle, Guy McKhann, William C. Mobley, Lynn Nadel, Diether Neubert, Rolf Schulte-Hermann, and Peter S. Spencer, “Review of the Toxicology of Chlorpyrifos With an Emphasis on Human Exposure and Neurodevelopment,” 38 Critical Reviews in Toxicology 1, 5-6(2008).

The second cited review article was written by clinical ecology zealot[1], William J. Rea. William J. Rea, “Pesticides,” 6 Journal of Nutritional and Environmental Medicine 55 (1996). Rea’s article does not appear in Pubmed.

Shrader-Frechette’s Criticisms of Statistical Significance Testing

What is the statistical significance against which S-F rails? She offers several definitions, none of which is correct or consistent with the others.

“The statistical-significance level p is defined as the probability of the observed data, given that the null hypothesis is true.8

Tainted at 101 (citing D. H. Johnson, “What Hypothesis Tests Are Not,” 16 Behavioral Ecology 325 (2004). Well not quite; attained significance probability is the probability of data observed or those more extreme, given the null hypothesis.  A Tainted definition.

Later in Chapter 8, S-F discusses significance probability in a way that overtly commits the transposition fallacy, not a good thing to do in a book that sets out to teach how to evaluate scientific evidence:

“However, typically scientists view statistical significance as a measure of how confidently one might reject the null hypothesis. Traditionally they have used a 0.05 statistical-significance level, p < or = 0.05, and have viewed the probability of a false-positive (incorrectly rejecting a true null hypothesis), or type-1, error as 5 percent. Thus they assume that some finding is statistically significant and provides grounds for rejecting the null if it has at least a 95-percent probability of not being due to chance.

Tainted at 101. Not only does the last sentence ignore the extent of error due to bias or confounding, it erroneously assigns a posterior probability that is the complement of the significance probability.  This error is not an isolated occurrence; here is another example:

“Thus, when scientists used the rule to examine the effectiveness of St. John’s Wort in relieving depression,14 or when they employed it to examine the efficacy of flutamide to treat prostate cancer,15 they concluded the treatments were ineffective because they were not statistically significant at the 0.05 level. Only at p < or = 0.14 were the results statistically significant. They had an 86-percent chance of not being due to chance.16

Tainted at 101-02 (citing papers by Shelton (endnote 14)[2], by Eisenberger (endnote 15) [3], and Rothman’s text (endnote 16)[4]). Although Ken Rothman has criticized the use of statistical significance tests, his book surely does not interpret a p-value of 0.14 as an 86% chance that the results were not due to chance.

Although S-F previous stated that statistical significance is interpreted as the probability that the null is true, she actually goes on to correct the mistake, sort of:

“Requiring the statistical-significance rule for hypothesis-development also is arbitrary in presupposing a nonsensical distinction between a significant finding if p = 0.049, but a nonsignificant finding if p = 0. 051.26 Besides, even when one uses a 90-percent (p < or = 0.10), an 85-percent (p < or = 0.15), or some other confidence level, it still may not include the null point. If not, these other p values also show the data are consistent with an effect. Statistical-significance proponents thus forget that both confidence levels and p values are measures of consistency between the data and the null hypothesis, not measures of the probability that the null is true. When results do not satisfy the rule, this means merely that the null cannot be rejected, not that the null is true.”

Tainted at 103.

S-F’s repeats some criticisms of significance testing, most of which involve their own misunderstandings of the concept.  It hardly suffices to argue that evaluating the magnitude of random error is worthless because it does not measure the extent of bias and confounding.  The flaw lies in those who would interpret the p-value as the sole measure of error involved in a measurement.

S-F takes the criticisms of significance probability to be sufficient to justify an alternative approach: evaluating causal hypotheses “on a preponderance of evidence,47 whether effects are more likely than not.”[5] Here citations, however, do not support the notion that an overall assessment of the causal hypothesis is a true alternative of statistical testing, but rather only a later step in the causal assessment, which presupposes the previous elimination of random variability in the observed associations.

S-F compounds her confusion by claiming that this purported alternative is superior to significance testing or any evaluation of random variability, and by noting that juries in civil cases must decide causal claims on the preponderance of the evidence, not on attained significance probabilities:

“In welfare-affecting areas of science, a preponderance-of-evidence rule often is better than a statistical-significance rule because it could take account of evidence based on underlying mechanisms and theoretical support, even if evidence did not satisfy statistical significance. After all, even in US civil law, juries need not be 95 percent certain of a verdict, but only sure that a verdict is more likely than not. Another reason for requiring the preponderance-of-evidence rule, for welfare-related hypothesis development, is that statistical data often are difficult or expensive to obtain, for example, because of large sample-size requirements. Such difficulties limit statistical-significance applicability. ”

Tainted at 105-06. S-F’s assertion that juries need not have 95% certainty in their verdict is either a misunderstanding or a misrepresentation of the meaning of a confidence interval, and a conflation of two very kinds of probability or certainty.  S-F invites a reading that commits the transposition fallacy by confusing the probability involved in a confidence interval with that involved in a posterior probability.  S-F’s claim that sample size requirements often limit the ability to use statistical significance evaluations is obviously highly contingent upon the facts of case, but in civil cases, such as Pritchard, this limitation is rarely at play.  Of course, if the sample size is too small to evaluate the role of chance, then a scientist should probably declare the evidence too fragile to support a causal conclusion.

S-F also postulates that that a posterior probability rather than a significance probability approach would “better counteract conflicts of interest that sometimes cause scientists to pay inadequate attention to public-welfare consequences of their work.” Tainted at 106. This claim is a remarkable assertion, which is not supported by any empirical evidence.  The varieties of evidence that go into an overall assessment of a causal hypothesis are often quantitatively incommensurate.  The so-called preponderance-of-the-evidence described by S-F is often little more than a subjective overall assessment of weight of the evidence.  The approving citations to the work of Carl Cranor support interpreting S-F to endorse this subjective, anything-goes approach to weight of the evidence.  As for WOE eliminating inadequate attention to “public welfare,” S-F’s citations actually suggest the opposite. S-F’s citations to the 1961 reviews by Wynder and by Little illustrate how subjective narrative reviews can be, with diametrically opposed results.  Rather than curbing conflicts of interest, these subjective, narrative reviews illustrate how contrary results may be obtained by the failure to pre-specify criteria of validity, and inclusion and exclusion of admissible evidence. Still, S-F asserts that “up to 80 percent of welfare-related statistical studies have false-negative or type-II errors, failing to reject a false null.” Tainted at 106. The support for this assertion is a citation to a review article by David Resnik. See David Resnik, “Statistics, Ethics, and Research: An Agenda for Education and Reform,” 8 Accountability in Research 163, 183 (2000). Resnik’s paper is a review article, not an empirical study, but at the page cited by S-F, Resnik in turn cites to well-known papers that present actual data:

“There is also evidence that many of the errors and biases in research are related to the misuses of statistics. For example, Williams et al. (1997) found that 80% of articles surveyed that used t-tests contained at least one test with a type II error. Freiman et al. (1978)  * * *  However, empirical research on statistical errors in science is scarce, and more work needs to be done in this area.”

Id. The papers cited by Resnik, Williams (1997)[6] and Freiman (1978)[7] did identify previously published studies that over-interpreted statistically non-significant results, but the identified type-II errors were potential errors, not ascertained errors, because the authors made no claim that every non-statistically significant result actually represented a missed true association. In other words, S-F is not entitled to say that these empirical reviews actually identified failures to reject fall null hypotheses. Furthermore, the empirical analyses in the studies cited by Resnik, who was in turn cited by S-F, did not look at correlations between alleged conflicts of interest and statistical errors. The cited research calls for greater attention to proper interpretation of statistical tests, not for their abandonment.

In the end, at least in the chapter on statistics, S-F fails to deliver much if anything on her promise to show how to evaluate science from a philosophic perspective.  Her discussion of the Pritchard case is not an analysis; it is a harangue. There are certainly more readable, accessible, scholarly, and accurate treatments of the scientific and statistical issues in this book.  See, e.g., Michael B. Bracken, Risk, Chance, and Causation: Investigating the Origins and Treatment of Disease (2013).

[1] Not to be confused with the deceased federal judge by the same name, William J. Rea. William J. Rea, 1 Chemical Sensitivity – Principles and Mechanisms (1992); 2 Chemical Sensitivity – Sources of Total Body Load (1994),  3 Chemical Sensitivity – Clinical Manifestation of Pollutant Overload (1996), 4 Chemical Sensitivity – Tools of Diagnosis and Methods of Treatment (1998).

[2] R. C. Shelton, M. B. Keller, et al., “Effectiveness of St. John’s Wort in Major Depression,” 285 Journal of the American Medical Association 1978 (2001).

[3] M. A. Eisenberger, B. A. Blumenstein, et al., “Bilateral Orchiectomy With or Without Flutamide for Metastic [sic] Prostate Cancer,” 339 New England Journal of Medicine 1036 (1998).

[4] Kenneth J. Rothman, Epidemiology 123–127 (NY 2002).

[5] Endnote 47 references the following papers: E. Hammond, “Cause and Effect,” in E. Wynder, ed., The Biologic Effects of Tobacco 193–194 (Boston 1955); E. L. Wynder, “An Appraisal of the Smoking-Lung-Cancer Issue,”264  New England Journal of Medicine 1235 (1961); see C. Little, “Some Phases of the Problem of Smoking and Lung Cancer,” 264 New England Journal of Medicine 1241 (1961); J. R. Stutzman, C. A. Luongo, and S. A McLuckey, “Covalent and Non-Covalent Binding in the Ion/Ion Charge Inversion of Peptide Cations with Benzene-Disulfonic Acid Anions,” 47 Journal of Mass Spectrometry 669 (2012). Although the paper on ionic charges of peptide cations is unfamiliar, the other papers do not eschew traditional statistical significance testing techniques. By the time these early (1961) reviews were written, the association that was reported between smoking and lung cancer was clearly accepted as not likely explained by chance.  Discussion focused upon bias and potential confounding in the available studies, and the lack of animal evidence for the causal claim.

[6] J. L. Williams, C. A. Hathaway, K. L. Kloster, and B. H. Layne, “Low power, type II errors, and other statistical problems in recent cardiovascular research,” 42 Am. J. Physiology Heart & Circulation Physiology H487 (1997).

[7] Jennie A. Freiman, Thomas C. Chalmers, Harry Smith and Roy R. Kuebler, “The importance of beta, the type II error and sample size in the design and interpretation of the randomized control trial: survey of 71 ‛negative’ trials,” 299 New Engl. J. Med. 690 (1978).

Contra Parascandola’s Reduction of Specific Causation to Risk

August 22nd, 2014

Mark Parascandola is a photographer who splits his time between Washington DC, and Almeria, Spain.  Before his career in photography, Parascandola studied philosophy (Cambridge), and did graduate work in epidemiology (Johns Hopkins, MPH). In 1997 to 1998, he studied the National Cancer Institute’s role in determining that smoking causes some kinds of cancer.  He went on to serve as a staff epidemiologist at NCI, at its Tobacco Control Research Branch, in the Division of Cancer Control and Population Sciences (DCCPS).

Back in the 1990s, Parascandola wrote an article, which is a snapshot and embellishment of arguments given by Sander Greenland, on the use and alleged abuse of relative risks to derive a “probability of causation.” See Mark Parascandola, “What’s Wrong with the Probability of Causation?” 39 Jurimetrics J. 29 (1998)[cited here are Parascandola]. Parascandola’s article is a locus of arguments that have recurred from time to time, and worth revisiting.

Parascandola offers an interesting historical factoid, which is a useful reminder to those who suggest that the RR > 2 argument was the brainchild of lawyers:  The argument was first suggested in 1959, by Dr. Victor P. Bond, a physician with expertise in medical physics at the Brookhaven National Laboratory.  See Parascandola at 31 n. 6 (citing Victor P. Bond, The Medical Effects of Radiation (1960), reprinted in NACCA 13th Annual Convention 1959, at 126 (1960).

Unfortunately, Parascandola is a less reliable reporter when it comes to the judicial use of the relative risk greater than two (RR > 2) argument.  He argues that Judge Jack Weinstein opposed the RR > 2 argument on policy grounds, when in fact, Judge Weinstein rejected the anti-probabilistic argument that probabilistic inference could never establish specific causation, and embraced the RR > 2 argument as a logical policy compromise that would allow evidence of risk to substitute for specific causation in a limited fashion. Parascandola at 33-34 & n.20. Given Judge Weinstein’s many important contributions to tort and procedural law, and the importance of the Agent Orange litigation, it is worth describing Judge Weinstein’s views accurately. See In re Agent Orange Product Liab. Litig., 597 F. Supp. 740, 785, 817, 836 (E.D.N.Y. 1984) (“A government administrative agency may regulate or prohibit the use of toxic substances through rulemaking, despite a very low probability of any causal relationship.  A court, in contrast, must observe the tort law requirement that a plaintiff establish a probability of more than 50% that the defendant’s action injured him. … This means that at least a two-fold increase in incidence of the disease attributable to Agent Orange exposure is required to permit recovery if epidemiological studies alone are relied upon.”), aff’d 818 F.2d 145, 150-51 (2d Cir. 1987)(approving district court’s analysis), cert. denied sub nom. Pinkney v. Dow Chemical Co., 487 U.S. 1234 (1988); see also In re “Agent Orange” Prod. Liab. Litig., 611 F. Supp. 1223, 1240, 1262 (E.D.N.Y. 1985)(excluding plaintiffs’ expert witnesses), aff’d, 818 F.2d 187 (2d Cir. 1987), cert. denied, 487 U.S. 1234 (1988).[1]

Parascandola’s failure to cite and describe Judge Weinstein’s views raises some question of the credibility of his analyses, and his assertion that “[he] will demonstrate that the PC formula is invalid in many situations and cannot fill the role it is given.” Parascandola at 30 (emphasis added).

Parascandola describes basic arithmetic of probability of causation (PC) in terms of a disease for which we “expect cases” and for which we have “excess cases.” The rate of observed cases in an exposed population divided by the rate of expected cases in an unexposed population provides an estimate of the population relative risk (RR). The excess cases can be obtained simply from the difference between observed cases in the exposed group and the expected cases in the unexposed group.  The attributable fraction is the ratio of excess cases to total cases.

The probability of causation “PC” = 1 – (1/RR).

Heterogeneity Yields Uncertainty Argument

The RR describes a group statistic, and an individual’s altered risk will almost certainly not be exactly equal to the group’s average risk. Parascandola notes that sometimes this level of uncertainty can be remedied by risk measurements for subgroups that better fit an individual plaintiff’s characteristics.  All true, but this is hardly an argument against RR > 2.  At best, the heterogeneity argument is an expression of inference skepticism of the sort that led Judge Weinstein to accept RR > 2 as a reasonable compromise. The presence of heterogeneity of this sort simply increases the burden upon plaintiff to provide RR statistics from studies that very tightly resemble plaintiff in terms of exposure and other characteristics.

Urning for Probablistic Certainty

Parascandola describes how the PC formula arises from a consideration of the “urn model” of disease causation.  Suppose in group of sufficient size there were expected 200 stomach cancer cases within a certain time, but 300 were observed. We can model the situation with an urn of 300 marbles, 200 of which are green, and 100 are red. Blindfolded or colorblind, we pull a single marble from the urn, and we have only a 1/3 chance of obtaining a red, “excess” marble case. Parascandola at 36-37 (borrowing from David Kaye, “The Limits of the Preponderance of the Evidence Standard: Justifiably Naked Statistical Evidence and Multiple Causation,” 7 Am. Bar Fdtn. Res. J. 487, 501 (1982)).

Parascandola argues that the urn model is not necessarily correct.  Causation cannot always be reduced to a single cause. Complex etiologic mechanisms and pathways are common.  Interactions between and among causes frequently occur.  Biological phenomena are sometimes “over-determined.” Parascandola asks us to assume that some of the non-excess cases are also “etiologic cases,” which were caused by the exposure but which would not have occurred but for the exposure.  Id. at 37. Borrowing from Greenland, Parascandola asserts that “[a]ll excess cases are etiologic cases, but not vice versa.” Id. at 38 & n.37 (quoting from Sander Greenland & James M. Robins, “Conceptual Problems in the Definition and Interpretation of Attributable Fractions,” 128 Am. J. Epidem. 1185, 1185 (1988)).

Parascandola’s argument, if accepted, proves too much to help plaintiffs who hope to establish specific causation with evidence of increased risk. His argument posits a different, more complex model of causation, for which plaintiffs usually have no evidence.  (If they did have such evidence, then they would have nothing to fear in the assumptions of the simplistic urn model; they could rebut those assumptions.) Parascandola’s argument pushes the speculation envelope by asking us to believe that some “non-excess” cases are etiologic cases, but providing no basis for identifying which ones they are.  Unless and until such evidence is forthcoming, Parascandola’s argument is simply uncontrolled multi-leveled conjecture.

Again borrowing from Sander Greenland’s speculation, Parascandola advances a variant of the argument above by suggesting that an exposure may not increase the overall number of excess cases, but that it may accelerate the onset of the harm in question. While it is true that the element of time is important, both in law and in life, the invoked speculation can be, and usually is, tested by time windows or time series analyses in observational epidemiology and clinical trials.  The urn model is “flat” with respect to the temporal dimension, but if plaintiffs want to claim acceleration, then they should adduce Kaplan-Meier curves and the like.  But again, with the modification of the time dimension, plaintiffs will still need hazard ratios or other risk ratios greater than two to make out their case, unless there is a biomarker/fingerprint of individual causation. The introduction of the temporal element is important to an understanding of risk, but Parascandola’s argument does not help transmute evidence of risk in a group to causation in an individual.

Joint Chancy Causation

In line with his other speculative arguments, Parascandola asks:  what if a given cancer in the exposed group is the product of two causes rather than due to one or another of the two causes? Parascandola at 40. This question restates the speculative argument in only slightly different terms.  We could multiply the possible causal sets by suggesting that the observed effect resulted from one or the other or both or none of the causes.  Parascandola calls this “joint chancy causation,” but he manages to show only that the inference of causation from prior chance or risk is a very chancy (or dicey) step in his argument.  Parascandola argues that we should not assume that the urn model is true, when multiple causation models are “plausible and consistent” with other causal theories.

Interestingly, this line of argument would raise the burden upon plaintiffs by requiring them to specify the applicable causal model in ways that (1) they often cannot, and (2) they now, under current law, are not required to do.


In the end, Parascandola realizes that he has raised, not lowered, the burden for plaintiffs.  His counter is to suggest, contrary to law and science, that “the existence of alternative hypotheses should not prevent the plaintiff’s case from proceeding.” Parascandola at 41 n.50.  Because he says so. In other words, Parascandola is telling us that irrespective of how poorly established a hypothesis is, or of how speculative an inference is, or of the existence and strength of alternative hypotheses,

“This trial must be tried.”

W.S. Gilbert, Trial by Jury (1875).

With bias of every kind, no doubt.

That is not science, law, or justice.

[1] An interesting failure or lack of peer review in a legal journal.