For your delectation and delight, desultory dicta on the law of delicts.

Stranger to the Contract and to the World

March 10th, 2018

It Was a Dark and Stormy Night

All around the country, first year law students are staring at the prospect of their final examination in contracts, one of the required courses in the law school curriculum in the United States. So here is a practice question.

A lawyer for David Dennison drafts a memorandum of agreement between Mr Dennison and Stephanie Clifford. The memorandum calls for Ms Clifford to remain silent about a sexual liaison between Mr Dennison and her, in return for payment of $130,000, in “hush money.”

Mr Dennison never signed the putative contract, and he never provided the consideration for Ms Clifford’s silence. The lawyer for Mr Dennison, however, wired Ms Clifford the money, although he apparently was never given the money by his client, or reimbursed for the payment, later1.

Mr Dennison’s lawyer also represents the President of the United States (POTUS). POTUS may well be Mr Dennison, but he has never acknowledged that Dennison was a name he used. Mr Dennison’s lawyer has publicly acknowledged that he provided the money to Ms. Clifford, and that his client Mr Dennison or whoever Mr Dennison is, did not reimburse him2.

The putative contract calls for arbitration and penalties. A company, EC, LLC, obtained a temporary restraining order (TRO) from the designated alternative dispute resolution company. EC, LLC v Peterson, ADR Services TRO (Feb. 27, 2018)3. A week later, Stephanie Clifford sued POTUS (a.k.a. David Dennison) for declaratory relief, in California Superior Court, Los Angeles County, after the TRO was entered. Clifford v. Trump, Calif. Super. Ct., Los Angeles Cty. Complaint (Mar. 06, 2018)4.

Prepare a bench memorandum for the trial court judge who has been assigned the declaratory judgment action. Make sure you address all issues of contract formation and enforcement, affirmative defenses such as the statute of frauds5, as well as professional ethics of the lawyers involved. Address the ethical propriety of POTUS’s lawyer’s paying consideration for a hush contract out of his own pocket and then claiming the benefit of the bargain for his client, as well as the legal consequences of his public disclosure on the enforceability of the putative contract. If you come up with a negotiation strategy for the wife of POTUS to vitiate her pre-nuptial agreements with POTUS, you will receive extra credit.

Watch the upcoming issues of the New York Times for the answer to this practice question.

1 Amy Davidson Sorkin, “Does Stormy Daniels Have a Case Against Donald Trump?” New Yorker (Mar. 7, 2018).

2 Debra Cassens Weiss, “Stormy Daniels sues Trump, says confidentiality deal is void because he didn’t sign it,” Am. Bar. Ass’n J. (Mar. 7, 2018).

3 Jim Rutenberg & Peter Baker, “Trump Lawyer Obtained Restraining Order to Silence Stormy Daniels,” N.Y. Times (Mar. 7, 2018).

4 Rebecca R. Ruiz & Matt Stevens, “Stormy Daniels Sues, Saying Trump Never Signed ‘Hush Agreement’,” N.Y. Times (Mar. 6, 2018).

Statistical Deontology

March 2nd, 2018

In courtrooms across America, there has been a lot of buzzing and palavering about the American Statistical Association’s Statement on Statistical Significance Testing,1 but very little discussion of the Society’s Ethical Guidelines, which were updated and promulgated in the same year, 2016. Statisticians and statistics, like lawyers and the law, receive their fair share of calumny over their professional activities, but the statistician’s principal North American professional organization is trying to do something about members’ transgressions.

The American Statistical Society (ASA) has promulgated ethical guidelines for statisticians, as has the Royal Statistical Society,2 even if these organizations lack the means and procedures to enforce their codes. The ASA’s guidelines3 are rich with implications for statistical analyses put forward in all contexts, including in litigation and regulatory rule making. As such, the guidelines are well worth studying by lawyers.

The ASA Guidelines were prepared by the Committee on Professional Ethics, and approved by the ASA’s Board in April 2016. There are lots of “thou shall” and “thou shall nots,” but I will focus on the issues that are more likely to arise in litigation. What is remarkable about the Guidelines is that if followed, they probably are more likely to eliminate unsound statistical practices in the courtroom than the ASA State on P-values.

Defining Good Statistical Practice

Good statistical practice is fundamentally based on transparent assumptions, reproducible results, and valid interpretations.” Guidelines at 1. The Guidelines thus incorporate something akin to the Kumho Tire standard that an expert witness ‘‘employs in the courtroom the same level of intellectual rigor that characterizes the practice of an expert in the relevant field.’’ Kumho Tire Co. v. Carmichael, 526 U.S. 137, 152 (1999).

A statistician engaged in expert witness testimony should provide “only expert testimony, written work, and oral presentations that he/she would be willing to have peer reviewed.” Guidelines at 2. “The ethical statistician uses methodology and data that are relevant and appropriate, without favoritism or prejudice, and in a manner intended to produce valid, interpretable, and reproducible results.” Id. Similarly, the statistician, if ethical, will identify and mitigate biases, and use analyses “appropriate and valid for the specific question to be addressed, so that results extend beyond the sample to a population relevant to the objectives with minimal error under reasonable assumptions.” Id. If the Guidelines were followed, a lot of spurious analyses would drop off the litigation landscape, regardless whether they used p-values or confidence intervals, or a Bayesian approach.

Integrity of Data and Methods

The ASA’s Guidelines also have a good deal to say about data integrity and statistical methods. In particular, the Guidelines call for candor about limitations in the statistical methods or the integrity of the underlying data:

The ethical statistician is candid about any known or suspected limitations, defects, or biases in the data that may impact the integrity or reliability of the statistical analysis. Objective and valid interpretation of the results requires that the underlying analysis recognizes and acknowledges the degree of reliability and integrity of the data.”

Guidelines at 3.

The statistical analyst openly acknowledges the limits of statistical inference, the potential sources of error, as well as the statistical and substantive assumptions made in the execution and interpretation of any analysis,” including data editing and imputation. Id. The Guidelines urge analysts to address potential confounding not assessed by the study design. Id. at 3, 10. How often do we see these acknowledgments in litigation-driven analyses, or in peer-reviewed papers, for that matter?

Affirmative Actions Prescribed

In the aid of promoting data and methodological integrity, the Guidelines also urge analysts to share data when appropriate without revealing the identities of study participants. Statistical analysts should publicly correct any disseminated data and analyses in their own work, as well as working to “expose incompetent or corrupt statistical practice.” Of course, the Lawsuit Industry will call this ethical duty “attacking the messenger,” but maybe that’s a rhetorical strategy based upon an assessment of risks versus benefits to the Lawsuit Industry.


The ASA Guidelines address the impropriety of substantive statistical errors, such as:

[r]unning multiple tests on the same data set at the same stage of an analysis increases the chance of obtaining at least one invalid result. Selecting the one “significant” result from a multiplicity of parallel tests poses a grave risk of an incorrect conclusion. Failure to disclose the full extent of tests and their results in such a case would be highly misleading.”

Guidelines at 9.

There are some Lawsuit Industrialists who have taken comfort in the pronouncements of Kenneth Rothman on corrections for multiple comparisons. Rothman’s views on multiple comparisons are, however, much broader and more nuanced than the Industry’s sound bites.4 Given that Rothman opposes anything like strict statistical significance testing, it follows that he is relatively unmoved for the need for adjustments to alpha or the coefficient of confidence. Rothman, however, has never deprecated the need to consider the multiplicity of testing, and the need for researchers to be forthright in disclosing the the scope of comparisons originally planned and actually done.

2 Royal Statistical Society – Code of Conduct (2014); Steven Piantadosi, Clinical Trials: A Methodologic Perspective 609 (2d ed. 2005).

3 Shelley Hurwitz & John S. Gardenier, “Ethical Guidelines for Statistical Practice: The First 60 Years and Beyond,” 66 Am. Statistician 99 (2012) (describing the history and evolution of the Guidelines).

4 Kenneth J. Rothman, “Six Persistent Research Misconceptions,” 29 J. Gen. Intern. Med. 1060, 1063 (2014).

The 5% Solution at the FDA

February 24th, 2018

The statistics wars rage on1, with Bayesians attempting to take advantage of the so-called replication crisis to argue it is all the fault of frequentist significance testing. In 2016, there was an attempted coup at the American Statistical Association, but the Bayesians did not get what they wanted, with little more than a consensus that p-values and confidence intervals should be properly interpreted. Patient advocacy groups have lobbied for the availability of unapproved and incompletely tested medications, and rent-seeking litigation has argued and lobbied for the elimination of statistical tests and methods in the assessment of causal claims. The battle continues.

Against this backdrop, a young Harvard graduate student has published a a paper with a brief history of significance testing, and the role that significance testing has taken on at the United States Food and Drug Administration (FDA). Lee Kennedy-Shaffer, “When the Alpha is the Omega: P-Values, ‘Substantial Evidence’, and the 0.05 Standard at FDA,” 72 Food & Drug L.J. 595 (2017) [cited below as K-S]. The paper presents a short but entertaining history of the evolution of the p-value from its early invocation in 1710, by John Arbuthnott, a Scottish physician and mathematician, who calculated the probability that male births would exceed female births 82 consecutive years if their true proportions were equal. K-S at 603. Kennedy-Shaffer notes the role of the two great French mathematicians, Pierre-Simon Laplace and Siméon-Denis Poisson, who used p-values (or their complements) to evaluate empirical propositions. As Kennedy-Shaffer notes, Poisson observed that the equivalent of what would be a modern p-value about 0.005, was sufficient in his view, back in 1830, to believe that the French Revolution of 1830 had caused the pattern of jury verdicts to be changed. K-S at 604.

Kennedy-Shaffer traces the p-value, or its equivalent, through its treatment by the great early 20th century statisticians, Karl Pearson and Ronald A. Fisher, through its modification by Jerzy Neyman and Egon Pearson, into the bowels of the FDA in Rockville, Maryland. It is a history well worth recounting, if for no other reason, to remind us that the p-value or its equivalent has been remarkably durable and reasonably effective in protecting the public against false claims of safety and efficacy. Kennedy-Shaffer provides several good examples in which the FDA’s use of significance testing was outcome dispositive of approval or non-approval of medications and devices.

There is enough substance and history here that everyone will have something to pick at this paper. Let me volunteer the first shot. Kennedy-Shaffer describes the co-evolution of the controlled clinical trial and statistical tests, and points to the landmark study by the Medical Research Council on streptomycin for tuberculosis. Geoffrey Marshall (chairman), “Streptomycin Treatment of Pulmonary Tuberculosis: A Medical Research Council Investigation,” 2 Brit. Med. J. 769, 769–71 (1948). This clinical trial was historically important, not only for its results and for Sir Austin Bradford Hill’s role in its design, but for the care with which it described randomization, double blinding, and multiple study sites. Kennedy-Shaffer suggests that “[w]hile results were presented in detail, few formal statistical tests were incorporated into this analysis.” K-S at 597-98. And yet, a few pages later, he tells us that “both chi-squared tests and t-tests were used to evaluate the responses to the drug and compare the control and treated groups,” and that “[t]he difference in mortality between the two groups is statistically significant.” K-S at 611. Although it is true that the authors did not report their calculated p-values for any test, the difference in mortality between the streptomycin and control groups was very large, and the standards for describing the results of such a clinical trial were in their infancy in 1948.

Kennedy-Shaffer’s characterization of Sir Austin Bradford Hill’s use of statistical tests and methods takes on out-size importance because of the mischaracterizations, and even misrepresentations, made by some representatives of the Lawsuit Industry, who contend that Sir Austin dismissed statistical methods as unnecessary. In the United States, some judges have been seriously misled by those misrepresentations, which have their way into published judicial decisions.

The operative document, of course, is the publication of Sir Austin’s famous after-dinner speech, in 1965, on the occasion of his election to the Presidency of the Royal Society of Medicine. Although the speech is casual and free of scholarly footnotes, Sir Austin’s message was precise, balanced, and nuanced. The speech is a classic in the history of medicine, which remains important even if rather dated in terms of its primary message about how science and medicine move from beliefs about associations to knowledge of causal associations. As everyone knows, Sir Austin articulated nine factors or viewpoints through which to assess any putative causal association, but he emphasized that before these nine factors are assessed, our starting point itself has prerequisites:

Disregarding then any such problem in semantics we have this situation. Our observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance. What aspects of that association should we especially consider before deciding that the most likely interpretation of it is causation?”

Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 295 (1965) [cited below as Hill]. The starting point, therefore, before the Bradford Hill nine factors come into play, is a “clear-cut” association, which is “beyond what we would care to attribute to the play of chance.”

In other words, consideration of random error is necessary.

Now for the nuance and the balance. Sir Austin acknowledged that there were some situations in which we simply do not need to calculate standard errors because the disparity between treatment and control groups is so large and meaningful. He goes on to wonder out loud:

whether the pendulum has not swung too far – not only with the attentive pupils but even with the statisticians themselves. To decline to draw conclusions without standard errors can surely be just as silly? Fortunately I believe we have not yet gone so far as our friends in the USA where, I am told, some editors of journals will return an article because tests of significance have not been applied. Yet there are innumerable situations in which they are totally unnecessary – because the difference is grotesquely obvious, because it is negligible, or because, whether it be formally significant or not, it is too small to be of any practical importance. What is worse the glitter of the t table diverts attention from the inadequacies of the fare.”

Hill at 299. Now this is all true, but hardly the repudiation of statistical testing claimed by those who want to suppress the consideration of random error from science and judicial gatekeeping. There are very few litigation cases in which the difference between exposed and unexposed is “grotesquely obvious,” such that we can leave statistical methods at the door. Importantly, the very large differences between the streptomycin and placebo control groups in the Medical Council’s 1948 clinical trial were not so “grotesquely obvious” that statistical methods were obviated. To be fair, the differences were sufficiently great that statistical discussion could be kept to a minimum. Sir Austin gave extensive tables in the 1948 paper to let the reader appreciate the actual data themselves.

In his after-dinner speech, Hill also gives examples of studies that are so biased and confounded that no statistical method will likely ever save them. Certainly, the technology of regression and propensity-score analyses have progressed tremendously since Hill’s 1965 speech, but his point still remains. This point hardly excuses the lack of statistical apparatus in highly confounding or biased observations.

In addressing the nine factors he identified, which presumed a “clear-cut” association, with random error ruled out, Sir Austin did opine that for the factors raised questions and that:

No formal tests of significance can answer those questions. Such tests can, and should, remind us of the effects that the play of chance can create, and they will instruct us in the likely magnitude of those effects. Beyond that they contribute nothing to the ‘proof’ of our hypothesis.”

Hill at 299. Again, the date and the context are important. Hill is addressing consideration of the nine factors, not the required predicate association beyond the play of chance or random error. The date is important as well, because it would be foolish to suggest that statistical methods have not grown in the last half century to address some of the nine factors. The existence and the nature of dose-response are the subject of extensive statistical methods, and meta-analysis and meta-regression are used to assess and measure consistency between studies.

Kennedy-Shaffer might well have pointed out the great influence Sir Austin’s textbook on medical statistics had had on medical research and practice. This textbook, which went through numerous editions, makes clear the importance of statistical testing and methods:

Are simple methods of the interpretation of figures only a synonym for common sense or do they involve an art or knowledge which can be imparted? Familiarity with medical statistics leads inevitably to the conclusion that common sense is not enough. Mistakes which when pointed out look extremely foolish are quite frequently made by intelligent persons, and the same mistakes, or types of mistakes, crop up again and again. There is often lacking what has been called a ‘statistical tact, which is rather more than simple good sense’. That fact the majority of persons must acquire (with a minority it is undoubtedly innate) by a study of the basic principles of statistical method.”

Austin Bradford Hill, Principles of Medical Statistics at 2 (4th ed. 1948) (emphasis in original). And later in his text, Sir Austin notes that:

The statistical method is required in the interpretation of figures which are at the mercy of numerous influences, and its object is to determine whether individual influences can be isolated and their effects measured.”

Id. at 10 (emphasis added).

Sir Austin’s work taken as a whole demonstrates the acceptance of the necessity of statistical methods in medicine, and causal inference. Kennedy-Shaffer’s paper covers much ground, but it short changes this important line of influence, which lies directly in the historical path between Sir Ronald Fisher and the medical regulatory community.

Kennedy-Shaffer gives a nod to Bayesian methods, and even suggests that Bayesian results are “more intuitive,” but he does not explain the supposed intuitiveness of how a parameter has a probability distribution. This might make sense at the level of quantum physics, but does not seem to describe the reality of a biomedical phenomenon such as relative risk. Kennedy-Shaffer notes the FDA’s expression of willingness to entertain Bayesian analyses of clinical trials, and the rare instances in which such analyses have actually been deployed. K-S at 629 (“e.g., Pravigard Pac for prevention of myocardial infarction”). He concedes, however, that Bayesian designs are still the exception to the rule, as well as the cautions of Robert Temple, a former FDA Director of Medical Policy, in 2005, that Bayesian proposals for drug clinical trials were at that time “very rare.2” K-S at 630.

2 Robert Temple, “How FDA Currently Makes Decisions on Clinical Studies,” 2 Clinical Trials 276, 281 (2005).

Scientific Evidence in Canadian Courts

February 20th, 2018

A couple of years ago, Deborah Mayo called my attention to the Canadian version of the Reference Manual on Scientific Evidence.1 In the course of discussion of mistaken definitions and uses of p-values, confidence intervals, and significance testing, Sander Greenland pointed to some dubious pronouncements in the Science Manual for Canadian Judges [Manual].

Unlike the United States federal court Reference Manual, which is published through a joint effort of the National Academies of Science, Engineering, and Medicine, the Canadian version, is the product of the Canadian National Judicial Institute (NJI, or the Institut National de la Magistrature, if you live in Quebec), which claims to be an independent, not-for-profit group, that is committed to educating Canadian judges. In addition to the Manual, the Institute publishes Model Jury Instructions and a guide, Problem Solving in Canada’s Courtrooms: A Guide to Therapeutic Justice (2d ed.), as well as conducting educational courses.

The NJI’s website describes the Instute’s Manual as follows:

Without the proper tools, the justice system can be vulnerable to unreliable expert scientific evidence.

         * * *

The goal of the Science Manual is to provide judges with tools to better understand expert evidence and to assess the validity of purportedly scientific evidence presented to them. …”

The Chief Justice of Canada, Hon. Beverley M. McLachlin, contributed an introduction to the Manual, which was notable for its frank admission that:

[w]ithout the proper tools, the justice system is vulnerable to unreliable expert scientific evidence.


Within the increasingly science-rich culture of the courtroom, the judiciary needs to discern ‘good’ science from ‘bad’ science, in order to assess expert evidence effectively and establish a proper threshold for admissibility. Judicial education in science, the scientific method, and technology is essential to ensure that judges are capable of dealing with scientific evidence, and to counterbalance the discomfort of jurists confronted with this specific subject matter.”

Manual at 14. These are laudable goals, indeed, but did the National Judicial Institute live up to its stated goals, or did it leave Canadian judges vulnerable to the Institute’s own “bad science”?

In his comments on Deborah Mayo’s blog, Greenland noted some rather cavalier statements in Chapter two that suggest that the conventional alpha of 5% corresponds to a “scientific attitude that unless we are 95% sure the null hypothesis is false, we provisionally accept it.” And he, pointed elsewhere where the chapter seems to suggest that the coefficient of confidence that corresponds to an alpha of 5% “constitutes a rather high standard of proof,” thus confusing and conflating probability of random error with posterior probabilities. Greenland is absolutely correct that the Manual does a rather miserable job of educating Canadian judges if our standard for its work product is accuracy and truth.

Some of the most egregious errors are within what is perhaps the most important chapter of the Manual, Chapter 2, “Science and the Scientific Method.” The chapter has two authors, a scientist, Scott Findlay, and a lawyer, Nathalie Chalifour. Findlay is an Associate Professor, in the Department of Biology, of the University of Ottawa. Nathalie Chalifour is an Associate Professor on the Faculty of Law, also in the University of Ottawa. Together, they produced some dubious pronouncements, such as:

Weight of the Evidence (WOE)

First, the concept of weight of evidence in science is similar in many respects to its legal counterpart. In both settings, the outcome of a weight-of-evidence assessment by the trier of fact is a binary decision.”

Manual at 40. Findlay and Chalifour cite no support for their characterization of WOE in science. Most attempts to invoke WOE are woefully vague and amorphous, with no meaningful guidance or content.2  Sixty-five pages later, if any one is noticing, the authors let us in a dirty, little secret:

at present, there exists no established prescriptive methodology for weight of evidence assessment in science.”

Manual at 105. The authors omit, however, that there are prescriptive methods for inferring causation in science; you just will not see them in discussions of weight of the evidence. The authors then compound the semantic and conceptual problems by stating that “in a civil proceeding, if the evidence adduced by the plaintiff is weightier than that brought forth by the defendant, a judge is obliged to find in favour of the plaintiff.” Manual at 41. This is a remarkable suggestion, which implies that if the plaintiff adduces the crummiest crumb of evidence, a mere peppercorn on the scales of justice, but the defendant has none to offer, that the plaintiff must win. The plaintiff wins notwithstanding that no reasonable person could believe that the plaintiff’s claims are more likely than not true. Even if there were the law of Canada, it is certainly not how scientists think about establishing the truth of empirical propositions.

Confusion of Hypothesis Testing with “Beyond a Reasonable Doubt”

The authors’ next assault comes in conflating significance probability with the probability connected with the burden of proof, a posterior probability. Legal proceedings have a defined burden of proof, with criminal cases requiring the state to prove guilt “beyond a reasonable doubt.” Findlay and Chalifour’s discussion then runs off the rails by likening hypothesis testing, with an alpha of 5% or its complement, 95%, as a coefficient of confidence, to a “very high” burden of proof:

In statistical hypothesis-testing – one of the tools commonly employed by scientists – the predisposition is that there is a particular hypothesis (the null hypothesis) that is assumed to be true unless sufficient evidence is adduced to overturn it. But in statistical hypothesis-testing, the standard of proof has traditionally been set very high such that, in general, scientists will only (provisionally) reject the null hypothesis if they are at least 95% sure it is false. Third, in both scientific and legal proceedings, the setting of the predisposition and the associated standard of proof are purely normative decisions, based ultimately on the perceived consequences of an error in inference.”

Manual at 41. This is, as Greenland and many others have pointed out, a totally bogus conception of hypothesis testing, and an utterly false description of the probabilities involved.

Later in the chapter, Findlay and Chalifour flirt with the truth, but then lapse into an unrecognizable parody of it:

Inferential statistics adopt the frequentist view of probability whereby a proposition is either true or false, and the task at hand is to estimate the probability of getting results as discrepant or more discrepant than those observed, given the null hypothesis. Thus, in statistical hypothesis testing, the usual inferred conclusion is either that the null is true (or rather, that we have insufficient evidence to reject it) or it is false (in which case we reject it). 16 The decision to reject or not is based on the value of p if the estimated value of p is below some threshold value a, we reject the null; otherwise we accept it.”

Manual at 74. OK; so far so good, but here comes the train wreck:

By convention (and by convention only), scientists tend to set α = 0.05; this corresponds to the collective – and, one assumes, consensual – scientific attitude that unless we are 95% sure the null hypothesis is false, we provisionally accept it. It is partly because of this that scientists have the reputation of being a notoriously conservative lot, given that a 95% threshold constitutes a rather high standard of proof.”

Manual at 75. Uggh; so we are back to significance probability’s being a posterior probability. As if to atone for their sins, in the very next paragraph, the authors then remind the judicial readers that:

As noted above, p is the probability of obtaining results at least as discrepant as those observed if the null is true. This is not the same as the probability of the null hypothesis being true, given the results.”

Manual at 75. True, true, and completely at odds with what the authors have stated previously. And to add to the reader’s now fully justified conclusion, the authors describe the standard for rejecting the null hypothesis as “very high indeed.” Manual at 102, 109. Any reader who is following the discussion might wonder how and why there is such a problem of replication and reproducibility in contemporary science.

Conflating Bayesianism with Frequentist Modes of Inference

We have seen how Findlay and Chalifour conflate significance and posterior probabilities, some of the time. In a section of their chapter that deals explicitly with probability, the authors tell us that before any study is conducted the prior probability of the truth of the tested hypothesis is 50%, sans evidence. This an astonishing creation of certainty out nothingness, and perhaps it explains the authors’ implied claim that the crummiest morsel of evidence on one side is sufficient to compel a verdict, if the other side has no morsels at all. Here is how the authors put their claim to the Canadian judges:

Before each study is conducted (that is, a priori), the hypothesis is as likely to be true as it is to be false. Once the results are in, we can ask: How likely is it now that the hypothesis is true? In the first study, the low a priori inferential strength of the study design means that this probability will not be much different from the a priori value of 0.5 because any result will be rather equivocal owing to limitations in the experimental design.”

Manual at 64. This implied Bayesian slant, with 50% priors, in the world of science would lead anyone to believe “as many as six impossible things before breakfast,” and many more throughout the day.

Lest you think that the Manual is all rubbish, there are occasional gems of advice to the Canadian judges. The authors admonish the judges to

be wary of individual ‘statistically significant’ results that are mined from comparatively large numbers of trials or experiments, as the results may be ‘cherry picked’ from a larger set of experiments or studies that yielded mostly negative results. The court might ask the expert how many other trials or experiments testing the same hypothesis he or she is aware of, and to describe the outcome of those studies.”

Manual at 87. Good advice, but at odds with the authors’ characterization of statistical significance as establishing the rejection of the null hypothesis well-nigh beyond a reasonable doubt.

When Greenland first called attention to this Manual, I reached to some people who had been involved in its peer review. One reviewer told me that it was a “living document,” and would likely be revised after he had the chance to call the NJI’s attention to the errors. But two years later, the errors remain, and so we have to infer that the authors meant to say all the contradictory and false statements that are still present in the downloadable version of the Manual.

2 SeeWOE-fully Inadequate Methodology – An Ipse Dixit By Another Name” (May 1, 2012); “Weight of the Evidence in Science and in Law” (July 29, 2017); see also David E. Bernstein, “The Misbegotten Judicial Resistance to the Daubert Revolution,” 89 Notre Dame L. Rev. 27 (2013).

The opinions, statements, and asseverations expressed on Tortini are my own, or those of invited guests, and these writings do not necessarily represent the views of clients, friends, or family, even when supported by good and sufficient reason.