For your delectation and delight, desultory dicta on the law of delicts.

American Statistical Association – Consensus versus Personal Opinion

December 13th, 2019

Lawyers and judges pay close attention to standards, guidances, and consenus statements from respected and recognized professional organizations. Deviations from these standards may be presumptive evidence of malpractice or malfeasance in civil and criminal litigation, in regulatory matters, and in other contexts. One important, recurring situation arises when trial judges must act as gatekeepers of the admissibility of expert witness opinion testimony. In making this crucial judicial determination, judges will want to know whether a challenged expert witness has deviated from an accepted professional standard of care or practice.

In 2016, the American Statistical Association (ASA) published a consensus statement on p-values. The ASA statement grew out of a lengthy process that involved assembling experts of diverse viewpoints. In October 2015, the ASA convened a two-day meeting for 20 experts to meet and discuss areas of core agreement. Over the following three months, the participating experts and the ASA Board members continued their discussions, which led to the ASA Executive Committee’s approval of the statement that was published in March 2016.[1]

The ASA 2016 Statement spelled out six relatively uncontroversial principles of basic statistical practice.[2] Far from rejecting statistical significance, the six principles embraced statistical tests as an important but insufficient basis for scientific conclusions:

“3. Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.”

Despite the fairly clear and careful statement of principles, legal actors did not take long to misrepresent the ASA principles.[3] What had been a prescription about the insufficiency of p-value thresholds was distorted into strident assertions that statistical significance was unnecessary for scientific conclusions.

Three years after the ASA published its p-value consensus document, ASA Executive Director, Ronald Wasserstein, and two other statisticians, published an editorial in a supplemental issue of The American Statistician, in which they called for the abandonment of significance testing.[4] Although the Wasserstein’s editorial was clearly labeled as such, his essay introduced the special journal issue, and it appeared without disclaimer over his name, and his official status as the ASA Executive Director.

Sowing further confusion, the editorial made the following pronouncement:[5]

“The [2016] ASA Statement on P-Values and Statistical Significance stopped just short of recommending that declarations of ‘statistical significance’ be abandoned. We take that step here. We conclude, based on our review of the articles in this special issue and the broader literature, that it is time to stop using the term “statistically significant” entirely. Nor should variants such as ‘significantly different’, ‘p < 0.05’, and ‘nonsignificant’ survive, whether expressed in words, by asterisks in a table, or in some other way.”

The ASA is a collective body, and its ASA Statement 2016 was a statement from that body, which spoke after lengthy deliberation and debate. The language, quoted above, moves within one paragraph, from the ASA Statement to the royal “We,” who are taking the step of abandoning the term “statistically significant.” Given the unqualified use of the collective first person pronoun in the same paragraph that refers to the ASA, combined with Ronald Wasserstein’s official capacity, and the complete absence of a disclaimer that this pronouncement was simply a personal opinion, a reasonable reader could hardly avoid concluding that this pronouncement reflected ASA policy.

Your humble blogger, and others, read Wasserstein’s 2019 editorial as an ASA statement.[6] Although it is true that the 2019 paper is labeled “editorial,” and that the editorial does not describe a consensus process, there is no disclaimer such as is customary when someone in an official capacity publishes a personal opinion. Indeed, rather than the usual disclaimer, the Wasserstein editorial thanks the ASA Board of Directors “for generously and enthusiastically supporting the ‘p-values project’ since its inception in 2014.” This acknowledgement strongly suggests that the editorial is itself part of the “p-values project,” which is “enthusiastically” supported by the ASA Board of Directors.

If the editorial were not itself confusing enough, an unsigned email from “ASA <>” was sent out in July 2019, in which the anonymous ASA author(s) takes credit for changing statistical guidelines at the New England Journal of Medicine:[7]

From: ASA <>
Date: Thu, Jul 18, 2019 at 1:38 PM
Subject: Major Medical Journal Updates Statistical Policy in Response to ASA Statement
To: <XXXX>

The email is itself an ambiguous piece of evidence as to what the ASA is claiming. The email says that the New England Journal of Medicine changed its guidelines “in response to the ASA Statement on P-values and Statistical Significance and the subsequent The American Statistician special issue on statistical inference.” Of course, the “special issue” was not just Wasserstein’s editorial, but the 42 other papers. So this claim leaves open to doubt exactly what in the 2019 special issue the NEJM editors were responding to. Given that the 42 articles that followed Wasserstein’s editorial did not all agree with Wasserstein’s “steps taken,” or with each other, the only landmark in the special issue was the editorial over the name of the ASA’s Executive Director.

Moreover, a reading of the NEJM revised guidelines does not suggest that the journal’s editors were unduly influenced by the Wasserstein editorial or the 42 accompanying papers. The journal mostly responded to the ASA 2016 consensus paper by putting some teeth into its Principle 4, which dealt with multiplicity concerns in submitted manuscripts.  The newly adopted (2019) NEJM author guidelines do not take step out with Wasserstein and colleagues; there is no general prohibition on p-values or statements of “statistical significance.”

The confusion propagated by the Wasserstein 2019 editorial has not escaped the attention of other ASA officials. An editorial in the June 2019 issue of AmStat News, by ASA President Karen Kafadar, noted the prevalent confusion and uneasiness over the 2019 The American Statistician special issue, the lack of consensus, and the need for healthy debate.[8]

In this month’s issue of AmStat News, President Kafadar returned to the issue of the confusion over the 2019 ASA special issue of The American Statistician, in her “President’s Corner.” Because Executive Director Wasserstein’s editorial language about “we now take this step” is almost certainly likely to find its way into opportunistic legal briefs, Kafadar’s comments are worth noting in some detail:[9]

“One final challenge, which I hope to address in my final month as ASA president, concerns issues of significance, multiplicity, and reproducibility. In 2016, the ASA published a statement that simply reiterated what p-values are and are not. It did not recommend specific approaches, other than ‘good statistical practice … principles of good study design and conduct, a variety of numerical and graphical summaries of data, understanding of the phenomenon under study, interpretation of results in context, complete reporting and proper logical and quantitative understanding of what data summaries mean’.

The guest editors of the March 2019 supplement to The American Statistician went further, writing: ‘The ASA Statement on P-Values and Statistical Significance stopped just short of recommending that declarations of “statistical significance” be abandoned. We take that step here. … [I]t is time to stop using the term “statistically significant” entirely’.

Many of you have written of instances in which authors and journal editors – and even some ASA members – have mistakenly assumed this editorial represented ASA policy. The mistake is understandable: The editorial was coauthored by an official of the ASA. In fact, the ASA does not endorse any article, by any author, in any journal – even an article written by a member of its own staff in a journal the ASA publishes.”

Kafadar’s caveat should quash incorrect assertions about the ASA’s position on statistical significance testing. It is a safe bet, however, that such assertions will appear in trial and appellate briefs.

Statistical reasoning is difficult enough for most people, but the hermeneutics of American Statistical Association publications on statistical significance may require a doctorate of divinity degree. In a cleverly titled post, Professor Deborah Mayo argues that there is no other way to interpret the Wasserstein 2019 editorial except as laying down an ASA prescription. Deborah G. Mayo, “Les stats, c’est moi,” Error Philosophy (Dec. 13, 2019). I accept President Kafadar’s correction at face value, and accept that I, like many other readers, misinterpreted the Wasserstein editorial as having the imprimatur of the ASA. Mayo points out, however, that Kafadar’s correction in a newsletter may be insufficient at this point, and that a stronger disclaimer is required. Officers of the ASA are certainly entitled to their opinions and the opportunity to present them, but disclaimers would bring clarity and transparency to published work of these officials.

Wasserstein’s 2019 editorial goes further to make a claim about how his “step” will ameliorate the replication crisis:

“In this world, where studies with ‘p < 0.05’ and studies with ‘p > 0.05 are not automatically in conflict, researchers will see their results more easily replicated – and, even when not, they will better understand why.”

The editorial here seems to be attempting to define replication failure out of existence. This claim, as stated, is problematic. A sophisticated practitioner may think of the situation in which two studies, one with p = .048, and another with p = 0.052 might be said not to be conflict. In real world litigation, however, advocates will take Wasserstein’s statement about studies not in conflict (despite p-values above and below a threshold, say 5%) to the extremes. We can anticipate claims that two similar studies with p-values above and below 5%, say with one p-value at 0.04, and the other at 0.40, will be described as not in conflict, with the second a replication of the first test. It is hard to see how this possible interpretation of Wasserstein’s editorial, although consistent with its language, will advance sound, replicable science.[10]

[1] Ronald L. Wasserstein & Nicole A. Lazar, “The ASA’s Statement on p-Values: Context, Process, and Purpose,” 70 The Am. Statistician 129 (2016).

[2]The American Statistical Association’s Statement on and of Significance” (Mar. 17, 2016).

[3] See, e.g., “The Education of Judge Rufe – The Zoloft MDL” (April 9, 2016) (Zoloft litigation); “The ASA’s Statement on Statistical Significance – Buzzing from the Huckabees” (Mar. 19, 2016); “The American Statistical Association Statement on Significance Testing Goes to Court – Part I” (Nov. 13, 2018).

[4] Ronald L. Wasserstein, Allen L. Schirm, and Nicole A. Lazar, “Editorial: Moving to a World Beyond ‘p < 0.05’,” 73 Am. Statistician S1, S2 (2019).

[5] Id. at S2.

[6] SeeHas the American Statistical Association Gone Post-Modern?” (Mar. 24, 2019); Deborah G. Mayo, “The 2019 ASA Guide to P-values and Statistical Significance: Don’t Say What You Don’t Mean,” Error Statistics Philosophy (June 17, 2019); B. Haig, “The ASA’s 2019 update on P-values and significance,” Error Statistics Philosophy  (July 12, 2019).

[7] SeeStatistical Significance at the New England Journal of Medicine” (July 19, 2019); See also Deborah G. Mayo, “The NEJM Issues New Guidelines on Statistical Reporting: Is the ASA P-Value Project Backfiring?Error Statistics Philosophy  (July 19, 2019).

[8] See Kafadar, “Statistics & Unintended Consequences,” AmStat News 3,4 (June 2019).

[9] Karen Kafadar, “The Year in Review … And More to Come,” AmStat News 3 (Dec. 2019).

[10]  See also Deborah G. Mayo, “P‐value thresholds: Forfeit at your peril,” 49 Eur. J. Clin. Invest. e13170 (2019).


Is the IARC Lost in the Weeds?

November 30th, 2019

A couple of years ago, I met David Zaruk at a Society for Risk Analysis meeting, where we were both presenting. I was aware of David’s blogging and investigative journalism, but meeting him gave me a greater appreciation for the breadth and depth of his work. For those of you who do not know David, he is present in cyberspace as the Risk-Monger who blogs about risk and science communications issues. His blog has featured cutting-edge exposés about the distortions in risk communications perpetuated by the advocacy of non-governmental organizations (NGOs). Previously, I have recorded my objections to the intellectual arrogance of some such organizations that purport to speak on behalf of the public interest, when often they act in cahoots with the lawsuit industry in the manufacturing of tort and environmental litigation.

David’s writing on the lobbying and control of NGOs by plaintiffs’ lawyers from the United States should be required reading for everyone who wants to understand how litigation sausage is made. His series, “SlimeGate” details the interplay among NGO lobbying, lawsuit industry maneuvering, and carcinogen determinations at the International Agency for Research on Cancer (IARC). The IARC, a branch of the World Health Organization, is headquartered in Lyon, France. The IARC convenes “working groups” to review the scientific studies of the carcinogencity of various substances and processes. The IARC working groups produce “monographs” of their reviews, and the IARC publishes these monographs, in print and on-line. The United States is in the top tier of participating countries for funding the IARC.

The IARC was founded in 1965, when observational epidemiology was still very much an emerging science, with expertise concentrated in only a few countries. For its first few decades, the IARC enjoyed a good reputation, and its monographs were considered definitive reviews, especially under its first director, Dr. John Higginson, from 1966 to 1981.[1] By the end of the 20th century, the need for the IARC and its reviews had waned, as the methods of systematic review and meta-analyses had evolved significantly, and had became more widely standardized and practiced.

Understandably, the IARC has been concerned that the members of its working groups should be viewed as disinterested scientists. Unfortunately, this concern has been translated into an asymmetrical standard that excludes anyone with a hint of manufacturing connection, but keeps the door open for those scientists with deep lawsuit industry connections. Speaking on behalf of the plaintiffs’ bar, Michael Papantonio, a plaintiffs’ lawyer who founded Mass Torts Made Perfect, noted that “We [the lawsuit industry] operate just like any other industry.”[2]

David Zaruk has shown how this asymmetry has been exploited mercilessly by the lawsuit industry and its agents in connection with the IARC’s review of glyphosate.[3] The resulting IARC classification of glyphosate has led to a litigation firestorm and an all-out assault on agricultural sustainability and productivity.[4]

The anomaly of the IARC’s glyphosate classification has been noted by scientists as well. Dr. Geoffrey Kabat is a cancer epidemiologist, who has written perceptively on the misunderstandings and distortions of cancer risk assessments in various settings.[5] He has previously written about glyphosate in Forbes and elsewhere, but recently he has written an important essay on glyphosate in Issues in Science and Technology, which is published by the National Academies of Sciences, Engineering, and Medicine and Arizona State University. In his essay, Dr. Kabat details how the IARC’s evaluation of glyphosate is an outlier in the scientific and regulatory world, and is not well supported by the available evidence.[6]

The problems with the IARC are both substantive and procedural.[7] One of the key problems that face IARC evaluations is an incoherent classification scheme. IARC evaluations classify putative human carcinogenic risks into five categories: Group I (known), Group 2A (probably), Group 2B (possibly), Group 3 (unclassifiable), and Group 4 (probably not). Group 4 is virtually an empty set with only one substance, caprolactam ((CH2)5C(O)NH), an organic compound used in the manufacture of nylon.

In the IARC evaluation at issue, glyphosate was placed into Group 2A, which would seem to satisfy the legal system’s requirement that an exposure more likely than not causes the harm in question. Appearances and word usage, however, can be deceiving. Probability is a continuous scale from zero to one. In Bayesian decision making, zero and one are unavailable because if either was our starting point, no amount of evidence could ever change our judgment of the probability of causation. (Cromwell’s Rule) The IARC informs us that its use of “probably” is quite idiosyncratic; the probability that a Group 2A agent causes cancer has “no quantitative” meaning. All the IARC intends is that a Group 2A classification “signifies a greater strength of evidence than possibly carcinogenic.”[8]

In other words, Group 2A classifications are consistent with having posterior probabilities of less than 0.5 (or 50 percent). A working group could judge the probability of a substance or a process to be carcinogenic to humans to be greater than zero, but no more than five or ten percent, and still vote for a 2A classification, in keeping with the IARC Preamble. This low probability threshold for a 2A classification converts the judgment of “probably carcinogenic” into a precautionary prescription, rendered when the most probable assessment is either ignorance or lack of causality. There is thus a practical certainty, close to 100%, that a 2A classification will confuse judges and juries, as well as the scientific community.

In IARC-speak, a 2A “probability” connotes “sufficient evidence” in experimental animals, and “limited evidence” in humans. A substance can receive a 2A classification even when the sufficient evidence of carcinogenicity occurs in one non-human animal specie, even though other animal species fail to show carcinogenicity. A 2A classification can raise the thorny question in court whether a claimant is more like a rat or a mouse.

Similarly, “limited evidence” in humans can be based upon inconsistent observational studies that fail to measure and adjust for known and potential confounding risk factors and systematic biases. The 2A classification requires little substantively or semantically, and many 2A classifications leave juries and judges to determine whether a chemical or medication caused a human being’s cancer, when the basic predicates for Sir Austin Bradford Hill’s factors for causal judgment have not been met.[9]

In courtrooms, IARC 2A classifications should be excluded as legally irrelevant, under Rule 403. Even if a 2A IARC classification were a credible judgment of causation, admitting evidence of the classification would be “substantially outweighed by a danger of … unfair prejudice, confusing the issues, [and] misleading the jury….”[10]

The IARC may be lost in the weeds, but there is no need to fret. A little Round Up™ will help.

[1]  See John Higginson, “The International Agency for Research on Cancer: A Brief History of Its History, Mission, and Program,” 43 Toxicological Sci. 79 (1998).

[2]  Sara Randazzo & Jacob Bunge, “Inside the Mass-Tort Machine That Powers Thousands of Roundup Lawsuits,” Wall St. J. (Nov. 25, 2019).

[3]  David Zaruk, “The Corruption of IARC,” Risk Monger (Aug. 24, 2019); David Zaruk, “Greed, Lies and Glyphosate: The Portier Papers,” Risk Monger (Oct. 13, 2017).

[4]  Ted Williams, “Roundup Hysteria,” Slate Magazine (Oct. 14, 2019).

[5]  See, e.g., Geoffrey Kabat, Hyping Health Risks: Environmental Hazards in Everyday Life and the Science of Epidemiology (2008); Geoffrey Kabat, Getting Risk Right: Understanding the Science of Elusive Health Risks (2016).

[6]  Geoffrey Kabat, “Who’s Afraid of Roundup?” 36 Issues in Science and Technology (Fall 2019).

[7]  See Schachtman, “Infante-lizing the IARC” (May 13, 2018); “The IARC Process is Broken” (May 4, 2016). See also Eric Lasker and John Kalas, “Engaging with International Carcinogen Evaluations,” Law360 (Nov. 14, 2019).

[8]  “IARC Preamble to the IARC Monographs on the Identification of Carcinogenic Hazards to Humans,” at Sec. B.5., p.31 (Jan. 2019); See alsoIARC Advisory Group Report on Preamble” (Sept. 2019).

[9]  See Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295 (1965) (noting that only when “[o]ur observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance,” do we move on to consider the nine articulated factors for determining whether an association is causal.

[10]  Fed. R. Evid. 403.


Does the California State Bar Discriminate Unlawfully?

November 24th, 2019

Earlier this month, various news outlets announced a finding in a California study that black male attorneys are three times more likely to be disciplined by the State Bar than their white male counterparts.[1] Some of the news accounts treated the study findings as conclusions that the Bar had engaged in race discrimination. One particularly irresponsible website proclaimed that “bar discipline is totally racist.”[2] Indeed, the California State Bar itself apparently plans to hire consulting experts to help it achieve “bias-free decision-making and processes,” to eliminate “unintended bias,” and to consider how, if at all, to weigh prior complaints in the disciplinary procedure.[3]

The California Bar’s report was prepared by a social scientist, George Farkas, of the School of Education at University of California, Irvine. Based upon data from attorneys admitted to the California bar between 1990 and 2008, Professor Farkas reported crude prevalence rates of discipline, probation, disbarment, or resignation, by race.[4] The disbarment/ resignation rate for black male lawyers was 3.9%, whereas the rate for white male lawyers was 1%. Disparities, however, are not unlawful discriminations.

The disbarment/resignation rate for black female lawyers was 0.9%, but no one has suggested that there is implicit bias in favor of black women over both black and white male lawyers. White women were twice as likely as Asian women to resign, or be placed on probation or be disbarred (0.4% versus 0.2%).

The ABA’s coverage sheepishly admitted that “[d]ifferences could be explained by the number of complaints received about an attorney, the number of investigations opened, the percentage of investigations in which a lawyer was not represented by counsel, and previous discipline history.”[5]

Farkas’s report of October 31, 2019, was transmitted to the Bar’s Board of Trustees, on November 14th.[6] As anyone familiar with discrimination law would have expected, Professor Farkas conducted multiple regression analyses that adjusted for the number of previous complaints filed against the errant lawyer, and whether the lawyer was represented by counsel before the Bar. The full analyses showed that these other important variables, not race – not could – but did explain variability in discipline rates:

“Statistically, these variables explained all of the differences in probation and disbarment rates by race/ethnicity. Among all variables included in the final analysis, prior discipline history was found to have the strongest effects [sic] on discipline outcomes, followed by the proportion of investigations in which the attorney under investigation was represented by counsel, and the number of investigations.”[7]

The number of previous complaints against a particular lawyer surely has a role in considering whether a miscreant lawyer should be placed on probation, or subjected to disbarment. And without further refinement of the analysis, and irrespective of race or ethnicity, failure to retain counsel for disciplinary hearings may correlate strongly with futility of any defense.

Curiously, the Farkas report did not take into account the race or ethnicity of the complainants before the Bar’s disciplinary committee. The Farkas report seems reasonable as far as it goes, but the wild conclusions drawn in the media would not pass Rule 702 gatekeeping.

[1]  See, e.g., Emma Cueto, “Black Male Attorneys Disciplined More Often, California Study Finds,” Law360 (Nov. 18, 2019); Debra Cassens Weiss, “New California bar study finds racial disparities in lawyer discipline,” Am. Bar Ass’n J. (Nov. 18, 2019).

[2]  Joe Patrice, “Study Finds That Bar Discipline Is Totally Racist Shocking Absolutely No One: Black male attorneys are more likely to be disciplined than white attorneys,” Above the Law (Nov. 19, 2019).

[3]  Debra Cassens Weiss, “New California bar study finds racial disparities in lawyer discipline,” Am. Bar Ass’n J. (Nov. 18, 2019).

[4]  George Farkas, “Discrepancies by Race and Gender in Attorney Discipline by the State Bar of California: An Empirical Analysis” (Oct. 31, 2019).

[5]  Debra Cassens Weiss, supra at note 3.

[6]  Dag MacLeod (Chief of Mission Advancement & Accountability Division) & Ron Pi (Principal Analyst, Office of Research & Institutional Accountability), Report on Disparities in the Discipline System (Nov. 14, 2019).

[7] Dag MacLeod & Pi, Report on Disparities in the Discipline System at 4 (Nov. 14, 2019) (emphasis added).

Palavering About P-Values

August 17th, 2019

The American Statistical Association’s most recent confused and confusing communication about statistical significance testing has given rise to great mischief in the world of science and science publishing.[1] Take for instance last week’s opinion piece about “Is It Time to Ban the P Value?” Please.

Helena Chmura Kraemer is an accomplished professor of statistics at Stanford University. This week the Journal of the American Medical Association network flagged Professor Kraemer’s opinion piece on p-values as one of its most read articles. Kraemer’s eye-catching title creates the impression that the p-value is unnecessary and inimical to valid inference.[2]

Remarkably, Kraemer’s article commits the very mistake that the ASA set out to correct back in 2016,[3] by conflating the probability of the data under a hypothesis of no association with the probability of a hypothesis given the data:

“If P value is less than .05, that indicates that the study evidence was good enough to support that hypothesis beyond reasonable doubt, in cases in which the P value .05 reflects the current consensus standard for what is reasonable.”

The ASA tried to break the bad habit of scientists’ interpreting p-values as allowing us to assign posterior probabilities, such as beyond a reasonable doubt, to hypotheses, but obviously to no avail.

Kraemer also ignores the ASA 2016 Statement’s teaching of what the p-value is not and cannot do, by claiming that p-values are determined by non-random error probabilities such as:

“the reliability and sensitivity of the measures used, the quality of the design and analytic procedures, the fidelity to the research protocol, and in general, the quality of the research.”

Kraemer provides errant advice and counsel by insisting that “[a] non-significant result indicates that the study has failed, not that the hypothesis has failed.” If the p-value is the measure of the probability of observing an association at least as large as obtained given an assumed null hypothesis, then of course a large p-value cannot speak to the failure of the hypothesis, but why declare that the study has failed? The study was perhaps indeterminate, but it still yielded information that perhaps can be combined with other data, or help guide future studies.

Perhaps in her most misleading advice, Kraemer asserts that:

“[w]hether P values are banned matters little. All readers (reviewers, patients, clinicians, policy makers, and researchers) can just ignore P values and focus on the quality of research studies and effect sizes to guide decision-making.”

Really? If a high quality study finds an “effect size” of interest, we can now ignore random error?

The ASA 2016 Statement, with its “six principles,” has provoked some deliberate or ill-informed distortions in American judicial proceedings, but Kraemer’s editorial creates idiosyncratic meanings for p-values. Even the 2019 ASA “post-modernism” does not advocate ignoring random error and p-values, as opposed to proscribing dichotomous characterization of results as “statistically significant,” or not.[4] The current author guidelines for articles submitted to the Journals of the American Medical Association clearly reject this new-fangled rejection of evaluating this new-fangled rejection of the need to assess the role of random error.[5]

[1]  See Ronald L. Wasserstein, Allen L. Schirm, and Nicole A. Lazar, “Editorial: Moving to a World Beyond ‘p < 0.05’,” 73 Am. Statistician S1, S2 (2019).

[2]  Helena Chmura Kraemer, “Is It Time to Ban the P Value?J. Am. Med. Ass’n Psych. (August 7, 2019), in-press at doi:10.1001/jamapsychiatry.2019.1965.

[3]  Ronald L. Wasserstein & Nicole A. Lazar, “The ASA’s Statement on p-Values: Context, Process, and Purpose,” 70 The American Statistician 129 (2016).

[4]  “Has the American Statistical Association Gone Post-Modern?” (May 24, 2019).

[5]  See instructions for authors at