For your delectation and delight, desultory dicta on the law of delicts.

Scientific Evidence in Canadian Courts

February 20th, 2018

A couple of years ago, Deborah Mayo called my attention to the Canadian version of the Reference Manual on Scientific Evidence.1 In the course of discussion of mistaken definitions and uses of p-values, confidence intervals, and significance testing, Sander Greenland pointed to some dubious pronouncements in the Science Manual for Canadian Judges [Manual].

Unlike the United States federal court Reference Manual, which is published through a joint effort of the National Academies of Science, Engineering, and Medicine, the Canadian version, is the product of the Canadian National Judicial Institute (NJI, or the Institut National de la Magistrature, if you live in Quebec), which claims to be an independent, not-for-profit group, that is committed to educating Canadian judges. In addition to the Manual, the Institute publishes Model Jury Instructions and a guide, Problem Solving in Canada’s Courtrooms: A Guide to Therapeutic Justice (2d ed.), as well as conducting educational courses.

The NJI’s website describes the Instute’s Manual as follows:

Without the proper tools, the justice system can be vulnerable to unreliable expert scientific evidence.

         * * *

The goal of the Science Manual is to provide judges with tools to better understand expert evidence and to assess the validity of purportedly scientific evidence presented to them. …”

The Chief Justice of Canada, Hon. Beverley M. McLachlin, contributed an introduction to the Manual, which was notable for its frank admission that:

[w]ithout the proper tools, the justice system is vulnerable to unreliable expert scientific evidence.


Within the increasingly science-rich culture of the courtroom, the judiciary needs to discern ‘good’ science from ‘bad’ science, in order to assess expert evidence effectively and establish a proper threshold for admissibility. Judicial education in science, the scientific method, and technology is essential to ensure that judges are capable of dealing with scientific evidence, and to counterbalance the discomfort of jurists confronted with this specific subject matter.”

Manual at 14. These are laudable goals, indeed, but did the National Judicial Institute live up to its stated goals, or did it leave Canadian judges vulnerable to the Institute’s own “bad science”?

In his comments on Deborah Mayo’s blog, Greenland noted some rather cavalier statements in Chapter two that suggest that the conventional alpha of 5% corresponds to a “scientific attitude that unless we are 95% sure the null hypothesis is false, we provisionally accept it.” And he, pointed elsewhere where the chapter seems to suggest that the coefficient of confidence that corresponds to an alpha of 5% “constitutes a rather high standard of proof,” thus confusing and conflating probability of random error with posterior probabilities. Greenland is absolutely correct that the Manual does a rather miserable job of educating Canadian judges if our standard for its work product is accuracy and truth.

Some of the most egregious errors are within what is perhaps the most important chapter of the Manual, Chapter 2, “Science and the Scientific Method.” The chapter has two authors, a scientist, Scott Findlay, and a lawyer, Nathalie Chalifour. Findlay is an Associate Professor, in the Department of Biology, of the University of Ottawa. Nathalie Chalifour is an Associate Professor on the Faculty of Law, also in the University of Ottawa. Together, they produced some dubious pronouncements, such as:

Weight of the Evidence (WOE)

First, the concept of weight of evidence in science is similar in many respects to its legal counterpart. In both settings, the outcome of a weight-of-evidence assessment by the trier of fact is a binary decision.”

Manual at 40. Findlay and Chalifour cite no support for their characterization of WOE in science. Most attempts to invoke WOE are woefully vague and amorphous, with no meaningful guidance or content.2  Sixty-five pages later, if any one is noticing, the authors let us in a dirty, little secret:

at present, there exists no established prescriptive methodology for weight of evidence assessment in science.”

Manual at 105. The authors omit, however, that there are prescriptive methods for inferring causation in science; you just will not see them in discussions of weight of the evidence. The authors then compound the semantic and conceptual problems by stating that “in a civil proceeding, if the evidence adduced by the plaintiff is weightier than that brought forth by the defendant, a judge is obliged to find in favour of the plaintiff.” Manual at 41. This is a remarkable suggestion, which implies that if the plaintiff adduces the crummiest crumb of evidence, a mere peppercorn on the scales of justice, but the defendant has none to offer, that the plaintiff must win. The plaintiff wins notwithstanding that no reasonable person could believe that the plaintiff’s claims are more likely than not true. Even if there were the law of Canada, it is certainly not how scientists think about establishing the truth of empirical propositions.

Confusion of Hypothesis Testing with “Beyond a Reasonable Doubt”

The authors’ next assault comes in conflating significance probability with the probability connected with the burden of proof, a posterior probability. Legal proceedings have a defined burden of proof, with criminal cases requiring the state to prove guilt “beyond a reasonable doubt.” Findlay and Chalifour’s discussion then runs off the rails by likening hypothesis testing, with an alpha of 5% or its complement, 95%, as a coefficient of confidence, to a “very high” burden of proof:

In statistical hypothesis-testing – one of the tools commonly employed by scientists – the predisposition is that there is a particular hypothesis (the null hypothesis) that is assumed to be true unless sufficient evidence is adduced to overturn it. But in statistical hypothesis-testing, the standard of proof has traditionally been set very high such that, in general, scientists will only (provisionally) reject the null hypothesis if they are at least 95% sure it is false. Third, in both scientific and legal proceedings, the setting of the predisposition and the associated standard of proof are purely normative decisions, based ultimately on the perceived consequences of an error in inference.”

Manual at 41. This is, as Greenland and many others have pointed out, a totally bogus conception of hypothesis testing, and an utterly false description of the probabilities involved.

Later in the chapter, Findlay and Chalifour flirt with the truth, but then lapse into an unrecognizable parody of it:

Inferential statistics adopt the frequentist view of probability whereby a proposition is either true or false, and the task at hand is to estimate the probability of getting results as discrepant or more discrepant than those observed, given the null hypothesis. Thus, in statistical hypothesis testing, the usual inferred conclusion is either that the null is true (or rather, that we have insufficient evidence to reject it) or it is false (in which case we reject it). 16 The decision to reject or not is based on the value of p if the estimated value of p is below some threshold value a, we reject the null; otherwise we accept it.”

Manual at 74. OK; so far so good, but here comes the train wreck:

By convention (and by convention only), scientists tend to set α = 0.05; this corresponds to the collective – and, one assumes, consensual – scientific attitude that unless we are 95% sure the null hypothesis is false, we provisionally accept it. It is partly because of this that scientists have the reputation of being a notoriously conservative lot, given that a 95% threshold constitutes a rather high standard of proof.”

Manual at 75. Uggh; so we are back to significance probability’s being a posterior probability. As if to atone for their sins, in the very next paragraph, the authors then remind the judicial readers that:

As noted above, p is the probability of obtaining results at least as discrepant as those observed if the null is true. This is not the same as the probability of the null hypothesis being true, given the results.”

Manual at 75. True, true, and completely at odds with what the authors have stated previously. And to add to the reader’s now fully justified conclusion, the authors describe the standard for rejecting the null hypothesis as “very high indeed.” Manual at 102, 109. Any reader who is following the discussion might wonder how and why there is such a problem of replication and reproducibility in contemporary science.

Conflating Bayesianism with Frequentist Modes of Inference

We have seen how Findlay and Chalifour conflate significance and posterior probabilities, some of the time. In a section of their chapter that deals explicitly with probability, the authors tell us that before any study is conducted the prior probability of the truth of the tested hypothesis is 50%, sans evidence. This an astonishing creation of certainty out nothingness, and perhaps it explains the authors’ implied claim that the crummiest morsel of evidence on one side is sufficient to compel a verdict, if the other side has no morsels at all. Here is how the authors put their claim to the Canadian judges:

Before each study is conducted (that is, a priori), the hypothesis is as likely to be true as it is to be false. Once the results are in, we can ask: How likely is it now that the hypothesis is true? In the first study, the low a priori inferential strength of the study design means that this probability will not be much different from the a priori value of 0.5 because any result will be rather equivocal owing to limitations in the experimental design.”

Manual at 64. This implied Bayesian slant, with 50% priors, in the world of science would lead anyone to believe “as many as six impossible things before breakfast,” and many more throughout the day.

Lest you think that the Manual is all rubbish, there are occasional gems of advice to the Canadian judges. The authors admonish the judges to

be wary of individual ‘statistically significant’ results that are mined from comparatively large numbers of trials or experiments, as the results may be ‘cherry picked’ from a larger set of experiments or studies that yielded mostly negative results. The court might ask the expert how many other trials or experiments testing the same hypothesis he or she is aware of, and to describe the outcome of those studies.”

Manual at 87. Good advice, but at odds with the authors’ characterization of statistical significance as establishing the rejection of the null hypothesis well-nigh beyond a reasonable doubt.

When Greenland first called attention to this Manual, I reached to some people who had been involved in its peer review. One reviewer told me that it was a “living document,” and would likely be revised after he had the chance to call the NJI’s attention to the errors. But two years later, the errors remain, and so we have to infer that the authors meant to say all the contradictory and false statements that are still present in the downloadable version of the Manual.

2 SeeWOE-fully Inadequate Methodology – An Ipse Dixit By Another Name” (May 1, 2012); “Weight of the Evidence in Science and in Law” (July 29, 2017); see also David E. Bernstein, “The Misbegotten Judicial Resistance to the Daubert Revolution,” 89 Notre Dame L. Rev. 27 (2013).

High, Low and Right-Sided Colonics – Ridding the Courts of Junk Science

July 16th, 2016

Not surprisingly, many of Selikoff’s litigation- and regulatory-driven opinions have not fared well, such as the notions that asbestos causes gastrointestinal cancers and that all asbestos minerals have equal potential and strength to cause mesothelioma.  Forty years after Selikoff testified in litigation that occupational asbestos exposure caused an insulator’s colorectal cancer, the Institute of Medicine reviewed the extant evidence and announced that the evidence was  “suggestive but not sufficient to infer a causal relationship between asbestos exposure and pharyngeal, stomach, and colorectal cancers.” Jonathan Samet, et al., eds., Institute of Medicine Review of Asbestos: Selected Cancers (2006).[1] The Institute of Medicine’s monograph has fostered a more circumspect approach in some of the federal agencies.  The National Cancer Institute’s website now proclaims that the evidence is insufficient to permit a conclusion that asbestos causes non-pulmonary cancers of gastrointestinal tract and throat.[2]

As discussed elsewhere, Selikoff testified as early as 1966 that asbestos causes colorectal cancer, in advance of any meaningful evidence to support such an opinion, and then he, and his protégées, worked hard to lace the scientific literature with their pronouncements on the subject, without disclosing their financial, political, and positional conflicts of interest.[3]

With plaintiffs’ firm’s (Lanier) zealous pursuit of bias information from the University of Idaho, in the LoGuidice case, what are we to make of Selikoff’s and his minions’ dubious ethics of failed disclosure. Do Selikoff and Mount Sinai receive a pass because their asbestos research predated the discovery of ethics? The “Lobby” (as the late Douglas Liddell called Selikoff and his associates)[4] has seriously distorted truth-finding in any number of litigations, but nowhere are the Lobby’s distortions more at work than in lawsuits for claimed asbestos injuries. Here the conflicts of interests truly have had a deleterious effect on the quality of civil justice. As we saw with the Selikoff exceptionalism displayed by the New York Supreme Court in reviewing third-party subpoenas,[5] some courts seem bent on ignoring evidence-based analyses in favor of Mount Sinai faith-based initiatives.

Current Asbestos Litigation Claims Involving Colorectal Cancer

Although Selikoff has passed from the litigation scene, his trainees and followers have lined up at the courthouse door to propagate his opinions. Even before the IOM’s 2006 monograph, more sophisticated epidemiologists consistently rejected the Selikoff conclusion on asbestos and colon cancer, which grew out of Selikoff’s litigation activities.[6] And yet, the minions keep coming.

In the pre-Daubert era, defendants lacked an evidentiary challenge to the Selikoff’s opinion that asbestos caused colorectal cancer. Instead of contesting the legal validity or sufficiency of the plaintiffs’ general causation claims, defendants often focused on the unreliability of the causal attribution for the specific claimant’s disease. These early cases are often misunderstood to be challenges to expert witnesses’ opinions about whether asbestos causes colorectal cancer; they were not.[7]

Of course, after the IOM’s 2006 monograph, active expert witness gatekeeping should eliminate asbestos gastrointestinal cancer claims, but sadly they persist. Perhaps, courts simply considered the issue “grandfathered” in from the era in which judicial scrutiny of expert witness opinion testimony was restricted. Perhaps, defense counsel are failing to frame and support their challenges properly.  Perhaps both.

Arthur Frank Jumps the Gate

Although ostensibly a “Frye” state, Pennsylvania judges have, when moved by the occasion, to apply a fairly thorough analysis of proffered expert witness opinion.[8] On occasion, Pennsylvania judges have excluded unreliably or invalidly supported causation opinions, under the Pennsylvania version of the Frye standard. A recent case, however, tried before a Workman’s Compensation Judge (WCJ), and appealed to the Commonwealth Court, shows how inconsistent the application of the standard can be, especially when Selikoff’s legacy views are at issue.

Michael Piatetsky, an architect, died of colorectal cancer. Before his death, he and his wife filed a worker’s compensation claim, in which they alleged that his disease was caused by his workplace exposure to asbestos. Garrison Architects v. Workers’ Comp. Appeal Bd. (Piatetsky), No. 1095 C.D. 2015, Pa. Cmwlth. Ct., 2016 Pa. Commw. Unpub. LEXIS 72 (Jan. 22, 2016) [cited as Piatetsky]. Mr. Piatetsky was an architect, almost certainly knowledgeable about asbestos hazards generally.  Despite his knowledge, Piatetsky eschewed personal protective equipment even when working at dusty work sites well marked with warnings. Although he had engaged in culpable conduct, the employer in worker compensation proceedings does not have ordinary negligence defenses, such as contributory negligence or assumption of risk.

In litigating the Piatetsky’s claim, the employer dragged its feet and failed to name an expert witness.  Eventually, after many requests for continuances, the Workers’ Compensation Judge barred the employer from presenting an expert witness. With the record closed, and without an expert witness, the Judge understandably ruled in favor of the claimant.

The employer, sans expert witness, had to confront claimant’s expert witness, Arthur L. Frank, a minion of Selikoff and a frequent testifier in asbestos and many other litigations. Frank, of course, opined that asbestos causes colon cancer and that it caused Mr. Piatetsky’s cancer. Mr. Piatetsky’s colon cancer originated on the right side of his colon. Dr. Frank thus emphasized that asbestos causes colon cancer in all locations, but especially on the right side in view of one study’s having concluded “that colon cancer caused by asbestos is more likely to begin on the right side.” Piatetsky at *6.

On appeal, the employer sought relief on several issues, but the only one of interest here is the employer’s argument “that Claimant’s medical expert based his opinion on flimsy medical studies.” Piatetsky at *10. The employer’s appeal seemed to go off the rails with the insistence that the Claimant’s medical opinion was invalid because Dr. Frank relied upon studies not involving architects. Piatetsky at *14. The Commonwealth Court was able to point to testimony, although probably exaggerated, which suggested that Mr. Piatetsky had been heavily exposed, at least at times, and thus his exposure was similar to that in the studies cited by Frank.

With respect to Frank’s right-sided (non-sinister) opinion, the Commonwealth Court framed the employer’s issue as a contention that Dr. Frank’s opinion on the asbestos-relatedness of right-sided colon cancer was “not universally accepted.” But universal acceptance has never been the test or standard for the rejection or acceptance of expert witness opinion testimony in any state.  Either the employer badly framed its appeal, or the appellate court badly misstated the employer’s ground for relief. In any event, the Commonwealth Court never addressed the relevant legal standard in its discussion.

The Claimant argued that the hearing Judge had found that Frank’s opinion was based on “numerous studies.” Piatetsky at *15. None of these studies is cited to permit the public to assess the argument and the Court’s acceptance of it. The appellate court made inappropriately short work of this appellate issue by confusing general and specific causation, and invoking Mr. Piatetsky’s age, his lack of family history of colon cancer, Frank’s review of medical records, testimony, and work records, as warranting Frank’s causal inference. None of these factors is relevant to general causation, and none is probative of the specific causation claim.  Many if not most colon cancers have no identifiable risk factor, and Dr. Frank had no way to rule out baseline risk, even if there were an increased risk from asbestos exposure. Piatetsky at *16. With no defense expert witness, the employer certainly had a difficult appellate journey. It is hard for the reader of the Commonwealth Court’s opinion to determine whether the case was poorly defended, poorly briefed on appeal, or poorly described by the appellate judges.

In any event, the right-sided ruse of Arthur Frank went unreprimanded.  Intellectual due process might have led the appellate court to cite the article at issue, but it failed to do so.  It is interesting and curious to see how the appellate court gave a detailed recitation of the controverted facts of asbestos exposure, while how glib the court was when describing the scientific issues and evidence.  Nonetheless, the article referenced vaguely, which went uncited by the appellate court, was no doubt the paper:  K. Jakobsson, M. Albin & L. Hagmar, “Asbestos, cement, and cancer in the right part of the colon,” 51 Occup. & Envt’l Med. 95 (1994).

These authors 24 observed versus 9.63 expected right-sided colon cancers, and they concluded that there was an increased rate of right-sided colon cancer in the asbestos cement plant workers.  Notably the authors’ reference population had a curiously low rate of right-sided colon cancer.  For left-sided colon cancer, the authors 9.3 expected cases but observed only 5 cases in the asbestos-cement cohort.  Contrary to Frank’s suggestion, the authors did not conclude that right-sided colon cancers had been caused by asbestos; indeed, the authors never reached any conclusion whether asbestos causes colorectal  cancer under any circumstances.  In their discussion, these authors noted that “[d]espite numerous epidemiological and experimental studies, there is no consensus concerning exposure to asbestos and risks of gastrointestinal cancer.” Jakobsson at 99; see also Dorsett D. Smith, “Does Asbestos Cause Additional Malignancies Other than Lung Cancer,” chap. 11, in Dorsett D. Smith, The Health Effects of Asbestos: An Evidence-based Approach 143, 154 (2015). Even this casual description of the Jakobsson study will awake the learned reader to the multiple comparisons that went on in this cohort study, with outcomes reported for left, right, rectum, and multiple sites, without any adjustment to the level of significance.  Risk of right-sided colon cancer was not a pre-specified outcome of the study, and the results of subsequent studies have never corroborated this small cohort study.

A sane understanding of subgroup analyses is important to judicial gatekeeping. SeeSub-group Analyses in Epidemiologic Studies — Dangers of Statistical Significance as a Bright-Line Test” (May 17, 2011).  The chapter on statistics in the Reference Manual for Scientific Evidence (3d ed. 2011) has some prudent caveats for multiple comparisons and testing, but neither the chapter on epidemiology, nor the chapter on clinical medicine[9], provides any sense of the dangers of over-interpreting subgroup analyses.

Some commentators have argued that we must not dissuade scientists from doing subgroup analysis, but the issue is not whether they should be done, but how they should be interpreted.[10] Certainly many authors have called for caution in how subgroup analyses are interpreted[11], but apparently Expert Witness Arthur Frank, did not receive the memo, before testifying in the Piatetsky case, and the Commonwealth Court did not before deciding this case.

[1] As good as the IOM process can be on occasion, even its reviews are sometimes less than thorough. The asbestos monograph gave no consideration to alcohol in the causation of laryngeal cancer, and no consideration to smoking in its analysis of asbestos and colorectal cancer. See, e.g., Peter S. Liang, Ting-Yi Chen & Edward Giovannucci, “Cigarette smoking and colorectal cancer incidence and mortality: Systematic review and meta-analysis,” 124 Internat’l J. Cancer 2406, 2410 (2009) (“Our results indicate that both past and current smokers have an increased risk of [colorectal cancer] incidence and mortality. Significantly increased risk was found for current smokers in terms of mortality (RR 5 1.40), former smokers in terms of incidence (RR 5 1.25)”); Lindsay M. Hannan, Eric J. Jacobs and Michael J. Thun, “The Association between Cigarette Smoking and Risk of Colorectal Cancer in a Large Prospective Cohort from the United States,” 18 Cancer Epidemiol., Biomarkers & Prevention 3362 (2009).

[2] National Cancer Institute, “Asbestos Exposure and Cancer Risk” (last visited July 10, 2016) (“In addition to lung cancer and mesothelioma, some studies have suggested an association between asbestos exposure and gastrointestinal and colorectal cancers, as well as an elevated risk for cancers of the throat, kidney, esophagus, and gallbladder (3, 4). However, the evidence is inconclusive.”).

[3] Compare “Health Hazard Progress Notes: Compensation Advance Made in New York State,” 16(5) Asbestos Worker 13 (May 1966) (thanking Selikoff for testifying in a colon cancer case) with, Irving J. Selikoff, “Epidemiology of gastrointestinal cancer,” 9 Envt’l Health Persp. 299 (1974) (arguing for his causal conclusion between asbestos and all gastrointestinal cancers, with no acknowledgment of his role in litigation or his funding from the asbestos insulators’ union).

[4] F.D.K. Liddell, “Magic, Menace, Myth and Malice,” 41 Ann. Occup. Hyg. 3, 3 (1997); see alsoThe Lobby Lives – Lobbyists Attack IARC for Conducting Scientific Research” (Feb. 19, 2013).


SeeThe LoGiudice Inquisitiorial Subpoena & Its Antecedents in N.Y. Law” (July 14, 2016).

[6] See, e.g., Richard Doll & Julian Peto, Asbestos: Effects on health of exposure to asbestos 8 (1985) (“In particular, there are no grounds for believing that gastrointestinal cancers in general are peculiarly likely to be caused by asbestos exposure.”).

[7] See Landrigan v. The Celotex Corporation, Revisited” (June 4, 2013); Landrigan v. The Celotex Corp., 127 N.J. 404, 605 A.2d 1079 (1992); Caterinicchio v. Pittsburgh Corning Corp., 127 NJ. 428, 605 A.2d 1092 (1992). In both Landrigan and Caterinicchio, there had been no challenge to the reliability or validity of the plaintiffs’ expert witnesses’ general causation opinions. Instead, the trial courts entered judgments, assuming arguendo that asbestos can cause colorectal cancer (a dubious proposition), on the ground that the low relative risk cited by plaintiffs’ expert witnesses (about 1.5) was factually insufficient to support a verdict for plaintiffs on specific causation.  Indeed, the relative risk suggested that the odds were about 2 to 1 in defendants’ favor that the plaintiffs’ colorectal cancers were not caused by asbestos.

[8] See, e.g., Porter v. Smithkline Beecham Corp., Sept. Term 2007, No. 03275. 2016 WL 614572 (Phila. Cty. Com. Pleas, Oct. 5, 2015); “Demonstration of Frye Gatekeeping in Pennsylvania Birth Defects Case” (Oct. 6, 2015).

[9] John B. Wong, Lawrence O. Gostin & Oscar A. Cabrera, “Reference Guide on Medical Testimony,” in Reference Manual for Scientific Evidence 687 (3d ed. 2011).

[10] See, e.g., Phillip I. Good & James W. Hardin, Common Errors in Statistics (and How to Avoid Them) 13 (2003) (proclaiming a scientists’ Bill of Rights under which they should be allowed to conduct subgroup analyses); Ralph I. Horwitz, Burton H. Singer, Robert W. Makuch, Catherine M. Viscoli, “Clinical versus statistical considerations in the design and analysis of clinical research,” 51 J. Clin. Epidemiol. 305 (1998) (arguing for the value of subgroup analyses). In United States v. Harkonen, the federal government prosecuted a scientist for fraud in sending a telecopy that described a clinical trial as “demonstrating” a benefit in a subgroup of a secondary trial outcome.  Remarkably, in the Harkonen case, the author, and criminal defendant, was describing a result in a pre-specified outcome, in a plausible but post-hoc subgroup, which result accorded with prior clinical trials and experimental evidence. United States v. Harkonen (D. Calif. 2009); United States v. Harkonen (D. Calif. 2010) (post-trial motions), aff’d, 510 F. App’x 633 (9th Cir. 2013) (unpublished), cert. denied, 134 S. Ct. 824, ___ U.S. ___ (2014); Brief by Scientists And Academics as Amici Curiae In Support Of Petitioner, On Petition For Writ Of Certiorari in the Supreme Court of the United States, W. Scott Harkonen v. United States, No. 13-180 (filed Sept. 4, 2013).

[11] SeeSub-group Analyses in Epidemiologic Studies — Dangers of Statistical Significance as a Bright-Line Test” (May 17, 2011) (collecting commentary); see also Lemuel A. Moyé, Statistical Reasoning in Medicine:  The Intuitive P-Value Primer 206, 225 (2d ed. 2006) (noting that subgroup analyses are often misleading: “Fishing expeditions for significance commonly catch only the junk of sampling error”); Victor M. Montori, Roman Jaeschke, Holger J. Schünemann, Mohit Bhandari, Jan L Brozek, P. J. Devereaux & Gordon H Guyatt, “Users’ guide to detecting misleading claims in clinical research reports,” 329 Brit. Med. J. 1093 (2004) (“Beware subgroup analysis”); Susan F. Assmann, Stuart J. Pocock, Laura E. Enos, Linda E. Kasten, “Subgroup analysis and other (mis)uses) of baseline data in clinical trials,” 355 Lancet 1064 (2000); George Davey Smith & Mathias Egger, “Commentary: Incommunicable knowledge? Interpreting and applying the results of clinical trials and meta-analyses,” 51 J. Clin. Epidemiol. 289 (1998) (arguing against post-hoc hypothesis testing); Douglas G. Altman, “Statistical reviewing for medical journals,” 17 Stat. Med. 2662 (1998); Douglas G. Altman, “Commentary:  Within trial variation – A false trail?” 51 J. Clin. Epidemiol. 301 (1998) (noting that observed associations are expected to vary across subgroup because of random variability); Christopher Bulpitt, “Subgroup Analysis,” 2 Lancet: 31 (1988).

Judicial Control of the Rate of Error in Expert Witness Testimony

May 28th, 2015

In Daubert, the Supreme Court set out several criteria or factors for evaluating the “reliability” of expert witness opinion testimony. The third factor in the Court’s enumeration was whether the trial court had considered “the known or potential rate of error” in assessing the scientific reliability of the proffered expert witness’s opinion. Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579, 593 (1993). The Court, speaking through Justice Blackmun, failed to provide much guidance on the nature of the errors subject to gatekeeping, on how to quantify the errors, and on to know how much error was too much. Rather than provide a taxonomy of error, the Court lumped “accuracy, validity, and reliability” together with a grand pronouncement that these measures were distinguished by no more than a “hen’s kick.” Id. at 590 n.9 (1993) (citing and quoting James E. Starrs, “Frye v. United States Restructured and Revitalized: A Proposal to Amend Federal Evidence Rule 702,” 26 Jurimetrics J. 249, 256 (1986)).

The Supreme Court’s failure to elucidate its “rate of error” factor has caused a great deal of mischief in the lower courts. In practice, trial courts have rejected engineering opinions on stated grounds of their lacking an error rate as a way of noting that the opinions were bereft of experimental and empirical evidential support[1]. For polygraph evidence, courts have used the error rate factor to obscure their policy prejudices against polygraphs, and to exclude test data even when the error rate is known, and rather low compared to what passes for expert witness opinion testimony in many other fields[2]. In the context of forensic evidence, the courts have rebuffed objections to random-match probabilities that would require that such probabilities be modified by the probability of laboratory or other error[3].

When it comes to epidemiologic and other studies that require statistical analyses, lawyers on both sides of the “v” frequently misunderstand p-values or confidence intervals to provide complete measures of error, and ignore the larger errors that result from bias, confounding, study validity (internal and external), inappropriate data synthesis, and the like[4]. Not surprisingly, parties fallaciously argue that the Daubert criterion of “rate of error” is satisfied by expert witness’s reliance upon studies that in turn use conventional 95% confidence intervals and measures of statistical significance in p-values below 0.05[5].

The lawyers who embrace confidence intervals and p-values as their sole measure of error rate fail to recognize that confidence intervals and p-values are means of assessing only one kind of error: random sampling error. Given the carelessness of the Supreme Court’s use of technical terms in Daubert, and its failure to engage in the actual evidence at issue in the case, it is difficult to know whether the Court intended to suggest that random error was the error rate it had in mind[6]. The statistics chapter in the Reference Manual on Scientific Evidence helpfully points out that the inferences that can be drawn from data turn on p-values and confidence intervals, as well as on study design, data quality, and the presence or absence of systematic errors, such as bias or confounding.  Reference Manual on Scientific Evidence at 240 (3d 2011) [Manual]. Random errors are reflected in the size of p-values or the width of confidence intervals, but these measures of random sampling error ignore systematic errors such as confounding and study biases. Id. at 249 & n.96.

The Manual’s chapter on epidemiology takes an even stronger stance: the p-value for a given study does not provide a rate of error or even a probability of error for an epidemiologic study:

“Epidemiology, however, unlike some other methodologies—fingerprint identification, for example—does not permit an assessment of its accuracy by testing with a known reference standard. A p-value provides information only about the plausibility of random error given the study result, but the true relationship between agent and outcome remains unknown. Moreover, a p-value provides no information about whether other sources of error – bias and confounding – exist and, if so, their magnitude. In short, for epidemiology, there is no way to determine a rate of error.”

Manual at 575. This stance seems not entirely justified given that there are Bayesian approaches that would produce credibility intervals accounting for sampling and systematic biases. To be sure, such approaches have their own problems and they have received little to no attention in courtroom proceedings to date.

The authors of the Manual’s epidemiology chapter, who are usually forgiving of judicial error in interpreting epidemiologic studies, point to one United States Court of Appeals case that fallaciously interpreted confidence intervals magically to quantify bias and confounding in a Bendectin birth defects case. Id. at 575 n. 96[7]. The Manual could have gone further to point out that, in the context of multiple studies, of different designs and analyses, cognitive biases involved in evaluating, assessing, and synthesizing the studies are also ignored by statistical measures such as p-values and confidence intervals. Although the Manual notes that assessing the role of chance in producing a particular set of sample data is “often viewed as essential when making inferences from data,” the Manual never suggests that random sampling error is the only kind of error that must be assessed when interpreting data. The Daubert criterion would appear to encompass all varieties or error, not just random error.

The Manual’s suggestion that epidemiology does not permit an assessment of the accuracy of epidemiologic findings misrepresents the capabilities of modern epidemiologic methods. Courts can, and do, invoke gatekeeping approaches to weed out confounded study findings. SeeSorting Out Confounded Research – Required by Rule 702” (June 10, 2012). The “reverse Cornfield inequality” was an important analysis that helped establish the causal connection between tobacco smoke and lung cancer[8]. Olav Axelson studied and quantified the role of smoking as a confounder in epidemiologic analyses of other putative lung carcinogens.[9] Quantitative methods for identifying confounders have been widely deployed[10].

A recent study in birth defects epidemiology demonstrates the power of sibling cohorts in addressing the problem of residual confounding from observational population studies with limited information about confounding variables. Researchers looking at various birth defect outcomes among offspring of women who used certain antidepressants in early pregnancy generally found no associations in pooled data from Iceland, Norway, Sweden, Finland, and Denmark. A putative association between maternal antidepressant use and a specific kind of cardiac defect (right ventricular outflow tract obstruction or RVOTO) did appear in the overall analysis, but was reversed when the analysis was limited to the sibling subcohort. The study found an apparent association between RVOTO defects and first trimester maternal exposure to selective serotonin reuptake inhibitors, with an adjusted odds ratio of 1.48 (95% C.I., 1.15, 1.89). In the adjusted analysis for siblings, the study found an OR of 0.56 (95% C.I., 0.21, 1.49) in an adjusted sibling analysis[11]. This study and many others show how creative analyses can elucidate and quantify the direction and magnitude of confounding effects in observational epidemiology.

Systematic bias has also begun to succumb to more quantitative approaches. A recent guidance paper by well-known authors encourages the use of quantitative bias analysis to provide estimates of uncertainty due to systematic errors[12].

Although the courts have failed to articulate the nature and consequences of erroneous inference, some authors would reduce all of Rule 702 (and perhaps 704, 403 as well) to a requirement that proffered expert witnesses “account” for the known and potential errors in their opinions:

“If an expert can account for the measurement error, the random error, and the systematic error in his evidence, then he ought to be permitted to testify. On the other hand, if he should fail to account for any one or more of these three types of error, then his testimony ought not be admitted.”

Mark Haug & Emily Baird, “Finding the Error in Daubert,” 62 Hastings L.J. 737, 739 (2011).

Like most antic proposals to revise Rule 702, this reform vision shuts out the full range of Rule 702’s remedial scope. Scientists certainly try to identify potential sources of error, but they are not necessarily very good at it. See Richard Horton, “Offline: What is medicine’s 5 sigma?” 385 Lancet 1380 (2015) (“much of the scientific literature, perhaps half, may simply be untrue”). And as Holmes pointed out[13], certitude is not certainty, and expert witnesses are not likely to be good judges of their own inferential errors[14]. Courts continue to say and do wildly inconsistent things in the course of gatekeeping. Compare In re Zoloft (Setraline Hydrochloride) Products, 26 F. Supp. 3d 449, 452 (E.D. Pa. 2014) (excluding expert witness) (“The experts must use good grounds to reach their conclusions, but not necessarily the best grounds or unflawed methods.”), with Gutierrez v. Johnson & Johnson, 2006 WL 3246605, at *2 (D.N.J. November 6, 2006) (denying motions to exclude expert witnesses) (“The Daubert inquiry was designed to shield the fact finder from flawed evidence.”).

[1] See, e.g., Rabozzi v. Bombardier, Inc., No. 5:03-CV-1397 (NAM/DEP), 2007 U.S. Dist. LEXIS 21724, at *7, *8, *20 (N.D.N.Y. Mar. 27, 2007) (excluding testimony from civil engineer about boat design, in part because witness failed to provide rate of error); Sorto-Romero v. Delta Int’l Mach. Corp., No. 05-CV-5172 (SJF) (AKT), 2007 U.S. Dist. LEXIS 71588, at *22–23 (E.D.N.Y. Sept. 24, 2007) (excluding engineering opinion that defective wood-carving tool caused injury because of lack of error rate); Phillips v. Raymond Corp., 364 F. Supp. 2d 730, 732–33 (N.D. Ill. 2005) (excluding biomechanics expert witness who had not reliably tested his claims in a way to produce an accurate rate of error); Roane v. Greenwich Swim Comm., 330 F. Supp. 2d 306, 309, 319 (S.D.N.Y. 2004) (excluding mechanical engineer, in part because witness failed to provide rate of error); Nook v. Long Island R.R., 190 F. Supp. 2d 639, 641–42 (S.D.N.Y. 2002) (excluding industrial hygienist’s opinion in part because witness was unable to provide a known rate of error).

[2] See, e.g., United States v. Microtek Int’l Dev. Sys. Div., Inc., No. 99-298-KI, 2000 U.S. Dist. LEXIS 2771, at *2, *10–13, *15 (D. Or. Mar. 10, 2000) (excluding polygraph data based upon showing that claimed error rate came from highly controlled situations, and that “real world” situations led to much higher error (10%) false positive error rates); Meyers v. Arcudi, 947 F. Supp. 581 (D. Conn. 1996) (excluding polygraph in civil action).

[3] See, e.g., United States v. Ewell, 252 F. Supp. 2d 104, 113–14 (D.N.J. 2003) (rejecting defendant’s objection to government’s failure to quantify laboratory error rate); United States v. Shea, 957 F. Supp. 331, 334–45 (D.N.H. 1997) (rejecting objection to government witness’s providing separate match and error probability rates).

[4] For a typical judicial misstatement, see In re Zoloft Products, 26 F. Supp.3d 449, 454 (E.D. Pa. 2014) (“A 95% confidence interval means that there is a 95% chance that the ‘‘true’’ ratio value falls within the confidence interval range.”).

[5] From my experience, this fallacious argument is advanced by both plaintiffs’ and defendants’ counsel and expert witnesses. See also Mark Haug & Emily Baird, “Finding the Error in Daubert,” 62 Hastings L.J. 737, 751 & n.72 (2011).

[6] See David L. Faigman, et al. eds., Modern Scientific Evidence: The Law and Science of Expert Testimony § 6:36, at 359 (2007–08) (“it is easy to mistake the p-value for the probability that there is no difference”)

[7] Brock v. Merrell Dow Pharmaceuticals, Inc., 874 F.2d 307, 311-12 (5th Cir. 1989), modified, 884 F.2d 166 (5th Cir. 1989), cert. denied, 494 U.S. 1046 (1990). As with any error of this sort, there is always the question whether the judges were entrapped by the parties or their expert witnesses, or whether the judges came up with the fallacy on their own.

[8] See Joel B Greenhouse, “Commentary: Cornfield, Epidemiology and Causality,” 38 Internat’l J. Epidem. 1199 (2009).

[9] Olav Axelson & Kyle Steenland, “Indirect methods of assessing the effects of tobacco use in occupational studies,” 13 Am. J. Indus. Med. 105 (1988); Olav Axelson, “Confounding from smoking in occupational epidemiology,” 46 Brit. J. Indus. Med. 505 (1989); Olav Axelson, “Aspects on confounding in occupational health epidemiology,” 4 Scand. J. Work Envt’l Health 85 (1978).

[10] See, e.g., David Kriebel, Ariana Zeka1, Ellen A Eisen, and David H. Wegman, “Quantitative evaluation of the effects of uncontrolled confounding by alcohol and tobacco in occupational cancer studies,” 33 Internat’l J. Epidem. 1040 (2004).

[11] Kari Furu, Helle Kieler, Bengt Haglund, Anders Engeland, Randi Selmer, Olof Stephansson, Unnur Anna Valdimarsdottir, Helga Zoega, Miia Artama, Mika Gissler, Heli Malm, and Mette Nørgaard, “Selective serotonin reuptake inhibitors and ventafaxine in early pregnancy and risk of birth defects: population based cohort study and sibling design,” 350 Brit. Med. J. 1798 (2015).

[12] Timothy L.. Lash, Matthew P. Fox, Richard F. MacLehose, George Maldonado, Lawrence C. McCandless, and Sander Greenland, “Good practices for quantitative bias analysis,” 43 Internat’l J. Epidem. 1969 (2014).

[13] Oliver Wendell Holmes, Jr., Collected Legal Papers at 311 (1920) (“Certitude is not the test of certainty. We have been cock-sure of many things that were not so.”).

[14] See, e.g., Daniel Kahneman & Amos Tversky, “Judgment under Uncertainty:  Heuristics and Biases,” 185 Science 1124 (1974).

ALI Reporters Are Snookered by Racette Fallacy

April 27th, 2015

In the Reference Manual on Scientific Evidence, the authors of the epidemiology chapter advance instances of acceleration of onset of disease as an example of a situation in which reliance upon doubling of risk will not provide a reliable probability of causation calculation[1]. In a previous post, I suggested that the authors’ assertion may be unfounded. SeeReference Manual on Scientific Evidence on Relative Risk Greater Than Two For Specific Causation Inference” (April 25, 2014). Several epidemiologic methods would permit the calculation of relative risk within specific time windows from first exposure.

The American Law Institute (ALI) Reporters, for the Restatement of Torts, make similar claims.[2] First, the Reporters, citing the Manual’s second edition, repeat the Manual’s claim that:

 “Epidemiologists, however, do not seek to understand causation at the individual level and do not use incidence rates in group to studies to determine the cause of an individual’s disease.”

American Law Institute, Restatement (Third) of Torts: Liability for Physical and Emotional Harm § 28(a) cmt. c(4) & rptrs. notes (2010) [Comment c(4)]. In making this claim, the Reporters ignore an extensive body of epidemiologic studies on genetic associations and on biomarkers, which do address causation implicitly or explicitly, on an individual level.

The Reporters also repeat the Manual’s doubtful claim that acceleration of onset of disease prevents an assessment of attributable risk, although they acknowledge that an average earlier age of onset would form the basis of damages calculations rather than calculations for damages for an injury that would not have occurred but for the tortious exposure. Comment c(4). The Reporters go a step further than the Manual, however, and provide an example of the acceleration-of-onset studies that they have in mind:

“For studies whose results suggest acceleration, see Brad A. Racette, Welding-Related Parkinsonism: Clinical Features, Treatments, and Pathophysiology,” 56 Neurology 8, 12 (2001) (stating that authors “believe that welding acts as an accelerant to cause [Parkinson’s Disease]… .”

The citation to Racette’s 2001 paper[3] is curious, interesting, disturbing, and perhaps revealing. In this 2001 paper, Racette misrepresented the type of study he claimed to have done, and the inferences he drew from his case series are invalid. Any one experienced in the field of epidemiology would have dismissed this study, its conclusions, and its suggested relation between welding and parkinsonism.

Dr. Brad A. Racette teaches and practices neurology at Washington University in St. Louis, across the river from a hotbed of mass tort litigation, Madison County, Illinois. In the 1990s, Racette received referrals from plaintiffs’ attorneys to evaluate their clients in litigation over exposure to welding fumes. Plaintiffs were claiming that their occupational exposures caused them to develop manganism, a distinctive parkinsonism that differs from Parkinson’s disease [PD], but has signs and symptoms that might be confused with PD by unsophisticated physicians unfamiliar with both manganism and PD.

After the publication of his 2001 paper, Racette became the darling of felon Dicky Scruggs and other plaintiffs’ lawyers. The litigation industrialists invited Racette and his team down to Alabama and Mississippi, to conduct screenings of welding tradesmen, recruited by Scruggs and his team, for potential lawsuits for PD and parkinsonism. The result was a paper that helped Scruggs propel a litigation assault against the welding industry.[4]

Racette’s 2001 paper was accompanied by a press release, as have many of his papers, in which he was quoted as stating that “[m]anganism is a very different disease” from PD. Gila Reckess, “Welding, Parkinson’s link suspected” (Feb. 9, 2001)[5].

Racette’s 2001 paper provoked a strongly worded letter that called Racette and his colleagues out for misrepresenting the nature of their work:

“The authors describe their work as a case–control study. Racette et al. ascertained welders with parkinsonism and compared their concurrent clinical features to those of subjects with PD. This is more consistent with a cross-sectional design, as the disease state and factors of interest were ascertained simultaneously. Cross-sectional studies are descriptive and therefore cannot be used to infer causation.”


“The data reported by Racette et al. do not necessarily support any inference about welding as a risk factor in PD. A cohort study would be the best way to evaluate the role of welding in PD.”

Bernard Ravina, Andrew Siderowf, John Farrar, Howard Hurtig, “Welding-related parkinsonism: Clinical features, treatment, and pathophysiology,” 57 Neurology 936, 936 (2001).

As we will see, Dr. Ravina and his colleagues were charitable to suggest that the study was more compatible with a cross-sectional study. Racette had set out to determine “whether welding-related parkinsonism differs from idiopathic PD.” He claimed that he had “performed a case-control study,” with a case group of welders and two control groups. His inferences drawn from his “data” are, however, fallacious because he employed an invalid study design.

In reality, Racette’s paper was nothing more than a chart review, a case series of 15 “welders” in the context of a movement disorder clinic. After his clinical and radiographic evaluation, Racette found that these 15 cases were clinically indistinguishable from PD, and thus unlike manganism. Racette did not reveal whether any of these 15 welders had been referred by plaintiffs’ counsel; nor did he suggest that these welding tradesmen made up a disproportionate number of his patient base in St. Louis, Missouri.

Racette compared his selected 15 career welders with PD to his general movement disorders clinic patient population, for comparison. From the patient population, Racette deployed two “control” groups, one matched for age and sex with the 15 welders, and the other group not matched. The America Law Institute reporters are indeed correct that Racette suggested that the average age of onset for these 15 welders was lower than that for his non-welder patients, but their uncritical embrace overlooked the fact that Racette’s suggestion does not support his claimed inference that in welders, therefore, “welding exposure acts as an accelerant to cause PD.”

Racette’s claimed inference is remarkable because he did not perform an analytical epidemiologic study that was capable of generating causal inferences. His paper incongruously presents odds ratios, although the controls have PD, the disease of interest, which invalidates any analytical inference from his case series. Given the referral and selection biases inherent in tertiary-care specialty practices, this paper can provide no reliable inferences about associations or differences in ages of onset. Even within the confines of a case series misrepresented to be a case-control study, Racette acknowledged that “[s]ubsequent comparisons of the welders with age-matched controls showed no significant differences.”


That Racette wrongly identified his paper as a case-control study is beyond debate. How the journal Neurology accepted the paper for publication is a mystery. The acceptance of the inference by the ALI Reporter, lawyers and judges, is regrettable.

Structurally, Racette’s paper could never quality as a case-control study, or any other analytical epidemiologic study. Here is how a leading textbook on case-control studies defines a case-control study:

“In a case-control study, individuals with a particular condition or disease (the cases) are selected for comparison with a series of individuals in whom the condition or disease is absent (the controls).”

James J. Schlesselman, Case-control Studies. Design, Conduct, Analysis at 14 (N.Y. 1982)[6].

Every patient in Racette’s paper, welders and non-welders, have the outcome of interest, PD. There is no epidemiologic study design that corresponds to what Racette did, and there is no way to draw any useful inference from Racette’s comparisons. Racette’s paper violates the key principle for a proper case-control study; namely, all subjects must be selected independently of the study exposure that is under investigation. Schlesselman stressed that that identifying an eligible case or control must not depend upon that person’s exposure status for any factor under consideration. Id. Racette’s 2001 paper deliberately violated this basic principle.

Racette’s study design, with only cases with the outcome of interest appearing in the analysis, recklessly obscures the underlying association between the exposure (welding) and age in the population. We would, of course, expect self-identified welders to be younger than the average Parkinson’s disease patient because welding is physical work that requires good health. An equally fallacious study could be cobbled together to “show” that the age-of-onset of Parkinson’s disease for sitcom actors (such as Michael J. Fox) is lower than the age-of-onset of Parkinson’s disease for Popes (such as John Paul II). Sitcom actors are generally younger as a group than Popes. Comparing age of onset between disparate groups that have different age distributions generates a biased comparison and an erroneous inference.

The invalidity and fallaciousness of Racette’s approach to studying the age-of-onset issue of PD in welders, and his uncritical inferences, have been extensively commented upon in the general epidemiologic literature. For instance, in studies that compared the age at death for left-handed versus right-handed person, studies reported an observed nine-year earlier death for left handers, leading to (unfounded) speculation that earlier mortality resulted from birth and life stressors and accidents for left handers, living in a world designed to accommodate right-handed person[7]. The inference has been shown to be fallacious and the result of social pressure in the early twentieth century to push left handers to use their right hands, a prejudicial practice that abated over the decades of the last century. Left handers born later in the century were less likely to be “switched,” as opposed to those persons born earlier and now dying, who were less likely to be classified as left-handed, as a result of a birth-cohort effect[8]. When proper prospective cohort studies were conducted, valid data showed that left-handers and right-handers have equivalent mortality rates[9].

Epidemiologist Ken Rothman addressed the fallacy of Racette’s paper at some length in one of his books:

“Suppose we study two groups of people and look at the average age at death among those who die. In group A, the average age of death is 4 years; in group B, it is 28 years. Can we say that being a member of group A is riskier than being a member of group B? We cannot… . Suppose that group A comprises nursery school students and group B comprises military commandos. It would be no surprise that the average age at death of people who are currently military commandos is 28 years or that the average age of people who are currently nursery students is 4 years. …

In a study of factory workers, an investigator inferred that the factory work was dangerous because the average age of onset of a particular kind of cancer was lower in these workers than among the general population. But just as for the nursery school students and military commandos, if these workers were young, the cancers that occurred among them would have to be occurring in young people. Furthermore, the age of onset of a disease does not take into account what proportion of people get the disease.

These examples reflect the fallacy of comparing the average age at which death or disease strikes rather than comparing the risk of death between groups of the same age.”

Kenneth J. Rothman, “Introduction to Epidemiologic Thinking,” in Epidemiology: An Introduction at 5-6 (N.Y. 2002).

And here is how another author of Modern Epidemiology[10] addressed the Racette fallacy in a different context involving PD:

“Valid studies of age-at-onset require no underlying association between the risk factor and aging or birth cohort in the source population. They must also consider whether a sufficient induction time has passed for the risk factor to have an effect. When these criteria and others cannot be satisfied, age-specific or standardized risks or rates, or a population-based case-control design, must be used to study the association between the risk factor and outcome. These designs allow the investigator to disaggregate the relation between aging and the prevalence of the risk factor, using familiar methods to control confounding in the design or analysis. When prior knowledge strongly suggests that the prevalence of the risk factor changes with age in the source population, case-only studies may support a relation between the risk factor and age-at-onset, regardless of whether the inference is justified.”

Jemma B. Wilk & Timothy L. Lash, “Risk factor studies of age-at-onset in a sample ascertained for Parkinson disease affected sibling pairs: a cautionary tale,” 4 Emerging Themes in Epidemiology 1 (2007) (internal citations omitted) (emphasis added).

A properly designed epidemiologic study would have avoided Racette’s fallacy. A relevant cohort study would have enrolled welders in the study at the outset of their careers, and would have continued to follow them even if they changed occupations. A case-control study would have enrolled cases with PD and controls without PD (or more broadly, parkinsonism), with cases and controls selected independently of their exposure to welding fumes. Either method would have determined the rate of PD in both groups, absolutely or relatively. Racette’s paper, which completely lacked non-PD cases, could not have possibly accomplished his stated objectives, and it did not support his claims.

Racette’s questionable work provoked a mass tort litigation and ultimately federal Multi-District Litigation 1535.[11] Ultimately, analytical epidemiologic studies consistently showed no association between welding and PD. A meta-analysis published in 2012 ended the debate[12] as a practical matter, and MDL 1535 is no more. How strange that the ALI reporters chose the Racette work as an example of their claims about acceleration of onset!

[1] Michael D. Green, D. Michal Freedman, and Leon Gordis, “Reference Guide on Epidemiology,” in Federal Judicial Center, Reference Manual on Scientific Evidence 549, 614 (Wash., DC 3d ed. 2011).

[2] Michael D. Green was an ALI Reporter, and of course, an author of the chapter in the Reference Manual.

[3] Brad A. Racette, L. McGee-Minnich, S. M. Moerlein, J. W. Mink, T. O. Videen, and Joel S. Perlmutter, “Welding-related parkinsonism: clinical features, treatment, and pathophysiology,” 56 Neurology 8 (2001).

[4] See Brad A. Racette, S.D. Tabbal, D. Jennings, L. Good, Joel S. Perlmutter, and Brad Evanoff, “Prevalence of parkinsonism and relationship to exposure in a large sample of Alabama welders,” 64 Neurology 230 (2005); Brad A. Racette, et al., “A rapid method for mass screening for parkinsonism,” 27 Neurotoxicology 357 (2006) (duplicate publication of the earlier, 2005, paper).

[5] Previously available at <>, last visited on June 27, 2005.

[6] See also Brian MacMahon & Dimitrios Trichopoulos, Epidemiology. Principles and Methods at 229 (2ed 1996) (“A case-control study is an inquiry in which groups of individuals are selected based on whether they do (the cases) or do not (the controls) have the disease of which the etiology is to be studied.”); Jennifer L. Kelsey, W.D. Thompson, A.S. Evans, Methods in Observational Epidemiology at 148 (N.Y. 1986) (“In a case-control study, persons with a given disease (the cases) and persons without the disease (the controls) are selected … .”).

[7] See, e.g., Diane F. Halpern & Stanley Coren, “Do right-handers live longer?” 333 Nature 213 (1988); Diane F. Halpern & Stanley Coren, “Handedness and life span,” 324 New Engl. J. Med. 998 (1991).

[8] Kenneth J. Rothman, “Left-handedness and life expectancy,” 325 New Engl. J. Med. 1041 (1991) (pointing out that by comparing age of onset method, nursery education would be found more dangerous than paratrooper training, given that the age at death of pres-schoolers wo died would be much lower than that of paratroopers who died); see also Martin Bland & Doug Altman, “Do the left-handed die young?” Significance 166 (Dec. 2005).

[9] See Philip A. Wolf, Ralph B. D’Agostino, Janet L. Cobb, “Left-handedness and life expectancy,” 325 New Engl. J. Med. 1042 (1991); Marcel E. Salive, Jack M. Guralnik & Robert J. Glynn, “Left-handedness and mortality,” 83 Am. J. Public Health 265 (1993); Olga Basso, Jørn Olsen, Niels Holm, Axel Skytthe, James W. Vaupel, and Kaare Christensen, “Handedness and mortality: A follow-up study of Danish twins born between 1900 and 1910,” 11 Epidemiology 576 (2000). See also Martin Wolkewitz, Arthur Allignol, Martin Schumacher, and Jan Beyersmann, “Two Pitfalls in Survival Analyses of Time-Dependent Exposure: A Case Study in a Cohort of Oscar Nominees,” 64 Am. Statistician 205 (2010); Michael F. Picco, Steven Goodman, James Reed, and Theodore M. Bayless, “Methodologic pitfalls in the determination of genetic anticipation: the case of Crohn’s disease,” 134 Ann. Intern. Med. 1124 (2001).

[10] Kenneth J. Rothman, Sander Greenland, Timothy L. Lash, eds., Modern Epidemiology (3d ed. 2008).

[11] Dicky Scruggs served on the Plaintiffs’ Steering Committee until his conviction on criminal charges.

[12] James Mortimer, Amy Borenstein, and Lorene Nelson, “Associations of welding and manganese exposure with Parkinson disease: Review and meta-analysis,” 79 Neurology 1174 (2012).