TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

The 5% Solution at the FDA

February 24th, 2018

The statistics wars rage on1, with Bayesians attempting to take advantage of the so-called replication crisis to argue it is all the fault of frequentist significance testing. In 2016, there was an attempted coup at the American Statistical Association, but the Bayesians did not get what they wanted, with little more than a consensus that p-values and confidence intervals should be properly interpreted. Patient advocacy groups have lobbied for the availability of unapproved and incompletely tested medications, and rent-seeking litigation has argued and lobbied for the elimination of statistical tests and methods in the assessment of causal claims. The battle continues.

Against this backdrop, a young Harvard graduate student has published a a paper with a brief history of significance testing, and the role that significance testing has taken on at the United States Food and Drug Administration (FDA). Lee Kennedy-Shaffer, “When the Alpha is the Omega: P-Values, ‘Substantial Evidence’, and the 0.05 Standard at FDA,” 72 Food & Drug L.J. 595 (2017) [cited below as K-S]. The paper presents a short but entertaining history of the evolution of the p-value from its early invocation in 1710, by John Arbuthnott, a Scottish physician and mathematician, who calculated the probability that male births would exceed female births 82 consecutive years if their true proportions were equal. K-S at 603. Kennedy-Shaffer notes the role of the two great French mathematicians, Pierre-Simon Laplace and Siméon-Denis Poisson, who used p-values (or their complements) to evaluate empirical propositions. As Kennedy-Shaffer notes, Poisson observed that the equivalent of what would be a modern p-value about 0.005, was sufficient in his view, back in 1830, to believe that the French Revolution of 1830 had caused the pattern of jury verdicts to be changed. K-S at 604.

Kennedy-Shaffer traces the p-value, or its equivalent, through its treatment by the great early 20th century statisticians, Karl Pearson and Ronald A. Fisher, through its modification by Jerzy Neyman and Egon Pearson, into the bowels of the FDA in Rockville, Maryland. It is a history well worth recounting, if for no other reason, to remind us that the p-value or its equivalent has been remarkably durable and reasonably effective in protecting the public against false claims of safety and efficacy. Kennedy-Shaffer provides several good examples in which the FDA’s use of significance testing was outcome dispositive of approval or non-approval of medications and devices.

There is enough substance and history here that everyone will have something to pick at this paper. Let me volunteer the first shot. Kennedy-Shaffer describes the co-evolution of the controlled clinical trial and statistical tests, and points to the landmark study by the Medical Research Council on streptomycin for tuberculosis. Geoffrey Marshall (chairman), “Streptomycin Treatment of Pulmonary Tuberculosis: A Medical Research Council Investigation,” 2 Brit. Med. J. 769, 769–71 (1948). This clinical trial was historically important, not only for its results and for Sir Austin Bradford Hill’s role in its design, but for the care with which it described randomization, double blinding, and multiple study sites. Kennedy-Shaffer suggests that “[w]hile results were presented in detail, few formal statistical tests were incorporated into this analysis.” K-S at 597-98. And yet, a few pages later, he tells us that “both chi-squared tests and t-tests were used to evaluate the responses to the drug and compare the control and treated groups,” and that “[t]he difference in mortality between the two groups is statistically significant.” K-S at 611. Although it is true that the authors did not report their calculated p-values for any test, the difference in mortality between the streptomycin and control groups was very large, and the standards for describing the results of such a clinical trial were in their infancy in 1948.

Kennedy-Shaffer’s characterization of Sir Austin Bradford Hill’s use of statistical tests and methods takes on out-size importance because of the mischaracterizations, and even misrepresentations, made by some representatives of the Lawsuit Industry, who contend that Sir Austin dismissed statistical methods as unnecessary. In the United States, some judges have been seriously misled by those misrepresentations, which have their way into published judicial decisions.

The operative document, of course, is the publication of Sir Austin’s famous after-dinner speech, in 1965, on the occasion of his election to the Presidency of the Royal Society of Medicine. Although the speech is casual and free of scholarly footnotes, Sir Austin’s message was precise, balanced, and nuanced. The speech is a classic in the history of medicine, which remains important even if rather dated in terms of its primary message about how science and medicine move from beliefs about associations to knowledge of causal associations. As everyone knows, Sir Austin articulated nine factors or viewpoints through which to assess any putative causal association, but he emphasized that before these nine factors are assessed, our starting point itself has prerequisites:

Disregarding then any such problem in semantics we have this situation. Our observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance. What aspects of that association should we especially consider before deciding that the most likely interpretation of it is causation?”

Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 295 (1965) [cited below as Hill]. The starting point, therefore, before the Bradford Hill nine factors come into play, is a “clear-cut” association, which is “beyond what we would care to attribute to the play of chance.”

In other words, consideration of random error is necessary.

Now for the nuance and the balance. Sir Austin acknowledged that there were some situations in which we simply do not need to calculate standard errors because the disparity between treatment and control groups is so large and meaningful. He goes on to wonder out loud:

whether the pendulum has not swung too far – not only with the attentive pupils but even with the statisticians themselves. To decline to draw conclusions without standard errors can surely be just as silly? Fortunately I believe we have not yet gone so far as our friends in the USA where, I am told, some editors of journals will return an article because tests of significance have not been applied. Yet there are innumerable situations in which they are totally unnecessary – because the difference is grotesquely obvious, because it is negligible, or because, whether it be formally significant or not, it is too small to be of any practical importance. What is worse the glitter of the t table diverts attention from the inadequacies of the fare.”

Hill at 299. Now this is all true, but hardly the repudiation of statistical testing claimed by those who want to suppress the consideration of random error from science and judicial gatekeeping. There are very few litigation cases in which the difference between exposed and unexposed is “grotesquely obvious,” such that we can leave statistical methods at the door. Importantly, the very large differences between the streptomycin and placebo control groups in the Medical Council’s 1948 clinical trial were not so “grotesquely obvious” that statistical methods were obviated. To be fair, the differences were sufficiently great that statistical discussion could be kept to a minimum. Sir Austin gave extensive tables in the 1948 paper to let the reader appreciate the actual data themselves.

In his after-dinner speech, Hill also gives examples of studies that are so biased and confounded that no statistical method will likely ever save them. Certainly, the technology of regression and propensity-score analyses have progressed tremendously since Hill’s 1965 speech, but his point still remains. This point hardly excuses the lack of statistical apparatus in highly confounding or biased observations.

In addressing the nine factors he identified, which presumed a “clear-cut” association, with random error ruled out, Sir Austin did opine that for the factors raised questions and that:

No formal tests of significance can answer those questions. Such tests can, and should, remind us of the effects that the play of chance can create, and they will instruct us in the likely magnitude of those effects. Beyond that they contribute nothing to the ‘proof’ of our hypothesis.”

Hill at 299. Again, the date and the context are important. Hill is addressing consideration of the nine factors, not the required predicate association beyond the play of chance or random error. The date is important as well, because it would be foolish to suggest that statistical methods have not grown in the last half century to address some of the nine factors. The existence and the nature of dose-response are the subject of extensive statistical methods, and meta-analysis and meta-regression are used to assess and measure consistency between studies.

Kennedy-Shaffer might well have pointed out the great influence Sir Austin’s textbook on medical statistics had had on medical research and practice. This textbook, which went through numerous editions, makes clear the importance of statistical testing and methods:

Are simple methods of the interpretation of figures only a synonym for common sense or do they involve an art or knowledge which can be imparted? Familiarity with medical statistics leads inevitably to the conclusion that common sense is not enough. Mistakes which when pointed out look extremely foolish are quite frequently made by intelligent persons, and the same mistakes, or types of mistakes, crop up again and again. There is often lacking what has been called a ‘statistical tact, which is rather more than simple good sense’. That fact the majority of persons must acquire (with a minority it is undoubtedly innate) by a study of the basic principles of statistical method.”

Austin Bradford Hill, Principles of Medical Statistics at 2 (4th ed. 1948) (emphasis in original). And later in his text, Sir Austin notes that:

The statistical method is required in the interpretation of figures which are at the mercy of numerous influences, and its object is to determine whether individual influences can be isolated and their effects measured.”

Id. at 10 (emphasis added).

Sir Austin’s work taken as a whole demonstrates the acceptance of the necessity of statistical methods in medicine, and causal inference. Kennedy-Shaffer’s paper covers much ground, but it short changes this important line of influence, which lies directly in the historical path between Sir Ronald Fisher and the medical regulatory community.

Kennedy-Shaffer gives a nod to Bayesian methods, and even suggests that Bayesian results are “more intuitive,” but he does not explain the supposed intuitiveness of how a parameter has a probability distribution. This might make sense at the level of quantum physics, but does not seem to describe the reality of a biomedical phenomenon such as relative risk. Kennedy-Shaffer notes the FDA’s expression of willingness to entertain Bayesian analyses of clinical trials, and the rare instances in which such analyses have actually been deployed. K-S at 629 (“e.g., Pravigard Pac for prevention of myocardial infarction”). He concedes, however, that Bayesian designs are still the exception to the rule, as well as the cautions of Robert Temple, a former FDA Director of Medical Policy, in 2005, that Bayesian proposals for drug clinical trials were at that time “very rare.2” K-S at 630.


2 Robert Temple, “How FDA Currently Makes Decisions on Clinical Studies,” 2 Clinical Trials 276, 281 (2005).

Scientific Evidence in Canadian Courts

February 20th, 2018

A couple of years ago, Deborah Mayo called my attention to the Canadian version of the Reference Manual on Scientific Evidence.1 In the course of discussion of mistaken definitions and uses of p-values, confidence intervals, and significance testing, Sander Greenland pointed to some dubious pronouncements in the Science Manual for Canadian Judges [Manual].

Unlike the United States federal court Reference Manual, which is published through a joint effort of the National Academies of Science, Engineering, and Medicine, the Canadian version, is the product of the Canadian National Judicial Institute (NJI, or the Institut National de la Magistrature, if you live in Quebec), which claims to be an independent, not-for-profit group, that is committed to educating Canadian judges. In addition to the Manual, the Institute publishes Model Jury Instructions and a guide, Problem Solving in Canada’s Courtrooms: A Guide to Therapeutic Justice (2d ed.), as well as conducting educational courses.

The NJI’s website describes the Instute’s Manual as follows:

Without the proper tools, the justice system can be vulnerable to unreliable expert scientific evidence.

         * * *

The goal of the Science Manual is to provide judges with tools to better understand expert evidence and to assess the validity of purportedly scientific evidence presented to them. …”

The Chief Justice of Canada, Hon. Beverley M. McLachlin, contributed an introduction to the Manual, which was notable for its frank admission that:

[w]ithout the proper tools, the justice system is vulnerable to unreliable expert scientific evidence.

****

Within the increasingly science-rich culture of the courtroom, the judiciary needs to discern ‘good’ science from ‘bad’ science, in order to assess expert evidence effectively and establish a proper threshold for admissibility. Judicial education in science, the scientific method, and technology is essential to ensure that judges are capable of dealing with scientific evidence, and to counterbalance the discomfort of jurists confronted with this specific subject matter.”

Manual at 14. These are laudable goals, indeed, but did the National Judicial Institute live up to its stated goals, or did it leave Canadian judges vulnerable to the Institute’s own “bad science”?

In his comments on Deborah Mayo’s blog, Greenland noted some rather cavalier statements in Chapter two that suggest that the conventional alpha of 5% corresponds to a “scientific attitude that unless we are 95% sure the null hypothesis is false, we provisionally accept it.” And he, pointed elsewhere where the chapter seems to suggest that the coefficient of confidence that corresponds to an alpha of 5% “constitutes a rather high standard of proof,” thus confusing and conflating probability of random error with posterior probabilities. Greenland is absolutely correct that the Manual does a rather miserable job of educating Canadian judges if our standard for its work product is accuracy and truth.

Some of the most egregious errors are within what is perhaps the most important chapter of the Manual, Chapter 2, “Science and the Scientific Method.” The chapter has two authors, a scientist, Scott Findlay, and a lawyer, Nathalie Chalifour. Findlay is an Associate Professor, in the Department of Biology, of the University of Ottawa. Nathalie Chalifour is an Associate Professor on the Faculty of Law, also in the University of Ottawa. Together, they produced some dubious pronouncements, such as:

Weight of the Evidence (WOE)

First, the concept of weight of evidence in science is similar in many respects to its legal counterpart. In both settings, the outcome of a weight-of-evidence assessment by the trier of fact is a binary decision.”

Manual at 40. Findlay and Chalifour cite no support for their characterization of WOE in science. Most attempts to invoke WOE are woefully vague and amorphous, with no meaningful guidance or content.2  Sixty-five pages later, if any one is noticing, the authors let us in a dirty, little secret:

at present, there exists no established prescriptive methodology for weight of evidence assessment in science.”

Manual at 105. The authors omit, however, that there are prescriptive methods for inferring causation in science; you just will not see them in discussions of weight of the evidence. The authors then compound the semantic and conceptual problems by stating that “in a civil proceeding, if the evidence adduced by the plaintiff is weightier than that brought forth by the defendant, a judge is obliged to find in favour of the plaintiff.” Manual at 41. This is a remarkable suggestion, which implies that if the plaintiff adduces the crummiest crumb of evidence, a mere peppercorn on the scales of justice, but the defendant has none to offer, that the plaintiff must win. The plaintiff wins notwithstanding that no reasonable person could believe that the plaintiff’s claims are more likely than not true. Even if there were the law of Canada, it is certainly not how scientists think about establishing the truth of empirical propositions.

Confusion of Hypothesis Testing with “Beyond a Reasonable Doubt”

The authors’ next assault comes in conflating significance probability with the probability connected with the burden of proof, a posterior probability. Legal proceedings have a defined burden of proof, with criminal cases requiring the state to prove guilt “beyond a reasonable doubt.” Findlay and Chalifour’s discussion then runs off the rails by likening hypothesis testing, with an alpha of 5% or its complement, 95%, as a coefficient of confidence, to a “very high” burden of proof:

In statistical hypothesis-testing – one of the tools commonly employed by scientists – the predisposition is that there is a particular hypothesis (the null hypothesis) that is assumed to be true unless sufficient evidence is adduced to overturn it. But in statistical hypothesis-testing, the standard of proof has traditionally been set very high such that, in general, scientists will only (provisionally) reject the null hypothesis if they are at least 95% sure it is false. Third, in both scientific and legal proceedings, the setting of the predisposition and the associated standard of proof are purely normative decisions, based ultimately on the perceived consequences of an error in inference.”

Manual at 41. This is, as Greenland and many others have pointed out, a totally bogus conception of hypothesis testing, and an utterly false description of the probabilities involved.

Later in the chapter, Findlay and Chalifour flirt with the truth, but then lapse into an unrecognizable parody of it:

Inferential statistics adopt the frequentist view of probability whereby a proposition is either true or false, and the task at hand is to estimate the probability of getting results as discrepant or more discrepant than those observed, given the null hypothesis. Thus, in statistical hypothesis testing, the usual inferred conclusion is either that the null is true (or rather, that we have insufficient evidence to reject it) or it is false (in which case we reject it). 16 The decision to reject or not is based on the value of p if the estimated value of p is below some threshold value a, we reject the null; otherwise we accept it.”

Manual at 74. OK; so far so good, but here comes the train wreck:

By convention (and by convention only), scientists tend to set α = 0.05; this corresponds to the collective – and, one assumes, consensual – scientific attitude that unless we are 95% sure the null hypothesis is false, we provisionally accept it. It is partly because of this that scientists have the reputation of being a notoriously conservative lot, given that a 95% threshold constitutes a rather high standard of proof.”

Manual at 75. Uggh; so we are back to significance probability’s being a posterior probability. As if to atone for their sins, in the very next paragraph, the authors then remind the judicial readers that:

As noted above, p is the probability of obtaining results at least as discrepant as those observed if the null is true. This is not the same as the probability of the null hypothesis being true, given the results.”

Manual at 75. True, true, and completely at odds with what the authors have stated previously. And to add to the reader’s now fully justified conclusion, the authors describe the standard for rejecting the null hypothesis as “very high indeed.” Manual at 102, 109. Any reader who is following the discussion might wonder how and why there is such a problem of replication and reproducibility in contemporary science.

Conflating Bayesianism with Frequentist Modes of Inference

We have seen how Findlay and Chalifour conflate significance and posterior probabilities, some of the time. In a section of their chapter that deals explicitly with probability, the authors tell us that before any study is conducted the prior probability of the truth of the tested hypothesis is 50%, sans evidence. This an astonishing creation of certainty out nothingness, and perhaps it explains the authors’ implied claim that the crummiest morsel of evidence on one side is sufficient to compel a verdict, if the other side has no morsels at all. Here is how the authors put their claim to the Canadian judges:

Before each study is conducted (that is, a priori), the hypothesis is as likely to be true as it is to be false. Once the results are in, we can ask: How likely is it now that the hypothesis is true? In the first study, the low a priori inferential strength of the study design means that this probability will not be much different from the a priori value of 0.5 because any result will be rather equivocal owing to limitations in the experimental design.”

Manual at 64. This implied Bayesian slant, with 50% priors, in the world of science would lead anyone to believe “as many as six impossible things before breakfast,” and many more throughout the day.

Lest you think that the Manual is all rubbish, there are occasional gems of advice to the Canadian judges. The authors admonish the judges to

be wary of individual ‘statistically significant’ results that are mined from comparatively large numbers of trials or experiments, as the results may be ‘cherry picked’ from a larger set of experiments or studies that yielded mostly negative results. The court might ask the expert how many other trials or experiments testing the same hypothesis he or she is aware of, and to describe the outcome of those studies.”

Manual at 87. Good advice, but at odds with the authors’ characterization of statistical significance as establishing the rejection of the null hypothesis well-nigh beyond a reasonable doubt.

When Greenland first called attention to this Manual, I reached to some people who had been involved in its peer review. One reviewer told me that it was a “living document,” and would likely be revised after he had the chance to call the NJI’s attention to the errors. But two years later, the errors remain, and so we have to infer that the authors meant to say all the contradictory and false statements that are still present in the downloadable version of the Manual.


2 SeeWOE-fully Inadequate Methodology – An Ipse Dixit By Another Name” (May 1, 2012); “Weight of the Evidence in Science and in Law” (July 29, 2017); see also David E. Bernstein, “The Misbegotten Judicial Resistance to the Daubert Revolution,” 89 Notre Dame L. Rev. 27 (2013).

Wrong Words Beget Causal Confusion

February 12th, 2018

In clinical medical and epidemiologic journals, most articles that report about associations will conclude with a discussion section in which the authors hold forth about

(1) how they have found that exposure to X “increases the risk” of Y, and

(2) how their finding makes sense because of some plausible (even if unproven) mechanism.

In an opinion piece in Significance,1 Dalmeet Singh Chawla cites to a study that suggests the “because” language frequently confuses readers into believing that a causal claim is being made. The study abstract explains:

Most researchers do not deliberately claim causal results in an observational study. But do we lead our readers to draw a causal conclusion unintentionally by explaining why significant correlations and relationships may exist? Here we perform a randomized study in a data analysis massive online open course to test the hypothesis that explaining an analysis will lead readers to interpret an inferential analysis as causal. We show that adding an explanation to the description of an inferential analysis leads to a 15.2% increase in readers interpreting the analysis as causal (95% CI 12.8% – 17.5%). We then replicate this finding in a second large scale massive online open course. Nearly every scientific study, regardless of the study design, includes explanation for observed effects. Our results suggest that these explanations may be misleading to the audience of these data analyses.”

Leslie Myint, Jeffrey T. Leek, and Leah R. Jager, “Explanation implies causation?” (Nov. 2017) (on line manuscript).

Invoking the principle of charity, these authors suggest that most researchers are not deliberately claiming causal results. Indeed, the language of biomedical science itself is biased in favor of causal interpretation. The term “statistical significance” suggests causality to naive readers, as does stats talk about “effect size,” and “fixed effect models,” for data sets that come no where near establishing causality.

Common epidemiologic publication practice tolerates if not encourages authors to state that their study shows (finds, demonstrates, etc.) that exposure to X “increases the risk” of Y in the studies’ samples. This language is deliberately causal, even if the study cannot support a causal conclusion alone or even with other studies. After all, a risk is the antecedent of a cause, and in the stochastic model of causation involved in much of biomedical research, causation will manifest in a change of a base rate to a higher or lower post-exposure rate. Given that mechanism is often unknown and not required, then showing an increased risk is the whole point. Eliminating chance, bias, confounding, and study design often is lost in the irrational exuberance of declaring the “increased risk.”

Tighter editorial control might have researchers qualify their findings by explaining that they found a higher rate in association with exposure, under the circumstances of the study, followed by an explanation that much more is needed to establish causation. But where is the fun and profit in that?

Journalists, lawyers, and advocacy scientists often use the word “link,” to avoid having to endorse associations that they know, or should know, have not been shown to be causal.2 Using “link” as a noun or a verb clearly implies a causal chain metaphor, which probably is often deliberately implied. Perhaps publishers would defend the use of “link” by noting that it is so much shorter than “association,” and thus saves typesetting costs.

More attention is needed to word choice, even and especially when statisticians and scientists are using their technical terms and jargon.3 If, for the sake of argument, we accept the sincerity of scientists who work as expert witnesses in litigation in which causal claims are overstated, we can see that poor word choices confuse scientists as well as lay people. Or you can just read the materials and methods and the results of published study papers; skip the introduction and discussion sections, as well as the newspaper headlines.


1 Dalmeet Singh Chawla, “Mind your language,” Significance 6 (Feb. 2018).

2 See, e.g., Perri Klass, M.D., “https://www.nytimes.com/2017/12/04/well/family/does-an-adhd-link-mean-tylenol-is-unsafe-in-pregnancy.html,” N.Y. Times (Dec. 4, 2017); Nicholas Bakalar, “Body Chemistry: Lower Testosterone Linked to Higher Death Risk,” N.Y. Times (Aug. 15, 2006).

3 Fang Xuelan & Graeme Kennedy, “Expressing Causation in Written English,” 23 RELC J. 62 (1992); Bengt Altenberg, “Causal Linking in Spoken and Written English,” 38 Studia Linguistica 20 (1984).

PubMed Refutes Courtroom Historians

February 11th, 2018

Professors Rosner and Markowitz, labor historians, or historians laboring in courtrooms, have made a second career out of testifying about other people’s motivations. Consider their pronouncement:

In the postwar era, professionals, industry, government, and a conservative labor movement tried to bury silicosis as an issue.”

David Rosner & Gerald Markowitz, Deadly Dust: Silicosis and the Politics of Occupational Disease in the Twentieth Century America 213 (Princeton 1991); Gerald Markowitz & David Rosner, “Why Is Silicosis So Important?” Chap. 1, at 27, in Paul-André Rosental, ed., Silicosis: A World History (2017). Their accusation is remarkable for any number of reasons,1 but the most remarkable is that their claim is unverified, but readily falsified.2

Previously, I have pointed to searches in Google’s Ngram Book viewer as well as in the National Library of Medicine’s database (PubMed) on silicosis. The PubMed website has now started to provide a csv file, with articles counts by year, which can be opened in programs such as LibreOffice Calc, Excel, etc, and then used to generate charts of the publication counts over time. 

Here is a chart generated form a simple search on <silicosis> in PubMed, with years aggregated over roughly 11 year intervals:

The chart shows that the “professionals,” presumably physicians and scientists were most busy publishing on, not burying, the silicosis issue exactly when Rosner and Markowitz claimed them to be actively suppressing. Many of the included publications grew out of industry, labor, and government interests and concerns. In their book and in their courtroom performances,, Rosner and Markowitz provide mostly innuendo without evidence, but their claim is falsifiable and false.

To be sure, the low count in the 1940s may well result from the relatively fewer journals included in the PubMed database, as well as the growth in the number of biomedical journals after the 1940s. The Post-War era certainly presented distractions in the form of other issues, including the development of antibiotics, chemotherapies for tuberculosis, the spread of poliomyelitis and the development of vaccines for this and other viral diseases, radiation exposure and illnesses, tobacco-related cancers, and other chronic diseases. Given the exponential expansion in scope of public health, the continued interest in silicosis after World War II, documented in the PubMed statistics, is remarkable for its intensity, pace Rosner and Markowitz.


1Conspiracy Theories: Historians, In and Out of Court(April 17, 2013). Not the least of the reasons the group calumny is pertinent is the extent to which it keeps the authors gainfully employed as expert witnesses in litigation.

2 See also CDC, “Ten Great Public Health Achievements – United States, 1900 – 1999,” 48(12) CDC Morbidity and Mortality Weekly Report 241 (April 02, 1999)(“Work-related health problems, such as coal workers’ pneumoconiosis (black lung), and silicosis — common at the beginning of the century — have come under better control.”).

ToxicHistorians Sponsor ToxicDocs

February 1st, 2018

A special issue of the Journal of Public Health Policy waxes euphoric over a website, ToxicDocs, created by two labor historians, David Rosner and Gerald Markowitz (also known as the “Pink Panthers”). The Panthers have gotten their universities, Columbia University and the City University of New York, to host the ToxicDocs website with whole-text searchable documents of what they advertise as “secret internal memoranda, emails, slides, board minutes, unpublished scientific studies, and expert witness reports — among other kinds of documents — that emerged in recent toxic tort litigation.” According to Rosner and Markowitz, they are “constantly adding material from lawsuits involving lead, asbestos, silica, and PCBs, among other dangerous substances.” Rosner and Markowitz are well-positioned to obtain and add such materials because of their long-term consulting and testifying work for the Lawsuit Industry, which has obtained many of these documents in routine litigation discovery proceedings.

Despite the hoopla, the ToxicDocs website is nothing new or novel. Tobacco litigation has spawned several such on-line repositories: Truth Tobacco Industry Documents Library,” Tobacco Archives,” and “Tobacco Litigation Documents.” And the Pink Panthers’ efforts to create a public library of the documents upon which they rely in litigation go back several years to earlier websites. See David Heath & Jim Morris, “Exposed: Decades of denial on poisons. Internal documents reveal industry ‘pattern of behavior’ on toxic chemicals,” Center for Public Integrity (Dec. 4, 2014).

The present effort, however, is marked by shameless self promotion and support from other ancillary members of the Lawsuit Industry. The Special Issue of Journal of Public Health Policy is introduced by Journal editor Anthony Robbins,1 who was a mover and shaker in the SKAPP enterprise and its efforts to subvert judicial assessments of proffered opinions for validity and methodological propriety. In addition, Robbins, along with the Pink Panthers as guest editors, have recruited additional “high fives” and self-congratulatory cheerleading from other members of, and expert witnesses for, the Lawsuit Industry, as well as zealots of the type who can be counted upon to advocate for weak science and harsh treatment for manufacturing industry.2

Rosner and Markowitz, joined by Merlin Chowkwanyun, add to the happening with their own spin on ToxicDocs.3 As historians, it is understandable that they are out of touch with current technologies, even those decades old. They wax on about the wonders of optical character recognition and whole text search, as though it were quantum computing.

The Pink Panthers liken their “trove” of documents to “Big Data,” but there is nothing quantitative about their collection, and their mistaken analogy ignores their own “Big Bias,” which vitiates much of their collection. These historians have been trash picking in the dustbin of history, and quite selectively at that. You will not likely find documents here that reveal the efforts of manufacturing industry to improve the workplace and the safety and health of their workers.

Rosner and Markowitz disparage their critics as hired guns for industry, but it is hard for them to avoid the label of hired guns for the Lawsuit Industry, an industry with which they have worked in close association for several decades, and from which they have reaped thousands of dollars in fees for consulting and testifying. Ironically, neither David Rosner nor Gerald Markowitz disclose their conflicts of interest, or their income from the Lawsuit Industry. David Wegman, in his contribution to the love fest, notes that ToxicDocs may lead to more accurate reporting of conflicts of interest. And yet, Wegman does not report his testimonial adventures for the Lawsuit Industry; nor does Robert Proctor; nor do Rosner and Markowitz.

It is a safe bet that ToxicDocs does not contain any emails, memoranda, letters, and the like about the many frauds and frivolities of the Lawsuit Industry, such as the silica litigation, where fraud has been rampant.4 I looked for but did not find the infamous Baron & Budd asbestos memorandum, or any of the documentary evidence from fraud cases arising from false claiming in the asbestos, silicone, welding, Fen-Phen, and other litigations.5

The hawking of ToxicDocs in the pages of the Journal of Public Health Policy is only the beginning. You will find many people and organizations promoting ToxicDocs on Facebook, Twitter, and LinkedIn. Proving there is no limit to the mercenary nature of the enterprise, you can even buy branded T-shirts and stationery online. Ah America, where even Marxists have the enterpreurial spirit!


1 Anthony Robbins & Phyllis Freeman, “ToxicDocs (www.ToxicDocs.org) goes live: A giant step toward leveling the playing field for efforts to combat toxic exposures,” 39 J. Public Health Pol’y 1 (2018). SeeMore Antic Proposals for Expert Witness Testimony – Including My Own Antic Proposals” (Dec. 30 2014).

2 Robert N. Proctor, “God is watching: history in the age of near-infinite digital archives,” 39 J. Public Health Pol’y 24 (2018); Stéphane Horel, “Browsing a corporation’s mind,” 39 J. Public Health Pol’y 12 (2018); Christer Hogstedt & David H. Wegman, “ToxicDocs and the fight against biased public health science worldwide,” 39 J. Public Health Pol’y 15 (2018); Joch McCulloch, “Archival sources on asbestos and silicosis in Southern Africa and Australia,” 39 J. Public Health Pol’y 18 (2018); Sheldon Whitehouse, “ToxicDocs: using the US legal system to confront industries’ systematic counterattacks against public health,” 39 J. Public Health Pol’y 22 (2018); Elena N. Naumova, “The value of not being lost in our digital world,” 39 J. Public Health Pol’y 27 (2018); Nicholas Freudenberg, “ToxicDocs: a new resource for assessing the impact of corporate practices on health,” 39 J. Public Health Pol’y 30 (2018). These articles are free, open-access, but in this case, you may get what you have paid for.

3 David Rosner, Gerald Markowitz, and Merlin Chowkwanyun, “ToxicDocs (www.ToxicDocs.org): from history buried in stacks of paper to open, searchable archives online,” 39 J. Public Health Pol’y 4 (2018).

4 See, e.g., In re Silica Products Liab. Litig., MDL No. 1553, 398 F.Supp. 2d 563 (S.D.Tex. 2005).

5 See Lester Brickman, “Fraud and Abuse in Mesothelioma Litigation,” 88 Tulane L. Rev. 1071 (2014); Peggy Ableman, “The Garlock Decision Should be Required Reading for All Trial Court Judges in Asbestos Cases,” 37 Am. J. Trial Advocacy 479, 488 (2014).

Fake Friends and Fake Followers

January 28th, 2018

In the Black Mirror production of Nosedive, based upon a short story by Charlie Brooker, a young woman named Lacie lives in a world in which social media approval metrics determine social, economic, and political opportunities. Every interaction is graded on a scale from one to five. Lacie’s approval rating is slipping, thus jeopardizing her participation in her friend’s wedding, and she is determined to improve her rating. She tries her best to be “nice,” and then enlists a ratings coach, but her efforts cannot stop her approval rating from its nosedive. Perhaps if Lacie had greater financial resources, she could have improved her ratings by paying people to like her on social media.

Would people really pay for the appearance of social approval? “Celebrities, athletes, pundits and politicians have millions of fake followers,” and they paid for them. Thus announces the New York Times in an exposé of the practice of paying for followers on social media.1 Perhaps even the President has paid for fake followers who are mere bots. Maybe bots are the only friends he has.

Although I am skeptical of the utility of Facebook and Twitter, I have come reluctantly to admit that these and other social media – even blogs – have some utility if properly used. The business of buying followers, however, is just plain sick.

Finally, Eric Schneiderman has announced an investigation into an issue of some importance. He is investigating Devumi, a company that he claims sells fake followers on social media. The company is alleged to have created over 55,000 bots based upon living people and their identifying features.2

Stealing identities and selling fake followers is deplorable, and Scheiderman’s crusade is a laudable exercise of prosecutorial discretion. But so is buying fake followers to lard up one’s social media metrics. The crime involves two separate criminal acts, and we should not lose sight of the fraudulent nature of the representations about inflated number of followers. It takes two parties to enter the contract to defraud the public. Devumi’s clients may well be in pari delicto.

Let us hope that when Schneiderman opens the books at Devumi, he will have the fortitude to tell us which “celebrities, athletes, pundits, and politicians” have been juking their stats. Schneiderman’s investigation has the promise of making Eliot Spitzer’s commercial transactions look like child’s play. Inquiring minds want to know who would buy a friend or a follower.


1 Nicholas Confessore, Gabriel J.X. Dance, Richard Harris, and Mark Hansen, “The Follower Factory: Everyone wants to be popular online. Some even pay for it. Inside social media’s black market,” N.Y. Times (Jan. 27, 2018).

2 Nicholas Confessore, “New York Attorney General to Investigate Firm That Sells Fake Followers,” N.Y. Times (Jan. 27, 2018).

Divine Intervention in Litigation

January 27th, 2018

The Supreme Being, or Beings, or their Earthly Agents (Angels) rarely intervene in mundane matters such as litigation. Earlier this month, however, there may have been an unsuccessful divine intervention in the workings of a Comal County, Texas, jury, which was deliberating whether or not to convict Gloria Romero Perez of human trafficking.

After the jury reached a verdict, and rang the bell to signal its verdict, the trial judge, the Hon. Jack Robison, waltzed in and proclaimed that that God had told him that Perez was not guilty. According to jury foreperson Mark A. House, Judge Robison told them that he had prayed on the case and that God told him that he had to tell the jury. The state’s attorney was not present to object to the hearsay. House reported that the jury signaled again that it had reached a verdict, and again Judge Robison appeared to proclaim the defendant’s innocence.

Judge Robison’s pronouncements apparently anguished the jurors, some of who were “physically sick, crying and distraught” from the appearance of a putative prophet in the courthouse. Nonetheless, guilty is guilty, and the jury returned its verdict unmoved by Judge Robison. According to news reports, Judge Robison later apologized to the jury, but added something like “if God tells me to do something, I have to do it.” Zeke MacCormack, “Judge facing complaints over trying to sway jury,” San Antonio Express-News (Jan. 20, 2018); Ryan Autullo, “Texas judge interrupts jury, says God told him defendant is not guilty,” American-Statesman (Jan. 19, 2018). Foreperson House filed a complaint against Judge Robison with the judicial conduct commission, but told a local newspaper that “You’ve got to respect him for what he did. He went with his conscience.” Debra Cassens Weiss, “Judge informs jurors that God told him accused sex trafficker isn’t guilty,” A.B.A.J. online (Jan. 22, 2018).Or he was having a stroke. Somewhere, Henry Mencken is laughing and crying uncontrollably.

* * * * * * * * * * * *

For better or worse, I have not experienced divine intervention in my cases. At least, I think not. In one of my cases, the jury foreman and several jurors were in the elevator with my adversary and me, at the end of the trial. The situation was awkward, and punctuated by the foreman’s simple statement that God had directed them to their verdict. No one questioned the gentlemen. I thanked the jurors for their service, but I have never been able to verify the source of the direction or inspiration given to the jury. To this day, I prefer to believe the verdict resulted from my advocacy and marshaling of the evidence.

The case was Edward and Carmelita O’Donnell v. Celotex Corp., et al., Philadelphia County Court of Common Pleas, July Term 1982, No. 1619. My adversary was a very capable African American lawyer, Sandy L.V. Byrd, then of the Brookman, Rosenberg, Brown & Sandler firm in Philadelphia, now a sitting judge in Philadelphia County. As you will see, race was an important ingredient in this case, and perhaps the reason it was tried.

Sandy and I had pulled Judge Levan Gordon1, for the trial, which was noteworthy because Judge Gordon was one of the few trial judges who stood up to the wishes of the coordinating judge (Hon. Sandra Mazer Moss) that all cases be tried “reverse bifurcated,” that is, with medical causation and damages in a first phase, and liability in the second phase.

This unnatural way of trying asbestos personal injury cases had been first advocated by counsel for Johns Manville, which had a huge market share, a distinctive lack of liability defenses, and a susceptibility to punitive damages. In May 1989, when Sandy and defense counsel announced “ready” before Judge Gordon, Johns Manville was in bankruptcy. Reverse bifurcated had long outlasted its usefulness, and had become a way of abridging defendants’ due process rights to a trial on liability. If a jury returned a verdict with damages in phase One, plaintiffs would argue (illegitimately but often with court approval) that it was bad enough that defendants caused their illness, how much worse is it now that they are arguing to take away their compensation.

Worse yet, in trying cases backwards, with reverse bifurcation, plaintiffs quickly learned that they could, in Phase One, sneak evidence of liability, or hint that the defendants were as liable as sin, and thus suggest that the odd procedure of skipping over liability was desirable because liability was well-nigh conceded. The plaintiffs’ direct examination typically went something like:

Q. How did you feel emotionally when you received your diagnosis of asbestos-related _[fill in the blank]____?

A. I was devastated; I cried; I was depressed. I had never heard that asbestos could cause this disease..…

So clearly there was a failure to warn, at least on that colloquy, and that was all juries needed to hear on the matter, from the plaintiffs’ perspective. If the defendants lost in the first phase, and refused to settle, juries were annoyed that they were being kept from their lives by recalcitrant, liable defendants. Liability was a done deal.

At the time, most of the asbestos case trials in Philadelphia were brought by government employees at the Philadelphia Naval Shipyard. The government was an extremely knowledgeable purchaser of asbestos-containing insulation products, and was as, or more, aware of the hazards of asbestos use than any vendor. At the time, 1989, the sophisticated intermediary defense was disallowed under Pennsylvania strict liability law, and so defendants rarely got a chance to deploy it.

In a case that went “all issues,” with negligence and even potential punitive damages, however, the sophisticated intermediary defense was valid under Pennsylvania law. Judge Gordon’s practice of trying all cases, all issues, opened the door to defending the case by showing that there was no failure to warn at all, because the Navy, at its shipyards, was knowledgeable about asbestos hazards. If plaintiff’s testimony were true about lack of protections, then the Navy itself had been grossly negligent in its maintenance and supervision of the shipyard workplace.

Before trial began, on May 8, 1989, the Brookman firm had signaled that the O’Donnell case was on track to settle in a dollar range that was typical for cases involving the age, medical condition, and work history of the plaintiff, Mr. O’Donnell. The settlement posture of the case changed, abruptly however, after jury selection. When the jury was sworn, we had 12 Philadelphians, 11 of whom were African American, and one of whom was Latina. When I asked Sandy whether we were settled at the number we had discussed the previous day, he looked at me and asked why he would want to settle now, with the jury we had. He now insisted that this trial must be tried. Racism works in curious ways and directions.

So we tried the O’Donnell case, the old-fashioned way, from front to back. Both sides called “state of the art” expert witnesses, to address the history of medical knowledge about asbestos-related diseases. We called product identification lay witnesses, as well as several physicians to testify about Mr. O’Donnell’s disputed asbestosis. The lovely thing about the O’Donnell trial, however, was that I had the opportunity to present testimony from the Philadelphia Navy Yard’s industrial hygienist, Dr. Victor Kindsvatter, who had given a deposition many years before. Kindsvatter, who had a Ph.D. in industrial hygiene, was extraordinarily knowledgeable about asbestos, permissible exposure limits, asbestos hazards, and methods of asbestos control on board ships and in the shops.

The result of Judge Gordon’s all issue trial was a fuller, fairer presentation of the case. Plaintiffs could argue that the defendants were horribly negligent given what experts knew in the medical community. Defendants could present evidence that experts at the relevant time believed that asbestos-containing insulation products could be used safely, and that the U.S. Navy was especially eager to use asbestos products on board ships, and had extensive regulations and procedures for doing so. The testimony that probably tipped the balance came from a former shipyard worker, George Rabuck. Mr. Rabuck had been a client of the Brookman firm, and he was their go-to guy to testify on product identification. In the O’Donnell case, as in many others, Rabuck dutifully and cheerfully identified the products of the non-settling defendants, and less cheerfully, the products of the settled and bankrupt defendants. In O’Donnell, I was able to elicit additional testimony from Mr. Rabuck about a shakedown cruise of a new Navy ship, in which someone had failed to insulate a hot line in the boiler room. When an oil valve broke, diesel fuel sprayed the room, and ignited upon hitting the uninsulated pipe. A ship fire ensued, in which several sailors were seriously injured and one died. In my closing argument, I was able to remind the jury of the sailor who died because asbestos insulation was not used on the Navy ship.

On May 18, 1989, the jury came back with a general verdict for the defense in O’Donnell. Judge Gordon entered judgment, from which there was no appeal. Ignoring the plaintiffs’ lawyers intransigence on settlement, Judge Moss was angry at the defense lawyers, as she typically was, for tying up one of her court rooms for Judge Gordon’s rotation in her trial program. Judge Moss stopped asking Judge Gordon to help with the asbestos docket after the O’Donnell case. Without all-issue trials that included negligence claims, sophisticated intermediary defenses went pretty much unexercised in asbestos personal injury cases for the next 25 years.

My real question though, in view of Texas Judge Robison’s epiphany, is whether the defense won in O’Donnell because of the equities and the evidence, or whether an angel had put her finger on the scales of justice. It’s a mystery.


1 Ryanne Persinger, “Levan Gordon, retired judge,” Tribune Staff (Oct. 6, 2016). Judge Gordon was one of the most respected judges in Philadelphia County. He had graduated from Lincoln University in 1958, and from Howard University School of Law in 1961. Gordon was elected to Philadelphia Municipal Court in 1974, and to the Court of Common Pleas in 1979. He died on October 4, 2016.

Ninth Circuit Quashes Harkonen’s Last Chance

January 8th, 2018

With the benefit of hindsight, even the biggest whopper can be characterized as a strategic choice for trial counsel. As are result of this sort of thinking, the convicted have a very difficult time in pressing claims of ineffective assistance of counsel. After the fact, a reviewing or an appellate court can always imagine a strategic reason for trial counsel’s decisions, even if they contributed to the client’s conviction.

In the Harkonen case, a pharmaceutical executive was indicted and tried for wire fraud and misbranding. His crime was to send out a fax with a preliminary assessment of a recently unblinded clinical trial. In his fax, Dr Harkonen described the trial’s results as “demonstrating” a survival benefit in study participants with mild and moderate disease. Survival (or mortality) was not a primary outcome of the trial, but it was a secondary outcome, and arguably the most important one of all. The subgroup of “mild and moderate” was not pre-specified, but it was highly plausible.

Clearly, Harkonen’s post hoc analysis would not be sufficient normally to persuade the FDA to approve a medication, but Harkonen did not assert or predict that the company would obtain FDA approval. He simply claimed that the trial “demonstrated” a benefit. A charitable interpretation of his statement, which was several pages long, would include the prior successful clinical trial, as important context for Harkonen’s statement.

The United States government, however, was not interested in the principle of charity, the context, or even its own pronouncements on the issue of statistical significance. Instead, the United States Attorney pushed for draconian sentences under the Wire Fraud Act, and the misbranding sections of the Food, Drug, and Cosmetics Act. A jury acquitted on the misbranding charge, but convicted on wire fraud. The government’s request for an extreme prison term and fines was rebuffed by the trial court, which imposed a term of six months of house arrest, and a small fine.1 The conviction, however, effectively keeps Dr Harkonen from working again in the pharmaceutical industry.

In post-verdict challenges to the conviction, Harkonen’s lawyers were able to marshal support from several well-renown statisticians and epidemiologists, but the trial court was reluctant to consider these post-verdict opinions when the defense called no expert witness at trial. The trial situation, however, was complicated and confused by the government’s pre-trial position that it would not call expert witnesses on the statistical and clinical trial interpretative issues. Contrary to these representations, the government called Dr Thomas Fleming, as statistician, who testified at some length, and without objection, to strict criteria for assessing statistical significance and causation in clinical trials.

Having read Fleming’s testimony, I can say that the government got away with introducing a great deal of expert witness opinion testimony, without effective contradiction or impeachment. With the benefit of hindsight, the defense decision not to call an expert witness looks like a serious deviation from the standard of care. Fleming’s “facts” about how the FDA would evaluate the success or failure of the clinical trial were not relevant to whether Harkonen’s claim of a demonstrated benefit were true or false. More importantly, Harkonen’s claim involved an inference, which is not a fact, but an opinion. Fleming’s contrary opinion really did not turn Harkonen’s claim into a falsehood. A contrary rule would have many expert witnesses in civil and in criminal litigation behind bars on similar charges of wire or mail fraud.

After Harkonen exhausted his direct appeals,2 he petitioned for a writ of coram nobis. The trial court denied the petition,3 and in a non-precedential opinion [sic], the Ninth Circuit affirmed the denial of coram nobis.4 United States v. Harkonen, slip op., No. 15-16844 (9th Cir., Dec. 4, 2017) [cited below as Harkonen].

The Circuit rejected Harkonen’s contention that the Supreme Court had announced a new rule with respect to statistical significance, in Matrixx Initiatives, Inc. v. Siracusano, 563 U.S. 27 (2011), which change in law required that his conviction be vacated. Harkonen’s lawyer, like much of the plaintiffs’ tort bar, oversold the Supreme Court’s comments about statistical significance, which were at best dicta, and not very well considered or supported dicta, at that. Still, there was an obvious tension, and duplicity, between positions that the government, through the Solicitor General’s office, had taken in Siracusano, and positions the government took in the Harkonen case.5 Given the government’s opportunistic double-faced arguments about statistical significance, the Ninth Circuit held that Harkonen’s proffered evidence was “compelling, especially in light of Matrixx,” but the panel concluded that his conviction was not the result of a “manifest injustice” that requires the issuance of the writ of coram nobis. Harkonen at 2 (emphasis added). Apparently, Harkonen had suffered an injustice of a less obvious and blatant variety, which did not rise to the level of manifest injustice.

The Ninth Circuit gave similarly short shrift to Harkonen’s challenge to the competency of his counsel. His trial lawyers had averred that they thought that they were doing well enough not to risk putting on an expert witness, especially given that the defense’s view of the evidence came out in the testimony of the government’s witnesses. The Circuit thus acquiesced in the view that both sides had chosen to forgo expert witness testimony, and overlooked the defense’s competency issue for not having objected to Fleming’s opinion trial testimony. Harkonen at 2-4. Remarkably, the appellate court did not look at how Fleming was allowed to testify on statistical issues, without being challenged on cross-examination.


2 United States v. Harkonen, 510 F. App’x 633, 638 (9th Cir. 2013), cert. denied, 134 S. Ct. 824 (2013).

4 Dave Simpson, “9th Circuit Refuses To Rethink Ex-InterMune CEO’s Conviction,” Law360 (Dec. 5, 2017).

The Amicus Curious Brief

January 4th, 2018

Friends – Are They Boxers or Briefers*

Amicus briefs help appellate courts by bringing important views to bear on the facts and the law in disputes. Amicus briefs ameliorate the problem of the common law system, in which litigation takes place between specific parties, with many interested parties looking on, without the ability to participate in the discussion or shape the outcome.

There are dangers, however, of hidden advocacy in the amicus brief. Even the most unsophisticated court is not likely to be misled by the interests and potential conflicts of interest of groups such as the American Association for Justice or the Defense Research Institute. If the description of the group is not as fully forthcoming as one might like, a quick trip to its website will quickly clarify the group’s mission on Earth. No one is fooled, and the amicus briefs can be judged on their merits.

What happens when the amici are identified only by their individual names and institutional affiliations? A court might be misled into thinking that the signatories are merely disinterested academics, who believe that important information or argument is missing from the appellate discussion.

The Pennsylvania Supreme Court has offered itself up as an example of a court snookered by “58 physicians and scientists.”1 Rost v. Ford Motor Co., 151 A.3d 1032, 1052 (Pa. 2016). Without paying any attention to the provenance of the amicus brief or the authors’ deep ties with the lawsuit industry, the court cited the brief’s description of:

“the fundamental notion that each exposure to asbestos contributes to the total dose and increases the person’s probability of developing mesothelioma or other cancers as an ‘irrefutable scientific fact’. According to these physicians and scientists, cumulative exposure is merely an extension of the ancient concept of dose-response, which is the ‘oldest maxim in the field’.”

Id. (citing amicus brief at 2).

Well, irrefutable in the minds of the 58 amici curious perhaps, who failed to tell the court that not every exposure contributes materially to cumulative exposure such that it must be considered a “substantial contributing factor.” These would-be friends also failed to tell the court that the human body has defense mechanisms to carcinogenic exposures, which gives rise to a limit on, and qualification of, the concept of dose-response in the form of biological thresholds, below which exposures do not translate into causative doses. Even if these putative “friends” believed there was no evidence for a threshold, they certainly presented no evidence against one. Nonetheless, a confused and misguided Pennsylvania Supreme Court affirmed the judgment below in favor of the plaintiffs.

The 58 amici also misled the Pennsylvania Supreme Court on several other issues. By their failure to disclose important information about themselves, and holding themselves out (falsely but successfully) as “disinterested” physicians and scientists, these so-called friends misled the court by failing to disclose the following facts:

1. Some of them were personal friends, colleagues, and fellow-party expert witnesses of the expert witness (Arthur Frank), whose opinion was challenged in the lower courts;

2. Some of the amici had no reasonable claim to expertise on the issues addressed in the brief;

3. Some of the amici have earned substantial fees in other asbestos cases, involving the same issues raised in the Rost case;

4. Some of the amici have been excluded from testifying in similar cases, to the detriment of their financial balance sheets;

5. Some of the amici are zealous advocates, who not only have testified for plaintiffs, but have participated in highly politicized advocacy groups such as the Collegium Ramazzini.

Two of the amici are historians (Rosner and Markowitz), who have never conducted scientific research on asbestos-related disease. Their work as labor historians added no support to the scientific concepts that were put over the Pennsylvania Supreme Court. Both of these historians have testified in multiple asbestos cases, and one of them (Markowitz) has been excluded in a state court case, under a Daubert-like standard. They have never been qualified to give expert witness testimony on medical causation issues. Margaret Keith, an adjunct assistant professor of sociology, appears never to have written about medical causation between asbestos and cancer, but she at least is married to another amicus, James Brophy, who has.

Barry Castleman,2 David F. Goldsmith, John M. Dement, Richard A. Lemen, and David Ozonoff have all testified in asbestos or other alleged dust-induced disease cases, with Castleman having the distinction of having made virtually his entire livelihood in connection with plaintiffs-side asbestos litigation testifying and consulting. Castleman, Goldsmith, and Ozonoff have all been excluded from, or severely limited in, testifying for plaintiffs in chemical exposure cases.

(Rabbi) Daniel Thau Teitelbaum has the distinction of having been excluded in case that went to the United States Supreme Court (Joiner), but Shira Kramer,3 Richard Clapp, and Peter F. Infante probably make up for the lack of distinction with the number of testimonial adventures and misadventures. L. Christine Oliver and Colin L. Soskolne have also testified for the lawsuit industry, in the United States, and for Soskolne, in Canada, as well.

Lennart Hardell has testified in cellular telephone brain cancer cases,4 for plaintiffs of course, which qualified as an expert for the IARC on electromagnetic frequency and carcinogenesis.5

Celeste Monforton has earned credentials serving with fellow skapper David Michaels in the notorious Project on Scientific Knowledge and Public Policy (SKAPP) organization.6 Laura S. Welch, like Monforton, another George Washington lecturer, has served the lawsuit industry in asbestos personal injury and other cases.

Exhibit A to the Amicus brief lists the institutional affiliations of each amicus. Although some of the amici described themselves as “consultants,” only one amicus (Massimiliano Bugiani) listed his consultancy as specifically litigation related, with an identification of the party that engaged him: “Consultant of the Plaintiff in the Turin and Milan Courts.” Despite Bugiani’s honorable example, none of the other amici followed suit.

* * * * * * * *

Although many judges and lawyers agree that amicus briefs often bring important factual expertise to appellate courts, there are clearly some abuses. I, for one, am proud to have been associated with a few amicus briefs in various courts. One law professor, Allison Orr Larsen, in a trenchant law review article, has identified some problems and has suggested some reforms.7 Regardless of what readers think of Larsen’s proposed reforms, briefs should not be submitted by testifying and consulting expert witnesses for one side in a particular category of litigation, without disclosing fully and accurately their involvement in the underlying cases, and their financial enrichment from perpetuating the litigation in question.

* Thanks to Ramses Delafontaine for having alerted me to other aspects of the lack of transparency in connection with amicus briefs filed by professional historian organizations.


1 Brief of Muge Akpinar-Elci, Xaver Bauer, Carlos Bedrossian, Eula Bingham, Yv Bonnier-Viger, James Brophy, Massimiliano Buggiani, Barry Castleman, Richard Clapp, Dario Consonni, Emilie Counil, Mohamed Aquiel Dalvie, John M. Dement, Tony Fletcher, Bice Fubini, Thomas H. Gassert, David F. Goldsmith, Michael Gochfeld, Lennart Hadell [sic, Hardell], James Huff, Peter F. Infante, Moham F. Jeebhay, T. K. Joshi, Margaret Keith, John R. Keyserlingk, Kapil Khatter, Shira Kramer, Philip J. Landrigan, Bruce Lanphear, Richard A. Lemen, Charles Levenstein, Abby Lippman, Gerald Markowitz, Dario Mirabelli, Sigurd Mikkelsen, Celeste Monforton, Rama C. Nair, L. Christine Oliver, David Ozonoff, Domyung Paek, Smita Pakhale, Rolf Petersen, Beth Rosenberg, Kenneth Rosenman, David Rosner, Craig Slatin, Michael Silverstein, Colin L. Soskolne, Leslie Thomas Stayner, Ken Takahashi, Daniel Thau Teitelbaum, Benedetto Terracini, Annie Thebaud-Mony, Fernand Turcotte, Andrew Watterson, David H. Wegman, Laura S. Welch, Hans-Joachim Woitowitz as Amici Curiae in Support of Appellee, 2015 WL 3385332, filed in Rost v. Ford Motor Co., 151 A.3d 1032 (Pa. 2016).

2 SeeThe Selikoff – Castleman Conspiracy” (Mar. 13, 2011).

4 Newman v. Motorola, Inc., 218 F. Supp. 2d 769 (D. Md. 2002) (excluding Hardell’s proposed testimony), aff’d, 78 Fed. Appx. 292 (4th Cir. 2003) (affirming exclusion of Hardell).

6 See, e.g., SKAPP A LOT” (April 30, 2010); Manufacturing Certainty” (Oct. 25, 2011); “David Michaels’ Public Relations Problem” (Dec. 2, 2011); “Conflicted Public Interest Groups” (Nov. 3, 2013).

7 See Allison Orr Larsen, “The Trouble with Amicus Facts,” 100 Virginia L. Rev. 1757 (2014). See also Caitlin E. Borgmann, “Appellate Review of Social Facts in Constitutional Rights Cases,” 101 Calif. L. Rev. 1185, 1216 (2013) (“Amicus briefs, in particular, are often submitted by advocates and may be replete with dubious factual assertions that would never be admitted at trial.”).

Failed Gatekeeping in Ambrosini v. Labarraque (1996)

December 28th, 2017

The Ambrosini case straddled the Supreme Court’s 1993 Daubert decision. The case began before the Supreme Court clarified the federal standard for expert witness gatekeeping, and ended in the Court of Appeals for the District of Columbia, after the high court adopted the curious notion that scientific claims should be based upon reliable evidence and valid inferences. That notion has only slowly and inconsistently trickled down to the lower courts.

Given that Ambrosini was litigated in the District of Columbia, where the docket is dominated by regulatory controversies, frequently involving dubious scientific claims, no one should be surprised that the D.C. Court of Appeals did not see that the Supreme Court had read “an exacting standard” into Federal Rule of Evidence 702. And so, we see, in Ambrosini, this Court of Appeals citing and purportedly applying its own pre-Daubert decision in Ferebee v. Chevron Chem. Co., 552 F. Supp. 1297 (D.D.C. 1982), aff’d, 736 F.2d 1529 (D.C. Cir.), cert. denied, 469 U.S. 1062 (1984).1 In 2000, the Federal Rule of Evidence 702 was revised in a way that extinguishes the precedential value of Ambrosini and the broad dicta of Ferebee, but some courts and commentators have failed to stay abreast of the law.

Escolastica Ambrosini was using a synthetic progestin birth control, Depo-Provera, as well as an anti-nausea medication, Bendectin, when she became pregnant. The child that resulted from this pregnancy, Teresa Ambrosini, was born with malformations of her face, eyes, and ears, cleft lip and palate, and vetebral malformations. About three percent of all live births in the United States have a major malformation. Perhaps because the Divine Being has sovereign immunity, Escolastica sued the manufacturers of Bendectin and Depo-Provera, as well as the prescribing physician.

The causal claims were controversial when made, and they still are. The progestin at issue, medroxyprogesterone acetate (MPA), was embryotoxic in the cynomolgus monkey2, but not in the baboon3. The evidence in humans was equivocal at best, and involved mostly genital malformations4; the epidemiologic evidence for the MPA causal claim to this day remains unconvincing5.

At the close of discovery in Ambrosini, Upjohn (the manufacturer of the progestin) moved for summary judgment, with a supporting affidavit of a physician and geneticist, Dr. Joe Leigh Simpson. In his affidavit, Simpson discussed three epidemiologic studies, as well as other published papers, in support of his opinion that the progestin at issue did not cause the types of birth defects manifested by Teresa Ambrosini.

Ambrosini had disclosed two expert witnesses, Dr. Allen S. Goldman and Dr. Brian Strom. Neither Goldman nor Strom bothered to identify the papers, studies, data, or methodology used in arriving at an opinion on causation. Not surprisingly, the district judge was unimpressed with their opposition, and granted summary judgment for the defendant. Ambrosini v. Labarraque, 966 F.2d 1462, 1466 (D.C. Cir. 1992).

The plaintiffs appealed on the remarkable ground that Goldman’s and Strom’s crypto-evidence satisfied Federal Rule of Evidence 703. Even more remarkably, the Circuit, in a strikingly unscholarly opinion by Judge Mikva, opined that disclosure of relied-upon studies was not required for expert witnesses under Rules 703 and 705. Judge Mikva seemed to forget that the opinions being challenged were not given in testimony, but in (late-filed) affidavits that had to satisfy the requirement of Federal Rule of Civil Procedure 26. Id. at 1468-69. At trial, an expert witness may express an opinion without identifying its bases, but of course the adverse party may compel disclosure of those bases. In discovery, the proffered expert witness must supply all opinions and evidence relied upon in reach the opinions. In any event, the Circuit remanded the case for a hearing and further proceedings, at which the two challenged expert witnesses, Goldman and Strom, would have to identify the bases of their opinions. Id. at 1471.

Not long after the case landed back in the district court, the Supreme Court decided Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993). With an order to produce entered, plaintiffs’ counsel could no longer hide Goldman and Strom’s evidentiary bases, and their scientific inferences came under judicial scrutiny.

Upjohn moved again to exclude Goldman and Strom’s opinions. The district court upheld Upjohn’s challenges, and granted summary judgment in favor of Upjohn for the second time. The Ambrosinis appealed again, but the second case in the D.C. Circuit resulted in a split decision, with the majority holding that the exclusion of Goldman and Strom’s opinions under Rule 702 was erroneous. Ambrosini v. Labarraque, 101 F.3d 129 (D.C. Cir. 1996).

Although issued two decades ago, the majority’s opinion remains noteworthy as an example of judicial resistance to the existence and meaning of the Supreme Court’s Daubert opinion. The majority opinion uncritically cited the notorious Ferebee6 and other pre-Daubert decisions. The court embraced the Daubert dictum about gatekeeping being limited to methodologic consideration, and then proceeded to interpret methodology as superficially as necessary to sustain admissibility. If an expert witness claimed to have looked at epidemiologic studies, and epidemiology was an accepted methodology, then the opinion of the expert witness must satisfy the legal requirements of Daubert, or so it would seem from the opinion of the U.S. Court of Appeals for the District of Columbia.

Despite the majority’s hand waving, a careful reader will discern that there must have been substantial gaps and omissions in the explanations and evidence cited by plaintiffs’ expert witnesses. Seeing anything clearly in the Circuit’s opinion is made difficult, however, by careless and imprecise language, such as its descriptions of studies as showing, or not showing “causation,” when it could have meant only that such studies showed associations, with more or less random and systematic error.

Dr. Strom’s report addressed only general causation, and even so, he apparently did not address general causation of the specific malformations manifested by the plaintiffs’ child. Strom claimed to have relied upon the “totality of the data,” but his methodologic approach seems to have required him to dismiss studies that failed to show an association.

Dr. Strom first set forth the reasoning he employed that led him to disagree with those studies finding no causal relationship [sic] between progestins and birth defects like Teresa’s. He explained that an epidemiologist evaluates studies based on their ‘statistical power’. Statistical power, he continued, represents the ability of a study, based on its sample size, to detect a causal relationship. Conventionally, in order to be considered meaningful, negative studies, that is, those which allege the absence of a causal relationship, must have at least an 80 to 90 percent chance of detecting a causal link if such a link exists; otherwise, the studies cannot be considered conclusive. Based on sample sizes too small to be reliable, the negative studies at issue, Dr. Strom explained, lacked sufficient statistical power to be considered conclusive.”

Id. at 1367.

Putting aside the problem of suggesting that an observational study detects a “causal relationship,” as opposed to an association in need of further causal evaluation, the Court’s précis of Strom’s testimony on power is troublesome, and typical of how other courts have misunderstood and misapplied the concept of statistical power. Statistical power is a probability of observing an association of a specified size at a specified level of statistical significance. The calculation of statistical power turns indeed on sample size, the level of significance probability preselected for “statistical significance, an assumed probability distribution of the sample, and, critically, an alternative hypothesis. Without a specified alternative hypothesis, the notion of statistical power is meaningless, regardless of what probability (80% or 90% or some other percentage) is sought for finding the alternative hypothesis. Furthermore, the notion that the defense must adduce studies with “sufficient statistical power to be considered conclusive” creates an unscientific standard that can never be met, while subverting the law’s requirement that the claimant establish causation.

The suggestion that the studies that failed to find an association cannot be considered conclusive because they “lacked sufficient statistical power” is troublesome because it distorts and misapplies the very notion of statistical power. No attempt was made to describe the confidence intervals surrounding the point estimates of the null studies; nor was there any discussion whether the studies could be aggregated to increase their power to rule out meaningful associations.

The Circuit court’s scientific jurisprudence was thus seriously flawed. Without a discussion of the end points observed, the relevant point estimates of risk ratios, and the confidence intervals, the reader cannot assess the strength of the claims made by Goldman and Strom, or by defense expert Simpson, in their reports. Without identifying the study endpoints, the reader cannot evaluate whether the plaintiffs’ expert witnesses relied upon relevant outcomes in formulating their opinions. The court viewed the subject matter from 30,000 feet, passing over at 600 mph, without engagement or care. A strong dissent, however, suggested serious mischaracterizations of the plaintiffs’ evidence by the majority.

The only specific causation testimony to support plaintiff’s claims came from Goldman, in what appears to have been a “differential etiology.” Goldman purported to rule out a genetic cause, even though he had not conducted a critical family history or ordered a state-of-the-art chromosomal study. Id. at 140. Of course, nothing in a differential etiology approach would allow a physician to rule out “unknown” causes, which, for birth defects, make up the most prevalent and likely causes to explain any particular case. The majority acknowledged that these were short comings, but rhetorically characterized them as substantive, not methodologic, and therefore as issues for cross-examination, not for consideration by a judicial gatekeeping. All this is magical thinking, but it continues to infect judicial approaches to specific causation. See, e.g., Green Mountain Chrysler Plymouth Dodge Jeep v. Crombie, 508 F. Supp. 2d 295, 311 (D.Vt. 2007) (citing Ambrosini for the proposition that “the possibility of uneliminated causes goes to weight rather than admissibility, provided that the expert has considered and reasonably ruled out the most obvious”). In Ambrosini, however, Dr. Goldman had not ruled out much of anything.

Circuit Judge Karen LeCraft Henderson dissented in a short, but pointed opinion that carefully marshaled the record evidence. Drs. Goldman and Strom had relied upon a study by Greenberg and Matsunaga, whose data failed to show a statistically significant association between MPA and cleft lip and palate, when the crucial issue of timing of exposure was taken into consideration. Ambrosini, 101 F.3d at 142.

Beyond the specific claims and evidence, Judge Henderson anticipated the subsequent Supreme Court decisions in Joiner, Kumho Tire, and Weisgram, and the year 2000 revision of Rule 702, in noting that the majority’s acceptance of glib claims to have used a “traditional methodology” would render Daubert nugatory. Id. at 143-45 (characterizing Strom and Goldman’s methodologies as “wispish”). Even more importantly, Judge Henderson refused to indulge the assumption that somehow the length of Goldman’s C.V. substituted for evidence that his methods satisfied the legal (or scientific) standard of reliability. Id.

The good news is that little or nothing in Ambrosini survives the 2000 amendment to Rule 702. The bad news is that not all federal judges seem to have noticed, and that some commentators continue to cite the case, as lovely.

Probably no commentator has promiscuously embraced Ambrosini as warmly as Carl Cranor, a philosopher, and occasional expert witness for the lawsuit industry, in several publications and presentations.8 Cranor has been particularly enthusiastic about Ambrosini’s approval of expert witness’s testimony that failed to address “the relative risk between exposed and unexposed populations of cleft lip and palate, or any other of the birth defects from which [the child] suffers,” as well as differential etiologies that exclude nothing.9 Somehow Cranor, as did the majority in Ambrosini, believes that testimony that fails to identify the magnitude of the point estimate of relative risk can “assist the trier of fact to understand the evidence or to determine a fact in issue.”10 Of course, without that magnitude given, the trier of fact could not evaluate the strength of the alleged association; nor could the trier assess the probability of individual causation to the plaintiff. Cranor also has written approvingly of lumping unrelated end points, which defeats the assessment of biological plausibility and coherence by the trier of fact. When the defense expert witness in Ambrosini adverted to the point estimates for relevant end points, the majority, with Cranor’s approval, rejected the null findings as “too small to be significant.”11 If the null studies were, in fact, too small to be useful tests of the plaintiffs’ claims, intellectual and scientific honesty required an acknowledgement that the evidentiary display was not one from which a reasonable scientist would draw a causal conclusion.


1Ambrosini v. Labarraque, 101 F.3d 129, 138-39 (D.C. Cir. 1996) (citing and applying Ferebee), cert. dismissed sub nom. Upjohn Co. v. Ambrosini, 117 S.Ct. 1572 (1997) See also David E. Bernstein, “The Misbegotten Judicial Resistance to the Daubert Revolution,” 89Notre Dame L. Rev. 27, 31 (2013).

2 S. Prahalada, E. Carroad, M. Cukierski, and A.G. Hendrickx, “Embryotoxicity of a single dose of medroxyprogesterone acetate (MPA) and maternal serum MPA concentrations in cynomolgus monkey (Macaca fascicularis),” 32 Teratology 421 (1985).

3 S. Prahalada, E. Carroad, and A.G. Hendrick, “Embryotoxicity and maternal serum concentrations of medroxyprogesterone acetate (MPA) in baboons (Papio cynocephalus),” 32 Contraception 497 (1985).

4 See, e.g., Z. Katz, M. Lancet, J. Skornik, J. Chemke, B.M. Mogilner, and M. Klinberg, “Teratogenicity of progestogens given during the first trimester of pregnancy,” 65 Obstet Gynecol. 775 (1985); J.L. Yovich, S.R. Turner, and R. Draper, “Medroxyprogesterone acetate therapy in early pregnancy has no apparent fetal effects,” 38 Teratology 135 (1988).

5 G. Saccone, C. Schoen, J.M. Franasiak, R.T. Scott, and V. Berghella, “Supplementation with progestogens in the first trimester of pregnancy to prevent miscarriage in women with unexplained recurrent miscarriage: a systematic review and meta-analysis of randomized, controlled trials,” 107 Fertil. Steril. 430 (2017).

6 Ferebee v. Chevron Chemical Co., 736 F.2d 1529, 1535 (D.C. Cir.), cert. denied, 469 U.S. 1062 (1984).

7 Dr. Strom was also quoted as having provided a misleading definition of statistical significance: “whether there is a statistically significant finding at greater than 95 percent chance that it’s not due to random error.” Ambrosini at 101 F.3d at 136. Given the majority’s inadequate description of the record, the description of witness testimony may not be accurate, and error cannot properly be allocated.

8 Carl F. Cranor, Toxic Torts: Science, Law, and the Possibility of Justice 320, 327-28 (2006); see also Carl F. Cranor, Toxic Torts: Science, Law, and the Possibility of Justice 238 (2d ed. 2016).

9 Carl F. Cranor, Toxic Torts: Science, Law, and the Possibility of Justice 320 (2006).

10 Id.

11 Id. ; see also Carl F. Cranor, Toxic Torts: Science, Law, and the Possibility of Justice 238 (2d ed. 2016).

The opinions, statements, and asseverations expressed on Tortini are my own, or those of invited guests, and these writings do not necessarily represent the views of clients, friends, or family, even when supported by good and sufficient reason.