TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

The 5% Solution at the FDA

February 24th, 2018

The statistics wars rage on1, with Bayesians attempting to take advantage of the so-called replication crisis to argue it is all the fault of frequentist significance testing. In 2016, there was an attempted coup at the American Statistical Association, but the Bayesians did not get what they wanted, with little more than a consensus that p-values and confidence intervals should be properly interpreted. Patient advocacy groups have lobbied for the availability of unapproved and incompletely tested medications, and rent-seeking litigation has argued and lobbied for the elimination of statistical tests and methods in the assessment of causal claims. The battle continues.

Against this backdrop, a young Harvard graduate student has published a a paper with a brief history of significance testing, and the role that significance testing has taken on at the United States Food and Drug Administration (FDA). Lee Kennedy-Shaffer, “When the Alpha is the Omega: P-Values, ‘Substantial Evidence’, and the 0.05 Standard at FDA,” 72 Food & Drug L.J. 595 (2017) [cited below as K-S]. The paper presents a short but entertaining history of the evolution of the p-value from its early invocation in 1710, by John Arbuthnott, a Scottish physician and mathematician, who calculated the probability that male births would exceed female births 82 consecutive years if their true proportions were equal. K-S at 603. Kennedy-Shaffer notes the role of the two great French mathematicians, Pierre-Simon Laplace and Siméon-Denis Poisson, who used p-values (or their complements) to evaluate empirical propositions. As Kennedy-Shaffer notes, Poisson observed that the equivalent of what would be a modern p-value about 0.005, was sufficient in his view, back in 1830, to believe that the French Revolution of 1830 had caused the pattern of jury verdicts to be changed. K-S at 604.

Kennedy-Shaffer traces the p-value, or its equivalent, through its treatment by the great early 20th century statisticians, Karl Pearson and Ronald A. Fisher, through its modification by Jerzy Neyman and Egon Pearson, into the bowels of the FDA in Rockville, Maryland. It is a history well worth recounting, if for no other reason, to remind us that the p-value or its equivalent has been remarkably durable and reasonably effective in protecting the public against false claims of safety and efficacy. Kennedy-Shaffer provides several good examples in which the FDA’s use of significance testing was outcome dispositive of approval or non-approval of medications and devices.

There is enough substance and history here that everyone will have something to pick at this paper. Let me volunteer the first shot. Kennedy-Shaffer describes the co-evolution of the controlled clinical trial and statistical tests, and points to the landmark study by the Medical Research Council on streptomycin for tuberculosis. Geoffrey Marshall (chairman), “Streptomycin Treatment of Pulmonary Tuberculosis: A Medical Research Council Investigation,” 2 Brit. Med. J. 769, 769–71 (1948). This clinical trial was historically important, not only for its results and for Sir Austin Bradford Hill’s role in its design, but for the care with which it described randomization, double blinding, and multiple study sites. Kennedy-Shaffer suggests that “[w]hile results were presented in detail, few formal statistical tests were incorporated into this analysis.” K-S at 597-98. And yet, a few pages later, he tells us that “both chi-squared tests and t-tests were used to evaluate the responses to the drug and compare the control and treated groups,” and that “[t]he difference in mortality between the two groups is statistically significant.” K-S at 611. Although it is true that the authors did not report their calculated p-values for any test, the difference in mortality between the streptomycin and control groups was very large, and the standards for describing the results of such a clinical trial were in their infancy in 1948.

Kennedy-Shaffer’s characterization of Sir Austin Bradford Hill’s use of statistical tests and methods takes on out-size importance because of the mischaracterizations, and even misrepresentations, made by some representatives of the Lawsuit Industry, who contend that Sir Austin dismissed statistical methods as unnecessary. In the United States, some judges have been seriously misled by those misrepresentations, which have their way into published judicial decisions.

The operative document, of course, is the publication of Sir Austin’s famous after-dinner speech, in 1965, on the occasion of his election to the Presidency of the Royal Society of Medicine. Although the speech is casual and free of scholarly footnotes, Sir Austin’s message was precise, balanced, and nuanced. The speech is a classic in the history of medicine, which remains important even if rather dated in terms of its primary message about how science and medicine move from beliefs about associations to knowledge of causal associations. As everyone knows, Sir Austin articulated nine factors or viewpoints through which to assess any putative causal association, but he emphasized that before these nine factors are assessed, our starting point itself has prerequisites:

Disregarding then any such problem in semantics we have this situation. Our observations reveal an association between two variables, perfectly clear-cut and beyond what we would care to attribute to the play of chance. What aspects of that association should we especially consider before deciding that the most likely interpretation of it is causation?”

Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295, 295 (1965) [cited below as Hill]. The starting point, therefore, before the Bradford Hill nine factors come into play, is a “clear-cut” association, which is “beyond what we would care to attribute to the play of chance.”

In other words, consideration of random error is necessary.

Now for the nuance and the balance. Sir Austin acknowledged that there were some situations in which we simply do not need to calculate standard errors because the disparity between treatment and control groups is so large and meaningful. He goes on to wonder out loud:

whether the pendulum has not swung too far – not only with the attentive pupils but even with the statisticians themselves. To decline to draw conclusions without standard errors can surely be just as silly? Fortunately I believe we have not yet gone so far as our friends in the USA where, I am told, some editors of journals will return an article because tests of significance have not been applied. Yet there are innumerable situations in which they are totally unnecessary – because the difference is grotesquely obvious, because it is negligible, or because, whether it be formally significant or not, it is too small to be of any practical importance. What is worse the glitter of the t table diverts attention from the inadequacies of the fare.”

Hill at 299. Now this is all true, but hardly the repudiation of statistical testing claimed by those who want to suppress the consideration of random error from science and judicial gatekeeping. There are very few litigation cases in which the difference between exposed and unexposed is “grotesquely obvious,” such that we can leave statistical methods at the door. Importantly, the very large differences between the streptomycin and placebo control groups in the Medical Council’s 1948 clinical trial were not so “grotesquely obvious” that statistical methods were obviated. To be fair, the differences were sufficiently great that statistical discussion could be kept to a minimum. Sir Austin gave extensive tables in the 1948 paper to let the reader appreciate the actual data themselves.

In his after-dinner speech, Hill also gives examples of studies that are so biased and confounded that no statistical method will likely ever save them. Certainly, the technology of regression and propensity-score analyses have progressed tremendously since Hill’s 1965 speech, but his point still remains. This point hardly excuses the lack of statistical apparatus in highly confounding or biased observations.

In addressing the nine factors he identified, which presumed a “clear-cut” association, with random error ruled out, Sir Austin did opine that for the factors raised questions and that:

No formal tests of significance can answer those questions. Such tests can, and should, remind us of the effects that the play of chance can create, and they will instruct us in the likely magnitude of those effects. Beyond that they contribute nothing to the ‘proof’ of our hypothesis.”

Hill at 299. Again, the date and the context are important. Hill is addressing consideration of the nine factors, not the required predicate association beyond the play of chance or random error. The date is important as well, because it would be foolish to suggest that statistical methods have not grown in the last half century to address some of the nine factors. The existence and the nature of dose-response are the subject of extensive statistical methods, and meta-analysis and meta-regression are used to assess and measure consistency between studies.

Kennedy-Shaffer might well have pointed out the great influence Sir Austin’s textbook on medical statistics had had on medical research and practice. This textbook, which went through numerous editions, makes clear the importance of statistical testing and methods:

Are simple methods of the interpretation of figures only a synonym for common sense or do they involve an art or knowledge which can be imparted? Familiarity with medical statistics leads inevitably to the conclusion that common sense is not enough. Mistakes which when pointed out look extremely foolish are quite frequently made by intelligent persons, and the same mistakes, or types of mistakes, crop up again and again. There is often lacking what has been called a ‘statistical tact, which is rather more than simple good sense’. That fact the majority of persons must acquire (with a minority it is undoubtedly innate) by a study of the basic principles of statistical method.”

Austin Bradford Hill, Principles of Medical Statistics at 2 (4th ed. 1948) (emphasis in original). And later in his text, Sir Austin notes that:

The statistical method is required in the interpretation of figures which are at the mercy of numerous influences, and its object is to determine whether individual influences can be isolated and their effects measured.”

Id. at 10 (emphasis added).

Sir Austin’s work taken as a whole demonstrates the acceptance of the necessity of statistical methods in medicine, and causal inference. Kennedy-Shaffer’s paper covers much ground, but it short changes this important line of influence, which lies directly in the historical path between Sir Ronald Fisher and the medical regulatory community.

Kennedy-Shaffer gives a nod to Bayesian methods, and even suggests that Bayesian results are “more intuitive,” but he does not explain the supposed intuitiveness of how a parameter has a probability distribution. This might make sense at the level of quantum physics, but does not seem to describe the reality of a biomedical phenomenon such as relative risk. Kennedy-Shaffer notes the FDA’s expression of willingness to entertain Bayesian analyses of clinical trials, and the rare instances in which such analyses have actually been deployed. K-S at 629 (“e.g., Pravigard Pac for prevention of myocardial infarction”). He concedes, however, that Bayesian designs are still the exception to the rule, as well as the cautions of Robert Temple, a former FDA Director of Medical Policy, in 2005, that Bayesian proposals for drug clinical trials were at that time “very rare.2” K-S at 630.


2 Robert Temple, “How FDA Currently Makes Decisions on Clinical Studies,” 2 Clinical Trials 276, 281 (2005).

Scientific Evidence in Canadian Courts

February 20th, 2018

A couple of years ago, Deborah Mayo called my attention to the Canadian version of the Reference Manual on Scientific Evidence.1 In the course of discussion of mistaken definitions and uses of p-values, confidence intervals, and significance testing, Sander Greenland pointed to some dubious pronouncements in the Science Manual for Canadian Judges [Manual].

Unlike the United States federal court Reference Manual, which is published through a joint effort of the National Academies of Science, Engineering, and Medicine, the Canadian version, is the product of the Canadian National Judicial Institute (NJI, or the Institut National de la Magistrature, if you live in Quebec), which claims to be an independent, not-for-profit group, that is committed to educating Canadian judges. In addition to the Manual, the Institute publishes Model Jury Instructions and a guide, Problem Solving in Canada’s Courtrooms: A Guide to Therapeutic Justice (2d ed.), as well as conducting educational courses.

The NJI’s website describes the Instute’s Manual as follows:

Without the proper tools, the justice system can be vulnerable to unreliable expert scientific evidence.

         * * *

The goal of the Science Manual is to provide judges with tools to better understand expert evidence and to assess the validity of purportedly scientific evidence presented to them. …”

The Chief Justice of Canada, Hon. Beverley M. McLachlin, contributed an introduction to the Manual, which was notable for its frank admission that:

[w]ithout the proper tools, the justice system is vulnerable to unreliable expert scientific evidence.

****

Within the increasingly science-rich culture of the courtroom, the judiciary needs to discern ‘good’ science from ‘bad’ science, in order to assess expert evidence effectively and establish a proper threshold for admissibility. Judicial education in science, the scientific method, and technology is essential to ensure that judges are capable of dealing with scientific evidence, and to counterbalance the discomfort of jurists confronted with this specific subject matter.”

Manual at 14. These are laudable goals, indeed, but did the National Judicial Institute live up to its stated goals, or did it leave Canadian judges vulnerable to the Institute’s own “bad science”?

In his comments on Deborah Mayo’s blog, Greenland noted some rather cavalier statements in Chapter two that suggest that the conventional alpha of 5% corresponds to a “scientific attitude that unless we are 95% sure the null hypothesis is false, we provisionally accept it.” And he, pointed elsewhere where the chapter seems to suggest that the coefficient of confidence that corresponds to an alpha of 5% “constitutes a rather high standard of proof,” thus confusing and conflating probability of random error with posterior probabilities. Greenland is absolutely correct that the Manual does a rather miserable job of educating Canadian judges if our standard for its work product is accuracy and truth.

Some of the most egregious errors are within what is perhaps the most important chapter of the Manual, Chapter 2, “Science and the Scientific Method.” The chapter has two authors, a scientist, Scott Findlay, and a lawyer, Nathalie Chalifour. Findlay is an Associate Professor, in the Department of Biology, of the University of Ottawa. Nathalie Chalifour is an Associate Professor on the Faculty of Law, also in the University of Ottawa. Together, they produced some dubious pronouncements, such as:

Weight of the Evidence (WOE)

First, the concept of weight of evidence in science is similar in many respects to its legal counterpart. In both settings, the outcome of a weight-of-evidence assessment by the trier of fact is a binary decision.”

Manual at 40. Findlay and Chalifour cite no support for their characterization of WOE in science. Most attempts to invoke WOE are woefully vague and amorphous, with no meaningful guidance or content.2  Sixty-five pages later, if any one is noticing, the authors let us in a dirty, little secret:

at present, there exists no established prescriptive methodology for weight of evidence assessment in science.”

Manual at 105. The authors omit, however, that there are prescriptive methods for inferring causation in science; you just will not see them in discussions of weight of the evidence. The authors then compound the semantic and conceptual problems by stating that “in a civil proceeding, if the evidence adduced by the plaintiff is weightier than that brought forth by the defendant, a judge is obliged to find in favour of the plaintiff.” Manual at 41. This is a remarkable suggestion, which implies that if the plaintiff adduces the crummiest crumb of evidence, a mere peppercorn on the scales of justice, but the defendant has none to offer, that the plaintiff must win. The plaintiff wins notwithstanding that no reasonable person could believe that the plaintiff’s claims are more likely than not true. Even if there were the law of Canada, it is certainly not how scientists think about establishing the truth of empirical propositions.

Confusion of Hypothesis Testing with “Beyond a Reasonable Doubt”

The authors’ next assault comes in conflating significance probability with the probability connected with the burden of proof, a posterior probability. Legal proceedings have a defined burden of proof, with criminal cases requiring the state to prove guilt “beyond a reasonable doubt.” Findlay and Chalifour’s discussion then runs off the rails by likening hypothesis testing, with an alpha of 5% or its complement, 95%, as a coefficient of confidence, to a “very high” burden of proof:

In statistical hypothesis-testing – one of the tools commonly employed by scientists – the predisposition is that there is a particular hypothesis (the null hypothesis) that is assumed to be true unless sufficient evidence is adduced to overturn it. But in statistical hypothesis-testing, the standard of proof has traditionally been set very high such that, in general, scientists will only (provisionally) reject the null hypothesis if they are at least 95% sure it is false. Third, in both scientific and legal proceedings, the setting of the predisposition and the associated standard of proof are purely normative decisions, based ultimately on the perceived consequences of an error in inference.”

Manual at 41. This is, as Greenland and many others have pointed out, a totally bogus conception of hypothesis testing, and an utterly false description of the probabilities involved.

Later in the chapter, Findlay and Chalifour flirt with the truth, but then lapse into an unrecognizable parody of it:

Inferential statistics adopt the frequentist view of probability whereby a proposition is either true or false, and the task at hand is to estimate the probability of getting results as discrepant or more discrepant than those observed, given the null hypothesis. Thus, in statistical hypothesis testing, the usual inferred conclusion is either that the null is true (or rather, that we have insufficient evidence to reject it) or it is false (in which case we reject it). 16 The decision to reject or not is based on the value of p if the estimated value of p is below some threshold value a, we reject the null; otherwise we accept it.”

Manual at 74. OK; so far so good, but here comes the train wreck:

By convention (and by convention only), scientists tend to set α = 0.05; this corresponds to the collective – and, one assumes, consensual – scientific attitude that unless we are 95% sure the null hypothesis is false, we provisionally accept it. It is partly because of this that scientists have the reputation of being a notoriously conservative lot, given that a 95% threshold constitutes a rather high standard of proof.”

Manual at 75. Uggh; so we are back to significance probability’s being a posterior probability. As if to atone for their sins, in the very next paragraph, the authors then remind the judicial readers that:

As noted above, p is the probability of obtaining results at least as discrepant as those observed if the null is true. This is not the same as the probability of the null hypothesis being true, given the results.”

Manual at 75. True, true, and completely at odds with what the authors have stated previously. And to add to the reader’s now fully justified conclusion, the authors describe the standard for rejecting the null hypothesis as “very high indeed.” Manual at 102, 109. Any reader who is following the discussion might wonder how and why there is such a problem of replication and reproducibility in contemporary science.

Conflating Bayesianism with Frequentist Modes of Inference

We have seen how Findlay and Chalifour conflate significance and posterior probabilities, some of the time. In a section of their chapter that deals explicitly with probability, the authors tell us that before any study is conducted the prior probability of the truth of the tested hypothesis is 50%, sans evidence. This an astonishing creation of certainty out nothingness, and perhaps it explains the authors’ implied claim that the crummiest morsel of evidence on one side is sufficient to compel a verdict, if the other side has no morsels at all. Here is how the authors put their claim to the Canadian judges:

Before each study is conducted (that is, a priori), the hypothesis is as likely to be true as it is to be false. Once the results are in, we can ask: How likely is it now that the hypothesis is true? In the first study, the low a priori inferential strength of the study design means that this probability will not be much different from the a priori value of 0.5 because any result will be rather equivocal owing to limitations in the experimental design.”

Manual at 64. This implied Bayesian slant, with 50% priors, in the world of science would lead anyone to believe “as many as six impossible things before breakfast,” and many more throughout the day.

Lest you think that the Manual is all rubbish, there are occasional gems of advice to the Canadian judges. The authors admonish the judges to

be wary of individual ‘statistically significant’ results that are mined from comparatively large numbers of trials or experiments, as the results may be ‘cherry picked’ from a larger set of experiments or studies that yielded mostly negative results. The court might ask the expert how many other trials or experiments testing the same hypothesis he or she is aware of, and to describe the outcome of those studies.”

Manual at 87. Good advice, but at odds with the authors’ characterization of statistical significance as establishing the rejection of the null hypothesis well-nigh beyond a reasonable doubt.

When Greenland first called attention to this Manual, I reached to some people who had been involved in its peer review. One reviewer told me that it was a “living document,” and would likely be revised after he had the chance to call the NJI’s attention to the errors. But two years later, the errors remain, and so we have to infer that the authors meant to say all the contradictory and false statements that are still present in the downloadable version of the Manual.


2 SeeWOE-fully Inadequate Methodology – An Ipse Dixit By Another Name” (May 1, 2012); “Weight of the Evidence in Science and in Law” (July 29, 2017); see also David E. Bernstein, “The Misbegotten Judicial Resistance to the Daubert Revolution,” 89 Notre Dame L. Rev. 27 (2013).

Wrong Words Beget Causal Confusion

February 12th, 2018

In clinical medical and epidemiologic journals, most articles that report about associations will conclude with a discussion section in which the authors hold forth about

(1) how they have found that exposure to X “increases the risk” of Y, and

(2) how their finding makes sense because of some plausible (even if unproven) mechanism.

In an opinion piece in Significance,1 Dalmeet Singh Chawla cites to a study that suggests the “because” language frequently confuses readers into believing that a causal claim is being made. The study abstract explains:

Most researchers do not deliberately claim causal results in an observational study. But do we lead our readers to draw a causal conclusion unintentionally by explaining why significant correlations and relationships may exist? Here we perform a randomized study in a data analysis massive online open course to test the hypothesis that explaining an analysis will lead readers to interpret an inferential analysis as causal. We show that adding an explanation to the description of an inferential analysis leads to a 15.2% increase in readers interpreting the analysis as causal (95% CI 12.8% – 17.5%). We then replicate this finding in a second large scale massive online open course. Nearly every scientific study, regardless of the study design, includes explanation for observed effects. Our results suggest that these explanations may be misleading to the audience of these data analyses.”

Leslie Myint, Jeffrey T. Leek, and Leah R. Jager, “Explanation implies causation?” (Nov. 2017) (on line manuscript).

Invoking the principle of charity, these authors suggest that most researchers are not deliberately claiming causal results. Indeed, the language of biomedical science itself is biased in favor of causal interpretation. The term “statistical significance” suggests causality to naive readers, as does stats talk about “effect size,” and “fixed effect models,” for data sets that come no where near establishing causality.

Common epidemiologic publication practice tolerates if not encourages authors to state that their study shows (finds, demonstrates, etc.) that exposure to X “increases the risk” of Y in the studies’ samples. This language is deliberately causal, even if the study cannot support a causal conclusion alone or even with other studies. After all, a risk is the antecedent of a cause, and in the stochastic model of causation involved in much of biomedical research, causation will manifest in a change of a base rate to a higher or lower post-exposure rate. Given that mechanism is often unknown and not required, then showing an increased risk is the whole point. Eliminating chance, bias, confounding, and study design often is lost in the irrational exuberance of declaring the “increased risk.”

Tighter editorial control might have researchers qualify their findings by explaining that they found a higher rate in association with exposure, under the circumstances of the study, followed by an explanation that much more is needed to establish causation. But where is the fun and profit in that?

Journalists, lawyers, and advocacy scientists often use the word “link,” to avoid having to endorse associations that they know, or should know, have not been shown to be causal.2 Using “link” as a noun or a verb clearly implies a causal chain metaphor, which probably is often deliberately implied. Perhaps publishers would defend the use of “link” by noting that it is so much shorter than “association,” and thus saves typesetting costs.

More attention is needed to word choice, even and especially when statisticians and scientists are using their technical terms and jargon.3 If, for the sake of argument, we accept the sincerity of scientists who work as expert witnesses in litigation in which causal claims are overstated, we can see that poor word choices confuse scientists as well as lay people. Or you can just read the materials and methods and the results of published study papers; skip the introduction and discussion sections, as well as the newspaper headlines.


1 Dalmeet Singh Chawla, “Mind your language,” Significance 6 (Feb. 2018).

2 See, e.g., Perri Klass, M.D., “https://www.nytimes.com/2017/12/04/well/family/does-an-adhd-link-mean-tylenol-is-unsafe-in-pregnancy.html,” N.Y. Times (Dec. 4, 2017); Nicholas Bakalar, “Body Chemistry: Lower Testosterone Linked to Higher Death Risk,” N.Y. Times (Aug. 15, 2006).

3 Fang Xuelan & Graeme Kennedy, “Expressing Causation in Written English,” 23 RELC J. 62 (1992); Bengt Altenberg, “Causal Linking in Spoken and Written English,” 38 Studia Linguistica 20 (1984).

PubMed Refutes Courtroom Historians

February 11th, 2018

Professors Rosner and Markowitz, labor historians, or historians laboring in courtrooms, have made a second career out of testifying about other people’s motivations. Consider their pronouncement:

In the postwar era, professionals, industry, government, and a conservative labor movement tried to bury silicosis as an issue.”

David Rosner & Gerald Markowitz, Deadly Dust: Silicosis and the Politics of Occupational Disease in the Twentieth Century America 213 (Princeton 1991); Gerald Markowitz & David Rosner, “Why Is Silicosis So Important?” Chap. 1, at 27, in Paul-André Rosental, ed., Silicosis: A World History (2017). Their accusation is remarkable for any number of reasons,1 but the most remarkable is that their claim is unverified, but readily falsified.2

Previously, I have pointed to searches in Google’s Ngram Book viewer as well as in the National Library of Medicine’s database (PubMed) on silicosis. The PubMed website has now started to provide a csv file, with articles counts by year, which can be opened in programs such as LibreOffice Calc, Excel, etc, and then used to generate charts of the publication counts over time. 

Here is a chart generated form a simple search on <silicosis> in PubMed, with years aggregated over roughly 11 year intervals:

The chart shows that the “professionals,” presumably physicians and scientists were most busy publishing on, not burying, the silicosis issue exactly when Rosner and Markowitz claimed them to be actively suppressing. Many of the included publications grew out of industry, labor, and government interests and concerns. In their book and in their courtroom performances,, Rosner and Markowitz provide mostly innuendo without evidence, but their claim is falsifiable and false.

To be sure, the low count in the 1940s may well result from the relatively fewer journals included in the PubMed database, as well as the growth in the number of biomedical journals after the 1940s. The Post-War era certainly presented distractions in the form of other issues, including the development of antibiotics, chemotherapies for tuberculosis, the spread of poliomyelitis and the development of vaccines for this and other viral diseases, radiation exposure and illnesses, tobacco-related cancers, and other chronic diseases. Given the exponential expansion in scope of public health, the continued interest in silicosis after World War II, documented in the PubMed statistics, is remarkable for its intensity, pace Rosner and Markowitz.


1Conspiracy Theories: Historians, In and Out of Court(April 17, 2013). Not the least of the reasons the group calumny is pertinent is the extent to which it keeps the authors gainfully employed as expert witnesses in litigation.

2 See also CDC, “Ten Great Public Health Achievements – United States, 1900 – 1999,” 48(12) CDC Morbidity and Mortality Weekly Report 241 (April 02, 1999)(“Work-related health problems, such as coal workers’ pneumoconiosis (black lung), and silicosis — common at the beginning of the century — have come under better control.”).

ToxicHistorians Sponsor ToxicDocs

February 1st, 2018

A special issue of the Journal of Public Health Policy waxes euphoric over a website, ToxicDocs, created by two labor historians, David Rosner and Gerald Markowitz (also known as the “Pink Panthers”). The Panthers have gotten their universities, Columbia University and the City University of New York, to host the ToxicDocs website with whole-text searchable documents of what they advertise as “secret internal memoranda, emails, slides, board minutes, unpublished scientific studies, and expert witness reports — among other kinds of documents — that emerged in recent toxic tort litigation.” According to Rosner and Markowitz, they are “constantly adding material from lawsuits involving lead, asbestos, silica, and PCBs, among other dangerous substances.” Rosner and Markowitz are well-positioned to obtain and add such materials because of their long-term consulting and testifying work for the Lawsuit Industry, which has obtained many of these documents in routine litigation discovery proceedings.

Despite the hoopla, the ToxicDocs website is nothing new or novel. Tobacco litigation has spawned several such on-line repositories: Truth Tobacco Industry Documents Library,” Tobacco Archives,” and “Tobacco Litigation Documents.” And the Pink Panthers’ efforts to create a public library of the documents upon which they rely in litigation go back several years to earlier websites. See David Heath & Jim Morris, “Exposed: Decades of denial on poisons. Internal documents reveal industry ‘pattern of behavior’ on toxic chemicals,” Center for Public Integrity (Dec. 4, 2014).

The present effort, however, is marked by shameless self promotion and support from other ancillary members of the Lawsuit Industry. The Special Issue of Journal of Public Health Policy is introduced by Journal editor Anthony Robbins,1 who was a mover and shaker in the SKAPP enterprise and its efforts to subvert judicial assessments of proffered opinions for validity and methodological propriety. In addition, Robbins, along with the Pink Panthers as guest editors, have recruited additional “high fives” and self-congratulatory cheerleading from other members of, and expert witnesses for, the Lawsuit Industry, as well as zealots of the type who can be counted upon to advocate for weak science and harsh treatment for manufacturing industry.2

Rosner and Markowitz, joined by Merlin Chowkwanyun, add to the happening with their own spin on ToxicDocs.3 As historians, it is understandable that they are out of touch with current technologies, even those decades old. They wax on about the wonders of optical character recognition and whole text search, as though it were quantum computing.

The Pink Panthers liken their “trove” of documents to “Big Data,” but there is nothing quantitative about their collection, and their mistaken analogy ignores their own “Big Bias,” which vitiates much of their collection. These historians have been trash picking in the dustbin of history, and quite selectively at that. You will not likely find documents here that reveal the efforts of manufacturing industry to improve the workplace and the safety and health of their workers.

Rosner and Markowitz disparage their critics as hired guns for industry, but it is hard for them to avoid the label of hired guns for the Lawsuit Industry, an industry with which they have worked in close association for several decades, and from which they have reaped thousands of dollars in fees for consulting and testifying. Ironically, neither David Rosner nor Gerald Markowitz disclose their conflicts of interest, or their income from the Lawsuit Industry. David Wegman, in his contribution to the love fest, notes that ToxicDocs may lead to more accurate reporting of conflicts of interest. And yet, Wegman does not report his testimonial adventures for the Lawsuit Industry; nor does Robert Proctor; nor do Rosner and Markowitz.

It is a safe bet that ToxicDocs does not contain any emails, memoranda, letters, and the like about the many frauds and frivolities of the Lawsuit Industry, such as the silica litigation, where fraud has been rampant.4 I looked for but did not find the infamous Baron & Budd asbestos memorandum, or any of the documentary evidence from fraud cases arising from false claiming in the asbestos, silicone, welding, Fen-Phen, and other litigations.5

The hawking of ToxicDocs in the pages of the Journal of Public Health Policy is only the beginning. You will find many people and organizations promoting ToxicDocs on Facebook, Twitter, and LinkedIn. Proving there is no limit to the mercenary nature of the enterprise, you can even buy branded T-shirts and stationery online. Ah America, where even Marxists have the enterpreurial spirit!


1 Anthony Robbins & Phyllis Freeman, “ToxicDocs (www.ToxicDocs.org) goes live: A giant step toward leveling the playing field for efforts to combat toxic exposures,” 39 J. Public Health Pol’y 1 (2018). SeeMore Antic Proposals for Expert Witness Testimony – Including My Own Antic Proposals” (Dec. 30 2014).

2 Robert N. Proctor, “God is watching: history in the age of near-infinite digital archives,” 39 J. Public Health Pol’y 24 (2018); Stéphane Horel, “Browsing a corporation’s mind,” 39 J. Public Health Pol’y 12 (2018); Christer Hogstedt & David H. Wegman, “ToxicDocs and the fight against biased public health science worldwide,” 39 J. Public Health Pol’y 15 (2018); Joch McCulloch, “Archival sources on asbestos and silicosis in Southern Africa and Australia,” 39 J. Public Health Pol’y 18 (2018); Sheldon Whitehouse, “ToxicDocs: using the US legal system to confront industries’ systematic counterattacks against public health,” 39 J. Public Health Pol’y 22 (2018); Elena N. Naumova, “The value of not being lost in our digital world,” 39 J. Public Health Pol’y 27 (2018); Nicholas Freudenberg, “ToxicDocs: a new resource for assessing the impact of corporate practices on health,” 39 J. Public Health Pol’y 30 (2018). These articles are free, open-access, but in this case, you may get what you have paid for.

3 David Rosner, Gerald Markowitz, and Merlin Chowkwanyun, “ToxicDocs (www.ToxicDocs.org): from history buried in stacks of paper to open, searchable archives online,” 39 J. Public Health Pol’y 4 (2018).

4 See, e.g., In re Silica Products Liab. Litig., MDL No. 1553, 398 F.Supp. 2d 563 (S.D.Tex. 2005).

5 See Lester Brickman, “Fraud and Abuse in Mesothelioma Litigation,” 88 Tulane L. Rev. 1071 (2014); Peggy Ableman, “The Garlock Decision Should be Required Reading for All Trial Court Judges in Asbestos Cases,” 37 Am. J. Trial Advocacy 479, 488 (2014).