TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

ACGIH TLVs Lack Scientific Integrity & Transparency – The Mica NIC

June 2nd, 2020

The American Conference of Governmental Industrial Hygienists (ACGIH®) is a non-profit corporation established in 1938, to advance occupational and environmental health.  The corporation’s motto, included in its logo, hubristically announces:  “Defining the Science of Occupational and Environmental Health.”

Philosophers of science may demur from “the” in “the Science,” as well as from the intellectual arrogance in suggesting that this private organization has any such ability to commandeer the complex social nature of scientific knowledge. And yet, in the small area of setting permissible exposure limits to potential environmental or occupational toxic substances, the ACGIH is in the business of “defining” safety. Starting in 1941, the group started to review and recommend “exposure limits.” In 1956, the group coined (literally and figuratively) the term “threshold limit values” (TLVs®), and started to publish documentation for its recommended values.

From the beginning, the ACGIH has asserted that TLVs® are not standards; rather they are guidelines for use, with other information, in determining safe levels of workplace and environmental exposure. The ACGIH maintains that its TLVs are based upon published, peer-reviewed scientific studies in industrial hygiene, toxicology, occupational medicine, and epidemiology, without consideration for economic or technical feasibility.

Beginning in the 1980s, “the Lobby”[1] started to throw brushback pitches at the ACGIH to bully the organization out of positions that the Lobby thought were too comforting to manufacturing industry.[2]  The result was a dramatic shift in the ACGIH’s perspective. The bullying created a “white-hat” bias that operates as a one-way ratchet to push the ACGIH always to lower TLVs, regardless whether there was a scientific warrant for doing so. Efforts to curb ACGIH overreach by litigation have generally failed. The TLVs have becoming increasingly controversial and non-evidence-based.[3]

What follows is what a hypothetical stakeholder might submit in response to a recent ACGIH Notice of Intended change for its TLV for mica dust. Like other mineral dusts, mica when inhaled in large quantities over long time periods, causes a pneumoconiosis. Documenting a “reasonable safe” level requires studies with adequate quantification of exposure. I will leave the reader to decide whether the ACGIH has that evidence in hand, based upon the following.

[1]  F.D.K. Liddell, “Magic, Menace, Myth and Malice,” 41 Ann. Occup. Hyg. 3, 3 (1997); seeThe Lobby Lives – Lobbyists Attack IARC for Conducting Scientific Research” (Feb. 19, 2013).

[2]  Barry I. Castleman & Grace E. Ziem, “Corporate influence on threshold limit values,” 13 Am. J. Indus.  Med. 531 (1988); Grace Ziem & Barry I. Castleman, “Threshold limit values: historical perspectives and current practice,” 31 J. Occup. Med. 910 (1989); S.A. Roach & S.M. Rappaport, “But they are not thresholds:  a critical analysis of the documentation of Threshold Limit Values,” 17 Am. J. Indus. Med. 727 (1990).

[3]  Philip E. Karmel, “The Threshold Limit Values Controversy,” N.Y. L. J. (Jan. 3, 2008).

**********************************************

These comments are in response to the proposed change in the ACGIH® TLV® for Mica, as explained in the ACGIH “Mica: TLV® Chemical Substances Draft Documentation, Notice of Intended Change” (“NIC”).  For the reasons stated below, the change in the Mica TLV-TWA (time-weighted average) in the NIC is not warranted and the existing Mica TLV should not be changed.

The ACGIH® TLVs® are important non-governmental standards, largely because a number of government entities incorporate TLVs by reference into regulations and thus give TLVs the force of law.[4]  For example, some states and Canadian provinces simply adopt TLVs as state or provincial occupational exposures levels, and some states have established “maximum allowable ambient concentrations” or similar limits on “toxic air contaminants” based entirely or in part on TLVs.  The U.S. Mine Safety and Health Administration (MSHA) uses the 1973 ACGIH TLV for crystalline silica (quartz) as a legally enforceable permissible exposure level.[5]  The U.S. Occupational Health and Safety Administration (OSHA) Hazard Communication Standard requires that ACGIH TLVs be disclosed in required Safety Data Sheets.  The process by which the ACGIH develops TLVs is critically important, and furthermore, given the regulatory and legal significance of the ACGIH TLVs, the ACGIH has the burden to support proposed changes in TLVs by an adequate process, which includes transparency and evidence sufficient to support any proposed change.

The flaws in the ACGIH TLV setting process are well known and the subject of several publications, most recently in paper titled “142 ACGIH Threshold Limit Values® established from 2008-2018 lack consistency and transparency.”

The following specific comments address the ACGIH’s process disclosed in the ACGIH’s NIC in support of its proposed change for the TLV-TWA for mica, which illustrate the process problem — the ACGIH NIC for the proposed Mica TLV-TWA change does not support the change proposed by the ACGIH.   Again, the ACGIH TLVs have regulatory and legal significance; therefore, the ACGIH should not make TLV changes arbitrarily and capriciously.  Instead, the changes should be made pursuant to a transparent process, and the ACGIH should support the proposed changes with the weight of the available evidence, and the evidence in support of and the reasons for the proposed change should be publicly disclosed.  It has not done that in this case, and its own NIC makes it clear that it has not:

  1. There is no evidence that “mica is an important cause of disabling occupational pneumoconiosis” as stated in the NIC. 

The NIC provides no citation or other supporting evidence for this conclusion; it merely states the conclusion as “fact” and a premise for the proposed change.[6] The NIC fails to estimate the number of workers currently potentially exposed to mica in the U.S. (or elsewhere), what industries these workers work in, what forms of mica these workers may be exposed to, what levels of respirable mica these workers may be exposed to, and to what extent pneumoconiosis caused by the inhalation of respirable mica exists.[7]

  1. The NIC proposes to materially lower the TLV for mica, but, other than noting that there are “nine different major species”, does not adequately address the mineralogical differences between the different species of mica, makes no attempt to assess the potential adverse health effects for the different species of mica, does not examine the “dose-response” data (as inadequate as it is) for the different species of mica, and so on. 

The “Chemical and Physical Properties” section of the NIC suggests the wide variety of materials that fall within the general term “mica”.  In spite of this, the ACGIH appears to have ignored differences and concluded that the TLV for “mica” as a general category of substances should be applicable to all forms of mica, with no support for this conclusion in the NIC.[8]

  1. The human studies (sic) cited in the NIC are inadequate to support a decision to change the mica TLV and do not support the mica TLV proposed.

The first cited study involved four employees in a muscovite milling plant, with an alleged exposure to respirable mica (as muscovite) dust between 1.86 and 5.77 mg/m3.

The second cited study involved a (one) South African man who worked in a mica milling factory.  As noted in the NIC, “[q]uantitative exposure data were not reported.”

The third cited study involved a (one) 65-year old who worked in the rubber industry for 40 years, where he was exposed to numerous dusts, including mica.  There was no exposure data reported.

The fourth study involved a (one) 62-year old woman allegedly exposed to “pure mica” for seven years; no quantitative exposure data were available.

The NIC cites the case of a worker who bagged mica flake for 36 years.  In this case, there were, apparently, two dust samples taken – one at the time of a medical exam of the worker at age 54, total dust of 0.2 mg/m3 — and one taken 17 years earlier – 0.7 mg/m3.  The bulk mica samples disclosed 7.1% to 8.4% silica (presumably, respirable crystalline silica as quartz).  The reference to the silica content of the “bulk samples” suggests that there was no analysis of the material collected in the two air samples taken.

The NIC cites the case of two British men who worked as “grinders of imported muscovite,” one starting in 1957. The “workplace dust concentrations were not quantified.”

The next paper cited in the NIC was from 1940 and involved employees who were exposed to the dust caused by “mica-scrap” grinding.  There was actually an attempt to quantify mica exposures (the data from before 1940), but it was done by particle count.  The NIC notes that the available information regarding mica health effects may be “limited by potential uncertainty converting from mppcf (million particles per cubic foot) to mg/m3 (which might not apply to all dust exposure scenarios).”  The difficulties associated with converting from mppcf to mg/m3 are well known in the cases of minerals far more extensively studied than mica (e.g., crystalline silica as quartz).  In addition to the conversion factor issue, there are other concerns raised by relying upon a paper published in 1940 to support a TLV today, such as, the quality of the sampling, the quality of the chest x-rays, and issues with the classification of the chest x-rays.  With that said, the NIC noted that “[n]one of the workers exposed at less than 10 mppcf (1.8 mg/m3), irrespective of employment duration, developed pneumoconiosis.”

The remaining studies cited in the NIC are similar.  But, to close this section of the comments, I will refer to the last study, a study of 71 South African workers employed in mica milling.  Twelve personal and static samples were taken during the course of the study.  The results of the personal samples indicated a range of respirable dust (or was it mica?) between 0.4 to 1.68 mg/m3.  The radiologic examination disclosed that 19 of the 71 workers had changes consistent with one or more of asbestos, silica and/or mica.  “The specific dust concentrations to which the individuals presenting with lung changes were exposed were not reported.”

The ACGIH is proposing a reduction in the mica TLV based on the studies as described in the NIC.  We submit that this is a process and transparency problem – there is simply no way to conclude that a reduction in the mica TLV is warranted based on the Human Studies (the “evidence”) cited in the NIC.  In most cases, the Human Studies are simply case reports involving one, two, or a few people, with no quantitative exposure data.  The studies with exposure data are inadequate, i.e., date from before 1940, and among other things measured exposure as mppcf, with one study literally including two samples.  Given the legal and regulatory significance of ACGIH TLVs, the evidence cited in support of the change in the mica TLV, and the reasonable conclusion that can be drawn from the evidence cited, should exceed some threshold and meet some burden.  The evidence in the NIC is grossly insufficient to support the proposed change.

  1. The NIC for mica does not disclose any evidence to support the proposed TLV of a TWA of 0.1 mg/m3

The comparison to OSHA’s notice of proposed rulemaking for occupational exposure to respirable crystalline silica (RCS) is instructive.  The supporting documentation sets forth a preliminary quantitative risk assessment outlining life-time risks for various disease end points associated with occupational exposure to RCS at various levels.  The preliminary quantitative risk assessment disclosed all of the underlying studies and methodology, sufficient to allow a reader to understand the basis for the risk assessment conclusions and agree or disagree with the conclusions.  Based on the risks (and other factors not considered by the ACGIH) set forth in the documentation, OSHA proposed a PEL (permissible exposure level) for RCS.

By contrast, the ACGIH simply contends for its proposed mica TLV: “Consequently, a TLV-TWA of 0.1 mg/m3 measured as respirable fraction (containing no asbestos and <1% crystalline silica) is recommended.”  The materials preceding “[c]consequently”, which in normal reading would be expected to support the conclusion following, are not a risk assessment or anything similar to one, and in no way even superficially support the conclusion – the recommendation – stated.[9]  Therefore, the ACGIH proposed a TLV-TWA of 0.1. The NIC materials do not support a TLV of 0.1 mg/m3, any more than they support a TLV of 0.0001 or 10.  It cannot be stated too emphatically that the NIC is devoid of any evidence to support any TLV, including the recommended TLV-TWA of 0.1 mg/m3.

Of course, the ACGIH TLV process is not a federal rulemaking.  And readers should be aware of the ACGIH Position Statement regarding TLVs (“TLVs … are not quantitative levels of risk at different exposure levels…”). But the same Position Statement notes that “TLVs®…represent conditions under which ACGIH® believes that nearly all workers may be repeatedly exposed without adverse health effects.”  So, presumably, the ACGIH concluded that its proposed mica TLV was that level, and yet there is simply no evidence in the NIC to support that conclusion.  Given the regulatory and legal significance of TLVs, the process of establishing TLVs should have some basis in science and evidence.

References

ACGIH® TLV/BEI® Position Statement, available at: https://www.acgih.org/tlv-bei-guidelines/policies-procedures-presentations/tlv-bei-position-statement

ACGIH® TLV/BEI® Policy Statement, available at: https://www.acgih.org/tlv-bei-guidelines/policies-procedures-presentations/tlv-bei-policy-statement

30 C.F.R. § 56.5001 (MSHA exposure to airborne contaminants)

29 C.F.R. § 1910.1200 (OSHA Hazard Communications)

D. Davies & R. Cotton, “Mica pneumoconiosis,” 40 Br. J. Indus. Med. 22 (1983)

Subhabrata Moitra, “Mica pneumoconiosis: a neglected occupational lung disease – letter,” 6 The Lancet Respir. Med. e39 (2018), available at: https://www.thelancet.com/journals/lanres/article/PIIS2213-2600(18)30178-4/fulltext

Notice of Proposed Rule Making (NPRM) for Occupational Exposure to Respirable Crystalline Silica, 56 Fed. Reg. 56273 (Sept. 12, 2013)

Knut R. Skulburg, “Mica pneumoconiosis – a literature review,” 11 Scand. J. Work & Envt’l Health 65 (1985)

Carl J. Smith & Thomas A. Perfetti, “142 ACGIH Threshold Limit Values® (TLV®s) established from 2008-2018 lack consistency and transparency,” 3 Toxicol. Research & Application 1 (2019)


[4] ACGIH states that TLVs are not intended to be legal standards, but the ACGIH recognizes the broad use of TLVs and should reasonably anticipate that TLVs will be used in ways beyond the scope of the legal disclaimers that the ACGIH publishes.

[5] 30 C.F.R. 56.5001

[6] The problems cited by Moitra — illegally operated mica mines exploiting vulnerable populations to work without protections in India and some African countries — speak to the need to eliminate illegal mining and protect vulnerable populations from exploitation, not the adequacy or inadequacy of any TLV.

[7] By comparison, see the OSHA documentation for the NPRM for Occupational Exposure to Respirable Crystalline Silica published September 12, 2013.  The citation to the 1985 Skulberg article (in “Major Sources of Exposure”) is inadequate on its face in 2020; the article summarizes world mica use from 1905-1981, and provides no information regarding use post-1981.  Likewise, the citation to the “Campaign for Safe Cosmetics” does not provide information on occupational exposure.

[8] The case of crystalline silica again provides a useful contrast.  There is no ACGIH TLV for “crystalline silica,” which as a general term includes many polymorphs.

[9] Actually, the materials that precede “consequently” explicitly refute the conclusion stated by the ACGIH (the TLV-TWA of 0.1 mg/m3) — “the published literature has established an association between mica exposure and pneumoconiosis typically at concentrations in the range of 1-6 mg/m3,”  and “no cases were observed among workers exposed to mica dusts at concentrations of 1.8 mg/m3 or less….”

Judicial Dodgers – Rule 702 Tie Does Not Go to Proponent

June 2nd, 2020

The Advisory Committee notes to the year 2000 amendment to Federal Rule of Evidence 702 included a comment:

“A review of the case law after Daubert shows that the rejection of expert testimony is the exception rather than the rule. Daubert did not work a ‘seachange over federal evidence law’, and ‘the trial court’s role as gatekeeper is not intended to serve as a replacement for the adversary system’.”[internal citation omitted]

In stating its review of the caselaw, perhaps the Committee was attempting to allay the anxiety of technophobic judges. But was the Committee also attempting to derive an “ought” from an “is”?  Before the Supreme Court decided Daubert in 1993, virtually every admissibility challenge to expert witness opinion testimony failed. The trial courts were slow to adapt and to adopt the reframed admissibility standard. As the Joiner case illustrated, some Circuits were even slower to permit trial judges the discretion to assess the validity vel non of expert witnesses’ opinions.

The Committee’s observation about the “exceptional” nature of exclusions was thus unexceptional as a description of the case law before and shortly after Daubert was decided. And even if the Committee were describing a normative view, it is not at all clear how that view should translate into a ruling in a given case, without a very close analysis of the opinions at issue, under the Rule 702 criteria. In baseball, most hitters are thrown out at first base, but that fact does not help an umpire one whit in calling a specific runner “safe” or “out.”  Nonetheless, courts have repeatedly offered the observation about the exceptional nature of exclusion as both an explanation and a justification of their opinions to admit testimony.[1] The Advisory Committee note has thus mutated into a mandate to err on the side of admissibility, as though deliberately committing error was a good thing for any judge to do.[2] First rule: courts shall not err, not intentionally, recklessly, or negligently.

Close Calls and Resolving Doubts

Another mutant offspring of the “exception, not the rule” mantra is that “[a]ny doubts regarding the admissibility of an expert’s testimony should be resolved in favor of admissibility.”[3] Why not resolve the doubts and rule in accordance with the law? Or, if doubts remain, then charge them against the proponent who has the burden of showing admissibility? Unlike baseball, in which a tie goes to the runner, in expert witness law, a tie goes to the challenger because the defender of the motion has failed to show a preponderance in favor of admissibility. A better mantra: “exclusion when it is the Rule.”

Some courts re-imagine the Advisory Committee’s about exceptional exclusions as a recommendation for admitting Rule 702 expert witness opinion testimony as a preferred outcome. Again, that interpretation reverses the burden of proof and makes a mockery of equal justice and scientific due process.

Yet another similar judicial mutation is the notion that courts should refuse Rule 702 motions when they are “close calls.”[4] Telling the litigants that the call was close might help assuage the loser and temper the litigation enthusiasms of the winner, but it does not answer the key question: Did the proponent carry the burden of showing admissibility? Residual doubts would seem to weigh against the proponent.

Not all is lost. In one case, decided by a trial court within the Ninth Circuit, the trial judge explicitly pointed to the proponent’s failure to identify his findings and methodology as part of the basis for exclusion, not admission, of the challenged witness’s opinion testimony.[5] Difficulty in resolving whether the Rule 702 predicates were satisfied worked against, not for, the proponent, whose burden it was to show those predicates.

In another case, Judge David G. Campbell, of the District of Arizona, who has participated in the Rules Committee’s deliberations, showed the way by clearly stating that the exclusion of opinion testimony was required when the Rule 702 conditions were not met:

“Plaintiffs have not shown by a preponderance of the evidence that [the expert witness’s] causation opinions are based on sufficient facts or data to which reliable principles and methods have been applied reliably… .”[6]

Exclusion followed because the absent showings were “conditions for admissibility,” and not “mere” credibility considerations.

Trust Me, I’m a Liberal

One of the reasons that the Daubert Court rejected incorporating the Frye standard into Rule 702 was its view that a rigid “general acceptance” standard “would be at odds with the ‘liberal thrust’ of the Federal Rules.”[7] Some courts have cited this “liberal thrust” as though it explained or justified a particular decision to admit expert witness opinion testimony.[8]

The word “liberal” does not appear in the Federal Rules of Evidence.  Instead, the Rules contain an explicit statement of how judges must construe and apply the evidentiary provisions:

“These rules shall be construed to secure fairness in administration, elimination of unjustifiable expense and delay, and promotion of growth and development of the law of evidence to the end that the truth may be ascertained and proceedings justly determined.”[9]

A “liberal” approach, construed as a “let it all in” approach would be ill-designed to secure fairness, eliminate unjustifiable expense and time of trial, or lead to just and correct outcomes.  The “liberal” approach of letting in opinion testimony and let the jury guess at questions of scientific validiy would be a most illiberal result.  The truth will not be readily ascertained if expert witnesses are permitted to pass off hypotheses and ill-founded conclusions as scientific knowledge.

Avoiding the rigidity of the Frye standard, which was so rigid that it was virtually never applied, certainly seems like a worthwhile judicial goal. But how do courts go from the Justice Blackmun’s “liberal thrust” to infer a libertine “anything goes”? And why does liberal not connote seeking of the truth, free of superstitions? Can it be liberal to permit opinions that are based upon fallacious or flawed inferences, invalid studies, or cherry-picked data sets?

In reviewing the many judicial dodges that are used to avoid engaging in meaningful Rule 702 gatekeeping, I am mindful of Reporter Daniel Capra’s caveat that the ill-advised locutions used by judges do not necessarily mean that their decisions might not be completely justifiable on a carefully worded and reasoned opinion that showed that Rule 702 and all its subparts were met. Of course, we could infer that the conditions for admissibility were met whenever an expert witness’s opinions were admitted, and ditch the whole process of having judges offer reasoned explanations. Due process, however, requires more. Judges need to specify why they denied Rule 702 challenges in terms of the statutory requirements for admissibility so that other courts and the Bar can develop a principled jurisprudence of expert witness opinion testimony.


[1]  See, e.g., In re Scrap Metal Antitrust Litig., 527 F.3d 517, 530 (6th Cir. 2008) (“‘[R]ejection of expert testimony is the exception, rather than the rule,’ and we will generally permit testimony based on allegedly erroneous facts when there is some support for those facts in the record.”) (quoting Advisory Committee Note to 2000 Amendments to Rule 702); Citizens State Bank v. Leslie, No. 6-18-CV-00237-ADA, 2020 WL 1065723, at *4 (W.D. Tex. Mar. 5, 2020) (rejecting challenge to expert witness opinion “not based on sufficient facts”; excusing failure to assess factual basis with statement that “the rejection of expert testimony is the exception rather than the rule.”); In re E. I. du Pont de Nemours & Co. C-8 Pers. Injury Litig., No. 2:18-CV-00136, 2019 WL 6894069, at *2 (S.D. Ohio Dec. 18, 2019) (committing naturalistic fallacy; “[A] review of the case law … shows that rejection of the expert testimony is the exception rather than the rule.”): Frankenmuth Mutual Insur. Co. v. Ohio Edison Co., No. 5:17CV2013, 2018 WL 9870044, at *2 (N.D. Ohio Oct. 9, 2018) (quoting Advisory Committee Note “exception”); Wright v. Stern, 450 F. Supp. 2d 335, 359–60 (S.D.N.Y. 2006)(“Rejection of expert testimony, however, is still ‘the exception rather than the rule,’ Fed.R.Evid. 702 advisory committee’s note (2000 Amendments)[.] . . . Thus, in a close case the testimony should be allowed for the jury’s consideration.”) (internal quotation omitted).

[2]  Lombardo v. Saint Louis, No. 4:16-CV-01637-NCC, 2019 WL 414773, at *12 (E.D. Mo. Feb. 1, 2019) (“[T]he Court will err on the side of admissibility.”).

[3]  Mason v. CVS Health, 384 F. Supp. 3d 882, 891 (S.D. Ohio 2019).

[4]  Frankenmuth Mutual Insur. Co. v. Ohio Edison Co., No. 5:17CV2013, 2018 WL 9870044, at *2 (N.D. Ohio Oct. 9, 2018) (concluding “[a]lthough it is a very close call, the Court declines to exclude Churchwell’s expert opinions under Rule 702.”); In re E. I. du Pont de Nemours & Co. C-8 Pers. Injury Litig., No. 2:18-CV-00136, 2019 WL 6894069, at *2 (S.D. Ohio Dec. 18, 2019) (suggesting doubts should be resolved in favor of admissibility).

[5]  Rovid v. Graco Children’s Prod. Inc., No. 17-CV-01506-PJH, 2018 WL 5906075, at *13 (N.D. Cal. Nov. 9, 2018), app. dism’d, No. 19-15033, 2019 WL 1522786 (9th Cir. Mar. 7, 2019).

[6]  Alsadi v. Intel Corp., No. CV-16-03738-PHX-DGC, 2019 WL 4849482, at *4 -*5 (D. Ariz. Sept. 30, 2019).

[7]  Daubert v. Merrell Dow Pharms., Inc. 509 U.S. 579, 588 (1993).

[8]  In re ResCap Liquidating Trust Litig., No. 13-CV-3451 (SRN/HB), 2020 WL 209790, at *3 (D. Minn. Jan. 14, 2020) (“Courts generally support an attempt to liberalize the rules governing the admission of expert testimony, and favor admissibility over exclusion.”)(internal quotation omitted); Collie v. Wal-Mart Stores East, L.P., No. 1:16-CV-227, 2017 WL 2264351, at *1 (M.D. Pa. May 24, 2017) (“Rule 702 embraces a ‘liberal policy of admissibility’, under which it is preferable to admit any evidence that may assist the factfinder[.]”); In re Zyprexa Prod. Liab. Litig., 489 F. Supp. 2d 230, 282 (E.D.N.Y. 2007); Billone v. Sulzer Orthopedics, Inc., No. 99-CV-6132, 2005 WL 2044554, at *3 (W.D.N.Y. Aug. 25, 2005) (“[T]he Supreme Court has emphasized the ‘liberal thrust’ of Rule 702, favoring the admissibility of expert testimony.”).

[9]  Federal Rule of Evidence Rule 102 (“Purpose and Construction”) (emphasis added).

Practical Solutions for the Irreproducibility Crisis

March 3rd, 2020

I have previously praised the efforts of the National Association of Scholars (NAS) for its efforts to sponsor a conference on “Fixing Science: Practical Solutions for the Irreproducibility Crisis.” The conference was a remarkable event, with a good deal of diverse view points, civil discussion and debate, and collegiality.

The NAS has now posted a follow up to its conference, with a link to slide presentations, and to a You Tube page with videos of the presentations. The NAS, along with The Independent Institute, should be commended for their organizational efforts, and their transparency in making the conference contents available now to a wider audience.

The conference took place on February 7th and 8th, and I had the privilege of starting the event with my presentation, “Not Just an Academic Dispute: Irreproducible Scientific Evidence Renders Legal Judgments Unsafe”.

Some, but not all, of the interesting presentations that followed:

Tim Edgell, “Stylistic Bias, Selective Reporting, and Climate Science” (Feb. 7, 2020)

Patrick J. Michaels, “Biased Climate Science” (Feb. 7, 2020)

Daniele Fanelli, “Reproducibility Reforms if there is no Irreproducibility Crisis” (Feb. 8, 2020)

On Saturday, I had the additional privilege of moderating a panel on “Group Think” in science, and its potential for skewing research focus and publication:

Lee Jussim, “Intellectual Diversity Limits Groupthink in Scientific Psychology” (Feb. 8, 2020)

Mark Regnerus, “Groupthink in Sociology” (Feb. 8, 2020)

Michael Shermer, “Giving the Devil His Due” (Feb. 8, 2020)

Later on Saturday, the presenters turned to methodological issues, many of which are key to understanding ongoing scientific and legal controversies:

Stanley Young, “Prevention and Management of Acute and Late Toxicities in Radiation Oncology

James E. Enstrom, “Reproducibility is Essential to Combating Environmental Lysenkoism

Deborah Mayo, “P-Value ‘Reforms’: Fixing Science or Threats to Replication and Falsification?” (Feb. 8, 2020)

Ronald L. Wasserstein, “What Professional Organizations Can Do To Fix The Irreproducibility Crisis” (Feb. 8, 2020)

Louis Anthony Cox, Jr., “Causality, Reproducibility, and Scientific Generalization in Public Health” (Feb. 8, 2020)

David Trafimow, “What Journals Can Do To Fix The Irreproducibility Crisis” (Feb. 8, 2020)

David Randall, “Regulatory Science and the Irreproducibility Crisis” (Feb. 8, 2020)

Science Journalism – UnDark Noir

February 23rd, 2020

Critics of the National Association of Scholars’ conference on Fixing Science pointed readers to an article in Undark, an on-line popular science site for lay audiences, and they touted the site for its science journalism. My review of the particular article left me unimpressed and suspicious of Undark’s darker side. When I saw that the site featured an article on the history of the Supreme Court’s Daubert decision, I decided to give the site another try. For one thing, I am sympathetic to the task science journalists take on: it is important and difficult. In many ways, lawyers must commit to perform the same task. Sadly, most journalists and lawyers, with some notable exceptions, lack the scientific acumen and English communication skills to meet the needs of this task.

The Undark article that caught my attention was a history of the Daubert decision and the Bendectin litigation that gave rise to the Supreme Court case.[1] The author, Peter Andrey Smith, is a freelance reporter, who often covers science issues. In his Undark piece, Smith covered some of the oft-told history of the Daubert case, which has been told before, better and in more detail in many legal sources. Smith gets some credit for giving the correct pronunciation of the plaintiff’s name – “DAW-burt,” and for recounting how both sides declared victory after the Supreme Court’s ruling. The explanation Smith gives of the opinion by Associate Justice Harry Blackmun is reasonably accurate, and he correctly notes that a partial dissenting opinion by Chief Justice Rehnquist complained that the majority’s decision would have trial judges become “amateur scientists.” Nowhere in the article will you find, however, the counter to the dissent: an honest assessment of the institutional and individual competence of juries to decide complex scientific issues.

The author’s biases eventually, however, become obvious. He recounts his interviews with Jason Daubert and his mother, Joyce Daubert. He earnestly reports how Joyce Daubert remembered having taken Bendectin during her pregnancy with Jason, and in the moment of that recall, “she felt she’d finally identified the teratogen that harmed Jason.” Really? Is that how teratogens are identified? Might it have been useful and relevant for a scientific journalist to explain that there are four million live births every year in the United States and that 3% of children born each year have major congenital malformations? And that most malformations have no known cause? Smith ingenuously relays that Jason Daubert had genetic testing, but omits that genetic testing in the early 1990s was fairly primitive and limited. In any event, how were any expert witnesses supposed to rule out base-line risk of birth defects, especially given weak to non-existent epidemiologic support for the Daubert’s claims? Smith does answer these questions; he does not even acknowledge the questions.

Smith later quotes Joyce Daubert as describing the litigation she signed up for as “the hill I’ll die on. You only go to war when you think you can win.” Without comment or analysis, Smith gives Joyce Daubert an opportunity to rant against the “injustice” of how her lawsuit turned out. Smith tells us that the Dauberts found the “legal system remains profoundly disillusioning.” Joyce Daubert told Smith that “it makes me feel stupid that I was so naïve to think that, after we’d invested so much in the case, that we would get justice.”  When called for jury duty, she introduces herself as

“I’m Daubert of Daubert versus Merrell Dow … ; I don’t want to sit on this jury and pretend that I can pass judgment on somebody when there is no justice. Please allow me to be excused.”

But didn’t she really get all the justice she deserved? Given her zealotry, doesn’t she deserve to have her name on the decision that serves to rein in expert witnesses who outrun their scientific headlights? Smith is coy and does not say, but in presenting Mrs. Daubert’s rant, without presenting the other side, he is using his journalistic tools in a fairly blatant attempt to mislead. At this point, I begin to get the feeling that Smith is preaching to a like-minded choir over there at Undark.

The reader is not treated to any interviews with anyone from the company that made Bendectin, any of its scientists, or any of the scientists who published actual studies on whether Bendectin was associated with the particular birth defects Jason Daubert had, or for that matter, with any birth defects at all. The plaintiffs’ expert witnesses quoted and cited never published anything at all on the subject. The readers are left to their imagination about how the people who developed Bendectin felt about the litigation strategies and tactics of the lawsuit industry.

The journalistic ruse is continued with Smith’s treatment of the other actors in the Daubert passion play. Smith describes the Bendectin plaintiffs’ lawyer Barry Nace in hagiographic terms, but omits his bar disciplinary proceedings.[2] Smith tells us that Nace had an impressive background in chemistry, and quotes him in an interview in which he described the evidentiary rules on scientific witness testimony as “scientific evidence crap.”

Smith never describes the Daubert’s actual affirmative evidence in any detail, which one might expect in a sophisticated journalistic outlet. Instead, he described some of their expert witnesses, Shanna Swan, a reproductive epidemiologist, and Alan K. Done, “a former pediatrician from Wayne State University.” Smith is secretive about why Done was done in at Wayne State; and we learn nothing about the serious accusations of perjury on credentials by Done. Instead, Smith regales us with Done’s tsumish theory, which takes inconclusive bits of evidence, throws them together, and then declares causation that somehow eludes the rest of the scientific establishment.

Smith tells us that Swan was a rebuttal witness, who gave an opinion that the data did not rule out “the possibility Bendectin caused defects.” Legally and scientifically, Smith is derelict in failing to explain that the burden was on the party claiming causation, and that Swan’s efforts to manufacture doubt were beside the point. Merrell Dow did not have to rule out any possibility of causation; the plaintiffs had to establish causation. Nor does Smith delve into how Swan sought to reprise her performance in the silicone gel breast implant litigation, only to be booted by several judges as an expert witness. And then for a convincer, Smith sympathetically repeats plaintiffs’ lawyer Barry Nace’s hyperbolic claim that Bendectin manufacturer, Merrell Dow had been “financing scientific articles to get their way,” adding by way of emphasis, in his own voice:

“In some ways, here was the fake news of its time: If you lacked any compelling scientific support for your case, one way to undermine the credibility of your opponents was by calling their evidence ‘junk science’.”

Against Nace’s scatalogical Jackson Pollack approach, Smith is silent about another plaintiffs’ expert witness, William McBride, who was found guilty of scientific fraud.[3] Smith reports interviews of several well-known, well-respected evidence scholars. He dutifully report Professor Edward Cheng’s view that “the courts were right to dismiss the [Bendectin] plaintiffs’ claims.” Smith quotes Professor D. Michael Risinger that claims from both sides in Bendectin cases were exaggerated, and that the 1970s and 1980s saw an “unbridled expansion of self-anointed experts,” with “causation in toxic torts had been allowed to become extremely lax.” So a critical reader might wonder why someone like Professor Cheng, who has a doctorate in statistics, a law degree from Harvard, and teaches at Vanderbilt Law School, would vindicate the manufacturers’ position in the Bendectin litigation. Smith never attempts to reconcile his interviews of the law professors with the emotive comments of Barry Nace and Joyce Daubert.

Smith acknowledges that a reformulated version of Bendectin, known as  Diclegis, was approved by the Food and Drug Administration in the United States, in 2013, for treatment of  nausea and vomiting during pregnancy. Smith tells us that Joyce is not convinced the drug should be back on the market,” but really why would any reasonable person care about her view of the matter? The challenge by Nav Persaud, a Toronto physician, is cited, but Persaud’s challenge is to the claim of efficacy, not to the safety of the medication. Smith tells us that Jason Daubert “briefly mulled reopening his case when Diclegis, the updated version of Bendectin, was re-approved.” But how would the approval of Diclegis, on the strength of a full new drug application, somehow support his claim anew? And how would he “reopen” a claim that had been fully litigated in the 1990s, and well past any statute of limitations?

Is this straight reporting? I think not. It is manipulative and misleading.

Smith notes, without attribution, that some scholars condemn litigation, such as the cases involving Bendectin, as an illegitimate form of regulation of medications. In opposition, he appears to rely upon Elizabeth Chamblee Burch, a professor at the University of Georgia School of Law for the view that because the initial pivotal clinical trials for regulatory approvals take place in limited populations, litigation “serves as a stopgap for identifying rare adverse outcomes that could crop up when several hundreds of millions of people are exposed to those products over longer periods of time.” The problem with this view is that Smith ignores the whole process of pharmacovigilance, post-registration trials, and pharmaco-epidemiologic studies conducted after the licensing of a new medication. The suggested necessity of reliance upon the litigation system as an adjunct to regulatory approval is at best misplaced and tenuous.

Smith correctly explains that the Daubert standard is still resisted in criminal cases, where it could much improve the gatekeeping of forensic expert witness opinion. But while the author gets his knickers in a knot over wrongful convictions, he seems quite indifferent to wrongful judgments in civil action.

Perhaps the one positive aspect of this journalistic account of the Daubert case was that Jason Daubert, unlike his mother, was open minded about his role in transforming the law of scientific evidence. According to Smith, Jason Daubert did not see the case as having “not ruined his life.” Indeed, Jason seemed to approve the basic principle of the Daubert case, and the subsequent legislation that refined the admissibility standard: “Good science should be all that gets into the courts.”


[1] Peter Andrey Smith, “Where Science Enters the Courtroom, the Daubert Name Looms Large: Decades ago, two parents sued a drug company over their newborn’s deformity – and changed courtroom science forever,” Undark (Feb. 17, 2020).

[2]  Lawyer Disciplinary Board v. Nace, 753 S.E.2d 618, 621–22 (W. Va.) (per curiam), cert. denied, 134 S. Ct. 474 (2013).

[3] Neil Genzlinger, “William McBride, Who Warned About Thalidomide, Dies at 91,” N.Y. Times (July 15, 2018); Leigh Dayton, “Thalidomide hero found guilty of scientific fraud,” New Scientist (Feb. 27, 1993); G.F. Humphrey, “Scientific fraud: the McBride case,” 32 Med. Sci. Law 199 (1992); Andrew Skolnick, “Key Witness Against Morning Sickness Drug Faces Scientific Fraud Charges,” 263 J. Am. Med. Ass’n 1468 (1990).

Counter Cancel Culture – Part II: The Fixing Science Conference

February 12th, 2020

So this is what it is like to be denounced? My ancestors fled the Czar’s lands before they could be tyrannized by denunciations of Stalin’s Soviets. The work of contemporary denunciators is surely much milder, but no more principled than the Soviet versions of yesteryear.

Now that I am back from the Fixing Science conference, sponsored by the Independent Institute and the National Association of Scholars (NAS), I can catch up with the media coverage of the event. I have already addressed Dr. Lenny Teytelman’s issues in an open letter to him. John Mashey is a computer scientist who has written critical essays on climate science denial. On the opening day of the NAS conference, he published online his take on the recent NAS’s conference on scientific irreproducibility.[1] Mashey acknowledges that the Fixing Science conference included “credible speakers who want to improve some areas of science hurt by the use of poor statistical methods or making irreproducible claims,” but his post devolves into scurrilous characterizations of several presenters. Alas, some of the ad hominems are tossed at me, and here is what I have to say about them.

Mashey misspells my name, “Schactman,” but that is a minor flaw of scholarship. He writes that I have “published much on evidence,” which is probably too laudatory. I am hardly a recognized scholar on the law of evidence, although I know something about this area, and have published in it.

Mashey tautologically declares that I “may or may not be a ‘product defense lawyer’ (akin to Louis Anthony Cox) defending companies against legitimate complaints.” Mashey seems unaware of how the rule of law works in our country. Plaintiffs file complaints, but the standard for the legitimacy of these complaints is VERY low. Courts require the parties to engage in discovery of their claims and defenses, and then courts address dispositive motions to dismiss either the claims or the defenses. So, sometimes after years of work, legitimate complaints are revealed to be bogus complaints, and then the courts will dismiss bogus complaints, and thus legitimate complaints become illegitimate complaints. In my 36 years at the bar, I am proud to have been able to show that a great many apparently legitimate complaints were anything but what they seemed.

Mashey finds me “worrying” and “concerning.” My children are sometimes concerned about me, and even worry about me, about I do not think that Mashey was trying to express solicitude for me.

Why worry? Well, David Michaels in his most recent book, Triumph of Doubt (2020), has an entire chapter on silica dust. And I, worrisomely, have written and spoken, about silica and silicosis litigation, sometimes in a way critical of the plaintiffs’ litigation claims. Apparently, Mashey does not worry that David Michaels may be an unreliable protagonist who worked as a paid witness for the lawsuit industry on many occasions before becoming the OSHA Administrator, in which position he ignored enforcement of existing silica regulations in order to devote a great deal of time, energy, and money to revising the silica regulations. The evidentiary warrant for Michaels’ new silica rule struck me then, and now, as slim, but the real victims, workers, suffered because Michaels was so intent on changing a rule in the face of decades of declining silicosis mortality, that he failed, in my view, to attend to specific instances of over-exposure.

Mashey finds me concerning because two radical labor historians do not like me. (I think I am going eat a worm, ….) Mashey quotes at length from an article by these historians, criticizing me for having had the audacity to criticize them.[2] Oh my.

What Mashey does not tell his readers was that, as co-chair of a conference on silicosis litigation (along with a co-chair who was a plaintiffs’ lawyer), I invited historian Gerald Markowitz to speak and air his views on the history of silica regulation and litigation. In response, I delivered a paper that criticized, and I would dare say, rebutted many of Markowitz’s historical conclusions and his inferences from an incomplete, selectively assembled, and sometimes incorrect, set of historical facts. I later published my paper.

Mashey tells his readers that my criticisms, based not upon what I wrote, but upon the partisan cries of Rosner and Markowitz, “seems akin to Wood’s style of attack.” Well, if so, nicely done, Woods.

But does Mashey believe that his readers deserve to know that Rosner and Markowitz have testified repeatedly on behalf of the lawsuit industry, that is, those entrepreneurs who make lawsuits?[3] And that Rosner and Markowitz have been amply remunerated for their labors as partisan witnesses in these lawsuits?

And is Mashey worried or concerned that in the United States, silicosis litigation has been infused with fraud and deception, not by the defendants, but by the litigation industry that creates the lawsuits? Absent from Rosner and Markowitz’s historical narratives is any mention of the frauds that have led to dismissals of thousands of cases, and the professional defrocking of any number of physician witnesses.  In re Silica Products Liab. Litig., MDL No. 1553, 398 F. Supp. 2d 563 (S.D.Tex. 2005). Even the redoubtable expert witness for the plaintiffs’ bar, David S. Egilman, has published articles that point out the unethical and unlawful nature of the medico-legal screenings that gave rise to the silicosis litigation, which Michaels, Rosner, and Markowitz seem to support, or at the very least suppress any criticism of.[4]

So this is what it means to be denounced! Mashey’s piece is hardly advertisement for the intellectual honesty of those who would de-platform the NAS conference. He has selectively and inaccurately addressed my credentials. As just one example, and in an effort to diminish the NAS, he has omitted that I have received a grant from the NASEM to develop a teaching module on scientific causation. My finished paper is published online at the NASEM website.[5]

I do not know Mashey, but I leave it to you to judge him by his sour fruits.


[1]  John Mashey, “Dark-Moneyed Denialists Are Running ‘Fixing Science’ Symposium of Doubt,” Desmog Blog (Feb. 7, 2020).

[2]  David Rosner & Gerald Markowitz, “The Trials and Tribulations of Two Historians:  Adjudicating Responsibility for Pollution and Personal Harm, 53 Medical History 271, 280-81 (2009) (criticizing me for expressing the view that historians should not be permitted to testify and thereby circumvent the rules of evidence). See also David Rosner & Gerald Markowitz, “L’histoire au prétoire.  Deux historiens dans les procès des maladies professionnelles et environnementales,” 56 Revue D’Histoire Moderne & Contemporaine 227, 238-39 (2009) (same); D. Rosner, “Trials and Tribulations:  What Happens When Historians Enter the Courtroom,” 72 Law & Contemporary Problems 137, 152 (2009) (same). I once thought there was an academic standard that prohibited duplicative publication!

[3] I have been critical of Rosner and Markowitz on many occasions; they have never really responded to the substance of my criticisms. See, e.g., “How Testifying Historians Are Like Lawn-Mowing Dogs,” (May 15, 2010).

[4]  See David Egilman and Susanna Rankin Bohme, “Attorney-directed screenings can be hazardous,” 45 Am. J. Indus. Med. 305 (2004); David Egilman, “Asbestos screenings,” 42 Am. J. Indus. Med. 163 (2002).

[5]  “Drug-Induced Birth Defects: Exploring the Intersection of Regulation, Medicine, Science, and Law – An Educational Module” (2016) (A teaching module designed to help professional school students and others evaluate the role of science in decision-making, developed for the National Academies of Science, Engineering, and Medicine, and its Committee on Preparing the Next Generation of Policy Makers for Science-Based Decisions).

Counter Cancel Culture – The NAS Conference on Irreproducibility

February 9th, 2020

The meaning of the world is the separation of wish and fact.”  Kurt Gödel

Back in October 2019, David Randall, the Director of Research, of the National Association of Scholars, contacted me to ask whether I would be interested in presenting at a conference, to be titled “Fixing Science: Practical Solutions for the Irreproducibility Crisis.” David explained that the conference would be aimed at a high level consideration of whether such a crisis existed, and if so, what salutary reforms might be implemented.

As for the character and commitments of the sponsoring organizations, David was candid and forthcoming. I will quote him, without his permission, and ask his forgiveness later:

The National Association of Scholars is taken to be conservative by many scholars; the Independent Institute is (broadly speaking) in the libertarian camp. The NAS is open to but currently agnostic about the degree of human involvement in climate change. The Independent Institute I take to be institutionally skeptical of consensus climate change theory–e.g., they recently hosted Willie Soon for lecture. A certain number of speakers prefer not to participate in events hosted by institutions with these commitments.”

To me, the ask was for a presentation on how the so-called replication crisis, or the irreproducibility crisis, affected the law. This issue was certainly one I have had much occasion to consider. Although I am aware of the “adjacency” arguments made by some that people should be mindful of whom they align with, I felt that nothing in my participation would compromise my own views or unduly accredit institutional positions of the sponsors.

I was flattered by the invitation, but I did some due diligence on the sponsoring organizations. I vaguely recalled the Independent Institute from my more libertarian days, but the National Association of Scholars (NAS, not to be confused with Nathan A. Schachtman) was relatively unknown to me. A little bit of research showed that the NAS had a legitimate interest in the irreproducibility crisis. David Randall had written a monograph for the organization, which was a nice summary of some of the key problems. The Irreproducibility Crisis of Modern Science: Causes, Consequences,and the Road to Reform (2018).

On other issues, the NAS seemed to live up to its description as “an organization of scholars committed to higher education as the catalyst of American freedom.” I listened to some of the group’s podcasts, Curriculum Vitae, and browsed through its publications. I found myself agreeing with many positions articulated by or through the NAS, and disagreeing with a few positions very strongly.

In looking over the list of other invited speakers, I saw great diversity of view points and approaches, One distinguished speaker, Daniele Fanelli, had criticized the very notion that there was a reproducibility crisis. In the world of statistics, there were strong defenders of statistical tests, and vociferous critics. I decided to accept the invitation, not because I was flattered, but because the replication issue was important, and I believed that I could add something to the discussion before an audience of professional scientists, statisticians, and educated lay persons. In writing to David Randall to accept the invitation, I told him that with respect to the climate change issues, I was not at all put off by healthy skepticism in the face all dogmas. Every dogma will have its day.

I did not give any further consideration to the political aspect of the conference until early January, when I received an email from a scientist, Lenny Teytelman, Ph.D., the C.E.O. of a company protocols.io, which addresses reproducibility issues. Dr Teytelman’s interest in improving reproducibility seemed quite genuine, but he wrote to express his deep concern about the conference and the organizations that were sponsoring it.

Perhaps a bit pedantically, he cautioned me that the NAS was not the National Academy of Sciences, a confusion that never occurred to me because the National Academies has been known as the National Academies of Science, Engineering and Medicine for several years now. Dr. Teytelman’s real concern seemed to be that the NAS is a “‘politically conservative advocacy group’.” (The internal scare quotes were Teytelman’s, but I was not afraid.) According to Dr. Teytelman, the NAS sought to undermine climate science and environmental protection by advancing a call for more reproducible science. He pointed me to what he characterized as an exposé on NAS, in Undark,1 and he cautioned me that the National Association of Scholars’ work is “dangerous.” Finally, Dr. Teytelman urged me to reconsider my decision to participate in the conference.

I did reconsider my decision, but reaffirmed it in an email I sent back to Dr. Teytelman. I realized that I could be wrong, in which case, I would eat my words, confident that they would be most digestible:

Dear Dr Teytelman,

Thank you for your note. I was aware of the piece on Undark’s website, as well as the difference between the NAS and the NASEM. I don’t believe anyone involved in science education would likely to be confused between the two organizations. A couple of years ago, I wrote a teaching module on biomedical causation for the National Academies. This is my first presentation at the request of the NAS, and frankly I am honored by the organization’s request that I present at its conference.

I have read other materials that have been critical of the NAS and its publications on climate change and other issues. I know that there are views of the organization from which I would dissent, but I do not see my disagreement on some issues as a reason not to attend, and present at a conference on an issue of great importance to the legal system.

I am hardly an expert on climate change issues, and that is my failing. Most of my professional work involves health effects regulation and litigation. If the NAS has advanced sophistical arguments against a scientific claim, then the proper antidote will be to demonstrate its fallacious reasoning and misleading marshaling of evidence. I should think, however, as someone interested in improving the reproducibility of scientific research, you will agree that there is much common ground for discussion and reform of scientific practice, on a broader arrange [sic] of issues than climate change.

As for the political ‘conservatism’, of the organization, I am not sure why that is a reason to eschew participation in a conference that should be of great importance to people of all political views. My own politics probably owe much to the influence of Michael Oakeshott, which puts me in perhaps the smallest political tribe of any in the United States. If conservatism means antipathy to post-modernism, identity politics, political orthodoxies, and assaults on Enlightenment values and the Rule of Law, then count me in.

In any event, thanks for your solicitude. I think I can participate and return with my soul intact.

All the best.

Nathan

To his credit, Dr. Teytelman tenaciously continued. He acknowledged that the political leanings of the organizers were not a reason to boycott, but he politely pressed his case. We were now on a first name basis:

Dear Nathan,

I very much applaud all efforts to improve the rigour of our science. The problem here is that this NAS organization has a specific goal – undermining the environmental protection and denying climate change. This is why 7 out of the 21 speakers at the event are climate change deniers. [https://docs.google.com/spreadsheets/d/136FNLtJzACc6_JbbOxjy2urbkDK7GefRZ/edit?usp=sharing] And this isn’t some small fringe effort to be ignored. Efforts of this organization and others like them have now gotten us to the brink of a regulatory change at the United States Environmental Protection Agency which can gut the entire EPA (see a recent editorial against this I co-authored). This conference is not a genuine effort to talk about reproducibility. The reproducibility part is a clever disguise for pushing a climate change denialism agenda.

Best,

Lenny

I looked more carefully at Lenny’s spreadsheet, and considered the issue afresh. We were both pretty stubborn:

Dear Lenny,

Thank you for this information. I will review with interest.

I do not see that the conference is primarily or even secondarily about climate change vel non. There are two scientists, Trafimow and Wasserstein with whom I have some disagreements about statistical methodology. Tony Cox and Stan Young, whatever their political commitments or views on climate change may be, are both very capable statisticians, from whom I have learned a great deal. The conference should be a lively conversation about reproducibility, not about climate change. Given your interests and background, you should go.

I believe that your efforts here are really quite illiberal, although they are in line with the ‘cancel culture’, so popular on campuses these days.

Forty three years ago, I entered a Roman Catholic Church to marry the woman I love. There were no lightning bolts or temblors, even though I was then and I am now an atheist. Yes, I am still married to my first wife. Although I share the late Christopher Hitchins’ low view of the Catholic Church, somehow I managed to overcome my antipathy to being married in what some would call a house of ill repute. I even manage to agree with some Papist opinions, although not for the superstitious reasons’ Papists embrace.

If I could tolerate the RC Church’s dogma for a morning, perhaps you could put aside the dichotomous ‘us and them’ view of the world and participate in what promises to be an interesting conference on reproducibility?

All the best.

Nathan

Lenny kindly acknowledged my having considered his issues, and wrote back a nice note, which I will quote again in full without permission, but with the hope that he will forgive me and even acknowledge that I have given his views an airing in this forum.

Hi Nathan,

We’ll have to agree to disagree. I don’t want to give a veneer of legitimacy to an organization whose goal is not improving reproducibility but derailing EPA and climate science.

Warmly,

Lenny

The business of psychoanalyzing motives and disparaging speakers and conference organizers is a dangerous business for several reasons. First motives can be inscrutable. Second, they can be misinterpreted. And third, they can be mixed. When speaking of organizations, there is the further complication of discerning a corporate motive among the constituent members.

The conference was an exciting, intellectually challenging event, which took place in Oakland, California, on February 7 and 8. I can report back to Lenny that his characterizations of and fears about the conference were unwarranted. While there were some assertions of climate change skepticism made with little or no evidence, the evidence-based presentations essentially affirmed climate change and sought to understand its causes and future course in a scientific way. But climate change was not why I went to this conference. On the more general issue of reform of scientific procedures and methods, we had open debates, some agreement on important principles, and robust and reasoned disagreement.

Lenny, you were correct that the NAS should not be ignored, but you should have gone to the meeting and participated in the conversation.


1 Michael Schulson, “A Remedy for Broken Science, Or an Attempt to Undercut It?Undark (April 18, 2018).

Judicial Gatekeeping Cures Claims That Viagra Can Cause Melonoma

January 24th, 2020

The phosphodiesterases 5 inhibitor medications (PDE5i) seem to arouse the litigation propensities of the lawsuit industry. The PDE5i medications (sildenafil, tadalafil, etc.) have multiple indications, but they are perhaps best known for their ability to induce penile erections, which in some situations can be a very useful outcome.

The launch of Viagra in 1998 was followed by litigation that claimed the drug caused heart attacks, and not the romantic kind. The only broken hearts, however, were those of the plaintiffs’ lawyers and their expert witnesses who saw their litigation claims excluded and dismissed.[1]

Then came claims that the PDE5i medications caused non-arteritic anterior ischemic optic neuropathy (“NAION”), based upon a dubious epidemiologic study by Dr. Gerald McGwin. This litigation demonstrated, if anything, that while love may be blind, erections need not be.[2] The NAION cases were consolidated in a multi-district litigation (MDL) in front of Judge Paul Magnuson, in the District of Minnesota. After considerable back and forth, Judge Manguson ultimately concluded that the McGwin study was untrustworthy, and the NAION claims were dismissed.[3]

In 2014, the American Medical Association’s internal medicine journal published an observational epidemiologic study of sildenafil (Viagra) use and melanoma.[4] The authors of the study interpreted their study modestly, concluding:

“[s]ildenafil use may be associated with an increased risk of developing melanoma. Although this study is insufficient to alter clinical recommendations, we support a need for continued investigation of this association.”

Although the Li study eschewed causal conclusions and new clinical recommendations in view of the need for more research into the issue, the litigation industry filed lawsuits, claiming causality.[5]

In the new natural order of things, as soon as the litigation industry cranks out more than a few complaints, an MDL results, and the PDE5i – melanoma claims were no exception. By spring 2016, plaintiffs’ counsel had collected ten cases, a minion, sufficient for an MDL.[6] The MDL plaintiffs named the manufacturers of sildenafil and tadalafil, two of the more widely prescribed PDEi5 medications, on behalf of putative victims.

While the MDL cases were winding their way through discovery and possible trials, additional studies and meta-analyses were published. None of the subsequent studies, including the systematic reviews and meta-analyses, concluded that there was a causal association. Most scientists who were publishing on the issue opined that systematic error (generally confounding) prevented a causal interpretation of the data.[7]

Many of the observational studies found statistically significantly increased relative risks about 1.1 to 1.2 (10 to 20%), typically with upper bounds of 95% confidence intervals less than 2.0. The only scientists who inferred general causation from the available evidence were those who had been recruited and retained by plaintiffs’ counsel. As plaintiffs’ expert witnesses, they contended that the Li study, and the several studies that became available afterwards, collectively showed that PDE5i drugs cause melanoma in humans.

Not surprisingly, given the absence of any non-litigation experts endorsing the causal conclusion, the defendants challenged plaintiffs’ proffered expert witnesses under Federal Rule of Evidence 702. Plaintiffs’ counsel also embraced judicial gatekeeping and challenged the defense experts. The MDL trial judge, the Hon. Richard Seeborg, held hearings with four days of viva voce testimony from four of plaintiffs’ expert witnesses (two on biological plausibility, and two on epidemiology), and three of the defense’s experts. Last week, Judge Seeborg ruled by granting in part, and denying in part, the parties’ motions.[8]

The Decision

The MDL trial judge’s opinion is noteworthy in many respects. First, Judge Richard Seeborg cited and applied Rule 702, a statute, and not dicta from case law that predates the most recent statutory version of the rule. As a legal process matter, this respect for judicial process and the difference in legal authority between statutory and common law was refreshing. Second, the judge framed the Rule 702 issue, in line with the statute, and Ninth Circuit precedent, as an inquiry whether expert witnesses deviated from the standard of care of how scientists “conduct their research and reach their conclusions.”[9]

Biological Plausibility

Plaintiffs proffered three expert witnesses on biological plausibility, Drs. Rizwan Haq, Anand Ganesan, and Gary Piazza. All were subject to motions to exclude under Rule 702. Judge Seeborg denied the defense motions against all three of plaintiffs’ plausibility witnesses.[10]

The MDL judge determined that biological plausibility is neither necessary nor sufficient for inferring causation in science or in the law. The defense argued that the plausibility witnesses relied upon animal and cell culture studies that were unrealistic models of the human experience.[11] The MDL court, however, found that the standard for opinions on biological plausibility is relatively forgiving, and that the testimony of all three of plaintiffs’ proffered witnesses was admissible.

The subjective nature of opinions about biological plausibility is widely recognized in medical science.[12] Plausibility determinations are typically “Just So” stories, offered in the absence of hard evidence that postulated mechanisms are actually involved in a real causal pathway in human beings.

Causal Association

The real issue in the MDL hearings was the conclusion reached by plaintiffs’ expert witnesses that the PDE5i medications cause melanoma. The MDL court did not have to determine whether epidemiologic studies were necessary for such a causal conclusion. Plaintiffs’ counsel had proffered three expert witnesses with more or less expertise in epidemiology: Drs. Rehana Ahmed-Saucedo, Sonal Singh, and Feng Liu-Smith. All of plaintiffs’ epidemiology witnesses, and certainly all of defendants’ experts, implicitly if not explicitly embraced the proposition that analytical epidemiology was necessary to determine whether PDE5i medications can cause melanoma.

In their motions to exclude Ahmed-Saucedo, Singh, and Liu-Smith, the defense pointed out that, although many of the studies yielded statistically significant estimates of melanoma risk, none of the available studies adequately accounted for systematic bias in the form of confounding. Although the plaintiffs’ plausibility expert witnesses advanced “Just-So” stories about PDE5i and melanoma, the available studies showed an almost identical increased risk of basal cell carcinoma of the skin, which would be explained by confounding, but not by plaintiffs’ postulated mechanisms.[13]

The MDL court acknowledged that whether epidemiologic studies “adequately considered” confounding was “central” to the Rule 702 inquiry. Without any substantial analysis, however, the court gave its own ipse dixit that the existence vel non of confounding was an issue for cross-examination and the jury’s resolution.[14] Whether there was a reasonably valid association between PDE5i and melanoma was a jury question. This judicial refusal to engage with the issue of confounding was one of the disappointing aspects of the decision.

The MDL court was less forgiving when it came to the plaintiffs’ epidemiology expert witnesses’ assessment of the association as causal. All the parties’ epidemiology witnesses invoked Sir Austin Bradford Hill’s viewpoints or factors for judging whether associations were causal.[15] Although they embraced Hill’s viewpoints on causation, the plaintiffs’ epidemiologic expert witnesses had a much more difficult time faithfully applying them to the evidence at hand. The MDL court concluded that the plaintiffs’ witnesses deviated from their own professional standard of care in their analysis of the data.[16]

Hill’s first enumerated factor was “strength of association,” which is typically expressed epidemiologically as a risk ratio or a risk difference. The MDL court noted that the extant epidemiologic studies generally showed relative risks around 1.2 for PDE5i and melanoma, which was “undeniably” not a strong association.[17]

The plaintiffs’ epidemiology witnesses were at sea on how to explain away the lack of strength in the putative association. Dr. Ahmed-Saucedo retreated into an emphasis on how all or most of the studies found some increased risk, but the MDL court correctly found that this ruse was merely a conflation of strength with consistency of the observed associations. Dr. Ahmed-Saucedo’s dismissal of the importance of a dose-response relationship, another Hill factor, as unimportant sealed her fate. The MDL court found that her Bradford Hill analysis was “unduly results-driven,” and that her proffered testimony was not admissible.[18] Similarly, the MDL court found that Dr. Feng Liu-Smith similarly conflated strength of association with consistency, which error was too great a professional deviation from the standard of care.[19]

Dr. Sonal Singh fared no better after he contradicted his own prior testimony that there is an order of importance to the Hill factors, with “strength of association,” at or near the top. In the face of a set of studies, none of which showed a strong association, Dr. Singh abandoned his own interpretative principle to suit the litigation needs of the case. His analysis placed the greatest weight on the Li study, which had the highest risk ratio, but he failed to advance any persuasive reason for his emphasis on one of the smallest studies available. The MDL court found that Dr. Singh’s claim to have weighed strength of association heavily, despite the obvious absence of strong associations, puzzling and too great an analytical gap to abide.[20]

Judge Seeborg thus concluded that while the plaintiffs’ expert witness could opine that there was an association, which was arguably plausible, they could not, under Rule 702, contend that the association was causal. In attempting to advance an argument that the association met Bradford Hill’s factors for causality, the plaintiffs’ witnesses had ignored, misrepresented, or confused one of the most important factors, strength of the association, in a way that revealed their analyses to be result driven and unfaithful to the methodology they claimed to have followed. Judge Seeborg emphasized a feature of the revised Rule 702, which often is ignored by his fellow federal judges:[21]

“Under the amendment, as under Daubert, when an expert purports to apply principles and methods in accordance with professional standards, and yet reaches a conclusion that other experts in the field would not reach, the trial court may fairly suspect that the principles and methods have not been faithfully applied. See Lust v. Merrell Dow Pharmaceuticals, Inc., 89 F.3d 594, 598 (9th Cir. 1996). The amendment specifically provides that the trial court must scrutinize not only the principles and methods used by the expert, but also whether those principles and methods have been properly applied to the facts of the case.”

Given that the plaintiffs’ witnesses purported to apply a generally accepted methodology, Judge Seeborg was left to question why they would conclude causality when no one else in their field had done so.[22] The epidemiologic issue had been around for several years, and addressed not just in observational studies, but systematically reviewed and meta-analyzed. The absence of published causal conclusions was not just an absence of evidence, but evidence of absence of expert support for how plaintiffs’ expert witnesses applied the Bradford Hill factors.

Reliance Upon Studies That Did Not Conclude Causation Existed

Parties challenging causal claims will sometimes point to the absence of a causal conclusion in the publication of individual epidemiologic studies that are the main basis for the causal claim. In the PDE5i-melanoma cases, the defense advanced this argument unsuccessfully. The MDL court rejected the defense argument, based upon the absence of any comprehensive review of all the pertinent evidence for or against causality in an individual study; the study authors are mostly concerned with conveying the results of their own study.[23] The authors may have a short discussion of other study results as the rationale for their own study, but such discussions are often limited in scope and purpose. Judge Seeborg, in this latest round of PDE5i litigation, thus did not fault plaintiffs’ witnesses’ reliance upon epidemiologic or mechanistic studies, which individually did not assert causal conclusions; rather it was the absence of causal conclusions in systematic reviews, meta-analyses, narrative reviews, regulatory agency pronouncements, or clinical guidelines that ultimately raised the fatal inference that the plaintiffs’ witnesses were not faithfully deploying a generally accepted methodology.

The defense argument that pointed to the individual epidemiologic studies themselves derives some legal credibility from the Supreme Court’s opinion in General Electric Co. v. Joiner, 522 U.S. 136 (1997). In Joiner, the SCOTUS took plaintiffs’ expert witnesses to task for drawing stronger conclusions than were offered in the papers upon which they relied. Chief Justice Rehnquist gave considerable weight to the consideration that the plaintiffs’ expert witnesses relied upon studies, the authors of which explicitly refused to interpret as supporting a conclusion of human disease causation.[24]

Joiner’s criticisms of the reliance upon studies that do not themselves reach causal conclusions have gained a foothold in the case law interpreting Rule 702. The Fifth Circuit, for example, has declared:[25]

“It is axiomatic that causation testimony is inadmissible if an expert relies upon studies or publications, the authors of which were themselves unwilling to conclude that causation had been proven.”

This aspect of Joiner may properly limit the over-interpretation or misinterpretation of an individual study, which seems fine.[26] The Joiner case may, however, perpetuate an authority-based view of science to the detriment of requiring good and sufficient reasons to support the testifying expert witnesses’ opinions.  The problem with Joiner’s suggestion that expert witness opinion should not be admissible if it disagrees with the study authors’ discussion section is that sometimes study authors grossly over-interpret their data.  When it comes to scientific studies written by “political scientists” (scientists who see their work as advancing a political cause or agenda), then the discussion section often becomes a fertile source of unreliable, speculative opinions that should not be given credence in Rule 104(a) contexts, and certainly should not be admissible in trials. In other words, the misuse of non-rigorous comments in published articles can cut both ways.

There have been, and will continue to be, occasions in which published studies contain data, relevant and important to the causation issue, but which studies also contain speculative, personal opinions expressed in the Introduction and Discussion sections.  The parties’ expert witnesses may disagree with those opinions, but such disagreements hardly reflect poorly upon the testifying witnesses.  Neither side’s expert witnesses should be judged by those out-of-court opinions.  Perhaps the hearsay discussion section may be considered under Rule 104(a), which suspends the application of the Rules of Evidence, but it should hardly be a dispositive factor, other than raising questions for the reviewing court.

In exercising their gatekeeping function, trial judges should exercise care in how they assess expert witnesses’ reliance upon study data and analyses, when they disagree with the hearsay authors’ conclusions or discussions.  Given how many journals cater to advocacy scientists, and how variable the quality of peer review is, testifying expert witnesses should, in some instances,  have the expertise to interpret the data without substantial reliance upon, or reference to, the interpretative comments in the published literature.

Judge Seeborg sensibly seems to have distinguished between the absence of causal conclusions in individual epidemiologic studies and the absence of causal conclusions in any reputable medical literature.[27] He refused to be ensnared in the Joiner argument because:[28]

“Epidemiology studies typically only expressly address whether an association exists between agents such as sildenafil and tadalafil and outcomes like melanoma progression. As explained in In re Roundup Prod. Liab. Litig., 390 F. Supp. 3d 1102, 1116 (N.D. Cal. 2018), ‘[w]hether the agents cause the outcomes, however, ordinarily cannot be proven by epidemiological studies alone; an evaluation of causation requires epidemiologists to exercise judgment about the import of those studies and to consider them in context’.”

This new MDL opinion, relying upon the Advisory Committee Notes to Rule 702, is thus a more felicitous statement of the goals of gatekeeping.

Confidence Intervals

As welcome as some aspects of Judge Seeborg’s opinion are, the decision is not without mistakes. The district judge, like so many of his judicial colleagues, trips over the proper interpretation of a confidence interval:[29]

“When reviewing the results of a study it is important to consider the confidence interval, which, in simple terms, is the ‘margin of error’. For example, a given study could calculate a relative risk of 1.4 (a 40 percent increased risk of adverse events), but show a 95 percent ‘confidence interval’ of .8 to 1.9. That confidence interval means there is 95 percent chance that the true value—the actual relative risk—is between .8 and 1.9.”

This statement is inescapably wrong. The 95 percent probability attaches to the capturing of the true parameter – the actual relative risk – in the long run of repeated confidence intervals that result from repeated sampling of the same sample size, in the same manner, from the same population. In Judge Seeborg’s example, the next sample might give a relative risk point estimate 1.9, and that new estimate will have a confidence interval that may run from just below 1.0 to over 3. A third sample might turn up a relative risk estimate of 0.8, with a confidence interval that runs from say 0.3 to 1.4. Neither the second nor the third sample would be reasonably incompatible with the first. A more accurate assessment of the true parameter is that it will be somewhere between 0.3 and 3, a considerably broader range for the 95 percent.

Judge Seeborg’s error is sadly all too common. Whenever I see the error, I wonder whence it came. Often the error is in briefs of both plaintiffs’ and defense counsel. In this case, I did not see the erroneous assertion about confidence intervals made in plaintiffs’ or defendants’ briefs.


[1]  Brumley  v. Pfizer, Inc., 200 F.R.D. 596 (S.D. Tex. 2001) (excluding plaintiffs’ expert witness who claimed that Viagra caused heart attack); Selig v. Pfizer, Inc., 185 Misc. 2d 600 (N.Y. Cty. S. Ct. 2000) (excluding plaintiff’s expert witness), aff’d, 290 A.D. 2d 319, 735 N.Y.S. 2d 549 (2002).

[2]  “Love is Blind but What About Judicial Gatekeeping of Expert Witnesses? – Viagra Part I” (July 7, 2012); “Viagra, Part II — MDL Court Sees The Light – Bad Data Trump Nuances of Statistical Inference” (July 8, 2012).

[3]  In re Viagra Prods. Liab. Litig., 572 F.Supp. 2d 1071 (D. Minn. 2008), 658 F. Supp. 2d 936 (D. Minn. 2009), and 658 F. Supp. 2d 950 (D. Minn. 2009).

[4]  Wen-Qing Li, Abrar A. Qureshi, Kathleen C. Robinson, and Jiali Han, “Sildenafil use and increased risk of incident melanoma in US men: a prospective cohort study,” 174 J. Am. Med. Ass’n Intern. Med. 964 (2014).

[5]  See, e.g., Herrara v. Pfizer Inc., Complaint in 3:15-cv-04888 (N.D. Calif. Oct. 23, 2015); Diana Novak Jones, “Viagra Increases Risk Of Developing Melanoma, Suit Says,” Law360 (Oct. 26, 2015).

[6]  See In re Viagra (Sildenafil Citrate) Prods. Liab. Litig., 176 F. Supp. 3d 1377, 1378 (J.P.M.L. 2016).

[7]  See, e.g., Jenny Z. Wang, Stephanie Le , Claire Alexanian, Sucharita Boddu, Alexander Merleev, Alina Marusina, and Emanual Maverakis, “No Causal Link between Phosphodiesterase Type 5 Inhibition and Melanoma,” 37 World J. Men’s Health 313 (2019) (“There is currently no evidence to suggest that PDE5 inhibition in patients causes increased risk for melanoma. The few observational studies that demonstrated a positive association between PDE5 inhibitor use and melanoma often failed to account for major confounders. Nonetheless, the substantial evidence implicating PDE5 inhibition in the cyclic guanosine monophosphate (cGMP)-mediated melanoma pathway warrants further investigation in the clinical setting.”); Xinming Han, Yan Han, Yongsheng Zheng, Qiang Sun, Tao Ma, Li Dai, Junyi Zhang, and Lianji Xu, “Use of phosphodiesterase type 5 inhibitors and risk of melanoma: a meta-analysis of observational studies,” 11 OncoTargets & Therapy 711 (2018).

[8]  In re Viagra (Sildenafil Citrate) and Cialis (Tadalafil) Prods. Liab. Litig., Case No. 16-md-02691-RS, Order Granting in Part and Denying in Part Motions to Exclude Expert Testimony (N.D. Calif. Jan. 13, 2020) [cited as Opinion].

[9]  Opinion at 8 (“determin[ing] whether the analysis undergirding the experts’ testimony falls within the range of accepted standards governing how scientists conduct their research and reach their conclusions”), citing Daubert v. Merrell Dow Pharm., Inc. (Daubert II), 43 F.3d 1311, 1317 (9th Cir. 1995).

[10]  Opinion at 11.

[11]  Opinion at 11-13.

[12]  See Kenneth J. Rothman, Sander Greenland, and Timothy L. Lash, “Introduction,” chap. 1, in Kenneth J. Rothman, et al., eds., Modern Epidemiology at 29 (3d ed. 2008) (“no approach can transform plausibility into an objective causal criterion).

[13]  Opinion at 15-16.

[14]  Opinion at 16-17.

[15]  See Austin Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295 (1965); see also “Woodside & Davis on the Bradford Hill Considerations” (April 23, 2013).

[16]  Opinion at 17 – 21.

[17]  Opinion at 18. The MDL court cited In re Silicone Gel Breast Implants Prod. Liab. Litig., 318 F. Supp. 2d 879, 893 (C.D. Cal. 2004), for the proposition that relative risks greater than 2.0 permit the inference that the agent under study “was more likely than not responsible for a particular individual’s disease.”

[18]  Opinion at 18.

[19]  Opinion at 20.

[20]  Opinion at 19.

[21]  Opinion at 21, quoting from Rule 702, Advisory Committee Notes (emphasis in Judge Seeborg’s opinion).

[22]  Opinion at 21.

[23]  SeeFollow the Data, Not the Discussion” (May 2, 2010).

[24]  Joiner, 522 U.S. at 145-46 (noting that the PCB studies at issue did not support expert witnesses’ conclusion that PCB exposure caused cancer because the study authors, who conducted the research, were not willing to endorse a conclusion of causation).

[25]  Huss v. Gayden, 571 F.3d 442  (5th Cir. 2009) (citing Vargas v. Lee, 317 F.3d 498, 501-01 (5th Cir. 2003) (noting that studies that did not themselves embrace causal conclusions undermined the reliability of the plaintiffs’ expert witness’s testimony that trauma caused fibromyalgia); see also McClain v. Metabolife Internat’l, Inc., 401 F.3d 1233, 1247-48 (11th Cir. 2005) (expert witnesses’ reliance upon studies that did not reach causal conclusions about ephedrine supported the challenge to the reliability of their proffered opinions); Happel v. Walmart, 602 F.3d 820, 826 (7th Cir. 2010) (observing that “is axiomatic that causation testimony is inadmissible if an expert relies upon studies or publications, the authors of which were themselves unwilling to conclude that causation had been proven”).

[26]  In re Accutane Prods. Liab. Litig., 511 F. Supp. 2d 1288, 1291 (M.D. Fla. 2007) (“When an expert relies on the studies of others, he must not exceed the limitations the authors themselves place on the study. That is, he must not draw overreaching conclusions.) (internal citations omitted).

[27]  See Rutigliano v. Valley Bus. Forms, 929 F. Supp. 779, 785 (D.N.J. 1996), aff’d, 118 F.3d 1577 (3d Cir. 1997) (“law warns against use of medical literature to draw conclusions not drawn in the literature itself …. Reliance upon medical literature for conclusions not drawn therein is not an accepted scientific methodology.”).

[28]  Opinion at 14

[29]  Opinion at 4 – 5.

Statistical Significance at the New England Journal of Medicine

July 19th, 2019

Some wild stuff has been going on in the world of statistics, at the American Statistical Association, and elsewhere. A very few obscure journals have declared p-values to be verboten, and presumably confidence intervals as well. The world of biomedical research has generally reacted more sanely, with authors defending the existing frequentist approaches and standards.[1]

This week, the editors of the New England Journal of Medicine have issued new statistical guidelines for authors. The Journal’s approach seems appropriately careful and conservative for the world of biomedical research. In an editorial introducing the new guidelines,[2] the Journal editors remind their potential authors that statistical significance and p-values are here to stay:

“Despite the difficulties they pose, P values continue to have an important role in medical research, and we do not believe that P values and significance tests should be eliminated altogether. A well-designed randomized or observational study will have a primary hypothesis and a prespecified method of analysis, and the significance level from that analysis is a reliable indicator of the extent to which the observed data contradict a null hypothesis of no association between an intervention or an exposure and a response. Clinicians and regulatory agencies must make decisions about which treatment to use or to allow to be marketed, and P values interpreted by reliably calculated thresholds subjected to appropriate adjustments have a role in those decisions.”[3]

The Journal’s editors described their revamped statistical policy as being based upon three premises:

(1) adhering to prespecified analysis plans if they exist;

(2) declaring associations or effects only for statistical analyses that have pre-specified “a method for controlling type I error”; and

(3) presenting evidence about clinical benefits or harms requires “both point estimates and their margins of error.”

With a hat tip to the ASA’s recent pronouncements on statistical significance,[4] the editors suggest that their new guidelines have moved away from bright-line applications of statistical significance “as a bright-line marker for a conclusion or a claim”[5]:

“[T]he notion that a treatment is effective for a particular outcome if P < 0.05 and ineffective if that threshold is not reached is a reductionist view of medicine that does not always reflect reality.”[6]

The editors’ language intimates greater latitude for authors in claiming associations or effects from their studies, but this latitude may well be circumscribed by tighter control over such claims in the inevitable context of multiple testing within a dataset.

The editors’ introduction of the new guidelines is not entirely coherent. The introductory editorial notes that the use of p-values for reporting multiple outcomes, without adjustments for multiplicity, inflates the number of findings with p-values less than 5%. The editors thus caution against “uncritical interpretation of multiple inferences,” which can be particularly threatening to valid inference when not all the comparisons conducted by the study investigators have been reported in their manuscript.[7] They reassuringly tell prospective authors that many methods are available to adjust for multiple comparisons, and can be used to control Type I error probability “when specified in the design of a study.”[8]

But what happens when such adjustment methods are not pre-specified in the study design? Failure to to do so do not appear to be disqualifying factors for publication in the Journal. For one thing, when the statistical analysis plan of the study has not specified adjustment methods for controlling type I error probabilities, then authors must replace p-values with “estimates of effects or association and 95% confidence intervals.”[9] It is hard to understand how this edict helps when the specified coefficient of 95% is a continuation of the 5% alpha, which would have been used in any event. The editors seem to be saying that if authors fail to pre-specify or even post-specify methods for controlling error probabilities, then they cannot declare statistical significance, or use p-values, but they can use confidence intervals in the same way they have been using them, and with the same misleading interpretations supplied by their readers.

More important, another price authors will have to pay for multiple testing without pre-specified methods of adjustment is that they will affirmatively have to announce their failure to adjust for multiplicity and that their putative associations “may not be reproducible.” Tepid as this concession is, it is better than previous practice, and perhaps it will become a badge of shame. The crucial question is whether judges, in exercising their gatekeeping responsibilities, will see these acknowledgements as disabling valid inferences from studies that carry this mandatory warning label.

The editors have not issued guidelines for the use of Bayesian statistical analyses, because “the large majority” of author manuscripts use only frequentist analyses.[10] The editors inform us that “[w]hen appropriate,” they will expand their guidelines to address Bayesian and other designs. Perhaps this expansion will be appropriate when Bayesian analysts establish a track record of abuse in their claiming of associations and effects.

The new guidelines themselves are not easy to find. The Journal has not published these guidelines as an article in their published issues, but has relegated them to a subsection of their website’s instructions to authors for new manuscripts:

https://www.nejm.org/author-center/new-manuscripts

Presumably, the actual author instructions control in any perceived discrepancy between this week’s editorial and the guidelines themselves. Authors are told that p-values generally should be two-sided. Authors’ use of:

“Significance tests should be accompanied by confidence intervals for estimated effect sizes, measures of association, or other parameters of interest. The confidence intervals should be adjusted to match any adjustment made to significance levels in the corresponding test.”

Similarly, the guidelines call for, but do not require, pre-specified methods of controlling family-wide error rates for multiple comparisons. For observational studies submitted without pre-specified methods of error control, the guidelines recommend the use of point estimates and 95% confidence intervals, with an explanation that the interval widths have not been adjusted for multiplicity, and a caveat that the inferences from these findings may not be reproducible. The guidelines recommend against using p-values for such results, but again, it is difficult to see why reporting the 95% confidence intervals is recommended when p-values are not recommended.


[1]  Jonathan A. Cook, Dean A. Fergusson, Ian Ford, Mithat Gonen, Jonathan Kimmelman, Edward L. Korn, and Colin B. Begg, “There is still a place for significance testing in clinical trials,” 16 Clin. Trials 223 (2019).

[2]  David Harrington, Ralph B. D’Agostino, Sr., Constantine Gatsonis, Joseph W. Hogan, David J. Hunter, Sharon-Lise T. Normand, Jeffrey M. Drazen, and Mary Beth Hamel, “New Guidelines for Statistical Reporting in the Journal,” 381 New Engl. J. Med. 285 (2019).

[3]  Id. at 286.

[4]  See id. (“Journal editors and statistical consultants have become increasingly concerned about the overuse and misinterpretation of significance testing and P values in the medical literature. Along with their strengths, P values are subject to inherent weaknesses, as summarized in recent publications from the American Statistical Association.”) (citing Ronald L. Wasserstein & Nicole A. Lazar, “The ASA’s statement on p-values: context, process, and purpose,” 70 Am. Stat. 129 (2016); Ronald L. Wasserstein, Allen L. Schirm, and Nicole A. Lazar, “Moving to a world beyond ‘p < 0.05’,” 73 Am. Stat. s1 (2019)).

[5]  Id. at 285.

[6]  Id. at 285-86.

[7]  Id. at 285.

[8]  Id., citing Alex Dmitrienko, Frank Bretz, Ajit C. Tamhane, Multiple testing problems in pharmaceutical statistics (2009); Alex Dmitrienko & Ralph B. D’Agostino, Sr., “Multiplicity considerations in clinical trials,” 378 New Engl. J. Med. 2115 (2018).

[9]  Id.

[10]  Id. at 286.

Science Bench Book for Judges

July 13th, 2019

On July 1st of this year, the National Judicial College and the Justice Speakers Institute, LLC released an online publication of the Science Bench Book for Judges [Bench Book]. The Bench Book sets out to cover much of the substantive material already covered by the Federal Judicial Center’s Reference Manual:

Acknowledgments

Table of Contents

  1. Introduction: Why This Bench Book?
  2. What is Science?
  3. Scientific Evidence
  4. Introduction to Research Terminology and Concepts
  5. Pre-Trial Civil
  6. Pre-trial Criminal
  7. Trial
  8. Juvenile Court
  9. The Expert Witness
  10. Evidence-Based Sentencing
  11. Post Sentencing Supervision
  12. Civil Post Trial Proceedings
  13. Conclusion: Judges—The Gatekeepers of Scientific Evidence

Appendix 1 – Frye/Daubert—State-by-State

Appendix 2 – Sample Orders for Criminal Discovery

Appendix 3 – Biographies

The Bench Book gives some good advice in very general terms about the need to consider study validity,[1] and to approach scientific evidence with care and “healthy skepticism.”[2] When the Bench Book attempts to instruct on what it represents the scientific method of hypothesis testing, the good advice unravels:

“A scientific hypothesis simply cannot be proved. Statisticians attempt to solve this dilemma by adopting an alternate [sic] hypothesis – the null hypothesis. The null hypothesis is the opposite of the scientific hypothesis. It assumes that the scientific hypothesis is not true. The researcher conducts a statistical analysis of the study data to see if the null hypothesis can be rejected. If the null hypothesis is found to be untrue, the data support the scientific hypothesis as true.”[3]

Even in experimental settings, a statistical analysis of the data do not lead to a conclusion that the null hypothesis is untrue, as opposed to not reasonably compatible with the study’s data. In observational studies, the statistical analysis must acknowledge whether and to what extent the study has excluded bias and confounding. When the Bench Book turns to speak of statistical significance, more trouble ensues:

“The goal of an experiment, or observational study, is to achieve results that are statistically significant; that is, not occurring by chance.”[4]

In the world of result-oriented science, and scientific advocacy, it is perhaps true that scientists seek to achieve statistically significant results. Still, it seems crass to come right out and say so, as opposed to saying that the scientists are querying the data to see whether they are compatible with the null hypothesis. This first pass at statistical significance is only mildly astray compared with the Bench Book’s more serious attempts to define statistical significance and confidence intervals:

4.10 Statistical Significance

The research field agrees that study outcomes must demonstrate they are not the result of random chance. Leaving room for an error of .05, the study must achieve a 95% level of confidence that the results were the product of the study. This is denoted as p ≤ 05. (or .01 or .1).”[5]

and

“The confidence interval is also a way to gauge the reliability of an estimate. The confidence interval predicts the parameters within which a sample value will fall. It looks at the distance from the mean a value will fall, and is measured by using standard deviations. For example, if all values fall within 2 standard deviations from the mean, about 95% of the values will be within that range.”[6]

Of course, the interval speaks to the precision of the estimate, not its reliability, but that is a small point. These definitions are virtually guaranteed to confuse judges into conflating statistical significance and the coefficient of confidence with the legal burden of proof probability.

The Bench Book runs into problems in interpreting legal decisions, which would seem softer grist for the judicial mill. The authors present dictum from the Daubert decision as though it were a holding:[7]

“As noted in Daubert, ‘[t]he focus, of course, must be solely on principles and methodology, not on the conclusions they generate’.”

The authors fail to mention that this dictum was abandoned in Joiner, and that it is specifically rejected by statute, in the 2000 revision to the Federal Rule of Evidence 702.

Early in the Bench Book, it authors present a subsection entitled “The Myth of Scientific Objectivity,” which they might have borrowed from Feyerabend or Derrida. The heading appears misleading because the text contradicts it:

“Scientists often develop emotional attachments to their work—it can be difficult to abandon an idea. Regardless of bias, the strongest intellectual argument, based on accepted scientific hypotheses, will always prevail, but the road to that conclusion may be fraught with scholarly cul-de-sacs.”[8]

In a similar vein, the authors misleadingly tell readers that “the forefront of science is rarely encountered in court,” and so “much of the science mentioned there shall be considered established….”[9] Of course, the reality is that many causal claims presented in court have already been rejected or held to be indeterminate by the scientific community. And just when readers may think themselves safe from the goblins of nihilism, the authors launch into a theory of naïve probabilism that science is just placing subjective probabilities upon data, based upon preconceived biases and beliefs:

“All of these biases and beliefs play into the process of weighing data, a critical aspect of science. Placing weight on a result is the process of assigning a probability to an outcome. Everything in the universe can be expressed in probabilities.”[10]

So help the expert witness who honestly (and correctly) testifies that the causal claim or its rejection cannot be expressed as a probability statement!

Although I have not read all of the Bench Book closely, there appears to be no meaningful discussion of Rule 703, or of the need to access underlying data to ensure that the proffered scientific opinion under scrutiny has used appropriate methodologies at every step in its development. Even a 412 text cannot address every issue, but this one does little to help the judicial reader find more in-depth help on statistical and scientific methodological issues that arise in occupational and environmental disease claims, and in pharmaceutical products litigation.

The organizations involved in this Bench Book appear to be honest brokers of remedial education for judges. The writing of this Bench Book was funded by the State Justice Institute (SJI) Which is a creation of federal legislation enacted with the laudatory goal of improving the quality of judging in state courts.[11] Despite its provenance in federal legislation, the SJI is a a private, nonprofit corporation, governed by 11 directors appointed by the President, and confirmed by the Senate. A majority of the directors (six) are state court judges, one state court administrator, and four members of the public (no more than two from any one political party). The function of the SJI is to award grants to improve judging in state courts.

The National Judicial College (NJC) originated in the early 1960s, from the efforts of the American Bar Association, American Judicature Society and the Institute of Judicial Administration, to provide education for judges. In 1977, the NJC became a Nevada not-for-profit (501)(c)(3) educational corporation, which its campus at the University of Nevada, Reno, where judges could go for training and recreational activities.

The Justice Speakers Institute appears to be a for-profit company that provides educational resources for judge. A Press Release touts the Bench Book and follow-on webinars. Caveat emptor.

The rationale for this Bench Book is open to question. Unlike the Reference Manual for Scientific Evidence, which was co-produced by the Federal Judicial Center and the National Academies of Science, the Bench Book’s authors are lawyers and judges, without any subject-matter expertise. Unlike the Reference Manual, the Bench Book’s chapters have no scientist or statistician authors, and it shows. Remarkably, the Bench Book does not appear to cite to the Reference Manual or the Manual on Complex Litigation, at any point in its discussion of the federal law of expert witnesses or of scientific or statistical method. Perhaps taxpayers would have been spared substantial expense if state judges were simply encouraged to read the Reference Manual.


[1]  Bench Book at 190.

[2]  Bench Book at 174 (“Given the large amount of statistical information contained in expert reports, as well as in the daily lives of the general society, the ability to be a competent consumer of scientific reports is challenging. Effective critical review of scientific information requires vigilance, and some healthy skepticism.”).

[3]  Bench Book at 137; see also id. at 162.

[4]  Bench Book at 148.

[5]  Bench Book at 160.

[6]  Bench Book at 152.

[7]  Bench Book at 233, quoting Daubert v. Merrell Dow Pharms., Inc., 509 U.S. 579, 595 (1993).

[8]  Bench Book at 10.

[9]  Id. at 10.

[10]  Id. at 10.

[11] See State Justice Institute Act of 1984 (42 U.S.C. ch. 113, 42 U.S.C. § 10701 et seq.).

The Shmeta-Analysis in Paoli

July 11th, 2019

In the Paoli Railroad yard litigation, plaintiffs claimed injuries and increased risk of future cancers from environmental exposure to polychlorinated biphenyls (PCBs). This massive litigation showed up before federal district judge Hon. Robert F. Kelly,[1] in the Eastern District of Pennsylvania, who may well have been the first judge to grapple with a litigation attempt to use meta-analysis to show a causal association.

One of the plaintiffs’ expert witnesses was the late William J. Nicholson, who was a professor at Mt. Sinai School of Medicine, and a colleague of Irving Selikoff. Nicholson was trained in physics, and had no professional training in epidemiology. Nonetheless, Nicholson was Selikoff’s go-to colleague for performing epidemiologic studies. After Selikoff withdrew from active testifying for plaintiffs in tort litigation, Nicholson was one of his colleagues who jumped into the fray as a surrogate advocate for Selikoff.[2]

For his opinion that PCBs were causally associated with liver cancer in humans,[3] Nicholson relied upon a report he wrote for the Ontario Ministry of Labor. [cited here as “Report”].[4] Nicholson described his report as a “study of the data of all the PCB worker epidemiological studies that had been published,” from which he concluded that there was “substantial evidence for a causal association between excess risk of death from cancer of the liver, biliary tract, and gall bladder and exposure to PCBs.”[5]

The defense challenged the admissibility of Nicholson’s meta-analysis, on several grounds. The trial court decided the challenge based upon the Downing case, which was the law in the Third Circuit, before the Supreme Court decided Daubert.[6] The Downing case allowed some opportunity for consideration of reliability and validity concerns; there is, however, disappointingly little discussion of any actual validity concerns in the courts’ opinions.

The defense challenge to Nicholson’s proffered testimony on liver cancer turned on its characterization of meta-analysis as a “novel” technique, which is generally unreliable, and its claim that Nicholson’s meta-analysis in particular was unreliable. None of the individual studies that contributed data showed any “connection” between PCBs and liver cancer; nor did any individual study conclude that there was a causal association.

Of course, the appropriate response to this situation, with no one study finding a statistically significant association, or concluding that there was a causal association, should have been “so what?” One of the reasons to do a meta-analysis is that no available study was sufficiently large to find a statistically significant association, if one were there. As for drawing conclusions of causal associations, it is not the role or place of an individual study to synthesize all the available evidence into a principled conclusion of causation.

In any event, the trial court concluded that the proffered novel technique lacked sufficient reliability, that the meta-analysis would “overwhelm, confuse, or mislead the jury,” and that the proffered meta-analysis on liver cancer was not sufficiently relevant to the facts of the case (in which no plaintiff had developed, or had died of, liver cancer). The trial court noted that the Report had not been peer-reviewed, and that it had not been accepted or relied upon by the Ontario government for any finding or policy decision. The trial court also expressed its concern that the proffered testimony along the lines of the Report would possibly confuse the jury because it appeared to be “scientific” and because Nicholson appeared to be qualified.

The Appeal

The Court of Appeals for the Third Circuit, in an opinion by Judge Becker, reversed Judge Kelly’s exclusion of the Nicholson Report, in an opinion that is still sometimes cited, even though Downing is no longer good law in the Circuit or anywhere else.[7] The Court was ultimately not persuaded that the trial court had handled the exclusion of Nicholson’s Report and its meta-analysis correctly, and it remanded the case for a do-over analysis.

Judge Becker described Nicholson’s Report as a “meta-analysis,” which pooled or “combined the results of numerous epidemiologic surveys in order to achieve a larger sample size, adjusted the results for differences in testing techniques, and drew his own scientific conclusions.”[8] Through this method, Nicholson claimed to have shown that “exposure to PCBs can cause liver, gall bladder and biliary tract disorders … even though none of the individual surveys supports such a conclusion when considered in isolation.”[9]

Validity

The appellate court gave no weight to the possibility that a meta-analysis would confuse a jury, or that its “scientific nature” or Nicholson’s credentials would lead a jury to give it more weight than it deserved.[10] The Court of Appeals conceded, however, that exclusion would have been appropriate if the methodology used itself was invalid. The appellate opinion further acknowledged that the defense had offered opposition to Nicholson’s Report in which it documented his failure to include data that were inconsistent with his conclusions, and that “Nicholson had produced a scientifically invalid study.”[11]

Judge Becker’s opinion for a panel of the Third Circuit provided no details about the cherry picking. The opinion never analyzed why this charge of cherry-picking and manipulation of the dataset did not invalidate the meta-analytic method generally, or Nicholson’s method as applied. The opinion gave no suggestion that this counter-affidavit was ever answered by the plaintiffs.

Generally, Judge Becker’s opinion dodged engagement with the specific threats to validity in Nicholson’s Report, and took refuge in the indisputable fact that hundreds of meta-analyses were published annually, and that the defense expert witnesses did not question the general reliability of meta-analysis.[12] These facts undermined the defense claim that meta-analysis was novel.[13] The reality, however, was that meta-analysis was in its infancy in bio-medical research.

When it came to the specific meta-analysis at issue, the court did not discuss or analyze a single pertinent detail of the Report. Despite its lack of engagement with the specifics of the Report’s meta-analysis, the court astutely observed that prevalent errors and flaws do not mean that a particular meta-analysis is “necessarily in error.”[14] Of course, without bothering to look, the court would not know whether the proffered meta-analysis was “actually in error.”

The appellate court would have given Nicholson’s Report a “pass” if it was an application of an accepted methodology. The defense’s remedy under this condition would be to cross-examine the opinion in front of a jury. If, on the other hand, the Nicholson had altered an accepted methodology to skew its results, then the court’s gatekeeping responsibility under Downing would be invoked.

The appellate court went on to fault the trial court for failing to make sufficiently explicit findings as to whether the questioned meta-analysis was unreliable. From its perspective, the Court of Appeals saw the trial court as resolving the reliability issue upon the greater credibility of defense expert witnesses in branding the disputed meta-analysis as unreliability. Credibility determinations are for the jury, but the court left room for a challenge on reliability itself:[15]

“Assuming that Dr. Nicholson’s meta-analysis is the proper subject of Downing scrutiny, the district court’s decision is wanting, because it did not make explicit enough findings on the reliability of Dr. Nicholson’s meta-analysis to satisfy Downing. We decline to define the exact level at which a district court can exclude a technique as sufficiently unreliable. Reliability indicia vary so much from case to case that any attempt to define such a level would most likely be pointless. Downing itself lays down a flexible rule. What is not flexible under Downing is the requirement that there be a developed record and specific findings on reliability issues. Those are absent here. Thus, even if it may be possible to exclude Dr. Nicholson’s testimony under Downing, as an unreliable, skewed meta-analysis, we cannot make such a determination on the record as it now stands. Not only was there no hearing, in limine or otherwise, at which the bases for the opinions of the contesting experts could be evaluated, but the experts were also not even deposed. All of the expert evidence was based on affidavits.”

Peer Review

Understandably, the defense attacked Nicholson’s Report as not having been peer reviewed. Without any scrutiny of the scientific bona fides of the workers’ compensation agency, the appellate court acquiesced in Nicholson’s self-serving characterization of his Report as having been reviewed by “cooperating researchers” and the Panel of the Ontario Workers’ Compensation agency. Another partisan expert witness characterized Nicholson’s Report as a “balanced assessment,” and this seemed to appease the Third Circuit, which was wary of requiring peer review in the first place.[16]

Relevancy Prong

The defense had argued that Nicholson’s Report was irrelevant because no individual plaintiff claimed liver cancer.[17] The trial court largely accepted this argument, but the appellate court disagreed because of conclusory language in Nicholson’s affidavit, in which he asserted that “proof of an increased risk of liver cancer is probative of an increased risk of other forms of cancer.” The court seemed unfazed by the ipse dixit, asserted without any support. Indeed, Nicholson’s assertion was contradicted by his own Report, in which he reported that there were fewer cancers among PCB-exposed male capacitor manufacturing workers than expected,[18] and that the rate for all cancers for both men and women was lower than expected, with 132 observed and 139.40 expected.[19]

The trial court had also agreed with the defense’s suggestion that Nicholson’s report, and its conclusion of causality between PCB exposure and liver cancer, were irrelevant because the Report “could not be the basis for anyone to say with reasonable degree of scientific certainty that some particular person’s disease, not cancer of the liver, biliary tract or gall bladder, was caused by PCBs.”[20]

Analysis

It would likely have been lost on Judge Becker and his colleagues, but Nicholson presented SMRs (standardized mortality ratios) throughout his Report, and for the all cancers statistic, he gave an SMR of 95. What Nicholson clearly did in this, and in all other instances, was simply divide the observed number by the expected, and multiply by 100. This crude, simplistic calculation fails to present a standardized mortality ratio, which requires taking into account the age distribution of the exposed and the unexposed groups, and a weighting of the contribution of cases within each age stratum. Nicholson’s presentation of data was nothing short of false and misleading. And in case anyone remembers General Electric v. Joiner, Nicholson’s summary estimate of risk for lung cancer in men was below the expected rate.[21]

Nicholson’s Report was replete with many other methodological sins. He used a composite of three organs (liver, gall bladder, bile duct) without any biological rationale. His analysis combined male and female results, and still his analysis of the composite outcome was based upon only seven cases. Of those seven cases, some of the cases were not confirmed as primary liver cancer, and at least one case was confirmed as not being a primary liver cancer.[22]

Nicholson failed to standardize the analysis for the age distribution of the observed and expected cases, and he failed to present meaningful analysis of random or systematic error. When he did present p-values, he presented one-tailed values, and he made no corrections for his many comparisons from the same set of data.

Finally, and most egregiously, Nicholson’s meta-analysis was meta-analysis in name only. What he had done was simply to add “observed” and “expected” events across studies to arrive at totals, and to recalculate a bogus risk ratio, which he fraudulently called a standardized mortality ratio. Adding events across studies is not a valid meta-analysis; indeed, it is a well-known example of how to generate a Simpson’s Paradox, which can change the direction or magnitude of any association.[23]

Some may be tempted to criticize the defense for having focused its challenge on the “novelty” of Nicholson’s approach in Paoli. The problem of course was the invalidity of Nicholson’s work, but both the trial court’s exclusion of Nicholson, and the Court of Appeals’ reversal and remand of the exclusion decision, illustrate the problem in getting judges, even well-respected judges, to accept their responsibility to engage with questioned scientific evidence.

Even in Paoli, no amount of ketchup could conceal the unsavoriness of Nicholson’s scrapple analysis. When the Paoli case reached the Court Appeals again in 1994, Nicholson’s analysis was absent.[24] Apparently, the plaintiffs’ counsel had second thoughts about the whole matter. Today, under the revised Rule 702, there can be little doubt that Nicholson’s so-called meta-analysis should have been excluded.


[1]  Not to be confused with the Judge Kelly of the same district, who was unceremoniously disqualified after attending an ex parte conference with plaintiffs’ lawyers and expert witnesses, at the invitation of Dr. Irving Selikoff.

[2]  Pace Philip J. Landrigan & Myron A. Mehlman, “In Memoriam – William J. Nicholson,” 40 Am. J. Indus. Med. 231 (2001). Landrigan and Mehlman assert, without any support, that Nicholson was an epidemiologist. Their own description of his career, his undergraduate work at MIT, his doctorate in physics from the University of Washington, his employment at the Watson Laboratory, before becoming a staff member in Irving Selikoff’s department in 1969, all suggest that Nicholson brought little to no experience in epidemiology to his work on occupational and environmental exposure epidemiology.

[3]  In re Paoli RR Yard Litig., 706 F. Supp. 358, 372-73 (E.D. Pa. 1988).

[4]  William Nicholson, Report to the Workers’ Compensation Board on Occupational Exposure to PCBs and Various Cancers, for the Industrial Disease Standards Panel (ODP); IDSP Report No. 2 (Toronto, Ontario Dec. 1987).

[5]  Id. at 373.

[6]  United States v. Downing, 753 F.2d 1224 (3d Cir.1985)

[7]  In re Paoli RR Yard PCB Litig., 916 F.2d 829 (3d Cir. 1990), cert. denied sub nom. General Elec. Co. v. Knight, 111 S.Ct. 1584 (1991).

[8]  Id. at 845.

[9]  Id.

[10]  Id. at 841, 848.

[11]  Id. at 845.

[12]  Id. at 847-48.

[13]  See, e.g., Robert Rosenthal, Judgment studies: Design, analysis, and meta-analysis (1987); Richard J. Light & David B. Pillemer, Summing Up: the Science of Reviewing Research (1984); Thomas A. Louis, Harvey V. Fineberg & Frederick Mosteller, “Findings for Public Health from Meta-Analyses,” 6 Ann. Rev. Public Health 1 (1985); Kristan A. L’abbé, Allan S. Detsky & Keith O’Rourke, “Meta-analysis in clinical research,” 107 Ann. Intern. Med. 224 (1987).

[14]  Id. at 857.

[15]  Id. at 858/

[16]  Id. at 858.

[17]  Id. at 845.

[18]  Report, Table 16.

[19]  Report, Table 18.

[20]  In re Paoli, 916 F.2d at 847.

[21]  See General Electric v. Joiner, 522 U.S. 136 (1997); NAS, “How Have Important Rule 702 Holdings Held Up With Time?” (March 20, 2015).

[22]  Report, Table 22.

[23]  James A. Hanley, Gilles Thériault, Ralf Reintjes and Annette de Boer, “Simpson’s Paradox in Meta-Analysis,” 11 Epidemiology 613 (2000); H. James Norton & George Divine, “Simpson’s paradox and how to avoid it,” Significance 40 (Aug. 2015); George Udny Yule, Notes on the theory of association of attributes in Statistics, 2 Biometrika 121 (1903).

[24]  In re Paoli RR Yard Litig., 35 F.3d 717 (3d Cir. 1994).