Peer Review, Protocols, and QRPs

In Daubert, the Supreme Court decided a legal question about the proper interpretation of a statute, Rule 702, and then remanded the case to the Ninth Circuit of the Court of Appeals for further proceedings. The Court did, however, weigh in with dicta about some several considerations in admissibility decisions.  In particular, the Court identified four non-dispositive factors: whether the challenged opinion has been empirically tested, published and peer reviewed, and whether the underlying scientific technique or method supporting the opinion has an acceptable rate of error, and has gained general acceptance.[1]

The context in which peer review was discussed in Daubert is of some importance to our understanding its holding peer review out as a consideraton. One of the bases for the defense challenges to some of the plaintiffs’ expert witnesses’ opinions in Daubert was their reliance upon re-analyses of published studies to suggest that there was indeed an increased risk of birth defects if only the publication authors had used some other control group, or taken some other analytical approach. Re-analyses can be important, but these reanalyses of published Bendectin studies were post hoc, litigation driven, and obviously result oriented. The Court’s discussion of peer review reveals that it was not simply creating a box to be checked before a trial court could admit an expert witness’s opinions. Peer review was suggested as a consideration because:

“submission to the scrutiny of the scientific community is a component of “good science,” in part because it increases the likelihood that substantive flaws in methodology will be detected. The fact of publication (or lack thereof) in a peer reviewed journal thus will be a relevant, though not dispositive, consideration in assessing the scientific validity of a particular technique or methodology on which an opinion is premised.”[2]

Peer review, or the lack thereof, for the challenged expert witnesses’ re-analyses was called out because it raised suspicions of lack of validity. Nothing in Daubert, or in later decisions, or more importantly in Rule 702 itself, supports admitting expert witness testimony just because the witness relied upon peer-reviewed studies, especially when the studies are invalid or are based upon questionable research practices. The Court was careful to point out that peer-reviewed publication was “not a sine qua non of admissibility; it does not necessarily correlate with reliability, … .”[3] The Court thus showed that it was well aware that well-ground (and thus admissible) opinions may not have been previously published, and that the existence of peer review was simply a potential aid in answering the essential question, whether the proponent of a proffered opinion has shown “the scientific validity of a particular technique or methodology on which an opinion is premised.[4]

Since 1993, much has changed in the world of bio-science publishing. The wild proliferation of journals, including predatory and “pay-to-play” journals, has disabused most observers that peer review provides evidence of validity of methods. Along with the exponential growth in publications has come an exponential growth in expressions of concern and out-right retractions of articles, as chronicled and detailed at Retraction Watch.[5] Some journals encourage authors to nominate the peer reviewers for their manuscripts; some journals let authors block some scientists as peer reviewers of their submitted manuscripts. If the Supreme Court were writing today, it might well note that peer review is often a feature of bad science, advanced by scientists who know that peer-reviewed publication is the price of admission to the advocacy arena.

Since the Supreme Court decided Daubert, the Federal Judicial Center and National Academies of Science have provided a Reference Manual for Scientific Evidence, now in its third edition, and with a fourth edition on the horizon, to assist judges and lawyers involved in the litigation of scientific issues. Professor Goodstein, in his chapter “How Science Works,” in the third edition, provides the most extensive discussion of peer review in the Manual, and emphasizes that peer review “works very poorly in catching cheating or fraud.”[6]  Goodstein invokes his own experience as a peer reviewer to note that “peer review referees and editors limit their assessment of submitted articles to such matters as style, plausibility, and defensibility; they do not duplicate experiments from scratch or plow through reams of computer-generated data in order to guarantee accuracy or veracity or certainty.”[7] Indeed, Goodstein’s essay in the Reference Manual characterizes the ability of peer review to warrant study validity as a “myth”:

Myth: The institution of peer review assures that all published papers are sound and dependable.

Fact: Peer review generally will catch something that is completely out of step with majority thinking at the time, but it is practically useless for catching outright fraud, and it is not very good at dealing with truly novel ideas. … It certainly does not ensure that the work has been fully vetted in terms of the data analysis and the proper application of research methods.[8]

Goodstein’s experience as a peer reviewer is hardly idiosyncratic. One standard text on the ethical conduct of research reports that peer review is often ineffective or incompetent, and that it may not even catch simple statistical or methodological errors.[9] According to the authors, Shamoo and Resnik:

“[p]eer review is not good at detecting data fabrication or falsification partly because reviewers usually do not have access to the material they would need to detect fraud, such as the original data, protocols, and standard operating procedures.”[10]

Indeed, without access to protocols, statistical analysis plans, and original data, peer review often cannot identify good faith or negligent deviations from the standard of scientific care. There is some evidence to support this negative assessment of peer review from testing of the counter-factual. Reviewers were able to detect questionable, selective reporting when they had access to the study authors’ research protocols.[11]

Study Protocol

The study protocol provides the scientific rationale for a study, clearly defines the research question, the data collection process, defines the key exposure and outcomes, and describes the methods to be applied, before commencing data collection.[12] The protocol also typically pre-specifies the statistical data analysis. The epidemiology chapter of the current edition of the Reference Manual for Scientific Evidence offers blandly only that epidemiologists attempt to minimize bias in observational studies with “data collection protocols.”[13] Epidemiologists and statisticians are much clearer in emphasizing the importance, indeed the necessity, of having a study protocol before commencing data collection. Back in 1988, John Bailar and Frederick Mosteller explained that it was critical in reporting statistical analyses to inform readers about how and when the authors devised the study design, and whether they set the design criteria out in writing before they began to collect data.[14]

The necessity of a study protocol is “self-evident,”[15] and essential to research integrity.[16] The International Society of Pharmacoepidemiology has issued Guidelines for “Good Pharmacoepidemiology Practices,”[17] which calls for every study to have a written protocol. Among the requirements set out in this set of guidelines are descriptions of the research method, study design, operational definitions of exposure and outcome variables, and projected study sample size. The Guidelines provide that a detailed statistical analysis plan may be specified after data collection begins, but before any analysis commences.

Expert witness opinions on health effects are built upon studies, and so it behooves legal counsel to identify the methodological strengths and weaknesses of key studies through questioning whether they have protocols, whether the protocols were methodologically appropriate, and whether the researchers faithfully followed their protocols and their statistical analysis plans. Determining the peer review status of a publication, on the other hand, will often not advance a challenge based upon improvident methodology.

In some instances, a published study will have sufficiently detailed descriptions of methods and data that readers, even lawyers, can evaluate their scientific validity or reliability (vel non). In some cases, however, readers will be no better off than the peer reviewers who were deprived of access to protocols, statistical analysis plans, and original data. When a particular study is crucial support for an adversary’s expert witness, a reasonable litigation goal may well be to obtain the protocol and statistical analysis plan, and if need be, the original underlying data. The decision to undertake such discovery is difficult. Discovery of non-party scientists can be expensive and protracted; it will almost certainly be contentious. When expert witnesses rely upon one or a few studies, which telegraph internal validity, this litigation strategy may provide the strongest evidence against the study’s being reasonably relied upon, or its providing “sufficient facts and data” to support an admissible expert witness opinion.


[1] Daubert v. Merrell Dow Pharms., Inc., 509 U.S. 579, 593-594 (1993).

[2] Id. at 594 (internal citations omitted) (emphasis added).

[3] Id.

[4] Id. at 593-94.

[5] Retraction Watch, at https://retractionwatch.com/.

[6] Reference Manual on Scientific Evidence at 37, 44-45 (3rd ed. 2011) [Manual].

[7] Id. at 44-45 n.11.

[8] Id. at 48 (emphasis added).

[9] Adil E. Shamoo and David B. Resnik, Responsible Conduct of Research 133 (4th ed. 2022).

[10] Id.

[11] An-Wen Chan, Asbjørn Hróbjartsson, Mette T. Haahr, Peter C. Gøtzsche, and David G. Altman, D. G. “Empirical evidence for selective reporting of outcomes in randomized trials: Comparison of protocols to published articles,” 291 J. Am. Med. Ass’n 2457 (2004).

[12] Wolfgang Ahrens & Iris Pigeot, eds., Handbook of Epidemiology 477 (2nd ed. 2014).

[13] Michael D. Green, D. Michal Freedman, and Leon Gordis, “Reference Guide on Epidemiology,” in Reference Manual on Scientific Evidence 573 (3rd ed. 2011) 573 (“Study designs are developed before they begin gathering data.”).

[14] John Bailar & Frederick Mosteller, “Guidelines for Statistical Reporting in Articles for Medical Journals,” 108 Ann. Intern. Med. 2266, 268 (1988).

[15] Wolfgang Ahrens & Iris Pigeot, eds., Handbook of Epidemiology 477 (2nd ed. 2014).

[16] Sandra Alba, et al., “Bridging research integrity and global health epidemiology statement: guidelines for good epidemiological practice,” 5 BMJ Global Health e003236, at p.3 & passim (2020).

[17] See “The ISPE Guidelines for Good Pharmacoepidemiology Practices (GPP),” available at <https://www.pharmacoepi.org/resources/policies/guidelines-08027/>.