Rule 702 is, or is not, a litmus test for expert witness opinion admissibility. Relative risk is, or is not, a litmus test for specific causation. Statistical significance is, or is not, a litmus test for reasonable reliance upon the results of a study. It is relatively easy to find judicial opinions on either side of the litmus divide. Compare National Judicial College, Resource Guide for Managing Complex Litigation at 57 (2010) (Daubert is not a litmus test) with Cryer v. Werner Enterprises, Inc., Civ. Action No. 05-S-696-NE, Mem. Op. & Order at 16 n. 63 (N.D. Ala. Dec. 28, 2007) (describing the Eleventh Circuit’s restatement of Rule 702’s “litmus test” for the methodological reliability of proffered expert witness opinion testimony).
The “litmus test“ is one sorry, overworked metaphor. Perhaps its appeal has to do with a vague collective memory that litmus paper is one of those “things of science,” which we used in high school chemistry, and never had occasion to use again. Perhaps, litmus tests have the appeal of “proofiness.”
The reality is different. The litmus test is a semi-quantitative test for acidity or alkalinity. Neutral litmus is purple. Under acidic conditions, litmus turns red; under basic conditions, it turns blue. For some time, scientists have used pH meters when they want a precise quantification of acidity or alkalinity. Litmus paper is a fairly crude test, which easily discriminates moderate acidity from alkalinity (say pH 4 from pH 11), but is relatively useless for detecting an acidity at pH or 6.95, or alkalinity at 7.05.
So what exactly are legal authors trying to say when they say that some feature of a test is, or is not, a “litmus test”? The litmus test is accurate, but not precise at the important boundary at neutrality. The litmus test color can be interpreted for degree of acidity or alkalinity, but it is not the preferred method to obtain a precise measurement. Saying that a judicial candidate’s views on abortion are a litmus test for the Senate’s evaluation of the candidate makes sense, given the relative binary nature of the outcome of a litmus test, and the polarization of political views on abortion. Apparently, neutral views or views close to neutrality on abortion are not a desideratum for judicial candidates. A cruder, binary test is exactly what is desired by politicians.
The litmus test that is used for judicial candidates does not seem to work so well when used to describe scientific or statistical inference. The litmus test is well understood, but fairly obsolete in modern laboratory practice. When courts say things, such as statistical significance is not a litmus test for acceptability of a study’s results, clearly they are correct because measure of random error is only one aspect of judging a body of evidence for, or against, an association. Yet courts seem to imply something else, at least at times:
statistical significance is not an important showing in making a case that an exposure is reliably associated with a particular outcome.
Here courts are trading in half truths. Statistical significance is quantitative, and the choice of a level of significance is not based upon immutable law. So like the slight difference between a pH of 6.95 and 7.05, statistical significance tests have a boundary issue. Nonetheless, a consideration of random error cannot be dismissed or overlooked on the theory that significance level is not a “litmus test.” This metaphor obscures and attempts to excuse sloppy thinking. It is time to move beyond this metaphor.