Almost 28 years ago, the United States Department of Labor (Occupational Safety and Health Administration or OSHA) promulgated The Hazard Communication Standard. 29 C.F.R. § 1910.1200 (November 1983; effective date November 25, 1985) (HazCom standard). Initially the HazCom standard applied to importers and manufacturers of chemicals. Starting one year later, November 25, 1986, the standard covered manufacturing employers, under OSHA jurisdiction, by defining their duties to protect and inform employees.
The HazCom standard applies to all chemical manufacturers and distributors and to
“any chemical which is known to be present in the workplace in such a manner that employees may be exposed under normal conditions of use or in a foreseeable emergency.”
29 C.F.R. § 1910.1200(b)(1), and (b)(2). The standard requires manufacturers and distributors of hazardous chemicals inform not only their own employees of the dangers posed by the chemicals, but downstream employers and employees as well. The standard implements this duty to warn downstream employers’ employees by requiring that containers of hazardous chemicals leaving the workplace are labeled with “appropriate hazard warnings.” See Martin v. American Cyanamid Co., 5 F.3d 140, 141-42 (6th Cir. 1993) (reviewing agency’s interpretation of the standard).
The HazCom standard attempts to provide some definition of the health hazards for which warnings are required:
“For health hazards, evidence which is statistically significant and which is based on at least one positive study conducted in accordance with established scientific principles is considered to be sufficient to establish a hazardous effect if the results of the study meet the definitions of health hazards in this section.”
29 C.F.R. § 1910.1200(d)(2).
This regulatory language is troubling. What does statistically significant mean? The concept remains important in health effects research, but several writers have subjected the use of significance testing specifically, and frequentist statistics generally, to criticisms. See, e.g., Stephen T. Ziliak and Deirdre N. McCloskey, The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives (Ann Arbor 2008) (example of one of the more fringe, and not particularly cogent, criticisms of frequentist statistics). And what are the “established scientific principles,” which would allow a single “positive study” to “establish” a hazardous “effect”?
The HazCom standard is important not only for purposes of regulatory compliance, but for its potential implications for products liability law, as well. With its importance in mind, what can be said about the definition of health hazard, provided in 29 C.F.R. § 1910.1200(d)(2)?
Perhaps a good place to start is with the guidance provided by OSHA on compliance with the HazCom standard. To be sure, like most agency guidance statements, this one is prefaced with caveats and cautions:
“This guidance is not a standard or regulation, and it creates no new legal obligations. It is advisory in nature, informational in content, and is intended to assist employers in providing a safe and healthful workplace. Pursuant to the Occupational Safety and Health Act, employers must comply with safety and health standards promulgated by OSHA or by a state with an OSHA-approved state plan. In addition, pursuant to Section 5(a)(1), the General Duty Clause of the Act, employers must provide their employees with a workplace free from recognized hazards likely to cause death or serious physical harm. Employers can be cited for violating the General Duty Clause if there is a recognized hazard and they do not take reasonable steps to prevent or abate the hazard. However, failure to implement any specific recommendations in this guidance is not, in itself, a violation of the General Duty Clause. Citations can only be based on standards, regulations, and the General Duty Clause.”
U.S. Dep’t of Labor, Guidance for Hazard Determination for Compliance with the OSHA Hazard Communication Standard (29 CFR § 1910.1200) (July 6, 2007).
Section II of the Guidance describes how manufacturers may assess whether their chemicals are “hazardous.” A health hazard is defined as a chemical
“for which there is statistically significant evidence based on at least one study conducted in accordance with established scientific principles that acute or chronic health effects may occur in exposed employees.”
A fair-minded person might object that this is no guidance at all. Statistically significant is not defined in the regulations. Study is not defined. The guidance specifies that the study or studies must be conducted in accordance with “established scientific principles,” but must the interpretation or judgment of causality be made similarly in accordance with such principles? One would hope so, but the Guidance does not really specify. The use of “may” seems to inject a level of conjecture or speculation into the hazard assessment.
Section V of the Guidance addresses data analysis, and here the agency attempts to provide some meaning to statistical significance and other terms in the regulation, but in doing so, the Guidance offers incoherent, incredible advice.
The Guidance notes that the regulation specifies one “positive study,” which presumably is a study that is some evidence in favor of an “effect.” Because we are dealing with chemical exposures in occupational settings, the studies at issue will be, at best, observational studies. Randomized clinical trials are out. The one study (at least) at issue must be sufficient to establish a hazardous effect if that effect is considered a “health hazard” within the meaning of the regulations. This is problematic on many levels. What sort of study are we discussing? An experimental study in planaria worms, a case study of a single human, an ecological study, or an analytical epidemiologic (case-control or cohort) study? Whatever the study is, it would be a most remarkable study if it alone were “sufficient” to “establish” an “effect.”
A reasonable manufacturer or disinterested administrator surely would interpret the sufficiency requirement to mean that the entire evidentiary display must be considered rather than whether one study, taken in isolation, ripped from its scientific context, should be used to suggest a duty to warn. The Guidance, and the regulations, however, never address the real-world complexity of hazard assessment.
Section V of the Guidance offers a failed attempt to illuminate the meaning of statistical significance:
“Statistical significance is a mathematical determination of the confidence in the outcome of a test. The usual criterion for establishing statistical significance is the p-value (probability value). A statistically significant difference in results is generally indicated by p < 0.05, meaning there is less than a 5% probability that the toxic effects observed were due to chance and were not caused by the chemical. Another way of looking at it is that there is a 95% probability that the effect is real, i.e., the effect seen was the result of the chemical exposure.”
Few statisticians or scientists would accept the proffered definition as acceptable. The Guidance’s statement that a p-value is equivalent to the probability of the “toxic effect” occurring by chance is unacceptable for several reasons.
First, it is a notoriously incorrect, fallacious statement of the meaning of a p-value:
“Since p is calculated by assuming the null hypothesis is correct (that there is no difference [between observed and expected] in the full population), the p-value cannot give the chance that this hypothesis is true. The p-value merely gives the chance of getting evidence against the null hypothesis as strong or stronger than the evidence at hand — assuming that the null hypothesis … is correct.”
David H. Kaye, David E. Bernstein, and Jennifer L. Mnookin, The New Wigmore: Expert Evidence § 12.8.2, at 559 (2d ed. 2010) (discussing the transpositional fallacy).
Second, even if we could ignore the statistical solecism, the Guidance’s use of a mechanical test for statistical significance is troubling. The p-value is not necessarily an appropriate protection against Type I error, or a “false alarm” that there is an association between the exposure and outcome of interest. Multiple testing and other aspects of a study may inflate the number of false alarms to the point that a study with a low p-value, even one much lower than 5%, will not rule out the likely role of chance as an explanation for the study’s result.
Third, the Guidance’s suggestion that “statistical significance” boils down to a conclusion that the “effect is real” may be its greatest offense against scientific and statistical methodology. Section V of the Guidance emphasizes that the HazCom standard states that
“evidence that is statistically significant and which is based on at least one positive study conducted in accordance with established scientific principles is considered to be sufficient to establish a hazardous effect if the results of the study meet the [HCS] definitions of health hazards.”
This is nothing more than semantic fiat and legerdemain.
Statistical significance may, in some circumstances, permit an inference that the divergence from the expected was not likely due to chance, but it cannot, in the context of observational studies, allow for a conclusion that the divergence resulted because of a cause-effect relationship between the exposure and the outcome. Statistical significance cannot rule out systemic bias or confounding in the study; nor can it help us reconcile inconsistencies across studies. The study may have identified an association, which must be assessed for its causal or non-causal nature, in the context of all relevant evidence. See Arthur Bradford Hill, “The Environment and Disease: Association or Causation?” 58 Proc. Royal Soc’y Med. 295 (1965).”
The OSHA Guidance is really no guidance at all. Ensuring worker health and safety by requiring employers to provide industrial hygiene protections for workers is an exceedingly important task, but this aspect of the HazCom standard is incoherent and incompetent. Workers and employers are in the dark, and product suppliers are vulnerable to arbitrary and capricious enforcement.