For six years, the Food and Drug Administration (FDA) has been pondering a proposed rule to abandon the current system of pregnancy warning categories for prescription drugs. Last week, the agency finally published its final rule for pregnancy and lactation labeling[1]. The rule, effective in June 2015, will require removal of the current category labeling, A, B, C, D, or X, in favor of risk statements and narrative summaries of the human, animal, and pharmacologic data for adverse maternal and embryo/fetal outcomes.

The labeling system, which will be phased out, discouraged or prohibited inclusion of actual epidemiologic data results for teratogenicity. With sponsors required to present actual data, the agency voiced a concern whether prescribing physicians, who are the intended readers of the labeling, interpret a statistically non-significant result as showing a lack of association:

“We note that it is difficult to be certain that a lack of findings equates to a lack of risk because the failure of a study to detect an association between a drug exposure and an adverse outcome may be related to many factors, including a true lack of an association between exposure and outcome, a study of the wrong population, failure to collect or analyze the right data endpoints, and/or inadequate power. The intent of this final rule is to require accurate descriptions of available data and facilitate the determination of whether the data demonstrate potential associations between drug exposure and an increased risk for developmental toxicities.[2]”

When human epidemiologic data are available, the agency had proposed the following for inclusion in drug labeling[3]:

“

Narrative description of risk(s) based on human data.FDA proposed that when there are human data, the risk conclusion must be followed by a brief description of the risks of developmental abnormalities as well as other relevant risks associated with the drug. To the extent possible, this description must include the specific developmental abnormality (e.g.,neural tube defects); the incidence, seriousness, reversibility, and correctability of the abnormality; and the effect on the risk of dose, duration of exposure, and gestational timing of exposure. When appropriate, the description must include the risk above the background risk attributed to drug exposure and confidence limits and power calculations to establish the statistical power of the study to identify or rule out a specified level of risk (proposed [21 C.F.R.] § 201.57(c)(9)(i)(C)(4)).”

The agency rebuffed comments that physicians would be unable to interpret confidence intervals, and confused by actual data and the need to interpret study results. The agency’s responses to comments to the proposed rule note that the final rule requires a description of the data, and its limitations, in approved labeling[4]:

‘‘Confidence intervals and power calculations are important for the review and interpretation of the data. As noted in the draft guidance on pregnancy and lactation labeling, which is being published concurrently with the final rule, the confidence intervals and power calculation, when available, should be part of that description of limitations.’’

The agency’s insistence upon power calculations is surprising. The proposed rule talked about requiring ‘‘confidence limits and power calculations to establish the statistical power of the study to identify or rule out a *specified* *level of risk* (proposed § 201.57(c)(9)(i)(C)(*4*)).” The agency’s failure to retain the qualification of power, at some specified level of risk, makes the requirement meaningless. A study with ample power to find a doubling of risk may have low power to find a 20% increase in risk. Power is dependent upon the specified alternative to the null hypothesis, as well as the level of alpha, or statistical significance.

The final rule omits all references to power and power calculations, with or without the qualifier of at some specified level of risk, from the revised sections of part 201; indeed the statistical concepts of power and confidence interval do not show up at all, other than a vague requirement that the limitation of data from epidemiologic studies be described[5]:

‘‘(

3)Description of human data.For human data, the labeling must describe adverse developmental outcomes, adverse reactions, and other adverse effects. To the extent applicable, the labeling must describe the types of studies or reports, number of subjects and the duration of each study, exposure information, and limitations of the data. Both positive and negative study findings must be included.”

Presumably, the proposed rule’s requirement of providing power calculations and confidence intervals is part of the future requirement to describe data limitations. The agency, however, omitted this level of detail from the revised regulation.

The same day that the FDA issued the final rule, it also issued a draft guidance on pregnancy and lactation labeling, for public comment[6].

The guidance recommends what the regulation, in its final form, does not require specifically. First, the guidance recommends omission of individual case reports from the human data section, because:

‘‘Individual case reports are rarely sufficient to characterize risk and therefore ordinarily should not be included in this section.[7]”

And for actual controlled epidemiologic studies, the guidance suggests that:

‘‘If available, data from the comparator or control group, and data confidence intervals and power calculations should also be included.[8]”

Statistically, this guidance is no guidance at all. Power calculations can never be presented without a specified alternative hypothesis to the null hypothesis of no increased risk of birth defects. Furthermore, virtually no study provides power calculations of data already acquired and analyzed for point estimates and confidence intervals. The guidance is unclear as to whether sponsors should attempt to calculate power from the data in a study, and try to anticipate what level of specified risk is of interest to the agency and to prescribing physicians. More disturbing yet is the agency’s failure to explain why it is recommending both confidence intervals and power calculations, in the face of many leading groups’ recommendations to abandon power calculations when confidence intervals are available for the analyzed data.[9]

[1] Dep’t of Health & Human Services, Food & Drug Admin., 21 CFR Part 201, Content and Format of Labeling for Human Prescription Drug and Biological Products; Requirements for Pregnancy and Lactation Labeling; Pregnancy, Lactation, and Reproductive Potential: Labeling for Human Prescription Drug and Biological Products—Content and Format; Draft Guidance for Industry; Availability; Final Rule and Notice, 79 *Fed. Reg*. 72064 (Dec. 4, 2014) [Docket No. FDA–2006–N–0515 (formerly Docket No. 2006N–0467)]

[2] *Id*. at 72082a.

[3] *Id*. at 72082c-083a.

[4] *Id*. at 72083c.

[5] *Id*. at 72102a (§ 201.57(c)(9)(i)(D)(3)).

[6] U.S. Department of Health and Human Services, Food and Drug Administration, Pregnancy, Lactation, and Reproductive Potential: Labeling for Human Prescription Drug and Biological Products — Content and Format *DRAFT GUIDANCE* (Dec. 2014).

[7] *Id*. at 12.

[8] *Id*.

[9] *See, e.g.,* Vandenbroucke, *et al.,* “Strengthening the reporting of observational studies in epidemiology (STROBE): Explanation and elaboration,” 18* Epidemiology* 805, 815 (2007) (Section 10, sample size) (“Do not bother readers with post hoc justifications for study size or retrospective power calculations. From the point of view of the reader, confidence intervals indicate the statistical precision that was ultimately obtained. It should be realized that confidence intervals reflect statistical uncertainty only, and not all uncertainty that may be present in a study (see item 20).”); Douglas Altman, *et al*., “The Revised CONSORT Statement for Reporting Randomized Trials: Explanation and Elaboration,” 134 *Ann. Intern. Med*. 663, 670 (2001) (“There is little merit in calculating the statistical power once the results of the trial are known, the power is then appropriately indicated by confidence intervals.”).