TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

New Release of PLI’s Treatise on Product Liability Litigation

January 19th, 2013

The Practicing Law Institute (PLI) has released a new edition of its treatise on product liability litigation.  Stephanie A. Scharf, Lise T. Spacapan, Traci M. Braun, and Sarah R. Marmor, eds., Product Liability Litigation:  Current Law, Strategies and Best Practices (PLI Dec. 2012).

The new edition, the third release of the treatise, has several new chapters, including my contribution, Chapter 30A, “Statistical Evidence in Products Liability Litigation,” which discusses the use of, and recent developments, in statistical and scientific evidence in the law, including judicial mishandling of “significance probability,” statistical significance, statistical power, and meta-analysis.  Here is the table of contents for this new chapter on statistical evidence:

  • § 30A:1 : Overview 30A-2
  • § 30A:2 : Litigation Context of Statistical Issues 30A-2
  • § 30A:3 : Qualification of Expert Witnesses Who Give Testimony on Statistical Issues 30A-3
  • § 30A:4 : Admissibility of Statistical Evidence—Rules 702 and 703 30A-3
  • § 30A:5 : Significance Probability 30A-5
    • § 30A:5.1 : Definition of Significance Probability (The “p-value”) 30A-5
    • § 30A:5.2 : The Transpositional Fallacy 30A-5
    • § 30A:5.3 : Confusion Between Significance Probability and The Burden of Proof 30A-6
    • § 30A:5.4 : Hypothesis Testing 30A-7
    • § 30A:5.5 : Confidence Intervals 30A-8
    • § 30A:5.6 : Inappropriate Use of Statistical Significance—Matrixx Initiatives, Inc. v. Siracusano 30A-9
      • [A] : Sequelae of Matrixx Initiatives 30A-12
      • [B] : Is Statistical Significance Necessary? 30A-13
  • § 30A:6 : Statistical Power30A-14
    • § 30A:6.1 : Definition of Statistical Power 30A-14
    • § 30A:6.2 : Cases Involving Statistical Power 30A-15
  • § 30A:7 : Meta-Analysis 30A-17
    • § 30A:7.1 : Definition and History of Meta-Analysis 30A-17
    • § 30A:7.2 : Consensus Statements 30A-18
    • § 30A:7.3 : Use of Meta-Analysis in Litigation 30A-18
    • § 30A:7.4 : Competing Models for Meta-Analysis 30A-20
    • § 30A:7.5 : Recent Cases Involving Meta-Analyses 30A-21
  • § 30A:8 : Conclusion 30A-23

The treatise weighs in with over 40 chapters, and over 1,000 pages.  The table of contents and table of authorities are available online at the PLI’s website.

The PLI is a non-profit educational organization, chartered by the Regents of the University of the State of New York.  The PLI provides continuing legal education, and publishes treatises and handbooks geared for the practitioner.

New Superhero?

December 31st, 2012

The Verdict. A Civil Action. Class ActionMy Cousin Vinnie.

Wonder Woman, Superman, Batman, Ironman.

America loves movies, and superheroes.

So 2013 should be an exciting year with a new superhero movie coming to a theater, or a courthouse, near you: Egilman.

Actor-producer-director Patrick Coppola has announced that he is developing a film, which has yet to be given a catchy name.  Coppola calls the film in development:  the DOCTOR DAVID EGILMAN PROJECT.  According to Coppola, he was hired by

“by world famous MD – Doctor David Egilman to create and write a Screenplay based on Doctor Egilman’s life and the many cases he has served on as an expert witness in various chemical poisoning trials. Doctor Egilman is a champion of the underdog and has several worldwide charities and medical clinics he funds and donates his time to.”

Patrick Coppola describes his screenplay for the “Doctor David Egilman Project” as a story of conspiracy among corporate suppliers of beryllium materials, the government, and the thought leaders in occupational medicine to suppress information about harm to workers. In this narrative, which is a familiar refrain for plaintiffs’ counsel in toxic tort litigation, profits always take precedence over safety, and unions mysteriously are silently complicit in the carnage.

Can’t wait!

 

The (Clinical) Trial by Franz Kafka

December 9th, 2012

United States of America v. W. Scott Harkonen, MD — Part I

Last week, Mark Haddad, of Sidley Austin, argued Dr. W. Scott Harkonen’s appeal in the Ninth Circuit.   In 2009, Dr. Harkonen was convicted by a jury, before the Hon. Marilyn Hall Patel, on a single count of wire fraud, under 18 U.S.C. § 1343. The jury acquitted Dr. Harkonen of felony misbranding, 21 U.S.C. §§ 331(k), 333(a)(2), 352(a).  Dr. Harkonen’s crime?  Bad statistical practice!

Dr. Harkonen, a physician, was the President and CEO of InterMune, Inc., a biotechnology company that researches and develops medications. InterMune developed interferon gamma-1b (Actimmune®), which was licensed by the FDA for the treatment of two rare diseases, chronic granulomatous disease and severe, malignant osteopetrosis.  In 1999, Austrian researchers published the results of a small randomized clinical trial, which concluded that at 12 months, treatment with interferon gamma-1b (Actimmune®) plus prednisolone was associated with “substantial improvements in the conditions of patients with idiopathic pulmonary fibrosis [IPF] who had had no response to glucocorticoids alone.” Rolf Ziesche, Elisabeth Hofbauer, Karin Wittmann, Ventzislav Petkov, Lutz-Henning Block, 341 New Engl. J. Med., 1264 (1999).  Based upon this 1999 clinical trial, InterMune conducted another clinical trial, with a primary end point of “progression-free” survival,” measured by decrease in specific pulmonary function tests or death.  InterMune’s trial specified nine secondary end points, including survival time over from randomization until the end of the trial.

InterMune’s trial failed to show overall reduction in progression-free survival.  Patients on Actimmune did, however, experience improvements on the survival end point, which were not statistically significant at the pre-specified level of alpha (p < 0.05).  Although not statistically significant as defined, 28 of 168 patients on placebo died, while only 16 of 162 patients on Actimmune died – an absolute value of 40% higher survival on therapy, p-value = 0.084.  The relative survival benefit was greater (70%) for a non-prespecified subgroup that had mild-to-moderate IPF (by pulmonary function criteria) at the outset of the trial.

For a combined subgroup of all mild-to-moderate IPF patients (FVC>55%), making up 77% of all trial participants, the absolute difference in mortality was only 6 patients on Actimmune (n = 126), compared to 21 on placebo (n = 128). For this non-prespecified subgroup, the improvement was 70%, p = 0.004.

In August 2002, Dr. Harkonen approved a press release, which carried a headline, “phase III data demonstrating survival benefit of Actimmune in IPF.” A subtitle announced the 70% relative reduction in patients with mild to moderate disease.  The text of the press release stated that the company’s view was based upon “preliminary,” clinical trial data, which “demonstrate a significant survival benefit in patients with mild to moderate disease randomly assigned to Actimmune versus control treatment (p=0.004).” The press release also stated the results and associated p-value for the survival endpoint for the whole study population, as well as the results of the long-term follow-up study of the patients from the original study by Ziesche, et al. (which also showed a survival benefit for those randomized to Actimmune).  The remainder of the four-page press release acknowledged that the results of the primary end point did not reach statistical significance, and identified two upcoming medical conferences, as well as a conference call with the investment community that would be recorded and posted on the company’s website for two days, at which further details would be provided.

Dr. Harkonen was acquitted of misbranding, but convicted of wire fraud for having issued this press release.  The gravamen of his crime was stating that the clinical trial “demonstrated” prolonged survival for IPF patients.  The prosecution asserted that Dr. Harkonen engaged in data dredging, grasping for the right non-prespecified end point that had a low p-value attached. Such data dredging implicates the problem of multiple comparisons or tests, with the result of increasing the risk of a false-positive finding, notwithstanding the p-value below 0.05.

Supported by the testimony of Professor Thomas Fleming, who chaired the Data Safety Monitoring Board for the clinical trial in question, the government claimed that the trial results were “negative” because the p-values for all the pre-specified endpoints exceeded 0.05.  Shortly after the press release, Fleming sent InterMune a letter that strongly dissented from the language of the press release, which he characterized as misleading.  Because the primary and secondary end points were not statistically significant, and because the reported mortality benefit was found in a non-prespecified subgroup, the interpretation of the trial data required “greater caution,” and the press release was a “serious misrepresentation of results obtained from exploratory data subgroup analyses.”

The district court sentenced Harkonen to six months of home confinement, three years of probation, 200 hours of community service, and a fine of $20,000. Dr. Harkonen appealed on grounds that the federal fraud statutes do not permit the government to prosecute persons for expressing scientific opinions about which reasonable minds can differ.  If any reasonable could find the defendant’s statement to be true, the trial court should dismiss the prosecution.  Statements that have support even from a minority of the scientific community should not be the basis for a fraud charge.  In Dr. Harkonen’s case, the government did not allege any misstatement of an objectively verifiable fact, but alleged falsity in his characterization of the data’s “demonstration” of an efficacy effect.  The government cross-appealed to complain about the leniency of the sentence.

Dr. Harkonen’s trial counsel did not present any expert witnesses, but he did elicit testimony from some of the government witnesses about the proper interpretation of the trial data and about controversy concerning the reliance upon a precise p-value for interpreting causality.  On appeal, for instance, Dr. Harkonen’s counsel quoted government witness, Dr. Wayne Hockmeyer:

“Many times people have the impression that—that when you look at data, it’s immediately clear what conclusions you ought to draw from those data. . . . And sometimes that’s true. And sometimes there are gray areas. And it is not true all the time. And there’s a lot of vigorous debate that goes on amongst members of the scientific and medical community about the conclusions that one ought to draw from those data. ER1085.”

A panel of three judges, Judges Nelson, Tashima, and Murguia, heard Dr. Harkonen’s appeal.  The case presents obvious first amendment issues, but the more curious issues involve whether the government can impose a statistical orthodoxy on pain of punishment under the wire fraud statutes.  There is much that can be said of Dr. Harkonen’s interpretation of the data.  Clearly, multiplicity was a problem that diluted the meaning of the reported p-value, but the government never presented evidence of what the p-value, corrected for multiple testing, might be.  If Dr. Harkonen committed a crime, then so have many biomedical journal editors, article authors, and government scientists for having over-interpreted evidence in communications that travel in the U.S. mails, and by the internet.

General Causation and Epidemiologic Measures of Risk Size

November 24th, 2012

The gatekeeper’s door really must swing both ways on causal analysis. For decades, the courts allowed anything as long as the speaker was “an expert witness,” who uttered the magic words “reasonable medical certainty.”  For the most part, this willingness to tolerate all sorts of nonsense favored plaintiffs.  In the backlash against this judicial libertine approach, some courts, such as those in Texas, have embraced a principle that unfairly favors defendants.  Abridgment of scientific method and reasoning is offensive regardless who is being favored.

The Texas courts have adopted a rule that plaintiffs must offer a statistically significant study, with a risk ratio (RR) greater than two, to show general causation.  A RR ≤ 2 can be a strong practical argument against specific causation in many cases. See Courts and Commentators on Relative Risks to Infer Specific CausationRelative Risks and Individual Causal Attribution; and  Risk and Causation in the Law.   But a RR > 2 threshold has little in theory to do with general causation.  There are any number of well-established causal relationships, where the magnitude of the ex ante risk in an exposed population is > 1, but ≤ 2.  The magnitude of risk for cardiovascular disease and smoking is one such well-known example.  As I noted in “Confusion Over Causation in Texas” (Aug. 27, 2011), the Texas Supreme Court managed to confuse general and specific causation concepts in its decision in Merck & Co. v. Garza, 347 S.W.3d 256 (2011).

Still, the search for a RR threshold for general causation does have some basis in the practice of epidemiology. When assessing general causation from only observational epidemiologic studies, where residual confounding and bias may be lurking, it is prudent to require a RR > 2, as a measure of strength of the association that can help us rule out the role of systemic error.  As the cardiovascular disease/smoking example illustrates, however, there is clearly no scientific requirement that the RR be greater than 2 to establish general causation.  Courts should recognize that there are spurious associations with RR >> 2, and true, causal associations with RR < 2. Much will depend upon the number of studies, and the potential for bias or confounding in the body of evidence.  If the other important Bradford Hill factors are present – dose-response, consistent, coherence, etc. – then risk ratios ≤ 2, from observational studies, may suffice to show general causation.  So a requirement of RR > 2, for the showing of general causation, does not make sense as a criterion for general causation; and at best, RR > 2 is a much weaker consideration for general causation than it is for specific causation.

Randomization and double blinding are major steps in controlling confounding and bias, but they are not guarantees that systematic bias has been eliminated.  Similarly, despite the confusion and errors of lawyers and judges, statistical significance does not address bias or confounding.  See, e.g., Zach Hughes, “The Legal Significance of Statistical Significance,” 28 Westlaw Journal: Pharmaceutical 1, 2 (Mar. 2012) (erroneously describing the meaning and function of significance testing; “Stated simply, a statistically significant confidence interval helps ensure that the findings of a particular study are not due to chance or some other confounding factors.”).

A double-blinded, placebo-controlled, randomized clinical trial (RCT) will usually have less opportunity for bias and confounding to play a role.  Imposing a RR > 2 requirement for general causation thus makes less sense in the context of trying to infer general causation from the results of RCTs. The Garza Court, however, went a dictum too far by describing RR > 2 as a requirement that applied to general causation:

Havner holds, and we reiterate, that when parties attempt to prove general causation using epidemiological evidence, a threshold requirement of reliability is that the evidence demonstrate a statistically significant doubling of the risk. In addition, Havner requires that a plaintiff show ‘that he or she is similar to [the subjects] in the studies’ and that ‘other plausible causes of the injury or condition that could be negated [are excluded] with reasonable certainty’.40

347 S.W.3d at 265 (quoting from Merrell Dow Pharmaceuticals, Inc. v. Havner, 953 S.W.2d 706, 720 (Tex. 1997).  See Merk’s Appellant’s Brief to the Texas Court of Appeals at 16, 17 (July 16, 2007) (citing the Havner case as providing a “rational basis for inferring causation”; “To prove general causation, the Garzas were required to introduce at least two statistically significant scientific studies showing that Vioxx at the same dose and duration as taken by Mr. Garza more than doubled the risk of heart attack. Havner, 953 S.W.2d at 718-23, 727.”).

Imposing RR > 2 as a requirement for general causation, in the context of risk ratios from clinical trials, was particularly unwarranted. If general causation were the issue, it would be difficult to make out a reason for why the dose and duration used in the study had to be the same as that used by the specific plaintiff. General causation was not the dispositive issue in Garza, and so this language should be treated as dictum.  The confusion between general and specific causation is unfortunate.

What is the source of the Garza court’s notion about RR and general causation?  One popular article from Science, in the 1990’s, gave some credence to the notion of a minimal RR for general causation. Gary Taubes, “Epidemiology Faces Its Limits,” 269 Science 164 (July 14, 1995) [cited as Taubes]. Taubes collected quotes (or sound bites) from various authors, about the relevance of the magnitude of observed associations.  For instance, Taubes quoted Marcia Angell, a former editor of the New England Journal of Medicine, as articulating a general rule:

“As a general rule of thumb, we are looking for a relative risk of 3 or more [before accepting a paper for publication], particularly if it is biologically implausible or if it’s a brand new finding.”

Taubes at 168.  John Bailar, a professor emeritus at the University of Chicago, was quoted by Taubes as rejecting any reliable dividing line, thus taking a more nuanced approach:

“If you see a 10-fold relative risk and it’s replicated and it’s a good study with biological backup, like we have with cigarettes and lung cancer, you can draw a strong inference. * *  * If it’s a 1.5 relative risk, and it’s only one study and even a very good one, you scratch your chin and say maybe.”

Taubes at 168. Taubes described Harvard epidemiologist Dimitrios Trichopoulos as suggesting that a study should show a four-fold increased risk, and the late Sir Richard Doll of Oxford University as suggesting that a single epidemiologic study would not be persuasive unless the lower limit of its 95% confidence interval exclude 3.0.  Id.

Even if Taubes’ quotes are accurate, there is a risk that they were stripped of important nuance provided by the scientists he interviewed.  There are other, more credible sources, however, for scientists who have insisted on a need to use the size of a RR as a consideration in evaluating the causality of an association, especially for observational studies.  For example, Breslow and Day, two respected cancer researchers, noted in a publication of the World Health Organization, that

“[r]elative risks of less than 2.0 may readily reflect some unperceived bias or confounding factor, those over 5.0 are unlikely to do so.”

Norman E. Breslow & Nicholas E. Day, Statistical Methods in Cancer Research. Volume I The Analysis of Case-Control Studies at 36 (Lyon, International Agency for Research on Cancer Scientific Publications No. 32, 1980).  The caveat makes sense, but it clearly was never intended to be some sort of bright-line rule for people too lazy to look at the actual studies and data.  Unfortunately, not all epidemiologists are as capable as Breslow and Day, and there are plenty of examples of spurious RR > 5, arising from biased or confounded studies.

Sir Richard Doll, and Sir Richard Peto, expressed a similarly skeptical view about RR < 2, in assessing the causality of associations:

“when relative risk lies between 1 and 2 … problems of interpretation may become acute, and it may be extremely difficult to disentangle the various contributions of biased information, confounding of two or more factors, and cause and effect.”

Richard Doll & Richard Peto, The Causes of Cancer 1219 (Oxford Univ. Press 1981).

More recently, plaintiffs’ testifying expert witness, David Goldsmith expressed the view that a RR > 2 is a minimal indication of a strong RR, which is a likely candidate for causality. David F. Goldsmith & Susan G. Rose, “Establishing Causation with Epidemiology,” in Tee L. Guidotti & Susan G. Rose, eds., Science on the Witness Stand:  Evaluating Scientific Evidence in Law, Adjudication, and Policy 57, 60 (OEM Press 2001) (“There is no clear consensus in the epidemiology community regarding what constitutes a ‘strong’ relative risk, although, at a minimum, it is likely to be one where the RR is greater than two; i.e., one in which the risk among the exposed is at least twice as great as among the unexposed.”); Ernst L. Wynder & Geoffrey C. Kabat, “Environmental Tobacco Smoke and Lung Cancer: A Critical Assessment,” in H. Kasuga, ed., Indoor Air Quality 5, 6 (Berlin Springer Verlag, 1990) (“An association is generally considered weak if the odds ratio is under 3.0 and particularly when it is under 2.0, as is the case in the relationship of ETS and lung cancer. If the observed relative risk is small, it is important to determine whether the effect could be due to biased selection of subjects, confounding, biased reporting, or anomalies of particular subgroups.”).

In the 1990’s, Dr. Janet Daling and her colleagues published an observational epidemiologic study on whether abortion was related to later breast cancer. Janet R. Daling, K.E. Malone, L.F. Voigt, E. White, Noel S. Weiss, “Risk of breast cancer among young women: relationship to induced abortion,” 86 J. Nat’l Cancer Instit. 1584 (1994). Several scientists, concerned that Dr. Daling’s findings would be distorted by religious propagandists, wrote that the small RRs in the Daling study could not support a causal interpretation of the data.  In an editorial that accompanied the article, Dr. Lynn Rosenberg, of the Boston University School of Medicine, wrote:

“A typical difference in risk (50%) is small in epidemiologic terms and severely challenges our ability to distinguish if it reflects cause and effect or if it simply reflects bias.”

Lynn Rosenberg, “Induced Abortion and Breast Cancer: More Scientific Data Are Needed,” 86 J. Nat’l Cancer Instit. 1569, 1569 (1994).  Rosenberg’s caution was picked up and repeated by an official statement of the National Cancer Institute (NCI).  Linda Anderson, of the NCI Press Office (NIH) issued a press release to stifle fears raised by Dr. Daling’s abortion research:

“In epidemiologic research, relative risks of less than 2 are considered small and are usually difficult to interpret. Such increases may be due to chance, statistical bias, or effects of confounding factors that are sometimes not evident.”

Linda Anderson, “Abortion and possible risk for breast cancer: analysis and inconsistencies,” (Wash. DC, NCI Oct. 26. 1994).  In the lay media, an American Cancer Society epidemiologist was quoted in reference to the Daling study:

“Epidemiological studies, in general are probably not able, realistically, to identify with any confidence any relative risks lower than 1.3 (that is a 30% increase in risk) in that context, the 1.5 [reported relative risk of developing breast cancer after abortion] is a modest elevation compared to some other risk factors that we know cause disease.”

Washington Post (Oct 27,1994) (Dr. Eugenia Calle, Director of Analytic Epidemiology for the ACS).

Not surprisingly, tobacco companies, embattled by claims of cancer from environmental tobacco smoke (ETS) cried political correctness when the NCI and the ACS announced a skeptical view of whether RRs between 1 and 2 could show a causal relationship between abortion and breast cancer, while endorsing a low RR as real in the case of ETS and lung cancer.

What the tobacconists, however, missed was that Daling’s association was a relatively novel finding.  Subsequent studies failed to corroborate the association, which now lives on only because of the efforts of theocratic regimes in some of the United States.  The NCI’s reaction to the Daling study was in line with the quotes from Taubes’ article, above.

Recently, two epidemiologists reviewed the issue of minimal reliable risk, and concluded:

“There is no single number for a minimal reliable risk that pertains to all studies.”

Mark J. Nicolich and John F. Gamble, “What is the Minimum Risk that can be Estimated from an Epidemiology Study?,” in Anca Moldoveanu, ed., Advanced Topics in Environmental Health and Air Pollution Case Studies,at 4.1.1 Point 1 (2011).   Of course, this pronouncement by Nicolich and Gamble is precisely the sort of call for sound judgment that lawyers fear because it involves engagement with the studies, their methods, and their data. The potential for bias and confounding is not constant across all studies.  The potential for such errors varies with the nature of the exposure and the outcome under investigation, the design of the study, and myriad particulars and details of the studies involved.  As Nicolich and Gamble explained:

“Theoretically, there is no relative risk that is too small to be estimated. The relative risk is a construct or a concept, not a physical reality. Since it is a mathematically defined concept it can be mathematically estimated to any degree of precision. However, we have shown in this paper that (1) there are many assumptions that must be met to make certain that the RR estimate is accurate and precise; and (2) the significance level or uncertainty associated with the RR estimate has its own set of assumptions that must be met. So, while there may be no theoretical minimum RR that can be estimated, in practice there is a minimum risk and varies depending on uncertainties present in the context of each study.

An analogy in the physical world of estimating a RR is to measure the length of an object. A meterstick is precise enough to determine the width of a table to see if it will fit through a doorway, but a meterstick is not precise enough to measure the diameter of a shaft in an automobile engine with a tolerance of ±1.0 mm. To measure the shaft diameter one would use a micrometer. The micrometer while sufficiently precise to measure the shaft is not adequate to determine the size of a dust mite, usually in the range of 200 to 300 μm. The analogy can be carried through to the size of molecules, to the wavelength of visible light, and to the diameter of an electron. The conclusion is that while all the tasks involve measuring length and there is no practical ‘minimum length’, different tools and considerations are needed depending on the object to be measured and the precision required.”

Id. at 21.

“We agree with Wynder (1987) that epidemiology is able to correctly interpret relatively small relative risks, but only if the best epidemiological methodology is applied and only if the data are fully evaluated by examining all judgment criteria, especially those of biological plausibility. As RRs become smaller, the need for close adherence to these basic principles becomes greater. If these ideas are applied, a conclusion of no risk should reassure society. And when a risk is reported as positive, appropriate preventive measures to reduce avoidable illness can be used to successfully reach the ultimate goal of epidemiology and preventive medicine.”

Id. at 22.

Nicolich and Gamble probably provide more nuance than most courts want, but it is what scientists, policy makers, and lawyers need to hear. Simplistic rules, such as a requirement of two statistically significant studies with RR > 2, do not enhance the credibility of judicial judgments. The requirement is over- and under-inclusive; it screens out real causal associations while allowing spurious associations, almost certainly the product of bias or confounding, to stand.

Wells v. Ortho Pharmaceutical Corp. Reconsidered – Part 4

November 19th, 2012

Associations

As noted in part three of these notes on Wells, the court regarded the epidemiologic studies as “inconclusive” on whether spermicides cause birth defects.  Somehow, the inconclusiveness of some studies undermined the defense’s expert witnesses, but not the plaintiffs’ witnesses:

“In finding the studies offered by defendant inconclusive on the issue of causation, the Court did not need to consider as substantive evidence the studies offered by plaintiff to suggest causation. Of course, the Oechsli study, the Smith study, the Buttar study on teratogenicity, and the Jick study all support this finding. As discussed below, the Court did consider the Oechsli and Smith studies in making its decision on the issue of failure to warn.”

Wells v. Ortho Pharmaceutical Corp., 615 F. Supp. 262, 292 & n.38 (N.D. Ga. 1985), aff’d and rev’d in part on other grounds, 788 F.2d 741 (11th Cir.), cert. denied, 479 U.S.950 (1986). The trial court’s “of course” hardly made its statement correct; of course.

Additional FDA Reviews and Opinions

In reaching its verdict on state of the art knowledge in 1980, the court ignored the FDA monograph that addressed safety, efficacy, and labeling of non-ionic spermicidal products.  The court continued to ignore this monograph with respect to medical causation.

In 1983, the FDA’s Fertility and Maternal Health Drugs Advisory Committee (FMHDAC), concluded that non-ionic surfactant spermicides needed no additional warning.  615 F. Supp. at 278.  One of Ortho’s witnesses, Mr. Armond Welch, worked for the FDA from 1946 through 1980, and was involved in regulation of the product.  Mr. Welch pointed out that the FDA approved a contraceptive sponge device, designed to be used with non-ionic spermicides, after review of a New Drug Application, in the 1970’s.  Id. at 280.

Defense expert witness, Dr. Stolley, reviewed the transcript of the December 1983 meeting of the FMHDAC, and described the Committee’s assessment of spermicides and congenital abnormalities as “comprehensive and thoughtful.”  Dr. Stolley explained that there was no indication that the Committee had been unduly influenced by Ortho.  This factual assertion was never factually challenged. Given the number of members of the Committee, their diverse backgrounds and political views, it is difficult to see how Ortho could have biased the Committee’s consideration of the issues; yet the plaintiffs’ witnesses made sweeping, subjective, and defamatory statements, which the trial court credited.  One of plaintiffs’ expert witnesses, for instance, Dr. Mitchell, testified that the FMHDAC conclusion was irrelevant because, in his opinion:

“Those people are readily swayed, I think, by other interests, including pharmaceutical manufacturers, other people to whom they must report and who[m] they represent. I’m not sure what the weight of one of these advisory committees should ultimately be.”

Id. at 278.  FDA Committee members are “readily swayed,” but trial judges (and jurors) have scientific insight to avoid being misled!

Dr. Mitchell offered no factual basis for his slanderous subjective personal opinion; yet Judge Shoob thought he had a credible demeanor!  The contrast between what Judge Shoob found to discredit and accredit witnesses showed a remarkable bias in favor of plaintiffs’ expert witnesses.  In evaluating one of the defense experts, Judge Shoob found that Dr. Robert Brent was “biased” because he had published disparaging remarks about plaintiffs’ attorneys and expert witnesses who testify for plaintiffs in birth defects litigation.  Id. at 291.  Similarly, Ortho’s company witnesses were not taken seriously because they were employees of the defendant.  Id. at 278. Plaintiffs’ witnesses were not discredited for making disparaging remarks against industry or the FDA.  Dr. Mitchell’s wholesale dismissal of an entire FDA committee and its work product, however, was accepted at face value, without reflecting upon Dr. Mitchell’s obvious bias and overzealous advocacy.

Indeed, Judge Shoob corroborated Dr. Brent’s view of plaintiffs’ counsel when he dismissed several counts of the complaint for complete lack of evidence.  One dismissed count contained a claim for negligent failure to warn that the product does not always prevent conception. The uncontradicted evidence at trial showed that Ortho’s product did carry such a warning, that the plaintiff Ms. Maihafer had read the warning before use, and that she knew that the spermicide when used with a diaphragm would not be 100% effective.  Id. at 291.

Dr. Robert Brent, a leader in the field of the study of birth defects, served as a voting member in the FMHDAC’s inquiry into teratogenicity, without disclosing his litigation work.  Id. at 291.  Rather than accrediting Dr. Brent’s testimony, this credential was turned against him by the trial court.  Apparently, Dr. Brent assisted the defense prior to the meeting, but did not consider himself to have been “retained” until March 1984.  Id. at 290.  The trial court used Dr. Brent’s apparent equivocation between the date of his retention and the date of his first involvement to discredit both Dr. Brent’s testimony and the work of the FMHDAC.  Id. at 294.  Regrettably, the court allowed trial atmospherics about Dr. Brent to obscure the participation of many experts, from various perspectives, who participated in the several FDA findings and reports.

Animal Evidence

Plaintiffs’ witnesses took the animal evidence, which failed to support teratogenicity, and by personal opinion alone interpreted such evidence to support the claim that Ortho’s product was teratogenic in humans. One study of rats concluded:

“The results suggest that single vaginal application of [nonoxynol-9] is embryolethal and fetocidal but nonteratogenic in the rat at a dose approximately ten times higher than that recommended for controlling conception in women.”

Buttar, “Assessment of the Embryotoxic and Teratogenic Potential of Nonoxynol-9 in Rats Upon Vaginal Administration,” 2 The Toxicologist 39, 40 (1982). Other animal studies suggested absorption of the non-ionic spermicides. See, e.g., Buttar, “Transvaginal Absorption of Spermicides,” 13 Toxicology Letters 211 (1982); Chvapil, “Studies on Nonoxynol-9. II. Intravaginal Absorption, Distribution, Metabolism and Excretion in Rats and Rabbits, 22 Contraception 325 (1980).” The plaintiffs’ witnesses took absorption and lethality to mean teratogenicity, in the face of the studies’ failure to find malformations.  Hand waving and puffery became scientific evidence, somehow, to satisfy a burden of proof.  615 F. Supp. at 273, 276.

The Jick Study

The plaintiffs relied heavily upon a study conducted and published by Hershel Jick, and colleagues.  Hershel Jick, Alexander M. Walker, Kenneth J. Rothman, Judith R. Hunter, Lewis B. Holmes, Richard N. Watkins, Diane C. D’Ewart, Anne Danford, and Sue Madsen, “Vaginal Spermicides and Congenital Disorders,” 245 J. Am. Med. Ass’n 1329 (1981).  This study makes Justice Sotomayor’s Matrixx Initiatives dictum clearly erroneous; the Jick study found a statistically significant difference between the rate of a composite of several birth defects in women who did, and those who did not, use spermicides.

But the Jick study had serious threats to its internal and external validity, which the trial court chose to ignore.  For one thing, the statistically significant result came only for a composite of malformations, which included birth defects that had no biologically plausible relationship to one another in terms of their potential relationship to a fetal exposure to a teratogen.  Defense expert, Dr. Stolley, for instance, described the diverse birth defects selected for inclusion in the Jick study to be “biologically implausible.” 615 F. Supp. at 284.

The Jick study did not report a statistically significant outcome with respect to the limb reduction deficit, which was the outcome of interest for the Wells plaintiffs.  Further undermining the meaning of any confidence intervals or p-values from the Jick study was the multiple testing and comparisons that took place within the Jick cohort.  By looking for many malformation outcomes, without a pre-specified aim of the study, the Jick paper was an exploratory, hypothesis-generating study, which did not support scientific, causal conclusions.

Perhaps the biggest problem with the Jick study, was its use of a measure of “exposure” to spermicides, a prescription by a woman any time up to 11 months before conception.  Therefore, women and their malformed children were counted as cases without regard to whether the mothers had actually used spermicides at the time of conception or at any time during their pregnancies.  As Dr. Stolley explained, Jick had defined “exposure” so inaccurately, that the data from the Jick study was “almost uninterpretable.” Id. at 284. The Jick study may have raised a hypothesis about spermicides, but it could not test that hypothesis in a meaningful way with the prescription data it used.  Studies after the Jick paper failed to show an association between spermicides and birth defects, or limb reduction deficits specifically.

The defense called Dr. Richard Watkins, one of the authors of the Jick paper to testify narrowly on the validity problems in the study.  Dr. Watkins emphasized that the study was too small to yield statistically stable results, that presumed exposure in the study did not mean actual exposure, and that the lumping of multiple outcomes together was scientifically inappropriate.  Id. at 281.  According to Dr. Watkins, he made these criticisms to Dr. Jick, who directed him to have nothing more to do with the paper.

For better or worse, however, the Jick study was published with Dr. Watkins as an author. The trial court did not explain whether Dr. Watkins ever requested his name to be removed from the manuscript, but the trial court clearly formed a jaundiced view of Dr. Watkins, based upon the contradiction between his testimony and the paper that bore his name.

The Watkins’ problem was compounded by Ortho’s request to have him investigate exposure, and Dr. Watkins’ (unsurprising) finding that some of the women classified as users had not, in fact, used spermicides at the time of conception or afterwards.  Id. at 282.  The court viewed Dr. Watkins’ taking publication credit, but then having challenged the validity of the study for the first time in a court room, rather than in the scientific community or its publications, as “severely” eroding his credibility. Id. Judge Shoob explained that:

“[b]ecause he had participated in the 1981 Jick study, Dr. Watkins might have been an ideal witness to comment on the validity of its conclusions. Dr. Watkins’ testimony on cross-examination, however, severely eroded his credibility.  It is perplexing that a physician would risk his professional reputation by signing his name to a study about which he had serious reservations, especially when he knew the article would be published in a widely-read journal. Moreover, the dispute between Doctors Jick and Watkins, when Dr. Jick requested that Dr. Watkins discontinue work on the study, creates some question about Dr. Watkins’ impartiality.

Finally, that Dr. Watkins chose this courtroom as the first public forum in which to repudiate a study that he had helped conduct nearly four years earlier creates further doubt about his credibility. For these reasons, the Court found Dr. Watkins’ testimony not credible.”

Id. (internal citations omitted). Judge Shoob’s harsh condemnation of Dr. Watkins stands in stark contrast with his glowing approval of the credibility of plaintiffs’ witnesses, who had chosen the courtroom for their first declaration that spermicides cause birth defects.  More to the point, however, is that whatever the merits of Judge Shoob’s credibility determinations, they did nothing to make the Jick study more persuasive or to erase the serious validity concerns raised by the defense.

Plaintiffs’ witnesses appeared not to consider all the evidence in reaching conclusions about causation; they relied heavily upon the Oechsli unpublished study, and Jick, but they seemed to ignore Cordero (1983), Warburton, Shapiro (1982), Mills (1985), Harlap (1980), as well as another unpublished study by Harlap.[i]  615 F. Supp. at 284 & n.26.  The Court nonetheless found that plaintiffs had carried their burden of proving Katie Wells’ limb reduction deficits.  Id. at 292.

Judge Shoob’s verdict has long since been reversed by the court of scientific review, which of course was small solace for Ortho.  See James L. Mills, “Spermicides and Birth Defects,” in Kenneth R. Foster, David E. Bernstein, and Peter W. Huber, eds., Phantom Risk:  Scientific Inference and the Law 87 (MIT Press 1993) (by time of the FMHDAC meeting in 1983, “the vast majority of evidence found no association between spermicide use and birth defects”); David F. Goldsmith & Susan G. Rose, “Establishing Causation with Epidemiology,” in Tee L. Guidotti & Susan G. Rose, eds., Science on the Witness Stand:  Evaluating Scientific Evidence in Law, Adjudication, and Policy 57, 70 (OEM Press 2001) (“Weak science and inappropriate verdicts – Spermicide and birth defects”); see alsoWells v. Ortho Pharmaceutical Corp. Reconsidered – Part 1,” at footnote 1 (Nov. 12, 2012).

After Judge Shoob’s verdict, the FDA issued a notice, “Data Do Not Support Association Between Spermicides, Birth Defects,” in the FDA Drug Bulletin (1986). Dr. Watkins published his re-appraisal and criticism of the Jick study, which had been a lightning rod for Judge Shoob’s scorn.  Dr. Watkins reported that many of the women who had had malformed children and who were counted as users of spermicides actually planned their pregnancies and were not using spermicides at all. Richard N. Watkins, “Vaginal spermicides and congenital disorders:  the validity of a study,” 256 J. Am. Med. Ass’n 3095, 3096 (1986) (noting that the Jick study’s “definition of exposure to spermicide near the time of conception was grossly inaccurate”).  Watkins wrote further that Jick (1981) was “unsupported by more complete evidence from its subjects.” Id; see also Lewis B. Holmes, “Vaginal spermicides and congenital disorders:  the validity of a study – A Reply” 256 J. Am. Med. Ass’n 3096 (1986) (noting that the Jick article should never have been published).

Dicta

“Somewhat like statements in a law review article written by a judge, or a judge’s comments in a lecture, dicta can be used as a vehicle for offering to the bench and bar that judge’s views on an issue, for whatever those views are worth.” McDonald’s Corp. v. Robertson, 147 F.3d 1301, 1315-1316 (11th Cir. 1998).  Dicta can also be used to manipulate the path of the law by encouraging lower court’s to treat the dicta as holding.  The appellate court that issues the dicta retains “plausible deniability” if the dicta proves improvident in future cases. The importance of dicta increases with the hierarchical level of the court.  Dicta in Supreme Court cases is given great attention because of that Court’s role in setting policy and declaring the law beyond the narrow confines of the dispute between the parties.  See, e.g., Judith M. Stinson, Why Dicta Becomes Holding and Why it Matters,” 76 Brook. L. Rev. 219 (2010).  For all these reasons, Justice Sotomayor’s dicta about statistical significance and the Wells case should be taken seriously, and should be seriously rejected.

(to be continued)


[i] Harlap, “Congenital Abnormalities in the Offspring of Women Who Used Oral and Other Contraceptives around the Time of Conception” (unpublished study supported by the Center for Population Research, National Institute of Child Health and Development).

Summary Judgment in Gushue – Attempted Differential Diagnosis for Idiopathic Diseases Rebuffed

October 10th, 2012

Parkinson’s disease (PD) in young women is a rare disease.  Exposure to manganese fumes from a pottery kiln is a rare disease.  Plaintiff Kathleen Gushue, with the help of her expert witnesses, Drs. Paul Nausieda and Elan Louis, argued that the coincidence of both rare exposure and rare outcome must be probative of a causal relationship between the two.  Supreme Court Justice Jeffrey K. Oing, realizing that one in million happens eight times a day here in New York City, excluded the proffered testimony of Drs. Nausieda and Louis, and granted defendants summary judgment.  Gushue v. Estate of Norman Levy, et al., Supreme Court of New York, New York County, Index No.: 106645/05, Decision & Order  (Sept. 28, 2012).

Manganese in very high doses can cause a parkinsonism, but Justice Oing avoided the semantic traps set for him by the plaintiff.  Just because PD requires parkinsonism does not mean that manganese-induced parkinsonism can be equated with PD.  A dog is a carnivorous mammal, with fur, four legs, and a tail.  So is a cat, but a dog is not a cat.  Similarly, PD and the specific features of manganese-induced parkinsonism are different.  See Agency for Toxic Substances and Disease Registry, Draft Toxicological Profile for Manganese 16 (Draft 2008) (“While manganese neurotoxicity has clinical similarities to Parkinson’s disease, it can be clinically distinguished from Parkinson’s.”); id. at 66-67 (“Manganism and Parkinson’s disease also differ pathologically. * * *  It is likely that the terms Parkinson-like-disease and manganese-induced-Parkinsonism will continue to be used by those less knowledgeable about the significant differences between the two.”).

Plaintiff and her expert witnesses also attempted the differential diagnosis ploy, but Justice Oing followed prior New York law that requires a claimant, who is alleging toxic cause, to “reliably rule out reasonable alternative causes of [the alleged harm) or idiopathic causes.” Id., citing Barbaro v Eastman Kodak Co., 26 Misc. 3d 1124 (A) (Sup. Ct., Nassau Cty. 2010) .

Logically and legally, plaintiff could not rule out idiopathic causes that are responsible for the great majority of PD cases. Parkinson’s disease has no known causes other than a few uncommon genetic variants.  See John Hardy, “No Definitive Evidence for a Role for the Environment in the Etiology of Parkinson’s Disease,” 21 Movement Disorders 1790 (2006).  See also J. Mortimer, A. Borenstein, and L. Nelson, “Associations of welding and manganese exposure with Parkinson disease: Review and meta-analysis,” 79 Neurology 1174 (2012) (reporting a statistically significant decreased risk of Parkinson’s disease among welding tradesmen).

American Taliban and the Attack on Science

October 9th, 2012

Mostly I care about whether governmental policy is based upon facts, but discerning the facts requires intelligence.  In some areas of human endeavor, it involves something we call science.  Generally smart people are better at doing science than stupid people, but there may be the occasional idiot savant.

Political pundits focus on the dualism of America – rich and poor, but this is not the important divide.  The crucial distinction is between the smart and the stupid.

Rick Santorum says that smart people have no place in the Republican party.  Colleges and universities are the adversaries of the stupid.  Stupid people are the base.  See Kristen A. Lee, “Santorum complains to social conservatives about ‘smart people’” (Sept. 17, 2012).  Santorum accuses President Obama of being a snob:  “he wants everybody in America to go to college.” Santorum later backed away from his “What a snob,” remark, when he acknowledged that his comment was “probably not the smartest thing.”  Of course, Santorum was really complimenting himself, and reaffirming his core values.

Shifting gears, just slightly.

Science flourished in the Islamic world until it didn’t.  Most historians appear to accept that the rise of clerics and superstition killed a rich tradition of science in Islam, about the same time that the Reformation and other social changes in Europe allowed science to emerge from the shadows of the Church. The American Taliban would have us align ourselves with the current Islamic hostility to science.

Who are the American Taliban?

Meet Congressman Paul Broun.  Broun serves on the House Science Committee.  According to Wikipedia, the font of all knowledge, Broun has a bachelor’s degree in chemistry from the University of Georgia, and an M.D. degree from the Medical College of Georgia in Augusta.  Broun calls himself a scientist.

Last month, at a church-sponsored event in Georgia, Broun declared that “all that stuff I was taught about evolution and embryology and the Big Bang theory” are “lies straight from the pit of hell.” And these lies are no casual fibs; according to Broun, the lies are part of a conspiracy to “to try to keep me and all the folks who were taught that from understanding that they need a savior.” And Broun really needs a savior.

Broun is also an accomplished geologist:

“You see, there are a lot of scientific data that I’ve found out as a scientist that actually show that this is really a young Earth. I don’t believe that the earth’s but about 9,000 years old. I believe it was created in six days as we know them. That’s what the Bible says.”

Broun made his comments to constituents at the Sportsman’s Banquet at Liberty Baptist Church in Hartwell, Georgia.  In keeping with the Sportsman theme, members of the Bridge Project having been bird dogging Broun.  Instead of shooting big game; they shot video of Broun’s speech, which they proudly distributed by YouTube, which of course is their right for now.

The House Science Committee apparently has become a safe haven for the American Taliban.  Fellow scientist and Congressman, Todd Akin, also serves on the Committee.  Akin gained fame for his definitive study, which showed that women who experience “legitimate rape” cannot become pregnant because their tubes shut down.

Not all bad science is practiced in the courts.

Unraveling the “Master Historical Narrative” of Asbestos

October 6th, 2012

Sheila Scheuerman at the TortsProf Blog has posted a note about a forthcoming article by Rachel Maines, of the Cornell School of Electrical and Computer Engineering, entitled “The Asbestos Litigation Master Narrative: Building Codes, Engineering Standards, and ‘Retroactive Inculpation’.”  The article was published “in press,” in August, and is slated to appear in an upcoming issue of Enterprise & Society.

Prof. Scheuerman has kindly provided a link to the in-press version of Professor Maines’ article:  Download Maines Asbestos Litigation Master Narrative 2012. Several years ago, Professor Maines published a book that challenged the asbestos dogmas created in the occupational health community, and by plaintiffs’ counsel and their expert witnesses. R. Maines, Asbestos and Fire: Technological Trade-offs and the Body at Risk (Rutgers Univ. Press 2005).  In her forthcoming article, Maines extends the thesis of her book, to explore how plaintiffs’ counsel conspired with their expert witnesses, such as Barry Castleman, to create what she calls “The Asbestos Litigation Master Narrative,” which involves the “retroactive inculpation” of industry for manufacturing asbestos-containing products.  Her article explores how building codes, engineering standards, and federal regulations specified the use of asbestos in various products, for health and safety reasons.  These codes, standards, and regulations represent a broad and deep consensus that asbestos could and should be used safely because of its important physical properties.

Maines notes that her search of LexisNexis revealed only two asbestos cases in which courts referenced building codes as standards that weighed against the plaintiffs’ constructed narrative of conspiracy tales and supposedly established historical knowledge of asbestos hazards.  She seems to imply that defense counsel have not done enough to put the legal and regulatory insistence upon asbestos use before courts and juries, which must employ the retrospectoscope to assess past knowledge and exercise of due care.

While Maines presents a valuable and engaging counter-narrative, with careful historical scholarship, her implied criticism of the defense bar is unwarranted.  In several key states (NJ and PA), where many asbestos cases have been tried, a combination of hyper-strict liability and trial bifurcation has kept juries from hearing the kind of evidence that Maines outlines.  For many years, reverse bifurcation was mandated in Philadelphia County Court of Common Pleas.  Causation and damages were litigated in the first phase of trial; liability in the second.  Plaintiffs’ counsel sometimes played an ancient videotaped deposition of Dr. Katherine Sturgis, and the defense often did not respond, perhaps because Dr. Sturgis was so lackluster, and because most juries had a hard time in any event finding for the defense after they committed to a causation and damages verdict.

There were notable exceptions.  One judge who took cases from the Mass Tort Program was the Hon. Levan Gordon, who resisted the MTP prescription for reverse bifurcation, and who tried cases “all issues.” In one case Tom Hanna and I tried against now Judge Sandy Byrd, back in May 1989, O’Donnell v. The Celotex Corp., Phila. Cty. Ct.C.P., July 1982 Term, Case. No. 1619, Judge Gordon followed his practice of trying cases all-issues, and I was thus able to put on a “state-of-the-art” defense, along with evidence of U.S. Navy military specifications for asbestos in insulation products.  The plaintiffs’ product identification witness, Mr. George Rabuck, unexpectedly cooperated by offering a story of a shake-down cruise of a Navy vessel, in which the insulators had not covered a stretch of steam pipe with insulation.  When a nearby oil valve broke, spraying oil onto the uninsulated pipe, a fire erupted, and two sailors died before it could be extinguish.  I was able to have Mr. Rabuck agree that a fire on a ship was a terrible thing, and in my closing argument, I was able to paint the picture of the two dead sailors who taken off the ship in body bags because someone forgot to use asbestos.  I felt that the risk-utility balance had been restored.  Perhaps the jury did as well; they returned a general verdict for the defense.

I tell the war story, not only because it was one of my favorite trials, but also because the defense used evidence of governmental insistence upon procurement and use of asbestos-containing insulation.  I am confident that many other defense lawyers have used similar mil-spec evidence as well, along with evidence of the U.S. government’s very deep knowledge of the potential hazards of asbestos.

It’s Alimentary, My Dear Watson

September 20th, 2012

Mr. Watson, who claimed to have consumed thousands of bags of popcorn with diacetyl, sued for bronciolitis obliterans allegedly caused by the diacetyl.

Actually, with the help of frequent testifier David Egilman, Wayne Watson claimed his lung injury was inhalational.

The trial judges in Watson denied essentially the same challenges that were sustained in Newkirk v. ConAgra Foods, Inc., 727 F. Supp. 2d 1006 (E.D.Wash. 2010), aff’d, 438 Fed.Appx. 607 (9th Cir. 2011).

Yesterday, the jury returned a verdict for compensatory damages of $1.2 million, and punitive damages of $6 million, against the defendants, some of which had settled before trial.

For a predictably misleading, mainstream media account that fails to mention the interesting Daubert exclusions and defense verdicts in this litigation, see  Colorado man Wayne Watson wins $7 million in “popcorn lung” lawsuit; and ‘Popcorn Lung’ Lawsuit Nets $7.2M Award (Sept. 20, 2012).

The supermarket defendant at trial should certainly appeal.  It remains to be seen who gets the last pop in this case.

Watson Popcorn Case Pops Along

September 8th, 2012

Earlier today, I discussed the pending motion that would have limited, or eliminated, Dr. Egilman’s testimony in the Watson diacetyl case. See Good’s Expert Witness Opinion Not Good Enough in Tenth Circuit.  Apparently, Chief Judge Daniel denied the defendant’s renewed Rule 702 motion, and so “this trial must be tried.”  Whether the gatekeeping was sufficiently exact, time will tell.

Details to follow.