TORTINI

For your delectation and delight, desultory dicta on the law of delicts.

Maryland Puts the Brakes on Each and Every Asbestos Exposure

July 3rd, 2012

Last week, the Maryland Court of Special Appeals reversed a plaintiffs’ verdict in Dixon v. Ford Motor Company, 2012 WL 2483315 (Md. App. June 29, 2012).  Jane Dixon died of pleural mesothelioma.  The plaintiffs, her survivors, claimed that her last illness and death were caused by her household improvement projects, which involved exposure to spackling/joint compound, and by her husband’s work with car parts and brake linings, which involved “take home” exposure on his clothes.  Id. at *1.

All the expert witnesses appeared to agree that mesothelioma is a “dose-response disease,” meaning that the more the exposure, the greater the likelihood that a person exposed will develop the disease. Id. at *2.  Plaintiffs’ expert witness, Dr. Laura Welch, testified that “every exposure to asbestos is a substantial contributing cause and so brake exposure would be a substantial cause even if [Mrs. Dixon] had other exposures.” On cross-examination, Dr. Welch elaborated upon her opinion to explain that any “discrete” exposure would be a contributing factor. Id.

Welch, of course, criticized the entire body of epidemiology of car mechanics and brake repairmen, which generally finds no increased risk of mesothelioma above overall population rates.  With respect to the take-home exposure, Welch had to acknowledge that there were no epidemiologic studies that investigated the risk of wives of brake mechanics.  Welch argued that the studies of car mechanics did not involve exposure to brake shoes as would have been experienced by brake repairmen, but her argument only served to make her attribution based upon take-home exposure to brake linings seem more preposterous.  Id. at *3.  The court recognized that Dr. Welch’s opinion may have been trivially true, but still unhelpful.  Each discrete exposure, even as attenuated as a take-home exposure from having repaired a single brake shoe may have “contributed,” but that opinion did not help the jury assess whether the contribution was substantial.

The court sidestepped the issue of fiber type, and threshold, and honed in on the agreement that mesothelioma risk showed a dose-response relationship with asbestos exposure.  (There is a sense that the court confused the dose-response concept to mean no threshold.)  The court credited hyperbolic risk assessment figures from the United States Environmental Protection Agency, which suggested that even ambient air exposure to asbestos leads to an increase in mesothelioma risk, but then realized that such claims made the legal need to characterize the risk from the defendant’s product all the more important before the jury could reasonably have concluded that any particular exposure experienced by Ms. Dixon was “a substantial contributing factor.”  Id. at *5.

Having recognized that the best the plaintiffs could offer was a claim of increased risk, and perhaps crude quantification of the relative risks resulting from each product’s exposure, the court could not escape that the conclusion that Dr. Welch’s empty recitation of “every exposure” is substantial was nothing more than an unscientific and empty assertion.  Welch’s claim was either tautologically true or empirical nonsense.  The court also recognized that risk substituting for causation opened the door to essentially probabilistic evidence:

“If risk is our measure of causation, and substantiality is a threshold for risk, then it follows—as intimated above—that ‘substantiality’ is essentially a burden of proof. Moreover, we can explicitly derive the probability of causation from the statistical measure known as ‘relative risk’ … .  For reasons we need not explore in detail, it is not prudent to set a singular minimum ‘relative risk’ value as a legal standard.12 But even if there were some legal threshold, Dr. Welch provided no information that could help the finder of fact to decide whether the elevated risk in this case was ‘substantial’.”

Id. at *7.  The court’s discussion here of “the elevated risk” seems wrong unless we understand it to mean the elevated risk attributable to the particular defendant’s product, in the context of an overall exposure that we accept as having been sufficient to cause the decedent’s mesothelioma.  Despite the lack of any quantification of relative risks in the case, overall or from particular products, and the court’s own admonition against setting a minimum relative risk as a legal standard, the court proceeded to discuss relative risks at length.  For instance, the court criticized Judge Kozinski’s opinion in Daubert, upon remand from the Supreme Court, for not going far enough:

“In other words, the Daubert court held that a plaintiff’s risk of injury must have at least doubled in order to hold that the defendant’s action was ‘more likely than not’ the actual cause of the plaintiff’s injury. The problem with this holding is that relative risk does not behave like a ‘binary’ hypothesis that can be deemed ‘true’ or ‘false’ with some degree of confidence; instead, the un-certainty inherent in any statistical measure means that relative risk does not resolve to a certain probability of specific causation. In order for a study of relative risk to truly fulfill the preponderance standard, it would have to result in 100% confidence that the relative risk exceeds two, which is a statistical impossibility. In short, the Daubert approach to relative risk fails to account for the twin statistical uncertainty inherent in any scientific estimation of causation.”

Id. at *7 n.12 (citing Daubert v. Merrell Dow Pharms., Inc., 43 F.3d 1311, 1320-21 (9th Cir.1995) (holding that that a preponderance standard requires causation to be shown by probabilistic evidence of relative risk greater than two) (opinion on remand from Daubert v. Merrell Dow Pharms., 509 U.S. 579 (1993)).  The statistical impossibility derives from the asymptotic nature of the normal distribution, but the court failed to explain why a relative risk of two must be excluded as statistically implausible based upon the sample statistic.  After all, a relative risk greater than two, with a lower bound of a 95% confidence interval above one, based upon an unbiased sampling, suggests that our best evidence is that the population parameter is greater than two, as well.  The court, however, insisted upon stating the relative-risk-greater-than-two rule with a vengeance:

“All of this is not to say, however, that any and all attempts to establish a burden of proof of causation using relative risk will fail. Decisions can be – and in science or medicine are – premised on the lower limit of the relative risk ratio at a requisite confidence level. The point of this minor discussion is that one cannot apply the usual, singular ‘preponderance’ burden to the probability of causation when the only estimate of that probability is statistical relative risk. Instead, a statistical burden of proof of causation must consist of two interdependent parts: a requisite confidence of some minimum relative risk. As we explain in the body of our discussion, the flaws in Dr. Welch’s testimony mean we need not explore this issue any further.44

Id. (emphasis in original).

And despite having declared the improvidence of addressing the relative risk issue, and then the lack of necessity for addressing the issue given Dr. Welch’s flawed testimony, the court nevertheless tackled the issue once more, a couple of pages later:

“It would be folly to require an expert to testify with absolute certainty that a plaintiff was exposed to a specific dose or suffered a specific risk. Dose and risk fall on a spectrum and are not ‘true or false’. As such, any scientific estimate of those values must be expressed as one or more possible intervals and, for each interval, a corresponding confidence that the true value is within that interval.”

Id. at 9 (emphasis in original; internal citations omitted).  The court captured the frequentist concept of the confidence interval as being defined operationally by repeated samplings and their random variability, but the confidence of the confidence interval means that the specified coefficient represents the percentage of all such intervals that include the “true” value, not the probability that a particular interval, calculated from a given sample, contains the true value.  The true value is either in or not in the interval generated from a single sample risk statistic.  Again, it is unclear why the court was weighing in on this aspect of probabilistic evidence when plaintiffs’ expert witness, Welch, offered no quantitation of the overall risk or of the risk attributable to a specific product exposure.

The court indulged the plaintiffs’ no-threshold fantasy but recognized that the risks of low-level asbestos exposure were low, and likely below a doubling of risk, an issue that the court stressed it wanted to avoid.  The court cited one study that suggested a risk (odds) ratio of 1.1 for exposures less than 0.5 fiber/ml – years.  See id. at *5 (citing Y. Iwatsubo et al., “Pleural mesothelioma: dose-response relation at low levels of asbestos exposure in a French population-based case-control study,” 148 Am. J. Epidemiol. 133 (1998) (estimating an odds ratio of 1.1 for exposures less than 0.5 fibers/ml-years).  But the court, which tried to be precise elsewhere, appears to have lost its way in citing Iwatsubo here.  After all, how can a single odds ratio of 1.1 describe all exposures from 0 all the way up to 0.5 f/ml-years?  How can a single odds ratio describe all exposures in this range, regardless of fiber type, when chrystotile asbestos carries little to no risk for mesothelioma, and certainly orders of magnitude risk less than amphibole fibers such as amosite and crocidolite.  And if a low-level exposure has a risk ratio of 1.1, how can plaintiffs’ hired expert witness, Welch, even make the attribution of Dixon’s mesothelioma to the entirety of her exposure, let alone the speculative take-home chrysotile exposure involved from Ford’s brake linings?  Obviously, had the court posed these questions, it would it would have realized that “it is not possible” to permit Welch’s testimony at all.

The court further lost its way in addressing the exculpatory epidemiology put forward by the defense expert witnesses:

“Furthermore, the leading epidemiological report cited by Ford and its amici that specifically studied ‘brake mechanics’, P.A. Hessel et al., ‘Meso-thelioma Among Brake Mechanics: An Expanded Analysis of a Case-control Study’, 24 Risk Analysis 547 (2004), does not at all dispel the notion that this population faced an increased risk of mesothelioma due to their industrial asbestos exposure. … When calculated at the 95% confidence level, Hessel et al. estimated that the odds ratio of mesothelioma could have been as low as 0.01 or as high as 4.71, implying a nearly quintupled risk of mesothelioma among the population of brake mechanics. 24 Risk Analysis at 550–51.”

Id. at *8.  Again, the court is fixated with the confidence interval, to the exclusion of the estimated magnitude of the association!  This time, after earlier shouting that it was the lower bound of the interval that matters scientifically, the court emphasizes the upper bound.  The court here has strayed far from the actual data, and any plausible interpretation of them:

“The odds ratio (OR) for employment in brake installation or repair was 0.71 (95% CI: 0.30-1.60) when controlled for insulation or shipbuilding. When a history of employment in any of the eight occupations with potential asbestos exposure was controlled, the OR was 0.82 (95% CI: 0.36-1.80). ORs did not increase with increasing duration of brake work. Exclusion of those with any of the eight exposures resulted in an OR of 0.62 (95% CI: 0.01-4.71) for occupational brake work.”

P.A. Hessel et al., “Mesothelioma Among Brake Mechanics: An Expanded Analysis of a Case-control Study,” 24 Risk Analysis 547, 547 (2004).  All of Dr. Hessel’s estimates of effect sizes were below 1.0, and he found no trend for duration of brake work.  Cherry picking out the upper bound of a single subgroup analysis for emphasis was unwarranted, and hardly did justice to the facts or the science.

Dr. Welch’s conclusion that the exposure and risk in this case were “substantial” simply was not a scientific conclusion, and without it her testimony did not provide information for the jury to use in reaching its conclusion as to substantial factor causation. Id. at *7.  The court noted that Welch, and the plaintiffs, may have lacked scientific data to provide estimates of Dixon’s exposure to asbestos or relative risk of mesothelioma, but ignorance or uncertainty was hardly the basis to warrant an expert witness’s belief that the relevant exposures and risks are “substantial.” Id. at *10.  The court was well justified in being discomforted by the conclusory, unscientific opinion rendered by Laura Welch.

In the final puzzle of the Dixon case, the court vacated the judgment, and remanded for a new trial, “either without her opinion on substantiality or else with some quantitative testimony that will help the jury fulfill its charge.”  Id. at *10.  The court thus seemed to imply that an expert witness need not utter the magic word, “substantial,” for the case to be submitted to the jury against a brake defendant in a take-home exposure case.  Given the state of the record, the court should have simply reversed and rendered judgment for Ford.

Ecological Fallacy Goes to Court

June 30th, 2012

In previous posts, I have bemoaned the judiciary’s tin ear for important qualitative differences between and among different research study designs.  The Reference Manual for Scientific Evidence (3d ed. 2011)(RMSE3d) offers inconsistent advice, ranging from Margaret Berger’s counsel to abandon any hierarchy of evidence, to other chapters’ emphasizing the importance of a hierarchy.

The Cook case is one of the more aberrant decisions, which elevated an ecological study, without a statistically significant result, into an acceptable basis for a causal conclusion under Rule 702.  Senior Judge Kane’s decision in the litigation over radioactive contamination from the Colorado Rocky Flats nuclear weapons plant is illustrative of a judicial refusal to engage with the substantive differences among studies, and to ignore the inability of some study designs to support causality.  See Cook v. Rockwell Internat’l Corp., 580 F. Supp. 2d 1071, 1097-98 (D. Colo. 2006) (“Defendants assert that ecological studies are inherently unreliable and therefore inadmissible under Rule 702.  Ecological studies, however, are one of several methods of epidemiological study that are well-recognized and accepted in the scientific community.”), rev’d and remanded on other grounds, 618 F.3d 1127 (10th Cir. 2010), cert. denied, ___ U.S. ___ (May 24, 2012).  Senior Judge Kane’s point about the recognition and acceptance of ecological studies has nothing to do with their ability to support conclusions of causality.  This basic non sequitur led the trial judge into ruling that the challenge “goes to the weight, not the admissibility” of the challenged opinion testimony.  This is a bit like using an election day exit poll, with 5% returns, for “reliable” evidence to support a prediction of the winner.  The poll may have been conducted most expertly, but it lacks the ability to predict the winner.

The issue is not whether ecological studies are “scientific”; they are part of the epidemiologists’ toolkit.  The issue is whether they warrant inferences of causation.  Some so-called scientific studies are merely hypothesis generating, preliminary, tentative, or data-dredging exercises.  Judge Kane opined that ecological studies are merely “less probative” than other studies, and the relative weights of studies do not render them inadmissible.  Id.  This is a misunderstanding or an abdication of gatekeeping responsibility.  First, studies themselves are not admissible; it is the expert witness, whose testimony is challenged.  Second, Rule 702 requires that the proffered opinion be “scientific knowledge,” and ecological studies simply lack the necessary epistemic warrant.

The legal sources cited by Senior Judge Kane provide only equivocal and minimal support at best for his decision.  The court pointed to RSME2d at 344-45, for the proposition that ecological studies are useful for establishing associations, but are weak evidence for causality. The other legal citations give seem equally unhelpful.  In re Hanford Nuclear Reservation Litig., No. CY–91– 3015–AAM, 1998 WL 775340 at *106 (E.D.Wash. Aug.21, 1998) (citing RMSE2d and the National Academy of Science Committee on Radiation Dose Reconstruction for Epidemiological Uses, which states that “ecological studies are usually regarded as hypothesis generating at best, and their results must be regarded as questionable until confirmed with cohort or case‑control studies.” National Research Council, Radiation Dose Reconstruction for Epidemiologic Uses at 70 (1995)), rev’d on other grounds, 292 F.3d 1124 (9th Cir. 2002).  Ruff v. Ensign– Bickford Indus., Inc., 168 F.Supp. 2d 1271, 1282 (D. Utah 2001) (reviewing evidence that consisted of a case-control study in addition to an ecological study; “It is well established in the scientific community that ecological studies are correlational studies and generally provide relatively weak evidence for establishing a conclusive cause and effect relationship.’’); see also id. at 1274 n.3 (“Ecological studies tend to be less reliable than case–control studies and are given little evidentiary weight with respect to establishing causation.”)

 

ERROR COMPOUNDED

The new edition of RMSE cites the Cook case at several places.  In an introductory chapter, the late Professor Margaret Berger cites the case incorrectly for having excluded expert witness testimony.  See Margaret A. Berger, “The Admissibility of Expert Testimony 11, 24 n.62 in RMSE3d (“See Cook v. Rockwell Int’l Corp., 580 F. Supp. 2d 1071 (D. Colo. 2006) (discussing why the court excluded expert’s testimony, even though his epidemiological study did not produce statistically significant results).”)  The chapter on epidemiology cites Cook correctly for having refused to exclude the plaintiffs’ expert witness, Dr. Richard Clapp, who relied upon an ecological study of two cancer outcomes in the area adjacent to the Rocky Flats Nuclear Weapons Plant.  See Michael D. Green, D. Michal Freedman, and Leon Gordis, “Reference Guide on Epidemiology,” 549, 561 n. 34, in Reference Manual for Scientific Evidence (3d ed. 2011).  The authors, however, abstain from any judgmental comments about the Cook case, which is curious given their careful treatment of ecological studies and their limitations:

“4. Ecological studies

Up to now, we have discussed studies in which data on both exposure and health outcome are obtained for each individual included in the study.33 In contrast, studies that collect data only about the group as a whole are called ecological studies.34 In ecological studies, information about individuals is generally not gathered; instead, overall rates of disease or death for different groups are obtained and compared. The objective is to identify some difference between the two groups, such as diet, genetic makeup, or alcohol consumption, that might explain differences in the risk of disease observed in the two groups.35 Such studies may be useful for identifying associations, but they rarely provide definitive causal answers.36

Id. at 561.  The epidemiology chapter proceeds to note that the lack of information about individual exposure and disease outcome in an ecological study “detracts from the usefulness of the study,” and renders it prone to erroneous inferences about the association between exposure and outcome, “a problem known as an ecological fallacy.”  Id. at 562.  The chapter authors define the ecological fallacy:

“Also, aggregation bias, ecological bias. An error that occurs from inferring that a relationship that exists for groups is also true for individuals.  For example, if a country with a higher proportion of fishermen also has a higher rate of suicides, then inferring that fishermen must be more likely to commit suicide is an ecological fallacy.”

Id. at 623.  Although the ecological study design is weak and generally unsuitable to support causal inferences, the authors note that such studies can be useful in generating hypotheses for future research using studies that gather data about individuals. Id. at 562.  See also David Kaye & David Freedman, “Reference Guide on Statistics,” 211, 266 n.130 (citing the epidemiology chapter “for suggesting that ecological studies of exposure and disease are ‘far from conclusive’ because of the lack of data on confounding variables (a much more general problem) as well as the possible aggregation bias”); Leon Gordis, Epidemiology 205-06 (3d ed. 2004)(ecologic studies can be of value to suggest future research, but “[i]n and of themselves, however, they do not demonstrate conclusively that a causal association exists”).

The views expressed in the Reference Manual for Scientific Evidence, about ecological studies, are hardly unique.  The following quotes show how ecological studies are typically evaluated in epidemiology texts:

Ecological fallacy

An ecological fallacy or bias results if inappropriate conclusions are drawn on the basis of ecological data. The bias occurs because the association observed between variables at the group level does not necessarily represent the association that exists at the individual level (see Chapter 2).

***

Such ecological inferences, however limited, can provide a fruitful start for more detailed epidemiological work.”

R. Bonita, R. Beaglehole, and T. Kjellström, Basic Epidemiology 43 2d ed. (WHO 2006).

“A first observation of a presumed relationship between exposure and disease is often done at the group level by correlating one group characteristic with an outcome, i.e. in an attempt to relate differences in morbidity or mortality of population groups to differences in their local environment, living habits or other factors. Such correlational studies that are usually based on existing data are prone to the so-called ‘ecological fallacy’ since the compared populations may also differ in many other uncontrolled factors that are related to the disease. Nevertheless, ecological studies can provide clues to etiological hypotheses and may serve as a gateway towards more detailed investigations.”

Wolfgang Ahrens & Iris Pigeot, eds., Handbook of Epidemiology 17-18 (2005).

The Cook case is a wonderful illustration of the judicial mindset that avoids and evades gatekeeping by resorting to the conclusory reasoning that a challenge “goes to the weight, not the admissibility” of an expert witness’s opinion.

Let’s Require Health Claims to Be Evidence Based

June 28th, 2012

Litigation arising from the FDA’s refusal to approval “health claims” for foods and dietary supplements is a fertile area for disputes over the interpretation of statistical evidence.  A ‘‘health claim’’ is ‘‘any claim made on the label or in labeling of a food, including a dietary supplement, that expressly or by implication … characterizes the relationship of any substance to a disease or health-related condition.’’ 21 C.F.R. § 101.14(a)(1); see also 21 U.S.C. § 343(r)(1)(A)-(B).

Unlike the federal courts exercising their gatekeeping responsibility, the FDA has committed to pre-specified principles of interpretation and evaluation. By regulation, the FDA gives notice of standards for evaluating complex evidentiary displays for the ‘‘significant scientific agreement’’ required for approving a food or dietary supplement health claim.  21 C.F.R. § 101.14.  See FDA – Guidance for Industry: Evidence-Based Review System for the Scientific Evaluation of Health Claims – Final (2009).

If the FDA’s refusal to approve a health claim requires pre-specified criteria of evaluation, then we should be asking ourselves why have the federal courts failed to develop a set of criteria for evaluating health effects claims as part of its Rule 702 (“Daubert“) gatekeeping responsibilities.  Why, after close to 20 years after the Supreme Court decided Daubert, can lawyers make “health claims” without having to satisfy evidence-based criteria?

Although the FDA’s guidance is not always as precise as might be hoped, it is far better than the suggestion of the new Reference Manual for Scientific Evidence (3d ed. 2011) that there is no hierarchy of evidence.   See RMSE 3d at 564 & n.48 (citing and quoting idiosyncratic symposium paper that “[t]here should be no hierarchy [among different types of scientific methods to determine cancer causation]; “Late Professor Berger’s Introduction to the Reference Manual on Scientific Evidence” (Oct. 23, 2011).

The FDA’s attempt to articulate an evidence-based hierarchy is noteworthy because the agency must evaluate a wide range of evidence, from in vitro, to animal studies, to observational studies of varying kinds, to clinical trials, to meta-analyses and reviews.  The FDA’s criteria are a good start, and I imagine that they will develop and improve over time.  Although imperfect, the criteria are light years ahead of the situation in federal and state court gatekeeping.  Unlike gatekeeping in civil actions, the FDA criteria are pre-stated and not devised post hoc.  The FDA’s attempt to implement evidence-based principles in the evaluation of health claims made is a model that would much improve the Reference Manual for Scientific EvidenceSee Christopher Guzelian & Philip Guzelian, “Prevention of false scientific speech: a new role for an evidence-based approach,” 27 Human & Experimental Toxicol. 733 (2008).

The FDA’s evidence-based criteria need work in some areas.  For instance, the FDA’s Guidance on meta-analysis is not particularly specific or helpful:

Research Synthesis Studies

Reports that discuss a number of different studies, such as review articles, do not provide sufficient information on the individual studies reviewed for FDA to determine critical elements such as the study population characteristics and the composition of the products used. Similarly, the lack of detailed information on studies summarized in review articles prevents FDA from determining whether the studies are flawed in critical elements such as design, conduct of studies, and data analysis. FDA must be able to review the critical elements of a study to determine whether any scientific conclusions can be drawn from it. Therefore, FDA intends to use review articles and similar publications to identify reports of additional studies that may be useful to the health claim review and as background about the substance/disease relationship. If additional studies are identified, the agency intends to evaluate them individually. Most meta-analyses, because they lack detailed information on the studies summarized, will only be used to identify reports of additional studies that may be useful to the health claim review and as background about the substance-disease relationship.  FDA, however, intends to consider as part of its health claim review process a meta-analysis that reviews all the publicly available studies on the substance/disease relationship. The reviewed studies should be consistent with the critical elements, quality and other factors set out in this guidance and the statistical analyses adequately conducted.”

FDA – Guidance for Industry: Evidence-Based Review System for the Scientific Evaluation of Health Claims – Final at 10 (2009).

The dismissal of review articles as a secondary source is welcome, but meta-analyses are quantitative reviews that can add additional insights and evidence, if methodologically appropriate, by providing a summary estimate of association, sensitivity analyses, meta-regression, etc.  The FDA’s guidance was applied in connection with the agency’s refusal to approve a health claim for vitamin C and lung cancer.  Proponents claimed that a particular meta-analysis supported their health claim, but the FDA disagreed.  The proponents sought injunctive relief in federal district court, which upheld the FDA’s decision on vitamin C and lung cancer.  Alliance for Natural Health US v. Sebelius, 786 F.Supp. 2d 1, 21 (D.D.C. 2011).  The district court found that the FDA’s refusal to approve the health claim was neither arbitrary nor capricious with respect to its evaluation of the cited meta-analysis:

‘‘The FDA discounted the Cho study because it was a ‘meta-analysis’ of studies reflected in a review article. FDA Decision at 2523. As explained in the 2009 Guidance Document, ‘research synthesis studies’, and ‘review articles’, including ‘most meta-analyses’, ‘do not provide sufficient information on the individual studies reviewed’ to determine critical elements of the studies and whether those elements were flawed. 2009 Guidance Document at A.R. 2432. The Guidance Document makes an exception for meta-analyses ‘that review[ ] all the publicly available studies on the substance/disease relationship’. Id. Based on the Court’s review of the Cho article, the FDA’s decision to exclude this article as a meta-analysis was not arbitrary and capricious.’’

Id. at 19.

The FDA’s Guidance was adequate for its task in the vitamin C/lung cancer health claim, but notably absent from the Guidance are any criteria to evaluate competing meta-analyses that do include “all the publicly available studies on the substance/disease relationship.”  The model assumptions of meta-analyses, fixed effect versus random effects, lack of heterogeneity, as well as other considerations will need to be spelled out in advance.  Still not a bad start.  Implementing evidence-based criteria in Rule 702 gatekeeping has the potential to tame the gatekeeper’s discretion.

Meta-Meta-Analysis – Celebrex Litigation – The Claims – Part 2

June 25th, 2012

IMPUTATION

As I noted in part one, the tables were turned on imputation, with plaintiffs making the same accusation that G.E. made in the gadolinium litigation:  imputation involves adding “phantom events” or “imaginary events to each arm of ‘zero event’ trials.”  See Plaintiffs’ Reply Mem. of Law in Further Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 8, 9 (May 5, 2010), in Securities Litig.

The plaintiffs claimed that Wei “created” an artifact of a risk ratio of 1.0 by using imputation in each of the zero-event trials.  The reality, however, is that each of those trials had zero risk difference, and the rates of event in drug and placebo arms were both low and equal to one another.  The plaintiffs’ claim that Wei “diluted” the risk is little more than saying that he failed to inflate the risk by excluding zero-event trials.  But zero-event trials represent a test in which the risk of events in both arms is equal, and relatively low.

The plaintiffs seemed to make their point half-heartedly.  They admitted that “imputation in and of itself is a commonly used methodology,” id. at 10, but they claimed that “adding zero-event trials to a meta-analysis is debated among scientists.”  Id.  A debate over methodology in the realm of meta-analysis procedures hardly makes any one of the debated procedures “not generally accepted,” especially in the context of meta-analysis of uncommon adverse events arising in clinical trials designed for other outcomes.  After all, investigators do not design trials to assess a suspected causal association between a medication and an adverse outcome as their primary outcome.  The debate over the ethics of such a trial would be much greater than any gentle debate over whether to include zero-event trials by using either the risk difference or imputation procedures.

The gravamen of the plaintiffs’ complaint against Wei seems to be that he included too many zero-event trials, “skewing the numbers greatly, and notably cites to no publications in which the dominant portion of the meta-analysis was comprised of studies with no events.”  Id. The plaintiffs further argue that Wei could have minimized the “distortion” created by imputation by using a fractional event, ” a smaller number like .000000001 to each trial.”  Id. The plaintiffs notably cited no texts or articles for this strategy.  In any event, if the zero-event trials are small, as they typically are, then they will have large study variances.  Because meta-analyses weight each trial by the inverse of the variance, studies with large variances have little weight in the summary estimate of association.  Including small studies with imputation methods will generally not affect the outcome very much, and their contribution may well reflect the reality of lower or non-differential risk from the medication.

Eliminating trials on the grounds that they had zero events has also been criticized for throwing away important data.  Charles H. Hennekens, David L. DeMets, C. Noel Bairey Merz, Steven L. Borzak, Jeffrey S. Borer,  “Doing More Harm Than Good,” 122 Am. J. Med. 315 (2009) (criticizing Nissen’s meta-analysis of rosiglitazone in which he excluded zero event trials for as biased towards overestimating the magnitude of the summary estimate of association). George A. Diamond, L. Bax, S. Kaul, “Uncertain effects of rosiglitazone on the risk for myocardial infarction and cardiovascular death,” 147 Ann. Intern. Med. 578 (2007) (conducting sensitivity analyses on Nissen’s meta-analysis of rosiglitazone to show that Nissen’s findings lost statistical significance when continuity corrections were made for zero-event trials).

 

RISK DIFFERENCE

The plaintiffs are correct that the risk difference is not the predominant risk measure used in meta-analysis or in clinical trials for that matter.  Researchers prefer risk ratios because they reflect base rates in the ratio.  As one textbook explains:

“the limitation of the [risk difference] statistic is its insensitivity to base rates. For example, a risk that increases from 50% to 52% may be less important than one that increases from 2% to 4%, although in both instances RD = 0.02.”

Julia Littell, Jacqueline Corcoran, and Vijayan Pillai, Systematic Reviews and Meta-Analysis 85 (Oxford 2008).  This feature of the risk difference hardly makes its use unreliable, however.

Pfizer pointed out that at least one other case addressed the circumstances in which the risk difference would be superior to risk ratios in meta-analyses:

“The risk difference method is often used in meta-analyses where many of the individual studies (which are all being pooled together in one, larger analysis) do not contain any individuals who developed the investigated side effect.FN17  whereas such studies would have to be excluded from an odds ratio calculation, they can be included in a risk difference calculation. FN18

FN17. This scenario is more likely to occur when studying a particularly rare event, such as suicide.

FN18. Studies where no individuals experienced the effect must be excluded from an odds ratio calculation because their inclusion would necessitate dividing by zero, which, as perplexed middle school math students come to learn, is impossible. The risk difference’s reliance on subtraction, rather than division, enables studies with zero incidences to remain in a meta-analysis. (Hr’g Tr. 310-11, June 20, 2008 (Gibbons.)).”

In re Neurontin Marketing, Sales Practices, and Products Liab. Litig.,  612 F.Supp. 2d 116, 126 (D. Mass. 2009) (MDL 1629).  See Pfizer’s Defendants’ Mem. of Law in Opp. to Plaintiffs’ Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei (Sept. 8, 2009), in Securities Litig. (citing In re Neurontin).

Pfizer also pointed out that Wei had employed both the risk ratio and the risk difference in conducting his meta-analyses, and that none of his summary estimates of association were statistically significant.  Id. at 19, 24.


EXACT CONFIDENCE INTERVALS

The plaintiffs argued that the use of “exact confidence” intervals was not scientifically reliable and could not have been used by Pfizer at the time period covered by the securities class’s allegations.  See Plaintiffs’ Reply Mem. of Law in Further Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 15 (May 5, 2010).  Exact intervals, however, are hardly a novelty, and there is often no single way to calculate a confidence interval.  See E. B. Wilson,  “Probable inference, the law of succession, and statistical inference,” 22 J. Am. Stat. Ass’n 209 (1927); C. Clopper, E. S. Pearson, “The use of confidence or fiducial limits illustrated in the case of the binomial,” 26 Biometrika 404 (1934).  Approximation methods are often used, despite their lack of precision, because of their ease in calculation.

Plaintiffs further claimed that the combination of risk difference and exact intervals is novel, not reliable, and not in existence during the class period.  Plaintiffs’ Reply Mem at 15.  The plaintiffs’ argument traded on Wei’s having published on the use of exact intervals in conjunction with the risk difference for heart attacks in clinical trials of Avandia.  See L. Tian, T. Cai, M.A. Pfeffer, N. Piankov, P.Y. Cremieux, and L.J. Wei, “Exact and efficient inference procedure for meta-analysis and its application to the analysis of independent 2 x 2 tables with all available data but without artificial continuity correction,” 10 Biostatistics 275 (2009).  Their argument ignored that Wei combined two well-understood statistical techniques, in a transparent way, with empirical testing of the validity of his approach.  Contrary to plaintiffs’ innuendo, Wei did not develop his approach as an expert witness for GlaxoSmithKline; a version of the manuscript describing his approach was posted on line well before he was ever contacted by GSK counsel. (L.J. Wei, personal communication)  Plaintiffs also claimed that Wei’s use of exact intervals for risk difference showed no increased risk of heart attack for Avandia, contrary to a well-known meta-analysis by Dr. Steven Nissen.  See Steven E. Nissen, M.D., and Kathy Wolski, M.P.H., “Effect of Rosiglitazone on the Risk of Myocardial Infarction and Death from Cardiovascular Causes,” 356 New Engl. J. Med. 2457, 2457 (2007).  This claim, however, is a crude distortion of Wei’s paper, which showed that there was a positive risk difference for heart attacks in the same dataset used by Nissen, but the confidence intervals included zero (no risk difference), and thus chance could not be excluded as explaining Nissen’s result.

 

DURATION OF TRIALS

Pfizer was ultimately successful in defending the Celebrex litigation on the basis of lack of risk associated with 200 mg/day use.  Pfizer also attempted to argue a duration effect on grounds that in one large trial that saw a statistically significant hazard ratio associated with higher doses, the result occurred for the first time among trial participants on medication, at 33 months into the trial.  Judge Bryer rejected this challenge, without explanation.  In re Bextra & Celebrex Marketing Celebrex Sales Practices & Prod. Liab. Litig., 524 F.Supp. 2d 1166, 1183 (2007).  The reasonable inference, however, is that the meta-analyses showed statistically significant results across trials with less duration of use, for 400 mg and 800 mg/day use.

Clearly duration of use is a potential consideration unless the mechanism of causation is such that a causally related adverse event would occur from the first use or very short-term use of the medication.  See In re Vioxx Prods. Liab. Litig., MDL No. 1657, 414 F. Supp. 2d. 574, 579 (E.D. La. 2006) (“A trial court may consider additional factors in assessing the scientific reliability of expert testimony . . . includ[ing] whether the expert’s opinion is based on incomplete or inaccurate dosage or duration data.”).  In the Celebrex litigation, plaintiffs’ counsel appeared to want to have duration effects both ways; they did not want to disenfrancise plaintiffs whose claims turned on short-term use, but at the same time, they criticized Professor Wei for including short-term trials of Celebrex.

One form that the plaintiffs’ criticism of Wei took was his failure to weight the trials included in his meta-analyses by duration.  In the plaintiffs’ words:

“Wei failed to utilize important information regarding the duration of the clinical trials that he analyzed, information that is critical to interpreting and understanding the Celebrex and Bextra safety information that is contained within those clinical trials.3 Because the types of cardiovascular events that are at issue in this case occur relatively rarely and are more likely to be observed after an extended period of exposure, the scientific community is in agreement that they would not be expected to appear in trials of very short duration.”

Plaintiffs’ Mem. of Law in Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 2 (July 23, 2009), submitted in In re Pfizer, Inc. Securities Litig., Nos. 04 Civ. 9866(LTS)(JLC), 05 md 1688(LTS) (S.D.N.Y.)[hereafter Securities Litig.]  The plaintiffs maintained that Wei’s meta-analyses were “fatally flawed” because he ignored trial duration, such as would be factored in by performing the analyses in terms of patient years.  Id. at 3

Many of the sources cited by plaintiffs do not support their argument. For instance, the plaintiffs cited articles that noted that weighted averages should be used, but virtually all methods, including Wei’s, weight studies by their variance, which takes into account sample size. Id. at 9 n.3, citing Egger, et al. “Meta-analysis: Principles and Procedures,” 315 Brit. Med. J. 1533 (1997) (an arithmetic average from all trials gives misleading results as results from small studies are more subject to the play of chance and should be given less weight. Meta-analyses use weighted results in which larger trials have more influence that smaller ones). See also id. at 22.  True, true, and immaterial.  No one in the Celebrex cases was using an arithmetic average of risk across trials or studies.

Most of the short-term studies were small, and thus contributed little to the overall summary estimate of association.  Some of the plaintiffs’ citations actually supported using “individual patient data” in the form of time-to-event analyses, which was not possible with many of the clinical trials available.  Indeed, the article the plaintiffs cited, by Dahabreh, did not use time-to-event data for rosiglitazone, because such data were not generally available.  Id. at 9 n.3, citing Dahabreh, “Meta-Analysis Of Rare Events: An Update And Sensitivity Analysis Of Cardiovascular Events In Randomized Trials Of Rosiglitazone,” 5 Clinical Trials 116 (2008).

The plaintiffs’ claim was thus a fairly weak challenge to using simple 2 x 2 tables for the included studies in Wei’s meta-analysis. Both sides failed to mention that many published meta-analyses eschew “patient years” in favor of a simple odds ratio for dichotomous count data from each included study.  See, e.g., Steven E. Nissen, M.D., and Kathy Wolski, M.P.H., “Effect of Rosiglitazone on the Risk of Myocardial Infarction and Death from Cardiovascular Causes,” 356 New Engl. J. Med. 2457, 2457 (2007)(using Peto method with count data, for fixed effect model).  Patient years would be a crude tool to modify the fairly common 2 x 2 table.  The analysis for large studies, with a high number of patient years, would still not reveal whether the adverse events occurred early or late in the trials.  Only a time-to-event analysis could provide the missing information about “duration,” and neither side’s expert witnesses appeared to use a time-to-event analysis.

Interestingly, plaintiffs’ expert witness, Prof. Madigan appears to have received the patient-level data from Pfizer’s clinical trials, but still did not conduct a time-to-event analysis.  Plaintiffs’ Mem. of Law in Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 12 (July 23, 2009), submitted in In re Pfizer, Inc. Securities Litig., Nos. 04 Civ. 9866(LTS)(JLC), 05 md 1688(LTS) (S.D.N.Y.)[hereafter Securities Litig] (noting that Madigan had examined all SAS data files produced by Pfizer, and that “[t]hese  files contained voluminous information on each subject in the trials, including information about duration of exposure to the drug ( or placebo), any adverse events experienced and a wide variety of other information.”).  Of course, even with time-to-event data from the Pfizer clinical trials, Madigan had the problem of whether to limit himself to just the Pfizer trials or use all the data, including non-Pfizer trials.  If he opted for completeness, he would have been forced to include trials for which he did not have underlying data.  In all likelihood, Madigan used patient-years in his analyses because he could not conduct a complete analysis with time-to-event data for all trials.

The plaintiffs’ point appears well taken if the court were to assume that there really was a duration issue, but the plaintiffs’ theories were to the contrary, and Pfizer lost its attempt to limit claims to those events that appeared 33 months (or some other fixed time) after first ingestion.  It is certainly correct that patient-year analyses, in the absence of time-to-event analyses, is generally preferred.  Pfizer had used patient-year information to analyze combined trials in its submission to the FDA’s Advisory Committee.  See Pfizer’s Submission of Advisory Committee Briefing Document at 15 (January 12, 2005).  See also  FDA Reviewer Guidance: Conducting a Clinical Safety Review of a New Product Application and Preparing a Report on the Review at 22 (2005); see also id. at 15 (“If there is a substantial difference in exposure across treatment groups, incidence rates should be calculated using person-time exposure in the denominator, rather than number of patients in the denominator.”);  R. H. Friis & T. A. Sellers, Epidemiology for Public Health Practice at 105 (2008) (“To allow for varying periods of observation of the subjects, one uses a modification of the formula for incidence in which the denominator becomes person-time of observation”).

Professor Wei chose not to do a “patient-year” analysis because such a methodological commitment would have required him to drop over a dozen Celebrex clinical trials involving thousands of patients, and dozens of heart attack and stroke events of interest.  Madigan’s approach led him to disregard a large amount of data.  Wei could, of course, stratified the summary estimates for different length clinical trials, and analyzed whether there were differences as a function of trial duration.  Pfizer claimed that Wei conducted a variety of sensitivity analyses, but it is unclear whether he ever used this technique.  Wei should have been allowed in any event to take plaintiffs at their word that thrombotic events from Celebrex occurred shortly after first ingestion.   Pfizer Mem. of Law in Opp. to Plaintiffs’ Motion to Exclude Defendants’ Expert Dr. Lee-Jen Wei at 2 (Sept. 8, 2009), in Secur. Litig.

 

MADIGAN’S META-ANALYSIS

According to Pfizer, Professor Madigan reached different results from Wei’s largely because he had used different event counts and end points.  The defendants’ challenge to Madigan turned largely upon the unreliable way he went about counting events to include in his meta-analyses.

Data concerning unexpected adverse events in clinical trials often is collected as reports of treating physicians, whose descriptions may be incomplete, inaccurate, or inadequate.  When there is a suggestion that a particular adverse event – say heart attack – occurred more frequently in the medication arm as opposed to the placebo or comparator arms, the usual course of action is to have a panel of clinical experts review all the adverse event reports, and supporting medical charts, to provide diagnoses that can be used in a more complete statistical analyses.  Obviously, the reviewers should be blinded to the patients’ assignment to medication or placebo, and the reviewers should be clinical experts in the clinical specialty of the adverse event.  Cardiologists should be making the call for heart attacks.

In addition to event definition and adjudication, clinical trial interpretation sometimes leads to the use of “composite end points,” which consist of related diagnostic categories, aggregated in some way that makes biological sense.  For instance, if the concern is that a medication causes cardiovascular thrombotic events, a suitable cardiovascular composite end point might include heart attack and ischemic stroke.  Inclusion of hemorrhagic stroke, endocarditis, and valvular disease in the composite, however, would be inappropriate, given the concern over thrombosis.

Professor Madigan is a highly qualified statistician, but, as Pfizer argued, he had no clinical expertise to reassign diagnoses or determine appropriate composite end points.  The essence of the defendants’ challenges revolved around claims of flawed outcome and endpoint ascertainment and definitions.  According to Pfizer’s briefing, the event definition process was unblinded, and conducted by inexpert, partisan reviewers.  Madigan apparently relied upon the work of another plaintiffs’ witness, cardiologist Dr. Lawrence Baruch, as well as that of Dr. Curt Furberg.  Furberg was not a cardiologist; indeed he has never been licensed to practice medicine in the United Dates, and he had not treated a patient in over 30 years. Pfizer Mem. of Law in Opp. to Plaintiffs’ Motion to Exclude Defendants’ Expert Dr. Lee-Jen Wei at 29 (Sept. 8, 2009), in Secur. Litig.  Furthermore, Furberg was not familiar with current diagnostic criteria for heart attack.  Plaintiffs’ counsel asked Furberg to rework some but not all of Baruch’s classifications, but only for fatal events.  Baruch could not explain why Furberg made these reclassifications.  Furberg acknowledged that he had never used “one-line descriptions to classify events,” which he did in the Celebrex litigation, when he received the assignment from plaintiffs’ counsel on the eve of the Court’s deadline for disclosures.  Id. According to Pfizer, if the plaintiffs’ witnesses had used appropriate end points and event counts, their meta-analyses would not have differed from Professor Wei’s work.  Id.

Pfizer pointed to Madigan’s testimony to claim that he had admitted that, based upon the impropriety of Furberg’s changing end point definitions, and his own changes, made without the assistance of a clinician, he would not submit the earlier version of his meta-analysis for peer review.  Pfizer’s [Proposed] Findings of Fact and Conclusions of Law with Respect to Motion to Exclude Certain Plaintiffs’ Experts’ Opinions Regarding Celebrex and Bextra, and Plaintiffs’ Motion to Exclude Defendants’ Expert Dr. Lee-Jen Wei, Document 175, submitted in Securities Litig. (Dec. 4, 2009). at 33,  43.  The plaintiffs countered that Furberg’s reclassifications did not change Madigan’s reports, at least for certain years. Plaintiffs’ Reply Mem. of Law in Further Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 18 (May 5, 2010), in Securities Litig.

The trial court denied Pfizer’s challenges to Madigan’s meta-analysis in the securities fraud class action.  The court attributed any weakness in the classification of fatal adverse events by Baruch and Furberg to the limitations of the underlying data created and produced by Pfizer itself.  In re Pfizer Inc. Securities Litig., 2010 WL 1047618, *4 (S.D.N.Y. 2010).

 

Composites

Pfizer also argued that Madigan put together composite outcomes that did not make biological sense in view of the plaintiffs’ causal theories.  For instance, Madigan left out strokes in his composite, although he included both heart attack and stroke in his primary end point for his Vioxx litigation analysis, and he had no reason to distinguish Vioxx and Celebrex in terms of claimed thrombotic effects.  Pfizer’s [Proposed] Findings of Fact and Conclusions of Law with Respect to Motion to Exclude Certain Plaintiffs’ Experts’ Opinions Regarding Celebrex and Bextra, and Plaintiffs’ Motion to Exclude Defendants’ Expert Dr. Lee-Jen Wei, Document 175, submitted in Securities Litig. (Dec. 4, 2009). at 13-14, 18.  According to Pfizer, Madigan’s composite was novel and unvalidated by relevant, clinical opinion.  Id. at 29, 33.

The plaintiffs’ response is obscure.  The plaintiffs seemed to claim that Madigan was justified in excluding strokes because some kinds of stroke, hemorrhagic strokes, are unrelated to thrombosis.  Plaintiffs’ Reply Memorandum of Law in Further Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 14 (May 5, 2010), in Securities Litig. at 14.  This argument is undermined by the facts:  better than 85% of strokes being ischemic in origin, and even some hemorrhagic strokes start as a result of an ischemic event.

In any event, Pfizer’s argument about Madigan’s composite end points did not gain any traction with the trial judge in the securities fraud class action:

“Dr. Madigan’s written submissions and testimony described clearly and justified cogently his statistical methods, selection of endpoints, decisions regarding event classification, sources of data, as well as the conclusions he drew from his analysis. Indeed, Dr. Madigan’s meta-analysis was based largely on data and endpoints developed by Pfizer. All four of the endpoints that Dr. Madigan used in his analysis-Hard CHD, Myocardial Thromboembolic Events, Cardiovascular Thromboembolic Events, and CV Mortality-have been employed by Pfizer in its own research and analysis. The use of Hard CHD in the relevant literature combined with the use of the other three endpoints by Pfizer in its own 2005 meta-analysis will assist the trier of fact in determining Pfizer’s knowledge and understanding of the pre-December 17, 2004, cardiovascular safety profile of Celebrex.”

In re Pfizer Inc. Securities Litig., 2010 WL 1047618, *4 (S.D.N.Y. 2010).

The Role of the Material Safety Data Sheet in Expert Witness Gatekeeping

June 23rd, 2012

Several years ago, I had an unusual workman’s compensation case, pending in Salem County, New Jersey.  The petitioner claimed that after his employment, he had developed a rare disease as a result of his workplace exposures.  The exact disease really does not matter, other than to note that there was some suggestive epidemiologic evidence of an association, and some evidence against the association.  As a result of the scientific ambiguity, the respondent had started to warn, on its material data safety sheets, and on its packaging, of the association.  So when I stepped into the judge’s chambers, the first thing the petitioner’s counsel said was:  “I win; the employer has admitted causation.”  The judge looked skeptical, and when I pointed out that association is not causation, and that warnings are not statements of scientific conclusions, His Honor agreed enthusiastically.  To the petitioner’s counsel’s shock and dismay, the judge directed us to brief the reliability issue.  Things like that do not often happen in New Jersey, and especially not in a worker’s compensation case.  (A statute of limitations issue ultimately turned out to be dispositive, and we never did get a ruling on the scientific claim.)

The use of the material safety data sheet (MSDS), or warnings on product packaging, is a recurring theme in so-called toxic tort litigation.  The MSDS is a compilation of information about a hazardous material or substance.  Under Occupational Safety and Health Administration (OSHA) regulations, the seller is required to compile information about hazards and provide to purchasers. See 29 C.F.R. § 1910.1200(g) (2006).  Much of the required information is regulatory classification, which is often based upon precautionary judgments about hazards and risks, without information specific to actual exposures likely experienced by end users.  The existence of the MSDS often provides claimants a basis to argue that the seller has admitted causation, but in reality, the argument is little more than a substitution of substituting regulatory precautionary judgment for causal assessments.

The Fifth Circuit’s recent, sure-footed decision in Johnson v. Arkema, Inc., below, reminded me how misleading and how persistent are the expert witnesses who argue causal conclusions based upon MSDS.  The Fifth Circuit recognized that an MSDS cannot be more reliable than the evidence upon which it is based.  Courts need to be on guard against the seductive argument that warnings and MSDS should substitute for scientific evidence of causation.  Unfortunately, not all judges are as astute as the judge I drew in my Salem County worker’s compensation case, which means that there is work to be done.

 

FEDERAL CASES REJECTING RELIANCE UPON MSDS

Johnson v. Arkema Inc., Slip op. at 11-12, 2012 WL ___ (5th Cir. June 20, 2012) (per curiam) (affirming exclusion of expert witnesses who relied in part upon MSDS for two chemicals when the MSDS identified only a very general physical reaction of “respiratory tract irritation,” without identifying the underlying scientific support or specifying the relevant duration and exposure needed to induce any particular adverse outcome)

Pritchard v. Dow AgroSciences, LLC, 705 F. Supp. 2d 471 (W.D. Pa. 2010) (excluding expert witness who opined that Dursban caused NHL based in part upon MSDS), aff’d, 430 Fed. Appx. 102 (3d Cir. 2011), cert. denied, 132 S. Ct. 508 (2011)

Seaman v. Seacor Marine LLC, 564 F. Supp. 2d 598, 603 (E.D. La. 2008) (rejecting reliance on MSDS because, inter alia, “the MSDS … does not mention bladder cancer … as a potential effect”), aff’d, 326 F. App’x 721, 726 (5th Cir. 2009)

Turner v. Iowa Fire Equip. Co., 229 F.3d 1202, 1209 (8th Cir. 2000) (affirming exclusion of expert witness who relied upon MSDS among other things)

Mitchell v. Gencorp Inc., 165 F.3d 778, 781 (10th Cir. 1999) (reliance upon MSDS insufficient)

Moore v. Ashland Chem. Inc., 151 F.3d 269, 278 (5th Cir. 1998) (en banc) (holding that the district court did not abuse its discretion in finding an expert witness’s reliance upon an MSDS because the withness “did not know what tests Dow [Corning] had conducted in generating the MSDS”)

Leake v. United States, 2011 U.S. Dist. LEXIS 149634 (E.D. Pa. 2011) (excluding expert witness who concluded that painting exposure caused liver failure based upon MSDS, etc.)

Henricksen v. ConocoPhillips Co., 605 F. Supp. 2d 1142, 1159 (E.D. Wash. 2009)(excluding expert witness who relied upon MSDS, among other items)

Moore v. P&G-Clairol,  Case No. 09 C 1723, Slip op. (N.D. Ill. March 18, 2011)(excluding expert witness who relied upon MSDS to support a claim of a severe allergic reaction to Clairol hair dye; the MSDS did not address human reactions or consumer exposure levels)

STATE DECISIONS REJECTING RELIANCE UPON MSDS

OHIO

Braglin v. Lempco Indus., Inc., 2007 Ohio 1964; 2007 Ohio App. LEXIS 1773 (2007) (affirming exclusion of expert witness who relied upon MSDS among other things, and opined that plaintiff’s pancreatic cancer was caused by chemicals in raw metal processing)

TEXAS

Brookshire Bros., Inc. v. Smith, 176 S.W.3d 37-38 & n.7, 2003 WL 21756411, *4 (Tex. App. – Houston [1st Dist.] 2003, no pet. ) (MSDS or warning label cannot, alone, provide the specific, detailed, reliable showing of causation to support an expert witness’s opinion)

Coastal Tankships U.S.A. Inc. v. Anderson, 87 S.W.3d 591, 611 (Tex. App.‑ Houston [1st Dist.] 2002, pet. denied) (statement of health “effects” in MSDS was not reliable evidence of causation)

See also Exxon Corp. v. Makofski, 116 S.W.3d 176, 187-88 (Tex. App. 2003) (“standards used by OSHA [and] the EPA” inadequate for causal determinations)

 

UNTOWARD DECISIONS

There is, of course, some factual complexity to these decisions; some MSDS obviously will list well-established causal relationship; others not.  The key point is that an MSDS is a tertiary source, compiled from primary sources at various levels in the hierarchy of evidence, as well as secondary reviews.  To make matters really murky, most MSDS must also report regulatory classifications and determinations, which often are not evidence-based.  Most of the “untoward” decisions, below, share typical fallacious elements:

  • treating MSDS as an admission by the manufacturer or seller;
  • confusing precautionary regulatory assessments with scientific determinations;
  • ignoring considerations of dose, exposure, route of exposure, animal species; and
  • confusing disease outcome discussed in MSDS with that claimed by plaintiff.

FEDERAL

Best v. Lowe’s Home Centers, Inc., 563 F.3d 171 (6th Cir. 2009), rev’g No. 3:04-CV-294, 2008 WL 2359986 (E.D. Tenn. June 5, 2008) (excluding expert witness who relied extensively upon MSDS)

Curtis v.M&S Petroleum, Inc. 174 F.3d 661, 669-70 (5th Cir. 1999) (reliance upon a particular MSDS was reasonable when the sheet was consistent with a body of reliable information about the hazards of benzene exposure)

Westberry v. Gislaved Gummi AB, 178 F.3d 257, 264-66 (4th Cir. 1999) (expert witness’s causation opinion of plaintiff’s sinus condition was reasonably based upon facts and data, including MSDS on talc)

McCullock v. H.B. Fuller Co., 61 F.3d 1038, 1043-44 (2d Cir.1995) (affirming denial of Rule 702 motion when expert witness’s causation opinion was based upon wide array of materials, including product’s MSDS)

Allen v. Martin Surfacing, 263 F.R.D. 47 (D. Mass. 2009) (denying challenge to expert witness who concluded that plaintiff’s exposure to neurotoxic levels of floor-surfacing chemical caused his ALS)

In re Stand ‘n Seal Prod. Liab. Litig., 1:07 MD1804-TWT, MDL 1804, Order Sur Longo (N.D. Ga. June 15, 2009)(denying motion to exclude expert witnesses who relied in part upon MSDS)

In re Welding Fume Prod. Liab. Litig., 2006 WL 4507859, *35 (N.D.Ohio 2006)( OMALLEY, J.); 2005 WL 1868046, *36 (N.D.Ohio) (denying challenge to expert witnesses who claimed that welding causes Parkinson’s disease on basis of general statements in MSDS that a component of welding fume (manganese) can be neurotoxic, without specification of dose or duration)

Westley v Ecolab, Inc., 2004 WL 1068805 (E.D.Pa. 2004) (denying motion against expert witness who relied upon MSDS)

Blandin Paper Co. v. J&J Industrial Sales, Inc., No. Civ.02-4858 ADM/RLE, 2004 WL 1946388 (D. Minn. Sept. 2, 2004) (denying motion to exclude plaintiff’s expert witness who relied in part upon MSDS)

Lentz v. Mason, Case No. 1:96-cv-02319-SMO, Slip op. (D.N.J. Jan. 11, 1999)(denying motion to exclude expert witness who relied up MSDS as well as independent testing)

STATE

Langness v. Fencil Urethane Systems, Inc., 2003 ND 132, 667 N.W.2d 596 (2003) (reversing jury verdict on grounds of error in excluding expert witness who relied in part upon MSDS for physical properties of material)

Johnson v. Arkema Inc. – The Fifth Circuit Proves to Be Sophisticated Consumer of Science

June 21st, 2012

Yesterday, in celebration of the first day of summer, the Fifth Circuit handed down a decision in a case that looks like a laundry list of expert witness fallacies.  Fortunately, the district judge and two of the three appellate judges kept their analytical faculties intact.  Johnson v. Arkema Inc., Slip op., 2012 WL ___ (5th Cir. June 20, 2012) (per curiam) (affirming exclusion of expert witnesses).

The plaintiff had worked in a glass bottling plant, where on two occasions in 2007, he was in close proximity to the defendant’s ventilation hood, designed to be used with a chemical, Certincoat, composed of monobutyltin trichloride (MBTC), an organometallic compound.  Plaintiff claimed that the ventilation was inadequate and that as a result he was exposed to MBTC as well as hydrochloric acid.

The plaintiff sustained some acute symptoms and ultimately was diagnosed with a “chemical pneumonia,” by his treating physician.  The plaintiff further claimed that his condition progressively worsened,  and that he was ultimately diagnosed with “pulmonary fibrosis,” a “severe restrictive lung disease.” The plaintiff filed reports from two expert witnesses – Richard Schlesinger, a toxicologist, and Charles Grodzin, a pulmonary physician – in support of his claim that his pulmonary fibrosis was caused by overexposure to MBTC and hydrochloric acid (HCl).

Plaintiff’s claim led to defendant’s Rule 702 challenge, which the trial court sustained, and the appellate court affirmed.

A basic problem faced by plaintiff is that there was virtually no evidence that MBTC or HCl causes pulmonary fibrosis. Undaunted, the plaintiff and his expert witnesses pushed on, but the lack of epidemiologic evidence associating MBTC or HCl with pulmonary fibrosis proved reliably harmful to plaintiff’s case.

General Acceptance

Plaintiff could point to no evidence that MBTC or HCl causes pulmonary fibrosis.  Slip op. at 7. Given the delay in manifestation of the fibrosis after the plaintiff’s rather limited, discrete exposures, the court recognized that epidemiologic evidence was important, if not essential, to plaintiff’s case. Without epidemiology, the plaintiff retreated to generalities – the chemicals cause lung irritation, lung injury, etc.  One concurring judge was taken in, but the majority of the panel saw through the dodge.

Anecdotal Evidence

Without epidemiologic evidence, the plaintiff invoked anecdotal evidence that other employees sustained similar lung injuries. The problem, however, for even this low-level evidence was that other employees experienced only transitory symptoms, which quickly resolved.  Id. at 4 -5, 27.

Post Hoc, Ergo Propter Hoc

Focusing only on himself as an anecdote with n =1, the plaintiff, and his expert witnesses, argued that temporal sequence of his exposure and his pulmonary fibrosis was itself evidence of causality.  Neither the trial court nor the appellate court found this much of an argument.  Id. at 16 n.13, 18.

Mechanism in Search of Data – Schlesinger’s irritant theory

Schlesinger argued that both MBTC and HCl are pulmonary irritants, which can cause inflammation, and pulmonary fibrosis results from inflammation. Id. at 8.  True, but not all irritants cause pulmonary fibrosis.  Chronicity and dose are important considerations.  Whether these chemicals, under exposure conditions experienced by plaintiff, were capable of causing pulmonary fibrosis, cried out for evidence.

The Material Safety Data Sheets (MSDS)

The plaintiff argued that the MSDS for HCl established that this chemical was “severely corrosive to the respiratory system.” Id. at 11-12.  The defendant’s own MSDS for MBTC stated that MBTC “causes respiratory tract irritation.” Id. at 16.  The courts saw these arguments as transparently absent evidence. None of the MSDS identified pulmonary fibrosis; nor did they specify (1) the underlying scientific support, or (2) the relevant duration and exposure needed to induce any particular adverse outcome.

Animal Studies

For both MBTC and HCl, plaintiff adverted to animal studies, but the courts found that the animal studies failed to support the plaintiff’s expert witnesses’ opinions and the plaintiff’s claims.  The studies were readily distinguishable in terms of dose, duration, and disease outcome.  In particular, none of the studies showed that the chemicals caused pulmonary fibrosis. Id. at 7, 12 (baboon study of HCl showed impairment but not fibrosis at 10,000ppm for one year, quite unlike plaintiff’s exposure), 16-17 (rat inhalation study of MBTC, six hrs/day, five days/wk, up to 30 mg/m3, with toxicity but no mention of lung fibrosis).

Regulatory Limits

Plaintiff argued that HCl levels were multiples of the OSHA limits, but the courts would not credit regulatory exposure limits are evidence of harmfulness because of the precautionary nature of many regulations.  Id. at 14.  Furthermore, the disease outcomes of regulatory concern did not appear to be pulmonary fibrosis for the chemicals involved.

Res Ipsa Loquitur

The plaintiff argued that causation was a matter of common sense and general experience.  Even if his expert witnesses did not have valid, reliable evidence, the jury could make the causal determination without scientific evidence. Id. at  26.  Rejected.

Chemical Analogies

The defendant’s expert witness acknowledged that tin oxide can cause pulmonary fibrosis.  Id. at 28.  This admission, however, came without any qualification about what exposure or duration data might be needed to support a conclusion about specific causation in the plaintiff.  Id.  Furthermore, tin pneumoconiosis, or stannosis, is known as a benign lung disease, unassociated with impairment or disability.  Like simple silicosis, stannosis is a picture change on chest radiograph, without diminution of performance on pulmonary function tests.  Agency for Toxic Substances and Disease Registry, A Toxicological Profile for Tin and Tin Compounds at 30 (2005).

Differential Diagnosis

Plaintiff’s pulmonary expert witness, Dr. Grodzin, tried to bootstrap specific causation by assuming general and putting it in the “differentials” for him to embrace.  Id. at 19.  A fallacious form of reasoning, but the courts here were on top of it.

* * * * *

The panel did reverse the trial court’s grant of summary judgment.  The gate closed a little too fast to permit scrutiny of plaintiff’s claim of acute injuries and symptoms, which were less dependent upon epidemiologic evidence.

 

Meta-Meta-Analysis – Celebrex Litigation – The Claims – Part One

June 21st, 2012

In the Celebrex/Bextra litigation, both sides acknowledged the general acceptance and validity of meta-analysis, for both observational studies and clinical trials, but attacked the other side’s witnesses’ meta-analyses on grounds specific to how they were conducted.  See, e.g., Pfizer Defendants’ Motion to Exclude Certain Plaintiffs’ Experts’ Causation Opinion Regarding Celebrex – Memorandum of Points and Authorities in Support Thereof at 14, 16 (describing meta-analysis as “appropriate” and a “useful way to evaluate the presence and consistency of an effect,” and “a valid technique for analyzing the results of both randomized clinical trials and observational studies”)(dated July 20, 2007), submitted in MDL 1699, In re Bextra and Celebrex Marketing Sales Practices & Prod. Liab. Litig., Case No. 05-CV-01699 CRB (N.D. Calif.) [hereafter MDL 1699]; Plaintiffs’ Memorandum of Law in Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 2 (July 23, 2009) (“While use of a properly conducted meta-analysis is appropriate, there are underlying scientific principles and techniques to be used in meta-analysis that are widely accepted among biostatisticians and epidemiologists. Wei’s meta-analysis – which he acknowledges is based in part on an admittedly novel approach that is not generally recognized by the scientific community – fails to follow certain of these key principles.”), submitted in In re Pfizer, Inc. Securities Litig., Nos. 04 Civ. 9866(LTS)(JLC), 05 md 1688(LTS) (S.D.N.Y.)[hereafter Securities Litig.]

The plaintiffs and defendants expended a great deal of energy in attacking the other side’s meta-analyses as conducted.  With all the briefing in the federal MDL, the New York state cases, and the securities fraud class action, hundreds of pages were written on the suspected flaws in meta-analyses.  The courts, in both the products liability MDL cases and in the securities case, denied the challenges in a few sentences.  Indeed, it is difficult if not impossible to discern what the challenges were from reading the courts’ decisions. In re Pfizer Inc. Securities Litig., 2010 WL 1047618 (S.D.N.Y. 2010); In re Bextra and Celebrex, 2008 N.Y. Misc. LEXIS 720; 239 N.Y.L.J. 27(2008); In re Bextra and Celebrex Marketing Sales Practices and Product Liability Litig., MDL No. 1699, 524 F.Supp. 2d 1166 (N.D. Calif. 2007)

Although the issues shifted some over the course of these litigations, certain important themes recurred.  The plaintiffs focused their attack upon the meta-analyses conducted by defense expert witness, Lee-Jen Wei, a professor of biostatistics at the Harvard School of Public Health.

The plaintiffs maintained that Professor Wei’s meta-analyses should be excluded under Rule 702, or the New York case law, because of

  • inclusion of short-term clinical trials
  • failure to weight risk ratios by person years
  • inclusion of zero-event trials with use of imputation methods
  • use of risk difference instead of risk ratios
  • use of exact confidence intervals instead of estimated intervals

See generally Plaintiffs’ Memorandum of Law in Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei (July 23, 2009), in Securities Litig.

The plaintiffs advanced meta-analyses conducted by Professor David Madigan, Professor and Chair in the Department of Statistics, Columbia University.  The essence of the defendants’ challenges revolved around claims of flawed outcome and endpoint ascertainment and definitions:

  • invalid clinical endpoints
  • flawed data collection procedures
  • ad hoc changes in procedure and methods
  • novel methodologies “never used in the history of clinical research”
  • lack of documentation for classifying events
  • absence of expert clinical judgment in classifying event for inclusion in meta-analysis
  • creation of composite endpoints that included events unrelated to plaintiffs’ theory of thrombotic mechanism
  • lack of blinding to medication use when categorizing events
  • failure to adjust for multiple comparisons in meta-analyses

See generally Pfizer Defendants’ Motion to Exclude Certain Plaintiffs’ Experts’ Causation Opinion Regarding Celebrex – Memorandum of Points and Authorities in Support Thereof (dated July 20, 2007), in MDL 1699; Pfizer defendants’ [Proposed] Findings of Fact and Conclusions of Law with Respect to Motion to Exclude Certain Plaintiffs’ Experts’ Opinions Regarding Celebrex and Bextra, and Plaintiffs’ Motion to Exclude Defendants’ Expert Dr. Lee-Jen Wei, Document 175, submitted in Securities Litig. (Dec. 4, 2009).

Why did the three judges involved (Judge Breyer in the federal MDL; Justice Kornreich in the New York state cases; and Judge Swain in the federal securities putative class action) give such cursory attention to these Rule 702/Frye challenges?  The complexity of the issues, the lack of clarity in the lawyers’ briefings, and the stridency of both sides perhaps contributed to shorten judicial attention span.  Some of the claims were simply untenable, and may have obliterated more telling critiques.

ZERO-EVENT TRIALS

Many of the Celebrex parties’ claims can be traced to a broader issue of what to include or exclude in a meta-analysis.  Consider for instance the plaintiffs’ challenge to Wei’s meta-analysis.  The plaintiffs faulted Wei for including short-term clinical trials in his meta-analysis, while sponsoring their own expert witness testimony that Celebrex could induce heart attack or stroke after first ingestion of the medication.  Having made the claim, the plaintiffs were hard pressed to exclude short-term trials, other than to argue that such trials frequently had zero adverse events in either the medication or placebo arms.  Many meta-analytic methods, which treat each included study as a 2 x 2 contingency table, and calculate an odds ratio for each table, cannot accommodate zero event data.

Whether or not hard pressed, the plaintiffs made the claim. The plaintiffs’ analogized to the lack of reliability of underpowered clinical trials to provide evidence of safety.  See Plaintiffs’ Reply Memorandum of Law in Further Support of Their Motion to Exclude Expert Testimony by Defendants’ Expert Dr. Lee-Jen Wei at 6 (May 5, 2010), in Securities Litig. (citing In re Neurontin Mktg., Sales Practices, and Prod. Liab. Litig., 612 F. Supp. 2d 116, 141 (D. Mass. 2009) (noting that many of Pfizer’s studies were “underpowered” to detect the alleged connection between Neurontin and suicide).  The power argument, however, does not make sense in the context of a meta-analysis, which is aggregating data across studies to overcome the alleged lack of power in a single study.

Not surprisingly, clinical trials of a non-cardiac medication will often report no event of the outcome of interest, such as heart attack.  These trials are referred to as a “zero event”, which can happen in one or both arms of a given trial.  Some searchers exclude these studies from a meta-analysis because of the impossibility of calculating an odds ratio without using imputation in the zero cells of the 2 x 2 tables. Although there are methods to address zero-event trials, some researchers believe that the existence of several zero-event trials essentially means that the sparse data from rare outcomes deprives statistical tests of their usual meaning.  Traditional statistical standards of significance (p < 0.05) are described as “tenuous,” and too high, in this situation. A.V. Hernandez, E. Walker, J. P. Ioannidis, M.W. Kattan, “Challenges in meta-analysis of randomized clinical trials for rare harmful cardiovascular events: the case of rosiglitazone,” 156 Am. Heart J. 23, 28 (2008).

The exclusion of zero-event trials from meta-analyses of rare outcomes can yield biased results. See generally M.J. Bradburn, J.J Deeks, J.A. Berlin, and A. Russell Localio,” Much ado about nothing: a comparison of the performance of meta-analytical methods with rare events,” 26 Statistics in Med. 53 (2007); M.J. Sweeting, A.J. Sutton, and P.C. Lambert, “What to add to nothing? Use and avoidance of continuity corrections in meta-analysis of sparse data,” 23 Statistics in Med. 1351 (2004)(erratum at 25 Statistics in Med. 2700 (2006) (“Many routinely used summary methods provide widely ranging estimates when applied to sparse data with high imbalance between the size of the studies’ arms. A sensitivity analysis using several methods and continuity correction factors is advocated for routine practice.”).

Others researchers include zero-event trials as providing helpful information about the absence of risk. Zero-event trials:

“provide relevant data by showing that event rates for both the intervention and control groups are low and relatively equal. Excluding such trial data potentially increases the risk of inflating the magnitude of the pooled treatment effect.”

J.O. Friedrich, N.K. Adhikari, J. Beyene, “Inclusion of zero total event trials in meta-analyses maintains analytic consistency and incorporates all available data,” 5 BMC Med. Res. Methodol. 2 (2007)[cited as Friedrich].  Zero event trials can be included in meta-analyses by using something called a standard “continuity correction,” which involves imputing events, or fractional events, in all cells of the 2 x 2 table. One approach, the zero is replaced with 0.5 and all other numbers are increased by 0.5. Friedrich at 7.

After examining the bias in several meta-analyses from excluding zero-event trials, Friedrich and colleagues recommended:

“We believe these trials [with zero events] should also be included if RR [relative risks] or OR [odds ratios] are the effect measures to provide a more conservative estimate of effect size(even if this change in effect size is very small for RR and OR), and to provide analytic consistency and include the same number of trials in the meta-analysis, regardless of the summary effect measure used. Inclusion of zero total event trials would enable the inclusion of all available randomized controlled data in a meta-analysis, thereby providing the most generalizable estimate of treatment effect.”

Friedrich at 5-6.

Wei addressed the problem of zero-event trials by using common imputation methods, not so different from what plaintiffs’ expert witness Dr. Ix used in the gadolinium litigation. See Meta-Meta-Analysis — The Gadolinium MDL — More Than Ix’se Dixit.  Given that plaintiffs advanced a mechanistic theory, which would explain cardiovascular thrombotic events almost immediately upon first ingestion of Celebrex, Professor Wei’s attempt to save the data inherent in zero-event trials by “continuity correction” or imputation methods seems reasonable and well within meta-analytic procedures.

 

RISK DIFFERENCE

Professor Wei did not limit himself to a single method or approach.  In addition to using imputation methods, Wei used risk difference, rather than risk ratios, as the parameter of interest.  The risk difference is simply the difference between two risks: the risk or probability of an event in one group less the risk or probability of that event in another group.  Contrary to the plaintiffs’ claims, there is nothing novel or subversive about conducting a meta-analysis with the risk difference as the parameter of interest, rather than a risk ratio.  In the context of randomized clinical trials, the risk difference is expected as a measure of absolute effect.  See generally, Michael Borenstein, L. V. Hedges, J. P. T. Higgins, and H. R. Rothstein, Introduction to Meta-Analysis (2009); Julian PT Higgins and Sally Green, eds., Cochrane Handbook for Systematic Reviews of Interventions (2008)

Like risk ratios, the risk difference yield a calculated confidence interval at any desired coefficient of confidence.  Confidence intervals for dichotomous events are often based upon approximate methods that build upon the normal approximation to the binomial distribution.  These approximate methods require assumptions of sample size that may not be met in cases involving sparse data.  With modern computers, calculating exact confidence intervals is not particularly difficult, and Professor Wei has published a methods paper in which he explains the desirability of using the risk difference with exact intervals in addressing meta-analyses of sparse data, such as was involved in the Celebrex litigation.  See L. Tian, T. Cai, M.A. Pfeffer, N. Piankov, P.Y. Cremieux, and L.J. Wei, “Exact and efficient inference procedure for meta-analysis and its application to the analysis of independent 2 x 2 tables with all available data but without artificial continuity correction,” 10 Biostatistics 275 (2009).

Plaintiffs attacked Wei’s approach as “novel” and not generally accepted.  Judge Swain appropriately dismissed this attack:

“Dr. Wei’s methodology, the validity of which Plaintiffs contest and the novelty of which Plaintiffs seek to highlight, appears to have survived the rigors of peer review at least once, and is subject to critique by virtue of its transparency. Dr. Wei’s report, supplemented by his declaration, is sufficient to meet Defendants’ burden of demonstrating that his testimony is the product of reliable principles and methods. He has explained his methods, which can be tested. Plaintiffs’ critiques of Dr. Wei’s choices regarding which trials to include in his own meta-analysis, the origins of the data he used, the date at which he undertook his meta-analysis, and at whose behest he performed his analysis all go to the weight of Dr. Wei’s testimony.”

In re Pfizer Inc. Securities Litig., 2010 WL 1047618, *7 (S.D.N.Y. 2010).  The approach taken by Wei is novel only in the sense that researchers have not previously tried to push the methodological envelope of meta-analysis to deploy the technique for rare outcomes and sparse data, with many zero-event trials.  The risk difference approach is well suited to the situation, and the use of exact confidence intervals is hardly novel or dubious.

The Cherry-Picking Fallacy in Synthesizing Evidence

June 15th, 2012

What could be wrong with picking cherries?  At the end of the process you have cherries, and if you do it right, you have all ripe, and no rotten, cherries.  Your collection of ripe cherries, however, will be unrepresentative of the universe of cherries, but at least we understand how and why your cherries were selected.

Elite colleges pick the best high school students; leading law schools pick the top college students; and top law firms and federal judges cherry pick the best students of the best law schools.  Lawyers are all-too-comfortable with “cherry picking.”  Of course, the cherry-picking process here has at least some objective criteria, which can be stated in advance of the selection.

In litigation, each side is expected to “cherry pick” the favorable evidence, and ignore or flyblow the contrary evidence.  Judges are thus often complacent about selectivity in the presentation of evidence by parties and their witnesses.  In science, this kind of adversarial selectivity is a sure way to inject bias and subjectivity into claims of knowledge.  The development of the systematic review, in large measure, has been supported by the widespread recognition that studies cannot be evaluated with post hoc, subjective evaluative criteria. Cynthia D. Mulrow, Deborah J. Cook, Frank Davidoff, “Systematic Reviews: Critical Links in the Great Chain of Evidence,” 126 Ann. Intern. Med. 389 (1997)

The International Encyclopedia of Philosophy describes “cherry picking” as a fallacy, “a kind of error in reasoning.”  Cherry-picking the evidence, also known as “suppressed evidence,” is:

“[i]ntentionally failing to use information suspected of being relevant and significant is committing the fallacy of suppressed evidence. This fallacy usually occurs when the information counts against one’s own conclusion. * * * If the relevant information is not intentionally suppressed but rather inadvertently overlooked, the fallacy of suppressed evidence also is said to occur, although the fallacy’s name is misleading in this case.”

Bradley Dowden, “Suppressed Evidence,” International Encyclopedia of Philosophy (Last updated: December 31, 2010).

Cherry picking is a main rhetorical device for the litigator, and many judges simply do not understand what is so wrong with each side’s selection of the studies that it wishes to emphasize.  Whatever the acceptability of lawyers’ cherry picking in the presentation of evidence, it is antithetical to scientific methodology.  “Cherry picking (fallacy),” Wikipedia (describing cherry picking as the pointing to data that appears to confirm one’s opinion, while ignoring contradictory data)[last visited on June 14, 2012]

Given the pejorative connotations of “cherry picking,” no one should be very surprised that lawyers and judges couch their Rule 702 arguments and opinions in terms of whether expert witnesses engaged in this fruitful behavior.  Although I had heard plaintiffs’ and defendants’ counsel use the phrase, I only recently came across it in a judicial opinion.  Since the phrase nicely describes a fallacious form of reasoning, I thought it would be helpful to collect pertinent cases that describe the fallaciousness of fruit-pickin’ expert witness testimony.

United States Court of Appeals

Barber v. United Airlines, Inc., 17 Fed.Appx. 433, 437 (7th Cir. 2001) (affirming exclusion of “cherry-picking” expert witness who failed to explain why he ignored certain data while accepting others)

District Courts

Dwyer v. Sec’y of Health & Human Servs., No. 03-1202V, 2010 WL 892250 (Fed. Cl. Spec. Mstr. Mar. 12, 2010)(recommending rejection of thimerosal autism claim)(“In general, respondent’s experts provided more responsive answers to such questions.  Respondent’s experts were generally more careful and nuanced in their expert reports and testimony. In contrast, petitioners’ experts were more likely to offer opinions that exceeded their areas of expertise, to “cherry-pick” data from articles that were otherwise unsupportive of their position, or to draw conclusions unsupported by the data cited… .”)

In re Bausch & Lomb, Inc., 2009 WL 2750462 at 13 (D. S.C. 2009)( “Dr. Cohen did not address [four contradictory] studies in her expert reports or affidavit, and did not include them on her literature reviewed list [. . .] This failure to address this contrary data renders plaintiffs’ theory inherently unreliable.”)

Rimbert v. Eli Lilly & Co., No. 06-0874, 2009 WL 2208570, *19 (D.N.M. July 21, 2009) )(“Even more damaging . . . is her failure to grapple with any of the myriad epidemiological studies that refute her conclusion.”), aff’d, 647 F.3d 1247 (10th Cir. 2011) (affirming exclusion but remanding to permit plaintiff to find a new expert witness)

In re Bextra & Celebrex Prod. Liab. Litig., 524 F. Supp.2d 1166, 1176, 1179, 1181, 1184 (N.D. Cal. 2007) (criticizing plaintiffs’ expert witnesses for “cherry-picking studies”); id. at 1181 (“these experts ignore the great weight of the observational studies that contradict their conclusion and rely on the handful that appear to support their litigation-created opinion.”)

LeClerq v. Lockformer Co., No. 00 C 7164, 2005 U.S. Dist. LEXIS 7602, at *15 (N.D. Ill. Apr. 28, 2005) (holding that expert witness’s “cherry-pick[ing] the facts he considered to render his opinion, and such selective use of facts fail[s] to satisfy the scientific method and Daubert.”)(internal citations and quotations omitted)

Holden Metal & Aluminum Works v. Wismarq Corp., No. 00 C 0191, 2003 WL 1797844, at *2 (N.D. Ill. Apr. 2, 2003).

State Courts

Betz v. Pneumo Abex LLC, 2012 WL 1860853, *16 (May 23, 2012 Pa. S. Ct.)(“According to Appellants, moreover, the pathologist’s self-admitted selectivity in his approach to the literature is decidedly inconsistent with the scientific method. Accord Brief for Amici Scientists at 17 n.2 (“‘Cherry picking’ the literature is also a departure from ‘accepted procedure’.”)).

George v. Vermont League of Cities and Towns, 2010 Vt. 1, 993 A.2d 367, 398 (Vt. 2010)(expressing concern about how and why plaintiff’s expert witnesses selected some studies to include in their “weight of evidence” methodology.  Without an adequate explanation of selection and weighting criteria, the choices seemed arbitrary)

Scaife v. AstraZeneca LP, 2009 WL 1610575 at 8 (Del. Super. 2009) (“Simply stated, the expert cannot accept some but reject other data from the medical literature without explaining the bases for her acceptance or rejection.”)

In re Bextra & Celebrex, 2008 N.Y. Misc. LEXIS 720, *20, 239 N.Y.L.J. 27 (2008) (holding that New York’s Frye rule requires proponent to show that its expert witness had “look[ed] at the totality of the evidence and [did] not ignore contrary data.”); see also id. at *36 (“Moreover, out of 32 studies (29 published) cited by defendants, plaintiffs chose only 8 to plead their case.  This smacks of ‘cherry-picking,’ skewing their analysis by only looking at the helpful studies. Such practice contradicts the accepted method for an expert’s analysis of epidemiological data.”)

Bowen v. E.I. DuPont de Nemours & Co., 906 A.2d 787, 797 (Del. 2006) (noting that expert witnesses cannot ignore studies contrary to their opinions)

Selig v. Pfizer, Inc., 185 Misc. 2d 600, 607, 713 N.Y.S.2d 898 (Sup. Ct. N.Y. Cty. 2000) (holding that expert witness failed to satisfy Frye test’s requirement of following an accepted methodology when he ignored studies contrary to his opinion), aff’d, 290 A.D.2d 319, 735 N.Y.S.2d 549 (1st Dep’t 2002)

******************

Most but not all the caselaw uniformly recognizes the fallacy for an expert witness to engage in ad hoc selectivity in addressing studies upon which to rely.  In the following two cases, the cherry-picking was identified, but acquiesced in by judges.

McClellan v. I-Flow Corp., 710 F. Supp. 2d 1092, 1114 (D. Ore. 2010)(discussing cherry picking but rejecting “document by document” review)(“Finally, defendants contend that plaintiffs’ experts employ unreliable methodologies by ‘cherry-picking’ facts from certain studies and asserting reliance on the ‘totality’ or ‘global gestalt of medical evidence’. Defendants argue that in  doing so, plaintiffs’ experts fail to ‘painstakingly’ link each piece of data to their conclusions or explain how the evidence supports their opinions.”)

United States v. Paracha, 2006 WL 12768 (S.D. N.Y. Jan. 3, 2006)(rejecting challenge to terrorism expert on grounds that he cherry picked evidence in conspiracy prosecution involving al Queda)

King v. Burlington No. Santa Fe Ry, ___N.W.2d___, 277 Neb. Reports 203, 234 (2009)(noting that the law does “not preclude a trial court from considering as part of its reliability inquiry whether an expert has cherry-picked a couple of supporting studies from an overwhelming contrary body of literature,” but ignoring the force of the fallacious expert witness testimony by noting that the questionable expert witness (Frank) had some studies that showed associations between exposure to diesel exhaust or benzene and multiple myeloma).

Another Confounder in Lung Cancer Occupational Epidemiology — Diesel Engine Fumes

June 13th, 2012

Researchers obviously need to be aware of, and control for, potential and known confounders.  In the context of investigating the etiologies of lung cancer, there is a long list of potential confounding exposures, often ignored in peer-reviewed papers, which focus on one particular outcome of interest.  Just last week, I wrote to emphasize the need to account for potential and known confounding agents, and how this need was particularly strong in studies of weak alleged carcinogens such as crystalline silica.  See Sorting Out Confounded Research – Required by Rule 702.  Yesterday, the World Health Organization (WHO) added another “known” confounder for lung cancer epidemiology:  diesel fume.

According to the International Agency for Research on Cancer (IARC), a division of the WHO, a working group of international experts voted to reclassify diesel engine exhaust as a “Group I” carcinogen.  IARC: Diesel engines exhaust carcinogenic (2012).  This classification means, in IARC parlance, that ” there is sufficient evidence of carcinogenicity in humans. Exceptionally, an agent may be placed in this category when evidence of carcinogenicity in humans is less than sufficient but there is sufficient evidence of carcinogenicity in experimental animals and strong evidence in exposed humans that the agent acts through a relevant mechanism of carcinogenicity.”  The Group was headed up by Dr. Christopher Portier, who is the director of the National Center for Environmental Health and the Agency for Toxic Substances and Disease Registry at the Centers for Disease Control and Prevention.  Id.

The reclassification removes diesel exhaust from its previous categorization as a Group 2A carcinogen, which is interpreted “as probably carcinogenic to humans.”  Diesel exhaust has been on a high-priority list for re-evaluation since 1998, as result of epidemiologic research from many countries.  The Working Group specifically found that there was sufficient evidence to conclude that diesel exhaust is a cause of lung cancer in humans, and limited evidence to support an association with bladder cancer.  The Group rejected any change in classification of gasoline engine exhaust from its current IARC rating as “possibly carcinogenic to humans. (Group 2B).”

Unlike other IARC Working Group decisions (such as crystalline silica), which were weakened by close votes and significant dissents, the diesel Group’s conclusion was unanimous.  The diesel Group appeared to be impressed by two recent studies of lung cancer in underground miners, released in March 2012.  One study was in a large cohort, conducted by NIOSH, and the other was a nested case-control study, conducted by the National Cancer Institute (NCI).  See Debra T. Silverman, Claudine M. Samanic, Jay H. Lubin, Aaron E. Blair, Patricia A. Stewart , Roel Vermeulen, Joseph B. Coble, Nathaniel Rothman, Patricia L. Schleiff , William D. Travis, Regina G. Ziegler, Sholom Wacholder, Michael D. Attfield, “The Diesel Exhaust in Miners Study: A Nested Case-Control Study of Lung Cancer and Diesel Exhaust,” J. Nat’l Cancer Instit. (2012)(in press and open access); and Michael D. Attfield, Patricia L. Schleiff, Jay H. Lubin, Aaron Blair, Patricia A. Stewart, Roel Vermeulen, Joseph B. Coble, and Debra T. Silverman, “The Diesel Exhaust in Miners Study: A Cohort Mortality Study With Emphasis on Lung Cancer,” J. Nat’l Cancer Instit. (2012)(in press).

According to a story in the New York Times, the IARC Working Group described diesel engine exhaust as “more carcinogenic than secondhand cigarette smoke.”  Donald McNeil, “W.H.O. Declares Diesel Fumes Cause Lung Cancer,” N.Y. Times (June 12, 2012).  The Times also quoted Dr. Debra Silverman, NCI chief of environmental epidemiology, at length.  Dr. Silverman, who was the lead author of the nested case-control study cited by the IARC Press Release, noted that her large study showed that long-term heavy exposure to diesel fumes increased lung cancer risk seven fold. Dr. Silverman described this risk as much greater than that thought to be created by passive smoking, but much smaller than smoking two packs of cigarettes a day.  She stated that “totally” supported the IARC reclassification, and that she believed that governmental agencies would use the IARC analysis as the basis for changing the regulatory classification of diesel exhaust.

Silverman’s nested case-control study appears to have been based upon careful diesel exhaust exposure information, as well as smoking histories.  The study also searched and analyzed for other potential confounders, which might be expected to be involved in underground mining:

“Other potential confounders [ie, duration of cigar smoking; frequency of pipe smoking; environmental tobacco smoke; family history of lung cancer in a first-degree relative; education; body mass index based on usual adult weight and height; leisure time physical activity; diet; estimated cumulative exposure to radon, asbestos, silica, polycyclic aromatic hydrocarbons (PAHs) from non-diesel sources, and respirable dust in the study facility based on air measurement and other data (14)] were evaluated but not included in the final models because they had little or no impact on odds ratios (ie, inclusion of these factors in the final models changed point estimates for diesel exposure by ≤ 10%).”

Silverman, et al., at 4.  The absence of an association between lung cancer and silica exposure is noteworthy in a such a large study of underground miners.

Meta-Meta-Analysis — The Gadolinium MDL — More Than Ix’se Dixit

June 8th, 2012

There is an tendency, for better or worse, for legal bloggers to be partisan cheerleaders over litigation outcomes.  I admit that most often I am dismayed by judicial failures or refusals to exclude dubious plaintiffs’ expert witnesses’ opinion testimony, and I have been known to criticize such decisions.  Indeed, I wouldn’t mind seeing courts exclude dubious defendants’ expert witnesses.  I have written approvingly about cases in which judges have courageously engaged with difficult scientific issues, seen through the smoke screen, and properly assessed the validity of the opinions expressed.  The Gadolinium MDL (No. 1909) Daubert motions and decision offer a fascinating case study of a challenge to an expert witness’s meta-analysis, an effective defense of the meta-analysis, and a judicial decision to admit the testimony, based upon the meta-analysis.  In re Gadolinium-Based Contrast Agents Prods. Liab. Litig., 2010 WL 1796334 (N.D. Ohio May 4, 2010) [hereafter Gadolinium], reconsideration denied, 2010 WL 5173568 (June 18, 2010).

Plaintiffs proffered general causation opinions (between gadolinium contrast media and Nephrogenic Systemic Fibrosis (“NSF”), by a nephrologist, Joachim H. Ix, M.D., with training in epidemiology.  Dr. Ix’s opinions were based in large part upon a meta-analysis he conducted on data in published observational studies.  Judge Dan Aaron Polster, the MDL judge, itemized the defendant’s challenges to Dr. Ix’s proposed testimony:

“The previously-used procedures GEHC takes issue with are:

(1) the failure to consult with experts about which studies to include;

(2) the failure to independently verify which studies to select for the meta-analysis;

(3) using retrospective and non-randomized studies;

(4) relying on studies with wide confidence intervals; and

(5) using a “more likely than not” standard for causation that would not pass scientific scrutiny.”

Gadolinium at *23.  Judge Polster confidently dispatched these challenges.  Dr. Ix, as a nephrologist, had subject-matter expertise with which to develop inclusionary and exclusionary criteria on his own.  The defendant never articulated what, if any, studies were inappropriately included or excluded.  The complaint that Dr. Ix had used retrospective and non-randomized studies also rang hollow in the absence of any showing that there were randomized clinical trials with pertinent data at hand.  Once a serious concern of nephrotoxicity arose, clinical trials were unethical, and the defendant never explained why observational studies were somehow inappropriate for inclusion in a meta-analysis.

Relying upon studies with wide confidence intervals can be problematic, but that is one of the reasons to conduct a meta-analysis, assuming the model assumptions for the meta-analysis can be verified.  The plaintiffs effectively relied upon a published meta-analysis, which pre-dated their expert witness’s litigation effort, in which the authors used less conservative inclusionary criteria, and reported a statistically significant summary estimate of risk, with an even wider confidence interval.  R. Agarwal, et al., ” Gadolinium-based contrast agents and nephrogenic systemic fibrosis: a systematic review and meta-analysis,” 24 Nephrol. Dialysis & Transplantation 856 (2009).  As the plaintiffs noted in their opposition to the challenge to Dr. Ix:

“Furthermore, while GEHC criticizes Dr. Ix’s CI from his meta-analysis as being “wide” at (5.18864 and 25.326) it fails to share with the court that the peer-reviewed Agarwal meta-analysis, reported a wider CI of (10.27–69.44)… .”

Plaintiff’s Opposition to GE Healthcare’s Motion to Exclude the Opinion Testimony of Joachim Ix at 28 (Mar. 12, 2010)[hereafter Opposition].

Wider confidence intervals certainly suggest greater levels of random error, but Dr. Ix’s intervals suggested statistical significance, and he had carefully considered statistical heterogeneity.  Opposition at 19. (Heterogeneity was never advanced by the defense as an attack on Dr. Ix’s meta-analysis).  Remarkably, the defendant never advanced a sensitivity analysis to suggest or to show that reasonable changes to the evidentiary dataset could result in loss of statistical significance, as might be expected from the large intervals.  Rather, the defendant relied upon the fact that Dr. Ix had published other meta-analyses in which the confidence interval was much narrower, and then claimed that he had “required” these narrower confidence intervals for his professional, published research.  Memorandum of Law of GE Healthcare’s Motion to Exclude Certain Testimony of Plaintiffs’ Generic Expert, Joachim H. Ix, MD, MAS, In re Gadolinium MDL No. 1909, Case: 1:08-gd-50000-DAP  Doc #: 668   (Filed Feb. 12, 2010)[hereafter Challenge].  There never was, however, a showing that narrower intervals were required for publication, and the existence of the published Agarwal meta-analysis contradicted the suggestion.

Interestingly, the defense did not call attention to Dr. Ix’s providing an incorrect definition of the confidence interval!  Here is how Dr. Ix described the confidence interval, in language quoted by plaintiffs in their Opposition:

“The horizontal lines display the “95% confidence interval” around this estimate. This 95% confidence interval reflects the range of odds ratios that would be observed 95 times if the study was repeated 100 times, thus the narrower these confidence intervals, the more precise the estimate.”

Opposition at 20.  The confidence interval does not provide a probability distribution of the parameter of interest; rather the distribution of confidence intervals has a probability of covering the hypothesized “true value” of the parameter.

Finally, the defendant never showed any basis for suggesting that a scientific opinion on causation requires something more than a “more likely than not” basis.

Judge Polster also addressed some more serious challenges:

“Defendants contend that Dr. Ix’s testimony should also be excluded because the methodology he utilized for his generic expert report, along with varying from his normal practice, was unreliable. Specifically, Defendants assert that:

(1) Dr. Ix could not identify a source he relied upon to conduct his meta-analysis;

(2) Dr. Ix imputed data into the study;

(3) Dr. Ix failed to consider studies not reporting an association between GBCAs and NSF; and

(4) Dr. Ix ignored confounding factors.”

Gadolinium at *24

IMPUTATION

The first point, above – the alleged failure to identify a source for conducting the meta-analysis – rings fairly hollow, and Judge Polster easily deflected it.  The second point raised a more interesting challenge.  In the words of defense counsel:

“However, in arriving at this estimate, Dr. Ix imputed, i.e., added, data into four of the five studies.  (See Sept. 22 Ix Dep. Tr. (Ex. 20), at 149:10-151:4.)  Specifically, Dr. Ix added a single case of NSF without antecedent GBCA exposure to the patient data in the underlying studies.

* * *

During his deposition, Dr. Ix could not provide any authority for his decision to impute the additional data into his litigation meta-analysis.  (See Sept. 22 Ix Dep. Tr. (Ex. 20), at 149:10-151:4.)  When pressed for any authority supporting his decision, Dr. Ix quipped that ‘this may be a good question to ask a Ph.D level biostatistician about whether there are methods to [calculate an odds ratio] without imputing a case [of NSF without antecedent GBCA exposure]’.”

Challenge at 12-13.

The deposition reference suggests that the examiner had scored a debating point by catching Dr. Ix unprepared, but by the time the parties briefed the challenge, the plaintiffs had the issue well in hand, citing A. W. F. Edwards, “The Measure of Association in a 2 × 2 Table,” 126 J. Royal Stat. Soc. Series A 109 (1963); R.L. Plackett, “The Continuity Correction in 2 x 2 Tables,” 51 Biometrika 327 (1964).  Opposition at 36 (describing the process of imputation in the event of zero counts in the cells of a 2 x 2 table for odds ratios).  There are qualms to be stated about imputation, but the defense failed to make them.  As a result, the challenge overall lost momentum and credibility.  As the trial court stated the matter:

“Next, there is no dispute that Dr. Ix imputed data into his meta-analysis. However, as Defendants acknowledge, there are valid scientific reasons to impute data into a study. Here, Dr. Ix had a valid basis for imputing data. As explained by Plaintiffs, Dr. Ix’s imputed data is an acceptable technique for avoiding the calculation of an infinite odds ratio that does not accurately measure association.7 Moreover, Dr. Ix chose the most conservative of the widely accepted approaches for imputing data.8 Therefore, Dr. Ix’s decision to impute data does not call into question the reliability of his meta-analysis.”

Gadolinium at *24.

FAILURE TO CONSIDER NULL STUDIES

The defense’s challenged including a claim that Dr. Ix had arbitrarily excluded studies in which there was no reported incidence of NSF. The defense brief unfortunately does not describe the studies excluded, and what, if any, effect their inclusion in the meta-analysis would have had.  This was, after all, the crucial issue. The abstract nature of the defense claim left the matter ripe for misrepresentation by the plaintiffs:

“GEHC continues to misunderstand the role of a meta-analysis and the need for studies that included patients both that did or did not receive GBCAs and reported on the incidence of NSF, despite Dr. Ix’s clear elucidation during his deposition. (Ix Depo. TR [Exh.1] at 97-98).  Meta-analyses such as performed by Dr. Ix and Dr. Agarwal search for whether or not there is a statistically valid association between exposure and disease event. In order to ascertain the relationship between the exposure and event one must have an event to evaluate. In other words, if you have a study in which the exposed group consists of 10,000 people that are exposed to GBCAs and none develop NSF, compared to a non-exposed group of 10,000 who were not exposed to GBCAs and did not develop NSF, the study provides no information about the association between GBCAs and NSF or the relative risk of developing NSF.”

Challenge at 37 – 38 (emphasis in original).  What is fascinating about this particular challenge, and the plaintiffs’ response, is the methodological hypocrisy exhibited.  In essence, the plaintiffs argued that imputation was appropriate in a case-control study, in which one cell contained a zero, but they would ignore a great deal of data in a cohort study with data.  To be sure, case-control studies are more efficient than cohort studies for identifying and assessing risk ratios for rare outcomes.  Nevertheless, the plaintiffs could easily have been hoisted with their own hypothetical petard.  No one in 10,000 gadolinium-exposed patients developed NSF; and no one in a control group did either.  The hypothetical study suggests that the rate of NSF is low and not different in the exposed and in the unexposed patients.  The risk ratio could be obtained by imputing an integer for the cells containing zero, and a confidence interval calculated.  The risk ratio, of course, would be 1.0.

Unfortunately, the defense did not make this argument; nor did it explore where the meta-analysis might have come out had a more even-handed methodology been taken by Dr. Ix.  The gap allowed the trial court to brush the challenge aside:

“The failure to consider studies not reporting an association between GBCAs and NSF also does not render Dr. Ix’s meta-analysis unreliable. The purpose of Dr. Ix’s meta-analysis was to study the strength of the association between an exposure (receiving GBCA) and an outcome (development of NSF). In order to properly do this, Dr. Ix necessarily needed to examine studies where the exposed group developed NSF.”

Gadolinium at *24.  Judge Polster, with no help from the defense brief, missed the irony of Dr. Ix’s willingness to impute data in the case-control 2 x 2 contingency tables, but not in the relative risk tables.

CONFOUNDING

Defendants complained that Dr. Ix had ignored the possibility that confounding factors had contributed to the development of NSF.  Challenge at 13.  Defendants went so far as to charge Dr. Ix with misleading the court by failing to consider other possible causative exposures or conditions.  Id.

Defendants never identified the existence, source, and likely magnitude of confounding factors.  As a result, the plaintiffs’ argument, based in the Reference Manual, that confounding was an unlikely explanation for a very large risk ratio was enthusiastically embraced by the trial court, virtually verbatim from the plaintiffs’ Opposition (at 14):

“Finally, the Court rejects Defendants’ argument that Dr. Ix failed to consider confounding factors. Plaintiffs argued and Defendants did not dispute that, applying the Bradford Hill criteria, Dr. Ix calculated a pooled odds ratio of 11.46 for the five studies examined, which is higher than the 10 to 1 odds ratio of smoking and lung cancer that the Reference Manual on Scientific Evidence deemed to be “so high that it is extremely difficult to imagine any bias or confounding factor that may account for it.” Id. at 376.  Thus, from Dr. Ix’s perspective, the odds ratio was so high that a confounding factor was improbable. Additionally, in his deposition, Dr. Ix acknowledged that the cofactors that have been suggested are difficult to confirm and therefore he did not try to specifically quantify them. (Doc # : 772-20, at 27.) This acknowledgement of cofactors is essentially equivalent to the Agarwal article’s representation that “[t]here may have been unmeasured variables in the studies confounding the relationship between GBCAs and NSF,” cited by Defendants as a representative model for properly considering confounding factors. (See Doc # : 772, at 4-5.)”

Gadolinium at *24.

The real problem is that the defendant’s challenge pointed only to possible, unidentified causal agents.  The smoking/lung cancer analogy, provided by the Reference Manual, was inapposite.  Smoking is indeed a large risk factor for lung cancer, with relative risks over 20.  Although there are other human lung carcinogens, none is consistently in the same order of magnitude (not even asbestos), and as a result, confounding can generally be excluded as an explanation for the large risk ratios seen in smoking studies.  It would be easy to imagine that there are confounders for NSF, especially given that it is relatively recently been identified, and that they might be of the same or greater magnitude as that suggested for the gadolinium contrast media.  The defense, however, failed to identify confounders that actually threatened the validity of any of the individual studies, or of the meta-analysis.

CONCLUSION

The defense hinted at the general unreliability of meta-analysis, with references to References Manual on Scientific Evidence at 381 (2d ed. 2000)(noting problems with meta-analysis), and other, relatively dated papers.  See, e.g., John Bailar, “Assessing Assessments,” 277 Science 529 (1997)(arguing that “problems have been so frequent and so deep, and overstatements of the strength of conclusions so extreme, that one might well conclude there is something seriously and fundamentally wrong with [meta-analysis].”).  The Reference Manual language carried over into the third edition, is out of date, and represents a failing of the new edition.  See The Treatment of Meta-Analysis in the Third Edition of the Reference Manual on Scientific Evidence” (Nov. 14, 2011).

The plaintiffs came forward with some descriptive statistics of the prevalence of meta-analysis in contemporary biomedical literature.  The defendants gave mostly argument; there is a dearth of citation to defense expert witnesses, affidavits, consensus papers on meta-analysis, textbooks, papers by leading authors, and the like.  The defense challenge suffered from being diffuse and unfocused; it lost persuasiveness by including weak, collateral issues such as claiming that Dr. Ix was opining “only” on a “more likely than not” basis, and that he had not consulted with other experts, and that he had failed to use randomized trial data.  The defense was quick to attack perceived deficiencies, but it did not illustrate how or why the alleged deficiencies threatened the validity of Dr. Ix’s meta-analysis.  Indeed, even when the defense made strong points, such as the exclusion of zero-event cohort studies, it failed to document that such studies existed, and that their inclusion might have made a difference.