Approaches to Integrating Evidence From Animal and Human Studies in Chemical Assessments

Approaches to Integrating Evidence From Animal and Human Studies in Chemical Assessments Kris Thayer, National Center for Environmental Assessment (NCEA) Integrated Risk Information System (IRIS) Division Director Advancing Disease Modeling in Animal-Based Research in Support of Precision Medicine October 5-6, 2017 Office of Research and Development NCEA, IRIS

Created in 1985 to foster consistency in the evaluation of chemical toxicity across the Agency. IRIS assessments contribute to decisions across EPA and other health agencies Clean Air Act (CAA), Safe Drinking Water Act (SDWA), Food Quality Protection Act (FQPA), Comprehensive Environmental Response, Compensation, and Liability Act (CERCLA), Resource Conservation and Recovery Act (RCRA), Toxic Substances Control Act (TSCA) Comprehensive assessment of hazard and toxicity values Noncancer: Reference Doses (RfDs) and Reference Concentrations (RfCs). Cancer: Oral Slope Factors (OSFs) and Inhalation Unit Risks (IURs). IRIS is the only federal program to provide toxicity values for both cancer and noncancer effects. IRIS assessments undergo a rigorous, multi-step review process with opportunities for public input IRIS assessments have no direct regulatory impact until they are combined with Extent of exposure to people, cost of cleanup, available technology, etc. Regulatory options, which are the purview of EPA s program offices. 1

Toxicity Value Definitions Reference Dose/Concentration (non-cancer): An estimate (with uncertainty spanning perhaps an order of magnitude) of a daily oral/inhalation exposure to the human population (including sensitive subgroups) that is likely to be without an appreciable risk of deleterious effects during a lifetime. It can be derived from a NOAEL, LOAEL, or benchmark dose, with uncertainty factors generally applied to reflect limitations of the data used. [Durations include acute, short-term, subchronic, and chronic] Oral Slope Factor: is an estimate of the increased cancer risk from oral exposure to a dose of 1 mg/kg-day for a lifetime. The OSF can be multiplied by an estimate of lifetime exposure (in mg/kg-day) to estimate the lifetime cancer risk. Inhalation Unit Risk: is an estimate of the increased cancer risk from inhalation exposure to a concentration of 1 µg/m 3 for a lifetime. The interpretation of inhalation unit risk would be as follows: if unit risk = 2 10 ⁶ per µg/m³, 2 excess cancer cases (upper bound estimate) are expected to develop per 1,000,000 people if exposed daily for a lifetime to 1 µg of the chemical per m³ of air. The IUR can be multiplied by an estimate of lifetime exposure (in µg/m 3 ) to estimate the lifetime cancer risk 2

Assessments Conducted Using Systematic Review Methods Systematic Review Scoping Systematic Review Protocol Literature Inventory Study Evaluation Data Extraction Evidence Integration Derive Toxicity Values Assessment Initiated Assessment Developed Initial Problem Formulation Literature Search Refined Analysis Plan Organize Hazard Review Evidence Analysis and Synthesis Select and Model Studies Most pertinent to today s presentation on approach to integrating evidence Evidence integration is qualitative in IRIS assessments, expressed in context of confidence: Strongest evidence Weakest evidence Quantitative methods are being explored in NCEA e.g., Bayesian methods of combining data from human and other species 3

Individual Study Evaluation Scoping Systematic Review Protocol Literature Inventory Study Evaluation Data Extraction Evidence Integration Derive Toxicity Values Assessment Initiated Assessment Developed Initial Problem Formulation Literature Search Refined Analysis Plan Organize Hazard Review Evidence Analysis and Synthesis Select and Model Studies General approach same for human and animal studies Evaluation process focused on: Internal validity/bias Sensitivity Applicability (relevance to the question) Reporting quality 4

Overview of Study Evaluation in IRIS Individual study level domains Animal Reporting Quality Selection or Performance Bias Confounding/Variable Control Reporting or Attrition Bias Exposure Methods Sensitivity Outcome Measures and Results Display Other Epidemiological Exposure measurement Outcome ascertainment Population Selection Confounding Analysis Sensitivity Domain judgements Good Adequate Poor Critically Deficient Judgement Good Adequate Poor Critically Deficient Interpretation Appropriate study conduct relating to the domain & minor deficiencies not expected to influence results. A study that may have some limitations, but not likely to be severe or to have a substantive impact on results. Identified biases or deficiencies interpreted as likely to have had a substantial impact on the results or prevent reliable interpretation of study findings. A flaw that is so serious that the study could not be used. Rating Interpretation Overall study rating High Medium Low Uninformative High Medium Low Uninformative No notable deficiencies or concerns identified; potential for bias unlikely or minimal and sensitive methodology. Possible deficiencies or concerns noted, but resulting bias or lack of sensitivity would be unlikely to be of a substantive degree. Deficiencies or concerns were noted, and the potential for substantive bias or inadequate sensitivity could have a significant impact on the study results or their interpretation. Serious flaw(s) makes study results unusable for hazard identification 5

Individual Epidemiological Study Examples Medium confidence Uninformative 6

Across Study Evaluations 7 Study 1 Study 2 Study 3 Study 4 Study 5 Study 6

Within Evidence Stream Synthesis Scoping Systematic Review Protocol Literature Inventory Study Evaluation Data Extraction Evidence Integration Derive Toxicity Values Assessment Initiated Assessment Developed Initial Problem Formulation Literature Search Refined Analysis Plan Organize Hazard Review Evidence Analysis and Synthesis Select and Model Studies Synthesis of evidence is more than counting the number of positive and negative studies Consider the influence of bias and sensitivity when describing study results and synthesizing evidence Synthesis should primarily be based on studies of medium and high confidence (when available) Use structured framework to aid in transparency 8

Within Evidence Stream Considerations Epidemiology evidence Risk of Bias Animal toxicology evidence Sensitivity Directness/Applicability Consistency Effect magnitude/ precision Biological gradient/ dose-response Coherence Informative human and animal health effect evidence is analyzed and synthesized separately. Mechanistic evidence is synthesized that informs the conclusions regarding the human and animal health effect evidence. 9

Synthesizing Evidence on Health Effects Organization and Structure What outcomes are relevant to each health hazard domain and at what level (e.g., health effect or subgroupings) should synthesis occur? What populations were studied (e.g., general population, occupations, life stages, species, etc.) Can study results be described across varying exposure patterns, levels, duration or intensity? Are there differences in the confidence in study results for different outcomes, populations, or exposure? Does toxicokinetic information influence differences in responses across route of exposure, other aspects of exposure, or life stages? 10

Moving from Synthesis to Integration Scoping Systematic Review Protocol Literature Inventory Study Evaluation Data Extraction Evidence Integration Derive Toxicity Values Assessment Initiated Assessment Developed Initial Problem Formulation Literature Search Refined nalysis Plan Organize Hazard Review Evidence Analysis and Synthesis Select and Model Studies Step 1: Within Evidence Stream Judgements Results of Human Health Effect Study Synthesis Results of Animal Health Effect Study Synthesis Step 2: Across Evidence Stream Integration Results of Synthesis of Mechanistic Evidence Informing the Human and Animal Syntheses 11

Evidence Profile Tables to Summarize Evidence Synthesis and Integration Judgements Studies and confidence (risk of bias, sensitivity) Factors that increase confidence [Health Effect or Outcome Grouping] Evidence from Human Studies (Route) References Study confidence and explanation Study design description Consistency Dose response gradient Coherence of observed effects (apical studies) Effect size (magnitude, severity) Biological plausibility Low risk of bias/ high quality Insensitivity of null/ negative studies Factors that decrease confidence Unexplained inconsistency Imprecision Indirectness/ applicability Poor study quality/ high risk of bias Other (e.g., Single/Few Studies; small sample size) Evidence demonstrating implausibility Summary of findings Results information (general endpoints affected/ unaffected) across studies Human evidence informing biological plausibility: discuss how mechanistic data influenced the within stream judgement (e.g., evidence of precursors in exposed humans). Could be multiple rows (e.g., grouped by study confidence or population) if this informs results heterogeneity Within stream evidence strength judgements Describe confidence in evidence from human studies, and primary basis: Strongest evidence Weakest evidence Inference across evidence streams Describe assumptions and degree of support from mechanistic evidence Final evidence integration conclusion Describe conclusion(s) and primary basis for the integration of all available evidence (e.g., across human, animal, and mechanistic): Strongest Conclusion Weakest Conclusion Evidence for an Effect in Animals (Route) References Study confidence and explanation Study design description Consistency and Replication Dose response gradient Coherence of observed effects (apical studies) Effect size (magnitude, severity) Biological plausibility Low risk of bias/ high quality Insensitivity of null/ negative studies Unexplained inconsistency Imprecision Indirectness/ applicability Poor study quality/ high risk of bias Other (e.g., Single/Few Studies; small sample size) Evidence demonstrating implausibility Results information (general endpoints affected/ unaffected) across studies Evidence informing biological plausibility for effects in animals: discuss how mechanistic data influenced the within stream judgement (e.g., evidence of coherent molecular changes in animal studies) Could be multiple rows (e.g., by study confidence, species, or exposure duration) if this informs results heterogeneity Describe confidence in evidence for an effect in animals, and primary basis: Strongest evidence Weakest evidence 12

Example Evidence Profile Table Studies Factors that increase confidence Factors that decrease confidence Summary of findings Within stream confidence judgement Inferences across streams Hazard assessment conclusion Chemical X (Health Outcome Y) Human (oral) Case Series Study 1 Cross sectional Study 2 Few studies Low number of exposed cases (insensitivity) Lack of dose response Studies found no significant correlations with chemical X exposure and health outcome y INDETERMINATE Findings in animals presumed relevant to humans (no evidence to the contrary); coherent evidence from mechanistic studies mammalian and nonmammalian models. Risk of bias and sensitivity High risk of bias. Animal (oral) Short term Study 1 (rat) Study 2 (rat) Subchronic Study 3 (rat) Study 4 (mouse) Developmental/Reproductive Study 5 (rat) Study 6 (rat) Study 7 (rat) Study 8 (mouse) Risk of bias and sensitivity Coherence among related endpoints Low risk of bias Dose response gradient Biological plausibility Small sample sizes in some studies Some unexplained inconsistency Similar pattern of changes in hormone A and hormone B were observed in study 1 and study 2. Effects on serum hormone levels are supported by histopathological changes in tissue A (study 1, study 3, study 4, study 5, study 6) and increased tissue A weight (study 1, study, 5, study 6, study 8). Evidence of dose response gradient in most studies reporting effects. Biological plausibility of the observed effects is supported by mechanistic studies in mammalian and nonmammalian models (see Section 1.2.1 Mechanistic Evidence). MODERATE 13

Parting Thoughts State of science for evidence integration across animal and human studies largely qualitative; structured frameworks are becoming more common In environmental chemical assessments animal models are typically assumed relevant to human health unless data to contrary exist Considerations for importance of similarity in animal and human response may differ based on goals (therapeutic predictively versus identifying potential hazard) Exact findings in animals and humans not necessarily required in chemical assessments (e.g., tumor site concordance) Worth considering minimal data standards as journals develop guidance on amount/type of information shared via supplemental information 14

THANK YOU FOR YOUR ATTENTION Office of Research and Development NCEA, IRIS