SUPPLEMENTARY INFORMATION

Similar documents
Mohammad Amin Asadi Zarch et al.,2015

Investigating the robustness of the nonparametric Levene test with more than two groups

Overview of Non-Parametric Statistics

Revised Cochrane risk of bias tool for randomized trials (RoB 2.0) Additional considerations for cross-over trials

The Context of Detection and Attribution

Performance of Median and Least Squares Regression for Slightly Skewed Data

Lecture 15. There is a strong scientific consensus that the Earth is getting warmer over time.

MEA DISCUSSION PAPERS

DRAFT (Final) Concept Paper On choosing appropriate estimands and defining sensitivity analyses in confirmatory clinical trials

Sum of Neurally Distinct Stimulus- and Task-Related Components.

Inference with Difference-in-Differences Revisited

SUPPLEMENTARY INFORMATION

Quantitative Methods in Computing Education Research (A brief overview tips and techniques)

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Stat Wk 9: Hypothesis Tests and Analysis

ON THE MAGNITUDE AND PERSISTENCE OF THE HAWTHORNE EFFECT EVIDENCE FROM FOUR FIELD STUDIES

Mantel-Haenszel Procedures for Detecting Differential Item Functioning

The Impact of Continuity Violation on ANOVA and Alternative Methods

Nepal - Unmet Need for Family Planning,

CHAPTER 3 A null-model for significance testing of presence-only species distribution models

Short Communication Response to comment on Candidate Distributions for Climatological Drought Indices (SPI and SPEI)

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Annual boom bust cycles of polar phytoplankton biomass revealed by space-based lidar

1D Verification Freezing and Thawing

Examining differences between two sets of scores

Theme 14 Ranking tests

Psychology 2019 v1.3. IA2 high-level annotated sample response. Student experiment (20%) August Assessment objectives

Food Production and Violent Conflict in Sub-Saharan Africa. - Supplementary Information -

CHAPTER 3 RESEARCH METHODOLOGY

Is Knowing Half the Battle? The Case of Health Screenings

Comparison of Actual and Calculated Lag Times in Humidity Cell Tests

CHAPTER III RESEARCH METHODOLOGY

Caveats for the Spatial Arrangement Method: Supplementary Materials

FOOD SUPPLY, DISTRIBUTION, CONSUMPTION AND NUTRITIONAL STATUS IN BANGLADESH

Discussion. Ralf T. Münnich Variance Estimation in the Presence of Nonresponse

Chapter 5: Field experimental designs in agriculture

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15)

Title: A robustness study of parametric and non-parametric tests in Model-Based Multifactor Dimensionality Reduction for epistasis detection

BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT QUADRATIC (U-SHAPED) REGRESSION MODELS

Developing and Testing Hypotheses Kuba Glazek, Ph.D. Methodology Expert National Center for Academic and Dissertation Excellence Los Angeles

Basic Biostatistics. Chapter 1. Content

CHAPTER VI RESEARCH METHODOLOGY

MS&E 226: Small Data

Measuring the path toward malaria elimination

Bayes Linear Statistics. Theory and Methods

Statistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN

Simple Linear Regression the model, estimation and testing

Essentials in Bioassay Design and Relative Potency Determination

How many speakers? How many tokens?:

Type of intervention Treatment. Economic study type Cost-effectiveness analysis.

SUPPLEMENTARY INFORMATION

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

COAL COMBUSTION RESIDUALS RULE STATISTICAL METHODS CERTIFICATION SOUTHERN ILLINOIS POWER COOPERATIVE (SIPC)

STATISTICS AND RESEARCH DESIGN

By the end of the activities described in Section 5, consensus has

ANOVA in SPSS (Practical)

A GUIDE TO ROBUST STATISTICAL METHODS IN NEUROSCIENCE. Keywords: Non-normality, heteroscedasticity, skewed distributions, outliers, curvature.

Mapping Population & Climate Change: Malawi. Malawi - Unmet Need for Family Planning, 2010

Unit 1 Exploring and Understanding Data

Assignment #6. Chapter 10: 14, 15 Chapter 11: 14, 18. Due tomorrow Nov. 6 th by 2pm in your TA s homework box

Statistics 2. RCBD Review. Agriculture Innovation Program

Initial Evidence for Self-Organized Criticality in Electric Power System Blackouts

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

Changing expectations about speed alters perceived motion direction

HS Exam 1 -- March 9, 2006

Absolute Humidity as a Deterministic Factor Affecting Seasonal Influenza Epidemics in Japan

KARUN ADUSUMILLI OFFICE ADDRESS, TELEPHONE & Department of Economics

Abstract Title Page Not included in page count. Title: Analyzing Empirical Evaluations of Non-experimental Methods in Field Settings

CHAPTER 4 RESULTS. In this chapter the results of the empirical research are reported and discussed in the following order:

Student Performance Q&A:

Evaluation of a Scenario in Which Estimates of Bioequivalence Are Biased and a Proposed Solution: t last (Common)

Research Methods in Forest Sciences: Learning Diary. Yoko Lu December Research process

Various Approaches to Szroeter s Test for Regression Quantiles

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition

Institutional information. Concepts and definitions

Impact and adjustment of selection bias. in the assessment of measurement equivalence

How are Journal Impact, Prestige and Article Influence Related? An Application to Neuroscience*

In the Matter of Portland General Electric Company 2016 Integrated Resource Plan Docket LC 66

Data Mining. Outlier detection. Hamid Beigy. Sharif University of Technology. Fall 1395

Title: Intention-to-treat and transparency of related practices in randomized, controlled trials of anti-infectives

Chapter 1: Introduction to Statistics

STAT445 Midterm Project1

SUPPLEMENTARY INFORMATION

A Comparison of Variance Estimates for Schools and Students Using Taylor Series and Replicate Weighting

Testing the Predictability of Consumption Growth: Evidence from China

Example 1: How would you evaluate this transducer? What is the most common problem found in ultrasound QC testing? Uniformity is Subjective

Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto

Functional Elements and Networks in fmri

PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity

CHAPTER - 6 STATISTICAL ANALYSIS. This chapter discusses inferential statistics, which use sample data to

Climate and Plague. Kenneth L. Gage, PhD Bacterial Diseases Branch Division of Vector-Borne Diseases NCEZID/CDC

Stafff Paper Prepared by: Corey Freije. December 2018

n Outline final paper, add to outline as research progresses n Update literature review periodically (check citeseer)

Introduction & Basics

National Child Measurement Programme Changes in children s body mass index between 2006/07 and 2010/11

MRL setting and intakes for cereals. Annette Petersen

Transcription:

doi:10.1038/nature16467 Supplementary Discussion Relative influence of larger and smaller EWD impacts. To examine the relative influence of varying disaster impact severity on the overall disaster signal in the composites, we constructed histograms of eventa year differences from control for extreme heat and drought. The distribution of eventa year differences from control spans positive and negative values, with more values falling in the negative (reda shaded areas, Extended Data Fig. 1). In all cases, upwards of 70% of points fall within 20% of respective resampled control means, and fewer than 10% of disasters reflect a deficit of more than 30%. The disaster response signals are thus driven by moderate deficits in production, yield, and area, with a smaller influence of more severe impacts. This finding emphasizes the global agricultural importance of moderately severe disasters. The influence of sample size. The SEA method as applied here functions under the assumption that if the number of time series composited is sufficiently large, then the signal due to all variables other than the disaster (for instance policy changes or economic shocks) should be distributed evenly positively and negatively relative to control, and therefore disappear after averaging. The composite should thus exhibit only the impact of the disaster. While no definitive guidelines exist for what qualifies as a sufficient sample size for effective compositing, previous applications of SEA have employed samples of about 20 31,32 and as low as 9 constituent time series 33. WWW.NATURE.COM/NATURE 1

Sample sizes by year are tabulated in Extended Data Tables 2 and 4. It should be noted that sample sizes differ across years in a given composite due to varying numbers of coinciding disasters per year removed during compositing to isolate the impacts of specific disaster types (see Methods). At some points in the main letter, we present sample sizes averaged across the years for clarity. Respective control composites were resampled using corresponding year- specific sample sizes. While most of our analyses involved much larger samples than those assessed in previous applications, in some cases our sample sizes seemed potentially too small for effective compositing (less than 30, and as small as 14). For cases with small sample sizes, we questioned whether our results may have included type I (persistent noise due to insufficient sample misconstrued as disaster signal) and type II (insufficient sample to isolate disaster signal) errors. Regarding type I errors, there are two effects of sample size on statistical significance in our methodology, which must be considered together. First, with lower sample size, each disaster composite exhibits more noise relative to signal compared to larger samples, because the probability of non- disaster signals in each time series contributing to the composite canceling out (i.e. being equally positive and negative relative to control) increases with sample size. This effect is visible in the greater variability in the extreme heat and cold composites (n~50) compared to those for flood and drought (n>200) in Figures 1 and 2. Second, at lower sample size, each control composite is more variable, so the distribution of sets controls is wider (visible in the difference in widths of control boxplots between disaster types in Figure 1). If a deficit signal in the disaster composite is significant, it should dip below most of the 1000 controls (by our criterion, all but 5 of 1000 for two- tailed 99% significance). Otherwise, the signal is not statistically significant (i.e., is within the variability of false- WWW.NATURE.COM/NATURE 2

disaster controls). Since the distribution of controls depends on number of samples in each replicate, the greater variability in disaster composites based on fewer samples corresponds to equivalently wider control distributions (i.e., equivalently more conservative thresholds for significance). We therefore consider this method of significance testing robust to sample size, and are confident that our significant findings in cases with fewer samples do not reflect type I errors. Type II errors could arise if the sample size is insufficient to isolate a strong mean disaster signal that is differentiable from variability in controls. This may arise due to a saturation effect in which the mean disaster signal becomes increasingly different from control with additional samples as it asymptotically approaches the final estimate. The existence of a saturation effect would call into question, for instance, whether wheat and rice genuinely respond less to extreme heat, or if the sample is simply too small for SEA to isolate an effect in rice and wheat (Figure 4). To examine whether such a saturation effect exists, we performed an illustrative resampling experiment in which we computed 200 pseudo- mean disaster impacts on 16- cereal aggregate production using random sub- samples of the full disaster sets of size (1, 2,, n). This procedure enabled us to visualize the convergence of and variability in 200 possible paths to our actual mean disaster impact at the full sample size (Extended Data Figure 2). We used the set of 200 estimates at n- 1 to estimate the 95% confidence intervals of our full- sample impact estimates 34. At low sample sizes, the impact of extreme values on the mean is greater, resulting in high variability among the different estimates. With increasing samples, the variability of estimates reaches an inflection point after which incremental samples result in only small incremental decreases in the variability of estimates. With increasing samples below this point, the WWW.NATURE.COM/NATURE 3

majority of noise is reduced by compositing. The fact that our actual sample sizes are far beyond this zone lends us confidence that our sample sizes are sufficient. If failures to reject the null hypothesis in cases with small samples were due to insufficient sample size, then there would exist a bias towards under- estimating the disaster signal relative to final estimate when resampling with smaller sample size (N<n). In other words, if a saturation effect exists, then the pseudo- estimates at N<n should approach the final estimate at N=n from the positive side. In that event, the disaster signal may become significant with additional samples. Since the pseudo- estimates are roughly evenly distributed on either side of the final estimate (grey dotted line, Extended Data Figure 2), we deduce that no saturation effect is responsible for the lack of significant findings for these cases, and that our lack of findings in these cases are not reflective of a type II error. The fact that significant signal was detected in some cases with equivalently small samples provides us further confidence in our findings. Comparative statistics and assumptions. For the individual crop, regional, and earlier- versus- later droughts analyses, we assessed the significance of differences in disaster impacts between groups by applying Kruskal- Wallis one- way non- parametric analysis of variance to the sets of individual disaster responses in each grouping. We made this choice because, after applying the Anderson- Darling test for normality to the data, we rejected the null hypothesis of normality at the 5% significance level for all groups, and therefore could not use parametric statistics which assume a normal distribution. Kruskal- Wallis analysis of variance tests the significance of differences between group distributions (without the assumption of normality), which includes differences in both mean and variance. In comparing disaster impacts among regions, crops, and times, differences in mean were more salient for our study than differences in variance. WWW.NATURE.COM/NATURE 4

An initial application Levene s Absolute test for equality of variances on the data revealed unequal variance for many groupings that we wished to compare (at 5% significance). Although inequality of variance does not violate the assumptions of the Kruskal- Wallis test, it complicates the attribution of statistically significant differences to mean versus variance. Applying a quadratic transformation to the entire set of disaster- year responses resolved the issue of unequal variance in most cases (Extended Data Table 6), and did not in any way affect the Kruskal- Wallis results. A re- application of the Anderson- Darling test revealed that the transformed data still deviated substantially from normality. We therefore proceeded with quadratically- transformed data and the Kruskal- Wallis test in assessing our comparisons for statistical significance. Where differences between groups were significant but variances unequal, we have stated that we cannot conclude whether the significance is due to differing mean or variance. This situation arises most notably for individual crop yield for extreme heat (Figure 4d). In this case, while the Kruskal- Wallis results fail to conclusively differentiate the mean signals among crops, maize is still the only crop with significant yield impacts. Comparison to previous studies. We cannot compare our estimated yield deficits from extreme weather events to previous studies of the impact of seasonal mean climate trends over the same period 20 because we quantify mean per- disaster sensitivities and not impacts due to climate trends. To compare our yield loss estimates based on reported disasters to previous studies based growing season weather anomalies, it was necessary to establish whether the reported disasters correspond to significant seasonal weather anomalies. We therefore repeated the compositing procedure using nationally- averaged JJA (DJF in Southern Hemisphere countries) mean temperatures WWW.NATURE.COM/NATURE 5

and total precipitation from CRU TS 3.2 35 for our samples of extreme heat and drought disaster over 1961-2010, and multiplied the percent anomalies by the global averages. Drought did not correspond to any precipitation anomalies and showed only slight temperature anomalies of 0.15ºC (Extended Data Figure 3b- c). We conclude that the drought impacts observed here are therefore independent from those estimated in previous studies based on seasonal mean anomalies. Meanwhile, the extreme heat disasters corresponded to a significant mean temperature anomaly of +1.2ºC, which implies a yield sensitivity of 6-7% per 1ºC (Extended Data Figure 3a). This value falls within the range of sensitivity estimates in Lobell and Field (2007) 36, which found 5-9% yield reductions per 1ºC increase in seasonal temperature for a number of crops. However, we cannot draw any deeper conclusions by comparing our numbers due to the following methodological differences: 1) Comparing the two estimates for yield impacts at ~1ºC assumes that the years in each sample feature comparable extremes. But since the relative contribution of moderate versus extreme warm days to the mean anomaly is unknown, the two samples cannot be assumed to reflect similarly extreme weather perturbations to the crops. 2) Estimated sensitivities in Lobell and Field (2007) are based on linear regressions covering a temperature domain of about - 1 to 1ºC in which the vast majority of points fall below 1ºC. The sensitivities based on these linear regressions cannot be assumed to be powerful at and above the maximum temperature in their domain, and therefore may not reflect the level of extremes that we examine. 3) Our estimates are based on globally- averaged national weather and crop data, while those in Lobell and Field (2007) are based on data averaged across cropping regions that transcend borders 36. Because the estimates are made over differing spatial WWW.NATURE.COM/NATURE 6

scales, they may not reflect variation in yield and weather similarly. The disasters in our study do not include sub- national spatial data, so we cannot at present solve this discrepancy by spatially disaggregating. Number of disasters per year. The EM- DAT database is based on a compilation of disaster reports gathered from various organizations including United Nations agencies, governments, and the International Federation of Red Cross and Red Crescent Societies. A time- series of reported disasters per year (Extended Data Figure 4) exhibits an upward trend. This is likely a result primarily of an improvement in the completeness of disaster reporting over time, but the trend may also partially reflect greater incidence of disasters in recent decades. Assuming that incomplete disaster reporting in earlier decades was independent of disaster impact severity, our estimates of cereal crop sensitivities to EWDs are robust to this trend because we quantify them on a per- disaster basis. But we likely underestimate the total global cereal production lost to these disasters because we aggregated losses over an incomplete set of disasters. WWW.NATURE.COM/NATURE 7

References 31. Adams J. B., M. E. Mann, and C. M. Ammann. 2003. Proxy evidence for an El Niño- like response to volcanic forcing. Nature 426: 274-278. 32. Lühr, H., M. Rother, T. Iyemori, T. L. Hansen, and R. P. Lepping. 1998. Superposed epoch analysis applied to large- amplitude travelling convection vortices. Annales Geophysicae 16(7): 743-753. 33. Mass, C. F., and D. A. Portman. 1989. Major volcanic eruptions and climate: A critical evaluation. J. Climate 2: 566-593. 34. Efron, B., and C. Stein. 1981. The jackknife estimate of variance. The Annals of Statistics: 586-596. 35. Harris, I., P. D. Jones, T. J. Osborn, and D. H. Lister. 2014. Updated high- resolution grids of monthly climatic observations the CRU TS3.10 Dataset. Int. J. Climatol. 34: 623 642. 36. Lobell, D.B, and C. B. Field. 2007. Global scale climate- crop yield relationships and the impact of recent warming. Environmental Research Letters 2(1). WWW.NATURE.COM/NATURE 8