Comparison of cross-validation and bagging for building a seasonal runoff forecast model
|
|
- Susan Phillips
- 6 years ago
- Views:
Transcription
1 Comparison of cross-validation and bagging for building a seasonal runoff forecast Simon Schick, Ole Rössler, Rolf Weingartner University of Bern, Switzerland Institute of Geography Oeschger Centre for Climate Change Research 1/12
2 Why resampling? training selection testing (Hastie et al., 2009) 2/12
3 Why resampling? training selection testing (Hastie et al., 2009) small sample sizes 2/12
4 Why resampling? training selection testing (Hastie et al., 2009) small sample sizes weak relationships and large pool of candidate predictors 2/12
5 Why resampling? training selection testing (Hastie et al., 2009) small sample sizes weak relationships and large pool of candidate predictors Can we benefit from the s out of resampling? 2/12
6 Comparison of 2 approaches best guess (BGS) use all data points testing by leave-one-out cross-validation (LOO) bagged (BAG) bagging: bootstrap aggregating (Breiman, 1996) testing by out of bag predictions (OOB) 3/12
7 Regression Y = β 0 + β 1 x Q + β 2 x P + β 3 x T + ε Y i,j mean runoff for i = 30,60,90 days, starting with the 1 st and 16 th of every month (j = 1,...,24) centered and scaled 4/12
8 Regression Y = β 0 + β 1 x Q + β 2 x P + β 3 x T + ε Y i,j mean runoff for i = 30,60,90 days, starting with the 1 st and 16 th of every month (j = 1,...,24) centered and scaled x initial conditions, parametrized by runoff Q, precipitation P, and temperature T univariate screening for 10, 20,..., 720 days 4/12
9 Regression Y = β 0 + β 1 x Q + β 2 x P + β 3 x T + ε Y i,j mean runoff for i = 30,60,90 days, starting with the 1 st and 16 th of every month (j = 1,...,24) centered and scaled x initial conditions, parametrized by runoff Q, precipitation P, and temperature T univariate screening for 10, 20,..., 720 days β estimation by partial least squares at least the first PLS direction is selected ε residuals 4/12
10 (32 years) 66 catchments (no nesting) P and T out of E-OBS gridded data set (Haylock et al., 2008) 5/12
11 Hindcast experiment training selection testing (Hastie et al., 2009) 6/12
12 Hindcast experiment training selection testing leave one out predictor screening cross validation partial least squares best guess (BGS) testing: LOO 6/12
13 Hindcast experiment training selection testing leave one out predictor screening cross validation partial least squares best guess (BGS) testing: LOO bootstrap predictor screening cross validation partial least squares bagged (BAG) testing: OOB 6/12
14 Hindcast experiment training selection testing leave one out predictor screening cross validation partial least squares best guess (BGS) testing: LOO bootstrap predictor screening cross validation partial least squares bagged (BAG) testing: OOB leave one out empirical mean seasonal regime (SRG) testing: LOO 6/12
15 Hindcast experiment training selection testing leave one out predictor screening cross validation partial least squares best guess (BGS) testing: LOO bootstrap predictor screening cross validation partial least squares bagged (BAG) testing: OOB leave one out empirical mean seasonal regime (SRG) testing: LOO outer cross validation (8 folds) 6/12
16 Mean squared error of prediction (n=66) 7/12
17 Mean squared error of prediction H0: The less complex performs equal or even better than the more complex. H1: The more complex performs better. Table: p-values of right-sided t-test using paired differences of Ê MSP (only outer cross-validation) Y 30 Y 60 Y 90 SRG - BGS 0.21 >0.99 >0.99 SRG - BAG < BGS - BAG <0.01 <0.01 <0.01 8/12
18 Mean squared error of prediction How much reduces bagging Ê MSP? Table: Ê MSP and reduction (only outer cross-validation) Y 30 Y 60 Y 90 BGS BAG reduction % /12
19 LOO and OOB provide on average accurate estimates of prediction error. 10/12
20 LOO and OOB provide on average accurate estimates of prediction error. SRG outperforms BGS and BAG in many catchments. 10/12
21 LOO and OOB provide on average accurate estimates of prediction error. SRG outperforms BGS and BAG in many catchments. Most likely, BAG outperforms BGS on average. 10/12
22 LOO and OOB provide on average accurate estimates of prediction error. SRG outperforms BGS and BAG in many catchments. Most likely, BAG outperforms BGS on average. Error reduction is strongest, when BGS has low or no skill. 10/12
23 LOO and OOB provide on average accurate estimates of prediction error. SRG outperforms BGS and BAG in many catchments. Most likely, BAG outperforms BGS on average. Error reduction is strongest, when BGS has low or no skill. Suboptimal: Comparison rests on resampling. 10/12
24 runoff series and catchment boundaries: Landesanstalt für Umwelt, Messungen und Naturschutz Baden-Württemberg; Bayerisches Landesamt für Umwelt; Land Vorarlberg (data.vorarlberg.gv.at); Bundesministerium für Land- und Forstwirtschaft, Umwelt und Wasserwirtschaft Österreich; Schweizerisches Bundesamt für Umwelt precipitation and temperature series: E-OBS data set (EU-FP6 project ENSEMBLE, ensembles-eu.metoffice.com, and ECA&D, project ecad.eu) digital elevation : EU-DEM, produced using Copernicus data and information funded by the European Union Bernhard Wehren made available additional runoff data for the river Kander at Hondrich 11/12
25 Breiman, L.: Bagging Predictors. Machine Learning 24.2, , Garen, D. C.: Improved techniques in regression-based streamflow volume forecasting. Journal of Water Resources Planning and Management, 118.6, , Hastie, T., Tibshirani, R., and Friedman, J.: The Elements of Statistical Learning. Mining, Inference, and Prediction. Second Edition. Springer New York Inc., Haylock, M. R., Hofstra, N., Klein Tank, A. M. G., Klok, E. J., Jones, P. D., and New, M.: A European daily high-resolution gridded data set of surface temperature and precipitation for Journal of Geophysical Research: Atmospheres, 113, D20119, Mevik, B.-H., and Wehrens, R.: The pls Package: Principal Component and Partial Least Squares Regression in R. Journal of Statistical Software, 18, 2, experimental forecasts: 12/12
26 Ê MSP : mean squared error of prediction H0: The less complex performs equal or even better than the more complex H1: The more complex performs better Table: p-values of right-sided t-test (nonparametric bootstrap) using paired differences of Ê MSP (only outer cross-validation) Y 30 Y 60 Y 90 SRG - BGS 0.21 (0.21) >0.99 (>0.99) >0.99 (>0.99) SRG - BAG <0.01 (<0.01) 0.03 (0.02) 0.67 (0.69) BGS - BAG <0.01 (<0.01) <0.01 (<0.01) <0.01 (<0.01) 12/12
27 Ê MAP : mean absolute error of prediction (n=66) 12/12
28 Ê MAP : mean absolute error of prediction H0: The less complex performs equal or even better than the more complex H1: The more complex performs better Table: p-values of right-sided t-test (nonparametric bootstrap) using paired differences of Ê MAP (only outer cross-validation) Y 30 Y 60 Y 90 SRG - BGS 0.02 (0.01) 0.92 (0.92) >0.99 (>0.99) SRG - BAG <0.01 (<0.01) <0.01 (<0.01) 0.25 (0.25) BGS - BAG <0.01 (<0.01) <0.01 (<0.01) <0.01 (<0.01) 12/12
29 Nash-Sutcliffe Efficiency (n=66; six outliers in [-3.7,-1.8] are not shown for readability) 12/12
30 12/12
31 12/12
32 12/12
33 12/12
34 alpine: Landwasser, Davos (Y 60 ) 12/12
35 lake: Aabach, Hitzkirch (Y 60 ) 12/12
36 lowland: Töss, Neftenbach (Y 60 ) 12/12
37 regulated: Julia, Tiefencastel (Y 60 ) 12/12
UvA-DARE (Digital Academic Repository)
UvA-DARE (Digital Academic Repository) A classification model for the Leiden proteomics competition Hoefsloot, H.C.J.; Berkenbos-Smit, S.; Smilde, A.K. Published in: Statistical Applications in Genetics
More informationChapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)
Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it
More informationPerformance of Median and Least Squares Regression for Slightly Skewed Data
World Academy of Science, Engineering and Technology 9 Performance of Median and Least Squares Regression for Slightly Skewed Data Carolina Bancayrin - Baguio Abstract This paper presents the concept of
More informationUsing Ensemble-Based Methods for Directly Estimating Causal Effects: An Investigation of Tree-Based G-Computation
Institute for Clinical Evaluative Sciences From the SelectedWorks of Peter Austin 2012 Using Ensemble-Based Methods for Directly Estimating Causal Effects: An Investigation of Tree-Based G-Computation
More informationSpatiotemporal Regime of Climate & Streamflow in the US Great Lakes Basin
Spatiotemporal Regime of Climate & Streamflow in the US Great Lakes Basin Boris Shmagin & Carol Johnston, South Dakota State University, Nir Y. Krakauer, City College of New York Introduction http://precedings.nature.com/documents/7/version/
More informationRussian Journal of Agricultural and Socio-Economic Sciences, 3(15)
ON THE COMPARISON OF BAYESIAN INFORMATION CRITERION AND DRAPER S INFORMATION CRITERION IN SELECTION OF AN ASYMMETRIC PRICE RELATIONSHIP: BOOTSTRAP SIMULATION RESULTS Henry de-graft Acquah, Senior Lecturer
More informationComputer Age Statistical Inference. Algorithms, Evidence, and Data Science. BRADLEY EFRON Stanford University, California
Computer Age Statistical Inference Algorithms, Evidence, and Data Science BRADLEY EFRON Stanford University, California TREVOR HASTIE Stanford University, California ggf CAMBRIDGE UNIVERSITY PRESS Preface
More informationArticle from. Forecasting and Futurism. Month Year July 2015 Issue Number 11
Article from Forecasting and Futurism Month Year July 2015 Issue Number 11 Calibrating Risk Score Model with Partial Credibility By Shea Parkes and Brad Armstrong Risk adjustment models are commonly used
More informationAn Improved Algorithm To Predict Recurrence Of Breast Cancer
An Improved Algorithm To Predict Recurrence Of Breast Cancer Umang Agrawal 1, Ass. Prof. Ishan K Rajani 2 1 M.E Computer Engineer, Silver Oak College of Engineering & Technology, Gujarat, India. 2 Assistant
More informationBIOINFORMATICS ORIGINAL PAPER
BIOINFORMATICS ORIGINAL PAPER Vol. 21 no. 9 2005, pages 1979 1986 doi:10.1093/bioinformatics/bti294 Gene expression Estimating misclassification error with small samples via bootstrap cross-validation
More informationResponse to Mease and Wyner, Evidence Contrary to the Statistical View of Boosting, JMLR 9:1 26, 2008
Journal of Machine Learning Research 9 (2008) 59-64 Published 1/08 Response to Mease and Wyner, Evidence Contrary to the Statistical View of Boosting, JMLR 9:1 26, 2008 Jerome Friedman Trevor Hastie Robert
More informationMultiple Regression Analysis
Multiple Regression Analysis Basic Concept: Extend the simple regression model to include additional explanatory variables: Y = β 0 + β1x1 + β2x2 +... + βp-1xp + ε p = (number of independent variables
More informationBayesRandomForest: An R
BayesRandomForest: An R implementation of Bayesian Random Forest for Regression Analysis of High-dimensional Data Oyebayo Ridwan Olaniran (rid4stat@yahoo.com) Universiti Tun Hussein Onn Malaysia Mohd Asrul
More informationSolving Problems of Clustering and Classification of Cancer Diseases Based on DNA Methylation Data 1,2
APPLIED PROBLEMS Solving Problems of Clustering and Classification of Cancer Diseases Based on DNA Methylation Data 1,2 A. N. Polovinkin a, I. B. Krylov a, P. N. Druzhkov a, M. V. Ivanchenko a, I. B. Meyerov
More informationRandom forest of modified risk factor on ischemic and hemorrhagic (Case study: Medicum Clinic, Tallinn, Estonia)
Proceedings of the IConSSE FSM SWCU (2015), pp. MA.26 41 ISBN: 978-602-1047-21-7 MA.26 Random forest of modified risk factor on ischemic and hemorrhagic (Case study: Medicum Clinic, Tallinn, Estonia) Ria
More informationSwitzerland. David Fäh. Universität Zürich. Institut für Sozial- und Präventivmedizin
Environment and cardiovascular disease in Switzerland David Fäh ISPM Zürich Aims of this meeting Evaluate potential for collaboration between Swiss TPH and ISPM Zürich Contribution ISPM Zürich: Concept
More informationPrediction of blood β-hydroxybutyrate content in early-lactation New Zealand dairy cows using milk infrared spectra
Prediction of blood β-hydroxybutyrate content in early-lactation New Zealand dairy cows using milk infrared spectra V. Bonfatti 1, S.-A. Turner 2, B. Kuhn-Sherlock 2, C. Phyn 2, J. Pryce 3,4 valentina.bonfatti@unipd.it
More informationApplications. DSC 410/510 Multivariate Statistical Methods. Discriminating Two Groups. What is Discriminant Analysis
DSC 4/5 Multivariate Statistical Methods Applications DSC 4/5 Multivariate Statistical Methods Discriminant Analysis Identify the group to which an object or case (e.g. person, firm, product) belongs:
More informationStatistics for EES Factorial analysis of variance
Statistics for EES Factorial analysis of variance Dirk Metzler http://evol.bio.lmu.de/_statgen 1. July 2013 1 ANOVA and F-Test 2 Pairwise comparisons and multiple testing 3 Non-parametric: The Kruskal-Wallis
More information3. Model evaluation & selection
Foundations of Machine Learning CentraleSupélec Fall 2016 3. Model evaluation & selection Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr
More informationPRINCIPLES OF EFFECTIVE MACHINE LEARNING APPLICATIONS IN REAL-WORLD EVIDENCE
PRINCIPLES OF EFFECTIVE MACHINE LEARNING APPLICATIONS IN REAL-WORLD EVIDENCE Prepared and Presented by: Gorana Capkun-Niggli, PhD, Global Head of Innovation, Health Economics and Outcomes Research, Novartis,
More informationLogistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision in Pune, India
20th International Congress on Modelling and Simulation, Adelaide, Australia, 1 6 December 2013 www.mssanz.org.au/modsim2013 Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision
More informationVariable selection should be blinded to the outcome
Variable selection should be blinded to the outcome Tamás Ferenci Manuscript type: Letter to the Editor Title: Variable selection should be blinded to the outcome Author List: Tamás Ferenci * (Physiological
More informationAnalysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach
University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School November 2015 Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach Wei Chen
More informationMohammad Amin Asadi Zarch et al.,2015
A discusion on the paper "Droughts in a warming climate: A global assessment of Standardized precipitation index (SPI) and Reconnaissance drought index (RDI)" Mohammad Amin Asadi Zarch et al.,2015 Reporter:PanCongcong
More informationMachine Learning to Inform Breast Cancer Post-Recovery Surveillance
Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Final Project Report CS 229 Autumn 2017 Category: Life Sciences Maxwell Allman (mallman) Lin Fan (linfan) Jamie Kang (kangjh) 1 Introduction
More informationGeneralizations and Extensions of the Probability of Superiority Effect Size Estimator
Multivariate Behavioral Research, 48:208 219, 2013 Copyright Taylor & Francis Group, LLC ISSN: 0027-3171 print/1532-7906 online DOI: 10.1080/00273171.2012.738184 Generalizations and Extensions of the Probability
More informationipred : Improved Predictors
ipred : Improved Predictors This short manual is heavily based on Peters et al. (2002b) and needs some improvements. 1 Introduction In classification problems, there are several attempts to create rules
More informationq3_2 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
q3_2 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) The relationship between the number of games won by a minor
More informationInvestigating the robustness of the nonparametric Levene test with more than two groups
Psicológica (2014), 35, 361-383. Investigating the robustness of the nonparametric Levene test with more than two groups David W. Nordstokke * and S. Mitchell Colp University of Calgary, Canada Testing
More informationPropensity scores and causal inference using machine learning methods
Propensity scores and causal inference using machine learning methods Austin Nichols (Abt) & Linden McBride (Cornell) July 27, 2017 Stata Conference Baltimore, MD Overview Machine learning methods dominant
More informationGeoffrey Stewart Morrison 1,2. School of Electrical Engineering & Telecommunications, University of New South Wales, UNSW Sydney, NSW 2052, Australia
Research Report CALCULATION OF FORENSIC LIKELIHOOD RATIOS: USE OF MONTE CARLO SIMULATIONS TO COMPARE THE OUTPUT OF SCORE- BASED APPROACHES WITH TRUE LIKELIHOOD-RATIO VALUES Geoffrey Stewart Morrison,2
More informationRecursive Partitioning Methods for Data Imputation in the Context of Item Response Theory: A Monte Carlo Simulation
Psicológica (2018), 39, 88-117. doi: 10.2478/psicolj-2018-0005 Recursive Partitioning Methods for Data Imputation in the Context of Item Response Theory: A Monte Carlo Simulation Julianne M. Edwards *1
More informationBootstrapping Residuals to Estimate the Standard Error of Simple Linear Regression Coefficients
Bootstrapping Residuals to Estimate the Standard Error of Simple Linear Regression Coefficients Muhammad Hasan Sidiq Kurniawan 1) 1)* Department of Statistics, Universitas Islam Indonesia hasansidiq@uiiacid
More informationIn this module I provide a few illustrations of options within lavaan for handling various situations.
In this module I provide a few illustrations of options within lavaan for handling various situations. An appropriate citation for this material is Yves Rosseel (2012). lavaan: An R Package for Structural
More informationModel reconnaissance: discretization, naive Bayes and maximum-entropy. Sanne de Roever/ spdrnl
Model reconnaissance: discretization, naive Bayes and maximum-entropy Sanne de Roever/ spdrnl December, 2013 Description of the dataset There are two datasets: a training and a test dataset of respectively
More informationStatistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.
Statistics as a Tool A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations. Descriptive Statistics Numerical facts or observations that are organized describe
More informationCOAL COMBUSTION RESIDUALS RULE STATISTICAL METHODS CERTIFICATION SOUTHERN ILLINOIS POWER COOPERATIVE (SIPC)
Regulatory Guidance Regulatory guidance provided in 40 CFR 257.90 specifies that a CCR groundwater monitoring program must include selection of the statistical procedures to be used for evaluating groundwater
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 Exam policy: This exam allows one one-page, two-sided cheat sheet; No other materials. Time: 80 minutes. Be sure to write your name and
More informationA novel approach to estimation of the time to biomarker threshold: Applications to HIV
A novel approach to estimation of the time to biomarker threshold: Applications to HIV Pharmaceutical Statistics, Volume 15, Issue 6, Pages 541-549, November/December 2016 PSI Journal Club 22 March 2017
More informationA Comparison of Robust and Nonparametric Estimators Under the Simple Linear Regression Model
Nevitt & Tam A Comparison of Robust and Nonparametric Estimators Under the Simple Linear Regression Model Jonathan Nevitt, University of Maryland, College Park Hak P. Tam, National Taiwan Normal University
More informationStatistics 202: Data Mining. c Jonathan Taylor. Final review Based in part on slides from textbook, slides of Susan Holmes.
Final review Based in part on slides from textbook, slides of Susan Holmes December 5, 2012 1 / 1 Final review Overview Before Midterm General goals of data mining. Datatypes. Preprocessing & dimension
More informationComputational Capacity and Statistical Inference: A Never Ending Interaction. Finbarr Sloane EHR/DRL
Computational Capacity and Statistical Inference: A Never Ending Interaction Finbarr Sloane EHR/DRL Studies in Crop Variation I (1921) It has been estimated that Sir Ronald A. Fisher spent about 185
More informationMachine Learning Statistical Learning. Prof. Matteo Matteucci
Machine Learning Statistical Learning Pro. Matteo Matteucci Statistical Learning Outline o What Is Statistical Learning? Why estimate? How do we estimate? The trade-o between prediction accuracy & model
More informationWalkability vs. Several Health Diagnoses for Klamath Falls, OR
Walkability vs. Several Health Diagnoses for Klamath Falls, OR John Ritter, Ph.D. Geomatics Dept, Oregon Tech Stephanie Van Dyke, MD, MPH Medical Director, Sky Lakes Wellness Center Katherine Pope, RN,
More informationISIR: Independent Sliced Inverse Regression
ISIR: Independent Sliced Inverse Regression Kevin B. Li Beijing Jiaotong University Abstract In this paper we consider a semiparametric regression model involving a p-dimensional explanatory variable x
More informationSTATISTICAL INFERENCE 1 Richard A. Johnson Professor Emeritus Department of Statistics University of Wisconsin
STATISTICAL INFERENCE 1 Richard A. Johnson Professor Emeritus Department of Statistics University of Wisconsin Key words : Bayesian approach, classical approach, confidence interval, estimation, randomization,
More informationMath 215, Lab 7: 5/23/2007
Math 215, Lab 7: 5/23/2007 (1) Parametric versus Nonparamteric Bootstrap. Parametric Bootstrap: (Davison and Hinkley, 1997) The data below are 12 times between failures of airconditioning equipment in
More informationRadiotherapy Outcomes
in partnership with Outcomes Models with Machine Learning Sarah Gulliford PhD Division of Radiotherapy & Imaging sarahg@icr.ac.uk AAPM 31 st July 2017 Making the discoveries that defeat cancer Radiotherapy
More informationApplying Machine Learning Methods in Medical Research Studies
Applying Machine Learning Methods in Medical Research Studies Daniel Stahl Department of Biostatistics and Health Informatics Psychiatry, Psychology & Neuroscience (IoPPN), King s College London daniel.r.stahl@kcl.ac.uk
More informationMidterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do.
Midterm STAT-UB.0003 Regression and Forecasting Models The exam is closed book and notes, with the following exception: you are allowed to bring one letter-sized page of notes into the exam (front and
More informationSpatio-temporal modeling of weekly malaria incidence in children under 5 for early epidemic detection in Mozambique
Spatio-temporal modeling of weekly malaria incidence in children under 5 for early epidemic detection in Mozambique Katie Colborn, PhD Department of Biostatistics and Informatics University of Colorado
More informationUpdate of NOAA (NCEP, Climate Test Bed) Seasonal Forecast Activities
Update of NOAA (NCEP, Climate Test Bed) Seasonal Forecast Activities Stephen J. Lord Director NCEP Environmental Modeling Center NCEP: where America s climate, weather, and ocean services begin 1 Overview
More informationImpute vs. Ignore: Missing Values for Prediction
Proceedings of International Joint Conference on Neural Networks, Dallas, Texas, USA, August 4-9, 2013 Impute vs. Ignore: Missing Values for Prediction Qianyu Zhang, Ashfaqur Rahman, and Claire D Este
More informationBootstrapped Integrative Hypothesis Test, COPD-Lung Cancer Differentiation, and Joint mirnas Biomarkers
Bootstrapped Integrative Hypothesis Test, COPD-Lung Cancer Differentiation, and Joint mirnas Biomarkers Kai-Ming Jiang 1,2, Bao-Liang Lu 1,2, and Lei Xu 1,2,3(&) 1 Department of Computer Science and Engineering,
More informationPredicting Breast Cancer Survival Using Treatment and Patient Factors
Predicting Breast Cancer Survival Using Treatment and Patient Factors William Chen wchen808@stanford.edu Henry Wang hwang9@stanford.edu 1. Introduction Breast cancer is the leading type of cancer in women
More informationList of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition
List of Figures List of Tables Preface to the Second Edition Preface to the First Edition xv xxv xxix xxxi 1 What Is R? 1 1.1 Introduction to R................................ 1 1.2 Downloading and Installing
More informationPERFORMANCE OF THOMAS FIERING MODEL FOR GENERATING SYNTHETIC STREAMFLOW OF JAKHAM RIVER
Plant Archives Vol. 18 No. 1, 2018 pp. 325-330 ISSN 0972-5210 PERFORMANCE OF THOMAS FIERING MODEL FOR GENERATING SYNTHETIC STREAMFLOW OF JAKHAM RIVER Priyanka Sharma 1, S. R. Bhakar 2 and P. K. Singh 2
More informationCombining machine learning and matching techniques to improve causal inference in program evaluation
bs_bs_banner Journal of Evaluation in Clinical Practice ISSN1365-2753 Combining machine learning and matching techniques to improve causal inference in program evaluation Ariel Linden DrPH 1,2 and Paul
More informationQuantile Regression for Final Hospitalization Rate Prediction
Quantile Regression for Final Hospitalization Rate Prediction Nuoyu Li Machine Learning Department Carnegie Mellon University Pittsburgh, PA 15213 nuoyul@cs.cmu.edu 1 Introduction Influenza (the flu) has
More informationDeveloping a Predictive Model of Physician Attribution of Patient Satisfaction Surveys
ABSTRACT Paper 1089-2017 Developing a Predictive Model of Physician Attribution of Patient Satisfaction Surveys Ingrid C. Wurpts, Ken Ferrell, and Joseph Colorafi, Dignity Health For all healthcare systems,
More informationJ2.6 Imputation of missing data with nonlinear relationships
Sixth Conference on Artificial Intelligence Applications to Environmental Science 88th AMS Annual Meeting, New Orleans, LA 20-24 January 2008 J2.6 Imputation of missing with nonlinear relationships Michael
More informationComparison of discrimination methods for the classification of tumors using gene expression data
Comparison of discrimination methods for the classification of tumors using gene expression data Sandrine Dudoit, Jane Fridlyand 2 and Terry Speed 2,. Mathematical Sciences Research Institute, Berkeley
More informationSummary of main challenges and future directions
Summary of main challenges and future directions Martin Schumacher Institute of Medical Biometry and Medical Informatics, University Medical Center, Freiburg Workshop October 2008 - F1 Outline Some historical
More informationMULTIPLE REGRESSION OF CPS DATA
MULTIPLE REGRESSION OF CPS DATA A further inspection of the relationship between hourly wages and education level can show whether other factors, such as gender and work experience, influence wages. Linear
More informationLearning from data when all models are wrong
Learning from data when all models are wrong Peter Grünwald CWI / Leiden Menu Two Pictures 1. Introduction 2. Learning when Models are Seriously Wrong Joint work with John Langford, Tim van Erven, Steven
More informationPart [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals
Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals Patrick J. Heagerty Department of Biostatistics University of Washington 174 Biomarkers Session Outline
More informationAMELIA II: A Package for Missing Data
AMELIA II: A Package for Missing Data James Honaker Gary King Matthew Blackwell July 24, 2009 I want to convince you of three things. I want to convince you of three things. 1 Missing data is a problem
More informationResearch Methods in Forest Sciences: Learning Diary. Yoko Lu December Research process
Research Methods in Forest Sciences: Learning Diary Yoko Lu 285122 9 December 2016 1. Research process It is important to pursue and apply knowledge and understand the world under both natural and social
More information2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%
Capstone Test (will consist of FOUR quizzes and the FINAL test grade will be an average of the four quizzes). Capstone #1: Review of Chapters 1-3 Capstone #2: Review of Chapter 4 Capstone #3: Review of
More informationUsing climate models to project the future distributions of climate-sensitive infectious diseases
Liverpool Marine Symposium, 17 Jan 2011 Using climate models to project the future distributions of climate-sensitive infectious diseases Prof. Matthew Baylis Liverpool University Climate and Infectious
More informationPractical Regression: Convincing Empirical Research in Ten Steps
DAVID DRANOVE 7-112-001 Practical Regression: Convincing Empirical Research in Ten Steps This is one in a series of notes entitled Practical Regression. These notes supplement the theoretical content of
More informationPredictive Models for Healthcare Analytics
Predictive Models for Healthcare Analytics A Case on Retrospective Clinical Study Mengling Mornin Feng mfeng@mit.edu mornin@gmail.com 1 Learning Objectives After the lecture, students should be able to:
More informationSTAT 151B. Administrative Info. Statistics 151B: Introduction Modern Statistical Prediction and Machine Learning. Overview and introduction
Statistics 151B: Modern Statistical Prediction and Machine Learning Overview and introduction information Homepage: http://www.stat.berkeley.edu/ jon/ stat-151b-spring-2012 All announcements and materials
More informationScore Tests of Normality in Bivariate Probit Models
Score Tests of Normality in Bivariate Probit Models Anthony Murphy Nuffield College, Oxford OX1 1NF, UK Abstract: A relatively simple and convenient score test of normality in the bivariate probit model
More informationWDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?
WDHS Curriculum Map Probability and Statistics Time Interval/ Unit 1: Introduction to Statistics 1.1-1.3 2 weeks S-IC-1: Understand statistics as a process for making inferences about population parameters
More informationClassification and Statistical Analysis of Auditory FMRI Data Using Linear Discriminative Analysis and Quadratic Discriminative Analysis
International Journal of Innovative Research in Computer Science & Technology (IJIRCST) ISSN: 2347-5552, Volume-2, Issue-6, November-2014 Classification and Statistical Analysis of Auditory FMRI Data Using
More informationVisual and Decision Informatics (CVDI)
University of Louisiana at Lafayette, Vijay V Raghavan, 337.482.6603, raghavan@louisiana.edu Drexel University, Xiaohua (Tony) Hu, 215.895.0551, xh29@drexel.edu Tampere University (Finland), Moncef Gabbouj,
More informationA methodology for the analysis of medical data
Please cite this book chapter as: A. Tsanas, M.A. Little, P.E. McSharry, A methodology for the analysis of medical data, Handbook of Systems and Complexity in Health, Springer, New York, pp. 113-125, 2013
More informationSUPPLEMENTARY INFORMATION
doi:10.1038/nature16467 Supplementary Discussion Relative influence of larger and smaller EWD impacts. To examine the relative influence of varying disaster impact severity on the overall disaster signal
More informationOn testing dependency for data in multidimensional contingency tables
On testing dependency for data in multidimensional contingency tables Dominika Polko 1 Abstract Multidimensional data analysis has a very important place in statistical research. The paper considers the
More informationMultivariate Regression with Small Samples: A Comparison of Estimation Methods W. Holmes Finch Maria E. Hernández Finch Ball State University
Multivariate Regression with Small Samples: A Comparison of Estimation Methods W. Holmes Finch Maria E. Hernández Finch Ball State University High dimensional multivariate data, where the number of variables
More informationSupplementary Material. other ethnic backgrounds. All but six of the yoked pairs were matched on ethnicity. Results
Supplementary Material S1 Methodological Details Participants The sample was 80% Caucasian, 16.7% Asian or Asian American, and 3.3% from other ethnic backgrounds. All but six of the yoked pairs were matched
More informationDetection and Classification of Diabetic Retinopathy using Retinal Images
Detection and Classification of Diabetic Retinopathy using Retinal Images Kanika Verma, Prakash Deep and A. G. Ramakrishnan, Senior Member, IEEE Medical Intelligence and Language Engineering Lab Department
More informationBehavioral Data Mining. Lecture 4 Measurement
Behavioral Data Mining Lecture 4 Measurement Outline Hypothesis testing Parametric statistical tests Non-parametric tests Precision-Recall plots ROC plots Hardware update Icluster machines are ready for
More informationBoosted PRIM with Application to Searching for Oncogenic Pathway of Lung Cancer
Boosted PRIM with Application to Searching for Oncogenic Pathway of Lung Cancer Pei Wang Department of Statistics Stanford University Stanford, CA 94305 wp57@stanford.edu Young Kim, Jonathan Pollack Department
More informationAP Statistics. Semester One Review Part 1 Chapters 1-5
AP Statistics Semester One Review Part 1 Chapters 1-5 AP Statistics Topics Describing Data Producing Data Probability Statistical Inference Describing Data Ch 1: Describing Data: Graphically and Numerically
More informationClass Outlier Detection. Zuzana Pekarčíková
Class Outlier Detection Zuzana Pekarčíková Outline λ What is an Outlier? λ Applications of Outlier Detection λ Types of Outliers λ Outlier Detection Methods Types λ Basic Outlier Detection Methods λ High-dimensional
More informationPerformance, Labour and Economic Aspects of Different Farrowing Systems
1 Performance, Labour and Economic Aspects of Different Farrowing Systems University of Natural Resources and Applied Life Sciences, Department of Sustainable Agricultural Systems, Division of Agricultural
More informationSwiss Brown Swiss in different environments: Does GxE play an important role? Beat Bapst Qualitas AG, Switzerland
Swiss Brown Swiss in different environments: Does GxE play an important role? Beat Bapst Qualitas AG, Switzerland 07.04.2016 World Brown Swiss Congress, Mende Introduction/Background Brown Swiss Dairy
More informationExtraversion. The Extraversion factor reliability is 0.90 and the trait scale reliabilities range from 0.70 to 0.81.
MSP RESEARCH NOTE B5PQ Reliability and Validity This research note describes the reliability and validity of the B5PQ. Evidence for the reliability and validity of is presented against some of the key
More informationEnsemble based probabilistic forecasting of meteorology and air quality in Oslo, Norway
Ensemble based probabilistic forecasting of meteorology and air quality in Oslo, Norway Sam Erik Walker, Bruce Rolstad Denby, Núria Castell NILU Norwegian Institute for Air Research 21 August 2014 World
More informationUNCERTAINTY, HEURISTICS AND INJURY PREDICTION
UNCERTAINTY, HEURISTICS AND INJURY PREDICTION Written by Mladen Jovanovic, Serbia Predicting injuries in high-performance sports is of great importance for both players and clubs, but also for fans Having
More informationImportance of factors contributing to work-related stress: comparison of four metrics
Importance of factors contributing to work-related stress: comparison of four metrics Mounia N. Hocine, Natalia Feropontova, Ndèye Niang, Karim Aït-Bouziad, Gilbert Saporta Conservatoire national des arts
More informationNORTH DAKOTA 2011 FLOOD EVENT
NORTH DAKOTA 2011 FLOOD EVENT OUTLINE Missouri River Basin Geography Generic Reservoir Operations Weather and Climate 2011 Basin-wide Hydrology Missouri River Timeline Missouri River Damages Mouse River
More informationStatistics 571: Statistical Methods Summer 2003 Final Exam Ramón V. León
Name: Statistics 571: Statistical Methods Summer 2003 Final Exam Ramón V. León This exam is closed-book and closed-notes. However, you can use up to twenty pages of personal notes as an aid in answering
More informationQUANTIFYING CEREBRAL CONTRIBUTIONS TO PAIN 1
QUANTIFYING CEREBRAL CONTRIBUTIONS TO PAIN 1 Supplementary Figure 1. Overview of the SIIPS1 development. The development of the SIIPS1 consisted of individual- and group-level analysis steps. 1) Individual-person
More informationSurvival Prediction Models for Estimating the Benefit of Post-Operative Radiation Therapy for Gallbladder Cancer and Lung Cancer
Survival Prediction Models for Estimating the Benefit of Post-Operative Radiation Therapy for Gallbladder Cancer and Lung Cancer Jayashree Kalpathy-Cramer PhD 1, William Hersh, MD 1, Jong Song Kim, PhD
More informationMigratory Bird classification and analysis Aparna Pal
Migratory Bird classification and analysis Aparna Pal apal4@wisc.edu Abstract The use of classification vectors to classify land and seabirds act as a first step to pattern classification of migratory
More informationNPTEL Project. Econometric Modelling. Module 14: Heteroscedasticity Problem. Module 16: Heteroscedasticity Problem. Vinod Gupta School of Management
1 P age NPTEL Project Econometric Modelling Vinod Gupta School of Management Module 14: Heteroscedasticity Problem Module 16: Heteroscedasticity Problem Rudra P. Pradhan Vinod Gupta School of Management
More information