Prediction and Inference under Competing Risks in High Dimension - An EHR Demonstration Project for Prostate Cancer

Size: px
Start display at page:

Download "Prediction and Inference under Competing Risks in High Dimension - An EHR Demonstration Project for Prostate Cancer"

Transcription

1 Prediction and Inference under Competing Risks in High Dimension - An EHR Demonstration Project for Prostate Cancer Ronghui (Lily) Xu Division of Biostatistics and Bioinformatics Department of Family Medicine and Public Health and Department of Mathematics University of California, San Diego rxu@ucsd.edu This is joint work with: Jiayi Hou, Ph.D., Anthony Paravati, M.D., James Murphy, M.D., MPH; and Jue (Marquis) Hou, Jelena Bradic, Ph.D. R. Xu (UCSD) Competing Risks in High Dimension 1 / 30

2 Linked SEER-Medicare database As an illustration project of how information contained in patients electronic medical records (EMR) can be harvested for the purposes of precision medicine, we consider the large data set linking: the Surveillance, Epidemiology and End Results (SEER) Program database of the National Cancer Institute with the federal health insurance program Medicare database for prostate cancer patients of age 65 or older. R. Xu (UCSD) Competing Risks in High Dimension 2 / 30

3 Cancer and non-cancer mortality Clinical guidelines for the management of prostate cancer based on two factors: an estimation of the aggressiveness of a patient s tumor; and estimation of a patient s overall life expectancy. Currently no tool exists to predict non-cancer survival for this patient population. Meanwhile non-cancer and cancer survival are not independent, i.e. competing risks, a comprehensive prediction tool should consider both types of risks simultaneously. R. Xu (UCSD) Competing Risks in High Dimension 3 / 30

4 Data We restrict our analysis to patients diagnosed between 2004 and 2009 in the SEER-Medicare database. After excluding additional patients with missing clinical records, we have a total of 57,011 patients 7 relevant clinical variables (age, PSA, Gleason score, AJCC stage, and AJCC stage T, N, M, respectively) 5 demographical variables (race, marital status, metro, registry and year of diagnosis) 8971 binary insurance claim codes, capturing medical diagnoses, procedures, hospitalization and outpatient activities, etc. 1 if the procedure/activity present, and 0 if absent. The clinical and demographic information are at the time of diagnosis, and insurance claims data during the year prior to diagnosis, so that survival prediction would occur at the time of diagnosis R. Xu (UCSD) Competing Risks in High Dimension 4 / 30

5 Data (cont d) Until December 2013 (end of follow-up) there were a total of 1,247 deaths due to cancer 5,221 deaths unrelated to cancer. As the number of events dictates the effective sample size, we have more predictors than the effective sample size. R. Xu (UCSD) Competing Risks in High Dimension 5 / 30

6 High dimensional problem Different approaches have been proposed to analyze survival data with high dimensional covariates: LASSO under the Cox model (Tibshirani 1997), adaptive LASSO under the Cox model (Zhang and Lu 2007), random forest for right-censoring data (Hothorn et al. 2006), non-concave penalized methods under ultra high dimension (Bradic et al. 2011). But very few high-dimensional methods have been developed in the presence of competing risks: Binder et al. (2009) first proposed a boosting approach for fitting the proportional subdistribution hazards model, Fu et al. (2017) considered penalized approaches under the same model (requires p n). R. Xu (UCSD) Competing Risks in High Dimension 6 / 30

7 The models Denote the observed data X i = min(t i, C i ), δ i = I(T i C i ), ɛ i = 1, 2 is the cause or type, Z i is a vector of covariates, i = 1,..., n. We are interested in the cumulative incidence function (CIF) F j (t Z) = P (T t, ɛ = j Z) for failure type j. The proportional cause-specific hazards (PCSH) model λ j (t Z) = λ 0j (t) exp(β jz), (1) for j = 1, 2, where 1 λ j (t) = lim t 0+ tp (t T < t + t, J = j T t) is the cause-specific hazard function. Then F j (t Z) = t 0 λ j(u Z)S(u Z)du, where S(t Z) = P (T > t Z) is the overall survival function. R. Xu (UCSD) Competing Risks in High Dimension 7 / 30

8 The models (cont d) The proportional subdistribution hazards (PSDH, Fine and Gray 1999) model λ 1 (t Z) = λ 0 (t) exp(β Z). (2) where λ 1 (t Z) = d dt log{1 F 1(t Z)}. Obviously F 1 (t Z) is readily estimated after fitting the PSDH model for cause 1 only. R. Xu (UCSD) Competing Risks in High Dimension 8 / 30

9 High-d algorithm simulation: PCSH model with LASSO (1) Figure: Continuous covariates: n = 500, the three columns correspond to p = 20, 500, and The rows are: 1) CV10, 2) CV+1SE, 3) minimum AIC, 4) minimum BIC, 5) elbow AIC and 6) elbow BIC. The true CIF 1 (2 z 0 ) = Ind Exch AR1 Oracle R. Xu (UCSD) Competing Risks in High Dimension 9 / 30

10 High-d algorithm simulation: PCSH model with LASSO (2) Figure: Balanced binary covariates: n = 500, the three columns correspond to p = 20, 500, and The rows are: 1) CV10, 2) CV+1SE, 3) minimum AIC, 4) minimum BIC, 5) elbow AIC and 6) elbow BIC. The true CIF 1 (2 z 0 ) = See Supplement See Supplement Ind Exch AR1 Oracle See Supplement See Supplement R. Xu (UCSD) Competing Risks in High Dimension 10 / 30

11 High-d algorithm simulation: PCSH model with LASSO (3) Figure: Sparse binary covariates: n = 500, the three columns correspond to p = 20, 500, and The rows are: 1) CV10, 2) CV+1SE, 3) minimum AIC, 4) minimum BIC, 5) elbow AIC and 6) elbow BIC. The true CIF 1 (2 z 0 ) = See Supplement See Supplement Ind Exch AR1 Oracle See Supplement See Supplement See Supplement See Supplement R. Xu (UCSD) Competing Risks in High Dimension 11 / 30

12 High-d algorithm simulation: PSDH with boosting (1) Figure: Continuous covariates: n = 500, the three columns correspond to p = 20, 500, and The rows are: 1) CV10, 2) minimum AIC, 3) minimum BIC, 4) elbow AIC and 5) elbow BIC. The true CIF 1 (2 z 0 ) = Ind Exch AR1 Oracle R. Xu (UCSD) Competing Risks in High Dimension 12 / 30

13 High-d algorithm simulation: PSDH with boosting (2) Figure: Balanced binary covariates: n = 500, the three columns correspond to p = 20, 500, and The rows are: 1) CV10, 2) minimum AIC, 3) minimum BIC, 4) elbow AIC and 5) elbow BIC. The true CIF 1 (2 z 0 ) = Ind Exch AR1 Oracle R. Xu (UCSD) Competing Risks in High Dimension 13 / 30

14 High-d algorithm simulation: PSDH with boosting (3) Figure: Sparse binary covariates: n = 500, the three columns correspond to p = 20, 500, and The rows are: 1) CV10, 2) minimum AIC, 3) minimum BIC, 4) elbow AIC and 5) elbow BIC. The true CIF 1 (2 z 0 ) = Ind Exch AR1 Oracle R. Xu (UCSD) Competing Risks in High Dimension 14 / 30

15 Apply to SEER-Medicare data: PCSH model with LASSO Random split into training (n = 28, 505) and test (n = 28, 506) datasets. On training data: p 1 = 2188 and p 2 = 1079 claim codes for non-cancer and cancer mortality after univariate screening; CV10 was used to choose the penalty parameter; final model contained 143 predictors for non-cancer mortality, and 9 predictors for cancer mortality. On test data: For each mortality type j = 1, 2, we divided into 4 risk strata: low (L), median low (ML), median high (MH) and high (H) by quartiles; this gives 16 strata for the combined two types of mortalities; after plotting the predicted CIF s for the 16 strata, 5 distinct risk groups emerged for both cancer and non-cancer mortalities. R. Xu (UCSD) Competing Risks in High Dimension 15 / 30

16 SEER-Medicare results: PCSH model with LASSO R. Xu (UCSD) Competing Risks in High Dimension 16 / 30

17 SEER-Medicare results: PCSH model with LASSO Risk Groups n # events Cause-Specific Non-cancer mortality: Non-cancer: Cancer: L L L, ML, MH, H ML ML L, ML, MH, H M MH L, ML, MH, H MH H L, ML, MH H H H Cancer mortality: Non-cancer: Cancer: L L, ML, MH, H L ML L, ML, MH, H ML M L, ML, MH, H MH MH L, ML, MH H H H H R. Xu (UCSD) Competing Risks in High Dimension 17 / 30

18 SEER-Medicare results: PCSH model with LASSO R. Xu (UCSD) Competing Risks in High Dimension 18 / 30

19 Apply to SEER-Medicare data: PSDH model with boosting Same training (n = 28, 505) and test (n = 28, 506) datasets. On training data: top 2000 claim codes (ranked by p-value) for each type of mortality (p 1 = 4634 and p 2 = 6088 after univariate screening); AIC was used, because it was a close second to CV10 and less computationally intensive; final model contained 53 predictors for non-cancer mortality, and 13 predictors for cancer mortality. On test data: For each mortality type j = 1, 2, we divided into 5 risk strata: low (L), median low (ML), median (M), median high (MH) and high (H) by quintiles. R. Xu (UCSD) Competing Risks in High Dimension 19 / 30

20 SEER-Medicare results: PSDH model with boosting R. Xu (UCSD) Competing Risks in High Dimension 20 / 30

21 Prediction, variable selection, and inference The field of high dimensional statistical theory has been advancing (Bühlmann, Peter and van de Geer, 2011). Consistent prediction can be relatively easily achieved (eg. LASSO can be consistent for estimating the underlying regression function); Consistent variable selection needs a beta-min condition (i.e. all non-zero coefficients are sufficiently large), in addition to the restrictive irrepresentable condition (cannot have too strongly correlated design); Recent developments by Zhang and Zhang (2014), van de Geer et al. (2014) allows confidence intervals, in the presence of many small non-zero coefficients. R. Xu (UCSD) Competing Risks in High Dimension 21 / 30

22 One-step estimator Consider the high-dimensional PSDH model where Z(t) may be time-dependent. λ 1 (t Z(t)) = λ 0 (t) exp{β Z(t)}. (3) The modified log partial likelihood that gives rise to the estimating equation for β (in low dim.) is n t n m(β) = n 1 β Z i (t) log ω j (t)y j (t)e β Z j (t) dn i o (t), i=1 0 where Ni o (t) = I(X i t, δ iɛ i = 1), ω i(t)y i(t) = r i(t)y i(t) Ĝ(t)/Ĝ(t Xi), G(t) = P (C > t) and r i(t)y i(t) = I(t X i) + I(t > X i)i(δ iɛ i > 1). j=1 (4) R. Xu (UCSD) Competing Risks in High Dimension 22 / 30

23 Initial LASSO estimator Following Zhang and Zhang (2014), van de Geer et al. (2014), let the initial estimator be } β(λ) = argmin { m(β) + λ β 1. (5) β Oracle inequality = for the regularization parameter λ choosen to be of the order log(n) log(p)/n, we have β β o 1 = O p (s o log(n) ) log(p)/n. R. Xu (UCSD) Competing Risks in High Dimension 23 / 30

24 One-step estimator (cont d) The one-step bias-corrected estimator is b := β + Θṁ( β), (6) where Θ is such that m(β o ) Θ I p, and is obtained by a version of nodewise LASSO based on the score vectors instead of the design matrix (Zhang and Zhang, 2014). R. Xu (UCSD) Competing Risks in High Dimension 24 / 30

25 Confidence intervals We can show that b β o = Θṁ(β o ) + o p(1), and that for any c R p such that c 1 = 1, c ṁ(β o ) d N(0, c Vc). An 1 α confidence interval for c β o is (c b Z1 α/2 c Θ V Θ c/n, c b ) + Z1 α/2 c Θ V Θ c/n. (7) R. Xu (UCSD) Competing Risks in High Dimension 25 / 30

26 Simulation results For cause 1, s o = 2. True Mean Est SD SE SE corrected Coverage Level/Power n=200, p=1000 β 1, β 1, β 1, n=500, p=1000 β 1, β 1, β 1, R. Xu (UCSD) Competing Risks in High Dimension 26 / 30

27 References 1 Hou J, Paravati A, Xu R, Murphy J. High-Dimensional Variable Selection and Prediction under Competing Risks with Application to SEER-Medicare Linked Data. 2 Hou J, Bradic J, Xu R, Fine-Gray Competing Risks Model with High Dimensional Covariates: Estimation and Inference. R. Xu (UCSD) Competing Risks in High Dimension 27 / 30

28 High dimensional algorithms: LASSO Tibshirani s (1997) LASSO can be applied to the PCSH model, as standard Cox regression software is used in low dimensions. Selection of penalty parameter: CV10: minimum 10-fold cross-validated (CV) negative predictive log partial likelihood (referred to as error in the following); CV+1SE: minimum 10-fold CV error plus one standard error of the CV estimated errors; min AIC / BIC; elbow AIC / BIC (Tibshirani et al. 2001). Under the Cox model AIC = 2 log(l) + 2s, where L is the partial likelihood and s = S( β) is the size of the active set (Verweij and van Houwelingen, 1993; Xu et al. 2009); BIC = 2 log(l) + 2s log(k), where k is the number of observed events (Volinsky and Raftery, 2000) from the cause of interest. R. Xu (UCSD) Competing Risks in High Dimension 28 / 30

29 LASSO (cont d) Fu et al. (R package crrp ) contains LASSO for the PSDH model, unfortunately was not able to apply to the linked SEER-Medicare data as it ran out of memory. (This becomes a challenge for the initial estimator in our 2nd paper.) R. Xu (UCSD) Competing Risks in High Dimension 29 / 30

30 High dimensional algorithms: boosting Binder et al. (2009) applied a likelihood based boosting approach to the weighted partial likelihood under the PSDH model. Selection of optimal number of boosting steps: CV10; min AIC / BIC; elbow AIC / BIC. Definitions of the criteria are similar to those under the PCSH model above, with the likelihood returned by the crr() function in R package cmprsk. R. Xu (UCSD) Competing Risks in High Dimension 30 / 30

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Final Project Report CS 229 Autumn 2017 Category: Life Sciences Maxwell Allman (mallman) Lin Fan (linfan) Jamie Kang (kangjh) 1 Introduction

More information

Modelling Spatially Correlated Survival Data for Individuals with Multiple Cancers

Modelling Spatially Correlated Survival Data for Individuals with Multiple Cancers Modelling Spatially Correlated Survival Data for Individuals with Multiple Cancers Dipak K. Dey, Ulysses Diva and Sudipto Banerjee Department of Statistics University of Connecticut, Storrs. March 16,

More information

Statistical Methods for the Evaluation of Treatment Effectiveness in the Presence of Competing Risks

Statistical Methods for the Evaluation of Treatment Effectiveness in the Presence of Competing Risks Statistical Methods for the Evaluation of Treatment Effectiveness in the Presence of Competing Risks Ravi Varadhan Assistant Professor (On Behalf of the Johns Hopkins DEcIDE Center) Center on Aging and

More information

Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach

Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School November 2015 Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach Wei Chen

More information

Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals

Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals Patrick J. Heagerty Department of Biostatistics University of Washington 174 Biomarkers Session Outline

More information

Computer Age Statistical Inference. Algorithms, Evidence, and Data Science. BRADLEY EFRON Stanford University, California

Computer Age Statistical Inference. Algorithms, Evidence, and Data Science. BRADLEY EFRON Stanford University, California Computer Age Statistical Inference Algorithms, Evidence, and Data Science BRADLEY EFRON Stanford University, California TREVOR HASTIE Stanford University, California ggf CAMBRIDGE UNIVERSITY PRESS Preface

More information

Selection and Combination of Markers for Prediction

Selection and Combination of Markers for Prediction Selection and Combination of Markers for Prediction NACC Data and Methods Meeting September, 2010 Baojiang Chen, PhD Sarah Monsell, MS Xiao-Hua Andrew Zhou, PhD Overview 1. Research motivation 2. Describe

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 Exam policy: This exam allows one one-page, two-sided cheat sheet; No other materials. Time: 80 minutes. Be sure to write your name and

More information

Sarwar Islam Mozumder 1, Mark J Rutherford 1 & Paul C Lambert 1, Stata London Users Group Meeting

Sarwar Islam Mozumder 1, Mark J Rutherford 1 & Paul C Lambert 1, Stata London Users Group Meeting 2016 Stata London Users Group Meeting stpm2cr: A Stata module for direct likelihood inference on the cause-specific cumulative incidence function within the flexible parametric modelling framework Sarwar

More information

Dynamic prediction using joint models for recurrent and terminal events: Evolution after a breast cancer

Dynamic prediction using joint models for recurrent and terminal events: Evolution after a breast cancer Dynamic prediction using joint models for recurrent and terminal events: Evolution after a breast cancer A. Mauguen, B. Rachet, S. Mathoulin-Pélissier, S. Siesling, G. MacGrogan, A. Laurent, V. Rondeau

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2007 Paper 221 Biomarker Discovery Using Targeted Maximum Likelihood Estimation: Application to the

More information

Application of EM Algorithm to Mixture Cure Model for Grouped Relative Survival Data

Application of EM Algorithm to Mixture Cure Model for Grouped Relative Survival Data Journal of Data Science 5(2007), 41-51 Application of EM Algorithm to Mixture Cure Model for Grouped Relative Survival Data Binbing Yu 1 and Ram C. Tiwari 2 1 Information Management Services, Inc. and

More information

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS) Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it

More information

Using dynamic prediction to inform the optimal intervention time for an abdominal aortic aneurysm screening programme

Using dynamic prediction to inform the optimal intervention time for an abdominal aortic aneurysm screening programme Using dynamic prediction to inform the optimal intervention time for an abdominal aortic aneurysm screening programme Michael Sweeting Cardiovascular Epidemiology Unit, University of Cambridge Friday 15th

More information

BIOINFORMATICS ORIGINAL PAPER

BIOINFORMATICS ORIGINAL PAPER BIOINFORMATICS ORIGINAL PAPER Vol. 21 no. 13 2005, pages 3001 3008 doi:10.1093/bioinformatics/bti422 Gene expression Penalized Cox regression analysis in the high-dimensional and low-sample size settings,

More information

What is Statistics? Ronghui (Lily) Xu (UCSD) An Introduction to Statistics 1 / 23

What is Statistics? Ronghui (Lily) Xu (UCSD) An Introduction to Statistics 1 / 23 What is Statistics? Some definitions of Statistics: The science of Statistics is essentially a branch of Applied Mathematics, and maybe regarded as mathematics applied to observational data. Fisher The

More information

Landmarking, immortal time bias and. Dynamic prediction

Landmarking, immortal time bias and. Dynamic prediction Landmarking and immortal time bias Landmarking and dynamic prediction Discussion Landmarking, immortal time bias and dynamic prediction Department of Medical Statistics and Bioinformatics Leiden University

More information

Score Tests of Normality in Bivariate Probit Models

Score Tests of Normality in Bivariate Probit Models Score Tests of Normality in Bivariate Probit Models Anthony Murphy Nuffield College, Oxford OX1 1NF, UK Abstract: A relatively simple and convenient score test of normality in the bivariate probit model

More information

Statistical Models for Censored Point Processes with Cure Rates

Statistical Models for Censored Point Processes with Cure Rates Statistical Models for Censored Point Processes with Cure Rates Jennifer Rogers MSD Seminar 2 November 2011 Outline Background and MESS Epilepsy MESS Exploratory Analysis Summary Statistics and Kaplan-Meier

More information

Model Selection Methods for Cancer Staging and Other Disease Stratification Problems. Yunzhi Lin

Model Selection Methods for Cancer Staging and Other Disease Stratification Problems. Yunzhi Lin Model Selection Methods for Cancer Staging and Other Disease Stratification Problems by Yunzhi Lin A dissertation submitted in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY

More information

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1 Welch et al. BMC Medical Research Methodology (2018) 18:89 https://doi.org/10.1186/s12874-018-0548-0 RESEARCH ARTICLE Open Access Does pattern mixture modelling reduce bias due to informative attrition

More information

VARIABLE SELECTION WHEN CONFRONTED WITH MISSING DATA

VARIABLE SELECTION WHEN CONFRONTED WITH MISSING DATA VARIABLE SELECTION WHEN CONFRONTED WITH MISSING DATA by Melissa L. Ziegler B.S. Mathematics, Elizabethtown College, 2000 M.A. Statistics, University of Pittsburgh, 2002 Submitted to the Graduate Faculty

More information

Assessment of a disease screener by hierarchical all-subset selection using area under the receiver operating characteristic curves

Assessment of a disease screener by hierarchical all-subset selection using area under the receiver operating characteristic curves Research Article Received 8 June 2010, Accepted 15 February 2011 Published online 15 April 2011 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.4246 Assessment of a disease screener by

More information

A Bayesian Perspective on Unmeasured Confounding in Large Administrative Databases

A Bayesian Perspective on Unmeasured Confounding in Large Administrative Databases A Bayesian Perspective on Unmeasured Confounding in Large Administrative Databases Lawrence McCandless lmccandl@sfu.ca Faculty of Health Sciences, Simon Fraser University, Vancouver Canada Summer 2014

More information

Modelling prognostic capabilities of tumor size: application to colorectal cancer

Modelling prognostic capabilities of tumor size: application to colorectal cancer Session 3: Epidemiology and public health Modelling prognostic capabilities of tumor size: application to colorectal cancer Virginie Rondeau, INSERM Modelling prognostic capabilities of tumor size : application

More information

The Statistical Analysis of Failure Time Data

The Statistical Analysis of Failure Time Data The Statistical Analysis of Failure Time Data Second Edition JOHN D. KALBFLEISCH ROSS L. PRENTICE iwiley- 'INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents Preface xi 1. Introduction 1 1.1

More information

Multivariate Multilevel Models

Multivariate Multilevel Models Multivariate Multilevel Models Getachew A. Dagne George W. Howe C. Hendricks Brown Funded by NIMH/NIDA 11/20/2014 (ISSG Seminar) 1 Outline What is Behavioral Social Interaction? Importance of studying

More information

Statistics 202: Data Mining. c Jonathan Taylor. Final review Based in part on slides from textbook, slides of Susan Holmes.

Statistics 202: Data Mining. c Jonathan Taylor. Final review Based in part on slides from textbook, slides of Susan Holmes. Final review Based in part on slides from textbook, slides of Susan Holmes December 5, 2012 1 / 1 Final review Overview Before Midterm General goals of data mining. Datatypes. Preprocessing & dimension

More information

Analyzing diastolic and systolic blood pressure individually or jointly?

Analyzing diastolic and systolic blood pressure individually or jointly? Analyzing diastolic and systolic blood pressure individually or jointly? Chenglin Ye a, Gary Foster a, Lisa Dolovich b, Lehana Thabane a,c a. Department of Clinical Epidemiology and Biostatistics, McMaster

More information

Ridge regression for risk prediction

Ridge regression for risk prediction Ridge regression for risk prediction with applications to genetic data Erika Cule and Maria De Iorio Imperial College London Department of Epidemiology and Biostatistics School of Public Health May 2012

More information

Treatment effect estimates adjusted for small-study effects via a limit meta-analysis

Treatment effect estimates adjusted for small-study effects via a limit meta-analysis Treatment effect estimates adjusted for small-study effects via a limit meta-analysis Gerta Rücker 1, James Carpenter 12, Guido Schwarzer 1 1 Institute of Medical Biometry and Medical Informatics, University

More information

Determinants and Status of Vaccination in Bangladesh

Determinants and Status of Vaccination in Bangladesh Dhaka Univ. J. Sci. 60(): 47-5, 0 (January) Determinants and Status of Vaccination in Bangladesh Institute of Statistical Research and Training (ISRT), University of Dhaka, Dhaka-000, Bangladesh Received

More information

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method Biost 590: Statistical Consulting Statistical Classification of Scientific Studies; Approach to Consulting Lecture Outline Statistical Classification of Scientific Studies Statistical Tasks Approach to

More information

Estimating and comparing cancer progression risks under varying surveillance protocols: moving beyond the Tower of Babel

Estimating and comparing cancer progression risks under varying surveillance protocols: moving beyond the Tower of Babel Estimating and comparing cancer progression risks under varying surveillance protocols: moving beyond the Tower of Babel Jane Lange March 22, 2017 1 Acknowledgements Many thanks to the multiple project

More information

Part [1.0] Introduction to Development and Evaluation of Dynamic Predictions

Part [1.0] Introduction to Development and Evaluation of Dynamic Predictions Part [1.0] Introduction to Development and Evaluation of Dynamic Predictions A Bansal & PJ Heagerty Department of Biostatistics University of Washington 1 Biomarkers The Instructor(s) Patrick Heagerty

More information

BIOSTATISTICAL METHODS

BIOSTATISTICAL METHODS BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH PROPENSITY SCORE Confounding Definition: A situation in which the effect or association between an exposure (a predictor or risk factor) and

More information

Multivariate Regression with Small Samples: A Comparison of Estimation Methods W. Holmes Finch Maria E. Hernández Finch Ball State University

Multivariate Regression with Small Samples: A Comparison of Estimation Methods W. Holmes Finch Maria E. Hernández Finch Ball State University Multivariate Regression with Small Samples: A Comparison of Estimation Methods W. Holmes Finch Maria E. Hernández Finch Ball State University High dimensional multivariate data, where the number of variables

More information

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions.

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions. Greenland/Arah, Epi 200C Sp 2000 1 of 6 EPI 200C Final, June 4 th, 2009 This exam includes 24 questions. INSTRUCTIONS: Write all answers on the answer sheets supplied; PRINT YOUR NAME and STUDENT ID NUMBER

More information

Propensity Score Analysis to compare effects of radiation and surgery on survival time of lung cancer patients from National Cancer Registry (SEER)

Propensity Score Analysis to compare effects of radiation and surgery on survival time of lung cancer patients from National Cancer Registry (SEER) Propensity Score Analysis to compare effects of radiation and surgery on survival time of lung cancer patients from National Cancer Registry (SEER) Yan Wu Advisor: Robert Pruzek Epidemiology and Biostatistics

More information

CSE 255 Assignment 9

CSE 255 Assignment 9 CSE 255 Assignment 9 Alexander Asplund, William Fedus September 25, 2015 1 Introduction In this paper we train a logistic regression function for two forms of link prediction among a set of 244 suspected

More information

Spatiotemporal models for disease incidence data: a case study

Spatiotemporal models for disease incidence data: a case study Spatiotemporal models for disease incidence data: a case study Erik A. Sauleau 1,2, Monica Musio 3, Nicole Augustin 4 1 Medicine Faculty, University of Strasbourg, France 2 Haut-Rhin Cancer Registry 3

More information

MODEL SELECTION STRATEGIES. Tony Panzarella

MODEL SELECTION STRATEGIES. Tony Panzarella MODEL SELECTION STRATEGIES Tony Panzarella Lab Course March 20, 2014 2 Preamble Although focus will be on time-to-event data the same principles apply to other outcome data Lab Course March 20, 2014 3

More information

A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U

A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U T H E U N I V E R S I T Y O F T E X A S A T D A L L A S H U M A N L A N

More information

Bayesian Model Averaging for Propensity Score Analysis

Bayesian Model Averaging for Propensity Score Analysis Multivariate Behavioral Research, 49:505 517, 2014 Copyright C Taylor & Francis Group, LLC ISSN: 0027-3171 print / 1532-7906 online DOI: 10.1080/00273171.2014.928492 Bayesian Model Averaging for Propensity

More information

Heritability. The Extended Liability-Threshold Model. Polygenic model for continuous trait. Polygenic model for continuous trait.

Heritability. The Extended Liability-Threshold Model. Polygenic model for continuous trait. Polygenic model for continuous trait. The Extended Liability-Threshold Model Klaus K. Holst Thomas Scheike, Jacob Hjelmborg 014-05-1 Heritability Twin studies Include both monozygotic (MZ) and dizygotic (DZ) twin

More information

Visualizing Data for Hypothesis Generation Using Large-Volume Health care Claims Data

Visualizing Data for Hypothesis Generation Using Large-Volume Health care Claims Data Visualizing Data for Hypothesis Generation Using Large-Volume Health care Claims Data Eberechukwu Onukwugha PhD, School of Pharmacy, UMB Margret Bjarnadottir PhD, Smith School of Business, UMCP Shujia

More information

Center for Bioinformatics & Molecular Biostatistics

Center for Bioinformatics & Molecular Biostatistics Center for Bioinformatics & Molecular Biostatistics (University of California, San Francisco) Year 2004 Paper L1Cox Penalized Cox Regression Analysis in the High-Dimensional and Low-sample Size Settings,

More information

Outlier Analysis. Lijun Zhang

Outlier Analysis. Lijun Zhang Outlier Analysis Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Extreme Value Analysis Probabilistic Models Clustering for Outlier Detection Distance-Based Outlier Detection Density-Based

More information

SubLasso:a feature selection and classification R package with a. fixed feature subset

SubLasso:a feature selection and classification R package with a. fixed feature subset SubLasso:a feature selection and classification R package with a fixed feature subset Youxi Luo,3,*, Qinghan Meng,2,*, Ruiquan Ge,2, Guoqin Mai, Jikui Liu, Fengfeng Zhou,#. Shenzhen Institutes of Advanced

More information

Bayesian Dose Escalation Study Design with Consideration of Late Onset Toxicity. Li Liu, Glen Laird, Lei Gao Biostatistics Sanofi

Bayesian Dose Escalation Study Design with Consideration of Late Onset Toxicity. Li Liu, Glen Laird, Lei Gao Biostatistics Sanofi Bayesian Dose Escalation Study Design with Consideration of Late Onset Toxicity Li Liu, Glen Laird, Lei Gao Biostatistics Sanofi 1 Outline Introduction Methods EWOC EWOC-PH Modifications to account for

More information

MEA DISCUSSION PAPERS

MEA DISCUSSION PAPERS Inference Problems under a Special Form of Heteroskedasticity Helmut Farbmacher, Heinrich Kögel 03-2015 MEA DISCUSSION PAPERS mea Amalienstr. 33_D-80799 Munich_Phone+49 89 38602-355_Fax +49 89 38602-390_www.mea.mpisoc.mpg.de

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write

More information

TRIPODS Workshop: Models & Machine Learning for Causal I. & Decision Making

TRIPODS Workshop: Models & Machine Learning for Causal I. & Decision Making TRIPODS Workshop: Models & Machine Learning for Causal Inference & Decision Making in Medical Decision Making : and Predictive Accuracy text Stavroula Chrysanthopoulou, PhD Department of Biostatistics

More information

Application of Cox Regression in Modeling Survival Rate of Drug Abuse

Application of Cox Regression in Modeling Survival Rate of Drug Abuse American Journal of Theoretical and Applied Statistics 2018; 7(1): 1-7 http://www.sciencepublishinggroup.com/j/ajtas doi: 10.11648/j.ajtas.20180701.11 ISSN: 2326-8999 (Print); ISSN: 2326-9006 (Online)

More information

Accommodating informative dropout and death: a joint modelling approach for longitudinal and semicompeting risks data

Accommodating informative dropout and death: a joint modelling approach for longitudinal and semicompeting risks data Appl. Statist. (2018) 67, Part 1, pp. 145 163 Accommodating informative dropout and death: a joint modelling approach for longitudinal and semicompeting risks data Qiuju Li and Li Su Medical Research Council

More information

Bringing machine learning to the point of care to inform suicide prevention

Bringing machine learning to the point of care to inform suicide prevention Bringing machine learning to the point of care to inform suicide prevention Gregory Simon and Susan Shortreed Kaiser Permanente Washington Health Research Institute Don Mordecai The Permanente Medical

More information

Fundamental Clinical Trial Design

Fundamental Clinical Trial Design Design, Monitoring, and Analysis of Clinical Trials Session 1 Overview and Introduction Overview Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics, University of Washington February 17-19, 2003

More information

What is Regularization? Example by Sean Owen

What is Regularization? Example by Sean Owen What is Regularization? Example by Sean Owen What is Regularization? Name3 Species Size Threat Bo snake small friendly Miley dog small friendly Fifi cat small enemy Muffy cat small friendly Rufus dog large

More information

TITLE: A Data-Driven Approach to Patient Risk Stratification for Acute Respiratory Distress Syndrome (ARDS)

TITLE: A Data-Driven Approach to Patient Risk Stratification for Acute Respiratory Distress Syndrome (ARDS) TITLE: A Data-Driven Approach to Patient Risk Stratification for Acute Respiratory Distress Syndrome (ARDS) AUTHORS: Tejas Prahlad INTRODUCTION Acute Respiratory Distress Syndrome (ARDS) is a condition

More information

Mediation Analysis With Principal Stratification

Mediation Analysis With Principal Stratification University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 3-30-009 Mediation Analysis With Principal Stratification Robert Gallop Dylan S. Small University of Pennsylvania

More information

arxiv: v1 [cs.lg] 29 Nov 2017

arxiv: v1 [cs.lg] 29 Nov 2017 A Novel Data-Driven Framework for Risk Characterization and Prediction from Electronic Medical Records: A Case Study of Renal Failure arxiv:1711.11022v1 [cs.lg] 29 Nov 2017 Prithwish Chakraborty 1, Vishrawas

More information

Logistic regression. Department of Statistics, University of South Carolina. Stat 205: Elementary Statistics for the Biological and Life Sciences

Logistic regression. Department of Statistics, University of South Carolina. Stat 205: Elementary Statistics for the Biological and Life Sciences Logistic regression Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 1 Logistic regression: pp. 538 542 Consider Y to be binary

More information

Lecture Outline Biost 517 Applied Biostatistics I

Lecture Outline Biost 517 Applied Biostatistics I Lecture Outline Biost 517 Applied Biostatistics I Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 2: Statistical Classification of Scientific Questions Types of

More information

Cancer survivorship and labor market attachments: Evidence from MEPS data

Cancer survivorship and labor market attachments: Evidence from MEPS data Cancer survivorship and labor market attachments: Evidence from 2008-2014 MEPS data University of Memphis, Department of Economics January 7, 2018 Presentation outline Motivation and previous literature

More information

Supplement for: CD4 cell dynamics in untreated HIV-1 infection: overall rates, and effects of age, viral load, gender and calendar time.

Supplement for: CD4 cell dynamics in untreated HIV-1 infection: overall rates, and effects of age, viral load, gender and calendar time. Supplement for: CD4 cell dynamics in untreated HIV-1 infection: overall rates, and effects of age, viral load, gender and calendar time. Anne Cori* 1, Michael Pickles* 1, Ard van Sighem 2, Luuk Gras 2,

More information

3. Model evaluation & selection

3. Model evaluation & selection Foundations of Machine Learning CentraleSupélec Fall 2016 3. Model evaluation & selection Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr

More information

Case Studies of Signed Networks

Case Studies of Signed Networks Case Studies of Signed Networks Christopher Wang December 10, 2014 Abstract Many studies on signed social networks focus on predicting the different relationships between users. However this prediction

More information

Package ORIClust. February 19, 2015

Package ORIClust. February 19, 2015 Type Package Package ORIClust February 19, 2015 Title Order-restricted Information Criterion-based Clustering Algorithm Version 1.0-1 Date 2009-09-10 Author Maintainer Tianqing Liu

More information

Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data

Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data Karl Bang Christensen National Institute of Occupational Health, Denmark Helene Feveille National

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Article from. Forecasting and Futurism. Month Year July 2015 Issue Number 11

Article from. Forecasting and Futurism. Month Year July 2015 Issue Number 11 Article from Forecasting and Futurism Month Year July 2015 Issue Number 11 Calibrating Risk Score Model with Partial Credibility By Shea Parkes and Brad Armstrong Risk adjustment models are commonly used

More information

Modelling heterogeneity variances in multiple treatment comparison meta-analysis Are informative priors the better solution?

Modelling heterogeneity variances in multiple treatment comparison meta-analysis Are informative priors the better solution? Thorlund et al. BMC Medical Research Methodology 2013, 13:2 RESEARCH ARTICLE Open Access Modelling heterogeneity variances in multiple treatment comparison meta-analysis Are informative priors the better

More information

Structured Association Advanced Topics in Computa8onal Genomics

Structured Association Advanced Topics in Computa8onal Genomics Structured Association 02-715 Advanced Topics in Computa8onal Genomics Structured Association Lasso ACGTTTTACTGTACAATT Gflasso (Kim & Xing, 2009) ACGTTTTACTGTACAATT Greater power Fewer false posi2ves Phenome

More information

Biostatistics II

Biostatistics II Biostatistics II 514-5509 Course Description: Modern multivariable statistical analysis based on the concept of generalized linear models. Includes linear, logistic, and Poisson regression, survival analysis,

More information

Identification of Tissue Independent Cancer Driver Genes

Identification of Tissue Independent Cancer Driver Genes Identification of Tissue Independent Cancer Driver Genes Alexandros Manolakos, Idoia Ochoa, Kartik Venkat Supervisor: Olivier Gevaert Abstract Identification of genomic patterns in tumors is an important

More information

Package speff2trial. February 20, 2015

Package speff2trial. February 20, 2015 Version 1.0.4 Date 2012-10-30 Package speff2trial February 20, 2015 Title Semiparametric efficient estimation for a two-sample treatment effect Author Michal Juraska , with contributions

More information

Mathematisches Forschungsinstitut Oberwolfach. Statistische und Probabilistische Methoden der Modellwahl

Mathematisches Forschungsinstitut Oberwolfach. Statistische und Probabilistische Methoden der Modellwahl Mathematisches Forschungsinstitut Oberwolfach Report No. 47/2005 Statistische und Probabilistische Methoden der Modellwahl Organised by James O. Berger (Durham) Holger Dette (Bochum) Gabor Lugosi (Barcelona)

More information

Tutorial #7A: Latent Class Growth Model (# seizures)

Tutorial #7A: Latent Class Growth Model (# seizures) Tutorial #7A: Latent Class Growth Model (# seizures) 2.50 Class 3: Unstable (N = 6) Cluster modal 1 2 3 Mean proportional change from baseline 2.00 1.50 1.00 Class 1: No change (N = 36) 0.50 Class 2: Improved

More information

PubH 7405: REGRESSION ANALYSIS

PubH 7405: REGRESSION ANALYSIS PubH 7405: REGRESSION ANALYSIS APLLICATIONS B: MORE BIOMEDICAL APPLICATIONS #. SMOKING & CANCERS There is strong association between lung cancer and smoking; this has been thoroughly investigated. A study

More information

RISK PREDICTION MODEL: PENALIZED REGRESSIONS

RISK PREDICTION MODEL: PENALIZED REGRESSIONS RISK PREDICTION MODEL: PENALIZED REGRESSIONS Inspired from: How to develop a more accurate risk prediction model when there are few events Menelaos Pavlou, Gareth Ambler, Shaun R Seaman, Oliver Guttmann,

More information

Supplementary Appendix

Supplementary Appendix Supplementary Appendix This appendix has been provided by the authors to give readers additional information about their work. Supplement to: Rawshani Aidin, Rawshani Araz, Franzén S, et al. Risk factors,

More information

Weight Adjustment Methods using Multilevel Propensity Models and Random Forests

Weight Adjustment Methods using Multilevel Propensity Models and Random Forests Weight Adjustment Methods using Multilevel Propensity Models and Random Forests Ronaldo Iachan 1, Maria Prosviryakova 1, Kurt Peters 2, Lauren Restivo 1 1 ICF International, 530 Gaither Road Suite 500,

More information

Survival Prediction Models for Estimating the Benefit of Post-Operative Radiation Therapy for Gallbladder Cancer and Lung Cancer

Survival Prediction Models for Estimating the Benefit of Post-Operative Radiation Therapy for Gallbladder Cancer and Lung Cancer Survival Prediction Models for Estimating the Benefit of Post-Operative Radiation Therapy for Gallbladder Cancer and Lung Cancer Jayashree Kalpathy-Cramer PhD 1, William Hersh, MD 1, Jong Song Kim, PhD

More information

Ethnic Disparities in the Treatment of Stage I Non-small Cell Lung Cancer. Juan P. Wisnivesky, MD, MPH, Thomas McGinn, MD, MPH, Claudia Henschke, PhD,

Ethnic Disparities in the Treatment of Stage I Non-small Cell Lung Cancer. Juan P. Wisnivesky, MD, MPH, Thomas McGinn, MD, MPH, Claudia Henschke, PhD, Ethnic Disparities in the Treatment of Stage I Non-small Cell Lung Cancer Juan P. Wisnivesky, MD, MPH, Thomas McGinn, MD, MPH, Claudia Henschke, PhD, MD, Paul Hebert, PhD, Michael C. Iannuzzi, MD, and

More information

Personalized Colorectal Cancer Survivability Prediction with Machine Learning Methods*

Personalized Colorectal Cancer Survivability Prediction with Machine Learning Methods* Personalized Colorectal Cancer Survivability Prediction with Machine Learning Methods* 1 st Samuel Li Princeton University Princeton, NJ seli@princeton.edu 2 nd Talayeh Razzaghi New Mexico State University

More information

Detecting Anomalous Patterns of Care Using Health Insurance Claims

Detecting Anomalous Patterns of Care Using Health Insurance Claims Partially funded by National Science Foundation grants IIS-0916345, IIS-0911032, and IIS-0953330, and funding from Disruptive Health Technology Institute. We are also grateful to Highmark Health for providing

More information

Cross-validation. Miguel Angel Luque Fernandez Faculty of Epidemiology and Population Health Department of Non-communicable Disease.

Cross-validation. Miguel Angel Luque Fernandez Faculty of Epidemiology and Population Health Department of Non-communicable Disease. Cross-validation Miguel Angel Luque Fernandez Faculty of Epidemiology and Population Health Department of Non-communicable Disease. August 25, 2015 Cancer Survival Group (LSH&TM) Cross-validation August

More information

Non-homogenous Poisson Process for Evaluating Stage I & II Ductal Breast Cancer Treatment

Non-homogenous Poisson Process for Evaluating Stage I & II Ductal Breast Cancer Treatment Journal of Modern Applied Statistical Methods Volume 10 Issue 2 Article 23 11-1-2011 Non-homogenous Poisson Process for Evaluating Stage I & II Ductal Breast Cancer Treatment Chris P Tsokos University

More information

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics Biost 517 Applied Biostatistics I Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 3: Overview of Descriptive Statistics October 3, 2005 Lecture Outline Purpose

More information

Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision in Pune, India

Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision in Pune, India 20th International Congress on Modelling and Simulation, Adelaide, Australia, 1 6 December 2013 www.mssanz.org.au/modsim2013 Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision

More information

Predictive Analytics for Retention in HIV Care

Predictive Analytics for Retention in HIV Care Predictive Analytics for Retention in HIV Care Jessica Ridgway, MD, MS; Arthi Ramachandran, PhD; Hannes Koenig, MS; Avishek Kumar, PhD; Joseph Walsh, PhD; Christina Sung; Rayid Ghani, MS; John A. Schneider,

More information

County-Level Small Area Estimation using the National Health Interview Survey (NHIS) and the Behavioral Risk Factor Surveillance System (BRFSS)

County-Level Small Area Estimation using the National Health Interview Survey (NHIS) and the Behavioral Risk Factor Surveillance System (BRFSS) County-Level Small Area Estimation using the National Health Interview Survey (NHIS) and the Behavioral Risk Factor Surveillance System (BRFSS) Van L. Parsons, Nathaniel Schenker Office of Research and

More information

Multiple Mediation Analysis For General Models -with Application to Explore Racial Disparity in Breast Cancer Survival Analysis

Multiple Mediation Analysis For General Models -with Application to Explore Racial Disparity in Breast Cancer Survival Analysis Multiple For General Models -with Application to Explore Racial Disparity in Breast Cancer Survival Qingzhao Yu Joint Work with Ms. Ying Fan and Dr. Xiaocheng Wu Louisiana Tumor Registry, LSUHSC June 5th,

More information

Bivariate Density Estimation With an Application to Survival Analysis

Bivariate Density Estimation With an Application to Survival Analysis Bivariate Density Estimation With an Application to Survival Analysis Charles KOOPERBERG A procedure for estimating a bivariate density based on data that may be censored is described After the data are

More information

Can Family Strengths Reduce Risk of Substance Abuse among Youth with SED?

Can Family Strengths Reduce Risk of Substance Abuse among Youth with SED? Can Family Strengths Reduce Risk of Substance Abuse among Youth with SED? Michael D. Pullmann Portland State University Ana María Brannan Vanderbilt University Robert Stephens ORC Macro, Inc. Relationship

More information

Package xmeta. August 16, 2017

Package xmeta. August 16, 2017 Type Package Title A Toolbox for Multivariate Meta-Analysis Version 1.1-4 Date 2017-5-12 Package xmeta August 16, 2017 Imports aod, glmmml, numderiv, metafor, mvmeta, stats A toolbox for meta-analysis.

More information

Individualized Treatment Effects Using a Non-parametric Bayesian Approach

Individualized Treatment Effects Using a Non-parametric Bayesian Approach Individualized Treatment Effects Using a Non-parametric Bayesian Approach Ravi Varadhan Nicholas C. Henderson Division of Biostatistics & Bioinformatics Department of Oncology Johns Hopkins University

More information

arxiv: v2 [stat.ap] 7 Dec 2016

arxiv: v2 [stat.ap] 7 Dec 2016 A Bayesian Approach to Predicting Disengaged Youth arxiv:62.52v2 [stat.ap] 7 Dec 26 David Kohn New South Wales 26 david.kohn@sydney.edu.au Nick Glozier Brain Mind Centre New South Wales 26 Sally Cripps

More information

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections New: Bias-variance decomposition, biasvariance tradeoff, overfitting, regularization, and feature selection Yi

More information

A Comparison of Methods for Determining HIV Viral Set Point

A Comparison of Methods for Determining HIV Viral Set Point STATISTICS IN MEDICINE Statist. Med. 2006; 00:1 6 [Version: 2002/09/18 v1.11] A Comparison of Methods for Determining HIV Viral Set Point Y. Mei 1, L. Wang 2, S. E. Holte 2 1 School of Industrial and Systems

More information