Regression Tree Methods for Precision Medicine

Size: px
Start display at page:

Download "Regression Tree Methods for Precision Medicine"

Transcription

1 Regression Tree Methods for Precision Medicine Wei-Yin Loh Department of Statistics University of Wisconsin Madison W-Y Loh July 12,

2 Subgroup identification: breast cancer trial Randomized trial of 672 subjects with primary node positive breast cancer (Schumacher et al., 1994) Response is recurrence-free survival time ( days, 299 uncensored, 387 censored) Eight predictor variables with no missing values: 1. horth (hormone therapy, yes/no) 2. age (21 80 years) 3. tsize (tumor size, mm) 4. pnodes (number of positive lymph nodes, 1 51) 5. progrec (progesterone receptor status, fmol) 6. estrec (estrogen receptor status, fmol) 7. menostat (menopausal status, pre/post) 8. tgrade (tumor grade, 1, 2, 3) W-Y Loh July 12,

3 Survival probability horth = no horth = yes Days Variable Coef p-value Variable Coef p-value horth=yes e-03 tsize age pnodes e-11 meno=pre progrec tgrade estrec W-Y Loh July 12,

4 cor(estrec,progrec) = 0.39 cor(ln(estrec+1),ln(progrec+1)) = 0.64 estrec estrec progrec progrec+1 W-Y Loh July 12,

5 GUIDE model (2nd best variable is estrec) progrec Node 2 Node 3 Survival probability horth = yes horth = no horth = yes horth = no W-Y Loh July 12,

6 Earlier subgroup identification methods Interaction trees (Su et al., 2008, 2009). For each X and split set S (e.g., {X < c} or {X A}), fit E(Y) = β 0 +β 2 I(S)+β 1 Z +β 3 Z I(S) to data. Find split (X,S) with most significant interaction (β 3 ). SIDES: (Lipkovich et al., 2011; Lipkovich and Dmitrienko, 2014). Find split (X,S) with most significant between-node difference in treatment effects. QUINT: Qualitative interaction tree (Dusseldorp and Van Mechelen, 2014) Find split (X,S) to optimize function of effect size and subgroup size. VT: Virtual twins (Foster et al., 2011). 1. Fit a Random forest model (Breiman, 2001) to observed outcomes y obs 2. Use model to predict counterfactual outcomes y unobs (other treatment) 3. Fit CART model to (y obs y unobs ) to find subgroups W-Y Loh July 12,

7 Limitations 1. Most methods follow CART approach of greedy search over all (X,S) result is bias in variable selection 2. Many are only applicable to 2 treatment levels 3. Most require imputation to deal with missing covariate values but imputation is possibly the hardest problem in statistics! 4. All are designed for univariate response only; extension to multivariate or longitudinal, time-dependent response is not straightforward W-Y Loh July 12,

8 Selection bias of CART and Random forest Ordinal X with n distinct values allows (n1) splits of the form {X c} Categorical X with m levels has (2 m1 1) splits of the form {X A} Bias: Variables with large n and m have more chance to split a node W-Y Loh July 12,

9 Example of selection bias: predicting heart disease 617 observations, no missing values Response is diagnosis of heart disease (5 levels) 52 predictor variables (29 ordinal, 23 categorical), including 1. ekgmo: month of electrocardiogram (12 values, splits) 2. ekgday: day of electrocardiogram (31 values, splits) W-Y Loh July 12,

10 RPART tree (Breiman et al., 1984) (3.6 hrs) W-Y Loh July 12,

11 GUIDE tree (Loh, 2002, 2009) (3 sec.) lmt=0 rcaprox 1 lmt 1 ladprox 1 rcadist 1 cxmain 1 2 8/15 laddist /32 ladprox 1 rcadist 1 cxmain 1 rcaprox 1 4 1/31 laddist 1 ladprox 1 cxmain 1 laddist 1 1 2/15 2/11 0/7 1 2 laddist 1 cxmain 1 ladprox 1 3 0/11 cxmain 1 3 3/18 ramus 1 3 0/27 2/ /6 cxmain 1 cxmain ramus om /10 2/9 2/ / /18 om /9 5/63 0/20 1/16 0/7 4/36 0/ / /7 W-Y Loh July 12,

12 Many missing values: a retrospective candidate gene study 1504 subjects randomized to treatment or placebo Response is survival time in days, with 63% censored 23 baseline (17 ordered, 6 categorical) and 282 genetic (cat.) variables 95% of subjects have missing values; only 7 variables are complete Survival probability Treatment Placebo Days W-Y Loh July 12,

13 GUIDE model with 95% bootstrap intervals for relative risk (treatment vs placebo) a2 0.1 or NA (0.73, 1.54) (0.45, 0.81) a2 0.1 or NA a2 > 0.1 Survival probability Treatment Placebo Days Treatment Placebo Days At each node, a case goes to the left child node if stated condition is satisfied. Sample sizes are beside terminal nodes. W-Y Loh July 12,

14 GUIDE method for subgroup identification (Loh, 2014; Loh et al., 2015) 1. Let Z = 1, 2,..., be treatment variable and X a split variable 2. Do for each X at each node: (a) If X is a categorical variable, add a category to X for missing values and test lack of fit of the additive model: EY = η + j β j I(X = j)+ k γ k I(Z = k) (b) If X is ordinal, convert it to categorical by discretization at quartiles compare with: EY = η+ β j I(X j < c j )+ γ k I(Z = k)+ ω jk I(X j < c j,z = k) j k j k 3. Let X be the variable with the most significant chi-squared 4. Find split on X that minimizes sum of squared residuals of the model EY = η + k γ ki(z = k) fitted to each subnode W-Y Loh July 12,

15 Type 2 diabetes longitudinal study with missing values in responses and covariates (Loh et al., 2016) 1249 subjects from a multi-center, randomized double-blind trial (Charbonnel et al., 2004) Subjects randomized to a 52-week treatment period of drug G (Gliclazide) or P (Pioglitazone) 24 baseline (time 0) variables measured for each subject as well as their HbA1c at 10 time points (-2, 0, 4, 8, 12, 16, 24, 32, 42, and 52 weeks) Gliclazide increases amount of insulin produced by the pancreas Pioglitazone improves how body uses insulin ( insulin sensitizer ) W-Y Loh July 12,

16 HbA1c means for 747 subjects A1C Pioglitazone Gliclazide Weeks W-Y Loh July 12,

17 Baseline variables and their missing values Variable #Missing Variable #Missing HDL 7 Age 0 LDL 77 Weight 1 Total cholesterol 6 BMI 0 Triglycerides 6 Waist 4 Creatinine 0 A1CBase 0 Fasting insulin 46 HomaS 62 ALT 0 HomaIR 62 AST 0 HomaB 62 GGT 0 Diastolic blood pressure 0 C-peptide 593 Systolic blood pressure 0 Diabetes duration 0 Pulse 0 Fasting blood glucose 0 W-Y Loh July 12,

18 GUIDE tree with 95% bootstrap CIs (Loh et al., 2016) HOMAB Fasting blood glucose Weeks Gliclazide Pioglitazone Node Weeks Gliclazide Pioglitazone Node Weeks Gliclazide Pioglitazone Node 7 W-Y Loh July 12,

19 Frequently (and not so frequently) asked questions 1. P(Type I error) controlled? 2. Subgroup correct? 3. Split points statistically significant? 4. Estimated subgroup treatment effects unbiased? 5. Estimated subgroup treatment effects statistically significant? 6. Estimated subgroup treatment effects confounded with covariates? W-Y Loh July 12,

20 Q1. Does GUIDE control P(Type I error)? As n, the estimated regression function is asymptotically consistent (Chaudhuri et al., 1994, 1995; Chaudhuri and Loh, 2002). Hence P(Type I error) 0 W-Y Loh July 12,

21 Q2. Is subgroup correctly identified? Surprise! There is no correct subgroup progrec Node 2 Node 3 Survival probability horth = yes horth = no horth = yes horth = no W-Y Loh July 12,

22 Model without progrec estrec Node Node 3 Survival probability horth = yes horth = no horth = yes horth = no W-Y Loh July 12,

23 Where is the correct subgroup? progrec estrec estrec progrec+1 W-Y Loh July 12,

24 Q3. Are split points statistically significant? Consider these two simulation models Jump model Broken line model Y Y X X W-Y Loh July 12,

25 IT and SIDES vs GUIDE Jump model (true subgroup marked by dotted line) Interaction Trees SIDES GUIDE Response Drug Placebo Response Drug Placebo Response Drug Placebo Biomarker Biomarker Biomarker Interaction Trees maximizes significance of treatment-biomarker interaction SIDES minimizes p-value of difference between treatment effects GUIDE minimizes sum of squared residuals W-Y Loh July 12,

26 Mean of split point for two models Model Interaction Trees SIDES GUIDE Jump (0.003) (0.005) (0.002) Broken line (0.006) (0.013) (0.006) based on iterations; simulation SEs in parentheses For Jump model, true split point is 5.0 For Broken line model, true split point is undefined W-Y Loh July 12,

27 Q4. Are estimated treatment effects unbiased? Ans: Usually not, but some methods are better Subgroup treatment effect bias Model Interaction Trees SIDES GUIDE Jump (0.001) (0.001) (0.001) Broken line (0.001) (0.002) (0.001) based on iterations; simulation SEs in parentheses W-Y Loh July 12,

28 Q5. Are treatment effects statistically significant? 1. Subgroups are random because they are results of search algorithms 2. Hence, unlike classical theory, true subgroup effects θ are also random 3. Statistical significance of estimates ˆθ must account for the search 4. P-value requires a null hypothesis H 0 but what is H 0? W-Y Loh July 12,

29 Bootstrap calibration (Loh, 1987, 1991) Naïve intervals too short do not account for subgroup search Need to increase nominal confidence level Use bootstrap to estimate true confidence levels Increase nominal level of intervals to reach desired level W-Y Loh July 12,

30 Tree from a bootstrap sample X X Real data Bootstrap sample x x x x 1 W-Y Loh July 12,

31 Bootstrap calibrated intervals (Loh et al., 2016) 1. Let F be true (unknown) distribution of data 2. Given sample of data, construct a tree model 3. Given γ, construct a nominal 100γ% interval at each terminal node 4. Let C(F,γ) be true average coverage of nominal 100γ% intervals 5. Let γ F be such that C(F,γ F ) = If we know F, construct nominal 100γ F % intervals and we are finished 7. Because F is unknown, let ˆF be its bootstrap estimate 8. Use simulation to find calibrated level γˆf such that C(ˆF,γˆF) = Construct desired intervals at nominal level γˆf W-Y Loh July 12,

32 Bootstrap calibrated alpha for 95% confidence intervals Bootstrap coverage Nominal alpha W-Y Loh July 12,

33 95% bootstrap intervals for RR (therapy vs none) progrec (0.56,1.42) (0.30,0.89) Bootstrap calibrated ˆα = Node 2 Node 3 Survival probability horth = yes horth = no horth = yes horth = no W-Y Loh July 12,

34 Coverage of 95% CIs for treatment effect for breast cancer data Naïve t interval ± Bootstrap calibrated interval ± simulation trials with 25 bootstraps each (± 2 simulation SEs in parentheses) W-Y Loh July 12,

35 Q6. How to ensure treatment effects are unconfounded within subgroups? Many studies include prognostic variables (e.g., age, tumor size) Treatment randomization balances the overall effects of these variables But balance may be upset within subgroups W-Y Loh July 12,

36 95% bootstrap intervals for RR due to horth with linear control of prognostic variables progrec 24 1 (0.56,1.18) (0.34,0.82) Bootstrap calibrated ˆα = Node 2 Node 3 coef p-value coef p-value constant pnodes horth=yes unadjusted p-values W-Y Loh July 12,

37 Coverage (± 2 SEs) of 95% CIs for treatment effect with local linear prognostic control for breast cancer data Naïve t interval ± Bootstrap calibrated interval ± based on 1200 simulation trials with 25 bootstraps per trial W-Y Loh July 12,

38 Conclusions 1. Asking for correct subgroup is naïve: often there is no unique subgroup 2. GUIDE handles missing values without imputation 3. GUIDE has no selection bias: does not select variables that have more splits 4. GUIDE seems to give less biased estimates of subgroup treatment effects 5. GUIDE can control for prognostic effects within subgroups 6. Simple way to assess statistical significance is bootstrap calibrated intervals Some outstanding problems 1. Given a tree model, how to tell which node defines the subgroup? 2. How to remove the bias in estimated treatment effect in the subgroup? 3. How to deal with longitudinal (time-dependent) covariates? W-Y Loh July 12,

39 Acknowledgments Xu He, Michael Man and Lei Shen Probal Chaudhuri Yu-Shan Shih, Wei Zheng and 18 other PhD students US Army Research Office US National Science Foundation US Bureau of Labor Statistics US National Institutes of Health AbbVie, Eli Lilly, Gilead Sciences, Pfizer and Takeda W-Y Loh July 12,

40 References Breiman, L. (2001). Random forests. Machine Learning, 45:5 32. Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. (1984). Classification and Regression Trees. Chapman & Hall/CRC. Charbonnel, B. H.and Matthews, D. R., Schernthaner, G., Hanefeld, M., and Brunetti, P. (2004). A long-term comparison of Pioglitazone and Gliclazide in patients with Type 2 diabetes mellitus: a randomized, double-blind, parallel-group comparison trial. Diabetic Medicine, 22: Chaudhuri, P., Huang, M.-C., Loh, W.-Y., and Yao, R. (1994). Piecewise-polynomial regression trees. Statistica Sinica, 4: Chaudhuri, P., Lo, W.-D., Loh, W.-Y., and Yang, C.-C. (1995). Generalized regression trees. Statistica Sinica, 5: Chaudhuri, P. and Loh, W.-Y. (2002). Nonparametric estimation of conditional quantiles using quantile regression trees. Bernoulli, 8: W-Y Loh July 12,

41 Dusseldorp, E. and Van Mechelen, I. (2014). Qualitative interaction trees: a tool to identify qualitative treatment-subgroup interactions. Statistics in Medicine, 33: Foster, J. C., Taylor, J. M. G., and Ruberg, S. J. (2011). Subgroup identification from randomized clinical trial data. Statistics in Medicine, 30: Lipkovich, I. and Dmitrienko, A. (2014). Strategies for identifying predictive biomarkers and subgroups with enhanced treatment effect in clinical trials using SIDES. Journal of Biopharmaceutical Statistics, 24: Lipkovich, I., Dmitrienko, A., Denne, J., and Enas, G. (2011). Subgroup identification based on differential effect search a recursive partitioning method for establishing response to treatment in patient subpopulations. Statistics in Medicine, 30: Loh, W.-Y. (1987). Calibrating confidence coefficients. Journal of the American Statistical Association, 82: Loh, W.-Y. (1991). Bootstrap calibration for confidence interval construction and selection. Statistica Sinica, 1: W-Y Loh July 12,

42 Loh, W.-Y. (2002). Regression trees with unbiased variable selection and interaction detection. Statistica Sinica, 12: Loh, W.-Y. (2009). Improving the precision of classification trees. Annals of Applied Statistics, 3: Loh, W.-Y. (2014). Fifty years of classification and regression trees (with discussion). International Statistical Review, 34: Loh, W.-Y., Fu, H., Man, M., Champion, V., and Yu, M. (2016). Identification of subgroups with differential treatment effects for longitudinal and multiresponse variables. Statistics in Medicine, 35: Loh, W.-Y., He, X., and Man, M. (2015). A regression tree approach to identifying subgroups with differential treatment effects. Statistics in Medicine, 34: Schumacher, M., Baster, G., Bojar, H., Hübner, K., Olschewski, M., Sauerbrei, W., Schmoor, C., Beyerle, C., Newmann, R. L. A., and Rauschecker, H. F. (1994). Randomized 2 2 trial evaluating hormonal treatment and the W-Y Loh July 12,

43 duration of chemotherapy in node-positive breast cancer patients. Journal of Clinical Oncology, 12: Su, X., Tsai, C. L., Wang, H., Nickerson, D. M., and Bogong, L. (2009). Subgroup analysis via recursive partitioning. Journal of Machine Learning Research, 10: Su, X., Zhou, T., Yan, X., Fan, J., and Yang, S. (2008). Interaction trees with censored survival data. International Journal of Biostatistics, 4. Article 2. W-Y Loh July 12,

44 Model for 10-week A1C without linear control InsulinFastpmolLBase A1CBase Sample sizes below nodes; treatment means for G and P beside nodes. Symbol stands for or missing. Red nodes indicate significant treatment effects. W-Y Loh July 12,

45 Node 2: Terminal node Regressor Coefficient t-stat p-val Thera.P E Mean of A1C10 = Node 6: Terminal node Regressor Coefficient t-stat p-val Thera.P E Mean of A1C10 = Node 7: Terminal node Regressor Coefficient t-stat p-val Thera.P E Mean of A1C10 = W-Y Loh July 12,

46 Model for 10-week A1C with linear control InsulinFastpmolLBase A1CBase A1CBase A1CBase FastBGBase Sample size, mean A1C10 and linear covariate below node. Red nodes indicate significant treatment effects. W-Y Loh July 12,

47 Node 2: Terminal node Regressor Coefficient t-stat p-val A1CBase E Thera.P E Mean of A1C10 = Node 6: Terminal node Regressor Coefficient t-stat p-val A1CBase E Thera.P E Mean of A1C10 = Node 7: Terminal node Regressor Coefficient t-stat p-val FastBGBase E Thera.P E Mean of A1C10 = W-Y Loh July 12,

48 Extension to censored response data via Poisson regression 1. Let U i and C i be survival and censoring times of subject i 2. Let Y i = min(u i,c i ) and δ i = I(T i < C i ) be the event indicator 3. Let Λ 0 (.) be the baseline cumulative hazard function of PH model 4. Estimate coefficients of PH model by iteratively fitting a Poisson regression model with δ i as response and logλ 0 (y i ) as offset: (a) Use the Nelson-Aalen method to get an initial estimate of Λ 0 (.) (b) Use GUIDE to construct a Poisson regression tree (c) Update Λ 0 (.) with the tree (d) Repeat steps (b) and (c) four more times W-Y Loh July 12,

49 Do at each node: Extension to multiple responses 1. For each response variable Y j, find chi-squared of each X variable 2. Choose the variable X with largest sum of chi-squared values over j 3. Choose the split on X that yields smallest sum of squared residuals over all response variables Extension to correlated response variables Apply principal components of Y variables computed locally at each node W-Y Loh July 12,

Data mining methods for subgroup identification. Ilya Lipkovich and Alex Dmitrienko, Quintiles TICTS, April 22, 2014

Data mining methods for subgroup identification. Ilya Lipkovich and Alex Dmitrienko, Quintiles TICTS, April 22, 2014 Data mining methods for subgroup identification Ilya Lipkovich and Alex Dmitrienko, Quintiles TICTS, April 22, 2014 Outline Introduction Principles and standards for Subgroup Analysis in clinical research

More information

Multivariable Cox regression. Day 3: multivariable Cox regression. Presentation of results. The statistical methods section

Multivariable Cox regression. Day 3: multivariable Cox regression. Presentation of results. The statistical methods section Outline: Multivariable Cox regression PhD course Survival analysis Day 3: multivariable Cox regression Thomas Alexander Gerds Presentation of results The statistical methods section Modelling The linear

More information

Predicting Breast Cancer Survival Using Treatment and Patient Factors

Predicting Breast Cancer Survival Using Treatment and Patient Factors Predicting Breast Cancer Survival Using Treatment and Patient Factors William Chen wchen808@stanford.edu Henry Wang hwang9@stanford.edu 1. Introduction Breast cancer is the leading type of cancer in women

More information

An Empirical Comparison of Principal Component Analysis and Clustering on Variables for Dimension Reduction Using Leukemia and Breast Cancer Data

An Empirical Comparison of Principal Component Analysis and Clustering on Variables for Dimension Reduction Using Leukemia and Breast Cancer Data International Journal of Statistics and Applications 2018, 8(3): 144-152 DOI: 10.5923/j.statistics.20180803.05 An Empirical Comparison of Principal Component Analysis and Clustering on Variables for Dimension

More information

Recursive Partitioning Method on Survival Outcomes for Personalized Medicine

Recursive Partitioning Method on Survival Outcomes for Personalized Medicine Recursive Partitioning Method on Survival Outcomes for Personalized Medicine Wei Xu, Ph.D Dalla Lana School of Public Health, University of Toronto Princess Margaret Cancer Centre 2nd International Conference

More information

Rise of the Machines

Rise of the Machines Rise of the Machines Statistical machine learning for observational studies: confounding adjustment and subgroup identification Armand Chouzy, ETH (summer intern) Jason Wang, Celgene PSI conference 2018

More information

Survival Prediction Models for Estimating the Benefit of Post-Operative Radiation Therapy for Gallbladder Cancer and Lung Cancer

Survival Prediction Models for Estimating the Benefit of Post-Operative Radiation Therapy for Gallbladder Cancer and Lung Cancer Survival Prediction Models for Estimating the Benefit of Post-Operative Radiation Therapy for Gallbladder Cancer and Lung Cancer Jayashree Kalpathy-Cramer PhD 1, William Hersh, MD 1, Jong Song Kim, PhD

More information

Design, Sampling, and Probability

Design, Sampling, and Probability STAT 269 Design, Sampling, and Probability Three ways to classify data Quantitative vs. Qualitative Quantitative Data: data that represents counts or measurements, answers the questions how much? or how

More information

Recent Advances in Methods for Quantiles. Matteo Bottai, Sc.D.

Recent Advances in Methods for Quantiles. Matteo Bottai, Sc.D. Recent Advances in Methods for Quantiles Matteo Bottai, Sc.D. Many Thanks to Advisees Andrew Ortaglia Huiling Zhen Joe Holbrook Junlong Wu Li Zhou Marco Geraci Nicola Orsini Paolo Frumento Yuan Liu Collaborators

More information

Bayesian additive decision trees of biomarker by treatment interactions for predictive biomarkers detection and subgroup identification

Bayesian additive decision trees of biomarker by treatment interactions for predictive biomarkers detection and subgroup identification Bayesian additive decision trees of biomarker by treatment interactions for predictive biomarkers detection and subgroup identification Wei Zheng Sanofi-Aventis US Comprehend Info and Tech Talk outlines

More information

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method Biost 590: Statistical Consulting Statistical Classification of Scientific Studies; Approach to Consulting Lecture Outline Statistical Classification of Scientific Studies Statistical Tasks Approach to

More information

Ecological Statistics

Ecological Statistics A Primer of Ecological Statistics Second Edition Nicholas J. Gotelli University of Vermont Aaron M. Ellison Harvard Forest Sinauer Associates, Inc. Publishers Sunderland, Massachusetts U.S.A. Brief Contents

More information

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY Lingqi Tang 1, Thomas R. Belin 2, and Juwon Song 2 1 Center for Health Services Research,

More information

Supplemental Table S2: Subgroup analysis for IL-6 with BMI in 3 groups

Supplemental Table S2: Subgroup analysis for IL-6 with BMI in 3 groups Supplemental Table S1: Unadjusted and Adjusted Hazard Ratios for Diabetes Associated with Baseline Factors Considered in Model 3 SMART Participants Only Unadjusted Adjusted* Baseline p-value p-value Covariate

More information

SYNOPSIS OF RESEARCH REPORT (PROTOCOL BC20779)

SYNOPSIS OF RESEARCH REPORT (PROTOCOL BC20779) TITLE OF THE STUDY / REPORT No. / DATE OF REPORT INVESTIGATORS / CENTERS AND COUNTRIES Clinical Study Report Protocol BC20779: Multicenter, double-blind, randomized, placebo-controlled, dose ranging phase

More information

SUPPLEMENTARY MATERIAL

SUPPLEMENTARY MATERIAL SUPPLEMENTARY MATERIAL Supplementary Figure 1. Recursive partitioning using PFS data in patients with advanced NSCLC with non-squamous histology treated in the placebo pemetrexed arm of LUME-Lung 2. (A)

More information

Using Ensemble-Based Methods for Directly Estimating Causal Effects: An Investigation of Tree-Based G-Computation

Using Ensemble-Based Methods for Directly Estimating Causal Effects: An Investigation of Tree-Based G-Computation Institute for Clinical Evaluative Sciences From the SelectedWorks of Peter Austin 2012 Using Ensemble-Based Methods for Directly Estimating Causal Effects: An Investigation of Tree-Based G-Computation

More information

Landmarking, immortal time bias and. Dynamic prediction

Landmarking, immortal time bias and. Dynamic prediction Landmarking and immortal time bias Landmarking and dynamic prediction Discussion Landmarking, immortal time bias and dynamic prediction Department of Medical Statistics and Bioinformatics Leiden University

More information

Comparison of discrimination methods for the classification of tumors using gene expression data

Comparison of discrimination methods for the classification of tumors using gene expression data Comparison of discrimination methods for the classification of tumors using gene expression data Sandrine Dudoit, Jane Fridlyand 2 and Terry Speed 2,. Mathematical Sciences Research Institute, Berkeley

More information

Understandable Statistics

Understandable Statistics Understandable Statistics correlated to the Advanced Placement Program Course Description for Statistics Prepared for Alabama CC2 6/2003 2003 Understandable Statistics 2003 correlated to the Advanced Placement

More information

Supplementary Online Content

Supplementary Online Content Supplementary Online Content Larsen JR, Vedtofte L, Jakobsen MSL, et al. Effect of liraglutide treatment on prediabetes and overweight or obesity in clozapine- or olanzapine-treated patients with schizophrenia

More information

Dynamic prediction using joint models for recurrent and terminal events: Evolution after a breast cancer

Dynamic prediction using joint models for recurrent and terminal events: Evolution after a breast cancer Dynamic prediction using joint models for recurrent and terminal events: Evolution after a breast cancer A. Mauguen, B. Rachet, S. Mathoulin-Pélissier, S. Siesling, G. MacGrogan, A. Laurent, V. Rondeau

More information

To cite this article:

To cite this article: To cite this article: Sies, A., Demyttenaere, K., & Van Mechelen, I. (in press). Studying treatment-effect heterogeneity in precision medicine through induced subgroups. Journal of Biopharmaceutical Statistics.

More information

Central pressures and prediction of cardiovascular events in erectile dysfunction patients

Central pressures and prediction of cardiovascular events in erectile dysfunction patients Central pressures and prediction of cardiovascular events in erectile dysfunction patients N. Ioakeimidis, K. Rokkas, A. Angelis, Z. Kratiras, M. Abdelrasoul, C. Georgakopoulos, D. Terentes-Printzios,

More information

A comparison of five recursive partitioning methods to find person subgroups involved in meaningful treatment subgroup interactions

A comparison of five recursive partitioning methods to find person subgroups involved in meaningful treatment subgroup interactions Adv Data Anal Classif DOI 1.17/s11634-13-159-x REGULAR ARTICLE A comparison of five recursive partitioning methods to find person subgroups involved in meaningful treatment subgroup interactions L. L.

More information

Roadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers. Richard Simon, J Clin Oncol 23:

Roadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers. Richard Simon, J Clin Oncol 23: Roadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers. Richard Simon, J Clin Oncol 23:7332-7341 Presented by Deming Mi 7/25/2006 Major reasons for few prognostic factors to

More information

LEPTIN AS A NOVEL PREDICTOR OF DEPRESSION IN PATIENTS WITH THE METABOLIC SYNDROME

LEPTIN AS A NOVEL PREDICTOR OF DEPRESSION IN PATIENTS WITH THE METABOLIC SYNDROME LEPTIN AS A NOVEL PREDICTOR OF DEPRESSION IN PATIENTS WITH THE METABOLIC SYNDROME Diana A. Chirinos, Ronald Goldberg, Elias Querales-Mago, Miriam Gutt, Judith R. McCalla, Marc Gellman and Neil Schneiderman

More information

MODEL SELECTION STRATEGIES. Tony Panzarella

MODEL SELECTION STRATEGIES. Tony Panzarella MODEL SELECTION STRATEGIES Tony Panzarella Lab Course March 20, 2014 2 Preamble Although focus will be on time-to-event data the same principles apply to other outcome data Lab Course March 20, 2014 3

More information

breast cancer; relative risk; risk factor; standard deviation; strength of association

breast cancer; relative risk; risk factor; standard deviation; strength of association American Journal of Epidemiology The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail:

More information

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics Biost 517 Applied Biostatistics I Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 3: Overview of Descriptive Statistics October 3, 2005 Lecture Outline Purpose

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Supplementary Appendix

Supplementary Appendix Supplementary Appendix This appendix has been provided by the authors to give readers additional information about their work. Supplement to: Serra AL, Poster D, Kistler AD, et al. Sirolimus and kidney

More information

Modelling prognostic capabilities of tumor size: application to colorectal cancer

Modelling prognostic capabilities of tumor size: application to colorectal cancer Session 3: Epidemiology and public health Modelling prognostic capabilities of tumor size: application to colorectal cancer Virginie Rondeau, INSERM Modelling prognostic capabilities of tumor size : application

More information

Learning Objectives 9/9/2013. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency

Learning Objectives 9/9/2013. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency Conflicts of Interest I have no conflict of interest to disclose Biostatistics Kevin M. Sowinski, Pharm.D., FCCP Last-Chance Ambulatory Care Webinar Thursday, September 5, 2013 Learning Objectives For

More information

Table S1. Characteristics associated with frequency of nut consumption (full entire sample; Nn=4,416).

Table S1. Characteristics associated with frequency of nut consumption (full entire sample; Nn=4,416). Table S1. Characteristics associated with frequency of nut (full entire sample; Nn=4,416). Daily nut Nn= 212 Weekly nut Nn= 487 Monthly nut Nn= 1,276 Infrequent or never nut Nn= 2,441 Sex; n (%) men 52

More information

9/4/2013. Decision Errors. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency

9/4/2013. Decision Errors. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency Conflicts of Interest I have no conflict of interest to disclose Biostatistics Kevin M. Sowinski, Pharm.D., FCCP Pharmacotherapy Webinar Review Course Tuesday, September 3, 2013 Descriptive statistics:

More information

Basic Biostatistics. Chapter 1. Content

Basic Biostatistics. Chapter 1. Content Chapter 1 Basic Biostatistics Jamalludin Ab Rahman MD MPH Department of Community Medicine Kulliyyah of Medicine Content 2 Basic premises variables, level of measurements, probability distribution Descriptive

More information

Design for Targeted Therapies: Statistical Considerations

Design for Targeted Therapies: Statistical Considerations Design for Targeted Therapies: Statistical Considerations J. Jack Lee, Ph.D. Department of Biostatistics University of Texas M. D. Anderson Cancer Center Outline Premise General Review of Statistical Designs

More information

12/30/2017. PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2

12/30/2017. PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2 PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2 Selecting a statistical test Relationships among major statistical methods General Linear Model and multiple regression Special

More information

ORIGINAL INVESTIGATION. C-Reactive Protein Concentration and Incident Hypertension in Young Adults

ORIGINAL INVESTIGATION. C-Reactive Protein Concentration and Incident Hypertension in Young Adults ORIGINAL INVESTIGATION C-Reactive Protein Concentration and Incident Hypertension in Young Adults The CARDIA Study Susan G. Lakoski, MD, MS; David M. Herrington, MD, MHS; David M. Siscovick, MD, MPH; Stephen

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Su Yon Jung 1*, Eric M. Sobel 2, Jeanette C. Papp 2 and Zuo-Feng Zhang 3

Su Yon Jung 1*, Eric M. Sobel 2, Jeanette C. Papp 2 and Zuo-Feng Zhang 3 Jung et al. BMC Cancer (2017) 17:290 DOI 10.1186/s12885-017-3284-7 RESEARCH ARTICLE Open Access Effect of genetic variants and traits related to glucose metabolism and their interaction with obesity on

More information

Beyond the intention-to treat effect: Per-protocol effects in randomized trials

Beyond the intention-to treat effect: Per-protocol effects in randomized trials Beyond the intention-to treat effect: Per-protocol effects in randomized trials Miguel Hernán DEPARTMENTS OF EPIDEMIOLOGY AND BIOSTATISTICS Intention-to-treat analysis (estimator) estimates intention-to-treat

More information

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions Readings: OpenStax Textbook - Chapters 1 5 (online) Appendix D & E (online) Plous - Chapters 1, 5, 6, 13 (online) Introductory comments Describe how familiarity with statistical methods can - be associated

More information

Different styles of modeling

Different styles of modeling Different styles of modeling Marieke Timmerman m.e.timmerman@rug.nl 19 February 2015 Different styles of modeling (19/02/2015) What is psychometrics? 1/40 Overview 1 Breiman (2001). Statistical modeling:

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter

More information

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012 STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION by XIN SUN PhD, Kansas State University, 2012 A THESIS Submitted in partial fulfillment of the requirements

More information

Analyzing diastolic and systolic blood pressure individually or jointly?

Analyzing diastolic and systolic blood pressure individually or jointly? Analyzing diastolic and systolic blood pressure individually or jointly? Chenglin Ye a, Gary Foster a, Lisa Dolovich b, Lehana Thabane a,c a. Department of Clinical Epidemiology and Biostatistics, McMaster

More information

RELATIONSHIP OF CLINICAL FACTORS WITH ADIPONECTIN AND LEPTIN IN CHILDREN WITH NEWLY DIAGNOSED TYPE 1 DIABETES. Yuan Gu

RELATIONSHIP OF CLINICAL FACTORS WITH ADIPONECTIN AND LEPTIN IN CHILDREN WITH NEWLY DIAGNOSED TYPE 1 DIABETES. Yuan Gu RELATIONSHIP OF CLINICAL FACTORS WITH ADIPONECTIN AND LEPTIN IN CHILDREN WITH NEWLY DIAGNOSED TYPE 1 DIABETES by Yuan Gu BE, Nanjing Institute of Technology, China, 2006 ME, University of Shanghai for

More information

Two-stage Methods to Implement and Analyze the Biomarker-guided Clinical Trail Designs in the Presence of Biomarker Misclassification

Two-stage Methods to Implement and Analyze the Biomarker-guided Clinical Trail Designs in the Presence of Biomarker Misclassification RESEARCH HIGHLIGHT Two-stage Methods to Implement and Analyze the Biomarker-guided Clinical Trail Designs in the Presence of Biomarker Misclassification Yong Zang 1, Beibei Guo 2 1 Department of Mathematical

More information

Prediction and Inference under Competing Risks in High Dimension - An EHR Demonstration Project for Prostate Cancer

Prediction and Inference under Competing Risks in High Dimension - An EHR Demonstration Project for Prostate Cancer Prediction and Inference under Competing Risks in High Dimension - An EHR Demonstration Project for Prostate Cancer Ronghui (Lily) Xu Division of Biostatistics and Bioinformatics Department of Family Medicine

More information

Lecture 14: Adjusting for between- and within-cluster covariates in the analysis of clustered data May 14, 2009

Lecture 14: Adjusting for between- and within-cluster covariates in the analysis of clustered data May 14, 2009 Measurement, Design, and Analytic Techniques in Mental Health and Behavioral Sciences p. 1/3 Measurement, Design, and Analytic Techniques in Mental Health and Behavioral Sciences Lecture 14: Adjusting

More information

A Robust Recursive Partitioning Algorithm for Mining Multiple Populations

A Robust Recursive Partitioning Algorithm for Mining Multiple Populations A Robust Recursive Partitioning Algorithm for Mining Multiple Populations Jose Alvir 1 Javier Cabrera 2 Frank Caridi 1 Ha Nguyen 1 Pfizer Inc 1 & Rutgers University 2 Rutgers Biostatistics Day, 4/25/2008

More information

Selection of Linking Items

Selection of Linking Items Selection of Linking Items Subset of items that maximally reflect the scale information function Denote the scale information as Linear programming solver (in R, lp_solve 5.5) min(y) Subject to θ, θs,

More information

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Final Project Report CS 229 Autumn 2017 Category: Life Sciences Maxwell Allman (mallman) Lin Fan (linfan) Jamie Kang (kangjh) 1 Introduction

More information

Gender Differences in Physical Inactivity and Cardiac Events in Men and Women with Type 2 Diabetes

Gender Differences in Physical Inactivity and Cardiac Events in Men and Women with Type 2 Diabetes Gender Differences in Physical Inactivity and Cardiac Events in Men and Women with Type 2 Diabetes Margaret M. McCarthy 1 Lawrence Young 2 Silvio Inzucchi 2 Janice Davey 2 Frans J Th Wackers 2 Deborah

More information

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0% Capstone Test (will consist of FOUR quizzes and the FINAL test grade will be an average of the four quizzes). Capstone #1: Review of Chapters 1-3 Capstone #2: Review of Chapter 4 Capstone #3: Review of

More information

Since 1980, obesity has more than doubled worldwide, and in 2008 over 1.5 billion adults aged 20 years were overweight.

Since 1980, obesity has more than doubled worldwide, and in 2008 over 1.5 billion adults aged 20 years were overweight. Impact of metabolic comorbidity on the association between body mass index and health-related quality of life: a Scotland-wide cross-sectional study of 5,608 participants Dr. Zia Ul Haq Doctoral Research

More information

Detecting Multiple Mean Breaks At Unknown Points With Atheoretical Regression Trees

Detecting Multiple Mean Breaks At Unknown Points With Atheoretical Regression Trees Detecting Multiple Mean Breaks At Unknown Points With Atheoretical Regression Trees 1 Cappelli, C., 2 R.N. Penny and 3 M. Reale 1 University of Naples Federico II, 2 Statistics New Zealand, 3 University

More information

Biostatistics for Med Students. Lecture 1

Biostatistics for Med Students. Lecture 1 Biostatistics for Med Students Lecture 1 John J. Chen, Ph.D. Professor & Director of Biostatistics Core UH JABSOM JABSOM MD7 February 14, 2018 Lecture note: http://biostat.jabsom.hawaii.edu/education/training.html

More information

Quint: An R package for the identification of subgroups of clients who differ in which treatment alternative is best for them

Quint: An R package for the identification of subgroups of clients who differ in which treatment alternative is best for them Behav Res (2016) 48:650 663 DOI 10.3758/s13428-015-0594-z Quint: An R package for the identification of subgroups of clients who differ in which treatment alternative is best for them Elise Dusseldorp

More information

Identifying Change Points in a Covariate Effect on Time-to-Event Analysis with Reduced Isotonic Regression

Identifying Change Points in a Covariate Effect on Time-to-Event Analysis with Reduced Isotonic Regression RESEARCH ARTICLE Identifying Change Points in a Covariate Effect on Time-to-Event Analysis with Reduced Isotonic Regression Yong Ma 1,2 *, Yinglei Lai 1,3, John M. Lachin 1,2 1. The Biostatistics Center,

More information

UK Liver Transplant Audit

UK Liver Transplant Audit November 2012 UK Liver Transplant Audit In patients who received a Liver Transplant between 1 st March 1994 and 31 st March 2012 ANNUAL REPORT Advisory Group for National Specialised Services Prepared

More information

LINEAR REGRESSION FOR BIVARIATE CENSORED DATA VIA MULTIPLE IMPUTATION

LINEAR REGRESSION FOR BIVARIATE CENSORED DATA VIA MULTIPLE IMPUTATION STATISTICS IN MEDICINE Statist. Med. 18, 3111} 3121 (1999) LINEAR REGRESSION FOR BIVARIATE CENSORED DATA VIA MULTIPLE IMPUTATION WEI PAN * AND CHARLES KOOPERBERG Division of Biostatistics, School of Public

More information

Table S2: Anthropometric, clinical, cardiovascular and appetite outcome changes over 8 weeks (baseline-week 8) by snack group

Table S2: Anthropometric, clinical, cardiovascular and appetite outcome changes over 8 weeks (baseline-week 8) by snack group Table S1: Nutrient composition of cracker and almond snacks Cracker* Almond** Weight, g 77.5 g (5 sheets) 56.7 g (2 oz.) Energy, kcal 338 364 Carbohydrate, g (kcal) 62.5 12.6 Dietary fiber, g 2.5 8.1 Protein,

More information

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1 Welch et al. BMC Medical Research Methodology (2018) 18:89 https://doi.org/10.1186/s12874-018-0548-0 RESEARCH ARTICLE Open Access Does pattern mixture modelling reduce bias due to informative attrition

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!

More information

LEADER Liraglutide and cardiovascular outcomes in type 2 diabetes

LEADER Liraglutide and cardiovascular outcomes in type 2 diabetes LEADER Liraglutide and cardiovascular outcomes in type 2 diabetes Presented at DSBS seminar on mediation analysis August 18 th Søren Rasmussen, Novo Nordisk. LEADER CV outcome study To determine the effect

More information

Application of Artificial Neural Network-Based Survival Analysis on Two Breast Cancer Datasets

Application of Artificial Neural Network-Based Survival Analysis on Two Breast Cancer Datasets Application of Artificial Neural Network-Based Survival Analysis on Two Breast Cancer Datasets Chih-Lin Chi a, W. Nick Street b, William H. Wolberg c a Health Informatics Program, University of Iowa b

More information

Online Supplementary Material

Online Supplementary Material Section 1. Adapted Newcastle-Ottawa Scale The adaptation consisted of allowing case-control studies to earn a star when the case definition is based on record linkage, to liken the evaluation of case-control

More information

Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach

Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School November 2015 Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach Wei Chen

More information

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS) Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it

More information

Comparison And Application Of Methods To Address Confounding By Indication In Non- Randomized Clinical Studies

Comparison And Application Of Methods To Address Confounding By Indication In Non- Randomized Clinical Studies University of Massachusetts Amherst ScholarWorks@UMass Amherst Masters Theses 1911 - February 2014 Dissertations and Theses 2013 Comparison And Application Of Methods To Address Confounding By Indication

More information

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC Selected Topics in Biostatistics Seminar Series Missing Data Sponsored by: Center For Clinical Investigation and Cleveland CTSC Brian Schmotzer, MS Biostatistician, CCI Statistical Sciences Core brian.schmotzer@case.edu

More information

Selection and Combination of Markers for Prediction

Selection and Combination of Markers for Prediction Selection and Combination of Markers for Prediction NACC Data and Methods Meeting September, 2010 Baojiang Chen, PhD Sarah Monsell, MS Xiao-Hua Andrew Zhou, PhD Overview 1. Research motivation 2. Describe

More information

Magnetic resonance imaging, image analysis:visual scoring of white matter

Magnetic resonance imaging, image analysis:visual scoring of white matter Supplemental method ULSAM Magnetic resonance imaging, image analysis:visual scoring of white matter hyperintensities (WMHI) was performed by a neuroradiologist using a PACS system blinded of baseline data.

More information

Bayesian Prediction Tree Models

Bayesian Prediction Tree Models Bayesian Prediction Tree Models Statistical Prediction Tree Modelling for Clinico-Genomics Clinical gene expression data - expression signatures, profiling Tree models for predictive sub-typing Combining

More information

Types of Statistics. Censored data. Files for today (June 27) Lecture and Homework INTRODUCTION TO BIOSTATISTICS. Today s Outline

Types of Statistics. Censored data. Files for today (June 27) Lecture and Homework INTRODUCTION TO BIOSTATISTICS. Today s Outline INTRODUCTION TO BIOSTATISTICS FOR GRADUATE AND MEDICAL STUDENTS Files for today (June 27) Lecture and Homework Descriptive Statistics and Graphically Visualizing Data Lecture #2 (1 file) PPT presentation

More information

Application of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties

Application of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties Application of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties Bob Obenchain, Risk Benefit Statistics, August 2015 Our motivation for using a Cut-Point

More information

Methods for Computing Missing Item Response in Psychometric Scale Construction

Methods for Computing Missing Item Response in Psychometric Scale Construction American Journal of Biostatistics Original Research Paper Methods for Computing Missing Item Response in Psychometric Scale Construction Ohidul Islam Siddiqui Institute of Statistical Research and Training

More information

Applied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business

Applied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business Applied Medical Statistics Using SAS Geoff Der Brian S. Everitt CRC Press Taylor Si Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor & Francis Croup, an informa business A

More information

Inverse Probability of Censoring Weighting for Selective Crossover in Oncology Clinical Trials.

Inverse Probability of Censoring Weighting for Selective Crossover in Oncology Clinical Trials. Paper SP02 Inverse Probability of Censoring Weighting for Selective Crossover in Oncology Clinical Trials. José Luis Jiménez-Moro (PharmaMar, Madrid, Spain) Javier Gómez (PharmaMar, Madrid, Spain) ABSTRACT

More information

ISIR: Independent Sliced Inverse Regression

ISIR: Independent Sliced Inverse Regression ISIR: Independent Sliced Inverse Regression Kevin B. Li Beijing Jiaotong University Abstract In this paper we consider a semiparametric regression model involving a p-dimensional explanatory variable x

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 5, 6, 7, 8, 9 10 & 11)

More information

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition List of Figures List of Tables Preface to the Second Edition Preface to the First Edition xv xxv xxix xxxi 1 What Is R? 1 1.1 Introduction to R................................ 1 1.2 Downloading and Installing

More information

Propensity scores and causal inference using machine learning methods

Propensity scores and causal inference using machine learning methods Propensity scores and causal inference using machine learning methods Austin Nichols (Abt) & Linden McBride (Cornell) July 27, 2017 Stata Conference Baltimore, MD Overview Machine learning methods dominant

More information

BEST PRACTICES FOR IMPLEMENTATION AND ANALYSIS OF PAIN SCALE PATIENT REPORTED OUTCOMES IN CLINICAL TRIALS

BEST PRACTICES FOR IMPLEMENTATION AND ANALYSIS OF PAIN SCALE PATIENT REPORTED OUTCOMES IN CLINICAL TRIALS BEST PRACTICES FOR IMPLEMENTATION AND ANALYSIS OF PAIN SCALE PATIENT REPORTED OUTCOMES IN CLINICAL TRIALS Nan Shao, Ph.D. Director, Biostatistics Premier Research Group, Limited and Mark Jaros, Ph.D. Senior

More information

(n=6279). Continuous variables are reported as mean with 95% confidence interval and T1 T2 T3. Number of subjects

(n=6279). Continuous variables are reported as mean with 95% confidence interval and T1 T2 T3. Number of subjects Table 1. Distribution of baseline characteristics across tertiles of OPG adjusted for age and sex (n=6279). Continuous variables are reported as mean with 95% confidence interval and categorical values

More information

Causal versus Casual Inference

Causal versus Casual Inference ASA Biopharmaceutical Section Workshop Washington, DC 13 Sep 2018 Causal versus Casual Inference What Happens When I Take This Medication? Stephen J. Ruberg, PhD President Analytix Thinking, LLC AnalytixThinking@gmail.com

More information

"Lack of activity destroys the good condition of every human being, while movement and methodical physical exercise save it and preserve it.

Lack of activity destroys the good condition of every human being, while movement and methodical physical exercise save it and preserve it. Leave all the afternoon for exercise and recreation, which are as necessary as reading. I will rather say more necessary because health is worth more than learning. - Thomas Jefferson "Lack of activity

More information

Supplementary Appendix

Supplementary Appendix Supplementary Appendix This appendix has been provided by the authors to give readers additional information about their work. Supplement to: Rawshani Aidin, Rawshani Araz, Franzén S, et al. Risk factors,

More information

A Brief (very brief) Overview of Biostatistics. Jody Kreiman, PhD Bureau of Glottal Affairs

A Brief (very brief) Overview of Biostatistics. Jody Kreiman, PhD Bureau of Glottal Affairs A Brief (very brief) Overview of Biostatistics Jody Kreiman, PhD Bureau of Glottal Affairs What We ll Cover Fundamentals of measurement Parametric versus nonparametric tests Descriptive versus inferential

More information

Impact of BMI on pathologic complete response (pcr) following neo adjuvant chemotherapy (NAC) for locally advanced breast cancer

Impact of BMI on pathologic complete response (pcr) following neo adjuvant chemotherapy (NAC) for locally advanced breast cancer Impact of BMI on pathologic complete response (pcr) following neo adjuvant chemotherapy (NAC) for locally advanced breast cancer Rachna Raman, MD, MS Fellow physician University of Iowa hospitals and clinics

More information

Model Selection Methods for Cancer Staging and Other Disease Stratification Problems. Yunzhi Lin

Model Selection Methods for Cancer Staging and Other Disease Stratification Problems. Yunzhi Lin Model Selection Methods for Cancer Staging and Other Disease Stratification Problems by Yunzhi Lin A dissertation submitted in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY

More information

PTHP 7101 Research 1 Chapter Assignments

PTHP 7101 Research 1 Chapter Assignments PTHP 7101 Research 1 Chapter Assignments INSTRUCTIONS: Go over the questions/pointers pertaining to the chapters and turn in a hard copy of your answers at the beginning of class (on the day that it is

More information

BayesRandomForest: An R

BayesRandomForest: An R BayesRandomForest: An R implementation of Bayesian Random Forest for Regression Analysis of High-dimensional Data Oyebayo Ridwan Olaniran (rid4stat@yahoo.com) Universiti Tun Hussein Onn Malaysia Mohd Asrul

More information

Biostatistics II

Biostatistics II Biostatistics II 514-5509 Course Description: Modern multivariable statistical analysis based on the concept of generalized linear models. Includes linear, logistic, and Poisson regression, survival analysis,

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 11 + 13 & Appendix D & E (online) Plous - Chapters 2, 3, and 4 Chapter 2: Cognitive Dissonance, Chapter 3: Memory and Hindsight Bias, Chapter 4: Context Dependence Still

More information