Applied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business

Similar documents
A Handbook of Statistical Analyses using SAS

Practical Multivariate Analysis

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition

isc ove ring i Statistics sing SPSS

PRACTICAL STATISTICS FOR MEDICAL RESEARCH

From Biostatistics Using JMP: A Practical Guide. Full book available for purchase here. Chapter 1: Introduction... 1

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

Biostatistics II

Data Analysis with SPSS

Data Analysis Using Regression and Multilevel/Hierarchical Models

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

DECISION ANALYSIS WITH BAYESIAN NETWORKS

STATISTICS IN CLINICAL AND TRANSLATIONAL RESEARCH

Linear Regression Analysis

The Statistical Analysis of Failure Time Data

Today: Binomial response variable with an explanatory variable on an ordinal (rank) scale.

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

10. LINEAR REGRESSION AND CORRELATION

Index. Springer International Publishing Switzerland 2017 T.J. Cleophas, A.H. Zwinderman, Modern Meta-Analysis, DOI /

Basic Biostatistics. Chapter 1. Content

bivariate analysis: The statistical analysis of the relationship between two variables.

CLINICAL BIOSTATISTICS

Contents. Part 1 Introduction. Part 2 Cross-Sectional Selection Bias Adjustment

An Introduction to Modern Econometrics Using Stata

Bangor University Laboratory Exercise 1, June 2008

A SAS Macro for Adaptive Regression Modeling

Choosing the Correct Statistical Test

Linear Regression in SAS

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 5 Residuals and multiple regression Introduction

What you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu

How to describe bivariate data

CLASSICAL AND. MODERN REGRESSION WITH APPLICATIONS

Correlation and regression

SUMMER 2011 RE-EXAM PSYF11STAT - STATISTIK

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

IAPT: Regression. Regression analyses

Meta-analysis: Advanced methods using the STATA software

Modern Regression Methods

Epidemiologic Methods I & II Epidem 201AB Winter & Spring 2002

f WILEY ANOVA and ANCOVA A GLM Approach Second Edition ANDREW RUTHERFORD Staffordshire, United Kingdom Keele University School of Psychology

Understandable Statistics

Midterm Exam ANSWERS Categorical Data Analysis, CHL5407H

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

Ecological Statistics

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

Ordinal Data Modeling

Applications of Regression Models in Epidemiology

How to analyze correlated and longitudinal data?

11/24/2017. Do not imply a cause-and-effect relationship

S Imputation of Categorical Missing Data: A comparison of Multivariate Normal and. Multinomial Methods. Holmes Finch.

Daniel Boduszek University of Huddersfield

BIOL 458 BIOMETRY Lab 7 Multi-Factor ANOVA

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

MAKING THE NSQIP PARTICIPANT USE DATA FILE (PUF) WORK FOR YOU

BIOSTATISTICAL METHODS

Daniel Boduszek University of Huddersfield

Statistical Tolerance Regions: Theory, Applications and Computation

POST GRADUATE DIPLOMA IN BIOETHICS (PGDBE) Term-End Examination June, 2016 MHS-014 : RESEARCH METHODOLOGY

12/30/2017. PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2

Graphical assessment of internal and external calibration of logistic regression models by using loess smoothers

BEST PRACTICES FOR IMPLEMENTATION AND ANALYSIS OF PAIN SCALE PATIENT REPORTED OUTCOMES IN CLINICAL TRIALS

Industrial and Manufacturing Engineering 786. Applied Biostatistics in Ergonomics Spring 2012 Kurt Beschorner

Computer Age Statistical Inference. Algorithms, Evidence, and Data Science. BRADLEY EFRON Stanford University, California

Critical Appraisal of Scientific Literature. André Valdez, PhD Stanford Health Care Stanford University School of Medicine

BIOSTATISTICAL METHODS AND RESEARCH DESIGNS. Xihong Lin Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA

Intro to SPSS. Using SPSS through WebFAS

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Quality of Life. The assessment, analysis and reporting of patient-reported outcomes. Third Edition

Comparison And Application Of Methods To Address Confounding By Indication In Non- Randomized Clinical Studies

Introduction to Survival Analysis Procedures (Chapter)

Dr. Kelly Bradley Final Exam Summer {2 points} Name

Table. A [a] Multiply imputed. Outpu

Lesson: A Ten Minute Course in Epidemiology

An Introduction to Multiple Imputation for Missing Items in Complex Surveys

Reveal Relationships in Categorical Data

2 Assumptions of simple linear regression

MEASURES OF ASSOCIATION AND REGRESSION

From Bivariate Through Multivariate Techniques

Sequential nonparametric regression multiple imputations. Irina Bondarenko and Trivellore Raghunathan

Supplementary Appendix

Multiple Linear Regression Analysis

Clincial Biostatistics. Regression

Examining Relationships Least-squares regression. Sections 2.3

Multicultural Health

Types of Statistics. Censored data. Files for today (June 27) Lecture and Homework INTRODUCTION TO BIOSTATISTICS. Today s Outline

Poisson regression. Dae-Jin Lee Basque Center for Applied Mathematics.

Learning Objectives 9/9/2013. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency

9/4/2013. Decision Errors. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency

Statistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.

Table of Contents. Preface to the third edition xiii. Preface to the second edition xv. Preface to the fi rst edition xvii. List of abbreviations xix

Chapter 6 Measures of Bivariate Association 1

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions.

STATISTICS & PROBABILITY

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn

Overview of Non-Parametric Statistics

Statistical Analysis Plan: Post-hoc analysis of the CALORIES trial

OPERATIONAL RISK WITH EXCEL AND VBA

You must answer question 1.

Analysis and Interpretation of Data Part 1

Transcription:

Applied Medical Statistics Using SAS Geoff Der Brian S. Everitt CRC Press Taylor Si Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor & Francis Croup, an informa business A CHAPMAN 6t HALL BOOK

Preface The Authors xiii xv 1. An Introduction to SAS 1 1.1 Introduction 1 1.2 User Interface 1 1.2.1 Editor Window 2 1.2.2 Log Window 4 1.2.3 Output Window 4 1.2.4 Results Window 4 1.2.5 Explorer Window 4 1.2.6 Results Viewer Window 5 1.2.7 Options for Displaying Procedure Results 5 1.2.8 Help and Documentation 5 1.3 SAS Programs 6 1.3.1 Program Steps 7 1.3.2 Variable Names and Data Set Names 8 1.3.3 Variable Lists 8 1.4 Reading Data The Data Step 11 1.4.1 Creating SAS Data Sets from Raw Data 11 1.4.2 Data Statement 11 1.4.3 Infile Statement 12 1.4.4 Input Statement 13 1.4.4.1 List Input 13 1.4.4.2 Column Input 14 1.4.4.3 Formatted Input 15 1.4.4.4 Multiple Lines per Observation 17 1.4.4.5 Multiple Observations per Line 17 1.4.4.6 Delimited Data 17 1.4.5 Reading Data Proc Import 18 Excel Files 19 1.4.6 Reading and Writing 1.4.7 Temporary and Permanent SAS Data Sets SAS Libraries 20 1.4.8 Reading Data from an Existing SAS Data Set 20 SAS Data 21 1.5 Modifying 1.5.1 Creating and Modifying Variables 21 1.5.1.1 Missing Values in Arithmetic Expressions 21 1.5.2 Deleting 1.5.3 Deleting Observations 24 1.5.4 Subsetting Data Sets 24 Variables 24 v

vi Data Sets 25 1.5.5 Concatenating and Merging 1.5.6 Merging Data Sets Adding Variables 25 1.5.7 Operation of the Data Step 26 1.6 ProcStep 1.6.1 Proc Statement 27 1.6.2 Var Statement 27 1.6.3 Where Statement 28 1.6.4 By Statement 28 1.6.5 Class Statement 28 1.7 Global Statements 28 1.7.1 Options 1.8 SAS 30 Graphics 1.8.1 xy Plots proc sgplot 30 1.8.2 Summary Plots 31 1.8.3 Panel Plots 32 1.9 ODS Output Delivery System 1.9.1 ODS Procedure Output 33 1.9.1.1 ODS 33 Styles 1.10 Saving Output in SAS Data Sets ods output 34 1.10.1 ODS Graphics 1.11 Enhancing Output 1.11.1 Variable Labels 36 1.11.2 Value Labels SAS Formats 36 1.12 SAS Macros 37 1.13 Some Tips for Preventing and Correcting Errors 39 27 29 32 34 36 2. Statistics and Measurement in Medicine 41 2.1 Introduction 41 2.2 A Brief History of Medical Statistics 42 2.3 Measurement in Medicine 46 2.3.1 Scales of Measurement 47 2.3.1.1 Nominal or Categorical Measurements 47 2.3.1.2 Ordinal Scale Measurements 47 2.3.1.3 Interval Scales 48 2.3.1.4 Ratio Scales 48 2.4 Assessing Bias and Reliability of Measurements 49 2.4.1 Assessing Reliability and Bias for Binary and Other Categorical Observations 50 2.4.2 Assessing the Reliability of Quantitative Measurements... 57 2.5 Diagnostic Tests 63 2.6 Summary 72 3. Clinical Trials 3.1 Introduction 73 73

vii 3.2 Clinical Trials 74 3.2.1 Types of Randomisation 77 3.2.1.1 Blocked Randomisation 80 3.2.1.2 Stratified Randomisation 82 3.2.1.3 Minimisation Method 85 3.3 How Many Participants Do I Need in My Trial? 88 3.4 of Analysis Data from Clinical Trials 92 3.4.1 p-values and Confidence Intervals 92 3.4.2 Some Examples of Analysis of Data from Clinical Trials Using Familiar Statistical Methods 94 3.5 Summary 107 4. Epidemiology 109 4.1 Introduction 109 4.2 Types of Epidemiological Study 109 4.2.1 Surveys 110 4.2.2 Case-Control Studies Ill 4.2.3 Cohort Studies 112 4.3 Relative Risk and Odds Ratios 114 4.4 Sample Size Estimation for Epidemiologic Studies 116 4.4.1 Sample Size Estimation for Case-Control Studies 116 4.4.2 Sample Size Estimation for Cohort Studies 118 4.5 Simple Analyses for Data from Observational Studies 119 4.5.1 Chi-Squared Test for Association 119 4.5.2 Finding a Confidence Interval for the Relative Risk and the Odds Ratio 120 4.5.3 Applying SAS to Analyse Examples of Epidemiological Data 121 4.5.4 Fisher's Test 125 4.5.5 Matched Case-Control Data 128 4.5.6 Stratified 2x2 Tables 129 4.6 Summary 132 5. Meta-Analysis 135 5.1 Introduction 135 5.2 Study Selection 138 5.3 Publication Bias 140 5.4 Statistics of Meta-Analysis 141 5.4.1 Fixed-Effects Model 143 5.4.2 Random-Effects Model 143 5.5 An Example of the Application of Meta-Analysis 144 5.6 Meta-Analysis on Sparse Data 150 5.7 Meta-Regression 152 5.8 Summary 155

viii 6. Analysis of Variance and Covariance 157 6.1 Introduction 157 6.2 A Simple Example of One-Way Analysis of Variance 157 6.2.1 One-Way Analysis of Variance Model 158 6.2.2 Applying the One-Way Analysis of Variance Model to Sickle Cell Disease Data 159 6.3 Multiple Comparison Procedures 162 6.3.1 Planned Comparisons 162 6.3.2 Post Hoc Comparisons 164 6.4 A Factorial Experiment 165 6.4.1 Model for Three-Factor Design 170 6.5 Unbalanced Designs 172 6.5.1 Type I Sums of Squares 174 6.5.2 Type II Sums of Squares 174 6.5.3 Type III Sums of Squares 175 6.5.4 Analysis of Antipyrine Data 176 6.6 Nonparametric Analysis of Variance 178 6.6.1 Kruskal-Wallis Distribution-Free Test for One-Way Analysis of Variance 179 6.6.2 Applying the Kruskal-Wallis Test 180 6.7 Analysis of Covariance 181 6.8 Summary 186 7. Scatter Plots, Correlation, Simple Regression, and Smoothing 187 7.1 Introduction 187 7.2 Scatter Plot and Correlation Coefficient 187 7.3 Simple Linear Regression and Locally Weighted Regression... 193 7.4 Locally Weighted Regression 203 7.5 Aspect Ratio of a Scatter Plot 205 7.6 Estimating Bivariate Densities 209 7.7 Scatter Plot Matrices 213 7.8 Summary 216 8. Multiple Linear Regression 219 8.1 Introduction 219 8.2 Multiple Linear Regression Model 219 8.3 Some Examples of the Application of the Multiple Linear Regression Model 222 8.3.1 Effect of the Amount of Anaesthetic Agent Administered during an Operation 222 8.3.2 Mortality and Water Hardness 224 8.3.3 Weight and Physical Measurements in Men 230 8.4 Identifying a Parsimonious Model 235 8.4.1 All Possible Subsets Regression 235 8.4.2 Stepwise Methods 236

ix 8.5 Checking Model Assumptions: Residuals and Other Regression Diagnostics 245 8.6 General Linear Model 249 8.7 Summary 253 9. Logistic Regression 255 9.1 Introduction 255 9.2 Logistic Regression 255 9.3 Two Examples of the Application of Logistic Regression 258 9.3.1 Psychiatric'Caseness' 258 9.3.2 Birth Weight of Babies 268 9.4 Diagnosing a Logistic Regression Model 274 9.5 Logistic Regression for 1:1 Matched Studies 275 9.6 Propensity Scores 281 9.7 Summary 283 10. Generalised Linear Model 285 10.1 Introduction 285 10.2 Generalised Linear Models 285 10.3 Applying the Generalised Linear Model 287 10.3.1 Poisson Regression 288 10.3.2 Regression with Gamma Errors 296 10.4 Residuals for GLMs 298 10.5 Overdispersion 300 10.6 Summary 302 11. Generalised Additive Models 303 11.1 Introduction 303 11.2 Scatter Plot Smoothers 304 11.3 Additive and Generalised Additive Models 312 11.4 Examples of the Application of GAMs 313 11.5 Summary 324 12. Analysis of Longitudinal Data 1 325 12.1 Introduction 325 12.2 Graphical Displays of Longitudinal Data 325 12.3 Summary Measure Analysis of Longitudinal 12.3.1 Choosing Summary Data 333 Measures 333 12.3.2 Applying the Summary Measure Approach 334 12.3.3 Incorporating Pretreatment Outcome Values into the Summary Measure Approach 335 12.3.4 Dealing with Missing Values When Using the Summary Measure Approach 337 12.4 Summary Measure Approach for Binary Responses 340 12.5 Summary 347

X 13. Analysis of Longitudinal Data II: Linear Mixed-Effects Models for Normal Response Variables 349 13.1 Introduction 349 Measures Data 350 13.2 Linear Mixed-Effects Models for Repeated 13.2.1 Random Intercept and Random Intercept and Slope Models 351 13.2.2 Applying the Random Intercept and Random Intercept and Slope Models 353 13.3 Dropouts in Longitudinal Data 370 13.4 Summary 14. Analysis of Longitudinal Data III: Non-Normal Responses 377 14.1 Introduction 377 14.2 Marginal Models and Conditional Models 378 375 14.2.1 Marginal Models 378 14.2.2 Conditional Models 381 14.3 Analysis of the Respiratory Data 383 14.3.1 Marginal Models 383 14.3.2 Generalised Linear Mixed-Effects Models 388 14.4 Analysis of Epilepsy Data 391 14.4.1 Marginal Models 392 14.4.2 Generalised Linear Mixed-Effects Models 394 14.5 Summary 15. Survival 399 Analysis 15.1 Introduction 399 15.2 Survivor Function and the Hazard Function 400 15.2.1 Survivor Function 400 15.2.2 Hazard Function 405 15.3 Comparing Groups of Survival Times 410 15.3.1 Log-Rank Test 412 15.3.2 Stratified Tests 415 15.4 Sample Size Estimation 417 15.5 Summary 398 419 16. Cox's Proportional Hazards Models for Survival Data 421 16.1 Introduction 421 16.2 Modelling the Hazard Function: Cox's Regression 421 16.2.1 Examples of Cox's Regression 16.2.2 Estimating the Baseline Hazard Function 428 16.2.3 Checking Assumptions in Cox's Regression 438 16.2.4 Stratified Cox's Regression 442 16.3 Time-Varying Covariates 445 16.4 Random-Effects Models for Survival Data 452 16.5 Summary 457 424

*i 17. Bayesian Methods 459 17.1 Introduction 459 17.2 Bayesian Estimation 460 17.3 Markov Chain Monte Carlo 463 17.4 Prior Distributions 464 17.5 Model Selection When Using a Bayesian Approach 465 17.6 Some Examples of the Application of Bayesian Statistics 465 17.6.1 Psychiatric 'Caseness' 465 in Babies 474 17.6.2 Cardiac Surgery 17.7 Summary 481 18. Missing Values 483 18.1 Introduction 483 18.2 Patterns of Missing Data 484 18.3 Missing Data Mechanisms 484 18.4 Exploring Missingness 486 18.5 Dealing with Missing Values 493 18.6 Imputing Missing Values 494 18.7 Analysing Multiply Imputed Data 496 18.8 Some Examples of the Application of Multiple Imputation 497 18.8.1 Air Pollution in US Cities 497 18.8.2 Growth of Danish Boys 502 18.9 Summary 507 References 509 Index 519