PRACTICAL STATISTICS FOR MEDICAL RESEARCH Douglas G. Altman Head of Medical Statistical Laboratory Imperial Cancer Research Fund London CHAPMAN & HALL/CRC Boca Raton London New York Washington, D.C.
Contents Preface xi 1 1.1 1.2 1.3 1.4 1.5 2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 4 4.1 4.2 4.3 4.4 4.5 Statistics in medical research Statistics at large Statistics in medicine Statistics in medical research What does statistics cover? The scope of this book Types of data Categorical data Numerical data Other types of data Censored data Variability Importance of the type of data Dealing with numbers Describing data Averages Describing variability Quantifying variability Two variables The effect of transforming the data Data presentation Theoretical distributions Probability Samples and populations Probability distributions The Normal distribution 1 1 3 4 5 8 11 13 16 17 17 17 19 19 21 22 31 38 41 42 45 48 48 49 50 50 51
4.6 4.7 4.8 4.9 4. 4.11 The Lognormal distributions The Binomial distribution The Poisson distribution Mathematical calculations The Uniform distribution Concluding remarks Contents vii 60 63 66 68 71 71 71 5 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5. 5.11 5.12 5.13 5.14 Designing research 74 74 Categories of research design 75 Sources of variation 78 An experiment: is the blood pressure the same in both arms? 79 The design of experiments 80 The structure of an experiment 83 Random allocation 85 Minimization 91 Observational studies 91 The case-control study 93 The cohort study 96 The cross-sectional study 99 Studies of change over time 1 Choosing a study design 2 3 6 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6. 6.11 7 7.1 7.2 7.3 7-4 7.5 Using a computer Advantages of using a computer Disadvantages of using a computer Types of statistical program Evaluating a statistical package Strategy for computer-aided analysis Forms for data collection Plotting Other uses of computers Misuses of the computer Concluding remarks Preparing to analyse data Data checking Outliers Missing data Data screening 7 7 7 8 1 111 112 114 119 120 120 121 122 122 122 126 130 132
vni Contents 7.6 7.7 7.8 Why transform data? Other features of the data Concluding remarks 143 146 149 149 8 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 8. 8.11 9 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 9. 9.11.1.2.3.4.5.6.7.8.9 Principles of statistical analysis Sampling distributions A demonstration of the distribution of sample means Estimation Hypothesis testing Non-parametric methods Statistical modelling Estimation or hypothesis testing? Strategy for analysing data Presentation of results Summary Comparing groups - continuous data Choosing an appropriate method of analysis The t distribution One group of observations Two groups of paired observations Two independent groups of observations Analysis of skewed data Three or more independent groups of observations One way analysis of variance - mathematics and worked example Presentation of results Summary Comparing groups - categorical data One proportion Proportions in two independent groups Two paired proportions Comparing several proportions The analysis of frequency tables 2 x 2 frequency tables - comparison of two proportions 2 x k tables - comparison of several proportions Large tables with ordered categories 152 152 153 155 160 165 171 173 174 175 176 177 177 179 179 179 181 183 189 191 199 205 218 220 222 223 229 229 230 232 235 241 241 250 259 265
..11.12.13 k x k tables - analysis of matched variables Comparing risks Presentation of results Summary Contents ix 266 266 271 271 272 11 11.1 11.2 11.3 11.4 11.5 11.6 11.7 11.8 11.9 11. 11.11 11.12 11.13 11.14 11.15 11.16 11.17 Relation between two continuous variables 277 Association, prediction and agreement 277 Correlation 278 Use and misuse of correlation 282 Rank correlation 285 Adjusting a correlation for another variable 288 Use of the correlation coefficient in assessing non-normality 291 Correlation - mathematics and worked examples Interpretation of correlation Presentation of correlation Regression Use of regression Extensions Regression - mathematics and worked example Interpretation of regression Relation to other analyses Presentation of regression Regression or correlation? 293 297 300 300 306 309 311 316 318 319 320 321 12 12.1 12.2 12.3 12.4 12.5 12.6 12.7 13 13.1 13.2 13.3 13.4 13.5 13.6 13.7 Relation between several variables Analysis of variance and multiple regression Two way analysis of variance Multiple regression Logistic regression Discriminant analysis Other methods Analysis of survival times Survival probabilities Comparing survival curves in two groups Mathematical calculations and worked examples Incorrect analyses Modelling survival - the Cox regression model Desien of survival studies 325 325 325 326 336 351 358 360 361 365 365 367 371 377 385 387 393
x Contents 13.8 Presentation of results 393 394 14 Some common problems in medical research 396 14.1 396 14.2 Method comparison studies 396 14.3 Inter-rater agreement 403 14.4 Diagnostic tests 409 14.5 Reference intervals 419 14.6 Serial measurements 426 14.7 Cyclic variation 433 435 15 Clinical trials 440 15.1 440 15.2 Design of clinical trials 441 15.3 Sample size 455 15.4 Analysis 461 15.5 Interpretation of results 471 15.6 Writing up and assessing clinical trials 473 474 16 The medical literature 477 16.1 477 16.2 The growth of statistics in medical research 478 16.3 Statistics in published papers 481 16.4 Reading a scientific paper 493 16.5 Writing a scientific paper 498 499 Appendix A Mathematical notation 505 Al.l 505 A1.2 Basic ideas 505 A1.3 Mathematical symbols 509 A1.4 Functions 5 A1.5 Glossary of notation 5 Appendix B Statistical tables 514 Answers to exercises 546 References 575 Index 589