PhD Course in Biostatistics Univ.-Prof. DI Dr. Andrea Berghold Institute for Medical Informatics, Statistics and Documentation Medical University of Graz andrea.berghold@medunigraz.at
Content Introduction to Medical Statistics Study designs in medical research Exploring and summarizing data Populations and samples Statements of probability and confidence intervals Drawing inferences from data - Hypothesis testing Estimating and comparing means Proportions and chi-square tests Correlation and regression Diagnostic tools Methods for analysing survival data
Literature Martin Bland: An Introduction to Medical Statistics. 3rd ed. Oxford University Press, 2000. Douglas Altman: Practical Statistics for Medical Research. Chapman & Hall. Aviva Petrie and Caroline Sabin: Medical Statistics at a Glance. Blackwell Science, 2000
Statistical Methods - medical literature NEJM June 2001: Methods Section of Full-Length Original Articles (by article, in column inches) Statistical Methods All methods Percentage 4.6 35.7 12.9 % 7.9 53.6 14.7 % 12.2 51.6 23.6 % 7.3 36.8 19.8 % 32.0 177.7 18.0 %
Statistical Methods - medical literature In the same issue the following statistical methods were mentioned: Bonferroni method Chi-square test for independence Chi-square test for goodness-of-fit Confidence intervals Cox proportional hazards models Cumulative mortality Fisher's exact test Intention-to-treat analysis Interim analysis Kaplan-Meier survival curves Logistic regression Logrank test Mantel-Haenszel adjusted relative risks Noninferiority testing Odds ratio Power Analysis P-values Randomization Relative risk reduction Repeated measures ANOVA Sample size estimation Spearman correlation t-tests Wilcoxon test
Statistics Is it worth to struggle with statistics? Bad statistics leads to bad research, and bad research is unethical Altman (1982)
Biostatistics - Medical Statistics Design of studies - How do I get adequate data? Data analysis using statistical methods - What do I do with the data? Critical appraisal - How do I interpret study results?
Study plan study interpret collect data analyse data
Study 1. Stating the problem Major objective of the study - determine relevant variables und factors Search the literature, discussion with experts 2. Designing the study Study design, sample size calculation etc. Statistical analysis plan Study protocol
A Study 3. Collecting data Collecting data and plausibility checks 4. Data analysis Graphs and summary statistics Statistical inference 5. Interpretation of results and conclusions Discussion of new information
Stating the problem Some questions which should be answered in advance: What is the major objective of the study? Is the question clearly defined? Is it also relevant?
Examples 1. Are there differences in the one-year rate of restenosis using stents or PTA of stenosis of arteria iliaca? 2. Does a betablocker decrease all-cause mortality in patients with chronic heart failure? 3. Have cancer patients who have anemia a worse prognosis than patients without anemia? 4. Which method should be used for training of laparascopic surgery? 5.
Variables Primary variable, endpoint 1. rate of restenosis; 2. all-cause mortality; 3. 5 year disease-specific survival; 4. number of stitches per minute; Factors 1. none 2. stage (NYHA class); 3. anemia, size of tumour, lymph nodes; 4. method, playing an instrument;... Other factors Age, sex, smoking...
Errors Random error inter- and intraindividual variability Systematic error - Bias Selection bias Assessment bias Information bias Try to avoid bias and reduce random error!
Types of studies Observational studies Experimental studies
Types of studies Main types of studies in medical research Observational studies Experimental studies Cross-sectional studiy case-control study cohort study Clinical trial Laboratory experiments
Observational Studies
Cross Sectional Study no direction of inquiry Population subjects selected for study with outcome without outcome Onset of study Time
Example disease asthma yes no total prevalence boys 344 = a 4885 = c 5229 = (a+c) 344 / 5229 = 0,066 Exposure girls 221 = b 4787 = d 5008 = (b+d) 221 / 5008 = 0,044 total 565 = (a+b) 9672 = (c+d) 10237 565 / 10237 = 0,055 OR = = = 1.53 a / b 344 / 221 c / d 4885 / 4787
Case-Control study Direction of inquiry exposed unexposed exposed unexposed cases controls Onset of study Time
Example Exposure (during childhood) cases Disease -Melanoma controls total no sun protection 303 = a 290 = b 593 sun protection 99 = c 132 = d 231 total 402 422 824 a / c 303 / 99 OR = = = 1.39 b / d 290 / 132 95% confidence interval: [1.02; 1.89]
Odds The Odds of a probability P is defined by P Odds (P) = 1-P It is the chance, that an event happens. Example: P = 0.5 : an event will happen with a probability of 50% Odds(P) = 0.5/0.5 = 1 (chance of 1:1) P = 0.8 Odds(P) = 0.8/0.2 = 4 (chance of 4:1)
Odds Ratio Exposure yes (cases) Disease no (controls) exposed a b not exposed c d a / c OR = = b / d ad bc OR = Chance, that case was exposed Chance, that control was exposed
Cohort Study Direction of inquiry Population Cohort selected for study exposed (subjects) unexposed (controls) with outcome without outcome with outcome without outcome Onset of study Time
Examples of cohort studies Onset of study Exposure Prognostic study e.g. influence of anemia on survival Epidemiological study e.g. Framingham study time of diagnosis or start of therapy "Start" of observation prognostic factors risk factors
Example Association between cigarette smoking and incidence of stroke in a cohort of 118 539 women (age 30-55) follow-up 8 years Exposure No. of cases of stroke Person-years Incidence (per 100 000 person-years) Smoker 139 280141 49.6 Ex-smoker 65 232712 27.9 Never-smoked 70 395594 17.7 139 / 280141 RR = = 2.8 70 / 395594 95% confidence interval RR: [2.1; 3.7]
Relative Risk Disease Exposure yes no total exposed a b a+b not exposed c d c+d RR = RR = a / (a+b) c / (c+d) Incidence rate of exposed Incidence rate of not-exposed
Experimental Studies
Clinical trial Comparison of the efficacy of different drugs, therapies, vaccines etc. after controlling for confounders (e.g. age, sex, stage of disease, ). Aim: Observed differences in success rates between treatment groups can exclusively be put down to the fact that differences are caused by the efficacy of the different treatments.
Statistical issues The efficacy and safety of treatments have to be judged against a background of biological variability In designing studies, two main points have to be kept in mind: the effect of bias the effect of chance
Focus Comparative trials: Interested in treatment effect and treatment comparisons Concurrent control group Investigate a new experimental intervention versus placebo or a standard intervention compare two alternative commonly-used interventions with each other Study the result of adding an additional agent to a standard regimen Compare different doses or intensities of an intervention Pre-defined study objective
Design techniques to avoid bias Randomization Blinding The most important design techniques for avoiding bias in clinical trials are blinding and randomisation. (ICH E9: Statistical Principles in Clinical Trials)
Randomization To allocate treatments to subjects in a trial at random (using coins, dice, random number tables or generators) Allocation concealment Neither the subject nor the investigator knows ahead of time what treatment the subject will receive Benefits: Eliminates assignment basis avoids selection bias Tends to produce comparable groups Statistical basis for a valid treatment comparison
Randomization 20 patients will be allocated at random to two groups Patients: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 We throw the die once for each patient: Result:?2?5?3 odd number: group A even nuumber: group B Group A: Group B: 2, 3, 6, 9, 12, 1, 4, 5, 7, 8, 13, 16, 17, 19 10, 11, 14, 15, 18, 20
A B 1 3 2 5 4 6 7 8 9 10
Restricted randomization Disadvantages of simple randomization: No guarantee of equal or approximately equal sample size in each treatment group at any stage of the trial With n = 20 on two treatments A and B, the probability of a 12:8 split or worse is approximately 0.19 No protection against long runs of one treatment Subject characteristics may change over time Restricted randomization: Permuted blocks (Matts & Lachin) Biased coin (Efron) Urn design (Wei) Big Stick (Soares & Wu)
Randomization block randomization: 1: AABB 2: ABAB 3: ABBA Randomization list 4: BABA 5: BAAB 6: BBAA (only at study coordinating centre and not for the researcher) n! 4! = = 6 n 1! n 2! 2! 2! randomization list Pat. allocation therapy 1 A Radiatio 2 A Radiatio 3 B Rad.+ Chem. 4 B Rad.+ Chem. 5 A Radiatio 6 B Rad.+ Chem. 7 A Radiatio 8 B Rad.+ Chem. 9 B Rad.+ Chem. 10 A Radiatio 11 A Radiatio 12 B Rad.+ Chem. 13 B Rad.+ Chem. 14 B Rad.+ Chem. 15 A Radiatio 16 A Radiatio.........
Stratified randomization Balance treatment groups with respect to prognostic factors For large studies, randomization tends to give balance For smaller studies a better guarantee may be needed Common factors used for stratification - e.g. clinical centre, age, sex, disease severity Define strata e.g. Age: < 40, 40-60, > 60; Sex: M, F (3 x 2 strata) Randomization is performed within each stratum and is usually blocked Rule of thumb use as few stratification factors as possible
Randomized controlled trials The trial carried out by the Medical Research Council (MRC, 1948) to test the efficacy of streptomycin for the treatment of pulmonary tuberculosis is generally considered to be the first randomized experiment in medicine. target population: patients with progressive bilateral pulmonary tuberculosis (bacterially proven), aged 15-30 years 107 patients in 3 centers were allocated by a series of random numbers drawn up for each sex at each centre.
Implementation Sequenced sealed envelopes Phone call / fax to trial coordination centre Interactive Voice Response Systems Internet-based Systems (e.g. Randomizer for Clinical Trials)
Randomizer for Clinical Trials
Online Randomization www.randomizer.at
Blinding - Masking To limit the occurrence of bias in the conduct and interpretation of the trial (in the care, the assessment of endpoints, the attitude of subjects to treatments etc.) Double-blind: neither subject nor investigator/staff are aware of the treatment received placebo, double dummy, masked vials blinding may not be possible surgical versus medical intervention one intervention has obvious side-effect Outcome assessed by masked observer Single-blind Open-label trial
Randomized controlled trials Choice of target population Selection of patients: Definition of target population using inclusion and exclusion criteria Trial Design Parallel Design Cross-Over Design
Parallel - Design Population Screening Eligible and williging subjects Randomization Test Control Assessment
Cross Over - Design Population Screening Eligible and williing subjects Randomization Control Test Assessmnet Test Control Assessment
Statistical analysis The statistical analysis has to be defined before the study is carried out Statistical analysis plan (SAP) Population used for analysis: All-Randomized patients Intention-to-treat analysis On-treatment patients Per-protocol analysis Safety population
Intention-to-Treat all randomized patients must be included in the analysis - they have to be included in the group they were randomised to, independent of what happened after randomization.
Intention-to-Treat (ITT) Analysis Randomization Treatment A Treatment B Treatment A per protocol Treatment withdrawal Treatment B per protocol Treatment withdrawal 1 2 3 4 Intention-to-Treat: 1+2 vs 3+4 Per-Protocol (PP): 1 vs 3
Illustration Propanolol Atenolol Placebo ITT Analysis 7.6% 8.7% 11.6% PP - Analysis 3.4% 2.6% 11.2% Withdrawal 15.9% 17.6% 12.5% Percentage of patients who died within 6 weeks after heart infarction (Wilcox et. al.)
Efficacy and Effectiveness Efficacy effect under optimal conditions All patients are included in the analysis, who were treated per protocol. Per-Protocol Analysis Effectiveness effect under real conditions. All patients are included in the analysis, who were included in the study (Withdrawal, changing treatment etc.). Intention-to-treat Analysis
Main points RCTs Randomization concealed allocation Blinding double blind study Minimal loss in follow up Intention to treat Analyse Carry out specified analysis
Laboratory Experiments Exactly the same principles apply to laboratory experiments on animals or on biological specimens as for clinical trials Stricter control of extraneous factors is possible Effect of uncertainties is minimized use of control group, randomization, replication Principles of randomization is often not well understood Using genetically similar animals little biological variablity