Learning Objectives 9/9/2013. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency

Conflicts of Interest I have no conflict of interest to disclose Biostatistics Kevin M. Sowinski, Pharm.D., FCCP Last-Chance Ambulatory Care Webinar Thursday, September 5, 2013 Learning Objectives For statistical tests commonly encountered in the pharmacotherapy literature, specify their appropriate application and interpretation. Differentiate observational and controlled trial designs based upon their strengths and weaknesses Detect common hazards in presenting and interpreting the statistical results of various trial types. Descriptive statistics: Numerical methods Measures of Central Tendency Mean Used only for continuous and normally distributed data Median (a.k.a 50th percentile) Midpoint of the values when placed in order from highest to lowest. Half above and below. Used for ordinal or continuous data (especially for skewed populations) Mode Most common value in a distribution Used for nominal, ordinal, or continuous data Data may have > one mode (bimodal, trimodal) Measures of Data Spread and Variability Hypothesis Testing SD: measure of the variability about the mean, applied to normally distributed continuous data Empirical rule: 68% within ±1 SD, 95% within ±2 SD, and 99% within ±3 SD CV relates the mean to the SD (SD/mean 100%) Variance = SD 2 Others: Range and Percentiles (IQR) Null hypothesis (H 0 ): No difference between comparator groups (Tx A = Tx B) Alternative hypothesis (H a ): States that there is a difference (Tx A Tx B) Results of hypothesis testing will indicate whether there is enough evidence to reject H 0 H 0 is rejected = statistically significant (SS) difference H 0 is not rejected = no SS difference We are not concluding that the treatments are equal. 1

Decision Errors Probability of making Type I error = significance level (α) Usually α:0.05, 5.0% of the time, we will conclude there is a SS difference when actually one does not exist. Calculated chance that a type I error has occurred is called the p-value. Lower p-value does not suggest more importance, only SS and less likely attributable to chance Type II Error (b) Usually b: 0.10-0.20 Concluding that no difference exists when one truly does (not rejecting H 0 when it should be rejected) Statistical Significance Interpreting p-values Size of p-value is not related to the importance of the result Statistically significant does not necessarily mean clinically significant Lack of statistical significance does not mean results are unimportant Post hoc power calculations For negative results For positive results (reject H o ). Confidence Intervals Can be reported for many different types of analysis A CI is a range of values that are likely to cover the true population parameter Difference between two means: range of values along with a point estimate of the difference between the two groups. A 95 % CI that includes a value consistent with no difference (i.e. 0) can be interpreted as a p 0.05 OR, RR, Hazard Ratio Statistical significance assessed based on whether bounds of CI include 1 Choosing A Test Type of Variable 2 Samples (independent) 2 Samples (related) > 2 Samples > 2 Samples (independent) (related) Nominal χ 2 or Fisher exact test McNemar test χ 2 Cochran Q Ordinal Continuous No factors Wilcoxon rank sum Mann-Whitney U-test Wilcoxon signed rank Sign test Equal variance t-test Paired t-test Unequal variance t-test 1 factor ANCOVA 2-way repeatedmeasures ANOVA Kruskal-Wallis (MCP) 1-way ANOVA (MCP) 2-way ANOVA (MCP) Friedman ANOVA Repeated-measures ANOVA 2-way repeatedmeasures ANOVA Vignette 1 (Choosing Appropriate Measures of Central Tendency) A pharmacy practice resident is planning a project that compares patient satisfaction with the services of a recently established medication therapy management program. She plans to evaluate several outcomes of the program s impact (e.g., adherence, goal attainment) including patients satisfaction with the services provided. Satisfaction will be assessed using a Likert scale-based instrument. A pharmacy practice resident is planning a project that compares patient satisfaction with the services of a recently established medication therapy management program. She plans to evaluate several outcomes of the program s impact (e.g., adherence, goal attainment) including patients satisfaction with the services provided. Satisfaction will be assessed using a Likert scale-based instrument. Question 1 Which measures of central tendency and data dispersion should be used to compare patient satisfaction between the intervention group and a control group? 2

Question 1 (Choosing Appropriate Measures of Central Tendency) Which measures of central tendency and data dispersion should be used to compare patient satisfaction between the intervention group and a control group? A. Mean and SD B. Median and SD C. Median and IQR D. Mode and SEM A pharmacy practice resident is planning a project that compares patient satisfaction with the services of a recently established medication therapy management program. She plans to evaluate several outcomes of the program s impact (e.g., adherence, goal attainment) including patients satisfaction with the services provided. Satisfaction will be assessed using a Likert scale-based instrument. Question 1a Which statistical test should be used to compare patient satisfaction between the intervention group and a control group? Question 1a (Choosing Appropriate Statistical Tests) Which statistical test should be used to compare patient satisfaction between the intervention group and a control of group? Vignette 2 (Decision Errors) A recent study comparing two medications on their ability to lower blood pressure concludes no difference between them (p > 0.05). In reviewing the methods section you read that the investigators calculated their sample size by setting alpha at 0.05 and beta at 0.40. A. Fisher s Exact Test B. Mann-Whitney U C. ANOVA D. Wilcoxon signed-rank A recent study comparing two medications on their ability to lower blood pressure concludes no difference between them (p = 0.08). In reviewing the methods section you read that the investigators calculated their sample size by setting alpha at 0.05 and beta at 0.40. Question 2 Which of the following is the best interpretation of the results of this study? Question 2 (Decision Errors) Which of the following is the best interpretation of the results of this study? A. A Type I error may have occurred, alpha should be set at < 0.10 B. A Type II error may have occurred, beta should be set at < 0.20 C. A Type I error may have occurred, alpha should be set at < 0.01 D. A Type II error may have occurred, beta should be set at > 0.40 3

Vignette 3 (Confidence Intervals) A randomized controlled trial reports that the difference in LDL-concentrations between a new HMG CoA reductase inhibitor and an existing one is 0.5%. The 95% confidence interval for this difference is (-0.1% to 0.9%). A randomized controlled trial reports that the difference in LDLconcentrations between a new HMG CoA reductase inhibitor and an existing one is 0.5%. The 95% confidence interval for this difference is (-0.1% to 0.9%). Question 3 Which of the following would be an appropriate interpretation of these results? Question 3 (Confidence Intervals) Which of the following would be an appropriate interpretation of these results? A. There is no statistically significant difference between this drug and placebo (p > 0.05) B. There is a statistically significant difference between this drug and placebo (p < 0.05) C. There is statistically significant difference between this drug and placebo (p < 0.01) D. The authors need to report a p-value to indicate if the observed difference is statistically significant Correlation and regression Correlation examines the strength/degree of association between 2 variables Correlation coefficient (r) ranges from -1 to +1 Regression analysis examines the ability of one or more variables to predict another variable e.g., Simple/multiple linear regression, simple/multiple logisitic regression, non-linear, polynomial,etc Y = mx+ b, risk of MI in patients with chest pain; risk of GI bleeding in critically ill patients, CrCl from SeCr Coefficient of determination (r 2 ) can range 0 to 1. An r 2 of 0.80: 80% of the variability in Y is explained by the variability in X Survival Analysis Studies the time between entry in a study and some event (e.g., death, MI) Kaplan-Meier method: Uses survival times to estimate the proportion of people who would survive a length of time Log-Rank Test: Compare the survival distributions > 2 groups Cox proportional hazards model Impact of covariates on survival in > 2 groups Allows calculation of a hazard ratio (and CI) Observational Study Design Summary of Characteristics Study Design Case Report/ Case Series Measure of Association Major Advantages Generate new information about natural history of Dz ID new disease/condition Case-control OR Study relatively rare outcomes Low cost and short duration Cohort RR Study relatively rare outcomes Study temporal associations Estimate direct risk estimates Major Disadvantages Usually can t measure rates of association Not practical for studying rare exposures Inability to study multiple outcomes in one study Not practical for studying rare exposures Increased cost and longer duration (prospective) Cross-sectional Prevalence Low cost and short duration Temporal associations can t be established Adapted from Pharmacotherapy 2010;30:973-984 4

Relative vs. Absolute Differences Group Medication Placebo Absolute Difference Relative Difference 40% (4/10) 20% (2/10) 20% 50% 4% (4/100) 2% (2/100) 2% 50% 0.4% (4/1000) 0.2% (2/1000) 0.2% 50% Vignette 4 (Correlation/Regression/Survival Analysis) In working with some data concerning the occurrence of adverse effects when taking a new drug, you are deciding whether to carry out a correlation analysis or a regression analysis. NNT: The reciprocal of the ARR NNT = 1/(ARR) Rounded to the next highest whole number In working with some data concerning the occurrence of adverse effects when taking a new drug, you are deciding whether to carry out a correlation analysis or a regression analysis. Question 4 Which of the following would be an appropriate consideration in making this decision? Question 4 (Correlation/Regression/Survival Analysis) Which of the following would be an appropriate consideration in making this decision? A. Correlation analyses require more data than regression analysis B. Regression analyses are useful for detecting only linear relationships between variables C. Correlation analyses work best for data that has been collected in retrospective trials D. Regression analyses are useful in predicting the value of one variable based upon the value of another variable Vignette 5 (Relative Risk/Odds Ratios) A case-control trial (n = 102 cases, n = 650 controls) reports that the OR ratio for myocardial infarction (MI) is 0.7 (95% CI: 0.5, 0.9) in persons who have a history of receiving a certain vaccine in childhood. A case-control trial (n = 102 cases, n = 650 controls) reports that the OR ratio for myocardial infarction (MI) is 0.7 (95% CI: 0.5, 0.9) in persons who have a history of receiving a certain vaccine in childhood. Question 5 Which of the following represents the best interpretation of this result? 5

Question 5 (Relative Risk/Odds Ratios) Which of the following represents the best interpretation of this result? A. This is a statistically significant difference; the vaccine is associated with a lower risk of MI B. This is not a statistically significant difference; the vaccine is not associated with a lower risk of MI C. This is a statistically significant difference; the vaccine decreases the risk of MI D. This is not a statistically significant difference; the vaccine does not increase the risk of MI Vignette 6 (Clinical Trials: Randomization Issues) A randomized, double blind trial is conducted to determine if a new vasodilator improves symptoms in HF. Patients are randomized to either the new drug or placebo, in additional to continuing existing standard medications. The intention-to-treat analysis shows no statistical difference in treatments. An actual treatment analysis showed that the new drug was more effective than placebo. Patients who did not take at least 70% of the study drug were reclassified as placebo for the actual treatment analysis. A randomized, double blind trial is conducted to determine if a new vasodilator improves symptoms in HF. Patients are randomized to either the new drug or placebo, in additional to continuing existing standard medications. The intention-to-treat analysis shows no statistical difference in treatments. An actual treatment analysis showed that the new drug was more effective than placebo. Patients who did not take at least 70% of the study drug were reclassified as placebo for the actual treatment analysis. Question 6 Which one of the following is the best course of action based on these results? Question 6 (Clinical Trials: Analysis Issues) Which one of the following is the best course of action based on these results? A. Recommend the new drug for all patients with HF B. Recommend the new drug for patients who adhere to their current therapies C. Recommend the new drug only for early-stage HF D. Do not recommend the new drug, wait for further studies Vignette 7 (Meta-analysis) A meta-analysis of 7 randomized, placebo-controlled trials investigating the effects of a drug on stroke are shown in the figure below. The results are expressed as an odds ratio (OR) comparing the new drug to the conventional therapy. A meta-analysis of 7 randomized, placebo-controlled trials investigating the effects of a drug on stroke are shown in the figure below. 0.5 1 OR 2 Question 7 Which one of the following is the following is the best interpretation of these results, as shown in the figure? 0.5 OR 1 2 6

Question 7 (Meta-analysis) Which one of the following is the following is the best interpretation of these results, as shown in the figure? A. The drug does not have an effect on the frequency of stroke B. The drug decreases the frequency of stroke C. The drug increases the frequency of stroke D. The drug s effect on stroke cannot be determined from the figure Vignette 8 (Relative and Absolute Differences, NNT) A recent trial comparing a new antibiotic to ciprofloxacin in the treatment of uncomplicated urinary tract infections shows that 89% of patients receiving the new drug were symptom-free at 3 days, while 85% of the ciprofloxacin patients were symptom-free at 3 days (p < 0.05). A recent trial comparing a new antibiotic to ciprofloxacin in the treatment of uncomplicated urinary tract infections shows that 89% of patients receiving the new drug were symptom-free at 3 days, while 85% of the ciprofloxacin patients were symptom-free at 3 days (p < 0.05). Question 8 How many patients would need to be treated with the new drug to have one additional patient symptom-free at 3 days? Question 8 (Relative and Absolute Differences, NNT) How many patients would need to be treated with the new drug to have one additional patient symptom-free at 3 days? A. 2 B. 4 C. 25 D. 95 Vignette R1 (Choosing Appropriate Statistical Tests) A study compares the plasma concentrations of a new antiretroviral medication both with and without the concomitant administration of ganciclovir. Patients will take a fixed-dose of the new medication and have steady-state plasma concentrations evaluated before receiving ganciclovir and again 2 weeks later after having received ganciclovir for 48 hours. A study compares the plasma concentrations of a new antiretroviral medication both with and without the concomitant administration of ganciclovir. Patients will take a fixed-dose of the new medication and have steady-state plasma concentrations evaluated before receiving ganciclovir and again 2 weeks later after having received ganciclovir for 48 hours. Question R1 To determine whether there are differences in steady-state plasma concentrations, which test should be used to compare the results of this investigation? 7

Question R1 (Choosing Appropriate Statistical Tests) To determine whether there are differences in steady-state plasma concentrations, which test should be used to compare the results of this investigation? Vignette R2 (Decision Errors) You are planning a trial to compare the incidence of depression recurrence with 9 months of antidepressant treatment compared with 18 months of anti-depressant treatment. A. Independent samples t-test B. Chi-squared with Bonferroni correction C. Paired t-test D. Kruskal-Wallis ANOVA You are planning a trial to compare the incidence of depression recurrence with 9 months of antidepressant treatment compared with 18 months of antidepressant treatment. Question R2 (Decision Errors) Question R2 Which of the following strategies could be used to increase the power to detect any differences? Which of the following strategies could be used to increase the power to detect any differences? A. Increase the sample size B. Decrease alpha C. Increase beta D. Decrease the size of the difference you wish to detect Vignette R3 (Confidence Intervals) In a large (n=4218) randomized controlled trial a new drug is compared to placebo to assess its effect on raising HDL cholesterol. Compared to placebo the new drug is demonstrated to increase HDL cholesterol by 2 mg/dl, 95% CI (0.6, 3.4). In a large (n=4218) randomized controlled trial a new drug is compared to placebo to assess its effect on raising HDL cholesterol. Compared to placebo the new drug is demonstrated to increase HDL cholesterol by 2 mg/dl, 95% CI (0.6, 3.4). Question R3 Which of the following represents the best interpretation of such results? 8

Question R3 (Confidence Intervals) Which of the following represents the best interpretation of such results? A. The results are neither statistically, nor clinically significant (i.e., don t use the drug) B. The results are not statistically significant, but are clinically significant (i.e., use the drug) C. The results are statistically significant, but may not be clinically significant D. The results are statistically significant, and clinically significant Vignette R4 (Correlation/Regression/Survival Analysis) A trial investigating the time to relapse for patients receiving a new drug to treat alcoholism uses a Kaplan- Meier analysis. A trial investigating the time to relapse for patients receiving a new drug to treat alcoholism reports its results uses a Kaplan-Meier analysis. Question R4 Which of the following characteristics of such a model would be important to be aware of when interpreting these results? Question R4 (Correlation/Regression/Survival Analysis) Which of the following characteristics of such a model would be important to be aware of when interpreting these results? A. The statistical test used in this type of analysis depends on several assumptions (e.g., independent and random samples) B. A Kaplan-Meier curve can be used to investigate several variables at one time C. This type of analyses requires that all subjects experience the event during the study D. The results of such an analysis are represented graphically as smooth, bell-shaped curves Vignette R5 (Relative Risk/Odds Ratios) A cohort study reports a relative risk (RR) of breast cancer of 1.4 (1.1,1.7) in women who reported having used an over-the-counter, herbal antidepressant in the past 5 years. A cohort study reports a relative risk (RR) of breast cancer of 1.4 (1.1,1.7) in women who reported having used an over-the-counter, herbal antidepressant in the past 5 years. Question R5 Which of the following is the best way to interpret these results? 9

Question R5 (Relative Risk/Odds Ratios) Which of the following is the best way to interpret these results? A. the herbal product should not be used, it increases the risk of breast cancer substantially B. the herbal product may be associated with increased risk of breast cancer, but should be studied prospectively C. the herbal product is not associated with any increase in breast cancer risk D. the herbal product appears safe and effective Vignette R6 (Regression Analysis) A study was conducted investigating the relationship between a the dose of a beta-agonist and FEV1. The results shown in the figure were: r=-0.46, p<0.05. A study was conducted investigating the relationship between the dose of a beta-agonist and FEV1. The results shown in the figure were: r=-0.46, p<0.05. Question R6 (Regression Analysis) Which of the following represents the percent in the variability in FEV1 that is explained by the variability in the dose of the beta-agonist? Question R6 Which of the following represents the percent in the variability in FEV1 that is explained by the variability in the dose of the betaagonist? A. 70 percent B. 21 percent C. 46 percent D. 92 percent Vignette R7 (Meta-analysis) In reporting the results of a meta-analysis designed to assess the impact of pharmacist interventions to improve medication adherence, the authors report the results of their analyses testing their data for heterogeneity. In reporting the results of a met-analysis designed to assess the impact of pharmacist interventions to improve medication adherence, the authors report the results of their analyses testing their data for heterogeneity. Question R7 Which of the following would be appropriately associated with such an analysis? 10

Question R7 (Meta-analysis) Which of the following would be appropriately associated with such an analysis? A. The use of ANOVA B. The reporting of a Χ 2 or Cochrane s Q test of greater than 0.1 C. A summary odds ratio D. A regression analysis using multivariate techniques Vignette R8 (Risk Reduction and NNT) The results of a prospective, randomized, double-blind, placebo-controlled trial show that over a 6-month period 27/1232 patients receiving a medication required hospitalization for symptoms of asthma, while 42/1230 patients receiving the gold standard therapy required hospitalization for symptoms of asthma (p < 0.05). The results of a prospective, randomized, double-blind, placebo-controlled trial show that over a 6-month period 27/1232 patients receiving a medication required hospitalization for symptoms of asthma, while 42/1230 patients receiving the gold standard therapy required hospitalization for symptoms of asthma (p < 0.05). Question R8 (Risk Reduction and NNT) Which of the following statements best represents these results? Question R8 Which of the following statements best represents these results? A. Relative Reduction in events = 35%, NNT = 83 B. Relative reduction in events = 55%, NNT = 8 C. Relative reduction in events = 15%, NNT = 83 D. Relative reduction in events = 83%, NNT = 35 Vignette E1 (Approaches to Analyses) In deciding an approach to statistical analysis, investigators wish to focus on demonstrating the maximum effectiveness of a new medication when it is used correctly. In deciding an approach to statistical analysis, investigators wish to focus on demonstrating the maximum effectiveness of a new medication when it is used correctly. Question E1 In pursuing such a goal, which of the following types of analysis would be most appropriate? 11

Question E1 (Approaches to Analyses) In pursuing such a goal, which of the following types of analysis would be most appropriate? A. Intention-to-treat B. Intention-to-randomize C. Per-protocol D. As-treated Vignette E2 (Composite End-Points) During your residency program you are presenting a journal club article regarding a new drug for hypertension. The primary outcome of the study is the occurrence of the combination of stroke, myocardial infarction (MI), and cardiovascular (CV) death. During your residency program you are presenting a journal club article regarding a new drug for hypertension. The primary outcome of the study is the occurrence of the combination of stroke, myocardial infarction (MI), and cardiovascular (CV) death. Question E2 In discussing this trial during your departmental journal club, which of the following would be appropriate comments regarding this type of end-point? Question E2 (Composite End-Points) In discussing this trial during your departmental journal club, which of the following would be appropriate comments regarding this type of end-point? A. Composite end-points provide increased power, without any risks to trial validity or interpretability B. Composite end-points create additional issues around multiple (statistical) testing C. Composite end-points that combine non-fatal outcomes with death are the most robust D. Composite end-points that include soft (i.e., subjective) end-points can be unduly influenced by one or more component end-points Vignette E3 (Subgroup Analysis) In a study of a new drug used to treat heart failure the authors describe several subgroups that were analyzed separately after the trial concluded based upon age, sex, and smoking history, history of stroke, and history of diabetes. They report that while the drug did not show effectiveness in the overall sample, that patients who were greater than 70 years of age did benefit from decreased hospitalizations due to HF (p < 0.05). In a study of a new drug used to treat heart failure the authors describe several subgroups that were analyzed separately after the trial concluded based upon age, sex, and smoking history, history of stroke, and history of diabetes. They report that while the drug did not show effectiveness in the overall sample, that patients who were greater than 70 years of age did benefit from decreased hospitalizations due to HF (p < 0.05). Question E3 Which of the following statements represents the best interpretation of these results? 12

Question E3 (Subgroup Analysis) Which of the following statements represents the best interpretation of these results? A. Patients over 70 years old with HF should receive the medication B. Any patient with HF should receive the medication C. Post-hoc subgroup analyses like this one, should be considered hypothesis generating D. Any patient with HF and additional risk factors should receive the medication Vignette E4 (Descriptive Statistics) A drug interaction study is conducted to assess the effects on the time to maximum concentrations (Tmax) of one drug when a p-glycoprotein inhibitor is taken concomitantly. A drug interaction study is conducted to assess the effects on the time to maximum concentrations (Tmax) of one drug when a p-glycoprotein inhibitor is taken concomitantly. Question E4 (Descriptive Statistics) Which of the following would be the best way to report the Tmax results of this study? Question E4 A. Mode and SEM B. Mean and median C. Median and range D. Mean and standard deviation Which of the following would be the best way to report the results of this study? Vignette E5 (Observational Study Designs) You wish to investigate whether or not patients being treated pharmacologically for hypertension have an increased occurrence of memory loss. This hypothesis was suggested by a recent first-ever case-series published in the medical/pharmacy literature. You also hope to ascertain if certain medications or medication classes have more or less of this potential effect on memory. You wish to investigate whether or not patients being treated pharmacologically for hypertension have an increased occurrence of memory loss. This hypothesis was suggested by a recent first-ever caseseries published in the medical/pharmacy literature. You also hope to ascertain if certain medications or medication classes have more or less of this potential effect on memory. Question E5 Which of the following would be an appropriate strategy for investigating this hypothesis initially? 13

Question E5 (Observational Designs) Which of the following would be an appropriate strategy for investigating this hypothesis initially? A. cohort trial B. case-control trial C. randomized, controlled trial D. meta-analysis 14