Research Designs and Potential Interpretation of Data: Introduction to Statistics Karen H. Hagglund, M.S. Medical Education St. John Hospital & Medical Center Karen.Hagglund@ascension.org Let s Take it Step by Step... Identify topic Literature review Variables of interest Research hypothesis Design study Power analysis Write proposal Design data tools Committees Collect data Set up spreadsheet Enter data Statistical analysis Graphs Slides / poster Write paper / manuscript Confused by Statistics? Goals To understand why a particular statistical test was used for your research project To interpret your results To understand & evaluate the medical literature Not to perform your own calculations Key Statistical Tests Level of statistical expertise to read the literature in NEJM (Emerson & Colditz 1983) 58%: descriptive statistics only 67%: plus t-tests 84%: plus contingency tables, epidemiologic statistics, Pearson correlation, simple linear regression, ANOVA No need to be a statistical expert! Before choosing a statistical test Figure out the variable type Scales of measurement (qualitative or quantitative) Figure out your goal Compare groups Measure relationship or association of variables 1
Scales of Measurement Nominal Scale (discrete) Nominal Ordinal Interval Ratio } Qualitative } Quantitative Simplest scale of measurement Variables which have no numerical value Variables which have categories Count number in each category, calculate percentage Examples: Gender Race Marital status Whether or not tumor recurred Alive or dead Ordinal Scale Variables are in categories, but with an underlying order to their values Rank-order categories from highest to lowest Intervals may not be equal Count number in each category, calculate percentage Examples: Cancer stages Apgar scores Pain ratings Likert scale Interval Scale Quantitative data Can add & subtract values Cannot multiply & divide values No true zero point Example: Temperature on a Celsius scale 0 0 indicates point when water will freeze, not an absence of warmth Ratio Scale (continuous) Scales of Measurement Quantitative data with true zero Can add, subtract, multiply & divide Examples: Age Body weight Blood pressure LOS OR time Nominal Ordinal Interval Ratio } Lead to nonparametric statistics } Lead to parametric statistics 2
Let s Poll the Audience! Which of the following does NOT characterize nominal scales? a. A score in one category must be counted the same as a score in another category b. You can count the number of categories and scores c. You can add and subtract scores d. It is the most appropriate scale for variables that vary in nonquantitative ways Identify the measurement scale The species of bacteria within a culture a. Nominal b. Ordinal c. Interval d. Ratio Identify the measurement scale The observed stage of a pressure ulcer a. Nominal b. Ordinal c. Interval d. Ratio Identify the measurement scale Hospital Length of Stay a. Nominal b. Ordinal c. Interval d. Ratio Two Branches of Statistics Descriptive Frequencies & percents Measures of the middle Measures of variation Inferential Nonparametric statistics Parametric statistics 3
Descriptive Statistics First step in analyzing data Goal is to communicate results, without generalizing beyond sample to a larger group Frequencies and Percents Number of times a specific value of an observation occurs (counts) For each category, calculate percent of sample Valid Missing smoker non-smoker unknown SMOKING Cumulative Frequency Percent Valid Percent Percent 26 20.5 24.8 24.8 79 62.2 75.2 100.0 105 82.7 100.0 22 17.3 127 100.0 Statistics Valid Missing 1 2 3 5 6 12 13 14 15 17 18 19 20 22 24 30 39 40 45 100 System nocigs_b Cumulative Frequency Percent Valid Percent Percent 2 1.6 7.7 7.7 1.8 3.8 11.5 1.8 3.8 15.4 3 2.4 11.5 26.9 1.8 3.8 30.8 1.8 3.8 34.6 1.8 3.8 38.5 1.8 3.8 42.3 2 1.6 7.7 50.0 1.8 3.8 53.8 1.8 3.8 57.7 2 1.6 7.7 65.4 2 1.6 7.7 73.1 1.8 3.8 76.9 1.8 3.8 80.8 1.8 3.8 84.6 1.8 3.8 88.5 1.8 3.8 92.3 1.8 3.8 96.2 1.8 3.8 100.0 26 20.5 100.0 101 79.5 127 100.0 nocigs_b N Mean Std. Error of Mean Median Mode Std. Deviation Variance Range Minimum Maximum Valid Missing 26 101 19.62 3.985 16.00 5 20.320 412.886 99 1 100 Let s Poll the Audience! Which measure of central tendency is important to people who want to think they are in the better half of the population? a. Mode b. Median c. Mean d. Standard deviation 4
Confidence Intervals (CI) Provides a range in which we are reasonably likely to find the true value Moves away from the traditional testing for statistical significance Journal editors are increasingly demanding that CIs be reported Confidence Intervals (CI) p value Presence or absence of an effect Whether or not the treatment had any impact CI Effect size How much impact Which Statement Gives More Information? Example 1. The difference in one year survival rate between the experimental and control groups was significant (p < 0.01). 2. The one year survival rate increased by 20% in the experimental group, with a 95% CI of 14-24%. Confidence Intervals (CI) 95% CI means that in 95% of replications, the true value of the population parameter will be within the interval Use for means, %s, RRs Confidence Intervals (CI) To calculate: Effect size (critical value * standard error) For 95% CI, critical value = 1.96 (95% of the area of a normal distribution is within 1.96 standard deviations of the mean) 5
Width of Confidence Interval Narrow Less variability in effect estimate Larger sample size More informative about true effect Wide Greater variability in effect estimate Smaller sample size Width of CI, Cont d If a finding is non-significant (p > 0.05) Narrow CI Implies no real effect exists Wide CI Implies that not enough patients were used in study Lack of power Smoking & Bladder Cancer Examples Smokers had an increased risk (RR = 1.9) of bladder cancer compared to non - smokers The 95% CI : (1.3 2.8) If the CI includes the null value (RR = 1.0) then the p value will also be non-significant Amoxicillin & UTI Is a single dose of amoxicillin more effective than 2 weeks of therapy for treatment of UTI? Single dose: 61% cure rate 2 weeks: 74% cure rate p = 0.07; 95% CI = -1.5% to 27.5% 6