Data Analysis in Practice-Based Research. Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine

Similar documents
Many studies conducted in practice settings collect patient-level. Multilevel Modeling and Practice-Based Research

Multiple Regression Analysis

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data

Exploring the Factors that Impact Injury Severity using Hierarchical Linear Modeling (HLM)

OLS Regression with Clustered Data

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data

Hierarchical Linear Models: Applications to cross-cultural comparisons of school culture

Dan Byrd UC Office of the President

How to analyze correlated and longitudinal data?

Daniel Boduszek University of Huddersfield

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Business Statistics Probability

MODELING HIERARCHICAL STRUCTURES HIERARCHICAL LINEAR MODELING USING MPLUS

Assessing Measurement Invariance in the Attitude to Marriage Scale across East Asian Societies. Xiaowen Zhu. Xi an Jiaotong University.

An informal analysis of multilevel variance

Objective: To describe a new approach to neighborhood effects studies based on residential mobility and demonstrate this approach in the context of

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

LOGLINK Example #1. SUDAAN Statements and Results Illustrated. Input Data Set(s): EPIL.SAS7bdat ( Thall and Vail (1990)) Example.

11/24/2017. Do not imply a cause-and-effect relationship

Donna L. Coffman Joint Prevention Methodology Seminar

Multilevel analysis quantifies variation in the experimental effect while optimizing power and preventing false positives

Ecological Statistics

CHILD HEALTH AND DEVELOPMENT STUDY

Meta-analysis using HLM 1. Running head: META-ANALYSIS FOR SINGLE-CASE INTERVENTION DESIGNS

investigate. educate. inform.

Use the above variables and any you might need to construct to specify the MODEL A/C comparisons you would use to ask the following questions.

Mixed Effect Modeling. Mixed Effects Models. Synonyms. Definition. Description

NORTH SOUTH UNIVERSITY TUTORIAL 2

What is Multilevel Modelling Vs Fixed Effects. Will Cook Social Statistics

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

Applications. DSC 410/510 Multivariate Statistical Methods. Discriminating Two Groups. What is Discriminant Analysis

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Unit 1 Exploring and Understanding Data

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

(C) Jamalludin Ab Rahman

12/30/2017. PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2

Clinical trial design issues and options for the study of rare diseases

An Introduction to Bayesian Statistics

Small-area estimation of mental illness prevalence for schools

THE UNIVERSITY OF SUSSEX. BSc Second Year Examination DISCOVERING STATISTICS SAMPLE PAPER INSTRUCTIONS

Vessel wall differences between middle cerebral artery and basilar artery. plaques on magnetic resonance imaging

C h a p t e r 1 1. Psychologists. John B. Nezlek

Still important ideas

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

Analysis of Variance: repeated measures

Advanced ANOVA Procedures

Sampling Weights, Model Misspecification and Informative Sampling: A Simulation Study

Simple Linear Regression

LIHS Mini Master Class Multilevel Modelling

The Late Pretest Problem in Randomized Control Trials of Education Interventions

THE UNIVERSITY OF OKLAHOMA HEALTH SCIENCES CENTER GRADUATE COLLEGE A COMPARISON OF STATISTICAL ANALYSIS MODELING APPROACHES FOR STEPPED-

Application of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties

f WILEY ANOVA and ANCOVA A GLM Approach Second Edition ANDREW RUTHERFORD Staffordshire, United Kingdom Keele University School of Psychology

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

General Example: Gas Mileage (Stat 5044 Schabenberger & J.P.Morgen)

Regression models, R solution day7

Certificate Program in Practice-Based. Research Methods. PBRN Methods: Clustered Designs. Session 8 - January 26, 2017

Still important ideas

A Comparison of Robust and Nonparametric Estimators Under the Simple Linear Regression Model

Analyzing data from educational surveys: a comparison of HLM and Multilevel IRT. Amin Mousavi

Households: the missing level of analysis in multilevel epidemiological studies- the case for multiple membership models

Correlation and regression

Statistical Techniques. Meta-Stat provides a wealth of statistical tools to help you examine your data. Overview

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Statistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.

THE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION

In this chapter, we discuss the statistical methods used to test the viability

Lecture 14: Adjusting for between- and within-cluster covariates in the analysis of clustered data May 14, 2009

Dr. Kelly Bradley Final Exam Summer {2 points} Name

2. Literature Review. 2.1 The Concept of Hierarchical Models and Their Use in Educational Research

Daniel Boduszek University of Huddersfield

A re-randomisation design for clinical trials

Multiple Regression Using SPSS/PASW

REPEATED MEASURES DESIGNS

Copyright. Kelly Diane Brune

Multivariable Systems. Lawrence Hubert. July 31, 2011

Notes for laboratory session 2

Meta-analysis using individual participant data: one-stage and two-stage approaches, and why they may differ

Data Analysis Using Regression and Multilevel/Hierarchical Models

Stat Wk 9: Hypothesis Tests and Analysis

Daniel Boduszek University of Huddersfield

WELCOME! Lecture 11 Thommy Perlinger

CHAPTER TWO REGRESSION

MULTIPLE REGRESSION OF CPS DATA

CLASSICAL AND. MODERN REGRESSION WITH APPLICATIONS

Content. Basic Statistics and Data Analysis for Health Researchers from Foreign Countries. Research question. Example Newly diagnosed Type 2 Diabetes

Supplementary Material. other ethnic backgrounds. All but six of the yoked pairs were matched on ethnicity. Results

Correlation and Regression

Understanding and Applying Multilevel Models in Maternal and Child Health Epidemiology and Public Health

Overview of Lecture. Survey Methods & Design in Psychology. Correlational statistics vs tests of differences between groups

Psych 5741/5751: Data Analysis University of Boulder Gary McClelland & Charles Judd. Exam #2, Spring 1992

Simple Linear Regression the model, estimation and testing

Regression Including the Interaction Between Quantitative Variables

CHAPTER III RESEARCH METHODOLOGY

Intraclass Correlation Coefficients Typical of Cluster-Randomized Studies: Estimates From the Robert Wood Johnson Prescription for Health Projects

EXERCISE: HOW TO DO POWER CALCULATIONS IN OPTIMAL DESIGN SOFTWARE

Incorporating Within-Study Correlations in Multivariate Meta-analysis: Multilevel Versus Traditional Models

Transcription:

Data Analysis in Practice-Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine

Multilevel Data Statistical analyses that fail to recognize the hierarchical structure of the data, or the dependence among observations within the same clinician, yield inflated Type I errors in testing the effects of interventions.

Multilevel Data Inflation of the Type I error rate implies that interventions effects are more likely to be claimed than actually exist. Unless ICC is accounted for in the analysis, the Type I error rate will be inflated, often substantially.

Multilevel Data When ICC>0, this violates the assumption of independence. Usual analysis methods are not appropriate for group-randomized randomized trials. Application of usual methods of analysis will result in a standard error that is too small and a p- value that overstates the significance of the results

Traditional Response to Nesting Ignore nesting or groups Conduct analysis with aggregated data Use clinician as the unit of analysis Spread group data across lower units Patients of a given clinician get the same value for clinician level variables

Analysis of Aggregated Data Analyses of aggregated data at higher levels of a hierarchy can produce different results from analyses at the individual level. Sample size will become very small and statistical power is substantially reduced Aggregation bias (meaning changed after aggregation)

Miscalculation of Standard Errors Nested data violate assumptions about independence of observations Exaggerated degrees of freedom for group data (e.g., clinicians) when spread across lower units (patients) Increased likelihood of Type I error due to unrealistically small confidence intervals

Reduction in Standard Error Basic formula for standard error of a mean is: Standard Error = Standard Deviation Sq. Rt. Sample Size If data are for 100 clinicians spread across 1000 patients, the standard error for clinician variables will be too small (roughly 1/3 its actual size in this example)

Example of Two-Group Analysis The primary aim of many trials is to compare two groups of patients with respect to their mean values on a quantitative outcome variable

Example of Two-Group Analysis Testing mean differences for statistical significance, in group trials, requires the computation of standard errors that take into account randomization by groups.

Analysis example Assume we have 32 clinicians, 16 randomized to Intervention and 16 to Control conditions Intervention is a weight loss program and the outcome is BMI at 2 years. Mean (I) = 25.62; Mean (C) = 25.98 Sample (I) = 1929; Sample (C) = 2205 (4134)

Standard t-testt test t = M1-M2 Sq. Rt. (Var (1/N1 + 1/N2)) = 25.62-25.98 = 0.36 = -2.37 (p =0.02) 0.152 0.152 (df = 4132) P=0.02 is too small when ICC>0

Adjusted two-sample t-test t test t = M1-M2 Sq. Rt. (Var (C1/N /N1 + C2/NC /N2)) ICC = 0.02; C1=VIF/Grp1 C = (1 + (N1-1)p) 1)p) = 25.62-25.98 = 0.36 = -1.27 (p =0.21) 0.28 0.28 (df = 30)

Post Hoc Correction for Analyses that Ignore the Group Effect. The VIF can be used to correct the inflation in the test statistic generated by the observation-level analysis. Test statistics such as F-and F chi-square tests are corrected by dividing the test by the VIF. Test statistics such as t or z-tests z are corrected by dividing the test by the square root of the VIF.

Post Hoc Correction Correction = t/vif; where t=2.37, and VIF=1+(M-1)p 1)p = 1+(129-1)(.02) 1)(.02) = 3.56 Sq. Rt. of 3.56 = 1.89 Correction: 2.37/1.89 = 1.25 (computed 1.27)

Multilevel Models This example illustrates a method for adjusting individual level analyses for clustering based on a simple extension of the standard two-sample t-t test. We now move to a more comprehensive, but computationally more extensive, approach called Multilevel Modeling

What is Multilevel Modeling? A general framework for investigating nested data with complex error structures Multilevel models incorporate higher level (clinician) predictors into the analysis Multilevel models provide a methodology for connecting the levels together, i.e., to analyze variables from different levels simultaneously, while adjusting for the various dependencies.

Multilevel Models Combining variables from different levels into a single statistical model is a more complicated problem than estimating and correcting for design effects.

Multilevel Models Multilevel models are also known as: random-effects models, mixed-effects effects models, variance-components components models, contextual models, or hierarchical linear models

Multilevel Models Use of information across multiple units of analysis to improve estimation of effects. Statistically partitioning variance and covariance components across levels Tests for cross-level effects (moderator)

A Multilevel Approach Specifies a patient-level model within clinicians. Level 1 model Treats regression coefficients as random variables at the clinician level Models the mean effect and variance in effects as a function of a clinician-level level model

Correlates of Alcohol Consumption β S.E. P value Intercept 2.06 0.46 <.001 Individual Coefficients Distance to Outlet.0001.035.997 Age -.008.001 <.001 Female -.678.053 <.001 Education.145.034.001 Black -.527.069 <.001 Census Tract Coefficients Mean Distance to Outlets -.477.194.024 Mean Age.014.017.435 Percent Female.292.957.763 Mean Education.345.408.410 Percent Black -.407.334.238 Percent Variance Explained Within Census Tracts 8.9 ICC=11.5% Between Census Tracts 80.3 (Scribner, 2000)

Software Packages MBDP-V V (www.ssicentral.com( www.ssicentral.com) VARCL (www.assess.com.varcl( www.assess.com.varcl) SAS Proc Mix (www.sas.com( www.sas.com) MLwiN (www.ioe.ac.uk/mlwin( www.ioe.ac.uk/mlwin) HLM (www.ssicentral.com)

Take Home Messages Clustered data inflate standard errors & p-valuesp Standard statistical analyses are invalid Post hoc corrections for clustering Multilevel data require multilevel analyses MM designed to analyze variables from different levels simultaneously & cross-level interactions Computationally extensive, requiring experience Parameters to be estimated increase rapidly Missing data at Level-2 2 more problematic