Logistic Regression Predicting the Chances of Coronary Heart Disease. Multivariate Solutions

Similar documents
m 11 m.1 > m 12 m.2 risk for smokers risk for nonsmokers

What Are Your Odds? : An Interactive Web Application to Visualize Health Outcomes

Table S1. Characteristics associated with frequency of nut consumption (full entire sample; Nn=4,416).

Modelling Reduction of Coronary Heart Disease Risk among people with Diabetes

Epidemiologic Measure of Association

Influence of Hypertension and Diabetes Mellitus on. Family History of Heart Attack in Male Patients

Measures of Association

Logistic regression. Department of Statistics, University of South Carolina. Stat 205: Elementary Statistics for the Biological and Life Sciences

Comparability of patient-reported health status: multi-country analysis of EQ-5D responses in patients with type 2 diabetes

Know Your Number Aggregate Report Single Analysis Compared to National Averages

Simple Linear Regression One Categorical Independent Variable with Several Categories

Simple Linear Regression

Smoking Status and Body Mass Index in the United States:

THE CARDIOVASCULAR RISK FACTORS PROFILE PREDISPOSING TO HEART ATTACKS IN YOUNG WOMEN

STATISTICS INFORMED DECISIONS USING DATA

Midterm Exam ANSWERS Categorical Data Analysis, CHL5407H

Supplementary Online Content

Predicting New Customer Retention for Online Dieting & Fitness Programs

Methodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA

DAZED AND CONFUSED: THE CHARACTERISTICS AND BEHAVIOROF TITLE CONFUSED READERS

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Part 8 Logistic Regression

Understanding Statistics for Research Staff!

Diabetes and Cardiovascular risk factors among Oil Sector Workers in the State of Kuwait (2013 KOC Periodical Medical Exam.)

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

Welcome to this third module in a three-part series focused on epidemiologic measures of association and impact.

POL 242Y Final Test (Take Home) Name

Statistical questions for statistical methods

Basic Biostatistics. Chapter 1. Content

Study Guide #2: MULTIPLE REGRESSION in education

Predicting Short Term Morbidity following Revision Hip and Knee Arthroplasty

Chapter 14: More Powerful Statistical Methods

Today: Binomial response variable with an explanatory variable on an ordinal (rank) scale.

Analyzing diastolic and systolic blood pressure individually or jointly?

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Joseph W Hogan Brown University & AMPATH February 16, 2010

Autonomic nervous system, inflammation and preclinical carotid atherosclerosis in depressed subjects with coronary risk factors

Risk Factors for Heart Disease

Since 1980, obesity has more than doubled worldwide, and in 2008 over 1.5 billion adults aged 20 years were overweight.

Daniel Boduszek University of Huddersfield

Vascular Diseases. Overview: Selected Slides

A n aly tical m e t h o d s

Survey of Smoking, Drinking and Drug Use (SDD) among young people in England, Andrew Bryant

ORIGINAL INVESTIGATION. C-Reactive Protein Concentration and Incident Hypertension in Young Adults

Computer Models for Medical Diagnosis and Prognostication

Linear Regression in SAS

Biases in clinical research. Seungho Ryu, MD, PhD Kanguk Samsung Hospital, Sungkyunkwan University

1. Coronary heart disease is a major cause of death in the western world.

Saffolalife Study 2013

5.3: Associations in Categorical Variables

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models

Archimedes, Medicare, and ARCHeS

Data that can be classified as belonging to a distinct number of categories >>result in categorical responses. And this includes:

BIOSTATISTICAL METHODS

Supplementary information file

Epidemiology. Bis vivit qui bene vivit

ID# Exam 3 PS 217, Spring 2009 (You must use your official student ID)

Measuring association in contingency tables

Matt Laidler, MPH, MA Acute and Communicable Disease Program Oregon Health Authority. SOSUG, April 17, 2014

Trends in electronic cigarette use in England

Looking Toward State Health Assessment.

TOTAL HIP AND KNEE REPLACEMENTS. FISCAL YEAR 2002 DATA July 1, 2001 through June 30, 2002 TECHNICAL NOTES

8/24/2011. Study Goal. Study Design. Patient Attributes Influencing Pain and Pain Management in Postoperative Total Knee Arthroplasty Patients

The Impact of Smoking on Acute Ischemic Stroke

Supplementary Appendix

Chapter 3: Describing Relationships

Supplementary Table S1: Proportion of missing values presents in the original dataset

Metabolic Syndrome and Workplace Outcome

Regression. Page 1. Variables Entered/Removed b Variables. Variables Removed. Enter. Method. Psycho_Dum

Mental Illness and Chronic Disease in a Random Sample of Newly-Arrived Refugees and Immigrants to the U.S.

Daniel Boduszek University of Huddersfield

*(a) Describe the blood clotting process. (4)

Measuring association in contingency tables

Declaration of Conflict of Interest. No potential conflict of interest to disclose with regard to the topics of this presentations.

Introduction to Survey Sample Weighting. Linda Owens

Low fractional diastolic pressure in the ascending aorta increased the risk of coronary heart disease

SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers

Chapter 11 Multiple Regression

Chronic kidney disease (CKD) has received

Technical appendix Strengthening accountability through media in Bangladesh: final evaluation

Improved control for confounding using propensity scores and instrumental variables?

Module Overview. What is a Marker? Part 1 Overview

Correlation of LV Longitudinal Strain by 2D Speckle Tracking with Cardiovascular risk in Elderly. (A pilot study of EGAT-Echo study.

Web Extra material. Comparison of non-laboratory-based risk scores for predicting the occurrence of type 2

Supplementary Appendix

programme. The DE-PLAN follow up.

Original Research Article

Comparison And Application Of Methods To Address Confounding By Indication In Non- Randomized Clinical Studies

Supplementary Online Content

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

< N=248 N=296

Estimating indirect and direct effects of a Cancer of Unknown Primary (CUP) diagnosis on survival for a 6 month-period after diagnosis.

Title: Dengue Score: a proposed diagnostic predictor of pleural effusion and/or ascites in adult with dengue infection

12/30/2017. PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2

Supplementary Appendix

Intro to SPSS. Using SPSS through WebFAS

Coronary heart disease risk prediction in the Atherosclerosis Risk in Communities (ARIC) study

MAKING THE NSQIP PARTICIPANT USE DATA FILE (PUF) WORK FOR YOU

Transcription:

Logistic Regression Predicting the Chances of Coronary Heart Disease Multivariate Solutions

What is Logistic Regression? Logistic regression in a nutshell: Logistic regression is used for prediction of the probability of occurrence of an event by fitting data to a logistic curve. Logistic regression makes use of several predictor variables that may be either numerical or categorical. For example, the probability that a person has a heart attack within a specified time period might be predicted from knowledge of the person's age, sex and body mass index. Logistic regression is used extensively in the medical and social sciences as well as marketing applications such as prediction of a customer's propensity to purchase a product or cease a subscription.

Example: Calculating the Risk of Coronary Heart Disease In this example, what are the risk factors associated with Coronary Heart Disease? How do they contribute to the chances of contracting the disease. Let us define a variable Outcome Death from Coronary Heart Disease Outcome = 1 If 'The Individual will contract a form of Coronary Heart Disease' = 0 If 'The Individual will not contract a form of Coronary Heart Disease' The outcome takes only two possible values.

Hypothesis: To Develop a Model to Determine the Risk of Contracting Coronary Heart Disease Logistic Regression Risk Factors Contained in the Model Smoking Total Cholesterol Level (TCL -200) Body Mass Index (BMI 25) Gender (1=male, 0=female) Age (in years, less 50) Hours of physical activity (weekly)

Logistic Regression Output Risk of Coronary Heart Disease - Ten Years Regression Output Regression Beta Sig. Odds Ratio (Exponential (Beta) Smoking 0.898 0.029 2.455 Total Cholesterol Level (TCL -200) 0.166 0.015 1.181 Body Mass Index (BMI-25) 0.058 0.120 1.060 Gender (1=male, 0=female) 0.028 0.038 1.028 Age (in years less 50) 0.024 0.024 1.024 Hourse of Physical Activity (weekly) -1.013 0.006 0.363 Constant -4.123 This slide is descriptive, and shows which of the variables are most influential in determining which risk factor is most relevant when considering Coronary Heart Disease. For example, smoking and a total cholesterol level above 200 are the highest risk factors. When examining the results, the Odds-Ratio is often used to interpret the results. Smokers' risk of developing coronary heart disease is 2.4 times that of nonsmokers. High cholesterol is also a risk factor, as is age. That men are slightly more likely to get Coronary Heart Disease than women, and that physical activity sharply reduces the chances of Coronary Heart Disease (negative coefficient).

Odds-Ratio When a respondent s choices are set within the regression model, an odds-ratio for each respondent is created using the formula of 1/(1+e -z ). Z is the outcome of the regression equation once all the questions are input. A simulator can be used to classify individuals based on demographic data or a survey screen. Two examples follow:

Example One Inactive, Smoking, 55-year-old Woman Risk of Coronary Heart Disease Regression Output Answer Regression Beta Product (b*d) Smoking 1 0.098 0.098 Total Cholesterol Level (TCL -200) 230 0.066 1.980 Body Mass Index (BMI-25) 32 0.058 0.406 Gender (1=male, 0=female) 0 0.028 0.000 Age (in years less 50) 55 0.024 0.119 Hourse of Physical Activity (weekly) 0-1.013 0.000 Equation Constant -4.123 Sum -1.520 Odds Ratio (1/(1+e -z ) 0.18 Risk of Coronary Heart Disease - Ten Years 18% A slightly obese, 55-year-old woman, smoker, with somewhat high total cholesterol and is physically inactive has an 18% chance of contracting Coronary Heart Disease within the next ten years.

Example Two Health-Conscience 65-Year-Old Man Risk of Coronary Heart Disease Regression Output Answer Regression Beta Product (b*d) Smoking 0 0.098 0.000 Total Cholesterol Level (TCL -200) 180 0.066-1.320 Body Mass Index (BMI-25) 25 0.058 0.000 Gender (1=male, 0=female) 1 0.028 0.028 Age (in years less 50) 65 0.024 0.358 Hourse of Physical Activity (weekly) 4-1.013-4.052 Equation Constant -4.123 Sum -9.109 Odds Ratio (1/(1+e -z ) 0.00 Risk of Coronary Heart Disease - Ten Years 0% Using the logistic output, the chances of a non-smoking, physically active 65-year-old man with a good cholesterol level has practically no chance of contracting Coronary Heart Disease in the next ten years.