Logistic Regression Predicting the Chances of Coronary Heart Disease Multivariate Solutions
What is Logistic Regression? Logistic regression in a nutshell: Logistic regression is used for prediction of the probability of occurrence of an event by fitting data to a logistic curve. Logistic regression makes use of several predictor variables that may be either numerical or categorical. For example, the probability that a person has a heart attack within a specified time period might be predicted from knowledge of the person's age, sex and body mass index. Logistic regression is used extensively in the medical and social sciences as well as marketing applications such as prediction of a customer's propensity to purchase a product or cease a subscription.
Example: Calculating the Risk of Coronary Heart Disease In this example, what are the risk factors associated with Coronary Heart Disease? How do they contribute to the chances of contracting the disease. Let us define a variable Outcome Death from Coronary Heart Disease Outcome = 1 If 'The Individual will contract a form of Coronary Heart Disease' = 0 If 'The Individual will not contract a form of Coronary Heart Disease' The outcome takes only two possible values.
Hypothesis: To Develop a Model to Determine the Risk of Contracting Coronary Heart Disease Logistic Regression Risk Factors Contained in the Model Smoking Total Cholesterol Level (TCL -200) Body Mass Index (BMI 25) Gender (1=male, 0=female) Age (in years, less 50) Hours of physical activity (weekly)
Logistic Regression Output Risk of Coronary Heart Disease - Ten Years Regression Output Regression Beta Sig. Odds Ratio (Exponential (Beta) Smoking 0.898 0.029 2.455 Total Cholesterol Level (TCL -200) 0.166 0.015 1.181 Body Mass Index (BMI-25) 0.058 0.120 1.060 Gender (1=male, 0=female) 0.028 0.038 1.028 Age (in years less 50) 0.024 0.024 1.024 Hourse of Physical Activity (weekly) -1.013 0.006 0.363 Constant -4.123 This slide is descriptive, and shows which of the variables are most influential in determining which risk factor is most relevant when considering Coronary Heart Disease. For example, smoking and a total cholesterol level above 200 are the highest risk factors. When examining the results, the Odds-Ratio is often used to interpret the results. Smokers' risk of developing coronary heart disease is 2.4 times that of nonsmokers. High cholesterol is also a risk factor, as is age. That men are slightly more likely to get Coronary Heart Disease than women, and that physical activity sharply reduces the chances of Coronary Heart Disease (negative coefficient).
Odds-Ratio When a respondent s choices are set within the regression model, an odds-ratio for each respondent is created using the formula of 1/(1+e -z ). Z is the outcome of the regression equation once all the questions are input. A simulator can be used to classify individuals based on demographic data or a survey screen. Two examples follow:
Example One Inactive, Smoking, 55-year-old Woman Risk of Coronary Heart Disease Regression Output Answer Regression Beta Product (b*d) Smoking 1 0.098 0.098 Total Cholesterol Level (TCL -200) 230 0.066 1.980 Body Mass Index (BMI-25) 32 0.058 0.406 Gender (1=male, 0=female) 0 0.028 0.000 Age (in years less 50) 55 0.024 0.119 Hourse of Physical Activity (weekly) 0-1.013 0.000 Equation Constant -4.123 Sum -1.520 Odds Ratio (1/(1+e -z ) 0.18 Risk of Coronary Heart Disease - Ten Years 18% A slightly obese, 55-year-old woman, smoker, with somewhat high total cholesterol and is physically inactive has an 18% chance of contracting Coronary Heart Disease within the next ten years.
Example Two Health-Conscience 65-Year-Old Man Risk of Coronary Heart Disease Regression Output Answer Regression Beta Product (b*d) Smoking 0 0.098 0.000 Total Cholesterol Level (TCL -200) 180 0.066-1.320 Body Mass Index (BMI-25) 25 0.058 0.000 Gender (1=male, 0=female) 1 0.028 0.028 Age (in years less 50) 65 0.024 0.358 Hourse of Physical Activity (weekly) 4-1.013-4.052 Equation Constant -4.123 Sum -9.109 Odds Ratio (1/(1+e -z ) 0.00 Risk of Coronary Heart Disease - Ten Years 0% Using the logistic output, the chances of a non-smoking, physically active 65-year-old man with a good cholesterol level has practically no chance of contracting Coronary Heart Disease in the next ten years.