USE OF MACHINE LEARNING TO PREDICT THE ONSET OF DIABETES

Similar documents
International Journal of Health Sciences and Research ISSN:

D. Hilton. Keywords Epidemiological methods, aging, prevalence.

A Simplified Indian Diabetes Risk Score for Screening for Undiagnosed Diabetic Subjects

Rapid globalization and industrialization occurring

Daniel Boduszek University of Huddersfield

International Journal of Health Sciences and Research ISSN:

Epidemiology of Diabetes Mellitus in Asia

Assessment of cardiovascular risk among diabetic patients in an urban area of Kancheepuram district, India: a cross sectional study

A review of socio-economic factors affecting for diabetes

Case Report. Diabetes mellitus and foot ulcer: Always challenge to society. Diabetes Obesity Metabolic Disorder Open Access 4:5-9 (2018) Abstract

This slide set provides an overview of the impact of type 1 and type 2 diabetes mellitus in the United States, focusing on epidemiology, costs both

A Study on Type 2 Diabetes Mellitus Patients Using Regression Model and Survival Analysis Techniques

PREVALENCE OF RISK FACTORS FOR TYPE 2 DIABETES MELLITUS IN THE URBAN POPULATION- A COMMUNITY BASED CROSS SECTIONAL STUDY

WHR (waist hip ratio) as risk factor irrespective of body mass index (BMI ) among patients of noninsulin dependent diabetes mellitus (NIDDM)

A COMPARATIVE EVALUATION OF HBA1C MEASUREMENT IN DIFFERENT ANTICOAGULANT VIALS AND ITS STABILITY ON STORAGE

Int.J.Curr.Microbiol.App.Sci (2016) 5(10):

Design and Study of Online Fuzzy Risk Score Analyzer for Diabetes Mellitus

A Study on Complications of Type 2 Diabetes Mellitus in a Diabetes Clinic of a Tertiary Care Hospital, Kolkata, West Bengal

AN INDIRECT EVALUATION OF THE NATIONAL PROGRAM OF DIABETES MELLITUS STUDY CASE OF ROMANIA

Volume 5 Issue 8, August

RISK FACTORS FOR HYPERTENSION IN INDIA AND CHINA: A COMPARATIVE STUDY

Diabetes risk scores and death: predictability and practicability in two different populations

Daniel Boduszek University of Huddersfield

CLINICAL PROFILE OF TYPE 2 DIABETES

Diabetes management: lessons from around the globe MENA. J. Belkhadir (Morocco)

Development and validation of a Diabetes Risk Score for screening undiagnosed diabetes in Sri Lanka (SLDRISK)

MEXICO CITY, MEXICO COPENHAGEN DENMARK RULE OF HALVES ANALYSIS HOW-TO GUIDE

CHAPTER 3 DIABETES MELLITUS, OBESITY, HYPERTENSION AND DYSLIPIDEMIA IN ADULT CENTRAL KERALA POPULATION

The Global Research Framework of Cities Changing Diabetes. Prof. David Napier Global Academic Lead Cities Changing Diabetes

TUE Physician Guidelines Medical Information to Support the Decisions of TUE Committees Diabetes Mellitus DIABETES MELLITUS

Prevalence of pre-diabetes among the general population in a tertiary care hospital in north India

University Journal of Medicine and Medical Specialities

Prevalance of Lifestyle Associated Risk Factor for Non- Communicable Diseases among Young Male Population in Urban Slum Area At Mayapuri, New Delhi

Diabetes - The Facts

The Global Agenda for the Prevention of Diabetes: Research Opportunities

Effect of bitter gourd clinical trial in prediabetics: blind, randomised, cross over trial in India

A COMPREHENSIVE REPORT ISSUED BY THE AMERICAN ASSOCIATION OF CLINICAL ENDOCRINOLOGISTS IN PARTNERSHIP WITH:

Obesity Causes Complications and Dietary Weight Loss Strategy

It is currently estimated that diabetes prevalence by

year resident, Department of Medicine, B. J. Medical college, Ahmedabad.

Chengamanad Diabetic Retinopathy Awareness Study (CDRAS)

Prevalence of Cardiac Risk Factors among People Attending an Exhibition

DIABETIC RISK PREDICTION FOR WOMEN USING BOOTSTRAP AGGREGATION ON BACK-PROPAGATION NEURAL NETWORKS

A Study on Identification of Socioeconomic Variables Associated with Non-Communicable Diseases Among Bangladeshi Adults

ATHLETES & PRESCRIBING PHYSICIANS PLEASE READ

Rheumatic Heart Disease Revisited: Patterns of Valvular Involvement from a Consecutive Cohort in Eastern Nepal

A LOGISTIC REGRESSION ANALYSIS OF MALARIA CONTROL DATA

Prevalence of Diabetes and Pre-Diabetes in Healthy Obese Employees

IJCISS Vol.2 Issue-09, (September, 2015) ISSN: International Journal in Commerce, IT & Social Sciences (Impact Factor: 2.

Knowledge and Practice regarding Self-Care among the patients with type II Diabetes of Kapan,Kathmandu

JOURNAL OF INTERNATIONAL ACADEMIC RESEARCH FOR MULTIDISCIPLINARY Impact Factor 1.625, ISSN: , Volume 2, Issue 9, October 2014

Diabetes is a condition with a huge health impact in Asia. More than half of all

Conceptual Model of Diabetes Self-Management for Middle-Aged Population of Rural Area of Pakistan

Prevalence of Type II diabetes is increasing globally as well

This PDF is available for free download from a site hosted by Medknow Publications

Prevalence and risk factors of hypertension, among adults residing in an urban area of North India

International Journal of Basic and Applied Physiology

Diabetes Management and Considerations for the Indian Culture

COSTS OF DIABETES IN DEVELOPING COUNTRIES

Study on occurrence of metabolic syndrome among patients with stroke: a descriptive study

Body Mass Index and Waist Hip Ratio among Youth of India

A study of micro vascular complications and associated risk factors in newly diagnosed patients of type 2 diabetes mellitus

Prevalence, awareness of hypertension in rural areas of Kurnool

Indian J. Prev. Soc. Med. Vol. 45 No. 1-2, 2014

Screening of Type II Diabetes Mellitus on the Basis of IDRS Among Urban Population of Bhopal, Madhya Pradesh

EFFECT OF SMOKING ON BODY MASS INDEX: A COMMUNITY-BASED STUDY

Characteristics of Patients Initializing Peritoneal Dialysis Treatment From 2007 to 2014 Analysis From Henan Peritoneal Dialysis Registry data

IJBCP International Journal of Basic & Clinical Pharmacology

Diabetes self-care: A community based cross sectional study from Kollam district, Kerala

Assessing Overweight in School Going Children: A Simplified Formula

Analysis of drug used for the treatment of complications of diabetes in a teaching hospital

Prevalence of overweight among urban and rural areas of Punjab

IDSP-NCD Risk Factor Survey

Incidence of Overweight and Obesity among Urban and Rural Males of Amritsar

The clinical and economic benefits of better treatment of adult Medicaid beneficiaries with diabetes

Rivu Basu 1, Abhishek Paul 2, Suresh Chandra Malick 2, Somdipta Bhattacharya

Prevalence of Diabetes and Associated Risk Factors among Selected Type 2 Diabetes

The DIABETES CHALLENGE IN PAKISTAN FIFTH NATIONAL ACTION PLAN

Diabetes is termed as pandemic in the 21 st century, due to

SCIENTIFIC STUDY REPORT

Expenditure Share United States, 2003

290 Biomed Environ Sci, 2016; 29(4):

The Diabetes Pandemic

Oral abstract No: O-045

Modelling and Application of Logistic Regression and Artificial Neural Networks Models

Original Research Article

International Journal of Health Sciences and Research ISSN:

Prevalence and determinants of peripheral neuropathy among diabetics in a rural cum costal area of Villupuram district, Tamil Nadu

ISSN X (Print) Original Research Article

Emerging Health Issues in Megacities: An Analysis on Causes of Non-Communicable Diseases in Karachi

Predictors of psychological distress and depression among patients with type 2 diabetes mellitus

Profile of diabetes mellitus in elderly of Chandigarh, India. S Puri, M Kalia, H Swami, A Singh, Abhimanyu, C Mangat, A Kaur, S Kaur

Dr. Lancelot Mui, MPH Post-doctoral Fellow School of Public Health and Primary Care The Chinese University of Hong Kong

A Summary Report: 2003

PROJECT Ntshembo: Improving adolescent health and interrupting mother-infant transfer of health risk in Africa. INDEPTH Network

The Effects of Macronutrient Intake on the Risk of Developing Type 2 Diabetes: A Systematic Review

STATISTICAL MODELING OF THE INCIDENCE OF BREAST CANCER IN NWFP, PAKISTAN

LIFESTYLE DISEASES IN KERALA : AN ANALYSIS OF SOCIO-ECONOMIC STATUS, CONSUMPTION PATTERN AND ADOLESCENT OBESITY IN KOTTAYAM DISTRICT

Why Do We Treat Obesity? Epidemiology

Prevalence of diabetes and impaired fasting glucose in Uygur children of Xinjiang, China

Transcription:

USE OF MACHINE LEARNING TO PREDICT THE ONSET OF DIABETES Vinaytosh Mishra 1, Dr. Cherian Samuel 2, Prof. S.K.Sharma 3 Department of Mechanical Engineering, IIT (BHU), Varanasi ABSTRACT Diabetes is growing like an epidemic in India. Prevalence is not only seen in urban areas but also in the rural parts of the country. The direct and indirect cost of the therapy is a major concern, with the cost rising with the progression of disease. Prediction in early stages can not only reduce the cost of therapy, but also prevent the multiple organ dysfunction and casualties. This paper uses classification techniques, like logistic regression to predict the disease in its early stages. The centres used for the study are diabetes speciality clinics in Varanasi. KEYWORDS: Diabetes Prevalence, Risk Score, Logistic Regression, Healthcare Burden 1. INTRODUCTION Diabetes is becoming a pandemic in world and with 62 million diabetic patients; India is one of the significant contributors [1]. Over the past 30 years, the prevalence of diabetes has increased to 12-18% in urban India and 3-6% in rural India.This rate of increase is 50-80% higher than China (10%) [2].According to International Diabetes Federation (IDF), India is the home of most number of diabetic patients and hence it is rightly termed as the diabetes capital of the world [19]. The estimated burden for properly treating diabetes is USD 2.2 billion in India, while government was spending only USD 61 per capita on healthcare in year 2012[20, 21]. The following table shows all major studies done on prevalence of diabetes in India. Most of the studies are regional and none of them are done in Eastern U.P and Bihar. The literature on the studies underlines the fact, that there is a rising trend in the prevalence of Type 2 diabetes in Urban India [18]. 1 Research Scholar,Industrial Management,Department of Mechanical Engineering,IIT-BHU Corresponding Author: vinaytosh@gmail.com, Mobile: +91-8795832849/51 2 Assistant Professor, Industrial Management,Department of Mechanical Engineering,IIT-BHU 3 Professor, Industrial Management,Department of Mechanical Engineering,IIT-BHU DOI : 10.14810/ijmech.2015.4202 9

Year Authors Place Area Prevalence %(Urban) 1971 Thripathy et al Cuttak Central 1.2 1972 Ahuja, et al New Delhi North 1.3 1979 Gupta,et al Multicentre 3 1984 Murthy,et al Tenali South 4.7 1986 Patel Bhadran West 3.8 1988 Ramchandran,et al Kudremukh South 5 1991 Ahuja, et al New Delhi North 6.7 1992 Ramchandran,et al Chennai South 8.2 1997 Ramchandran,et al Chennai South 11.6 2000 Zarger,et al Kashmir North 6.1 2001 Ramchandran,et al National 12.1 2001 Misra,et al New Delhi North 10.3 2001 Mohan,et al Chennai South 12.1 2001 Kutty,et al Kerala South 12.4 Table 1: Rising Prevalence of Diabetes in Urban India Diabetes is a door way to multiple diseases and the cost of therapy increases with time. According to study on 3010 sample done by Ramchandran et al [18], the prevalence of vascular complication in diabetes is alarming. Microvascular Complication Macrovascular Complication Retinopathy 23.7 Cardiovascular disease 11.4 Background 20 Peripheral vascular disease 4 Proliferative 3.7 Cerebrovascular accidents 0.9 Nephropathy 5.5 Hypertension 38 Peri-neuropathy 27.5 Table 2: Prevalence of Vascular Complications in Diabetes Another study [Bhansali et al] further throws light on the cost of therapy with various complications. The patients with diabetes having foot complications spent 19020 INR, and those who had two complications spent four times more (17633 INR), and patients with renal disease (12690 INR), cardiovascular (13135 INR) and retinal complications (13922 INR) spent three times more than patients without any complications (4493 INR) [3]. Diabetes is associated with the sizeable proportion of the healthcare resource worldwide. Several studies have shown that timely intervention can prevent or postpone the onset of the disease. Therefore appropriate identification of individuals at high risk is important [4, 5]. Several risk scores to predict type 2 diabetes have been developed [6-9]. 10

Table 3: Various Diabetes Risk Models Worldwide [10] It is an irony that, a country which is infamous as the diabetes capital of the world, there is a lack of studies on the prediction of diabetes. Moreover, there is significant variation in lifestyle, ritual and eating habits in India, among various states. The models discussed above are based on studies done in specific geography. Existing diabetes prediction models are being used for prediction of diabetes, but the performance of each model varies with country, age, sex, and adiposity [10]. Despite the cultural specificity of Western medicine practices, that it is of a particular cultural tradition-it has been extraordinarily widely diffused throughout the world [11]. 2. SCOPE Thus, there is need of regional studies for diabetes prediction in India. The early intervention can reduce the prevalence of diabetes and hence the economic burden due to it. OBJECTIVE To develop an easy to administer diabetes prediction model for Eastern India. SELECTION OF VARIABLE 1. Twenty five variables were selected using literature review of earlier studies. 2. Twelve diabetes prediction models were revisited, to find out whether those variables are included in the study 3. The variable having more than six inclusion was selected for the study 4. In addition to the above mentioned variables, one additional variable HBA1C was used. The HBA1C is being used as a popular diabetes prediction tool, and has been recommended by the International Expert Committee [17]. 3. SAMPLING PLAN The participants were selected from Eastern U.P and Bihar with Age 23 years and education more than fifth standard. Only two speciality centre from Varanasi were selected for the study. The definition of the incidence of diabetes is derived from the diagnosis of specialist doctors. Out of 200 participants, 106 were found diabetic, while 94 were adjudged non diabetic. 11

4. METHODOLOGY Logistic regression is the model of choice in many medical data classification tasks [12]. The classification tool used for the study is Binary Logistic Regression, and the Software used for the Data Analysis is IBM SPSS 20.0. The power of the study in case of the logistic regression is, between 0.80 to 0.85, for 200 samples [13-16]. The purpose of the study is, not only to find whether the available data is able to classify two data sets namely diabetic and non diabetic, but also to provide a mechanism for removal of superfluous variables, to get more accurate models. This approach will save money, time and effort by dropping unnecessary tests, and considering only relevant questions [12] The logistic regression model calculates the class membership probability for one of the two categories in the data set: and P (0 x, α)=1 P(1 x,α). Here, we write P (1 x, α) to make the dependence of the posterior distribution on the parameters α explicit. The hyperplane of all points x satisfying the equation α x=0 forms the decision boundary between the two classes; these are the points for which P (1 x, α) =P (0 x, α) =0.5 [22] 6. RESULTS The result of SPSS output for Model Summary and Variables used in the model is listed below. Step -2 Log likelihood Cox & Snell R Square Nagelkerke R Square 1 78.556 a.628.839 a. Estimation terminated at iteration number 8 because parameter estimates changed by less than.001. The Pseudo R Square values Cox & Snell R Square and Nagelkerke R Square for the Model are 0.628 and 0.839 respectively. This shows that Model is appropriate using these variables. Table 4: Model Summary 12

B S.E. Wald df Sig. Exp(B) AGE.325.083 15.454 1.000 1.384 SEX -.931.926 1.011 1.315.394 SMOKING -2.865 1.352 4.488 1.034.057 PARENTAL_DM 4.407 1.329 10.989 1.001 81.998 Step 1 a HYPERTENSION 4.948 1.346 13.525 1.000 140.940 BMI.127.142.808 1.369 1.136 WAIST_CRCM.153.069 4.834 1.028 1.165 HBA1C.844.683 1.526 1.217 2.326 Constant -40.591 7.866 26.631 1.000.000 a.) Variable(s) entered on step 1: AGE, SEX, SMOKING, PARENTAL_DM, HYPERTENSION, BMI, WAIST_CRCM, and HBA1C. Table 5: Variables in the Equation The variables such as Age, Smoking, Parental Diabetes Mellitus, Hypertension & Waist Circumference are significant while other variables like Sex, BMI and HBA1C are found insignificant for 95% confidence interval. 7. CONCLUSIONS The reading of HBA1C was not found significant in the study. This raises question on over glorification of this tool as predictor of the diabetes. Are the simple anthropometric variables like Waist Circumference better predictor, than the three month average Blood Glucose Level? LIMITATIONS OF STUDY Only two centres were used for the study and the selection of participants was done through convenient sampling. More life style & location specific variables can be included as predictor. FUTURE SCOPE OF STUDY A multicenter study with more variables can give different results. More variables can be included using Delphi Method. The present methodology used i.e. Logistic Regression can be compared with more advance tools like ANN (Artificial Neural Network) for the results. 13

REFERENCES [1] Seema Abhijeet Kaveeshwa, Jon Cornwall, The current state of diabetes mellitus in India, Australia Med J. 2014; 7(1): 45 48. [2] Mohan V, Sandeep S, Deepa R, Shah B, Varghese C. Epidemiology of type 2 diabetes: Indian scenario. Indian J Med Res. Mar 2007;125(3):217-230 [3] Anil Bhansali, Cost of Diabetes Care : Prevent Diabetes or Face Catastrophe, JAPI FEBRUARY 2013 VOL. 61 [4] International Diabetes Federation. IDF Diabetes Atlas, 5th edn. Brussels: International Diabetes Federation, 2011. [5] 2 Li G, Zhang P, Wang J, et al. The long-term effect of lifestyle interventions to prevent diabetes in the China Da Qing Diabetes Prevention Study: a 20-year follow-up study. Lancet 2008; 371: 1783 89. [6] Noble D, Mathur R, Dent T, Meads C, Greenhalgh T. Risk models and scores for type 2 diabetes: systematic review. BMJ 2011; 343: d7163 [7] Buijsse B, Simmons RK, Griffin SJ, Schulze MB. Risk assessment tools for identifying individuals at risk of developing type 2 diabetes. Epidemiology Rev 2011; 33: 46 62. [8] Lindstrom J, Tuomilehto J. The diabetes risk score: a practical tool to predict type 2 diabetes risk. Diabetes Care 2003; 26: 725 31. [9] Collins G S, Mallett S, Omar O, Yu LM. Developing risk prediction models for type 2 diabetes: a systematic review of methodology and reporting. BMC Med 2011; 9: 103. [10] Andre P K et al. Non-invasive risk scores for prediction of type 2 diabetes (EPIC-InterAct): a validation of existing models [11] Nicholas A. Ethics are local: engaging cross-cultural Variation in the ethics for clinical research, Sm. SC; Med. Vol. 35, No. 9, pp. 1079-1091, 1992 [12] Stephan, Lucia. Logistic regression and artificial neural network classification models: a methodology review, Volume 35, Issues 5-6, Pages 352 359 [13] Agresti, A. 2002. Categorical Data Analysis, Second Edition Hoboken, NJ: Wiley. [14] Agresti, A. 1996. An Introduction to Categorical Data Analysis. New York: Wiley. [15] Cohen, J. 1988. Statistical Power Analysis for the Behavioural Sciences, Second Edition. Mahwah, NJ: Lawrence Erlbaum Associates. [16] Long, J.S. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: SAGE Publications, Inc. [17] David M Nathan. The International Expert Committee, International Expert Committee Report on the Role of the A1C Assay in the Diagnosis of Diabetes [18] A. Ramchandran, Socio-Economic Burden of Diabetes in India, JULY 2007 VOL. 55 [19] Guariguata L, Whiting DR, Hambleton I, Beagley J, Linnenkamp U, Shaw JE: Global estimates of diabetes prevalence for 2013 and projections for 2035 for the IDF Diabetes Atlas. Diabetes Res Clin Pract 2013, 49. [20] Ramchandran A: Socio-economic burden of diabetes in India Assoc Physicians India 2007,55(L):9 [21] World Health Organization. Global Health Expenditure Database. Total expenditure on health/capita at exchange rate. 2012 [16.10.2014] [22] B. Ripley, Pattern recognition and neural networks, Cambridge University Press, Cambridge (1996) 14