Tropical Medicine and International Health doi:10.1111/j.1365-3156.2009.02337.x volume 14 no 9 pp 1154 1159 september 2009 Decision tree algorithm in deciding hospitalization for adult patients with dengue haemorrhagic fever in Singapore V. J. Lee 1, D. C. Lye 2, Y. Sun 3 and Y. S. Leo 2 1 Department of Clinical Epidemiology, Tan Tock Seng Hospital, Singapore 2 Department of Infectious Disease, Tan Tock Seng Hospital, Singapore 3 Clinical Project Management and Planning, National Healthcare Group, Singapore Summary objectives To develop a simple decision tree for clinicians to decide between hospitalization and outpatient monitoring of adult dengue patients. method Retrospective cohort study on all laboratory-diagnosed dengue patients admitted in 2004 to Tan Tock Seng Hospital, Singapore. Demographic, clinical, laboratory and radiological data were collected, and cases classified as dengue fever (DF) or dengue haemorrhagic fever (DHF) using World Health Organization criteria. To develop the decision tree, we used chi-squared automatic interaction detector (CHAID) with bi-way and multi-way splitting. The resulting trees were pruned to achieve the highest sensitivity with the shortest tree. results In 2004, 1973 probable and confirmed adult dengue patients were admitted; DF comprised 1855 (94.0%) and DHF 118 (6.0%) of the cases. The best decision tree prediction had three branches, consisting of a history of clinical bleeding, serum urea, and serum total protein. This decision tree had a sensitivity of 1.00, specificity of 0.46, positive predictive value of 7.5%, and negative predictive value of 100%. The overall accuracy of the decision tree was 48.1%. The test sensitivity and specificity compared favourably with other predictive probability equations and sophisticated laboratory tests, and would prevent 43.9% of mild DF cases from hospitalization. conclusions A simple decision tree is effective in predicting DHF in the clinical setting for adult dengue infection. keywords algorithm, dengue, dengue haemorrhagic fever, decision tree, hospitalization, prediction Introduction Differentiating dengue fever (DF) from more severe forms of dengue haemorrhagic fever (DHF) and dengue shock syndrome (DSS) in the early phases of illness are clinically challenging. Three studies of children (Teeraratkul et al. 1990; Kalayanarooj et al. 1997; Carlos et al. 2005) and one study of adults (Lee et al. 2006) attempted to compare DF and DHF patients to derive predictors of DHF, but none resulted in a clinically useful tool to assist clinicians in recommending hospitalization for patients with dengue in the early febrile phase of their illness. In Singapore, about 80% of notified adult dengue cases were hospitalized from 2000 to 2005 despite low rates of DHF (1.8 2.8%) (Lye et al. 2008). An easy-to-use tool for clinicians is therefore necessary to reduce unnecessary dengue admissions and to focus on the management of potential complicated cases. In a previous report, we developed a predictive probability equation to determine the probability of subsequent development of DHF (including DSS) based on clinical and laboratory data upon first presentation to hospital (World Health Organization 1997), with a sensitivity of 98%, specificity of 60% and negative predictive value of >99%. However, the equation requires computational equipment and may not be easily used in all clinical settings. Compared with other predictive analytic methods, the strength of a decision tree lies in converting simple branching heuristics into if then rules that are easy to generate and understand. This reduces the amount of calculation necessary when numerous cases need to be classified and scored. Objectives This study aims to develop a simple decision tree that will assist clinicians in deciding between hospitalization and outpatient monitoring of adult dengue patients; and also to validate the decision tree with existing data. 1154 ª 2009 Blackwell Publishing Ltd
Study design Similar to our previous study, we conducted a retrospective cohort study of all probable and confirmed adult dengue patients according to WHO criteria (World Health Organization 1997) admitted from 1 January to 31 December 2004 to the Department of Infectious Diseases at Tan Tock Seng Hospital (TTSH), Singapore (Lee et al. 2008). Patients had to have fever and at least two of the following symptoms: headache, eye pain, myalgia, arthralgia, rash, bleeding or leukopaenia. A probable case had positive acute dengue serology, as measured by Dengue Duo IgM & IgG Rapid Strip Test, Panbio, Qld, Australia (Cuzzubbo et al. 2001; Blacksell et al. 2006) and a confirmed case had positive dengue polymerase chain reaction (Barkham et al. 2006). We extracted detailed demographic, clinical, laboratory and radiological data from chart review, and all cases were classified as DF, DHF (requiring all four criteria of fever, thrombocytopenia <100 10 3 ll, bleeding and plasma leakage) or DSS using WHO criteria, based on data from patients entire illness. Decision tree learning is a standard machine learning technique for approximating discrete-valued functions (Quinlan 1986). To develop the decision tree, we used the chi-squared automatic interaction detector (CHAID) method, a common method for building classification trees by using the chi-square test for contingency tables to decide which variables are of maximal importance for classification. This method is appropriate because the target variables are a mix of categorical and continuous variables. We applied the CHAID method on two groups in succession based on the available data set. Assuming a test power of 0.8, effect size of 0.2, and degrees of freedom of 3, the total sample size required for each group is 273. Therefore, the two groups were the training set which comprised 60% of the entire data set, and the developed model on the testing set which comprised 40% of the entire data set. This ensured independence of the test results and internal validity. The final model was then used on the entire dataset for analysis. We set a stopping criterion that the total number in a terminal node must not comprise less than 2% of the total training size. In addition, we allowed for bi-way and multi-way splitting, using the chi-square test to implement the splitting based on the best fit. As splitting of merged categories occurred, the Bonferroni adjustment was used in the calculation of P values to determine the most appropriate branches. We explored three different models using CHAID the first utilizing all the variables collected in the dataset as input variables; the second using only the nine variables found to be significant on univariate analysis (bleeding, headache, rash, serum urea, serum total protein, alanine transaminase, aspartate transaminase and gamma glutamyl transpeptidase); and the third using only the four variables found to be significant on multivariate analysis (bleeding, lymphocyte proportion, serum urea, and serum total protein). The resulting trees were pruned to achieve the highest sensitivity with the shortest possible tree, as sensitivity was preferred over specificity as we wanted all DHF (including DSS cases) to be considered for hospitalization. The analyses were performed using clementine 10.0 (SPSS Inc., Chicago, IL, USA, 2005), and all tests were conducted at the 5% level of significance. To show the predictive tool s cost-effectiveness, we also estimated the cost savings if the tool was used to reduce admissions (Lee et al. 2008). All costs were adjusted to 2004 Singapore dollars (USD1 = SGD1.6908). Results In 2004, 1973 adult dengue patients were admitted to Department of Infectious Diseases, TTSH, Singapore; confirmed dengue infections comprised 46% (917) of the cohort. Dengue fever comprised 1855 (94.0%) of the cases, and DHF 118 (6.0%) of the cases. There were 10 DSS patients, seven patients required intensive care unit admission and one patient died all of these severe cases had DHF. The median age for DF cases was 32 years (5 95th percentile, 17 58 years) and for DHF 34 years (5 95th percentile, 18 54 years) (P = 0.89). The median duration of illness from onset to presentation was 5 days (5 95th percentile, 3 7 days) and the median length of hospitalization was 4 days (5 95th percentile, 2 7 days), for the entire cohort. At presentation to hospital, 96 (5.2%) DF patients had bleeding vs. 78 (66.1%) DHF patients (P < 0.01). Median platelet nadir for DF patients was 51 10 3 ll (5 95th percentile, 11 97 10 3 ll) vs. 49 10 3 ll (5 95th percentile, 9 90 10 3 ll) for DHF patients (P > 0.05). Table 1 details the clinical and laboratory features of DF and DHF patients in our study cohort. Among the 118 DHF cases, 36 (30.5%) were identified at presentation to our hospital, while 82 cases (69.5%) developed DHF subsequently during hospitalization. These latter 82 cases were used in the subsequent analysis to develop the decision tree since these were the cases that were not detected at presentation. From the analysis, the best prediction is achieved similarly from using the second (based on the 10 significant variables on univariate analysis) or third model (based on the 4 significant variables on multivariate analysis) (Figure 1). This decision tree had only three branches, comprising a history of any clinical bleeding, followed by ª 2009 Blackwell Publishing Ltd 1155
DF (n = 1855) DHF (n = 82) P value Demographic data Male 1183 (64%) 53 (65%) 0.87 Age, years 32 (17, 58) 34 (18, 53) 0.62 Co-morbid medical conditionsà 137 (7%) 8 (10%) 0.42 History Symptoms Fever 1825 (98%) 82 (100%) 0.25 Headache 606 (33%) 16 (20%) 0.01 Eye pain 5 (0.3%) 0 1.00 Myalgia arthralgia 1243 (67%) 52 (63%) 0.50 Vomiting 604 (33%) 29 (35%) 0.59 Abdominal pain 252 (14%) 14 (17%) 0.37 Bleeding gums 106 (6%) 53 (65%) <0.01 Menorrhagia 7 (0.4%) 0 1.00 Bleeding gastrointestinal tract 2 (0.1%) 1 (1.2%) 0.12 Bleeding nose 16 (0.9%) 10 (12%) <0.01 Clinical parameters Rash 887 (48%) 49 (60%) 0.03 Temperature, C 37.6 (36.4 39.3) 37.2 (36.4 39.3) 0.05 Low blood pressure, <90 60 mmhg 73 (4%) 5 (6%) 0.38 Pulse pressure, mmhg 45 (30 65) 42 (33 56) 0.06 Pulse min 85 (64 109) 84 (64 101) 0.78 Laboratory parameters Haemoglobin, g dl 14.8 (12.2 17.1) 14.6 (11.5 17.8) 0.77 Haematocrit, % 43.5 (36.0 50.1) 43.3 (34.8 50.5) 0.69 White cell count, 10 3 ll 3.1 (1.5 7.4) 3.4 (1.6 7.6) 0.54 Polymorph proportion, % 56.0 (26.0 80.8) 58.9 (35.2 81.6) 0.11 Lymphocyte proportion, % 34.0 (14.8 55.2) 32.9 (14.5 51.4) 0.08 Platelets, 10 3 ll 79 (28 139) 72 (18 133) 0.12 Prothrombin time, s 13.2 (11.9 15.0) 13.2 (11.7 15.9) 0.59 Activated thromboplastin time, s 39.3 (31.4 49.7) 39.6 (28.6 60.2) 0.40 Urea, mmol l 3.8 (2 6) 4.1 (2 7) 0.04 Creatinine, lmol l 74 (49 107) 69 (47 111) 0.09 Total protein, g l 70 (60 79) 65 (60 74) <0.01 Alanine transaminase, IU l 72 (20 390) 98 (23 564) 0.01 Aspartate transaminase, IU l 106 (34 536) 136 (31 727) <0.01 Gamma glutamyl transpeptidase, IU l 41 (12 269) 62 (13 380) 0.01 Table 1 Selected demographic, clinical and laboratory features for DF and DHF patients* *Variables shown are all at first presentation to hospital. Fordichotomousvariables,numberofpositivecasesisshownwithpercentagesinparenthesis; for continuous variables, median values are shown with 5 95th percentiles in parentheses. àco-morbid conditions include chronic obstructive pulmonary disease, diabetes mellitus, hypertension, hyperlipidaemia, cerebrovascular disease, cardiac disease, vascular disease. serum urea level and finally by serum total protein level. This tree was sufficient to achieve a sensitivity of 100%. The tree had a specificity of 46%, positive predictive value (PPV) of 7.5%, and negative predictive value (NPV) of 100% (Table 2). The overall accuracy of the decision tree was 48.1%. Using the first model with all variables included, the best sensitivity that could be achieved was 96%, and was therefore excluded because the highest sensitivity possible is required. Although further branches were available for all three models, they could not maintain the highest sensitivity level required, and the pruning remained at the levels shown. Overall, the 1973 dengue hospitalizations cost SGD 6.77 million including losses due to work absenteeism (Lee et al. 2008). Using the decision tree will maximize classification of DHF cases while reducing overall dengue hospitalizations by 43.9%, resulting in savings of approximately SGD 1.00 million in addition to freeing up invaluable inpatient beds and healthcare staff. Conclusions The advantage of the decision tree described in this study, compared with novel laboratory markers (Green et al. 1156 ª 2009 Blackwell Publishing Ltd
Presentation to emergency department 1855 DF 82 DHF History of bleeding? No Yes Urea levels (mmol/l) High risk - admit 123 DF 61 DHF 4.0 > 4.0 Total protein levels (g/l) High risk - admit 383 DF 12 DHF 67.0 > 67.0 Figure 1 Decision tree for selection of adult DHF cases for admission. High risk - admit 499 DF 850 DF Low risk - discharge 9 DHF 0 DHF Table 2 2 2 Table for the comparison of the accuracy of decision tree with actual diagnosis Diagnosis DF DHF* Total Decision tree Low risk 850 0 850 NPV = 100% High risk 1005 82 1087 PPV = 8% Total 1855 82 1937 Specificity = 46% Sensitivity = 100% *DHF includes only those diagnosed post-admission. 1999; Libraty et al. 2002; Saito et al. 2004; Wang et al. 2006) predictive of DHF or our predictive probability equation (Lee et al. 2008), is its ease of use by clinicians in the emergency room or clinic setting. Other studies using novel laboratory markers for prediction of DHF are costly, unlikely to be widely available in developing countries where dengue is endemic, and unlikely to have rapid turnaround time to be practical for clinicians working in primary care or emergency room to assist in decisionmaking on need for hospitalization (Lee et al. 2008). Soluble tumour necrosis factor receptor 80 had a sensitivity of 67% and NPV of 69% (Green et al. 1999), free secreted non-structural protein 1 had a sensitivity of 72% and NPV of 69% (Libraty et al. 2002), platelet-associated IgM had a sensitivity of 49% and a specificity of 92% (Saito et al. 2004), and dengue viral load had a NPV of 95% and PPV of 88% (Wang et al. 2006). There have been several other studies exploring clinical and laboratory predictors for DHF, but few have been performed in adults. An early Thai paediatric study found flushed face, absence of coryza and positive Tourniquet s test to be useful early predictors of DHF (Teeraratkul et al. 1990). Another Thai paediatric study reported that comparing DHF and DF patients, aspartate transaminase was higher and platelet count lower in DHF patients on admission. Two days before defervescence, haematocrit was higher in DHF patients, and 1 day before defervescence, platelet count was lower in DHF patients (Kalayanarooj et al. 1997). A more recent paediatric study in the Philippines found DHF patients to suffer more from abdominal pain, restlessness and epistaxis, and to have higher leukocyte count, lymphocyte proportion and haematocrit, and lower platelet count, compared with DF patients (Carlos et al. 2005). One Taiwanese adult study found that multivariate risk factors for DHF included age >65 years, prior dengue, co-morbidity of diabetes mellitus, hypertension and renal impairment, alanine transaminase >100I U l, and gallbladder wall thickening (Lee et al. 2002). However, none of these studies was able to report sensitivity, specificity, PPV and NPV for DHF. We did not find the differentiating value of platelet count and liver transaminases to be useful on multivariate analysis, even though DHF patients had significantly lower platelet count and higher liver transaminases than DF patients on univariate analysis (Lee et al. 2008). The early dengue infection and outcome study (EDEN) from Singapore, of which TTSH is a collaborating centre, found decision trees to be potentially useful in predicting severe dengue as defined by platelet count <50 10 3 ll (instead of DHF used in this study) (Tanner et al. 2008). The reported decision tree utilized platelet count on first 3 days of illness, quantitative dengue viral load and dengue ª 2009 Blackwell Publishing Ltd 1157
Variable Sensitivity Specificity PPV NPV Decision tree 100 46 8 100 48 Outcome of predictive 83 84 18 99 84 probability equation >-2.9 (Lee et al. 2008) Outcome of predictive probability equation >-5.1 (Lee et al. 2008) 98 60 10 >99 62 Overall accuracy Table 3 Comparison of the decision tree with the predictive probability equation at different cut off levels IgG to predict subsequent thrombocytopenia <50 10 3 ll, of which only platelet count is easily available in the setting of emergency departments or primary healthcare. We have not found thrombocytopenia to be a predictor of DHF or other severe outcomes in our study (Lee et al. 2008). Our decision tree has only three easily obtained variables to consider: clinical bleeding, serum urea as a proxy measure of dehydration, and serum protein as a proxy measure of plasma leakage. The cut-off values are clear and easy to obtain and utilize in a busy emergency room or primary care clinic. Compared with the predictive probability equation in our previous study (Lee et al. 2008), the decision tree had perfect sensitivity and NPV, but lower specificity and PPV (Table 3). Sensitivity and specificity involve trade-offs, and the decision tree trades off specificity for sensitivity to increase the confidence of clinicians in admitting all DHF cases including those requiring intensive care. Even though the overall accuracy was only 48.1%, it was sufficient to prevent 43.9% of uncomplicated DF cases from being hospitalized. These cases may be followed up on a daily basis as outpatients (Chin et al. 1993; Ingram et al. 2008). The limitations of our study are the relatively small number of DHF patients in our cohort and our patients presented at a median of 5 days of illness (5th and 95th percentile of 3 and 7 days). The means that the results may not apply if patients present much earlier or later; however, patients are rarely identified as dengue earlier in their illness, and most are identified as dengue due to persistent fever, petechiae, and low platelet counts. In addition, our study population in Singapore is unique, being a primarily adult population with low paediatric dengue incidence nationally. More studies should be conducted in different populations and early part of acute dengue infection to further validate this decision tree to aid clinicians in hospitalizing patients with potentially severe dengue for further monitoring and treatment. In conclusion, we have shown that a simple decision tree can be as effective as more complex laboratory markers or statistical equations in predicting DHF in the clinical setting for the adult population. This can reduce unnecessary hospitalization while optimizing clinical outcome for hospitalized DHF patients. Acknowledgements This study was funded by a National Healthcare Group Small Innovative Grant and a National Medical Research Council Individual Research Grant. References Barkham TM, Chung YK, Tang KF & Ooi EE (2006) The performance of RT-PCR compared with a rapid serological assay for acute dengue fever in a diagnostic laboratory. Transactions of the Royal Society of Tropical Medicine and Hygiene 100, 142 148. Blacksell SD, Doust JA, Newton PN, Peacock SJ, Day NPJ & Dondorp AM (2006) A systematic review and meta-analysis of the diagnostic accuracy of rapid immunochromatographic assays for the detection of dengue virus IgM antibodies during acute infection. Transactions of the Royal Society of Tropical Medicine and Hygiene 100, 775 784. Carlos CC, Oishi K, Cinco MTDD et al. (2005) Comparison of clinical features and hematologic abnormalities between dengue fever and dengue hemorrhagic fever among children in the Philippines. The American Journal of Tropical Medicine and Hygiene 73, 435 440. Chin CK, Kang BH, Liew BK, Cheah PC, Nair R & Lam SK (1993) Protocol for outpatient management of dengue illness in young adults. Journal of Tropical Medicine and Hygiene 96, 259 263. Cuzzubbo AJ, Endy TP, Nisalak A et al. (2001) Use of recombinant envelope proteins for serological diagnosis of dengue virus infection in an immunochromatographic assay. Clinical and Diagnostic Laboratory Immunology 8, 1150 1155. Green S, Vaughn DW, Kalayanarooj S et al. (1999) Early immune activation in acute dengue illness is related to development of plasma leakage and disease severity. Journal of Infectious Diseases 179, 755 762. 1158 ª 2009 Blackwell Publishing Ltd
Ingram PR, Mahadevan M & Fisher DA. (2008) Dengue management: practical and safe hospital-based outpatient care. Transactions of the Royal Society of Tropical Medicine and Hygiene 103, 203 205. Kalayanarooj S, Vaughn DW, Nimmannitya S et al. (1997) Early clinical and laboratory indicators of acute dengue illness. Journal of Infectious Diseases 176, 313 321. Lee MS, Hwang KP, Chen TC, Lu PL & Chen TP (2006) Clinical characteristics of dengue and dengue haemorrhagic fever in a medical center of southern Taiwan during the 2002 epidemic. Journal of Microbiology, Immunology, and Infection 39, 121 129. Lee VJ, Lye DC, Sun Y, Fernandez G, Ong A & Leo YS (2008) Predictive value of simple clinical and laboratory variables for dengue hemorrhagic fever in adults. Journal of Clinical Virology 42, 34 39. Libraty DH, Young PR, Pickering D et al. (2002) High circulating levels of the dengue virus nonstructural protein NS1 early in dengue illness correlate with the development of dengue hemorrhagic fever. Journal of Infectious Diseases 186, 1165 1168. Lye DC, Chan M, Lee VJ & Leo YS (2008) Did young adults with uncomplicated dengue fever need hospitalisation? A retrospective analysis of clinical and laboratory features Singapore Medical Journal 49, 476 479. Quinlan JR (1986) Induction of decision trees. Machine Learning 1, 81 106. Saito M, Oishi K, Inoue S et al. (2004) Association of increased platelet-associated immunoglobulins with thrombocytopenia and the severity of disease in secondary dengue virus infections. Clinical and Experimental Immunology 138, 299 303. Tanner L, Schreiber M, Low JG et al. (2008) Decision tree algorithms predict the diagnosis and outcome of dengue fever in the early phase of illness. PLoS Neglect Tropical Disease 2, e196. Teeraratkul A, Limpakarnjanara K, Nisalak A & Nimmannitya S (1990) Predictive value of clinical and laboratory findings for early diagnosis of dengue and dengue hemorrhagic fever. Southeast Asian Journal of Tropical Medicine and Public Health 21, 696 697. Wang WK, Chen HL, Yang CF et al. (2006) Slower rates of clearance of viral load and virus-containing immune complexes in patients with dengue hemorrhagic fever. Clinical Infectious Diseases 43, 1023 1030. World Health Organization (1997) Dengue Haemorrhagic Fever: Diagnosis, Treatment, Prevention and Control, 2nd edn. WHO, Geneva. Corresponding Author Vernon Lee, Department of Clinical Epidemiology, Tan Tock Seng Hospital, 11 Jalan Tan Tock Seng, Singapore 308433. E-mail: vernonljm@hotmail.com ª 2009 Blackwell Publishing Ltd 1159