An Algorithm to Stratify Sleep Apnea Risk in a Sleep Disorders Clinic Population

Similar documents
Diagnostic Accuracy of the Multivariable Apnea Prediction (MAP) Index as a Screening Tool for Obstructive Sleep Apnea

Occupational Screening for Obstructive Sleep Apnea in Commercial Drivers

Prediction of sleep-disordered breathing by unattended overnight oximetry

Occupational screening for obstructive sleep apnea in commercial drivers

Polysomnography (PSG) (Sleep Studies), Sleep Center

Underdiagnosis of Sleep Apnea Syndrome in U.S. Communities

RESEARCH PACKET DENTAL SLEEP MEDICINE

Obstructive sleep apnoea How to identify?

Critical Review Form Diagnostic Test

DECLARATION OF CONFLICT OF INTEREST

Association of Nocturnal Arrhythmias with. Sleep-Disordered Breathing: The Sleep Heart Health Study. On Line Supplement

In-Patient Sleep Testing/Management Boaz Markewitz, MD

In 1994, the American Sleep Disorders Association

Methods of Diagnosing Sleep Apnea. The Diagnosis of Sleep Apnea: Questionnaires and Home Studies

Frequency-domain Index of Oxyhemoglobin Saturation from Pulse Oximetry for Obstructive Sleep Apnea Syndrome

Web-Based Home Sleep Testing

The STOP-Bang Equivalent Model and Prediction of Severity

Sleep Apnea: Vascular and Metabolic Complications

Assessment of a wrist-worn device in the detection of obstructive sleep apnea

Simple diagnostic tools for the Screening of Sleep Apnea in subjects with high risk of cardiovascular disease

Effect of body mass index on overnight oximetry for the diagnosis of sleep apnea

Internet Journal of Medical Update

Interrelationships between Body Mass, Oxygen Desaturation, and Apnea-Hypopnea Indices in a Sleep Clinic Population

(To be filled by the treating physician)

The Familial Occurrence of Obstructive Sleep Apnoea Syndrome (OSAS)

QUESTIONS FOR DELIBERATION

The Latest Technology from CareFusion

Sleep Apnea: Diagnosis & Treatment

PORTABLE OR HOME SLEEP STUDIES FOR ADULT PATIENTS:

Coding for Sleep Disorders Jennifer Rose V. Molano, MD

sleepview by midmark Home Sleep Test

About VirtuOx. Was marketed exclusively by Phillips Healthcare division, Respironics for 3 years

2019 COLLECTION TYPE: MIPS CLINICAL QUALITY MEASURES (CQMS) MEASURE TYPE: Process

O bstructive sleep apnoea-hypopnoea (OSAH) is a highly

Automated analysis of digital oximetry in the diagnosis of obstructive sleep apnoea

New Government O2 Criteria and Expert Panel. Jennifer Despain, RPSGT, RST, AS

PEDIATRIC SLEEP GUIDELINES Version 1.0; Effective

Sleep Studies: Attended Polysomnography and Portable Polysomnography Tests, Multiple Sleep Latency Testing and Maintenance of Wakefulness Testing

Proposed Decision Memo for Sleep Testing for Obstructive Sleep Apnea (OSA) (CAGimage 00405N)

Obstructive Sleep Apnea

Selecting the Right Patients for Oral Appliance Therapy

Effectiveness of Portable Monitoring Devices for Diagnosing Obstructive Sleep Apnea: Update of a Systematic Review

Obstructive sleep apnea (OSA) is characterized by. Quality of Life in Patients with Obstructive Sleep Apnea*

Obstructive sleep apnea (OSA) is the periodic reduction

CPAP titration by an auto-cpap device based on snoring detection: a clinical trial and economic considerations

The recommended method for diagnosing sleep

International Journal of Scientific & Engineering Research Volume 9, Issue 1, January ISSN

Upper Airway Stimulation for Obstructive Sleep Apnea

PREDICTIVE VALUE OF AUTOMATED OXYGEN SATURATION ANALYSIS FOR THE DIAGNOSIS AND TREATMENT OF OBSTRUCTIVE SLEEP APNEA IN A HOME-BASED SETTING

Is CPAP helpful in severe Asthma?

Evaluation of the Brussells Questionnaire as a screening tool

(HST) (95806, G0398, G0399)

18/06/2009 NZ Respiratory & Sleep Institute

Sleepiness, Fatigue, Tiredness, and Lack of Energy in Obstructive Sleep Apnea*

Management of OSA in the Acute Care Environment. Robert S. Campbell, RRT FAARC HRC, Philips Healthcare May, 2018

Sleep Medicine. Paul Fredrickson, MD Director. Mayo Sleep Center Jacksonville, Florida.

Mario Kinsella MD FAASM 10/5/2016

Berlin Questionnaire Sleep Apnea

Is Insomnia an Independent Predictor of Obstructive Sleep Apnea?

Split Night Protocols for Adult Patients - Updated July 2012

Sleep and the Heart Reversing the Effects of Sleep Apnea to Better Manage Heart Disease

Introducing the WatchPAT 200 # 1 Home Sleep Study Device

Berlin Questionnaire and Portable Monitoring Device for Diagnosing Obstructive Sleep Apnea: A Preliminary Study in Jakarta, Indonesia

Questions: What tests are available to diagnose sleep disordered breathing? How do you calculate overall AHI vs obstructive AHI?

CPAP The Treatment of Choice for Patients with OSA

Clinical Trials in OSA

A comparison of public and private obstructive sleep apnea clinics

11/20/2015. Eighth Biennial Pediatric Sleep Medicine Conference. November 12-15, 2015 Omni Amelia Island Plantation Resort Amelia Island, Florida

Treatment-related changes in sleep apnea syndrome in patients with acromegaly: a prospective study

The most accurate predictors of arterial hypertension in patients with Obstructive Sleep Apnea Syndrome

Predictive Value of Clinical Features in Diagnosing Obstructive Sleep Apnea

Predictors of Longitudinal Change in Sleep-Disordered Breathing in a Nonclinic Population

Portable Devices Used for Home Testing in Obstructive Sleep Apnea. California Technology Assessment Forum

SLEEP DISORDERED BREATHING The Clinical Conditions

Practice Parameters for the Use of Portable Monitoring Devices in the Investigation of Suspected Obstructive Sleep Apnea in Adults

Obstructive Sleep Apnea in Truck Drivers

Screening for Sleep Apnea-Hypopnea

Sleep Bruxism and Sleep-Disordered Breathing

Works Cited 1. A Quantitative Assessment of Sleep Laboratory Activity in the United States. Tachibana N, Ayas NT, White DP. 2005, J Clin Sleep Med,

Respiratory Inductance Plethysmography Improved Diagnostic Sensitivity and Specificity of Obstructive Sleep Apnea

Comparison of two in-laboratory titration methods to determine evective pressure levels in patients with obstructive sleep apnoea

Effects of Varying Approaches for Identifying Respiratory Disturbances on Sleep Apnea Assessment

Split-Night Studies for the Diagnosis and Treatment of Sleep-Disordered Breathing

pii: jc

Circadian Variations Influential in Circulatory & Vascular Phenomena

2016 Physician Quality Reporting System Data Collection Form: Sleep Apnea (for patients aged 18 and older)

Overnight fluid shifts in subjects with and without obstructive sleep apnea

Brian Palmer, D.D.S, Kansas City, Missouri, USA. April, 2001

Obstructive sleep apnea (OSA) is a common condition 1

Outline. Major variables contributing to airway patency/collapse. OSA- Definition

Sleep and the Heart. Physiologic Changes in Cardiovascular Parameters during Sleep

Sleep and the Heart. Rami N. Khayat, MD

Nasal pressure recording in the diagnosis of sleep apnoea hypopnoea syndrome

A Deadly Combination: Central Sleep Apnea & Heart Failure

Sleep Apnea in Women: How Is It Different?

Polysomnography and Sleep Studies

An update on childhood sleep-disordered breathing

Validation of the Danish STOP-Bang obstructive sleep apnoea questionnaire in a public sleep clinic

The veteran population: one at high risk for sleep-disordered breathing

Transcription:

An Algorithm to Stratify Sleep Apnea Risk in a Sleep Disorders Clinic Population INDIRA GURUBHAGAVATULA, GREG MAISLIN, and ALLAN I. PACK Center for Sleep and Respiratory Neurobiology, Pulmonary and Critical Care Division, Department of Medicine, University of Pennsylvania Medical Center, Philadelphia, Pennsylvania; and Pulmonary and Critical Care Section, Philadelphia VA Medical Center, Philadelphia, Pennsylvania Obstructive sleep apnea may lead to complications if not identified and treated. Polysomnography is the diagnostic standard, but is often inaccessible due to bed shortages. A system that facilitates prioritization of patients requiring sleep studies would thus be useful. We retrospectively compared the accuracy of a two-stage risk-stratification algorithm for sleep apnea using questionnaire plus nocturnal pulse oximetry against using polysomnography to identify patients without apnea (Objective 1) and those with severe apnea (Objective 2). Patients were those referred to a university-based sleep disorders clinic due to suspicion of sleep apnea. Subjects completed a sleep apnea symptom questionnaire, and underwent oximetry and two-night polysomnography. We used bootstrap methodology to maximize sensitivity of our model for Objective 1 and specificity for Objective 2. We calculated sensitivity, specificity, positive and negative predictive values, and rate of misclassification error of the two-stage risk-stratification algorithm for each of our two objectives. The model identified cases of sleep apnea with 95% sensitivity and severe apnea with 97% specificity. It excluded only 8% of patients from sleep studies, but prioritized up to 23% of subjects to receive in-laboratory studies. Among sleep disorders clinic referrals, a two-stage risk-stratification algorithm using questionnaire and nocturnal pulse oximetry excluded few patients from sleep studies, but identified a larger proportion of patients who should receive early testing because of their likelihood of having severe disease. Keywords: obstructive sleep apnea; risk stratification; polysomnography; nocturnal pulse oximetry; questionnaire Identifying obstructive sleep apnea (OSA) syndrome is important because it leads to adverse outcomes if left untreated (1 4), and because it improves with treatment (5). Although in-laboratory polysomnography (PSG) is an accurate diagnostic tool, it requires significant expertise and expense (6) and is often associated with long waiting periods (7). The use of PSG in a sleep clinic population would be optimized if a risk stratification strategy were developed that achieves the following two goals. First, it identifies patients whose risk for sleep apnea is so low that PSG is not warranted. Second, it identifies the subset of patients who are at risk for severe OSA (8), who should be prioritized for early testing. To fulfill these two objectives, as well as to maintain accuracy while containing cost compared with gold standard tests, we developed a model that relies on a two-stage method of identifying cases. We based our model on existing prototypes for (Received in original form March 8, 2001; accepted in final form September 10, 2001) This work was supported by NIH Grants HL-42236, HL-07713, and CRC-RR-00040. Correspondence and requests for reprints should be addressed to Indira Gurubhagavatula, M.D., Center for Sleep and Respiratory Neurobiology, Hospital of the University of Pennsylvania, 9th Floor, Maloney Building, 3600 Spruce Street, Philadelphia, PA 19104-4283. E-mail: gurubhag@mail.med.upenn.edu This article has an online data supplement, which is accessible from this issue s table of contents online at www.atsjournals.org Am J Respir Crit Care Med Vol 164. pp 1904 1909, 2001 DOI: 10.1164/rccm2103039 Internet address: www.atsjournals.org other conditions (9 15). The two component tests that we chose for our algorithm were the Multivariable Apnea Prediction (MAP) questionnaire (16) and nocturnal pulse oximetry (npo) (5, 17 20). The objectives of this two-stage algorithm are to avoid doing sleep studies on patients at low risk for sleep apnea, and to prioritize scheduling of patients who are likely to have severe disease. We tested the hypothesis that our strategy could meet these two objectives in a sleep center population. METHODS Subjects We selected subjects from patients presenting to our sleep center for evaluation for possible sleep apnea. All completed the MAP questionnaire (16), plus a full-night sleep study (PSG) with concurrent oximetry. We excluded subjects who were already diagnosed with OSA or obesity-hypoventilation syndrome or used supplemental oxygen. Polysomnography Sleep studies were completed and scored by polysomnographic technologists at our sleep center according to the method of Rechtschaffen and Kales (21). Twelve-channel recordings were done on all patients that included monitoring of the electroencephalogram, eye movements, tibialis anterior and chin electromyography, respiratory effort, snoring, airflow, and oximetry. The American Sleep Disorders Association (ASDA) definition was used to score arousals during sleep (22). The respiratory disturbance or apnea hypopnea index (RDI, AHI) was defined as the number of apneas plus hypopneas divided by the total sleep time in hours. An apnea was defined as complete cessation of airflow for at least 10 s and a hypopnea was defined by a 50% reduction in airflow for at least 10 s, associated with a 4% fall in oxyhemoglobin saturation or an arousal. We judged OSA to be present if the RDI was 5 events/h, and severe OSA to be present if the RDI was 30 events/h (23). MAP, Oximetry, and PSG Scoring As described previously (16), we determined each subject s MAP score, which predicts apnea risk using a score between 0 and 1, with 0 representing low risk and 1 representing high risk. A single observer then scored the oximetry strips without knowledge of the PSG results. The oximetry desaturation index (ODI) was the number of desaturations divided by the total test time. We obtained desaturation indices using both 3% and 4% drops, that is, ODI3 and ODI4, respectively. On PSG, we judged OSA to be present if the respiratory disturbance index (RDI) was 5 events/h (case 1) and severe OSA to be present if the RDI was 30 events/h (case 2) (23). Discriminatory Power of MAP and npo Using ROCKIT software (24), we performed area under the curve (AUC) analysis for receiver operating characteristic (ROC) curves (25) to determine relative discriminatory power of the MAP, ODI3, and ODI4 (26) and to determine their sensitivity and specificity (27). Description of the Two-stage Algorithm We applied our algorithm retrospectively on all included subjects. We used a predefined upper bound (UB) value and lower bound (LB) value for the MAP score to separate the subjects into three groups. Those who had high MAP scores (MAP UB) were predicted to

Gurubhagavatula, Maislin, and Pack: Risk Stratification for Sleep Apnea 1905 have OSA and had subsequent review of PSG to see if this were so. Those with low MAP scores (MAP LB) were predicted to be free of OSA and would not be further studied. Those with MAP scores between UB and LB underwent scoring of nocturnal pulse oximetry, and the ODI was compared against a predefined threshold (ODI threshold ) desaturation index. Those with ODI ODI threshold were predicted to have OSA and hence would undergo PSG evaluation, whereas those with ODI ODI threshold were predicted to be free of OSA. Sensitivity and Specificity Using Statistical Application Software (SAS) programming (Cary, NC), we computed our algorithm s sensitivity and specificity against PSG for each of 125 combinations of UB, LB, and ODI threshold (i.e., 125 parameter sets ) for case 1 and for case 2 separately (23). Determination of the Optimal Parameter Sets (28, 29) We first selected a random sample of 80% of the clinic population, the estimation sample. We generated 200 bootstrap resamples from this estimation sample. To the 125 possible parameter sets associated with each bootstrap resample, we applied our criterion function to select the optimum parameter set (or sets). We then averaged all the optima from the 200 resamples to find what we called the unbiased estimate of the optimum parameter set. We did this for case 1 and for case 2 separately. Model Validation We applied our algorithm, using the two unbiased optimal parameter sets, to our reserved validation sample of 20% of our original clinic population. We computed sensitivity, specificity, and positive and negative predictive values of our two models separately, that is, for predicting RDI 5/h and RDI 30/h. RESULTS Figure 2. Receiver operating characteristic curves using MAP questionnaire (thin solid line) or oximetry (heavy solid line). To score oximetry, we used ODI3, the number of episodes of fall in saturation 3%/h of test time. RDI 5/h (upper panel) or RDI 30/h (lower panel) defined OSA. Compared with reference (straight dashed line), MAP shows good discriminatory power, and ODI3 better. The curve for ODI4, which used 4% fall in saturation/h of test time, was superimposed on that for ODI3 and is not displayed separately. Descriptive Results Of 421 patients considered for the study, 359 met the inclusion criteria. We excluded 62 subjects 46 because they had been ordered to have split-night studies by their referring physician, 12 because they were previously diagnosed with OSA, 3 because they were already receiving supplemental oxygen, and 1 because of an incomplete MAP questionnaire. The final group consisted of 243 (67.7%) men and 116 (32.3%) women. Mean age SD was 47.2 13.2 yr. The racial composition was 72.2% white, 24.2% black, and 3.6% other. The average body mass index (BMI) SD was 32.4 8.8 kg/m 2. The MAP frequency distribution (Figure 1) shows a high prevalence of clinic patients with MAP scores exceeding 0.6. The mean SD of the RDI scores was 25.9 29.5 events/h, with median 12 events/h, and range 0 to 110 events/h. Using RDI 5/h, 15/h, and 30/h as the diagnostic criterion, the prevalence of OSA in this sleep center population was 69.4%, 47.1%, and 33%, respectively. Determination of the Discriminatory Power of MAP and npo We generated a receiver operating characteristic curve using MAP values against RDI 5/h (Figure 2, upper panel) and RDI 30/h (Figure 2, lower panel). For RDI 5/h, the respective values of AUC ( SE) for MAP, ODI3, and ODI4 were 0.834 ( 0.024), 0.947 ( 0.011), and 0.936 ( 0.012). For RDI 30/h, AUC for ODI4 was 0.98 ( 0.008), showing slightly improved discrimination of oximetry when using more stringent criteria for defining the presence of sleep apnea. For ODI4, sensitivity and specificity were 85% and 90%, respectively, using RDI 5/h, and 90% and 95% using RDI 30/h. Similar values were obtained for ODI3 (see Figure 2 and Table 1). Effect of Varying the Parameters (UB, LB, and ODI threshold ) on Sensitivity and Specificity of the Risk Stratification Algorithm We computed the sensitivity and specificity of our algorithm as a function of the value we chose for UB, LB, and ODI threshold. TABLE 1. RELATIVE DISCRIMINATORY POWERS OF MAP, ODI3, AND ODI4 AUC Sensitivity Specificity Discriminatory power using RDI 5 events/h MAP 0.834 0.819 0.700 ODI3 0.947 0.860 0.909 ODI4 0.936 0.852 0.900 Reference 0.500 1.000 1.000 Figure 1. Interval-specific MAP frequencies of sleep center population in each of the 10 MAP deciles. The distribution is right skewed, with a heavy proportion of the sleep clinic population having MAP scores exceeding a value of 0.6. Discriminatory power using RDI 30 events/h MAP 0.786 0.805 0.631 ODI3 0.981 0.908 0.942 ODI4 0.979 0.899 0.946 Reference 0.500 1.000 1.000 Definition of abbreviations: AUC area under the curve; MAP Multivariable Apnea Prediction; ODI3, ODI4 oximetry desaturation index using 3% and 4% drops; RDI respiratory disturbance index.

1906 AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE VOL 164 2001 TABLE 2. EFFECT OF LOWER BOUND ON SENSITIVITY AND SPECIFICITY, USING RDI 5/h OR 30/h TO DEFINE OSA Upper Bound 0.7 Lower Bound RDI 5/h ODI threshold 15/h RDI 30/h ODI threshold 15/h Sensitivity Specificity Sensitivity Specificity 0.1 0.724 0.836 0.958 0.647 0.2 0.720 0.836 0.958 0.651 0.3 0.716 0.836 0.950 0.651 0.4 0.716 0.845 0.950 0.656 0.5 0.696 0.845 0.916 0.660 Definition of abbreviations: ODI oximetry desaturation index; OSA obstructive sleep apnea; RDI respiratory disturbance index. Figure 4. Upper bound ODI threshold interaction in predicting RDI 30 events/h for a representative lower bound value of 0.3. As upper bound is increased from 0.5 to 0.9 (x-axis), we see greater changes in sensitivity (upper panel) and specificity (lower panel) when ODI threshold is high at 25/h (solid lines), compared with when ODI threshold is low, at 5/h (dashed lines). We varied UB from 0.5 to 0.9 and LB from 0.1 to 0.5 in 0.1 unit increments, and ODI threshold from 5 to 25 events/h by increments of 5. We computed sensitivity and specificity for these 125 combinations of UB, LB, and ODI threshold. Effect of LB on sensitivity and specificity. As Table 2 shows, for a given ODI threshold (15/h), if UB is kept constant at a specific value (0.7), varying LB has negligible effect on either sensitivity or specificity of the algorithm compared with PSG for identifying patients with RDI 5 events/h (left two columns). This effect is also true for RDI 30 events/h (right two columns). This effect is true for all levels of ODI threshold and for any value of MAP UB that we studied (data not shown). Effect of UB and ODI threshold on sensitivity and specificity. The effect of varying UB is less straightforward, but still consistent and predictable (Figures 3 and 4). The effect of UB interacts with the effect of ODI threshold. First, we discuss the case when RDI 5/h is used to define apnea. UB has a negligible effect on sensitivity when ODI threshold is low. As an example, for ODI threshold 5/h, LB 0.3, as UB is increased from 0.5 to 0.9, the sensitivity drops very little (Figure 3, upper panel, dashed black line). At larger values of ODI threshold, however, the effect of raising UB on sensitivity increases considerably. For example, at ODI threshold 25 events/h, increasing UB from 0.5 to 0.9 while LB 0.3 results in dramatic decline in sensitivity (Figure 3, lower panel, solid black line) and dramatic increase in specificity (Figure 3, lower panel, solid gray line). Moreover, the effect of varying UB on sensitivity is less pronounced than its effect on specificity for any value of ODI threshold. An interaction between ODI threshold and LB was Figure 3. Upper bound ODI threshold interaction in predicting RDI 5 events/h for a representative lower bound value of 0.3. Specificity (lower panel) and particularly sensitivity (upper panel) change more prominently with upper bound when ODI threshold is 25 events/h (solid lines), compared with the lower value of 5 events/h (dashed lines). not observed (data not shown). Table 3 (left four columns) provides a numerical synopsis of these data. When RDI 30/h is used to define apnea (Figure 4), the interaction between UB and ODI threshold was still present. Upper bound s effect on sensitivity (upper panel) and specificity (lower panel) was more pronounced at the higher ODI threshold value of 25/h rather than the lower value of 5/h (compare solid lines to dashed lines). The comparative results of these two definitions of OSA ( 5/h and 30/h) are shown numerically in Table 3. Prevalence of OSA by MAP Decile As Figure 5 (white bars) shows, apnea prevalence (RDI 5/h) increases with MAP score, ranging from 0% (for MAP 0.0 0.1) to 93% (for MAP 0.8 0.9). The overall prevalence of OSA in the population as a whole was 69%. The prevalence of severe apnea (RD 30 events/h, solid bars) increased with MAP score. For those with MAP 0.4, prevalence of severe apnea ranged from 12% to 71%, while only one patient with an MAP score below 0.4 had severe OSA. Optimal Model Parameters Obtained by the Bootstrapping Technique Using our criterion functions for each of our two objectives, we obtained the biased optimal parameter set. For Objective 1, this set was UB 0.58 (range 0.5 0.8), LB 0.14 (range 0.1 0.2), and ODI threshold 5.02 desaturations/h (range 5 10). For Objective 2, that is, to detect those with RDI 30 events/h who would be suitable for split-night studies, the biased optimal parameter set was UB 0.9 (range 0.9 0.9), LB 0.38 (range 0.2 0.5), and ODI threshold 21/h (range 10 25). Interestingly, the entire bootstrap distribution of UB values was fixed at 0.9. The results are summarized in Table 4. Table 4 also shows the total number of parameter sets (which resulted from the 200 bootstrap resamples), which were averaged to obtain the biased optimal parameter set for each of our two objectives. Model Validation in a Subset of Clinic Patients Using Parameters Obtained by Bootstrapping Using our unbiased estimates of each of the two optimal parameter sets, we computed model accuracy in our validation sample, which consisted of 75 patients (20%) from the original clinic population (Table 5). For Objective I, identifying all potential cases of apnea, the model shows a sensitivity of 95%, with a negative predictive

Gurubhagavatula, Maislin, and Pack: Risk Stratification for Sleep Apnea 1907 TABLE 3. INTERACTION BETWEEN UPPER BOUND AND ODI threshold, USING RDI 5/h OR 30/h TO DEFINE OSA Lower Bound 0.3 RDI 5/h RDI 30/h ODI threshold ODI threshold Upper Bound 5/h 25/h 5/h 25/h Sensitivity Specificity Sensitivity Specificity Sensitivity Specificity Sensitivity Specificity 0.5 0.932 0.600 0.880 0.627 0.983 0.336 0.966 0.394 0.6 0.916 0.682 0.796 0.709 0.983 0.390 0.958 0.515 0.7 0.876 0.800 0.672 0.855 0.975 0.481 0.899 0.680 0.8 0.840 0.873 0.512 0.936 0.975 0.552 0.832 0.851 0.9 0.816 0.909 0.400 0.982 0.975 0.593 0.790 0.967 Definition of abbreviations: ODI oximetry desaturation index; OSA obstructive sleep apnea; RDI respiratory disturbance index. value of 87%. The specificity and positive predictive values are 68% and 86%, respectively. These values are given in Table 5 (top), along with their 2.5% lower confidence limit and 97.5% upper confidence limits. The misclassification rate, defined as the sum of the total false positive and false negative rates divided by the total sample size (n 75), was 17%. The false negative rate was very low at 7.9%, and the false positive rate was 9.2%. For Objective II, identifying patients with severe apnea (RDI 30/h) for split-night studies, we found that the model had a specificity of 97% and a positive predictive value of 94%. The sensitivity was 85%, and the negative predictive value was 92%. These values are also given in Table 5 (bottom). The false negative and false positive rates were 5.3% and 3.9%, respectively. By comparison, the 20% validation sample yielded estimates of accuracy that were comparable to those obtained for the 80% estimation sample for both objectives. (Compare the first and fourth columns of Table 5.) Comparison of Prevalence of OSA in Estimation and Validation Samples The prevalence of OSA (RDI 5/h) in the estimation, validation, and total sample was 70%, 68%, and 69%, respectively. Using RDI 30/h, these respective values were 35%, 24%, and 33%. Thus the three groups had similar prevalences of apnea and severe apnea. DISCUSSION Our first objective was to identify subjects who do not need a sleep study, which our model accomplished with 95% sensitivity. We reviewed the five cases of apnea that were missed by our algorithm, out of a total of 51 patients with apnea, in the model validation population (n 75). The individual RDI scores of these five patients were 5, 5, 6, 10, and 11 events/h. Thus, all missed cases had mild disease. The number of such missed cases is a function of the parameters we chose. Our specific parameter selection criteria resulted in an optimal LB value of 0.1. However, review of Figure 5 shows that most cases of OSA are associated with MAP 0.4. If we use LB 0.4 as the optimal parameter, we exclude 73/359 20% of subjects from unneeded sleep studies, and if we use LB 0.1, we exclude only 25/359 8% of subjects, none of whom had apnea. Excluding this additional 12% of subjects by using LB 0.4 instead of 0.1 comes at the cost of missing 13 additional cases of apnea, with RDI values 7 23/h for all but one, who had RDI 73/h. Thus, most patients who are missed when we use LB 0.4 still have mild to moderate apnea (30). However, missing these additional cases might prove to be expensive in the long run, as evidence suggests that patients who have even mild OSA may be at risk for adverse effects, including hypertension (31, 32) and vehicular crashes (33), and may also benefit from therapy (34). These results suggest that although our risk stratification algorithm works well, it does not remove the need for sleep studies in many patients. This is hardly surprising, as patients being evaluated had come with the clinical suspicion of sleep apnea, and the prevalence of the disorder in this sample is high (69%). This particular aspect of our algorithm might there- TABLE 4. ESTIMATES OF MODEL PARAMETERS (UB, LB, ODI) OBTAINED USING BOOTSTRAPPING METHODOLOGY ON AN 80% RANDOM SAMPLE OF SUBJECTS Parameter Bootstrap Mean* Min* Max* Number of Samples Objective 1: Selection of Subjects for In-lab Sleep Study UB 0.58 0.50 0.80 402 LB 0.14 0.10 0.20 402 ODI 5.02 5.00 10.00 402 Objective 2: Selection of Subjects for Split-night Testing UB 0.90 0.90 0.90 237 LB 0.38 0.20 0.50 237 ODI 21.14 10.00 25.00 237 Figure 5. Prevalence of OSA by MAP interval in clinic population. OSA prevalence is high in nearly all intervals in the sleep center population, and particularly so in those with MAP 0.6. This was true regardless of the cutpoint for OSA: RDI 5 events/h or 30 events/h. Definition of abbreviations: LB lower bound; ODI threshold value on overnight oximetry used to define a positive study; UB upper bound. * Bootstrap mean, min, and max are calculated from each of 200 bootstrap resamples. Number of parameter combinations obtained after 200 bootstrap iterations. Because some iterations produced more than one optimal parameter combination, the number of samples used to compute each of the three bootstrap means exceeds the number of iterations, 200.

1908 AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE VOL 164 2001 TABLE 5. ACCURACY OF ALGORITHMS USING OPTIMAL PARAMETER SETS TO PREDICT PRESENCE OF OSA OR SEVERE OSA Estimation Sample* 2.5% LCL 97.5% UCL Validation Sample Objective 1: Selection of Subjects for In-lab Sleep Study Sensitivity 0.948 (0.913 0.975) 0.941 Specificity 0.684 (0.575 0.806) 0.667 PPV 0.861 (0.810 0.919) 0.857 NPV 0.865 (0.790 0.935) 0.842 Objective 2: Selection of Subjects for Split-night Testing Sensitivity 0.845 (0.769 0.925) 0.833 Specificity 0.967 (0.938 0.989) 0.947 PPV 0.935 (0.880 0.977) 0.833 NPV 0.920 (0.879 0.958) 0.947 Definition of abbreviations: LCL lower confidence limit; NPV negative predictive value; OSA obstructive sleep apnea; PPV positive predictive value; UCL upper confidence limit. * Values obtained by averaging results of 200 resamples from 80% estimation sample. The bootstrap mean values of UB, LB, and ODI threshold shown in Table 4 were applied to 20% of the clinic population reserved for the validation procedure to obtain sensitivity, specificity, and positive and negative predictive values. fore have more utility if applied as a screening tool in a more general population. This is an area for future study. Importantly, this finding highlights the heterogeneity of clinical presentation among those with milder forms of sleep apnea, wherein subjects with similar disease severity may present with quite variable symptoms. Identification of severe cases of OSA was our second objective, and our specificity was 97%. We correctly identified 15 of the 18 patients in the validation sample with RDI 30 events/h, and 54 of the 57 patients with RDI 30 events/h. We reviewed the RDI scores of patients incorrectly predicted to have severe apnea by our algorithm. Only three patients were incorrectly identified as having severe apnea, and these had RDI scores of 1, 20, and 25 events/h. Thus, two of three of these subjects had at least moderate apnea, characterized by RDI between 15 and 30 events/h. In addition, three cases of severe apnea were not prioritized by the model to undergo PSG, and these had RDI values of 34/h, 42/h, and 66/h. However, they would have been selected for PSG when the first model is applied, as they had MAP scores of 0.67, 0.80, and 0.71, respectively, all in the high MAP group for the first model. Hence, the diagnosis would not have been missed in any of them. Thus, the risk stratification algorithm is more useful for prioritizing patients to receive sleep studies than for excluding apnea. Given its value, it is reasonable to consider whether we could select the range of MAP values that is critical. In our model, we found that patients who have an MAP 0.9 should have sleep studies, independent of oximetry results. In the clinic population as a whole, 13 (4%) of 359 patients had MAP 0.9, and 8 of these (62%) had RDI 30/h. In addition, 11 of the 13 (85%) had RDI 5/h. If we instead used MAP 0.8, we would find that 82 of 359, or 23%, had MAP 0.8, of whom 47 of 82 (57%) had RDI 30/h, and 75 of 82 (91%) had RDI 5/h. This is a substantial number of subjects. Thus, we believe that it is reasonable to propose that the sleep center use the optimal model parameters outlined here, or alternatively send the larger number of patients with MAP 0.8 for full sleep studies. The latter represents a slightly adjusted algorithm, but could be used if overnight oximetry was not available, and moreover would remove the costs for this additional test. Although these diagnostic rates are reasonably accurate, techniques other than MAP and oximetry may yield even better results, including other questionnaires (16, 35 40), relative proportions of intraoral measurements (41), or other simpler systems to monitor for respiratory disturbances that include readings of nasal pressure (42). In interpreting these results, we note that oximetry was conducted concurrently with PSG, raising the concern that the two tests could not be scored independently. However, a second observer who had no knowledge of the PSG results scored the oximetry records. Additionally, random rescoring of npo tracings by a second as well as by the original interpreter showed no significant differences in scores. Test retest and interrater reliabilities (TRT and IRR) (43) measured using intraclass correlation coefficients were high, and were 0.997 and 0.981, respectively. Performing both npo and PSG together had benefits, such as ensuring that the individual was sleeping in an identical position and was in an identical stage of sleep for both studies. Moreover, a technician was present to correct signal errors immediately. Future investigations should assess the predictive value of unattended oximetry conducted in the patient s home. In our sensitivity analysis of oximetry, we observed the trends in sensitivity and specificity when we used a 3% drop in saturation instead of 4% to define events. We found negligible differences in sensitivity and specificity trends using this alternative method of interpreting oximetry data (data not shown). Another strength of our study was that the prevalence of apnea was similar in the validation sample, in the estimation sample, and in the total sample, for either apnea definition RDI 5/h or RDI 30/h. This suggests that the validation sample was representative of both the estimation sample and of the total sample. In our algorithm, we explored the value of questionnaire and oximetry, but other techniques may yield even better results, including other questionnaires (16, 35 40). Relative proportions of intraoral measurements combined with BMI (41) have yielded an AUC of 0.996, with BMI having an exceptionally strong predictive value in this study, having an AUC of 0.938, compared with the value of 0.734 found by us previously (16). Other simpler systems to monitor for respiratory disturbances include readings of nasal pressure (42), and yielded a sensitivity of 100%, specificity of 92%, positive predictive value of 92%, and negative predictive value of 100%, as compared with PSG. These values were better than those of npo, which produced sensitivity of 75%, specificity of 85%, positive predictive value of 75%, and negative predictive value of 89%. An approach that combines both nasal pressure measurement and oximetry may therefore be the optimal one for the second stage of our model. In conclusion, our algorithm offers simplicity, feasibility, and sufficient accuracy. The model can omit unnecessary sleep studies in subjects who are unlikely to have OSA, and prioritize high-risk patients to receive early testing. The former reduces the number of sleep studies by only 8 12%, given the high prevalence of apnea in a sleep center population. It is likely to have greater utility in more general populations. The second objective addresses a larger percentage of the population (up to 23%), and gives sleep specialists a tool to identify high-risk patients who should be studied quickly. The optimal tests to be used in this two-stage method have yet to be determined. As diagnostic technology develops, we anticipate that this algorithm can be further refined. References 1. Flemons W, Tsai W. Quality of life consequences of sleep-disordered breathing. J Allergy Clin Immunol 1997;99:S750 S756. 2. Olson L, King M, Hensley M, Saunders N. A community study of snoring and sleep-disordered breathing. Health outcomes. Am J Respir Crit Care Med 1995;152:717 720.

Gurubhagavatula, Maislin, and Pack: Risk Stratification for Sleep Apnea 1909 3. Phillipson E. Sleep apnea a major public health problem. N Engl J Med 1993;328:1271 1273. 4. Shepard JJ. Hypertension, cardiac arrhythmias, myocardial infarction, and stroke in relation to obstructive sleep apnea. Clin Chest Med 1992; 13:437 458. 5. Baumel M, Maislin G, Pack A. Population and occupational screening for obstructive sleep apnea: are we there yet? Am J Respir Crit Care Med 1997;155:9 14. 6. Pack A. Obstructive sleep apnea. Adv Intern Med 1994;39:517 567. 7. Pouliot Z, Peters M, Neufeld H, Kryger M. Using self-reported questionnaire data to prioritize OSA patients for polysomnography. Sleep 1997;20:232 236. 8. Recommendations of criteria for measurements, definitions and severity ratings of sleep related breathing disorders in adults. Report of an American Sleep Disorders Association Task Force. Chicago, 1998. 9. Hochberg M, Schmidt M, Funk K, Sutton J, Stevens M. Validity and yield of a two-stage screening procedure for systemic lupus erythematosus. Clin Exp Rheumatol 1983;1:67 71. 10. Martin D, Blackburn N, O Connell K, Brant E, Goetsch E. Evaluation of the World Health Organisation antibody-testing strategy for the individual patient diagnosis of HIV infection (strategy III). S Afr Med J 1995;85:877 880. 11. Allen M, Embry B. Biopsy of the prostate guided by transrectal ultrasonography: early experience in a teaching community hospital. S Med J 1991;84:579 586. 12. Winawer S, Sherlock P. Surveillance for colorectal cancer in averagerisk patients, familial high-risk groups, and patients with adenomas. Cancer 1982;50:2609 2614. 13. Zavertnik J, McCoy C, Robinson D, Love N. Cost-effective management of breast cancer. Cancer 1992;69:1979 1984. 14. Bower S, Bewley S, Campbell S. Improved prediction of preeclampsia by two-stage screening of uterine arteries using the early diastolic notch and color Doppler imaging. Obstet Gynecol 1993;82:78 83. 15. Sen B, Wilkinson G, Mari J. Psychiatric morbidity in primary health care. A two-stage screening procedure in developing countries: choice of instruments and cost-effectiveness. Br J Psychol 1987;151:33 38. 16. Maislin G, Pack A, Kribbs N, Smith P, Schwartz A, Kline L, Schwab R, Dinges D. A survey screen for prediction of apnea. Sleep 1995;18: 158 166. 17. Series F, Marc I, Cormier Y, La Forge J. Utility of nocturnal home oximetry for case finding in patients with suspected sleep apnea hypopnea syndrome. Ann Intern Med 1993;119:449 453. 18. Yamashiro Y, Kryger M. Nocturnal oximetry: is it a screening tool for sleep disorders? Sleep 1995;18:167 171. 19. Levy P, Pepin J, Deschaux-Blanc C, Paramelle B, Brambilla C. Accuracy of oximetry for detection of respiratory disturbances in sleep apnea syndrome. Chest 1996;109:395 399. 20. Lacassagne L, Didier A, Murris-Espin M, Charlet J, Chollet P, Leophonte- Domairon M, Tiberge M, Pessey J, Leophonte P. Role of nocturnal oximetry in screening for sleep apnea syndrome in pulmonary medicine. Study of 329 patients. Rev Mal Respir 1997;14:201 207. 21. Rechtschaffen A, Kales A. A manual of standardized terminology techniques and scoring system for sleep stages of human subjects. Washington, DC: US Goverment Printing Office; 1968. Publication No. 204. 22. Sleep-Related Breathing Disorders in Adults: Recommendations for Syndrome Definition and Measurement Techniques in Clinical Research. The Report of an American Academy of Sleep Medicine Task Force. Sleep 1999;22:667 689. 23. Department of Radiology. Biological Sciences Division. University of Chicago Hospitals and Clinics. ROCKIT Software. Chicago, IL: Available ftp at random.bsd.uchicago.edu; 1998. 24. Hanley J, McNeil B. The meaning and use of area under a receiver operating characteristic (ROC) curve. Radiology 1982;143:29 36. 25. Harrell F, Lee K, Mark D. Tutorial in biostatistics: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 1996;15:361 387. 26. Beck J, Shultz E. The use of relative operating characteristic (ROC) curves in test performance evaluation. Arch Pathol Lab Med 1986; 110:13 20. 27. Efron B, Tibshirani R. Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Stat Sci 1986;1:54 77. 28. Efron B, Tibshirani R. An introduction to the bootstrap. Monographs on statistics and applied probability 57. New York: Chapman & Hall; 1993. 29. Yamashiro Y, Kryger M. CPAP titration for sleep apnea using a splitnight protocol. Chest 1995;107:62 66. 30. Quan S, Howard B, Iber C, Kiley J, Nieto F, O Connor G, Rapoport D, Redline S, Robbins J, Samet J, Wahl P. The Sleep Heart Health Study: design, rationale, and methods. Sleep 1997;20:1077 1085. 31. Nieto F, Young T, Lind B, Shahar E, Samet J, Redline S, D Agostino R, Newman A, Lebowitz M, Pickering T. Association of sleep-disordered breathing, sleep apnea, and hypertension in a large communitybased study. Sleep Heart Health Study. JAMA 2000;283:1829 1836. 32. Young T, Blustein J, Finn L, Palta M. Sleep-disordered breathing and motor vehicle accidents in a population-based sample of employed adults. Sleep 1997;20:608 613. 33. Engleman H, Kingshott R, Wraith P, Mackay T, Deary I, Douglas N. Randomized placebo-controlled crossover trial of continuous positive airway pressure for mild sleep apnea/hypopnea syndrome. Am J Respir Crit Care Med 1999;159:461 467. 34. Flemons W, Remmers J. The diagnosis of sleep apnea: questionnaires and home studies. Sleep 1996;19:S243 S247. 35. Kump K, Whalen C, Tishler P, Browner I, Ferrette V, Strohl K, Rosenberg C, Redline S. Assessment of the validity and utility of a sleepsymptom questionnaire. Am J Respir Crit Care Med 1994;150:735 741. 36. Kupaniai L, Andrew D, Crowell D, Pearce J. Identifying sleep apnea from self-reports. Sleep 1988;11:430 436. 37. Haraldsson P, Carenfelt C, Knutsson E, Persson H, Rinder J. Preliminary report: validity of symptom analysis and daytime polysomnography in diagnosis of sleep apnea. Sleep 1992;15:261 263. 38. Viner S, Szalai J, Hoffstein V. Are history and physical examination a good screening test for sleep apnea? Ann Intern Med 1991;115:356 359. 39. Hoffstein V, Szalai J. Predictive value of clinical features in diagnosing obstructive sleep apnea. Sleep 1993;16:118 122. 40. Kushida C, Efron B, Guilleminault C. A predictive morphometric model for the obstructive sleep apnea syndrome. Ann Intern Med 1997;127: 581 587. 41. Bradley P, Mortimore I, Douglas N. Comparison of polysomnography with ResCare Autoset in the diagnosis of the sleep apnoea/hypopnoea syndrome. Thorax 1995;50:1201 1203. 42. Fleiss J. The design and analysis of clinical experiments. New York: John Wiley & Sons; 1986.