Estimating the Prevalence of Drug Use from Self-Reports in a Cohort for which Biologic Data are Available for a Subsample

Similar documents
Assessing Outpatient Drug Abuse

A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) *

Reliability of Reported Age at Menopause

AVOIDED COSTS FROM SUBSTANCE ABUSE TREATMENT IN KENTUCKY SECTION SIX

SAMPLING AND SCREENING PROBLEMS IN RHEUMATIC HEART DISEASE CASE. FINDING STUDY

Estimating the Extent of Illicit Drug Abuse in New Jersey Using Capture-recapture Analysis

Technical Specifications

The Use of Collateral Reports for Patients with Bipolar and Substance Use Disorders

Flexible Matching in Case-Control Studies of Gene-Environment Interactions

Minimizing Uncertainty in Property Casualty Loss Reserve Estimates Chris G. Gross, ACAS, MAAA

SECTION SIX AVOIDED COSTS FROM SUBSTANCE ABUSE TREATMENT IN KENTUCKY

Challenges of Observational and Retrospective Studies

EPI Case Study 2: Reliability, Validity, and Tests of Agreement in M. Tuberculosis Screening Time to Complete Exercise: 30 minutes

Chapter Three: Sampling Methods

Probability II. Patrick Breheny. February 15. Advanced rules Summary

A Spreadsheet for Deriving a Confidence Interval, Mechanistic Inference and Clinical Inference from a P Value

Ensuring/Creating Effective Treatment for Drug Abusing Parents. TTaylor Behavioral Health

Computerized Mastery Testing

Practitioner s Guide To Stratified Random Sampling: Part 1

Overview of. Treatment Outcome Studies from DATOS

Ambiguous Data Result in Ambiguous Conclusions: A Reply to Charles T. Tart

USE AND MISUSE OF MIXED MODEL ANALYSIS VARIANCE IN ECOLOGICAL STUDIES1

Failure of Intervention or Failure of Evaluation: A Meta-evaluation of the National Youth Anti-Drug Media Campaign Evaluation

Remarks on Bayesian Control Charts

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

The Exposure-Stratified Retrospective Study: Application to High-Incidence Diseases

Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions

Treatment retention has served. Relationships Between Counseling Rapport and Drug Abuse Treatment Outcomes

Psychology, 2010, 1: doi: /psych Published Online August 2010 (

The Regression-Discontinuity Design

DOUGLAS COUNTY GOVERNMENT POLICY FORM. To ensure a drug-free work environment within Douglas County Government.

DRUG TESTING POLICY. Policy Number: ADMINISTRATIVE T0 Effective Date: October 1, Related Policies None

2016 Children and young people s inpatient and day case survey

Agreement Between Retrospective Accounts

Standard Errors of Correlations Adjusted for Incidental Selection

A Comparison of Methods of Estimating Subscale Scores for Mixed-Format Tests

NORTH AMERICAN SERVICES GROUP DRUG & ALCOHOL TESTING POLICY

Supplement 2. Use of Directed Acyclic Graphs (DAGs)

Comparison of Two Instruments for Quantifying Intake of Vitamin and Mineral Supplements: A Brief Questionnaire versus Three 24-Hour Recalls

Overview of Treatment Engagement Findings from DATOS

Biostatistics 2 nd year Comprehensive Examination. Due: May 31 st, 2013 by 5pm. Instructions:

DRUG TESTING POLICY. Policy Number: ADMINISTRATIVE T0 Effective Date: January 1, Related Policies None

CHAPTER 3 METHOD AND PROCEDURE

A Validity Study of the Comparison Question Test Based on Internal Criterion in Field Paired Examinations 1

Description of components in tailored testing

CHARACTERISTICS OF ADMISSIONS TO RESIDENTIAL DRUG TREATMENT AGENCIES IN NEW SOUTH WALES, : ALCOHOL PROBLEMS

Liver Transplantation for Alcoholic Liver Disease: A Survey of Transplantation Programs in the United States

Choice of screening tests

DRUG TESTING POLICY. Policy Number: ADMINISTRATIVE T0 Effective Date: March 1, Related Policies None

A Critique of Two Methods for Assessing the Nutrient Adequacy of Diets

You can t fix by analysis what you bungled by design. Fancy analysis can t fix a poorly designed study.

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Department of Biostatistics University of North Carolina at Chapel Hill. Institute of Statistics Mimeo Series No

THE LAST-BIRTHDAY SELECTION METHOD & WITHIN-UNIT COVERAGE PROBLEMS

Lecture Outline Biost 517 Applied Biostatistics I

STEP II Conceptualising a Research Design

Drug Testing Policy. Approved By 05/10/2017. Application This reimbursement policy applies to UnitedHealthcare Community Plan Medicaid products.

Misclassification errors in prevalence estimation: Bayesian handling with care

Drug Testing Policy. Reimbursement Policy CMS Approved By. Policy Number. Annual Approval Date. Reimbursement Policy Oversight Committee

Patterns of adolescent smoking initiation rates by ethnicity and sex

Shoplifting Inventory: Standardization Study

The Use of Item Statistics in the Calibration of an Item Bank

Using Active Medical Record Review and Capture-Recapture Methods to Investigate the Prevalence of Down Syndrome among Live-Born Infants in Colorado

Can "oral fluid" be used instead of "urine" for rapid screening of drug of abuse: a prospective pilot study

Agenetic disorder serious, perhaps fatal without

PDRF About Propensity Weighting emma in Australia Adam Hodgson & Andrey Ponomarev Ipsos Connect Australia

Bayesian Adjustments for Misclassified Data. Lawrence Joseph

The Short NART: Cross-validation, relationship to IQ and some practical considerations

Managing Correctional Officers

JSM Survey Research Methods Section

Learning Objectives. Drug Testing 10/17/2012. Utilization of the urine drug screen: The good, the bad, and the ugly

Confirm Limit--Level of detectable drugs in urine to confirm a positive test.

Bayesian Adjustments for Misclassified Data. Lawrence Joseph

A Review of SAMHSA s Revised Alcohol Biomarkers (EtG/EtS) Advisory - Spring By: Paul L. Cary Toxicology Laboratory University of Missouri

Methodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA

You must answer question 1.

MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and. Lord Equating Methods 1,2

Combining Risks from Several Tumors Using Markov Chain Monte Carlo

Mark J. Anderson, Patrick J. Whitcomb Stat-Ease, Inc., Minneapolis, MN USA

COMPLETE DRUG AND ALCOHOL POLICY & Testing Policy

A Brief Introduction to Bayesian Statistics

Student Performance Q&A:

Brain tissue and white matter lesion volume analysis in diabetes mellitus type 2

Measuring Performance Of Physicians In The Diagnosis Of Endometriosis Using An Expectation-Maximization Algorithm

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Medical Policy Outpatient Drug Screening and Testing. No Prior Authorization X X

Evaluation Models STUDIES OF DIAGNOSTIC EFFICIENCY

An introduction to power and sample size estimation

Probabilistic Approach to Estimate the Risk of Being a Cybercrime Victim

computation and interpretation of indices of reliability. In

Reliability and validity of the International Spinal Cord Injury Basic Pain Data Set items as self-report measures

Lecture 9 Internal Validity

DIETARY RISK ASSESSMENT IN THE WIC PROGRAM

SURVEY TOPIC INVOLVEMENT AND NONRESPONSE BIAS 1

MS&E 226: Small Data

KINESIOLOGY AND HEALTH STUDIES DEPARTMENT

A Comparison of Variance Estimates for Schools and Students Using Taylor Series and Replicate Weighting

How to use the Lafayette ESS Report to obtain a probability of deception or truth-telling

Mantel-Haenszel Procedures for Detecting Differential Item Functioning

The ACCE method: an approach for obtaining quantitative or qualitative estimates of residual confounding that includes unmeasured confounding

Transcription:

American Journal of Epidemiology Copyright O 1996 by The Johns Hopkins University School of Hygiene and Public Health All righte reserved Vol. 144, No. 4 Printed in U.SA. Estimating the Prevalence of Drug Use from Self-Reports in a Cohort for which Biologic Data are Available for a Subsample W. Kenneth Poole, 1 Patrick M. Flynn, 2 A. Vijaya Rao, 1 and Philip C. Cooley 3 Diagnostic procedures, used singly or in combination, are crucial in the determination of the presence and prevalence of medical and other conditions. In the absence of a "gold standard," two or more measures or diagnostic tests are often available that may be used to estimate true prevalence. The authors have developed a statistical method with which to calculate more precise estimates of a condition in the presence of two diagnostic measures, one measurement being performed on the entire study sample and a second, more precise one being made in a random sample of the study sample. This method uses the well-known equations which express the probabilities of the four possible outcomes of the two measures in terms of the sensitivities and specificities of the measures and the prevalence of the condition and some properties of maximum likelihood estimates to obtain an expression for the estimated true prevalence and its precision. The method is illustrated by applying it to data collected by urinalysis and self-report in 1992-1993 in a national multisite study the Cocaine Treatment Outcome Study. Through application of this methodology, a more precise estimate of the true prevalence of substance use can be obtained from two measures, one biologic and the other self-reported. Detailed equations and expressions are provided so that the method can be applied in other situations where diagnostic data from two different sources or procedures are available. Am J Epidemiol 1996;144:413-20. epidemiologic methods; prevalence; sensitivity and specificity; statistics; substance abuse; urinalysis In the study of substance-abusing populations, biologic measures have long been argued to be the "gold standard" for measurement of illicit substance use (1). Despite recent cautions about the validity of selfreports of substance use (2), which are often more practical and less costly measures of use, self-reports have become virtually the sole measures reported in the literature in determining the outcomes of treatment efforts, as was the case for the recent California outcome study (3). Some treatment outcome studies (4, 5) use both biologic and self-report measures and improved validity techniques, and use the biologic measure (urinalysis) as a criterion for validating selfreports of substance use. In these situations, an issue Received for publication July 17,1995, and in final form December 20, 1995. Abbreviations: EMIT, enzyme-multiplied immunoassay technique; GC/MS, gas chromatography/mass spectrometry; MLE, maximum likelihood estimate. 1 Statistics Research Division, Research Triangle Institute, Research Triangle Park, NC. 2 Substance Abuse Treatment Research Program, Research Triangle Institute, Research Triangle Park, NC. * Computer Applications and Design Center, Research Triangle Institute, Research Triangle Park, NC. Reprint requests to Dr. W. Kenneth Poole, Statistics Research Division, Research Triangle Institute, 3040 Comwallis Road, Research Triangle Park, NC 27709-2194. then arises as to how to approach discrepancies and disagreement between the two measures and provide a better estimate of use. Because neither measure (biologic or self-report) is an exact and absolute standard, a rational approach would be to consider both and develop an estimate using both sources of data. As in the case of psychodiagnostic tests, the sensitivity and specificity of the gold standard criterion (in this case, the biologic measure or urinalysis results) is an important factor if one wishes to estimate the true prevalence of a characteristic such as illicit substance use. In this paper, we use two diagnostic procedures in estimating the true prevalence of cocaine use in a cohort of participants in an outcome study of cocaine users who received treatment (5). Self-reports were collected from the entire study sample, and data from urine testing were collected for a random subsample of the study sample. The urine test data are used to "correct" the self-reported estimates of posttreatment cocaine use for the larger group (or total sample) of follow-up participants. These data were obtained from the Cocaine Treatment Outcome Study (5) (funded by the National Institute of Drug Abuse), a retrospective study of persons with a primary diagnosis of cocaine dependence. The purpose of the study was to determine relapse outcomes for the 413

414 Poole et al. cocaine users during the first year after discharge from treatment. Subjects included 772 clients discharged from 23 community-based treatment programs (nine long-term residential, five short-term inpatient, and nine outpatient programs) in seven US cities. The study used a retrospective design to investigate 12-month posttreatment outcomes. Baseline data were obtained on a sample of all primary cocaine patients who were discharged from participating treatment programs between March and November of 1992. Both planned and actual time spent in treatment, or length of stay in treatment prior to discharge, varied by treatment modality, clinic, and patient The subjects were traced, located, and interviewed only once, approximately 1 year after their discharge from treatment From this field data collection effort, there were 772 completed interviews. Upon completion of each follow-up interview, field staff determined whether the subject was in the 50 percent subsample that had been selected for urine testing. Informed consent was then obtained, and urine specimens were collected from those who had voluntarily given their consent. There were 382 subjects selected for urine testing within the group of 772 who had completed 1-year follow-up interviews. Thus, data points for each of the 772 subjects consisted of baseline information obtained from clinic records, a posttreatment 12-month follow-up interview, and urinalysis data for those who provided a urine sample. Initial descriptive and univariate analyses of treatment outcomes based on the self-reports were promising and provided evidence of positive individual and societal outcomes for cocaine users treated in these programs. Data from urinalysis testing in the random sample of follow-up subjects showed approximately 70 percent agreement with self-reports of cocaine use during the 72 hours prior to the follow-up interview. This paper provides an overview of a statistical method for estimating the true prevalence of cocaine use from urine-test and self-report data, where the urine testing is carried out in a random sample of clients for whom the self-report data are available. Detailed information on the study, as well as demographic and other information on the research subjects, is available from the authors upon request. METHODS We use the following notation and terminology in this paper: 77 = the true prevalence of cocaine use; = the sensitivity of self-reports; = the specificity of self-reports; = the sensitivity of urine tests; and = the specificity of urine tests, where sensitivity is the probability that a user will be classified as a user by the method and specificity is the probability that a nonuser will be classified as a nonuser. A period in a subscript denotes summation over that subscript (e.g., Pj Py + Py). The key parameters (data) for estimating die sensitivities, specificities, and prevalences are presented in table 1, where Py is the expected proportion of participants in the (i/)th cell. For example, P u is the proportion who test positive on the urine test and also report drug use on the self-report. The estimating equations that were important in this study are as follows: P n = - TT)(1 - ~ S pu ); P n TT(1 S a )S eu + (1 77)5^,(1 - S pu ); P 2l = trsjx - SJ + (1 - ir)(l - S ps )S pu ; P 22 = TT(1- SJ(1 - SJ + (1 - ir)^. (1) These equations are derived by relating the probability that a participant will fall into a particular cell of table 1 to the sensitivity, specificity, and prevalence. For example, the probability that an individual will fall into cell (1, 1) is the probability that the individual is a cocaine user (TT) and is correctly classified by both the urine test and the self-report (S ej X S eu ), plus the probability that the individual is a nonuser (1 TT) and is incorrectly classified by both procedures ([1 - S ps ] X [1 - S p J). The other equations are similarly derived. It is immediately seen that this general formulation presents a problem. First, there are five parameters and only four equations. Furthermore, because the Py's must total 1, there are only three independent equations. Hence, we really have five parameters and only three equations. This means that constraints must be placed on some of the parameters in order to estimate the others. Our choice is to assume that the sensitivity and specificity of the urine test are known. This will permit the unique estimation of the prevalence of use, as well as the sensitivity and specificity of the selfreported data. One further point is that if the left-hand sides of the expressions in the system of equations 1 are replaced TABLE 1. Joint distribution of results from self-reports of cocaine use and urine testing, Cocaine Treatment Outcome Study, 1992-1993 Urine test Positive Negative Total Yes P11 P21 Self-report No P-tz P22 PA P* Total P,. Pz. 1

Estimating the Prevalence of Drug Use 415 by maximum likelihood estimates (MLEs) of the Py's, then the solutions for TT, S es, and S ps (assuming that S eu and S pu are known) will be their MLEs, because functions of MLEs are MLEs. This permits large sample variances to be computed for the estimates by calculation of the information matrix (which we discuss elsewhere in this paper). If the first and second expressions or the first and third expressions in equation system 1 are added, we obtain P + = irs t - SX (2) where P + = the proportion reported positive by the method, S e = the sensitivity of the method, and S p the specificity of the method. Whence, (3) thus giving us the relation between the prevalence and the proportion reporting positive, the sensitivity and specificity. Therefore, if estimates of P +, S p, and S e are available, the prevalence may be estimated. If S p and S e are known fixed quantities and P + is estimated by the observed proportion positive in the sample for the method, then the estimate of prevalence is ft = [P + - (1 - S p MS e + S p -l) (4) and the variance estimate of this estimate is Var (#) = [P + (l - P + )]/N(S e + S p - I) 2, (5) where the circumflex denotes an estimate and N is the sample size on which P + is based. We note, as have others (e.g., Gart and Buck (6)), that prevalence estimates of this form are not rangepreserving (i.e., are not necessarily between 0 and 1). In extreme cases, negative estimates or estimates greater than 1 may result. However, in our area of application, we feel that this will not generally be a problem, since both of our tests (i.e., self-reports and biologic data) will normally have high specificities; the Youden index, S e + S p 1, will be positive; and the proportion scoring positive on the test will not be greater than the sensitivity of the test. In the unlikely event that the (0, 1) boundaries were violated, we would accept the boundary value as the estimate and either forego an estimate of variance or develop other suitable formulae, since the methods presented in the Appendix might no longer apply. The relation shown in equation 3 may also be used in equation system 1 to estimate the sensitivity and specificity of the self-reported data in terms of the data and the assumed known parameter values for the urine test. These turn out to be and S P, = u ~ Pi. + 5«(1 - - Aid - sj]/[a. - (i - (6) (7) We use these relations in dealing with the problem addressed in this paper. As we noted in the Introduction, our interest is in using the information obtained from urinalysis to adjust or correct the data from self-reports on a larger sample. In our case, the study design required the sample in which the urine tests were conducted to be a random subsample of the group for which selfreports were available. The data appear as in table 2, where the n tj are data from the subsample on which urine-test and self-report data are available and the rrij are data from the subsample on which only self-reports are available (N is the total sample size). If we assume that the probabilities associated with the rrij are the same as for the rij = n ly + n^ (a reasonable assumption if the urine testing was done on a random sample of those providing serf-reports), the likelihood of the data in table 2 would be L = (8) where C is a constant. Setting the derivatives of log L equal to zero and solving the resulting equations under the restriction that the P^s add up to 1 gives the MLEs f for the h Py's P' as m, (9) These estimates of P i} are also the "intuitive" estimates (i.e., (jiy + estimated niy)in). Substituting these estimates of Py in equations 6 and 7 along with the assumed known values of 5,.,, and S pu gives estimates of S es and S ps, S es and S ps, and inserting these quanti- TABLE 2. Data from a self-report survey on cocaine use where urine testing was carried out in a random sample of participants, Cocaine Treatment Outcome Study, 1992-1993 Urine test Positive Negative Not tested Yea "n n 21 Self-report No "12 "22

416 Poole et al. ties into equation 4 gives the estimate for the prevalence, if. To obtain estimates of the variances of S es, S ps, and fr, we use formulae for the large sample variances and covariances of MLEs. The derivations of these formulae are shown in the Appendix. RESULTS In the Cocaine Treatment Outcome Study (5), selfreports of cocaine use were obtained from face-to-face interviews conducted by trained professional interviewers. Upon completion of each interview, the subject was paid $15.00 for participation. After the interview was completed, the interviewer then checked the sampling schedule, notified the subject if he or she had been selected to provide a urine sample, and proceeded to follow standard informed consent procedures. Volunteers were given $10.00 for providing a urine sample. The random sampling schedule selected 382 respondents, from whom 307 urine samples were obtained (296 samples were usable). Of the 75 persons from whom no specimen was obtained, 22 refused to participate, 23 were interviewed in settings where a specimen could not be obtained, and 30 signed the consent form but no specimen arrived at the laboratory or a urine kit was not brought to the interview. Approximately 37 percent (109/296 from table 3) of these specimens tested positive for cocaine use in the previous 72 hours. The basic data needed for our analysis came from the urine testing and from the portion of the faceto-face interview that asked about drug use during the previous 72 hours. The time period of 72 hours was chosen because it was felt that urine testing for the detection of drug use is valid for this time frame but may be less reliable for longer periods of time. Table 3 shows the self-report data for the entire sample of 772 and the urine test results for the random subsample of 296. TABLE 3. Self-reports of cocaine use within 72 hours prior to interview, Cocaine Treatment Outcome Study, 1992-1993 Urine te3t Positive Negative Subtotal Urine test not done Total Yes 28 10 38 37 75 Self-report No 81 177 258 439 697 Total 109 187 296 476 772 Despite the fact that there was only a 5.8 percent (22/382) refusal rate for the urine test, the data in table 3 indicate that the subsample on which the urine test results were ultimately available may not, strictly speaking, have been a truly random subsample. This doubt arises from the fact that while 12.8 percent of participants with a urine test result reported cocaine use, only 7.8 percent of those without a urine test result reported use. Nevertheless, the data are adequate for illustration of our methods, and it can be shown that if there is a shift of a percent of the individuals from cell n l2 in table 2 to cell m 2 because of specimen refusal, our estimates of prevalence are underestimates and the estimated bias is a n n n 22 {n 2 Nn. 2 [(l ~ cc)n n (S eu S pu - For example, if the 22 refusals should have been in cell n 12, the estimated prevalence would be 49.1 percent rather than the 42.4 percent which we present below. As we noted above, we assume that sensitivity and specificity for the urine testing are known, and we use the values of 77 percent for sensitivity (7) and 95 percent for specificity (8). The sensitivity and specificity were estimated by applying the enzymemultipued immunoassay technique (EMIT) to specimens that had been analyzed by gas chromatography/ mass spectrometry (GC/MS). The specimens used to calculate sensitivity were collected from 2,668 parolees and arrestees, and those used for specificity were collected from 159,129 employees. EMIT identified 77 percent of the GC/MS-positive samples and 95 percent of the GC/MS-negative ones. The method used for urinalysis in the Cocaine Treatment Outcome Study was very similar to the EMIT methodology. In using these numbers for our purposes, we must assume that GC/MS identifies all cases of cocaine use during the previous 72 hours, and this is clearly an approximation. If it does not, then the 77 percent is probably an overestimate, and again our methods would produce underestimates of prevalence. Using equation 9, we find the estimates for the Py to be as follows: P n = 0.0716, P 12 = 0.2834, P 21 = 0.0256, and P 22 = 0.6194. Using the version of equation 4 corresponding to the sensitivity and specificity of self-reports (i.e., P + = P A, S p = S ps, S e = S es ) and equations 6 and 7, we have estimates of the prevalence of use and of the sensitivity and specificity of selfreports as follows: TT = 0.4237 = 42.4 percent; S es = 0.2187 = 21.9 percent; and S ps = 0.9922 = 99.2 percent

Estimating the Prevalence of Drug Use 417 Using equation A1 in the Appendix, we derive the covariance matrix for the estimated parameters as Cov 0.000219 0.000257 0.000153 0.000257 0.000936 0.000489 0.000153 0.000489 0.002389 Hence, an estimated 95 percent confidence interval for the prevalence, TT, is 39.4-45.3. This analysis reveals that whereas the reported use of cocaine within the previous 72 hours was only 9.7 percent (75/772 from table 3), the estimate derived from a combination of urinalysis data and self-reports puts the figure at 42.4 percent, over four times that which was reported. We further see that while the self-reports have very good specificity, as expected (i.e., nonusers are almost certain to report nonuse), the sensitivity is poor. We also note that the above estimate of prevalence agrees very well with that which would have been estimated from the urine data alone (i.e., using equation 4 with the percent positive and the sensitivity and specificity of the urine test). This turns out to be TTJ = [109/296 - (1-0.95)]/ (0.77 + 0.95-1) = 44.2 percent, as compared with the 42.4 percent obtained above. On the other hand, the confidence interval is somewhat wider for the estimate obtained from the urine data (i.e., 36.6-51.8), despite the fact that the sensitivity and specificity are assumed to be known. This increase in precision for the self-reports results from the increased sample size. DISCUSSION One of the first discussions in the literature of the sensitivity and specificity of diagnostic tests appeared in Public Health Reports in 1947. It was written by Jacob Yerushalmy (9) and was followed by comments by Jersey Neyman (10). Yerushalmy discussed a study of four radiographic techniques used in the detection of tuberculosis and noted the high variability not only between radiograph readers but also within each reader. Since this early work, several authors have contributed to the literature on special studies involving one and two diagnostic tests. For the case of one diagnostic test, both Quade et al. (11) and Rogan and Gladen (12) assumed that the sensitivity and specificity of the test were known in order to estimate the prevalence of the condition under study. In the case of two diagnostic procedures, the estimation procedure uses three independent equations with five unknown quantities; hence, constraints must be placed on the unknown parameters to obtain unique solutions. These added constraints have included assuming that the sensitivity and specificity of one procedure is known (6) and that both of the specificities are known (13). Recently, a Bayesian approach has been advocated (14) in which prior distributions of the parameters of the diagnostic procedures are postulated and estimates are obtained from the posterior distributions. This is an interesting approach, but like previous methods and the method presented in this paper, it still involves assumptions about unknown parameters (i.e., the functional form and parameter values of the prior distributions). In addition, the estimates of interest must be derived from iterative methods. Although we make the same assumption of a known sensitivity and specificity of one method as do Garth and Buck (6), our methodology differs from theirs in that the validation sample (i.e., the urine test participants) is only a subset of the entire study sample. This will normally be the case in drug abuse studies, because of the expense involved in the collection and analysis of such data. Our methods are analogous to the double sampling techniques discussed by Ekholm and Palmgren (15), Baker (16), and Tenenbein (17), but they differ in that Ekholm and Palmgren (15) and Tenenbein (17) assumed that one of the methods was perfect (i.e., had a sensitivity and specificity equal to 1) and Baker (16) considered the case of three separate tests: a new test, a reference test, and a perfect "gold standard" test. In addition, none of these authors have sought to use their methodologies to correct for underreporting in drug use surveys. In the psychological literature, as evidenced by Gibertini et al. (18), sensitivity and specificity for psychodiagnostic tests have been reported and calculated using diagnostic assignments as the gold standard upon which the operating characteristics of the tests are determined. Little attention has been paid to issues of the sensitivity and specificity associated with the procedure underlying the gold standard or the assignment of patients to the diagnostic categories being assessed by the psychodiagnostic instruments. Often, what is purported to be the gold standard criterion measure has associated sensitivity and specificity parameters of less than 1. Where this problem exists, it is possible to calculate a more precise estimate of the prevalence of the condition (in our case, the presence or absence of illicit substance use) by considering the sensitivities and specificities of both measures. In this paper, we have presented a method that should prove useful in adjusting self-reported data on substance abuse when other information with known sensitivity and specificity (such as that obtained by urinalysis) is available for a random subset of the study group. We used data from urine testing to adjust the self-reported data, but other tests of biologic spec-

418 Poole et al. imens (e.g., hair) may be used as well, provided that the sensitivity and specificity parameters can be assumed to be known. Results of these tests may be applicable for a longer period of time than urinalysis data. It is important to remember, in applying these methods, that the data used to adjust the self-reported estimates must be applicable over the same period of time as is covered by the self-reports. This is why we chose the 72-hour self-report data to correspond to the urine tests. On the other hand, if drug use behavior during the 72 hours preceding the interview is typical of behavior over a longer period of time, the urine tests may be appropriate for a longer time interval (e.g., the previous month). Our analysis generally confirms the popular belief that the specificity of self-reports of drug use is very good (99 percent) because nonusers will usually tell the truth, while sensitivity is very poor (22 percent) because some proportion of users tend to deny use. Once sensitivity and specificity estimates are derived for self-reports, in some general population of interest they may be used in equation 4 to adjust estimates from future similar surveys carried out in that population or similar populations. In fact, because we expect the specificity to be very close to 1, particularly for short recall periods, the estimate for the prevalence can be simplified to fr = P+/S es, where P + = the proportion of positive self-reports in the sample and S es = the estimated sensitivity of the self-reports from the previous independent study. The variance of this estimate can be approximated using a Taylor series linearization of the estimate and noting that P + and S^ are statistically independent, having arisen from different samples. Finally, since there will usually be a substantial cost differential between obtaining biologic specimens and gathering self-report data, this method offers researchers an opportunity to investigate the relative mix of biologic specimens and self-reports necessary to achieve maximum precision for a fixed cost. We plan to make this line of research the goal of future analyses. ACKNOWLEDGMENTS This research was supported by National Institute of Drug Abuse (NIDA) contract N01DA-2-4310. Dr. Bennett W. Fletcher served as the NIDA project officer. REFERENCES 1. Hubbard RL, Eckerman WC, Rachal JV. Methods of validating self-reports of drug use: a critical review. In: Proceedings of the American Statistical Association. Washington, DC: American Statistical Association, 1976:406-9. 2. Rouse BA, Kozel NJ, Richards LG, eds. Self-report methods of estimating drug use: meeting current challenges to validity. Rockville, MD: National Institute of Drug Abuse, 1985. (NIDA Research Monograph no. 57) (DHHS publication no. (ADM) 88-1402). 3. Gerstein DR, Johnson RA, Harwood HJ, et al. Evaluating recovery services: the California Drug and Alcohol Treatment Assessment (CALDATA) general report. Sacramento, CA: California Department of Alcohol and Drug Programs, 1994. (Publication no. ADP 94-629). 4. Hubbard RL, Marsden ME, Rachal JV, et al. Drug abuse treatment: a national study of effectiveness. Chapel Hill, NC: University of North Carolina Press, 1989. 5. Luckey JW, Dunteman GH, Fletcher BW, et al. Assessing the accuracy of self-reported cocaine use: methodological issues with underreporting of use. In: Problems of drug dependence 1995: proceedings of the 57th annual scientific meeting. Rockville, MD: National Institute of Drug Abuse, 1996:310. (NIDA Research Monograph no. 162) (NIH publication no. 96-4116). 6. Gart JJ, Buck AA. Comparison of a screening test and a reference test in epidemiologic studies. JJ. A probabilistic model for the comparison of diagnostic tests. Am J Epidemiol 1966;83:593-6O2. 7. Visher G, McFadden K. A comparison of urinalysis technologies for drug testing in criminal justice. Washington, DC: National Institute of Justice, US Department of Justice, 1991. 8. Stephenson RL. The potential for collision: drug testing in the workplace and in the criminal justice system. Presented at the NJJ-BJA Third Annual Evaluating Drug Control Initiatives Conference, July 1992. [Cited in: Harrison LD. The validity of self-reported data on drug use. J Drug Issues 1995;25:51 71.] 9. Yerushalmy J. Statistical problems in assessing methods of medical diagnosis, with special reference to x-ray techniques. Public Health Rep 1947;62:1432-49. 10. Neyman J. Outline of statistical treatment of the problem of diagnosis. Public Health Rep 1947;62:1449-56. 11. Quade D, Lachenbruch PA, Whaley FS, et al. Effects of misclassification on statistical inferences in epidemiology. Am J Epidemiol 1980;l 11:503-15. 12. Rogan WJ, Gladen B. Estimating prevalence from the results of a screening test Am J Epidemiol 1978;107:71-6. 13. Goldberg JD, Wittes JT. The estimation of false negatives in medical screening. Biometrics 1978;34:77-86. 14. Joseph L, Gyorkos TW, Coupal L. Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard. Am J Epidemiol 1995; 141: 263-72. 15. Ekholm A, Palmgren J. Correction for misclassification using doubly sampled data. J Offic Stat 1987;3:419-29. 16. Baker S. Evaluating a new test using a reference test with estimated sensitivity and specificity. Commun Stat Theory Meth 1991;20:2739-52. 17. Tenenbein A. A double sampling scheme for estimating from binomial data with misclassification. J Am Stat Assoc 1970; 65:1350-61. 18. Gibertini M, Brandenburg NA, Retzlaff PD. The operating characteristics of the Millon Clinical Multiaxial Inventory. J Pers Assess 1986,50:554-67.

Estimating the Prevalence of Drug Use 419 APPENDIX The large sample variances and covariances for maximum likelihood estimates (MLEs) involve the secondorder derivatives of the log-likelihood with respect to the parameters of interest Our interest is in the parameters 7T, S^;, and S pj, but the discussion will be easier if we initially adopt the generic notation 0 lt 0 2, and 0 3. If 0j, 0 2, and 0 3 are MLEs, then the covariance matrix of (d v 6 2, 0 3 ) = Q for large samples is given by where a* are the variances and element is -E\ ( 2 \ (7j ^M2 ^13 \ O- 21 (j\ (723 I are the covariances crand 31 are cr 32 given o\ I by the inverse of the matrix D whose (ifc/)th d 2 log L dd k dd, = n 2 2 2 2 m (Al) where ( ) is the expectation, n and m are the data in table 2, the /Vs are the MLEs for the P^'s given in equation 9, and the derivatives are evaluated at the MLEs for 0 v 9 2, and 9 3 (i.e., v, S es, and S,. u ). The first-order derivatives in equation Al are ds,, dp 12 ds, as. - SJ - 5J - S pu ) dp n = - (1 - W)(l - SJ -= -(l-tt)s pu i [ (] V1 _ C ^ air - S p,(l ~ S pu ) - 5J - J Epidemiol Vol. 144, No. 4, 1996

420 Poole et al. The second-order derivatives in equation A-l are d 2 P n d 2 Pu d-ndsp, ""-(i-w dirdsp, d 2 P 7 21, d 2 P- 2\ dirds cs dirdsp, d2p *--a-s) d2pl2 - (\ <\ \ and all other second-order derivatives are zero. Substituting the estimates of these quantities into equation Al and inverting matrix D results in the estimated precision of interest. A computer program for calculating the parameter estimates and their covariance matrix is available from the authors. s