Limin X. Clegg, 1 Arnold L. Potosky, 1 Linda C. Harlan, 1 Benjamin F. Hankey, 1 Richard M. Hoffman, 2,3 Janet L. Stanford, 4 and Ann S.

American Journal of Epidemiology Copyright 001 by the Johns Hopkins University Bloomberg School of Public Health All rights reserved Vol. 154, No. 6 Printed in U.S.A. Self-reported Treatment for Prostate Cancer Clegg et al. Comparison of Self-reported Initial Treatment with Medical Records: Results from the Prostate Cancer Outcomes Study Limin X. Clegg, 1 Arnold L. Potosky, 1 Linda C. Harlan, 1 Benjamin F. Hankey, 1 Richard M. Hoffman,,3 Janet L. Stanford, 4 and Ann S. Hamilton 5 Medical records are generally accepted as the most accurate source of information documenting cancer treatments. However, as the health care system becomes more decentralized and more cancer care is delivered in outpatient settings, it is increasingly difficult and expensive to review records from the many surgeons and medical/radiation oncologists who administer cancer therapies in the community setting. Using 19 19 data, the authors compared initial treatment for prostate cancer self-reported (from a mailed questionnaire or telephone/in-person interview) by 3,1 US men in the population-based Prostate Cancer Outcomes Study with information obtained from medical records. Agreement between self-reports and medical records varied by type of treatment. Generally, agreement was excellent for more invasive procedures such as prostatectomy or radiation (kappa values > 0.8), with decreasing agreement for hormone shots and pills (kappa values < 0.7). If the medical record abstract is assumed to be the gold standard, the estimated sensitivity was generally high (>80%) for prostatectomy and radiation but low (68%) for hormone pills, although the estimated specificity was % or greater for all treatments. These results can serve as a useful guide to researchers contemplating the use of surveys as an alternative to medical record abstraction to ascertain treatment in studies of patient outcomes. Am J Epidemiol 001;154:58 7. medical records; prostatic neoplasms; questionnaires; recall; therapeutics Medical records provide the basis for monitoring cancer incidence and survival. They are also generally accepted as the most accurate source of information documenting cancer treatments. In addition, medical records are abstracted to study patterns of cancer care and associated health outcomes. However, as the health care system becomes more decentralized and more cancer care is delivered in outpatient settings, it is increasingly difficult and expensive to review records from the many surgeons, medical oncologists, and radiation oncologists who administer cancer therapies in the community setting. Given the prolonged survival of men with early-stage prostate cancers, it can also be expensive to Received for publication December 0, 000, and accepted for publication April 0, 001. Abbreviations: PCOS, Prostate Cancer Outcomes Study; SEER, Surveillance, Epidemiology, and End Results. 1 Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, MD. New Mexico Tumor Registry, University of New Mexico Health Sciences Center, Albuquerque, NM. 3 Medical Service, Department of Veterans Affairs Medical Center, Albuquerque, NM. 4 Fred Hutchinson Cancer Research Center, Seattle, WA. 5 Department of Preventive Medicine, University of Southern California Keck School of Medicine, Los Angeles, CA. Reprint requests to Dr. Limin X. Clegg, National Cancer Institute, NIH, 6116 Executive Boulevard, Suite 504, MSC 8316, Bethesda, MD 08-8316 (e-mail: lin_clegg@nih.gov). obtain longitudinal treatment information by reabstracting medical records at multiple time points. In 19, the National Cancer Institute (Bethesda, Maryland) initiated the Prostate Cancer Outcomes Study (PCOS) to investigate the patterns of cancer care and the effects of initial treatments on health-related quality-of-life outcomes in a large population-based cohort of newly diagnosed prostate cancer patients. For the PCOS, information about initial treatments was collected primarily from medical records, including both inpatient and outpatient sources. Treatment information was also collected in a selfadministered patient survey designed primarily to ascertain health-related quality of life. In this study, we compare surgery, radiation, and hormonal therapies self-reported by men newly diagnosed with prostate cancer with treatment information obtained from medical records. MATERIALS AND METHODS Study subjects The PCOS is a 5-year, population-based, longitudinal cohort study conducted in six US geographic regions covered by the population-based Surveillance, Epidemiology, and End Results (SEER) cancer registries: the states of Connecticut, Utah, and New Mexico and the metropolitan areas of Atlanta, Georgia; Los Angeles, California; and 58

Self-reported Treatment for Prostate Cancer 583 Seattle-Puget Sound, Washington. Details of the study methods have been reported elsewhere (1). Briefly, eligible cases were all men aged 39 89 years with biopsy-proven, primary invasive carcinoma of the prostate diagnosed between October 1, 19, and October 31, 19; in Seattle, however, men younger than age 60 years were excluded because they were eligible for another ongoing study. Eligible cases were identified from all pathology facilities serving the registries catchment areas within 4 6 months of diagnosis. Non- Hispanic Whites aged 60 years or more at diagnosis were randomly drawn from among eligible cases. Men younger than age 60 years were oversampled, as were Hispanic men and Black men, to obtain a sufficient number of minority men and younger men. Among 11,137 eligible cases, 5,67 were randomly sampled from the six registries according to defined age and race/ethnicity strata. Of the sampled cases, 4,736 (83.5 percent) were contacted and were invited to participate, and 3,1 (56.3 percent of sampled cases and 67.5 percent of invited participants) returned a 6-month survey. Eligible sampled patients were contacted by mail (. percent) or telephone/in person (9.8 percent) 6 months after the diagnosis date and were asked to complete a self-administered questionnaire and provide consent for access to medical records from all providers of cancer care (both institutions and specific physicians). Medical records were abstracted for 3,173 (99.3 percent) of the participants who finished the 6-month survey. This study included all 3,1 patients who completed the 6-month survey. For 3 patients whose medical records were not abstracted for the PCOS, we used data on initial treatment routinely abstracted from medical records by the SEER registries. Variables of initial treatment TABLE 1. Questionnaire items (self-reported) used in the patient survey of initial treatment, Prostate Cancer Outcomes Study, United States, 19 19 Treatment Prostatectomy Radiation Orchiectomy Hormone shots Hormone pills The 6-month PCOS patient survey collected information on demographics, treatment of prostate cancer, healthrelated quality of life, and quality-of-life outcomes. Participants were asked whether they had initially received a radical prostatectomy, radiation, orchiectomy, hormone shots, or hormone pills for prostate cancer (table 1). Because chemotherapy is rarely used for initial treatment of prostate cancer (it is used typically to treat hormone-refractory disease, which tends to occur in a fraction of patients after initial hormonal therapy fails), the 6-month PCOS survey did not ask about chemotherapy. Response options for each question were yes and no. For those who did not give an answer, their responses were categorized as unknown. Centrally trained, certified, experienced abstractors from each registry abstracted medical records from hospitals, outpatient clinics, health maintenance organizations, and private physician s offices that were identified by the patients. In addition, a 5 percent sample of PCOS records was reabstracted by a quality control supervisor. A standardized form was used to record treatment details, including types of therapy and dates received, as well as other information about the symptoms of disease, diagnostic procedures performed, clinical and pathologic stage of disease, and acute complications. Abstraction was conducted at least 1 year after diagnosis in physicians offices to ensure complete collection of treatments given in the first 6 months after diagnosis. Because some patients received certain treatments after returning their 6-month survey, the treatment was not counted as received if the treatment date was later than the corresponding 6-month survey date. From medical record abstracts, leuprolide acetate and goserelin were considered hormone shots for comparison with self-reports, while finasteride, flutamide, estrogens, bicalutamide, prednisone/steroids, aminolutethimide, and ketoconazole were considered hormone pills. For comparison of self-reports to medical records, the medical record abstract for hormone shots or pills was categorized yes if answers to any of the component items were positive, no if all responses were negative, and unknown otherwise. Statistical methods Wording of question Have you had surgery to remove your prostate gland? Have you had radiation treatment for prostate cancer? Have you had surgery to remove the testicles? Have you had hormone shots for prostate cancer? Have you taken hormone pills for prostate cancer? Kappa statistics were used to obtain chance-corrected agreement when neither self-report nor medical record could be assumed to be the gold standard. In general, values of kappa greater than 0.8 represent excellent agreement beyond chance, values between 0.61 and 0.8 substantial agreement, and values higher than 0.4 but 0.60 or lower moderate agreement (). Depending on the method of classifying unknown answers, we calculated three types of kappa statistics. Kappa1 used yes, no, and unknown. Kappa considered only yes and no by treating unknown from medical record abstraction as no and excluding participants who did not answer the survey item. Kappa3 also considered only yes and no but counted both unknown from medical record abstraction and the survey items as no. Sensitivity, specificity, positive predictive value, and negative predictive value were also estimated (based on categorical levels of yes, no, and unknown ) by assuming the medical record to be the gold standard. Adjusted agreements were produced from the logistic regression models (3) to investigate the association between various patient characteristics and agreement between selfreported and medical record information. The dependent

584 Clegg et al. variable for these analyses was agreement between survey self-reports and medical record abstracts based on categorical levels of yes, no, and unknown. For example, the dependent (indicator) variable was coded 0 (disagreement) if self-report was unknown and the medical record indicated no treatment. The independent variables included age (<60, 60 69, 70 79, 80 years), race (Black, White, Hispanic), education (less than high school, some college, advanced), income ( $10,000, $10,001 $0,000, $0,001 $30,000, $30,001 $50,000, $50,001, unknown), and registry area. All analyses were performed by using SUDAAN statistical software (4) except when we calculated adjusted agreement, for which we used a customized computer program written in SAS (SAS Institute, Inc., Cary, North Carolina). Horvitz-Thompson sampling weights, which are the reciprocal of sampling probabilities, were used to take into account the study design. The sum of the weights, 6,031, was about double the number of subjects in the analysis, 3,1 (table ). TABLE. Sample size and weighted distribution of the study population, by demographic variables, Prostate Cancer Outcomes Study, United States, 19 19 Variable Age (years) <60 60 69 70 79 80 Race White Hispanic Black Education College or advanced level High school or some college Less than high school Income ($) 10,000 10,001 0,000 0,001 30,000 30,001 50,000 50,000 Unknown Registry Seattle-Puget Sound, Washington Connecticut New Mexico Utah Metropolitan Atlanta, Georgia Los Angeles, California Current marital status Married Not married No. in sample (n = 3,1) 734 1,03 1,041 18,0 547 447 1,0 1,430 699 78 56 50 730 86 334 338 669 344 584 31 0,511 648 Weighted %* (n = 6,031) 16.5 40.3 35.7 7.6 75.5 13.9 10.4 34.5 44.3 1. 7.5 15.9 15.3 3.3 6.3 11.6 6..3 11.6 10.7 13.5 35.6 79.5 0.5 * Some percentages do not total 100 because of rounding. Excludes 45 subjects for whom values were missing. Excludes 37 subjects for whom values were missing. RESULTS The number of subjects in the study was 3,1, who represented 6,031 (of 11,137) eligible cases based on our sampling weights (table ). We did not use weights adjusted for attrition because our original study participation rate was not 100 percent. Study subjects were predominantly White (76 percent), were married (80 percent), were in their early sixties to late seventies (76 percent), had some college or more advanced education (79 percent), and had annual family incomes of more than $0,000 (65 percent) (table ). On the basis of medical records, 1,58 patients received prostatectomy, 810 radiation, 188 orchiectomy, 538 hormone shots, and 99 hormone pills (table 3). Kappa1 (which included unknown as a separate category) had the lowest values, while kappa (which classified unknown from medical records as no and excluded respondents who did not complete the survey item) had similar or slightly larger values than kappa3 (which excluded unknown responses). All kappa statistics for prostatectomy and radiation were greater than 0.8, indicating excellent agreement between self-reports and medical records. The chance-corrected agreement for hormone pills was moderate (kappa statistics ranged from 0.47 to 0.57), while for hormone shots it was substantial (from kappa1 0.69 to kappa 0.78). Kappa1 (0.61) indicated somewhat moderate agreement between self-reports and medical records for orchiectomy, whereas kappa (0.8) and kappa3 (0.81) showed substantial agreement. The estimated sensitivity, assuming the medical record as the gold standard, was more than 80 percent for prostatectomy, radiation, and hormone shots and was 74 and 69 percent for orchiectomy and hormone pills, respectively (table 3). The estimated specificity of self-reports was percent or greater for all treatments. The positive predictive value and negative predictive value of self-reports corresponded to the sensitivity and specificity of the medical records, assuming the survey selfreports were the gold standard. The estimated positive predictive value was greater than percent for prostatectomy, radiation, and orchiectomy and was 77 percent for hormone shots. However, the estimated positive predictive value was only 54 percent for hormone pills, meaning that among the patients who reported taking hormone pills, 46 percent did not have any such evidence in their abstracted medical records. The estimated negative predictive value was high ( percent or more) for all treatments. Logistic regression analysis was used to examine whether the respondent s age, race/ethnicity, educational level, annual family income, and cancer registry confounded the likelihood of agreement about initial treatment. Table 4 displays information on adjusted agreement calculated from the regression models for the covariates and the p values from testing for the overall effect of each covariate by using the Wald test statistic. The overall test (with more than one degree of freedom) was not a trend test (with one degree of freedom), because we did not assume that the effect of a covariate is linear. Because income was not statistically significant (p > 0.05) for any of these treatments, it is not shown in this table. Cancer registry was also excluded, since its effect was not of particular interest in its own right but should

Self-reported Treatment for Prostate Cancer 585 TABLE 3. Agreement between self-reports and medical records, and validity of self-reports for initial type of treatment received, Prostate Cancer Outcomes Study, United States, 19 19* Kappa statistic Medical records/self-reports Sen Spe PPV NPV No/ no No/ yes Yes/ unk Yes/ no Yes/ yes Treatment Kappa Kappa3 Kappa1 Unk/ unk Unk/ no Unk/ yes No/ unk 97 99 98 77 54 98 97 98 89 74 84 69 0.88 0. 0.81 0.76 0.55 0.89 0. 0.8 0.78 0.57 0.86 0.85 0.61 0.69 0.47 1 46 55 63 5 1 6 14 13 63 48 47 49 1,563,83,4,45,601 1 17 8 103 168 179 88 53 83 88 1,401 717 134 449 07 Prostatectomy Radiation Orchiectomy Hormone shots Hormone pills 0 0 0 5 1 6 4 * Counts were based on an unweighted number of 3,1, and kappa statistics were based on a weighted number of 6,031 (refer to the Materials and Methods section of the text). Unk, unknown; Sen, sensitivity; Spe, specificity; PPV, positive predictive value; NPV, negative predictive value. be adjusted for as a study design variable. Age, race, and educational level did not affect agreement on radiation treatment. Age was statistically significantly associated with agreement for reporting prostatectomy, hormone shots, and hormone pills, with better agreement for younger compared with older patients. Race/ethnicity also significantly affected the agreement for prostatectomy, orchiectomy, and hormone pills, with the highest agreement for White patients followed by Hispanic patients. For orchiectomy and hormone shots, self-reports from patients with a higher educational level versus less than a high school education showed significantly better agreement with the medical record. DISCUSSION We examined agreement between self-reports and medical records regarding initial treatment for prostate cancer. Previous studies have reported agreement between selfreports and medical records on demographic and anthropometric data (5, 6), disease history and medical conditions (5 11), medication usage (6), screening procedures (1, 13), and risk factors (5, 7). We found that agreement between self-reports and medical records on initial treatment for prostate cancer varied somewhat by treatment type. In general, the agreement beyond chance was excellent for more aggressive procedures such as surgery or radiation, with decreasing agreement for hormone shots and hormone pills. When we assumed that the medical record abstract was the gold standard, the estimated sensitivity was generally high for surgical therapies and radiation but low for hormone pills. The disagreement between self-reports and medical records might have been due to several different factors, such as recall error, biased self-reports, wording of questionnaire items, incomplete medical records abstraction, or patient noncompliance with a prescription regimen. The following paragraphs explore each of these factors in turn. When a treatment was identified from medical records but not reported by patients, the disagreement might have been attributable to recall error. Inaccuracy in recalling medical history and medication history has been documented by testretest evaluation (14). Results from test-retest evaluation indicate that a patient is more likely to recall the presence of a disease requiring a surgical or intensive pathologic/laboratory diagnostic procedure than a disease without such a procedure. We also observed that the extent of inaccuracy varied by treatment; that is, patients were more likely to recall a treatment that involved a surgical procedure but less likely to recall one that involved only taking pills. Not surprisingly, we found that age can affect recall. For example, agreement between selfreport and medical records regarding hormone pills was less likely for men aged 80 years or more, compared with younger men, and the adjusted agreement was 63 percent versus 80 percent or more, respectively. The disagreement between medical records and surveys might also have reflected biased self-reports or unclear wording of questionnaire items. The estimated 74 percent sensitivity of the self-reports on orchiectomy indicates that 6 percent of men who had the procedure did not respond positively to the question about that procedure. Because it is

586 Clegg et al. TABLE 4. Relation between selected demographic characteristics and agreement on prostate cancer treatment from self-reports and medical records, Prostate Cancer Outcomes Study, United States, 19 19 Characteristic Age (years) <60 60 69 70 79 80 Race White Hispanic Black Education Advanced level High school/some college Less than high school Prostatectomy p < 0.0001 87 p = 0.001 88 p = 0.16 Radiation p = 0. p = 0.6 p = 0.35 % adjusted agreement* Orchiectomy p = 0.18 p = 0.005 p = 0.01 Hormone shots p = 0.015 8 p = 0.067 87 p = 0.00 86 Hormone pills p < 0.0001 86 85 80 63 p = 0.014 83 80 80 p = 0.30 8 8 79 * Adjusted for tumor registry, income, and other variables in the table; income was not a statistically significantly predictor. unlikely that men would forget an orchiectomy received within the past 6 months, some of the underreporting might have been due to the respondents reluctance to acknowledge a potentially embarrassing treatment that negatively affected their body image. Another possibility is that the wording of the question may have confused some men. The question specifically asked about testicles, which may have been an unfamiliar term for some participants. For orchiectomy, educational level was a statistically significant predictor of agreement in our regression model: for self-reports from men with less than a high school education, adjusted agreement with medical records was lower. Therefore, future studies should design questions appropriate for subjects with lower educational levels. Poor agreement could have arisen from incomplete medical records abstraction. Although the PCOS attempted to abstract records from all treating physicians, we might have missed some, particularly the office-based physicians. This omission could certainly account for much of the poor agreement on hormone pills, which are prescribed almost exclusively in physicians offices. Our study suggests that medical record review may not necessarily be the gold standard for hormone pills and shots, unless investigators are certain that all physicians treating the patient were contacted and consented to record reviews. In our study, more than 40 percent of the men who reported taking hormone pills had no such evidence in their medical record abstracts (estimated positive predictive value given in table 3). Men might also have inaccurately reported hormone pill use because they confused hormone pills with other medications. However, a Swedish prospective cohort study of more than 16,000 women aged 45 73 years confirmed the validity of self-reporting current use of hormone therapy (estrogens, progestogens, or their combination) when compared with a 7-day personal diary (15). Although the Swedish study might not be generalizable to prostate cancer, patient surveys may be more accurate than medical record abstracts regarding self-administered oral medications, especially when investigators cannot be certain they have accessed all relevant providers records. A fifth reason for disagreement could have been poor patient compliance with prescription regimens. For example, patients might not have taken their hormone pills or not even filled their prescriptions. This noncompliance might have led to lower sensitivity when the medical records were considered the gold standard. Poor patient compliance might partly explain why more than 30 percent of men did not acknowledge taking hormone pills despite medical records indicating that they had been given a prescription (estimated sensitivity shown in table 3). This finding suggests that patients should be asked whether hormone pills were prescribed for prostate cancer in addition to asking the current question, Have you taken hormone pills for prostate cancer? Our results showed that agreement between self-reported and medical-record-based initial treatment (except for radiation) was lower for Hispanics and Blacks than for Whites after we controlled for tumor registry, income, age, and educational level. The adjusted agreement was also poorer for older men than for younger men. We also performed logistic regression analyses that included indicator variables for clinical disease stage. The adjusted proportions of agreement generally did not change much between levels of clinical stages, although agreement tended to worsen somewhat with increasing stage for each type of therapy. However, the adjusted agreement for hormone pills was much worse for patients with stage T3 or T4 disease. One possible reason for this finding is that men with more advanced disease may be more likely than men with early-stage disease to be prescribed hormone pills by medical oncologists to manage their condition, because hormonal therapy is usually given alone for advanced disease

Self-reported Treatment for Prostate Cancer 587 but as adjuvant therapy to other local treatments (surgery or radiotherapy) for early-stage disease. In the PCOS, abstraction of records from specialists was not as comprehensive as abstraction from urologist or radiation oncologist records. One limitation of our study is that we did not obtain the date of surgery, radiation, or orchiectomy on the patient survey, although this information was collected in medical record abstracts. This variable may be important for some quality-of-life studies. It is conceivable that agreement on timing of treatment will not be as high as agreement on treatment itself. Also, because PCOS nonresponders were slightly older and from areas with lower levels of educational attainment (1), the missing responses might have led us to slightly overestimate agreement between self-report and medical record, except for radiation therapy, for which agreement was not associated with age and education. In summary, this study provides the first known population-based comparison of self-report with medical record abstractions regarding treatment of prostate cancer, and the results are possibly generalizable to other diseases and treatments. These results can serve as a useful guide to outcomes researchers contemplating using surveys instead of medical record abstraction to ascertain treatments. REFERENCES 1. Potosky AL, Harlan LC, Stanford JL, et al. Prostate cancer practice patterns and quality of life: the Prostate Cancer Outcomes Study. J Natl Cancer Inst 1999;:1719 4.. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977;33:159 74. 3. Graubard BI, Korn EL. Predictive margins with survey data. Biometrics 1999;55:65 9. 4. Shah BV, Barnwall BG, Bieler GS. SUDAAN. Research Triangle Park, NC: Research Triangle Institute, 1997. 5. Zhu K, McKnight B, Stergachis A, et al. Comparison of selfreport data and medical records data: results from a casecontrol study on prostate cancer. Int J Epidemiol 1999;8: 409 17. 6. Paganini-Hill A, Ross RK. Reliability of recall of drug usage and other health-related information. Am J Epidemiol 198;116: 114. 7. Colditz GA, Martin P, Stampfer MJ, et al. Validation of questionnaire information on risk factors and disease outcomes in a prospective cohort study of women. Am J Epidemiol 1986; 13:8 0. 8. Coulter A, McPherson K, Elliott S, et al. Accuracy of recall of surgical histories: a comparison of postal survey data and general practice records. Community Med 1985;7:186 9. 9. Bush TL, Miller SR, Golden AL, et al. Self-report and medical record report agreement of selected medical conditions in the elderly. Am J Public Health 1989;79:1554 6. 10. Karlson EW, Lee IM, Cook NR, et al. Comparison of selfreported diagnosis of connective tissue disease with medical records in female health professionals. The Women s Health Cohort Study. Am J Epidemiol 1999;150:65 60. 11. Haapanen N, Miilunpalo S, Pasanen M et al. Agreement between questionnaire data and medical records of chronic diseases in middle-aged and elderly Finnish men and women. Am J Epidemiol 1997;145:76 9. 1. Gordon NP, Hiatt RA, Lampert DI. Concordance of selfreported data and medical record audit for six cancer screening procedures. J Natl Cancer Inst 19;85:566 70. 13. Mandelson MT, LaCroix AZ, Anderson LA et al. Comparison of self-reported fecal occult blood testing with automated laboratory records among older women in a health maintenance organization. Am J Epidemiol 1999;150:617 1. 14. Kelly JP, Rosenberg L, Kaufman DW, et al. Reliability of personal interview data in a hospital-based case-control study. Am J Epidemiol 19;131:79. 15. Merlo J, Berglund G, Wirfält E, et al. Self-administered questionnaire compared with a personal diary for assessment of current use of hormone therapy: an analysis of 16,060 women. Am J Epidemiol 000;15:788.