American Journal of Epidemiology Copyright 2000 by The Johns Hopkins University School of Hygiene and Public Health All rights reserved Vol. 152,. 8 Printed in U.S.A. Questionnaire Assessment of Hormone Use Merlo et al. Self-administered Questionnaire Compared with a Personal Diary for Assessment of Current Use of Hormone Therapy: An Analysis of 16,060 Women Juan Merlo, 1,2 Göran Berglund, 3 Elisabet Wirfält, 3 Bo Gullberg, 1 Bo Hedblad, 1,3 Jonas Manjer, 1 Birgita Hovelius, 1 Lars Janzon, 1 Bertil S. Hanson, 1 and Per Olof Östergren 1 A personal diary may be more appropriate than a questionnaire for assessing self-reported current use of hormone therapy (estrogens, progestagens, or their combination); however, use of a questionnaire is more feasible and less expensive. The authors compared both methods for 16,060 Swedish women aged 45 73 years from the Malmö Diet and Cancer Study (baseline, 1991 1996). In a reliability analysis, the authors investigated the agreement (kappa value) between the questionnaire and the diary regarding current hormone therapy use (yes vs. no), studying the ability to replicate results whether or not they were correct. They also explored associations between discrepancy and individual characteristics. A validity analysis was conducted to determine whether use of the questionnaire achieved an outcome without systematic error (i.e., high specificity and sensitivity); the personal diary was considered the gold standard. Agreement between both methods was high: 95.5% (kappa = 0.840). The sensitivity was 84.9% and the specificity 97.7%. Higher body mass index and being a widow were associated with agreement, whereas age (50 59 years), use of anxiolytics/hypnotics or opiates, high alcohol consumption, past smoking, and higher educational level were associated with discrepancy. Compared with a personal diary, a simple self-administered questionnaire is a valid method for assessing current use of hormone therapy. Am J Epidemiol 2000;152:788 92. bias (epidemiology); estrogens; pharmacoepidemiology; progestational hormones; questionnaires; women An appropriate gold standard for assessing current drug use would be a personal diary in which persons logged the specifics of their medications on a daily basis (1). However, in large-scale studies, this method is too costly. Therefore, there is a need for studies evaluating the measurement quality of more accessible methods such as self-administered questionnaires. The Malmö Diet and Cancer Study (MDCS) (2) included a questionnaire and a 7-day personal diary in the baseline examination. Both of these instruments assessed current use of hormone therapy (estrogens, progestagens, or their combination) by respondents. One aim of the present study was to perform a reliability analysis, that is, to study the ability to replicate results whether or not they were correct, by assessing the agreement (kappa value) between the questionnaire and the diary. We also explored the association between discrepancy and eight individual characteristics. By conducting a validity Received for publication June 1, 1999, and accepted for publication January 12, 2000. Abbreviations: BMI, body mass index; MDCS, Malmö Diet and Cancer Study. 1 Department of Community Medicine, Malmö University Hospital, Malmö, Sweden. 2 The NEPI Foundation, Malmö, Sweden. 3 Department of Medicine, Surgery and Orthopedics, Malmö University Hospital, Malmö, Sweden. Reprint requests to Dr. Juan Merlo, Department of Community Medicine, Malmö University Hospital, S-205 02 Malmö, Sweden (email: Juan.Merlo@smi.mas.lu.se). analysis, we aimed to determine whether the questionnaire achieved an accurate result, that is, one that lacked systematic error; the personal diary was considered the gold standard. We calculated specificity, sensitivity, and positive and negative predictive values. MATERIALS AND METHODS The MDCS is a prospective cohort study performed in the city of Malmö, Sweden (approximately 240,000 inhabitants). The 17,388 women in the MDCS cohort represented 41 percent of all women born in 1923 1950 who were living in Malmö during the 1991 1996 baseline period. Of these participants, approximately 7.7 percent (1,328/17,388) were excluded from our study because of incomplete information: 0.4 percent (63/17,388) completed the diary only, 1.7 percent (290/17,388) completed the questionnaire only, and 5.6 percent (975/17,388) had missing values for drug use. A detailed description of the design and aims of the cohort study is given elsewhere (2). Information on current use of hormone therapy was obtained from 1) a self-administered questionnaire that included the open-ended question, Which medicines do you use on a regular basis? and 2) a structured, 7-day personal diary that included a specific, open-ended item for recording drug use. Each participant completed both information sources at home within a 1 2-week period between the first and second visits to the study center at baseline. 788
Questionnaire Assessment of Hormone Use 789 All pharmacologic agents reported in the personal diary or the questionnaire were classified according to the 1997 version of the Anatomic Therapeutic Chemical classification system (3). Hormone therapy was defined by the classification codes G03C (estrogens), G03D (progestagens), and G03F (progestagens and estrogens in combination); use of anxiolytics/hypnotics or opiates was defined by codes N05B (anxiolytics), N05C (hypnotics and sedatives), N03AE (benzodiazepines), N02A (opioids), N02B51C (in this study, paracetamol in combination with codeine), and N02AA59 (in this study, aspirin in combination with codeine). The population was divided into four groups by quartile of body mass index (BMI), expressed as weight (kg) divided by height (m) squared (kg/m 2 ). Height and weight were measured at the study center at baseline. High alcohol consumption was defined as an intake of 25 g or more of alcohol per day as reported in the personal diary. The rest of the information was obtained from the self-administered questionnaire. Smoking status was classified as regular, infrequent, former, or never. Age at entry into the study was aggregated into the four categories of 45 49, 50 59, 60 69, and 70 73 years. Marital status was defined as married, unmarried, divorced, or widowed. Country of origin was dichotomized into Sweden and foreign. Finally, educational level was categorized as less than 9 years, 9 11 years, or more than 11 years of education. Reliability analysis We conducted a reliability analysis to assess to what extent the questionnaire and the personal diary agreed (i.e., the ability to replicate results whether or not the information was correct). We calculated the percentage of agreement and the related kappa coefficient (4) for dichotomous (yes vs. no) self-reported current use of hormone therapy. The kappa coefficient is a measure of the degree of nonrandom agreement between two measurements of the same categorical variable (5). A p value of <0.05 was required for rejection of the null hypothesis of no agreement other than by chance. Prevalence of current use of hormone therapy (i.e., the percentage of women reporting current use) was computed according to information from the questionnaire, from the diary, from the questionnaire or the diary, and from the questionnaire and the diary. Discrepancy was considered present if a woman reported in the questionnaire but not in the diary, or vice versa, that she currently used hormone therapy. We observed whether women tended to report hormone therapy use more in the questionnaire than in the diary (or vice versa) by computing the difference between the percentage of women reporting in the questionnaire only and the percentage reporting in the diary only. A positive difference indicated use reported more often in the questionnaire. Prevalence of hormone therapy use in the general population We compared the prevalence of hormone therapy use in the general population (in the southwest part of the county of Skåne, where the city of Malmö is located) according to the National Prescription Survey (6) with the prevalence according to the questionnaire, according to the diary, and according to the questionnaire or the diary. We restricted our analysis to women aged 56 60 years (mean age in the MDCS was 58 years) and to the years 1992 1995 (when the MDCS was being conducted). Characteristics associated with discrepancy The odds ratios of discrepancy (i.e., reporting in the questionnaire but not in the diary, or vice versa, of current use of hormone therapy) were calculated as a function of age, smoking status, alcohol consumption, use of anxiolytics/ hypnotics or opiates, BMI, marital status, country of origin, and educational level (refer to Materials and Methods) by using logistic regression analysis. The analyses were performed unadjusted for each of the characteristics and were adjusted by entering all eight characteristics as covariates in the same model. Validity analysis In the validity analysis, we studied whether the questionnaire produced an accurate result (one that lacked systematic error) by calculating its sensitivity, specificity, and positive and negative predictive values; the personal diary was considered the gold standard. Computations were performed by using the on-line calculator at the Canadian Centres for Health Evidence (Internet site: http://www.cche.net/). RESULTS Prevalence of current hormone therapy use The prevalence of current hormone therapy use among the 16,060 participating women was 18.9 percent when the questionnaire or the diary was considered, 16.3 percent according to the diary only, 17.0 percent according to the questionnaire only, and 1 percent when only those instances in which the questionnaire and the diary agreed were considered. For women aged 56 60 years, according to the National Prescription Survey (6), the prevalence of current hormone therapy use in the general population was 22.1 percent compared with 24.2 percent according to the questionnaire or the diary, 21.2 percent according to the diary only, and 21.5 percent according to the questionnaire only. Reliability analysis Agreement between the two methods was high at 95.5 percent (kappa 0.840, p < 0.001), and similar values were obtained for specific groups of hormones (table 1). The lowest discrepancy was found for women aged 70 73 years (2.8 percent) and the highest (6.3 percent) for women with high alcohol consumption (table 2). The women tended to report more current hormone therapy use according to the questionnaire, especially those with high alcohol consumption (2.33). However, users of anxiolytics/hypnotics and opiates reported more use according to the diary ( 1.23). Am J Epidemiol Vol. 152,. 8, 2000
790 Merlo et al. TABLE 1. Reliability and validity of current hormone therapy use, according to the questionnaire and the personal diary, for 16,060 Swedish women from the Malmö Diet and Cancer Study (baseline examination, 1991 1996) Estrogens. % Progestagens Progestagens and estrogens All hormone therapy. %. %. % Reliability analysis Use reported in the Questionnaire Agreement (kappa) Diary 231 164 969 14,696 0.817* 1.4 1.0 6.0 91.5 108 56 227 15,669 0.729* 0.7 0.3 1.4 97.6 176 147 1,266 14,471 0.876* 1.1 0.9 7.9 90.1 302 412 2,318 13,028 0.840* 1.9 2.6 1 81.1 Validity analysis Sensitivity Specificity Positive predictive value Negative predictive value 85.5 98.5 80.1 98.9 80.2 99.3 67.8 99.6 89.6 98.8 87.8 99.0 84.9 97.7 88.5 96.9 * p < 0.001. Percentages are based on the total study sample of 16,060 women. Personal diary method was considered the gold standard. Characteristics associated with discrepancy Our study found a higher discrepancy for women aged 50 59 years compared with those aged 45 49 years; however, the discrepancy was similar or even lower for older women. The unadjusted odds ratio of discrepancy (table 2) decreased at higher BMI values almost in a dose-response type of relation. Compared with married women, widows had a lower and divorced women a higher odds ratio of discrepancy. In the unadjusted analysis, use of anxiolytics/ hypnotics or opiates, high alcohol consumption, former smoking, and 9 or more years of education were associated with a higher discrepancy. The adjusted odds ratios (table 2) were similar to the unadjusted odds ratios. However, the adjusted point estimates moved slightly toward the null value except those for use of anxiolytics/hypnotics or opiates and for age 50 59 years, for which the estimates moved slightly away from the null value. Validity analysis The outcome of the validity analysis is presented in table 1. The specificity and sensitivity of the questionnaire were 97.7 and 84.5 percent, respectively. Sensitivity was lowest (80.2 percent) for progestagens and highest (89.6 percent) for progestagens and estrogens in combination (table 1). DISCUSSION Compared with a costly 7-day personal diary, a simple, self-administered questionnaire that included a single openended question on drug use had a high reliability and a high validity for assessing current hormone therapy use. It has been suggested that an appropriate gold standard would be a personal diary in which persons log the specifics of their medications on a daily basis (1). Compared with such a gold standard, both the sensitivity and the specificity of the selfadministered questionnaire were high. te that our study analyzed only current hormone therapy use. Questionnaires may provide information about past use of medication, but diaries are suitable for current use only. Recall of exposure in the past may be less accurate and less complete (7). Women who participated in the MDCS, especially those included in the present study who provided complete drug use information in both the questionnaire and the diary, may be a select, more motivated population (e.g., because of family or personal history of disease) for whom agreement is higher. Compared with the women who provided complete information, those whose information was incomplete were older (2 vs. 10 percent in the group aged 70 73 years), more of them used anxiolytics/hypnotics or codeine (4 vs. 9 percent), and more were born in Sweden (83 vs. 88 percent). In addition, compared with the women who completed both the questionnaire and the diary, those who completed only the questionnaire used more anxiolytics/hypnotics or codeine (8 vs. 12 percent), more were within the higher BMI quartile (25 vs. 42 percent), and fewer were married (61 vs. 49 percent). Therefore, there is a risk of selection or response bias (e.g., if results for the excluded women were more discrepant, the lower discrepancy found for the highest BMI quartile might be an overestimation, as these women were found more often in the excluded group with incomplete information). Furthermore, the MDCS may not be representative of the general population. The MDCS questionnaire was also used in a mailed health survey carried out in Malmö in 1994 (8). The response rate for the mailed survey was 74.6 percent, twice the MDCS participation rate. Compared Am J Epidemiol Vol. 152,. 8, 2000
Questionnaire Assessment of Hormone Use 791 TABLE 2. Odds ratios of discrepancy* in self-assessed current hormone therapy use, by characteristics of the 16,060 women from the Swedish cohort of the Malmö Diet and Cancer Study (baseline examination, 1991 1996) OR Unadjusted value 95% CI OR Adjusted value 95% CI Discrepancy (%) Difference Age (years) 45 49 50 59 60 69 70 73 1.68 0.86 0.76 1.36, 2.05 0.68, 1.10 0.54, 1.07 1.77 0.90 1.43, 2.18 0.79, 1.30 0.63, 1.29 3.7 6.2 3.3 2.8 0.74 0.76 0.44 1.03 Smoking status Never Former Current Referemce 1.45 1.15 1.21, 1.74 0.95, 1.38 1.36 0.99 1.13, 1.63 0.81, 1.20 3.8 5.5 0.63 1.06 0.43 High alcohol consumption 1.47 1.09, 1.97 1.28 0.95, 1.72 6.3 0.60 2.33 Use of anxiolytics/hypnotics or opiates 1.38 1.09, 1.75 1.51 1.19, 1.93 4.3 5.9 0.87 1.23 Body mass index (kg/m 2 ) <22.6 >22.6 <24.8 >24.8 <27.8 >27.8 0.75 0.62 0.84, 1.24 0.61, 0.93 0.50, 0.78 0.79 0.65 0.84, 1.25 0.64, 0.98 0.52, 0.82 5.2 5.3 4.0 3.3 1.00 0.92 0.75 0.07 Marital status Married Unmarried Divorced Widowed 0.96 1.20 0.70 0.74, 1.27 1.00, 1.44 0.52, 0.94 0.93 1.15 0.83 0.71, 1.23 0.96, 1.39 0.61, 1.12 4.3 5.3 3.1 0.73 0.20 0.70 0.80 Country of origin Foreign Sweden 1.14 0.91, 1.43 1.11 0.87, 1.41 4.0 4.5 0.62 0.69 Educational level (years) <9 9 11 >11 1.35 1.40 1.13, 1.61 1.15, 1.70 1.21 1.18 1.01, 1.46 0.95, 1.45 3.7 4.9 5.1 0.27 0.90 1.08 * Reported in the questionnaire but not in the personal diary, or vice versa. Adjusted for all other variables. Difference between the percentage of women reporting use according to the questionnaire only and the percentage reporting use according to the diary only; a positive difference indicates more reported use according to the questionnaire, and a negative difference indicates more reported use according to the personal diary. OR, odds ratio; CI, confidence interval. with the mailed survey, the MDCS included a higher proportion of employed persons and subjects born in Sweden and a lower proportion of retired subjects. However, in both studies, participants were similar with regard to educational level, type of employment, marital status, smoking habits, and weight distribution. Therefore, MDCS participants could be regarded as reasonably representative of the general population. Moreover, data from the National Prescription Survey (6) suggest that the MDCS rather reliably estimated the prevalence of hormone therapy use in the general population. Differences in payment or prescription practices might have influenced hormone therapy use. However, payment practices probably did not influence the results; in Sweden, the government reimburses a major part of the cost of prescription drugs. Therefore, during the baseline period, the total cost per year for hormone therapy could not exceed SEK 450 (US $56). On the other hand, different local therapeutic traditions might have led to differences in hormone therapy use between women from different areas. ndifferential misclassification of covariates may bias the estimate either toward or away from the null value (9) and may lead to a partial loss of ability to control confounding (10). Therefore, it is necessary to consider the degree of misclassification when deciding whether to control a covariate (11). Of the covariates studied, high alcohol consumption may be more prone to misclassification (12). Also, continuous age and BMI variables were categorized in our Am J Epidemiol Vol. 152,. 8, 2000
792 Merlo et al. analysis, which may have introduced misclassification. Excluding the alcohol variable or using continuous age and BMI did not affect point estimates in multivariate analyses. As the questionnaire and the personal diary were completed during the same 2-week period, it is possible that recording of information in the diary may have influenced memory when the questionnaire was being completed. This bias would have overestimated the agreement. However, as neither the questionnaire nor the personal diary focused on drug use, this potential source of bias probably had a limited effect on the results. Compared with the women aged 45 49 years, the discrepancy for the group aged 50 59 years was higher; for older women in our study, the discrepancy was similar or even lower. Other studies (13) have observed a higher agreement for younger subjects, but these studies compared self-reported questionnaire data with pharmacy databases. Older women are prescribed more drugs; therefore, according to database information, they appear to use more drugs. However, underuse is particularly common in the elderly, and pharmacy databases may overreport current hormone therapy use (14). The higher discrepancy for users of anxiolytics/hypnotics or opiates could be due to impaired memory recall (15), which also could explain why these women reported in the personal diary more current use of hormone therapy (table 2). Higher educational level also was associated with a higher discrepancy. In the literature, results conflict when interviews are compared with physicians records (16) or with prescription databases (17). Nevertheless, our study was based on two self-reported information sources, and highly educated women may have reported in the questionnaire the drugs they purchased rather than the drugs they actually used. There could be a reasonable explanation for the observed discrepancy. Those women who reported in the questionnaire but not in the diary that they currently used hormone therapy might have been reporting during a period of low compliance. In summary, compared with a 7-day personal diary, a more convenient, self-administered questionnaire that includes an open-ended question on drug use seems to be a valid method for assessing current hormone therapy use in large epidemiologic studies. ACKNOWLEDGMENTS This study was supported by the MDCS project, the NEPI Foundation, an ALF-medel government grant (Dnr M: E 39 390/98 (Dr. Merlo)), and the Swedish Institute for Public Health. REFERENCES 1. West SL, Savitz DA, Koch G, et al. Recall accuracy for prescription medications: self-report compared with database information. Am J Epidemiol 1995;142:1103 12. 2. Berglund G, Elmstahl S, Janzon L, et al. The Malmö Diet and Cancer Study. Design and feasibility. J Intern Med 1993;233: 45 51. 3. Capellà D. Descriptive tools and analysis. In: Dukes MNG, ed. Drug utilisation studies. Methods and uses. Copenhagen, Denmark: World Health Organization Regional Office for Europe, 1993:55 78. (WHO regional publication; European series, no. 45). 4. Maclure M, Willet WC. Misinterpretation and misuse of the kappa statistic. Am J Epidemiol 1987;126:161 9. 5. Last JM, ed. A dictionary of epidemiology. 3rd ed. New York, NY: Oxford University Press, 1995. 6. Wessling A. The National Prescription Survey. A database for drug utilization studies in Sweden results and experiences from the 1970s and 1980s. Doctoral thesis. Stockholm University, Stockholm, Sweden, 1990. 7. West SL, Strom BL. Validity of pharmacoepidemiology drug and diagnosis data. In: Strom BL, ed. Pharmacoepidemiology. 2nd ed. Chichester, England: John Wiley & Sons Ltd, 1994: 549 80. 8. Lindström M, Bexell A, Hanson BS, et al. Hälsoläget i Malmö. Rapport från postenkätundersökningen våren 1994. (The health situation in Malmö 1994). (In Swedish). Malmö, Sweden: Socialmedicinska enheten, Universitetssjukhuset MAS, 1995. 9. Fung KY, Howe GR. Methodological issues in case-control studies. III: the effect of joint misclassification of risk factors and confounding factors upon estimation and power. Int J Epidemiol 1984;13:366 70. 10. Greenland S. The effect of misclassification in the presence of covariates. Am J Epidemiol 1980;112:564 9. 11. Greenland S, Robins JM. Confounding and misclassification. Am J Epidemiol 1985;122:495 506. 12. Brownson RC, Davis JR, Chang JC, et al. A study of the accuracy of cancer risk factor information reported to a central registry compared with that obtained by interview. Am J Epidemiol 1989;129:616 24. 13. Van den Brandt PA, Petri H, Dorant E, et al. Comparison of questionnaire information and pharmacy data on drug use. Pharm Weekbl [Sci] 1991;13:91 6. 14. Hartzema AG, Perfetto EM. Sources and effects of drug exposure and unintended effect misclassification in pharmacoepidemiologic studies. In: Hartzema AG, Porta MS, Tilson HH, eds. Pharmacoepidemiology, an introduction. 2nd ed. Cincinnati, OH: Harvey Whitney Books Co, 1991:176 206. 15. Hanlon JT, Horner RD, Schmader KE, et al. Benzodiazepine use and cognitive function among community-dwelling elderly. Clin Pharmacol Ther 1998;64:684 92. 16. Goodman MT, mura AM, Wilkens LR, et al. Agreement between interview information and physician records on history of menopausal estrogen use. Am J Epidemiol 1990;131: 815 25. 17. West SL, Savitz DA, Koch G, et al. Demographics, health behaviors, and past drug use as predictors of recall accuracy for previous prescription medication use. J Clin Epidemiol 1997;50:975 80. Am J Epidemiol Vol. 152,. 8, 2000