AnExaminationoftheQualityand UtilityofInterviewerEstimatesof HouseholdCharacteristicsinthe NationalSurveyofFamilyGrowth. BradyWest

Size: px
Start display at page:

Download "AnExaminationoftheQualityand UtilityofInterviewerEstimatesof HouseholdCharacteristicsinthe NationalSurveyofFamilyGrowth. BradyWest"

Transcription

1 AnExaminationoftheQualityand UtilityofInterviewerEstimatesof HouseholdCharacteristicsinthe NationalSurveyofFamilyGrowth BradyWest

2 An Examination of the Quality and Utility of Interviewer Estimates of Household Characteristics in the National Survey of Family Growth Brady T. West Michigan Program in Survey Methodology Institute for Social Research University of Michigan - Ann Arbor bwest@umich.edu NSFG Survey Methodology Working Papers, No April 2010

3 Quality and Utility of Interviewer Estimates of Household Characteristics 2 ABSTRACT Effective methods for repairing nonresponse error are of primary interest to the field of survey methodology, given declining response rates in household surveys of nearly all formats. Post-survey methods for repairing nonresponse error rely on the presence of auxiliary variables on a sampling frame for both respondents and non-respondents, and much methodological work has shown that the best auxiliary variables for repairing nonresponse errors are related to both the survey variables of interest and response propensity. Unfortunately, auxiliary variables having these optimal properties are rare in survey research practice. In Cycle 7 of the National Survey of Family Growth (NSFG), female interviewers performing household screening operations were asked to record their best guesses as to whether there were children under the age of 15 in the household (35,258 guesses by 96 interviewers, prior to the screening questions), and whether the selected respondent was in a sexually active relationship with a member of the opposite sex (13,495 guesses by 94 interviewers, after the completed screening questions). Given that correct values on these two indicators can be derived from completed household listings and responses to the main NSFG interview, this study sought to examine the amount of error in the two interviewer estimates, and the associations of the interviewer estimates with both key NSFG variables and the propensity to respond to the main NSFG interview, given a completed screening interview. Several significant associations were found, suggesting that these interviewer estimates may be useful for repairing nonresponse errors. However, a small simulation study shows that the level of estimation error in the NSFG may have a negative impact on potential reductions in nonresponse error. The study concludes with a discussion of estimation techniques used by highly accurate interviewers and future research in this area aimed at improving the quality of the interviewer estimates.

4 Quality and Utility of Interviewer Estimates of Household Characteristics 3 INTRODUCTION This paper presents an initial examination of the error properties associated with interviewer judgments of household characteristics in the National Survey of Family Growth (NSFG), and evaluates the utility of these paradata (Couper, 1998) for constructing post-survey nonresponse adjustments to NSFG estimates. The paper also simulates the implications of these errors for the bias properties of nonresponse adjustments, and reports the observational techniques used by interviewers who tend to be more accurate in their judgments. Effective and inexpensive methods for repairing nonresponse error are of primary interest to the field of survey methodology, given declining response rates in large household surveys of nearly all formats (De Leeuw and De Heer, 2002). One relatively inexpensive method for repairing unit nonresponse errors is to make post-survey adjustments to base sampling weights (if applicable) by grouping respondents and nonrespondents into weighting classes, and adjusting the weights based on inverses of estimated response rates (or response propensities when using logistic regression modeling) within the weighting classes. This adjustment method, however, relies on auxiliary variables (or covariates) measured for both respondents and nonrespondents, and methodological work has shown that the best auxiliary variables for repairing nonresponse errors are related to both the survey variables of interest and response propensity (Groves, 2006; Little and Vartivarian, 2005). Further, gains in the precision of survey estimates are also possible when using auxiliary variables that are correlates of survey variables of interest (Kreuter et al., 2010; Little and Vartivarian, 2005). Unfortunately, auxiliary variables having these optimal properties are rare in survey research practice (Kreuter et al., 2010). As a result, large survey research programs have turned to the collection of paradata (Couper, 1998), or variables describing interviewer observations and other measurements about the survey data collection process, from both respondents and nonrespondents. For example, working with National Health Interview Survey (NHIS) data collected in 2006 and 2007 and NHIS sample families

5 Quality and Utility of Interviewer Estimates of Household Characteristics 4 contacted at least once, Maitland et al. (2009) describe an analysis of such variables from the U.S. Census Bureau s Contact History Instrument (CHI), which are available on the NHIS paradata files. Analyzing both individual variables from the paradata files (measuring cooperation and contactability of the families) and factor scores derived from the variables, these authors showed that the variables tended to have much stronger correlations with survey participation than the NHIS variables of interest. Given that the variables measured in the CHI are not specifically designed to measure health status but rather to manage field operations, variables having a stronger theoretical relationship with the survey variables of interest would seem to be of more importance when collecting paradata. Indeed, Maitland et al. presented evidence of a health-related variable in the available NHIS paradata (breaking off an interview for health reasons) having a stronger correlation with both survey participation and selected health variables measured in the survey than other theoretically unrelated CHI variables. This work suggested that collecting paradata on variables having a theoretical relationship with survey variables of interest would certainly be recommended for making nonresponse adjustments based on the paradata. Kreuter et al. (2010) present examples of other large survey research programs attempting to collect paradata from both respondents and nonrespondents on other theoretical correlates of key survey variables, including the European Social Survey (ESS) and the American National Election Study (ANES). The Continuous National Survey of Family Growth (NSFG) is another example of a large survey research operation that has used paradata extensively for production and estimation work (Groves et al., 2009, p ). Beginning in Cycle 7 of the NSFG (2006 to present), interviewers were requested during screening operations to estimate whether children under the age of 15 were present in the household and whether or not selected respondents were in sexually active relationships. These two variables were thus collected from both respondents and nonrespondents to the eventual main NSFG interview during household screening visits, as a part of a larger paradata-driven responsive survey design

6 Quality and Utility of Interviewer Estimates of Household Characteristics 5 (Groves and Heeringa, 2006). The two variables have the important property of being theoretically (assuming no measurement error) correlated with a variety of key variables in the NSFG. In fact, a recent study (Kreuter et al., 2010) has demonstrated that these variables have better correlations with NSFG survey variables of interest than similar paradata collected in four other large-scale personal interview surveys, and that these correlations can lead to moderate changes in NSFG estimates after applying nonresponse adjustments based on the auxiliary variables. The present study was motivated in part by previous attempts to use similar forms of paradata for making nonresponse adjustments in a national transportation survey (Yan and Raghunathan, 2007), the National Election Study (Peytchev and Olson, 2007), and the European Social Survey (Kreuter, Lemay and Casas-Cordero, 2007). Unfortunately, the correlations of these interviewer observations with both response propensity and survey variables, along with the corresponding nonresponse adjustments, may be attenuated by measurement error in the auxiliary variables. Potential reductions in nonresponse error may not be realized if these variables are measured with too much error. The impact of measurement error in auxiliary variables on the bias of estimated regression coefficients in linear regression models has been well established (e.g., Fuller, 1987), and this bias carries over to logistic regression models used for response propensity modeling when making nonresponse adjustments based on predicted response propensities (Stefanski and Carroll, 1985). To date, only one study has directly examined the error associated with the measurement of these types of auxiliary variables (Groves et al., 2007), based on survey self-reports. These authors analyzed data from the first four quarters of Cycle 7 of the NSFG, and only found 70-80% accuracy on the sexual activity judgment, with evidence of over-estimation of sexual activity. One can therefore assume that the measurement error on these observations will not be negligible, especially given that the data are based on interviewer judgments. Two other more indirect studies of measurement error in these types of observations, focusing on item-missing data and inter-rater reliability, also provide

7 Quality and Utility of Interviewer Estimates of Household Characteristics 6 empirical support for the presence of errors (Casas-Cordero and Kreuter, 2008; Kreuter et al., 2007). No studies to date, however, have considered the impacts of these errors on the bias properties of subsequent nonresponse adjustments, and this study aims to extend the initial work of Groves et al. (2007) in this manner. The NSFG affords the opportunity to study this error and its implications for nonresponse adjustments in more detail, given that measures on these two variables are collected from respondents as a part of the main NSFG interview. The objectives of this study are to examine the amount of error associated with these types of observations, consider the implications of this error for nonresponse adjustments based on the paradata, present observation techniques used by the interviewers with the most accurate observations, and discuss methods that may be useful for reducing the measurement error in future investigations. METHODS Data Data collected during the first 10 completed quarters of the NSFG (July 2006 December 2008) were analyzed in this study, building on the work of Groves et al. (2007). Screening interviews are necessary in the NSFG to determine the eligibility of individuals in randomly selected households, given that the target population is non-institutionalized U.S. males and females aged Additional details on the design of the NSFG, which has a primary goal of collecting nationally representative data on factors affecting birth and pregnancy rates, family formation, and the risks of HIV and other STDs, can be found elsewhere (Groves et al., 2009). Prior to the first face-to-face contact attempt with a randomly selected household for screening purposes, female interviewers 1 were instructed to first locate the household and then estimate whether the selected household contained any children under the age of 15 present (yes / no). In the data set constructed for analyzing the amount of error in 1 The NSFG does not employ male interviewers for data collection.

8 Quality and Utility of Interviewer Estimates of Household Characteristics 7 these observations for this study, there were a total of 35,258 observations on the presence of young children reported by 96 interviewers. For each of these observations, completed household roster information was available to determine whether children under the age of 15 were actually present in the household (observations with missing household roster information were deleted). There was certainly a possibility of error in the household enumeration process, but for the purposes of this study, completed household rosters were assumed to be correct. Immediately after the successful completion of the full screening questionnaire and the selection of a respondent from a household for the main interview, interviewers were asked to estimate whether the selected respondent was in a sexually active relationship with an opposite-sex partner (yes / no). There were a total of 13,490 judgments of sexual activity reported by 94 interviewers for which actual survey information on sexual activity was also available from the CAPI portion of a completed main interview. These numbers were reduced relative to the observations on young children because the true value for the variable indicating the presence of young children under the age of 15 could be measured after the household roster was completed, and did not require information from the main interview. Measurement error on the self-report of sexual activity collected in the main NSFG interview was also a real possibility, but was not considered further in this study. These two interviewer judgments (the presence of children under age 15 and whether the selected respondent was sexually active) were therefore collected for both respondents and nonrespondents to the main NSFG interview request. For the purposes of making nonresponse adjustments for the main interview in practice, one would certainly prefer to use the correct household roster indicator for young children; in this study, error properties of the interviewer observation on this indicator and their implications for nonresponse adjustments were considered. In total, there were 15,044 completed screening interviews where these two observations were available for potential respondents and nonrespondents, and a main interview was either completed or not

9 Quality and Utility of Interviewer Estimates of Household Characteristics 8 completed (more than 4,600 cases with completed screening interviews that did not respond in the first 10 weeks of the quarter and were not randomly selected for the NSFG second phase were dropped). A small subset of the more accurate interviewers (based on the household rosters and the responses to the main survey) were later approached by the NSFG field supervisor and asked to describe the observational techniques that they tended to use for making their judgments in an open-ended manner. In Quarters 1-10 of the NSFG, main interviews were completed by a total of 13,495 respondents. Five of these respondents had missing data on the variables necessary to determine a reported value of current sexual activity, resulting in the 13,490 sample persons with sufficient data for studying measurement error. Given the complex multistage sample design of the NSFG, base weights were necessary to offset unequal probabilities of selection, and included on the data file for each of the respondents. In addition, the NSFG includes a second phase or double sample operation where a subsample of initial nonrespondents from the first 10 weeks of a quarter receives an alternative and more intensive data collection protocol aimed at boosting response rates for the quarter (Lepkowski et al., 2010). The base sampling weights for second phase respondents therefore required an adjustment for this subsampling. Sampling error codes produced by NSFG staff enabling complex sample variance estimation (Lepkowski et al., 2010) were also used for the design-based analyses presented in this study. Survey variables collected in the main NSFG interview that were analyzed in this study included: 1) a binary indicator of whether the respondent had never been married; 2) a binary indicator of whether the respondent had ever had sex; 3) a binary indicator of whether the respondent had ever cohabitated with a partner; 4) the number of sexual partners in the past year; 5) for males, a count of biological children; and 6) for females, parity, or the number of live births. Male and female respondents to the main interview were coded as being sexually active if reporting one or more opposite sex partners in the past 12 months. Female respondents were also asked about having a current opposite-sex partner, and this measure was used to indicate being sexually active if no information was

10 Quality and Utility of Interviewer Estimates of Household Characteristics 9 available on the number of partners in the past 12 months. This information was used to determine the amount of error in the interviewer judgments regarding the sexual activity of selected respondents. Data Analysis Simple two-way cross-tabulations and unweighted Kappa statistics were used to examine overall agreement of the binary interviewer judgments with actual binary measures collected from the household roster information and the survey data. Two gross difference rates (GDRs), measuring the proportion of observations by an interviewer on each variable that were discordant with actual values (e.g., Biemer, 2004, p. 229), were computed for each interviewer. One example of a discordant observation would be a sampled male whom the interviewer estimates to be sexually active but does not report having a female sex partner in the past 12 months in the main NSFG interview. To examine associations of the two interviewer observations with propensity to respond to the main NSFG interview, two logistic regression models were fitted. These models used response to the main interview conditional on a completed screening interview as a binary dependent variable, the 15,044 successful screening interviews with interviewer observations available as the case base, the second phase sampling weight as a weight for each case (equal to 1 for respondents from the first phase), and a series of predictors identified as important in previous response propensity models for the main NSFG interview (Lepkowski et al., 2006) 2. The first model considered all of the predictors of response propensity from previous work, including a variety of paradata. The second model added the interviewer judgments on sexual activity and young children as predictors, to analyze their independent ability to predict response to the main interview. 2 In the final response propensity models used for computing nonresponse adjustments for the main NSFG interview in Cycle 7 (conditional on a completed screening interview), separate weighted logistic regression models were fitted in five strata defined by age and the interviewer estimation of sexual activity: age / not sexually active; age / not sexually active; age / sexually active; age / sexually active; and age / sexually active (Lepkowski et al., 2010). This stratification was performed after initial exploratory analyses found variance in response propensity by both age and estimation of sexual activity, in addition to variance in the relationships of other predictors with response propensity across the strata. The main effect models considered in this study are for illustrative purposes only.

11 Quality and Utility of Interviewer Estimates of Household Characteristics 10 Given that the relationships of both binary and count survey variables from the main NSFG interview with the two interviewer observations were of interest in this study, design-based logistic and Poisson regression models were fitted to the six different survey variables under consideration (e.g., Heeringa et al., 2010). Predictor variables in these models included the two interviewer observations, along with the same control variables used in the response propensity models (again to determine whether these two variables have independent predictive power, only now for the survey variables). These analyses represent an important first step in analyzing nonresponse bias (Maitland et al., 2009; Peytcheva and Groves, 2009), and assume that the associations of the survey variables with the interviewer observations are the same for both respondents and nonrespondents. The available NSFG data did not permit testing this assumption. Finally, to examine the impact of including the interviewer judgments in the nonresponse adjustments on the estimated means and estimated variances for the key survey variables, design-based estimates of means and percentages on the six survey variables (and their standard errors) were then computed using the base weights, nonresponseadjusted base weights with adjustments excluding the interviewer judgments, and nonresponse-adjusted base weights with adjustments including the interviewer judgments. RESULTS Overall Quality of the Interviewer Judgments Considering first a comparison of the overall quality of the two interviewer judgments, interviewers appeared to be less accurate when judging whether children under the age of 15 were present in selected households (Table 1). Roughly 73% (i.e., 59.63% %) of these judgments were correct based on the household roster information collected in the screening interviews (Kappa = 0.30), and among the errors, there was a slight tendency for more false negatives (15.00%) than false positives (12.47%).

12 Quality and Utility of Interviewer Estimates of Household Characteristics 11 Table 1: Case counts and overall percentages indicating the measurement error properties of interviewer judgments regarding the presence of children under the age of 15 in selected households (Quarters 1-10, Continuous NSFG) Household Roster Indicator: Kids Age < 15 Interviewer Judgment: Kids Age < 15 No Yes Totals No 21,025 (59.63%) 5,289 (15.00%) 26,314 (74.63%) Yes 4,395 (12.47%) 4,549 (12.90%) 8,944 (25.37%) Totals 25,420 (72.10%) 9,838 (27.90%) 35,258 (100.00%) * Note: Kappa Statistic = 0.30, 95% CI = (0.29, 0.31). Interviewers had an easier time estimating sexual activity (Table 2), with overall accuracy approaching 78% (Kappa = 0.34). Results are presented both overall and by gender, to see if the accuracy of this judgment varied depending on the gender of the selected respondent. Roughly 79% of these judgments were accurate when considering selected female respondents (Kappa = 0.35), consistent with findings reported by Groves et al. (2007) based on the first four quarters of data collection from the Continuous NSFG. Accuracy of these judgments was slightly lower when considering male respondents (roughly 76%; Kappa = 0.32), and the Kappa statistics for males and females were not found to be significantly different. Errors on the sexual activity observations had a slightly higher tendency to be false positives, where interviewers guessed that selected respondents were sexually active when in fact they were not. These analyses assume that self-reports of sexual activity in the main NSFG interview (collected using CAPI) are accurate.

13 Quality and Utility of Interviewer Estimates of Household Characteristics 12 Table 2: Case counts and overall percentages indicating measurement error in interviewer judgments of whether selected respondents were sexually active, both overall and by gender of selected respondent (Quarters 1-10, Continuous NSFG)* All Respondents Female Respondents Male Respondents Main NSFG Interview: Selected R Sexually Active Interviewer Judgment: Selected R Sexually Active No Yes Totals No 1,358 (10.07%) 1,290 (9.56%) 2,648 (19.63%) Yes 1,703 (12.62%) 9,139 (67.75%) 10,842 (80.37%) Totals 3,061 (22.69%) 10,429 (77.31%) 13,490 (100.00%) No 689 (9.37%) 668 (9.08%) 1,357 (18.45%) Yes 858 (11.67%) 5,140 (69.88%) 5,998 (81.55%) Totals 1,547 (21.03%) 5,808 (78.97%) 7,355 (100.00%) No 669 (10.90%) 622 (10.14%) 1,291 (21.04%) Yes 845 (13.77%) 3,999 (65.18%) 4,844 (78.96%) Totals 1,514 (24.68%) 4,621 (75.32%) 6,135 (100.00%) * Notes: Overall Kappa Statistic = 0.34, 95% CI = (0.32, 0.35). Male Kappa Statistic = 0.32, 95% CI = (0.30, 0.35). Female Kappa Statistic = 0.35, 95% CI = (0.32, 0.37). Test for Equal Kappa Statistics: Chi-square(1) = 1.38, p = Overall n = 13,490 (5 cases, or 1 female and 4 males, had missing data on the survey variable in the main NSFG interview). Interviewer-Specific Measures of Quality Figure 1 presents a scatter plot showing the association of the two GDRs computed for each interviewer. Each point corresponds to a single interviewer, and the gross difference rates (GDRs) for both interviewer observations define the horizontal and vertical axes. This plot allows for an examination of whether the same interviewer tends to do well on both estimation tasks. A weighted scatter plot smoother 3 was fitted to the GDRs, where the weights (and the sizes of the points in Figure 1) were proportional to the number of judgments on presence of young children (i.e., a proxy of the number of screening interviews attempted by each interviewer). Figure 1 shows that there is not consistent evidence of interviewers doing poorly or doing well on both observations, as would be indicated by a linear association between the GDRs, and the Pearson correlation (r) of the two GDRs was not significant at the 5% level. Accuracy tended to vary depending on the judgment and the interviewer, suggesting that the same interviewer may be using different strategies to make the two observations. 3 The survey package in the R software was used for this analysis.

14 Quality and Utility of Interviewer Estimates of Household Characteristics 13 Figure 1: Scatter plot examining the association of interviewer gross difference rates (GDRs) on judgment of sexual activity and presence of children under age 15 in the household. Two interviewers with relatively low GDRs on both observations are highlighted with arrows, and a weighted smoother is fitted to the points. Sizes of points (weights) are based on the number of initial observations on children under 15 (representing attempted screening interviews). Two interviewers with relatively low GDRs on both measures (indicating relatively high accuracy) are highlighted with arrows in Figure 1. One interviewer had 441 housing unit observations on the presence of children under 15, and was only incorrect on 65 of the observations (85% accuracy); this same interviewer also had 155 observations on sexual activity (after a completed screener) and was only incorrect on 27 of them (83% accuracy). The second interviewer had 1323 housing unit observations on the presence of children under 15, and was incorrect on 307 of them (77% accuracy); this same interviewer also had 487 observations on sexual activity and was only incorrect on 71 of them (85% accuracy). In contrast, one of the more poorly performing interviewers on

15 Quality and Utility of Interviewer Estimates of Household Characteristics 14 both measures was incorrect on 113 out of 261 young children observations (57% accuracy), and 32 out of 96 sexual activity observations (67% accuracy). Figure 1 shows evidence of interviewer variance in the accuracy of these two observations, which suggests that interviewers may vary in terms of their observational strategies. Variance in accuracy between interviewers may also arise as a function of the difficulty of the PSU being worked by an interviewer (e.g., urban areas without yards might make it harder to see children s toys). If the errors in these observations are in fact having a negative impact on subsequent nonresponse adjustments based on the judgments, then methods for reducing the discrepancies in accuracy between interviewers certainly require additional research. A later section will consider some of the observational techniques used by the more accurate interviewers. Associations of Interviewer Judgments with Response Propensity For an auxiliary variable to be effective in constructing nonresponse adjustments, it must first have a significant association with response propensity. The following table presents estimates of the odds ratios (along with 95% confidence intervals for the odds ratios) in the two logistic regression models predicting propensity to respond to the main NSFG interview, conditional on a completed screening interview. The first set of estimates (Model 1) is from the fitted model excluding the two interviewer judgments, while the second set of estimates (Model 2) is from the second model including the two interviewer judgments. Table 3: Main interview response propensity modeling results, showing significant predictors of response propensity in models excluding and including the interviewer judgments (Continuous NSFG, Quarters 1-10). Model 1 Model 2 Predictor Estimated Odds Estimated Odds 95% CI Ratio Ratio 95% CI Physical Impediments to HH (1.076, 1.488) (1.086, 1.502) Call Number (0.875, 0.894) (0.874, 0.893) Number of Contacts (1.499, 1.684) (1.480, 1.663) Black Respondent (0.921, 1.282) (0.911, 1.268)

16 Quality and Utility of Interviewer Estimates of Household Characteristics 15 Quarter (0.416, 0.665) (0.429, 0.689) Quarter (0.667, 1.108) (0.675, 1.123) Quarter (0.969, 1.613) (0.985, 1.645) Quarter (0.980, 1.638) (1.014, 1.699) Quarter (1.112, 1.855) (1.167, 1.953) Quarter (0.858, 1.410) (0.886, 1.453) Quarter (0.732, 1.171) (0.796, 1.276) Quarter (0.863, 1.414) (0.886, 1.453) Quarter (0.574, 0.904) (0.574, 0.905) High Estimated Main Interview Probability (0.693, 1.052) (0.690, 1.048) Medium Est. Main Interview Probability (0.350, 0.481) (0.353, 0.485) Low Estimated Main Interview Probability (0.154, 0.214) (0.155, 0.216) Age (1.350, 1.730) (1.516, 1.965) Age (1.242, 1.691) (1.230, 1.676) Urban Neighborhood (0.924, 1.246) (0.933, 1.260) Single Household (1.090, 1.529) (1.210, 1.708) Interviewer Non-White (0.862, 1.201) (0.810, 1.131) Bilingual Interviewer (1.016, 1.312) (0.984, 1.274) East Region (0.499, 0.701) (0.487, 0.686) Midwest Region (0.911, 1.280) (0.882, 1.242) West Region (1.028, 1.420) (1.084, 1.503) <10% Black, <10% Hispanic Population (0.673, 0.992) (0.681, 1.004) In Segment >10% Black, <10% Hispanic Population (0.697, 1.054) (0.731, 1.107) In Segment <10% Black, >10% Hispanic Population (0.522, 0.761) (0.516, 0.752) In Segment All Housing Units in Segment Residential (1.041, 1.299) (1.035, 1.292) Interviewer Has Safety Concerns (1.038, 1.358) (1.004, 1.315) Case Part of Second Phase Sample (0.241, 0.306) (0.239, 0.304) Interviewer Estimates R Sexually (1.405, 1.837) Active Interviewer Estimates Children (1.113, 1.415) Under 15 in HH Sample Size 15,044 15,044 Nagelkerke R AUC Reference Categories: Quarter 10; Estimated Main Interview Probability Missing; Age 30-44; Region South; Domain >10% Black, >10% Hispanic Population in Segment

17 Quality and Utility of Interviewer Estimates of Household Characteristics 16 The estimated odds ratios in these two models show that there are several strong predictors of propensity to respond to the main NSFG interview, with most reflecting theoretical expectations. For example, respondents having received more calls have significantly lower odds of responding to the main interview. One type of paradata that the interviewers are asked to collect is an estimate of the probability that a selected respondent will complete the main interview, and relative to missing observations on this variable (treated as a discrete category), a sample line with a low estimated probability assigned by an interviewer has more than 80% lower odds of completing the main interview. Of particular interest are the estimates in the second model, where the two interviewer judgment variables are added to the initial models. Both of these variables are significant independent predictors of response propensity: controlling for the other predictors, respondents estimated to be sexually active have 61% higher odds of completing the main interview, while respondents in households estimated to have children under 15 have 26% higher odds of completing the main interview. The overall fit of the model is improving when adding these two predictors, with the area under the curve (AUC) increasing by However, it is important to note that predicted response propensities for the 13,495 respondents based on the two models have a very high correlation (0.989), suggesting that impacts on estimates from using one set of response propensities or another for nonresponse adjustments may be relatively minor. We note that the estimated odds ratios which change the most from Model 1 to Model 2 are those for the youngest age category and single households. This suggests that the impacts of these predictors on response propensity may become stronger when adjusting for the interviewer judgments. Given the possibility of differential measurement error in these two judgments across interviewers, sensitivity of the relationships of these two judgments with response

18 Quality and Utility of Interviewer Estimates of Household Characteristics 17 propensity to the presence of particular interviewers was also examined. Specifically, the second response propensity model in Table 3 was re-fitted multiple times, each time excluding one of the interviewers. This jackknife-type approach was used to examine the range of estimated odds ratios for the two interviewer judgments. The resulting range of estimated odds ratios for sexual activity was (1.532, 1.669), and the resulting range of estimated odds ratios for presence of young children was (1.222, 1.306). Further, in all cases, a value of 1 was not included in the design-based 95% confidence intervals. The results indicate that the relationships of these two judgments with response propensity were not heavily influenced by particular interviewers. Collectively, these results suggest that these two variables may be independently useful for the construction of nonresponse adjustments. It remains important to determine whether the two interviewer judgments are also predictive of key survey variables when controlling for the same predictors. Associations of Interviewer Judgments with Key NSFG Variables Table 4 indicates the results of design-based tests of significance for the two interviewer judgments as predictors of the six NSFG variables measured in the main interview. The tests indicate whether or not the regression parameters associated with the two predictors in the six regression models are significantly different from zero when controlling for the other predictors in the response propensity models (the same control variables from Table 3). The estimates of the parameters indicate the directions of the relationships of the two interviewer judgments with either the log-odds of a binary outcome being equal to 1 or the log-mean of the count outcomes (from a Poisson regression model).

19 Quality and Utility of Interviewer Estimates of Household Characteristics 18 Table 4: Design-based estimates of regression parameters and tests of significance for the two interviewer judgments as predictors of six NSFG variables of interest, contrasted with estimates when using true values as predictors NSFG Survey Variable* (Outcome) Never Been Married (binary, n = 13,495) Ever Had Sex (binary, n = 13,495) Ever Cohabitated (binary, n = 13,495) Number of Biological Children (count, males, n = 6,139) Number of Sexual Partners in Past Year (count, n = 12,468) Parity (count, females, n = 7,356) Interviewer Judgment: Children Under 15 Estimate (SE), p-value (0.10), p = [-0.94 (0.13), p < ] 0.18 (0.11), p = [0.39 (0.17), p = ] 0.01 (0.10), p = [0.12 (0.09), p = ] 0.20 (0.05), p < [0.59 (0.06), p < ] (0.03), p = [-0.09 (0.04), p = ] 0.31 (0.04), p < [1.16 (0.11), p < ] Interviewer Judgment: Sexual Activity Estimate (SE), p-value (0.20), p < [-2.56 (0.21), p < ] 1.70 (0.14), p < [23.07 (0.24), p < ] 0.81 (0.13), p < [2.18 (0.12), p < ] 0.44 (0.11), p < [1.37 (0.18), p < ] 0.21 (0.06), p < [5.09 (0.82), p < ] 0.49 (0.12), p < [1.09 (0.10), p < ] * Note: Parameter estimates for other control variables listed in Table 3 are not shown for each dependent variable in the first column. Parameter estimates for variables containing the true values on sexual activity and presence of young children are displayed in italics and [brackets]. The results in Table 4 clearly show that the interviewer judgment of sexual activity is a strong correlate of these six NSFG survey variables when controlling for the other predictors included in the response propensity models (Table 3). Respondents estimated to be sexually active have significantly lower odds of never having been married, significantly higher odds of ever having had sex, significantly higher odds of ever having cohabitated, significantly more biological children on average (males), significantly more sexual partners in the past year, and significantly more live births (females), all when controlling for the other predictors from Table 3. The interviewer judgment of whether there are children under 15 in the household is not as strong of a correlate of these six outcomes, suggesting that its utility may be more limited when constructing nonresponse adjustments. Particularly striking in the Table 4 results are the estimates of these relationships had the variables measuring true values for sexual activity and presence of children under 15 in the household been used in the models instead of the interviewer judgments. The severe attenuation of these relationships due to the measurement error in

20 Quality and Utility of Interviewer Estimates of Household Characteristics 19 the interviewer judgments is clearly evident, which will more than likely impact the effectiveness of nonresponse adjustments based in part on the judgments. Impacts of Nonresponse Adjustments on Survey Estimates Collectively, the results in Tables 3 and 4 suggest that the interviewer judgment of sexual activity might be a candidate for an auxiliary variable to be used in constructing nonresponse adjustments for the main NSFG interview, whether using response propensity models or developing weighting classes. At present, this variable is used in computing the final nonresponse adjustments for the initial Continuous NSFG data release (Quarters 1-10), and other work has demonstrated the changes in estimates that are possible given strong correlations of these kinds of auxiliary variables with both response propensity and several key survey variables (Kreuter et al., 2010). Table 5 presents estimates of percentages or means (including design-based estimates of standard errors) on the six key NSFG variables, using three alternative weights: the base weights without nonresponse adjustments, the base weights with nonresponse adjustments based on predicted response propensities excluding the interviewer judgments ( No IW ), and the base weights with nonresponse adjustments based on predicted response propensities including the interviewer judgments ( IW ). Table 5: Impacts of alternative nonresponse adjustments on NSFG estimates (design-based standard errors reported in parentheses)* Variable (Estimate) Base Weights Only Nonresponse Adjusted Nonresponse Adjusted Base Weights, No IW Base Weights, IW Never Married (%) 49.74% (1.31) 49.60% (2.51) 49.34% (2.70) Had Sex (%) 84.89% (0.99) 87.01% (1.08) 86.87% (1.12) Ever Cohabitated (%) 48.92% (1.57) 49.95% (2.65) 49.88% (2.83) Males: # Biological Kids (Mean) 1.30 (0.07) 1.31 (0.16) 1.30 (0.17) # Partners in Past Year (Mean) 1.17 (0.02) 1.16 (0.02) 1.16 (0.02) Females: Parity (Mean) 1.27 (0.05) 1.27 (0.05) 1.26 (0.05) * Notes: these estimates do not incorporate post-stratification factors and do not represent final estimates based on quarters 1-10 of NSFG Cycle 7. n = 13,495 for all analyses.

21 Quality and Utility of Interviewer Estimates of Household Characteristics 20 The three estimates presented in Table 5 for each of the six NSFG variables suggest that nonresponse adjustments including the two interviewer observations are not having a substantial impact on the estimates. Nonresponse adjustments based on the additional interviewer judgments continue to move the estimated percentage that have never been married farther down, and nonresponse adjustments appear to increase the estimate of the percentage that have ever had sex, with the estimate based on the interviewer judgments slightly lower than the estimate without. Given the results in Table 4, these findings are not terribly surprising, as the measurement error in the interviewer judgments appears to be attenuating the true relationships of these auxiliary variables with the key survey variables (and thus reducing the potential reductions in bias and variance from using the auxiliary variables to make nonresponse adjustments). It is worth noting the increases in variance of the estimates when applying the nonresponse adjustments, relative to the changes in the estimates: there does not appear to be a fair bias-variance tradeoff, with the changes in variance being much larger than the changes in the estimates (with the exception of the estimate for percentage that have ever had sex). In theory, stronger correlations of the interviewer judgments with the survey variables should help to reduce variance (Little and Vartivarian, 2005), but the measurement error in the judgments appears to be preventing this. The overall lack of difference in the estimates with and without nonresponse adjustments could be due to the relatively high main interview response rate in the NSFG (81%, conditional on a completed screener). Implications of Measurement Error for Nonresponse Adjustments The results presented thus far indicate that nonresponse adjustments incorporating the two interviewer judgments are having only a minimal impact on NSFG estimates, despite apparent associations of the judgments with both response propensities and the survey variables of interest. An important open question remains, especially given the results in Table 4: what is the impact that errors in the interviewer judgments are having on the effectiveness of these nonresponse adjustments?

22 Quality and Utility of Interviewer Estimates of Household Characteristics 21 To initially examine possible theoretical implications of measurement error in the interviewer judgments of sexual activity on the bias and variance properties of subsequent nonresponse adjustments, a small simulation study was performed using real NSFG data. A hypothetical population was defined by the N = 7,355 female respondents to the main NSFG interview in Quarters 1-10, and the data set for this population included both interviewer judgments of sexual activity and actual reports of sexual activity from the main NSFG interview. The base sampling weights and NSFG survey variables measuring parity and number of partners in the past year were also included in the population data file for the simulations, allowing for an examination of how measurement error in the interviewer judgments can attenuate relationships between sexual activity and these two survey variables. In each of six simulations (three for each survey variable), one hundred (100) probability proportionate to size (PPS) samples of size n = 500, with size measures for females representing inverses of the NSFG base weight, were selected from this artificial population. This size variable has no relationship with either survey variable in the population, meaning that the sampling can be considered non-informative (for these variables). A small number of the largest NSFG base weights were trimmed to the 95 th percentile for the weights to enable the PPS selection. New base weights were then computed for each simulated sample based on the probabilities of selection. Unit nonresponse was simulated for each of the 100 samples based on the following logistic regression model, motivated by actual NSFG outcomes 4 : exp( report. sexually. activei ) Pr( responsei ) = 1 + exp( report. sexually. active ) A sampled case denoted by i had values on the two survey variables deleted if a random draw from a UNIFORM(0,1) distribution was not less than or equal to the probability i 4 See Table 3, where the estimated coefficient for the sexual activity judgment in the response propensity model was ln(1.607) = 0.47, and was likely attenuated toward zero by the measurement errors in these judgments.

23 Quality and Utility of Interviewer Estimates of Household Characteristics 22 computed above. The simulated probability of response was thus a function of the reported sexual activity for case i (1 = yes, 0 = no), and not the interviewer judgment. For each simulated sample, a logistic regression model was fitted to a response indicator, with a given sexual activity measure (reported or judged) and the sampling weight as predictors, and the inverse of the predicted probability based on this model was used to adjust the base sampling weight for non-response. Given known means on the two variables for the artificial population, the empirical bias, root mean squared error (RMSE), and 95% confidence interval coverage of simulated design-based estimates using a) the base weights only, b) nonresponse-adjusted base weights using the reported sexual activity values, and c) nonresponse-adjusted base weights using the interviewer judgments were computed. Based on the artificial population of N = 7,355 females with available values on both the interviewer judgments of sexual activity and self-reported sexual activity, Table 6 presents simple differences in population means between sexually active and sexually inactive females (according to each potential auxiliary variable) on the NSFG variables measuring parity and number of partners in the past year. For example, the population mean on number of partners in the past year for sexually active women (according to survey reports) is 1.29, compared to a population mean on number of partners for sexually inactive women of 0.00 (by definition). The corresponding means for sexually active and sexually inactive women based on the interviewer judgments are 1.09 and Table 6: Differences in population means on parity and number of partners in the past year as a function of sexual activity for the artificial NSFG population, by measure of sexual activity Self-Reported Sexual Activity Interviewer Judgment of Sexual Activity NSFG Variable Active Inactive Active Inactive Parity Partners in the Past Year

24 Quality and Utility of Interviewer Estimates of Household Characteristics 23 The results in Table 6 clearly show the attenuation in the relationship between sexual activity and these two survey variables that is introduced by the measurement error in the interviewer judgments. Much stronger associations are evident when using the selfreported values of sexual activity, which suggests that nonresponse adjustments based on the interviewer judgments are eliminating some of the within-group homogeneity that would be possible if the judgments were closer to the reported values. This would have important implications for nonresponse adjustments based on weighting classes. Table 7 summarizes the results of this small simulation study using real NSFG data, showing the empirical performance (across 100 simulated samples) of the three potential estimators of the mean on each NSFG variable: the estimator using base weights only, the estimator with a nonresponse adjustment to the base weights based on self-reported sexual activity, and the estimator with a nonresponse adjustment to the base weights based on the interviewer judgment of sexual activity. Table 7: Results of simulation study, showing empirical performance of estimators with and without nonresponse adjustments based on the interviewer judgments NSFG Variable Parity Partners in Past 12 Months Nonresponse Adjustment Method Auxiliary Variable for Nonresponse Adjustment True Mean None Response Propensity Self-reported Sexual Activity Interviewer Judgment of Sexual Activity None Response Propensity Self-reported Sexual Activity Interviewer Judgment of Sexual Activity Empirical Bias (Rel. %) (3.36%) (0.09%) (2.20%) (5.28%) (0.17%) (3.92%) Empirical RMSE 95% CI Coverage Mean CI Width

25 Quality and Utility of Interviewer Estimates of Household Characteristics 24 The results in Table 7 suggest that use of the interviewer judgment on sexual activity as an auxiliary variable when constructing the nonresponse adjustments (rows 3 and 6) attenuates potential reductions in both bias and variance when using the response propensity weighting method, relative to adjustments using the true self-reported values of sexual activity (rows 2 and 5). The bias of the resulting estimates (when using the interviewer judgments) is similar to (and slightly lower than) that found when analyzing the cases without any adjustments to the base weights for nonresponse under the defined nonresponse mechanism (rows 1 and 4). For example, the relative bias of the completecase estimate of the mean on parity is 3.36%. When using the true sexual activity to make a nonresponse adjustment, the relative bias is 0.09%, but when using the interviewer judgment of sexual activity, the relative bias is 2.20%. There is also evidence of higher empirical RMSEs in the estimates (compared to use of base weights only) when using the interviewer judgments, in contrast to the lower empirical RMSE in the estimates found when using the true measures. This suggests that potential reductions in the variance of estimates are being attenuated as well by using the error-prone judgments; in fact, the estimates based on nonresponse adjustments using the judgments have the highest RMSE for both variables. Coverage and confidence interval width do not appear to be affected by the use of the interviewer judgments (measured with error) for making the nonresponse adjustments. Observational Techniques used by Accurate Interviewers Given the negative implications of error in the sexual activity judgments for nonresponse adjustments found in the small simulation study above, methods for minimizing error in the observations certainly warrant investigation. Five of the active NSFG interviewers that appeared to be performing well on both observations after the first 10 quarters of data collection (see Figure 1) were approached over by the NSFG field supervisor, and asked about the techniques that they used in making their observations (given their relatively high accuracy rates). In some cases, follow-up conversations were had over the telephone to clarify messages. The following techniques and visual cues were

26 Quality and Utility of Interviewer Estimates of Household Characteristics 25 identified by these interviewers regarding their observations on the presence of young children: Examining the furniture outside the house Presence of baby strollers, outdoor toys / shoes in the yard or on the porch, evidence of stickers / crayons / miscellaneous kids decorations Looking inside any open curtains / blinds for baby blankets and baby furniture Firefighter stickers indicating the presence of children Bikes on the porch or in the yard / swing sets or trampolines in the yard Boxes for baby wipes or diapers Looking inside cars in the driveway for booster seats or toys in the backseats Looking inside open garages for toys or other child equipment Basketball hoop in the driveway Candy wrappers around the doorway Listening for sounds of children The following techniques and visual cues were identified by these same five interviewers regarding the estimation of sexual activity: Considering the physical appearance of the selected respondent and others in the household (conservatively dressed?) Teenage respondents that are said by parents to not come home after school, hang out with friends, or are frequently involved in sports activities Teenage respondents that are more reserved and shy (not sexually active) vs. teenage respondents that are more interested in the content of the survey Gut feelings about the selected respondent Presence of children even if only a single adult lives in the household Two cars parked at the household A nicely manicured lawn and yard, indicating the presence of a male in the household

27 Quality and Utility of Interviewer Estimates of Household Characteristics 26 Upscale neighborhood, which indicates two incomes and increased likelihood of being sexually active Well maintained residences in lower income neighborhoods usually indicate older couples who are not likely to be sexually active Beer cans and garbage on the porch in a lower income area tend to indicate single males with no females living in the household (not sexually active) Collectively, these essentially anecdotal techniques and visual cues provide interesting insights into the observation methods used in the field by the more accurate interviewers. Although all of the NSFG interviewers are trained to record these judgments based on their initial impressions and best guesses, these interviewers tended to be very aware of the area surrounding each household and were able to pick up on a variety of evidence and visual cues that lead to more accurate observations on these two variables. Unfortunately, similar conversations were not attempted with poorly performing interviewers at the same time. Future study of these observational techniques among both accurate and inaccurate interviewers is certainly warranted in an effort to reduce errors in these observations. DISCUSSION This study represents one of the first detailed examinations into the measurement error properties of interviewer observations in a national survey using in-person interviewing, and the implications of those errors for post-survey nonresponse adjustments. Analyses of data collected in the first ten quarters of the Continuous NSFG indicated that female interviewers tended to be between 70-80% accurate overall when estimating household characteristics, namely whether children under the age of 15 were present in a household and whether selected respondents were sexually active. Female interviewers did not appear to be consistently accurate or inaccurate on both observations, and a large amount of interviewer variance in accuracy was found, suggesting that different interviewers may be using different observational techniques that vary in their effectiveness (e.g., consistent guessing of Yes for sexual activity may overestimate the

28 Quality and Utility of Interviewer Estimates of Household Characteristics 27 proportion of sample units that are sexually active). However, we cannot rule out the possibility that accuracy may have depended on the difficulty of the primary sampling unit being worked by a given interviewer, which would likely impact the housing unit observations on the presence of children (e.g., urban areas may not have yards where toys could be noticed). In the NSFG design, the majority of the interviewers are assigned to one PSU, introducing confounding between interviewer and PSU. More detailed examinations of factors that influence the accuracy of these types of judgments are certainly needed. The two interviewer judgments were found to have strong relationships with both main interview response propensity and NSFG survey variables of interest when controlling for the relationships of other paradata and information collected in a screening interview, even though these relationships appeared to be attenuated due to the measurement error in the judgments. This makes the two judgments attractive candidates for auxiliary variables to be used in post-survey nonresponse adjustments. However, nonresponse adjustments to base sampling weights incorporating the two interviewer judgments were not found to significantly impact key estimates relative to nonresponse adjustments excluding the two observations. There could be a number of reasons for this finding, including the measurement error in the judgments and the relatively high response rate for the main NSFG interview conditional on a completed screening interview (81%). The NSFG is also unique in that it collects a large amount of paradata on sample units during the screening process while operating under a responsive survey design framework (Groves and Heeringa, 2006). Some of these variables may have been correlated with the two interviewer judgments (e.g., safety concerns, physical impediments, primarily residential neighborhood, single-person household, age, etc.), and the use of too much paradata when constructing nonresponse adjustments may introduce multicollinearity concerns in models used to compute the adjustments. Alternative methods of using the interviewer judgments to repair nonresponse errors were also certainly possible (e.g., multiple imputation analysis for individual survey items with the interviewer judgments as predictors in the imputation models, selection of only predictors having a significant

29 Quality and Utility of Interviewer Estimates of Household Characteristics 28 relationship with both the survey variables of interest and response propensity in the response propensity models, etc.). Most importantly, the possibility exists that errors in the judgments may have limited the effectiveness of the nonresponse adjustments based on the judgments. Results from a small simulation study indicated that the amount of error in the NSFG interviewer judgments of sexual activity (where 20-30% of observations are inaccurate) may be attenuating the effectiveness of nonresponse adjustments based on the judgments, relative to nonresponse adjustments based on the sexual activity reported by respondents to the main NSFG interview ( true values). This initial simulation study may provide one explanation for the lack of impact that nonresponse adjustments based on the two observations were having on NSFG estimates: measurement error at these levels or higher in similar auxiliary variables could eliminate a large portion of the reductions in nonresponse bias that would be possible if interviewers had access to the true values on the variables that they are trying to observe. The possible impact of measurement error on alternative nonresponse adjustment techniques needs more research focus as well. Given the potentially detrimental effects of measurement error in these observations on the effectiveness of nonresponse adjustments, future research in this area needs to examine predictors of accuracy among the interviewers and study observational techniques associated with reduced error. In this study, five of the interviewers found to have the highest accuracy on the two judgments were queried about the techniques that they employ in the field when making the observations. A wide variety of observational techniques and important visual cues to look for were provided by these interviewers, and regular assessment of these techniques used by the more accurate interviewers could lead to improved training programs aimed at reducing error in these judgments. The NSFG has started to request that interviewers provide open-ended justifications for all of their judgments, and these data will be analyzed qualitatively and linked with interviewer accuracy in future work. In addition, NSFG interviewers in the most recent quarter were provided with information regarding observable predictors of sexual activity that would

30 Quality and Utility of Interviewer Estimates of Household Characteristics 29 be available to them at the time of making a judgment for a selected respondent, and the effectiveness of this modification to NSFG protocol is currently being evaluated. The ongoing work in this area that is being conducted by NSFG staff aims to identify effective observational techniques that are associated with reduced error in these judgments. The collection of interviewer judgments on auxiliary variables having theoretical relationships with key survey variables provides a potentially useful tool for making postsurvey nonresponse adjustments. After factoring in the costs of incorporating training on effective techniques for making accurate judgments of selected characteristics into general interviewer training sessions, the collection of interviewer judgments in the field is a relatively inexpensive method of data collection with many potential benefits. These ideas extend beyond the NSFG; for example, the American National Election Study (ANES) has interviewers in the field note whether political signs were present for selected households. Other surveys may also benefit from having interviewers collect information on features of households that are relevant to a particular survey s content. However, survey researchers need to consider the potential measurement error involved in the collection of these observations and judgments, and the impact of the measurement error on the effectiveness of post-survey nonresponse adjustments. The additional time and cost required to train interviewers in these techniques and have the interviewers record observations may not be warranted if the level of error in the collected data has detrimental effects on nonresponse adjustments.

31 Quality and Utility of Interviewer Estimates of Household Characteristics 30 REFERENCES Biemer, P. (2004). Chapter 12: Modeling Measurement Error to Identify Flawed Questions. In Methods for Testing and Evaluating Survey Questionnaires, Edited by Presser et al. Wiley. Casas-Cordero, C. and Kreuter, F. (2008). Assessing interviewer observation of neighborhood characteristics for nonresponse adjustments. Paper presented at the International Conference on Survey Methods in Multinational, Multiregional, and Multicultural Contexts (3MC). Berlin, Germany. June 28, Couper, M.P. (1998). Measuring survey quality in a CASIC environment. Paper presented at the Joint Statistical Meetings of the American Statistical Association, Dallas, TX. de Leeuw, E., and de Heer, W. (2002). Trends in Household Survey Nonresponse: A Longitudinal and International Comparison. Chapter 3 in Groves, R.M. et al., Survey Nonresponse. Wiley. Fuller, W. (1987). Chapter 1: A Single Explanatory Variable. Measurement Error Models. Wiley. Groves, R.M., and Heeringa, S.G., 2006, Responsive Design for Household Surveys: Tools for Actively Controlling Survey Errors and Costs. J.R..Statist. Soc. A, 169, Part 3, Groves, R.M., Mosher, W.D., Lepkowski, J. and Kirgis, N.G. (2009). Planning and development of the continuous National Survey of Family Growth. National Center for Health Care Statistics. Vital Health Statistics, 1(48). Groves, R., Wagner, J., and Peytcheva, E. (2007). Use of Interviewer Judgments About Attributes of Selected Respondents in Post-Survey Adjustments for Unit Nonresponse: An Illustration with the National Survey of Family Growth. Proceedings of the Section on Survey Research Methods, Joint Statistical Meetings, Salt Lake City, UT. Heeringa, S.G., West, B.T., and Berglund, P.A. (2010). Applied Survey Data Analysis. Chapman and Hall / CRC Press. Kreuter, F., Lemay, M. and Casas-Cordero, C. (2007). Using Proxy Measures of Survey Outcomes in Post-Survey Adjustments: Examples from the European Social Survey (ESS). Proceedings of the Section on Survey Research Methods, Joint Statistical Meetings, Salt Lake City, UT. Kreuter, F., Olson, K., Wagner, J., Yan, T., Ezzati-Rice, T.M., Casas-Cordero, C., Lemay, M., Peytchev, A., Groves, R.M., and Raghunathan, T.E. (2010). Using Proxy Measures and Other Correlates of Survey Outcomes to Adjust for Nonresponse: Examples from Multiple Surveys. Journal of the Royal Statistical Society - Series A, 173, Part 3, Lepkowski, J.M. et al. (2010). The National Survey of Family Growth: Sample Design and Analysis of a Continuous Survey. Vital and Health Statistics, Series 2, No. 150, forthcoming in Lepkowski et al., NSFG Series 2 report from Cycle 6. Little, R.J., and Vartivarian, S. (2005). Does Weighting for Nonresponse Increase the Variance of Survey Means? Survey Methodology, 31(2),

32 Quality and Utility of Interviewer Estimates of Household Characteristics 31 Maitland, A., Casas-Cordero, C., and Kreuter, F. (2009). An Evaluation of Nonresponse Bias Using Paradata from a Health Survey. Proceedings of the Section on Government Statistics, Joint Statistical Meetings, Washington, D.C. Petchev, A. and Olson, K. (2007). Using Interviewer Observations to Improve Nonresponse Adjustments: NES Proceedings of the Section on Survey Research Methods, Joint Statistical Meetings, Salt Lake City, UT. Peytcheva, E. and Groves, R.M. (2009). Using variation in response rates of demographic subgroups as evidence of nonresponse bias in survey estimates. Journal of Official Statistics, 25, Stefanski, L.A., and Carroll, R.J. (1985). Covariate Measurement Error in Logistic Regression. The Annals of Statistics, 13(4), Yan, T. and Raghunathan, T. (2007). Using Proxy Measures of the Survey Variables in Post-Survey Adjustments in a Transportation Survey. Proceedings of the Section on Survey Research Methods, Joint Statistical Meetings, Salt Lake City, UT.

33

Use of Paradata in a Responsive Design Framework to Manage a Field Data Collection

Use of Paradata in a Responsive Design Framework to Manage a Field Data Collection Journal of Official Statistics, Vol. 28, No. 4, 2012, pp. 477 499 Use of Paradata in a Responsive Design Framework to Manage a Field Data Collection James Wagner 1, Brady T. West 1, Nicole Kirgis 1, James

More information

Using Soft Refusal Status in the Cell-Phone Nonresponse Adjustment in the National Immunization Survey

Using Soft Refusal Status in the Cell-Phone Nonresponse Adjustment in the National Immunization Survey Using Soft Refusal Status in the Cell-Phone Nonresponse Adjustment in the National Immunization Survey Wei Zeng 1, David Yankey 2 Nadarajasundaram Ganesh 1, Vicki Pineau 1, Phil Smith 2 1 NORC at the University

More information

Examples of Responsive Design in the National Survey of Family Growth

Examples of Responsive Design in the National Survey of Family Growth Examples of Responsive Design in the National Survey of Family Growth James M. Lepkowski, Brady T. West, James Wagner, Nicole Kirgis, Shonda Kruger-Ndiaye, William Axinn, Robert M. Groves Institute for

More information

Using proxy measures and other correlates of survey outcomes to adjust for non-response: examples from multiple surveys

Using proxy measures and other correlates of survey outcomes to adjust for non-response: examples from multiple surveys J. R. Statist. Soc. A (2010) 173, Part 2, pp. 389 407 Using proxy measures and other correlates of survey outcomes to adjust for non-response: examples from multiple surveys F. Kreuter, University of Maryland,

More information

AnExaminationofWithin-Person VariationinResponsePropensity. KristenOlsonandRobertM.Groves

AnExaminationofWithin-Person VariationinResponsePropensity. KristenOlsonandRobertM.Groves AnExaminationofWithin-Person VariationinResponsePropensity overthedatacolectionfieldperiod KristenOlsonandRobertM.Groves An Examination of Within-Person Variation in Response Propensity over the Data Collection

More information

Using Proxy Measures and Other Correlates of Survey Outcomes to Adjust for Non-Response: Examples from Multiple Surveys

Using Proxy Measures and Other Correlates of Survey Outcomes to Adjust for Non-Response: Examples from Multiple Surveys University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Sociology Department, Faculty Publications Sociology, Department of 4-2010 Using Proxy Measures and Other Correlates of

More information

An Empirical Study of Nonresponse Adjustment Methods for the Survey of Doctorate Recipients Wilson Blvd., Suite 965, Arlington, VA 22230

An Empirical Study of Nonresponse Adjustment Methods for the Survey of Doctorate Recipients Wilson Blvd., Suite 965, Arlington, VA 22230 An Empirical Study of Nonresponse Adjustment Methods for the Survey of Doctorate Recipients 1 Fan Zhang 1 and Stephen Cohen 1 Donsig Jang 2, Amang Suasih 2, and Sonya Vartivarian 2 1 National Science Foundation,

More information

An Independent Analysis of the Nielsen Meter and Diary Nonresponse Bias Studies

An Independent Analysis of the Nielsen Meter and Diary Nonresponse Bias Studies An Independent Analysis of the Nielsen Meter and Diary Nonresponse Bias Studies Robert M. Groves and Ashley Bowers, University of Michigan Frauke Kreuter and Carolina Casas-Cordero, University of Maryland

More information

An Application of Propensity Modeling: Comparing Unweighted and Weighted Logistic Regression Models for Nonresponse Adjustments

An Application of Propensity Modeling: Comparing Unweighted and Weighted Logistic Regression Models for Nonresponse Adjustments An Application of Propensity Modeling: Comparing Unweighted and Weighted Logistic Regression Models for Nonresponse Adjustments Frank Potter, 1 Eric Grau, 1 Stephen Williams, 1 Nuria Diaz-Tena, 2 and Barbara

More information

JSM Survey Research Methods Section

JSM Survey Research Methods Section Methods and Issues in Trimming Extreme Weights in Sample Surveys Frank Potter and Yuhong Zheng Mathematica Policy Research, P.O. Box 393, Princeton, NJ 08543 Abstract In survey sampling practice, unequal

More information

Section on Survey Research Methods JSM 2009

Section on Survey Research Methods JSM 2009 Missing Data and Complex Samples: The Impact of Listwise Deletion vs. Subpopulation Analysis on Statistical Bias and Hypothesis Test Results when Data are MCAR and MAR Bethany A. Bell, Jeffrey D. Kromrey

More information

A Comparison of Variance Estimates for Schools and Students Using Taylor Series and Replicate Weighting

A Comparison of Variance Estimates for Schools and Students Using Taylor Series and Replicate Weighting A Comparison of Variance Estimates for Schools and Students Using and Replicate Weighting Peter H. Siegel, James R. Chromy, Ellen Scheib RTI International Abstract Variance estimation is an important issue

More information

The Impact of Cellphone Sample Representation on Variance Estimates in a Dual-Frame Telephone Survey

The Impact of Cellphone Sample Representation on Variance Estimates in a Dual-Frame Telephone Survey The Impact of Cellphone Sample Representation on Variance Estimates in a Dual-Frame Telephone Survey A. Elizabeth Ormson 1, Kennon R. Copeland 1, B. Stephen J. Blumberg 2, and N. Ganesh 1 1 NORC at the

More information

Anticipatory Survey Design: Reduction of Nonresponse Bias through Bias Prediction Models

Anticipatory Survey Design: Reduction of Nonresponse Bias through Bias Prediction Models Anticipatory Survey Design: Reduction of Nonresponse Bias through Bias Prediction Models Andy Peytchev 1, Sarah Riley 2, Jeff Rosen 1, Joe Murphy 1, Mark Lindblad 2, Paul Biemer 1,2 1 RTI International

More information

Does it Matter How You Ask? Question Wording and Males' Reporting of Contraceptive Use at Last Sex

Does it Matter How You Ask? Question Wording and Males' Reporting of Contraceptive Use at Last Sex Does it Matter How You Ask? Question Wording and Males' Reporting of Contraceptive Use at Last Sex Jennifer Yarger University of Michigan Sarah Brauner-Otto Mississippi State University Joyce Abma National

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

LOGISTIC PROPENSITY MODELS TO ADJUST FOR NONRESPONSE IN PHYSICIAN SURVEYS

LOGISTIC PROPENSITY MODELS TO ADJUST FOR NONRESPONSE IN PHYSICIAN SURVEYS LOGISTIC PROPENSITY MODELS TO ADJUST FOR NONRESPONSE IN PHYSICIAN SURVEYS Nuria Diaz-Tena, Frank Potter, Michael Sinclair and Stephen Williams Mathematica Policy Research, Inc., Princeton, New Jersey 08543-2393

More information

J. Michael Brick Westat and JPSM December 9, 2005

J. Michael Brick Westat and JPSM December 9, 2005 Nonresponse in the American Time Use Survey: Who is Missing from the Data and How Much Does It Matter by Abraham, Maitland, and Bianchi J. Michael Brick Westat and JPSM December 9, 2005 Philosophy A sensible

More information

Weight Adjustment Methods using Multilevel Propensity Models and Random Forests

Weight Adjustment Methods using Multilevel Propensity Models and Random Forests Weight Adjustment Methods using Multilevel Propensity Models and Random Forests Ronaldo Iachan 1, Maria Prosviryakova 1, Kurt Peters 2, Lauren Restivo 1 1 ICF International, 530 Gaither Road Suite 500,

More information

Nonresponse Adjustment Methodology for NHIS-Medicare Linked Data

Nonresponse Adjustment Methodology for NHIS-Medicare Linked Data Nonresponse Adjustment Methodology for NHIS-Medicare Linked Data Michael D. Larsen 1, Michelle Roozeboom 2, and Kathy Schneider 2 1 Department of Statistics, The George Washington University, Rockville,

More information

UN Handbook Ch. 7 'Managing sources of non-sampling error': recommendations on response rates

UN Handbook Ch. 7 'Managing sources of non-sampling error': recommendations on response rates JOINT EU/OECD WORKSHOP ON RECENT DEVELOPMENTS IN BUSINESS AND CONSUMER SURVEYS Methodological session II: Task Force & UN Handbook on conduct of surveys response rates, weighting and accuracy UN Handbook

More information

Within-Household Selection in Mail Surveys: Explicit Questions Are Better Than Cover Letter Instructions

Within-Household Selection in Mail Surveys: Explicit Questions Are Better Than Cover Letter Instructions University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Sociology Department, Faculty Publications Sociology, Department of 2017 Within-Household Selection in Mail Surveys: Explicit

More information

UMbRELLA interim report Preparatory work

UMbRELLA interim report Preparatory work UMbRELLA interim report Preparatory work This document is intended to supplement the UMbRELLA Interim Report 2 (January 2016) by providing a summary of the preliminary analyses which influenced the decision

More information

Sexual multipartnership and condom use among adolescent boys in four sub-saharan African countries

Sexual multipartnership and condom use among adolescent boys in four sub-saharan African countries 1 Sexual multipartnership and condom use among adolescent boys in four sub-saharan African countries Guiella Georges, Department of demography, University of Montreal Email: georges.guiella@umontreal.ca

More information

Introduction to Statistical Data Analysis I

Introduction to Statistical Data Analysis I Introduction to Statistical Data Analysis I JULY 2011 Afsaneh Yazdani Preface What is Statistics? Preface What is Statistics? Science of: designing studies or experiments, collecting data Summarizing/modeling/analyzing

More information

Subject index. bootstrap...94 National Maternal and Infant Health Study (NMIHS) example

Subject index. bootstrap...94 National Maternal and Infant Health Study (NMIHS) example Subject index A AAPOR... see American Association of Public Opinion Research American Association of Public Opinion Research margins of error in nonprobability samples... 132 reports on nonprobability

More information

Daniel Boduszek University of Huddersfield

Daniel Boduszek University of Huddersfield Daniel Boduszek University of Huddersfield d.boduszek@hud.ac.uk Introduction to Logistic Regression SPSS procedure of LR Interpretation of SPSS output Presenting results from LR Logistic regression is

More information

Methodological Considerations to Minimize Total Survey Error in the National Crime Victimization Survey

Methodological Considerations to Minimize Total Survey Error in the National Crime Victimization Survey Methodological Considerations to Minimize Total Survey Error in the National Crime Victimization Survey Andrew Moore, M.Stat., RTI International Marcus Berzofsky, Dr.P.H., RTI International Lynn Langton,

More information

SURVEY TOPIC INVOLVEMENT AND NONRESPONSE BIAS 1

SURVEY TOPIC INVOLVEMENT AND NONRESPONSE BIAS 1 SURVEY TOPIC INVOLVEMENT AND NONRESPONSE BIAS 1 Brian A. Kojetin (BLS), Eugene Borgida and Mark Snyder (University of Minnesota) Brian A. Kojetin, Bureau of Labor Statistics, 2 Massachusetts Ave. N.E.,

More information

Survey Errors and Survey Costs

Survey Errors and Survey Costs Survey Errors and Survey Costs ROBERT M. GROVES The University of Michigan WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION CONTENTS 1. An Introduction To Survey Errors 1 1.1 Diverse Perspectives

More information

Estimating Incidence of HIV with Synthetic Cohorts and Varying Mortality in Uganda

Estimating Incidence of HIV with Synthetic Cohorts and Varying Mortality in Uganda Estimating Incidence of HIV with Synthetic Cohorts and Varying Mortality in Uganda Abstract We estimate the incidence of HIV using two cross-sectional surveys in Uganda with varying mortality rates. The

More information

Number September 15, Abstract. Highlights. Overview

Number September 15, Abstract. Highlights. Overview Number 362 + September 15, 2005 DatainTables8,10,12,andAppendixtable4havebeenrevised. Numberscitedintextonpages3,4,and13havebeenrevised. Sexual Behavior and Selected Health Measures: Men and Women 15 44

More information

Bayesian Analysis of Between-Group Differences in Variance Components in Hierarchical Generalized Linear Models

Bayesian Analysis of Between-Group Differences in Variance Components in Hierarchical Generalized Linear Models Bayesian Analysis of Between-Group Differences in Variance Components in Hierarchical Generalized Linear Models Brady T. West Michigan Program in Survey Methodology, Institute for Social Research, 46 Thompson

More information

Problem Two (Data Analysis) Grading Criteria and Suggested Answers

Problem Two (Data Analysis) Grading Criteria and Suggested Answers Department of Epidemiology School of Public Health University of California, Los Angeles Problem Two (Data Analysis) Grading Criteria and Suggested Answers Enclosed is the form that will be used to grade

More information

nvestigating the Utility of Interviewer Observations on the Survey Response Process

nvestigating the Utility of Interviewer Observations on the Survey Response Process University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln 2019 Workshop: Interviewers and Their Effects from a Total Survey Error Perspective Sociology, Department of 2-26-2019 nvestigating

More information

2.1. Sample Size and Allocation

2.1. Sample Size and Allocation Mode effects in the Canadian Community Health Survey: a Comparison of CAPI and CATI Martin St-Pierre (martin.st-pierre@statcan.ca) and Yves Béland (yves.beland@statcan.ca), Statistics Canada This article

More information

Nonresponse Rates and Nonresponse Bias In Household Surveys

Nonresponse Rates and Nonresponse Bias In Household Surveys Nonresponse Rates and Nonresponse Bias In Household Surveys Robert M. Groves University of Michigan and Joint Program in Survey Methodology Funding from the Methodology, Measurement, and Statistics Program

More information

Reliability of Ordination Analyses

Reliability of Ordination Analyses Reliability of Ordination Analyses Objectives: Discuss Reliability Define Consistency and Accuracy Discuss Validation Methods Opening Thoughts Inference Space: What is it? Inference space can be defined

More information

Analysis of Confidence Rating Pilot Data: Executive Summary for the UKCAT Board

Analysis of Confidence Rating Pilot Data: Executive Summary for the UKCAT Board Analysis of Confidence Rating Pilot Data: Executive Summary for the UKCAT Board Paul Tiffin & Lewis Paton University of York Background Self-confidence may be the best non-cognitive predictor of future

More information

Measuring the User Experience

Measuring the User Experience Measuring the User Experience Collecting, Analyzing, and Presenting Usability Metrics Chapter 2 Background Tom Tullis and Bill Albert Morgan Kaufmann, 2008 ISBN 978-0123735584 Introduction Purpose Provide

More information

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are

More information

Frequentist and Bayesian approaches for comparing interviewer variance components in two groups of survey interviewers

Frequentist and Bayesian approaches for comparing interviewer variance components in two groups of survey interviewers Catalogue no. 1-001-X ISSN 149-091 Survey Methodology Frequentist and Bayesian approaches for comparing interviewer variance components in two groups of survey interviewers by Brady T. West and Michael

More information

NONRESPONSE ADJUSTMENT IN A LONGITUDINAL SURVEY OF AFRICAN AMERICANS

NONRESPONSE ADJUSTMENT IN A LONGITUDINAL SURVEY OF AFRICAN AMERICANS NONRESPONSE ADJUSTMENT IN A LONGITUDINAL SURVEY OF AFRICAN AMERICANS Monica L. Wolford, Senior Research Fellow, Program on International Policy Attitudes, Center on Policy Attitudes and the Center for

More information

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES 24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter

More information

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research 2012 CCPRC Meeting Methodology Presession Workshop October 23, 2012, 2:00-5:00 p.m. Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy

More information

DUAL PROTECTION DILEMMA

DUAL PROTECTION DILEMMA PAA 2012 PAPER SUBMISSION Do not cite without permission from authors DUAL PROTECTION DILEMMA KIYOMI TSUYUKI University of California Los Angeles REGINA BARBOSA University of Campinas Campinas, Brazil

More information

USING THE CENSUS 2000/2001 SUPPLEMENTARY SURVEY AS A SAMPLING FRAME FOR THE NATIONAL EPIDEMIOLOGICAL SURVEY ON ALCOHOL AND RELATED CONDITIONS

USING THE CENSUS 2000/2001 SUPPLEMENTARY SURVEY AS A SAMPLING FRAME FOR THE NATIONAL EPIDEMIOLOGICAL SURVEY ON ALCOHOL AND RELATED CONDITIONS USING THE CENSUS 2000/2001 SUPPLEMENTARY SURVEY AS A SAMPLING FRAME FOR THE NATIONAL EPIDEMIOLOGICAL SURVEY ON ALCOHOL AND RELATED CONDITIONS Marie Stetser, Jana Shepherd, and Thomas F. Moore 1 U.S. Census

More information

Reduction of Nonresponse Bias through Case Prioritization. Paper presented at the 2009 annual AAPOR conference, Hollywood, FL

Reduction of Nonresponse Bias through Case Prioritization. Paper presented at the 2009 annual AAPOR conference, Hollywood, FL Reduction of Nonresponse Bias through Case Prioritization Andy Peytchev 1, Sarah Riley 2, Jeff Rosen 1, Joe Murphy 1, Mark Lindblad 2 Paper presented at the 2009 annual AAPOR conference, Hollywood, FL

More information

THE EFFECT OF WEIGHT TRIMMING ON NONLINEAR SURVEY ESTIMATES

THE EFFECT OF WEIGHT TRIMMING ON NONLINEAR SURVEY ESTIMATES THE EFFECT OF WEIGHT TRIMMING ON NONLINER SURVEY ESTIMTES Frank J. Potter, Research Triangle Institute Research Triangle Institute, P. O Box 12194, Research Triangle Park, NC 27709 KEY WORDS: sampling

More information

Selection and Combination of Markers for Prediction

Selection and Combination of Markers for Prediction Selection and Combination of Markers for Prediction NACC Data and Methods Meeting September, 2010 Baojiang Chen, PhD Sarah Monsell, MS Xiao-Hua Andrew Zhou, PhD Overview 1. Research motivation 2. Describe

More information

Modeling Interviewer Effects in a Large National Health Study

Modeling Interviewer Effects in a Large National Health Study University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln 2019 Workshop: Interviewers and Their Effects from a Total Survey Error Perspective Sociology, Department of 2-26-2019 Modeling

More information

SAMPLING ERROI~ IN THE INTEGRATED sysrem FOR SURVEY ANALYSIS (ISSA)

SAMPLING ERROI~ IN THE INTEGRATED sysrem FOR SURVEY ANALYSIS (ISSA) SAMPLING ERROI~ IN THE INTEGRATED sysrem FOR SURVEY ANALYSIS (ISSA) Guillermo Rojas, Alfredo Aliaga, Macro International 8850 Stanford Boulevard Suite 4000, Columbia, MD 21045 I-INTRODUCTION. This paper

More information

JSM Survey Research Methods Section

JSM Survey Research Methods Section Studying the Association of Environmental Measures Linked with Health Data: A Case Study Using the Linked National Health Interview Survey and Modeled Ambient PM2.5 Data Rong Wei 1, Van Parsons, and Jennifer

More information

Comparing Multiple Imputation to Single Imputation in the Presence of Large Design Effects: A Case Study and Some Theory

Comparing Multiple Imputation to Single Imputation in the Presence of Large Design Effects: A Case Study and Some Theory Comparing Multiple Imputation to Single Imputation in the Presence of Large Design Effects: A Case Study and Some Theory Nathaniel Schenker Deputy Director, National Center for Health Statistics* (and

More information

How Errors Cumulate: Two Examples

How Errors Cumulate: Two Examples How Errors Cumulate: Two Examples Roger Tourangeau, Westat Hansen Lecture October 11, 2018 Washington, DC Hansen s Contributions The Total Survey Error model has served as the paradigm for most methodological

More information

Tim Johnson Survey Research Laboratory University of Illinois at Chicago September 2011

Tim Johnson Survey Research Laboratory University of Illinois at Chicago September 2011 Tim Johnson Survey Research Laboratory University of Illinois at Chicago September 2011 1 Characteristics of Surveys Collection of standardized data Primarily quantitative Data are consciously provided

More information

Sex and the Classroom: Can a Cash Transfer Program for Schooling decrease HIV infections?

Sex and the Classroom: Can a Cash Transfer Program for Schooling decrease HIV infections? Sex and the Classroom: Can a Cash Transfer Program for Schooling decrease HIV infections? Sarah Baird, George Washington University Craig McIntosh, UCSD Berk Özler, World Bank Education as a Social Vaccine

More information

Small-area estimation of mental illness prevalence for schools

Small-area estimation of mental illness prevalence for schools Small-area estimation of mental illness prevalence for schools Fan Li 1 Alan Zaslavsky 2 1 Department of Statistical Science Duke University 2 Department of Health Care Policy Harvard Medical School March

More information

A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) *

A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) * A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) * by J. RICHARD LANDIS** and GARY G. KOCH** 4 Methods proposed for nominal and ordinal data Many

More information

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012 STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION by XIN SUN PhD, Kansas State University, 2012 A THESIS Submitted in partial fulfillment of the requirements

More information

Modeling Nonresponse Bias Likelihood and Response Propensity

Modeling Nonresponse Bias Likelihood and Response Propensity Modeling Nonresponse Bias Likelihood and Response Propensity Daniel Pratt, Andy Peytchev, Michael Duprey, Jeffrey Rosen, Jamie Wescott 1 RTI International is a registered trademark and a trade name of

More information

Meta-Analysis. Zifei Liu. Biological and Agricultural Engineering

Meta-Analysis. Zifei Liu. Biological and Agricultural Engineering Meta-Analysis Zifei Liu What is a meta-analysis; why perform a metaanalysis? How a meta-analysis work some basic concepts and principles Steps of Meta-analysis Cautions on meta-analysis 2 What is Meta-analysis

More information

EVALUATION OF A MONETARY INCENTIVE PAYMENT EXPERIMENT IN THE NATIONAL LONGITUDINAL SURVEY OF YOUTH, 1997 COHORT

EVALUATION OF A MONETARY INCENTIVE PAYMENT EXPERIMENT IN THE NATIONAL LONGITUDINAL SURVEY OF YOUTH, 1997 COHORT EVALUATION OF A MONETARY INCENTIVE PAYMENT EXPERIMENT IN THE NATIONAL LONGITUDINAL SURVEY OF YOUTH, 1997 COHORT A. Rupa Datta Michael W. Horrigan James R. Walker 1 The evidence on incentives in survey

More information

Can Interviewer Observations of the Interview Predict Future Response?

Can Interviewer Observations of the Interview Predict Future Response? methods, data, analyses 2016, pp. 1-16 DOI: 10.12758/mda.2016.010 Can Interviewer Observations of the Interview Predict Future Response? Ian Plewis 1, Lisa Calderwood 2 & Tarek Mostafa 2 1 University of

More information

Adjustment for Noncoverage of Non-landline Telephone Households in an RDD Survey

Adjustment for Noncoverage of Non-landline Telephone Households in an RDD Survey Adjustment for Noncoverage of Non-landline Telephone Households in an RDD Survey Sadeq R. Chowdhury (NORC), Robert Montgomery (NORC), Philip J. Smith (CDC) Sadeq R. Chowdhury, NORC, 4350 East-West Hwy,

More information

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions.

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions. Greenland/Arah, Epi 200C Sp 2000 1 of 6 EPI 200C Final, June 4 th, 2009 This exam includes 24 questions. INSTRUCTIONS: Write all answers on the answer sheets supplied; PRINT YOUR NAME and STUDENT ID NUMBER

More information

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Lec 02: Estimation & Hypothesis Testing in Animal Ecology Lec 02: Estimation & Hypothesis Testing in Animal Ecology Parameter Estimation from Samples Samples We typically observe systems incompletely, i.e., we sample according to a designed protocol. We then

More information

Co-Variation in Sexual and Non-Sexual Risk Behaviors Over Time Among U.S. High School Students:

Co-Variation in Sexual and Non-Sexual Risk Behaviors Over Time Among U.S. High School Students: Co-Variation in Sexual and Non-Sexual Risk Behaviors Over Time Among U.S. High School Students: 1991-2005 John Santelli, MD, MPH, Marion Carter, PhD, Patricia Dittus, PhD, Mark Orr, PhD APHA 135 th Annual

More information

County-Level Small Area Estimation using the National Health Interview Survey (NHIS) and the Behavioral Risk Factor Surveillance System (BRFSS)

County-Level Small Area Estimation using the National Health Interview Survey (NHIS) and the Behavioral Risk Factor Surveillance System (BRFSS) County-Level Small Area Estimation using the National Health Interview Survey (NHIS) and the Behavioral Risk Factor Surveillance System (BRFSS) Van L. Parsons, Nathaniel Schenker Office of Research and

More information

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis EFSA/EBTC Colloquium, 25 October 2017 Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis Julian Higgins University of Bristol 1 Introduction to concepts Standard

More information

Evaluating bias of sequential mixed-mode designs against benchmark surveys

Evaluating bias of sequential mixed-mode designs against benchmark surveys Evaluating bias of sequential mixed-mode designs against benchmark surveys 2015 16 Thomas Klausch Joop Hox Barry Schouten November Content 1. Introduction 4 2. Background 7 3. The Crime Victimization Survey

More information

Appendix: Anger and Support for Vigilante Justice in Mexico s Drug War

Appendix: Anger and Support for Vigilante Justice in Mexico s Drug War Appendix: Anger and Support for Vigilante Justice in Mexico s Drug War Omar García-Ponce, Lauren Young, and Thomas Zeitzoff August 27, 2018 A Sampling A.1 Sampling Design Our target population was adults

More information

Impact of Nonresponse on Survey Estimates of Physical Fitness & Sleep Quality. LinChiat Chang, Ph.D. ESRA 2015 Conference Reykjavik, Iceland

Impact of Nonresponse on Survey Estimates of Physical Fitness & Sleep Quality. LinChiat Chang, Ph.D. ESRA 2015 Conference Reykjavik, Iceland Impact of Nonresponse on Survey Estimates of Physical Fitness & Sleep Quality LinChiat Chang, Ph.D. ESRA 2015 Conference Reykjavik, Iceland Data Source CDC/NCHS, National Health Interview Survey (NHIS)

More information

Estimation of effect sizes in the presence of publication bias: a comparison of meta-analysis methods

Estimation of effect sizes in the presence of publication bias: a comparison of meta-analysis methods Estimation of effect sizes in the presence of publication bias: a comparison of meta-analysis methods Hilde Augusteijn M.A.L.M. van Assen R. C. M. van Aert APS May 29, 2016 Today s presentation Estimation

More information

Vocabulary. Bias. Blinding. Block. Cluster sample

Vocabulary. Bias. Blinding. Block. Cluster sample Bias Blinding Block Census Cluster sample Confounding Control group Convenience sample Designs Experiment Experimental units Factor Level Any systematic failure of a sampling method to represent its population

More information

Appendix B Statistical Methods

Appendix B Statistical Methods Appendix B Statistical Methods Figure B. Graphing data. (a) The raw data are tallied into a frequency distribution. (b) The same data are portrayed in a bar graph called a histogram. (c) A frequency polygon

More information

Assessing the accuracy of response propensities in longitudinal studies

Assessing the accuracy of response propensities in longitudinal studies Assessing the accuracy of response propensities in longitudinal studies CCSR Working Paper 2010-08 Ian Plewis, Sosthenes Ketende, Lisa Calderwood Ian Plewis, Social Statistics, University of Manchester,

More information

EPSE 594: Meta-Analysis: Quantitative Research Synthesis

EPSE 594: Meta-Analysis: Quantitative Research Synthesis EPSE 594: Meta-Analysis: Quantitative Research Synthesis Ed Kroc University of British Columbia ed.kroc@ubc.ca March 28, 2019 Ed Kroc (UBC) EPSE 594 March 28, 2019 1 / 32 Last Time Publication bias Funnel

More information

Infertility services reported by men in the United States: national survey data

Infertility services reported by men in the United States: national survey data MALE FACTOR Infertility services reported by men in the United States: national survey data John E. Anderson, Ph.D., Sherry L. Farr, Ph.D., M.S.P.H., Denise J. Jamieson, M.D., M.P.H., Lee Warner, Ph.D.,

More information

Propensity Score Analysis: Its rationale & potential for applied social/behavioral research. Bob Pruzek University at Albany

Propensity Score Analysis: Its rationale & potential for applied social/behavioral research. Bob Pruzek University at Albany Propensity Score Analysis: Its rationale & potential for applied social/behavioral research Bob Pruzek University at Albany Aims: First, to introduce key ideas that underpin propensity score (PS) methodology

More information

Examining Relationships Least-squares regression. Sections 2.3

Examining Relationships Least-squares regression. Sections 2.3 Examining Relationships Least-squares regression Sections 2.3 The regression line A regression line describes a one-way linear relationship between variables. An explanatory variable, x, explains variability

More information

OHDSI Tutorial: Design and implementation of a comparative cohort study in observational healthcare data

OHDSI Tutorial: Design and implementation of a comparative cohort study in observational healthcare data OHDSI Tutorial: Design and implementation of a comparative cohort study in observational healthcare data Faculty: Martijn Schuemie (Janssen Research and Development) Marc Suchard (UCLA) Patrick Ryan (Janssen

More information

Sequential nonparametric regression multiple imputations. Irina Bondarenko and Trivellore Raghunathan

Sequential nonparametric regression multiple imputations. Irina Bondarenko and Trivellore Raghunathan Sequential nonparametric regression multiple imputations Irina Bondarenko and Trivellore Raghunathan Department of Biostatistics, University of Michigan Ann Arbor, MI 48105 Abstract Multiple imputation,

More information

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when.

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when. INTRO TO RESEARCH METHODS: Empirical Knowledge: based on observations. Answer questions why, whom, how, and when. Experimental research: treatments are given for the purpose of research. Experimental group

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Instrumental Variables Estimation: An Introduction

Instrumental Variables Estimation: An Introduction Instrumental Variables Estimation: An Introduction Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA The Problem The Problem Suppose you wish to

More information

Sampling for Success. Dr. Jim Mirabella President, Mirabella Research Services, Inc. Professor of Research & Statistics

Sampling for Success. Dr. Jim Mirabella President, Mirabella Research Services, Inc. Professor of Research & Statistics Sampling for Success Dr. Jim Mirabella President, Mirabella Research Services, Inc. Professor of Research & Statistics Session Objectives Upon completion of this workshop, participants will be able to:

More information

RAG Rating Indicator Values

RAG Rating Indicator Values Technical Guide RAG Rating Indicator Values Introduction This document sets out Public Health England s standard approach to the use of RAG ratings for indicator values in relation to comparator or benchmark

More information

Numeracy, frequency, and Bayesian reasoning

Numeracy, frequency, and Bayesian reasoning Judgment and Decision Making, Vol. 4, No. 1, February 2009, pp. 34 40 Numeracy, frequency, and Bayesian reasoning Gretchen B. Chapman Department of Psychology Rutgers University Jingjing Liu Department

More information

Analysis of TB prevalence surveys

Analysis of TB prevalence surveys Workshop and training course on TB prevalence surveys with a focus on field operations Analysis of TB prevalence surveys Day 8 Thursday, 4 August 2011 Phnom Penh Babis Sismanidis with acknowledgements

More information

BMI 541/699 Lecture 16

BMI 541/699 Lecture 16 BMI 541/699 Lecture 16 Where we are: 1. Introduction and Experimental Design 2. Exploratory Data Analysis 3. Probability 4. T-based methods for continous variables 5. Proportions & contingency tables -

More information

What s New in SUDAAN 11

What s New in SUDAAN 11 What s New in SUDAAN 11 Angela Pitts 1, Michael Witt 1, Gayle Bieler 1 1 RTI International, 3040 Cornwallis Rd, RTP, NC 27709 Abstract SUDAAN 11 is due to be released in 2012. SUDAAN is a statistical software

More information

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions Readings: OpenStax Textbook - Chapters 1 5 (online) Appendix D & E (online) Plous - Chapters 1, 5, 6, 13 (online) Introductory comments Describe how familiarity with statistical methods can - be associated

More information

Statistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.

Statistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations. Statistics as a Tool A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations. Descriptive Statistics Numerical facts or observations that are organized describe

More information

i EVALUATING THE EFFECTIVENESS OF THE TAKE CONTROL PHILLY CONDOM MAILING DISTRIBUTION PROGRAM by Alexis Adams June 2014

i EVALUATING THE EFFECTIVENESS OF THE TAKE CONTROL PHILLY CONDOM MAILING DISTRIBUTION PROGRAM by Alexis Adams June 2014 i EVALUATING THE EFFECTIVENESS OF THE TAKE CONTROL PHILLY CONDOM MAILING DISTRIBUTION PROGRAM by Alexis Adams June 2014 A Community Based Master s Project presented to the faculty of Drexel University

More information

The Effectiveness of Advance Letters for Cell Telephone Samples

The Effectiveness of Advance Letters for Cell Telephone Samples The Effectiveness of Advance Letters for Cell Telephone Samples Benjamin Skalland 1, Zhen Zhao 2, Jenny Jeyarajah 2 1 NORC at the University of Chicago 2 Centers for Disease Control and Prevention Abstract

More information

Understandable Statistics

Understandable Statistics Understandable Statistics correlated to the Advanced Placement Program Course Description for Statistics Prepared for Alabama CC2 6/2003 2003 Understandable Statistics 2003 correlated to the Advanced Placement

More information

DATA is derived either through. Self-Report Observation Measurement

DATA is derived either through. Self-Report Observation Measurement Data Management DATA is derived either through Self-Report Observation Measurement QUESTION ANSWER DATA DATA may be from Structured or Unstructured questions? Quantitative or Qualitative? Numerical or

More information