INTERVIEWER RATINGS OF RESPONDENT POLITICAL KNOWLEDGE: CALIBRATING A USEFUL MEASUREMENT INSTRUMENT

Size: px
Start display at page:

Download "INTERVIEWER RATINGS OF RESPONDENT POLITICAL KNOWLEDGE: CALIBRATING A USEFUL MEASUREMENT INSTRUMENT"

Transcription

1 INTERVIEWER RATINGS OF RESPONDENT POLITICAL KNOWLEDGE: CALIBRATING A USEFUL MEASUREMENT INSTRUMENT William G. Jacoby Michigan State University jacoby@msu.edu Prepared for presentation at the 01 Annual Meetings of the Midwest Political Science Association. Chicago, IL: April 17, 01.

2 ABSTRACT The American National Election Studies (ANES) interviewer ratings of respondents levels of political information are widely used in the field of mass political behavior. But recent research calls their measurement characteristics into question. In this paper I use data from the 00 ANES to create a set of optimally-scaled (OS) scores for the interviewer ratings. These OS scores simultaneously maximize the relationship with objective measures of respondent political information and strictly respect a set of explicit measurement assumptions regarding the ratings. I argue that the OS scores overcome the problems associated with standard usage of the interviewer ratings and comprise better measurement of political knowledge within the mass public.

3 Political knowledge is widely regarded as an important variable in research on public opinion and political behavior. However, there is little consensus among scholars regarding the best way to measure the knowledge possessed by individual citizens. One strategy relies upon survey interviewer assessments of respondents political information and awareness. Despite numerous advantages, there are potentially serious problems with this approach resulting from systematic differences in judgments across interviewers. In this paper, I draw from measurement theory to propose a simple approach for taking these interviewer biases into account, thereby effectively calibrating the measurement instrument for political knowledge. This approach is tested using the interviewer assessments and other items from the 00 ANES. The measurement calibration strategy produces more detailed and cleaner measures of political knowledge, that exhibit theoretically reasonable relationships with other variables. Overall, the empirical results provide an optimistic perspective on the measurement characteristics and theoretical utility of interviewer assessments of individual political knowledge. BACKGROUND Political knowledge, or information, is one of the central variables in the field of mass political behavior. In fact, one of the most consistent findings to emerge from more than seventy-five years of empirical research is the low level of political information possessed throughout most of the mass public (e.g., Neuman 1986). 1 This finding is troublesome because information and knowledge are widely believed to be preconditions for individuals to fulfill the requirements of democratic citizenship (Delli Carpini and Keeter 1996). Information is necessary for the public to evaluate elements of the political world and informed citizens have been found to behave and think about politics differently from their less-informed counterparts (e.g., Bartels 1996; Althaus 00). At the same time, knowledge and information are strongly related to political engagement and overt participation (Verba and Nie 197; Verba, Schlozman, Brady 199). Information also helps people connect general orientations to specific policy stimuli and candidate choices (e.g., Campbell, Converse, Miller, Stokes 1960; Lewis-Beck, Jacoby, Norpoth, Weisberg 008). Thus, political knowledge is a key factor in understanding citizens political attitudes and behavior. 1 In some contexts, knowledge and information could refer to conceptually distinct phenomena. For example, information could be regarded as an incoming stream of stimuli while knowledge could represent that subset of the information that a person actually retains. Here, however, I will not make such distinctions. The two terms are used interchangeably throughout the study.

4 It probably is an understatement to say that the concept of political information or knowledge has received an enormous amount of scholarly attention (e.g., Barabas, Jerit, Pollock, Rainey 01). This is particularly interesting because, as an empirical variable, political information is a relative newcomer to the field. Up through the early 1990 s researchers used other variables to stratify the mass public, including political conceptualization (Converse 196), cognitive ability (Stimson 197), sophistication (Luskin 1987), and education (Sniderman, Brody, Tetlock 1991). While the specific operationalizations varied from one investigator to the next, the general objective in each case is to tap individual differences in the comprehension of, and engagement with, the political world. Fact-Based Political Knowledge Attention shifted quickly, starting in the late 1980 s. The Center for Political Studies (CPS) American National Election Studies (ANES) began including batteries of factual items about politics on their regular interview schedules beginning in 1988, after initial testing on the 198 NES Pilot Study. Starting that year, and continuing to the present day, respondents in the post-election wave of the ANES have been presented with the following question: Now we have a set of questions concerning various public figures. We want to see how much information about them gets out to the public from television, newspapers and the like. The first name is. What job or political office does he NOW hold? Of course, the blank in the question is replaced with the name of an actual political figure and the question is repeated for several more political figures. The specific political figures vary from one year to the next corresponding to reasonable differences in their public visibility. For example, the 00 ANES respondents were asked about the offices held by Dennis Hastert, Dick Cheney, Tony Blair, and William Rehnquist. Typically, these items are not employed as conceptually distinct empirical variables. Instead, researchers usually assign each survey respondent a score corresponding to the number of correct identifications. And, that score is interpreted as the respondent s level of political knowledge. Shortly after the introduction of the ANES factual item battery, political knowledge became a very popular concept. For example, Delli Carpini and Keeter (199, page 1180) state:

5 A common conclusion... is that factual knowledge is the single best indicator of sophistication and its related concepts of expertise, awareness, political engagement, and even media exposure... Certainly the scale based upon the office identification items has very attractive properties: It is easily-administered (Zaller 1986), highly reliable (Delli Carpini and Keeter 199), and has strong predictive validity (Delli Carpini and Keeter 1996). So, the popularity of the fact-based knowledge scale and its simultaneous widespread use in empirical research are readily understandable. Despite its apparent strengths, researchers have also identified several serious weaknesses in the factual knowledge scale. Mondak (001) has shown that the ANES respondents varying propensities to guess the correct answers to the knowledge questions introduce systematic biases into the measure. Gibson and Caldeira (009) examined the verbatim responses to the office identification items and found highly arbitrary coding decisions about what constitutes correct or incorrect answers. DeBell (01) identifies issues associated with the differing sets of public figures used from one year to the next, and the potential for systematically varying difficulties in identifying figures within a given year due to such factors as differing levels of visibility across the figures. From a somewhat different perspective, the factual item scale may be an incomplete measure of knowledge. Gilens (001) points out that it does not tap domain-specific knowledge regarding substantive political matters (e.g., specific issue controversies). Similarly, the fact-based scale does not capture operative knowledge, as opposed to textbook knowledge about politics (Lupia 006; Abrajano 01). Barabas et al. (01) identify two dimensions of political knowledge and point out that the office-identification items only tap a subset of the resultant classification system for different types of information. Using Interviewer Ratings to Measure Knowledge The various issues with factual knowledge batteries highlight the potential utility of an alternative approach to measuring political knowledge: Interviewer assessments. At the completion of each face-to-face session with the respondent, the ANES interviewer assesses the respondent s general level of information about politics and public affairs relative to five ordered categories. The most common practice is to assign the categories successive integer scores as follows: 1..

6 ... This variable can be regarded as a quantification of the interviewer assessments of political knowledge, with larger scores indicating more informed respondents. Using interviewer assessments to measure political knowledge have several advantages. For one thing, the interviewers seem to comprise a high quality measurement instrument. They are carefully trained at the Survey Research Center, in the University of Michigan s Institute for Social Research. And, most of the interviewers have quite a bit of experience administering surveys to respondents. The ANES interview schedules have included interviewer ratings in every survey since 1968, enabling comparisons of public information levels over time. The rating scores the interviewers assign appear to be highly reliable. For example, the testretest correlation of interviewers information ratings (using the one-to-five quantification) from the pre- and post-election waves of the 00 ANES is This figure, in itself, suggests that the characteristic under observation is quite stable. And, if we are willing to assume that an individual s level of political knowledge does not change across a relatively short period of time (like the interval between the two waves of the 00 survey) then combining the separate rating scores into a single scale produces a reliability coefficient of According to the standard interpretation of reliability (e.g., Carmines and Zeller 1980), this means that 8% of the observed variance in the two-item scale is shared with the unobservable variance in true political knowledge. Furthermore, the interviewer assessments show high levels of convergent validity, in the form of strong correlations with a sizable number of other variables that, from a theoretical perspective, should be closely related to political knowledge (Zaller 1986; Delli Carpini and Keeter 199). For all of these reasons, it is not surprising that the ANES interviewer ratings have been widely accepted as excellent measures of individual political information levels and used in many studies of American political behavior (e.g., Zaller 199; Bartels 1996; Baum and Kernell 1999; Goren 000; Althaus 00; Jacoby 009). In fact, the ANES Codebook scores the five categories in the opposite direction from what is shown here. That is, the one-to-five scale runs from the highest to lowest levels of knowledge. The reflected scoring scheme shown here makes more sense from a substantive perspective (again, larger values indicate more knowledge) so it is used throughout this analysis.

7 Interviewer ratings of political knowledge definitely are not a panacea for the problems of factbased measures; instead, they have some serious problems of their own. One troubling issue is that various interviewers may not be assessing knowledge the same way, and they may not even be measuring political information or knowledge at all. Lupia (01) points out that, even though the SRC interviewers are highly trained in the various details of administering a survey, they are not given specific guidance about evaluating respondents political knowledge. He writes (page 06): After years of searching and dozens of interviews with staff, no one has been able to produce any written instructions given to any ANES interviewers about how to assess any respondent s information level other than a single line of text that reads use your best judgment. Thus, interviewers are required to evaluate respondents according to a criterion that they apparently are supposed to define for themselves. Given the complicated and multifaceted nature of political knowledge (e.g., Delli Carpini and Keeter 1996; Barabas et al. 01) it seems very likely that different interviewers will focus on varying elements of the overall concept when they try to apply it to individual survey respondents. Lupia shows that this can have pernicious consequences when knowledge is used as a variable that affects political orientations, attitudes, and behavior: These assessments are not the product of practices that are likely to provide accurate and consistent evidence from interviewer to interviewer or from year to year. Existing interviewer assessments are unreliable (Lupia 01, page 1). At least part of Lupia s pessimistic conclusion about the qualities of the interviewer ratings is based upon earlier work by Levendusky and Jackman (008). The latter authors use data from the 1998 ANES to construct an item response theory model of individual political knowledge, and then examine its relationship with the interviewer assessments. Levendusky and Jackman find huge interviewer effects in the rating measures of political knowledge, saying that a respondent with the same level of political knowledge will most likely be assigned different interviewer ratings when scored by two different interviewers (008, Abstract). As a summary demonstration of the severity of the problem, they perform a simple ANOVA and show that % of the variance in interviewer ratings of political knowledge is due to the differences across the 8 interviewers, alone. This crossinterviewer variability is not unique to that particular dataset. A similar ANOVA carried out on the interviewer ratings from the 00 ANES (the dataset used in the empirical analysis below)

8 shows that the 81 interviewers account for 1% of the variance in the rating scores. Thus, the ANES interviewer ratings do not appear to provide an objective measure of political knowledge, with scores that are comparable across both respondents and interviewers. So, what is to be done about the shortcomings of the interviewer ratings? Levendusky and Jackman (008) say that First, scholars should accept once and for all that the interviewer rating is a flawed measure of political knowledge (page ). Nevertheless, discarding or ignoring this variable does not seem to be a wise course of action. The interviewers bring insights based upon a fairly lengthy social interaction with the respondents (Tourangeau, Rips, Rasinski 000) in which the focus of discussion is politics. The interviewer records answers to a sizable number of questions, including the factual items. He or she also is exposed to the additional verbal and nonverbal queues that respondents provide during the session many of which reflect the person s level of political understanding (e.g., Converse 197). The readiness and ease with which respondents converse about political phenomena certainly seems like a reasonable expression of their awareness of, and facility in dealing with, the elements of the political world in other words, precisely what social scientists generally mean when they refer to political knowledge. And, it seems very likely that the interviewers take these factors into account when generating their own rating of a respondent s political information. As we have already seen, the resultant scores are reliable and correlated with a variety of other variables. So, it seems more reasonable to fix the problems in the interviewer ratings than to abandon them. Of course, the strategy we follow depends upon the source of the problems in the measure. Here, there do not appear to be systematic biases due to the race, gender, or social class of the interviewer relative to the respondent (Zaller 1986; Levendusky and Jackman 008). Instead there are pronounced individual differences in the ways the interviewers use the five rating categories to represent political knowledge. This generates a problem known as inter-personally incomparable scores (Brady 198) or differential item functioning (King, Murray, Salomon, Tandon 00). The quantifications of political knowledge obtained by assigning successive integers to the five response categories are not comparable across the interviewers. And, while serious, this problem definitely is not insurmountable. Lupia argues that it should be handled through better training of the interviewers (01, page 09). Levendusky and Jackman recommend that... the interviewer rating 6

9 should... be used... in a measurement model that can explicitly correct for this cross-interviewer heterogeneity (009, pp. -6). That is precisely what I will do in the remainder of this study, although I use a different approach than the anchoring vignettes that they mention. The potential remedies to differential item functioning proposed by Lupia and by Levendusky and Jackman can only be used with new data. The strategy I will employ has an important advantage in that it can be applied to any existing ANES data that contain both the interviewer ratings of political knowledge and the factual questions for the same respondents. In effect, I will calibrate the former, using their relationship to the latter. MEASUREMENT THEORY Let us begin by considering the basic nature of measurement (e.g., Hand 00), and how it pertains to political knowledge. Regardless of the specific context, measurement begins by sorting a set of objects into observationally distinct categories, based upon some specified characteristic of those objects. Numbers are assigned to the objects based upon their category membership. This classification process can be considered measurement if and only if the differences between the numbers assigned to the objects correspond to the substantive differences between the observational categories, according to some specified rule. Here, the objects are the ANES survey respondents. The observational categories are the five levels of political information that the interviewers use to classify the respondents,,, and so on. The assigned numbers are the integers one through five, and the rule is to assign larger numbers to categories corresponding to higher information levels. Measurement Properties There are three important properties of measurement that are relevant in the present context. First, measurement level refers to the nature of the function mapping from the observational categories to the assigned numbers. Measurement at the nominal level implies that an identitypreserving function is used, but no further restrictions are placed on the specific numbers assigned to the objects in the respective categories. Ordinal measurement implies a monotonic function in This discussion is based upon a theory of measurement originally laid out by Young (1981) and developed in the political science context by Jacoby (1991; 1999). 7

10 which observational asymmetries across the objects correspond to a non-decreasing array of numeric values. Interval measurement implies that there is some parametric function often linear mapping from the objects to the numbers. And, ratio-level measurement indicates a parametric function, with a non-alterable intercept of zero (usually signalling the absence of the property being measured). In any case, measurement level refers to variability in the assigned numbers across the observational categories. The second property, measurement process, involves the variability of the numbers assigned within the observational categories. Specfically, measurement process can either be discrete or continuous. Discrete process implies a measurement scheme in which all of the objects within a common category are assigned the same number. With a continuous measurement process, objects within a single category can be assigned different numbers, as long as the values within each category correspond to a closed interval of real numbers. In other words, adjacent intervals of values representing two observational categories can share a common end point but, beyond that, they cannot overlap. The third property, measurement conditionality, pertains to differences across the objects, regardless of their category membership. Specifically, measurement condition determines which kinds of comparisons are legitimate and meaningful within the given measurement scheme. The immediate consequence of measurement is that the numbers assigned to two distinct objects can be compared to each other to determine whether one object possesses more, less, or an equal amount of the property being measured, relative to the other object. If such comparisons are possible for all distinct pairs of objects being measured, then the measurement is unconditional. But, if such comparisons are only meaningful within distinct subsets of observations (i.e., each objects numerical value can only be compared to the numerical values of other objects within the same subset) then the measurement is conditional. Now, the one-to-five scale usually constructed from the interviewer ratings is a quantification of political knowledge. But, what are its measurement properties? In the research literature, the interviewer ratings are usually treated as interval level, discrete process, and unconditional measures. This quantification implies that the differences between adjacent categories are identical across the entire range of the scale. In other words, the scores are regarded as a linear function of observed differences in political knowledge, making the ratings an interval-level measure. Similarly, 8

11 the survey respondents placed in a given category by the interviewer are all assigned the same number again, a value ranging from one to five. This means that the ratings are a discrete measure. Finally, the scoring scheme used for the interviewer ratings is not affected by which interviewer rated which respondent. The scores assigned to the respondents who were categorized by any given interviewer are treated as exactly equivalent to the scores assigned to respondents by any other interviewer. If two NES respondents are assigned a score of, say, then they are both treated as if they have an identical fairly low level of political knowledge, even if two different interviewers assigned the respective scores. Thus, the usual measurement of political knowledge is treated as if it is unconditional. It is almost certainly unrealistic to treat the interviewer ratings of political knowledge as intervallevel, discrete process, and unconditional. Instead, it probably is more appropriate to assume that the ratings are ordinal level, continuous process, and interviewer-conditional. First, the ordinal level of measurement means that the numbers assigned to the objects within the categories comprise a monotonic function of observed political knowledge (i.e., as evaluated by the interviewer). That is, as respondents levels of political knowledge increase, the numbers assigned to the respondents should never decrease. This implies that, say, respondents in the fairly high category possess more knowledge than those in the average category, but we cannot say a priori how much more. Further, the difference between these two categories need not be the same as the difference between the very high category and the fairly high category (or any other pair of adjacent categories). Second, continuous process means that each of the observational categories corresponds to a closed interval of numbers, rather than to a single value. In substantive terms, this implies that each category contains individuals with varying degrees of political knowledge, although it still is the case that respondents in a category with lower assigned numbers possess less knowledge than those in a category with higher assigned numbers. This should be very reasonable if interviewers employ several observational criteria (e.g., answers to various questions along with other cues) to rate the respondents. Third, interviewer-conditional measurement means that the scores assigned by one interviewer are not necessarily comparable to those assigned by any other interviewer. It is reasonable to assume that each interviewer has his or her own idea of what constitutes each level of political knowledge that is, what a respondent must say or do in order to get placed into the average category, or 9

12 any of the other four categories. But, one interviewer s idea of average may not be equivalent to another interviewer s (and similar non-equivalence probably exists for the other four categories). Taken together, these assumptions imply that the scores assigned to the five knowledge categories can be meaningfully compared across the respondents interviewed by any given interviewer. But, the numeric scores for respondents from one interviewer cannot be compared legitimately to the scores for respondents interviewed by someone else. Thus, the usual quantification generated by the interviewer ratings of respondent political knowledge is based upon a set of unrealistically stringent assumptions about its measurement characteristics. Therefore, it is important to develop a different quantification of the interviewer ratings which corresponds to less stringent, but more realistic, measurement assumptions. A strategy for doing so is presented in the next section. ESTIMATION STRATEGY: ALSOS The interviewer makes his or her rating of the respondent s knowledge at the end of the interview and it presumably is based (in some way) on the full interaction that the interviewer has had with the respondent. As such, it should incorporate more than just factual knowledge, hopefully including such things as the domain-specific and operative forms of knowledge that are missing from responses to factual questions about officeholders and other general textbook aspects of American government. Nevertheless, there is no reason to expect that the interviewer ratings of political knowledge would be inconsistent with the kinds of factual knowledge that are tapped by the latter survey questions. All of the ANES interviewers pose the factual questions from the interview schedule to all of their respondents. The correct answers to the factual questions (obviously) do not vary across respondents or interviewers. Therefore, they can be regarded as a fixed standard from which to evaluate variability across the latter two sets of actors. In effect, we will generate a new quantification of the interviewer ratings of respondent political knowledge that is calibrated according to the responses on the factual items. In order to do so, we will assign a set of scores to the interviewer ratings that optimize two criteria simultaneously: 1. The assigned scores will maximize the squared multiple correlation between the interviewer ratings and the objective measures of respondent information. 10

13 . The assigned scores will conform strictly to the pre-specified assumptions about the measurement properties of the interviewer ratings (i.e., ordinal level, continuous process, and interviewer-conditional). Numeric values that possess the preceding two characteristics are called optimally-scaled values, or OS scores. For present purposes, the OS scores represent a quantification of the interviewer ratings of political knowledge that conform to a realistic set of measurement assumptions. Hopefully, that will make them more useful tools for research than the traditional one-to-five quantification that is beset with the problems discussed earlier. The specific procedure for obtaining the OS scores is called multiple optimal regression via alternating least squares (MORALS, see Young, de Leeuw, Takane 1976) and it is a manifestation of a more general strategy for the quantitative analysis of qualitative data called alternating least squares, optimal scaling or ALSOS (Young 1981; Jacoby 1999). MORALS is an iterative procedure. Each iteration begins by regressing the current quantification of the interviewer ratings on the objective measures of factual knowledge from the ANES. On the first iteration, the the current quantification is just the usual (reflected) one-to-five scale assigned in the ANES codebook. On all subsequent iterations, it is an updated set of scores that provide the optimal quantification based upon the estimates so far. On each iteration, the procedure has two separate phases. In the first phase, the procedure obtains ordinary least squares estimates of the regression coefficients relating the factual knowledge measures to the current quantification of the knowledge ratings. These coefficients and the independent variable values are used to produce predicted values of political knowledge for the respondents in the usual manner. By definition, these predicted values are the linear combination of the independent variables that is maximally correlated with the dependent variable (again, the current quantification of the knowledge ratings). In the second phase of each iteration, Kruskal s (196) monotonic transformation is applied to the predicted values. This produces a set of scores that are as similar as possible (in the leastsquares sense) to the predicted values from the regression, but are still perfectly monotonic with the ordered categories of the original interviewer ratings. Kruskal s primary treatment of ties is used in the transformation, which means that respondents in the same rating category need 11

14 not be assigned the same scores. Furthermore, the transformation is carried out separately for each subset of respondents interviewed by the respective NES interviewers; therefore, the ways the numbers are assigned to the respondents will vary across the interviewers. A complete iteration of the MORALS procedure consists of these two phases. Each time the first phase is carried out, the latest R value is compared to the R from the previous iteration (on the first iteration, the previous R is initialized at zero). If the goodness of fit has increased over the previous iteration, the MORALS algorithm goes on to the second phase of the iteration and calculates new optimal scores based upon the latest set of regression coefficients and predicted values. If the goodness of fit has stabilized (i.e., R does not increase over that from the previous iteration), the procedure terminates and the most recent transformations of the predicted values (those calculated in the second phase of the previous iteration) are taken as the OS scores. Again, the OS scores from the MORALS algorithm represent a quantification of the interviewer ratings that is maximally correlated with the factual items on the ANES, and consistent with the pre-specified measurement assumptions about the ratings. It is important to emphasize that this quantification is an interval-level representation; any changes in the relative sizes of the OS scores would degrade the goodness of fit between the scores and the predictor variables. Hence, the OS scores can be used in statistical models that require interval-level variables. Nevertheless, the OS scores are completely legitimate representations of the interviewer ratings because they always comprise a weakly monotonic function of the original ordered categories assigned by the respective interviewers. Beyond the monotonicity requirement, there are no restrictions on the differences between the numeric values assigned to respondents in different categories. In this manner, the quantification reflects the ordinal nature of the characteristic being measured. Different monotonic transformations are used for the respondents rated by different interviewers. This explicitly allows for the differential item functioning that Levendusky and Jackman identified in the traditional five-point measure. But, unlike the original one-to-five integer values, the OS scores can be compared across interviewers; they are the monotonic transformations of the categories for For example, two respondents that an interviewer rated as showing average levels of political information would not have to be assigned the same optimal scores, as long as their scores are greater than or equal to the scores assigned to all respondents that the interviewer rated as political information and less than or equal to the scores assigned all respondents that the interviewer judged as possessing levels of political information. 1

15 each interviewer that, when combined into a single set of values, produces the highest squared multiple correlation with the fact-based knowledge measures. Finally, each of the five original observational categories of political knowledge is represented by a closed interval of OS scores for each interviewer. This is consistent with the likely possibility that each interviewer grouped together respondents with somewhat differing levels of political knowledge into single categories, although we still assume that the categories for each interviewer are ordered properly. In summary, the OS scores possess exactly the characteristics that most researchers assume to exist in the interviewer ratings of political knowledge. From a measurement theory perspective, the OS scores comprise a better quantification of the political knowledge ratings than the quantification provided by the five-point successive-integer scale. DATA AND ALSOS ANALYSIS The empirical analysis uses data from the 00 ANES. In that year, there were 81 interviewers with sufficient information to include in the MORALS estimation routine. For present purposes, interviewers have to interview more than one respondent because the transformation to the OS scores cannot be carried out with a single observation. Fortunately, only two interviewers had to be dropped for this reason (along with their respective respondents). At the same time, respondents must answer all of the factual questions and be rated by the interviewers in order to be included in the regression equation. There were 909 respondents with usable information. The independent variables comprise all of the factual questions included on the 00 ANES. First, there is the number of political figures whose offices the respondent identified correctly; this variable ranges from zero to four. Second, respondents were asked which party currently held the majority in the U.S. House and the Senate. A variable identifying how many of these they got correct ranges from zero to two. Third, a dummy variable is scored one if the respondent correctly stated that the rich-poor gap has been increasing and zero otherwise. Fourth, another dummy variable is scored one of the respondent correctly identifies the national Republican party as more conservative, and zero otherwise Two additional dummy variables are included in the regression equation for high school graduates and college graduates. 1

16 OLS and ALSOS Estimates Let us begin by examining the relationship between the traditional, but problematic, version of the interviewer rating of political knowledge and the other variables. Table 1 shows the ordinary least squares estimates obtained when the original five-point scale is regressed on the factual knowledge items and the two education dummy variables. Even with a knowledge measure that is based upon extremely unrealistic assumptions, these independent variables still account for almost 0% of the observed variance. Four of the six independent variables have statistically significant effects. The only exceptions are the dummy variables for increasing gap between rich and poor, and the ideology of the national parties. Thus, despite attributing properties to the interviewer ratings that they almost certainly do not possess (i.e., equal differences between all adjacent categories, homogeneity within categories, and comparability across interviewers), the resultant quantification appears to be moderately related to more objective measures of political knowledge. Table presents the ALSOS estimates for the same equation. 6 Here, the values of the dependent variable conform to much more realistic assumptions about the interviewer ratings. That is, the scores assigned to the respondents are monotonic to the five ordered categories that the interviewers used to classify their political knowledge, the scores can vary within each of those categories, and the scores are assigned separately for each of the 81 interviewers. And, the scores represent the best fit to the statistical model, subject to the preceding measurement constraints. These OS scores for the interviewer ratings are set to range from one to five, so their values are at least roughly comparable to the original five-point variable. The most obvious feature in Table is the very high level of variance explained, with R = When we assign scores via the MORALS routine, the quantification of the interviewer ratings is a nearly perfect linear function of the factual knowledge measures the latter account for about 90% of the variance in the former about half again more than with the traditional variable. It is remarkable that this improvement in fit is achieved solely by using more realistic assumptions about the measurement properties of the information ratings. The entries in Table 1 are also the estimates obtained in the first phase of the first iteration of the MORALS routine. 6 The MORALS routine was carried out using the optiscale in R (Jacoby 01). It took three iterations to produce the final estimates. 1

17 The individual coefficient estimates in Table are not of particular interest for present purposes. In effect, the independent variables function as the calibration instrument for adjusting the information in the interviewer ratings to their optimal scores (i.e., making the latter the best-fitting linear function of the former). But, it still is gratifying to see that any substantive interpretations about the impact of objective knowledge measures on the interviewer ratings would remain largely unchanged. While the specific numerical values differ, the relative sizes of the respective coefficients in Table are very similar to those in Table 1. Once again, college graduation shows the largest effect, followed by the ability to identify officeholders, high school graduation, the ability to identify congressional majority parties, correct ideological placement of the Republican party, and finally, correctly saying that the gap between rich and poor is increasing. Note that it is difficult to assess statistical significance of the individual variables effects because the standard errors from the MORALS routine are almost certainly too small (Young et al. 1976). Interviewer Effects and Measurement Assumptions The large difference in fit between the OLS and ALSOS results confirms the presence of interviewer effects in the knowledge ratings. We can be more specific about the nature of these effects by looking at the relationship between the original five-point scale and the OS scores for the individual ANES interviewers. Figure 1 contains a trellis display with 81 separate panels one for each interviewer in three subsets of 7 panels each. The panels are arrayed by the sequential numbers assigned to the respective interviewers. Therefore, the specific ordering of the panels has no substantive meaning. In each panel, the horizontal axis represents the original five categories used in the rating. They are spaced at equal intervals to correspond to successive integer scoring scheme usually applied to the categories. The vertical axis in each panel plots the mean OS score for the respondents that the interviewer placed into each the five rating categories. Note that many interviewers did not use the full range of categories, so there are fewer than five points plotted in those cases. The curve formed by using line segments to connect the adjacent points in each panel can be regarded as the estimated measurement function for that interviewer, in the sense that it summarizes the mapping from the observational categories to the optimally scaled scores assigned to the respondents. 7 7 The OS scores are not unique representations of the interviewer ratings. For each interviewer, any set of numbers that is weakly monotonic to the five rating categories would provide equally valid scores for the respondents in the 1

18 First, consider the shapes of the measurement functions. They are all constrained to be (weakly) monotonically increasing with respect to the political information categories. But, most of them are nearly linear, such as the functions for interviewers 1 (lower left corner of the first subset of panels) and (fourth from the left in the same row of panels). For the many interviewers for which this is the case, then the assumption of interval measurement level is not problematic because the differences in the mean OS values between adjacent observational categories are identical (or nearly so) across the ranges of values that they used to rate the respondents political information. Of course, there are also a non-trivial number of panels that exhibit non-linear arrays of functions so, in these cases, the average differences in the optimal scores are not equal across all adjacent pairs of categories. For those interviewers, the interval-level assumption clearly is violated. As a specific example, consider the plot for interviewer 79, the third from the right in the top row of the third subset of panels. This interviewer only used three of the rating categories, and the difference between the mean OS scores assigned to the and categories is much larger than the difference between the OS scores assigned to the and categories. More severe violations of the interval-level assumption occur when segments of the measurement function are flat. This means that the mean OS scores are identical across the two categories which, in turn, implies that the interviewer did not differentiate the respondents in those categories effectively with respect to their observed levels of factual knowledge. For example, consider interviewer 81, in the upper right corner of the third subset of panels, who shows one of the more pronounced examples of this problem. Even though this interviewer used four of the five rating categories (those from through, ) he or she did not distinguish the three lower categories according to their factual knowledge. Therefore, it is more effective to treat the ratings from this interviewer as representing only two distinct levels of political information, corresponding to Very high and everything else below that. Second, consider the differences in the specific measurement functions across the 81 panels of Figure 1. The varying shapes of the curves connecting the adjacent mean OS scores for the five rating categories confirm that the assumption of unconditional measurement across interviewers categories. But, these scores are uniquely optimal in the sense that they provide the best possible least-squares fit to the factual items on the ANES interview schedule. Even so, there are an infinite number of such scores but, they all would be linear transformations of these scores so any other sets of values would provide exactly the same information. This specific set of values is obtained by transforming the OS scores to range across the interval from one to five. 16

19 is violated severely. This implies that the level of political knowledge represented within each of the observational categories varies markedly from one interviewer to the next. As one example, consider the panels that fall fourth and fifth from the left in the bottom row of the first subset of panels (for interviewers and ). Both of these interviewers placed respondents into the Very high information category. But, the mean OS score assigned to this category in the right-hand panel is close to, but slightly lower than, the mean score for the category in the left-hand panel. This shows that the interviewer s perception of very high political knowledge corresponds on average to what interviewer means by fairly high political knowledge. It is important to emphasize that the conditionality problems in the interviewer ratings are distinct from the property of measurement level. As already discussed, many of the curves in Figure 1 are nearly linear in shape. And, as explained, this is evidence for interval-level measurement of the distinctions between the observational categories. But, the precise linear functions the slopes and intercepts for the nearly-linear curves do vary from one panel to the next. And, that confirms that interviewers map the rating categories onto actual levels of political knowledge differently from each other, even if they each do so in ways that imply equal differences between categories. While Figure 1 enables an assessment of measurement level and conditionality, it cannot be used to evaluate measurement process. The panels in the figure only plot the central tendency of the OS scores for each of the five rating categories, for each interviewer. But, the ALSOS analysis allowed the OS scores to vary within the categories so that each of the latter are quantified by a range of scores, rather than a single value. In principle, this could be shown in a trellis display like Figure 1, but using the actual OS values for the individual respondents, rather than the category means. This display is shown in the Appendix. But, it is difficult to interpret, due to the small sizes of the individual panels. And, there probably is not much to be gained from close inspection of that display, beyond the general observation that there is, indeed, variability in the OS scores within each of the observational categories used by the respective interviewers. Figure shows the more detailed measurement functions for two interviewers, taken from the overall trellis display in the Appendix. In each panel. the points and line segments plotted in blue represent the OS scores for the individual respondents rated by that interviewer. The green points and line segments plot the means of the OS scores for the observational categories (i.e., the same information that was shown in Figure 1). Within each panel, the amount of variability within a 17

20 rating category is represented by the length of the vertical line segment connecting the OS scores within that category. So, for interviewer 6, the range of OS values is greater within the three middle categories than in either of the two end categories (note that the category actually contains two respondents with identical OS scores, so their points are overplotted). In contrast, interviewer 1 shows much more variation in the amount of political knowledge subsumed within the category than in the or categories. This interviewer placed only one respondent in the category, and none in the category. The general conclusion to be drawn from these plots, and the full trellis display in the Appendix, is that there clearly is variability in the levels of political knowledge subsumed within each of the rating categories. Of course, this is ignored entirely if the usual five-point variable is used in its standard form. To summarize the results so far, the ALSOS analysis of the ANES interviewer ratings of respondent political knowledge that the typical usage of this variable is highly problematic. The five-point version of this variable based on the successive-integer scores in the ANES codebook are severely affected by measurement error. The ALSOS analysis provides more insights about the nature of that error. Specifically, it stems from three distinct sources: Measurement conditionality or differences in the ways the respective interviewers sort observed levels of political knowledge into the five categories and continuous measurement process the existence of nontrivial variability in the actual levels of political knowledge probably represent the most troublesome issues. Measurement level is problematic for some interviewers, who either fail to differentiate observational categories by actual levels of knowledge or vary in the extent to which they differentiate the amount of political knowledge in adjacent categories across the range of the scale. But, for many of the interviewers, their ratings do suggest that their average distinctions about political knowledge between categories are fairly uniform. OS SCORES AND MEASUREMENT QUALITY The optimally-scaled scores examined in the previous section produce the best quantification of the interviewer ratings of political knowledge in the sense that they are maximally correlated with the available objective information while still conforming strictly to a set of explicit measurement assumptions about the nature of the variable. So far, we have used these OS scores to gain insights 18

21 about the type of problems that arise in the typical quantification of the interviewer ratings (i.e., the usual five-point equal-interval scale). Hopefully, we can go significantly further than that, and use the OS scores to improve our measurement of respondents political information. We will use two criteria for evaluating our success in achieving this objective. The OS scores can be regarded as better measurement from a substantive perspective if they (1) provide more detailed resolution of the property being measured; and () exhibit clearer relationships with other theoretically relevant variables. As we will see, the OS scores meet both of these criteria quite easily. Measurement Resolution The interviewer ratings of political knowledge sort the ANES respondents into five ordered categories. If we conceptualize knowledge as a continuum, then the standard variable only allows us to empirically distinguish five positions along that continuum. In contrast, there are 0 distinct values in the set of OS scores. This enables much more fine-grained placement of individual respondents according to their political knowledge. Again, these latter placements are fully consistent with the information in the original ratings. If we focus on the OS scores for respondents rated by any given interviewer, then the locations along the continuum will be monotonic with respect to the categorical ratings. Figure shows the histograms for the interviewer ratings, separately for the original five-point scale, and for the OS scores. The two displays tell somewhat different stories about the distribution of political knowledge among the ANES respondents. The histogram for the five-point scale (i.e., the original rating scores from the ANES codebook) shows a distribution with a negative skew. The modal score is, corresponding to a level of political information although there are almost as many respondents in the middle category (scored as ), or political knowledge. The proportion of respondents in the highest knowledge category (1.% of the total) is much larger than the proportion that fall into the two lowest knowledge categories combined (1.9%). On the five-point scale, the mean is.9, the median is, and the standard deviation is 1.0. Overall, the traditional five-point scale suggests a fairly optimistic view of citizens political knowledge, with the typical individual placed somewhere from the middle to the more knowledgeable end of the continuum. 19

CHAPTER 2. MEASURING AND DESCRIBING VARIABLES

CHAPTER 2. MEASURING AND DESCRIBING VARIABLES 4 Chapter 2 CHAPTER 2. MEASURING AND DESCRIBING VARIABLES 1. A. Age: name/interval; military dictatorship: value/nominal; strongly oppose: value/ ordinal; election year: name/interval; 62 percent: value/interval;

More information

Unit 7 Comparisons and Relationships

Unit 7 Comparisons and Relationships Unit 7 Comparisons and Relationships Objectives: To understand the distinction between making a comparison and describing a relationship To select appropriate graphical displays for making comparisons

More information

Appendix A. Correlational Validity. Closed-Ended Items. interest (measured on a five-point scale from not at all interested to extremely interested ),

Appendix A. Correlational Validity. Closed-Ended Items. interest (measured on a five-point scale from not at all interested to extremely interested ), Appendix A Correlational Validity In practice, we judge correlational validity by correlating the measures we are evaluating with alternative ones, or with measures of other but related variables like

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions Readings: OpenStax Textbook - Chapters 1 5 (online) Appendix D & E (online) Plous - Chapters 1, 5, 6, 13 (online) Introductory comments Describe how familiarity with statistical methods can - be associated

More information

Preliminary Conclusion

Preliminary Conclusion 1 Exploring the Genetic Component of Political Participation Brad Verhulst Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University Theories of political participation,

More information

Chapter 7: Descriptive Statistics

Chapter 7: Descriptive Statistics Chapter Overview Chapter 7 provides an introduction to basic strategies for describing groups statistically. Statistical concepts around normal distributions are discussed. The statistical procedures of

More information

Quantitative Methods in Computing Education Research (A brief overview tips and techniques)

Quantitative Methods in Computing Education Research (A brief overview tips and techniques) Quantitative Methods in Computing Education Research (A brief overview tips and techniques) Dr Judy Sheard Senior Lecturer Co-Director, Computing Education Research Group Monash University judy.sheard@monash.edu

More information

INDIVIDUAL VALUE STRUCTURES AND PERSONAL POLITICAL ORIENTATIONS: DETERMINING THE DIRECTION OF INFLUENCE

INDIVIDUAL VALUE STRUCTURES AND PERSONAL POLITICAL ORIENTATIONS: DETERMINING THE DIRECTION OF INFLUENCE INDIVIDUAL VALUE STRUCTURES AND PERSONAL POLITICAL ORIENTATIONS: DETERMINING THE DIRECTION OF INFLUENCE William G. Jacoby Michigan State University jacoby@msu.edu April 2013 Prepared for presentation at

More information

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions Readings: OpenStax Textbook - Chapters 1 5 (online) Appendix D & E (online) Plous - Chapters 1, 5, 6, 13 (online) Introductory comments Describe how familiarity with statistical methods can - be associated

More information

Goodness of Pattern and Pattern Uncertainty 1

Goodness of Pattern and Pattern Uncertainty 1 J'OURNAL OF VERBAL LEARNING AND VERBAL BEHAVIOR 2, 446-452 (1963) Goodness of Pattern and Pattern Uncertainty 1 A visual configuration, or pattern, has qualities over and above those which can be specified

More information

ASSESSING THE EFFECTS OF MISSING DATA. John D. Hutcheson, Jr. and James E. Prather, Georgia State University

ASSESSING THE EFFECTS OF MISSING DATA. John D. Hutcheson, Jr. and James E. Prather, Georgia State University ASSESSING THE EFFECTS OF MISSING DATA John D. Hutcheson, Jr. and James E. Prather, Georgia State University Problems resulting from incomplete data occur in almost every type of research, but survey research

More information

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are

More information

IAPT: Regression. Regression analyses

IAPT: Regression. Regression analyses Regression analyses IAPT: Regression Regression is the rather strange name given to a set of methods for predicting one variable from another. The data shown in Table 1 and come from a student project

More information

6. Unusual and Influential Data

6. Unusual and Influential Data Sociology 740 John ox Lecture Notes 6. Unusual and Influential Data Copyright 2014 by John ox Unusual and Influential Data 1 1. Introduction I Linear statistical models make strong assumptions about the

More information

Technical Specifications

Technical Specifications Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically

More information

CHAPTER 3 METHOD AND PROCEDURE

CHAPTER 3 METHOD AND PROCEDURE CHAPTER 3 METHOD AND PROCEDURE Previous chapter namely Review of the Literature was concerned with the review of the research studies conducted in the field of teacher education, with special reference

More information

(CORRELATIONAL DESIGN AND COMPARATIVE DESIGN)

(CORRELATIONAL DESIGN AND COMPARATIVE DESIGN) UNIT 4 OTHER DESIGNS (CORRELATIONAL DESIGN AND COMPARATIVE DESIGN) Quasi Experimental Design Structure 4.0 Introduction 4.1 Objectives 4.2 Definition of Correlational Research Design 4.3 Types of Correlational

More information

Mantel-Haenszel Procedures for Detecting Differential Item Functioning

Mantel-Haenszel Procedures for Detecting Differential Item Functioning A Comparison of Logistic Regression and Mantel-Haenszel Procedures for Detecting Differential Item Functioning H. Jane Rogers, Teachers College, Columbia University Hariharan Swaminathan, University of

More information

SUMMATED RATING SCALES AND LEVELS OF MEASUREMENT

SUMMATED RATING SCALES AND LEVELS OF MEASUREMENT Measurement, Scaling, and Dimensional Analysis Summer 07 Bill Jacoby SUMMATED RATING SCALES AND LEVELS OF MEASUREMENT Assume that we are interested in measuring public attitudes toward government spending.

More information

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES 24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter

More information

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review Results & Statistics: Description and Correlation The description and presentation of results involves a number of topics. These include scales of measurement, descriptive statistics used to summarize

More information

Statistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN

Statistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN Statistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN Vs. 2 Background 3 There are different types of research methods to study behaviour: Descriptive: observations,

More information

Convergence Principles: Information in the Answer

Convergence Principles: Information in the Answer Convergence Principles: Information in the Answer Sets of Some Multiple-Choice Intelligence Tests A. P. White and J. E. Zammarelli University of Durham It is hypothesized that some common multiplechoice

More information

SUPPLEMENTAL MATERIAL

SUPPLEMENTAL MATERIAL 1 SUPPLEMENTAL MATERIAL Response time and signal detection time distributions SM Fig. 1. Correct response time (thick solid green curve) and error response time densities (dashed red curve), averaged across

More information

Political Science 15, Winter 2014 Final Review

Political Science 15, Winter 2014 Final Review Political Science 15, Winter 2014 Final Review The major topics covered in class are listed below. You should also take a look at the readings listed on the class website. Studying Politics Scientifically

More information

Measuring the User Experience

Measuring the User Experience Measuring the User Experience Collecting, Analyzing, and Presenting Usability Metrics Chapter 2 Background Tom Tullis and Bill Albert Morgan Kaufmann, 2008 ISBN 978-0123735584 Introduction Purpose Provide

More information

Reveal Relationships in Categorical Data

Reveal Relationships in Categorical Data SPSS Categories 15.0 Specifications Reveal Relationships in Categorical Data Unleash the full potential of your data through perceptual mapping, optimal scaling, preference scaling, and dimension reduction

More information

CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA

CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA Data Analysis: Describing Data CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA In the analysis process, the researcher tries to evaluate the data collected both from written documents and from other sources such

More information

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand

More information

alternate-form reliability The degree to which two or more versions of the same test correlate with one another. In clinical studies in which a given function is going to be tested more than once over

More information

The Impact of Relative Standards on the Propensity to Disclose. Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX

The Impact of Relative Standards on the Propensity to Disclose. Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX The Impact of Relative Standards on the Propensity to Disclose Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX 2 Web Appendix A: Panel data estimation approach As noted in the main

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 13 & Appendix D & E (online) Plous Chapters 17 & 18 - Chapter 17: Social Influences - Chapter 18: Group Judgments and Decisions Still important ideas Contrast the measurement

More information

Answers to end of chapter questions

Answers to end of chapter questions Answers to end of chapter questions Chapter 1 What are the three most important characteristics of QCA as a method of data analysis? QCA is (1) systematic, (2) flexible, and (3) it reduces data. What are

More information

Analysis and Interpretation of Data Part 1

Analysis and Interpretation of Data Part 1 Analysis and Interpretation of Data Part 1 DATA ANALYSIS: PRELIMINARY STEPS 1. Editing Field Edit Completeness Legibility Comprehensibility Consistency Uniformity Central Office Edit 2. Coding Specifying

More information

Chapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE

Chapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE Chapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE 1. When you assert that it is improbable that the mean intelligence test score of a particular group is 100, you are using. a. descriptive

More information

DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Research Methods Posc 302 ANALYSIS OF SURVEY DATA

DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Research Methods Posc 302 ANALYSIS OF SURVEY DATA DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Research Methods Posc 302 ANALYSIS OF SURVEY DATA I. TODAY S SESSION: A. Second steps in data analysis and interpretation 1. Examples and explanation

More information

MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and. Lord Equating Methods 1,2

MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and. Lord Equating Methods 1,2 MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and Lord Equating Methods 1,2 Lisa A. Keller, Ronald K. Hambleton, Pauline Parker, Jenna Copella University of Massachusetts

More information

Sum of Neurally Distinct Stimulus- and Task-Related Components.

Sum of Neurally Distinct Stimulus- and Task-Related Components. SUPPLEMENTARY MATERIAL for Cardoso et al. 22 The Neuroimaging Signal is a Linear Sum of Neurally Distinct Stimulus- and Task-Related Components. : Appendix: Homogeneous Linear ( Null ) and Modified Linear

More information

bivariate analysis: The statistical analysis of the relationship between two variables.

bivariate analysis: The statistical analysis of the relationship between two variables. bivariate analysis: The statistical analysis of the relationship between two variables. cell frequency: The number of cases in a cell of a cross-tabulation (contingency table). chi-square (χ 2 ) test for

More information

Student name: SOCI 420 Advanced Methods of Social Research Fall 2017

Student name: SOCI 420 Advanced Methods of Social Research Fall 2017 SOCI 420 Advanced Methods of Social Research Fall 2017 EXAM 1 RUBRIC Instructor: Ernesto F. L. Amaral, Assistant Professor, Department of Sociology Date: October 12, 2017 (Thursday) Section 903: 9:35 10:50am

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Group Assignment #1: Concept Explication. For each concept, ask and answer the questions before your literature search.

Group Assignment #1: Concept Explication. For each concept, ask and answer the questions before your literature search. Group Assignment #1: Concept Explication 1. Preliminary identification of the concept. Identify and name each concept your group is interested in examining. Questions to asked and answered: Is each concept

More information

Readings: Textbook readings: OpenStax - Chapters 1 4 Online readings: Appendix D, E & F Online readings: Plous - Chapters 1, 5, 6, 13

Readings: Textbook readings: OpenStax - Chapters 1 4 Online readings: Appendix D, E & F Online readings: Plous - Chapters 1, 5, 6, 13 Readings: Textbook readings: OpenStax - Chapters 1 4 Online readings: Appendix D, E & F Online readings: Plous - Chapters 1, 5, 6, 13 Introductory comments Describe how familiarity with statistical methods

More information

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological

More information

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data TECHNICAL REPORT Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data CONTENTS Executive Summary...1 Introduction...2 Overview of Data Analysis Concepts...2

More information

A Comparison of Several Goodness-of-Fit Statistics

A Comparison of Several Goodness-of-Fit Statistics A Comparison of Several Goodness-of-Fit Statistics Robert L. McKinley The University of Toledo Craig N. Mills Educational Testing Service A study was conducted to evaluate four goodnessof-fit procedures

More information

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj Statistical Techniques Masoud Mansoury and Anas Abulfaraj What is Statistics? https://www.youtube.com/watch?v=lmmzj7599pw The definition of Statistics The practice or science of collecting and analyzing

More information

Writing Reaction Papers Using the QuALMRI Framework

Writing Reaction Papers Using the QuALMRI Framework Writing Reaction Papers Using the QuALMRI Framework Modified from Organizing Scientific Thinking Using the QuALMRI Framework Written by Kevin Ochsner and modified by others. Based on a scheme devised by

More information

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Plous Chapters 17 & 18 Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions

More information

Section 3.2 Least-Squares Regression

Section 3.2 Least-Squares Regression Section 3.2 Least-Squares Regression Linear relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these relationships.

More information

11/24/2017. Do not imply a cause-and-effect relationship

11/24/2017. Do not imply a cause-and-effect relationship Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection

More information

Chapter 2--Norms and Basic Statistics for Testing

Chapter 2--Norms and Basic Statistics for Testing Chapter 2--Norms and Basic Statistics for Testing Student: 1. Statistical procedures that summarize and describe a series of observations are called A. inferential statistics. B. descriptive statistics.

More information

Color naming and color matching: A reply to Kuehni and Hardin

Color naming and color matching: A reply to Kuehni and Hardin 1 Color naming and color matching: A reply to Kuehni and Hardin Pendaran Roberts & Kelly Schmidtke Forthcoming in Review of Philosophy and Psychology. The final publication is available at Springer via

More information

Does momentary accessibility influence metacomprehension judgments? The influence of study judgment lags on accessibility effects

Does momentary accessibility influence metacomprehension judgments? The influence of study judgment lags on accessibility effects Psychonomic Bulletin & Review 26, 13 (1), 6-65 Does momentary accessibility influence metacomprehension judgments? The influence of study judgment lags on accessibility effects JULIE M. C. BAKER and JOHN

More information

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

3 CONCEPTUAL FOUNDATIONS OF STATISTICS 3 CONCEPTUAL FOUNDATIONS OF STATISTICS In this chapter, we examine the conceptual foundations of statistics. The goal is to give you an appreciation and conceptual understanding of some basic statistical

More information

How to interpret results of metaanalysis

How to interpret results of metaanalysis How to interpret results of metaanalysis Tony Hak, Henk van Rhee, & Robert Suurmond Version 1.0, March 2016 Version 1.3, Updated June 2018 Meta-analysis is a systematic method for synthesizing quantitative

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) *

A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) * A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) * by J. RICHARD LANDIS** and GARY G. KOCH** 4 Methods proposed for nominal and ordinal data Many

More information

Analysis of Environmental Data Conceptual Foundations: En viro n m e n tal Data

Analysis of Environmental Data Conceptual Foundations: En viro n m e n tal Data Analysis of Environmental Data Conceptual Foundations: En viro n m e n tal Data 1. Purpose of data collection...................................................... 2 2. Samples and populations.......................................................

More information

ADMS Sampling Technique and Survey Studies

ADMS Sampling Technique and Survey Studies Principles of Measurement Measurement As a way of understanding, evaluating, and differentiating characteristics Provides a mechanism to achieve precision in this understanding, the extent or quality As

More information

multilevel modeling for social and personality psychology

multilevel modeling for social and personality psychology 1 Introduction Once you know that hierarchies exist, you see them everywhere. I have used this quote by Kreft and de Leeuw (1998) frequently when writing about why, when, and how to use multilevel models

More information

Exploring the Impact of Missing Data in Multiple Regression

Exploring the Impact of Missing Data in Multiple Regression Exploring the Impact of Missing Data in Multiple Regression Michael G Kenward London School of Hygiene and Tropical Medicine 28th May 2015 1. Introduction In this note we are concerned with the conduct

More information

CHAPTER III RESEARCH METHODOLOGY

CHAPTER III RESEARCH METHODOLOGY CHAPTER III RESEARCH METHODOLOGY Research methodology explains the activity of research that pursuit, how it progress, estimate process and represents the success. The methodological decision covers the

More information

Basic Concepts in Research and DATA Analysis

Basic Concepts in Research and DATA Analysis Basic Concepts in Research and DATA Analysis 1 Introduction: A Common Language for Researchers...2 Steps to Follow When Conducting Research...2 The Research Question...3 The Hypothesis...3 Defining the

More information

Agents with Attitude: Exploring Coombs Unfolding Technique with Agent-Based Models

Agents with Attitude: Exploring Coombs Unfolding Technique with Agent-Based Models Int J Comput Math Learning (2009) 14:51 60 DOI 10.1007/s10758-008-9142-6 COMPUTER MATH SNAPHSHOTS - COLUMN EDITOR: URI WILENSKY* Agents with Attitude: Exploring Coombs Unfolding Technique with Agent-Based

More information

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when.

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when. INTRO TO RESEARCH METHODS: Empirical Knowledge: based on observations. Answer questions why, whom, how, and when. Experimental research: treatments are given for the purpose of research. Experimental group

More information

The Regression-Discontinuity Design

The Regression-Discontinuity Design Page 1 of 10 Home» Design» Quasi-Experimental Design» The Regression-Discontinuity Design The regression-discontinuity design. What a terrible name! In everyday language both parts of the term have connotations

More information

INDIVIDUAL VALUE CHOICES: HIERARCHICAL STRUCTURE VERSUS AMBIVALENCE AND INDIFFERENCE

INDIVIDUAL VALUE CHOICES: HIERARCHICAL STRUCTURE VERSUS AMBIVALENCE AND INDIFFERENCE INDIVIDUAL VALUE CHOICES: HIERARCHICAL STRUCTURE VERSUS AMBIVALENCE AND INDIFFERENCE William G. Jacoby Michigan State University David J. Ciuk Franklin and Marshall College January 2015 We would like to

More information

On the purpose of testing:

On the purpose of testing: Why Evaluation & Assessment is Important Feedback to students Feedback to teachers Information to parents Information for selection and certification Information for accountability Incentives to increase

More information

George B. Ploubidis. The role of sensitivity analysis in the estimation of causal pathways from observational data. Improving health worldwide

George B. Ploubidis. The role of sensitivity analysis in the estimation of causal pathways from observational data. Improving health worldwide George B. Ploubidis The role of sensitivity analysis in the estimation of causal pathways from observational data Improving health worldwide www.lshtm.ac.uk Outline Sensitivity analysis Causal Mediation

More information

Statistics Mathematics 243

Statistics Mathematics 243 Statistics Mathematics 243 Michael Stob February 2, 2005 These notes are supplementary material for Mathematics 243 and are not intended to stand alone. They should be used in conjunction with the textbook

More information

DATA GATHERING. Define : Is a process of collecting data from sample, so as for testing & analyzing before reporting research findings.

DATA GATHERING. Define : Is a process of collecting data from sample, so as for testing & analyzing before reporting research findings. DATA GATHERING Define : Is a process of collecting data from sample, so as for testing & analyzing before reporting research findings. 2012 John Wiley & Sons Ltd. Measurement Measurement: the assignment

More information

Child Mental Health: A Review of the Scientific Discourse

Child Mental Health: A Review of the Scientific Discourse Child Mental Health: A Review of the Scientific Discourse Executive Summary and Excerpts from A FrameWorks Research Report Prepared for the FrameWorks Institute by Nat Kendall-Taylor and Anna Mikulak February

More information

Comparing Direct and Indirect Measures of Just Rewards: What Have We Learned?

Comparing Direct and Indirect Measures of Just Rewards: What Have We Learned? Comparing Direct and Indirect Measures of Just Rewards: What Have We Learned? BARRY MARKOVSKY University of South Carolina KIMMO ERIKSSON Mälardalen University We appreciate the opportunity to comment

More information

o^ &&cvi AL Perceptual and Motor Skills, 1965, 20, Southern Universities Press 1965

o^ &&cvi AL Perceptual and Motor Skills, 1965, 20, Southern Universities Press 1965 Ml 3 Hi o^ &&cvi AL 44755 Perceptual and Motor Skills, 1965, 20, 311-316. Southern Universities Press 1965 m CONFIDENCE RATINGS AND LEVEL OF PERFORMANCE ON A JUDGMENTAL TASK 1 RAYMOND S. NICKERSON AND

More information

Chapter 4: Defining and Measuring Variables

Chapter 4: Defining and Measuring Variables Chapter 4: Defining and Measuring Variables A. LEARNING OUTCOMES. After studying this chapter students should be able to: Distinguish between qualitative and quantitative, discrete and continuous, and

More information

VARIABLES AND MEASUREMENT

VARIABLES AND MEASUREMENT ARTHUR SYC 204 (EXERIMENTAL SYCHOLOGY) 16A LECTURE NOTES [01/29/16] VARIABLES AND MEASUREMENT AGE 1 Topic #3 VARIABLES AND MEASUREMENT VARIABLES Some definitions of variables include the following: 1.

More information

SUPPLEMENTARY INFORMATION. Table 1 Patient characteristics Preoperative. language testing

SUPPLEMENTARY INFORMATION. Table 1 Patient characteristics Preoperative. language testing Categorical Speech Representation in the Human Superior Temporal Gyrus Edward F. Chang, Jochem W. Rieger, Keith D. Johnson, Mitchel S. Berger, Nicholas M. Barbaro, Robert T. Knight SUPPLEMENTARY INFORMATION

More information

Appendix III Individual-level analysis

Appendix III Individual-level analysis Appendix III Individual-level analysis Our user-friendly experimental interface makes it possible to present each subject with many choices in the course of a single experiment, yielding a rich individual-level

More information

Online Appendix. According to a recent survey, most economists expect the economic downturn in the United

Online Appendix. According to a recent survey, most economists expect the economic downturn in the United Online Appendix Part I: Text of Experimental Manipulations and Other Survey Items a. Macroeconomic Anxiety Prime According to a recent survey, most economists expect the economic downturn in the United

More information

STAT 503X Case Study 1: Restaurant Tipping

STAT 503X Case Study 1: Restaurant Tipping STAT 503X Case Study 1: Restaurant Tipping 1 Description Food server s tips in restaurants may be influenced by many factors including the nature of the restaurant, size of the party, table locations in

More information

A Comparison of Three Measures of the Association Between a Feature and a Concept

A Comparison of Three Measures of the Association Between a Feature and a Concept A Comparison of Three Measures of the Association Between a Feature and a Concept Matthew D. Zeigenfuse (mzeigenf@msu.edu) Department of Psychology, Michigan State University East Lansing, MI 48823 USA

More information

25. EXPLAINING VALIDITYAND RELIABILITY

25. EXPLAINING VALIDITYAND RELIABILITY 25. EXPLAINING VALIDITYAND RELIABILITY "Validity" and "reliability" are ubiquitous terms in social science measurement. They are prominent in the APA "Standards" (1985) and earn chapters in test theory

More information

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations)

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations) Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations) After receiving my comments on the preliminary reports of your datasets, the next step for the groups is to complete

More information

Chapter 3: Examining Relationships

Chapter 3: Examining Relationships Name Date Per Key Vocabulary: response variable explanatory variable independent variable dependent variable scatterplot positive association negative association linear correlation r-value regression

More information

The Pretest! Pretest! Pretest! Assignment (Example 2)

The Pretest! Pretest! Pretest! Assignment (Example 2) The Pretest! Pretest! Pretest! Assignment (Example 2) May 19, 2003 1 Statement of Purpose and Description of Pretest Procedure When one designs a Math 10 exam one hopes to measure whether a student s ability

More information

A Case Study: Two-sample categorical data

A Case Study: Two-sample categorical data A Case Study: Two-sample categorical data Patrick Breheny January 31 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/43 Introduction Model specification Continuous vs. mixture priors Choice

More information

Appendix D: Statistical Modeling

Appendix D: Statistical Modeling Appendix D: Statistical Modeling Cluster analysis Cluster analysis is a method of grouping people based on specific sets of characteristics. Often used in marketing and communication, its goal is to identify

More information

Sawtooth Software. The Number of Levels Effect in Conjoint: Where Does It Come From and Can It Be Eliminated? RESEARCH PAPER SERIES

Sawtooth Software. The Number of Levels Effect in Conjoint: Where Does It Come From and Can It Be Eliminated? RESEARCH PAPER SERIES Sawtooth Software RESEARCH PAPER SERIES The Number of Levels Effect in Conjoint: Where Does It Come From and Can It Be Eliminated? Dick Wittink, Yale University Joel Huber, Duke University Peter Zandan,

More information

Announcement. Homework #2 due next Friday at 5pm. Midterm is in 2 weeks. It will cover everything through the end of next week (week 5).

Announcement. Homework #2 due next Friday at 5pm. Midterm is in 2 weeks. It will cover everything through the end of next week (week 5). Announcement Homework #2 due next Friday at 5pm. Midterm is in 2 weeks. It will cover everything through the end of next week (week 5). Political Science 15 Lecture 8: Descriptive Statistics (Part 1) Data

More information

Speaker Notes: Qualitative Comparative Analysis (QCA) in Implementation Studies

Speaker Notes: Qualitative Comparative Analysis (QCA) in Implementation Studies Speaker Notes: Qualitative Comparative Analysis (QCA) in Implementation Studies PART 1: OVERVIEW Slide 1: Overview Welcome to Qualitative Comparative Analysis in Implementation Studies. This narrated powerpoint

More information

Chapter Eight: Multivariate Analysis

Chapter Eight: Multivariate Analysis Chapter Eight: Multivariate Analysis Up until now, we have covered univariate ( one variable ) analysis and bivariate ( two variables ) analysis. We can also measure the simultaneous effects of two or

More information

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 1.1-1

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 1.1-1 Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by Mario F. Triola 1.1-1 Chapter 1 Introduction to Statistics 1-1 Review and Preview 1-2 Statistical Thinking 1-3

More information

Statistical Methods and Reasoning for the Clinical Sciences

Statistical Methods and Reasoning for the Clinical Sciences Statistical Methods and Reasoning for the Clinical Sciences Evidence-Based Practice Eiki B. Satake, PhD Contents Preface Introduction to Evidence-Based Statistics: Philosophical Foundation and Preliminaries

More information

Chapter Eight: Multivariate Analysis

Chapter Eight: Multivariate Analysis Chapter Eight: Multivariate Analysis Up until now, we have covered univariate ( one variable ) analysis and bivariate ( two variables ) analysis. We can also measure the simultaneous effects of two or

More information

Likert Scaling: A how to do it guide As quoted from

Likert Scaling: A how to do it guide As quoted from Likert Scaling: A how to do it guide As quoted from www.drweedman.com/likert.doc Likert scaling is a process which relies heavily on computer processing of results and as a consequence is my favorite method

More information

Lecture (chapter 1): Introduction

Lecture (chapter 1): Introduction Lecture (chapter 1): Introduction Ernesto F. L. Amaral January 17, 2018 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015. Statistics: A Tool for Social Research. Stamford:

More information

Types of Variables. Chapter Introduction. 3.2 Measurement

Types of Variables. Chapter Introduction. 3.2 Measurement Contents 3 Types of Variables 61 3.1 Introduction............................ 61 3.2 Measurement........................... 61 3.2.1 Nominal Scale of Measurement.............. 62 3.2.2 Ordinal Scale of

More information