Evaluating EDI* Participant Reactions via Different Response Scales: A Technical Review. Keshav Gaur William A. Eckert

Size: px
Start display at page:

Download "Evaluating EDI* Participant Reactions via Different Response Scales: A Technical Review. Keshav Gaur William A. Eckert"

Transcription

1 Public Disclosure Authorized Public Disclosure Authorized Public Disclosure Authorized Public Disclosure Authorized *The World Bank Institute (WBI) was formerly the Economic Development Institute (EDI), as reflected in the text of this publication. Evaluating EDI* Participant Reactions via Different Response Scales: A Technical Review Keshav Gaur William A. Eckert WBI Evaluation Studies Number ES99-17 World Bank Institute The World Bank Washington, D.C

2 Copyright 1999 The International Bank for Reconstruction and Development/The World Bank 1818 H Street, N.W. Washington, D.C , U.S.A. The World Bank enjoys copyright under protocol 2 of the Universal Copyright Convention. This material may nonetheless be copied for research, educational, or scholarly purposes only in the member countries of The World Bank. Material in this series is subject to revision. The findings, interpretations, and conclusions expressed in this document are entirely those of the author(s) and should not be attributed in any manner to The World Bank, to its affiliated organizations, or the members of its Board of Directors or the countries they represent. If this is reproduced or translated, WBI would appreciate a copy. -2-

3 Executive Summary Donald L. Kirpatrick 1 defines four levels of evaluation as evaluating reaction (level 1), evaluating learning (level 2), evaluating behavior (level 3) and evaluating results (level 4). The Evaluation Unit (EU) of the Economic Development Institute evaluates EDI s activities (seminars, conferences, workshops, courses, etc.) on these four levels. All activities are evaluated at level 1 and a selected few on higher levels. It has been evaluating, on an average, about 200 activities at level 1 annually over the past five years. This paper deals with level 1 evaluations which measure how the participants who participate in the programs react to them. For level 1 evaluations, the Evaluation Unit has shifted to using a 5-point Likert type scale instead of a 6-point scale which was the norm until November This study attempts to address two main issues arising out of this change. First is the appropriateness of using a 5-point scale over a 6-point scale and second is the comparison/conversion of ratings obtained on two different scales. In particular, an attempt is made to find out a suitable method for conversion of scores from one scale to another so that comparison of scores being obtained on a 5-point scale can be made with past scores which are on a 6-point scale. The methodology has involved gleaning significant insights from the past research on the issue and also analyzing the results of an experiment conducted by the Evaluation Unit. The results of the experiment and further analysis of past data of EU clearly show that a 5-point scale is more appropriate to evaluate EDI s activities and also that there is a suitable method to convert/compare scores from one scale to another. An odd rather than an even number of response alternatives is preferable under circumstances in which the respondent can legitimately adopt a neutral position. A 6-point scale does not provide the alternative to respond neutrally. It was found that in EDI s past evaluations, based on 6-point scales, this constraint made the overall scores biased. These results clearly show that the overall opinion formed about a seminar can influence the response of a participant on questions in the neutral range. Since there is no neutral or midpoint on a 6-point scale, he/she is forced to choose either 3 or 4 for questions to which he/she is neutral. 4 is chosen more often when overall opinions are strongly favorable and 3 is chosen more often when overall opinions are not so strongly favorable, thus making good performance better and bad performance worse. There is clearly a need to provide the participants an alternative neutral response, which is precisely the objective of a 5-point scale. When the same activity is measured on two different scales, the mean score on a 6-point scale is greater than the mean score on a 5-point scale. However, the variance does not differ significantly due to decreasing or increasing the scale by one point. This may not be true if the scale changes by more than one point. This finding provides a basis to convert and compare the ratings on a 5-point scale to that on a 6-point scale by making an adjustment for the means accordingly. We have attempted to find the best point estimate of the mean difference between scores on different scales, for activities of the same quality and for questions falling in four categories, viz., relevancy, course content/usefulness, objectivity and worthwhileness. This difference, which is 0.78, can be used as a benchmark to convert/compare scores obtained on a 5- point scale to previous scores obtained by using a 6-point scale. For example, if for a particular activity in future the mean score on a 5-point scale to a question is 4.5, then a comparable figure on a 6-point scale will be = Donald L. Kirpatrick. Evaluating Training Programs, The Four Levels.

4 Evaluating EDI Participant Reactions via Different Response Scales: A Technical Review Introduction Background: Shift from a six to a five point scale The Economic Development Institute (EDI) conducts over 400 training activities annually. These activities are evaluated by the Evaluation Unit (EU), a part of EDI, and includes seminars, conferences, workshops, and courses. Evaluation activities conducted by the EU generally correspond to the four levels of evaluation defined by Kirpatrick 2. These activity levels consist of evaluating reaction (level 1), evaluating learning (level 2), evaluating behavior (level 3) and evaluating results (level 4). All EDI activities are evaluated at level 1 and a selected few on higher levels. The Evaluation Unit has been evaluating, on an average, about 200 level 1 activities annually for the past five years. This paper uses data from these level 1 evaluations and measures how participants react to EDI sponsored training. One important feature of these level 1 evaluations has been the use of a Likert 3 type scale to measure participants responses at the end of an activity. During most of its evaluative activities, the Evaluation Unit used a 6-point scale, where point 1 corresponded to Not At All and point 6 corresponded to Exceeded Expectations. Questions were framed so that a higher point corresponded to improvement in assessed quality of the event. The mean scores from these responses are used as quantitative indicators of the quality of EDI activities and the data/results are given to the course organizers, the Task Managers (TMs). Although the questions were different from event to event, the scale used was the same. A sample questionnaire using this 6- point scale can be seen in Appendix A. Such basic questionnaires and the 6-point scale were used for more than 5 years to evaluate various EDI activities. In November, 1997, the Evaluation Unit decided to shift to using a 5-point scale and phase out the use of the 6-point scale by the end of August, This decision was made by Evaluation Unit staff who believed that a 5-point scale would produce more valid responses, a view consistent with current research and practice. Two examples of questionnaires using 5-point scale are shown in Appendix B. The shift from a 6-point to a 5-point scale may be an issue of concern within EDI, particularly among Task Managers. Task Managers are the principal users of evaluation findings within the various divisions of EDI. Reported results not only provide them with a quantitative assessment of their activities, but these same results also form a benchmark for comparisons, especially when tracking the performance of an activity over time. An important concern is that when activities are evaluated using a 5-point scale, they will not be easily compared to results from past performances which were evaluated using a 6-point scale. This concern would be a strong argument against changing the scale, unless there was some equally compelling reason for that change. Our paper addresses this concern. In the following sections, we provide evidence for making the change to a 5-point scale and a method for comparing the previously used 6-point scale results to those obtained using the new, 5-point scale. 2 Donald L. Kirpatrick. Evaluating Training Programs, The Four Levels. 3 Likert, R. A Technique for the Measurement of Attitudes. Archives of Psychology, no. 140,

5 Evaluating EDI Participant Reactions via Different Response Scales: A Technical Review Why use a five point scale? Past research and theoretical framework Past research has addressed the subject of how many points a scale should have and whether or not a midpoint for neutral/average responses makes a difference. The debate can be dated back to 1915, when Boyce (1915) reviewed the number of alternatives employed in scales used to evaluate the efficiency of teachers, and has continued for over 80 years. A seminal work by Eli P. Cox III (1980) summarized this research literature on the optimal number of response alternatives for a scale. According to Cox,... as the number of response alternatives is increased beyond some minimum, the demands placed upon a respondent become sufficiently burdensome that an increasing number of discrepancies occur between the true scale value and the value reported by the respondent. Thus, although the information transmission capacity of a scale is improved by increasing the number of response alternatives, response error seems to increase concomitantly. Accordingly, one can hypothesize that the relationship between the amount of information actually transmitted by a scale and the number of response alternatives it provides is similar to that shown in Figure 1 4. It can be argued that the optimal number of response alternatives is found at the point where the amount of transmitted information is maximized... Transmitted information Response alternatives Figure 1. Relationship between the amount of information transmitted by a scale and the number of its response alternatives ( Adopted from Eli P. Cox III, 1980) 4 Likert s original monograph describing his technique, an important landmark in attitude measurement. 3 --

6 Evaluating EDI Participant Reactions via Different Response Scales: A Technical Review There is no definitive criterion or formula to decide the optimal number of response alternatives appropriate under all circumstances. Broad guidelines are, however, suggested by past research on this topic. Scales with two or three responses are generally inadequate for transmitting full information and may not provide sufficient alternatives to respondents. Marginal returns from using more than nine responses are minimal. Fowler (1995), for example, shows that Five to seven points appear sufficient for meaningful responses in most rating tasks. In the case of subject-centered scales, Cox (1980) found that five alternatives appear adequate for the individual items and suggests that energy is best spent on increasing the number of quality items constituting the composite scale. However, this same body of research also shows that changing scales and labels may change the distribution of responses significantly (Schwarz, et. al., 1991). The theoretical justification for using an odd-point scale is that it has a specific midpoint which denotes an average response, or a neutral response when moving from a negative to a positive range. On the other hand, on an even-point scale, if a participant wants to respond average or neutral, he/she is forced to choose in a specific direction (for example, 3 or 4 on a 6- point scale), neither of which is exactly the midpoint. Furthermore, the explicit midpoint plays a crucial role. An odd rather than an even number of response alternatives is preferable under circumstances in which the respondent can legitimately adopt a neutral position (Cox.., 1980). Offering an explicit middle alternative in a forced-choice attitude item increases the proportion of respondents in that category. On most issues the increase is in the neighborhood of 10 to 20%, but it may be considerably larger (Schuman & Presser, 1981). This means that some participants prefer to remain neutral on various issues and if they are given a choice to select a neutral response then about 10 to 20% of participants would select it. Alternatively, an even-point scale may introduce bias, especially when overall responses tend to be very high or very low. Under these conditions, when participants don t have an alternative to respond neutrally they tend to select a response in the direction of their overall response level. For example, on a 6-point scale, a response of 4 is more likely to be selected when overall responses are high and a response of 3 selected when overall responses are low. This can bias results by making good results better and bad results worse. Test for Bias in the EDI 6-point scale A justification for moving from a 6-point to a 5-point scale in EDI would be the presence of a pattern of bias, as explained above. To determine if such bias exists in the use of a 6-point scale and its direction, we analyzed data for a five year period (1993 to 1997), using 5,902 participants observations from all 214 Senior Policy Seminars organized by EDI during that time. All seminars used the same basic questionnaire shown in Appendix A. We identified 8 common questions and attempted to see whether the absence of a midpoint on the 6-point scale introduced any bias when a participant actually had no choice of selecting a neutral opinion, but was forced to choose either 3 or 4. The methodology we used to test for the presence of bias was to identify participants who made very strong favorable or not very strong favorable responses to an activity, and then study their specific responses to points 3 and 4 on the 6-point scale. Out of the 8 common questions (see Appendix A) we selected, if a participant responded 5 or 6 (on the 6-point scale ) for more than 5 questions, we classified that respondent as having a strong favorable opinion about that particular activity. Alternatively, if the participant responded 5 or 6 for only 2 or fewer questions, then we classified them as having a not very strong favorable opinion about that particular activity. After identifying and classifying these participants, we studied their 4 --

7 Evaluating EDI Participant Reactions via Different Response Scales: A Technical Review responses to the remaining questions and identified how frequently they chose a response of 3 or 4 on the 6-point scale. Table 1 shows the results from this analysis using the 214 activities with 5,902 participants. Column 2 shows the number of times a participant selected either a 5 or 6 response out of the 8 questions. Thus, a figure of 0 in Column 2 indicates that participants did not reply favorably to any of the 8 questions. Alternatively, a score of 7 indicates that participants replied very favorably to 7 out of 8 questions asked on the questionnaire. Viewing these data arranged in this manner allows us to see the responses to alternatives 3 and 4 for participants who have formed strong favorable or not so strong favorable opinions about an activity. Column 3 shows the number of times a response of 3 was chosen and Column 4 shows the number of times a response 4 was chosen. A trend clearly visible in these data is that, as a participant s opinion about an activity becomes more favorable, then he/she is more and more likely to select response 4 over 3, even when the total number of responses to 3 and 4 remains approximately same. The percentage of times 4 is chosen increases from 67% to 97% when we move from strong favorable category to not so strong favorable category. On the other hand, as the negative opinion about an activity increases, a respondent is more and more likely to choose 3. The percentage of selecting 3 increases steadily from 11% to 33% as the participant s opinion starts to become less favorable. Number of participants Number of times response 5 or 6 is chosen out of 8 Number of times 3 is chosen (A) Number of times 4 is chosen (B) Total (A+B) % choosing 3 % choosing % 67% % 78% % 82% % 86% % 89% % 92% % 95% % 97% Table 1 Analyzed Responses for 5,902 participants over 5 years ( ) to the eight questions of the questionnaire in Appendix A. These results clearly show that the overall opinion formed about a seminar has influence on the response of a participant to EDI evaluation questions in the neutral range. Since there was no neutral or midpoint, he/she was forced to choose either 3 or 4. A 3 was chosen more often when overall opinion was not strongly favorable and a 4 was chosen more often when overall opinion was strongly favorable, thus making good performance better and bad performance worse. There is clearly a need to provide the participants an alternative neutral response, which is precisely the objective of a 5-point scale. This was supported further when we examined responses on the new 5-point scale. In one of the activities using the 5-point scale 5, about 31% of participants (out of 29) responded neutral to one question. More than 20% of the responses were neutral for 5 of the 20 questions. Thus there is clearly a need to provide the option of a midpoint in EDI evaluation questionnaire responses. 5 Activity code 1F98FS4C 5 --

8 Evaluating EDI Participant Reactions via Different Response Scales: A Technical Review Issues of comparison and conversion Results of our analysis show clear evidence that bias existed as a result of EDI s use of the 6-point scale. Furthermore, the form of that bias appears consistent with the direction specified by past research, in that it tends to exaggerate both positive and negative responses. With this established, the issues remain of how to compare and convert results from the 5-point and 6-point scales. Experiment with different scales The EDI evaluation unit conducted an experiment to observe directly how changing scales affects responses, and how this information could be used to make conversions in scores between the different scales. Two EDI sponsored workshops on Partnership for Poverty Reduction were used to conduct the experiment. At the end of each workshop, participants were randomly and unobtrusively divided into two groups. One group was asked to fill out a questionnaire using a 6-point scale and the other using a 5-point scale. The workshops were held in San Salvador, El Salvador, in November, 1997 and in Kingston, Jamaica in January, The objective of the experiment was to produce a set of results whose differences were caused only by the use of different scales. Results from this experiment are shown in Appendix C. These results were used to make inferences about the underlying population of mean scores on 5-point and 6- point scales for activities of the same quality. Both scales start from 1, where 1 denotes the worst performance. The scales increase with the performance level and the last number (5 or 6) on the scale represents the best performance. Two important facts have to be considered. The first is that, where a 5-point scale has a specific midpoint of 3 (where 3 denotes a neutral/average response), the 6-point scale has no such provision. On a 6-point scale, if a participant wants to respond neutral then he/she is forced to choose either 3 or 4, none of which is exactly the midpoint of the scale (the true midpoint is 3.5). Another issue is that the scales try to capture a continuum where the lowest position shows the worst performance, which slowly improves as the scale increases. This is shown graphically in Figure SAME PERFORMANCE MEASURED ON TWO DIFFERENT SCALES Figure 2. Two scales measuring same activity 6 --

9 Evaluating EDI Participant Reactions via Different Response Scales: A Technical Review Results from the experiment were used to establish two points with regard to the 5-point and 6-point scales. The first point was to determine if the mean scores using these two different scales were normally distributed. If it were found that these mean scores were normally distributed for both scales, the second point was to determine if there was a difference between the variances and mean values of the scales. If it could be established that (i) the mean scores were normally distributed, with equal variances, and (ii) the only difference between mean scores from the two scales were due to the difference between their population means, then the basis for a conversion would be to adjust for the simple difference between these population means. The results from this experiment are presented below. Test for Normality Appendix C gives the mean values of responses to 8 questions asked in both seminars using both the 5-point and 6-point scales. It can be inferred from Central Limit Theorem (CLT) that the mean scores are normally distributed irrespective of the distribution of the individual responses. However, we confirmed this using the Kolmogorov-Smirnov test for normality which gave a Lilliefors significance of 20% for mean scores on both scales. The results show that null hypothesis of normality has a p-value of 0.2. The fact that distributions of mean scores are approximately normal gives us a way to compare the scores on the two different scales. This result is also useful for performing tests that compare means and variances. Test for variances and means Once we established that the data from both 5-point and 6-point scales were normally distributed, we then tested to see if there was a significant difference between the variances and means for the two groups. Also it is desirable to test for equality of variances before doing tests for comparing means, such as the Student T test. Results are shown in Box 1 and in Tables 2 and 3. The null hypothesis that variances of responses on two scales are the same cannot be rejected. A high significance (p = 52.1%) shows that variances are statistically equal. Thus, merely increasing or decreasing the scale by one point does not change the variance of mean scores significantly. After finding equal variances of mean scores on both scales, the next step was to compare overall means. If overall means differ but variances are the same, we can derive a correction factor to compare results obtained using different scales. The results are shown in Box 2 and Tables 2 and 3. The distribution of the two mean scores on two different scales is normal but not identical. The tests for distribution parameters (see Box 1 and Box 2) clearly show that, statistically, the variances on two scales are the same but the population means are different. In fact, the overall mean on a 6-point scale is statistically higher in value than the overall mean on a 5-point scale. Note that in this experiment, variables such as quality, background of participants and trainers etc. have been controlled. The data are from the same program, conducted by the same task manager and so the quality is not a variable here. Also, participants were randomly divided into two groups and as such there cannot be a systematic bias in any one group. The only variable in the experiment is the different scale being used in the questionnaires. 7 --

10 Evaluating EDI Participant Reactions via Different Response Scales: A Technical Review Group Statistics SCALE N Mean MEAN Std. Deviation Std. Error Mean 8.580E Table 2. SPSS Output for Group Statistics for comparing means and variances MEAN Equal variances assumed Equal variances not assumed Levene's Test for Equality of Variances F Sig. Independent Samples Test t df t-test for Equality of Means Sig. (2-tailed) Mean Difference 95% Confidence Std. Error Interval of the Difference Lower Upper Table 3. Samples Test Results Results from the test for equal variance clearly show that the population variances are the same. The p value for significance is well outside of the acceptable range at The test for equal means yields a p value of 0, indicating a significant difference between the means of the two populations. Thus, even if population variances are statistically same, the population means are not. The mean on the 6-point scale is higher, by an amount of 0.61 than the mean of the 5-point scale. Population distributions are shown in Figure 3. µ 5 µ Figure 3. Population Distribution of mean scores on 5-point scale and 6-point scale (0.61 is the difference found in the experiment). 8 --

11 Evaluating EDI Participant Reactions via Different Response Scales: A Technical Review TEST FOR VARIANCES OF TWO DISTRIBUTIONS Let σ 5 = Standard deviation of mean on 5-point scale σ 6 = Standard deviation of mean on 6-point scale n 5 = Number of observations on scale of 5 n 6 = Number of observations on scale of 6 α = Level of significance The values in our case are (standard deviations approximated by sample standard deviations): σ 5 = , σ 6 = , n 5 = 16, and n 6 = 16 For a level of significance of 10%(α = 0.10), F 15,15,.05 = 2.40 The null hypothesis is that both the population variances are same. That is, H 0 : σ 5 2 = σ 6 2 Against the two sided alternative H 1 : σ 5 2 σ 6 2 Decision Rule is to reject H 0 if (σ 6 2 / σ 5 2 ) > F 15,15,.05 But (σ 6 2 / σ 5 2 ) = which is clearly < Therefore, the null hypothesis can not be rejected at a significance level of 10%. In fact the significance level in Levene s Test for equality of variances is 52.1%, clearly showing that variance does not change significantly by merely increasing or decreasing the scale by one point. Box 1: Test for equality of variances TEST FOR POPULATION MEANS OF TWO DISTRIBUTIONS Let X 5 = Overall sample mean of mean scores on 5-point scale X 6 = Overall sample mean of mean scores on 6-point scale The result of above test enables us to test the (in)equality of the two population means. From the observed sample variances, an estimate of the common population variance is: s 2 = ( n 1) * σ + ( n 1) * σ n + n = Now let us test the null hypothesis that all else equal, the mean score of a 6-point scale is equal to the mean score of a 5-point scale. That is, H 0 : µ 6 = µ 5 Where µ 5 = Population mean score on a 5-point scale µ 6 = Population mean score on a 6-point scale against the two sided alternative that H 1 : µ 6 µ 5 This test uses Student-t distribution as the number of observations is less than 30. The decision rule is ( for a significance level of 10%) Reject the null if absolute value of X X 5 6 s* ( n + n ) / ( n * n ) > t 30, Which gives us > Thus the null hypothesis that the two means are equal is clearly rejected at a significance level of 10%. In fact the p value of the test is almost zero. See the SPSS output for Independent Samples Test in Table 2 and Table 3. Box 2: Test for means of two distributions 9 --

12 Evaluating EDI Participant Reactions via Different Response Scales: A Technical Review After establishing these conditions, we can derive a fairly accurate method of converting the mean scores from a 6-point scale to a 5-point scale and vice-versa. In fact, the problem reduces to simply arriving at the best point estimates of two population means, since, as the above figure shows, the mean scores are distributed with the same variance but different means. So we can simply compare the means of two populations. Say for the same activity, m 5 = Population mean score on a 5-point scale; and m 6 = Population mean score on a 6-point scale. We have shown that m 6 > m 5. Now if we have best point estimates of m 5 and m 6 then we can convert (for the purpose of comparison) a mean score on 5-point scale to a mean score on 6-point scale by adding (m 6 - m 5 ) to it. For example, if the best estimate of this difference is say 0.61, then a score on 5-point scale should be increased by 0.61 for comparing it to a score on a 6- point scale. Verifying results beyond the experiment Results from the experiment show that, on an average, the mean score of responses on a 6-point scale is 0.61 higher than the score on a 5-point scale. But this result is based on only one activity and may not be the best estimate of the difference of two population means. It only indicates that the actual mean difference lies somewhere near We attempted to verify the results of this experiment and get a better estimate of this difference with data from more activities. The ideal situation would be to use data from multiple activities measured on two different scales simultaneously. This was not practical; we were limited in this approach by the fact that no such experiments had been done (with the exception of this study) nor have activities been compared using different scales on the same set of questions. To overcome this constraint, we randomly selected activities evaluated on different scales which were presented in Random selection was used to ensure that the overall, average quality of the seminars in both categories was the same. Also, we selected similar questions from these activities to compare. As the questions were not identical, we divided questions into four broad categories (see Appendix D). Questions on administrative/logistics were excluded as were questions asked before the activity s start. Only data from the end of the course evaluation questionnaires were used. Every end of activity questionnaire contained questions about relevancy of the activity, quality of contents, usefulness of the activity in improving knowledge, ability to meet activity s objectives and whether the activity had been a worthwhile use of participants time. We randomly selected 20 responses to these questions from different activities which themselves were selected randomly. A sample size of 20 was chosen to (i) ensure a constant sample size and (ii) ensure that sample distribution remained normal. Also, in most of the EDI activities, the number of participants was generally 20 or more. Using these similar questions and 20 responses, we calculated scores for various activities evaluated on 5-point and 6-point scales (See Appendix D for details). The normality tests showed that sample means on both scales were normally distributed with Lilliefors significance of 20% (see Appendix E). We then compared the variance and means of these responses. Results from these comparisons are shown in Table 4 and Table

13 Evaluating EDI Participant Reactions via Different Response Scales: A Technical Review Group Statistics SCALE N Mean MEAN Std. Deviation Std. Error Mean 8.923E E-02 Table 4. SPSS Output for Group statistics for comparing means and variances MEAN Equal variances assumed Equal variances not assumed Levene's Test for Equality of Variances F Sig. Independent Samples Test t df t-test for Equality of Means Sig. Mean Std. Error 95% Confidence Interval of the Mean (2-tailed) Difference Difference Lower Upper Table 5. SPSS output for Independent Samples Test These results are consistent with our earlier findings using the experimental data. The variances are equal as shown by a very high level of significance or Levene s Test (p value of 17.8%). Also the means are clearly different. These results show that the mean response to the questions falling in the four categories of relevancy, course content/usefulness, objectivity and worthwhileness is 0.78 more on a 6-point scale than on a 5-point scale measuring the same activity. For example, if for a particular activity in the future the mean score on a 5-point scale is 4.5, then a comparable figure on a 6-point scale will be = Conclusions Conversion to a 5-point scale seems appropriate for EDI. This scale allows participants to select a valid response of neutral/average when rating an activity and effectively eliminates response bias present with an even-point scale. A 6-point scale suffers from having no neutral/average midpoint, thereby forcing the participants to choose either 3 or 4 which can distort the overall rating of an activity by making good results better and bad results worse. This appeared to be the case in EDI when the 6-point scale was in use. The midpoint or neutral response also appears to be a valid choice when rating EDI activities. When participants are given a choice of selecting a neutral/average midpoint, a considerable number (up to 30% in some activities) do select this response category. Our study also found that when the same activity is measured on two different scales, the mean scores on a 6-point scale will be greater than the mean scores on a 5-point scale, but the variance will not differ significantly due to decreasing or increasing the scale by one point. This may not be true if the scale changes by more than one point. These results provide the basis for converting and comparing the ratings on a 5-point scale to those of a 6-point scale by making an adjustment for the means accordingly. We have attempted to find the best point estimate of the mean difference between scores on different scales, for the same quality of activities, and for questions falling in four categories, viz., relevancy, course content/usefulness, objectivity and 11 --

14 Evaluating EDI Participant Reactions via Different Response Scales: A Technical Review worthwhileness. This difference, 0.78, can be used as a benchmark to convert/compare scores obtained on a 5-point scale to that of previous scores obtained by using a 6-point scale. For example, if for a particular activity in the future the mean score on a 5-point scale to a question is 4.5, then a comparable figure on a 6-point scale will be = Task Managers now have a method for comparing scores from past activities which used the 6-point scale to current activities now using the 5-point scale

15 Evaluating EDI Participant Reactions via Different Response Scales: A Technical Review References Donald L. Kirkpatrick ( 1994 ). Evaluating Training Programs : The Four Levels, San Francisco, Berrett-Koehler Publishers. Cox III, Eli P. ( 1980 ). The Optimal Number of Response Alternatives for a Scale : A Review. Journal of Marketing Research, Vol. XVII ( November, 1980 ), Schuman, Howard & Presser, Stanley ( 1981 ). Questions and Answers in Attitude surveys. Experiments on Question Form, Wording, and Context. New York, Academic Press, Schwarz, Norbert, Knauper, Barbel, Hippler, Hans-J, Noelle-Newman, Elisabeth & Clark, Leslie (1991). Rating Scales : Numeric Values may change the meaning of scale labels. Public Opinion Quarterly 1991: DeVellis, Robert F ( 1991 ). Scale Development : Theory and Applications. Applied Social Research Methods Series, Volume 26. SAGE Publications. Newbold, Paul ( 1994 ). Statistics for Business & Economics. NJ : Prentice-Hall, Inc. Fowler, Floyd J. ( 1995 ). Improving Survey Questions : Design and Evaluation. Applied Social Research Methods Series, Volume 38. SAGE Publications. John Oxenham (1997 ). End-of-Activity Evaluations - A Three Year Retrospective. Office Memorandum of The World Bank dated June 9,

16

17

18

19

20

21 APPENDIX C Results of the experiment Question Number Activity:4N97CA5C Five-Point Scale Mean Responses Six-Point Scale Mean Responses Activity:4J98CA5C Five-Point Scale Mean Responses Six-Point Scale Mean Responses a b c d The following questions were asked : 1. To what degree do you feel we achieved our objective? 2a. Was the workshop personally useful in providing better information? 2b. Was the workshop personally useful in providing new or expanded concepts? 2c. Was the workshop personally useful in revealing a wider range of policy options? 2d. Was the workshop personally useful in enabling you better to asses policy alternatives? 3. To what degree has this seminar been relevant to your current official functions? 4. To what degree has this seminar been a worthwhile use of your time? 5. To what extent did the seminar materials contribute to the effectiveness of the seminar? 19

22

23 APPENDIX D Mean scores for different categories of questions asked on two different scales Category > Relevancy Quality of contents/ Usefulness in improving knowledge Meeting of objectives Worthwhile use of time Activity Code ( scale ) Question ( score ) Question ( score ) Question ( score ) Question ( score ) 4F98PF1C ( 5 ) To what degree are the materials useful to you? (4.6) 1F98FS4C ( 5 ) To what degree did the workshop focus on issues important to you? (4.65) What I learned is relevant to my daily work. (4.65) To what degree were the presentations useful to you? (4.25 ) The workshop content was well prepared. (4) To what degree do you feel the workshop achieved its objectives? (4.55) The workshop achieved its objectives. (4.2) To what degree was the workshop a worthwhile use of your time? (4.7) The workshop was a worthwhile use of my time. (4.55) 1R98JA3C ( 5 ) 1R98KE5F ( 5 ) To what extent was the course relevant to your current works or functions? (4.4) Was the consultation relevant to your country s needs? (4.45) I know much more about investigative journalism now. (4.05) To what extent were the materials constructive? (4.2) To what extent did the course help you better asses the consequences of different policy alternatives? (3.35) Did the consultation treat the issues in sufficient depth? (3.55) Did the consultation include relevant and useful presentations? (4.25) To what extent did the courses achieved the stated objectives? (3.5) Did the consultation achieve the objectives you had in mid? (4) To what extent was the course a worthwhile use of your time? (3.95) Was the consultation a worthwhile use of your time? (4.45) 21

24 Mean scores for different categories of questions asked on two different scales (continued) Category > Relevancy Quality of contents/ Usefulness in improving knowledge Meeting of objectives Worthwhile use of time Activity Code ( scale ) Question ( score ) Question ( score ) Question ( score ) Question ( score ) 1R98PE3C ( 6 ) Has the seminar been relevant to your official function? (5) Did the seminar focus on the most important issues? (5) Did we achieve our objectives? (4.75) Has the seminar been a worthwhile use of your time? (4.95) Did the seminar enable you to be better informed? (4.75) 4R98EB5C ( 6 ) Was the seminar relevant to your current work or functions? (4.75) Did it focus on what you most needed to address to improve project design and implementation? (4.4) Did the seminar achieve its stated objectives? (4.8) Was the seminar a worthwhile use of your time? (5.25) 7J98AE4C ( 6 ) To what extent was the first week of this course relevant to your current work or functions? (4.95) Do you feel better equipped to design and implement health sector projects? (4.6) To what extent the first week of this course relevant to your country s needs? (5.1) To what extent did the first week of this course help you improve your regulatory skills? (4.95) To what extent was the first week of this course a worthwhile use of your time? (5.35) 298RFI3C ( 6 ) To what degree has this seminar been relevant to your official functions? (5.35) To what extent did the first week of this course help you clarify the next steps to undertake? (4.8) To what extent did the seminar materials contribute to the effectiveness of the seminar? (5.5) To what degree do you feel we achieved our objectives? (5.1) To what degree has the seminar been a worthwhile use of your of your time? (5.2) Was the seminar personally useful in providing better information? (5.35) 22

25 APPENDIX E MEAN1 Tests of Normality Kolmogorov-Smirnov a Shapiro-Wilk Statistic df Sig. Statistic df Sig * *. This is a lower bound of the true significance. a. Lilliefors Significance Correction Table 1. Test of normality for mean scores using a five point scale 2.0 Normal Q-Q Plot of MEAN Expected Normal Observed Value Figure 1. Test of normality for mean scores using a five point scale 23

26 MEAN2 Tests of Normality Kolmogorov-Smirnov a Shapiro-Wilk Statistic df Sig. Statistic df Sig * *. This is a lower bound of the true significance. a. Lilliefors Significance Correction Table 2. Test of normality for mean scores using a six point scale 2.0 Normal Q-Q Plot of MEAN Expected Normal Observed Value Figure 2. Test of normality for mean scores using a six point scale 24

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc. Chapter 23 Inference About Means Copyright 2010 Pearson Education, Inc. Getting Started Now that we know how to create confidence intervals and test hypotheses about proportions, it d be nice to be able

More information

CHAPTER 3 METHOD AND PROCEDURE

CHAPTER 3 METHOD AND PROCEDURE CHAPTER 3 METHOD AND PROCEDURE Previous chapter namely Review of the Literature was concerned with the review of the research studies conducted in the field of teacher education, with special reference

More information

CHAPTER III METHODOLOGY

CHAPTER III METHODOLOGY 24 CHAPTER III METHODOLOGY This chapter presents the methodology of the study. There are three main sub-titles explained; research design, data collection, and data analysis. 3.1. Research Design The study

More information

Confidence Intervals On Subsets May Be Misleading

Confidence Intervals On Subsets May Be Misleading Journal of Modern Applied Statistical Methods Volume 3 Issue 2 Article 2 11-1-2004 Confidence Intervals On Subsets May Be Misleading Juliet Popper Shaffer University of California, Berkeley, shaffer@stat.berkeley.edu

More information

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests Objectives Quantifying the quality of hypothesis tests Type I and II errors Power of a test Cautions about significance tests Designing Experiments based on power Evaluating a testing procedure The testing

More information

Theoretical Exam. Monday 15 th, Instructor: Dr. Samir Safi. 1. Write your name, student ID and section number.

Theoretical Exam. Monday 15 th, Instructor: Dr. Samir Safi. 1. Write your name, student ID and section number. بسم االله الرحمن الرحيم COMPUTER & DATA ANALYSIS Theoretical Exam FINAL THEORETICAL EXAMINATION Monday 15 th, 2007 Instructor: Dr. Samir Safi Name: ID Number: Instructor: INSTRUCTIONS: 1. Write your name,

More information

Here are the various choices. All of them are found in the Analyze menu in SPSS, under the sub-menu for Descriptive Statistics :

Here are the various choices. All of them are found in the Analyze menu in SPSS, under the sub-menu for Descriptive Statistics : Descriptive Statistics in SPSS When first looking at a dataset, it is wise to use descriptive statistics to get some idea of what your data look like. Here is a simple dataset, showing three different

More information

Quantitative Methods in Computing Education Research (A brief overview tips and techniques)

Quantitative Methods in Computing Education Research (A brief overview tips and techniques) Quantitative Methods in Computing Education Research (A brief overview tips and techniques) Dr Judy Sheard Senior Lecturer Co-Director, Computing Education Research Group Monash University judy.sheard@monash.edu

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

EPS 625 INTERMEDIATE STATISTICS TWO-WAY ANOVA IN-CLASS EXAMPLE (FLEXIBILITY)

EPS 625 INTERMEDIATE STATISTICS TWO-WAY ANOVA IN-CLASS EXAMPLE (FLEXIBILITY) EPS 625 INTERMEDIATE STATISTICS TO-AY ANOVA IN-CLASS EXAMPLE (FLEXIBILITY) A researcher conducts a study to evaluate the effects of the length of an exercise program on the flexibility of female and male

More information

RAG Rating Indicator Values

RAG Rating Indicator Values Technical Guide RAG Rating Indicator Values Introduction This document sets out Public Health England s standard approach to the use of RAG ratings for indicator values in relation to comparator or benchmark

More information

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES 24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter

More information

CHAPTER - 6 STATISTICAL ANALYSIS. This chapter discusses inferential statistics, which use sample data to

CHAPTER - 6 STATISTICAL ANALYSIS. This chapter discusses inferential statistics, which use sample data to CHAPTER - 6 STATISTICAL ANALYSIS 6.1 Introduction This chapter discusses inferential statistics, which use sample data to make decisions or inferences about population. Populations are group of interest

More information

Item Nonresponse and the 10-Point Response Scale in Telephone Surveys 1

Item Nonresponse and the 10-Point Response Scale in Telephone Surveys 1 Vol. 5, no 4, 2012 www.surveypractice.org The premier e-journal resource for the public opinion and survey research community Item Nonresponse and the 10-Point Response Scale in Telephone Surveys 1 Matthew

More information

MBA 605 Business Analytics Don Conant, PhD. GETTING TO THE STANDARD NORMAL DISTRIBUTION

MBA 605 Business Analytics Don Conant, PhD. GETTING TO THE STANDARD NORMAL DISTRIBUTION MBA 605 Business Analytics Don Conant, PhD. GETTING TO THE STANDARD NORMAL DISTRIBUTION Variables In the social sciences data are the observed and/or measured characteristics of individuals and groups

More information

A Brief Introduction to Bayesian Statistics

A Brief Introduction to Bayesian Statistics A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon

More information

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj Statistical Techniques Masoud Mansoury and Anas Abulfaraj What is Statistics? https://www.youtube.com/watch?v=lmmzj7599pw The definition of Statistics The practice or science of collecting and analyzing

More information

Chapter 8 Estimating with Confidence

Chapter 8 Estimating with Confidence Chapter 8 Estimating with Confidence Introduction Our goal in many statistical settings is to use a sample statistic to estimate a population parameter. In Chapter 4, we learned if we randomly select the

More information

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence Chapter 8: Estimating with Confidence Section 8.1 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Introduction Our goal in many statistical settings is to use a sample statistic

More information

Samples, Sample Size And Sample Error. Research Methodology. How Big Is Big? Estimating Sample Size. Variables. Variables 2/25/2018

Samples, Sample Size And Sample Error. Research Methodology. How Big Is Big? Estimating Sample Size. Variables. Variables 2/25/2018 Research Methodology Samples, Sample Size And Sample Error Sampling error = difference between sample and population characteristics Reducing sampling error is the goal of any sampling technique As sample

More information

Student Performance Q&A:

Student Performance Q&A: Student Performance Q&A: 2009 AP Statistics Free-Response Questions The following comments on the 2009 free-response questions for AP Statistics were written by the Chief Reader, Christine Franklin of

More information

CHAPTER III RESEARCH METHODOLOGY

CHAPTER III RESEARCH METHODOLOGY CHAPTER III RESEARCH METHODOLOGY 3.1 Introduction This chapter discusses overview of the methodological aspects of the study. The overview comprises of seven parts; the research design, population and

More information

ISC- GRADE XI HUMANITIES ( ) PSYCHOLOGY. Chapter 2- Methods of Psychology

ISC- GRADE XI HUMANITIES ( ) PSYCHOLOGY. Chapter 2- Methods of Psychology ISC- GRADE XI HUMANITIES (2018-19) PSYCHOLOGY Chapter 2- Methods of Psychology OUTLINE OF THE CHAPTER (i) Scientific Methods in Psychology -observation, case study, surveys, psychological tests, experimentation

More information

Two-Way Independent ANOVA

Two-Way Independent ANOVA Two-Way Independent ANOVA Analysis of Variance (ANOVA) a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment. There

More information

UNIVERSITY OF THE FREE STATE DEPARTMENT OF COMPUTER SCIENCE AND INFORMATICS CSIS6813 MODULE TEST 2

UNIVERSITY OF THE FREE STATE DEPARTMENT OF COMPUTER SCIENCE AND INFORMATICS CSIS6813 MODULE TEST 2 UNIVERSITY OF THE FREE STATE DEPARTMENT OF COMPUTER SCIENCE AND INFORMATICS CSIS6813 MODULE TEST 2 DATE: 3 May 2017 MARKS: 75 ASSESSOR: Prof PJ Blignaut MODERATOR: Prof C de Villiers (UP) TIME: 2 hours

More information

YSU Students. STATS 3743 Dr. Huang-Hwa Andy Chang Term Project 2 May 2002

YSU Students. STATS 3743 Dr. Huang-Hwa Andy Chang Term Project 2 May 2002 YSU Students STATS 3743 Dr. Huang-Hwa Andy Chang Term Project May 00 Anthony Koulianos, Chemical Engineer Kyle Unger, Chemical Engineer Vasilia Vamvakis, Chemical Engineer I. Executive Summary It is common

More information

Where does "analysis" enter the experimental process?

Where does analysis enter the experimental process? Lecture Topic : ntroduction to the Principles of Experimental Design Experiment: An exercise designed to determine the effects of one or more variables (treatments) on one or more characteristics (response

More information

WELCOME! Lecture 11 Thommy Perlinger

WELCOME! Lecture 11 Thommy Perlinger Quantitative Methods II WELCOME! Lecture 11 Thommy Perlinger Regression based on violated assumptions If any of the assumptions are violated, potential inaccuracies may be present in the estimated regression

More information

10.1 Estimating with Confidence. Chapter 10 Introduction to Inference

10.1 Estimating with Confidence. Chapter 10 Introduction to Inference 10.1 Estimating with Confidence Chapter 10 Introduction to Inference Statistical Inference Statistical inference provides methods for drawing conclusions about a population from sample data. Two most common

More information

EXAMINING THE RELATIONSHIP BETWEEN ORGANIZATIONAL JUSTICE AND EFFECTIVENESS OF STRATEGY IMPLEMENTATION AT FOOD INDUSTRIES IN ARDABIL PROVINCE

EXAMINING THE RELATIONSHIP BETWEEN ORGANIZATIONAL JUSTICE AND EFFECTIVENESS OF STRATEGY IMPLEMENTATION AT FOOD INDUSTRIES IN ARDABIL PROVINCE EXAMINING THE RELATIONSHIP BETWEEN ORGANIZATIONAL JUSTICE AND EFFECTIVENESS OF STRATEGY IMPLEMENTATION AT FOOD INDUSTRIES IN ARDABIL PROVINCE Dr.MirzaHassan Hosseini Associate Professor, Payam e Noor University,

More information

Methods for Determining Random Sample Size

Methods for Determining Random Sample Size Methods for Determining Random Sample Size This document discusses how to determine your random sample size based on the overall purpose of your research project. Methods for determining the random sample

More information

CHAPTER 8 Estimating with Confidence

CHAPTER 8 Estimating with Confidence CHAPTER 8 Estimating with Confidence 8.1 Confidence Intervals: The Basics The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Confidence Intervals: The

More information

Explore. sexcntry Sex according to country. [DataSet1] D:\NORA\NORA Main File.sav

Explore. sexcntry Sex according to country. [DataSet1] D:\NORA\NORA Main File.sav EXAMINE VARIABLES=nc228 BY sexcntry /PLOT BOXPLOT HISTOGRAM NPPLOT /COMPARE GROUPS /STATISTICS DESCRIPTIVES /CINTERVAL 95 /MISSING LISTWISE /NOTOTAL. Explore Notes Output Created Comments Input Missing

More information

Examining differences between two sets of scores

Examining differences between two sets of scores 6 Examining differences between two sets of scores In this chapter you will learn about tests which tell us if there is a statistically significant difference between two sets of scores. In so doing you

More information

CHAPTER OBJECTIVES - STUDENTS SHOULD BE ABLE TO:

CHAPTER OBJECTIVES - STUDENTS SHOULD BE ABLE TO: 3 Chapter 8 Introducing Inferential Statistics CHAPTER OBJECTIVES - STUDENTS SHOULD BE ABLE TO: Explain the difference between descriptive and inferential statistics. Define the central limit theorem and

More information

Examining the efficacy of the Theory of Planned Behavior (TPB) to understand pre-service teachers intention to use technology*

Examining the efficacy of the Theory of Planned Behavior (TPB) to understand pre-service teachers intention to use technology* Examining the efficacy of the Theory of Planned Behavior (TPB) to understand pre-service teachers intention to use technology* Timothy Teo & Chwee Beng Lee Nanyang Technology University Singapore This

More information

INTERNATIONAL STANDARD ON ASSURANCE ENGAGEMENTS 3000 ASSURANCE ENGAGEMENTS OTHER THAN AUDITS OR REVIEWS OF HISTORICAL FINANCIAL INFORMATION CONTENTS

INTERNATIONAL STANDARD ON ASSURANCE ENGAGEMENTS 3000 ASSURANCE ENGAGEMENTS OTHER THAN AUDITS OR REVIEWS OF HISTORICAL FINANCIAL INFORMATION CONTENTS INTERNATIONAL STANDARD ON ASSURANCE ENGAGEMENTS 3000 ASSURANCE ENGAGEMENTS OTHER THAN AUDITS OR REVIEWS OF HISTORICAL FINANCIAL INFORMATION (Effective for assurance reports dated on or after January 1,

More information

Chapter 4. Objective & Research Methodology of the Study

Chapter 4. Objective & Research Methodology of the Study Chapter 4 Objective & Research Methodology of the Study 4.1] Introduction The current chapter details about research methodology used in the present study to arrive at desired results. This chapter includes

More information

Introduction to statistics Dr Alvin Vista, ACER Bangkok, 14-18, Sept. 2015

Introduction to statistics Dr Alvin Vista, ACER Bangkok, 14-18, Sept. 2015 Analysing and Understanding Learning Assessment for Evidence-based Policy Making Introduction to statistics Dr Alvin Vista, ACER Bangkok, 14-18, Sept. 2015 Australian Council for Educational Research Structure

More information

CHAPTER NINE DATA ANALYSIS / EVALUATING QUALITY (VALIDITY) OF BETWEEN GROUP EXPERIMENTS

CHAPTER NINE DATA ANALYSIS / EVALUATING QUALITY (VALIDITY) OF BETWEEN GROUP EXPERIMENTS CHAPTER NINE DATA ANALYSIS / EVALUATING QUALITY (VALIDITY) OF BETWEEN GROUP EXPERIMENTS Chapter Objectives: Understand Null Hypothesis Significance Testing (NHST) Understand statistical significance and

More information

STA Module 9 Confidence Intervals for One Population Mean

STA Module 9 Confidence Intervals for One Population Mean STA 2023 Module 9 Confidence Intervals for One Population Mean Learning Objectives Upon completing this module, you should be able to: 1. Obtain a point estimate for a population mean. 2. Find and interpret

More information

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions Readings: OpenStax Textbook - Chapters 1 5 (online) Appendix D & E (online) Plous - Chapters 1, 5, 6, 13 (online) Introductory comments Describe how familiarity with statistical methods can - be associated

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter

More information

International Framework for Assurance Engagements

International Framework for Assurance Engagements FRAMEWORK March 2015 Framework International Framework for Assurance Engagements Explanatory Foreword The Council of the Malaysian Institute of Accountants has approved this International Framework for

More information

Fundamental Clinical Trial Design

Fundamental Clinical Trial Design Design, Monitoring, and Analysis of Clinical Trials Session 1 Overview and Introduction Overview Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics, University of Washington February 17-19, 2003

More information

Chapter 12: Introduction to Analysis of Variance

Chapter 12: Introduction to Analysis of Variance Chapter 12: Introduction to Analysis of Variance of Variance Chapter 12 presents the general logic and basic formulas for the hypothesis testing procedure known as analysis of variance (ANOVA). The purpose

More information

Reading Time [min.] Group

Reading Time [min.] Group The exam set contains 8 questions. The questions may contain sub-questions. Make sure to indicate which question you are answering. The questions are weighted according to the percentage in brackets. Please

More information

PHYSICAL STRESS EXPERIENCES

PHYSICAL STRESS EXPERIENCES PHYSICAL STRESS EXPERIENCES A brief guide to the PROMIS Physical Stress Experiences instruments: PEDIATRIC PROMIS Pediatric Bank v1.0 - Physical Stress Experiences PROMIS Pediatric Short Form v1.0 - Physical

More information

Investigation of Professional Readiness of Selected Male and Female Experts in Iranian Sports Organizations

Investigation of Professional Readiness of Selected Male and Female Experts in Iranian Sports Organizations International Journal of Science Culture and Sport (IntJSCS) March 2015: 3(1) ISSN : 2148-1148 Doi : 10.14486/IJSCS216 Investigation of Professional Readiness of Selected Male and Female Experts in Iranian

More information

Statistics for Psychology

Statistics for Psychology Statistics for Psychology SIXTH EDITION CHAPTER 3 Some Key Ingredients for Inferential Statistics Some Key Ingredients for Inferential Statistics Psychologists conduct research to test a theoretical principle

More information

One-Way Independent ANOVA

One-Way Independent ANOVA One-Way Independent ANOVA Analysis of Variance (ANOVA) is a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment.

More information

Consistency in REC Review

Consistency in REC Review National Research Ethics Advisors Panel Consistency in REC Review Summary 1. Consistency is taken to mean that, for any specific application or other, similar, applications, Research Ethics Committees

More information

MEANING AND PURPOSE. ADULT PEDIATRIC PARENT PROXY PROMIS Item Bank v1.0 Meaning and Purpose PROMIS Short Form v1.0 Meaning and Purpose 4a

MEANING AND PURPOSE. ADULT PEDIATRIC PARENT PROXY PROMIS Item Bank v1.0 Meaning and Purpose PROMIS Short Form v1.0 Meaning and Purpose 4a MEANING AND PURPOSE A brief guide to the PROMIS Meaning and Purpose instruments: ADULT PEDIATRIC PARENT PROXY PROMIS Item Bank v1.0 Meaning and Purpose PROMIS Short Form v1.0 Meaning and Purpose 4a PROMIS

More information

More about inferential control models

More about inferential control models PETROCONTROL Advanced Control and Optimization More about inferential control models By Y. Zak Friedman, PhD Principal Consultant 34 East 30 th Street, New York, NY 10016 212-481-6195 Fax: 212-447-8756

More information

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions Readings: OpenStax Textbook - Chapters 1 5 (online) Appendix D & E (online) Plous - Chapters 1, 5, 6, 13 (online) Introductory comments Describe how familiarity with statistical methods can - be associated

More information

GLOSSARY OF GENERAL TERMS

GLOSSARY OF GENERAL TERMS GLOSSARY OF GENERAL TERMS Absolute risk reduction Absolute risk reduction (ARR) is the difference between the event rate in the control group (CER) and the event rate in the treated group (EER). ARR =

More information

FATIGUE. A brief guide to the PROMIS Fatigue instruments:

FATIGUE. A brief guide to the PROMIS Fatigue instruments: FATIGUE A brief guide to the PROMIS Fatigue instruments: ADULT ADULT CANCER PEDIATRIC PARENT PROXY PROMIS Ca Bank v1.0 Fatigue PROMIS Pediatric Bank v2.0 Fatigue PROMIS Pediatric Bank v1.0 Fatigue* PROMIS

More information

Chapter 7: Correlation

Chapter 7: Correlation Chapter 7: Correlation Smart Alex s Solutions Task 1 A student was interested in whether there was a positive relationship between the time spent doing an essay and the mark received. He got 45 of his

More information

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Lec 02: Estimation & Hypothesis Testing in Animal Ecology Lec 02: Estimation & Hypothesis Testing in Animal Ecology Parameter Estimation from Samples Samples We typically observe systems incompletely, i.e., we sample according to a designed protocol. We then

More information

INTRODUCTION TO ASSESSMENT OPTIONS

INTRODUCTION TO ASSESSMENT OPTIONS ASTHMA IMPACT A brief guide to the PROMIS Asthma Impact instruments: PEDIATRIC PROMIS Pediatric Item Bank v2.0 Asthma Impact PROMIS Pediatric Item Bank v1.0 Asthma Impact* PROMIS Pediatric Short Form v2.0

More information

USE AND MISUSE OF MIXED MODEL ANALYSIS VARIANCE IN ECOLOGICAL STUDIES1

USE AND MISUSE OF MIXED MODEL ANALYSIS VARIANCE IN ECOLOGICAL STUDIES1 Ecology, 75(3), 1994, pp. 717-722 c) 1994 by the Ecological Society of America USE AND MISUSE OF MIXED MODEL ANALYSIS VARIANCE IN ECOLOGICAL STUDIES1 OF CYNTHIA C. BENNINGTON Department of Biology, West

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 11 + 13 & Appendix D & E (online) Plous - Chapters 2, 3, and 4 Chapter 2: Cognitive Dissonance, Chapter 3: Memory and Hindsight Bias, Chapter 4: Context Dependence Still

More information

How to interpret results of metaanalysis

How to interpret results of metaanalysis How to interpret results of metaanalysis Tony Hak, Henk van Rhee, & Robert Suurmond Version 1.0, March 2016 Version 1.3, Updated June 2018 Meta-analysis is a systematic method for synthesizing quantitative

More information

Construction of an Attitude Scale towards Teaching Profession: A Study among Secondary School Teachers in Mizoram

Construction of an Attitude Scale towards Teaching Profession: A Study among Secondary School Teachers in Mizoram Page29 Construction of an Attitude Scale towards Teaching Profession: A Study among Secondary School Teachers in Mizoram ABSTRACT: Mary L. Renthlei* & Dr. H. Malsawmi** *Assistant Professor, Department

More information

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis EFSA/EBTC Colloquium, 25 October 2017 Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis Julian Higgins University of Bristol 1 Introduction to concepts Standard

More information

Testing Means. Related-Samples t Test With Confidence Intervals. 6. Compute a related-samples t test and interpret the results.

Testing Means. Related-Samples t Test With Confidence Intervals. 6. Compute a related-samples t test and interpret the results. 10 Learning Objectives Testing Means After reading this chapter, you should be able to: Related-Samples t Test With Confidence Intervals 1. Describe two types of research designs used when we select related

More information

PSYCHOLOGY 300B (A01) One-sample t test. n = d = ρ 1 ρ 0 δ = d (n 1) d

PSYCHOLOGY 300B (A01) One-sample t test. n = d = ρ 1 ρ 0 δ = d (n 1) d PSYCHOLOGY 300B (A01) Assignment 3 January 4, 019 σ M = σ N z = M µ σ M d = M 1 M s p d = µ 1 µ 0 σ M = µ +σ M (z) Independent-samples t test One-sample t test n = δ δ = d n d d = µ 1 µ σ δ = d n n = δ

More information

SLEEP DISTURBANCE ABOUT SLEEP DISTURBANCE INTRODUCTION TO ASSESSMENT OPTIONS. 6/27/2018 PROMIS Sleep Disturbance Page 1

SLEEP DISTURBANCE ABOUT SLEEP DISTURBANCE INTRODUCTION TO ASSESSMENT OPTIONS. 6/27/2018 PROMIS Sleep Disturbance Page 1 SLEEP DISTURBANCE A brief guide to the PROMIS Sleep Disturbance instruments: ADULT PROMIS Item Bank v1.0 Sleep Disturbance PROMIS Short Form v1.0 Sleep Disturbance 4a PROMIS Short Form v1.0 Sleep Disturbance

More information

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug? MMI 409 Spring 2009 Final Examination Gordon Bleil Table of Contents Research Scenario and General Assumptions Questions for Dataset (Questions are hyperlinked to detailed answers) 1. Is there a difference

More information

Analysis and Interpretation of Data Part 1

Analysis and Interpretation of Data Part 1 Analysis and Interpretation of Data Part 1 DATA ANALYSIS: PRELIMINARY STEPS 1. Editing Field Edit Completeness Legibility Comprehensibility Consistency Uniformity Central Office Edit 2. Coding Specifying

More information

Chapter 11. Experimental Design: One-Way Independent Samples Design

Chapter 11. Experimental Design: One-Way Independent Samples Design 11-1 Chapter 11. Experimental Design: One-Way Independent Samples Design Advantages and Limitations Comparing Two Groups Comparing t Test to ANOVA Independent Samples t Test Independent Samples ANOVA Comparing

More information

Lecture Notes Module 2

Lecture Notes Module 2 Lecture Notes Module 2 Two-group Experimental Designs The goal of most research is to assess a possible causal relation between the response variable and another variable called the independent variable.

More information

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when.

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when. INTRO TO RESEARCH METHODS: Empirical Knowledge: based on observations. Answer questions why, whom, how, and when. Experimental research: treatments are given for the purpose of research. Experimental group

More information

The Regression-Discontinuity Design

The Regression-Discontinuity Design Page 1 of 10 Home» Design» Quasi-Experimental Design» The Regression-Discontinuity Design The regression-discontinuity design. What a terrible name! In everyday language both parts of the term have connotations

More information

TACKLING WITH REVIEWER S COMMENTS:

TACKLING WITH REVIEWER S COMMENTS: TACKLING WITH REVIEWER S COMMENTS: Comment (a): The abstract of the research paper does not provide a bird s eye view (snapshot view) of what is being discussed throughout the paper. The reader is likely

More information

SINGLE-CASE RESEARCH. Relevant History. Relevant History 1/9/2018

SINGLE-CASE RESEARCH. Relevant History. Relevant History 1/9/2018 SINGLE-CASE RESEARCH And Small N Designs Relevant History In last half of nineteenth century, researchers more often looked at individual behavior (idiographic approach) Founders of psychological research

More information

Monte Carlo Analysis of Univariate Statistical Outlier Techniques Mark W. Lukens

Monte Carlo Analysis of Univariate Statistical Outlier Techniques Mark W. Lukens Monte Carlo Analysis of Univariate Statistical Outlier Techniques Mark W. Lukens This paper examines three techniques for univariate outlier identification: Extreme Studentized Deviate ESD), the Hampel

More information

DISCPP (DISC Personality Profile) Psychometric Report

DISCPP (DISC Personality Profile) Psychometric Report Psychometric Report Table of Contents Test Description... 5 Reference... 5 Vitals... 5 Question Type... 5 Test Development Procedures... 5 Test History... 8 Operational Definitions... 9 Test Research and

More information

ABOUT PHYSICAL ACTIVITY

ABOUT PHYSICAL ACTIVITY PHYSICAL ACTIVITY A brief guide to the PROMIS Physical Activity instruments: PEDIATRIC PROMIS Pediatric Item Bank v1.0 Physical Activity PROMIS Pediatric Short Form v1.0 Physical Activity 4a PROMIS Pediatric

More information

Intro to SPSS. Using SPSS through WebFAS

Intro to SPSS. Using SPSS through WebFAS Intro to SPSS Using SPSS through WebFAS http://www.yorku.ca/computing/students/labs/webfas/ Try it early (make sure it works from your computer) If you need help contact UIT Client Services Voice: 416-736-5800

More information

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14 Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14 Still important ideas Contrast the measurement of observable actions (and/or characteristics)

More information

Funnelling Used to describe a process of narrowing down of focus within a literature review. So, the writer begins with a broad discussion providing b

Funnelling Used to describe a process of narrowing down of focus within a literature review. So, the writer begins with a broad discussion providing b Accidental sampling A lesser-used term for convenience sampling. Action research An approach that challenges the traditional conception of the researcher as separate from the real world. It is associated

More information

ANXIETY A brief guide to the PROMIS Anxiety instruments:

ANXIETY A brief guide to the PROMIS Anxiety instruments: ANXIETY A brief guide to the PROMIS Anxiety instruments: ADULT PEDIATRIC PARENT PROXY PROMIS Pediatric Bank v1.0 Anxiety PROMIS Pediatric Short Form v1.0 - Anxiety 8a PROMIS Item Bank v1.0 Anxiety PROMIS

More information

Discussion Meeting for MCP-Mod Qualification Opinion Request. Novartis 10 July 2013 EMA, London, UK

Discussion Meeting for MCP-Mod Qualification Opinion Request. Novartis 10 July 2013 EMA, London, UK Discussion Meeting for MCP-Mod Qualification Opinion Request Novartis 10 July 2013 EMA, London, UK Attendees Face to face: Dr. Frank Bretz Global Statistical Methodology Head, Novartis Dr. Björn Bornkamp

More information

1 The conceptual underpinnings of statistical power

1 The conceptual underpinnings of statistical power 1 The conceptual underpinnings of statistical power The importance of statistical power As currently practiced in the social and health sciences, inferential statistics rest solidly upon two pillars: statistical

More information

Is There any Difference Between the Results of the Survey Marked by the Interviewer and the Respondent?

Is There any Difference Between the Results of the Survey Marked by the Interviewer and the Respondent? Is There any Difference Between the Results of the Survey Marked by the Interviewer and the Respondent? Cetin Kalburan, PhD Cand. Tugce Aksoy, Graduate Student Institute of Social Sciences, Pamukkale University,

More information

Comparing multiple proportions

Comparing multiple proportions Comparing multiple proportions February 24, 2017 psych10.stanford.edu Announcements / Action Items Practice and assessment problem sets will be posted today, might be after 5 PM Reminder of OH switch today

More information

International Standard on Auditing (UK) 530

International Standard on Auditing (UK) 530 Standard Audit and Assurance Financial Reporting Council June 2016 International Standard on Auditing (UK) 530 Audit Sampling The FRC s mission is to promote transparency and integrity in business. The

More information

Improving Individual and Team Decisions Using Iconic Abstractions of Subjective Knowledge

Improving Individual and Team Decisions Using Iconic Abstractions of Subjective Knowledge 2004 Command and Control Research and Technology Symposium Improving Individual and Team Decisions Using Iconic Abstractions of Subjective Knowledge Robert A. Fleming SPAWAR Systems Center Code 24402 53560

More information

ABOUT SMOKING NEGATIVE PSYCHOSOCIAL EXPECTANCIES

ABOUT SMOKING NEGATIVE PSYCHOSOCIAL EXPECTANCIES Smoking Negative Psychosocial Expectancies A brief guide to the PROMIS Smoking Negative Psychosocial Expectancies instruments: ADULT PROMIS Item Bank v1.0 Smoking Negative Psychosocial Expectancies for

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 5, 6, 7, 8, 9 10 & 11)

More information

An Experimental Investigation of Self-Serving Biases in an Auditing Trust Game: The Effect of Group Affiliation: Discussion

An Experimental Investigation of Self-Serving Biases in an Auditing Trust Game: The Effect of Group Affiliation: Discussion 1 An Experimental Investigation of Self-Serving Biases in an Auditing Trust Game: The Effect of Group Affiliation: Discussion Shyam Sunder, Yale School of Management P rofessor King has written an interesting

More information

Midterm Exam MMI 409 Spring 2009 Gordon Bleil

Midterm Exam MMI 409 Spring 2009 Gordon Bleil Midterm Exam MMI 409 Spring 2009 Gordon Bleil Table of contents: (Hyperlinked to problem sections) Problem 1 Hypothesis Tests Results Inferences Problem 2 Hypothesis Tests Results Inferences Problem 3

More information

Comparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria

Comparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria Comparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria Thakur Karkee Measurement Incorporated Dong-In Kim CTB/McGraw-Hill Kevin Fatica CTB/McGraw-Hill

More information

The Confidence Interval. Finally, we can start making decisions!

The Confidence Interval. Finally, we can start making decisions! The Confidence Interval Finally, we can start making decisions! Reminder The Central Limit Theorem (CLT) The mean of a random sample is a random variable whose sampling distribution can be approximated

More information

Basic Biostatistics. Dr. Kiran Chaudhary Dr. Mina Chandra

Basic Biostatistics. Dr. Kiran Chaudhary Dr. Mina Chandra Basic Biostatistics Dr. Kiran Chaudhary Dr. Mina Chandra Overview 1.Importance of Biostatistics 2.Biological Variations, Uncertainties and Sources of uncertainties 3.Terms- Population/Sample, Validity/

More information

Applied Statistical Analysis EDUC 6050 Week 4

Applied Statistical Analysis EDUC 6050 Week 4 Applied Statistical Analysis EDUC 6050 Week 4 Finding clarity using data Today 1. Hypothesis Testing with Z Scores (continued) 2. Chapters 6 and 7 in Book 2 Review! = $ & '! = $ & ' * ) 1. Which formula

More information

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels;

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels; 1 One-Way ANOVAs We have already discussed the t-test. The t-test is used for comparing the means of two groups to determine if there is a statistically significant difference between them. The t-test

More information

Module 28 - Estimating a Population Mean (1 of 3)

Module 28 - Estimating a Population Mean (1 of 3) Module 28 - Estimating a Population Mean (1 of 3) In "Estimating a Population Mean," we focus on how to use a sample mean to estimate a population mean. This is the type of thinking we did in Modules 7

More information

The t-test: Answers the question: is the difference between the two conditions in my experiment "real" or due to chance?

The t-test: Answers the question: is the difference between the two conditions in my experiment real or due to chance? The t-test: Answers the question: is the difference between the two conditions in my experiment "real" or due to chance? Two versions: (a) Dependent-means t-test: ( Matched-pairs" or "one-sample" t-test).

More information