measuring performance

Size: px
Start display at page:

Download "measuring performance"

Transcription

1 measuring performance Further challenges in measuring communication skills: accounting for actor effects in standardised patient assessments Stephen J Lurie, Christopher J Mooney, Anne C Nofziger, Sean C Meldrum & Ronald M Epstein CONTEXT Subjective rating scales for communication skills may yield more personally meaningful responses than more standardised rating schemes. It is unclear, however, whether such evaluations may be overly biased by respondentsõ rating styles, which may lead to unreliable measurement of examineesõ communication skills. METHODS Our study involved 212 students from the classes of 2005 and 2006 at the University of Rochester School of Medicine and Dentistry. All students were rated by actors depicting standardised patients (SPs) on the same seven cases using the 19-item Rochester Communication Rating Scale (RCRS). Different students were assigned to different actors playing the same SP. We assessed the extent to which actorsõ personal rating styles influenced the scores they assigned to students. Main outcome measures were: between-actor variability in responses; the degree to which actorsõ response styles contribute to overall scores, and improvements in reliability achieved by standardising actorsõ ratings. RESULTS There were statistically significant differences between actors in their mean assigned scores. Scores aggregated over 18 separate SP cases have an expected generalisability coefficient of If raw RCRS scores are used, a total of 27 replications of the RCRS are required to achieve a CronbachÕs alpha of 0.8; standardisation reduces this number to 18. Office of Educational Evaluation and Research, University of Rochester School of Medicine and Dentistry, Rochester, New York, USA Correspondence: Stephen J Lurie, 601 Elmwood Avenue, Box 601, Rochester, New York 14624, USA. Tel: ; Fax: ; stephen_lurie@urmc.rochester.edu CONCLUSIONS Although actors are variable in their use of a standardised subjective scale of communication, such differences contribute to an acceptably small proportion of the total variance if scores are combined across a large number of cases. Reliability can be markedly improved by standardising scores across raters. KEYWORDS *communication; *patient standardisation; physician)patient relations; New York; education, medical, undergraduate *methods; education, dental *methods; clinical competence *standards; patient satisfaction; humans. Medical Education 2008: 42: doi: /j x INTRODUCTION Nearly all medical schools now have formal curricula for teaching and assessing communication skills. Communication is a major component of the Medical School Outcome Project of the Association of American Medical Colleges, 1 and its assessment is now a requirement of the National Board of Medical Examiners. 2 A number of formal tools for assessing communication skills have appeared in the last decade. Most of these tools involve theory-based coding systems that require trained raters to assess videotaped encounters between students and standardised patients (SPs), 3 5 whereas others 6 describe specific communicationrelated behaviours and tasks that can be objectively measured. Although such systems appear to be reliable, they are also quite labour-intensive. Thus, students may not have the opportunity to be scored on more than a small number of encounters. This limitation may be especially problematic because communication skills may be case-specific: a student who appears to be a good communicator in one set of clinical circumstances may perform poorly in another. 7 Because assessment of overall 662 ª Blackwell Publishing Ltd MEDICAL EDUCATION 2008; 42:

2 Subjective communication rating scale Overview What is already known on this subject Patients have idiosyncratic reactions to their doctorsõ communication skills. This may also be true for standardised patients. What this study adds Raters are highly variable in their use of a standardised communication rating scale. Such differences between raters substantially decrease the reliability of measurement. Using rater-standardised scores can increase reliability without additional cost. Suggestions for further research It remains unknown whether raters are similarly variable in other aspects of measurement, or whether it would be beneficial to train raters to produce more standardised responses. communication skills appears to require a relatively large number of cases, its reliability may be limited by insufficient time or resources to provide a sufficient number of stations to assess these skills. This problem may be exacerbated by the use of checklists that ask SPs to describe their personal reactions to the interviewerõs communication skills, rather than checklists of behaviours. Such subjective judgements may represent a different domain than that assessed with behavioural checklists. 3,8 In practice, however, their use is complicated by the fact that different SPs have their own systematic individual biases. Thus, it is possible that an interviewer who received a relatively high score from one SP may receive a relatively low score from another for exactly the same set of communications behaviours. This is an important source of extraneous variability in scores and it may make the overall score less reliable. Traditionally, the only way to improve the reliability of such a test is to make the test longer, in the hope that such sources of variability will cancel themselves out over multiple questions. To address this question, Colliver et al. 9 compared average scores on a subjective checklist completed by different actors who portrayed the same case. Although they found many significant differences between actors in their rating styles, these authors concluded that such differences tended to even out over a sufficiently large number of cases. 9 Unfortunately, because students did not receive repeated scores from different actors on the same cases, Colliver et al. were unable to quantify the amount of variability that such differences may introduce into individual scores, or how many cases would be required to negate effects of rating styles. 9 Other studies have used a more powerful generalisability study (G study) framework to assess these concerns. Colliver et al. 10,11 found that variability resulting from individual actors was relatively small compared with that of students and student)case interactions. Nonetheless, these authors found that 18 stations on an objective structured clinical examination (OSCE) were required to achieve a reliability coefficient It is possible, however, that reliability could be increased by standardising scores within actors, assuming that each actor evaluates a sufficiently large number of examinees to generate an actor-specific distribution of scores. Each actorõs scores would then be normalised to the same mean and variance, and these recomputed scores would then be added to yield an overall composite score. In order to assess this possibility, we performed three separate analyses of data acquired using the Rochester Communication Rating Scale (RCRS), 12 which SPs complete immediately after an encounter with a student. The RCRS was developed to assess the patient-centred communication skills of trainees at all levels, as well as those of practising doctors. Components of patient-centred communication that are assessed include: eliciting the patientõs perspective; understanding the psychosocial context; developing a collaborative healing relationship, and actively involving the patient in decisions about his or her health. Items were derived from a number of existing scales, the results of which have been correlated with patient satisfaction, adherence and well-being The item pool was piloted using an ethnically and educationally diverse focus group of SPs and real patients in order to ascertain that the items would be easily understood. The scale was then iteratively revised. Items with low variability were eliminated, yielding the current 19-item version, which we have used for SP examinations since the year An SP requires an average of 2 3 minutes to complete ª Blackwell Publishing Ltd MEDICAL EDUCATION 2008; 42:

3 S J Lurie et al the instrument. Items on the RCRS are displayed in Table 1. Firstly, within seven different cases in which students were assessed by the RCRS, we compared ratersõ assigned scores against one another. In theory, if students are randomly assigned to raters, there should be no difference between raters in the scores Table 1 Rochester Communication Rating Scale 1 Interviewer attended to my physical comfort during interview and physical examination 2 InterviewerÕs body language and tone of voice communicated caring and concern 3 Interviewer did not seem distracted 4 Interviewer made an effort to understand my feelings and emotions 5 Interviewer made me feel I could tell him or her anything, even something personal 6 Interviewer took interest in even my smallest problems and concerns 7 Interviewer allowed me to tell my story in my own words 8 Interviewer asked about all of my concerns early in the interview 9 Interviewer first asked about my general concerns, then asked about specific details 10 Interviewer greeted me warmly 11 Interviewer let me explain my problem without interruption 12 Interviewer asked about issues that affect my health, like family, culture, finances, work environment, access to care, alternative medicine, or spiritual beliefs 13 Interviewer asked how the illness affects my life at home or at work 14 Interviewer asked me if I had any questions 15 Interviewer checked to see if I was willing and able to follow through with the treatment plan 16 Interviewer clearly explained my problem and its treatment using language I could understand 17 Interviewer encouraged me to participate in treatment decisions to the extent I wished 18 Interviewer summed up and made sure he or she understood what I said 19 Interviewer tried to understand how I see my illness or problem All questions were scored on a 6-point scale ranging from 1 = strongly disagree to 6 = strongly agree they assign to students. If significant differences between raters were to be found, however, this would suggest that raters introduce a significant source of bias into the scores that students receive. Secondly, we performed a G study to more accurately quantify the amount of variance that raters introduce into scores. This analysis led to a decision study (D study) to predict how many different raters would be required to achieve a suitably reliable overall score. These results allowed us to compare the reliability of the RCRS against a similar scale used by Colliver et al. 9 Finally, we standardised each raterõs assigned scores, so that all the ratersõ distributions of assigned scores had identical means and standard deviations (SDs). We then assessed the reliability of these new standardised scores. If a significant improvement in reliability were to be found, this would suggest that standardising scores can provide a cost-effective alternative to increasing the number of SP stations as a way of improving reliability in assessing communication skills. METHODS Subjects were 212 students from the classes of 2005 and 2006 at the University of Rochester School of Medicine and Dentistry. All students undergo a comprehensive assessment at the end of both Years 2 and 3 of medical school. As part of this comprehensive assessment, students are exposed to eight SP cases in Year 2 and four in Year 3. After the encounter, the SP rates the student on a checklist of observed behaviours, as well as on the RCRS. Most actors return to portray the same case in multiple years. Some of the cases are changed each year in order to maintain the integrity of the cases. For the purposes of the current analysis, we analysed data only for the five cases that were common to both Year 2 assessments and the two cases common to both Year 3 assessments. Thus, we examined a total of seven cases that were used in both years. We then performed three related analyses to assess the relationship between actor effects on reliability of the test. Comparison of actors within cases For each of the seven cases, we first produced box and whisker plots that depicted the range of scores assigned by each actor, in order to provide a visual representation of the degree of variability in actorsõ 664 ª Blackwell Publishing Ltd MEDICAL EDUCATION 2008; 42:

4 Subjective communication rating scale use of the RCRS. Within each case, differences between actors were assessed by one-way analysis of variance. Generalisability and decision analyses Shavelson and Webb 19 point out that standard procedures for estimating variance components cannot be used when the objects of measurement (e.g. students) are nested within a facet of measurement (e.g. actors). They recommend the procedure followed by Shavelson et al., 20 who reported a study in which examinees rotated through several stations at which they were evaluated by one of several possible pairs of examiners. To address this situation in which examinees were nested within raters, these authors performed a separate G study for each subset of examinees, who were assessed by the same pair of examiners. This procedure was then repeated across each of the possible pairs of examiners. Each of these analyses thus represents a completely factorial design, as all examinees within each replication of the analysis were rated by the same pair of raters. (Raters are viewed as a random factor, with levels that change across each of the replications.) The results of these individual studies are then combined to yield pooled estimates and SDs of the variance terms. This procedure yields only three variance terms: students, raters and students raters. Note that there are no nested terms. To identify all possible pairs of actors, we first produced two-way frequency distributions of all 21 possible two-way combinations of the seven cases. For instance, if case A used six actors and case B used four actors, there would have been 24 possible combinations of evaluators in these two cases. We then examined each of these pairs of actors within this table, and retained those who had evaluated at least eight of the same students. Across the pairwise combinations of cases, a total of 452 actor pairs met this criterion. For each of these, we computed variance components for the factorial student actor design, and then pooled the 452 resulting variance terms to yield final estimates for each of the three model parameters (students, actors and students actors error). Because actors are completely confounded with cases, it is not possible to determine a separate variance component reflecting the cases in this design. Similarly to a jack-knife test, this procedure also allowed us to derive empirical estimates of the variability of each of the variance components. Finally, we used these estimates to compute a generalisability coefficient for an 18-station test, which allowed us to compare our results with those of Colliver and colleagues. 10,11 Effect of standardising ratings We standardised each studentõs communication rating scores for actor effects. Unlike the previous analyses, we used data from the RCRS scores the students received across all 12 cases in their particular year. We did this by first computing each actorõs mean and SD across all the students he or she rated. Because we knew which actors generated each studentõs RCRS scores, we were able to generate actor-specific Z scores for each student. These Z scores essentially place all the scores on the same metric, thereby largely eliminating actor effects. We then used the Spearman Brown prophecy formula to estimate the improvement in internal consistency across an entire OSCE examination using raw versus actor-standardised scores. RESULTS Comparison of actors within cases Figure 1 displays actorsõ distributions of scores for one of the seven cases; very similar results were found for the other six cases. As the figure indicates, actors differ widely in both their mean scores and the range of scores they assign. In many cases, some actorsõ scores do not overlap at all; in these cases, one actorõs highest-rated student may receive a lower score than another actorõs lowest-rated student. There were significant differences between raters in all seven cases (Case 1, F 12,197 = 8.91; Case 2, F 6,191 = 19.77; Case 3, F 10,185 = 9.70; Case 4, F 4,190 = 45.20; Case 5, F 8,188 = 4.35; Case 6, F 7,191 = 9.40; Case 7, F 8,193 = 5.48; all significant at P < 0.001). Generalisability and decision analyses Over the 432 replications, we found mean (SD) variance components of 0.04 (0.10) for students, 0.12 (0.22) for actors within cases, and 0.15 (0.12) for students actors. A hypothetical 18-station SP examination would have a projected reliability of These estimates are in fact quite similar to those reported by Colliver and colleagues. 10,11 We note, however, that the SDs of these estimates are large relative to the means, suggesting that the distribution ª Blackwell Publishing Ltd MEDICAL EDUCATION 2008; 42:

5 S J Lurie et al Figure 1 Individual standardised patient rating styles for Case 1 of these estimates is skewed to the right. This is particularly true of the estimate of the variance component for students. Thus these pooled estimates, which are weighted averages, may tend to overstate the actual reliability of the scale over several replications. Effect of standardising ratings If the 12 raw RCRS scores are combined, the composite total would have a CronbachÕs alpha of The reliability of the 12-item composite increases to 0.73 if the scores are first standardised within actors. Applying the Spearman Brown formula, a total of 27 raw RCRS scores would be needed to achieve a CronbachÕs alpha of 0.8, but only 18 such cases would be required to achieve that degree of reliability if the scores were first standardised within actors. DISCUSSION The RCRS was designed as a survey that could be completed by either SPs or real patients without any prior training in its use. Ideally such a scale, when utilised by an SP, would approximate the subjective impressions of real patients because these impressions are known to correlate with important outcomes such as adherence to treatment, switching doctors and health-related quality of life. Unfortunately, SPs may substantially differ from one another in the average ratings they assign to trainees, as well as in the variability of the scores they assign. Within the same case, there may be actors whose scores do not overlap at all; the highest score assigned by one actor may be lower than the lowest score assigned by another. Furthermore, some actors essentially assign every examinee the same score, whereas others use nearly the entire range of the scale. Thus, when comparing students who were rated by different SPs, scores may be more reflective of different SPsÕ biases in using the scale, rather than the studentsõ ÔtrueÕ performances. Despite this effect, we found that when such scores are combined over a relatively large number of cases, studentsõ ÔtrueÕ scores appear to emerge in a more generalisable way. Even when accounting for the possible nesting of actors within cases, we found that an 18-station OSCE would be expected to yield generalisability coefficients of 0.8 for studentsõ subjectively measured communication skills. Nonetheless, 27 such stations would be required to achieve an acceptable CronbachÕs alpha. This may 666 ª Blackwell Publishing Ltd MEDICAL EDUCATION 2008; 42:

6 Subjective communication rating scale explain why brief OSCE examinations based on SP satisfaction tend not to have high correlations with one another. 8 By the simple measure of standardising such scores within actors, only 18 such stations would be required to achieve a similar level of internal consistency. Such standardisation corrects for irrelevant sources of bias in rating style, while maintaining SPsÕ relative judgements about the communication skills of various trainees. Although many institutions may lack the ability to provide a full 18 testing stations, our results underscore the fact that standardisation of ratersõ scores is a low-cost way to improve the reliability of any test of communication skills, irrespective of the length of the examination. Thus, we recommend that such standardisation be performed irrespective of the number of stations. By removing the contaminating effect of actorõs rating styles, students can be provided with a more accurate estimate of their communication skills. Indeed, such a procedure would be concordant with the position of Swanson et al., of the National Board of Medical Examiners, who state: ÔDespite the obvious importance of adjusting scores on alternate test forms to reduce measurement error and ensure equitable treatment of examinees, equating procedures are not typically employed.õ 21 (p 67) An alternative to such standardisation of scores would be to train SPs to a reasonable level of calibration, such that SPs can agree on studentsõ levels of communication skills. 22 Such calibration, however, may also make the SPs less genuine and immediate in their responses to studentsõ interpersonal skills, and may thus bias SPs in ways that could diminish the predictive validity of the scale. 23,24 Thus, the use of rater-specific standardised scores allows each rater to express his or her own subjective reactions while maintaining fairness in grading across large groups of examinees. Of course, this procedure would also represent a highly cost-effective method of reducing the amount of resources required to obtain reliable scores of examineesõ communication skills. Our study has a number of limitations. Firstly, this analysis was performed on one SP programme within a single institution. It is possible that SPs elsewhere may be less variable in their responses. Nonetheless, we point out that SPs at our programme undergo a lengthy and rigorous training programme in order to depict these cases in a highly standardised way. Furthermore, because actors within each case are chosen to represent a specific patient, they are similar in terms of age, sex and racial or ethnic background. Another limitation of our study relates to the fact that we examined only seven SP cases, which represented a range of patient presentations. It is possible that an examination comprising more similar cases would find a higher reliability across cases, and thus less effect of individual rating styles. Secondly, we did not ask faculty observers to rate these interactions and thus it is impossible to assess the degree to which faculty may be more or less reliable observers than SPs. Finally, we must point out that the generalisability of the RCRS appears to be at least as high than that of the scale used by Colliver et al. 10,11 The RCRS requires an average of 2 3 minutes to complete, and has now been used several thousand times by our SPs in assessing communication skills of medical students. Thus, the RCRS appears to be a feasible, reliable and low-cost way of assessing these important skills, particularly if scores are normalised within each assessor. Contributors: SJL contributed to study design and the drafting of the article. CM, SM and RME contributed to the analysis and interpretation of data, and the critical revision of the manuscript. ACN contributed to the collection, analysis and interpretation of data, and the critical revision of the manuscript. Acknowledgements: none. Funding: none. Conflicts of interest: none. Ethical approval: this study was approved by the University of Rochester Institutional Review Board. REFERENCES 1 Association of American Medical Colleges. Medical School Outcome Project. msop. [Accessed 16 April 2008.] 2 National Board of Medical Examiners. Step 2 Clinical Skills Examination. steps2/step2.html. [Accessed 17 May 2007.] 3 Krupat E, Frankel R, Stein T, Irish J. The Four Habits Coding Scheme: validation of an instrument to assess cliniciansõ communication behaviour. Patient Educ Couns 2006;62 (1): van Thiel J, Kraan HF, van der Vleuten CP. Reliability and feasibility of measuring medical interviewing skills: the revised Maastricht History-taking and Advice Checklist. Med Educ 1991;25 (3): Makoul G. The SEGUE Framework for teaching and assessing communication skills. Patient Educ Couns 2001;45 (1): ª Blackwell Publishing Ltd MEDICAL EDUCATION 2008; 42:

7 S J Lurie et al 6 Roter D, Larson S. The Roter Interaction Analysis System (RIAS): utility and flexibility for analysis of medical interactions. Patient Educ Couns 2002;46 (4): Guiton G, Hodgson CS, Delandshere G, Wilkerson L. Communication skills in standardised-patient assessment of final-year medical students: a psychometric study. Adv Health Sci Educ Theory Pract 2004;9 (3): Chessman AW, Blue AV, Gilbert GE, Carey M, Mainous AG III. Assessing studentsõ communication and interpersonal skills across evaluation settings. Fam Med 2003;35 (9): Colliver JA, Robbs RS, Vu NV. Effects of using two or more standardised patients to simulate the same case on case means and case failure rates. Acad Med 1991;66 (10): Colliver JA, Morrison LJ, Markwell SJ, Verhulst SJ, Steward DE, Dawson Saunders E, Barrows HS. Three studies of the effect of multiple standardised patients on intercase reliability of five standardised-patient examinations. Teach Learn Med 1990;2 (4): Colliver JA, Marcy ML, Vu NV, Steward DE, Robbs RS. Effect of using multiple standardised patients to rate interpersonal and communication skills on intercase reliability. Teach Learn Med 1994;6 (1): Epstein RM, Dannefer EF, Nofziger AC, Hansen JT, Schultz SH, Jospe N, Connard LW, Meldrum SC, Henson LC. Comprehensive assessment of professional competence: the Rochester experiment. Teach Learn Med 2004;16 (2): Willliams GC, Freedman Z, Deci EL. Promoting motivation for diabeticsõ self-regulation of HgbAlc. Diabetes 1997;45: Greenfield S, Kaplan SH, Ware JE Jr, Yano EM, Frank HJL. PatientsÕ participation in medical care: effects on blood sugar control and quality of life in diabetes. J Gen Intern Med 1995;3: Kaplan SH, Greenfield S, Ware JE Jr. Assessing the effects of physician patient interactions on the outcomes of chronic disease. Med Care 1989;27 (Suppl): Safran DG, Taira DA, Rogers WH, Kosinski M, Ware JE, Tarlov AR. Linking primary care performance to outcomes of care. J Fam Pract 1998;47: Safran DG, Kosinski M, Tarlov AR, Rogers WH, Taira DH, Lieberman N, Ware JE. The Primary Care Assessment Survey: tests of data quality and measurement performance. Med Care 1998;36: Safran DG, Montgomery JE, Chang H, Murphy J, Rogers WH. Switching doctors: predictors of voluntary disenrolment from a primary physicianõs practice. J Fam Pract 2001;50: Shavelson RJ, Webb NM. Generalizability Theory. Newbury Park, CA: Sage Publications 1991; Shavelson RJ, Mayberry PW, Li W, Webb NM. Generalisability of job performance measurements: marine corps riflemen. Mil Psychol 1990;2 (3): Swanson DB, Clauser BE, Case SM. Clinical skills assessment with standardised patients in high-stakes tests: a framework for thinking about score precision, equating, and security. Adv Health Sci Educ 1999;4: Srinivasan M, Franks P, Meredith LS, Epstein RM, Kravitz RL. Connoisseurs of care? Unannounced standardised patientsõ ratings of physicians Med Care 2006;44 (12): Fiscella K, Franks P, Srinivasan M, Kravitz RL, Epstein R. Ratings of physician communication by real and standardised patients. Ann Fam Med 2007;5 (2): Filipetto FA, Weiss LB, Switala CA, Bertagnolli JF Jr. The effectiveness of a first-year clinical preceptorship on the data collection and communication skills of second-year medical students. Teach Learn Med 2006;18 (2): Received 4 June 2007; editorial comments to authors 19 October 2007, 20 December 2007; accepted for publication 3 January ª Blackwell Publishing Ltd MEDICAL EDUCATION 2008; 42:

Communication Skills in Standardized-Patient Assessment of Final-Year Medical Students: A Psychometric Study

Communication Skills in Standardized-Patient Assessment of Final-Year Medical Students: A Psychometric Study Advances in Health Sciences Education 9: 179 187, 2004. 2004 Kluwer Academic Publishers. Printed in the Netherlands. 179 Communication Skills in Standardized-Patient Assessment of Final-Year Medical Students:

More information

COMPUTING READER AGREEMENT FOR THE GRE

COMPUTING READER AGREEMENT FOR THE GRE RM-00-8 R E S E A R C H M E M O R A N D U M COMPUTING READER AGREEMENT FOR THE GRE WRITING ASSESSMENT Donald E. Powers Princeton, New Jersey 08541 October 2000 Computing Reader Agreement for the GRE Writing

More information

the metric of medical education

the metric of medical education the metric of medical education Validity threats: overcoming interference with proposed interpretations of assessment data Steven M Downing 1 & Thomas M Haladyna 2 CONTEXT Factors that interfere with the

More information

Validity, Reliability, and Defensibility of Assessments in Veterinary Education

Validity, Reliability, and Defensibility of Assessments in Veterinary Education Instructional Methods Validity, Reliability, and Defensibility of Assessments in Veterinary Education Kent Hecker g Claudio Violato ABSTRACT In this article, we provide an introduction to and overview

More information

GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS

GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS Michael J. Kolen The University of Iowa March 2011 Commissioned by the Center for K 12 Assessment & Performance Management at

More information

Comparison of a rational and an empirical standard setting procedure for an OSCE

Comparison of a rational and an empirical standard setting procedure for an OSCE Examination for general practice Comparison of a rational and an empirical standard setting procedure for an OSCE Anneke Kramer, 1 Arno Muijtjens, 2 Koos Jansen, 1 Herman Düsman 1, Lisa Tan 1 & Cees van

More information

Analysis of Confidence Rating Pilot Data: Executive Summary for the UKCAT Board

Analysis of Confidence Rating Pilot Data: Executive Summary for the UKCAT Board Analysis of Confidence Rating Pilot Data: Executive Summary for the UKCAT Board Paul Tiffin & Lewis Paton University of York Background Self-confidence may be the best non-cognitive predictor of future

More information

Investigating the Reliability of Classroom Observation Protocols: The Case of PLATO. M. Ken Cor Stanford University School of Education.

Investigating the Reliability of Classroom Observation Protocols: The Case of PLATO. M. Ken Cor Stanford University School of Education. The Reliability of PLATO Running Head: THE RELIABILTY OF PLATO Investigating the Reliability of Classroom Observation Protocols: The Case of PLATO M. Ken Cor Stanford University School of Education April,

More information

The Impact of Statistically Adjusting for Rater Effects on Conditional Standard Errors for Performance Ratings

The Impact of Statistically Adjusting for Rater Effects on Conditional Standard Errors for Performance Ratings 0 The Impact of Statistically Adjusting for Rater Effects on Conditional Standard Errors for Performance Ratings Mark R. Raymond, Polina Harik and Brain E. Clauser National Board of Medical Examiners 1

More information

Reliability, validity, and all that jazz

Reliability, validity, and all that jazz Reliability, validity, and all that jazz Dylan Wiliam King s College London Introduction No measuring instrument is perfect. The most obvious problems relate to reliability. If we use a thermometer to

More information

Reliability, validity, and all that jazz

Reliability, validity, and all that jazz Reliability, validity, and all that jazz Dylan Wiliam King s College London Published in Education 3-13, 29 (3) pp. 17-21 (2001) Introduction No measuring instrument is perfect. If we use a thermometer

More information

CHAPTER 3 METHOD AND PROCEDURE

CHAPTER 3 METHOD AND PROCEDURE CHAPTER 3 METHOD AND PROCEDURE Previous chapter namely Review of the Literature was concerned with the review of the research studies conducted in the field of teacher education, with special reference

More information

Vocabulary. Bias. Blinding. Block. Cluster sample

Vocabulary. Bias. Blinding. Block. Cluster sample Bias Blinding Block Census Cluster sample Confounding Control group Convenience sample Designs Experiment Experimental units Factor Level Any systematic failure of a sampling method to represent its population

More information

Necessity of introducing postencounter note describing history and physical examination at clinical performance examination in Korea

Necessity of introducing postencounter note describing history and physical examination at clinical performance examination in Korea ORIGINAL ARTICLE Necessity of introducing postencounter note describing history and physical examination at clinical performance examination in Korea Jonghoon Kim Office of Medical Education, Inha University

More information

A laboratory study on the reliability estimations of the mini-cex

A laboratory study on the reliability estimations of the mini-cex Adv in Health Sci Educ (2013) 18:5 13 DOI 10.1007/s10459-011-9343-y A laboratory study on the reliability estimations of the mini-cex Alberto Alves de Lima Diego Conde Juan Costabel Juan Corso Cees Van

More information

Importance of Good Measurement

Importance of Good Measurement Importance of Good Measurement Technical Adequacy of Assessments: Validity and Reliability Dr. K. A. Korb University of Jos The conclusions in a study are only as good as the data that is collected. The

More information

ISC- GRADE XI HUMANITIES ( ) PSYCHOLOGY. Chapter 2- Methods of Psychology

ISC- GRADE XI HUMANITIES ( ) PSYCHOLOGY. Chapter 2- Methods of Psychology ISC- GRADE XI HUMANITIES (2018-19) PSYCHOLOGY Chapter 2- Methods of Psychology OUTLINE OF THE CHAPTER (i) Scientific Methods in Psychology -observation, case study, surveys, psychological tests, experimentation

More information

Temporal stability of objective structured clinical exams: a longitudinal study employing item response theory

Temporal stability of objective structured clinical exams: a longitudinal study employing item response theory Baig and Violato BMC Medical Education 2012, 12:121 RESEARCH ARTICLE Open Access Temporal stability of objective structured clinical exams: a longitudinal study employing item response theory Lubna A Baig

More information

MEANING AND PURPOSE. ADULT PEDIATRIC PARENT PROXY PROMIS Item Bank v1.0 Meaning and Purpose PROMIS Short Form v1.0 Meaning and Purpose 4a

MEANING AND PURPOSE. ADULT PEDIATRIC PARENT PROXY PROMIS Item Bank v1.0 Meaning and Purpose PROMIS Short Form v1.0 Meaning and Purpose 4a MEANING AND PURPOSE A brief guide to the PROMIS Meaning and Purpose instruments: ADULT PEDIATRIC PARENT PROXY PROMIS Item Bank v1.0 Meaning and Purpose PROMIS Short Form v1.0 Meaning and Purpose 4a PROMIS

More information

The Four Habits Coding Scheme: Validation of an instrument to assess clinicians communication behavior

The Four Habits Coding Scheme: Validation of an instrument to assess clinicians communication behavior Patient Education and Counseling 62 (2006) 38 45 www.elsevier.com/locate/pateducou The Four Habits Coding Scheme: Validation of an instrument to assess clinicians communication behavior Edward Krupat a,

More information

Author's response to reviews

Author's response to reviews Author's response to reviews Title: Effect of a multidisciplinary stress treatment programme on the return to work rate for persons with work-related stress. A non-randomized controlled study from a stress

More information

PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity

PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity Measurement & Variables - Initial step is to conceptualize and clarify the concepts embedded in a hypothesis or research question with

More information

UNIVERSITY OF CALGARY. Reliability & Validity of the. Objective Structured Clinical Examination (OSCE): A Meta-Analysis. Ibrahim Al Ghaithi A THESIS

UNIVERSITY OF CALGARY. Reliability & Validity of the. Objective Structured Clinical Examination (OSCE): A Meta-Analysis. Ibrahim Al Ghaithi A THESIS UNIVERSITY OF CALGARY Reliability & Validity of the Objective Structured Clinical Examination (OSCE): A Meta-Analysis by Ibrahim Al Ghaithi A THESIS SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN PARTIAL

More information

ORIGINS AND DISCUSSION OF EMERGENETICS RESEARCH

ORIGINS AND DISCUSSION OF EMERGENETICS RESEARCH ORIGINS AND DISCUSSION OF EMERGENETICS RESEARCH The following document provides background information on the research and development of the Emergenetics Profile instrument. Emergenetics Defined 1. Emergenetics

More information

10 Intraclass Correlations under the Mixed Factorial Design

10 Intraclass Correlations under the Mixed Factorial Design CHAPTER 1 Intraclass Correlations under the Mixed Factorial Design OBJECTIVE This chapter aims at presenting methods for analyzing intraclass correlation coefficients for reliability studies based on a

More information

Author s response to reviews

Author s response to reviews Author s response to reviews Title: The validity of a professional competence tool for physiotherapy students in simulationbased clinical education: a Rasch analysis Authors: Belinda Judd (belinda.judd@sydney.edu.au)

More information

Scaling the quality of clinical audit projects: a pilot study

Scaling the quality of clinical audit projects: a pilot study International Journal for Quality in Health Care 1999; Volume 11, Number 3: pp. 241 249 Scaling the quality of clinical audit projects: a pilot study ANDREW D. MILLARD Scottish Clinical Audit Resource

More information

INTRODUCTION TO ASSESSMENT OPTIONS

INTRODUCTION TO ASSESSMENT OPTIONS DEPRESSION A brief guide to the PROMIS Depression instruments: ADULT ADULT CANCER PEDIATRIC PARENT PROXY PROMIS-Ca Bank v1.0 Depression PROMIS Pediatric Item Bank v2.0 Depressive Symptoms PROMIS Pediatric

More information

CHAPTER 8. Reliability of the assessment of consultation skills of general practice trainees with the MAAS-Global instrument

CHAPTER 8. Reliability of the assessment of consultation skills of general practice trainees with the MAAS-Global instrument CHAPTER 8 Reliability of the assessment of consultation skills of general practice trainees with the MAAS-Global instrument Marcel E. Reinders Annette H. Blankenstein Harm W.J. van Marwijk Dirk L. Knol

More information

Assessing Agreement Between Methods Of Clinical Measurement

Assessing Agreement Between Methods Of Clinical Measurement University of York Department of Health Sciences Measuring Health and Disease Assessing Agreement Between Methods Of Clinical Measurement Based on Bland JM, Altman DG. (1986). Statistical methods for assessing

More information

Professional Development: proposals for assuring the continuing fitness to practise of osteopaths. draft Peer Discussion Review Guidelines

Professional Development: proposals for assuring the continuing fitness to practise of osteopaths. draft Peer Discussion Review Guidelines 5 Continuing Professional Development: proposals for assuring the continuing fitness to practise of osteopaths draft Peer Discussion Review Guidelines February January 2015 2 draft Peer Discussion Review

More information

PAIN INTERFERENCE. ADULT ADULT CANCER PEDIATRIC PARENT PROXY PROMIS-Ca Bank v1.1 Pain Interference PROMIS-Ca Bank v1.0 Pain Interference*

PAIN INTERFERENCE. ADULT ADULT CANCER PEDIATRIC PARENT PROXY PROMIS-Ca Bank v1.1 Pain Interference PROMIS-Ca Bank v1.0 Pain Interference* PROMIS Item Bank v1.1 Pain Interference PROMIS Item Bank v1.0 Pain Interference* PROMIS Short Form v1.0 Pain Interference 4a PROMIS Short Form v1.0 Pain Interference 6a PROMIS Short Form v1.0 Pain Interference

More information

Original Article. (This manuscript was submitted on 9 February Following blind peer review, it was accepted for publication on 6 June 2012)

Original Article. (This manuscript was submitted on 9 February Following blind peer review, it was accepted for publication on 6 June 2012) 483331PED0Supp. 10.1177/1757975913483331D. Trouilloud and J. Regnier 013 Therapeutic education among adults with type diabetes: effects of a three-day intervention on perceived competence, self-management

More information

The Use of Standardized Patients to Evaluate Family Medicine Resident Decision Making

The Use of Standardized Patients to Evaluate Family Medicine Resident Decision Making 261 The Use of Standardized Patients to Evaluate Family Medicine Resident Decision Making Richard Terry, DO; Erik Hiester, DO; Gary D. James, PhD Background and Objectives: This study was intended to assess

More information

ANXIETY A brief guide to the PROMIS Anxiety instruments:

ANXIETY A brief guide to the PROMIS Anxiety instruments: ANXIETY A brief guide to the PROMIS Anxiety instruments: ADULT PEDIATRIC PARENT PROXY PROMIS Pediatric Bank v1.0 Anxiety PROMIS Pediatric Short Form v1.0 - Anxiety 8a PROMIS Item Bank v1.0 Anxiety PROMIS

More information

1 The conceptual underpinnings of statistical power

1 The conceptual underpinnings of statistical power 1 The conceptual underpinnings of statistical power The importance of statistical power As currently practiced in the social and health sciences, inferential statistics rest solidly upon two pillars: statistical

More information

PsychTests.com advancing psychology and technology

PsychTests.com advancing psychology and technology PsychTests.com advancing psychology and technology tel 514.745.8272 fax 514.745.6242 CP Normandie PO Box 26067 l Montreal, Quebec l H3M 3E8 contact@psychtests.com Psychometric Report Emotional Intelligence

More information

Examining Factors Affecting Language Performance: A Comparison of Three Measurement Approaches

Examining Factors Affecting Language Performance: A Comparison of Three Measurement Approaches Pertanika J. Soc. Sci. & Hum. 21 (3): 1149-1162 (2013) SOCIAL SCIENCES & HUMANITIES Journal homepage: http://www.pertanika.upm.edu.my/ Examining Factors Affecting Language Performance: A Comparison of

More information

IDEA Technical Report No. 20. Updated Technical Manual for the IDEA Feedback System for Administrators. Stephen L. Benton Dan Li

IDEA Technical Report No. 20. Updated Technical Manual for the IDEA Feedback System for Administrators. Stephen L. Benton Dan Li IDEA Technical Report No. 20 Updated Technical Manual for the IDEA Feedback System for Administrators Stephen L. Benton Dan Li July 2018 2 Table of Contents Introduction... 5 Sample Description... 6 Response

More information

QUASI EXPERIMENTAL DESIGN

QUASI EXPERIMENTAL DESIGN UNIT 3 QUASI EXPERIMENTAL DESIGN Factorial Design Structure 3. Introduction 3.1 Objectives 3.2 Meaning of Quasi Experimental Design 3.3 Difference Between Quasi Experimental Design and True Experimental

More information

Mantel-Haenszel Procedures for Detecting Differential Item Functioning

Mantel-Haenszel Procedures for Detecting Differential Item Functioning A Comparison of Logistic Regression and Mantel-Haenszel Procedures for Detecting Differential Item Functioning H. Jane Rogers, Teachers College, Columbia University Hariharan Swaminathan, University of

More information

Higher Psychology RESEARCH REVISION

Higher Psychology RESEARCH REVISION Higher Psychology RESEARCH REVISION 1 The biggest change from the old Higher course (up to 2014) is the possibility of an analysis and evaluation question (8-10) marks asking you to comment on aspects

More information

Answers to end of chapter questions

Answers to end of chapter questions Answers to end of chapter questions Chapter 1 What are the three most important characteristics of QCA as a method of data analysis? QCA is (1) systematic, (2) flexible, and (3) it reduces data. What are

More information

Peer assessment of competence

Peer assessment of competence The Metric of Medical Education Peer assessment of competence John J Norcini Objective This instalment in the series on professional assessment summarises how peers are used in the evaluation process and

More information

Title:Problematic computer gaming, console-gaming, and internet use among adolescents: new measurement tool and association with time use

Title:Problematic computer gaming, console-gaming, and internet use among adolescents: new measurement tool and association with time use Author's response to reviews Title:Problematic computer gaming, console-gaming, and internet use among adolescents: new measurement tool and association with time use Authors: Mette Rasmussen (mera@niph.dk)

More information

Adaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida

Adaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida Adaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida and Oleksandr S. Chernyshenko University of Canterbury Presented at the New CAT Models

More information

Patient and professional accuracy of recalled treatment decisions in out-patient consultations

Patient and professional accuracy of recalled treatment decisions in out-patient consultations University of Wollongong Research Online Faculty of Health and Behavioural Sciences - Papers (Archive) Faculty of Science, Medicine and Health 2007 Patient and professional accuracy of recalled treatment

More information

Smoking Social Motivations

Smoking Social Motivations Smoking Social Motivations A brief guide to the PROMIS Smoking Social Motivations instruments: ADULT PROMIS Item Bank v1.0 Smoking Social Motivations for All Smokers PROMIS Item Bank v1.0 Smoking Social

More information

VERDIN MANUSCRIPT REVIEW HISTORY REVISION NOTES FROM AUTHORS (ROUND 2)

VERDIN MANUSCRIPT REVIEW HISTORY REVISION NOTES FROM AUTHORS (ROUND 2) 1 VERDIN MANUSCRIPT REVIEW HISTORY REVISION NOTES FROM AUTHORS (ROUND 2) Thank you for providing us with the opportunity to revise our paper. We have revised the manuscript according to the editors and

More information

Issues Surrounding the Normalization and Standardisation of Skin Conductance Responses (SCRs).

Issues Surrounding the Normalization and Standardisation of Skin Conductance Responses (SCRs). Issues Surrounding the Normalization and Standardisation of Skin Conductance Responses (SCRs). Jason J. Braithwaite {Behavioural Brain Sciences Centre, School of Psychology, University of Birmingham, UK}

More information

Basic concepts and principles of classical test theory

Basic concepts and principles of classical test theory Basic concepts and principles of classical test theory Jan-Eric Gustafsson What is measurement? Assignment of numbers to aspects of individuals according to some rule. The aspect which is measured must

More information

Assessing the reliability of the borderline regression method as a standard setting procedure for objective structured clinical examination

Assessing the reliability of the borderline regression method as a standard setting procedure for objective structured clinical examination Educational Research Article Assessing the reliability of the borderline regression method as a standard setting procedure for objective structured clinical examination Sara Mortaz Hejri 1, Mohammad Jalili

More information

Best Practice Model Communication/Relational Skills in Soliciting the Patient/Family Story Stuart Farber

Best Practice Model Communication/Relational Skills in Soliciting the Patient/Family Story Stuart Farber Best Practice Model Communication/Relational Skills in Soliciting the Patient/Family Story Stuart Farber Once you have set a safe context for the palliative care discussion soliciting the patient's and

More information

Clinical psychology trainees experiences of supervision

Clinical psychology trainees experiences of supervision Clinical psychology trainees experiences of supervision Item Type Article Authors Waldron, Michelle;Byrne, Michael Citation Waldron, M, & Byrne, M. (2014). Clinical psychology trainees' experiences of

More information

PHYSICAL FUNCTION A brief guide to the PROMIS Physical Function instruments:

PHYSICAL FUNCTION A brief guide to the PROMIS Physical Function instruments: PROMIS Bank v1.0 - Physical Function* PROMIS Short Form v1.0 Physical Function 4a* PROMIS Short Form v1.0-physical Function 6a* PROMIS Short Form v1.0-physical Function 8a* PROMIS Short Form v1.0 Physical

More information

Thriving in College: The Role of Spirituality. Laurie A. Schreiner, Ph.D. Azusa Pacific University

Thriving in College: The Role of Spirituality. Laurie A. Schreiner, Ph.D. Azusa Pacific University Thriving in College: The Role of Spirituality Laurie A. Schreiner, Ph.D. Azusa Pacific University WHAT DESCRIBES COLLEGE STUDENTS ON EACH END OF THIS CONTINUUM? What are they FEELING, DOING, and THINKING?

More information

Family Assessment Device (FAD)

Family Assessment Device (FAD) Outcome Measure Sensitivity to Change Population Domain Type of Measure ICF-Code/s Description Family Assessment Device (FAD) No Paediatric and adult Family Environment Self-report d7, d9 The Family Assessment

More information

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological

More information

MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and. Lord Equating Methods 1,2

MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and. Lord Equating Methods 1,2 MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and Lord Equating Methods 1,2 Lisa A. Keller, Ronald K. Hambleton, Pauline Parker, Jenna Copella University of Massachusetts

More information

FATIGUE. A brief guide to the PROMIS Fatigue instruments:

FATIGUE. A brief guide to the PROMIS Fatigue instruments: FATIGUE A brief guide to the PROMIS Fatigue instruments: ADULT ADULT CANCER PEDIATRIC PARENT PROXY PROMIS Ca Bank v1.0 Fatigue PROMIS Pediatric Bank v2.0 Fatigue PROMIS Pediatric Bank v1.0 Fatigue* PROMIS

More information

How Do We Assess Students in the Interpreting Examinations?

How Do We Assess Students in the Interpreting Examinations? How Do We Assess Students in the Interpreting Examinations? Fred S. Wu 1 Newcastle University, United Kingdom The field of assessment in interpreter training is under-researched, though trainers and researchers

More information

Medical student interviewing: a randomized trial of patient-centredness and clinical competence

Medical student interviewing: a randomized trial of patient-centredness and clinical competence Family Practice Vol. 20, No. 2 Oxford University Press 2003, all rights reserved. Doi: 10.1093/fampra/20.2.213, available online at www.fampra.oupjournals.org Printed in Great Britain Medical student interviewing:

More information

Test Validity. What is validity? Types of validity IOP 301-T. Content validity. Content-description Criterion-description Construct-identification

Test Validity. What is validity? Types of validity IOP 301-T. Content validity. Content-description Criterion-description Construct-identification What is? IOP 301-T Test Validity It is the accuracy of the measure in reflecting the concept it is supposed to measure. In simple English, the of a test concerns what the test measures and how well it

More information

ANXIETY. A brief guide to the PROMIS Anxiety instruments:

ANXIETY. A brief guide to the PROMIS Anxiety instruments: ANXIETY A brief guide to the PROMIS Anxiety instruments: ADULT ADULT CANCER PEDIATRIC PARENT PROXY PROMIS Bank v1.0 Anxiety PROMIS Short Form v1.0 Anxiety 4a PROMIS Short Form v1.0 Anxiety 6a PROMIS Short

More information

ACTIVE LISTENING. Allison Bickett, MS Carolinas HealthCare System

ACTIVE LISTENING. Allison Bickett, MS Carolinas HealthCare System ACTIVE LISTENING Allison Bickett, MS Carolinas HealthCare System PB&J experiment What can we learn from this? Role of the listener in communication EXPECTATIONS ASSUMPTIONS Communication in Health Care

More information

Ambiguous Data Result in Ambiguous Conclusions: A Reply to Charles T. Tart

Ambiguous Data Result in Ambiguous Conclusions: A Reply to Charles T. Tart Other Methodology Articles Ambiguous Data Result in Ambiguous Conclusions: A Reply to Charles T. Tart J. E. KENNEDY 1 (Original publication and copyright: Journal of the American Society for Psychical

More information

Technical Specifications

Technical Specifications Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically

More information

Mixed Methods Study Design

Mixed Methods Study Design 1 Mixed Methods Study Design Kurt C. Stange, MD, PhD Professor of Family Medicine, Epidemiology & Biostatistics, Oncology and Sociology Case Western Reserve University 1. Approaches 1, 2 a. Qualitative

More information

Exploring rare patient behaviour with sequential analysis: an illustration

Exploring rare patient behaviour with sequential analysis: an illustration Exploring rare patient with sequential analysis: an illustration HILDE EIDE 1,2,VICENÇ QUERA 3, ARNSTEIN FINSET 2 1 Faculty of Nursing, Oslo University College, Oslo, Norway 2 Department of Behavioural

More information

Catching the Hawks and Doves: A Method for Identifying Extreme Examiners on Objective Structured Clinical Examinations

Catching the Hawks and Doves: A Method for Identifying Extreme Examiners on Objective Structured Clinical Examinations Catching the Hawks and Doves: A Method for Identifying Extreme Examiners on Objective Structured Clinical Examinations July 20, 2011 1 Abstract Performance-based assessments are powerful methods for assessing

More information

Understanding Uncertainty in School League Tables*

Understanding Uncertainty in School League Tables* FISCAL STUDIES, vol. 32, no. 2, pp. 207 224 (2011) 0143-5671 Understanding Uncertainty in School League Tables* GEORGE LECKIE and HARVEY GOLDSTEIN Centre for Multilevel Modelling, University of Bristol

More information

Process of a neuropsychological assessment

Process of a neuropsychological assessment Test selection Process of a neuropsychological assessment Gather information Review of information provided by referrer and if possible review of medical records Interview with client and his/her relative

More information

PSYCHOLOGICAL STRESS EXPERIENCES

PSYCHOLOGICAL STRESS EXPERIENCES PSYCHOLOGICAL STRESS EXPERIENCES A brief guide to the PROMIS Pediatric and Parent Proxy Report Psychological Stress Experiences instruments: PEDIATRIC PROMIS Pediatric Item Bank v1.0 Psychological Stress

More information

SLEEP DISTURBANCE ABOUT SLEEP DISTURBANCE INTRODUCTION TO ASSESSMENT OPTIONS. 6/27/2018 PROMIS Sleep Disturbance Page 1

SLEEP DISTURBANCE ABOUT SLEEP DISTURBANCE INTRODUCTION TO ASSESSMENT OPTIONS. 6/27/2018 PROMIS Sleep Disturbance Page 1 SLEEP DISTURBANCE A brief guide to the PROMIS Sleep Disturbance instruments: ADULT PROMIS Item Bank v1.0 Sleep Disturbance PROMIS Short Form v1.0 Sleep Disturbance 4a PROMIS Short Form v1.0 Sleep Disturbance

More information

Abstract. Authors. Shauna Whiteford, Ksusha Blacklock, Adrienne Perry. Correspondence. Keywords

Abstract. Authors. Shauna Whiteford, Ksusha Blacklock, Adrienne Perry. Correspondence. Keywords brief report: Reliability of the York Measure of Quality of IBI (YMQI) Abstract Volume 18, Number 3, 2012 Authors Shauna Whiteford, Ksusha Blacklock, Adrienne Perry Department of Psychology, York University,

More information

Interviewing, or MI. Bear in mind that this is an introductory training. As

Interviewing, or MI. Bear in mind that this is an introductory training. As Motivational Interviewing Module 2 Slide Transcript Slide 1 In this module, you will be introduced to the basics of Motivational Interviewing, or MI. Bear in mind that this is an introductory training.

More information

A Coding System to Measure Elements of Shared Decision Making During Psychiatric Visits

A Coding System to Measure Elements of Shared Decision Making During Psychiatric Visits Measuring Shared Decision Making -- 1 A Coding System to Measure Elements of Shared Decision Making During Psychiatric Visits Michelle P. Salyers, Ph.D. 1481 W. 10 th Street Indianapolis, IN 46202 mpsalyer@iupui.edu

More information

(77, 72, 74, 75, and 81).

(77, 72, 74, 75, and 81). CHAPTER 3 METHODOLOGY 3.1 RESEARCH DESIGN A descriptive study using a cross sectional design was used to establish norms on the JHFT for an ethnically diverse South African population between the ages

More information

Observer OPTION 5 Manual

Observer OPTION 5 Manual Observer OPTION 5 Manual Measuring shared decision making by assessing recordings or transcripts of encounters from clinical settings. Glyn Elwyn, Stuart W Grande, Paul Barr The Dartmouth Institute for

More information

DISCLOSURE INFORMATION

DISCLOSURE INFORMATION DISCLOSURE INFORMATION Disclosure of Relevant Financial Relationships: Presenter has no financial relationships to disclose. Disclosure of Off-Label and/or Investigative Uses: Presenter will not discuss

More information

College Student Self-Assessment Survey (CSSAS)

College Student Self-Assessment Survey (CSSAS) 13 College Student Self-Assessment Survey (CSSAS) Development of College Student Self Assessment Survey (CSSAS) The collection and analysis of student achievement indicator data are of primary importance

More information

A short appraisal of recent studies on fluoridation cessation in Alberta, Canada

A short appraisal of recent studies on fluoridation cessation in Alberta, Canada A short appraisal of recent studies on fluoridation cessation in Alberta, Canada Two recent papers by McLaren et al 1, 2 report on a set of linked studies conducted in Alberta, to assess the impact of

More information

FUNCTIONAL CONSISTENCY IN THE FACE OF TOPOGRAPHICAL CHANGE IN ARTICULATED THOUGHTS Kennon Kashima

FUNCTIONAL CONSISTENCY IN THE FACE OF TOPOGRAPHICAL CHANGE IN ARTICULATED THOUGHTS Kennon Kashima Journal of Rational-Emotive & Cognitive-Behavior Therapy Volume 7, Number 3, Fall 1989 FUNCTIONAL CONSISTENCY IN THE FACE OF TOPOGRAPHICAL CHANGE IN ARTICULATED THOUGHTS Kennon Kashima Goddard College

More information

HPS301 Exam Notes- Contents

HPS301 Exam Notes- Contents HPS301 Exam Notes- Contents Week 1 Research Design: What characterises different approaches 1 Experimental Design 1 Key Features 1 Criteria for establishing causality 2 Validity Internal Validity 2 Threats

More information

1. Draft checklist for judging on quality of animal studies (Van der Worp et al., 2010)

1. Draft checklist for judging on quality of animal studies (Van der Worp et al., 2010) Appendix C Quality appraisal tools (QATs) 1. Draft checklist for judging on quality of animal studies (Van der Worp et al., 2010) 2. NICE Checklist for qualitative studies (2012) 3. The Critical Appraisal

More information

Psychological testing

Psychological testing Psychological testing Lecture 11 Mikołaj Winiewski, PhD Marcin Zajenkowski, PhD Strategies for test development and test item considerations The procedures involved in item generation, item selection,

More information

Measuring and Assessing Study Quality

Measuring and Assessing Study Quality Measuring and Assessing Study Quality Jeff Valentine, PhD Co-Chair, Campbell Collaboration Training Group & Associate Professor, College of Education and Human Development, University of Louisville Why

More information

Teacher stress: A comparison between casual and permanent primary school teachers with a special focus on coping

Teacher stress: A comparison between casual and permanent primary school teachers with a special focus on coping Teacher stress: A comparison between casual and permanent primary school teachers with a special focus on coping Amanda Palmer, Ken Sinclair and Michael Bailey University of Sydney Paper prepared for presentation

More information

Client Care Counseling Critique Assignment Osteoporosis

Client Care Counseling Critique Assignment Osteoporosis Client Care Counseling Critique Assignment Osteoporosis 1. Describe the counselling approach or aspects of different approaches used by the counsellor. Would a different approach have been more appropriate

More information

Week 17 and 21 Comparing two assays and Measurement of Uncertainty Explain tools used to compare the performance of two assays, including

Week 17 and 21 Comparing two assays and Measurement of Uncertainty Explain tools used to compare the performance of two assays, including Week 17 and 21 Comparing two assays and Measurement of Uncertainty 2.4.1.4. Explain tools used to compare the performance of two assays, including 2.4.1.4.1. Linear regression 2.4.1.4.2. Bland-Altman plots

More information

Stewart W Mercer a, Margaret Maxwell b, David Heaney c and Graham CM Watt a. Introduction

Stewart W Mercer a, Margaret Maxwell b, David Heaney c and Graham CM Watt a. Introduction Family Practice Vol. 21, No. 6 Oxford University Press 2004, all rights reserved. Printed in Great Britain Doi: 10.1093/fampra/cmh621 Family Practice Advance Access originally published online on 1 November

More information

AN ANALYSIS ON VALIDITY AND RELIABILITY OF TEST ITEMS IN PRE-NATIONAL EXAMINATION TEST SMPN 14 PONTIANAK

AN ANALYSIS ON VALIDITY AND RELIABILITY OF TEST ITEMS IN PRE-NATIONAL EXAMINATION TEST SMPN 14 PONTIANAK AN ANALYSIS ON VALIDITY AND RELIABILITY OF TEST ITEMS IN PRE-NATIONAL EXAMINATION TEST SMPN 14 PONTIANAK Hanny Pradana, Gatot Sutapa, Luwandi Suhartono Sarjana Degree of English Language Education, Teacher

More information

Introduction. 1.1 Facets of Measurement

Introduction. 1.1 Facets of Measurement 1 Introduction This chapter introduces the basic idea of many-facet Rasch measurement. Three examples of assessment procedures taken from the field of language testing illustrate its context of application.

More information

Small Group Facilitator s Guide Doctoring 101 The ETHNICS Mnemonic

Small Group Facilitator s Guide Doctoring 101 The ETHNICS Mnemonic Small Group Facilitator s Guide Doctoring 101 The ETHNICS Mnemonic Schedule and Brief Agenda: I. Briefly introduce the agenda and specific learning objectives (10 min) II. Discussion of Health Beliefs

More information

The degree to which a measure is free from error. (See page 65) Accuracy

The degree to which a measure is free from error. (See page 65) Accuracy Accuracy The degree to which a measure is free from error. (See page 65) Case studies A descriptive research method that involves the intensive examination of unusual people or organizations. (See page

More information

The Effect of Review on Student Ability and Test Efficiency for Computerized Adaptive Tests

The Effect of Review on Student Ability and Test Efficiency for Computerized Adaptive Tests The Effect of Review on Student Ability and Test Efficiency for Computerized Adaptive Tests Mary E. Lunz and Betty A. Bergstrom, American Society of Clinical Pathologists Benjamin D. Wright, University

More information

Chapter 6 Topic 6B Test Bias and Other Controversies. The Question of Test Bias

Chapter 6 Topic 6B Test Bias and Other Controversies. The Question of Test Bias Chapter 6 Topic 6B Test Bias and Other Controversies The Question of Test Bias Test bias is an objective, empirical question, not a matter of personal judgment. Test bias is a technical concept of amenable

More information

The Leeds Reliable Change Indicator

The Leeds Reliable Change Indicator The Leeds Reliable Change Indicator Simple Excel (tm) applications for the analysis of individual patient and group data Stephen Morley and Clare Dowzer University of Leeds Cite as: Morley, S., Dowzer,

More information

The UK FAM items Self-serviceTraining Course

The UK FAM items Self-serviceTraining Course The UK FAM items Self-serviceTraining Course Course originator: Prof Lynne Turner-Stokes DM FRCP Regional Rehabilitation Unit Northwick Park Hospital Watford Road, Harrow, Middlesex. HA1 3UJ Background

More information

EMOTIONAL INTELLIGENCE skills assessment: technical report

EMOTIONAL INTELLIGENCE skills assessment: technical report OnlineAssessments EISA EMOTIONAL INTELLIGENCE skills assessment: technical report [ Abridged Derek Mann ] To accompany the Emotional Intelligence Skills Assessment (EISA) by Steven J. Stein, Derek Mann,

More information