4 Diagnostic Tests and Measures of Agreement
|
|
- Margaret McCoy
- 5 years ago
- Views:
Transcription
1 4 Diagnostic Tests and Measures of Agreement Diagnostic tests may be used for diagnosis of disease or for screening purposes. Some tests are more effective than others, so we need to be able to measure how useful a test is in a given set of circumstances. In practice, of course, we rarely know the true state of the individual and hence we evaluate the test in comparison with some other, more accurate classification. To simplify terminology here, we shall assume that the reference procedure (often called the gold standard ) indicates the true status of the subject. 4.1 Sensitivity and specificity To measure the effectiveness of a test, we need to consider two measures: sensitivity: (Se) the probability that if the disease is present the test is positive specificity: (Sp) the probability that if the disease is absent the test is negative Sensitivity is a measure of how good a test is at correctly identifying those who have the condition. If the test is not sensitive to the condition of interest then we would observe many false negatives. Specificity is a measure of how good a test is at correctly identifying those who do not have the condition. If the test is not specific to the condition of interest then we would observe many false positives. Sometimes the false negative (1 Se) and false positive (1 Sp) rates are given. Setting aside wider issues we can look at a simple measure of the efficiency of a screening test by comparing the prevalence in the whole population with the prevalence in the screen positive group. If the costs of the gold standard are high it may not be economically viable to apply the gold standard to the entire population, but it might be cost effective to apply it to a screen positive group Confidence Intervals Sensitivity and specificity are both estimates, so we can find confidence intervals for them. They are both Binomial proportions; however, since they are often close to 1 using the normal approximation may not always be appropriate. Instead we should use exact methods. 4.2 Tests on a continuous scale When a test result is expressed on a continuous scale, as in most haematological and biochemical tests, it is often convenient to think in terms of a cut off point (orfrequentlybothupperandlower cut off points) beyond which the result will be regarded as being abnormal. This simplifies the test to a binary (positive/negative) result. We need to define a critical value C as the cut-off point beyond which an individual would be referred for further investigation. Clearly the position of C is crucial. For example, suppose the cut-off point is such that most healthy individuals have values less than C and most diseased individuals have values greater than C; diseased individuals less than C are false negatives, missed by the test. Reducing C moves it closer to the mean of the healthy individuals and will reduce the number of false negatives. This improves the sensitivity of the test but at the expense of its 21
2 Table 4: Radiation of pain and diagnosis of gallstones Gallstones Not gallstones Total Pain radiates to shoulder Pain radiates to other site Pain does not radiate Total specificity (and hence the number of false positives). The converse happens if the cut-off point is moved the other way. A receiver operating characteristic (ROC) curve can be useful to examine the trade-off between sensitivity and specificity. We choose a number of different cut-off points, calculate the sensitivity and specificity for each cut-off point and then plot sensitivity against the false positive rate. Tests with ROC curves which go furthest into the top left corner are usually best. The area under the ROC curve estimates the probability that a member of one population chosen at random will have avaluegreaterthanamemberoftheotherpopulation(similartothemann-whitneyutest). It can be useful in comparing different tests. 4.3 Positive and negative predictive value Sensitivity and specificity give only part of the picture. In evaluating a test that might be used for screening purposes, we need a measure of the predictive power ofeitherapositiveoranegative test result. The predictive power of a positive test result, the positive predictive value (PPV) is the proportion of those with positive test results who turn out eventually to have the condition. The predictive value of a negative test result, the negative predictive value (NPV) is the proportion of those with negative test results who eventually turn out not to have the condition. PPV = NPV = Se p Se p +(1 Sp)(1 p) Sp(1 p) Sp(1 p)+(1 Se)p where p is the prevalence Positive and negative evidence When deciding between alternative diagnoses, different items of information contribute more or less weight of evidence for or against particular diagnoses. The presence of guarding in a patient with acute abdominal pain, for example, carries considerable weight in favour of the diagnosis of acute appendicitis. The items of information that help to exclude a diagnosis may, however, be different from those that help to establish it. These considerations may be important when deciding which of several possible subsequent investigations are likely to be helpful. 4.4 Comparing Two Methods There is a general class of problems relating to how one device whichmeasuressomecontinuous variable compares with a second device. The particular problem which occurs most frequently 22
3 in medicine, and will be discussed here, is whether a (usually cheaper) device can satisfactorily substitute for a device which measures with no appreciable error, this is method comparison. An apparently slightly different problem is to do with whether one method which measures with error can be substituted for another method which also measures with error. This has been dubbed the method conversion problem, and it is quite different from the method comparison problem and will not be discussed here. AcommonmistakeistocalculatePearson scorrelationcoefficient on the data, get a result which is very highly statistically significant and hence declare good agreement. However,thisisnotsuitable because the null hypothesis, that the two measurement scales areunrelated,isnotplausible;so showing the results were unlikely to occur by chance under the nullhypothesisofnoagreement is not useful. Rather we need a method which shows how much the results deviate from total agreement. Plot the data as a scatterplot and add the line of equality (y = x). This will give a quick visualisation of the association between the data. Perform a paired sample t-test on the data against the null hypothesis of no difference in the pairs of results. The mean difference is an estimate of the bias; a confidence interval will quantify the extent of the plausible bias, while the p value will show the weight of evidence in favour of a true difference existing. Plot the difference between the methods against the average. This gives an estimate of the size of the bias against the true value. A 95% range, based on the mean and standard deviation of the difference (assuming normality), is often added to the plot; these lines are sometimes called limits of agreement. Pearson s correlation coefficient can be calculated on these data to test the null hypothesis that the difference and mean and unrelated; that is, that the size of bias is unrelated to the true value. 4.5 Measures of Agreement Suppose two observers are asked to rate the same subjects for the presence or absence of a disease. Cohen s kappa coefficient can be used to assess the agreement between the two raters. Rater 2 Rater 1 Present Absent Total Present n 11 n 10 n 1+ Absent n 01 n 00 n 0+ Total n +1 n +0 n ++ Define I o as the observed proportion of agreement and I e as the proportion of expected agreement due to chance: I o = n 11 + n 00 n ++ I e = n +1n 1+ + n +0 n 0+ n 2 ++ Then kappa, κ, is the excess agreement expressed as a fraction of the maximum possible excess: κ = I o I e 1 I e 23
4 If there is complete agreement, κ =1;ifobservedagreementisequaltochance,κ =0;ifobserved agreement is greater than by chance κ>0. An important assumption underlying the use of the kappa coefficient is that errors associated with the two sets of ratings are independent. This requires the subjects to be independent and Rater 1 s ratings to be independent of Rater 2 s. The kappa coefficient, therefore, is not appropriate for a situation in which one observer is required to either confirm or disconfirm a known previous rating from another observer. When margin totals are not the same we may use I max I e as the denominator, where I max is the maximum possible agreement, keeping the margins fixed. Another alternative, using weighted observations, so that it attaches greater emphasis to large differences between ratings than to small differences. 4.6 Measurement Scales Validity Avalidscalemeasureswhatitintendstomeasure. Validitycan be judged in several ways; the scale should look as if it makes sense (face validity); all the itemsshouldberelevantandallaspects of the concept being measured should be included (content validity); the scale should be able to predict outcome (predictive validity); the scale should produce similar results to an established scale measuring a different concept (convergent and divergent validity); finally, a scale should be able to distinguish groups of patients who, a priori, are deemed to be different (discriminant validity) Sensitivity and specificity When a scale is used to categorise people, it should be capable ofcategorisingthemaccurately. For example, it would be most useful to detect patients with previously unrecognised problems or those with problems that are amenable to intervention. When screening, sensitivity may be more important than specificity; opportunities for clarifying the status of false positive patients will arise but the false negative patient is lost to further scrutiny Reliability A reliable scale produces results which can be replicated with different observers (inter-observer reliability), when repeated (test-retest reliability), when using different sources of information and when administered by different means. Simple correlation between repeat tests is not adequate for the assessment of reliability - it is more appropriate to analyse the differences between scores to see if they are larger than might be expected by chance Responsiveness to change Ascaleshouldbecapableofdetectingchangeduetointerventions or over time at all levels of the scale. Floor and ceiling effects present particular difficulties; a scale may not be able to detect meaningful differences between subjects who score respectively at the bottom or the top of a scale. 24
5 4.6.5 Format and language Ascaleshouldbewell-designed,andinanappropriateformatandlanguageforthesubjectsand users of the scale who may have differing knowledge and skills. 25
Question Sheet. Prospective Validation of the Pediatric Appendicitis Score in a Canadian Pediatric Emergency Department
Question Sheet Prospective Validation of the Pediatric Appendicitis Score in a Canadian Pediatric Emergency Department Bhatt M, Joseph L, Ducharme FM et al. Acad Emerg Med 2009;16(7):591-596 1. Provide
More informationReliability and Validity checks S-005
Reliability and Validity checks S-005 Checking on reliability of the data we collect Compare over time (test-retest) Item analysis Internal consistency Inter-rater agreement Compare over time Test-Retest
More information(true) Disease Condition Test + Total + a. a + b True Positive False Positive c. c + d False Negative True Negative Total a + c b + d a + b + c + d
Biostatistics and Research Design in Dentistry Reading Assignment Measuring the accuracy of diagnostic procedures and Using sensitivity and specificity to revise probabilities, in Chapter 12 of Dawson
More informationEPIDEMIOLOGY. Training module
1. Scope of Epidemiology Definitions Clinical epidemiology Epidemiology research methods Difficulties in studying epidemiology of Pain 2. Measures used in Epidemiology Disease frequency Disease risk Disease
More informationScreening (Diagnostic Tests) Shaker Salarilak
Screening (Diagnostic Tests) Shaker Salarilak Outline Screening basics Evaluation of screening programs Where we are? Definition of screening? Whether it is always beneficial? Types of bias in screening?
More informationBinary Diagnostic Tests Paired Samples
Chapter 536 Binary Diagnostic Tests Paired Samples Introduction An important task in diagnostic medicine is to measure the accuracy of two diagnostic tests. This can be done by comparing summary measures
More informationDiagnostic tests, Laboratory tests
Diagnostic tests, Laboratory tests I. Introduction II. III. IV. Informational values of a test Consequences of the prevalence rate Sequential use of 2 tests V. Selection of a threshold: the ROC curve VI.
More informationImportance of Good Measurement
Importance of Good Measurement Technical Adequacy of Assessments: Validity and Reliability Dr. K. A. Korb University of Jos The conclusions in a study are only as good as the data that is collected. The
More informationResearch Questions, Variables, and Hypotheses: Part 2. Review. Hypotheses RCS /7/04. What are research questions? What are variables?
Research Questions, Variables, and Hypotheses: Part 2 RCS 6740 6/7/04 1 Review What are research questions? What are variables? Definition Function Measurement Scale 2 Hypotheses OK, now that we know how
More informationStatistics, Probability and Diagnostic Medicine
Statistics, Probability and Diagnostic Medicine Jennifer Le-Rademacher, PhD Sponsored by the Clinical and Translational Science Institute (CTSI) and the Department of Population Health / Division of Biostatistics
More informationValidity of measurement instruments used in PT research
Validity of measurement instruments used in PT research Mohammed TA, Omar Ph.D. PT, PGDCR-CLT Rehabilitation Health Science Department Momarar@ksu.edu.sa A Word on Embedded Assessment Discusses ways of
More informationChapter 10. Screening for Disease
Chapter 10 Screening for Disease 1 Terminology Reliability agreement of ratings/diagnoses, reproducibility Inter-rater reliability agreement between two independent raters Intra-rater reliability agreement
More informationGeorgina Salas. Topics EDCI Intro to Research Dr. A.J. Herrera
Homework assignment topics 51-63 Georgina Salas Topics 51-63 EDCI Intro to Research 6300.62 Dr. A.J. Herrera Topic 51 1. Which average is usually reported when the standard deviation is reported? The mean
More informationGlossary of Practical Epidemiology Concepts
Glossary of Practical Epidemiology Concepts - 2009 Adapted from the McMaster EBCP Workshop 2003, McMaster University, Hamilton, Ont. Note that open access to the much of the materials used in the Epi-546
More informationQuestionnaire design. Questionnaire Design: Content. Questionnaire Design. Questionnaire Design: Wording. Questionnaire Design: Wording OUTLINE
Questionnaire design OUTLINE Questionnaire design tests Reliability Validity POINTS TO CONSIDER Identify your research objectives. Identify your population or study sample Decide how to collect the information
More informationBinary Diagnostic Tests Two Independent Samples
Chapter 537 Binary Diagnostic Tests Two Independent Samples Introduction An important task in diagnostic medicine is to measure the accuracy of two diagnostic tests. This can be done by comparing summary
More informationAP STATISTICS 2008 SCORING GUIDELINES (Form B)
AP STATISTICS 2008 SCORING GUIDELINES (Form B) Question 4 Intent of Question The primary goals of this question were to assess a student s ability to (1) design an experiment to compare two treatments
More informationSTATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012
STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION by XIN SUN PhD, Kansas State University, 2012 A THESIS Submitted in partial fulfillment of the requirements
More informationEdinburgh Imaging Academy online distance learning courses. Statistics
Statistics Semester 1 / Autumn 10 Credits Each Course is composed of Modules & Activities. Modules: Introduction to Statistics IMSc NI4R How to Read a Paper IMSc NI4R Assessing the Accuracy of Diagnostic
More informationFigure 1: Design and outcomes of an independent blind study with gold/reference standard comparison. Adapted from DCEB (1981b)
Page 1 of 1 Diagnostic test investigated indicates the patient has the Diagnostic test investigated indicates the patient does not have the Gold/reference standard indicates the patient has the True positive
More informationEmpirical Knowledge: based on observations. Answer questions why, whom, how, and when.
INTRO TO RESEARCH METHODS: Empirical Knowledge: based on observations. Answer questions why, whom, how, and when. Experimental research: treatments are given for the purpose of research. Experimental group
More information11-3. Learning Objectives
11-1 Measurement Learning Objectives 11-3 Understand... The distinction between measuring objects, properties, and indicants of properties. The similarities and differences between the four scale types
More informationAn update on the analysis of agreement for orthodontic indices
European Journal of Orthodontics 27 (2005) 286 291 doi:10.1093/ejo/cjh078 The Author 2005. Published by Oxford University Press on behalf of the European Orthodontics Society. All rights reserved. For
More informationLecture Week 3 Quality of Measurement Instruments; Introduction SPSS
Lecture Week 3 Quality of Measurement Instruments; Introduction SPSS Introduction to Research Methods & Statistics 2013 2014 Hemmo Smit Overview Quality of Measurement Instruments Introduction SPSS Read:
More informationUniversity of Wollongong. Research Online. Australian Health Services Research Institute
University of Wollongong Research Online Australian Health Services Research Institute Faculty of Business 2011 Measurement of error Janet E. Sansoni University of Wollongong, jans@uow.edu.au Publication
More informationItem Analysis: Classical and Beyond
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013 Why is item analysis relevant? Item analysis provides
More informationUnderstandable Statistics
Understandable Statistics correlated to the Advanced Placement Program Course Description for Statistics Prepared for Alabama CC2 6/2003 2003 Understandable Statistics 2003 correlated to the Advanced Placement
More informationADMS Sampling Technique and Survey Studies
Principles of Measurement Measurement As a way of understanding, evaluating, and differentiating characteristics Provides a mechanism to achieve precision in this understanding, the extent or quality As
More informationData that can be classified as belonging to a distinct number of categories >>result in categorical responses. And this includes:
This sheets starts from slide #83 to the end ofslide #4. If u read this sheet you don`t have to return back to the slides at all, they are included here. Categorical Data (Qualitative data): Data that
More informationUnit 1 Exploring and Understanding Data
Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile
More informationDATA is derived either through. Self-Report Observation Measurement
Data Management DATA is derived either through Self-Report Observation Measurement QUESTION ANSWER DATA DATA may be from Structured or Unstructured questions? Quantitative or Qualitative? Numerical or
More informationCRITICAL APPRAISAL AP DR JEMAIMA CHE HAMZAH MD (UKM) MS (OFTAL) UKM PHD (UK) DEPARTMENT OF OPHTHALMOLOGY UKM MEDICAL CENTRE
CRITICAL APPRAISAL AP DR JEMAIMA CHE HAMZAH MD (UKM) MS (OFTAL) UKM PHD (UK) DEPARTMENT OF OPHTHALMOLOGY UKM MEDICAL CENTRE MINGGU PENYELIDIKAN PERUBATAN & KESIHATAN PPUKM Lecture content Introduction
More informationEnumerative and Analytic Studies. Description versus prediction
Quality Digest, July 9, 2018 Manuscript 334 Description versus prediction The ultimate purpose for collecting data is to take action. In some cases the action taken will depend upon a description of what
More informationValidity and responsiveness of the Core Outcome Measures Index (COMI) for the neck
Validity and responsiveness of the Core Outcome Measures Index (COMI) for the neck C. D. Fankhauser 1 U. Mutter 1 E. Aghayev 2 A. F. Mannion 1 1, Schulthess Klinik, Zürich, Switzerland 2 Institute for
More informationSEED HAEMATOLOGY. Medical statistics your support when interpreting results SYSMEX EDUCATIONAL ENHANCEMENT AND DEVELOPMENT APRIL 2015
SYSMEX EDUCATIONAL ENHANCEMENT AND DEVELOPMENT APRIL 2015 SEED HAEMATOLOGY Medical statistics your support when interpreting results The importance of statistical investigations Modern medicine is often
More informationInvestigating the Reliability of Classroom Observation Protocols: The Case of PLATO. M. Ken Cor Stanford University School of Education.
The Reliability of PLATO Running Head: THE RELIABILTY OF PLATO Investigating the Reliability of Classroom Observation Protocols: The Case of PLATO M. Ken Cor Stanford University School of Education April,
More informationSYMSYS 130: Research Methods in the Cognitive and Information Sciences (Spring 2013)
SYMSYS 130: Research Methods in the Cognitive and Information Sciences (Spring 2013) Take Home Final June 20, 2013 Instructor' s Responses Please respond to the following questions with short essays (300-500
More informationMeasures. David Black, Ph.D. Pediatric and Developmental. Introduction to the Principles and Practice of Clinical Research
Introduction to the Principles and Practice of Clinical Research Measures David Black, Ph.D. Pediatric and Developmental Neuroscience, NIMH With thanks to Audrey Thurm Daniel Pine With thanks to Audrey
More informationChapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.
Chapter 23 Inference About Means Copyright 2010 Pearson Education, Inc. Getting Started Now that we know how to create confidence intervals and test hypotheses about proportions, it d be nice to be able
More informationPsychology, 2010, 1: doi: /psych Published Online August 2010 (
Psychology, 2010, 1: 194-198 doi:10.4236/psych.2010.13026 Published Online August 2010 (http://www.scirp.org/journal/psych) Using Generalizability Theory to Evaluate the Applicability of a Serial Bayes
More informationReviewing IPA studies: Developing and evaluating a new tool
Reviewing IPA studies: Developing and evaluating a new tool Dr Sherrill Snelgrove¹, Dr Annmarie Nelson ²,Dr Stephanie Sivell², Dr Mala Mann², Dr Bridie Evans ¹ ¹ Swansea University ² Cardiff University
More information10 Intraclass Correlations under the Mixed Factorial Design
CHAPTER 1 Intraclass Correlations under the Mixed Factorial Design OBJECTIVE This chapter aims at presenting methods for analyzing intraclass correlation coefficients for reliability studies based on a
More informationIntroduction. We can make a prediction about Y i based on X i by setting a threshold value T, and predicting Y i = 1 when X i > T.
Diagnostic Tests 1 Introduction Suppose we have a quantitative measurement X i on experimental or observed units i = 1,..., n, and a characteristic Y i = 0 or Y i = 1 (e.g. case/control status). The measurement
More informationStatistical Tools in Biology
Statistical Tools in Biology Research Methodology Design protocol/procedure. (2 types) Cross sectional study comparing two different grps. e.g, comparing LDL levels between athletes and couch potatoes.
More informationObjectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests
Objectives Quantifying the quality of hypothesis tests Type I and II errors Power of a test Cautions about significance tests Designing Experiments based on power Evaluating a testing procedure The testing
More information7/17/2013. Evaluation of Diagnostic Tests July 22, 2013 Introduction to Clinical Research: A Two week Intensive Course
Evaluation of Diagnostic Tests July 22, 2013 Introduction to Clinical Research: A Two week Intensive Course David W. Dowdy, MD, PhD Department of Epidemiology Johns Hopkins Bloomberg School of Public Health
More informationPRACTICAL STATISTICS FOR MEDICAL RESEARCH
PRACTICAL STATISTICS FOR MEDICAL RESEARCH Douglas G. Altman Head of Medical Statistical Laboratory Imperial Cancer Research Fund London CHAPMAN & HALL/CRC Boca Raton London New York Washington, D.C. Contents
More informationTheory. = an explanation using an integrated set of principles that organizes observations and predicts behaviors or events.
Definition Slides Hindsight Bias = the tendency to believe, after learning an outcome, that one would have foreseen it. Also known as the I knew it all along phenomenon. Critical Thinking = thinking that
More informationalternate-form reliability The degree to which two or more versions of the same test correlate with one another. In clinical studies in which a given function is going to be tested more than once over
More informationA review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) *
A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) * by J. RICHARD LANDIS** and GARY G. KOCH** 4 Methods proposed for nominal and ordinal data Many
More informationPTHP 7101 Research 1 Chapter Assignments
PTHP 7101 Research 1 Chapter Assignments INSTRUCTIONS: Go over the questions/pointers pertaining to the chapters and turn in a hard copy of your answers at the beginning of class (on the day that it is
More informationChapter 8 Estimating with Confidence
Chapter 8 Estimating with Confidence Introduction Our goal in many statistical settings is to use a sample statistic to estimate a population parameter. In Chapter 4, we learned if we randomly select the
More informationLecture 4: Research Approaches
Lecture 4: Research Approaches Lecture Objectives Theories in research Research design approaches ú Experimental vs. non-experimental ú Cross-sectional and longitudinal ú Descriptive approaches How to
More informationOne-Way Independent ANOVA
One-Way Independent ANOVA Analysis of Variance (ANOVA) is a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment.
More informationEvaluation of diagnostic tests
Evaluation of diagnostic tests Biostatistics and informatics Miklós Kellermayer Overlapping distributions Assumption: A classifier value (e.g., diagnostic parameter, a measurable quantity, e.g., serum
More informationResearch Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1:
Research Methods 1 Handouts, Graham Hole,COGS - version 10, September 000: Page 1: T-TESTS: When to use a t-test: The simplest experimental design is to have two conditions: an "experimental" condition
More informationEvidence Based Medicine Prof P Rheeder Clinical Epidemiology. Module 2: Applying EBM to Diagnosis
Evidence Based Medicine Prof P Rheeder Clinical Epidemiology Module 2: Applying EBM to Diagnosis Content 1. Phases of diagnostic research 2. Developing a new test for lung cancer 3. Thresholds 4. Critical
More informationReliability. Internal Reliability
32 Reliability T he reliability of assessments like the DECA-I/T is defined as, the consistency of scores obtained by the same person when reexamined with the same test on different occasions, or with
More informationROC (Receiver Operating Characteristic) Curve Analysis
ROC (Receiver Operating Characteristic) Curve Analysis Julie Xu 17 th November 2017 Agenda Introduction Definition Accuracy Application Conclusion Reference 2017 All Rights Reserved Confidential for INC
More informationDeveloping and Testing Hypotheses Kuba Glazek, Ph.D. Methodology Expert National Center for Academic and Dissertation Excellence Los Angeles
Developing and Testing Hypotheses Kuba Glazek, Ph.D. Methodology Expert National Center for Academic and Dissertation Excellence Los Angeles NATIONAL CENTER FOR ACADEMIC & DISSERTATION EXCELLENCE Overview
More informationIntroduction to ROC analysis
Introduction to ROC analysis Andriy I. Bandos Department of Biostatistics University of Pittsburgh Acknowledgements Many thanks to Sam Wieand, Nancy Obuchowski, Brenda Kurland, and Todd Alonzo for previous
More informationBasic Biostatistics. Dr. Kiran Chaudhary Dr. Mina Chandra
Basic Biostatistics Dr. Kiran Chaudhary Dr. Mina Chandra Overview 1.Importance of Biostatistics 2.Biological Variations, Uncertainties and Sources of uncertainties 3.Terms- Population/Sample, Validity/
More informationChapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence Section 8.1 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Introduction Our goal in many statistical settings is to use a sample statistic
More informationAbout Reading Scientific Studies
About Reading Scientific Studies TABLE OF CONTENTS About Reading Scientific Studies... 1 Why are these skills important?... 1 Create a Checklist... 1 Introduction... 1 Abstract... 1 Background... 2 Methods...
More informationPsychometric evaluation of the self-test (PST) in the responsible gambling tool Playscan (GamTest)
Psychometric evaluation of the self-test (PST) in the responsible gambling tool Playscan (GamTest) Background I Originally called GamTest. A questionnaire consisting of 15 items plus one general item.
More informationChapter 1: Exploring Data
Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!
More informationClinical biostatistics: Assessing agreement and diagnostic test evaluation
1/66 Clinical biostatistics: Assessing agreement and diagnostic test evaluation Dr Cameron Hurst cphurst@gmail.com DAMASAC and CEU, Khon Kaen University 26 th September, 2557 2/66 What we will cover...
More informationIncorporating quantitative information into a linear ordering" GEORGE R. POTTS Dartmouth College, Hanover, New Hampshire 03755
Memory & Cognition 1974, Vol. 2, No.3, 533 538 Incorporating quantitative information into a linear ordering" GEORGE R. POTTS Dartmouth College, Hanover, New Hampshire 03755 Ss were required to learn linear
More informationChapter IR:VIII. VIII. Evaluation. Laboratory Experiments Logging Effectiveness Measures Efficiency Measures Training and Testing
Chapter IR:VIII VIII. Evaluation Laboratory Experiments Logging Effectiveness Measures Efficiency Measures Training and Testing IR:VIII-1 Evaluation HAGEN/POTTHAST/STEIN 2018 Retrieval Tasks Ad hoc retrieval:
More informationOverview of Experimentation
The Basics of Experimentation Overview of Experiments. IVs & DVs. Operational Definitions. Reliability. Validity. Internal vs. External Validity. Classic Threats to Internal Validity. Lab: FP Overview;
More informationAgreement Coefficients and Statistical Inference
CHAPTER Agreement Coefficients and Statistical Inference OBJECTIVE This chapter describes several approaches for evaluating the precision associated with the inter-rater reliability coefficients of the
More informationVocabulary. Bias. Blinding. Block. Cluster sample
Bias Blinding Block Census Cluster sample Confounding Control group Convenience sample Designs Experiment Experimental units Factor Level Any systematic failure of a sampling method to represent its population
More informationClosed Coding. Analyzing Qualitative Data VIS17. Melanie Tory
Closed Coding Analyzing Qualitative Data Tutorial @ VIS17 Melanie Tory A code in qualitative inquiry is most often a word or short phrase that symbolically assigns a summative, salient, essence capturing,
More informationChapter 5: Field experimental designs in agriculture
Chapter 5: Field experimental designs in agriculture Jose Crossa Biometrics and Statistics Unit Crop Research Informatics Lab (CRIL) CIMMYT. Int. Apdo. Postal 6-641, 06600 Mexico, DF, Mexico Introduction
More informationEvaluating Quality in Creative Systems. Graeme Ritchie University of Aberdeen
Evaluating Quality in Creative Systems Graeme Ritchie University of Aberdeen Graeme Ritchie {2007} Some Empirical Criteria for Attributing Creativity to a Computer Program. Minds and Machines 17 {1}, pp.67-99.
More informationAn Evaluation of Interrater Reliability Measures on Binary Tasks Using d-prime
Article An Evaluation of Interrater Reliability Measures on Binary Tasks Using d-prime Applied Psychological Measurement 1 13 Ó The Author(s) 2017 Reprints and permissions: sagepub.com/journalspermissions.nav
More informationUnderstanding CELF-5 Reliability & Validity to Improve Diagnostic Decisions
Understanding CELF-5 Reliability & Validity to Improve Diagnostic Decisions Senior Educational Consultant Pearson Disclosures Dr. Scheller is an employee of Pearson, publisher of the CELF-5. No other language
More informationCHAMP: CHecklist for the Appraisal of Moderators and Predictors
CHAMP - Page 1 of 13 CHAMP: CHecklist for the Appraisal of Moderators and Predictors About the checklist In this document, a CHecklist for the Appraisal of Moderators and Predictors (CHAMP) is presented.
More informationTypes of Tests. Measurement Reliability. Most self-report tests used in Psychology and Education are objective tests :
Measurement Reliability Objective & Subjective tests Standardization & Inter-rater reliability Properties of a good item Item Analysis Internal Reliability Spearman-Brown Prophesy Formla -- α & # items
More information2 Critical thinking guidelines
What makes psychological research scientific? Precision How psychologists do research? Skepticism Reliance on empirical evidence Willingness to make risky predictions Openness Precision Begin with a Theory
More informationAppendix G: Methodology checklist: the QUADAS tool for studies of diagnostic test accuracy 1
Appendix G: Methodology checklist: the QUADAS tool for studies of diagnostic test accuracy 1 Study identification Including author, title, reference, year of publication Guideline topic: Checklist completed
More informationGastric ulcers at endoscopy: brush, biopsy, or both Sadowski D C, Rabeneck L
Gastric ulcers at endoscopy: brush, biopsy, or both Sadowski D C, Rabeneck L Record Status This is a critical abstract of an economic evaluation that meets the criteria for inclusion on NHS EED. Each abstract
More informationLEVEL ONE MODULE EXAM PART TWO [Reliability Coefficients CAPs & CATs Patient Reported Outcomes Assessments Disablement Model]
1. Which Model for intraclass correlation coefficients is used when the raters represent the only raters of interest for the reliability study? A. 1 B. 2 C. 3 D. 4 2. The form for intraclass correlation
More informationSupporting Information: Cognitive capacities for cooking in chimpanzees Felix Warneken & Alexandra G. Rosati
Supporting Information: Cognitive capacities for cooking in chimpanzees Felix Warneken & Alexandra G. Rosati Subject Information Name Sex Age Testing Year 1 Testing Year 2 1 2 3 4 5a 5b 5c 6a 6b 7 8 9
More informationA Cross-sectional, Randomized, Non-interventional Methods Study to Compare Three Methods of Assessing Suicidality in Psychiatric Inpatients
A Cross-sectional, Randomized, Non-interventional Methods Study to Compare Three Methods of Assessing Suicidality in Psychiatric Inpatients Eric A. Youngstrom, Ph.D., Ahmad Hameed, M.D., Michael Mitchell,
More informationReview: Conditional Probability. Using tests to improve decisions: Cutting scores & base rates
Review: Conditional Probability Using tests to improve decisions: & base rates Conditional probabilities arise when the probability of one thing [A] depends on the probability of something else [B] In
More informationProbability Models for Sampling
Probability Models for Sampling Chapter 18 May 24, 2013 Sampling Variability in One Act Probability Histogram for ˆp Act 1 A health study is based on a representative cross section of 6,672 Americans age
More informationResearch Questions and Survey Development
Research Questions and Survey Development R. Eric Heidel, PhD Associate Professor of Biostatistics Department of Surgery University of Tennessee Graduate School of Medicine Research Questions 1 Research
More informationBiostatistics 2 - Correlation and Risk
BROUGHT TO YOU BY Biostatistics 2 - Correlation and Risk Developed by Pfizer January 2018 This learning module is intended for UK healthcare professionals only. PP-GEP-GBR-0957 Date of preparation Jan
More informationVariables in Research. What We Will Cover in This Section. What Does Variable Mean?
Variables in Research 9/20/2005 P767 Variables in Research 1 What We Will Cover in This Section Nature of variables. Measuring variables. Reliability. Validity. Measurement Modes. Issues. 9/20/2005 P767
More informationTitle:Validity and Reliability of Arm Abduction Angle Measured on Smartphone: a cross-sectional study
Author's response to reviews Authors: Antonio I Cuesta-Vargas (acuesta.var@gmail.com) Cristina Roldan-Jimenez (CRISTINA.ROLDAN005@gmail.com) Version:3Date:27 January 2016 Author's response to reviews:
More informationEvaluation of CBT for increasing threat detection performance in X-ray screening
Evaluation of CBT for increasing threat detection performance in X-ray screening A. Schwaninger & F. Hofer Department of Psychology, University of Zurich, Switzerland Abstract The relevance of aviation
More informationA Spreadsheet for Deriving a Confidence Interval, Mechanistic Inference and Clinical Inference from a P Value
SPORTSCIENCE Perspectives / Research Resources A Spreadsheet for Deriving a Confidence Interval, Mechanistic Inference and Clinical Inference from a P Value Will G Hopkins sportsci.org Sportscience 11,
More informationLecture Outline Biost 517 Applied Biostatistics I. Statistical Goals of Studies Role of Statistical Inference
Lecture Outline Biost 517 Applied Biostatistics I Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Statistical Inference Role of Statistical Inference Hierarchy of Experimental
More informationGATE CAT Diagnostic Test Accuracy Studies
GATE: a Graphic Approach To Evidence based practice updates from previous version in red Critically Appraised Topic (CAT): Applying the 5 steps of Evidence Based Practice Using evidence from Assessed by:
More informationChapter 11: Experiments and Observational Studies p 318
Chapter 11: Experiments and Observational Studies p 318 Observation vs Experiment An observational study observes individuals and measures variables of interest but does not attempt to influence the response.
More information2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%
Capstone Test (will consist of FOUR quizzes and the FINAL test grade will be an average of the four quizzes). Capstone #1: Review of Chapters 1-3 Capstone #2: Review of Chapter 4 Capstone #3: Review of
More informationCollecting & Making Sense of
Collecting & Making Sense of Quantitative Data Deborah Eldredge, PhD, RN Director, Quality, Research & Magnet Recognition i Oregon Health & Science University Margo A. Halm, RN, PhD, ACNS-BC, FAHA Director,
More informationMantel-Haenszel Procedures for Detecting Differential Item Functioning
A Comparison of Logistic Regression and Mantel-Haenszel Procedures for Detecting Differential Item Functioning H. Jane Rogers, Teachers College, Columbia University Hariharan Swaminathan, University of
More informationWeek 17 and 21 Comparing two assays and Measurement of Uncertainty Explain tools used to compare the performance of two assays, including
Week 17 and 21 Comparing two assays and Measurement of Uncertainty 2.4.1.4. Explain tools used to compare the performance of two assays, including 2.4.1.4.1. Linear regression 2.4.1.4.2. Bland-Altman plots
More information