Sample Size Considerations. Todd Alonzo, PhD

Size: px
Start display at page:

Download "Sample Size Considerations. Todd Alonzo, PhD"

Transcription

1 Sample Size Considerations Todd Alonzo, PhD 1

2 Thanks to Nancy Obuchowski for the original version of this presentation. 2

3 Why do Sample Size Calculations? 1. To minimize the risk of making the wrong conclusion from your study 2. To plan for your study s needs 3. To determine if a study is feasible 3

4 Outline 4steps in process to determining sample size Two example studies Measurement Test hypothesis Four steps for each example 4

5 The process of sample size calculations takes place afteryou have specified your study s primary objective. e.g. To estimate and compare the breast-level sensitivity and specificity of board-certified mammographersinterpreting breast MRI vs. mammograms of high-risk women. 5

6 State statistical hypothesis and/or what you want to estimate 6

7 State statistical hypothesis and/or what you want to estimate Plan the statistical analysis 7

8 State statistical hypothesis and/or what you want to estimate. Plan the statistical analysis. Make assumptions about your study data 8

9 State statistical hypothesis and/or what you want to estimate Plan the statistical analysis Calculate sample size Make assumptions about your study data 9

10 Two Examples 1. Study to estimate sensitivity of new modality. - Simple measurement with confidence interval 2. Study to compare ROC areas of 2modalities. - Test hypothesis 10

11 Example 1: Study to estimate sensitivity of new modality Suppose you re planning a one-reader study Reader will score each patient using 5-point scale Define positive test result as score >3 Retrospective design 11

12 Estimate sensitivity and construct 95% CI 1. Estimate sensitivity as the proportion of diseased patients scored >3. 2. Retrospective design Calculate N Assumptions?? 12

13 Conversation with statistician Radiologist: I want to estimate sensitivity of my new modality Statistician: How sensitive do you expect it to be? Maybe between 0.7 and 0.9 To within 0.10 would be ok. Ok. We should construct a CI. How precisely do you want to estimate sensitivity? 13

14 Estimate sensitivity and construct 95% CI 1. Estimate sensitivity as the proportion of diseased patients scored >3. 2. Retrospective design Calculate N Assumptions: 1. Sensitivity: CI precision of

15 Sample Size Formula for Constructing CI N = [ (z α/2 ) 2 (Var) ] / (L) 2 Var = variance L = precision 15

16 Some Useful z-values For 95% confidence level z α/2 = 1.96 For 5% type I error rate z α/2 = 1.96 For 80% power z β = 0.84 For 90%power z β =

17 Example 1 Variability for a proportion: p (1-p) p Variability

18 Example 1 N = [ (1.96) 2 ( ) ] / (0.10) 2 = 80.7 So we need 81 patients with disease. 18

19 Impact of Study Design on N: Retrospective Study: Identify # diseased subjects needed for study based on sample size calculation. Call this N DIS = 81 19

20 Impact of Study Design on N: Retrospective Study: Identify # diseased subjects needed for study based on sample size calculation. Call this N DIS = 81 Prospective Study: Must consider prevalence rate. Say prevalence rate is 10%. total N= N DIS / Prevalence = 80.7/0.10 =

21 Number of Subjects Required Retrospective (Ndis) Prospective (Ntotal) Half-Width of CI 21

22 Example 2: Study to Compare ROC Areas of Two Modalities Suppose you want to compare a new modality with a standard modality Readers will assign a confidence score (0-100) Retrospective design 22

23 Testing Hypotheses Superiority Non-Inferiority Equivalence 23

24 Alternative Hypothesis Superiority One modality is better than the other Non-Inferiority New modality is at least as good as the standard modality Equivalence New modality hassame performance as the standard modality 24

25 Statistical Hypotheses Superiority H o : null Non-Inferiority H o : H A : alternative H A : Equivalence H o : H A : 25

26 Statistical Hypotheses Superiority H o : AUC new = AUC old Non-Inferiority H A : AUC new AUC old Equivalence 26

27 Statistical Hypotheses Superiority H o : AUC new = AUC old H A : AUC new AUC old Non-Inferiority H o : AUC new + δ <AUC old Equivalence H A : AUC new + δ >AUC old 27

28 Non-inferiority criterion If the new test is easier to use than the old test (e.g. fewer complications, less expensive, less radiation, less time, less personnel), it doesn t have to be quite so accurate: _δ=0.05_ 0.75 AUC old =

29 Statistical Hypotheses Superiority Non-Inferiority Equivalence H o : (AUC new -AUC old ) <δ L or (AUC new -AUC old ) > δ U H A : δ L < (AUC new -AUC old ) < δ U 29

30 Equivalence criteria The difference between the new and old test must be between the lower and the upper equivalence bounds: δ L = δ U = (AUC new -AUC old ) 30

31 H o : AUC new + δ<auc old H A : AUC new + δ> AUC old 1. Retrospective 2. Equal # diseased & nondiseased subjects 3. Paired subjects Calculate sample size for 1- reader study Assumptions?? 31

32 Conversation with statistician Radiologist: I want to test if my new modality is non-inferior to the old modality. Maybe about AUC=0.8. I think it is about the same. Yes. A paired subject design. Statistician: How accurate is the old test? How accurate is the new test? Will the same subjects undergo the new and old test? Ok. I ll assume moderate correlation. 32

33 Accuracy of New Test If AUC new >>> AUC old We don t need a large N... If AUC new <AUC old We need a much larger N 33

34 H o : AUC new + δ<auc old H A : AUC new + δ> AUC old 1. Retrospective 2. Equal # diseased & nondiseased subjects 3. Paired subjects Calculate sample size for 1- reader study 1. Old test average AUC = New test average AUC = δ= r subjects =0.5 34

35 Sample Size Formula for One-Reader Study (z α/2 + z β ) 2 [2 {Var subject (1-r subject )}] N dis = (θ new -θ old + δ) 2 35

36 Type I and II errors Truth Null is truth (inferior) Alternative is truth (non-inferior) Fail to reject Null Correct conclusion Incorrect conclusion Reject Null Incorrect conclusion Correct conclusion 36

37 Type I and II errors Truth Null is truth Alternative is truth Fail to reject Null Correct conclusion TypeII (β) error Reject Null TypeI (α) error POWER 37

38 Some Useful z-values For 95% confidence level z α/2 = 1.96 For 5% type I error rate z α/2 = 1.96 For 80% power z β = 0.84 For 90%power z β =

39 What is the variability for AUC? Obtain an estimate from pilot study Var subject = AUC (1-AUC) (conservative) 39

40 Sample Size Formula for One-Reader Study (z α/2 + z β ) 2 [2 {Var subject (1-r subject )}] N dis = (θ new -θ old + δ) 2 ( ) 2 [2 { (1-0.5)}] N dis = (0.05)

41 Now Consider Multi-reader Design: Same readers interpret images from both modalities? paired-reader design: r reader > 0 unpaired-reader design: r reader = 0 41

42 H o : AUC new + δ<auc old H A : AUC new + δ> AUC old 1. Retrospective 2. Equal # diseased & nondiseased subjects 3. Paired subjects 4. **Paired readers Calculate sample size for MRMC study 1. Old test average AUC = New test average AUC = δ= r subjects =0.5 **Additional assumptions 42

43 Conversation with statistician Radiologist: N=1000 is too large; I want to perform a MRMC study. Their AUCs will vary quite a bit. Maybe between 0.7 and 0.9. Yes, the same readers will interpret images from both modalities. Yes, the readers who perform best on the old test are likely to perform best on the new test. Statistician: Ok. What might be the range of the readers AUCs? Will the readers be paired? Do you think readers will perform similarly on the two tests? Ok. I ll assume high correlation. 43

44 H o : AUC new + δ<auc old H A : AUC new + δ> AUC old 1. Retrospective 2. Equal # diseased & nondiseased subjects 3. Paired subjects 4. Paired readers Calculate sample size for MRMC study 1. Old test average AUC = New test average AUC = δ= Range of AUC across readers: r subjects = r readers =

45 Sample Size Formula for MRMC Study [R (θ new -θ old + δ) 2 ] λ= [2 {Var reader (1-r reader ) + Var subject /N DIS (1-r subject )}] R = # readers 45

46 Sample Size for MRMC Study λis the noncentralityparameter from an F distribution. You can use a math package (e.g. MATLAB) to calculate power if you know λ, R, and the type I error. 46

47 Sources of Variability in MRMC Studies Obtain an estimate from pilot study Var subject = AUC (1-AUC) (conservative) Obtain an estimate from pilot study Estimate the range of AUCs across readers; then Var reader = [(range)/4] 2 47

48 48

49 Unpaired Designs If you plan an unpaired design, then set correlations to zero: Unpaired readers: r reader = 0 Unpaired subjects: r subject = 0 N = # subjects per test 49

50 Unpaired Designs If subjects are unpaired andnot randomized to two tests, then you must consider differences in patient mix between two tests. Use statistical modeling to do this. This increases N. 50

51 Clustered data (common in imaging studies) Clustered data = multiple observations/subject Examples: left and right breast / subject six colon segments / subject unknown# lesions / subject 51

52 Clustered data In data analysis and in sample size estimation, do not assume independence! 52

53 Sample size estimation with clustered data A conservative approach is to assume only one observation / subject. 53

54 Sample size estimation with clustered data Alternatively, take clustered data into account: design effect = 1 + (s-1) x r cluster s = # observations / subject r cluster = correlation between observations in cluster 54

55 Example with clustered data Suppose readers will score each breast in a mammography study. Suppose that with one observation / subject you calculated that you need 100 subjects. 55

56 Example with clustered data Calculate design effect = 1 + (s-1) x r cluster = 1 + (2-1) x 0.5 =

57 Example with clustered data design effect = 1.5 If we needed 100 subjects (if only one observation per subject), then: 100 x1.5= 150 breasts needed so, 75 subjects needed (150 breasts) 57

58 Estimate and construct CI State statistical hypothesis: - superiority - non-inferiority - equivalence Plan Statistical Analysis: - prospective or retrospective - paired or unpaired -clustered or 1 obs/subject Calculate sample size Assumptions: - performance estimates - variability - correlations -set type I and II error rates 58

Designing Studies of Diagnostic Imaging

Designing Studies of Diagnostic Imaging Designing Studies of Diagnostic Imaging Chaya S. Moskowitz, PhD With thanks to Nancy Obuchowski Outline What is study design? Building blocks of imaging studies Strategies to improve study efficiency What

More information

Power & Sample Size. Dr. Andrea Benedetti

Power & Sample Size. Dr. Andrea Benedetti Power & Sample Size Dr. Andrea Benedetti Plan Review of hypothesis testing Power and sample size Basic concepts Formulae for common study designs Using the software When should you think about power &

More information

SISCR Module 4 Part III: Comparing Two Risk Models. Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington

SISCR Module 4 Part III: Comparing Two Risk Models. Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington SISCR Module 4 Part III: Comparing Two Risk Models Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington Outline of Part III 1. How to compare two risk models 2.

More information

Introduction to ROC analysis

Introduction to ROC analysis Introduction to ROC analysis Andriy I. Bandos Department of Biostatistics University of Pittsburgh Acknowledgements Many thanks to Sam Wieand, Nancy Obuchowski, Brenda Kurland, and Todd Alonzo for previous

More information

Outline of Part III. SISCR 2016, Module 7, Part III. SISCR Module 7 Part III: Comparing Two Risk Models

Outline of Part III. SISCR 2016, Module 7, Part III. SISCR Module 7 Part III: Comparing Two Risk Models SISCR Module 7 Part III: Comparing Two Risk Models Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington Outline of Part III 1. How to compare two risk models 2.

More information

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Lec 02: Estimation & Hypothesis Testing in Animal Ecology Lec 02: Estimation & Hypothesis Testing in Animal Ecology Parameter Estimation from Samples Samples We typically observe systems incompletely, i.e., we sample according to a designed protocol. We then

More information

Psychology Research Process

Psychology Research Process Psychology Research Process Logical Processes Induction Observation/Association/Using Correlation Trying to assess, through observation of a large group/sample, what is associated with what? Examples:

More information

Binary Diagnostic Tests Two Independent Samples

Binary Diagnostic Tests Two Independent Samples Chapter 537 Binary Diagnostic Tests Two Independent Samples Introduction An important task in diagnostic medicine is to measure the accuracy of two diagnostic tests. This can be done by comparing summary

More information

Binary Diagnostic Tests Paired Samples

Binary Diagnostic Tests Paired Samples Chapter 536 Binary Diagnostic Tests Paired Samples Introduction An important task in diagnostic medicine is to measure the accuracy of two diagnostic tests. This can be done by comparing summary measures

More information

Methodological aspects of non-inferiority and equivalence trials

Methodological aspects of non-inferiority and equivalence trials Methodological aspects of non-inferiority and equivalence trials Department of Clinical Epidemiology Leiden University Medical Center Capita Selecta 09-12-2014 1 Why RCTs Early 1900: Evidence based on

More information

Lesson 11.1: The Alpha Value

Lesson 11.1: The Alpha Value Hypothesis Testing Lesson 11.1: The Alpha Value The alpha value is the degree of risk we are willing to take when making a decision. The alpha value, often abbreviated using the Greek letter α, is sometimes

More information

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests Objectives Quantifying the quality of hypothesis tests Type I and II errors Power of a test Cautions about significance tests Designing Experiments based on power Evaluating a testing procedure The testing

More information

Improving Reading Time of Digital Breast Tomosynthesis with Concurrent Computer Aided Detection

Improving Reading Time of Digital Breast Tomosynthesis with Concurrent Computer Aided Detection White Paper Improving Reading Time of Digital Breast Tomosynthesis with Concurrent Computer Aided Detection WHITE PAPER 2 3 Abstract PowerLook Tomo Detection, a concurrent computer-aided detection (CAD)

More information

Sample Size, Power and Sampling Methods

Sample Size, Power and Sampling Methods Sample Size, Power and Sampling Methods Mary Ann McBurnie, PhD Senior Investigator, Kaiser Permanente Center for Health Research Steering Committee Chair, Community Health Applied Research Network (CHARN)

More information

SAMPLE SIZE IN CLINICAL RESEARCH, THE NUMBER WE NEED

SAMPLE SIZE IN CLINICAL RESEARCH, THE NUMBER WE NEED TECHNICAL NOTES SAMPLE SIZE IN CLINICAL RESEARCH, THE NUMBER WE NEED Pratap Patra Department of Pediatrics, Govt. Medical College, Vadodara, Gujarat, India Correspondence to: Pratap Patra (pratap_patra3@yahoo.co.in)

More information

NEED A SAMPLE SIZE? How to work with your friendly biostatistician!!!

NEED A SAMPLE SIZE? How to work with your friendly biostatistician!!! NEED A SAMPLE SIZE? How to work with your friendly biostatistician!!! BERD Pizza & Pilots November 18, 2013 Emily Van Meter, PhD Assistant Professor Division of Cancer Biostatistics Overview Why do we

More information

Simple Linear Regression the model, estimation and testing

Simple Linear Regression the model, estimation and testing Simple Linear Regression the model, estimation and testing Lecture No. 05 Example 1 A production manager has compared the dexterity test scores of five assembly-line employees with their hourly productivity.

More information

Fundamentals of Clinical Research for Radiologists. ROC Analysis. - Research Obuchowski ROC Analysis. Nancy A. Obuchowski 1

Fundamentals of Clinical Research for Radiologists. ROC Analysis. - Research Obuchowski ROC Analysis. Nancy A. Obuchowski 1 - Research Nancy A. 1 Received October 28, 2004; accepted after revision November 3, 2004. Series editors: Nancy, C. Craig Blackmore, Steven Karlik, and Caroline Reinhold. This is the 14th in the series

More information

Comparing Two Means using SPSS (T-Test)

Comparing Two Means using SPSS (T-Test) Indira Gandhi Institute of Development Research From the SelectedWorks of Durgesh Chandra Pathak Winter January 23, 2009 Comparing Two Means using SPSS (T-Test) Durgesh Chandra Pathak Available at: https://works.bepress.com/durgesh_chandra_pathak/12/

More information

Bayesian and Frequentist Approaches

Bayesian and Frequentist Approaches Bayesian and Frequentist Approaches G. Jogesh Babu Penn State University http://sites.stat.psu.edu/ babu http://astrostatistics.psu.edu All models are wrong But some are useful George E. P. Box (son-in-law

More information

Methods for Determining Random Sample Size

Methods for Determining Random Sample Size Methods for Determining Random Sample Size This document discusses how to determine your random sample size based on the overall purpose of your research project. Methods for determining the random sample

More information

Sheila Barron Statistics Outreach Center 2/8/2011

Sheila Barron Statistics Outreach Center 2/8/2011 Sheila Barron Statistics Outreach Center 2/8/2011 What is Power? When conducting a research study using a statistical hypothesis test, power is the probability of getting statistical significance when

More information

Hypothesis testing: comparig means

Hypothesis testing: comparig means Hypothesis testing: comparig means Prof. Giuseppe Verlato Unit of Epidemiology & Medical Statistics Department of Diagnostics & Public Health University of Verona COMPARING MEANS 1) Comparing sample mean

More information

Reducing Decision Errors in the Paired Comparison of the Diagnostic Accuracy of Continuous Screening Tests

Reducing Decision Errors in the Paired Comparison of the Diagnostic Accuracy of Continuous Screening Tests Reducing Decision Errors in the Paired Comparison of the Diagnostic Accuracy of Continuous Screening Tests Brandy M. Ringham, 1 Todd A. Alonzo, 2 John T. Brinton, 1 Aarti Munjal, 1 Keith E. Muller, 3 Deborah

More information

Mammography limitations. Clinical performance of digital breast tomosynthesis compared to digital mammography: blinded multi-reader study

Mammography limitations. Clinical performance of digital breast tomosynthesis compared to digital mammography: blinded multi-reader study Clinical performance of digital breast tomosynthesis compared to digital mammography: blinded multi-reader study G. Gennaro (1), A. Toledano (2), E. Baldan (1), E. Bezzon (1), C. di Maggio (1), M. La Grassa

More information

Statistical inference provides methods for drawing conclusions about a population from sample data.

Statistical inference provides methods for drawing conclusions about a population from sample data. Chapter 14 Tests of Significance Statistical inference provides methods for drawing conclusions about a population from sample data. Two of the most common types of statistical inference: 1) Confidence

More information

FDA Executive Summary

FDA Executive Summary Meeting of the Radiological Devices Advisory Panel On October 24, 22, the panel will discuss, make recommendations, and vote on a premarket approval application supplement (P83/S) to expand the indications

More information

Psychology Research Process

Psychology Research Process Psychology Research Process Logical Processes Induction Observation/Association/Using Correlation Trying to assess, through observation of a large group/sample, what is associated with what? Examples:

More information

Determining the size of a vaccine trial

Determining the size of a vaccine trial 17th Advanced Vaccinology Course Annecy 26 th May 2016 Determining the size of a vaccine trial Peter Smith London School of Hygiene & Tropical Medicine Purposes of lecture Introduce concepts of statistical

More information

Fundamental Clinical Trial Design

Fundamental Clinical Trial Design Design, Monitoring, and Analysis of Clinical Trials Session 1 Overview and Introduction Overview Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics, University of Washington February 17-19, 2003

More information

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels;

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels; 1 One-Way ANOVAs We have already discussed the t-test. The t-test is used for comparing the means of two groups to determine if there is a statistically significant difference between them. The t-test

More information

Week 2 Video 3. Diagnostic Metrics

Week 2 Video 3. Diagnostic Metrics Week 2 Video 3 Diagnostic Metrics Different Methods, Different Measures Today we ll continue our focus on classifiers Later this week we ll discuss regressors And other methods will get worked in later

More information

Lecture Notes Module 2

Lecture Notes Module 2 Lecture Notes Module 2 Two-group Experimental Designs The goal of most research is to assess a possible causal relation between the response variable and another variable called the independent variable.

More information

A Brief Introduction to Bayesian Statistics

A Brief Introduction to Bayesian Statistics A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon

More information

Designing Experiments... Or how many times and ways can I screw that up?!?

Designing Experiments... Or how many times and ways can I screw that up?!? www.geo.uzh.ch/microsite/icacogvis/ Designing Experiments... Or how many times and ways can I screw that up?!? Amy L. Griffin AutoCarto 2012, Columbus, OH Outline When do I need to run an experiment and

More information

COMPARING SEVERAL DIAGNOSTIC PROCEDURES USING THE INTRINSIC MEASURES OF ROC CURVE

COMPARING SEVERAL DIAGNOSTIC PROCEDURES USING THE INTRINSIC MEASURES OF ROC CURVE DOI: 105281/zenodo47521 Impact Factor (PIF): 2672 COMPARING SEVERAL DIAGNOSTIC PROCEDURES USING THE INTRINSIC MEASURES OF ROC CURVE Vishnu Vardhan R* and Balaswamy S * Department of Statistics, Pondicherry

More information

Probability II. Patrick Breheny. February 15. Advanced rules Summary

Probability II. Patrick Breheny. February 15. Advanced rules Summary Probability II Patrick Breheny February 15 Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 1 / 26 A rule related to the addition rule is called the law of total probability,

More information

Chapter 12: Introduction to Analysis of Variance

Chapter 12: Introduction to Analysis of Variance Chapter 12: Introduction to Analysis of Variance of Variance Chapter 12 presents the general logic and basic formulas for the hypothesis testing procedure known as analysis of variance (ANOVA). The purpose

More information

STA 3024 Spring 2013 EXAM 3 Test Form Code A UF ID #

STA 3024 Spring 2013 EXAM 3 Test Form Code A UF ID # STA 3024 Spring 2013 Name EXAM 3 Test Form Code A UF ID # Instructions: This exam contains 34 Multiple Choice questions. Each question is worth 3 points, for a total of 102 points (there are TWO bonus

More information

Q: Why is breast cancer a big deal?

Q: Why is breast cancer a big deal? I hate breast cancer. As a radiologist who specializes in breast imaging, my career is devoted to the detection and diagnosis of breast cancer. I am passionate about women s health and my goal is to find

More information

Common Statistical Issues in Biomedical Research

Common Statistical Issues in Biomedical Research Common Statistical Issues in Biomedical Research Howard Cabral, Ph.D., M.P.H. Boston University CTSI Boston University School of Public Health Department of Biostatistics May 15, 2013 1 Overview of Basic

More information

STAT 200. Guided Exercise 4

STAT 200. Guided Exercise 4 STAT 200 Guided Exercise 4 1. Let s Revisit this Problem. Fill in the table again. Diagnostic tests are not infallible. We often express a fale positive and a false negative with any test. There are further

More information

Clinical Trials A Practical Guide to Design, Analysis, and Reporting

Clinical Trials A Practical Guide to Design, Analysis, and Reporting Clinical Trials A Practical Guide to Design, Analysis, and Reporting Duolao Wang, PhD Ameet Bakhai, MBBS, MRCP Statistician Cardiologist Clinical Trials A Practical Guide to Design, Analysis, and Reporting

More information

Introduction. We can make a prediction about Y i based on X i by setting a threshold value T, and predicting Y i = 1 when X i > T.

Introduction. We can make a prediction about Y i based on X i by setting a threshold value T, and predicting Y i = 1 when X i > T. Diagnostic Tests 1 Introduction Suppose we have a quantitative measurement X i on experimental or observed units i = 1,..., n, and a characteristic Y i = 0 or Y i = 1 (e.g. case/control status). The measurement

More information

Machine Learning Statistical Learning. Prof. Matteo Matteucci

Machine Learning Statistical Learning. Prof. Matteo Matteucci Machine Learning Statistical Learning Pro. Matteo Matteucci Statistical Learning Outline o What Is Statistical Learning? Why estimate? How do we estimate? The trade-o between prediction accuracy & model

More information

OBSERVER PERFORMANCE METHODS FOR DIAGNOSTIC IMAGING Foundations, Modeling and Applications with R-Examples. Dev P.

OBSERVER PERFORMANCE METHODS FOR DIAGNOSTIC IMAGING Foundations, Modeling and Applications with R-Examples. Dev P. OBSERVER PERFORMANCE METHODS FOR DIAGNOSTIC IMAGING Foundations, Modeling and Applications with R-Examples Dev P. Chakraborty, PhD Chapter 01: Preliminaries This chapter provides background material, starting

More information

Statistical Power Sampling Design and sample Size Determination

Statistical Power Sampling Design and sample Size Determination Statistical Power Sampling Design and sample Size Determination Deo-Gracias HOUNDOLO Impact Evaluation Specialist dhoundolo@3ieimpact.org Outline 1. Sampling basics 2. What do evaluators do? 3. Statistical

More information

BMI 541/699 Lecture 16

BMI 541/699 Lecture 16 BMI 541/699 Lecture 16 Where we are: 1. Introduction and Experimental Design 2. Exploratory Data Analysis 3. Probability 4. T-based methods for continous variables 5. Proportions & contingency tables -

More information

Confidence Intervals On Subsets May Be Misleading

Confidence Intervals On Subsets May Be Misleading Journal of Modern Applied Statistical Methods Volume 3 Issue 2 Article 2 11-1-2004 Confidence Intervals On Subsets May Be Misleading Juliet Popper Shaffer University of California, Berkeley, shaffer@stat.berkeley.edu

More information

appstats26.notebook April 17, 2015

appstats26.notebook April 17, 2015 Chapter 26 Comparing Counts Objective: Students will interpret chi square as a test of goodness of fit, homogeneity, and independence. Goodness of Fit A test of whether the distribution of counts in one

More information

Viewpoints on Setting Clinical Trial Futility Criteria

Viewpoints on Setting Clinical Trial Futility Criteria Viewpoints on Setting Clinical Trial Futility Criteria Vivian H. Shih, AstraZeneca LP Paul Gallo, Novartis Pharmaceuticals BASS XXI November 3, 2014 Reference Based on: Gallo P, Mao L, Shih VH (2014).

More information

3. Fixed-sample Clinical Trial Design

3. Fixed-sample Clinical Trial Design 3. Fixed-sample Clinical Trial Design 3. Fixed-sample clinical trial design 3.1 On choosing the sample size 3.2 Frequentist evaluation of a fixed-sample trial 3.3 Bayesian evaluation of a fixed-sample

More information

Applied Statistical Analysis EDUC 6050 Week 4

Applied Statistical Analysis EDUC 6050 Week 4 Applied Statistical Analysis EDUC 6050 Week 4 Finding clarity using data Today 1. Hypothesis Testing with Z Scores (continued) 2. Chapters 6 and 7 in Book 2 Review! = $ & '! = $ & ' * ) 1. Which formula

More information

Statistical Considerations in Pilot Studies

Statistical Considerations in Pilot Studies Statistical Considerations in Pilot Studies Matthew J. Gurka, PhD Professor, Dept. of Health Outcomes & Policy Associate Director, Institute for Child Health Policy Goals for Today 1. Statistical analysis

More information

EVALUATION AND COMPUTATION OF DIAGNOSTIC TESTS: A SIMPLE ALTERNATIVE

EVALUATION AND COMPUTATION OF DIAGNOSTIC TESTS: A SIMPLE ALTERNATIVE EVALUATION AND COMPUTATION OF DIAGNOSTIC TESTS: A SIMPLE ALTERNATIVE NAHID SULTANA SUMI, M. ATAHARUL ISLAM, AND MD. AKHTAR HOSSAIN Abstract. Methods of evaluating and comparing the performance of diagnostic

More information

Weak Supervision. Vincent Chen and Nish Khandwala

Weak Supervision. Vincent Chen and Nish Khandwala Weak Supervision Vincent Chen and Nish Khandwala Outline Motivation We want more labels! We want to program our data! #Software2.0 Weak Supervision Formulation Landscape of Noisy Labeling Schemes Snorkel

More information

Welcome to this series focused on sources of bias in epidemiologic studies. In this first module, I will provide a general overview of bias.

Welcome to this series focused on sources of bias in epidemiologic studies. In this first module, I will provide a general overview of bias. Welcome to this series focused on sources of bias in epidemiologic studies. In this first module, I will provide a general overview of bias. In the second module, we will focus on selection bias and in

More information

SAMPLING AND SAMPLE SIZE

SAMPLING AND SAMPLE SIZE SAMPLING AND SAMPLE SIZE Andrew Zeitlin Georgetown University and IGC Rwanda With slides from Ben Olken and the World Bank s Development Impact Evaluation Initiative 2 Review We want to learn how a program

More information

e.com/watch?v=hz1f yhvojr4 e.com/watch?v=kmy xd6qeass

e.com/watch?v=hz1f yhvojr4   e.com/watch?v=kmy xd6qeass What you need to know before talking to your statistician about sample size Sharon D. Yeatts, Ph.D. Associate Professor of Biostatistics Data Coordination Unit Department of Public Health Sciences Medical

More information

Behavioral Data Mining. Lecture 4 Measurement

Behavioral Data Mining. Lecture 4 Measurement Behavioral Data Mining Lecture 4 Measurement Outline Hypothesis testing Parametric statistical tests Non-parametric tests Precision-Recall plots ROC plots Hardware update Icluster machines are ready for

More information

AP STATISTICS 2008 SCORING GUIDELINES (Form B)

AP STATISTICS 2008 SCORING GUIDELINES (Form B) AP STATISTICS 2008 SCORING GUIDELINES (Form B) Question 4 Intent of Question The primary goals of this question were to assess a student s ability to (1) design an experiment to compare two treatments

More information

Breast Cancer Screening: Improved Readings With Computers

Breast Cancer Screening: Improved Readings With Computers Transcript Details This is a transcript of an educational program accessible on the ReachMD network. Details about the program and additional media formats for the program are accessible by visiting: https://reachmd.com/programs/advances-in-medical-imaging/breast-cancer-screening-improvedreadings-with-computers/4023/

More information

Creative Commons Attribution-NonCommercial-Share Alike License

Creative Commons Attribution-NonCommercial-Share Alike License Author: Brenda Gunderson, Ph.D., 05 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution- NonCommercial-Share Alike 3.0 Unported License:

More information

This is a repository copy of Practical guide to sample size calculations: superiority trials.

This is a repository copy of Practical guide to sample size calculations: superiority trials. This is a repository copy of Practical guide to sample size calculations: superiority trials. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/97114/ Version: Accepted Version

More information

Testing Superiority, Equivalence, and Non-Inferiority

Testing Superiority, Equivalence, and Non-Inferiority Testing Superiority, Equivalence, and Non-Inferiority Benoît Mâsse Statistical Center for HIV/AIDS Research & Prevention Seattle, WA USA Consultation on Standards of Prevention in HIV Prevention Trials

More information

Chapter 25. Paired Samples and Blocks. Copyright 2010 Pearson Education, Inc.

Chapter 25. Paired Samples and Blocks. Copyright 2010 Pearson Education, Inc. Chapter 25 Paired Samples and Blocks Copyright 2010 Pearson Education, Inc. Paired Data Data are paired when the observations are collected in pairs or the observations in one group are naturally related

More information

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN Outline 1. Review sensitivity and specificity 2. Define an ROC curve 3. Define AUC 4. Non-parametric tests for whether or not the test is informative 5. Introduce the binormal ROC model 6. Discuss non-parametric

More information

CHAPTER NINE DATA ANALYSIS / EVALUATING QUALITY (VALIDITY) OF BETWEEN GROUP EXPERIMENTS

CHAPTER NINE DATA ANALYSIS / EVALUATING QUALITY (VALIDITY) OF BETWEEN GROUP EXPERIMENTS CHAPTER NINE DATA ANALYSIS / EVALUATING QUALITY (VALIDITY) OF BETWEEN GROUP EXPERIMENTS Chapter Objectives: Understand Null Hypothesis Significance Testing (NHST) Understand statistical significance and

More information

European Federation of Statisticians in the Pharmaceutical Industry (EFSPI)

European Federation of Statisticians in the Pharmaceutical Industry (EFSPI) Page 1 of 14 European Federation of Statisticians in the Pharmaceutical Industry (EFSPI) COMMENTS ON DRAFT FDA Guidance for Industry - Non-Inferiority Clinical Trials Rapporteur: Bernhard Huitfeldt (bernhard.huitfeldt@astrazeneca.com)

More information

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) Research Methods and Ethics in Psychology Week 4 Analysis of Variance (ANOVA) One Way Independent Groups ANOVA Brief revision of some important concepts To introduce the concept of familywise error rate.

More information

Comparison of two means

Comparison of two means 1 Comparison of two means Most studies are comparative in that they compare outcomes from one group with outcomes from another, for example the mean blood pressure in reponse to two different treatments.

More information

Fixed Effect Combining

Fixed Effect Combining Meta-Analysis Workshop (part 2) Michael LaValley December 12 th 2014 Villanova University Fixed Effect Combining Each study i provides an effect size estimate d i of the population value For the inverse

More information

Patrick Breheny. January 28

Patrick Breheny. January 28 Confidence intervals Patrick Breheny January 28 Patrick Breheny Introduction to Biostatistics (171:161) 1/19 Recap Introduction In our last lecture, we discussed at some length the Public Health Service

More information

Adaptive designs: When are they sensible?

Adaptive designs: When are they sensible? Adaptive designs: When are they sensible? Janet Wittes Statistics Collaborative, Inc. Sensible Guidelines May 2012 The pitch Clinical trials are too rigid There are new classes of designs More ethical

More information

Sanjay P. Zodpey Clinical Epidemiology Unit, Department of Preventive and Social Medicine, Government Medical College, Nagpur, Maharashtra, India.

Sanjay P. Zodpey Clinical Epidemiology Unit, Department of Preventive and Social Medicine, Government Medical College, Nagpur, Maharashtra, India. Research Methodology Sample size and power analysis in medical research Sanjay P. Zodpey Clinical Epidemiology Unit, Department of Preventive and Social Medicine, Government Medical College, Nagpur, Maharashtra,

More information

Breast tomosynthesis reduces radiologist performance variability compared to digital mammography

Breast tomosynthesis reduces radiologist performance variability compared to digital mammography Breast tomosynthesis reduces radiologist performance variability compared to digital mammography Andrew Smith 1, Elizabeth Rafferty 2, Loren Niklason 1 1 Hologic, Inc., Bedford MA, USA 2 Massachusetts

More information

Statistics for Clinical Trials: Basics of Phase III Trial Design

Statistics for Clinical Trials: Basics of Phase III Trial Design Statistics for Clinical Trials: Basics of Phase III Trial Design Gary M. Clark, Ph.D. Vice President Biostatistics & Data Management Array BioPharma Inc. Boulder, Colorado USA NCIC Clinical Trials Group

More information

Simple Sensitivity Analyses for Matched Samples Thomas E. Love, Ph.D. ASA Course Atlanta Georgia https://goo.

Simple Sensitivity Analyses for Matched Samples Thomas E. Love, Ph.D. ASA Course Atlanta Georgia https://goo. Goal of a Formal Sensitivity Analysis To replace a general qualitative statement that applies in all observational studies the association we observe between treatment and outcome does not imply causation

More information

Meta-Analysis and Subgroups

Meta-Analysis and Subgroups Prev Sci (2013) 14:134 143 DOI 10.1007/s11121-013-0377-7 Meta-Analysis and Subgroups Michael Borenstein & Julian P. T. Higgins Published online: 13 March 2013 # Society for Prevention Research 2013 Abstract

More information

Hypothesis Testing. Richard S. Balkin, Ph.D., LPC-S, NCC

Hypothesis Testing. Richard S. Balkin, Ph.D., LPC-S, NCC Hypothesis Testing Richard S. Balkin, Ph.D., LPC-S, NCC Overview When we have questions about the effect of a treatment or intervention or wish to compare groups, we use hypothesis testing Parametric statistics

More information

EC352 Econometric Methods: Week 07

EC352 Econometric Methods: Week 07 EC352 Econometric Methods: Week 07 Gordon Kemp Department of Economics, University of Essex 1 / 25 Outline Panel Data (continued) Random Eects Estimation and Clustering Dynamic Models Validity & Threats

More information

Practical Bayesian Design and Analysis for Drug and Device Clinical Trials

Practical Bayesian Design and Analysis for Drug and Device Clinical Trials Practical Bayesian Design and Analysis for Drug and Device Clinical Trials p. 1/2 Practical Bayesian Design and Analysis for Drug and Device Clinical Trials Brian P. Hobbs Plan B Advisor: Bradley P. Carlin

More information

Making comparisons. Previous sessions looked at how to describe a single group of subjects However, we are often interested in comparing two groups

Making comparisons. Previous sessions looked at how to describe a single group of subjects However, we are often interested in comparing two groups Making comparisons Previous sessions looked at how to describe a single group of subjects However, we are often interested in comparing two groups Data can be interpreted using the following fundamental

More information

Power of a Clinical Study

Power of a Clinical Study Power of a Clinical Study M.Yusuf Celik 1, Editor-in-Chief 1 Prof.Dr. Biruni University, Medical Faculty, Dept of Biostatistics, Topkapi, Istanbul. Abstract The probability of not committing a Type II

More information

Sampling for Impact Evaluation. Maria Jones 24 June 2015 ieconnect Impact Evaluation Workshop Rio de Janeiro, Brazil June 22-25, 2015

Sampling for Impact Evaluation. Maria Jones 24 June 2015 ieconnect Impact Evaluation Workshop Rio de Janeiro, Brazil June 22-25, 2015 Sampling for Impact Evaluation Maria Jones 24 June 2015 ieconnect Impact Evaluation Workshop Rio de Janeiro, Brazil June 22-25, 2015 How many hours do you expect to sleep tonight? A. 2 or less B. 3 C.

More information

Helmut Schütz. Satellite Short Course Budapest, 5 October

Helmut Schütz. Satellite Short Course Budapest, 5 October Multi-Group and Multi-Site Studies. To pool or not to pool? Helmut Schütz Satellite Short Course Budapest, 5 October 2017 1 Group Effect Sometimes subjects are split into two or more groups Reasons Lacking

More information

Comments on Significance of candidate cancer genes as assessed by the CaMP score by Parmigiani et al.

Comments on Significance of candidate cancer genes as assessed by the CaMP score by Parmigiani et al. Comments on Significance of candidate cancer genes as assessed by the CaMP score by Parmigiani et al. Holger Höfling Gad Getz Robert Tibshirani June 26, 2007 1 Introduction Identifying genes that are involved

More information

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological

More information

Biostatistics 3. Developed by Pfizer. March 2018

Biostatistics 3. Developed by Pfizer. March 2018 BROUGHT TO YOU BY Biostatistics 3 Developed by Pfizer March 2018 This learning module is intended for UK healthcare professionals only. Job bag: PP-GEP-GBR-0986 Date of preparation March 2018. Agenda I.

More information

Electrical impedance scanning of the breast is considered investigational and is not covered.

Electrical impedance scanning of the breast is considered investigational and is not covered. ARBenefits Approval: 09/28/2011 Effective Date: 01/01/2012 Revision Date: Code(s): Medical Policy Title: Electrical Impedance Scanning of the Breast Document: ARB0127 Administered by: Public Statement:

More information

Welcome to our e-course on research methods for dietitians

Welcome to our e-course on research methods for dietitians Welcome to our e-course on research methods for dietitians This unit is on Sample size: Also reviews confidence intervals and simple statistics This project has been funded with support from the European

More information

Mammographic Breast Density Classification by a Deep Learning Approach

Mammographic Breast Density Classification by a Deep Learning Approach Mammographic Breast Density Classification by a Deep Learning Approach Aly Mohamed, PhD Robert Nishikawa, PhD Wendie A. Berg, MD, PhD David Gur, ScD Shandong Wu, PhD Department of Radiology, University

More information

EPS 625 INTERMEDIATE STATISTICS TWO-WAY ANOVA IN-CLASS EXAMPLE (FLEXIBILITY)

EPS 625 INTERMEDIATE STATISTICS TWO-WAY ANOVA IN-CLASS EXAMPLE (FLEXIBILITY) EPS 625 INTERMEDIATE STATISTICS TO-AY ANOVA IN-CLASS EXAMPLE (FLEXIBILITY) A researcher conducts a study to evaluate the effects of the length of an exercise program on the flexibility of female and male

More information

SCHOOL OF MATHEMATICS AND STATISTICS

SCHOOL OF MATHEMATICS AND STATISTICS Data provided: Tables of distributions MAS603 SCHOOL OF MATHEMATICS AND STATISTICS Further Clinical Trials Spring Semester 014 015 hours Candidates may bring to the examination a calculator which conforms

More information

EXERCISE: HOW TO DO POWER CALCULATIONS IN OPTIMAL DESIGN SOFTWARE

EXERCISE: HOW TO DO POWER CALCULATIONS IN OPTIMAL DESIGN SOFTWARE ...... EXERCISE: HOW TO DO POWER CALCULATIONS IN OPTIMAL DESIGN SOFTWARE TABLE OF CONTENTS 73TKey Vocabulary37T... 1 73TIntroduction37T... 73TUsing the Optimal Design Software37T... 73TEstimating Sample

More information

Bios 6648: Design & conduct of clinical research

Bios 6648: Design & conduct of clinical research Bios 6648: Design & conduct of clinical research Section 1 - Specifying the study setting and objectives 1. Specifying the study setting and objectives Where will we end up?: (a) The treatment indication

More information

Two-Way Independent ANOVA

Two-Way Independent ANOVA Two-Way Independent ANOVA Analysis of Variance (ANOVA) a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment. There

More information

The t-test: Answers the question: is the difference between the two conditions in my experiment "real" or due to chance?

The t-test: Answers the question: is the difference between the two conditions in my experiment real or due to chance? The t-test: Answers the question: is the difference between the two conditions in my experiment "real" or due to chance? Two versions: (a) Dependent-means t-test: ( Matched-pairs" or "one-sample" t-test).

More information

Midterm Exam MMI 409 Spring 2009 Gordon Bleil

Midterm Exam MMI 409 Spring 2009 Gordon Bleil Midterm Exam MMI 409 Spring 2009 Gordon Bleil Table of contents: (Hyperlinked to problem sections) Problem 1 Hypothesis Tests Results Inferences Problem 2 Hypothesis Tests Results Inferences Problem 3

More information

Stat Wk 9: Hypothesis Tests and Analysis

Stat Wk 9: Hypothesis Tests and Analysis Stat 342 - Wk 9: Hypothesis Tests and Analysis Crash course on ANOVA, proc glm Stat 342 Notes. Week 9 Page 1 / 57 Crash Course: ANOVA AnOVa stands for Analysis Of Variance. Sometimes it s called ANOVA,

More information