Introduction. We can make a prediction about Y i based on X i by setting a threshold value T, and predicting Y i = 1 when X i > T.

Size: px
Start display at page:

Download "Introduction. We can make a prediction about Y i based on X i by setting a threshold value T, and predicting Y i = 1 when X i > T."

Transcription

1 Diagnostic Tests 1

2 Introduction Suppose we have a quantitative measurement X i on experimental or observed units i = 1,..., n, and a characteristic Y i = 0 or Y i = 1 (e.g. case/control status). The measurement X i is thought to be related to the characteristic Y i in the sense that units with higher X i values are more likely to have Y i = 1. We can make a prediction about Y i based on X i by setting a threshold value T, and predicting Y i = 1 when X i > T. This is called a diagnostic test. 2

3 Applications of diagnostic testing Cancer detection The amount or concentration of a protein X i in serum obtained from person i may be used to predict whether the person has a particular form of cancer. Credit scoring A person s credit score at the time that he or she receives a loan may be used to predict whether the loan is repaid on time. 3

4 Labeling conventions The labeling of outcome categories as 1 or 0 is arbitrary in principal for example we could label cancer as 1 and non-cancer as 0, or vice-versa. But in practice, label 1 is typically used for the rarer category, or the category that would require some action or intervention. Label 0 usually denotes a default category that requires no action. Depending on the situation, it may be that either larger values of X or smaller values of X are associated with higher probabilities that Y i = 1. In the latter case we can work with X i, or use prediction rules of the form X i < T rather than X i > T. 4

5 Diagnostic testing terminology A diagnostic test is a balance between two types of successful predictions and two types of errors: Successful predictions: True positive a situation in which X i > T and Y i = 1, for example when a person with cancer is predicted to have cancer. True negative a situation in which X i < T and Y i = 0, for example when a cancer-free person is predicted to be cancer-free. Errors: False positive a situation in which X i > T but Y i = 0, for example when a person is predicted to have cancer but actually does not. False negative a situation in which X i < T but Y i = 1, for example when a person is predicted to be cancer-free but actually has cancer. 5

6 Marginal categories The actual status of a unit is positive or negative: Positive everyone with Y i = 1 (all true positives and false negatives). The proportion of positives is often called the prevalence. Negative everyone with Y i = 0 (all false positives and true negatives). The predicted status of a unit is called positive or called negative: Called positive everyone with X i > T (all true positives and false positives). Called negative everyone with X i < T (all true negatives and false negatives). 6

7 The relationships among all these terms is summarized as follows: Called positive Called negative Positive True positive False negative Negative False positive True negative 7

8 Sensitivity and specificity A common way to evaluate a diagnostic test is in terms of sensitivity and specificity. Sensitivity the proportion of positive units that are called positive, the population value is P (X i > T Y i = 1). Specificity the proportion of negative units that are called negative, the population value is P (X i < T Y i = 0). Since sensitivity and specificity are calculated conditionally on case/control status (Y i ), they can be estimated using either a population sample or a case/control sample. 1-specificity is called the false positive rate (FPR) 1-sensitivity is called the false negative rate (FNR) 8

9 Example: Suppose we have a biomarker X i for colon cancer such that 75% of people with colon cancer have X i > T and 5% of people without colon cancer have X i > T. Thus the sensitivity is 75% and the specificity is 100%- 5%=95%. We then screen 1000 people from a population with 15% colon cancer prevalence. We should expect the following results: Called positive Called negative Positive = = 37.5 Negative = = The overall error rate is 80/1000 = 8%, and there is a rough balance between false positives and false negatives. Most of the people who have colon cancer are detected. 9

10 Example: Now suppose we are screening for pancreatic cancer with a prevalence of 0.5% using tests with the same sensitivity and specificity. We expect to get: Called positive Called negative Positive = = 1.25 Negative = = The overall error rate improves to 50.25/1000 5%. The errors overwhelmingly consist of cancer-free false positives. Note that we could get an error rate of 0.5% by predicting everybody to be cancer-free. 10

11 Sensitivity and specificity for normal populations Suppose that X Y = 0 is normal with mean µ 0 and standard deviation σ 0, X Y = 1 is normal with mean µ 1 and standard deviation σ 1. Sensitivity = P (X > T Y = 1) = P ((X µ 1 )/σ 1 > (T µ 1 )/σ 1 Y = 1) = P (Z > (T µ 1 )/σ 1 Y = 1) = 1 P (Z (T µ 1 )/σ 1 Y = 1) P (Z ) can be obtained from a normal probability table. Exercise: Derive a similar formula for specificity. 11

12 Positive and negative predictive values Another way to evaluate a diagnostic test is based on the positive and negative predictive values. Positive predictive value (PPV) the proportion of units called positive that are positive, the population value is P (Y i = 1 X i > T ). Negative predictive value (NPV) the proportion of units called negative that are negative, the population value is P (Y i = 0 X i < T ). 1-PPV is called the false discovery rate the proportion of called positives that are negative. 12

13 Relationships between sensitivity, specificity, positive predictive value, and negative predictive value If we know the prevalence, we can use Bayes theorem to convert between sensitivity/specificity and positive/negative predictive values. For example: P (Y i = 1 X i > T ) = P (X i > T Y i = 1)P (Y i = 1)/P (X i > T ) PPV = sensitivity prevalence/p (positive call) Exercise: Derive a similar relationship for NPV. Note: If pre valance/p (positive call) is approximately 1 then the PPV and sensitivity are similar. Note: PPV depends on prevalence, so cannot be estimated from a case/control sample unless we have an independent estimate of the prevalence. 13

14 Example: The probability of being a called positive in the colon cancer example above is = Thus the positive predictive value is /0.155 = Exercise: show that the negative predictive value for the colon cancer example is Example: For the pancreatic cancer example the probability of being a called positive is = 0.05, so the positive predictive value is /0.05 = Exercise: show that the negative predictive value for the pancreatic cancer example is Note that pancreatic cancer screening looks easier than colon cancer screening based on overall error rate (5% versus 8%) but PPV reveals that the pancreatic cancer test produces a high fraction of false positives. 14

15 Which cancer is truly easier to detect? It depends on the follow-up: Suppose that for colon cancer there is a secondary test that can quickly and safely differentiate the 113 true positives from the 43 false positives, and there is a treatment that substantially helps 50% of people whose colon cancer is detected at screening. Then the 43 false positive only need to go through the inconvenience and stress of a secondary test, and half of the 113 true positives have substantially improved outcomes. Suppose that for pancreatic cancer the only way to confirm the disease is by an invasive procedure that has a 10% rate of serious complications, and therapy only improves the outcome for 20% of people with the disease. Then 4.6% (=46/10) of healthy people are put at serious risk in order to identify 5 people with pancreatic cancer, of whom only one on average will benefit from treatment. Note: the numbers used for the colon and pancreatic cancer examples are made up, but are roughly realistic. 15

16 ROC curves Suppose we want to evaluate how much information a measurement X i contains about a characteristic Y i, but we don t yet want to fix a specific threshold value T. A graphical approach is to plot sensitivity on the vertical axis against 1 specificity on the horizontal axis for all possible values of T. 1.0 Sensitivity Sensitivity Specificity Red Blue Green Specificity 16

17 The following facts constrain a plot of sensitivity against 1 specificity: As T increases, the sensitivity is non-decreasing. As T increases, the specificity is non-increasing, so 1-specificity in nondecreasing. When T is the sensitivity and 1-specificity are both 0. When T is + the sensitivity and 1-specificity are both 1. 17

18 ROC curves A plot of sensitivity against 1-specificity is called a Receiver Operating Characteristics curve, or ROC curve. Due to the constraints discussed above, a ROC curve is a non-decreasing path from (0, 0) to (1, 1). 18

19 Reading and interpreting ROC curves If X contains no information about Y, the sensitivity is P (X > T Y = 1) = P (X > T ), and the specificity is P (X < T Y = 0) = P (X < T ). Therefore 1-specificity is P (X < T ), so the ROC curve is a plot of P (X < T ) against P (X < T ) a diagonal line from (0, 0) to (1, 1). Note that in this case sensitivity = 1 specificity, or sensitivity + specificity = 1. If X is perfectly informative about Y, then there exists a point T such that P (X > T Y = 1) = 1 and P (X < T Y = 0) = 1. We can always determine the value of Y based on whether X is greater than, or less than T. In this case the ROC curve is a path from (0, 0) to (0, 1) to (1,0). If X is partially informative about Y, then for at least some values of T, sensitivity + specificity > 1, so the ROC curve is sometimes above the diagonal. The more it lies above the diagonal, the better. If X is usually or always below the diagonal, the relationship between X and Y is inverted, and we should be using X rather than X to form our predictions. 19

20 Graphs of population ROC curves The following plots show population ROC curves (right side) together with the population densities (left side) of X values in the Y = 0 group (orange) and in the Y = 1 group (blue). 20

21 Probability X Sensitivity Specificity AUC=0.56 Probability X Sensitivity Specificity AUC=

22 Probability X Sensitivity Specificity AUC=0.92 Probability X Sensitivity Specificity AUC=

23 Probability X Sensitivity Specificity AUC=

24 Area under the curve (AUC) The ROC curve always lies in the unit box (0, 1) (0, 1). In the most favorable situation for prediction, the ROC curve consists of the left and top edges of the box, so the area under the ROC curve is 1 (the area of the whole box). In the least favorable situation for prediction, X and Y are independent, the ROC curve follows the diagonal from (0, 0) to (1, 1), and the area under the ROC curve is 1/2. In general, the area under the ROC curve (AUC) can be used as an overall measure of the information in X about Y. The AUC can fall anywhere between 0 and 1, but if the correct orientation of X is known the AUC will fall between 1/2 and 1. Higher AUC values correspond to a greater amount of information in X about Y. 24

25 Sampling interpretation of the AUC Suppose a positive unit Y i = 1 and a negative unit Y j = 0 are selected at random. The population AUC is the probability that X i > X j. The sample AUC is also known as the Mann-Whitney statistic, and can be equivalently calculated as i,j I(X i > X j, Y i = 1, Y j = 0) i,j I(Y i = 1, Y j = 0) The AUC can be calculated in R as follows: wilcox.test(x1,x0)$statistic/(length(x1)*length(x0)) 25

26 Inference Sensitivity, specificity, PPV, and NPV are proportions. population sensitivity is For example, the p = P (X > T Y = 1) which we estimate as ˆp = i I(X i > T )Y i / i Y i. If the i Y i is fixed (as in a case/control study), ˆp is a simple average. In this case, it is unbiased and has variance var ˆp = p(1 p)/n 1, where n 1 = i Y i. 26

27 If the data are from a random sample of size N, then n 1 is random. In this case, the estimate of sensitivity is still unbiased, but the variance is larger than in a case/control study. The conditional variance is: var(ˆp n 1 ) = p(1 p)/n 1 Using the law of total variation, we get var ˆp = vare(ˆp n 1 ) + Evar(ˆp n 1 ) N = 0 + p(1 p) P (n 1 = n)/n. Since n 1 is the number of cases out of a total sample size of N, and each sampled unit has a fixed probability q of being a case, n 1 has a binomial distribution n=1 P (n 1 = n) = ( N n ) q n (1 q) N n. 27

28 How does N n=1 P (n 1 = n)/n relate to 1/n 1? This tells us about the efficiency of a case/control study compared to a random population sample for estimating the sensitivity. These plots compare standard errors for the two types of sampling when the total sample size is 50. Standard error Case/control Population sample P(Y=1) Difference in standard errors P(Y=1) The difference could be important if p < 0.2 or so. 28

29 Sample ROC curves for various sample sizes ROC curves based on data fluctuate around their mean value and become more accurate as the sample size increases. The following plots show sample ROC curves when X Y normal and X Y = 1 is normal with mean 1 and variance 1. = 0 is standard 29

30 Sensitivity Sensitivity Sample size 25 (per group) Specificity Sample size 100 (per group) Specificity Sensitivity Sensitivity Sample size 50 (per group) Specificity Sample size 200 (per group) Specificity

31 Inference for the AUC For most statistics, the standard error of the statistic based on a sample of size N approximately has the form SE c/ N. When this holds, we can form a log/log plot of SE against sample size and the slope will be 1/2: log SE log(c) log(n)/2. 30

32 Here is the plot for AUC: log2 SE(AUC) log2 total sample size Standard error Linear fit The slope of the grey line is 1.16, so this is not a typical statistic. It appears that SE c/n for the AUC. There are complicated analytic expressions for the standard error of the AUC, but the bootstrap is also a good approach. 31

33 Generalization of the threshold value A common way to set the threshold value T is to specify a lower bound κ on specificity, and set T to the lowest value such that the sample specificity in the training set is greater than T. What is the distribution of threshold values associated with this procedure? What is the distribution of population specificity values associated with this procedure? 32

34 Distributions of threshold values T when κ = 0.9, X Y = 0 is standard normal and X Y = 1 is normal with mean µ 1 and variance 1. Both groups have sample size n. Density Density n =25, µ 1 = Threshold n =50, µ 1 = Threshold Density Density n =50, µ 1 = Threshold n =100, µ 1 = Threshold 33

35 Population specificities corresponding to the distributions of threshold values on the previous slide. Density Density n =25, µ 1 = Population specificity n =50, µ 1 = Population specificity Density Density n =50, µ 1 = Population specificity n =100, µ 1 = Population specificity 34

36 Parametric bootstrap for ROC analysis Suppose a diagnostic test is available that has an AUC of 0.8. Someone has developed a new test, and wants to show that it is superior to the gold standard. The following code outlines the parametric bootstrap using normal models for the X Y = 0 and X Y = 1 populations. 35

37 ## Estimate means and standard deviations for the X Y=0 and X Y=1 ## populations. m0 = mean(x0) s0 = sd(x0) m1 = mean(x1) s1 = sd(x1) nboot = 1000 ## The number of bootstrap samples to use. auc = rep(0, nboot) for (k in 1:nboot) { ## Generate a bootstrap data set. x0 = rnorm(length(x0), mean=m0, sd=s0) x1 = rnorm(length(x1), mean=m1, sd=s1) } auc[k] = wilcox.test(x1,x0)$statistic/(length(x1)*length(x0)) auc = sort(auc) lb = auc[0.025*nboot] ub = auc[0.975*nboot] ## The lower bound of the CI. ## The upper bound of the CI. 36

38 The following plots show the observed ROC curve in red, along with 10 ROC curves from parametric bootstrap samples in grey. Sensitivity Sample size 25 (per group) Specificity Sensitivity Sample size 50 (per group) Specificity µ 1 (σ 1 ) µ 0 (σ 0 ) n ˆµ 1 (ˆσ 1 ) ˆµ 0 (ˆσ 0 ) AUC 95%CI Left 1(1) 0(1) (0.97) -0.09(0.85) 0.80 (0.66,0.92) Right 1(1) 0(1) (0.95) -0.08(0.99) 0.82 (0.75,0.90) The bootstrap CI is based on 1000 samples. Based on the CI s, we cannot be confident that these tests are better than the existing test with an AUC of

39 Power and sample size analysis for ROC curves Suppose we intend to evaluate a diagnostic test base on its AUC, using the parametric bootstrap to construct a 95% confidence interval for the population AUC. For power analysis purposes, suppose we wish to identify the smallest sample size that gives us 80% power to conclude that the population AUC is greater than 0.6, when the data follow a standard normal population when Y = 0, and a normal population with mean 1 and standard deviation 1 when Y = 1. The R code on the following slide uses simulation to estimate the power for a range of sample sizes. 38

40 R = NULL ## Storage for the results. ## Loop over possible sample sizes. for (n in c(30,40,50)) { m = 0 for (r in 1:nrep) { ## Use nrep~100 to save time. X1 = rnorm(n, mean=1, sd=1) ## Actual data. X0 = rnorm(n, mean=0, sd=1) ## Actual data. m1 = mean(x1); s1 = sd(x1) ## Calculate parameters to use for m0 = mean(x0); s0 = sd(x0) ## the parametric bootstrap. auc = rep(0, nboot) ## Calculate AUC values for bootstrap sets. for (k in 1:nboot) { x1 = rnorm(length(x1), mean=m1, sd=s1) ## Bootstrap data. x0 = rnorm(length(x0), mean=m0, sd=s0) ## Bootstrap data. auc[k] = wilcox.test(x1,x0)$statistic/(length(x1)*length(x0)) } } auc = sort(auc) if (auc[0.025*nboot] > 0.6) { m=m+1 } ## Check if the interval covers. } R = rbind(r, c(n, m/nrep)) ## Record the result. 39

41 Using nrep=100 and nboot=1000 I got the following power results: n power Based on these results it seems that a sample size between 40 and 50 per group should be used. Further simulation can pin this down to a single number. Note: The population AUC for this simulation study is Note: We have σ 1 = σ 2 for this analysis, but if there is reason to believe that σ 1 σ 2, additional simulations should be run. Note: These results are from a simulation with a small nrep value. For a real power analysis a larger nrep should be used. 40

42 Example: PSA and prostate cancer 41

An Introduction to Bayesian Statistics

An Introduction to Bayesian Statistics An Introduction to Bayesian Statistics Robert Weiss Department of Biostatistics UCLA Fielding School of Public Health robweiss@ucla.edu Sept 2015 Robert Weiss (UCLA) An Introduction to Bayesian Statistics

More information

METHODS FOR DETECTING CERVICAL CANCER

METHODS FOR DETECTING CERVICAL CANCER Chapter III METHODS FOR DETECTING CERVICAL CANCER 3.1 INTRODUCTION The successful detection of cervical cancer in a variety of tissues has been reported by many researchers and baseline figures for the

More information

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN Outline 1. Review sensitivity and specificity 2. Define an ROC curve 3. Define AUC 4. Non-parametric tests for whether or not the test is informative 5. Introduce the binormal ROC model 6. Discuss non-parametric

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Understandable Statistics

Understandable Statistics Understandable Statistics correlated to the Advanced Placement Program Course Description for Statistics Prepared for Alabama CC2 6/2003 2003 Understandable Statistics 2003 correlated to the Advanced Placement

More information

Module Overview. What is a Marker? Part 1 Overview

Module Overview. What is a Marker? Part 1 Overview SISCR Module 7 Part I: Introduction Basic Concepts for Binary Classification Tools and Continuous Biomarkers Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington

More information

Comparing Two ROC Curves Independent Groups Design

Comparing Two ROC Curves Independent Groups Design Chapter 548 Comparing Two ROC Curves Independent Groups Design Introduction This procedure is used to compare two ROC curves generated from data from two independent groups. In addition to producing a

More information

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences. SPRING GROVE AREA SCHOOL DISTRICT PLANNED COURSE OVERVIEW Course Title: Basic Introductory Statistics Grade Level(s): 11-12 Units of Credit: 1 Classification: Elective Length of Course: 30 cycles Periods

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!

More information

Lecture 1: Measuring Disease Occurrence: Prevalence, incidence, incidence density

Lecture 1: Measuring Disease Occurrence: Prevalence, incidence, incidence density Lecture 1: Measuring Disease Occurrence: Prevalence, incidence, incidence density Dankmar Böhning Southampton Statistical Sciences Reserch Institute University of Southampton, UK 2-4 March 2015 Outline

More information

A point estimate is a single value that has been calculated from sample data to estimate the unknown population parameter. s Sample Standard Deviation

A point estimate is a single value that has been calculated from sample data to estimate the unknown population parameter. s Sample Standard Deviation 7.1 Margins of Error and Estimates What is estimation? A point estimate is a single value that has been calculated from sample data to estimate the unknown population parameter. Population Parameter Sample

More information

SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers

SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington

More information

Estimation of Area under the ROC Curve Using Exponential and Weibull Distributions

Estimation of Area under the ROC Curve Using Exponential and Weibull Distributions XI Biennial Conference of the International Biometric Society (Indian Region) on Computational Statistics and Bio-Sciences, March 8-9, 22 43 Estimation of Area under the ROC Curve Using Exponential and

More information

Sensitivity, specicity, ROC

Sensitivity, specicity, ROC Sensitivity, specicity, ROC Thomas Alexander Gerds Department of Biostatistics, University of Copenhagen 1 / 53 Epilog: disease prevalence The prevalence is the proportion of cases in the population today.

More information

Reflection Questions for Math 58B

Reflection Questions for Math 58B Reflection Questions for Math 58B Johanna Hardin Spring 2017 Chapter 1, Section 1 binomial probabilities 1. What is a p-value? 2. What is the difference between a one- and two-sided hypothesis? 3. What

More information

VU Biostatistics and Experimental Design PLA.216

VU Biostatistics and Experimental Design PLA.216 VU Biostatistics and Experimental Design PLA.216 Julia Feichtinger Postdoctoral Researcher Institute of Computational Biotechnology Graz University of Technology Outline for Today About this course Background

More information

PSYCHOLOGY 300B (A01) One-sample t test. n = d = ρ 1 ρ 0 δ = d (n 1) d

PSYCHOLOGY 300B (A01) One-sample t test. n = d = ρ 1 ρ 0 δ = d (n 1) d PSYCHOLOGY 300B (A01) Assignment 3 January 4, 019 σ M = σ N z = M µ σ M d = M 1 M s p d = µ 1 µ 0 σ M = µ +σ M (z) Independent-samples t test One-sample t test n = δ δ = d n d d = µ 1 µ σ δ = d n n = δ

More information

PSYCH-GA.2211/NEURL-GA.2201 Fall 2016 Mathematical Tools for Cognitive and Neural Science. Homework 5

PSYCH-GA.2211/NEURL-GA.2201 Fall 2016 Mathematical Tools for Cognitive and Neural Science. Homework 5 PSYCH-GA.2211/NEURL-GA.2201 Fall 2016 Mathematical Tools for Cognitive and Neural Science Homework 5 Due: 21 Dec 2016 (late homeworks penalized 10% per day) See the course web site for submission details.

More information

Conditional Distributions and the Bivariate Normal Distribution. James H. Steiger

Conditional Distributions and the Bivariate Normal Distribution. James H. Steiger Conditional Distributions and the Bivariate Normal Distribution James H. Steiger Overview In this module, we have several goals: Introduce several technical terms Bivariate frequency distribution Marginal

More information

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests Objectives Quantifying the quality of hypothesis tests Type I and II errors Power of a test Cautions about significance tests Designing Experiments based on power Evaluating a testing procedure The testing

More information

Introduction to screening tests. Tim Hanson Department of Statistics University of South Carolina April, 2011

Introduction to screening tests. Tim Hanson Department of Statistics University of South Carolina April, 2011 Introduction to screening tests Tim Hanson Department of Statistics University of South Carolina April, 2011 1 Overview: 1. Estimating test accuracy: dichotomous tests. 2. Estimating test accuracy: continuous

More information

Knowledge Discovery and Data Mining. Testing. Performance Measures. Notes. Lecture 15 - ROC, AUC & Lift. Tom Kelsey. Notes

Knowledge Discovery and Data Mining. Testing. Performance Measures. Notes. Lecture 15 - ROC, AUC & Lift. Tom Kelsey. Notes Knowledge Discovery and Data Mining Lecture 15 - ROC, AUC & Lift Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-17-AUC

More information

A point estimate is a single value that has been calculated from sample data to estimate the unknown population parameter. s Sample Standard Deviation

A point estimate is a single value that has been calculated from sample data to estimate the unknown population parameter. s Sample Standard Deviation 7.1 Margins of Error and Estimates What is estimation? A point estimate is a single value that has been calculated from sample data to estimate the unknown population parameter. Population Parameter Sample

More information

Section 3.2 Least-Squares Regression

Section 3.2 Least-Squares Regression Section 3.2 Least-Squares Regression Linear relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these relationships.

More information

Chapter 3 CORRELATION AND REGRESSION

Chapter 3 CORRELATION AND REGRESSION CORRELATION AND REGRESSION TOPIC SLIDE Linear Regression Defined 2 Regression Equation 3 The Slope or b 4 The Y-Intercept or a 5 What Value of the Y-Variable Should be Predicted When r = 0? 7 The Regression

More information

Bayesian Methods for Medical Test Accuracy. Broemeling & Associates Inc., 1023 Fox Ridge Road, Medical Lake, WA 99022, USA;

Bayesian Methods for Medical Test Accuracy. Broemeling & Associates Inc., 1023 Fox Ridge Road, Medical Lake, WA 99022, USA; Diagnostics 2011, 1, 1-35; doi:10.3390/diagnostics1010001 OPEN ACCESS diagnostics ISSN 2075-4418 www.mdpi.com/journal/diagnostics/ Review Bayesian Methods for Medical Test Accuracy Lyle D. Broemeling Broemeling

More information

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Lec 02: Estimation & Hypothesis Testing in Animal Ecology Lec 02: Estimation & Hypothesis Testing in Animal Ecology Parameter Estimation from Samples Samples We typically observe systems incompletely, i.e., we sample according to a designed protocol. We then

More information

Statistical Methods and Reasoning for the Clinical Sciences

Statistical Methods and Reasoning for the Clinical Sciences Statistical Methods and Reasoning for the Clinical Sciences Evidence-Based Practice Eiki B. Satake, PhD Contents Preface Introduction to Evidence-Based Statistics: Philosophical Foundation and Preliminaries

More information

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence Chapter 8: Estimating with Confidence Key Vocabulary: point estimator point estimate confidence interval margin of error interval confidence level random normal independent four step process level C confidence

More information

BMI 541/699 Lecture 16

BMI 541/699 Lecture 16 BMI 541/699 Lecture 16 Where we are: 1. Introduction and Experimental Design 2. Exploratory Data Analysis 3. Probability 4. T-based methods for continous variables 5. Proportions & contingency tables -

More information

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review Results & Statistics: Description and Correlation The description and presentation of results involves a number of topics. These include scales of measurement, descriptive statistics used to summarize

More information

Meta-analysis of Diagnostic Test Accuracy Studies

Meta-analysis of Diagnostic Test Accuracy Studies GUIDELINE Meta-analysis of Diagnostic Test Accuracy Studies November 2014 Copyright EUnetHTA 2013. All Rights Reserved. No part of this document may be reproduced without an explicit acknowledgement of

More information

4 Diagnostic Tests and Measures of Agreement

4 Diagnostic Tests and Measures of Agreement 4 Diagnostic Tests and Measures of Agreement Diagnostic tests may be used for diagnosis of disease or for screening purposes. Some tests are more effective than others, so we need to be able to measure

More information

About OMICS International

About OMICS International About OMICS International OMICS International through its Open Access Initiative is committed to make genuine and reliable contributions to the scientific community. OMICS International hosts over 700

More information

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012 STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION by XIN SUN PhD, Kansas State University, 2012 A THESIS Submitted in partial fulfillment of the requirements

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 Exam policy: This exam allows one one-page, two-sided cheat sheet; No other materials. Time: 80 minutes. Be sure to write your name and

More information

SAMPLING AND SAMPLE SIZE

SAMPLING AND SAMPLE SIZE SAMPLING AND SAMPLE SIZE Andrew Zeitlin Georgetown University and IGC Rwanda With slides from Ben Olken and the World Bank s Development Impact Evaluation Initiative 2 Review We want to learn how a program

More information

Sample Size Considerations. Todd Alonzo, PhD

Sample Size Considerations. Todd Alonzo, PhD Sample Size Considerations Todd Alonzo, PhD 1 Thanks to Nancy Obuchowski for the original version of this presentation. 2 Why do Sample Size Calculations? 1. To minimize the risk of making the wrong conclusion

More information

Diagnostic tests, Laboratory tests

Diagnostic tests, Laboratory tests Diagnostic tests, Laboratory tests I. Introduction II. III. IV. Informational values of a test Consequences of the prevalence rate Sequential use of 2 tests V. Selection of a threshold: the ROC curve VI.

More information

BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT REGRESSION MODELS

BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT REGRESSION MODELS BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT REGRESSION MODELS 17 December 2009 Michael Wood University of Portsmouth Business School SBS Department, Richmond Building Portland Street, Portsmouth

More information

9. Interpret a Confidence level: "To say that we are 95% confident is shorthand for..

9. Interpret a Confidence level: To say that we are 95% confident is shorthand for.. Mrs. Daniel AP Stats Chapter 8 Guided Reading 8.1 Confidence Intervals: The Basics 1. A point estimator is a statistic that 2. The value of the point estimator statistic is called a and it is our "best

More information

4. Model evaluation & selection

4. Model evaluation & selection Foundations of Machine Learning CentraleSupélec Fall 2017 4. Model evaluation & selection Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr

More information

Biomarker adaptive designs in clinical trials

Biomarker adaptive designs in clinical trials Review Article Biomarker adaptive designs in clinical trials James J. Chen 1, Tzu-Pin Lu 1,2, Dung-Tsa Chen 3, Sue-Jane Wang 4 1 Division of Bioinformatics and Biostatistics, National Center for Toxicological

More information

10.1 Estimating with Confidence. Chapter 10 Introduction to Inference

10.1 Estimating with Confidence. Chapter 10 Introduction to Inference 10.1 Estimating with Confidence Chapter 10 Introduction to Inference Statistical Inference Statistical inference provides methods for drawing conclusions about a population from sample data. Two most common

More information

MOST: detecting cancer differential gene expression

MOST: detecting cancer differential gene expression Biostatistics (2008), 9, 3, pp. 411 418 doi:10.1093/biostatistics/kxm042 Advance Access publication on November 29, 2007 MOST: detecting cancer differential gene expression HENG LIAN Division of Mathematical

More information

Name: Biostatistics 1 st year Comprehensive Examination: Applied Take Home exam. Due May 29 th, 2015 by 5pm. Late exams will not be accepted.

Name: Biostatistics 1 st year Comprehensive Examination: Applied Take Home exam. Due May 29 th, 2015 by 5pm. Late exams will not be accepted. Name: Biostatistics 1 st year Comprehensive Examination: Applied Take Home exam Due May 29 th, 2015 by 5pm. Late exams will not be accepted. Instructions: 1. There are 2 questions and 4 pages. Answer each

More information

Diagnostic screening. Department of Statistics, University of South Carolina. Stat 506: Introduction to Experimental Design

Diagnostic screening. Department of Statistics, University of South Carolina. Stat 506: Introduction to Experimental Design Diagnostic screening Department of Statistics, University of South Carolina Stat 506: Introduction to Experimental Design 1 / 27 Ties together several things we ve discussed already... The consideration

More information

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc. Chapter 23 Inference About Means Copyright 2010 Pearson Education, Inc. Getting Started Now that we know how to create confidence intervals and test hypotheses about proportions, it d be nice to be able

More information

Chapter 19. Confidence Intervals for Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 19. Confidence Intervals for Proportions. Copyright 2010 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions Copyright 2010 Pearson Education, Inc. Standard Error Both of the sampling distributions we ve looked at are Normal. For proportions For means SD pˆ pq n

More information

Introduction to ROC analysis

Introduction to ROC analysis Introduction to ROC analysis Andriy I. Bandos Department of Biostatistics University of Pittsburgh Acknowledgements Many thanks to Sam Wieand, Nancy Obuchowski, Brenda Kurland, and Todd Alonzo for previous

More information

1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp

1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp The Stata Journal (22) 2, Number 3, pp. 28 289 Comparative assessment of three common algorithms for estimating the variance of the area under the nonparametric receiver operating characteristic curve

More information

ROC (Receiver Operating Characteristic) Curve Analysis

ROC (Receiver Operating Characteristic) Curve Analysis ROC (Receiver Operating Characteristic) Curve Analysis Julie Xu 17 th November 2017 Agenda Introduction Definition Accuracy Application Conclusion Reference 2017 All Rights Reserved Confidential for INC

More information

EVALUATION AND COMPUTATION OF DIAGNOSTIC TESTS: A SIMPLE ALTERNATIVE

EVALUATION AND COMPUTATION OF DIAGNOSTIC TESTS: A SIMPLE ALTERNATIVE EVALUATION AND COMPUTATION OF DIAGNOSTIC TESTS: A SIMPLE ALTERNATIVE NAHID SULTANA SUMI, M. ATAHARUL ISLAM, AND MD. AKHTAR HOSSAIN Abstract. Methods of evaluating and comparing the performance of diagnostic

More information

Behavioral Data Mining. Lecture 4 Measurement

Behavioral Data Mining. Lecture 4 Measurement Behavioral Data Mining Lecture 4 Measurement Outline Hypothesis testing Parametric statistical tests Non-parametric tests Precision-Recall plots ROC plots Hardware update Icluster machines are ready for

More information

STAT 200. Guided Exercise 4

STAT 200. Guided Exercise 4 STAT 200 Guided Exercise 4 1. Let s Revisit this Problem. Fill in the table again. Diagnostic tests are not infallible. We often express a fale positive and a false negative with any test. There are further

More information

Previously, when making inferences about the population mean,, we were assuming the following simple conditions:

Previously, when making inferences about the population mean,, we were assuming the following simple conditions: Chapter 17 Inference about a Population Mean Conditions for inference Previously, when making inferences about the population mean,, we were assuming the following simple conditions: (1) Our data (observations)

More information

A Bayesian approach to sample size determination for studies designed to evaluate continuous medical tests

A Bayesian approach to sample size determination for studies designed to evaluate continuous medical tests Baylor Health Care System From the SelectedWorks of unlei Cheng 1 A Bayesian approach to sample size determination for studies designed to evaluate continuous medical tests unlei Cheng, Baylor Health Care

More information

6. Unusual and Influential Data

6. Unusual and Influential Data Sociology 740 John ox Lecture Notes 6. Unusual and Influential Data Copyright 2014 by John ox Unusual and Influential Data 1 1. Introduction I Linear statistical models make strong assumptions about the

More information

T-Statistic-based Up&Down Design for Dose-Finding Competes Favorably with Bayesian 4-parameter Logistic Design

T-Statistic-based Up&Down Design for Dose-Finding Competes Favorably with Bayesian 4-parameter Logistic Design T-Statistic-based Up&Down Design for Dose-Finding Competes Favorably with Bayesian 4-parameter Logistic Design James A. Bolognese, Cytel Nitin Patel, Cytel Yevgen Tymofyeyef, Merck Inna Perevozskaya, Wyeth

More information

A novel approach to estimation of the time to biomarker threshold: Applications to HIV

A novel approach to estimation of the time to biomarker threshold: Applications to HIV A novel approach to estimation of the time to biomarker threshold: Applications to HIV Pharmaceutical Statistics, Volume 15, Issue 6, Pages 541-549, November/December 2016 PSI Journal Club 22 March 2017

More information

INTRODUCTION TO MACHINE LEARNING. Decision tree learning

INTRODUCTION TO MACHINE LEARNING. Decision tree learning INTRODUCTION TO MACHINE LEARNING Decision tree learning Task of classification Automatically assign class to observations with features Observation: vector of features, with a class Automatically assign

More information

Search settings MaxQuant

Search settings MaxQuant Search settings MaxQuant Briefly, we used MaxQuant version 1.5.0.0 with the following settings. As variable modifications we allowed Acetyl (Protein N-terminus), methionine oxidation and glutamine to pyroglutamate

More information

False Discovery Rates and Copy Number Variation. Bradley Efron and Nancy Zhang Stanford University

False Discovery Rates and Copy Number Variation. Bradley Efron and Nancy Zhang Stanford University False Discovery Rates and Copy Number Variation Bradley Efron and Nancy Zhang Stanford University Three Statistical Centuries 19th (Quetelet) Huge data sets, simple questions 20th (Fisher, Neyman, Hotelling,...

More information

Confidence Intervals On Subsets May Be Misleading

Confidence Intervals On Subsets May Be Misleading Journal of Modern Applied Statistical Methods Volume 3 Issue 2 Article 2 11-1-2004 Confidence Intervals On Subsets May Be Misleading Juliet Popper Shaffer University of California, Berkeley, shaffer@stat.berkeley.edu

More information

A Brief Introduction to Bayesian Statistics

A Brief Introduction to Bayesian Statistics A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon

More information

Sheila Barron Statistics Outreach Center 2/8/2011

Sheila Barron Statistics Outreach Center 2/8/2011 Sheila Barron Statistics Outreach Center 2/8/2011 What is Power? When conducting a research study using a statistical hypothesis test, power is the probability of getting statistical significance when

More information

AP Statistics TOPIC A - Unit 2 MULTIPLE CHOICE

AP Statistics TOPIC A - Unit 2 MULTIPLE CHOICE AP Statistics TOPIC A - Unit 2 MULTIPLE CHOICE Name Date 1) True or False: In a normal distribution, the mean, median and mode all have the same value and the graph of the distribution is symmetric. 2)

More information

Various performance measures in Binary classification An Overview of ROC study

Various performance measures in Binary classification An Overview of ROC study Various performance measures in Binary classification An Overview of ROC study Suresh Babu. Nellore Department of Statistics, S.V. University, Tirupati, India E-mail: sureshbabu.nellore@gmail.com Abstract

More information

Assignment #6. Chapter 10: 14, 15 Chapter 11: 14, 18. Due tomorrow Nov. 6 th by 2pm in your TA s homework box

Assignment #6. Chapter 10: 14, 15 Chapter 11: 14, 18. Due tomorrow Nov. 6 th by 2pm in your TA s homework box Assignment #6 Chapter 10: 14, 15 Chapter 11: 14, 18 Due tomorrow Nov. 6 th by 2pm in your TA s homework box Assignment #7 Chapter 12: 18, 24 Chapter 13: 28 Due next Friday Nov. 13 th by 2pm in your TA

More information

Sanjay P. Zodpey Clinical Epidemiology Unit, Department of Preventive and Social Medicine, Government Medical College, Nagpur, Maharashtra, India.

Sanjay P. Zodpey Clinical Epidemiology Unit, Department of Preventive and Social Medicine, Government Medical College, Nagpur, Maharashtra, India. Research Methodology Sample size and power analysis in medical research Sanjay P. Zodpey Clinical Epidemiology Unit, Department of Preventive and Social Medicine, Government Medical College, Nagpur, Maharashtra,

More information

Chapter 19. Confidence Intervals for Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 19. Confidence Intervals for Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions Copyright 2010, 2007, 2004 Pearson Education, Inc. Standard Error Both of the sampling distributions we ve looked at are Normal. For proportions For means

More information

Statistics: Interpreting Data and Making Predictions. Interpreting Data 1/50

Statistics: Interpreting Data and Making Predictions. Interpreting Data 1/50 Statistics: Interpreting Data and Making Predictions Interpreting Data 1/50 Last Time Last time we discussed central tendency; that is, notions of the middle of data. More specifically we discussed the

More information

Student Performance Q&A:

Student Performance Q&A: Student Performance Q&A: 2009 AP Statistics Free-Response Questions The following comments on the 2009 free-response questions for AP Statistics were written by the Chief Reader, Christine Franklin of

More information

The results of the clinical exam first have to be turned into a numeric variable.

The results of the clinical exam first have to be turned into a numeric variable. Worked examples of decision curve analysis using Stata Basic set up This example assumes that the user has installed the decision curve ado file and has saved the example data sets. use dca_example_dataset1.dta,

More information

Sample Illustration of the "CompuSyn Report" of Two-Drug Combinations in Vitro

Sample Illustration of the CompuSyn Report of Two-Drug Combinations in Vitro Sample Illustration of the "CompuSyn Report" of Two-Drug Combinations in Vitro A: Fludelone (FD) in nm (IC 50 was about 2.7 nm), 8 Data Points B: Panaxytriol (PTX) in um (IC 50 was about 3.2 um), 6 Data

More information

Statistics for Psychology

Statistics for Psychology Statistics for Psychology SIXTH EDITION CHAPTER 3 Some Key Ingredients for Inferential Statistics Some Key Ingredients for Inferential Statistics Psychologists conduct research to test a theoretical principle

More information

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0% Capstone Test (will consist of FOUR quizzes and the FINAL test grade will be an average of the four quizzes). Capstone #1: Review of Chapters 1-3 Capstone #2: Review of Chapter 4 Capstone #3: Review of

More information

Chapter 2--Norms and Basic Statistics for Testing

Chapter 2--Norms and Basic Statistics for Testing Chapter 2--Norms and Basic Statistics for Testing Student: 1. Statistical procedures that summarize and describe a series of observations are called A. inferential statistics. B. descriptive statistics.

More information

MEA DISCUSSION PAPERS

MEA DISCUSSION PAPERS Inference Problems under a Special Form of Heteroskedasticity Helmut Farbmacher, Heinrich Kögel 03-2015 MEA DISCUSSION PAPERS mea Amalienstr. 33_D-80799 Munich_Phone+49 89 38602-355_Fax +49 89 38602-390_www.mea.mpisoc.mpg.de

More information

Comparing Two Means using SPSS (T-Test)

Comparing Two Means using SPSS (T-Test) Indira Gandhi Institute of Development Research From the SelectedWorks of Durgesh Chandra Pathak Winter January 23, 2009 Comparing Two Means using SPSS (T-Test) Durgesh Chandra Pathak Available at: https://works.bepress.com/durgesh_chandra_pathak/12/

More information

Analysis of Vaccine Effects on Post-Infection Endpoints Biostat 578A Lecture 3

Analysis of Vaccine Effects on Post-Infection Endpoints Biostat 578A Lecture 3 Analysis of Vaccine Effects on Post-Infection Endpoints Biostat 578A Lecture 3 Analysis of Vaccine Effects on Post-Infection Endpoints p.1/40 Data Collected in Phase IIb/III Vaccine Trial Longitudinal

More information

3. Model evaluation & selection

3. Model evaluation & selection Foundations of Machine Learning CentraleSupélec Fall 2016 3. Model evaluation & selection Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr

More information

Screening (Diagnostic Tests) Shaker Salarilak

Screening (Diagnostic Tests) Shaker Salarilak Screening (Diagnostic Tests) Shaker Salarilak Outline Screening basics Evaluation of screening programs Where we are? Definition of screening? Whether it is always beneficial? Types of bias in screening?

More information

MATH 183 Test 2 Review Problems

MATH 183 Test 2 Review Problems MATH 183 Test 2 Review Problems 1. A Nationwide Math Assessment Test has a maximum score of 100 points. To be proficient, a student needs a minimum score of 40 points. Suppose you want a 98% confidence

More information

MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1. Lecture 27: Systems Biology and Bayesian Networks

MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1. Lecture 27: Systems Biology and Bayesian Networks MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1 Lecture 27: Systems Biology and Bayesian Networks Systems Biology and Regulatory Networks o Definitions o Network motifs o Examples

More information

Power & Sample Size. Dr. Andrea Benedetti

Power & Sample Size. Dr. Andrea Benedetti Power & Sample Size Dr. Andrea Benedetti Plan Review of hypothesis testing Power and sample size Basic concepts Formulae for common study designs Using the software When should you think about power &

More information

Stroke-free duration and stroke risk in patients with atrial fibrillation: simulation using a Bayesian inference

Stroke-free duration and stroke risk in patients with atrial fibrillation: simulation using a Bayesian inference Asian Biomedicine Vol. 3 No. 4 August 2009; 445-450 Brief Communication (Original) Stroke-free duration and stroke risk in patients with atrial fibrillation: simulation using a Bayesian inference Tomoki

More information

Introduction to Meta-analysis of Accuracy Data

Introduction to Meta-analysis of Accuracy Data Introduction to Meta-analysis of Accuracy Data Hans Reitsma MD, PhD Dept. of Clinical Epidemiology, Biostatistics & Bioinformatics Academic Medical Center - Amsterdam Continental European Support Unit

More information

Exam 4 Review Exercises

Exam 4 Review Exercises Math 160: Statistics Spring, 2014 Toews Exam 4 Review Exercises Instructions: Working in groups of 2-4, first review the goals and objectives for this exam (listed below) and then work the following problems.

More information

Statistical inference provides methods for drawing conclusions about a population from sample data.

Statistical inference provides methods for drawing conclusions about a population from sample data. Chapter 14 Tests of Significance Statistical inference provides methods for drawing conclusions about a population from sample data. Two of the most common types of statistical inference: 1) Confidence

More information

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you? WDHS Curriculum Map Probability and Statistics Time Interval/ Unit 1: Introduction to Statistics 1.1-1.3 2 weeks S-IC-1: Understand statistics as a process for making inferences about population parameters

More information

ROC Curves (Old Version)

ROC Curves (Old Version) Chapter 545 ROC Curves (Old Version) Introduction This procedure generates both binormal and empirical (nonparametric) ROC curves. It computes comparative measures such as the whole, and partial, area

More information

dataset1 <- read.delim("c:\dca_example_dataset1.txt", header = TRUE, sep = "\t") attach(dataset1)

dataset1 <- read.delim(c:\dca_example_dataset1.txt, header = TRUE, sep = \t) attach(dataset1) Worked examples of decision curve analysis using R A note about R versions The R script files to implement decision curve analysis were developed using R version 2.3.1, and were tested last using R version

More information

Module 28 - Estimating a Population Mean (1 of 3)

Module 28 - Estimating a Population Mean (1 of 3) Module 28 - Estimating a Population Mean (1 of 3) In "Estimating a Population Mean," we focus on how to use a sample mean to estimate a population mean. This is the type of thinking we did in Modules 7

More information

PROFILE SIMILARITY IN BIOEQUIVALENCE TRIALS

PROFILE SIMILARITY IN BIOEQUIVALENCE TRIALS Sankhyā : The Indian Journal of Statistics Special Issue on Biostatistics 2000, Volume 62, Series B, Pt. 1, pp. 149 161 PROFILE SIMILARITY IN BIOEQUIVALENCE TRIALS By DAVID T. MAUGER and VERNON M. CHINCHILLI

More information

Outline of Part III. SISCR 2016, Module 7, Part III. SISCR Module 7 Part III: Comparing Two Risk Models

Outline of Part III. SISCR 2016, Module 7, Part III. SISCR Module 7 Part III: Comparing Two Risk Models SISCR Module 7 Part III: Comparing Two Risk Models Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington Outline of Part III 1. How to compare two risk models 2.

More information

Question Sheet. Prospective Validation of the Pediatric Appendicitis Score in a Canadian Pediatric Emergency Department

Question Sheet. Prospective Validation of the Pediatric Appendicitis Score in a Canadian Pediatric Emergency Department Question Sheet Prospective Validation of the Pediatric Appendicitis Score in a Canadian Pediatric Emergency Department Bhatt M, Joseph L, Ducharme FM et al. Acad Emerg Med 2009;16(7):591-596 1. Provide

More information

Creative Commons Attribution-NonCommercial-Share Alike License

Creative Commons Attribution-NonCommercial-Share Alike License Author: Brenda Gunderson, Ph.D., 05 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution- NonCommercial-Share Alike 3.0 Unported License:

More information

Bayesian hierarchical modelling

Bayesian hierarchical modelling Bayesian hierarchical modelling Matthew Schofield Department of Mathematics and Statistics, University of Otago Bayesian hierarchical modelling Slide 1 What is a statistical model? A statistical model:

More information