AQCHANALYTICAL TUTORIAL ARTICLE. Classification in Karyometry HISTOPATHOLOGY. Performance Testing and Prediction Error

Size: px
Start display at page:

Download "AQCHANALYTICAL TUTORIAL ARTICLE. Classification in Karyometry HISTOPATHOLOGY. Performance Testing and Prediction Error"

Transcription

1 AND QUANTITATIVE CYTOPATHOLOGY AND AQCHANALYTICAL HISTOPATHOLOGY An Official Periodical of The International Academy of Cytology and the Italian Group of Uropathology Classification in Karyometry Performance Testing and Prediction Error TUTORIAL ARTICLE Peter H. Bartels, Ph.D., and Hubert G. Bartels, M.S.I.E. Classification plays a central role in quantitative histopathology. Success is expressed in terms of the accuracy of prediction for the classification of future data points and an estimate of the prediction error. The prediction error is affected by the chosen procedure, e.g., the use of a training set of data points, a validation set, an independent test set, the sample size and the learning curve of the classification algorithm. For small samples procedures such as the jackknife, the leave one out and the bootstrap are recommended in order to arrive at an unbiased estimate of the true prediction error. All of the procedures rest on the assumption that the data set used to derive a classification rule is representative for the diagnostic categories involved. It is this assumption that in quantitative histopathology has to be carefully verified before a clinically generally valid classification procedure can be claimed. (Anal Quant Cytopathol Histopathol 2013;35: ) Keywords: classification, histopathology, karyometry. Classification of nuclei, lesions, and patients plays a central role in quantitative histopathologic studies. There is a rich literature on classification procedures, on the training of classification algorithms, and on the testing of their performance. There is the comprehensive collection of seminal studies in the engineering field edited by Agrawala. 1 There are the classical texts by Fukunaga 2 and by Duda and Hart. 3 There have been extensive studies of the behavior of classification algorithms. Computer simulations and Monte Carlo studies have led to a thorough understanding of the underlying processes. Most authors in the field agree, though, that there is no general theory guiding method development. The existing procedures and recommendations are essentially based on heuristics. Much of the literature, particularly on error estimation, requires a rather advanced background in mathematics and statistics for a reader to appreciate the recommendations for practical applications. 4-6 However, the correct practical use of classification algorithms From the College of Optical Sciences and Arizona Cancer Center, University of Arizona, Tucson, Arizona, U.S.A. Dr. P. Bartels is Professor Emeritus. Mr. H. Bartels is Applications Programmer, Senior. This work was supported in part by grant PO1 CA from the National Institutes of Health, Bethesda, Maryland, and a gift from Michael Lewis, Los Angeles, California. Address correspondence to: Peter H. Bartels, Ph.D., Arizona Cancer Center, University of Arizona, 1515 North Campbell Avenue, P.O. Box , Tucson, Arizona , U.S.A. (hubertbartels@msn.com). Financial Disclosure: The authors have no connection to any companies or products mentioned in this article /13/ /$18.00/0 Science Printers and Publishers, Inc. 181

2 182 Bartels and Bartels offered in software packages is fairly straightforward. 7,8 Karyometry presents some particular challenges to the development, evaluation and application of classification procedures. In many instances the validity of certain assumptions underlying classification procedures is questionable: clinical materials by their very nature rarely offer entirely homogeneous populations. The idea of a cohort of patients even when matched for certain anamnestic variables as offering samples from a single stochastic source in the jargon of the pertinent literature is, at best, an approximation. It is difficult to state with confidence a priori at what size a truly representative sample for a diagnostic category has been attained. An analysis of nuclear populations usually involves thousands of nuclei. Here again, though, the presence of subpopulations of different phenotypes with often subtle differences in karyometric characteristics raises questions concerning the homogeneity of the clinical samples. There are 3 assumptions underlying any discussion of classifier development and performance: first, that the sample used for the training of a classification algorithm is truly representative for its class. Second, the data points, i.e., nuclei, lesions or patients, are assumed to be true random samples. The researcher assembling the data sets must under no circumstances exert any judgment, or preselect nuclei or patients. Third, it is assumed that the data are drawn for each category from a single distribution, as mentioned above, from a single stochastic source. The assumption implies that the population from which the objects are taken is homogeneous. For the elements of the training or test sets to be random samples from a homogeneous population, though, is by itself not yet enough. The elements in a training set should also be fully representative of the population in general. The literature calls this requirement that they fully cover the problem space. This is a condition that in karyometry is often hard to attain or even to verify a priori. Given the notable variability in any biologic entity, one might have to assemble a data set of considerable size in order to make it fairly representative. Unless one has a fully representative data set for the classifier, the system is not really trained to finality and the true prediction error is hard to approximate. Exactly what sample size is required to achieve full representation depends on the task at hand. Practical experience suggests that equality of the apparent error from the training set to the prediction error from the test set indicates that full representation has been reached. It is not that unusual in karyometry that one finds this to be true at sample sizes of a few hundred nuclei. It is the objective of this article to provide guidance for the practical use of classification procedures and to explain the underlying rationale. Basic Concepts The basic process begins with the assembly of 2 data sets, representing diagnostic categories, to be distinguished by a decision rule. A search is conducted for characteristics, or features, which have differences in value for the 2 diagnostic categories. The set of selected features is called a feature vector. Each feature vector in karyometry represents a nucleus, a lesion, or a patient. In the following these shall be referred to as objects or as data points. The feature vectors for the 2 samples to be discriminated are submitted to a classification algorithm. The algorithm derives a decision rule. That rule typically is a linear combination of feature values. Computed for a single data point, it results in a score. The score value is compared to a threshold. If it exceeds the threshold, the data point is assigned to one diagnostic category; if less than the threshold value, it is assigned to the other diagnostic category. The decision rule is applied to the data sets, and the proportion of correctly assigned objects is determined. The result is presented as a classification matrix. The rows represent the true diagnostic category for data points of known label. The columns present the assignments made by the classification algorithm. Table I shows an example. There are objects assigned to their correct category and there are objects that have been misclassified, i.e., assigned to the incorrect category. The correct recognition rate, or overall accuracy, here would be / = 81.4%. The estimated classification error would be 18.6%. In this example the distinguishing features did not completely separate objects from the two diag- Table I Classification Matrix for Nuclei True diagnostic category Assignment by algorithm Sample size Class A Class B Class A (84%) 62 (16%) Class B (23%) 185 (77%)

3 Volume 35, Number 4/August 2013 Classification in Karyometry 183 nostic categories. The misclassified objects are referred to as classification errors. At this point one might add a feature, or delete a feature that does not carry a notable weight. Then one would run the classification algorithm again to see whether a better distinction with a lower error rate could be attained. This brings us to an important concept in classification methodology: the estimation of an error rate. Classifier Performance: Estimation of Error Rates In the basic procedure shown above, objects used in the development of the decision rule were also involved in the estimation of the decision rule s performance. However, the rule may have been fitted specifically to the data set from which it was derived. The result might be optimistic. It could not be expected to be as favorable when the rule is applied to new objects, independent, and not involved in the rule s derivation. The result of the procedure may, in principle, be biased. The error rate is known as apparent error rate (E app ). The bias resulting in an optimistic outcome causes the error rate to be lower than the rate that would be expected in the application of the rule to unknown, new objects from the same diagnostic categories. 8 The rate at which the decision rule would classify any new, independent objects is called the generalization error rate, or the true prediction error (E true ) rate. The reason for the bias in the apparent error rate is that the samples used to derive the result may not have been fully representative for their categories. If they had been, the application to new, independent objects would yield the same misclassification rate as for the original data sets. This, however, is rarely the case for biologic materials. The conventional wisdom, according to which the procedure leads to bias, is generally accepted. Using the data sets from the formulation of the decision rule to estimate the classifier error rate is known as resubstitution and the classification error E app as resubstitution or reclassification error. The Training Set/Test Set Procedure A common method to avoid the resubstitution error and to obtain an unbiased estimate of the true prediction error is the training set/test set procedure. The original data sets are partitioned. Often, 50% of the objects from category A and 50% of the objects from category B are used to derive a decision rule, as a training set. The other 50% of each category are used as independent test set. The decision rule is then applied to the test set. The classification result from the test set is free of bias. The recommendation is to report only the result from the test set. In karyometry the clinical materials representing diagnostic categories usually are a set of nuclei from each case and a set of cases from each diagnostic category. Typical values would be 100 nuclei per case, and from 10 to 50 cases per diagnostic category. This would allow training sets of a minimum of 500 nuclei from 5 cases or 2,500 nuclei from 25 cases. The partition into training and test sets should be made at the case level. One should not randomly select, e.g., every other nucleus to be assigned to the training or test set, as this would not result in an independent sample for the test set. The results are now presented by two classification matrices: one for the training set, and one for the test set. It is to be expected that the overall accuracy 1- E test is somewhat lower than that from the training set. Practical experience from the pattern recognition literature suggests a decrease by < 15%. If the classification error is more than that, one might want to reexamine the selected feature set. It has been customary, finally, to apply the decision rule to the combined training set and test set, thus getting an estimate of the classification error on a larger sample size. This, of course, also involves resubstitution and may introduce bias and provide too optimistic a result. Any resubstitution has been criticized and its use discouraged. The categorical rejection of a classification procedure involving resubstitution is, though, not always justified. The resubstitution bias decreases with increasing sample size. The apparent error induced by the training increases, and asymptotically approximates the true prediction error. The test set error decreases in a similar manner, approximating the true prediction error with increasing sample size, as reflected in the classifier s learning curve. This is seen in Figure 1. The relationship between the apparent prediction error and the generalized, true prediction error as a function of sample size is demonstrated by the learning curve of a classifier. The learning curve of a classifier usually takes the form of a power law function. 9 The estimated apparent prediction error becomes monotonically less optimistic with increasing sample size. For large samples the training error becomes equal to the true prediction error because the

4 184 Bartels and Bartels Figure 1 Apparent error and test set error of a classifier as a function of sample size. samples for both diagnostic categories have become fully representative for their populations. For the test set error the opposite trend is true. It decreases with increasing sample size and asymptotically approximates the true prediction error. For large samples both the apparent error and the test set error leave only a negligible bias. The distinction between apparent error and true prediction error is dropped altogether for large samples in the socalled one-shot approach. 10 Sample size thus plays an important role in assessing classification errors. The literature on classification methodology considers samples of 10,000 as very large, and samples in the range of 1,000s to be intermediate. Samples of < 500 in size are generally considered as small in the engineering literature. The question then becomes, what is a large sample in karyometry? The heuristic rule here, for a multivariate analysis, is that a sample consisting of 10 times the number of objects per variable is accepted as a large sample for which resubstitution would not be optimistically biased. The asymptotic approximation of the test set error to the true prediction error as a function of sample size is closely related to the learning curve of the classifier. The learning curve follows a power law and has the form E test = E true + C/n x The constant C and the exponent x are task specific. They have to do with the dispersion of the test set data and how many data points would be needed to have a representative sample. With increasing sample size n the second term in the sum goes to zero and the true prediction error remains. The exponent x affects the sample size, at which a certain difference to the true prediction error is reached, say, 1% or so. The value of x is slightly larger than 1.00, but it has a notable influence on the effective sample size. For n = 200, x = 1, n x = 200, but for x = 1.05, n x = 260, and for x = 1.10, n x = 339. In karyometry the classification of nuclear populations practically always involves several hundred, and often thousands of, nuclei. The remaining resubstitution error then is very small. Assessing overall efficacy of a chemopreventive agent on a treated and a control cohort, even in an exploratory study, may involve about 20 patients/ diagnostic category, i.e., there would be 2,000 nuclei/diagnostic category, and typically from 4 to 8 variables. In the development of a criterion indicating risk for the development of an aggressive type of lesion, one could expect patients, i.e., up to 10,000 nuclei recorded and evaluated. The decision rule typically involves from 3 to 8 variables at most. In both instances, the resubstitution error may be negligible. The assessment of nuclei from a single or a small

5 Volume 35, Number 4/August 2013 Classification in Karyometry 185 number of cases invariably provides only small samples. This is certainly also so when classification involves nuclear subpopulations of different phenotype, as they occur in single cases. Attention then needs to be paid to possible bias. There is always some uncertainty as to what sample size would be representative for a diagnostic category. Weiss and Kulikowski 7 point out that the sample size ensuring full representation may not be unreasonably high, and rather, sometimes, surprisingly small. One knows the size of the test set. For any classifier the quality of an error estimate depends directly on the number of objects in the test set, and the accuracy of the estimate, on randomly drawn, independent test objects, follows a binomial distribution. This means we know not only the error rate estimated from the test set but also how far off it can be: the highest possible error rate is given by the confidence limit of the binomial distribution. There is only a low percentage chance that the error rate is higher. Thus, for example, in a situation where an error rate of 32% had been estimated, on a nuclear population of 2,000 nuclei the true error rate is likely not higher than 32% %. The standard error is defined as Standard Error = {E * (1 E) / n} 1/2 = {0.32 * (1 0.32) / 2,000} 1/2 = {1.088 * 10-4 } 1/2 = , or 1.04% i.e., the sample size is quite adequate for an estimate of the true prediction error. For an estimate of the percentage of nuclei of a certain phenotype in a single case, n = 100, and the same estimated error rate of 32%, the result would be 32% % = 36.7%. Even for samples of intermediate size the difference between apparent error and true prediction error may not be substantial, though. If the classification error from the training set matches the classification error from the test set, it is an indication that the decision rule had not been bent to fit the training set. It indicates that both the training set and the test set are fully representative for the diagnostic categories at hand and that the apparent error has become practically equal to the true prediction error. In the classification of cases, sample sizes tend to be small. To obtain an unbiased estimate of the true prediction error, the partitioning of the data sets into 50% training and 50% test sets is common. The recommendations in the literature tend to partitionings of 2/3 training set objects and 1/3 test set objects, or even to 90% versus 10%. The reasoning is that this makes more information available for the defining of a decision rule. Use of a Validation Set The concern with optimistic bias in the apparent error is justified in the classification of cases. It has been extended to the test set used in the training set/test set procedure as described above. There the result from the test set is generally accepted as unbiased. The argument here is, though, that the training of the system may involve observing the result obtained from the test set. Making adjustments to the decision rule, therefore, actually involves the test set in the process and impairs complete independence. In response, a procedure is recommended where the data sets are partitioned into 3 components: a training set and a validation set for the development of the decision rule, and then application of that rule to a truly independent test set (Figure 2). 4,6 Classification of Intermediate and/or Small Samples The estimate of the generalization, true prediction error is a function of the sample size of the training set. In many studies one might expect that the size for a fully representative sample might have to be prohibitively large or might just not be available. The classifier, therefore, would have to be tested on a sample of smaller size for an estimate of the true prediction error. For the classification of medium and small size data sets, a number of methods are recommended. The training set/test set sequence, with a partitioning into just 2 data sets, is expanded. In the jackknife procedure the preferred choice is a 5-fold to 10-fold cross-validation. 11 In the leave- Figure 2 Partitioning of data into a training set, a validation set and a test set.

6 186 Bartels and Bartels one-out procedure a partitioning of a sample of n data points into partitions of size n 1 is set up. The bootstrap method is recommended especially for small samples which are resampled with replacement up to several hundred times, followed by the same number of training set/test set procedures. The Jackknife Procedure In this procedure the data sets are divided into a number of subsets. All but 1 are used to derive decision rules versus the left-out subset. Since one had several subsets, there are several decision rules and several estimates of an error rate. The true error rate is the average of them. Its reliability is ascertained by the standard deviation of this set of estimates. The partitioning is shown in Figure 3. For a sample of 300 objects and a partitioning into 5 subsets, the training set thus has 240 entries. The risk that one encounters with all decreases in the size of the training set is that one may end up at a portion of the learning curve of the classifier well below the asymptotic approach to the true prediction error. This would result in an overestimate of the true prediction error, as shown in Figure 4. One needs to consider the trade-off between the number of partitions and the effect of working with a smaller sample size for the training set. For the example above, the 5-fold partition would result in a training set of 240 objects. This would provide an acceptable approximation to the true prediction error. But, if for the same task one had only 80 samples to begin with, the training set would have only 64 objects. This may very well place the problem into a range of the learning curve where the slope has still kept it well below the accuracy given by the true prediction error. 4 This can be seen when drawing a line vertically from the abscissa at the sample size of 64 to the learning curve. If one chose to employ a 10-fold cross validation, this would further reduce the size of the effective training set, and it might lead to an overestimate of the true prediction error. Just how much the true prediction error would be overestimated depends not only on the available sample size, but also on the slope of the learning curve. This situation may not become a problem in the processing of nuclear populations. However, when the data points represent cases, it is a very relevant consideration. The Leave-One-Out Procedure In this procedure, with a sample size of n, training is done on n 1 data points versus the 1 data point left out. This process is then repeated until every data point has been left out once, i.e., one develops n classification rules. This is a very labor-intensive procedure. Its single advantage is that for a small sample the leave-oneout method is the only one to provide an unbiased estimate of the true prediction error. For this estimate one uses the average error for the n rules. There are n such estimates, so one obtains an estimate for the variance of the true prediction error as well. The procedure has a number of disadvantages. There is the need to develop n decision rules, and the finally resulting estimate for the true prediction error is based on an average over the n classifiers, so which rule does it refer to? Figure 3 Jackknife procedure with a 5-fold cross validation. The Bootstrap Procedure For very small samples, say of 30 objects or so, the finding of the best estimate for the prediction error may be difficult. Traditionally for such samples the leave-one-out method has been used. It is unbiased, but the variance of the prediction error estimate is quite high for small samples. In such small samples the variance has a dominating influence on the result. Thus, if one had a low variance procedure, for small samples even some bias may be accepted. The bootstrap method offers such a procedure. It was introduced in 1983 by Efron. 12 Bootstrapping is

7 Volume 35, Number 4/August 2013 Classification in Karyometry 187 Figure 4 Overestimation of true prediction error resulting from a sample size so small that the learning curve of the classifier is not yet approximating the true prediction error. a resampling method. If one has a sample of n cases, one would resample the sample by drawing n resamples, with replacement. In sampling with replacement, an object may be drawn twice or even multiple times for a resample, while other objects are not resampled at all. Sampling theory shows that in such a procedure, on the average, 63.2% of the original objects are drawn for a resample, and 36.8 objects are not drawn. These are used as test set. The resampling may be done a very large number of times, such as times. These are treated as independent data sets in the subsequent (200 or so) training set/test set procedures. The so-called procedure results in a low variance estimate for the prediction error, but it has an optimistic bias. Conclusions The engineering literature emphasizes that it is useful to have rules discriminating between objects from different classes, but that the real challenge is to have rules that allow an accurate prediction for new objects in the future. This is certainly true and accurate; generally valid prediction rules are more difficult to derive. In karyometry, though, even the ability to distinguish accurately between objects in 2 data sets plays an important role e.g., in the assessment of the grade, or, in general, an accurate quantitative assessment of a lesion. And, it is by no means evident that just such a classification rule is simple and straightforward to derive. In karyometry, resubstitution error might be the least problem to be worried about, but sample inhomogeneity and inadequate representation can pose big problems. The literature on automated pattern recognition, machine vision and classification lists the representative sample as a prime requirement for the development of a classification rule. In technology applications this requirement is readily satisfied, but in karyometry it remains a major problem. In prospective karyometric studies in which material from one and the same institution is used, careful control of processing, i.e., of fixation, sectioning, and staining, is possible. When materials collected at different institutions are used, or even when prepared from archival material at the same institution, the assumption of having a representative sampling may need to be examined. Karyometric characteristics do not generally have visually clearly perceived appearance. Histopathologic preparations looking convincingly the same as others, in their digital representation, may be distinctly different. Consequently, one may well find agreement between classification success from a training set and a test set for the clinical materials in a given study. But, one may also find that the classification rule fails when applied to a set of histopathologic slides from a different institution, even when those were prepared according to a well-defined protocol. The differences may be subtle. Training on the new material may show the same karyometric features as effective, but the

8 188 Bartels and Bartels coefficients in the discriminant function may be a little different. A representative sample for a classification algorithm therefore may not evolve until material subject to all small differences in preparation has been included in the training. A clinically generally valid classification rule must be expected to follow from an iterative process. The problem of a representative sample becomes particularly relevant in situations when the original set of cases is small. One has to remember here that nuclei from a given diagnostic category, and especially from a given grade of a lesion, are not crisp, but fuzzy sets. What one has as a representative sample is a small number of members of a fuzzy set. Classification methodologies for small samples have been developed to allow estimates of prediction error balanced between the variance of the estimate versus bias. Bootstrapping is a good example for this. A small sample is resampled with replacement, possibly hundreds of times, until a large number of these small data sets are generated. They allow a precise estimate of prediction error. The generated bootstrapped data set has the exact stochastic properties of the original small sample. The derived classification rule is generally valid, but only for additional materials with the same stochastic properties as the original data set. As a small sample of a fuzzy set it is unlikely that the originally included cases represent a diagnostic category in its entirety. The problem of a representation, therefore, is doubly relevant when the original material is but a small sample. Again, this is rarely a problem in technology applications, but in histopathologic materials it is to be taken into serious consideration. References 1. Agrawala AK (editor): Machine Recognition of Patterns. New York, IEEE Press, Fukunaga K: Introduction to Statistical Pattern Recognition. New York, Academic Press, Duda R, Hart P: Pattern Classification and Scene Analysis. New York, Wiley, Hastie T, Tibshirani R, Friedman J: The Elements of Statistical Learning. New York, Springer, 2001, pp Michie D, Spiegelhalter DJ, Taylor CC: Machine Learning, Neural and Statistical Classification. New York, Ellis Horwood, 1994, pp Schuermann J: Pattern Classification. New York, John Wiley, 1996, pp Weiss SM, Kulikowski CA: Computer Systems That Learn: Classification and Prediction Methods. San Mateo, California, Morgan Kaufman, 1990, pp James M: Classification Algorithms. New York, Wiley, 1985, pp Duda RO, Hart PE, Stork DG: Pattern Classification. Second edition. New York, John Wiley, 2000, p Henery RJ: Methods for comparison: Train and test. In Machine Learning, Neural and Statistical Classification. Edited by D Michie, DJ Spiegelhalter, CC Taylor. New York, Ellis Horwood, 1994, p McLachlan GJ: Discriminant Analysis and Statistical Pattern Recognition. New York, John Wiley, Efron B: Estimating the error rate of a prediction rule: Some improvements on cross-validation. J Amer Statist Assoc 1983;78:

Empirical Formula for Creating Error Bars for the Method of Paired Comparison

Empirical Formula for Creating Error Bars for the Method of Paired Comparison Empirical Formula for Creating Error Bars for the Method of Paired Comparison Ethan D. Montag Rochester Institute of Technology Munsell Color Science Laboratory Chester F. Carlson Center for Imaging Science

More information

Technical Specifications

Technical Specifications Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically

More information

PROBABILITY Page 1 of So far we have been concerned about describing characteristics of a distribution.

PROBABILITY Page 1 of So far we have been concerned about describing characteristics of a distribution. PROBABILITY Page 1 of 9 I. Probability 1. So far we have been concerned about describing characteristics of a distribution. That is, frequency distribution, percentile ranking, measures of central tendency,

More information

Confidence Intervals On Subsets May Be Misleading

Confidence Intervals On Subsets May Be Misleading Journal of Modern Applied Statistical Methods Volume 3 Issue 2 Article 2 11-1-2004 Confidence Intervals On Subsets May Be Misleading Juliet Popper Shaffer University of California, Berkeley, shaffer@stat.berkeley.edu

More information

Reliability of Ordination Analyses

Reliability of Ordination Analyses Reliability of Ordination Analyses Objectives: Discuss Reliability Define Consistency and Accuracy Discuss Validation Methods Opening Thoughts Inference Space: What is it? Inference space can be defined

More information

Louis Leon Thurstone in Monte Carlo: Creating Error Bars for the Method of Paired Comparison

Louis Leon Thurstone in Monte Carlo: Creating Error Bars for the Method of Paired Comparison Louis Leon Thurstone in Monte Carlo: Creating Error Bars for the Method of Paired Comparison Ethan D. Montag Munsell Color Science Laboratory, Chester F. Carlson Center for Imaging Science Rochester Institute

More information

The Effect of Guessing on Item Reliability

The Effect of Guessing on Item Reliability The Effect of Guessing on Item Reliability under Answer-Until-Correct Scoring Michael Kane National League for Nursing, Inc. James Moloney State University of New York at Brockport The answer-until-correct

More information

Behavioral Data Mining. Lecture 4 Measurement

Behavioral Data Mining. Lecture 4 Measurement Behavioral Data Mining Lecture 4 Measurement Outline Hypothesis testing Parametric statistical tests Non-parametric tests Precision-Recall plots ROC plots Hardware update Icluster machines are ready for

More information

Assigning B cell Maturity in Pediatric Leukemia Gabi Fragiadakis 1, Jamie Irvine 2 1 Microbiology and Immunology, 2 Computer Science

Assigning B cell Maturity in Pediatric Leukemia Gabi Fragiadakis 1, Jamie Irvine 2 1 Microbiology and Immunology, 2 Computer Science Assigning B cell Maturity in Pediatric Leukemia Gabi Fragiadakis 1, Jamie Irvine 2 1 Microbiology and Immunology, 2 Computer Science Abstract One method for analyzing pediatric B cell leukemia is to categorize

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction 1.1 Motivation and Goals The increasing availability and decreasing cost of high-throughput (HT) technologies coupled with the availability of computational tools and data form a

More information

Statistical Audit. Summary. Conceptual and. framework. MICHAELA SAISANA and ANDREA SALTELLI European Commission Joint Research Centre (Ispra, Italy)

Statistical Audit. Summary. Conceptual and. framework. MICHAELA SAISANA and ANDREA SALTELLI European Commission Joint Research Centre (Ispra, Italy) Statistical Audit MICHAELA SAISANA and ANDREA SALTELLI European Commission Joint Research Centre (Ispra, Italy) Summary The JRC analysis suggests that the conceptualized multi-level structure of the 2012

More information

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY Lingqi Tang 1, Thomas R. Belin 2, and Juwon Song 2 1 Center for Health Services Research,

More information

Learning with Rare Cases and Small Disjuncts

Learning with Rare Cases and Small Disjuncts Appears in Proceedings of the 12 th International Conference on Machine Learning, Morgan Kaufmann, 1995, 558-565. Learning with Rare Cases and Small Disjuncts Gary M. Weiss Rutgers University/AT&T Bell

More information

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES 24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter

More information

Reliability, validity, and all that jazz

Reliability, validity, and all that jazz Reliability, validity, and all that jazz Dylan Wiliam King s College London Published in Education 3-13, 29 (3) pp. 17-21 (2001) Introduction No measuring instrument is perfect. If we use a thermometer

More information

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS) Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it

More information

Proof. Revised. Chapter 12 General and Specific Factors in Selection Modeling Introduction. Bengt Muthén

Proof. Revised. Chapter 12 General and Specific Factors in Selection Modeling Introduction. Bengt Muthén Chapter 12 General and Specific Factors in Selection Modeling Bengt Muthén Abstract This chapter shows how analysis of data on selective subgroups can be used to draw inference to the full, unselected

More information

Classification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang

Classification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang Classification Methods Course: Gene Expression Data Analysis -Day Five Rainer Spang Ms. Smith DNA Chip of Ms. Smith Expression profile of Ms. Smith Ms. Smith 30.000 properties of Ms. Smith The expression

More information

PEER REVIEW HISTORY ARTICLE DETAILS VERSION 1 - REVIEW. Ball State University

PEER REVIEW HISTORY ARTICLE DETAILS VERSION 1 - REVIEW. Ball State University PEER REVIEW HISTORY BMJ Open publishes all reviews undertaken for accepted manuscripts. Reviewers are asked to complete a checklist review form (see an example) and are provided with free text boxes to

More information

Methods for Computing Missing Item Response in Psychometric Scale Construction

Methods for Computing Missing Item Response in Psychometric Scale Construction American Journal of Biostatistics Original Research Paper Methods for Computing Missing Item Response in Psychometric Scale Construction Ohidul Islam Siddiqui Institute of Statistical Research and Training

More information

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15)

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15) ON THE COMPARISON OF BAYESIAN INFORMATION CRITERION AND DRAPER S INFORMATION CRITERION IN SELECTION OF AN ASYMMETRIC PRICE RELATIONSHIP: BOOTSTRAP SIMULATION RESULTS Henry de-graft Acquah, Senior Lecturer

More information

Modeling Sentiment with Ridge Regression

Modeling Sentiment with Ridge Regression Modeling Sentiment with Ridge Regression Luke Segars 2/20/2012 The goal of this project was to generate a linear sentiment model for classifying Amazon book reviews according to their star rank. More generally,

More information

Imperfect, Unlimited-Capacity, Parallel Search Yields Large Set-Size Effects. John Palmer and Jennifer McLean. University of Washington.

Imperfect, Unlimited-Capacity, Parallel Search Yields Large Set-Size Effects. John Palmer and Jennifer McLean. University of Washington. Imperfect, Unlimited-Capacity, Parallel Search Yields Large Set-Size Effects John Palmer and Jennifer McLean University of Washington Abstract Many analyses of visual search assume error-free component

More information

26:010:557 / 26:620:557 Social Science Research Methods

26:010:557 / 26:620:557 Social Science Research Methods 26:010:557 / 26:620:557 Social Science Research Methods Dr. Peter R. Gillett Associate Professor Department of Accounting & Information Systems Rutgers Business School Newark & New Brunswick 1 Overview

More information

A STATISTICAL PATTERN RECOGNITION PARADIGM FOR VIBRATION-BASED STRUCTURAL HEALTH MONITORING

A STATISTICAL PATTERN RECOGNITION PARADIGM FOR VIBRATION-BASED STRUCTURAL HEALTH MONITORING A STATISTICAL PATTERN RECOGNITION PARADIGM FOR VIBRATION-BASED STRUCTURAL HEALTH MONITORING HOON SOHN Postdoctoral Research Fellow ESA-EA, MS C96 Los Alamos National Laboratory Los Alamos, NM 87545 CHARLES

More information

Examining Relationships Least-squares regression. Sections 2.3

Examining Relationships Least-squares regression. Sections 2.3 Examining Relationships Least-squares regression Sections 2.3 The regression line A regression line describes a one-way linear relationship between variables. An explanatory variable, x, explains variability

More information

The Personal Profile System 2800 Series Research Report

The Personal Profile System 2800 Series Research Report The Personal Profile System 2800 Series Research Report The Personal Profile System 2800 Series Research Report Item Number: O-255 1996 by Inscape Publishing, Inc. All rights reserved. Copyright secured

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Reliability, validity, and all that jazz

Reliability, validity, and all that jazz Reliability, validity, and all that jazz Dylan Wiliam King s College London Introduction No measuring instrument is perfect. The most obvious problems relate to reliability. If we use a thermometer to

More information

MODELLING CHARACTER LEGIBILITY

MODELLING CHARACTER LEGIBILITY Watson, A. B. & Fitzhugh, A. E. (989). Modelling character legibility. Society for Information Display Digest of Technical Papers 2, 36-363. MODELLING CHARACTER LEGIBILITY Andrew B. Watson NASA Ames Research

More information

How Does Analysis of Competing Hypotheses (ACH) Improve Intelligence Analysis?

How Does Analysis of Competing Hypotheses (ACH) Improve Intelligence Analysis? How Does Analysis of Competing Hypotheses (ACH) Improve Intelligence Analysis? Richards J. Heuer, Jr. Version 1.2, October 16, 2005 This document is from a collection of works by Richards J. Heuer, Jr.

More information

A model of parallel time estimation

A model of parallel time estimation A model of parallel time estimation Hedderik van Rijn 1 and Niels Taatgen 1,2 1 Department of Artificial Intelligence, University of Groningen Grote Kruisstraat 2/1, 9712 TS Groningen 2 Department of Psychology,

More information

Appendix G: Methodology checklist: the QUADAS tool for studies of diagnostic test accuracy 1

Appendix G: Methodology checklist: the QUADAS tool for studies of diagnostic test accuracy 1 Appendix G: Methodology checklist: the QUADAS tool for studies of diagnostic test accuracy 1 Study identification Including author, title, reference, year of publication Guideline topic: Checklist completed

More information

CANONICAL CORRELATION ANALYSIS OF DATA ON HUMAN-AUTOMATION INTERACTION

CANONICAL CORRELATION ANALYSIS OF DATA ON HUMAN-AUTOMATION INTERACTION Proceedings of the 41 st Annual Meeting of the Human Factors and Ergonomics Society. Albuquerque, NM, Human Factors Society, 1997 CANONICAL CORRELATION ANALYSIS OF DATA ON HUMAN-AUTOMATION INTERACTION

More information

Sawtooth Software. The Number of Levels Effect in Conjoint: Where Does It Come From and Can It Be Eliminated? RESEARCH PAPER SERIES

Sawtooth Software. The Number of Levels Effect in Conjoint: Where Does It Come From and Can It Be Eliminated? RESEARCH PAPER SERIES Sawtooth Software RESEARCH PAPER SERIES The Number of Levels Effect in Conjoint: Where Does It Come From and Can It Be Eliminated? Dick Wittink, Yale University Joel Huber, Duke University Peter Zandan,

More information

Biostatistics II

Biostatistics II Biostatistics II 514-5509 Course Description: Modern multivariable statistical analysis based on the concept of generalized linear models. Includes linear, logistic, and Poisson regression, survival analysis,

More information

Exploring the Influence of Particle Filter Parameters on Order Effects in Causal Learning

Exploring the Influence of Particle Filter Parameters on Order Effects in Causal Learning Exploring the Influence of Particle Filter Parameters on Order Effects in Causal Learning Joshua T. Abbott (joshua.abbott@berkeley.edu) Thomas L. Griffiths (tom griffiths@berkeley.edu) Department of Psychology,

More information

BIOINFORMATICS ORIGINAL PAPER

BIOINFORMATICS ORIGINAL PAPER BIOINFORMATICS ORIGINAL PAPER Vol. 21 no. 9 2005, pages 1979 1986 doi:10.1093/bioinformatics/bti294 Gene expression Estimating misclassification error with small samples via bootstrap cross-validation

More information

Minimizing Uncertainty in Property Casualty Loss Reserve Estimates Chris G. Gross, ACAS, MAAA

Minimizing Uncertainty in Property Casualty Loss Reserve Estimates Chris G. Gross, ACAS, MAAA Minimizing Uncertainty in Property Casualty Loss Reserve Estimates Chris G. Gross, ACAS, MAAA The uncertain nature of property casualty loss reserves Property Casualty loss reserves are inherently uncertain.

More information

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study STATISTICAL METHODS Epidemiology Biostatistics and Public Health - 2016, Volume 13, Number 1 Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation

More information

CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA

CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA Data Analysis: Describing Data CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA In the analysis process, the researcher tries to evaluate the data collected both from written documents and from other sources such

More information

Section on Survey Research Methods JSM 2009

Section on Survey Research Methods JSM 2009 Missing Data and Complex Samples: The Impact of Listwise Deletion vs. Subpopulation Analysis on Statistical Bias and Hypothesis Test Results when Data are MCAR and MAR Bethany A. Bell, Jeffrey D. Kromrey

More information

MODEL SELECTION STRATEGIES. Tony Panzarella

MODEL SELECTION STRATEGIES. Tony Panzarella MODEL SELECTION STRATEGIES Tony Panzarella Lab Course March 20, 2014 2 Preamble Although focus will be on time-to-event data the same principles apply to other outcome data Lab Course March 20, 2014 3

More information

Detection Theory: Sensitivity and Response Bias

Detection Theory: Sensitivity and Response Bias Detection Theory: Sensitivity and Response Bias Lewis O. Harvey, Jr. Department of Psychology University of Colorado Boulder, Colorado The Brain (Observable) Stimulus System (Observable) Response System

More information

Checking the counterarguments confirms that publication bias contaminated studies relating social class and unethical behavior

Checking the counterarguments confirms that publication bias contaminated studies relating social class and unethical behavior 1 Checking the counterarguments confirms that publication bias contaminated studies relating social class and unethical behavior Gregory Francis Department of Psychological Sciences Purdue University gfrancis@purdue.edu

More information

"Homegrown" Exercises around M&M Chapter 6-1- Help a journalist to be "statistically correct" age-related prevalence, and conflicting evidence exists in favor of the mortality hypothesis. We compared mortality

More information

Chapter 5: Field experimental designs in agriculture

Chapter 5: Field experimental designs in agriculture Chapter 5: Field experimental designs in agriculture Jose Crossa Biometrics and Statistics Unit Crop Research Informatics Lab (CRIL) CIMMYT. Int. Apdo. Postal 6-641, 06600 Mexico, DF, Mexico Introduction

More information

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Lec 02: Estimation & Hypothesis Testing in Animal Ecology Lec 02: Estimation & Hypothesis Testing in Animal Ecology Parameter Estimation from Samples Samples We typically observe systems incompletely, i.e., we sample according to a designed protocol. We then

More information

Patrick Breheny. January 28

Patrick Breheny. January 28 Confidence intervals Patrick Breheny January 28 Patrick Breheny Introduction to Biostatistics (171:161) 1/19 Recap Introduction In our last lecture, we discussed at some length the Public Health Service

More information

Comparison of Some Almost Unbiased Ratio Estimators

Comparison of Some Almost Unbiased Ratio Estimators Comparison of Some Almost Unbiased Ratio Estimators Priyaranjan Dash Department of Statistics, Tripura University, India. ---------------------------------------------------------------------***---------------------------------------------------------------------

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

1 The conceptual underpinnings of statistical power

1 The conceptual underpinnings of statistical power 1 The conceptual underpinnings of statistical power The importance of statistical power As currently practiced in the social and health sciences, inferential statistics rest solidly upon two pillars: statistical

More information

RAG Rating Indicator Values

RAG Rating Indicator Values Technical Guide RAG Rating Indicator Values Introduction This document sets out Public Health England s standard approach to the use of RAG ratings for indicator values in relation to comparator or benchmark

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Morton-Style Factorial Coding of Color in Primary Visual Cortex

Morton-Style Factorial Coding of Color in Primary Visual Cortex Morton-Style Factorial Coding of Color in Primary Visual Cortex Javier R. Movellan Institute for Neural Computation University of California San Diego La Jolla, CA 92093-0515 movellan@inc.ucsd.edu Thomas

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 11 + 13 & Appendix D & E (online) Plous - Chapters 2, 3, and 4 Chapter 2: Cognitive Dissonance, Chapter 3: Memory and Hindsight Bias, Chapter 4: Context Dependence Still

More information

A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) *

A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) * A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) * by J. RICHARD LANDIS** and GARY G. KOCH** 4 Methods proposed for nominal and ordinal data Many

More information

Discriminant Analysis with Categorical Data

Discriminant Analysis with Categorical Data - AW)a Discriminant Analysis with Categorical Data John E. Overall and J. Arthur Woodward The University of Texas Medical Branch, Galveston A method for studying relationships among groups in terms of

More information

MEA DISCUSSION PAPERS

MEA DISCUSSION PAPERS Inference Problems under a Special Form of Heteroskedasticity Helmut Farbmacher, Heinrich Kögel 03-2015 MEA DISCUSSION PAPERS mea Amalienstr. 33_D-80799 Munich_Phone+49 89 38602-355_Fax +49 89 38602-390_www.mea.mpisoc.mpg.de

More information

Cochrane Pregnancy and Childbirth Group Methodological Guidelines

Cochrane Pregnancy and Childbirth Group Methodological Guidelines Cochrane Pregnancy and Childbirth Group Methodological Guidelines [Prepared by Simon Gates: July 2009, updated July 2012] These guidelines are intended to aid quality and consistency across the reviews

More information

Type II Fuzzy Possibilistic C-Mean Clustering

Type II Fuzzy Possibilistic C-Mean Clustering IFSA-EUSFLAT Type II Fuzzy Possibilistic C-Mean Clustering M.H. Fazel Zarandi, M. Zarinbal, I.B. Turksen, Department of Industrial Engineering, Amirkabir University of Technology, P.O. Box -, Tehran, Iran

More information

linking in educational measurement: Taking differential motivation into account 1

linking in educational measurement: Taking differential motivation into account 1 Selecting a data collection design for linking in educational measurement: Taking differential motivation into account 1 Abstract In educational measurement, multiple test forms are often constructed to

More information

OLS Regression with Clustered Data

OLS Regression with Clustered Data OLS Regression with Clustered Data Analyzing Clustered Data with OLS Regression: The Effect of a Hierarchical Data Structure Daniel M. McNeish University of Maryland, College Park A previous study by Mundfrom

More information

Mark J. Anderson, Patrick J. Whitcomb Stat-Ease, Inc., Minneapolis, MN USA

Mark J. Anderson, Patrick J. Whitcomb Stat-Ease, Inc., Minneapolis, MN USA Journal of Statistical Science and Application (014) 85-9 D DAV I D PUBLISHING Practical Aspects for Designing Statistically Optimal Experiments Mark J. Anderson, Patrick J. Whitcomb Stat-Ease, Inc., Minneapolis,

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 13 & Appendix D & E (online) Plous Chapters 17 & 18 - Chapter 17: Social Influences - Chapter 18: Group Judgments and Decisions Still important ideas Contrast the measurement

More information

Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision in Pune, India

Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision in Pune, India 20th International Congress on Modelling and Simulation, Adelaide, Australia, 1 6 December 2013 www.mssanz.org.au/modsim2013 Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision

More information

Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions

Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions J. Harvey a,b, & A.J. van der Merwe b a Centre for Statistical Consultation Department of Statistics

More information

Computerized Mastery Testing

Computerized Mastery Testing Computerized Mastery Testing With Nonequivalent Testlets Kathleen Sheehan and Charles Lewis Educational Testing Service A procedure for determining the effect of testlet nonequivalence on the operating

More information

Local Image Structures and Optic Flow Estimation

Local Image Structures and Optic Flow Estimation Local Image Structures and Optic Flow Estimation Sinan KALKAN 1, Dirk Calow 2, Florentin Wörgötter 1, Markus Lappe 2 and Norbert Krüger 3 1 Computational Neuroscience, Uni. of Stirling, Scotland; {sinan,worgott}@cn.stir.ac.uk

More information

Response to Mease and Wyner, Evidence Contrary to the Statistical View of Boosting, JMLR 9:1 26, 2008

Response to Mease and Wyner, Evidence Contrary to the Statistical View of Boosting, JMLR 9:1 26, 2008 Journal of Machine Learning Research 9 (2008) 59-64 Published 1/08 Response to Mease and Wyner, Evidence Contrary to the Statistical View of Boosting, JMLR 9:1 26, 2008 Jerome Friedman Trevor Hastie Robert

More information

CHAMP: CHecklist for the Appraisal of Moderators and Predictors

CHAMP: CHecklist for the Appraisal of Moderators and Predictors CHAMP - Page 1 of 13 CHAMP: CHecklist for the Appraisal of Moderators and Predictors About the checklist In this document, a CHecklist for the Appraisal of Moderators and Predictors (CHAMP) is presented.

More information

Generalization and Theory-Building in Software Engineering Research

Generalization and Theory-Building in Software Engineering Research Generalization and Theory-Building in Software Engineering Research Magne Jørgensen, Dag Sjøberg Simula Research Laboratory {magne.jorgensen, dagsj}@simula.no Abstract The main purpose of this paper is

More information

Mammogram Analysis: Tumor Classification

Mammogram Analysis: Tumor Classification Mammogram Analysis: Tumor Classification Term Project Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is the

More information

Modelling Research Productivity Using a Generalization of the Ordered Logistic Regression Model

Modelling Research Productivity Using a Generalization of the Ordered Logistic Regression Model Modelling Research Productivity Using a Generalization of the Ordered Logistic Regression Model Delia North Temesgen Zewotir Michael Murray Abstract In South Africa, the Department of Education allocates

More information

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand

More information

Critical Thinking Assessment at MCC. How are we doing?

Critical Thinking Assessment at MCC. How are we doing? Critical Thinking Assessment at MCC How are we doing? Prepared by Maura McCool, M.S. Office of Research, Evaluation and Assessment Metropolitan Community Colleges Fall 2003 1 General Education Assessment

More information

Mammogram Analysis: Tumor Classification

Mammogram Analysis: Tumor Classification Mammogram Analysis: Tumor Classification Literature Survey Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is

More information

Essential Skills for Evidence-based Practice Understanding and Using Systematic Reviews

Essential Skills for Evidence-based Practice Understanding and Using Systematic Reviews J Nurs Sci Vol.28 No.4 Oct - Dec 2010 Essential Skills for Evidence-based Practice Understanding and Using Systematic Reviews Jeanne Grace Corresponding author: J Grace E-mail: Jeanne_Grace@urmc.rochester.edu

More information

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1 Welch et al. BMC Medical Research Methodology (2018) 18:89 https://doi.org/10.1186/s12874-018-0548-0 RESEARCH ARTICLE Open Access Does pattern mixture modelling reduce bias due to informative attrition

More information

THE EFFECT OF EXPECTATIONS ON VISUAL INSPECTION PERFORMANCE

THE EFFECT OF EXPECTATIONS ON VISUAL INSPECTION PERFORMANCE THE EFFECT OF EXPECTATIONS ON VISUAL INSPECTION PERFORMANCE John Kane Derek Moore Saeed Ghanbartehrani Oregon State University INTRODUCTION The process of visual inspection is used widely in many industries

More information

Ambiguous Data Result in Ambiguous Conclusions: A Reply to Charles T. Tart

Ambiguous Data Result in Ambiguous Conclusions: A Reply to Charles T. Tart Other Methodology Articles Ambiguous Data Result in Ambiguous Conclusions: A Reply to Charles T. Tart J. E. KENNEDY 1 (Original publication and copyright: Journal of the American Society for Psychical

More information

Evaluation Models STUDIES OF DIAGNOSTIC EFFICIENCY

Evaluation Models STUDIES OF DIAGNOSTIC EFFICIENCY 2. Evaluation Model 2 Evaluation Models To understand the strengths and weaknesses of evaluation, one must keep in mind its fundamental purpose: to inform those who make decisions. The inferences drawn

More information

IAASB Main Agenda (February 2007) Page Agenda Item PROPOSED INTERNATIONAL STANDARD ON AUDITING 530 (REDRAFTED)

IAASB Main Agenda (February 2007) Page Agenda Item PROPOSED INTERNATIONAL STANDARD ON AUDITING 530 (REDRAFTED) IAASB Main Agenda (February 2007) Page 2007 423 Agenda Item 6-A PROPOSED INTERNATIONAL STANDARD ON AUDITING 530 (REDRAFTED) AUDIT SAMPLING AND OTHER MEANS OF TESTING CONTENTS Paragraph Introduction Scope

More information

Chapter 21 Multilevel Propensity Score Methods for Estimating Causal Effects: A Latent Class Modeling Strategy

Chapter 21 Multilevel Propensity Score Methods for Estimating Causal Effects: A Latent Class Modeling Strategy Chapter 21 Multilevel Propensity Score Methods for Estimating Causal Effects: A Latent Class Modeling Strategy Jee-Seon Kim and Peter M. Steiner Abstract Despite their appeal, randomized experiments cannot

More information

Chapter 17 Sensitivity Analysis and Model Validation

Chapter 17 Sensitivity Analysis and Model Validation Chapter 17 Sensitivity Analysis and Model Validation Justin D. Salciccioli, Yves Crutain, Matthieu Komorowski and Dominic C. Marshall Learning Objectives Appreciate that all models possess inherent limitations

More information

Supplementary materials for: Executive control processes underlying multi- item working memory

Supplementary materials for: Executive control processes underlying multi- item working memory Supplementary materials for: Executive control processes underlying multi- item working memory Antonio H. Lara & Jonathan D. Wallis Supplementary Figure 1 Supplementary Figure 1. Behavioral measures of

More information

Asignificant amount of information systems (IS) research involves hypothesizing and testing for interaction

Asignificant amount of information systems (IS) research involves hypothesizing and testing for interaction Information Systems Research Vol. 18, No. 2, June 2007, pp. 211 227 issn 1047-7047 eissn 1526-5536 07 1802 0211 informs doi 10.1287/isre.1070.0123 2007 INFORMS Research Note Statistical Power in Analyzing

More information

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review Results & Statistics: Description and Correlation The description and presentation of results involves a number of topics. These include scales of measurement, descriptive statistics used to summarize

More information

The Classification Accuracy of Measurement Decision Theory. Lawrence Rudner University of Maryland

The Classification Accuracy of Measurement Decision Theory. Lawrence Rudner University of Maryland Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago, April 23-25, 2003 The Classification Accuracy of Measurement Decision Theory Lawrence Rudner University

More information

Contents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD

Contents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD Psy 427 Cal State Northridge Andrew Ainsworth, PhD Contents Item Analysis in General Classical Test Theory Item Response Theory Basics Item Response Functions Item Information Functions Invariance IRT

More information

Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures

Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures 1 2 3 4 5 Kathleen T Quach Department of Neuroscience University of California, San Diego

More information

SLAUGHTER PIG MARKETING MANAGEMENT: UTILIZATION OF HIGHLY BIASED HERD SPECIFIC DATA. Henrik Kure

SLAUGHTER PIG MARKETING MANAGEMENT: UTILIZATION OF HIGHLY BIASED HERD SPECIFIC DATA. Henrik Kure SLAUGHTER PIG MARKETING MANAGEMENT: UTILIZATION OF HIGHLY BIASED HERD SPECIFIC DATA Henrik Kure Dina, The Royal Veterinary and Agricuural University Bülowsvej 48 DK 1870 Frederiksberg C. kure@dina.kvl.dk

More information

Neuropsychology, in press. (Neuropsychology journal home page) American Psychological Association

Neuropsychology, in press. (Neuropsychology journal home page) American Psychological Association Abnormality of test scores 1 Running head: Abnormality of Differences Neuropsychology, in press (Neuropsychology journal home page) American Psychological Association This article may not exactly replicate

More information

The Myers Briggs Type Inventory

The Myers Briggs Type Inventory The Myers Briggs Type Inventory Charles C. Healy Professor of Education, UCLA In press with Kapes, J.T. et. al. (2001) A counselor s guide to Career Assessment Instruments. (4th Ed.) Alexandria, VA: National

More information

CHAPTER - 6 STATISTICAL ANALYSIS. This chapter discusses inferential statistics, which use sample data to

CHAPTER - 6 STATISTICAL ANALYSIS. This chapter discusses inferential statistics, which use sample data to CHAPTER - 6 STATISTICAL ANALYSIS 6.1 Introduction This chapter discusses inferential statistics, which use sample data to make decisions or inferences about population. Populations are group of interest

More information

MODEL-BASED CLUSTERING IN GENE EXPRESSION MICROARRAYS: AN APPLICATION TO BREAST CANCER DATA

MODEL-BASED CLUSTERING IN GENE EXPRESSION MICROARRAYS: AN APPLICATION TO BREAST CANCER DATA International Journal of Software Engineering and Knowledge Engineering Vol. 13, No. 6 (2003) 579 592 c World Scientific Publishing Company MODEL-BASED CLUSTERING IN GENE EXPRESSION MICROARRAYS: AN APPLICATION

More information

Goodness of Pattern and Pattern Uncertainty 1

Goodness of Pattern and Pattern Uncertainty 1 J'OURNAL OF VERBAL LEARNING AND VERBAL BEHAVIOR 2, 446-452 (1963) Goodness of Pattern and Pattern Uncertainty 1 A visual configuration, or pattern, has qualities over and above those which can be specified

More information

COMPARING PLS TO REGRESSION AND LISREL: A RESPONSE TO MARCOULIDES, CHIN, AND SAUNDERS 1

COMPARING PLS TO REGRESSION AND LISREL: A RESPONSE TO MARCOULIDES, CHIN, AND SAUNDERS 1 ISSUES AND OPINIONS COMPARING PLS TO REGRESSION AND LISREL: A RESPONSE TO MARCOULIDES, CHIN, AND SAUNDERS 1 Dale L. Goodhue Terry College of Business, MIS Department, University of Georgia, Athens, GA

More information

Regression Discontinuity Analysis

Regression Discontinuity Analysis Regression Discontinuity Analysis A researcher wants to determine whether tutoring underachieving middle school students improves their math grades. Another wonders whether providing financial aid to low-income

More information