Workshop Overview. Diagnostic Measurement. Theory, Methods, and Applications. Session Overview. Conceptual Foundations of. Workshop Sessions:
|
|
- Mariah Palmer
- 5 years ago
- Views:
Transcription
1 Workshop Overview Workshop Sessions: Diagnostic Measurement: Theory, Methods, and Applications Jonathan Templin The University of Georgia Session 1 Conceptual Foundations of Diagnostic Measurement Session 2 Diagnostic Modeling Psychometric Models Session 3 Diagnostic Modeling in Educational and Psychological Settings Session 4 Advanced Concepts Session 5 Estimation of Diagnostic Classification Models with Mplus 2 Session Overview Key definitions Conceptual Foundations of Diagnostic Measurement Session 1 Conceptual example Example uses of diagnostic models in education Classroom use (formative assessment) Large scale testing use (summative assessment) Why diagnostic models should be used instead of traditional classification methods Concluding remarks 4 1
2 What are Diagnoses? The word and meaning of diagnosis is common in language Session 1: Conceptual Foundations of Diagnostic Measurement DEFINITIONS Meaning of diagnoses are deeply ingrained in our society Seldom merits a second thought 5 6 Definitions American Heritage Dictionary definition of diagnosis: Generally (a) A critical analysis of the nature of something (b) The conclusion reached by such analysis Medicine (a) The act or process of identifying of determining the nature and cause of a disease or injury through evaluation of a patient s history, examination, and review of laboratory data (b) The opinion derived from such an evaluation Biology (a) A brief description of the distinguishing characteristics of an organism, as for taxonomic classification (p. 500) Diagnosis: Defined A diagnosis is the decision that is being made based on information Within psychological testing, providing a test score gives the information that is used for a diagnosis BUT, the score is not the diagnosis For this workshop, a diagnosis is by its nature discrete Classification 7 8 2
3 Day to Day Diagnosis Decisions happen every day: Decide to wear a coat or bring an umbrella Decide to study Decide what to watch on TV tonight In all cases: Information (or data) is collected Inferences are made from data based on what is likely to be the true state of reality Diagnosis (Formalized) In diagnostic measurement, the procedures of diagnosis are formalized: We make a set of observations Usually through a set of test questions Based on these questions we make a decision as to the underlying state (or states) of a person The decision is the diagnosis 9 10 Diagnosis (Formalized) Diagnoses featured in this workshop: Educational Measurement The competencies (skills) that a person has or has not mastered Leads to possible tailored instruction and remediation Psychiatric Assessment The DSM criteria that a person meets Leads to a broader diagnosis of a disorder Workshop Terminology Respondents: The people from whom behavioral data are collected Behavioral data considered test item responses for workshop Not limited to only item responses Items: Test items used to classify/diagnose respondents Diagnostic Assessment: The method used to elicit behavioral data Attributes: Unobserved dichotomous characteristics underlying the behaviors (i.e., diagnostic status) Latent variables linked to behaviors diagnostic classification models Psychometric Models: Models used to analyze item response data Diagnostic Classification Models (DCMs) is the name of the models used to obtain classifications/diagnoses
4 Diagnostic Classification Model Names Diagnostic classification models (DCMs) have been called many different things Skills assessment models Cognitive diagnosis models Cognitive psychometric models Latent response models Restricted (constrained) latent class models Multiple classification models Structured located latent class models Structured item response theory Psychometric Soapbox DCMs are but a small set of tools that must be adapted for a common purpose Part of a methodological toolbox that is used to classify respondents Should also include content experts and end users of the diagnoses DCMs link empirical observations and respondents characteristics The models are only as good as underlying theories Diagnostic Modeling Concepts Imagine that an elementary teacher wants to test basic math ability Session 1: Conceptual Foundations of Diagnostic Measurement CONCEPTUAL EXAMPLE Using traditional psychometric approaches, the teacher could estimate an ability or test score for each respondent Classical Test Theory: Assign respondents a test score Item Response Theory: Assign respondents a latent (scaled) score By knowing each respondent s score, the students are ordered along a continuum
5 Traditional Psychometrics Traditional Psychometrics Low Mathematics Ability at UGA High What results is a (weak) ordering of respondents Ordering is called weak because of error in estimates Seock Ho Kim > Allan Cohen > Jonathan Templin Questions that traditional psychometrics cannot answer: Why is Jonathan so low? How can we get him some help? How much ability is enough to pass? How much is enough to be proficient? Jonathan Allan Cohen Seock Ho Kim What math skills have the students mastered? Multiple Dimensions of Ability Ability from a Diagnostic Perspective As an alternative, we could have expressed math ability as a set of basic skills: Has Mastered Has Not Mastered Addition Addition Subtraction Subtraction Multiplication Division Multiplication Division
6 Multiple Dimensions of Ability The set of skills represent the multiple dimensions of elementary mathematics ability Other psychometric approaches have been developed for multiple dimensions Classical Test Theory Scale Subscores Multidimensional Item Response Theory (MIRT) Yet, issues in application have remained: Reliability of estimates is often poor for most practical test lengths Dimensions are often very highly correlated Large samples are needed to calibrate item parameters in MIRT DCMs as an Alternative DCMs do not assign a single score Instead, a profile of mastered attributes is given to respondents Multidimensional models DCMs provide respondents valuable information with fewer data demands Higher reliability than comparable IRT/MIRT models Complex item structures possible Path Diagram of Traditional Psychometrics Psychometric Model Comparison Basic Math Ability Addition Subtraction Multiplication Division /2 (4x2)+3 Using Traditional Models Has a score of 20 Has a 75%, a grade of C Is in the 60 th percentile of math Scored above the cut off, passes math Using Diagnostic Models Is proficient using addition Is proficient using subtraction Should work on Multiplication Should work on Division
7 DCM Specifics Let s expand on the idea of the basic math test Possible items may be: /2 (4 x 2) + 3 Not all items measure all attributes A Q matrix is used to indicate the attributes measured by each item This is the factor pattern matrix that assigns the loadings in confirmatory factor analysis The Q Matrix An example of a Q matrix using our math test Add Sub Mult Div / (4 x 2) Respondent Profiles Expected Responses to Items Respondents are characterized by profiles specifying which attributes have been mastered Numeric values are arbitrary, but for our purposes Mastery given a 1 Non mastery given a 0 For example: Add Sub Mult Div Respondent A Respondent profile estimates are in the form of probabilities of mastery Q matrix Add Sub Mult Div / (4 x 2) Respondent Mastery Add Sub Mult Div Respondent Respondent Respondent Respondent By knowing which attributes are measured by each item and which attributes have been mastered by each respondent, we can determine the items that will likely be answered correctly by each respondent Prob Ans #1 Prob Ans #2 Prob Ans #3 Prob Ans #1 & #
8 DCM Scoring and Score Reporting DCM Conceptual Summary DCMs focus on WHY a respondent is not performing well as compared to only focusing on WHO The models define the chances of a correct response based on the respondent s attribute profile Many models have been created ranging in complexity In Session #2 we discuss a general DCM The general model subsumes all other latent variable DCMs The model predicts how respondents will answer each item Also allows for classification/diagnoses based on item responses from Templin (2007) How do DCMs Produce Diagnoses? Diagnostic decisions come from comparing observed behaviors to two parts of the psychometric model: 1. Item/variable information (item parameters) How respondents with different diagnostic profiles perform on a set of test items Helps determine which items are better at discriminating between respondents with differing diagnostic profiles 2. Respondent information pertaining to the baserate or proportion of respondents with diagnoses in the Structural Model population Provides frequency of diagnosis (or diagnostic profile) Measurement Model Conceptual Model Mapping in DCMs Helps validate the plausibility of the observed diagnostic profiles
9 DCMs In Practice To demonstrate the potential benefits of using DCMs, I present a brief example of their use From Henson & Templin (2008); Templin & Henson (2008) An urban county in a southern state wanted to improve student s End Of Course (EOC) scores on the state s 10 th grade Algebra 2 exam Session 1: Conceptual Foundations of Diagnostic Measurement USES OF DIAGNOSTIC MODEL RESPONDENT ESTIMATES A benchmark test was given in the middle of a semester Formative test designed to help teachers focus instruction Respondents and their teachers received DCM estimates Used these to characterize student proficiency levels with respect to 5 state specified goals for Algebra 2 (standards) DCM Study The benchmark test was developed for use with a DCM Characteristics of the test were fixed via standard setting Five attributes were measured Mastery was defined as meeting the proficient level for each attribute Attributes were largest represented in EOC exam Respondents then took the EOC exam 50 item test: Score of 33+ considered proficient Benchmark estimates linked to EOC estimates Descriptive Statistics of Attribute Patterns First, the basic descriptive statistics for each possible pattern What we expect a respondent with a given attribute pattern to score on the EOC test Next slides describe how DCMs can help guide instruction
10 Gain by Mastery of Each Attribute Pathways to Proficiency The difference in test score between masters and non masters of an attribute can be quantified Correlation between attribute and EOC score indicates amount of gain in EOC score by mastery of attribute Note: 50 item test DCMs can be used to form of a learning path a respondent can follow that would most quickly lead to proficiency on the EOC test The pathway tells the respondent and the teacher the sequence of attributes to learn next that will provide the biggest increase in test score This mechanism may help teachers decide focus on when teaching a course Balances time spent on instruction with impact on test score Provides a practical implementation of DCMs in today s classroom testing environment Proficiency Road Map Fast Path to Proficiency
11 Harder Paths to Proficiency Some paths are less efficient at increasing EOC test scores Session 1: Conceptual Foundations of Diagnostic Measurement IMPLICATIONS FOR LARGE SCALE TESTING PROGRAMS DCM Characteristics Theoretical Reliability Comparison As mentioned previously, DCMs provide a higher level of reliability for their estimates than comparable IRT or CTT models (Templin & Bradshaw, in press) It is easier to place a respondent into one of two groups (mastery or non mastery) than to locate them on a scale Such characteristics allow DCMs to potentially change how large scale testing is conducted Most EOC type tests are for classification Proficiency standards DCMs provide direct link to classification And direct access to standards Reliability DCM IRT Reliability Level DCM IRT Items 34 Items Items 48 Items Items 77 Items Number of Items
12 Uni and Multidimensional Comparison DCMs for an EOC Test Reliability DCM IRT DCM IRT DCM IRT Reliability PL ρ θ =.87 2 Category 3 Category 4 Category Category: 24 Items 3 Category: 42 Items 4 Category: 50 Items 5 Category: 54 Items 5 Category Dimension 2-Dimension BiFactor Dimensional Model Number of Items Ramifications for Use of DCMs Reliable measurement of multiple dimensions is possible Two attribute DCM application to empirical data: Reliabilities of 0.95 and 0.90 (compared to 0.72 and 0.70 for IRT) Multidimensional proficiency standards Respondents must demonstrate proficiency on multiple areas to be considered proficient for an overall content domain Teaching to the test would therefore represent covering more curricular content to best prepare respondents Shorter unidimensional tests Two category unidimensional DCM application to empirical data: Test needed only 24 items to have same reliability as IRT with 73 items The Paradox of DCMs DCMs are often pitched as models that allow for measurement of fine grained skills (e.g., Rupp & Templin, 2008) Paradox of DCMs: Sacrifice fine grained measurement of a latent trait for only several categories Increased capacity to measure ability multidimensionally
13 When Are DCMs Appropriate? Which situations lend themselves more naturally to such diagnosis? The purpose of the diagnostic assessment matters most DCMs provide classifications directly Optimally used when tests are used for classification EOC Tests Licensure/certification Clinical screening College entrance Placement tests DCMs can be used as coarse approximations to continuous latent variable models i.e., EOG example (2 5 category levels shown) Session 1: Conceptual Foundations of Diagnostic Measurement BENEFITS OF DCMS OVER TRADITIONAL CLASSIFICATION METHODS Previous Methods for Classification Making diagnoses on the basis of test responses is not a new concept Classical test theory Item response theory Factor analysis Process is a two stage procedure 1. Scale respondents 2. Find appropriate cut scores Classify respondents based on cut scores Problems with the Two Stage Approach The two stage procedure allows for multiple sources of error to affect the results 1. The latent variable scores themselves: estimation error Uncertainty is typically not accounted for in the subsequent classification of respondents (i.e., standard errors) The classification of respondents at different locations on the score continuum with multiple cut scores is differentially precise Uncertainty of the latent variable scores varies as a function of the location of the score
14 Problems with the Two Stage Approach 2. Latent variable assumptions: that latent variable scores follow a continuous, typically normal, distribution Estimates reflect the assumed distribution Can introduce errors if the assumption is incorrect 3. Cut score determination Standard setting is imprecise when used with general abilities Standard setting methods can be directed to item performance Some theoretical justification needs to be provided for such a cut off Why are DCMs Better for Classification? The need for a two stage procedure to set cut scores for classification is eliminated when DCMs are used Reduces classification error Quantifies and models the measurement error of the observable variables Controlling for measurement error when producing the diagnosis DCMs have a natural and direct mechanism for incorporating base rate information into the analysis No direct way to do so objectively in two stage procedures Item parameters provide information as to the diagnostic quality of each item Not directly estimable in two stage approaches Can be used to build tests that optimally separate respondents Session 1 Take home Points DCMs provide direct link between diagnosis and behavior Provide diagnostic classifications directly Diagnoses set by psychometric model parameters DCMs are effective if classification is the ultimate purpose Reduce error by removing judgments necessary in two stage approach Session 1: Conceptual Foundations of Diagnostic Measurement CONCLUDING REMARKS DCMs can be used in many contexts Can be used to create highly informative tests Can be used to measure multiple dimensions DCMs are in their infancy Time will tell their effectiveness
15 Development of Psychometric Models Diagnostic Modeling: Psychometric Models Session 2 Over the past several years, numerous DCMs have been developed We will focus on DCMs that use latent variables for attributes Each DCM makes assumptions about how mastered attributes combine/interact to produce an item response Compensatory/disjunctive/additive models Non compensatory/conjunctive/non additive models With so many models, analysts have been unsure which model would best fit their purpose Difficult to imagine all items following same assumptions 58 General Models for Diagnosis Recent developments have produced very general diagnostic models General Diagnostic Model (GDM; von Davier, 2005) Loglinear Cognitive Diagnosis Model (LCDM; Henson, Templin, & Willse, 2009) Focus of this session The general DCMs (GDM; LCDM) provide great flexibility Subsume all other latent variable DCMs Allow for both additive and non additive relationships between attributes and items Sync with other psychometric models allowing for greater understanding of modeling process Session Overview Background information ANOVA models and the LCDM Logits explained The LCDM Parameter structure One item demonstration LCDM general form Linking the LCDM to other earlier developed DCMs
16 Notation Used Throughout Session Attributes: a = 1,, A Respondents: r = 1,,R Attribute Profiles: α r = [α r1, α r2,, α ra ] α ra is 0 or 1 Latent Classes: c = 1,,C We have C = 2 A latent classes one for each possible attribute profile Items: i = 1,,I Restricted to dichotomous item responses (X ri is 0 or 1) Q matrix: Elements q ia for an item i and attribute a q ia is 0 or 1 Session 2: Diagnostic Modeling Psychometric Models BACKGROUND INFORMATION: ANOVA MODELS Background Information ANOVA The LCDM models the probability of a correct response to an item as a function of the latent attributes of a respondent α = 0 α = 1 The latent attributes are categorical, meaning a respondent can have only a few possible statuses Each status corresponds to a predicted probability of a correct response P(X = 1) α = 0 α = 1 P(X=1) P(X = 1) As such, the LCDM is very similar to an ANOVA model Predicting the a dependent variable as a function of the experimental group of a respondent ANOVA Refresher As a refresher on ANOVA, lets imagine that we are interested in the factors that have an effect on work output (denoted by Y) We design a two factor study where work output may be affected by: Lighting of the workplace High or Low Temperature Cold or Warm This experimental design is known as a 2 Way ANOVA
17 ANOVA Model Here is the 2 x 2 Factorial design: ANOVA Model The ANOVA model allows us to test for the presence of: Cold Temperature Low Lighting High Lighting A main effect associated with Temperature (A t ) A main effect associated with Lighting (B l ) Warm Temperature An interaction effect associated with Temperature and Lighting (AB) tl The ANOVA model for a respondent s work output is ANOVA with Dummy Coded Variables ANOVA with Dummy Coded Variables The ANOVA model can also be re written using two dummy coded variables D rt and D rl Becomes a linear model (i.e., regression model) The ANOVA model then becomes: D rl = 0 Low Lighting D rl = 1 High Lighting D rt = 0 Cold Temperature D rt D rt =0 for respondents in cold temperature condition D rt =1 for respondents in warm temperature condition D rt = 1 Warm Temperature D light D rl =0 for respondents in low lighting condition D rl =1 for respondents in high lighting condition
18 ANOVA Effects Explained β 0 is the mean for the cold and low light condition (reference group) The intercept β t is the change of the mean when comparing cold to warm temperature for a business with low lights (Simple Main Effect) β l is the change of the mean when comparing low to high lights for a business with a cold temperature (Simple Main Effect) β t*l is additional mean change that is not explained by the shift in temperature and shift and lights, when both occur (2 Way Interaction) Respondents from in the same condition have the same predicted value ANOVA and the LCDM The ANOVA model and the LCDM take the same modeling approach Predict a response using dummy coded variables In LCDM dummy coded variables are latent attributes Using a set of main effects and interactions Links attributes to item response Where possible, we may look for ways to reduce the model Removing non significant interactions and/or main effects Differences Between LCDM and ANOVA The LCDM and the ANOVA model differ in two ways: Instead of a continuous outcome such as work output the LCDM models a function of the probability of a correct response The logit of a correct response (defined next) Instead of observed factors as predictors the LCDM uses discrete latent variables (the attributes being measured) Attributes are given dummy codes (act as latent factors) α ra = 1 if respondent r has mastered attribute a α ra = 0 if respondent r has not mastered attribute a Session 2: Diagnostic Modeling Psychometric Models LOGITS EXPLAINED
19 Model Background More on Logits Just as in IRT models, the LCDM models the log odds of a correct response conditional on a respondent s attribute pattern α r The log odds is called a logit The logit is used because the responses are binary Items are either answered correctly (1) or incorrectly (0) Probability Logit The linear model is inappropriate for categorical data Can lead to impossible predictions (i.e., probabilities greater than 1 or less than 0) From Logits to Probabilities Whereas logits are useful as the are unbounded continuous variables, categorical data analyses rely on estimated probabilities The inverse logit function coverts the unbounded logit to a probability This is also the form of an IRT model (and logistic regression) Session 2: Diagnostic Modeling Psychometric Models THE LCDM
20 Building the LCDM To demonstrate the LCDM, consider the item 2+3 1=? from our basic math example The item measured addition (attribute 1) and subtraction (attribute 2) Only attributes defined by the Q matrix are modeled for an item The LCDM provides the logit of a correct response as a function of the latent attributes mastered by a respondent: LCDM Explained logit(x ri = 1) is the logit of a correct response to item i by respondent r λ i,0 is the intercept The logit for non masters of addition and subtraction The reference group is respondents who have not mastered either attribute (α r1 = 0 and α r2 = 0) LCDM Explained Understanding LCDM Notation The LCDM item parameters have several subscripts: λ i,1,(1) = main effect for addition (attribute 1) The increase in the logit for mastering addition (in someone who has not also mastered subtraction) λ i,1,(2) = main effect for subtraction (attribute 2) The increase in the logit for mastering subtraction (in someone who has not also mastered addition) λ i,2,(1,2) is the interaction between addition and subtraction (attributes 1 and 2) Change in the logit for mastering both addition & subtraction 79 Subscript #1 i: the item to which parameters belong Subscript #2 e: the level of the effect 0 is the intercept 1 is the main effect 2 is the two way interaction 3 is the three way interaction Subscript #3 (a 1, ): the attributes to which the effect applies Same number of attributes listed as number in Subscript #
21 LCDM with Example Numbers Imagine we obtained the following estimates for the simple math item: Parameter Estimate Effect Name λ i,0-2 Intercept λ i,1,(1) 2 Addition Simple Main Effect Session 2: Diagnostic Modeling Psychometric Models LCDM: A NUMERICAL EXAMPLE λ i,1,(2) 1 Subtraction Simple Main Effect λ i,2,(1,2) 0 Addition/Subtraction Interaction LCDM Predicted Logits and Probabilities LCDM Interaction Plots α 1 α 2 LCDM Logit Function Logit Probability 0 0 λ i,0 + λ i,1,(1) *(0) + λ i,1,(2) *(0) + λ i,2,(1,2) *(0)*(0) λ i,0 + λ i,1,(1) *(0) + λ i,1,(2) *(1) + λ i,2,(1,2) *(0)*(1) λ i,0 + λ i,1,(1) *(1) + λ i,1,(2) *(0) + λ i,2,(1,2) *(1)*(0) λ i,0 + λ i,1,(1) *(1) + λ i,1,(2) *(1) + λ i,2,(1,2) *(1)*(1) Logit Response Function Probability Response Function The LCDM interaction term can be investigated via plots No interaction: parallel lines for the logit Compensatory RUM (Hartz, 2002) Logit Response Function 1.5 Probability Response Function Logit(X=1 α) α1=0; α2=0 α1=0; α2=1 α1=1; α2=0 α1=1; α2=1 Possible Attribute Patterns P(X=1 α) α1=0; α2=0 α1=0; α2=1 α1=1; α2=0 α1=1; α2=1 Possible Attribute Patterns Logit(X=1 α) α2=0 α2=1 α1=0 α1=1 P(X=1 α) α1=0; α2=0 α1=0; α2=1 α1=1; α2=0 α1=1; α2=1 Possible Attribute Patterns
22 Strong Positive Interactions Positive interaction: over additive logit model Conjunctive model (i.e., all or none) DINA model (Haertel, 1989; Junker & Sijtsma, 1999) Strong Negative Interactions Negative interaction: under additive logit model Disjunctive model (i.e., one or more) DINO model (Templin & Henson, 2006) Logit Response Function Probability Response Function Logit Response Function Probability Response Function Logit(X=1 α) α2=0 α2=1 α1=0 α1=1 P(X=1 α) Logit(X=1 α) α2=0 α2=1 α1=0 α1=1 P(X=1 α) α1=0; α2=0 α1=0; α2=1 α1=1; α2=0 α1=1; α2= α1=0; α2=0 α1=0; α2=1 α1=1; α2=0 α1=1; α2=1-2.5 Possible Attribute Patterns -2.5 Possible Attribute Patterns Less Extreme Interactions Extreme interactions are unlikely in practice Below: positive interaction with positive main effects Logit Response Function Probability Response Function Logit(X=1 α) α2=0 α2=1 α1=0 α1=1 GENERAL P(X=1 α) Session 2: Diagnostic Modeling Psychometric Models FORM OF THE LCDM α1=0; α2=0 α1=0; α2=1 α1=1; α2=0 α1=1; α2=1 Possible Attribute Patterns
23 More General Versions of the LCDM The LCDM is based on the General Diagnostic Model by von Davier (GDM; 2005) The GDM allows for both categorical and continuous latent variables For items measuring more than two attributes, higher level interactions are possible Difficult to estimate in practice General Form of the LCDM The LCDM specifies the probability of a correct response as a function of a set of attributes and a Q matrix: The term in the exponent is the logit we have been using all along Logit(X ri =1 α r ) The LCDM appears in the psychometric literature in a more general form See Henson, Templin, & Willse (2009) 89 Intercepts Main Effects Two-Way Interactions Higher Interactions 90 Previously Popular DCMs Because the advent of the GDM and LCDM has been fairly recent, other earlier DCMs are still in use Such DCMs are much more restrictive than the LCDM Not discussed at length here It is anticipated that field will adapt to more general forms Session 2: Diagnostic Modeling Psychometric Models SUBSUMED MODELS Each of these models can be fit using the LCDM Fixing certain model parameters Shown for reference purposes See Henson, Templin, & Willse (2009) for more detail
24 Other DCMs with the LCDM The Big 6 DCMs with latent variables: DINA (Deterministic Inputs, Noisy AND Gate) Haertel (1989); Junker and Sijtsma (1999) NIDA (Noisy Inputs, Deterministic AND Gate) Maris (1995) RUM (Reparameterized Unified Model) Hartz (2002) DINO (Deterministic Inputs, Noisy OR Gate) Templin & Henson (2006) NIDO (Noisy Inputs, Deterministic OR Gate) Templin (2006) C RUM (Compensatory Reparameterized Unified Model) Hartz (2002) LCDM Parameters Main Effects Other DCMs with the LCDM Non compensatory Models Compensatory Models DINA NIDA NC RUM DINO NIDO C RUM Zero Positive Positive Positive Positive Positive Interactions Positive Positive Positive Negative Zero Zero Parameter Restrictions Across Attributes Across Items Across Attributes Across Items Adapted from: Rupp, Templin, and Henson (forthcoming, 2010) Compensatory RUM (Hartz, 2002) No interactions in model No interaction: parallel lines for the logit DINA Model (Haertel, 1989; Junker & Sijstma, 1999) Positive interaction: over additive logit model Highest interaction parameter is non zero All main effects (and lower interactions) zero Logit Response Function Probability Response Function Logit Response Function Probability Response Function Logit(X=1 α) α2=0 α2=1 α1=0 α1=1 P(X=1 α) Logit(X=1 α) α2=0 α2=1 α1=0 α1=1 P(X=1 α) α1=0; α2=0 α1=0; α2=1 α1=1; α2=0 α1=1; α2= α1=0; α2=0 α1=0; α2=1 α1=1; α2=0 α1=1; α2=1-2.5 Possible Attribute Patterns -2.5 Possible Attribute Patterns
25 DINO Model (Templin & Henson, 2006) Negative interaction: under additive logit model All main effects equal Interaction terms are 1 sum of corresponding lower effects Logit Response Function 1.5 Probability Response Function Logit(X=1 α) α2=0 α2=1 α1=0 α1=1 P(X=1 α) Session 2: Diagnostic Modeling Psychometric Models CONCLUDING REMARKS α1=0; α2=0 α1=0; α2=1 α1=1; α2=0 α1=1; α2=1-2.5 Possible Attribute Patterns Session 2 Take Home Points The LCDM uses an ANOVA like approach to map latent attributes onto item responses Uses main effects and interactions for each attribute Uses a logit link function Multiple diagnostic models are subsumed by the LCDM Diagnostic Modeling in Educational and Psychological Settings Session
26 Session 3 Overview Examples of DCMs through applications Educational measurement English proficiency LCDM demonstration in practice Sample results Potential problems in analysis Psychological measurement pathological gambling Simplified LCDM (DINO model) Demonstration of what is possible with DCMs Session 3 Diagnostic Modeling in Educational and Psychological Settings LARGE SCALE LANGUAGE ASSESSMENT USING THE LCDM Introduction With the emphasis of today s academic environment on testing, the focus of formative assessment is growing Among possible formative settings, language assessment has received some attention (e.g., Buck & Tatsuoka, 1998; Jang, 2004; von Davier, 2005) The purpose of this study is to explore the possibility of using the LCDM for the evaluation of the Grammar section of the Examination for the Certificate of Proficiency in English (ECPE) Also provides an example analysis using the LCDM Examination for the Certificate of Proficiency in English (ECPE) The ECPE is a test developed and scored by the English Language Institute of the University of Michigan The ECPE was developed to measure advanced English ability in respondents for which English is not their first language Analysis is for the grammar section of the test 40 multiple choice items (28 items used in analysis) 10 were non operational 2 had difficulties greater than
27 Example Item from Grammar Section An example written to resemble an item in the Grammar section of the ECPE is: I have always snow. to enjoy enjoyed enjoying to enjoyed Session 3 Diagnostic Modeling in Educational and Psychological Settings ECPE ANALYSIS METHODS Examinees and Data A total of 2922 examinees are used to analyze the ECPE Grammar section The average age of examinees was approximately 23 years old Approximately 50% spoke Portuguese and an additional 31% of the examinees spoke Spanish as a first language Attributes Measured by Test Three attributes measured representing knowledge of: Morphosyntactic rules Cohesive rules Lexical rules The full LCDM was estimated using Mplus Marginal maximum likelihood estimation Q matrix characteristics 19 items measuring only one attribute (simple structure) 9 items measuring two attributes 0 items measuring all three attributes
28 ECPE Q matrix Here are the entries for several items from the ECPE Q matrix Item Morphosyntactic Rules Cohesive Rules Lexical Rules Session 3 Diagnostic Modeling in Educational and Psychological Settings LCDM RESULTS LCDM Results To further describe the parameters of the LCDM, several types of results will be presented: Model fit results Item parameter results Inspection of interactions Interpretation Structural parameter results Implied attribute hierarchy Respondent estimates/classifications Model Fit Results Overall model fit Chi square not computable AIC: ; BIC: Used to reduce model Bivariate model fit (Session 4) Compares model predicted and observed frequencies of responses for all pairs of items Of 378 item pairs 45 had p values less than 0.01 Items most indicated Item 13 (9 of 45 pairs) Item 4 (6 of 45 pairs) Item 5 (6 of 45 pairs) Indicates some items are not fit well by model We will ignore this and continue with analysis as example
29 Example Item LCDM Intercepts To demonstrate parameter interpretation, results from item 7 will be shown Attributes measured: Morphosyntactic rules (Attribute 1) Lexical rules (Attribute 3) Parameter estimates: Parameter Estimate SE p-value λ 7, λ 7,1,(1) λ 7,1,(3) λ 7,2,(1,3) Estimated Intercept: (0.095) Indicates the logit of a correct response for a nonmaster of all attributes Here, non masters have an average probability of a correct response: exp( 0.106)/1+exp( 0.106) = 0.47 Hypothesis test is not important Tests whether non masters have a probability of a correct response of 0.5 Problematic when very high Difficult to identify other parameters Indicates issues with test, Q matrix, or attributes Higher Order Model Parameters Examining Interaction Parameters Interpretation of main effects and interactions proceeds sequentially: 2 way interaction parameter: (0.144) If interactions are present: Examine highest level of interaction If significantly different from zero, leave in model If not, term can be omitted If interactions are not present: Examine how far main effect is from zero P value for parameter was small (0.000) Indicates parameter is significantly different from zero Candidate to leave in model Value indicates that there is an under additive effect of mastering both attributes Means mastery of one attribute is sufficient to have high chance to get item correct
30 More on Interactions Interaction pattern for this item indicates that mastery morphosyntatic rules is key to answering correctly Mastery of lexical rules helps, but not above that of mastery of morphosyntatic rules For why this is the case, stay tuned Logit(X=1 α) α3=0 α3=1 α1=0 α1=1 P(X=1 α) α1=0; α3=0 α1=0; α3=1 α1=1; α3=0 α1=1; α3=1 Possible Attribute Patterns Other Interactions Of 9 interaction parameters, 3 were significantly different from zero Candidates to be removed from model Of the 6 non significant interactions 4 had small main effects on one attribute Attribute not highly related to item response Indicates that Q matrix may be incorrect Have to re fit with new Q matrix and look at information criteria (Session 4) Interpreting Main Effects ECPE Item 7 Lexical Main Effect When significant interactions are present, main effects cannot be easily interpreted Sometimes called conditional main effects Need to know combination of attributes mastered to fully describe item response function Main effects in LCDM have added concern Lower bound is zero (for monotonicity) p values are inaccurate as they approach zero Because of the significant interaction, interpretation is conditional When Morphosyntactic Rules have not been mastered: Lexical main effect λ 7,1,(3) = Respondents who have mastered Lexical Rules have an increase in logit of over respondents who are non masters P(X=1 α) α1=0; α3=0 α1=0; α3=1 α1=1; α3=0 α1=1; α3=1 Possible Attribute Patterns
31 ECPE Item 7 Morphosyntactic Main Effect General Modeling Tips Because of the significant interaction, interpretation is conditional When Lexical Rules have not been mastered: Morphosyntactic main effect λ 7,1,(1) = Respondents who have mastered Morphosyntactic Rules have an increase in logit of over respondents who are nonmasters P(X=1 α) α1=0; α3=0 α1=0; α3=1 α1=1; α3=0 α1=1; α3=1 Possible Attribute Patterns High level interactions are difficult to estimate in most samples More than 2 way interactions may not be possible Modeling strategy: Try all interactions If model does not converge, limit to only 2 way interactions Remove non significant interactions from model If all interactions and main effects for an attribute are close to zero: Entry for attribute in Q matrix can be removed Double check with AIC/BIC as hypothesis test is approximate Attribute Pattern Probabilities Base rate pattern of profiles mastered in sample indicates an attribute hierarchy Lexical Cohesive Morphosyntatic Implications for Item 7 Cannot have morphosyntatic without lexical Suggests information about second language acquisition Example Respondent Estimates Respondent estimates are probabilities of mastery for each attribute Shown for 5 example respondents Test score given to provide comparison Respondent Total Morphosyntactic Cohesive Lexical Score
32 Educational Measurement Wrap Up Demonstrated results from LCDM when applied to English language assessment Investigated model fit Very important as of yet not well understood in DCMs Described item parameter estimates Interpreting interactions/main effects Modeling strategy Session 3 Diagnostic Modeling in Educational and Psychological Settings CONCLUDING REMARKS EDUCATIONAL MEASUREMENT Described structural parameter estimates Useful for understanding latent variables measured by test Described respondent parameter estimates Normally these help understand the knowledge state of a respondent Attribute hierarchy here limits utility of information Gambling Application Overview Study of pathological gambling DSM criteria for pathological gambling Common methods for assessment How diagnostic models could help Session 3 Diagnostic Modeling in Educational and Psychological Settings UNDERSTANDING PATHOLOGICAL GAMBLING Psychometric Model Formulating the LCDM for Likert data (and smaller samples) Adapting structural (or hierarchical) models to evaluate the DSM definition of pathological gambling Pathological Gambling Application Model Development Estimation/Results
33 The Gambling Explosion DSM Definition of Pathological Gambling Exponential increase in accessibility of gambling opportunities: State lotteries Native American tribal casinos Riverboat gambling Internet gambling Incidences of pathological gambling have increased (Volberg, 2002) In order to limit the detrimental effects of gambling on a community: Easily identify potential pathological gamblers and provide treatment interventions Understand the underlying causes of the disorder The DSM IV TR defines pathological gambling as an impulse control disorder (not elsewhere classified) To be classified as a pathological gambler, an individual must meet 5 of 10 defined criteria All are dichotomous Meets/Does not meet DSM CRITERIA C1 Is preoccupied with gambling C2 Needs to gamble with increasing amounts of money in order to achieve the desired excitement C3 Has repeated unsuccessful efforts to control, cut back, or stop gambling C4 Is restless or irritable when attempting to cut down or stop gambling C5 Gambles as a way of escaping from problems or of relieving a dysphoric mood C6 After losing money gambling, often returns another day to get even C7 Lies to family members, therapist, or others to conceal the extend of involvement with gambling C8 Has committed illegal acts such as forgery, fraud, theft, or embezzlement to finance gambling C9 Has jeopardized or lost a significant relationship, job, or educational or career opportunity because of gambling C10 Relies on others to provide money to relieve a desperate financial situation caused by gambling Studying Pathological Gambling The DSM definition has several characteristics which make it seem somewhat implausible: All criteria are treated equally in that the sum of any five will result in the diagnosis of pathological gambling It seems odd to have the following given equal weight: C8 Has committed illegal acts such as forgery, fraud, theft, or embezzlement to finance gambling C1 Is preoccupied with gambling If all criteria are treated equally, does the diagnostic criterion of five or more seem realistic? Session 3 Diagnostic Modeling in Educational and Psychological Settings METHODS DCMs can help to answer both questions
34 Gambling Instruments Take each of the 10 criteria to be the dichotomous latent attributes Applying a DCM would simultaneously provide: Diagnostic information for each individual Underlying structural model parameters Evaluation of the above/below five DSM criteria for pathological diagnosis rule Evaluation of whether all criteria should be treated equivalently Study included 112 experienced gamblers. Participants provided responses to two instruments Gambling research instrument (Henson, Feasel, & Jones, 2000) 41 items; 6 point Likert scale South Oaks Gambling Screen (Lesieur & Blume, 1987) 20 items; binary Used to validate result Psychometric Model The full LCDM was not able to be estimated Small sample size Likert response data Instead, the DINO model was used All Set to be Equal One or more attribute model Two parameters per item (regardless of entries in Q matrix) Shown for a dichotomous item measuring two attributes: Binomial link function used to model Likert responses Polytomous model assuming Binomial distribution conditional on attribute profile Conditional Response Distributions Marginal Response Distributions
35 GRI Structural Model The structural model provides a model for the correlational structure of the attributes (Session 4) A two class mixture was used as the structural model Classes were meant to represent pathological gamblers (PG) and non pathological gamblers (NPG) Help determine how the latent criteria map onto pathological gamblers The mixture structural model allows us to: Calculate the probability that each criterion is met given an individual is a PG or a NPG Determine the criteria that best discriminate between PG and NPG Calculate the probability of being a PG based on the number of criteria met Evaluate the DSM stated criteria of 5 or more to be diagnosed PG Model Estimation Created a Markov Chain Monte Carlo estimation algorithm in Fortran Uniform prior for all item parameters Latent traits (α) modeled with empirical prior defined by structural model Uniform prior for all structural model parameters Chain length of 50,000 (burn in of 40,000) Convergence check: Geweke test Visual inspection of timeseries plots Algorithm Convergence Session 3 Diagnostic Modeling in Educational and Psychological Settings MODEL RESULTS
36 Results To Be Presented Fit check: Model fit evaluation Usability: Diagnostic estimates of gamblers DSM criteria profile Validation: How GRI/DCM diagnoses correspond to SOGS diagnoses Interpretation: Item parameter estimates Structural model estimates: Criteria with differential discrimination between PG and NPG How many criteria are indicative of PG Checking Goodness of Fit Typical measures of goodness of fit were unreasonable due to a sparse contingency table of responses (6 41 possible response patterns) Monte Carlo fit index was constructed (based on Langeheine et. al, 1996) for bivariate item statistics (Maydeu Olivares, A. & Joe, H. 2005) Root Mean Squared Residual (RMSR) of the Pearson correlation was used as a criterion Correlation RMSR = (p = 0.486) Indicates adequate fit Respondent Diagnoses Respondent Diagnoses
37 Criterion Validity SOGS Classification DCM NPG PG Total Classification NPG PG Total Compared GRI/DCM classification with SOGS 89.3% matching classifications Cohen s Kappa: 0.69 Item Parameter Interpretation Bar graph: Red bar: Average response for PG Blue bar: Average response for NPG Item 5 [C2]: I find it necessary to gamble with larger amounts of money (than when I first gambled) for gambling to be exciting Item 13 [C3 or C4]: I find it difficult to stop gambling Structural Model Estimates Evaluating the DSM 5 or More Rule
38 Concluding Remarks: Gambling Talk DCM respondent estimates give rich information about the pattern of satisfied criteria Could be used to tailor treatment strategies A better definition of PG would be one who meets at least FOUR or more criteria Session 3 Diagnostic Modeling in Educational and Psychological Settings CONCLUDING REMARKS PSYCHOLOGICAL MEASUREMENT Results suggest that Criteria 1, 3, and 10 are more discriminating of PG than other criteria Criteria such as 2, 5, and 7 have relatively high probability of being met by NPG (more than 20% chance) Weaker indicators of pathological gambling Wrap Up and Take Home Points Session 3 demonstrated some potentials uses of DCMs Session 3 Diagnostic Modeling in Educational and Psychological Settings CONCLUDING REMARKS SESSION Applications of DCMs are rare Tests haven t been built to measure categorical attributes Item information is different in DCMs Users haven t had access to software To date, most applications use software built by researchers MCMC in Fortran or WinBugs MML in Fortran This is about to change
39 Notes on Usefulness of DCMs Full utility of DCMs cannot be understood unless applications become more frequent For now, have to use sub optimal data and problems Future applications coming soon Mathematical reasoning test under development (NSF funded) Assessment of readiness for first grade in kindergartners Funding opportunities exist and seem to review well Educational Measurement: NSF (DR K12); IES (Goals 2 and 5) Psychological Measurement: NIH (NIMH; NIDA; NIA; ) Advanced Topics: Structural Models, Model Fit, and Respondent Estimates Session 4 Industry seems interested ETS/College Board/ACT/Measurement Inc. Typically proprietary dangerous for academics 153 Session Overview Session 4 will provide the advanced topics needed to apply DCMs Understanding structural models What they are How to summarize them Differing types Assessment of model fit How respondent diagnoses are made WARNING: Content can be very technical But fun, though Notation Used Throughout Session Attributes: a = 1,, A Respondents: r = 1,,R Attribute Profiles: α r = [α r1, α r2,, α ra ] α ra is 0 or 1 Latent Classes: c = 1,,C We have C = 2 A latent classes one for each possible attribute profile Items: i = 1,,I Restricted to dichotomous item responses (X ri is 0 or 1) Q matrix: Elements q ia for an item i and attribute a q ia is 0 or
40 DCM Structural Models Throughout the workshop, attribute profile base rates have been mentioned as being influential in DCMs Part of respondent diagnoses (to be shown) Describes nature of attribute profiles ECPE discovered apparent attribute hierarchy Gambling study provided feedback on DSM criteria rules Session 4: Advanced Topics Structural Models, Model Fit, and Respondent Estimates STRUCTURAL MODELS The base rates represent the probability any respondent has a given attribute profile For a test measuring A attributes, 2 A profiles are possible The structural model provides the probability for each profile DCM Structural Models Defined Interpreting the Structural Model The parameter for the structural model is η c Each attribute profile c has one η c is the base rate probability of attribute profile c: The ECPE estimates of η c are shown to the right c η c α 1 α 2 α Because there are numerous η c parameters, interpretation is difficult Useful for detecting attribute hierarchies Often, the η c parameters are re expressed as: The marginal probability an attribute is mastered in the population The correlation between any two attributes Both can be computed using a frequency analysis weighted by η c
41 SAS Structural Model Summary SAS can be used to compute summaries of the structural model parameters SAS Structural Model Summary For each attribute, marginally: Proportion of Masters SAS Structural Model Summary Attribute Summary For each pair of attributes: Tetrachoric Correlation For the ECPE data, we have the following summary of attribute summary information Attribute Prop. Tetrachoric Correlation Masters 1. Morphosyntatic Cohesive Lexical Such information is helpful in determining nature of attributes in a population of interest Analogous to information about latent variables in CFA/MIRT
42 Differing Structural Models The structural model of a DCM has the potential to have an overwhelming number of parameters For A attributes, total estimated: 2 A 1 All must sum to 1 Saturated model Multiple structural models exist All reduce the number of parameters All use categorical data analysis techniques to model η c Analogous to latent variable covariance structure in structural equation modeling Distribution of attributes is categorical, not continuous Types of Structural Models Log linear model Predicts the natural logarithm of η c by the attributes in each profile Allows for varying levels of complexity Most: Saturated Model Least: Independent Attributes Model Implemented in Mplus (see session 5) and main focus of discussion today Tetrachoric correlation model Provides an item factor model for latent attributes Uses only bivariate information for pairs of attributes Allows for covariance structures to be estimated Not available in any software packages (but also shown briefly today) Hierarchical factors model Special case of tetrachoric correlation model Mixture models Shown in gambling example Also given by von Davier (2008) Log Linear Structural Models Log Linear Model for μ c The log linear structural model is the easiest to implement Due to its availability in Mplus μ c is the natural logarithm of η c c η c μ c α 1 α 2 α The structural model then uses an ANOVA like model to predict the value of μ c as a function of the attributes that are defined in attribute profile c Shown for 3 attribute model Includes main effects, 2 way, and 3 way interactions All parameters must sum to zero for identifiability Intercept and Main effects 2 way and 3 way interactions
43 Log Linear Structural Model Notation The log linear structural model parameters have several subscripts: Subscript #1 e: the level of the effect 0 is the intercept 1 is the main effect 2 is the two way interaction 3 is the three way interaction Subscript #2 (a 1, ): the attributes the effect applies to Same number of attributes listed as number in Subscript #2 Log Linear Model Explained Because not all attribute profiles include all attributes, only some terms get used to predict each value of μ c For attribute profile 1: α 1 = [α 11 = 0; α 12 = 0; α 13 = 0]: Only the intercept applies Log Linear Model Explained For attribute profile 2: α 1 = [α 11 = 0; α 12 = 0; α 13 = 1]: Log Linear Model Explained For attribute profile 6: α 1 = [α 11 = 1; α 12 = 0; α 13 = 1]: The intercept and main effect of attribute 2 apply The intercept, main effects of attribute 1 and attribute 3, and interaction between attributes 1 and 3 apply
44 Log Linear Model Explained For attribute profile 8: α 1 = [α 11 = 1; α 12 = 1; α 13 = 1]: Interpretations of Model Parameters The log linear model with ALL main effects and interactions is statistically equivalent to the saturated structural model Two way interactions are analogous to bivariate correlations in categorical models Higher level interactions represent higher level of characteristics of attribute distribution (i.e., skewness, kurtosis, etc ) All parameters apply Models without interactions imply uncorrelated attributes Main effects are essentially attribute base rates Models without main effects or interactions assume all attribute profiles are equally likely Higher order interactions can be removed if not significantly different from zero Log Linear Model for ECPE To demonstrate the log linear model, we again present our ECPE data Full model (all parameters) Parameter Estimate SE p-value γ γ 1,(1) γ 1,(2) γ 1,(3) γ 2,(1,2) γ 2,(1,3) γ 2,(2,3) γ 3,(1,2,3) Reductions in the Structural Model Because the three way interaction was not significant, we can remove that parameter from the model without greatly affecting model fit New results: Parameter Estimate SE p-value γ γ 1,(1) γ 1,(2) γ 1,(3) γ 2,(1,2) γ 2,(1,3) γ 2,(2,3)
45 New Results for Attribute Probabilities The reduced model only slightly modifies the attribute probabilities: c Original η c New η c Session 4: Advanced Topics Structural Models, Model Fit, and Respondent Estimates TETRACHORIC STRUCTURAL MODELS Tetrachoric Structural Models Because most summary information is given about attributes and pairs of attributes, tetrachoric models have been developed Such models use the tetrachoric correlation between attributes as a model for the probability for each attribute pattern Defining Tetrachoric Correlations The tetrachoric correlation is a measure of the association between two binary variables The correlation comes from mapping the binary variables onto two underlying continuous variables Each of the continuous variables is bisected by a threshold which transforms the continuous response into a categorical outcome The distribution of the underlying continuous variables is Available in Arpeggio software Assessment Systems Corporation ρ is the tetrachoric correlation coefficient
46 Tetrachoric Correlation Explained Technical Specifics: Multivariate Attributes The tetrachoric models assume use the following function to model the probability of an attribute profile: Tetrachoric Correlation Matrix Multivariate Normal Density Where: Structured Matrices Placing a structure on the Ξ tetrachoric correlation matrix expands the model to mimic SEM (Templin & Henson, 2006) Session 4: Advanced Topics Structural Models, Model Fit, and Respondent Estimates ASSESSMENT OF MODEL FIT
47 Assessing Model Fit There is no one best way to assess fit in DCMs Techniques typically used can put into several general categories: Absolute fit Model based hypothesis tests (if available) Entropy Relative fit Information criteria Item fit Topics discussed here will focus on fit statistics available in Mplus (also discussed in Session 5) Overall Model Fit: Chi Squared Test For small numbers of items (10 15), the traditional Chi Squared test of model fit can be used Test is invalid for too many items sparse data Shown for 28 item ECPE Mplus gives this automatically Omits when data are sparse Can omit extreme cells from an analysis Misleading Overall Model Fit: (Relative) Entropy The entropy of a model is a measure of classification uncertainty It is an absolute fit statistic Mplus reports relative entropy Value of 1.00 means all respondents classified with complete certainty (good fit) Value of 0.00 means all respondents classified with equal probabilities for all classes (poor fit) ECPE (relative) entropy: Hard to interpret by itself Relative Model Fit: Information Criteria Used when comparing between two models: Two DCMs (LCDM v. DINA) Two Q matrices (4 v. 5 attributes) Two different models (IRT v. DCM) Mplus reports: AIC and BIC Sample size adjusted BIC All can be used Smallest value is best
48 Item Fit Statistics The TECH10 option reports a degree of misfit for each Item individually (Univariate) Pair of two items (Bivariate) Uses Chi squared test for misfit Values for each item are distributed as Chi square with 1 df (for binary items) Misfitting items can be investigated Q matrix can be changed Items can be removed Item Fit Statistics: Univariate Fit Univariate fit attempts to determine if the model fits each item marginally A limited information statistic Not useful in DCMs Model is for probability Will always fit perfectly Item Fit Statistics: Bivariate Fit Bivariate fit is an index of fit for a pair of items Compares observed data with frequency expected under DCM Produces a 1 df Chi Squared test Can help identify items that do not fit model Rough approximation Concluding Remarks: Model Fit Assessment of model fit in DCMs is currently a difficult task Easily accessible options are limited Can quickly find options that take longer to assess fit than to estimate model Mplus options are adequate for initial screening DCMs share this problem with IRT models General categorical data analyses Other model fit options are available and forthcoming Based on limited information (i.e., Templin, 2007) Need further testing
Diagnostic Classification Models
Diagnostic Classification Models Lecture #13 ICPSR Item Response Theory Workshop Lecture #13: 1of 86 Lecture Overview Key definitions Conceptual example Example uses of diagnostic models in education Classroom
More informationModel-based Diagnostic Assessment. University of Kansas Item Response Theory Stats Camp 07
Model-based Diagnostic Assessment University of Kansas Item Response Theory Stats Camp 07 Overview Diagnostic Assessment Methods (commonly called Cognitive Diagnosis). Why Cognitive Diagnosis? Cognitive
More informationScale Building with Confirmatory Factor Analysis
Scale Building with Confirmatory Factor Analysis Latent Trait Measurement and Structural Equation Models Lecture #7 February 27, 2013 PSYC 948: Lecture #7 Today s Class Scale building with confirmatory
More informationMultifactor Confirmatory Factor Analysis
Multifactor Confirmatory Factor Analysis Latent Trait Measurement and Structural Equation Models Lecture #9 March 13, 2013 PSYC 948: Lecture #9 Today s Class Confirmatory Factor Analysis with more than
More informationBlending Psychometrics with Bayesian Inference Networks: Measuring Hundreds of Latent Variables Simultaneously
Blending Psychometrics with Bayesian Inference Networks: Measuring Hundreds of Latent Variables Simultaneously Jonathan Templin Department of Educational Psychology Achievement and Assessment Institute
More informationFundamental Concepts for Using Diagnostic Classification Models. Section #2 NCME 2016 Training Session. NCME 2016 Training Session: Section 2
Fundamental Concepts for Using Diagnostic Classification Models Section #2 NCME 2016 Training Session NCME 2016 Training Session: Section 2 Lecture Overview Nature of attributes What s in a name? Grain
More information11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES
Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are
More informationCOMBINING SCALING AND CLASSIFICATION: A PSYCHOMETRIC MODEL FOR SCALING ABILITY AND DIAGNOSING MISCONCEPTIONS LAINE P. BRADSHAW
COMBINING SCALING AND CLASSIFICATION: A PSYCHOMETRIC MODEL FOR SCALING ABILITY AND DIAGNOSING MISCONCEPTIONS by LAINE P. BRADSHAW (Under the Direction of Jonathan Templin and Karen Samuelsen) ABSTRACT
More informationEmpowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison
Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological
More informationFactors Affecting the Item Parameter Estimation and Classification Accuracy of the DINA Model
Journal of Educational Measurement Summer 2010, Vol. 47, No. 2, pp. 227 249 Factors Affecting the Item Parameter Estimation and Classification Accuracy of the DINA Model Jimmy de la Torre and Yuan Hong
More informationfor Scaling Ability and Diagnosing Misconceptions Laine P. Bradshaw James Madison University Jonathan Templin University of Georgia Author Note
Combing Item Response Theory and Diagnostic Classification Models: A Psychometric Model for Scaling Ability and Diagnosing Misconceptions Laine P. Bradshaw James Madison University Jonathan Templin University
More informationGENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS
GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS Michael J. Kolen The University of Iowa March 2011 Commissioned by the Center for K 12 Assessment & Performance Management at
More information11/24/2017. Do not imply a cause-and-effect relationship
Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection
More informationOn Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses. Structural Equation Modeling Lecture #12 April 29, 2015
On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses Structural Equation Modeling Lecture #12 April 29, 2015 PRE 906, SEM: On Test Scores #2--The Proper Use of Scores Today s Class:
More informationScale Building with Confirmatory Factor Analysis
Scale Building with Confirmatory Factor Analysis Introduction to Structural Equation Modeling Lecture #6 February 22, 2012 ERSH 8750: Lecture #6 Today s Class Scale building with confirmatory factor analysis
More informationComprehensive Statistical Analysis of a Mathematics Placement Test
Comprehensive Statistical Analysis of a Mathematics Placement Test Robert J. Hall Department of Educational Psychology Texas A&M University, USA (bobhall@tamu.edu) Eunju Jung Department of Educational
More informationJONATHAN TEMPLIN LAINE BRADSHAW THE USE AND MISUSE OF PSYCHOMETRIC MODELS
PSYCHOMETRIKA VOL. 79, NO. 2, 347 354 APRIL 2014 DOI: 10.1007/S11336-013-9364-Y THE USE AND MISUSE OF PSYCHOMETRIC MODELS JONATHAN TEMPLIN UNIVERSITY OF KANSAS LAINE BRADSHAW THE UNIVERSITY OF GEORGIA
More informationLinking Errors in Trend Estimation in Large-Scale Surveys: A Case Study
Research Report Linking Errors in Trend Estimation in Large-Scale Surveys: A Case Study Xueli Xu Matthias von Davier April 2010 ETS RR-10-10 Listening. Learning. Leading. Linking Errors in Trend Estimation
More informationTechnical Specifications
Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically
More informationBruno D. Zumbo, Ph.D. University of Northern British Columbia
Bruno Zumbo 1 The Effect of DIF and Impact on Classical Test Statistics: Undetected DIF and Impact, and the Reliability and Interpretability of Scores from a Language Proficiency Test Bruno D. Zumbo, Ph.D.
More informationMCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and. Lord Equating Methods 1,2
MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and Lord Equating Methods 1,2 Lisa A. Keller, Ronald K. Hambleton, Pauline Parker, Jenna Copella University of Massachusetts
More informationRunning head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1. John M. Clark III. Pearson. Author Note
Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1 Nested Factor Analytic Model Comparison as a Means to Detect Aberrant Response Patterns John M. Clark III Pearson Author Note John M. Clark III,
More informationA Brief Introduction to Bayesian Statistics
A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon
More informationA Comparison of Several Goodness-of-Fit Statistics
A Comparison of Several Goodness-of-Fit Statistics Robert L. McKinley The University of Toledo Craig N. Mills Educational Testing Service A study was conducted to evaluate four goodnessof-fit procedures
More informationData and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data
TECHNICAL REPORT Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data CONTENTS Executive Summary...1 Introduction...2 Overview of Data Analysis Concepts...2
More informationParallel Forms for Diagnostic Purpose
Paper presented at AERA, 2010 Parallel Forms for Diagnostic Purpose Fang Chen Xinrui Wang UNCG, USA May, 2010 INTRODUCTION With the advancement of validity discussions, the measurement field is pushing
More informationStructural Equation Modeling (SEM)
Structural Equation Modeling (SEM) Today s topics The Big Picture of SEM What to do (and what NOT to do) when SEM breaks for you Single indicator (ASU) models Parceling indicators Using single factor scores
More informationDifferential Item Functioning
Differential Item Functioning Lecture #11 ICPSR Item Response Theory Workshop Lecture #11: 1of 62 Lecture Overview Detection of Differential Item Functioning (DIF) Distinguish Bias from DIF Test vs. Item
More informationChapter 1. Introductory Information for Therapists. Background Information and Purpose of This Program
Chapter 1 Introductory Information for Therapists Background Information and Purpose of This Program Changes in gaming legislation have led to a substantial expansion of gambling opportunities in America,
More informationSHA, SHUYING, Ph.D. Nonparametric Diagnostic Classification Analysis for Testlet Based Tests. (2016) Directed by Dr. Robert. A. Henson. 121pp.
SHA, SHUYING, Ph.D. Nonparametric Diagnostic Classification Analysis for Testlet Based Tests. (2016) Directed by Dr. Robert. A. Henson. 121pp. Diagnostic classification Diagnostic Classification Models
More informationThe application of Big Data in the prevention of Problem Gambling
The application of Big Data in the prevention of Problem Gambling DR. MICHAEL AUER neccton m.auer@neccton.com DR. MARK GRIFFITHS Nottingham Trent University United Kingdom mark.griffiths@ntu.ac.uk 1 Biographie
More informationAnalyzing data from educational surveys: a comparison of HLM and Multilevel IRT. Amin Mousavi
Analyzing data from educational surveys: a comparison of HLM and Multilevel IRT Amin Mousavi Centre for Research in Applied Measurement and Evaluation University of Alberta Paper Presented at the 2013
More informationBusiness Statistics Probability
Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment
More informationUnit 1 Exploring and Understanding Data
Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile
More informationAndré Cyr and Alexander Davies
Item Response Theory and Latent variable modeling for surveys with complex sampling design The case of the National Longitudinal Survey of Children and Youth in Canada Background André Cyr and Alexander
More informationA review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) *
A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) * by J. RICHARD LANDIS** and GARY G. KOCH** 4 Methods proposed for nominal and ordinal data Many
More informationWhat is Gambling? Gambling or ludomania is an urge to continuously gamble despite harmful negative consequences or a desire to stop.
By Benjamin Bunker What is Gambling? Gambling or ludomania is an urge to continuously gamble despite harmful negative consequences or a desire to stop. What is Gambling? Pt. 2 Gambling is an Impulse Control
More informationItem Response Theory. Steven P. Reise University of California, U.S.A. Unidimensional IRT Models for Dichotomous Item Responses
Item Response Theory Steven P. Reise University of California, U.S.A. Item response theory (IRT), or modern measurement theory, provides alternatives to classical test theory (CTT) methods for the construction,
More informationItem Analysis: Classical and Beyond
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013 Why is item analysis relevant? Item analysis provides
More informationUsing the Rasch Modeling for psychometrics examination of food security and acculturation surveys
Using the Rasch Modeling for psychometrics examination of food security and acculturation surveys Jill F. Kilanowski, PhD, APRN,CPNP Associate Professor Alpha Zeta & Mu Chi Acknowledgements Dr. Li Lin,
More informationBayesian and Frequentist Approaches
Bayesian and Frequentist Approaches G. Jogesh Babu Penn State University http://sites.stat.psu.edu/ babu http://astrostatistics.psu.edu All models are wrong But some are useful George E. P. Box (son-in-law
More informationDoes factor indeterminacy matter in multi-dimensional item response theory?
ABSTRACT Paper 957-2017 Does factor indeterminacy matter in multi-dimensional item response theory? Chong Ho Yu, Ph.D., Azusa Pacific University This paper aims to illustrate proper applications of multi-dimensional
More informationYou must answer question 1.
Research Methods and Statistics Specialty Area Exam October 28, 2015 Part I: Statistics Committee: Richard Williams (Chair), Elizabeth McClintock, Sarah Mustillo You must answer question 1. 1. Suppose
More informationDescribe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo
Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment
More informationMultilevel IRT for group-level diagnosis. Chanho Park Daniel M. Bolt. University of Wisconsin-Madison
Group-Level Diagnosis 1 N.B. Please do not cite or distribute. Multilevel IRT for group-level diagnosis Chanho Park Daniel M. Bolt University of Wisconsin-Madison Paper presented at the annual meeting
More information3 CONCEPTUAL FOUNDATIONS OF STATISTICS
3 CONCEPTUAL FOUNDATIONS OF STATISTICS In this chapter, we examine the conceptual foundations of statistics. The goal is to give you an appreciation and conceptual understanding of some basic statistical
More informationQuantifying Problem Gambling: Explorations in measurement. Nigel E. Turner, Ph.D. Centre for Addiction and Mental Health
Quantifying Problem Gambling: Explorations in measurement Nigel E. Turner, Ph.D. Centre for Addiction and Mental Health Original abstract Abstract: Over the past few years I had conducted several studies
More informationUsing Differential Item Functioning to Test for Inter-rater Reliability in Constructed Response Items
University of Wisconsin Milwaukee UWM Digital Commons Theses and Dissertations May 215 Using Differential Item Functioning to Test for Inter-rater Reliability in Constructed Response Items Tamara Beth
More informationA Case Study: Two-sample categorical data
A Case Study: Two-sample categorical data Patrick Breheny January 31 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/43 Introduction Model specification Continuous vs. mixture priors Choice
More informationDetecting Suspect Examinees: An Application of Differential Person Functioning Analysis. Russell W. Smith Susan L. Davis-Becker
Detecting Suspect Examinees: An Application of Differential Person Functioning Analysis Russell W. Smith Susan L. Davis-Becker Alpine Testing Solutions Paper presented at the annual conference of the National
More informationBayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm
Journal of Social and Development Sciences Vol. 4, No. 4, pp. 93-97, Apr 203 (ISSN 222-52) Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Henry De-Graft Acquah University
More informationLikelihood Ratio Based Computerized Classification Testing. Nathan A. Thompson. Assessment Systems Corporation & University of Cincinnati.
Likelihood Ratio Based Computerized Classification Testing Nathan A. Thompson Assessment Systems Corporation & University of Cincinnati Shungwon Ro Kenexa Abstract An efficient method for making decisions
More informationSheila Barron Statistics Outreach Center 2/8/2011
Sheila Barron Statistics Outreach Center 2/8/2011 What is Power? When conducting a research study using a statistical hypothesis test, power is the probability of getting statistical significance when
More informationEndorsement of Criminal Behavior Amongst Offenders: Implications for DSM-5 Gambling Disorder
J Gambl Stud (2016) 32:35 45 DOI 10.1007/s10899-015-9540-3 ORIGINAL PAPER Endorsement of Criminal Behavior Amongst Offenders: Implications for DSM-5 Gambling Disorder Nigel E. Turner 1,2 Randy Stinchfield
More informationFundamental Clinical Trial Design
Design, Monitoring, and Analysis of Clinical Trials Session 1 Overview and Introduction Overview Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics, University of Washington February 17-19, 2003
More informationRunning head: ATTRIBUTE CODING FOR RETROFITTING MODELS. Comparison of Attribute Coding Procedures for Retrofitting Cognitive Diagnostic Models
Running head: ATTRIBUTE CODING FOR RETROFITTING MODELS Comparison of Attribute Coding Procedures for Retrofitting Cognitive Diagnostic Models Amy Clark Neal Kingston University of Kansas Corresponding
More informationITEM RESPONSE THEORY ANALYSIS OF THE TOP LEADERSHIP DIRECTION SCALE
California State University, San Bernardino CSUSB ScholarWorks Electronic Theses, Projects, and Dissertations Office of Graduate Studies 6-2016 ITEM RESPONSE THEORY ANALYSIS OF THE TOP LEADERSHIP DIRECTION
More informationDetection of Differential Test Functioning (DTF) and Differential Item Functioning (DIF) in MCCQE Part II Using Logistic Models
Detection of Differential Test Functioning (DTF) and Differential Item Functioning (DIF) in MCCQE Part II Using Logistic Models Jin Gong University of Iowa June, 2012 1 Background The Medical Council of
More informationMichael Hallquist, Thomas M. Olino, Paul A. Pilkonis University of Pittsburgh
Comparing the evidence for categorical versus dimensional representations of psychiatric disorders in the presence of noisy observations: a Monte Carlo study of the Bayesian Information Criterion and Akaike
More informationComputerized Mastery Testing
Computerized Mastery Testing With Nonequivalent Testlets Kathleen Sheehan and Charles Lewis Educational Testing Service A procedure for determining the effect of testlet nonequivalence on the operating
More informationEPI 200C Final, June 4 th, 2009 This exam includes 24 questions.
Greenland/Arah, Epi 200C Sp 2000 1 of 6 EPI 200C Final, June 4 th, 2009 This exam includes 24 questions. INSTRUCTIONS: Write all answers on the answer sheets supplied; PRINT YOUR NAME and STUDENT ID NUMBER
More informationBasic concepts and principles of classical test theory
Basic concepts and principles of classical test theory Jan-Eric Gustafsson What is measurement? Assignment of numbers to aspects of individuals according to some rule. The aspect which is measured must
More informationThe Evolving Definition of Pathological Gambling in the DSM-5
The Evolving Definition of Pathological Gambling in the DSM-5 By Christine Reilly and Nathan Smith National Center for Responsible Gaming One of the most anticipated events in the mental health field is
More informationREPORT. Technical Report: Item Characteristics. Jessica Masters
August 2010 REPORT Diagnostic Geometry Assessment Project Technical Report: Item Characteristics Jessica Masters Technology and Assessment Study Collaborative Lynch School of Education Boston College Chestnut
More information12/30/2017. PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2
PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2 Selecting a statistical test Relationships among major statistical methods General Linear Model and multiple regression Special
More informationCentre for Education Research and Policy
THE EFFECT OF SAMPLE SIZE ON ITEM PARAMETER ESTIMATION FOR THE PARTIAL CREDIT MODEL ABSTRACT Item Response Theory (IRT) models have been widely used to analyse test data and develop IRT-based tests. An
More informationSelection of Linking Items
Selection of Linking Items Subset of items that maximally reflect the scale information function Denote the scale information as Linear programming solver (in R, lp_solve 5.5) min(y) Subject to θ, θs,
More informationUSE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION
USE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION Iweka Fidelis (Ph.D) Department of Educational Psychology, Guidance and Counselling, University of Port Harcourt,
More informationIntroduction to Multilevel Models for Longitudinal and Repeated Measures Data
Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Today s Class: Features of longitudinal data Features of longitudinal models What can MLM do for you? What to expect in this
More informationStatistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN
Statistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN Vs. 2 Background 3 There are different types of research methods to study behaviour: Descriptive: observations,
More informationTurning Output of Item Response Theory Data Analysis into Graphs with R
Overview Turning Output of Item Response Theory Data Analysis into Graphs with R Motivation Importance of graphing data Graphical methods for item response theory Why R? Two examples Ching-Fan Sheu, Cheng-Te
More informationmultilevel modeling for social and personality psychology
1 Introduction Once you know that hierarchies exist, you see them everywhere. I have used this quote by Kreft and de Leeuw (1998) frequently when writing about why, when, and how to use multilevel models
More informationPathological Gambling Report by Sean Quinn
Pathological Gambling Report by Sean Quinn Signs of pathological gambling A persistent and recurrent maladaptive gambling behavior is indicated by five or more of the following: Is preoccupied with gambling
More informationSUPPLEMENTAL MATERIAL
1 SUPPLEMENTAL MATERIAL Response time and signal detection time distributions SM Fig. 1. Correct response time (thick solid green curve) and error response time densities (dashed red curve), averaged across
More informationRe-Examining the Role of Individual Differences in Educational Assessment
Re-Examining the Role of Individual Differences in Educational Assesent Rebecca Kopriva David Wiley Phoebe Winter University of Maryland College Park Paper presented at the Annual Conference of the National
More informationIDENTIFYING DATA CONDITIONS TO ENHANCE SUBSCALE SCORE ACCURACY BASED ON VARIOUS PSYCHOMETRIC MODELS
IDENTIFYING DATA CONDITIONS TO ENHANCE SUBSCALE SCORE ACCURACY BASED ON VARIOUS PSYCHOMETRIC MODELS A Dissertation Presented to The Academic Faculty by HeaWon Jun In Partial Fulfillment of the Requirements
More informationUsing Analytical and Psychometric Tools in Medium- and High-Stakes Environments
Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments Greg Pope, Analytics and Psychometrics Manager 2008 Users Conference San Antonio Introduction and purpose of this session
More informationHaving your cake and eating it too: multiple dimensions and a composite
Having your cake and eating it too: multiple dimensions and a composite Perman Gochyyev and Mark Wilson UC Berkeley BEAR Seminar October, 2018 outline Motivating example Different modeling approaches Composite
More informationContents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD
Psy 427 Cal State Northridge Andrew Ainsworth, PhD Contents Item Analysis in General Classical Test Theory Item Response Theory Basics Item Response Functions Item Information Functions Invariance IRT
More informationIssues That Should Not Be Overlooked in the Dominance Versus Ideal Point Controversy
Industrial and Organizational Psychology, 3 (2010), 489 493. Copyright 2010 Society for Industrial and Organizational Psychology. 1754-9426/10 Issues That Should Not Be Overlooked in the Dominance Versus
More informationPolitical Science 15, Winter 2014 Final Review
Political Science 15, Winter 2014 Final Review The major topics covered in class are listed below. You should also take a look at the readings listed on the class website. Studying Politics Scientifically
More informationA Comparison of Methods of Estimating Subscale Scores for Mixed-Format Tests
A Comparison of Methods of Estimating Subscale Scores for Mixed-Format Tests David Shin Pearson Educational Measurement May 007 rr0701 Using assessment and research to promote learning Pearson Educational
More informationIntroduction to Multilevel Models for Longitudinal and Repeated Measures Data
Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Today s Class: Features of longitudinal data Features of longitudinal models What can MLM do for you? What to expect in this
More informationUnderstandable Statistics
Understandable Statistics correlated to the Advanced Placement Program Course Description for Statistics Prepared for Alabama CC2 6/2003 2003 Understandable Statistics 2003 correlated to the Advanced Placement
More informationABERRANT RESPONSE PATTERNS AS A MULTIDIMENSIONAL PHENOMENON: USING FACTOR-ANALYTIC MODEL COMPARISON TO DETECT CHEATING. John Michael Clark III
ABERRANT RESPONSE PATTERNS AS A MULTIDIMENSIONAL PHENOMENON: USING FACTOR-ANALYTIC MODEL COMPARISON TO DETECT CHEATING BY John Michael Clark III Submitted to the graduate degree program in Psychology and
More informationAnalyzing Teacher Professional Standards as Latent Factors of Assessment Data: The Case of Teacher Test-English in Saudi Arabia
Analyzing Teacher Professional Standards as Latent Factors of Assessment Data: The Case of Teacher Test-English in Saudi Arabia 1 Introduction The Teacher Test-English (TT-E) is administered by the NCA
More informationVR for pathological gambling
CYBERTHERAPY 2006 VIRTUAL REALITY IN THE TREATMENT OF PATHOLOGICAL GAMBLING A. Garcia-Palacios, N. Lasso de la Vega, C. Botella,, R.M. Baños & S. Quero Universitat Jaume I. Universidad de Valencia. Universidad
More informationDescribe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo
Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter
More informationMidterm project due next Wednesday at 2 PM
Course Business Midterm project due next Wednesday at 2 PM Please submit on CourseWeb Next week s class: Discuss current use of mixed-effects models in the literature Short lecture on effect size & statistical
More informationHierarchical Bayesian Modeling of Individual Differences in Texture Discrimination
Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Timothy N. Rubin (trubin@uci.edu) Michael D. Lee (mdlee@uci.edu) Charles F. Chubb (cchubb@uci.edu) Department of Cognitive
More informationA MONTE CARLO STUDY OF MODEL SELECTION PROCEDURES FOR THE ANALYSIS OF CATEGORICAL DATA
A MONTE CARLO STUDY OF MODEL SELECTION PROCEDURES FOR THE ANALYSIS OF CATEGORICAL DATA Elizabeth Martin Fischer, University of North Carolina Introduction Researchers and social scientists frequently confront
More informationDoing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto
Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling Olli-Pekka Kauppila Daria Kautto Session VI, September 20 2017 Learning objectives 1. Get familiar with the basic idea
More informationIAPT: Regression. Regression analyses
Regression analyses IAPT: Regression Regression is the rather strange name given to a set of methods for predicting one variable from another. The data shown in Table 1 and come from a student project
More informationSmall-area estimation of mental illness prevalence for schools
Small-area estimation of mental illness prevalence for schools Fan Li 1 Alan Zaslavsky 2 1 Department of Statistical Science Duke University 2 Department of Health Care Policy Harvard Medical School March
More informationProblem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol.
Ho (null hypothesis) Ha (alternative hypothesis) Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol. Hypothesis: Ho:
More informationMeasurement Models for Behavioral Frequencies: A Comparison Between Numerically and Vaguely Quantified Reports. September 2012 WORKING PAPER 10
WORKING PAPER 10 BY JAMIE LYNN MARINCIC Measurement Models for Behavioral Frequencies: A Comparison Between Numerically and Vaguely Quantified Reports September 2012 Abstract Surveys collecting behavioral
More informationThe University of North Carolina at Chapel Hill School of Social Work
The University of North Carolina at Chapel Hill School of Social Work SOWO 918: Applied Regression Analysis and Generalized Linear Models Spring Semester, 2014 Instructor Shenyang Guo, Ph.D., Room 524j,
More informationInfluences of IRT Item Attributes on Angoff Rater Judgments
Influences of IRT Item Attributes on Angoff Rater Judgments Christian Jones, M.A. CPS Human Resource Services Greg Hurt!, Ph.D. CSUS, Sacramento Angoff Method Assemble a panel of subject matter experts
More informationImpact and adjustment of selection bias. in the assessment of measurement equivalence
Impact and adjustment of selection bias in the assessment of measurement equivalence Thomas Klausch, Joop Hox,& Barry Schouten Working Paper, Utrecht, December 2012 Corresponding author: Thomas Klausch,
More information