Measurement is the process of observing and recording the observations. Two important issues:

Farzad Eskandanian

Measurement is the process of observing and recording the observations. Two important issues: 1. Understanding the fundamental ideas: Levels of measurement: nominal, ordinal, interval and ratio Reliability 2. Types of measures: Survey research: design of interviews and questionnaires. Scaling: methods developing a scale Qualitative research: non-numerical measurement approaches.

Generalizing from your program or measures to the concept of your program or measures. A labeling issue. Examples: A Head Start program. Is the label accurate? A measure that you term "self esteem" is that what you were really measuring? The degree to which a test measures what it claims, to be measuring.

We really want to talk about the validity of any operationalization. Operationalization?! Any time you translate a concept or construct into a functioning and operating reality, Be concerned about how well you did the translation. Construct validity is the approximate truth of the conclusion that your operationalization accurately reflects its construct.

Different ways you can demonstrate different aspects of construct validity: Translation Validity: degree to which you accurately translated your construct into the operationalization. Face Validity: Look at the operationalization and see whether "on its face" it seems like a good translation of the construct. Content Validity: Check the operationalization against the relevant content domain for the construct. It is not always easy to decide on the criteria that constitute the content domain.

Checks the performance of your operationalization against some criterion. make a prediction about how the operationalization will perform based on our theory of the construct. Types: Predictive validity: Assess the operationalization's ability to predict something it should theoretically be able to predict. Concurrent validity: Assess the operationalization's ability to distinguish between groups that it should theoretically be able to distinguish between. Convergent validity: Examine the degree to which the operationalization is similar to other operationalizations that it theoretically should be similar to. Discriminant validity Examine the degree to which the operationalization is not similar to other operationalizations that it theoretically should be not be similar to.

Idea of Construct Validity: Definitionalist: Precise absolute definitions. Rationalist: Meanings differs relatively, Not absolutely. In court Tell "the truth, the whole truth and nothing but the truth. In our context: Our measure should reflect "the construct, the whole construct, and nothing but the construct. Example: "self esteem, all of self esteem, and nothing but self esteem?"

To establish construct validity the following conditions are required: Operationalize within a semantic net. Control the operationalization of the construct, so it looks similar to what you theoretically mean. Provide evidence that your data support your theoretical view of the relations among constructs.

Show that: Correspondence or convergence between similar constructs, and Discriminate between dissimilar constructs. Correlations between theoretically similar measures should be "high. While correlations between theoretically dissimilar measures should be "low. Note convergent correlations should always be higher than the discriminant ones.

We theorize that all four items reflect the idea of self esteem. Observations show the intercorrelations of four items. Pattern of correlations states that the four items are converging on the same idea (construct).

Show that measures that should not be related in reality are not related.

Inadequate Preoperational Explication of Constructs You didn't do a good enough job of defining (operationally) what you mean by the construct. Think more about the concepts. Use concept mappings and experts opinions. Mono-Operation Bias If you only use a single version of a program in a single place at a single point in time, then you are not capturing the whole picture. Solution: try to implement multiple versions of your program. Mono-Method Bias refers to your measures or observations, not to your programs or causes. Solution: try to implement multiple measures of key constructs

Interaction of Different Treatments Can you really label the program effect as a consequence of your program? Interaction of Testing and Treatment Restricted Generalizability Across Constructs Unintended consequences of the program. Confounding Constructs and Levels of Constructs Your label is not a good description for what you implemented. Social Threats: Hypothesis Guessing Evaluation Apprehension Experimenter Expectancies

To provide evidence that your measure has construct validity. This model links the conceptual/theoretical realm with the observable one, because this is the central concern of construct validity.

A correlation matrix. The Reliability Diagonal (monotrait-monomethod) The Validity Diagonals (monotrait-heteromethod) The Heterotrait-Monomethod Triangles Heterotrait-Heteromethod Triangles The Monomethod Blocks The Heteromethod Blocks

Linking two patterns: Theoretical and Observational A test of significance is usually required: t-test ANOVA

About quality of measurement. Reliability is the "consistency" or "repeatability" of your measures. True Score Theory True ability + Random error A measure that has no random error (is all true score) is perfectly reliable.

Random error or Noise. Systematic error or Bias.

Usually we don t know about true score. Only observation X. Using two observations we can see the true score is shared. But we can t calculate the variance of true score, so we estimate it. corr X %, X ' = *+,(. /,. 0 ) 23. / 23(. 0 )

Inter-Rater or Inter-Observer Reliability Used to assess the degree to which different raters/observers give consistent estimates of the same phenomenon. Test-Retest Reliability Used to assess the consistency of a measure from one time to another. Parallel-Forms Reliability Used to assess the consistency of the results of two tests constructed in the same way from the same content domain. Internal Consistency Reliability Used to assess the consistency of results across items within a test.

Think of the center of the target as the concept that you are trying to measure.