Running head: CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 1 DRAFT. Construct Validity Studies of the TLAP-R. Xavier Jouve. Cogn-IQ.

Similar documents
COPYRIGHTED MATERIAL. One OVERVIEW OF THE SB5 AND ITS HISTORY

Running head: CPPS REVIEW 1

Testing and Individual Differences

Intelligence. Exam 3. iclicker. My Brilliant Brain. What is Intelligence? Conceptual Difficulties. Chapter 10

Intelligence. Intelligence Assessment Individual Differences

Myers Psychology for AP, 2e

Examining the Psychometric Properties of The McQuaig Occupational Test

Psychological testing

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

Intelligence. Exam 3. Conceptual Difficulties. What is Intelligence? Chapter 11. Intelligence: Ability or Abilities? Controversies About Intelligence


JENSEN'S THEORY OF INTELLIGENCE: A REPLY

UNDERSTANDING INDIVIDUAL DIFFERNCES: THE CASE OF INTELLIGNCE

Intelligence What is intelligence? Intelligence Tests and Testing

PSYCHOLOGY 1002 NOTES. Mental Abilities MENTAL ABILITIES

AP PSYCH Unit 11.2 Assessing Intelligence

A concept that refers to individual differences in abilities to: Acquire knowledge Think and reason effectively Deal adaptively with the environment

Intelligence. PSYCHOLOGY (8th Edition) David Myers. Intelligence. Chapter 11. What is Intelligence?

Intelligence, Thinking & Language

The Normal Curve. You ll need Barron s book, partner, and notes

Intelligence, Aptitude, and Cognitive Abilities 01/08/2014

2. Which pioneer in intelligence testing first introduced performance scales in addition to verbal scales? David Wechsler

Confirmation of the structural dimensionality of the Stanford-Binet Intelligence Scale (Fourth Edition)

Rapidly-administered short forms of the Wechsler Adult Intelligence Scale 3rd edition

VALIDITY STUDIES WITH A TEST OF HIGH-LEVEL REASONING

How do we construct Intelligence tests? Tests must be: Standardized Reliable Valid

The merits of mental age as an additional measure of intellectual ability in the low ability range. Simon Whitaker

The secular increase in test scores is a ``Jensen e ect''

person has learned a test designed to predict a person's future performance; the capacity to learn Aptitude Test

TESTING AND INDIVIDUAL DIFFERENCES. AP Psychology

Introduction to Psychology. Lecture 34

ESTABLISHING VALIDITY AND RELIABILITY OF ACHIEVEMENT TEST IN BIOLOGY FOR STD. IX STUDENTS

Change in Plans. Monday. Wednesday. Finish intelligence Grade notebooks FRQ Work on Personality Project. Multiple Choice Work on Personality Project

On the purpose of testing:

Testing and Intelligence. What We Will Cover in This Section. Psychological Testing. Intelligence. Reliability Validity Types of tests.

Running head: THE DEVELOPMENT AND PILOTING OF AN ONLINE IQ TEST. The Development and Piloting of an Online IQ Test. Examination number:

The Context and Purpose of Executive Function Assessments: Beyond the Tripartite Model

AN ABSTRACT OF THE THESIS OF. Title: Relationships Between Scores on the American College Test and the

Process of a neuropsychological assessment

Using contextual analysis to investigate the nature of spatial memory

Construct Reliability and Validity Update Report

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when.

Validation of Scales

Sex differences on the progressive matrices among adolescents: some data from Estonia

Introduction to the HBDI and the Whole Brain Model. Technical Overview & Validity Evidence

Proactive interference and practice effects in visuospatial working memory span task performance

Extending the Measurement of

NEUROPSYCHOLOGICAL ASSESSMENT S A R A H R A S K I N, P H D, A B P P S A R A H B U L L A R D, P H D, A B P P

The Construct Validity of Memory Span as a Measure of Intelligence

INTELLIGENCE AND CREATIVITY

Cognitive Design Principles and the Successful Performer: A Study on Spatial Ability

Testing the Multiple Intelligences Theory in Oman

The Asian Conference on Education & International Development 2015 Official Conference Proceedings. iafor

Test Validity. What is validity? Types of validity IOP 301-T. Content validity. Content-description Criterion-description Construct-identification

MindmetriQ. Technical Fact Sheet. v1.0 MindmetriQ

Wisc Iv Wechsler Intelligence Scale For Children Iv

Clustering Autism Cases on Social Functioning

Chapter Fairness. Test Theory and Fairness

What to do if you score low on an IQ test?

CHAPTER VI RESEARCH METHODOLOGY

Measurement and Descriptive Statistics. Katie Rommel-Esham Education 604

PTHP 7101 Research 1 Chapter Assignments

The Predictive Validity of Eye Movement Indices for Technical School Qualifying Test Performance

INTRODUCTION. History of Intelligence

The influence of sensory modality on temporal information processing: Thomas Rammsayer

Making a psychometric. Dr Benjamin Cowan- Lecture 9

Interpreting change on the WAIS-III/WMS-III in clinical samples

Appendix B Statistical Methods

Journal of Pediatric Psychology Advance Access published May 4, 2006

Progressive Matrices

IMPORTANT: Upcoming Test

Development of self efficacy and attitude toward analytic geometry scale (SAAG-S)

Work Personality Index Factorial Similarity Across 4 Countries


DAT Next Generation. FAQs

3-86 Psychological Tests and Evaluation Procedures ^ General Ability Measures

Gender Differences in Adolescent Ego. Development and Ego Functioning Level

Testing and Individual Differences UNIT 11

Associate Prof. Dr Anne Yee. Dr Mahmoud Danaee

The Personal Profile System 2800 Series Research Report

The Youth Experience Survey 2.0: Instrument Revisions and Validity Testing* David M. Hansen 1 University of Illinois, Urbana-Champaign

CHAPTER ONE CORRELATION

A COMPARISON OF THE WISC-R. McCARTHY SCALES OF CHILDREN'S ABILITIES

Lecture No: 33. MMPI (Minnesota Multiphasic Personality Inventory):

DATA GATHERING. Define : Is a process of collecting data from sample, so as for testing & analyzing before reporting research findings.

Inspection time and IQ Fluid or perceptual aspects of intelligence?

Development and Psychometric Properties of the Relational Mobility Scale for the Indonesian Population

Concurrent validity of WAIS-III short forms in a geriatric sample with suspected dementia: Verbal, performance and full scale IQ scores

found to have some potential as predictors of the professional activities and accomplishments of first-year graduate students in psychology; the

Cattell Culture Fair Intelligence Test Manual

Abstract Reasoning Test 1

Reliability Evidence Validity Evidence. Criterion: Other (e.g.,

The Assessment of General Knowledge Online: A cross-cultural study using two platforms.

Introduction to Test Theory & Historical Perspectives

Psychology in Your Life

FACTOR ANALYSIS Factor Analysis 2006

Spatial Visualization Measurement: A Modification of the Purdue Spatial Visualization Test - Visualization of Rotations

Teaching Social Skills to Youth with Mental Health

Transcription:

Running head: CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 1 DRAFT Construct Validity Studies of the TLAP-R Xavier Jouve Cogn-IQ.org

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 2 Abstract The TLAP-R is presumed to yield with the measurement of a second-stratum latent trait chiefly composed of fluid intelligence. Based on rationale of the author, it has been hypothesized that loadings on the fluid intelligence factor and on other cognitive ability factors could vary according to the three-stratum theory. Three samples of participants who took criterion measures were analyzed using factor analysis. Investigations were conducted on groups of 67, 66 and 22 participants who took the following criterion tests: the Advanced Progressive Matrices (APM), the Scholastic Aptitude Test (SAT), the Slosson Intelligence Test-Revised (SIT-R) and a Short Term Memory Retention Test (STMRT). Age of participants has also been used. Results suggested that the measurement yielded by the TLAP- R could be interpreted as a representation of fluid intelligence, with a good predictive validity of crystallized intelligence without much significant aging effect. Keywords: TLAP-R, SAT, APM, SIT-R, fluid intelligence, crystallized intelligence, short term memory, factor analysis, aging effect

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 3 Construct Validity Studies of the TLAP-R The TLAP-R (Jouve, 2013a) is an untimed, non-verbal reasoning test prepared with perceptual material. It consists in 8 matrices, each one including 6 lines of 15 patterns of which 4 have been deleted. The examinee is asked to find the 4 missing patterns. The first version (limited to 5 matrices) of this test has been prepared by Xavier Jouve in mid 2000 for a research project. The chiefly involved cognitive process mostly depends on fluid intelligence and the test is minimal-knowledge based. The material used to prepare the matrices is numerical but not heavily mathematically based so that the TLAP-R is suitable to assess any person with only basic arithmetic knowledge. In order to complete the matrices, to find which patterns are missing, the TLAP-R requires inductive reasoning. The task requires figuring out a specific logical and governing rule from a chaotic situation. Spearman (1927) described this process and considered it as the requisite for the eductive part of g. However, the TLAP-R showed a strong relationship with crystallized intelligence even without measuring it directly (Jouve, 2013b). Construct validity is the examination of the theoretical background of the test (Pennington, 2003) by the rationale of the author or through analysis of the empirical data collected. A test is always the empirical representation of the theoretical idea presented by the test designer. If the theoretical ideas of the person who created the test are not correct, construct validity will not be demonstrated. However, if the experiment has been prepared according to a proper rationale, it will demonstrate construct validity. There are two ways to give an indication of a test s construct validity. On the one hand, the first method that is issued form differential psychology is an analytic method, which tends to

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 4 give an approach based on dimensional measures. In this manner, the construct validity of the assessment is checked if the given measure is similar to the one from a test that is supposed to be tapping the same latent construct. Correlations with equivalent and non-equivalent known, usually published, psychometric tests and factor analytic studies are required to give support for the concrete validity. On the other hand, the cognitive psychology method for establishing construct validity does not require empirical validity. In fact, the cognitive branch of psychology locates its main interest in the understanding of the functions and processes of the intellect. The specific procedure used by a cognitive psychologist is directed to establishing functional correspondence between the experimental test and another, which is supposed to be an equivalency. If both tests have sufficient equivalent parameters, one can conclude that the experimental test has good theoretical validity. The rationale of the TLAP-R is above of all a measure of culture-fair reasoning and thus taps a non-verbal type of intellectual ability. We know that most of the non-verbal reasoning abilities are parts of the fluid part of g. In correspondence with this hypothesis, the TLAP-R therefore must be highly loaded on the fluid intelligence factor (2F). However, according to Gustaffson (1984) this factor is quite similar to the general intelligence factor (3G), consequently the TLAP-R must not be totally independent from the crystallized intelligence factor (2C) and other group factors (Carroll, 1993). Moreover, as the group factors are ranked in the second stratum in the order of their correlation with 3G, the TLAP-R must have higher correlations with tests that are closer to it than tests that are more distant from it. For example, if we suppose that the TLAP-R is ranked in the second stratum close to the 2F factor, its correlations must be higher with a test measuring 2C

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 5 than with a test of general memory (2M). Thus its loadings on factors such as fluid and crystallized intelligence, or on a corresponding first stratum vector, must be higher than its loading on any memory related latent trait. Method Participants Table 1 displays background information for the samples used in this study. All samples were collected as a part of the reliability analyses and the validation of the TLAP-R. The information shows that the first two samples were of equivalent sizes including 67 and 66 persons each, while the third one has been of a smaller size with 22 participants. In the first cohort of participants, 56.25% attended college. Among them, 10.94% had completed 5 to 8 years at university (Ph.D. s represented 1.56% of them). As can been seen in Table 1, a slight majority of these people were females (56.71%). This sample was almost equally divided into students (49.21%) and persons having ended their formal education. The majority of second sample subjects reported having completed or studying towards a college degree: 66.67%, among which, respectively 19.05% and 17.46% mentioned their highest educational level as being a first year, or a second year at university. Post-graduates were 7.94% Masters and 3.18% Ph.D. s. An important part of this sample was composed of students from a variety of academic levels (70.79%). The third and last group was mainly composed of males (66.67%). To the same extend and fortuitously, the number of students (66.67%) was the largest among occupations given by the participants. 61.91% of subjects were pursuing their college degree or had ended studying

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 6 after having completed a college programme. The highest education reported was a Bachelor degree (9.54%). The range of people used for this study was 16 24 years of age. Measure The criterions that have been chosen are widely known and are frequently used in differential and experimental psychology. They have a proper amount of academic literature to show evidence in their behalf. Advanced Progressive Matrices (APM). The APM (Raven, Raven & Court, 1998a) is a nonverbal figurative reasoning test divided into two sets of items. Only the second set has been used for the purposes of this study. It consists in 36 matrices of 3 lines and 3 columns each. The last pattern of the third line is missing, and 6 choices are given to the subject. Only one of these can logically be chosen to complete the matrix. The APM compose the test of hardest difficulty among the Raven s Progressive Matrices: the Colored Progressive Matrices (CPM) (Raven et al. 1998b) are designed to assess children, the Standard Progressive Matrices (SPM) (Raven et al. 2004) aim at appraising average adults and the Advanced Progressive Matrices are dedicated to measure ability in above average adults. Raven s tests of matrices have provided with strong evidence over years of being very appropriate measures of fluid intelligence and are usually used as a criterion when the validity of an experimental questionnaire needs to be checked. According to the test manual, the APM scores Cronbach Alpha reliability was.87 in a cohort of 1,015 15 years old German students. Additional data from diverse researches reported split-half coefficients between.83 and.87 (Jensen, Larson & Paul, 1988; Lapsley & Enright, 1979; Paul, 1985; Poortinga, 1972). These are

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 7 satisfactory values and indicate a good level of reliability in test scores (Aiken, 2000; Nunnally & Bernstein, 1994). Evidence for close relationships between the APM and other cognitive ability tests are numerous: as a relevant example, we can cite Paul (1985) who reported a correlation of.84 (corrected for restriction of range, N = 300) with the Performance IQ of the revised Wechsler Adult Intelligence Scale (WAIS-R) (Wechsler, 1981). Scholastic Aptitude Test (SAT). The SAT (College Board, 2012) is a standardized, threehour test that measures verbal and mathematical reasoning abilities that students develop over time, both in and out of school. Many colleges and universities use the SAT for admission purposes in order to help predicting successful performance in college. Moreover, the SAT, which was initially developed after an IQ test (Lemann, 1999) and despite of successive revisions, is still strongly correlated to traditional intelligence measures (Jouve, 2010, 2011). Frey and Detterman (2003) reported a correlation of.72 between the recentered SAT and the APM. Additionally, these authors indicated that the pre-recentered SAT correlated from.56 to.82 with intelligence measures, with most correlations being higher than.70 in magnitude. Likewise, a study conducted by Raz, Willerman, Ingmundson, and Hanlon (1983) resulted in a correlation between the SAT and the Culture Fair Intelligence Test (CFIT) (Cattell, Krug, & Barton, 1973) of.81. The post recentering SAT, in use between 1995 and 2005 and from which the scores have been utilized in this study was divided into two sections, (i) a verbal part with emphasis on critical reading in which vocabulary was tested in the context of reading passages and in analogy and sentence-completion questions and (ii) a mathematical part with emphasis on data interpretation and applied math questions in which calculators were permitted but not required.

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 8 Short Term Memory Retention Test (STMRT). The STMRT (Jouve, 2000) was a test developed after Peterson and Peterson paradigm (1959) with the aim of measuring the retention of verbal items in short-term memory. The examinee needed to look at ten lines of four unrelated letters each during two and half minutes, with the instruction of memorizing those quadrigrams. As a matter of interest, the original paradigm used trigrams instead. Once the time was over, and during an equivalent period of time the subject was asked to take a paper-and-pencil task of symbol search as a manner of distracting memorization before having to recall the letters as best possible. The symbol search part of the test was prepared with three given symbols to be encircled each time they were repeated into a 40 40 matrix of look-alike items. Slosson Intelligence Test Revised (SIT-R). The SIT-R (Slosson, Nicholson & Hibpsham, 1991; Slosson, 1998) is a test prepared for evaluating crystallized verbal intelligence in native English (children and adults). The 187 SIT-R items are derived from the following cognitive domains: Information, Comprehension, Arithmetic, Similarities and Differences, Vocabulary and Auditory Memory. Standardized on 2,000 individuals, approximating the contemporary U.S. census, the SIT-R uses a deviation IQ (SD = 16). The SIT-R provides a complement to other educational assessments that look at learning ability, readiness or achievement. It has a noticeable scores reliability (Spearman-Brown =.97 in the entire sample) and gives an indication of a wide range of cognitive abilities. The technical manual for the test reports strong correlations (between.80 and.90) with other IQ scales such as the Weschler Intelligence Scale for Children (WISC) (Weschler, 1991) or the Stanford-Binet Intelligence Scale (SBIS) (Thorndike, Hagen & Sattler 1986). Kunen, Overstreet & Salles (1996) reported a correlation of.92 with the recommended abbreviated battery of the SBIS in 191 mentally challenged patients. Furthermore, the pre-

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 9 revised SIT was seen to very highly correlates with the SBIS Form LM (Terman & Merrill, 1960) with a Pearson product moment formula of.92 in 724 students (Armstrong, 1971) and showed a range-restriction corrected value of.73 (raw r =.61, N = 98; Baum, 1979) with the Wechsler Preschool and Primary Scale of Intelligence (WPPSI) (Weschler, 1967). Results Factor analysis is a common method of examining the patterns of relationships among a set of variables and is a widely used analytic approach in order to investigate the existence and the structure of any latent constructs among a set of items, or tests (Cronbach, 1990; Kamphaus, 2001). For exploring the presence of latent traits among the TLAP-R and the criterion measures, the method of principal components was chosen, with an orthogonal rotation for interpreting the results. The method employed for rotating factors was the varimax method (Kaiser, 1958), which is unquestionably the most frequently and most popular rotation method by far. Varimax rotation simplifies interpretation because each one of the variables is associated with a small number of large loadings and a large number of null to small loadings. As a matter of consequence, extracted factors only represent few variables while they have been only matching few factors. The first unrotated factor is usually interpreted as g by researchers. However, the nature of any latent trait is determined by the nature of the clusters involved in the study: for example, a principal factor among a set of verbal tasks could be reasonably interpreted as general verbal ability. Study 1. Table 2 shows the results of the principal components analysis (PCA) of the first sample. This sample included 67 scores from the SIT-R and the STMRT along with the TLAP-R. As can be seen, the loadings on the first factor of the TLAP-R and the SIT-R were

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 10 somehow equivalent. The figures (.90 and.89 respectively) were very high and indicated that both measures were strongly loaded. At a lower level, the working memory task showed a moderate loading on the principal factor (.44). This factor explained 59.91 percent of the protocol variance. Figure 1 graphically illustrates these findings. A second PCA has been performed with the same variables, but including also the Age of examinees. Factors eigenvalues were 1.82, 1.19, 0.74 and.25. The use of varimax rotation of the vectors helped exploring the results. These are plotted in Figure 2. In ascending order, loadings of the four variables on the first rotated factor were.94 for the TLAP-R,.88 for the SIT- R,.12 for the STMRT and.08 for the age of participants. The second factor of this analysis was a clear reflect of Age, in which it loaded at.98. Both the TLAP-R and the STMRT were not significantly loaded (-.10 each) by this trait while the SIT-R has been shown a low positive loading (.27). The third latent construct was correlated at.99 with the STMRT and from the fact was identified as a representation of short term memory retention. Furthermore, the use of a two-dimensional factor structure made sense, and we observed a first factor on which loadings were as follows:.93 for the SIT-R,.87 for the TLAP-R,.35 for the STMRT and.27 for the age. The second factor was characterized by a strong positive loading of the age (.82) while the STMRT has been strongly loaded as well, but negatively (-.70). Study 2. Plots from a PCA conducted on both the TLAP-R and the 40 minutes timed APM raw scores along with the age of 66 subjects are shown in Figure 3. Values resulted from a varimax rotation. Factors 1 and 3 portrayed the APM and the TLAP-R respectively. Loadings of both tests on their matching factor were.89, and each one loaded.45 on the factor that sketched the other one. The second factor was undoubtedly the age of persons gathered in this sample.

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 11 Age loading on this one was perfect (1.00). However, it did not show any significant loading on the two other latent traits (.01 on factor 1 and.03 on factor 3). Prior to apply varimax rotation, the APM and the TLAP-R appeared equivalently loaded on the principal factor with the same value of.95. Loading of the age of test-takers was.11. Eigenvalues suggested the use of an exploratory two-factorial solution as the value for the first factor was 1.81 (60.28% of the variability), and that of the second factor was 1.00 (33.15% of the variability). Study 3. Another factor analysis has been performed on the scores of 22 individuals, aged 16-24, who took the SAT (recentered) prior to the TLAP-R. Results of confirmatory and exploratory analyses are displayed in Table 3. As shown in Figure 5, unrotated principal factor loadings yield a similar level of proximity for the APM (.92) and the TLAP-R (.90). Although a little less, the verbal reasoning scale of the SAT was also highly loaded (.66) on the principal factor. Exploratory investigation of the latent constructs among this set of variables revealed a first factor on which the TLAP-R loaded the most (.90) and the mathematical reasoning scale of the SAT loaded significantly (.45). Factor 2 of this solution modeled the verbal reasoning scale of the SAT. In fact, the SAT V accounted for a value of.97. After adding the age of persons whose scores were collected, a new exploratory factor analysis has been carried out. Eigenvalues were 2.09, 1.15,.56 and.20. Findings for a trifactorial solution are exposed in Figure 4. The TLAP-R loaded at.93 and the SAT M loaded at.92 on the first factor. The SAT V showed only small positive correlation on it (.27). Nevertheless, the SAT V loaded at.96 on the third factor. The second factor matched almost completely the age of testees (.99).

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 12 The two-factorial solution gave enlightenment: the first factor represented almost equivalently the SAT M and the TLAP-R which the loading was.93 and.92 respectively. The second factor mimicked very well the age of participants, as age loading was.92. Concerning the SAT V, it showed comparable loadings on both, with.60 and.56. Discussion The purposes of this article were to examine the relationships between the TLAP-R and other measures of cognitive ability along with age of participants, with the aim of investigating the construct validity of the TLAP-R. In order to perform this duty, three independent studies using three different samples were conducted. Specifically, two questions were addressed. The main question was about the evidence of convergent validity between the TLAP-R and fluid intelligence factor measured by the APM. The results presented in this article strongly suggest a very close linkage between both, as the TLAP-R and the APM were seen to tap on very closely related domains. In addition to the Pearson product moment correlation between the TLAP-R and the 40 minutes timed APM given by Jouve (2013b), this study provided with even more evidence of the TLAP-R being a measure of 2F. The second question addressed in this research was the following: is there sufficient evidence to consider the TLAP-R as an indication of a wide range of mental abilities? The results did show evidence of a close relationship with crystallized intelligence as measured by the SIT-R and also with reasoning in scholastic domains, both verbally and non-verbally, as assessed by the college entrance SAT. The TLAP-R showed clear proximity with the mathematical reasoning scale of the SAT, and moderate proximity with verbal reasoning part of this battery. Moreover, findings suggested that the TLAP-R is a relatively fair instrument in regard to the age of

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 13 examinees and to a negative effect induced by aging such the decline in short term memory. In other words, where the ability to retained information in short term memory and to recall it efficiently was inversely proportional to the age (both variables being at opposite extrema of the same factor), the TLAP-R remained unimpacted.

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 14 References Aiken, L. R. (2000). Psychological testing and assessment (10th ed.). Needham Heights, MA: Allyn & Bacon. Armstrong, R. (1971). Can Scores Obtained from the Slosson Intelligence Test be Used with as much Confidence as Scores Obtained from the Stanford-Binet Intelligence Scale? Paper presented at the Annual Meeting of the American Educational Research Association, New York, NY. Baum, D. D. (1979). An Investigation of the Predictive Validity of the Slosson Intelligence Test with Learning Disabled Kindergarten Children. Educational an Psychological Measurement, 79, 1067-1072. Carroll, J. B. (1993). Human Cognitive Abilities: A survey of factor-analytic studies. New York, NY: Cambridge University Press. Cattell, R. B., Krug, S. E., & Barton, K. (1973). Technical Supplement for the Culture Fair Intelligence Tests, Scales 2 and 3. Champaign, IL: Institute for Personality and Ability Testing. College Board (2012). SAT College Board The Most Widely Used College Admission Exam. Retrieved from http://sat.collegeboard.org/home. Cronbach, L. J. (1990). Essentials of psychological testing (5th ed.). Boston, MA: Addison- Wesley.

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 15 Frey, M. C., & Detterman, D. K. (2003). Scholastic Assessment Test and g: The Relationship Between the Scholastic Assessment Test and General Cognitive Ability. Psychological Science, 15(6), 333-378. Gustaffson, J. E. (1984). A Unifying Model for the Structure of Intellectual Abilities. Intelligence, 8, 179-203. Jensen, A. R., Larson, G. E. & Paul, S. M. (1988). Psychometric g and mental processing speed on a semantic verification test. Personality and Individual Differences, 9(2), 243-255. Jouve, X. (2000). Test de Rétention en Mémoire à Court Terme (Short Term Memory Retention Test, STMRT). Unpublished manuscript. Jouve, X. (2010). Principal components factor analysis for the JCTI and SAT relationship. Retrieved from http://www.cogn-iq.org/archives/28 Jouve, X. (2011). Correlations between the JCCES and other measures. (2nd ed.). Retrieved from http://www.cogn-iq.org/archives/633 Jouve, X. (2013a). TLAP: Revised. Retrieved from http://www.cerebrals.org/tlap/ Jouve, X. (2013b). Correlations between the TLAP-R and other measures. Retrieved from http://www.cogn-iq.org/archives/770 Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika, 23 (3), 187-200. Kamphaus, R. W. (2001). Clinical assessment of child and adolescent intelligence (2nd ed.). Needham Heights, MA: Allyn & Bacon.

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 16 Kunen, S., Overstreet, S., & Salles, C. (1996). Concurrent validity study of the Slosson Intelligence Test-Revised in mental retardation testing. Mental Retardation, 34(6), 380-386. Lapsley, D. K., & Enright, R. D. (1979). The effects of social desirability, intelligence, and milieu on an American validation of the Conservatism scale. Journal of Social Psychology, 107, 9-14. Lemann, N. (1999). The big test: The secret history of the American meritocracy. New York, NY: Farrar, Straus, & Giroux. Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric Theory (3rd ed.). New York, NY: McGraw-Hill. Paul, S. M. (1985). The Advanced Raven s Progressive Matrices: Normative data for an American University population and an examination of the relationship with Spearman s g. Journal of Experimental Education, 54, 95 100. Pennington, D. (2003). Essential Personality. New York, NY: Oxford University Press. Peterson, L. R., & Peterson, M. J. (1959). Short Term Retention of Individual Verbal Items. Journal of Experimental Psychology, 58, 193-198. Poortinga, Y. (1972). A comparison of African and European students in simple auditory and visual tasks. In L. J. Cronbach and P. D. Drenth (Eds.). Mental Tests and Cultural Adaptation. The Hague: Mouton.

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 17 Raven, J., Raven, J. C., & Court, J. H. (1998a). Manual for Raven s Progressive Matrices and Vocabulary Scales. Section 4: The Advanced Progressive Matrices. San Antonio, TX: Harcourt Assessment. Raven, J., Raven, J. C., & Court, J. H. (1998b). Manual for Raven s Progressive Matrices and Vocabulary Scales. Section 2: The Coloured Progressive Matrices. San Antonio, TX: Harcourt Assessment. Raven, J., Raven, J. C., & Court, J. H. (2004). Manual for Raven s Progressive Matrices and Vocabulary Scales. Section 3: The Standard Progressive Matrices. San Antonio, TX: Harcourt Assessment. Raz, N., Willerman, L., Ingmundson, P., & Hanlon, M. (1983). Aptitude-related differences in auditory recognition masking. Intelligence, 7(2), 71-90. Slosson, R. L., Nicholson, C. L., & Hibpsham, T. H. (1991). Slosson Intelligence Test for Children and Adults.(Rev. ed. by C. L. Nicholson & T. H. Hibpsham). East Aurora, NY: Slosson Educational Publications, Inc. Slosson, R. L. (1998). Slosson Intelligence Test Revised (SIT-R) For Children and Adults, Technical Manual, Calibrated Norms Tables. East Aurora, NY: Slosson Educational Publications, Inc. Terman, L. M., & Merrill, M. A. (1960). Stanford Binet Intelligence Scale: Manual for the Third Revision Form L M with Revised IQ Tables by Samuel R. Pinneau. Boston, MA: Houghton Mifflin.

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 18 Thorndike, R. L., Hagen, E. P., & Sattler, J. M. (1986). Stanford-Binet Intelligence Scale (4th ed.). Itasca, IL: Riverside Publishing. Wechsler, D. (1967). Manual for the Wechsler Preschool and Primary Scale of Intelligence. San Antonio, TX: The Psychological Corporation. Wechsler, D. (1981). Wechsler Adult Intelligence Scale-Revised. San Antonio, TX: The Psychological Corporation. Wechsler, D. (1991). The Wechsler Intelligence Scale for Children (3rd ed.). San Antonio, TX: The Psychological Corporation.

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 19 Table 1 Descriptive Statistics for Study Samples Sample N Age Range Age Median Percentage Female Sample 1: SIT-R, STMRT & TLAP-R Sample 2: APM & TLAP-R Sample 3: SAT M, SAT V & TLAP-R 67 13-57 23.40 56.71 66 14-45 22.11 46.97 22 16-24 20.00 33.33 Note. APM = Advanced Progressive Matrices; SAT M = Scholastic Aptitude Test-Recentered Mathematical reasoning scale; SAT V = Scholastic Aptitude Test-Recentered Verbal reasoning scale; SIT-R = Slosson Intelligence Test-Revised; STMRT = Short Term Memory Retention Test.

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 20 Table 2 Principal Components Factor Loadings of the SIT-R, the STMRT and the TLAP-R Measure Principal Factor Factor 1 (GR) Factor 2 (STMR) SIT-R.90.38.10 STRMT.44.07.97 TLAP-R.89.92.09 Note. SIT-R = Slosson Intelligence Test-Revised; STMRT = Short Term Memory Retention Test. Principal Factor values are the loadings on the first unrotated factor. Loadings for both Factor 1 and Factor 2 are those following varimax rotation and identified as General Reasoning (GR) and Short Term Memory Retention (STMR).

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 21 Table 3 Principal Components Factor Loadings of the SAT Reasoning Scales and the TLAP-R Measure Principal Factor Factor 1 (NV) Factor 2 (V) SAT M.92.45.22 SAT V.66.15.97 TLAP-R.90.90.17 Note. SAT M = Scholastic Aptitude Test-Recentered Mathematical reasoning scale; SAT V = Scholastic Aptitude Test-Recentered Verbal reasoning scale. Principal Factor values are the loadings on the first unrotated factor. Loadings for both Factor 1 and Factor 2 are those following varimax rotation and identified as Nonverbal ability (NV) and Verbal ability (V).

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 22 Figure 1. Two-dimensional projection of TLAP-R, STRMT and SIT-R unrotated factor patterns 1.00 0.75 STMRT 0.50 F2 (30.10 %) 0.25 0.00-0.25 SIT-R TLAP-R -0.50-0.75-1.00-1.00-0.75-0.50-0.25 0.00 0.25 0.50 0.75 1.00 F1 (59.91 %) Note. F1 = Factor 1, i.e. principal factor; F2 = Factor 2, i.e. short term memory; SIT-R = Slosson Intelligence Test-Revised.

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 23 Figure 2. Three-dimensional projection of the TLAP-R, SIT-R, STMRT & Age after varimax rotation 1.00 STMRT 0.80 Dimension 3 0.60 0.40 SIT-R 0.20 TLAP-R 1.00 0.00-0.20 Age 0.80 0.60 0.40 Dimension 2 0.20 0.00-0.20 0.20 0.00 0.40 0.60 0.80 Dimension 1 Note. SIT-R = Slosson Intelligence Test-Revised; STMRT = Short Term Memory Retention Test; Dimension 1 = Problem solving potency; Dimension 2 = Aging effect; Dimension 3 = Short term memory retention.

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 24 Figure 3. Three-dimensional projection of TLAP-R, APM & Age after varimax rotation 1.00 TLAP-R 0.80 APM Dimension 3 0.60 0.40 1.00 0.80 0.20 0.00 Age 0.80 0.60 0.40 Dimension 2 0.20 0.00-0.20 0.20 0.00 0.60 0.40 Dimension 1 Note. APM = Advanced Progressive Matrices; Dimension 1 = Speeded figurative problem solving; Dimension 2 = Aging effect; Dimension 3 = Unspeeded figurative problem solving.

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 25 Figure 4. Three-dimensional projection of the TLAP-R, SAT M, SAT V & Age after varimax rotation SAT V 1.00 0.80 0.60 Age Dimension 3 0.40 0.20 0.00 1.00 0.80 0.60 SAT M 0.40 TLAP-R 0.20 0.00 0.00 0.20 0.40 0.60 0.80 1.00-0.20 Dimension 2 Dimension 1 Note. SAT M = Scholastic Aptitude Test-Recentered Mathematical reasoning scale; SAT V = Scholastic Aptitude Test-Recentered Verbal reasoning scale; Dimension 1 = Nonverbal reasoning; Dimension 2 = Aging effect; Dimension 3 = Verbal reasoning.

CONSTRUCT VALIDITY STUDIES OF THE TLAP-R 26 Figure 5. Two-dimensional projection of TLAP-R, SAT M and SAT V unrotated factor patterns 1.00 0.75 SAT V 0.50 F2 (23.76 %) 0.25 0.00-0.25-0.50 SAT M TLAP-R -0.75-1.00-1.00-0.75-0.50-0.25 0.00 0.25 0.50 0.75 1.00 F1 (69.54 %) Note. F1 = Factor 1, i.e. principal factor; F2 = Factor 2, i.e. verbal reasoning; SAT M = Scholastic Aptitude Test-Recentered Mathematical reasoning scale; SAT V = Scholastic Aptitude Test-Recentered Verbal reasoning scale.