Chapter -6 Reliability and Validity of the test Test - Retest Method Rational Equivalence Method Split-Half Method

Size: px

Start display at page:

Download "Chapter -6 Reliability and Validity of the test Test - Retest Method Rational Equivalence Method Split-Half Method"

Curtis Welch
5 years ago
Views:

1 Chapter -6 Reliability and Validity of the test 6.1 Introduction 6.2 Reliability of the test Test - Retest Method Rational Equivalence Method Split-Half Method 6.3 Validity of the test 6.4 Conclusion

2 Chapter-6 Reliability and Validity of the test 6.1 Introduction Standardized tests provide uniform procedures for administering and scoring the instrument. The same questions are asked each time the test is used, with a set of directions, that specifies how the test should be administered. Psychological tests are among the most useful tools of educational research. They have been devised to evaluate or measure behaviour in a standardized way for the purpose of selection, classification, prediction and guidance as well as for the evaluation of educational programmes. For the purpose of educational research, standardized tests are the most commonly usedc ones because they are considered-more objective than the non-standardized ones. A standardized test is one that has specific directions for administration and scoring, a fixed set of test items, and has been administered to representative samples taken from the population for whom the test is intended for the purpose of establishing norms. Thus, the chief value of standardized test for research purposes in education lies in their use as tools of comparison. It may be noted that standardize tests are as objective as possible their scoring is unambiguous and simple. A researcher must have to evaluate the validity, reliability of the test. In the present chapter, the investigator has described the procedure to find out the validity and reliability of the self - made vocabulary test at length. 6.2 Reliability of the Test Standardized tests are always reliable. No doubts in the results of the test itself. A data collection test must be reliable, that is it must have the ability to consistently yield the same results when repeated 287

3 measurements are taken of the same individuals under the same conditions. Reliability refers to the consistency of measurement, the extent to which the results are similar over different forms of the same instrument or occasions of data collecting. In the words of Freeman (1965), The term reliability has two closely related but somewhat different connotations in psychological testing. First, it refers to the extent to which a test is internally consistent, that is, consistency of results obtained throughout the test when administered once. In other words, how accurately, is the test measuring at a particular time? Second, reliability refers to the extent to which a measuring device yields consistent results upon testing and retesting^ That is, how dependable is it for predictive purposes? 1 A test is reliable to the extent that it measures what ever it is measuring consistently. In tests that have a high coefficient of reliability, errors of measurement have been reduced to a minimum. Reliable tests are stable in whatever they measure and yield comparable scores on repeated administration. According to F.H. Brown (1983), How consistently a test measures overtime, occasions or samples of items the degree to which test scores are influenced by measurement errors. Indices include (reliability co-efficient) and the standard error of measurement. 288

4 In the words of Charles Jackson (1982), Reliability is the degree to which test scores are free from errors of measurement.3 According to John W. Best and James V. Kahan (2005), The reliability of a test may be raised by increasing the number of items of equal quality to the other items. Carefully designed directions for the administration of the test with no variation from group to group, providing an atmosphere free from distractions and one that minimizes boredom and fatigue, will also improve the reliability of the testing instrument.4 According to the above statements, we may sum up to say that, A test may be reliable even though it is not valid. However, for a test to be valid, it must be reliable. That is, a test can consistently measure (reliability) nothing of interest (be invalid), but if a test measures what it is designed to measure (validity), it must do so consistently (reliability). There are four procedures in common use for assessing the reliability of a test they include (1) The test - retest method (2) The alternate or Parallel forms method (3) the split half method and (4) The rational equivalence method. From the above methods three of them are use to find out the reliability of the test for the present study. (1) The Test - Retest Method. (2) The Rational Equivalent Method (3) The Split- Half Method. 289

5 6.2.1 The Test- Retest Method In this method the same test is re-administered shortly after the firs administration, and the two sets of scores are correlates to obtain the reliability of the test. It is known as very easy method to establish the reliability of the test. In the present study, the investigator has selected the students of std. 8, 9 and group sample for the test - retest administration. The test has given to the same group of students after one week. According to the scores of the students in the test the frequency is presented in Table 6.1. Table 6.1 Reliability Value of Test - Retest (One Week) Scores of the test Class Fy l Fx According to the above Table 6.1, the value of correlation of test- retest is carried out so, the value of test - retest reliability of the 290

6 vocabulary test is achieved very high and describe the positive corelation. So, it can be said that the test is reliable. To establish the reliability value of the test, it has given to the students of std. 9 after two weeks. According to the scores of the test, the frequency is presented in Table 6.2 Table 6.2 Reliability Value of Test - Retest (Two Weeks) Scores of the test Class Fy Fx According to the above table 6.2. The value of correlation of testretest is carried out So, the value of test - retest reliability is higher. The value of co-relation is significant at 0.01 level as its value is lower than the one week test - retest. It is definitely said that the reliability of the test is carried out higher. So, the efficiency of the students in 291

7 vocabulary test is stable. So, the test is successfully measured the vocabulary of the students Rational Equivalence Method Such type of method of reliability is evolved to get an estimate of the reliability of a test, free from the objection raised against the methods discussed. Two forms of a test are defined as equivalent when corresponding items are interchangeable, when the inter - item correlations are the same for both the forms. Two internal consistency formula developed by Kuder - Richardson are often used to obtain coefficients of equivalence for test where one point is given for every correct answers and zero for a wrong answer. Reliability by KR-21 Table 6.3 Frequency Distribution of the difference between odd and even items Difference Frequency N=

8 Table 6.4 Reliability and SE (Rullon) SD of difference SD of total score Rtt SEv Table 6.5 Reliability and SE (Flangon) N 90 SD of odd nos SD o f even nos SD of total scores v« SEv Table 6.6 Reliability and SE (Kundcr - Richardson) KR21 Mean SD o f total score Rtt OT7 o J& V N 90 According to the above table 6.5. The mean is carried out and the SD is carried out Accordint to the values, the value of V 293

9 is carried out Which describes the positive correlation. So, it can be said that the test has achieved good reliability S p lit- Half Method In this method, the test is divided into two equivalent halves and the scores on the half of the items are correlated with the scores on the other half from the reliability of the half - test, the self - correlation of the whole test is then estimated by spearman - brown prophecy formula. The items of the test can be divided into two sets in a variety of ways. This method of reliability measures the internal reliability of the test and if the two halves do not correlate highly it suggest that they are not measuring the same thing, Moreover, the method has the advantage of controlling the fatigue and practice effects. According to the B.K. Tuckman (1975), A co-efficient of reliability obtained by correlating scores on one half of a test with scores on the other half and applying the spearman - Brown formula to adjust for the doubled length of the total test. Generally but not necessarily to the two halves consists of the old numbered and the even numbered items.5 For the present test, of add and even numbered equations are divided into two part. The sample of 90 students has taken for the present calculation. The frequency distribution of the available scores are described in Table

10 Table 6.7 Frequency distribution of odd and even numbers Odd Even Class Intervals Frequency Class Intervals Frequency Total 90 Total 90 According to the table 6.7, the spit - half reliability is carried out So, overall the reliability value is carried out The available reliability of the present test is good with the help of the split-half method. 6.3 Validity of the Test Validity refers to the degree to which evidence and theory support the interpretation of the test scores entailed by proposed uses of test. That is, validity has to do with both the attributes of the test and the uses to which it is put. When tests are used for more than one purpose, these 295

11 needs to be evidence of validity for each of these uses. Considering the number and variety of tests and their uses, the type of validity. The test, as a data collection tool, must produce information that is not only relevant but free from systematic errors that is, it must produce valid information. In general, a test is valid if it measures wheat is claims to measure. A test, however, does not possess universal and eternal validity. It may be valid for use in one situation but invalid if used in another. Lee J. Cronbach (1964) says that, A test which helps in making one decision in a particular research situation may have no value at all for another. This means that a researcher should not ask the general questions. Is tais a valid test?. The pertinent question to ask is how valid is this test for the decision. I wish to make? or more generally for what decision is this test valid? 6 In other words of W.J. Bramble (1984), The degree to which a test measures what it is suppose to measure, the experiments are attributable to the treatments.7 These are different types of validity including (1) Content validity (2) Functional validity (3) Factorial validity (4) Co-current validity For the present study, the investigator has used the following types of validity method of established the validity o f the vocabulary test. 296

12 Functional Validity In the present - study, the investigator has checked the validity of the test with the scores of the MIQ test of Dr. J.H. Shah with the scores of the vocabulary test. The co-relation of the scores of both the test has carried out the v value Which is appropriated valid for the present test. The v value is significant at 0.01 level. This shows that the test is valid. The investigator has checked validity of the present test with the help of the marks of English subject in first from exam of the students of std. 10. The correlation of the scores of the vocabulary test and the marks of English subject is carried the v value Which is appropriately achieved higher. So, it can be said that the test is valid according to the correlation of the above marks. The v value is significant at 0.01 level. According to the interpretation of correlation, the test indicates significant positive correlation. This shows that test is valid. Co-current validity The teacher s rating has prepared according to the scores of the vocabulary test and the efficiency of the students. The rating scale has given to the teachers who have taught English subject in the school. The rating scale has prepared for the 50 students of std. 10. The X2 has calculated for the present rating scale. Which is presented in Table

13 Table 6.8 Efficiency of the students in the vocabulary test & English Subject Teacher s Rating Scale Class Very Very Good Medium Poor intervals\ Good Poor Total Total 'j According to the Table 6.8, the value of X is carried out 90. According to that the correlation has been calculated with the help of the following formula, C = 1 x : N + x* Here, %2 = Qui Square N = No. of Students C = Correlation From the above frequency, the C value is carried Which is significant at 0.01 level. Which means that the present test is valid. 298

14 6.4 Conclusion In the present study, the row scores of the present test has analyzed according to the different statistical method. So, that the usability value of the test can be achieved at length. For the present test, the reliability and validity of the test has been find out with the help of different methods as the important steps of the standardization of the present test. It is the great achievement for the present test, that it s validity value has carried out higher and on the other side the reliability value has also carried out higher. So, it acceptable that, the process of the standardization of the present vocabulary test has established the reliability and validity scientificallus with the help of that, it can be said that the test is absolutely standardized 299

15 References 1. Frank S. Freeman (1965); Theory and Practice of Psychological Testing, (New Delhi : Oxford and IBH Publishing Co. India edition p F.H. Brown (1983); Principle of Education and Psychological Testing, (Third Edition) New York : Holt Rinehard and Winston, p Charles Jackson (1982); Understanding Psychological Testing, Bombay : Jaico Publishing p John W. Best & James V. Kahan (2005); Research in Education (Ninth Edition), New Delhi : Pearson Education (Singapore) Pvt. Ltd., Indian Branch p B. K. Tuckman (1975); Measuring Educational Outcomes Fundamentals of Testing, New York : Harcourt Brasce Jabvanovich International Edition, p Lee, J. Cronbach (1964); Essentials of Psychological Testing, New York : Harper and Row, International Education p W.J. Bramble (1979); Understanding and Conducting Research (Second Edition) Singapore : Mc-Graw Hill Book Co. p

LANGUAGE TEST RELIABILITY On defining reliability Sources of unreliability Methods of estimating reliability Standard error of measurement Factors

LANGUAGE TEST RELIABILITY On defining reliability Sources of unreliability Methods of estimating reliability Standard error of measurement Factors affecting reliability ON DEFINING RELIABILITY Non-technical