Chapter 20: Test Administration and Interpretation

Thought Questions Why should a needs analysis consider both the individual and the demands of the sport? Should test scores be shared with a team, or should they be kept confidential? What is more important, reliability or validity? Under what general conditions does a medical evaluation become more important? What is a bell curve? Will the athletes you test on a team follow the bell curve?

Testing and Measurement Testing is where initial decisions are made regarding the exercise prescription of such issues as frequency, intensity, and volume Results used to evaluate performance and make decisions regarding the future of a program or individual Test scores may be used in a research environment as part of an in-depth analysis of a sport or activity Testing determines where an individual currently stands regarding his or her training status and, more importantly, where he or she needs to be headed

Testing and Measurement (cont.) The final outcome of any training program is to arrive at a peak level of performance or to achieve some predetermined goal Using the results of a properly designed and implemented testing and measurement protocol will enable the tester/coach to make objective decisions regarding an athlete s program

Terms Used in Testing and Measurement Population: an entire group of individuals sharing some common characteristic Subpopulation: smaller sample mentioned above, containing a manageable number of people from whom to obtain performance measures. Results are then used to extrapolate data for the total population Test: a tool used to measure performance Measurement: the quantitative score derived from the test

Terms Evaluation: placing a value on the measurement derived from the test Assessment: putting all three of the aforementioned events together. Choose a test, measure the score, and then make an evaluation based on a scale comparison Normative scale: a post measurement scale derived from the scores of a peer group. Generally determined by placing highest score at the top and listing the descending scores in order of magnitude. Criterion: an a priori scale whereby the break points are known prior to testing and each person must meet an established level of performance to achieve that value level

Test Selection Tests chosen should be specific to the sport and to the population being tested Physical tests include measurements of cardiovascular and respiratory function, strength, power, endurance, anthropometry Making the wrong decisions about an individual s physical state can cause severe consequences for both the evaluator and client Utilization of spurious data will ultimately lead to erroneous conclusions

Validity Validity is the most important aspect of any assessment procedure For a test to be valid, it must test what it is made to test Five major types of validity: Face Concurrent Content Construct Predictive

Five Types of Validity 1. Face validity: states that the test is logical on the surface 2. Content validity: states that the test includes material that has been taught or covered 3. Predictive validity: states that test scores can accurately predict future performance 4. Concurrent validity: states that the test is a measure of the individual s current performance level 5. Construct validity: states that the test measures some part of the whole skill

Reliability Reliability is the ability of a test to arrive at or near the same score upon repeated measurements in the absence of any intervention strategy A reliable test should result in consistent scores If reliability is high, then people scoring well on a first test should also score well on repeated tests Pearson r: establishes reliability on a continuum ranging between 1.0 and -1.0 Reliability is not an either/or proposition but rather a floating scale of different levels A negative correlation still represents reliability but simply states that as one score increases, another score decreases, thereby inverting the rank order of participants

Key Point A test may be reliable but not valid, while a valid test must always be reliable.

Assessment Should include medical and exercise history, obtain a physician release if necessary, perform physiological testing, determine baseline nutritional status, and determine program goals Objective of assessment is to provide a comprehensive view of the athlete, what he/she is capable of, what his/her limitations are, and any special needs that may need to be addressed

Assessment Should Include Medical history & PAR-Q: ask a few basic questions; use PAR-Q form Physician release: medical release form describing each of the exercise components Nutrition information: basic understanding of the person s nutritional status Needs analysis: evaluate the needs of the athlete and the demands of the sport

Needs Analysis Assessment of an individual s body fat, cardiovascular fitness, muscular strength/power and flexibility Needs of a particular sport or activity are unique and require different physiological performance levels of the athlete or client Needs should include energy systems, duration of each repetition, duration of the entire event, muscle used, muscle actions used, range of motion, speed of movement

Tests for Measuring Strength & Power Wingate anaerobic cycle test Margaria-Kalamen stair climb test Isokinetic velocity spectrum Countermovement vertical jump One-repetition-maximum power clean One-repetition-maximum squat One-repetition-maximum bench press Bench-press body weight for total reps 40-yard sprint Standing long jump

Wingate Anaerobic Cycle Test designed as a means to measure overall anaerobic lower extremity power as fast as possible for 30 seconds with a predetermined resistance resistance is derived by multiplying the athlete's body weight (kg) by the constant 0.075 Specialized cycle ergometers will record the athlete's peak and mean power during the 30-second test good overall assessment of lower extremity power, as it requires contributions from all major muscle groups of the lower extremities

Margaria-Kalamen Stair Climb Test test of lower extremity power is easy and quick to perform performed on a staircase and is a good measure of lower-body power and explosiveness, which is important to athletes requiring speed, agility, and quickness the subject's body weight and calculate the height of each step (17.5 cm) Timing mats are placed on the third and ninth steps, which are connected to a timing device and are activated by the subject's body weight. The subject begins 6 m from the first step and then runs toward and up the steps, taking them three at a time result (power) is the product of body weight and the steps' vertical distance and gravity divided by the total time from the third to the ninth step

Isokinetic Velocity Spectrum Isokinetic testing involves controlling the velocity of a given movement dynamometer is set to a specified velocity at which the lever arm of the machine can be moved Torque (rotational force) and power produced during the movement are then recorded by the dynamometer. used to test a variety of joints in either seated or reclining positions Limited in their application to the movement patterns specific to an individual sport. They do, however, provide precise, highly reliable data that can easily be compared between athletes, injured and uninjured limbs, or agonist and antagonist muscle groups

Countermovement Vertical Jump The vertical jump is performed primarily utilizing the hip extensor and ankle plantarflexor muscle groups. The athlete is instructed to jump as high as he or she can following a quick downward squatting movement. The athlete's jump height can be recorded simply by measuring the difference between chalk markings placed first on a wall while standing on the ground and reaching and then at the highest point that the athlete can reach while performing the countermovement jump One limitation of the countermovement vertical jump test is the fact that technique often differs among athletes. Examiner must be careful that differences in scores on subsequent tests are due to training and not variations in the athlete's jumping technique

1RM Power Clean The power clean is an exercise performed using several upper extremity, lower extremity, and core muscle groups Because the exercise is designed for the athlete to lift the weight as quickly as possible, it is a good indication of ballistic strength The 1-RM power clean is considered to be the greatest load with which the athlete can perform one repetition of the movement with proper form. This test is useful for power athletes who are required to move explosively for short durations Protocol similar for bench and squat Not suitable for untrained persons

Standing Long Jump The standing long jump is also a good functional test that measures total lower-body power Although the vertical jump is used to measure vertical power, the long jump is used to measure horizontal power. As another test of lower-body power, it too is a good choice for athletes requiring speed, agility, and quickness. The test is very easy to conduct, since a tape measure is the only equipment needed.

Test Interpretation Final and most complicated portion is making an evaluation of the scores This involves placing a value on the recorded score so it may be used to help design a program of training specific to the individual Once test data are collected, they should be accurately interpreted to coach and athlete in terms of norms and expected improvement

Order Scales Main categories of test scores Nominal Interval Ratio Ordinal (sometimes)

Nominal Scores Assigns a name to a category Those that are coded for entry into a spreadsheet or statistical computer program They represent the real item through the use of number An example is coding gender as 1 for male and 2 for female or vice versa

Interval Scores Those that do not possess an absolute zero This means that zero does not represent the absence of that variable but that it sometimes contains negative scores and there is no standard unit of difference between scores One common form of interval scoring is temperature expressed in Fahrenheit degrees

Ratio Scores Possess all traits of interval scoring Have an absolute zero This means a zero score is an absence of that variable, as in an elderly person s vertical jump scores of zero inches Contain no negative numbers and there is an absolute scale of measurement between scores, such that lifting 50 lb is scored at half as much as lifting 100 lb

Ordinal Scores Sometimes a fourth form of scoring by placing individuals in rank order More accurately described as a ranking system rather than a scoring scale Ordinal system ranks individual scores from top to bottom, highest to lowest, or greatest to least All other scales of measurement can be listed in ordinal rank

Mathematical Measures A few variables are associated with central tendency. Enable investigator to draw conclusions from the collected data Provide the tester with valuable information regarding that data as a whole and each individual s relationship to the overall group

Variables Minimum and maximum: refer to the least score and the greatest score Range: difference between minimum and maximum; represents the overall spread of scores as a whole number; max-min Sum: total of all scores combined and added together; X 1 + X 2 + X 3 + X 4 N: symbol used to denote number of people in the test group and is used in conjunction with the sum to arrive at other important variables Mean: the arithmetic average derived from the sum divided by N; used most often; sum/n

Variables (cont.) Median: the exact middle of the total number of scores; not affected by outliers, shows a position only, and is rarely used for statistical calculations Mode: the score that occurs most frequently; may be multiple modes or no mode at all; not affected by extreme scores and is not often used in statistical calculations

Distribution of Scores Central tendency: the measure of the middle of a distribution of scores When the scores are presented graphically, they can then be analyzed using basic statistical assumptions Normal bell curve: the basic element and starting point of many statistical techniques It is a frequency histogram that plots the number of times a score occurs on the vertical, or Y axis, against the raw number itself on the horizontal, or X axis Scores increase as they move from left to right

Distribution of Scores (cont.) Although many statistical calculations are based on the bell curve, it almost never occurs in real-life data Positively skewed: curve with a long tail on the right and a hump near the left side of the graph Negatively skewed: curve with a long tail on the left and a hump near the right side of the graph Bimodial curve: where the frequency of scores was greatest in two different portions of the final outcome

Standard Deviation Most commonly used measure of variability in scores Describes the scatter of scores about the mean and is used to show inclusion of scores for different percentages of the population and can be used as a measure of the homogeneity when compared to the mean 1. Subtract the mean from each raw score 2. Square each deviation score by multiplying it by itself 3. Add all the squared deviations together to arrive at a sum 4. Divide the sum of the squared deviations by N-1 5. Take the square root of the variance; the answer is the SD

Standardized Scores Scores that express each individual scores as an SD, making it easier to determine each score s standard distance from the mean Two types: Z scores: range between -3 and +3; are expressed out to two decimal places T scores: range from 30 to 80; are almost identical to Z scores

Summary A valid and reliable evaluation and assessment program begins with a careful needs analysis of both the client/athlete and the activity/sport. Appropriate tests are then chosen based on the individual needs of the situation, and the results are carefully analysed using the basic mathematical calculations of central tendency and variability. By performing these simple calculations, assumptions can be made regarding the value of each score for each participant. Scores can then be compared to population-specific normative values or a criterion scale developed from those norms. In the final analysis, the well-prepared and knowledgeable tester will be able to evaluate each individual's needs and subsequently prescribe an appropriate training program to meet the derived client- specific goals