Survey data quality: Does verbal ability and text readability matter?

AIR Annual Forum Long Beach, CA May, 2013 Survey data quality: Does verbal ability and text readability matter? James Cole, Ph.D. Associate Scientist

Introduction Satisficing Krosnick, Narayan, and Smith (1996) identify three regulators of satisficing (a source of survey data error). One of them is task difficulty. Task difficulty has to do with how familiar the language is to the respondent. Survey creators try to decrease task difficulty and accommodate low verbal ability by writing items that use plain, simple language that is easily understood by the target audience (Fowler, 2008). Use of focus groups, cognitive interviews, and pilot testing are important components for determining task difficulty for the survey respondent. However, the verbal ability of survey respondents is one factor that can determine task difficulty and is overlooked (Krosnick, 1999).

Introduction Satisficing One tool for aligning text difficulty with appropriate reading skills has been the use of grade-level calculators. Though not widely used in survey research, they are used in fields such as health in the development of their promotional, educational materials, as well as surveys (e.g., NCI, NIH, CDC, various state depts of health). In addition, some health science IRB s are now starting to require this information for informed consent documents. There is some research regarding use of grade level estimations in survey research (e.g., Calderon, Morales, Liu, & Hays, 2006), but much is still not known.

Introduction Research Questions Research questions include: 1. Is low verbal ability, as compared to very high verbal ability respondents, associated with suspicious data quality? 2. Are commonly used readability programs that provide grade-level reading estimates associated with indicators of suspicious data quality?

Method Data Source Data for this study comes from the 2011 web administration of the National Survey of Student Engagement (NSSE) and only included full-time, first-year students with institution-reported SAT verbal scores. More than 81,000 respondents enrolled at 441 US institutions across the United States were included in the study. Average 2011 institutional response rate: 33% Respondent Characteristics Male 36.2% Female 63.8% African American 9.0% Asian American 7.8% Caucasian 61.8% Hispanic 10.5% Other 10.9% Non-First Generation 37.4% First Generation 62.6% TOTAL Count 81,164

Method Data Source NSSE annually collects information at hundreds of four-year colleges and universities about student participation in programs and activities that institutions provide for their learning and personal development. The results provide an estimate of how undergraduates spend their time and what they gain from attending college.

Method Method Key variables used in this study to identify suspicious data include: verbal ability survey duration straight-lining (indicated by the respondent checking the same response for the entire set of items on that screen) item skipping break-offs (non-completers) skip through (respondents that submit a screen in a matter of seconds) hanging (respondents that take an exceedingly long time to submit screen) grade-level calculations of survey text.

Method Verbal ability For this study, the SAT verbal test score measured verbal ability. The lowest 20% (scores between 200 and 460) were used as the low ability group. The top 20% (scores between 640 and 800) were considered the high verbal ability group. Survey duration By screen and per item (average duration divided by number of items on screen)

Method Straight-lining Indicated by the respondent checking the same response for the entire set of items on that screen Seven screens used where straight-lining was possible. These 7 screens used for all analysis.

Method Item skipping Items skipped per screen Mutually exclusive of those that break-off Break-offs (non-completers) Stopped submitting screens and did not return Skip through Respondents that submit a screen in a matter of seconds Calculated by taking bottom 5% in duration for each screen. Each screen calculated separately Hanging Respondents that take an exceedingly long time to submit screen Calculated by taking top 5% in duration for each screen. Each screen calculated separately

Method Grade-Level Calculations Four procedures used to calculate grade level reading estimates 1. Flesch-Kincaid (MS Word) 2. Flesch-Kincaid (online calculator)

Method Grade-Level Calculations 3. Gunning Fog Index 4. Automated Reading Index (ARI)

Method Grade-Level Calculation Procedures 1. Text from each of the seven screens were copied into the grade-level calculator 2. Text included instructions at top and item wording 3. Response category text was not included in the text analysis

Method Groups Three groups of respondents were created based on verbal test scores Respondent SAT Verbal Characteristics =<460 461 to 639 =>640 TOTAL 20.6% 59.0% 20.4% Male 18.5% 58.3% 23.2% Female 21.7% 59.3% 18.9% African Am 46.8% 48.0% 5.2% Asian Am 23.6% 59.7% 16.7% Caucasian 27.8% 53.1% 19.1% Hispanic 37.5% 53.3% 9.2% Other 25.4% 56.5% 18.1% Non-First Gen 12.2% 59.4% 28.4% First Gen 31.9% 58.5% 9.6% SAT Verbal =<460 461 to 639 =>640 Baccalaureate 19.9% 52.5% 27.6% Masters 29.3% 60.0% 10.7% Doctoral 12.1% 61.2% 26.8% Private 22.4% 59.3% 18.3% Public 17.1% 58.3% 24.6%

Results Is low verbal ability, as compared to very high verbal ability respondents, associated with suspicious data quality? Duration (mid 90% only) Screen (%) 1 2 3 4 5 14 15 Survey Bottom 20% 60.6 60.6 52.8 37.2 52.2 34.2 33.0 900.6 Middle 60% 55.2 57.0 49.8 34.2 46.2 33.0 32.4 857.4 Upper 20% 52.8 53.4 48.0 33.6 45.6 31.8 31.8 823.2 Difference Bottom/Upper 7.8 7.2 4.8 3.6 6.6 2.4 1.2 77.4 Significance 1 p<.001 p<.001 p<.001 p<.001 p<.001 p<.001 p<.001 p<.001 1 Scheffés post hoc test

Results Is low verbal ability, as compared to very high verbal ability respondents, associated with suspicious data quality? Item Missing Screen 1 2 3 4 5 14 15 Bottom 20% 5.8% 10.7% 6.2% 5.0% 3.5% 7.9% 9.1% Middle 60% 4.5% 8.7% 4.8% 3.6% 2.5% 5.9% 6.0% Upper 20% 4.0% 7.1% 4.4% 2.8% 2.0% 4.1% 4.6% Difference Bottom/Upper 1.8% 3.6% 1.8% 2.2% 1.5% 3.8% 4.5% Significance 1 p<.05 p<.05 p<.05 p<.05 p<.05 p<.05 p<.05

Results Is low verbal ability, as compared to very high verbal ability respondents, associated with suspicious data quality? Straight Lining Screen 1 2 3 4 5 14 15 Bottom 20% 2.6% 3.3% 6.8% 24.1% 1.6% 22.6% 12.4% Middle 60% 1.5% 1.8% 4.2% 18.8%.8% 17.4% 8.6% Upper 20%.9% 1.1% 2.3% 11.1%.4% 10.3% 5.1% Difference Bottom/Upper 1.7% 2.2% 4.5% 13.0% 1.2% 12.3% 7.3% Significance 1 p<.05 p<.05 p<.05 p<.05 p<.05 p<.05 p<.05

Results Is low verbal ability, as compared to very high verbal ability respondents, associated with suspicious data quality? Break-Offs Screen 1 2 3 4 5 14 15 Bottom 20% - 5.3% 2.2% 1.0%.6% 1.4% 1.3% Middle 60% - 5.0% 1.7%.8%.5% 1.0%.8% Upper 20% - 3.9% 1.6%.6%.3%.5%.5% Difference Bottom/Upper - 1.4%.6%.4%.3%.9%.8% Significance 1 - p<.05 p<.05 p<.05 p<.05 p<.05 p<.05

Percent Results Item missing, straight-lining, and break-offs are mutually exclusive activities. What percentage of respondents are displaying one of these activities? Bottom 20% Middle 60% Upper 20% 35 30 30.1 31.9 25 20 15 10 5 8.4 4.9 19.3 12.1 15.2 8.3 14.5 5.7 2.7 14.9 22.8 10.2 0 1 2 3 4 5 14 15 Screen

Results Is low verbal ability, as compared to very high verbal ability respondents, associated with suspicious data quality? Skip Through Screen 1 2 3 4 5 14 15 Bottom 20% 4.3% 6.7% 7.1% 6.0% 4.8% 7.0% 7.3% Middle 60% 4.1% 4.5% 4.5% 4.6% 4.8% 4.5% 4.3% Upper 20% 4.8% 3.1% 2.5% 3.0% 4.9% 2.7% 2.6% Difference Bottom/Upper -.5% 3.6% 4.6% 3.0% -.1% 4.3% 4.7% Significance 1 n/s p<.05 p<.05 p<.05 n/s p<.05 p<.05

Results Is low verbal ability, as compared to very high verbal ability respondents, associated with suspicious data quality? Hanging Screen 1 2 3 4 5 14 15 Bottom 20% 5.7% 5.2% 5.2% 5.6% 5.2% 5.3% 5.4% Middle 60% 4.9% 5.0% 5.1% 4.8% 4.9% 5.2% 5.2% Upper 20% 4.7% 5.2% 4.8% 5.0% 5.5% 4.7% 4.8% Difference Bottom/Upper 1.0%.0%.4%.6% -.3%.6%.6% Significance 1 p<.05 n/s n/s n/s n/s p<.05 n/s

Results Are commonly used readability programs that provide grade-level reading estimates associated with indicators of suspicious data quality? Grade Level Calculations Screen MS Word FK Flesch-Kincaid ARI FOG 1 8.8 10.0 10.0 10.7 2 10.0 9.7 8.5 10.8 3 11.9 11.8 11.2 13.4 4 13.9 14.6 14.4 16.8 5 7.4 7.7 6.7 7.1 14 13.3 13.9 12.5 13.6 15 10.6 11.3 10.0 12.9

Results Are commonly used readability programs that provide grade-level reading estimates associated with indicators of suspicious data quality? Grade Level Calculations Screen Mean Grade Level Easiest 1 5 7.2 2 2 9.8 3 1 9.9 4 15 11.2 5 3 12.1 6 14 13.3 Hardest 7 4 14.9

Results Are commonly used readability programs that provide grade-level reading estimates associated with indicators of poor data quality? Grade Level Calculations Grade Level Rank D/item IM SL BO ST Ha 5 1 15 1 5 5 1 2 2 2 14 2 1 4 5 3 1 3 1 3 2 15 4 5 15 5 2 5 3 14 2 14 3 4 3 4 15 3 14 15 14 6 4 14 14 2 3 4 4 7 5 15 4-15 1

Summary Summary This study provides additional evidence that survey responders with lower verbal ability are more likely to provide suspicious data. In particular this study found that low verbal scores on the SAT were associated with: Significantly shorter duration time to complete the survey Significantly more likely to skip items and straight items sets on a screen. Significantly more likely to not complete the survey., or to skip through some survey screens quickly. In addition, this study found: Moderate agreement between grade level calculations. Rank ordering the screens based on the grade level calculations was associated with straightlining, but no other suspicious data. Overall, college students with low verbal ability are more e likely to submit suspicious data on surveys. Use of grade level calculators did not prove very useful in identifying screens that may be problematic.

Thank you! Copy of this and past presentations can be found at: nsse.iub.edu/html/pubs.cfm Additional NSSE information can be found at: nsse.iub.edu/ Feel free to contact me with any questions regarding NSSE. Jim Cole colejs@indiana.edu