Interface Validity Investigating the potential role of face validity in content validation Gábor Szabó, Robert Märcz ECL Examinations

Outline - Questions of face validity - New approach - Context, participants and instruments - Results - Conclusions

Educational context: Post mortem? it is important to seem to be testing as well as to be actually doing it Test takers acceptance of the test: - contributes to the validity of it - source of motivation Lay opinion taken seriously?

New approach: Test takers are asked to Interface validity - give their opinion on the test (face validity) - give their opinion on the content (content validity)

Context and participants ECL International Language Examination System Level B2 Reading comprehension test Two tasks: sentence completion short answer Online questionnaire 903 answers within the first week (cc 50%)

The instrument Questionnaire of 17 items Four-point Likert scale (4: completely true 1: not true at all) 6 items on face validity: general statements concerning difficulty, layout, etc. 11 items on content validity: descriptors of the CEFR paraphrased Two negative items (halo effect)

The Questionnaire - Examples Face validity: 3. I had enough time to complete the tasks. Content validity Original CEFR descriptor: Can understand articles and reports concerned with contemporary problems in which the writers adopt particular stances or viewpoints. 9. I could understand the viewpoints of the writer. 16. It was difficult to understand the viewpoints of the writer.

Procedure Halo effect: analysing the parallel opposite items we found significant negative correlations (-0.630 /-0.670) Deleting responses with inconsistent response patterns 791 candidates responses were found valid and consistent

Results and analysis Descriptive statistics

Results and analysis Item correlations Expectation: significant, probably moderate correlations Descriptors tap into different aspects of B2 construct Actual results Strong, significant correlation (0.807) in one case: Though the text was long I was able to scan it quickly Though the text was complex I was able to scan it quickly

Results and analysis Actual results Moderate, significant correlations (0.405-0.654) I could quickly identify the content of the text I could understand the viewpoints of the writer I could understand the stance of the writer I could quickly identify the content of the text I could quickly identify the content of the text Though the text was complex I was able to scan it quickly Most consistent pattern of correlations in the case of item 8: I could quickly identify the content of the text

Results and analysis Actual results Low, sometimes not significant, occasionally negative correlations (<0.4) I could rarely find idioms in the text A broad active vocabulary was needed to complete the tasks The text was concerned with contemporary problems

Results and analysis Batch correlations Correlating face validity items with content validity items Significant, moderate correlation (0.536) found Indication of relationship between constructs?

Conclusions Using candidate feedback in content validation is potentially useful Further analyses of data in progress Checking for significant differences between sets of responses to different items Refinement of reworded descriptors needed Further research necessary Relationship between candidate performance and opinion

Thank you for your attention!