Strategic Sampling for Better Research Practice Survey Peer Network November 18, 2011 Thomas Lindsay, CLA Survey Services John Kellogg, Office of Institutional Research Tom Dohm, Office for Measurement Services
Sampling vs. Census Census Entire population is surveyed Results are not statistic they are the population Results can be heavily skewed by non-response by certain segments of population Generally very costly and difficult Sampling (Small) subset of population is surveyed Results may or may not be generalizable to the full population Smaller numbers should reduce challenges of nonresponse bias Generally less costly than census
Sampling Types Probability Every member of the population has a known, non-zero chance of selection Sample s representativeness of population (sample error) can be precisely calculated Population must be well defined Non-Probability Some members of the population are selected by non-random means Sample s representativeness cannot be calculated (is unknowable) Likely to introduce systematic bias Understanding of population is not required
Sampling Methods Sampling Methods Probability Sampling Non-Probability Sampling Simple Random Stratified Convenience Quota (This chart is also on the back of your pink handout)
Simple Random Sampling Most common method of probability sampling One sample is selected from entire population All members of population must have exactly the same odds of selection
Stratified Sampling Entire population is divided systematically into two or more groups by some criterion Different groups must be weighted if not directly proportional to population Simple random samples are drawn separately from each group Stratification criterion must be significant factor in research expectations
Convenience Sampling Easiest sampling method Participants are chosen based on availability or ease of recruitment High risk of non-response bias Sample error cannot be calculated representativeness is unknowable
Quota Sampling Researcher uses judgment to choose participants based on some population characteristic (usually demographic) Participants are chosen within each quota group by convenience, until quota is reached Researcher can ensure that important, harderto-reach groups are present in sample Sample error cannot be calculated representativeness is unknowable
Quota Sampling is NOT Stratified Sampling Quota sampling may reduce response and coverage bias of convenience sampling but sampling error remains unknowable Researcher s judgment, not random selection, determines makeup of sample All members of population do not have a known, nonzero chance of participation
Response Rates Adjusting for projected low responses Increase Sample Size? Risks: Survey Fatigue Does not solve the response RATE problem Can simply give you more non-representative responses Solution: Increase your response rates Collaboration
Other Considerations Numbers Needed for Analysis Descriptive statistics no real minimum Multiple Regression / Analysis of Covariance 200-500 depending on number of variables Comparative Analysis of Subgroups
Sample Size for 95% Confidence Interval Population Size ±5% ±3% 100 80 92 200 132 169 400 196 291 1,000 278 517 10,000 370 965 100,000 383 1,056 1,000,000 384 1,066 10,000,000 384 1,067 100,000,000 384 1,067
Types of Survey Errors Sampling Error inadequate sample size/nonrandom sample Measurement Error imperfect survey questionnaire (e.g., unreliable) Coverage Error inability to contact some types of people in the population Nonresponse Error the condition where people of a particular type are systematically not represented in the sample because such people are alike in their tendency not to respond
Types of Validity External Validity examines whether or not observed findings/results should be generalized across different measures, persons, settings and times. Generalized to a well-specified population (i.e., we can safely conclude that the results in our sample reflect how the population would respond) Generalized across subpopulations (i.e., conceptual replicability/robustness)- can results from one subpopulation be generalized to another subpopulation? Statistical Conclusion Validity concerns the power to detect relationships that exist and determine the magnitude of these relationships.
Types of Validity (continued) Internal Validity refers to the adequacy of the study design (not the survey instrument) and the degree of control we have exercised in data gathering. By control we mean that all variables except the dependent variable are controlled by the experimenter. Test Validity does the survey instrument measure what it purports to measure?
Type of Error and Impact on Validity Types of Validity\ Types of Errors External Statistical Conclusion Internal Sampling Measurement Coverage Nonresponse Major Impact if sample is non-probabilistic /non-random or sample size too small Inadequate sample size has major impact Major Impact if sample is nonprobabilistic /non-random or sample size too small If reliability low, can t trust results, can t generalize If reliability low, can t trust results, can t trust conclusions If Reliability low, results from survey instruments will be invalid Test No real impact If Reliability low, results from survey instruments will be invalid Results can t be generalized to noncovered individuals Conclusions can t be applied to noncovered individuals Bias can be introduced into design Doesn t impact validity, per se, but does impact norms and interpretation of results Results can t be generalized to nonparticipating groups Conclusions can t be generalized to nonparticipating groups Bias can be introduced into design Doesn t impact validity, per se, but does impact norms and interpretation of results
Small Group Exercise 1 How do you put together respondent pools?
Small Group Exercise 2 Scenario A: You have a community organization of 7,500 (and email addresses for 5,000 of them) and need feedback about X. How do you go about getting your information? What would your sampling plan look like? Scenario B: You are conducting a survey and want to ensure representativeness of an underrepresented population. How will this affect your sampling plan and subsequent data analysis? Scenario C: You want to conduct a survey around a sensitive issue with an unknown or hardto-identify population (examples: those who identify as GLBTA, or a population who has experienced sexual harassment/violence). How do you go about identifying your sample?
References Aaker, David A., V. Kumar, George S. Day. Marketing Research. 7 th ed. John Wiley & Sons, 2001. Dillman, Don A., Jolene D. Smyth, and Leah Melani Christian. Internet, Mail, and Mixed- Mode Surveys: The Tailored Design Method. 3 rd ed. John Wiley & Sons, 2009. Litwin, M.. How to Measure Survey Reliability and Vality. Sage Publications, 1995.
References McDaniel, Carl Jr. and Roger Gates. Marketing Research. 8 th ed. John Wiley & Sons, 2010. Sivo, S.A., Saunders, C., Chang, Q., and Jiang, J.J. How low should you go? Low Respo0nse Rates and the Validity of Inference in IS Questionnaire Research, Journal of the Association of Information Systems, 7:6, 2006.
Thank you for attending! To join the Survey Peer Network email list, email opa@umn.edu with Join SPN in the subject line. Learn more about the Survey Peer Network at http://surveys.umn.edu/spn/index.html, part of The Survey Connection website (http://surveys.umn.edu). Our next SPN session will be in February check your inbox in January for more information!
Types of Test (Survey) Validity Face Do items appear to measure what they are intended to measure If so, compliance and response rates will be higher Content Subject matter experts agree items adequately measure the topic for which the content of the survey is intended to measure Criterion measures how well your survey instrument compares to similar instruments measuring the same content OR how well your survey predicts criterion
Summary of Internal and External Validity Population Sample Good External Validity Reduced External Validity Good Internal Validity Reduced Internal Validity Random Sample Good External Validity Non-Random Sample Reduced External Validity Random Assignment Good Internal Validity Subjects Matched Good Internal Validity Non-Random Assignment Reduced Internal Validity Results will Generalize Results will NOT Generalize Groups are equal at the start, and you can attribute changes to the independent variable Groups are unequal at the start, and thus changes could be due to initialk differences, NOT the independent variables