Reducing Social Judgment Biases May Require Identifying the Potential Source of Bias

Similar documents
Journal of Experimental Social Psychology

Article in press at Social Cognition. An unintentional, robust, and replicable pro-black bias in social judgment. Jordan R. Axt

Implicit Attitude. Brian A. Nosek. University of Virginia. Mahzarin R. Banaji. Harvard University

Understanding and Using the Brief Implicit Association Test: Recommended Scoring Procedures

Malleability in Implicit Stereotypes and Attitudes. Siri J. Carpenter, American Psychological Association Mahzarin R. Banaji, Yale University

Supplementary Study A: Do the exemplars that represent a category influence IAT effects?

Implicit Bias: What Is It? And How Do We Mitigate its Effects on Policing? Presentation by Carmen M. Culotta, Ph.D.

Supplemental Materials: Facing One s Implicit Biases: From Awareness to Acknowledgment

Colin Tucker Smith Ghent University. Kate A. Ratliff Tilburg University. Brian A. Nosek University of Virginia

2008 Ohio State University. Campus Climate Study. Prepared by. Student Life Research and Assessment

Black 1 White 5 Black

Race Equity Project Debiasing Techniques

Understanding and Using the Brief Implicit Association Test: I. Recommended Scoring Procedures. Brian A. Nosek. University of Virginia.

Framing Discrimination: Effects of Inclusion Versus Exclusion Mind-Sets on Stereotypic Judgments

Testing the Persuasiveness of the Oklahoma Academy of Science Statement on Science, Religion, and Teaching Evolution

Supporting Information

My Notebook. A space for your private thoughts.

Facing One s Implicit Biases: From Awareness to Acknowledgment

Defining Psychology Behaviorism: Social Psychology: Milgram s Obedience Studies Bystander Non-intervention Cognitive Psychology:

Weekly Paper Topics psy230 / Bizer / Fall xxarticles are available at idol.union.edu/bizerg/readings230xx

Supplement to Nosek and Smyth (2005): Evidence for the generalizability of distinct implicit

Perceptual and Motor Skills, 2010, 111, 3, Perceptual and Motor Skills 2010 KAZUO MORI HIDEKO MORI

Introduction. The current project is derived from a study being conducted as my Honors Thesis in

Hidden Bias Implicit Bias, Prejudice and Stereotypes

Strategies for Reducing Adverse Impact. Robert E. Ployhart George Mason University

Breaking the Bias Habit. Jennifer Sheridan, Ph.D. Executive & Research Director Women in Science & Engineering Leadership Institute

Appendix D: Statistical Modeling

Implicit Bias: A New Plaintiff Class Action Strategy First Wal-Mart and Iowa What is next?

CAN PROMOTING CULTURAL DIVERSITY BACKFIRE? A LOOK AT THE IMPLICIT EFFECTS OF PSYCHOLOGICAL REACTANCE. Jeffrey Adam Gruen

Reviewing Applicants. Research on Bias and Assumptions

Implicit Bias: How Our Unconscious Minds Lead Us Astray

Unconscious Bias in Evaluations. Jennifer Sheridan, PhD July 2, 2013

Bridging the Gap: Predictors of Willingness to Engage in an Intercultural Interaction

PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity

Forging links with the self to combat implicit bias

SHORT REPORT Facial features influence the categorization of female sexual orientation

Implicit Bias. Primer & Refresher The Kirwan Institute. Kelly Capatosto, Senior Research Associate & Jillian Olinger, Senior Research Associate

Reviewing Applicants

Using Groups to Measure Intergroup. Prejudice. Erin Cooley 1 and B. Keith Payne 2. Prejudice. Article

overview of presentation

Awareness of implicit bias : what motivates behavior change?

The Relationship between Fraternity Recruitment Experiences, Perceptions of Fraternity Life, and Self-Esteem

Completing a Race IAT increases implicit racial bias

STRIDE. STRATEGIES and TACTICS for RECRUITING to IMPROVE DIVERSITY and EXCELLENCE. at The University of Tennessee

Implicit Bias. Gurjeet Chahal Meiyi He Yuezhou Sun

The influence of (in)congruence of communicator expertise and trustworthiness on acceptance of CCS technologies

RUNNING HEAD: RAPID EVALUATION OF BIG FIVE 1

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

Revised Top Ten List of Things Wrong with the IAT

The Impact of Unconscious Bias on our Work

Chapter 5: Field experimental designs in agriculture

Running head: AFFECTIVE FORECASTING AND OBJECTIFICATION 1

Embodied Cognition. Jason Guerrero Alex LaBranche Ritaj Mangush Jeewon Nahm

Diversity Messages: The Short and Long-Term Effects of Legal vs. Value Messages

Evolutionary Psychology. The Inescapable Mental Residue of Homo Categoricus. Book Review

Topics for today Ethics Bias

COWLEY COLLEGE & Area Vocational Technical School

Implicit Bias: Making the Unconscious Conscious

Summary Report: The Effectiveness of Online Ads: A Field Experiment

VIRGINIA LAW REVIEW IN BRIEF

Exploring Reflections and Conversations of Breaking Unconscious Racial Bias. Sydney Spears Ph.D., LSCSW

SUPPLEMENTARY INFORMATION

Egocentrism, Event Frequency, and Comparative Optimism: When What Happens Frequently Is More Likely to Happen to Me

The Regression-Discontinuity Design

The Impact of Relative Standards on the Propensity to Disclose. Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX

Assessing the influence of recollection and familiarity in memory for own- vs. other-race faces

Several studies have researched the effects of framing on opinion formation and decision

Risky Choice Decisions from a Tri-Reference Point Perspective

System Justifying Motives Can Lead to Both the Acceptance and Rejection of Innate. Explanations for Group Differences

Strategies for Reducing Racial Bias and Anxiety in Schools. Johanna Wald and Linda R. Tropp November 9, 2013

Fostering Opportunities in STEMM Occupations by Reducing Implicit Bias. Jennifer Sheridan, PhD July 23, 2013

Knowledge Building Part I Common Language LIVING GLOSSARY

Validity of a happiness Implicit Association Test as a measure of subjective well-being q

Final Exam Practice Test

Internal Consistency and Reliability of the Networked Minds Measure of Social Presence

Modeling Unconscious Gender Bias in Fame Judgments: Finding the Proper Branch of the Correct (Multinomial) Tree

It s brief but is it better? An evaluation of the Brief Implicit Association Test (BIAT) Klaus Rothermund 1 & Dirk Wentura 2

Application of the Implicit Association Test (IAT) to a Study of Deception

THE EFFECTS OF IMPLICIT BIAS ON THE PROSECUTION, DEFENSE, AND COURTS IN CRIMINAL CASES

ORGANISATIONAL BEHAVIOUR

11/24/2017. Do not imply a cause-and-effect relationship

CHAPTER 2. MEASURING AND DESCRIBING VARIABLES

CHAPTER VI RESEARCH METHODOLOGY

Lab 2: The Scientific Method. Summary

Self-Handicapping Variables and Students' Performance

THE FIRST SESSION CHECKLIST

WARNING, DISTRACTION, AND RESISTANCE TO INFLUENCE 1

Recognizing discrimination explicitly while denying it implicitly: Implicit social identity. protection. Jennifer M. Peach.

REPLICATION: GOING GREEN TO BE SEEN 1. Replication of: Griskevicius, Tybur, & Van den Bergh, (2010), Going Green to be seen: Status,

1 The conceptual underpinnings of statistical power

The Role of Modeling and Feedback in. Task Performance and the Development of Self-Efficacy. Skidmore College

GENDER DIFFERENCES IN IMPLICIT MORAL ORIENTATION ASSOCIATIONS: THE JUSTICE AND CARE DEBATE REVISITED

What Drives Priming Effects in the Affect Misattribution Procedure?

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

CURRENT RESEARCH IN SOCIAL PSYCHOLOGY

The associations in our heads belong to us: Searching for attitudes and knowledge in implicit evaluation

Thinking and Intelligence

UNCONSCIOUS BIAS. It Matters. Presented by Training Evolution, Inc.

A Modified CATSIB Procedure for Detecting Differential Item Function. on Computer-Based Tests. Johnson Ching-hong Li 1. Mark J. Gierl 1.

Unconscious Bias and the Hiring Decision

Transcription:

814003PSPXXX10.1177/0146167218814003Personality and Social Psychology BulletinAxt et al. research-article2018 Empirical Research Paper Reducing Social Judgment Biases May Require Identifying the Potential Source of Bias Personality and Social Psychology Bulletin 1 20 2018 by the Society for Personality and Social Psychology, Inc Article reuse guidelines: sagepub.com/journals-permissions DOI: https://doi.org/10.1177/0146167218814003 journals.sagepub.com/home/pspb Jordan R. Axt 1,2, Grace Casola 3, and Brian A. Nosek 3,4 Abstract Social judgment is shaped by multiple biases operating simultaneously, but most bias-reduction interventions target only a single social category. In seven preregistered studies (total N > 7,000), we investigated whether asking participants to avoid one social bias affected that and other social biases. Participants selected honor society applicants based on academic credentials. Applicants also differed on social categories irrelevant for selection: attractiveness and ingroup status. Participants asked to avoid potential bias in one social category showed small but reliable reductions in bias for that category (r =.095), but showed near-zero bias reduction on the unmentioned social category (r =.006). Asking participants to avoid many possible social biases or alerting them to bias without specifically identifying a category did not consistently reduce bias. The effectiveness of interventions for reducing social biases may be highly specific, perhaps even contingent on explicitly and narrowly identifying the potential source of bias. Keywords bias, social judgment, discrimination, prejudice Received December 18, 2017; revision accepted October 12, 2018 Intentional or not, people use ostensibly irrelevant social information to evaluate others. Social judgment biases may arise from reliance on physical features, such as when overweight candidates are less likely to be selected for a position than nonoverweight candidates (Pingitore, Dugoni, Tindale, & Spring, 1994). Biases may also emerge when people use group identity to inform evaluation, such as when foreign applicants are rated as less hirable than native applicants (Hosoda & Stone-Romero, 2010). There are many interventions that could reduce such biases (e.g., Gollwitzer, 1999; Lerner & Tetlock, 1999). One of the most direct ways is to alert people to a potential bias in judgment, so they can adjust their decision making to avoid it. Models of biased judgment highlight awareness of the source of bias as necessary for reducing biased behavior (Wegener & Petty, 1997; Wilson & Brekke, 1994). Interventions that alert people to biases in their judgment have worked in some cases (Golding, Fowler, Long, & Latta, 1990; Schul, 1993) but not others (Wegner, Coulton, & Wenzlaff, 1985; Wetzel, Wilson, & Kort, 1981). In the present research, we use a paradigm where asking participants to avoid a specific bias in judgment reliably reduces social judgment biases (Axt & Nosek, 2018). Most research on social judgment biases examines a single category at a time, such as race, age, or weight. But, in reality, people have identities on all those things at once, and social judgment may be influenced by an interplay of evaluations on a variety of social categories. An obvious question is whether interventions are effective at reducing bias across multiple social categories simultaneously, or are specific to individual categories, leaving other social biases unchanged. The answer has practical and theoretical implications. If an intervention is effective for only a single category, then application of that intervention will have limited scope. There are different theoretical implications if bias-reduction interventions affect single or multiple categories. If bias reduction is limited to a single category, it implies that attention and correction processes are responding to an identified social category. If bias reduction occurs across multiple categories, it implies that decision processes are attending to the relevant information for judgment and putting aside 1 Duke University, Durham, NC, USA 2 Project Implicit, Seattle, WA, USA 3 University of Virginia, Charlottesville, USA 4 Center for Open Science, Charlottesville, VA, USA Corresponding Author: Jordan R. Axt, Duke University, 334 Blackwell St., Suite 320, Durham, 27701, NC, USA. Email: jordan.axt@duke.edu

2 Personality and Social Psychology Bulletin 00(0) information from other irrelevant categories. In the present research, we investigated whether an intervention asking people to avoid showing bias had a constrained impact on the social categories identified in the intervention itself or a general impact on social categories whether or not they were explicitly identified in the intervention. Existing Evidence Research on intersectionality has examined how multiple social categories independently or interactively influence evaluation (Cole, 2009; Kang & Bodenhausen, 2015). For example, an analysis of over 500,000 employees revealed that those belonging to multiple minority groups (e.g., having both a disability and belonging to a racial minority) received lower pay than employees belonging to a single minority group (Woodhams, Lupton, & Cowling, 2015). Similar results emerged in an analysis of payment toward employees who were both racial and gender minorities (Greenman & Xie, 2008), and a field experiment found that applicants with multiple stigmatized identities were rated as less employable than applicants with a single threat-related identity (Derous, Ryan, & Serlie, 2015). These examples show a healthy literature concerning the presence of multiple social biases toward single targets. However, we have not found literature investigating how interventions to reduce such biases influence single or multiple social categories simultaneously. The closest are studies investigating malleability of implicit attitudes (Joy-Gaba & Nosek, 2010; Lai et al., 2014). For example, in Lai et al. (2014), priming the concept of multiculturalism was moderately effective at reducing implicit preferences for White versus Black people, but did not alter implicit preferences for White versus Hispanic people or White versus Asian people. To expand this literature, we investigated whether interventions to reduce social judgment biases had effectiveness limited to the social category explicitly identified in the intervention or whether they reduced bias in general toward the targets of judgment. Theoretical Expectations Wilson and Brekke s (1994) mental contamination model identifies the origins of biased judgment and processes necessary to counteract the expression of bias. Mental contamination occurs when behaviors are influenced by factors that exist outside of conscious intention or awareness. The model presumes four necessary features to avoid mental contamination: (a) awareness of unwanted processing related to the judgment, (b) motivation to correct bias, (c) awareness of direction and magnitude of bias, and (d) ability to adjust responses. Failure to meet any condition results in biased judgment. If decision makers are not alerted to the unwanted bias and able to adjust their responses effectively, then there is no opportunity for motivation and corrective processes to engage. For situations in which two social categories simultaneously contaminate evaluation of individual targets, the model would most directly predict that alerting people to a bias toward one social category could reduce bias toward that category but not another social category. However, Wilson and Brekke argue that awareness of bias need not come from an external source. For example, alerting people to a bias concerning one social category might lead people to spontaneously notice other potential biases concerning other social categories. Moreover, the model is not definitive on specificity of awareness. For instance, for reducing political orientation bias, would asking people to avoid showing a bias toward people from different categories be as effective as asking to people to avoid showing a bias toward people from different political parties? And, would warning about another category be sufficient to initiate corrective processes that would reduce political biases too? The model does not directly anticipate or exclude such possibilities, but empirical evidence would help refine the theory. A second theoretical perspective focuses on activation of motivational processes to reduce reliance on stereotypes and potentially decrease biased behavior (Moskowitz, Gollwitzer, Wasel, & Schaal, 1999). Similar to the Wilson and Brekke (1994) model, this perspective argues that activation of the social category is necessary, but it may not require awareness of bias in the conventional sense. For example, one study suggests that activating a social category by brief presentation of an outgroup face is sufficient to invoke correction (Moskowitz, Salomon, & Taylor, 2000). The emphasis of this motivational account is that the activation or awareness of potential bias initiates motivated corrective processes to avoid expression of bias. For example, White participants who thought of a time they failed to act in a racially egalitarian manner later showed increased inhibition of racial stereotypes (Moskowitz & Li, 2011), and White participants told they held racial biases in implicit evaluations displayed increased inhibition when processing race-related information (Monteith, Ashburn-Nardo, Voils, & Czopp, 2002). Relatedly, interventions warning of the possibility of racial bias increased concern about racial discrimination (Devine, Forscher, Austin, & Cox, 2012; Forscher, Mitamura, Dix, Cox, & Devine, 2017). These motivational accounts are also nonspecific as to whether alerting people to one bias would create reductions in other simultaneous biases. On one hand, the empirical studies virtually always activate the same social category that is assessed for change in behavior. This implies that activation of the motivation to be unbiased is restricted to the activated social category. Yet, the theoretical models leave open the possibility of more general impact of motivational processes. The interventions may highlight a particular social category of potential bias but activate general motivation to be unbiased, leading to corrective processes that reduce

Axt et al. 3 reliance on all irrelevant social information in judgment. Some preliminary support for this possibility comes from findings that priming a general mind-set to think different was sufficient for reducing stereotype activation on a lexical decision task (Sassenberg & Moskowitz, 2005). The reviewed models of bias correction have different emphases but share an expectation that awareness of potential bias (coupled with an ability and motivation to address it) is important for changing biased behavior, and they share a lack of specificity for whether interventions targeting biases for one social category are sufficient to reduce bias toward another social category simultaneously. As such, both models will be improved by evaluating whether bias-reduction interventions must alert people to the particular social category to reduce bias. The Present Work In all studies, participants completed a Judgment Bias Task (JBT; Axt, Nguyen, & Nosek, 2018). In the JBT, participants evaluate profiles for an outcome, here admission to an honor society. Each profile includes quantified criteria relevant for evaluation (grade point average [GPA], interview scores) and social information that is ostensibly irrelevant. Criteria are to be weighed equally in judgments, and profiles are constructed so that some are more qualified than others. Participants are assessed on their ability to identify the more over the less qualified applicants and their criterion (c) for accepting an applicant to the honor society using signal detection analysis. The JBT assesses how social information affects evaluation by comparing the criterion value for profiles belonging to each social category. A lower criterion value means more leniency specifically, a greater proportion of errors falsely admitting less qualified applicants relative to errors falsely rejecting more qualified applicants. In the academic JBT, a lower criterion means both qualified and unqualified applicants from a social group are more likely to be admitted to the honor society, and bias is evident when criterion is lower for applicants from certain social groups over others. In these studies, applicants were presented with two forms of social information known to create favoritism: faces varying in physical attractiveness (Feingold, 1992) and an image indicating ingroup or outgroup status (Mullen, Brown, & Smith, 1992). Each applicant was evaluated in the presence of both forms of social information, and the influence of each bias could be assessed independently due to a fully crossed design. In Studies 1a and 1b, some participants received an intervention that mentioned the potential for ingroup favoritism in judgment and asked them to avoid displaying such favoritism. We investigated whether that intervention reduced favoritism for ingroup members and favoritism for more physically attractive people. In Studies 2a and 2b, some participants received an intervention that asked them to avoid displaying either one or both of the physical attractiveness or ingroup biases. We then investigated whether the bias warning interventions reduced bias for the specific category only or whether it also reduced bias for the unmentioned category. The evidence indicates that the interventions are effective at reducing biases for the identified category but not for others. In Study 3, we tested warning about multiple potential biases (e.g., race, sexual orientation) in addition to biases concerning physical attractiveness and ingroup membership mirroring standard language in Equal Employment Opportunity Commission (EEOC) policies. We found that providing a long list of potential biases was less effective than only warning about the attractiveness and ingroup biases. Finally, in Study 4, we tested whether warning about a potential bias for social categories in general would reduce bias toward all categories because of its breadth or toward no categories because of its lack of specificity. The evidence suggests the latter. Studies 1a and 1b In Study 1a, profiles were presented with more or less physically attractive faces and images indicating applicants came from one s own or a rival university. Study 1b used the same faces, but replaced university with political affiliation. In both studies, participants in control conditions completed the JBT without additional instruction, and participants in experimental conditions received an intervention alerting them to an ingroup bias in judgment. Method Participants. Participants in Study 1a were University of Virginia (UVA) undergraduates who completed the study for a gift card or course credit. We originally targeted a sample of 652. This sample would provide greater than 80% power at detecting a between-subjects effect of d = 0.22, which was the size of the smaller criterion bias found in a pilot study (see supplemental material). Results from this initial sample were inconclusive, so we collected as much additional data as possible during the subsequent semester. To account for inflated Type I errors following multiple rounds of data analysis, we report p augmented (Sagarin, Ambler, & Lee, 2014). The final sample had 929 participants (M Age = 18.9, SD = 1.2, 62.5% female, 59.4% White). See https://osf.io/bm5yk/ for preregistration of materials, https://osf.io/r4xvk/ for analysis plan, and https://osf.io/dpfyv/ for final data collection strategy. Participants in Study 1b were recruited through an online survey company. We planned for 1,200 participants, which would provide greater than 93% power at detecting at least suggestive evidence (p <.05; Benjamin et al., 2018) for a between-subjects effect of d = 0.20. In the final sample, 1,223 participants provided data and passed the attention check (M Age = 42.4, SD = 13.0, 72.6% female, 63.3% White), and 1,186 of those reported being Democrat or

4 Personality and Social Psychology Bulletin 00(0) Republican. See https://osf.io/2dpmx/ for the study s preregistration. We report all measures, manipulations, and exclusion criteria. Materials, data, analysis syntax, and the supplemental material are available at https://osf.io/mqdga/. Procedure. Participants in Study 1b completed four study components in the following order: Participants first received the bias-reduction intervention, then completed the academic JBT, followed by items measuring perceptions of JBT performance and explicit attitudes, and finally a demographics survey. Participants in Study 1a completed those same components and a survey about differences among applicants, which followed the JBT, and measures of implicit attitudes, which followed the explicit attitude items. Experimental conditions. Before completing the JBT, participants were randomly assigned to the Control or Bias Warning condition. Participants in the Control condition received no additional instructions. In Study 1a, participants in the Bias Warning condition were alerted to a bias favoring applicants from one s own university and were asked to avoid showing that bias. In Study 1b, participants in the Bias Warning condition received the same manipulation but about favoring applicants from one s own political party. In Study 1b, participants read the following: In addition to differing on their qualifications, candidates will differ in political affiliation. Decision makers are frequently too easy on some applicants and too tough on others. Prior research suggests that decision makers are easier on candidates from their own political party and tougher on candidates from other political parties. Can you be fair toward all applicants and not be biased by applicants political party? When you make your accept and reject decisions, be as fair as possible. Please tell yourself quietly that you will be fair and avoid favoring applicants from your own political party over applicants from another political party. When you are done, please type this strategy in the box below. Participants then wrote that they would be fair in a text box. The Bias Warning manipulation was the same for Study 1a, except university replaced political party. Prior evidence found that this manipulation reduced bias in criterion on a JBT (Axt & Nosek, 2018). For simplicity, we refer to the manipulation as a bias warning, but the intervention is multifaceted in that it (a) names a specific social dimension known to bias judgment, (b) identifies the direction of the bias, and (c) actively asks participants to avoid showing the bias. We included each of these components in an attempt to maximize the intervention s effectiveness, although we found evidence in other research that removing the text asking participants to avoid bias does not reduce the effect of the manipulation (Axt, 2017). Academic decision-making task. Participants made accept or reject decisions for an academic honor society. Each applicant had four pieces of academic information: Science GPA (scale of 1-4), Humanities GPA (1-4), letter of recommendation quality (poor, fair, good, excellent), and interview score (1-100). Participants were instructed to accept approximately half of the applicants. Half of the applications were relatively more qualified and half were less qualified. To determine qualification, each piece of academic information was converted to a 1 to 4 scale. The GPAs already had a maximum score of 4. Recommendation letters were scored poor = 1, fair = 2, good = 3, and excellent = 4, and interview scores were divided by 25. The four scores were summed to determine each applicant s level of qualification. Less qualified applicants summed to 13 and more qualified applicants to 14. In all studies, applicants were paired with equal numbers of male and female faces depicting White, smiling targets. These faces were pretested and divided into two groups differing in physical attractiveness (d = 2.64; Axt et al., 2018). In Study 1a, applicants were also depicted with a logo from UVA or a rival school, the University of North Carolina (UNC). Instructions stated that UVA and UNC are equally rigorous, so academic qualifications from both schools should be weighed equally. In all other studies, applicants varied in both physical attractiveness and political identity. Political identity was depicted by a logo of the Democratic or Republican parties. Participants were reminded of the affiliations for each logo. In all studies, the JBT contained 64 unique applications, with eight trials (four males, four females) for each combination of qualification, attractiveness, and ingroup membership (eight more physically attractive and more qualified Democrats, eight less physically attractive and more qualified Democrats, etc.). Before evaluating applicants, participants completed an encoding phase where each applicant was shown for 1 s in a random order, although this was removed from Study 1b to save time. Evaluations were made with no time limit. Participants in Study 1a were assigned to one of two JBT orders. In each, the faces paired with either UVA or UNC were predetermined, but the face-school pairings were randomly assigned to applications during each study session. Across orders, each application was equally likely to be assigned to either a more or less physically attractive face and to be depicted as from UVA versus UNC. The online participants in all other studies were assigned to one of 12 study orders, with each application being equally likely to be assigned to a more versus less physically attractive face or depicted as a Democrat or Republican. Perceived differences among applicants. Following the JBT, participants in Study 1a rated how different (1 = not different at all, 5 = extremely different ) applicants were on five dimensions: university affiliation, gender, race, physical attractiveness, and facial expression. These items were not in our analysis plan but were added for exploratory purposes.

Axt et al. 5 Perceptions of performance, explicit attitudes and implicit attitudes. Participants completed four items measuring perceived and desired performance for each social category in the JBT. Each item used a 7-point scale (e.g., 3 = I was extremely easier on UNC applicants and extremely tougher on UVA applicants; +3 = I was extremely easier on UVA applicants and extremely tougher on UNC applicants ), with a neutral response indicating equal treatment. Participants also completed two 7-point explicit preference measures, one for each social category (e.g., 3 = I strongly prefer less physically attractive people to more physically attractive people, +3 = I strongly prefer more physically attractive people to less physically attractive people ), with a neutral response indicating no preference. Participants in Study 1a completed two 7-block evaluative Implicit Association Tests (IAT; Greenwald, McGhee, & Schwartz, 1998; Nosek, Greenwald, & Banaji, 2007) measuring evaluations toward more versus less physically attractive people and identification with UVA versus UNC. See the supplemental material for information procedures for implicit association measures. IATs were completed in a random order and scored with the D algorithm (Greenwald, Nosek, & Banaji, 2003). Demographics. Participants in Study 1a completed a five-item demographics survey. Participants in Study 1b completed a demographics survey reporting political identification, age, gender, and race. For political identification, participants first reported their political party (Democrat, Republican, Independent, Libertarian, Green, Other, Do not know). If participants selected something other than Democrat or Republican, they completed an item asking, if they had to choose, whether they identified more with Democrats or Republicans (participants also had the option to not answer). In Studies 1b to 4 and Study S1, we defined Democrats and Republicans as either those selecting that party identification immediately or those selecting that party in the forced choice. Other evidence suggests these groups are similar in political judgment (Hawkins & Nosek, 2012), and combining them maximized statistical power. Data were analyzed by whether applicants were from the same or opposing political party. Results In all studies, participants were excluded from analysis for accepting less than 20% or more than 80% of applicants or for accepting or rejecting every applicant from any social group (Axt, Ebersole, & Nosek, 2016; Axt et al., 2018). Fifteen participants (1.6%) were excluded based on these criteria in Study 1a and 286 participants (24.1%) in Study 1b. 1 In both studies, accuracy on the JBT (accepting more qualified and rejecting less qualified applicants) was above chance (Study 1a: M = 70.1%, SD = 7.0; Study 1b: M = 62.2%, SD = 9.7) and the average acceptance rate was close to 50% (Study 1a: M = 52.9%, SD = 10.1; Study 1b: M = 53.4%, SD = 13.6). Criterion bias in decision making. For Study 1a, the primary analysis was a 2 (Attractiveness: More vs. Less physically attractive) by 2 (School: UVA vs. UNC) by 2 (Condition: Bias Warning vs. Control) mixed-measures analysis of variance (ANOVA) on criterion for applicants from each combination of school and physical attractiveness. This analysis revealed main effects of physical attractiveness, F(1, 912) = 130.58, p <.001, η p 2 =.125, 95% confidence interval [CI] = [.09,.17], and school, F(1, 912) = 8.71, p =.003, η p 2 =.009, 95% CI = [.001,.03], p augmented = [.05,.0502], with lower criterion for more versus less physically attractive applicants and for applicants from UVA versus UNC. There was no evidence of a main effect of condition, F(1, 912) = 0.42, p =.516, η p 2 <.001, 95% CI = [0,.01]. See Figure 1 for criterion values in each condition for each combination of school affiliation and attractiveness, and see Table 1 for means and standard deviations in each condition for all studies. A school by condition interaction could indicate that being warned of a school-affiliation bias reduced the bias favoring applicants from one s own university. The school by condition interaction was consistent with this prediction but did not provide strong evidence, F(1, 912) = 3.03, p =.082, η p 2 =.003, 95% CI = [0,.02], p augmented = [.082,.119]. The main effect of school was larger in the Control, F(1, 424) = 9.25, p =.002, η p 2 =.021, 95% CI = [.003,.06], p augmented = [.05,.0501], than in the Bias Warning, F(1, 488) =.87, p =.351, η p 2 =.002, condition. A reliable attractiveness by condition interaction could indicate that being warned of a school-affiliation bias reduced the criterion bias favoring more physically attractive applicants, which would suggest that being warned of a specific bias reduced other biases. The attractiveness by condition interaction was small but suggestive, F(1, 912) = 5.13, p =.024, η p 2 =.006, 95% CI = [0,.02], p augmented = [.051,.058]. The main effect of attractiveness was larger in the Control, F(1, 424) = 80.37, p <.001, η p 2 =.159, 95% CI = [.10,.22], than in the Bias Warning, F(1, 488) = 48.96, p <.001, η p 2 =.091, 95% CI= [.05,.14], condition. Unrelated to key hypotheses, neither the school by attractiveness interaction, F(1, 912) = 3.72, p =.054, η p 2 =.004, 95% CI = [0,.016], nor the school by attractiveness by condition interaction, F(1, 912) = 2.23, p =.136, η p 2 =.002, 95% CI = [0,.013], produced strong effects. Study 1a results were consistent with the hypothesis that being warned of one bias could reduce a second bias, but evidence was generally weak. In Study 1b, the primary analysis was a 2 (Attractiveness: More vs. Less physically attractive) by 2 (Political Party: Ingroup vs. Outgroup) by 2 (Condition: Bias Warning vs. Control) mixed-measures ANOVA on criterion for applicants from each combination of political party and physical attractiveness. This analysis revealed main effects of physical

6 Personality and Social Psychology Bulletin 00(0) Figure 1. Box plots of criterion values in Study 1a for each experimental condition. Note. Diamond ( ) denotes mean. Own group were applicants from the decision maker s university, and other group were applicants from another university. In the bias warning condition, participants were asked to avoid favoring applicants from their own university and encouraged to be fair toward own and other group applicants. attractiveness, F(1, 898) = 105.77, p <.001, η p 2 =.105, 95% CI= [.07,.14], and political ingroup, F(1, 898) = 65.82 p <.001, η p 2 =.068, 95% CI = [.04,.10], indicating lower criterion for more physically attractive applicants and applicants from one s own political party. There was no evidence of a main effect of condition, F(1, 898) = 0.59, p =.441, η p 2 =.001, 95% CI = [0,.008]. See Figure 2 for criterion values in each condition. In Study 1b, the political group by condition interaction was reliable, F(1, 898) = 12.19, p =.001, η p 2 =.013, 95% CI = [.003,.03]. The main effect of school was larger in the Control, F(1, 466) = 55.78, p <.001, η p 2 =.107, 95% CI = [.06,.16], than in the Bias Warning, F(1, 432) = 14.18, p <.001, η p 2 =.032, 95% CI = [.01, 07], condition. Warning participants of a potential political bias reduced the expression of that bias. However, there was no evidence of an attractiveness by condition interaction, F(1, 898) = 0.93, p =.336, η p 2 =.001, 95% CI = [0,.009]. The main effect of attractiveness in the Control condition, F(1, 466) = 44.65, p <.001, η p 2 =.087, 95% CI = [.04,.14], was, if anything, slightly smaller than in the Bias Warning condition, F(1, 432) = 61.70, p <.001, η p 2 =.125, 95% CI = [.07,.18]. Warning participants of a potential political ingroup bias did not reduce expression of the attractiveness bias. There was no strong evidence for either a political group by attractiveness interaction, F(1, 898) = 0.10, p =.755, η p 2 <.001, or a political group by attractiveness by condition interaction, F(1, 898) =.97, p =.325, η p 2 =.001. Study 1b suggested that being warned of a potential bias for one social category reduced bias for that category but not another social category. Attitudes, perceived performance, and desired performance. Given similarity with past work (Axt et al., 2018), we review attitude and performance analyses in the aggregate and provide details for individual studies in the supplemental material. Means and standard deviations for implicit and explicit attitudes for all studies are presented in Table 2. The one exception is for implicit attractiveness attitudes, which were only completed in Study 1a (Control IAT M = 0.74, SD = 0.34; Awareness IAT M = 0.74, SD = 0.33). As in previous work (Axt et al., 2018), participants showed robust explicit and implicit preferences for more over less physically attractive people (implicit d = 2.23 in Study 1a; explicit aggregate d = 0.72), explicit preferences for members of their ingroup (aggregate d = 1.17), and greater implicit associations between the self and their political party (aggregate d = 0.88). Across studies, we found no effects that warning participants of a specific bias changed implicit associations compared with the control condition (Hedges s g =.02, 95% CI = [.08,.05]), but we did observe small effects on explicit preferences (Hedges s g =.05, 95% CI = [.02,.09]). Participants expressed slightly weaker ingroup and attractiveness preferences when they were warned of a potential bias in judgment for that social dimension.

Axt et al. 7 Table 1. Sample Sizes, Criterion Means, and Standard Deviation for All Studies. More attractive Less attractive Own group, M (SD) Other group, M (SD) Own group, M (SD) Other group, M (SD) Study 1a condition Control (N = 425) 0.23 (0.43) 0.14 (0.45) 0.01 (0.43) 0.003 (0.45) Bias Warning (N = 489) 0.15 (0.42) 0.13 (0.42) 0.03 (0.44) 0.02 (0.43) Study 1b condition Control (N = 467) 0.30 (0.59) 0.02 (0.62) 0.16 (0.62) 0.11 (0.66) Bias Warning (N = 433) 0.25 (0.57) 0.15 (0.60) 0.09 (0.58) 0.04 (0.61) Study 2a condition Control (N = 462) 0.20 (0.49) 0.04 (0.50) 0.05 (0.53) 0.05 (0.51) Politics Bias Warning (N = 461) 0.12 (0.47) 0.08 (0.45) 0.002 (0.46) 0.03 (0.45) Attractiveness Bias Warning (N = 469) 0.10 (0.51) 0.04 (0.48) 0.10 (0.48) 0.04 (0.49) Dual Bias Warning (N = 474) 0.07 (0.47) 0.01 (0.47) 0.04 (0.49) 0.01 (0.49) Study 2b condition Control (N = 362) 0.17 (0.52) 0.03 (0.51) 0.03 (0.50) 0.07 (0.53) Politics Bias Warning (N = 383) 0.17 (0.48) 0.12 (0.48) 0.05 (0.50) 0.02 (0.46) Attractiveness Bias Warning (N = 379) 0.11 (.47) 0.01 (.51) 0.04 (0.50) 0.05 (0.51) Dual Bias Warning (N = 395) 0.12 (0.49) 0.06 (0.49) 0.07 (0.50) 0.01 (0.49) Study 3 condition Control (N = 586) 0.17 (0.51) 0.03 (0.51) 0.04 (0.49) 0.12 (0.50) Dual Bias Warning (N = 604) 0.09 (0.50 0.01 (0.49) 0.06 (0.48) 0.02 (0.49) EEOC Bias Warning (N = 577) 0.14 (0.48) 0.02 (0.48) 0.05 (0.48) 0.05 (0.52) Study 4 condition Control (N = 546) 0.16 (0.52) 0.04 (0.51) 0.02 (0.52) 0.06 (0.51) General Bias Warning (N = 618) 0.17 (0.50) 0.06 (0.50) 0.01 (0.50) 0.08 (0.50) Study S1 condition Control (N = 416) 0.16 (0.49) 0.03 (0.49) 0.04 (0.47) 0.09 (0.52) Politics Prejudice Warning (N = 334) 0.12 (0.47) 0.08 (0.51) 0.02 (0.47) 0.001 (0.49) Attractiveness Prejudice Warning (N = 432) 0.09 (0.51) 0.005 (0.49) 0.07 (0.47) 0.01 (0.48) Dual Prejudice Warning (N = 358) 0.06 (0.47) 0.02 (0.45) 0.01 (0.49) 0.005 (0.48) Note. EEOC = Equal Employment Opportunity Commission. Means and standard deviations for perceived and desired performance are presented in Table 2; Table 3 presents the percentage of participants who reported having shown no bias on the task or wanting to show no bias for both physical attractiveness and ingroup status. Across studies, participants perceived having favored more physically attractive applicants (aggregate d = 0.18) and members of their own political or university group (aggregate d = 0.35). On average, participants reported a moderate desire to favor ingroup members (aggregate d = 0.32) and a very small desire to favor more physically attractive applicants (aggregate d = 0.04). Nevertheless, a large majority of participants reported a desire to be unbiased on the JBT (attractiveness: 91.8%; ingroup 86.3%) and a perception of having been unbiased (attractiveness: 79.3%; ingroup 82.0%). This suggests that most participants viewed these biases as socially unacceptable in the context of an academic evaluation, but a few felt free to express ingroup favoritism. In line with actual JBT behavior, bias warnings were associated with small changes in desired performance (Hedges s g =.10, 95% CI = [.05,.15]) and perceived performance (Hedges s g =.18, 95% CI = [.14,.23]) such that perceived and desired performance indicated more equal treatment after being warned about the potential for a specific bias. Finally, across studies, correlations with JBT performance replicated prior work (Axt et al., 2016; Axt et al., 2018): Criterion bias was weakly but positively associated with explicit attitudes (r =.16, 95% CI = [.12,.21]), perceived performance (r =.29, 95% CI = [.22,.36]), desired performance (r =.22, 95% CI = [.15,.29]), and implicit associations (r =.12, 95% CI = [.10,.14]). Discussion In control conditions, participants displayed two simultaneous biases: favoritism toward more physically attractive people and toward ingroup members. Warning participants of a potential university or political bias, and asking them to avoid displaying such biases, reduced that bias, though very weakly in Study 1a. Warning participants of a potential

8 Personality and Social Psychology Bulletin 00(0) Figure 2. Box plots of criterion values in Study 1b for each experimental condition. Note. Diamond ( ) denotes mean. Own group were applicants from the decision maker s political party, and other group were applicants from the other political party. In the bias warning condition, participants were asked to avoid favoring applicants from their own political party and encouraged to be fair toward own and other group applicants. university or political bias reduced attractiveness bias weakly in Study 1a and not at all in Study 1b. Despite relatively high-powered tests, these results do not definitively support or reject the possibility that an intervention warning about bias in one category influences a second, unmentioned social category bias in judgment. We sought more evidence by replicating and extending these findings. In Studies 2a and 2b, we replicated Study 1b and extended it into a 2 2 design, making warning participants of none, one, or both of the social judgment biases. Studies 2a and 2b Method Participants. Participants in Studies 2a to 4 came from the Project Implicit research pool. In Study 2a, we selected Americans who reported being at least slightly liberal or conservative and targeted a sample of 400 per condition. This sample would provide over 90% power at detecting a between-subjects effect of d = 0.24, which was the size of the impact of the warning for politics bias in Study 1b. Exclusion rates were overestimated, leading to a final sample larger than anticipated. In total, 2,106 participants (M Age = 37.5, SD = 14.9, 61.9% female, 72.8% White) provided data, and 1,993 reported being a Democrat or Republican. See https://osf.io/c6vjr/ for the study s preregistration. In Study 2b, we selected Americans who reported being at least slightly liberal or conservative and targeted a sample of 357 for each condition. This sample provided over 80% power at detecting a between-subjects effect of d = 0.21, which was the average effect for the impact of warning about a specific bias for reducing that bias on the JBT in Studies 1b to 2a. In total, 1,744 participants (M Age = 32.8, SD = 14.6, 69.5% female, 73.4% White) provided data, and 1,623 reported being a Democrat or Republican. See https://osf.io/ ynhup/ for the study s preregistration. Procedure. In both studies, participants completed five components in the following order: bias warning intervention, academic JBT, self-report items of JBT performance, explicit attitudes and political identity, and a measure of implicit political identification. In Study 2b, immediately after the JBT, participants also completed five items assessing perceptions of differences among applicants attractiveness, gender, race, political party, and GPA. Our preregistered analysis plan noted that we would only analyze these items if we replicated a three-way interaction found in Study 2a, which we did not. As a result, we did not analyze these items and do not discuss them further. Experimental conditions. In both studies, participants were randomly assigned to one of four conditions: Control, Politics Bias Warning, Attractiveness Bias Warning, or Dual Bias Warning. In the Control condition, participants received no additional instructions. In the Politics Bias Warning condition, participants read the same manipulation as Study 1b. In the Attractiveness Bias Warning condition, participants read

Axt et al. 9 Table 2. Means (and Standard Deviations) for Attitude and Performance Measures. Attractiveness Ingroup Exp. preference Perc. performance Des. performance Exp. preference Imp. association Perc. performance Des. performance Study 1a Control 1.01 (0.82) 0.39 (0.69) 0.07 (0.47) 0.72 (0.91) 0.41 (0.32) 0.19 (0.49) 0.17 (0.54) Bias Warning 1.07 (0.85) 0.31 (0.60) 0.08 (.43) 0.66 (0.88) 0.39 (0.29) 0.10 (0.40) 0.06 (0.36) Study 1b Control 0.44 (1.03) 0.05 (0.83) 0.04 (0.76) 1.22 (1.28) 0.32 (0.87) 0.32 (0.81) Bias Warning 0.38 (0.95) 0.09 (0.77) 0.09 (0.61) 1.03 (1.16) 0.20 (0.73) 0.24 (0.74) Study 2a Control 0.71 (0.98) 0.20 (0.66) 0.05 (0.43) 1.38 (1.17) 0.35 (0.44) 0.31 (0.75) 0.22 (0.65) Politics Bias Warning 0.62 (0.96) 0.18 (0.67) 0.04 (0.44) 1.41 (1.21) 0.42 (0.47) 0.10 (0.48) 0.12 (0.49) Attractiveness Bias Warning 0.62 (0.97) 0.02 (0.61) 0.02 (0.48) 1.37 (1.15) 0.35 (0.46) 0.18 (0.58) 0.16 (0.56) Dual Bias Warning 0.51 (0.90) 0.002 (0.57) 0.03 (0.37) 1.19 (1.21) 0.38 (0.46) 0.14 (0.52) 0.06 (0.40) Study 2b Control 0.70 (1.01) 0.15 (0.75) 0.02 (0.49) 1.34 (1.17) 0.38 (0.44) 0.29 (0.68) 0.22 (0.65) Politics Bias Warning 0.65 (1.00) 0.10 (0.63) 0.02 (0.32) 1.29 (1.21) 0.34 (0.49) 0.19 (0.64) 0.15 (0.58) Attractiveness Bias Warning 0.60 (0.95) 0.05 (0.62) 0.02 (0.48) 1.29 (1.15) 0.36 (.46) 0.27 (0.68) 0.20 (0.63) Dual Bias Warning 0.67 (0.90) 0.07 (0.59) 0.04 (0.48) 1.27 (1.14) 0.35 (0.44) 0.17 (0.56) 0.19 (0.56) Study 3 Control 0.65 (0.89) 0.15 (0.67) 0.002 (0.39) 1.67 (1.14) 0.38 (0.46) 0.36 (0.77) 0.29 (0.73) Dual Bias Warning 0.67 (0.94) 0.07 (0.59) 0.003 (0.39) 1.67 (1.15) 0.41 (0.43) 0.19 (0.63) 0.20 (0.62) EEOC Bias Warning 0.67 (0.92) 0.11 (0.61) 0.03 (0.36) 1.61 (1.13) 0.42 (0.46) 0.27 (0.66) 0.24 (0.66) Study 4 Control 0.77 (0.97) 0.12 (0.72) 0.004 (0.41) 1.60 (1.11) 0.40 (0.46) 0.29 (0.72) 0.22 (0.67) General Bias Warning 0.75 (0.93) 0.14 (0.70) 0.03 (0.40) 1.54 (1.13) 0.38 (0.45) 0.25 (0.64) 0.23 (0.65) Study S1 Control 0.62 (0.84) 0.05 (0.55) 0.01 (0.39) 1.44 (1.19) 0.38 (0.45) 0.28 (0.69) 0.21 (0.62) Politics Prejudice Warning 0.62 (0.94) 0.14 (0.58) 0.04 (0.51) 1.48 (1.15) 0.37 (0.46) 0.17 (0.55) 0.14 (0.52) Attractiveness Prejudice 0.56 (0.83) 0.03 (0.48) 0.02 (0.39) 1.43 (1.20) 0.37 (0.46) 0.20 (0.55) 0.23 (0.63) Warning Dual Prejudice Warning 0.54 (0.88) 0.02 (0.39) 0.01 (0.39) 1.34 (1.15) 0.40 (0.47) 0.15 (0.57) 0.16 (0.56) Note. For ingroup items, higher values mean more preference, stronger implicit identification, or more perceived/desired favoritism for members of one s own group (university in Study 1a, political party in all other studies). Exp. Preference = explicit preference item; Perc. Performance = perceived performance item; Des. Performance = desired performance item; Imp. Association = implicit association measures (IAT in Study 1a, BIAT in all other studies); IAT = Implicit Association Tests; BIAT = Brief Implicit Association Test; EEOC = Equal Employment Opportunity Commission. an updated version of that manipulation replacing any mention of political ingroup bias with a bias favoring more physically attractive people. In the Dual Bias Warning condition, participants read the relevant text from both the Attractiveness and Politics Bias Warning conditions. See supplemental material for full text. Academic decision-making task. Participants completed the same JBT as Study 1b. Perceptions of performance, explicit attitudes, and political identity. Participants completed the same explicit attitude and performance measures, as well as the same measure of self-reported political identification as Study 1b. Implicit identification. Participants completed a four-block, self-focal Brief Implicit Association Test (Sriram & Greenwald, 2009) measuring identification with Democrats versus Republicans. Results In all, 127 (6.4%) participants in Study 2a and 104 (6.4%) participants in Study 2b were excluded from analysis. In both studies, overall JBT accuracy was above chance (Study 2a: M = 68.4%, SD = 8.4; Study 2b: M = 66.7%, SD = 9.0) and average acceptance rate was close to 50% (Study 1a: M = 51.2%, SD = 12.0; Study 1b: M = 51.8%, SD = 12.1). Criterion bias in decision making. For Studies 2a and 2b, primary analyses focused on a 2 (Applicant Attractiveness: More vs. Less physically attractive) by 2 (Applicant Politics: Own party vs. Other party) by 2 (Attractiveness Condition: Bias Warning vs. None) by 2 (Politics Condition: Bias

10 Personality and Social Psychology Bulletin 00(0) Table 3. Percentage of Participants Reporting a Desire or Perception of Having Behaved Fairly. Attractiveness Ingroup Perceived performance (%) Desired performance (%) Perceived performance (%) Desired performance (%) Study 1a Control 63.3 88.5 81.9 88.0 Bias Warning 69.5 92.0 86.1 92.6 Study 1b Control 78.0 84.0 75.9 78.5 Bias Warning 83.1 87.3 86.7 84.1 Study 2a Control 75.4 91.1 79.6 83.9 Politics Bias Warning 79.4 92.2 87.1 91.9 Attractiveness bias Warning 82.5 91.9 85.1 87.9 Dual Bias Warning 83.9 93.5 88.5 93.9 Study 2b Control 78.3 90.3 77.9 83.7 Politics Bias Warning 78.3 95.0 82.8 89.9 Attractiveness Bias Warning 82.0 92.0 81.5 87.5 Dual Bias Warning 79.9 90.6 87.4 86.3 Study 3 Control 76.2 94.9 74.5 80.7 Dual Bias Warning 84.5 93.7 81.9 85.6 EEOC Bias Warning 80.8 93.2 78.0 83.8 Study 4 condition Control 77.1 92.0 77.8 85.0 General Bias Warning 77.9 91.9 80.4 84.1 Study S1 Control 84.5 93.3 79.8 84.5 Politics Prejudice Warning 83.1 90.5 84.6 88.6 Attractiveness Prejudice Warning 85.4 92.7 84.1 85.5 Dual Prejudice Warning 85.0 93.4 86.5 89.7 Note. EEOC = Equal Employment Opportunity Commission. Warning vs. None) mixed-measures ANOVA on criterion for each combination of applicant attractiveness and political identity. Both studies showed main effects of physical attractiveness such that more physically attractive applicants received lower criterion than less physically attractive applicants Study 2a: F(1, 1862) = 47.58, p <.001, η p 2 =.025, 95% CI = [.013,.041]; Study 2b: F(1, 1515) = 92.90, p <.001, η p 2 =.058, 95% CI = [.037,.082]. There were also main effects of applicant political party such that applicants from one s own party received lower criterion than from the other party Study 2a: F(1, 1862) = 57.61, p <.001, η p 2 =.030, 95% CI = [.017,.047]; Study 2b: F(1, 1515) = 56.10, p <.001, η p 2 =.036, 95% CI = [.020,.056]. Main effects of bias warnings for attractiveness or politics were not reliable (η p 2 <.002). See Figure 3 for criterion values in Study 2a and Figure 4 for Study 2b. If warning about a specific bias reduced that bias, there would be evidence of interactions between applicant attractiveness and attractiveness bias warning condition as well as between applicant political ingroup and politics bias warning condition. Both studies showed an applicant attractiveness by attractiveness bias warning interaction Study 2a: F(1, 1862) = 35.60, p <.001, η p 2 =.019, 95% CI = [.009,.033]; Study 2b: F(1, 1515) = 9.40, p =.002, η p 2 =.006, 95% CI= [.001,.016]; the attractiveness bias was smaller when participants were warned about the potential attractiveness bias (Study 2a: η p 2 =.0004 ; Study 2b: η p 2 =.027) than when not (Study 2a: η p 2 =.086 ; Study 2b: η p 2 =.100). There was weak evidence of political ingroup by politics bias warning interactions in Study 2a, F(1, 1862) = 5.19, p =.023, η p 2 =.003, 95% CI = [.00002,.010], and Study 2b, F(1, 1515) = 3.34, p =.068, η p 2 =.002, 95% CI = [0,.009]. In both cases, the main effect of political ingroup was smaller when participants had been warned about the bias (Study 2a: η p 2 =.018 ; Study 2b: η p 2 =.024) than when not (Study 2a: η p 2 =.043 ; Study 2b: η p 2 =.047). The evidence for warnings

Axt et al. 11 Figure 3. Box plots of criterion values in Study 2a for each experimental condition. Note. Diamond ( ) denotes mean. Own group were applicants from the decision maker s political party, and other group were applicants from the other political party. reducing biased judgment was consistent for attractiveness and politics, but stronger for attractiveness. A central question was whether warning about bias for one category influenced the bias expressed on the other category. If so, we would observe interactions between applicant attractiveness and politics bias warning conditions and between applicant political ingroup status and attractiveness bias warning conditions. In both studies, there was no evidence of interactions between applicant attractiveness and politics bias warning condition Study 2a: F(1, 1862) = 0.03, p =.868, η p 2 <.001 ; Study 2b: F(1, 1515) = 0.03, p =.865, η p 2 <.001 or between applicant political ingroup by attractiveness bias warning condition Study 2a: F(1, 1862) = 0.94, p =.333, η p 2 =.001; Study 2b: F(1, 1515) =.11, p =.744, η p 2 <.001. It is possible that the effectiveness of the bias warning interventions was moderated by whether participants were also warned about the other potential source of bias. This would be indicated by three-way interactions of both bias warning manipulations and the category manipulation (attractiveness or politics). There was no evidence of an interaction between applicant attractiveness and the two warning conditions Study 2a: F(1, 1862) = 0.29, p =.593, η p 2 <.001; Study 2b: F(1, 1515) =.13, p =.720, η p 2 <.001. Also, there was no evidence of an interaction between applicant politics, attractiveness bias warning, and politics bias warning in Study 2b, F(1, 1515) = 0.46, p =.500, η p 2 <.001. However, there was suggestive evidence of such an interaction in Study 2a, F(1, 1862) = 7.37, p =.007, η p 2 =.004, 95% CI = [.0003,.012]. The impact of the politics bias warning manipulation on reducing political bias was stronger when participants were only warned about the politics bias (η p 2 =.012) compared with when they were also warned about an attractiveness bias (η p 2 <.001). While intriguing, as

12 Personality and Social Psychology Bulletin 00(0) Figure 4. Box plots of criterion values in Study 2b for each experimental condition. Note. Diamond ( ) denotes mean. Own group were applicants from the decision maker s political party, and other group were applicants from the other political party. a single instance among multiple tests, this is weak evidence that should be replicated before taken seriously. No other terms in the ANOVA in either study reached suggestive (p <.05) or stronger (p <.005) statistical significance (Benjamin et al., 2018). See supplemental material for full reporting. Discussion Warning participants about a potential bias in social judgment was effective at reducing the bias, but only if the social category was explicitly identified. Unlike the mixed evidence in Studies 1a and 1b, we observed no evidence in Studies 2a and 2b that warning of a potential bias toward one social category reduced bias toward another social category. Study 3 The general conclusion of Studies 1a to 2b is that warning about potential biases in judgment decreases biased behavior for the specific social categories mentioned but does not transfer to reduced judgment biases for other, unmentioned social categories. One logical extension of this result is to examine what happens when participants are warned about many potential biases. This question has both theoretical relevance in establishing whether there is an upper bound in the number of categories that can be listed in an effective intervention and practical consequences given that existing antidiscriminatory warnings for evaluation (e.g., the EEOC language for protected classes) list many social categories as potentially influencing judgment. It is possible that raising warning about many potential biases will increase the effectiveness of the intervention, as