How Bad Data Analyses Can Sabotage Discrimination Cases

Similar documents
Statistics. Dr. Carmen Bruni. October 12th, Centre for Education in Mathematics and Computing University of Waterloo

CAPTURE THE IMAGINATION WITH VISUALIZATION:

Multiple Comparisons and the Known or Potential Error Rate

Lesson Plan. Class Policies. Designed Experiments. Observational Studies. Weighted Averages

An Experimental Investigation of Self-Serving Biases in an Auditing Trust Game: The Effect of Group Affiliation: Discussion

QUESTIONING THE MENTAL HEALTH EXPERT S CUSTODY REPORT

How to not blow it at On Campus Mock Trial Competition. with Tyler

e Magnus Advantage Imagine... Imagine a case where you know the impact of human dynamics, in addition to the law and the issues...

What are your rights if DTA won't give you benefits, or reduces or stops your benefits?

Social Studies Skills and Methods Analyzing the Credibility of Sources

Bad faith trademark filing: a theoretical perspective

Simpson s paradox (or reversal paradox)

Practitioner s Guide To Stratified Random Sampling: Part 1

Implicit Bias: A New Plaintiff Class Action Strategy First Wal-Mart and Iowa What is next?

Case 2:17-cv Document 1 Filed 10/30/17 Page 1 of 10

Title II of the Americans with Disabilities Act Section 504 of the Rehabilitation Act of 1973 Discrimination Complaint Form

107 If I have the proofs DTA wanted, should I still ask for a hearing?

Justice Research and Statistics Association CODE OF ETHICS

Deciding whether a person has the capacity to make a decision the Mental Capacity Act 2005

Confidence in Sampling: Why Every Lawyer Needs to Know the Number 384. By John G. McCabe, M.A. and Justin C. Mary

Lecture 9 Internal Validity

Evaluating Risk Assessment: Finding a Methodology that Supports your Agenda There are few issues facing criminal justice decision makers generating

Quantitative Reasoning in Evidence-Based Medicine

Explaining Bargaining Impasse: The Role of Self-Serving Biases

Handout 14: Understanding Randomness Investigating Claims of Discrimination

Medical marijuana vs. workplace policy

F A C T S H E E T C O N C E R N I N G L.A. T I M E S P O ST S O F F E BR U A R Y 14, 2011

Purpose: Policy: The Fair Hearing Plan is not applicable to mid-level providers. Grounds for a Hearing

Confounding and Effect Modification

The Role of the Judge

DISTRICT OF COLUMBIA COURT OF APPEALS. No. 99-BG-229. On Application for Admission to the District of Columbia Bar

MINT Incorporated Code of Ethics Adopted April 7, 2009, Ratified by the membership September 12, 2009

Sleeping Beauty is told the following:

Calif. Employers Should Sweat Over Heat-Illness Rules

IN THE COURT OF APPEALS OF IOWA. No Filed October 14, Appeal from the Iowa District Court for Clayton County, Richard D.

HOW TO APPLY FOR SOCIAL SECURITY DISABILITY BENEFITS IF YOU HAVE CHRONIC FATIGUE SYNDROME (CFS/CFIDS) MYALGIC ENCEPHALOPATHY (ME) and FIBROMYALGIA

How to Recognize and Avoid Harassment in the Workplace. ENGT-2000 Professional Development

Math 140 Introductory Statistics

Sociology 593 Exam 2 March 28, 2003

Are You a Professional or Just an Engineer? By Kenneth E. Arnold WorleyParsons November, 2014

Fwd: Council File No OPPOSED to Community Care Ordinance 1 message

AUTHORITIES & STATISTICS

Managing the Risk of Liability to Students Disciplined for Sexual Assault. Erin Buzuvis, J.D. Professor of Law, Western New England University

Probability, Statistics, Error Analysis and Risk Assessment. 30-Oct-2014 PHYS 192 Lecture 8 1

Confidence. What is it Why do you need it?

MENTAL CAPACITY ACT POLICY (England & Wales)

(c) Center for Court Innovation. Procedural Justice: Fair Treatment Matters

New York Law Journal. Friday, May 9, Trial Advocacy, Cross-Examination of Medical Doctors: Recurrent Themes

Self-Serving Assessments of Fairness and Pretrial Bargaining

Judea Pearl. Cognitive Systems Laboratory. Computer Science Department. University of California, Los Angeles, CA

Please complete the medical history section below so that we can be sure to respond to any

Principle underlying all of statistics

Real Statistics: Your Antidote to Stat 101

The Essentials of Jury De-Selection

CHAPTER 1 Criminology and the Sociological Perspective

Managerial Decision Making: Session 6

Chapter 1 Introduction to Educational Research

Book Review of Witness Testimony in Sexual Cases by Radcliffe et al by Catarina Sjölin

Medicaid Denied My Request for Services, Now What?

Chapter 4. More On Bivariate Data. More on Bivariate Data: 4.1: Transforming Relationships 4.2: Cautions about Correlation

Causal inference nuts and bolts

BIOMETRICS PUBLICATIONS

Quoting extensively from another source, even if you do it properly, is not appropriate

This is a licensed product of Ken Research and should not be copied

MOREHOUSE SCHOOL OF MEDICINE HUMAN RESOURCES POLICY AND PROCEDURE MANUAL

State of Connecticut Department of Education Division of Teaching and Learning Programs and Services Bureau of Special Education

The Lie Detector Test for Soft Tissue Injury

Case 2:12-cv KJM-GGH Document 1 Filed 07/02/12 Page 1 of 9 UNITED STATES DISTRICT COURT EASTERN DISTRICT OF CALIFORNIA. (Sacramento Division)

Case 2:17-cv Document 1 Filed 10/30/17 Page 1 of 10

PUBLIC HOUSING: THE GRIEVANCE PROCEDURE

2 Franklin Street, Belfast, BT2 8DQ Equality Scheme

HIV Rules & Statutes:

On Algorithms and Fairness

Section Introduction

Criminal Justice - Law Enforcement

FAQ: Heuristics, Biases, and Alternatives

The Logotherapy Evidence Base: A Practitioner s Review. Marshall H. Lewis

EXECUTIVE SUMMARY INTERPRETING FUND SCOPING PROJECT LAW INSTITUTE OF VICTORIA

TIPSHEET QUESTION WORDING

Exploring Social Determinants of Health through a Public Health Law Lens

Use of Supporting Evidence With the IME Physician at Trial

Specialists in asbestos litigation

Results of the Pickleball Player Survey December 2013

CSA Briefing Note Regarding Joint Application against the University and Re-Commencing Collection of CFS/CFS-O Fees

TERMINATION OF EMPLOYMENT HEARINGS BEFORE HEARING EXAMINER

Lieutenant Jonathyn W Priest

International Hearing Society Middlebelt Rd., Ste. 4 Livonia, MI p f

IN THE SUPREME COURT OF BRITISH COLUMBIA KENNETH KNIGHT IMPERIAL TOBACCO CANADA LIMITED STATEMENT OF CLAIM

Workplace Wellness Roadmap

Experimental Testing of Intrinsic Preferences for NonInstrumental Information

TUCSON CITY DOMESTIC VIOLENCE COURT

Family Man. a prime example of change. My dads past was riddled with chaos and bad decisions. My dad

Chapter 4: Understanding Others

Section One: Test Bias, Multicultural Assessment Theory, and Multicultural Instrumentation

The current system of using money bail to

Employment Discrimination Law Professor Nancy Modesitt Room 507 Administrative Assistant: Gloria Joy

UNIT. Experiments and the Common Cold. Biology. Unit Description. Unit Requirements

David Campbell, PhD Ethicist KHSC Palliative Care Rounds April 20, 2018

SENTENCING ADVOCACY WORKSHOP. Developing Theories and Themes. Ira Mickeberg, Public Defender Training and Consultant, Saratoga Springs, NY

Topics for today Ethics Bias

Transcription:

How Bad Data Analyses Can Sabotage Discrimination Cases By Mike Nguyen; Analysis Group, Inc. Law360, New York (December 15, 2016) Mike Nguyen When Mark Twain popularized the phrase, there are three kinds of lies: lies, damned lies, and statistics, he was throwing out the baby with the bathwater. When utilized properly, statistics can be a highly effective tool in supporting a legal argument. Conversely, statistics that are misunderstood or misinterpreted can quickly sink a case. In discrimination cases, as in so many other matters, it is critical to understand both the benefits and limitations of using data to support an argument. We re going to ask our friends Serena Williams and Derek Jeter to help us make the case. But first, some background, and then some recommendations for attorneys involved in these types of cases. Over the years, statistical data and analyses have been used in courtrooms as evidence of race-, gender- or age-based discrimination. As documentary evidence of discriminatory practices is rarely available, courts often rely on additional evidence, including statistical analyses, to evaluate whether the process or practice in question may have been discriminatory. However, the courts have been cautious in using statistical evidence to evaluate whether discrimination has in fact occurred, realizing that the statistical evidence itself may contain bias or yield misleading results. This article discusses a common flaw with statistical analyses that can have a material effect on the outcomes.

Inclusive Communities Opens the Door In the Texas Department of Housing and Community Affairs v. Inclusive Communities Project Inc. Fair Housing Act case (Inclusive Communities), the importance of sound statistical analysis in race-, gender- or age-based discrimination cases was amplified. In its 2015 decision, the U.S. Supreme Court addressed whether the Fair Housing Act recognizes disparate impact claims, 1 in which a plaintiff can establish liability, without proof of intentional discrimination, if a particular business practice has a disproportionate effect on specific groups of individuals and if the practice is not based on sound business considerations. The court found that, based on its language and purpose, the Fair Housing Act provides for disparate impact claims. However, the court also imposed important limitations to protect potential defendants against abusive disparate-impact claims. 2 Here s where statistical analysis comes in. Limitations of Statistical Analyses and Simpson s Paradox It is important to recognize that statistical analyses, including regression analyses, cannot perfectly determine whether differences in the outcomes for certain groups of individuals are due to discrimination. For example, the large body of academic literature on race-based discrimination in mortgage lending shows that one cannot look at outcomes in a regression model and infer race-based discrimination without having controlled for all factors that explain differences in mortgage credit decisions and without reviewing individual loan files. 3 There are many variables that affect mortgage interest rates paid by borrowers that typically are not available for the purposes of a regression analysis. These include factors that may be nonquantifiable, such as a borrower s experience. When using statistical analyses to test for discrimination in mortgage lending and other practices, experts often aggregate data in order to draw conclusions about particular groups as a whole. However, experts need to be cautious not to offer misleading conclusions that can result from such aggregation. A classic pitfall involves something known as Simpson s Paradox. Simpson s Paradox is a statistical concept that refers to the reversal of statistical relationships when two or more groups of data are combined. In a Simpson s Paradox situation, an analysis of aggregated data can produce a statistically significant result 4 even though the individual elements included in the aggregation may not be statistically significant. A famous example of Simpson s Paradox involves claims of gender bias in graduate admissions at the University of California at Berkeley. Researchers analyzing aggregate graduate school admissions data across all graduate departments at the university found that male applicants were more likely to be admitted than female applicants, and that the difference was so large that it was unlikely due to chance. However, each of Berkeley s departments maintained their own policies as to graduate admissions and made decisions independent of the other departments. When the 2

analysis was conducted for each individual decision maker (i.e., the individual department), only four departments had a statistically significant bias in favor of males, while six departments actually had a statistically significant bias in favor of females even though the aggregate analysis had suggested a statistically significant bias in favor of males across the entire university. The underlying cause of this aggregate result was that more women applied to the most competitive departments with the highest demand and the lowest admissions rates. In other words, women rejected from the highly competitive departments were overrepresented in the aggregate population. To understand how the math in a Simpson s Paradox situation works, consider a simple example from Serena Williams. Imagine she is playing in the final of the Law360 Open against her sister Venus. Venus wins a total of 14 games and Serena only 12. Venus wins, right? Not necessarily, because of the structure of the game. In women s tennis, the winner must take two of three sets. So while Venus won more games, did she win two sets? In our example, Venus wins the first set 6-0. But she loses the next two sets, each by the score of 6-4. So while Serena won fewer games (12 to her sister s 14), she wins the match (two sets to her sister s one). (See Table 1.) Another example of Simpson s Paradox can be seen in comparing the batting averages of players in professional baseball. Consider this question: Is it possible for one player to hit for a higher batting average than another player during a given year, and to do so again during the next year, but to have a lower batting average when the two years are combined? Surprisingly, the answer is yes. This can occur when there are large differences in the number of at-bats between the years. Ken Ross, a professor emeritus at the University of Oregon and baseball enthusiast, notes in a 2004 book on baseball statistics that in both 1995 and 1996, Derek Jeter of the New York Yankees had a lower batting average for each season than David Justice, then of the Atlanta Braves. (See Table 2.) 3

Combining the two years, however, Derek Jeter had a higher batting average overall. The paradox here results from the fact that the majority of Derek Jeter s at-bats came in 1996, with a higher batting average (0.314), while the majority of David Justice s at-bats came in 1995, with a lower batting average (0.253). The combined figures push the twoyear average in Derek Jeter s favor. (See Table 3.) Application to a Discrimination Case If not examined carefully, statistics in the courtroom can be equally misleading. In a case in which Analysis Group provided expert analysis, a mortgage lender was accused of implementing a discretionary pricing policy under which mortgage brokers were given discretion in pricing mortgage loans. Plaintiffs expert, relying on regression analysis of aggregated data, alleged that this policy caused African-American borrowers to receive higher annual percentage rates (APRs) for mortgage loans than similarly situated white borrowers. However, if in fact there was a discretionary pricing policy, and that policy had a disparate impact on African-American borrowers, then the disparate impact caused by the policy should be observed consistently across the various mortgage brokers who applied it. If such disparate impact was not observed consistently across mortgage brokers, that would suggest that loan pricing was the result of individualized decision-making by the brokers rather than the result of a common policy applied across the entire borrower population. When Analysis Group s expert (working on behalf of the defendant) disaggregated the mortgage data and analyzed it at the individual broker level, it became clear that the statistical evidence did not reflect a commonly applied discretionary pricing policy that resulted in higher loan prices for African-American borrowers as a group. In fact, when the top mortgage brokers were studied individually, the plaintiffs expert s own regression model revealed that, for certain brokers, African-American borrowers received APRs that were lower than the APRs received by similarly-situated white borrowers. 4

Key Takeaways: Understand the Strengths and Limitations of Data in Discrimination Cases The Supreme Court s decision in Inclusive Communities suggests that statistical analyses may play a larger role in discrimination cases because disparate impact alone may be sufficient to demonstrate that discrimination has occurred. For attorneys involved in these types of cases, there are a few important takeaways: Statistical analysis can be a highly effective tool in supporting a legal argument. It is critical to understand the assumptions upon which statistical tests and conclusions are based. Statistical tests that are not done properly can produce misleading results. And finally, if you re having a good year, try to get more at-bats. Mike Nguyen is a Vice President in the Los Angeles office of Analysis Group, and a former wide receiver for the UCLA Bruins (1991 1995). The opinions expressed are those of the author(s) and do not necessarily reflect the views of the firm, its clients, or Portfolio Media Inc., or any of its or their respective affiliates. This article is for general information purposes and is not intended to be and should not be taken as legal advice. Endnotes 1 The theory of disparate impact holds that practices in employment, housing, or other areas may be considered discriminatory and illegal if they have a disproportionate adverse impact on persons in a protected class. Disparate impact is different from disparate treatment, which involves discriminatory intent or motive. Both disparate impact and disparate treatment are considered forms of discrimination. 2 For example, it emphasized that the plaintiff has the burden to establish a robust causal connection between the challenged practice and the alleged disparities. http://www.law360.com/articles/794007/us-supremecourt-impact-on-mass-disparate-impact-claims. 3 See, e.g., Paul Calem and Stanley Longhofer, Anatomy of a Fair Lending Exam: The Uses and Limitations of Statistics 24 J. REAL ESTATE FIN. 207 (2002). 4 A statistically significant result is a result that is unlikely to have occurred purely by chance. All Content 2003 2017, Portfolio Media, Inc. 5