Summer Institute in Statistical Genetics University of Washington, Seattle Forensic Genetics Module

Similar documents
DNA evidence. Attack the DNA proof. Cybergenetics Distorting DNA evidence: methods of math distraction

Statistical Evaluation of Sibling Relationship

Statistical Weights of Mixed DNA Profiles

Why the effect of prior odds should accompany. the likelihood ratio when reporting DNA evidence

CHAPTER 9: Producing Data: Experiments

GENETIC DRIFT & EFFECTIVE POPULATION SIZE

Analysis of complex patterns of evidence in legal cases: Wigmore charts vs. Bayesian networks

THE RELIABILITY OF EYEWITNESS CONFIDENCE 1. Time to Exonerate Eyewitness Memory. John T. Wixted 1. Author Note

Evaluating DNA Profile Evidence When the Suspect Is Identified Through a Database Search

Improving statistical estimates used in the courtroom. Precis. Bayes Theorem. Professor Norman Fenton. Queen Mary University of London and Agena Ltd

Book Review of Witness Testimony in Sexual Cases by Radcliffe et al by Catarina Sjölin

ENQUIRING MINDS EQM EP 5 SEG 1

CHAPTER 4 Designing Studies

PHILOSOPHY OF ECONOMICS & POLITICS

DNA Fingerprinting & Forensic Analysis

(b) What is the allele frequency of the b allele in the new merged population on the island?

Myths and Facts of TV. Wardisiani s Forensic Science

Worksheet. Gene Jury. Dear DNA Detectives,

Analyze and synthesize the information in this lesson to write a fiveparagraph essay explaining the problems with DNA testing.

Genetics. by their offspring. The study of the inheritance of traits is called.

Understanding Probability. From Randomness to Probability/ Probability Rules!

Political Science 15, Winter 2014 Final Review

An Introduction to Huntington s Disease

Chapter 19. Confidence Intervals for Proportions. Copyright 2010 Pearson Education, Inc.

The Wellbeing Course. Resource: Mental Skills. The Wellbeing Course was written by Professor Nick Titov and Dr Blake Dear

High School is Over: Should You Go to College?

DNA for Dummies. Examples of when DNA testing will generally not be appropriate:

Kern Regional Crime Laboratory Laboratory Director: Dr. Kevin W. P. Miller

Genetics Unit Outcomes

Animal Breeding & Genetics

Nonparametric Linkage Analysis. Nonparametric Linkage Analysis

DNA Short Tandem Repeats. Organism

Geoffrey Stewart Morrison

So You Want to do a Survey?

Lecture 12: Normal Probability Distribution or Normal Curve

Lecture 2: Learning and Equilibrium Extensive-Form Games

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Building Friendships: Avoid Discounting

Needles in Haystacks

Power & Sample Size. Dr. Andrea Benedetti

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA

Transcript of Dr. Mark Perlin's talk on "Preserving DNA Information" delivered on

Inferring identity from DNA profile evidence

Human Molecular Genetics Prof. S. Ganesh Department of Biological Sciences and Bioengineering Indian Institute of Technology, Kanpur

IN THE DISTRICT COURT OF DOUGLAS COUNTY, NEBRASKA

Chapter 19. Confidence Intervals for Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Measures of validity. The first positive rapid influenza test in the middle FROM DATA TO DECISIONS

A. Indicate the best answer to each the following multiple-choice questions (20 points)

Taming Uncertainty in Forensic DNA Evidence. Uncertainty. Genotype. Cybergenetics Prior probability

How Math (and Vaccines) Keep You Safe From the Flu

Mendelian Genetics. Activity. Part I: Introduction. Instructions

Pros & Cons of Testimonial Evidence. Presentation developed by T. Trimpe 2006

THRIVING ON CHALLENGE NEGATIVE VS. POSITIVE AUTHENTICITY & ABUNDANCE ONLINE COACHING

Step Five. Admitted to ourselves and another human being the exact nature of our wrongs.

Reasoning with Uncertainty. Reasoning with Uncertainty. Bayes Rule. Often, we want to reason from observable information to unobservable information

Transcript of Dr. Mark Perlin s talk on Explaining the likelihood ratio in DNA

Pros & Cons of Testimonial Evidence. Presentation developed by T. Trimpe

Hybrid models of rational legal proof. Bart Verheij Institute of Artificial Intelligence and Cognitive Engineering

Bio 10- Fundamentals of Biology

On limiting the use of Bayes in presenting forensic evidence

Pooling Subjective Confidence Intervals

Your goal in studying for the GED science test is scientific

Sheila Barron Statistics Outreach Center 2/8/2011

Sleepy Suspects Are Way More Likely to Falsely Confess to a Crime By Adam Hoffman 2016

BIAS WHY MUST FORENSIC PRACTICE BE. Itiel Dror. For more information, see:

Stay Married with the FIT Technique Go from Pissed off to Peaceful in Three Simple Steps!

Genetics All somatic cells contain 23 pairs of chromosomes 22 pairs of autosomes 1 pair of sex chromosomes Genes contained in each pair of chromosomes

Oct. 21. Rank the following causes of death in the US from most common to least common:

Scientific Investigation

Chapter 8: Two Dichotomous Variables

Living well today...32 Hope for tomorrow...32

1. UTILITARIANISM 2. CONSEQUENTIALISM 3. TELEOLOGICAL 4. DEONTOLOGICAL 5. MOTIVATION 6. HEDONISM 7. MORAL FACT 8. UTILITY 9.

Why Is It That Men Can t Say What They Mean, Or Do What They Say? - An In Depth Explanation

Enumerative and Analytic Studies. Description versus prediction

ABSOLUTE AND RELATIVE JUDGMENTS IN RELATION TO STRENGTH OF BELIEF IN GOOD LUCK

EVIDENCE AND INVESTIGATION: Booklet 1

OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010

Notice that the word is pros-tate, although you will occasionally hear pros-trate! Bullet #1: other than skin cancer

Design an Experiment. Like a Real Scientist!!

English *P48984A0112* E202/01. Pearson Edexcel Functional Skills. P48984A 2015 Pearson Education Ltd. Level 2 Component 2: Reading

Understanding Molecular Mechanisms of Cancers

Inductive reasoning in the courtroom: Judging guilt based on uncertain evidence

How much can you trust your memory?

Interview with Prof. Dr. Mohamed Abou-Donia, Duke University Durham, North Carolina, USA at London on

Problem Situation Form for Parents

Just use the link above to register. Then start with the next slide.

Modelling crime linkage with Bayesian Networks

DNA Analysis Techniques for Molecular Genealogy. Luke Hutchison Project Supervisor: Scott R. Woodward

From the scenario below please identify the situation, thoughts, and emotions/feelings.

Research Methods II, Spring Term Logic of repeated measures designs

appstats26.notebook April 17, 2015

UNIT. Experiments and the Common Cold. Biology. Unit Description. Unit Requirements

Genetics. the of an organism. The traits of that organism can then be passed on to, on

AFSP SURVIVOR OUTREACH PROGRAM VOLUNTEER TRAINING HANDOUT

Statistical approaches for the interpretation of DNA evidence

How to Work with the Patterns That Sustain Depression

Testing for. Prostate Cancer

Chapter 1 Observation Skills

CS2220 Introduction to Computational Biology

Transcription:

Summer Institute in Statistical Genetics University of Washington, Seattle Forensic Genetics Module An introduction to the evaluation of evidential weight David Balding Professor of Statistical Genetics University of Melbourne, and University College London July 20/21, 2016

The notes for my lectures today and tomorrow are in the form of a 51-page document extracted from the book.

First we ll do some warm-up exercises. 1 The famous Collins case. 2 Rare disease testing. 3 Coloured taxis. You have seen this sort of example before but the base rate fallacy seems such a natural error for human brains that frequent refreshers are useful. For 2 and 3 above, the problem is set up so that there is a right answer. That s not so for 1, we won t propose a solution here and indeed there is no full solution, but I hope by the end of this course you will have ideas about how to approach evidence evaluation in such cases. One mantra that I would suggest you recite for an hour every evening is Always focus on the right question. Much confusion is caused by straying from this path.

Yesterday s news headline Melania Trump: Astrophysicist calculates there was one in 87 billion chance speech was not plagiarised What s wrong with the headline? What s needed to make it right?

In the UK case of Sally Clark, the medical expert Roy Meadows testified something like The probability for two babies in a family to die of SIDS is 1 in 73 million. The number is wrong, but let s ignore that. What is the right question? For DNA profile evidence the question is usually Did the DNA in the crime sample come from the alleged contributor(s)? We ll approach answering this question slowly, by way of a remote island where crime is rare, but did once happen. The island problem represents a simplification of a forensic identification problem. Don t be tricked into thinking it isn t important in practical problems many of the key ideas are present.

The island problem: facts summary All 101 islanders are initially equally under suspicion; The culprit has Υ; The suspect has Υ; The Υ-states of the other islanders are unknown; We expect on average about 1 person in 100 to have Υ. What is the probability that the suspect is guilty? SISG 2016 Forensic Genetics

Lessons from the island problem 1 The fact that Υ is rare (i.e. p is small) does not, taken alone, imply that Q is likely to be guilty. 2 Uncertainty about p does not cancel out. Ignoring uncertainty is unfavourable to defendants. 3 The overall weight of evidence against Q involves adding together the probability of a chance match and the probability of a match due to a typing error. 4 In the case of a search of possible culprits to find a match with crime scene evidence, the longer the search (i.e. the more individuals found not to match) the stronger the case against the one who is found to match.

Section 2.1.3: Application of the formula There are a lot of ideas in this section so a summary may be useful: The order in which different items of evidence are analysed ultimately doesn t matter, but we need to be clear about order to avoid misunderstandings: If the DNA evidence is considered first there may be a large set of alternative possible contributors, whereas the other evidence may, if accepted by the finder of fact, narrow it down to a small number of individuals. The weight-of-evidence formula doesn t solve all the problems, but it can guide thinking in the right direction. It can help delineate the roles of juror and expert witness. the expert can advise on values for the LR for one or more hypothesis pairs (e.g. Q vs unrelated, Q vs sibling). it s generally not the job of the expert to assess the other evidence (base rate or prior information) but illustrative calculations can be helpful (see next slides)

Illustrative probability calculation. Reported LR: 3 million (unrelated)... consider the hypothetical scenario in which, on the basis of the non-dna evidence, a juror considered that there were 1 million men who could be the questioned DNA source: X and 999,999 men unrelated to him. If each is initially considered equally likely to be the source, the effect of the DNA evidence would be to change the probability that X is indeed the correct source from 1 in 1 million (or 0.0001%) up to 75%. If initially there were 10,000 men each equally likely to be the source of the DNA, the effect of the DNA evidence would be to change the probability that X is the true source from 1 in 10,000 (or 0.01%) up to 99.7%. Finally, if the other evidence were such that a juror considered that only 100 equally likely men could be the source of the DNA, the effect of the DNA evidence would be to change the probability that X is the true source from 1 in 100 (or 1%) up to over 99.99%. SISG 2016 Forensic Genetics

Illustrative probability calculation. Reported LR: 61 (brother) Again hypothetically, if a juror judged that the DNA must have come from either X or a specified brother of X, both initially equally likely, then the effect of the DNA evidence would be to change the probability that X is the true source from 1 in 2 (or 50%) up to over 98%.

Section 2.2.4: Laboratory and handling errors The strength of DNA evidence requires the probabilities of 1 a chance match with an alternative possible contributor, and 2 a false match due to an error. The probability of any error occurring in the handling and analysis of a DNA profile is typically much higher than 1 above. Not relevant: only the probability of an error causing a false match. Example of newspaper report of winning lottery ticket. 2 above is essentially limited to the suspect s DNA being present in the crime sample for reasons other than committing the crime, such as: Cross contamination in the lab of evidence samples from different crime scenes (UK 2012 case of Adam Scott) Deliberate planting of DNA to frame a suspect.

Even if considered very unlikely (no evidence to suggest either), the possibility of cross contamination or of planting of evidence may be considered by a reasonable juror to be more likely than a chance match. If so, a DNA match probability may be essentially irrelevant. However that s an assessment to be made by jurors. An expert can give jurors some information about laboratory errors but very little about evidence tampering. It s still useful to report match probabilities even if a juror later assess this to be unimportant.

Why not declare uniqueness of the profile? How to decide the threshold for uniqueness? Low-template and mixed samples often generate modest LRs. What to do about other evidence? All evidence in favour of Q is evidence against uniqueness. What s wrong with RMNE (inclusion probability)? It doesn t answer the right question! How to incorporate other evidence? All evidence counts as evidence against the defendant - not always realistic. Problematic to compute RMNE for low-template profiles and multiple contributors of interest. Weight of evidence doesn t depend on the profile of the alleged contributor. e.g. two co-defendants both alleged to be contributors.

Calculating LRs allowing for coancestry The match probability for X given the profile of Q depends on relatedness: both close relatedness and more distant coancestry. There exists some very nice population genetics theory that allows us to compute match probabilities allowing for relatedness or coancestry, the latter measured by the coefficient F ST. F ST has several interpretations, one convenient interpretation is: The probability that two alleles sampled in a subpopulation are identical through inheritance from an ancestor within the subpopulation. Genotype matches at distinct loci are generally not independent, but relatedness is the cause of non-independence and after adjusting for relatives/coancestry it is reasonable to proceed assuming independence (multiply across loci).

Computing match probabilities The match probabilities depend on: population allele fractions (the p). These are unknown but can be estimated from a population database. Sampling variation should be taken into account: several ways to do this; it usually doesn t make much difference. The coancestry coefficient F ST. Also unknown, and cannot be directly estimated for a particular crime scenario. Appropriate value can vary over different X for the given Q: simplest to proceed using an upper bound. Fortunately high mutation rate at STR loci seems to keep F ST relatively low: we have found that 3% is generally a safe value to use even if the wrong database is specified for X.

Match probabilities for mixture profiles How many and which contributors to include in the competing hypotheses can be difficult. If the amount of DNA from each contributor is estimated, then overstating the number of contributors presents only a computational problem if not really a contributor, the estimated amount of DNA will be low. Low-level contributors that can be distinguished from the contributor(s) of interest can be treated like dropin (so the exact number of contributors need not be known). For multiple contributors of interest, we need to take care about specifying hypotheses. It is usually not appropriate to compare both are contributors with neither is a contributor. Instead we have to proceed one contributor at a time, with and without the other contributor being assumed present. Thus 4 LRs for two queried contributors.