Intro to Probability Instructor: Alexandre Bouchard

Similar documents
Probability. Esra Akdeniz. February 26, 2016

Section 6.1 "Basic Concepts of Probability and Counting" Outcome: The result of a single trial in a probability experiment

Lesson 87 Bayes Theorem

MS&E 226: Small Data

MITOCW conditional_probability

Probability and Sample space

Knowledge is rarely absolutely certain. In expert systems we need a way to say that something is probably but not necessarily true.

Probability and Counting Techniques

Cognitive Modeling. Lecture 12: Bayesian Inference. Sharon Goldwater. School of Informatics University of Edinburgh

Approximate Inference in Bayes Nets Sampling based methods. Mausam (Based on slides by Jack Breese and Daphne Koller)

Bayes theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

Chapter 6: Counting, Probability and Inference

PRINTABLE VERSION. Quiz 10

Bayes Theorem Application: Estimating Outcomes in Terms of Probability

Understanding Probability. From Randomness to Probability/ Probability Rules!

Oct. 21. Rank the following causes of death in the US from most common to least common:

MATH Homework #3 Solutions. a) P(E F) b) P(E'UF') c) P(E' F') d) P(E F) a) P(E F) = P(E) * P(F E) P(E F) =.55(.20) =.11

Chapter 5 & 6 Review. Producing Data Probability & Simulation

Reasoning with Uncertainty. Reasoning with Uncertainty. Bayes Rule. Often, we want to reason from observable information to unobservable information

Bayesian (Belief) Network Models,

Undesirable Optimality Results in Multiple Testing? Charles Lewis Dorothy T. Thayer

Bayesian Inference. Review. Breast Cancer Screening. Breast Cancer Screening. Breast Cancer Screening

Probability II. Patrick Breheny. February 15. Advanced rules Summary

Game Theory. Uncertainty about Payoffs: Bayesian Games. Manar Mohaisen Department of EEC Engineering

5.3: Associations in Categorical Variables

Probability and Counting Techniques

Concepts and Connections: What Do You Expect? Concept Example Connected to or Foreshadowing

BIOSTATS 540 Fall 2018 Exam 2 Page 1 of 12

Unit 4 Probabilities in Epidemiology

Stat 111 Final. 1. What are the null and alternative hypotheses for Alex and Sarah having the same mean lap time?

Quantifying and Mitigating the Effect of Preferential Sampling on Phylodynamic Inference

COMP90049 Knowledge Technologies

Bayesian performance

On the provenance of judgments of conditional probability

Modeling State Space Search Technique for a Real World Adversarial Problem Solving

Chapter 15: Continuation of probability rules

5. Suppose there are 4 new cases of breast cancer in group A and 5 in group B. 1. Sample space: the set of all possible outcomes of an experiment.

Introduction. Patrick Breheny. January 10. The meaning of probability The Bayesian approach Preview of MCMC methods

MIDTERM EXAM. Total 150. Problem Points Grade. STAT 541 Introduction to Biostatistics. Name. Spring 2008 Ismor Fischer

Math HL Chapter 12 Probability

Introduction to screening tests. Tim Hanson Department of Statistics University of South Carolina April, 2011

Bayes Linear Statistics. Theory and Methods

A Cue Imputation Bayesian Model of Information Aggregation

Study Guide for the Final Exam

Handout 11: Understanding Probabilities Associated with Medical Screening Tests STAT 100 Spring 2016

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION"

CS343: Artificial Intelligence

Supplementary notes for lecture 8: Computational modeling of cognitive development

# tablets AM probability PM probability

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections

Hypothesis-Driven Research

Signal Detection Theory and Bayesian Modeling

BAYESIAN HYPOTHESIS TESTING WITH SPSS AMOS

Chapter 10. Screening for Disease

Statistical Hocus Pocus? Assessing the Accuracy of a Diagnostic Screening Test When You Don t Even Know Who Has the Disease

Lecture 9 Internal Validity

Probability: Judgment and Bayes Law. CSCI 5582, Fall 2007

Probability and Random Processes ECS 315

6 Relationships between

A Taxonomy of Decision Models in Decision Analysis

SAMPLING AND SAMPLE SIZE

Bayes theorem (1812) Thomas Bayes ( ) English statistician, philosopher, and Presbyterian minister

Lecture Outline Biost 517 Applied Biostatistics I. Statistical Goals of Studies Role of Statistical Inference

Unit 4 Probabilities in Epidemiology

PHP2500: Introduction to Biostatistics. Lecture III: Introduction to Probability

Diagnostic Reasoning: Approach to Clinical Diagnosis Based on Bayes Theorem

Biostatistics Lecture April 28, 2001 Nate Ritchey, Ph.D. Chair, Department of Mathematics and Statistics Youngstown State University

Cognitive Science and Bayes. The main question. Several views. Introduction. The main question Several views Examples. The heuristics view

HSC FACULTY CONTRACTS OFFICE. Non-Standard Payments October 18, :00 10:00

I. Introduction and Data Collection B. Sampling. 1. Bias. In this section Bias Random Sampling Sampling Error

A Case Study: Two-sample categorical data

Foundations of AI. 10. Knowledge Representation: Modeling with Logic. Concepts, Actions, Time, & All the Rest

Reasoning with Bayesian Belief Networks

Probabilistic Information Processing Systems: Evaluation with Conditionally Dependent Data 1

A Brief Introduction to Bayesian Statistics

Application of Bayesian Network Model for Enterprise Risk Management of Expressway Management Corporation

SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers

Module Overview. What is a Marker? Part 1 Overview

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA

Problem Set and Review Questions 2

Agenetic disorder serious, perhaps fatal without

Exploring Experiential Learning: Simulations and Experiential Exercises, Volume 5, 1978 THE USE OF PROGRAM BAYAUD IN THE TEACHING OF AUDIT SAMPLING

Conditional probability

Bayesian Analysis of Diagnostic and Therapeutic Procedures in Clinical Medicine

D F 3 7. beer coke Cognition and Perception. The Wason Selection Task. If P, then Q. If P, then Q

The Lens Model and Linear Models of Judgment

Confirmation Bias. this entry appeared in pp of in M. Kattan (Ed.), The Encyclopedia of Medical Decision Making.

Introduction to Computational Neuroscience

Computer Age Statistical Inference. Algorithms, Evidence, and Data Science. BRADLEY EFRON Stanford University, California

Gene Finding in Eukaryotes

Bayesian Networks Representation. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University

Confirmation 4 Reasoning by Analogy; Nicod s Condition

Ways of counting... Review of Basic Counting Principles. Spring Ways of counting... Spring / 11

Statistics. Dr. Carmen Bruni. October 12th, Centre for Education in Mathematics and Computing University of Waterloo

Bayesian Belief Network Based Fault Diagnosis in Automotive Electronic Systems

Bayesian Concepts in Software Testing: An Initial Review

Section F. Measures of Association: Risk Difference, Relative Risk, and the Odds Ratio

Creative Commons Attribution-NonCommercial-Share Alike License

The role of sampling assumptions in generalization with multiple categories

Transcription:

www.stat.ubc.ca/~bouchard/courses/stat302-sp2017-18/ Intro to Probability Instructor: Alexandre Bouchard

Plan for today: Bayesian inference 101 Decision diagram for non equally weighted problems Bayes rule Law of total probability

NSERC USRA Paid internship program to do research in a research group Deadline: Jan 18 I will hire one student for a project on probabilistic inference Looking for someone with programming experience, esp. on JVM + strong interest in probability/machine learning If interested drop me a email with CV + github Many other projects from other faculty members

www.stat.ubc.ca/~bouchard/courses/stat302-sp2017-18/ Logistics What s new/recent on the website: Webwork: first set is released. Due on the 18th. Problems being fixed

Review

Def 7 Conditional probability: Review Conditioning S P(A E) = P(E A) P(E) E (observation) A (query)

Prop. 5 Properties Check these: 0 P(A E) 1 P(S E) = 1 If A1, A2,... are disjoint: P(A1 A2... E) = P(A1 E) + P(A2 E) +... In other words: P(. E) is a probability

Bonus: A more intuitive view of independent events Recall: Events A and B are independent if: P(A B) = P(A) * P(B) Equivalent definition (when P(B) > 0): Events A and B are independent if: P(A B) = P(A)

Decision trees and conditional probabilities

Ex. 23 Two bags 1) Flip a coin to pick one of the two bags 2) Pick one shape from the selected bag Question: Probability to pick the star? A. 1/6 B. 1/5 C. 1/4 D. 1/2 Bag of red shapes Bag of blue shapes

Decision tree: not equally weighted outcomes 1/2 Question: Probability to 1/2 B pick the star? S 1/2 1/3 A. 1/6 B. 1/5 C. 1/4 1/2 R 1/3 D. 1/2 P(R S) = P(R) 1/3 P(L R) Leaf L

Decision tree: not equally weighted outcomes 1/2 (1/2) * (1/2) = 1/4 1/2 B (1/2) * (1/2) S 1/2 = 1/4 1/3 (1/2) * (1/3) 1/2 R 1/3 = 1 / 6 (1/2) * (1/3) P(R S) = P(R) = 1 / 6 1/3 L P(L) = (1/2) * (1/3) P(L R) = 1 / 6

Decision tree: not equally weighted outcomes 1/2 Question: Probability to 1/2 B pick the star? S 1/2 1/3 A. 1/6 B. 1/5 C. 1/4 1/2 R 1/3 D. 1/2 P(R S) = P(R) 1/3 P(L R) Leaf L

Prop. 6 Chain rule For any events A, B, with P(A) > 0: P( A, B) = P(A) P(B A) Useful trick: it can be done in the other order P( A, B) = P(B) P(A B)

Medical tests and the law of total probability

Ex. 24 HIV tests: background Ways of detecting HIV infections: Direct: try to match conserved regions of HIV s genome, Indirect: try to detect the immune system s reaction to the virus. Modern tests often use combinations of these methods and are highly accurate. Today: let s assume an hypothetical, less than perfect test.. NOTE: The probabilities we will compute today are NOT representative of modern HIV test accuracies. But the ideas apply to rare/harder to diagnose diseases.

Ex. 24 HIV testing An HIV test has the following two modes of failure: When a patient has the disease, the test will still be negative 2% of the time (false negative) When a patient does not have the disease, the test will turn positive 1% of the time (false positive)

Ex. 24 HIV prevalence Source: Wikipedia, http://tinyurl.com/n7kakso

Ex. 24 HIV testing An HIV test has the following two modes of failure: When a patient has the disease (event H), the test will still be negative (T - ) 2% of the time (false negative) When a patient does not have the disease (H c ), the test will turn positive (T + ) 1% of the time (false positive) Question: if 0.5% of the population has HIV, what % of the tests will turn positive? A. 2.0% B. 2.198% C. 1.485% D. 0.02%

Ex. 24 Translation of the problem into probabilities and conditional probabilities An HIV test has the following two modes of failure: When a patient has the disease (event H), the test will still be negative (T - ) 2% of the time (false negative) When a patient does not have the disease (H c ), the test will turn positive (T + ) 1% of the time (false positive) Question: if 0.5% of the population has HIV, what % of the tests will turn positive? P(T- H) = 0.02 P(T+ H c ) = 0.01 P(H) = 0.005 P(T+) =?

Decision Ei P(Ei) tree T+ 0.98 H T+ 0.0049 0.005 H 0.02 H T- 0.0001 S 0.995 H c 0.99 H c T- 0.98505 0.01 H c T+ 0.00995 Why? P(T+) = P(H T+) + P(H c T+) = 0.01485

Justification for: P(T+) = P(H T+) + P(H c T+) First: recall that... Def. 4 F { E1 E2 E3 Partition of an event The events Ei form a partition of the event F if: 1. The union of the Ei s is equal to F: i Ei = F 2. The Ei s are disjoint: if i j, then Ei Ej = F = T+ E1 = H T+ E2 = H c T+ E4 Why is this useful? Note: as consequence of 1, 2 and the axioms of probability, P(F) = P(E1) + P(E2) + P(E3) + P(E4) S T+ H c Second: check that... 1. E1 E2 = F 2. E1 E2 = H

Prop. 7 Law of total probability If A is any event of interest (for example, A = T+ in the previous example), then we can decompose its probability as: P(A) = P(H A) + P(H c A) This is true for any event H. Why? P(A) = P(S A) = P((H H c ) A) = P((H A) (H c A)) = P(H A) + P(H c A)

Ex. 24 HIV testing An HIV test has the following two modes of failure: When a patient has the disease (event H), the test will still be negative (T - ) 2% of the time (false negative) When a patient does not have the disease (H c ), the test will turn positive (T + ) 1% of the time (false positive) Question: if 0.5% of the population has HIV, what % of the tests will turn positive? A. 2.0% B. 2.198% C. 1.485% D. 0.02%

Prop. 8 Bayes rules Notation: H, H c : a partition of the space into hypotheses (example: whether the patient has HIV or not). E: an observation (example: test positive, E = T+) We want: P(H E). We have: P(E H), P(E H c ), P(H) P(H E) = P(E,H) P(E) = P(H) P(E H) P(E, H) + P(E, H c ) Chain rule Definition of conditional probability Law of total pr Chain rule = P(H) P(E H) P(H) P(E H) + P(H c ) P(E H c )

Applying Bayes rule An HIV test has the following two Ex. 26 modes of failure: When a patient has the disease, the test will still be negative 2% of the time (false negative) When a patient does not have the disease, the test will turn positive 1% of the time (false positive) Suppose now 25% of the population has HIV. Question: Given that the test is positive, what is the updated (posterior) probability that the patient is indeed affected by HIV? P(H E) = P(H) P(E H) P(H) P(E H) + P(H c ) P(E H c ) H, H c : a partition of the space into hypotheses E: an observation A. ~34% B. ~55% C. ~93% D. ~97%

Ex. 26 Applying Bayes rule An HIV test has the following two modes of failure: When a patient has the disease, the test will still be negative 2% of the time (false negative) When a patient does not have the disease, the test will turn positive 1% of the time (false positive) Question: Questions: (a) If 25% of the population has HIV, what % of the tests will turn positive? (b) Given that the test is positive, what is the updated (posterior) probability that the patient is indeed affected by HIV? P(H E) = P(H) P(E H) P(H) P(E H) + P(H c ) P(E H c ) H, H c : a partition of the space into hypotheses E: an observation A. ~34% B. ~55% C. ~93% D. ~97%

Law of total probability If A is any event of interest (for example, A = T+ in the previous example), then we can decompose it as: P(A) = P(H A) + P(H c A) This is true for any event H, H c. Extended version? A S { H Feature of H, H c we used? They form a partition of S into 2 blocks. The law of total probability still works with partitions of more than 2 blocks. H c

Prop. 7 (v2) Law of total probability If A is any event of interest, and H1,.., Hn is any partition of S, then we can decompose A as: S { P(A) = P(H1 A) +... + P(Hn A) n = P(Hi A) i = 1 A H3 H1 H2 H4

Prop. 8 (v2) Bayes rule (extended version) Notation: H1,..., Hn: a partition of the space into hypotheses. S { E: an observation. We want: P(Hi E). We have: P(E Hi), P(Hi) H1 H2 P(Hi E) = P(Hi) P(E Hi) P(H1) P(E H1) +... + P(Hn) P(E Hn) H3 H4

Prop. 6 (v2) Chain rule For any events A, B, with P(A) > 0: P( A, B) = P(A) P(B A) Extended version For any events A, B, C, with P(A, B) > 0: P( A, B, C) = P(A) P(B A) P(C A, B)