Conditional Distributions and the Bivariate Normal Distribution. James H. Steiger

Similar documents
bivariate analysis: The statistical analysis of the relationship between two variables.

Understandable Statistics

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

SCATTER PLOTS AND TREND LINES

Chapter 1: Exploring Data

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.

Chapter 3 CORRELATION AND REGRESSION

Simple Linear Regression the model, estimation and testing

Chapter 3: Examining Relationships

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

Regression Equation. November 29, S10.3_3 Regression. Key Concept. Chapter 10 Correlation and Regression. Definitions

CHAPTER ONE CORRELATION

CHAPTER TWO REGRESSION

Applied Statistical Analysis EDUC 6050 Week 4

1.4 - Linear Regression and MS Excel

AP Statistics. Semester One Review Part 1 Chapters 1-5

Standard Scores. Richard S. Balkin, Ph.D., LPC-S, NCC

12.1 Inference for Linear Regression. Introduction

Final Exam PS 217, Fall 2010

STATISTICS INFORMED DECISIONS USING DATA

Section 3.2 Least-Squares Regression

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

10.1 Estimating with Confidence. Chapter 10 Introduction to Inference

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

Unit 1 Exploring and Understanding Data

Still important ideas

An Introduction to Bayesian Statistics

E 490 FE Exam Prep. Engineering Probability and Statistics

REGRESSION MODELLING IN PREDICTING MILK PRODUCTION DEPENDING ON DAIRY BOVINE LIVESTOCK

Chapter 3: Describing Relationships

Normal Distribution. Many variables are nearly normal, but none are exactly normal Not perfect, but still useful for a variety of problems.

Business Statistics Probability

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

1. The figure below shows the lengths in centimetres of fish found in the net of a small trawler.

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

Appendix B Statistical Methods

INTERPRET SCATTERPLOTS

From Biostatistics Using JMP: A Practical Guide. Full book available for purchase here. Chapter 1: Introduction... 1

Numerical Integration of Bivariate Gaussian Distribution

Lesson 1: Distributions and Their Shapes

Midterm Exam MMI 409 Spring 2009 Gordon Bleil

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

12/30/2017. PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2

Intro to SPSS. Using SPSS through WebFAS

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Quantitative Methods in Computing Education Research (A brief overview tips and techniques)

STATISTICS AND RESEARCH DESIGN

Statistics and Probability

Quick start guide for using subscale reports produced by Integrity

Simple Linear Regression

AP Statistics TOPIC A - Unit 2 MULTIPLE CHOICE

On Regression Analysis Using Bivariate Extreme Ranked Set Sampling

Statistics: Interpreting Data and Making Predictions. Interpreting Data 1/50

Statistics for Psychology

PSYCHOLOGY 300B (A01) One-sample t test. n = d = ρ 1 ρ 0 δ = d (n 1) d

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test February 2016

Prepared by: Assoc. Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies

Overview of Non-Parametric Statistics

Statistical Methods Exam I Review

Pitfalls in Linear Regression Analysis

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Chapter 2: The Organization and Graphic Presentation of Data Test Bank

Midterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do.

Introduction to Statistical Data Analysis I

Analysis and Interpretation of Data Part 1

6. Unusual and Influential Data

UNIT V: Analysis of Non-numerical and Numerical Data SWK 330 Kimberly Baker-Abrams. In qualitative research: Grounded Theory

Previously, when making inferences about the population mean,, we were assuming the following simple conditions:

Chapter 2 Organizing and Summarizing Data. Chapter 3 Numerically Summarizing Data. Chapter 4 Describing the Relation between Two Variables

Research Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1:

Psy201 Module 3 Study and Assignment Guide. Using Excel to Calculate Descriptive and Inferential Statistics

MATH 183 Test 2 Review Problems

10. LINEAR REGRESSION AND CORRELATION

CHAPTER 3 RESEARCH METHODOLOGY

Daniel Boduszek University of Huddersfield

10-1 MMSE Estimation S. Lall, Stanford

Regression CHAPTER SIXTEEN NOTE TO INSTRUCTORS OUTLINE OF RESOURCES

Small-area estimation of mental illness prevalence for schools

MAT Intermediate Algebra - Final Exam Review Textbook: Beginning & Intermediate Algebra, 5th Ed., by Martin-Gay

Sample Math 71B Final Exam #1. Answer Key

Statistics for Psychology

Two-Way Independent ANOVA

Class 7 Everything is Related

Chapter 1: Introduction to Statistics

3.2 Least- Squares Regression

Statistical Methods and Reasoning for the Clinical Sciences

Data Analysis with SPSS

Simple Linear Regression: Prediction. Instructor: G. William Schwert

Final Exam Version A

STATISTICS & PROBABILITY

PROBABILITY Page 1 of So far we have been concerned about describing characteristics of a distribution.

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet

On the Practice of Dichotomization of Quantitative Variables

Math 261 Exam I Spring Name:

Multiple Bivariate Gaussian Plotting and Checking

Overview of Lecture. Survey Methods & Design in Psychology. Correlational statistics vs tests of differences between groups

Transcription:

Conditional Distributions and the Bivariate Normal Distribution James H. Steiger

Overview In this module, we have several goals: Introduce several technical terms Bivariate frequency distribution Marginal distribution Conditional distribution In connection with the bivariate normal distribution, discuss Conditional distribution calculations Regression toward the mean

Bivariate Frequency Distributions So far, we have discussed univariate frequency distributions, which table or plot a set of values and their frequencies. X 4 3 1 f 5 4 6

Bivariate Frequency Distributions We need two dimensions to table or plot a univariate (1-variable) frequency distribution one dimension for the value, one dimension for its frequency. Frequency 6 5.5 5 4.5 4 3.5 3.5 1.5 1 0.5 0 0 0.5 1 1.5.5 3 3.5 4 4.5 5 Value

Bivariate Frequency Distributions A bivariate frequency distribution presents in a table or plot pairs of values on two variables and their frequencies.

Bivariate Frequency Distributions For example, suppose you throw two coins, X and Y, simultaneously and record the outcome as an ordered pair of values. Imagine that you threw the coin 8 times, and observed the following (1=Head, 0 = Tail) (X,Y) (1,1) (1,0) (0,1) (0,0) f

Bivariate Frequency Distributions To graph the bivariate distribution, you need a 3 dimensional plot, although this can be drawn in perspective in dimensions (X,Y) (1,1) (1,0) (0,1) (0,0) f

Bivariate Frequency Distributions Bivariate Histogram (Spreadsheet v*8c)

Marginal Distributions The bivariate distribution of X and Y shows how they behave together. We may also be interested in how X and Y behave separately. We can obtain this information for either X or Y by collapsing (summing) over the opposite variable. For example:

Marginal Distribution of Y Frequency 4 0 0 1 3 Value

Conditional Distributions The conditional distribution of Y given X=a is the distribution of Y for only those occasions when X takes on the value a. Example: The conditional distribution of Y given X=1 is obtained by extracting from the bivariate distribution only those pairs of scores where X=1, then tabulating the frequency distribution of Y on those occasions.

Conditional Distributions The conditional distribution of Y given X=1 is: Y X = 1 f 1 0

Conditional Distributions While marginal distributions are obtained from the bivariate by summing, conditional distributions are obtained by making a cut through the bivariate distribution.

Marginal, Conditional, and Bivariate Relative Frequencies The notion of relative frequency generalizes easily to bivariate, marginal, and conditional probability distributions. In all cases, the frequencies are rescaled by dividing by the total number of observations in the current distribution table.

Bivariate Continuous Probability Distributions With continuous distributions, we plot probability density. In this case, the resulting plot looks like a mountainous terrain, as probability density is registered on a third axis. The most famous bivariate continuous probability distribution is the bivariate normal. Glass and Hopkins discuss the properties of this distribution in some detail.

Bivariate Continuous Probability Distributions Characteristics of the Bivariate Normal Distribution Marginal Distributions are normal Conditional Distributions are normal, with constant variance for any conditional value. Let b and c be the slope and intercept of the linear regression line for predicting Y from X. µ = = + σ (1 ρ ) σ σ YX a ba c YX= a = YX Y = E

The Bivariate Normal Distribution 0.15 0.1 0.05 0-0 0 -

Computing Conditional Distributions We simply use the formulas µ = = + σ ρ σ σ YX a ba c YX = a = (1 YX) Y = E We have two choices: Process the entire problem in the original metric, or Process in Z score form, then convert to the original metric.

Conditional Distribution Problems The distribution of IQ for women (X) and their daughters (Y) is bivariate normal, with the following characteristics: Both X and Y have means of 100 and standard deviations of 15. The correlation between X and Y is.60.

Conditional Distribution Problems b First we will process in the original metric. Here are some standard questions. What are the linear regression coefficients for predicting Y from X? They are ρ σ YX Y = = σ X Y X.60 c= µ bµ = 100 (.6)(100) = 40

Conditional Distribution Problems What is the distribution of IQ scores for women whose mothers had an IQ of 145? The mean follows the linear regression rule: µ =.6(145) + 40 = 17 YX = 145 The standard deviation is σ = = ρ σ YX 145 1 XY Y = 1.36(15) = (.8)(15) = 1

Conditional Distribution Problems What percentage of daughters of mothers with IQ scores of 145 will have an IQ at least as high as their mother?

Conditional Distribution Problems We simply compute the probability of obtaining a score of 145 or higher in a normal distribution with a mean of 17 and a standard deviation of 1. We have: 145 17 Z145 = = 1.5 1 The area above 1.5 in the standard normal curve is 6.68%.

Conditional Distribution Problems What are the social implications of this result?