Plotting multivariate data

Size: px
Start display at page:

Download "Plotting multivariate data"

Transcription

1 Plotting multivariate data Statistics 7 ISU Multivariate data and graphics With multivariate data we want to understand the associations between multiple s, which might be considered to be understanding the shape of the data in high-dimensional space. Graphics are used to explore the data, and also to diagnose models. Typically start with plots of 1, then, then more s.

2 Count Count 3 1 One - categorical Eg Examining the number in each level of one of the label s, prior to doing classification or MANOVA Barchart Variable 3 Variable What do you think the counts are for each of the categories? Naomi Robbins () Creating More Effective Graphs Wiley Counts are,, 6, 8 PLEASE DON T USE 3D UNNECESSARILY One - real Histogram: bin the s, and represent counts as bars. Use different bin sizes. Density: Like a smoothed histogram, estimate the density, represent as a curve. Dotplot: 1D scatterplot, s represented by dots. Boxplot: Representation of the 5 number summary, min/max, Q1/Q3, median. Good for comparing distrbutions of multiple categories.

3 One - real 7 count What do you learn about tips? tip 35 count What else do you learn about tips? tip One real + one categorical tars How do the species differ in the tars1 s? Concinna Heikert. Heptapot. species tars Concinna Heikert. Heptapot. species

4 One real + one categorical LAve How does the type of music differ? Classical Type Rock LAve Do you learn anything different from the dotplot than the boxplot? Classical Type Rock Plotting Means 5 mean(tars1) mean(tars1) Means Concinna Heikert. Heptapot. species What s wrong with this plot? are points estimates, use points to represent them. What do we learn about differences between species? Concinna Heikert. Heptapot. species

5 Two s - real Scatterplot How is tip related to bill? How would you expect tip to be related to bill? tip total_bill Does aspect ratio matter? Which is the response and which is the explanatory? Two s - real Scatterplot How is aede1 related to tars1? aede tars1 Does aspect ratio matter? Which is the response and which is the explanatory? For scatterplots of multivariate data use a SQUARE aspect ratio!

6 Two real + one categorical Scatterplot 15 How is aede1 related to tars1, by species? aede species Concinna Heikert. Heptapot tars1 Two real + one categorical linoleic Scatterplot oleic Area Calabria Coastal Sardinia East Liguria Inland Sardinia North Apulia Sicily South Apulia Umbria West Liguria How is linoleic acid related to oleic acid, by area? How many colors is too many? Generally only use 3- colors in a plot, only 3- categories.

7 Two s - real D density How is aede1 related to tars1? aede aede tars tars1 Two real + one categorical Scatterplot + contour, facetted by species Concinna Heikert. Heptapot. 15 aede tars How is aede1 related to tars1, by species?

8 Two s - real D histogram, scatterplot + linear model 1 How is tip related to bill? 8 1 count tip tip total_bill total_bill Two real + one categorical Scatterplot + linear model, facetted by sex of the bill payer, and smoking status. 1 Female Male tip No Yes How is tip related to total bill, by sex and smoker? total_bill

9 BEYOND D... Scatterplot matrix Correlation matrix. How do correlations differ by type of chocolate?

10 y Calories Scatterplot matrix TotFat Chol Na Fiber Sugars x Calories TotFat Chol Na Fiber Sugars Red=milk Black=dark All s plotted pairwise, like correlation matrix. What do you learn? Scatterplot matrix V1 density V1 vs V V vs V1 V3 vs V1 V density V1 vs V3 V vs V3 V3 vs V V3 density GENERAL Pairwise dependence only. A cross-check on the correlation matrix, whether correlation is a good summary or NOT.

11 Parallel coordinates V V1 Orthogonal V V1 V1 V Parallel V1 V GENERAL Orthogonal axes become parallel. Can lay out many s on a page. Need to learn a new set of patterns of structure. Re-ordering of axes is important. Parallel coordinates Calories TotFat Chol Na Fiber Sugars Calories TotFat Fiber Chol Na Sugars Red=milk Black=dark Re-ordered so that s with high s for dark chocolate are first. What s the difference between milk and dark chocolate?

12 Tennis data 1 Australian Open tennis tournament statistics for women % 1st serves in, Aces, Double faults, Unforced errors, Win on 1st serve %, Win on nd serve %, Winners, Receiving points won, Break point conversions, Fastest serve speed, Return games won, Server points won (1 s) Scatterplot matrix 7 6 Pct.1st.S Corr: -.3 Aces Corr: Corr:.51 Corr: Corr:.888 Corr: -.1 Corr:.8 Weak associations Outliers 8 DF Corr:.75 UE 6 Corr:.17 Corr:.37 Can t plot many s 8 6 W.1st.S 6 8

13 Parallel coordinates 6 - Pct.1st.S Aces DF UE W.1st.S W.nd.S Winners RPW BPC FSS RGW SPW Plot a lot more s Some outliers Some weak associations can be seen, both positive and negative 6 - Pct.1st.S Aces DF UE W.1st.S W.nd.S Winners RPW BPC FSS RGW SPW Scale matters: mean/sd, /1, individual/ global Enables correlation to be seen better, and outliers

14 Pct.1st.S Aces DF UE W.1st.S W.nd.S Winners RPW BPC FSS RGW SPW Scale matters: mean/sd, /1, individual/ global Emphasizes the univariate distributions Pct.1st.S Aces DF UE W.1st.S W.nd.S Winners RPW BPC FSS RGW SPW Scale matters: mean/sd, /1, individual/ global

15 6 - Pct.1st.S Aces DF UE W.1st.S W.nd.S Winners RPW BPC FSS RGW SPW Order matters: place s that are highly correlated close to each other 6 - Pct.1st.S RPW BPC RGW Aces W.1st.S FSS DF UE W.nd.S Winners SPW Order matters: place s that are highly correlated close to each other Less line crossing, easier to digest positive correlation, and then negative correlation

16 Normal data - X1 X X3 X X5 X6 Nothing interesting! All a little moderate correlation. Modelling is going to be easy! Clustered data - X1 X X3 X X5 X6 See the criss-crossing, gaps between lines. Will need to extract the clusters before doing any other modeling, otherwise pretty regular data

17 (Less) Clustered data - X1 X X3 X X5 X6 Still see the criss-crossing, gaps between lines, but less prominent. Will need to deal with the multi-modality Outliers in the data X1 X X3 X X5 X6 Small group of observations that are outliers on X1-X3. Need to do something with these cases, remove with justification, or fix

18 Strong association - - X1 X X3 X X5 X6 Very flat lines indicate strong positive association between all s. Strong negative association - X1 X X3 X X5 X6 Crossed lines in first three s indicate X is strongly negatively correlated with other vars.

19 Strong negative association - fixed - X1 X X3 X X5 X6 X is multiplied by -1, then it is positively associated with other s Pct.1st.S Aces DF UE W.1st.S W.nd.S Winners RPW BPC FSS RGW SPW Tennis data again: Aces, DF, UE, W.1st.S, Winners, FSS reversed (multiplied by -1) Mostly now the outliers are visible, but not so much multivariate outliers, mostly only in one or few s

20 Another data set: olive oils - - palmitic palmitoleic stearic oleic linoleic linolenic arachidic eicosenoic Outliers, clustering, some positive and some negative associations, some discreteness Another data set: olive oils - - palmitic palmitoleic stearic oleic linoleic linolenic arachidic eicosenoic Outliers, clustering, some positive and some negative associations, some discreteness

21 Another data set: olive oils - - palmitic palmitoleic stearic oleic linoleic linolenic arachidic eicosenoic Outliers, clustering, some positive and some negative associations, some discreteness Another data set: olive oils - - palmitic palmitoleic stearic oleic linoleic linolenic arachidic eicosenoic Outliers, clustering, some positive and some negative associations, some discreteness

22 Another data set: olive oils - - palmitic palmitoleic stearic oleic linoleic linolenic arachidic eicosenoic Outliers, clustering, some positive and some negative associations, some discreteness Another data set: olive oils - - palmitic palmitoleic stearic oleic linoleic linolenic arachidic eicosenoic Outliers, clustering, some positive and some negative associations, some discreteness

23 Tours Red=milk Grey=dark Are the two clusters different? How? Are there any outliers? And points from one group that seem to belong to the other group? Tours A tour is a movie of low-dimensional projections of high-dimensional space. Let A =[a 1 a ]= a 11 a 1 be a -dimensional projection matrix, where a 1 and a are both of length 1, and orthogonal to each other, then for data matrix X n p XA = a 11 X 11 + a 1 X a p1 X 1p. a 11 X n1 + a 1 X n a p1 X np is a -dimensional projection of the data.. a p1 a p a 1 X 11 + a X a p X 1p. a 1 X n1 + a X n a p X np The s (call them coefficients) in the projection matrix are varied producing the views in the movie. n

24 Visual representation of coefficients (axes) Tours Projection coefficients and scaling A = scaling for each.58/1.15/1.15/39.75/39.185/15.363/15.31/1.63/1.793/8.116/8.11/68./68 Multiply the data matrix by this to get the particular view of the data. Types of Tours Grand tour: The coefficients are randomly varied, giving a random walk over the space of all projections. Guided tour: The coefficients are chosen to maximize some criterion of interestingness, eg clusters, outliers, separation of classes. Manual tour: The user controls the coefficients of one of the s. Good for exploring the importance of a single on the structure visible in the plot.

25 Clusters Tours Non-linear dependence Non-linear dependence + noise s Class differences, outliers, nonlinear dependence Tours Something hidden

26 Icons aede1 head tars tars1 aede aede3 1 Concinna Concinna Each case is plotted separately. Each is mapped a feature of the icon 3 Heptapot. 33 Heptapot. 67 Heikert. 68 Heikert. Icons Example: Monthly ozone for 6 years over central America. What do you see?

27 Multiple linked plots Interactive graphics (direct manipulation graphics) allow the user to brush plot elements and convey this selection to other plots. Te rminology Brushing (one-to-one) Brushing (categorical ) Brushing (one point to one line) Transient brushing Persistent painting Identification Scaling

28 This 1-1 Linking 18 RW species.sex Blue Female Blue Male Orange Female Orange Male OR This? 1 15 FL What do you learn differently? This 1-1 Linking 1 Area Calabria 1 Coastal Sardinia East Liguria linoleic 1 8 Inland Sardinia North Apulia Sicily South Apulia Umbria West Liguria OR This? oleic What do you learn differently?

29 This Categorical Variable Linking Longitudinal measurements on wages with workforce experience: 888 subjects. lnw OR This? exper Example Exploring dark chocolates that look more like milk chocolates.

30 One more DON t 57 Heikert. 73 Heikert. 7 Heikert. 61 Heikert. 7 Heikert. 65 Heikert. 7 Heikert. Heikert. 8 Heikert. 66 Heikert. 63 Heikert. 67 Heikert. 56 Heikert. 53 Heikert. 6 Heikert. 58 Heikert. 6 Heikert. 69 Heikert. 68 Heikert. 5 Heikert. 5 Heikert. 55 Heikert. 5 Heikert. 9 Heikert. 51 Heikert. 6 Heikert. 6 Heikert. 5 Heikert. 59 Heikert. 71 Heikert. 7 Heikert. Concinna 15 Concinna 8 Concinna 1 Concinna 1 Concinna 1 Concinna 7 Concinna 11 Concinna 16 Concinna 17 Concinna 18 Concinna Concinna 9 Concinna 13 Concinna 3 Concinna 19 Concinna 38 Heptapot. 33 Heptapot. 37 Heptapot. 35 Heptapot. Heptapot. Heptapot. 6 Heptapot. 3 Heptapot. 36 Heptapot. Heptapot. Heptapot. 1 Heptapot. 1 Concinna Concinna 5 Concinna 6 Concinna 1 Concinna 3 Heptapot. 8 Heptapot. 3 Heptapot. 9 Heptapot. 3 Heptapot. 3 Heptapot. 31 Heptapot. 39 Heptapot. 7 Heptapot. 5 Heptapot. Heatmaps: each cell of the data matrix is colored according to. Rows and columns sorted to get most similar together. How many clusters? Describe the clusters? aede head tars1 aede3 tars aede1 What s wrong with this plot? This work is licensed under the Creative Commons Attribution-Noncommercial 3. United States License. To view a copy of this license, visit creativecommons.org/licenses/by-nc/3./us/ or send a letter to Creative Commons, 171 Second Street, Suite 3, San Francisco, California, 915, USA.

CHAPTER 3 Describing Relationships

CHAPTER 3 Describing Relationships CHAPTER 3 Describing Relationships 3.1 Scatterplots and Correlation The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Reading Quiz 3.1 True/False 1.

More information

Graphical Methods for High-dimensional Data

Graphical Methods for High-dimensional Data Chapter 1 Graphical Methods for High-dimensional Data 1.1 Objectives This chapter explains interactive and dynamic graphical methods for exploring data, and for diagnosing models. the use of linked brushing,

More information

Understandable Statistics

Understandable Statistics Understandable Statistics correlated to the Advanced Placement Program Course Description for Statistics Prepared for Alabama CC2 6/2003 2003 Understandable Statistics 2003 correlated to the Advanced Placement

More information

How to interpret scientific & statistical graphs

How to interpret scientific & statistical graphs How to interpret scientific & statistical graphs Theresa A Scott, MS Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott 1 A brief introduction Graphics:

More information

Further Mathematics 2018 CORE: Data analysis Chapter 3 Investigating associations between two variables

Further Mathematics 2018 CORE: Data analysis Chapter 3 Investigating associations between two variables Chapter 3: Investigating associations between two variables Further Mathematics 2018 CORE: Data analysis Chapter 3 Investigating associations between two variables Extract from Study Design Key knowledge

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Lab 4 (M13) Objective: This lab will give you more practice exploring the shape of data, and in particular in breaking the data into two groups.

Lab 4 (M13) Objective: This lab will give you more practice exploring the shape of data, and in particular in breaking the data into two groups. Lab 4 (M13) Objective: This lab will give you more practice exploring the shape of data, and in particular in breaking the data into two groups. Activity 1 Examining Data From Class Background Download

More information

STAT 503X Case Study 1: Restaurant Tipping

STAT 503X Case Study 1: Restaurant Tipping STAT 503X Case Study 1: Restaurant Tipping 1 Description Food server s tips in restaurants may be influenced by many factors including the nature of the restaurant, size of the party, table locations in

More information

bivariate analysis: The statistical analysis of the relationship between two variables.

bivariate analysis: The statistical analysis of the relationship between two variables. bivariate analysis: The statistical analysis of the relationship between two variables. cell frequency: The number of cases in a cell of a cross-tabulation (contingency table). chi-square (χ 2 ) test for

More information

AP Stats Review for Midterm

AP Stats Review for Midterm AP Stats Review for Midterm NAME: Format: 10% of final grade. There will be 20 multiple-choice questions and 3 free response questions. The multiple-choice questions will be worth 2 points each and the

More information

CCM6+7+ Unit 12 Data Collection and Analysis

CCM6+7+ Unit 12 Data Collection and Analysis Page 1 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis CCM6+7+ Unit 12 Data Collection and Analysis Big Ideas Page(s) What is data/statistics? 2-4 Measures of Reliability and Variability: Sampling,

More information

CS Information Visualization Sep. 10, 2012 John Stasko. General representation techniques for multivariate (>3) variables per data case

CS Information Visualization Sep. 10, 2012 John Stasko. General representation techniques for multivariate (>3) variables per data case Topic Notes Multivariate Visual Representations 1 CS 7450 - Information Visualization Sep. 10, 2012 John Stasko Agenda General representation techniques for multivariate (>3) variables per data case But

More information

From Biostatistics Using JMP: A Practical Guide. Full book available for purchase here. Chapter 1: Introduction... 1

From Biostatistics Using JMP: A Practical Guide. Full book available for purchase here. Chapter 1: Introduction... 1 From Biostatistics Using JMP: A Practical Guide. Full book available for purchase here. Contents Dedication... iii Acknowledgments... xi About This Book... xiii About the Author... xvii Chapter 1: Introduction...

More information

(a) 50% of the shows have a rating greater than: impossible to tell

(a) 50% of the shows have a rating greater than: impossible to tell KEY 1. Here is a histogram of the Distribution of grades on a quiz. How many students took the quiz? 15 What percentage of students scored below a 60 on the quiz? (Assume left-hand endpoints are included

More information

STAT 201 Chapter 3. Association and Regression

STAT 201 Chapter 3. Association and Regression STAT 201 Chapter 3 Association and Regression 1 Association of Variables Two Categorical Variables Response Variable (dependent variable): the outcome variable whose variation is being studied Explanatory

More information

Graphical Exploration of Statistical Interactions. Nick Jackson University of Southern California Department of Psychology 10/25/2013

Graphical Exploration of Statistical Interactions. Nick Jackson University of Southern California Department of Psychology 10/25/2013 Graphical Exploration of Statistical Interactions Nick Jackson University of Southern California Department of Psychology 10/25/2013 1 Overview What is Interaction? 2-Way Interactions Categorical X Categorical

More information

Chapter 3: Examining Relationships

Chapter 3: Examining Relationships Name Date Per Key Vocabulary: response variable explanatory variable independent variable dependent variable scatterplot positive association negative association linear correlation r-value regression

More information

Unit 7 Comparisons and Relationships

Unit 7 Comparisons and Relationships Unit 7 Comparisons and Relationships Objectives: To understand the distinction between making a comparison and describing a relationship To select appropriate graphical displays for making comparisons

More information

Quantitative Methods in Computing Education Research (A brief overview tips and techniques)

Quantitative Methods in Computing Education Research (A brief overview tips and techniques) Quantitative Methods in Computing Education Research (A brief overview tips and techniques) Dr Judy Sheard Senior Lecturer Co-Director, Computing Education Research Group Monash University judy.sheard@monash.edu

More information

Statistics and Probability

Statistics and Probability Statistics and a single count or measurement variable. S.ID.1: Represent data with plots on the real number line (dot plots, histograms, and box plots). S.ID.2: Use statistics appropriate to the shape

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Statistics Final Review Semeter I Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) The Centers for Disease

More information

UF#Stats#Club#STA#2023#Exam#1#Review#Packet# #Fall#2013#

UF#Stats#Club#STA#2023#Exam#1#Review#Packet# #Fall#2013# UF#Stats#Club#STA##Exam##Review#Packet# #Fall## The following data consists of the scores the Gators basketball team scored during the 8 games played in the - season. 84 74 66 58 79 8 7 64 8 6 78 79 77

More information

Measuring the User Experience

Measuring the User Experience Measuring the User Experience Collecting, Analyzing, and Presenting Usability Metrics Chapter 2 Background Tom Tullis and Bill Albert Morgan Kaufmann, 2008 ISBN 978-0123735584 Introduction Purpose Provide

More information

Chapter 4. Navigating. Analysis. Data. through. Exploring Bivariate Data. Navigations Series. Grades 6 8. Important Mathematical Ideas.

Chapter 4. Navigating. Analysis. Data. through. Exploring Bivariate Data. Navigations Series. Grades 6 8. Important Mathematical Ideas. Navigations Series Navigating through Analysis Data Grades 6 8 Chapter 4 Exploring Bivariate Data Important Mathematical Ideas Copyright 2009 by the National Council of Teachers of Mathematics, Inc. www.nctm.org.

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!

More information

CHAPTER ONE CORRELATION

CHAPTER ONE CORRELATION CHAPTER ONE CORRELATION 1.0 Introduction The first chapter focuses on the nature of statistical data of correlation. The aim of the series of exercises is to ensure the students are able to use SPSS to

More information

(a) 50% of the shows have a rating greater than: impossible to tell

(a) 50% of the shows have a rating greater than: impossible to tell q 1. Here is a histogram of the Distribution of grades on a quiz. How many students took the quiz? What percentage of students scored below a 60 on the quiz? (Assume left-hand endpoints are included in

More information

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition List of Figures List of Tables Preface to the Second Edition Preface to the First Edition xv xxv xxix xxxi 1 What Is R? 1 1.1 Introduction to R................................ 1 1.2 Downloading and Installing

More information

IAT 355 Visual Analytics. Encoding Information: Design. Lyn Bartram

IAT 355 Visual Analytics. Encoding Information: Design. Lyn Bartram IAT 355 Visual Analytics Encoding Information: Design Lyn Bartram 4 stages of visualization design 2 Recall: Data Abstraction Tables Data item (row) with attributes (columns) : row=key, cells = values

More information

AP STATISTICS 2010 SCORING GUIDELINES

AP STATISTICS 2010 SCORING GUIDELINES AP STATISTICS 2010 SCORING GUIDELINES Question 1 Intent of Question The primary goals of this question were to assess students ability to (1) apply terminology related to designing experiments; (2) construct

More information

AP Statistics. Semester One Review Part 1 Chapters 1-5

AP Statistics. Semester One Review Part 1 Chapters 1-5 AP Statistics Semester One Review Part 1 Chapters 1-5 AP Statistics Topics Describing Data Producing Data Probability Statistical Inference Describing Data Ch 1: Describing Data: Graphically and Numerically

More information

MA 151: Using Minitab to Visualize and Explore Data The Low Fat vs. Low Carb Debate

MA 151: Using Minitab to Visualize and Explore Data The Low Fat vs. Low Carb Debate MA 151: Using Minitab to Visualize and Explore Data The Low Fat vs. Low Carb Debate September 5, 2018 1 Introduction to the Data We will be working with a synthetic data set meant to resemble the weight

More information

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Week 9 Hour 3 Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Stat 302 Notes. Week 9, Hour 3, Page 1 / 39 Stepwise Now that we've introduced interactions,

More information

MATH 2560 C F03 Elementary Statistics I LECTURE 6: Scatterplots (Continuation).

MATH 2560 C F03 Elementary Statistics I LECTURE 6: Scatterplots (Continuation). MATH 2560 C F03 Elementary Statistics I LECTURE 6: Scatterplots (Continuation). 1 Outline. adding categorical variables to scatterplots; more examples of scatterplots; categorical explanatory variables;

More information

Making charts in Excel

Making charts in Excel Making charts in Excel Use Excel file MakingChartsInExcel_data We ll start with the worksheet called treatment This shows the number of admissions (not necessarily people) to Twin Cities treatment programs

More information

Section 6: Analysing Relationships Between Variables

Section 6: Analysing Relationships Between Variables 6. 1 Analysing Relationships Between Variables Section 6: Analysing Relationships Between Variables Choosing a Technique The Crosstabs Procedure The Chi Square Test The Means Procedure The Correlations

More information

Nature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training.

Nature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training. Supplementary Figure 1 Behavioral training. a, Mazes used for behavioral training. Asterisks indicate reward location. Only some example mazes are shown (for example, right choice and not left choice maze

More information

CP Statistics Sem 1 Final Exam Review

CP Statistics Sem 1 Final Exam Review Name: _ Period: ID: A CP Statistics Sem 1 Final Exam Review Multiple Choice Identify the choice that best completes the statement or answers the question. 1. A particularly common question in the study

More information

STATISTICS & PROBABILITY

STATISTICS & PROBABILITY STATISTICS & PROBABILITY LAWRENCE HIGH SCHOOL STATISTICS & PROBABILITY CURRICULUM MAP 2015-2016 Quarter 1 Unit 1 Collecting Data and Drawing Conclusions Unit 2 Summarizing Data Quarter 2 Unit 3 Randomness

More information

10. LINEAR REGRESSION AND CORRELATION

10. LINEAR REGRESSION AND CORRELATION 1 10. LINEAR REGRESSION AND CORRELATION The contingency table describes an association between two nominal (categorical) variables (e.g., use of supplemental oxygen and mountaineer survival ). We have

More information

STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS

STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS Circle the best answer. This scenario applies to Questions 1 and 2: A study was done to compare the lung capacity of coal miners to the lung

More information

Math 075 Activities and Worksheets Book 2:

Math 075 Activities and Worksheets Book 2: Math 075 Activities and Worksheets Book 2: Linear Regression Name: 1 Scatterplots Intro to Correlation Represent two numerical variables on a scatterplot and informally describe how the data points are

More information

Lesson 9 Presentation and Display of Quantitative Data

Lesson 9 Presentation and Display of Quantitative Data Lesson 9 Presentation and Display of Quantitative Data Learning Objectives All students will identify and present data using appropriate graphs, charts and tables. All students should be able to justify

More information

Applications. DSC 410/510 Multivariate Statistical Methods. Discriminating Two Groups. What is Discriminant Analysis

Applications. DSC 410/510 Multivariate Statistical Methods. Discriminating Two Groups. What is Discriminant Analysis DSC 4/5 Multivariate Statistical Methods Applications DSC 4/5 Multivariate Statistical Methods Discriminant Analysis Identify the group to which an object or case (e.g. person, firm, product) belongs:

More information

Outline. Practice. Confounding Variables. Discuss. Observational Studies vs Experiments. Observational Studies vs Experiments

Outline. Practice. Confounding Variables. Discuss. Observational Studies vs Experiments. Observational Studies vs Experiments 1 2 Outline Finish sampling slides from Tuesday. Study design what do you do with the subjects/units once you select them? (OI Sections 1.4-1.5) Observational studies vs. experiments Descriptive statistics

More information

M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 60

M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 60 M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points 1-10 10 11 3 12 4 13 3 14 10 15 14 16 10 17 7 18 4 19 4 Total 60 Multiple choice questions (1 point each) For questions

More information

Introduction to Statistical Data Analysis I

Introduction to Statistical Data Analysis I Introduction to Statistical Data Analysis I JULY 2011 Afsaneh Yazdani Preface What is Statistics? Preface What is Statistics? Science of: designing studies or experiments, collecting data Summarizing/modeling/analyzing

More information

Reveal Relationships in Categorical Data

Reveal Relationships in Categorical Data SPSS Categories 15.0 Specifications Reveal Relationships in Categorical Data Unleash the full potential of your data through perceptual mapping, optimal scaling, preference scaling, and dimension reduction

More information

Rachael E. Jack, Caroline Blais, Christoph Scheepers, Philippe G. Schyns, and Roberto Caldara

Rachael E. Jack, Caroline Blais, Christoph Scheepers, Philippe G. Schyns, and Roberto Caldara Current Biology, Volume 19 Supplemental Data Cultural Confusions Show that Facial Expressions Are Not Universal Rachael E. Jack, Caroline Blais, Christoph Scheepers, Philippe G. Schyns, and Roberto Caldara

More information

8.SP.1 Hand span and height

8.SP.1 Hand span and height 8.SP.1 Hand span and height Task Do taller people tend to have bigger hands? To investigate this question, each student in your class should measure his or her hand span (in cm) and height (in inches).

More information

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj Statistical Techniques Masoud Mansoury and Anas Abulfaraj What is Statistics? https://www.youtube.com/watch?v=lmmzj7599pw The definition of Statistics The practice or science of collecting and analyzing

More information

You can use this app to build a causal Bayesian network and experiment with inferences. We hope you ll find it interesting and helpful.

You can use this app to build a causal Bayesian network and experiment with inferences. We hope you ll find it interesting and helpful. icausalbayes USER MANUAL INTRODUCTION You can use this app to build a causal Bayesian network and experiment with inferences. We hope you ll find it interesting and helpful. We expect most of our users

More information

Creative Commons Attribution-NonCommercial-Share Alike License

Creative Commons Attribution-NonCommercial-Share Alike License Author: Brenda Gunderson, Ph.D., 05 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution- NonCommercial-Share Alike 3.0 Unported License:

More information

Creative Commons Attribution-NonCommercial-Share Alike License

Creative Commons Attribution-NonCommercial-Share Alike License Author: Brenda Gunderson, Ph.D., 2015 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution- NonCommercial-Share Alike 3.0 Unported License:

More information

7. Bivariate Graphing

7. Bivariate Graphing 1 7. Bivariate Graphing Video Link: https://www.youtube.com/watch?v=shzvkwwyguk&index=7&list=pl2fqhgedk7yyl1w9tgio8w pyftdumgc_j Section 7.1: Converting a Quantitative Explanatory Variable to Categorical

More information

Lesson 1: Distributions and Their Shapes

Lesson 1: Distributions and Their Shapes Lesson 1 Lesson 1: Distributions and Their Shapes Classwork Statistics is all about data Without data to talk about or to analyze or to question, statistics would not exist There is a story to be uncovered

More information

5 To Invest or not to Invest? That is the Question.

5 To Invest or not to Invest? That is the Question. 5 To Invest or not to Invest? That is the Question. Before starting this lab, you should be familiar with these terms: response y (or dependent) and explanatory x (or independent) variables; slope and

More information

Randomized Block Designs 1

Randomized Block Designs 1 Randomized Block Designs 1 STA305 Winter 2014 1 See last slide for copyright information. 1 / 1 Background Reading Optional Photocopy 2 from an old textbook; see course website. It s only four pages. The

More information

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you? WDHS Curriculum Map Probability and Statistics Time Interval/ Unit 1: Introduction to Statistics 1.1-1.3 2 weeks S-IC-1: Understand statistics as a process for making inferences about population parameters

More information

M 140 Test 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

M 140 Test 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75 M 140 est 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDI! Problem Max. Points Your Points 1-10 10 11 10 12 3 13 4 14 18 15 8 16 7 17 14 otal 75 Multiple choice questions (1 point each) For questions

More information

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions Readings: OpenStax Textbook - Chapters 1 5 (online) Appendix D & E (online) Plous - Chapters 1, 5, 6, 13 (online) Introductory comments Describe how familiarity with statistical methods can - be associated

More information

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018 Introduction to Machine Learning Katherine Heller Deep Learning Summer School 2018 Outline Kinds of machine learning Linear regression Regularization Bayesian methods Logistic Regression Why we do this

More information

Introduction. Chapter Before you start Formulation

Introduction. Chapter Before you start Formulation Chapter 1 Introduction 1.1 Before you start Statistics starts with a problem, continues with the collection of data, proceeds with the data analysis and finishes with conclusions. It is a common mistake

More information

Part 1. For each of the following questions fill-in the blanks. Each question is worth 2 points.

Part 1. For each of the following questions fill-in the blanks. Each question is worth 2 points. Part 1. For each of the following questions fill-in the blanks. Each question is worth 2 points. 1. The bell-shaped frequency curve is so common that if a population has this shape, the measurements are

More information

isc ove ring i Statistics sing SPSS

isc ove ring i Statistics sing SPSS isc ove ring i Statistics sing SPSS S E C O N D! E D I T I O N (and sex, drugs and rock V roll) A N D Y F I E L D Publications London o Thousand Oaks New Delhi CONTENTS Preface How To Use This Book Acknowledgements

More information

Section 1.2 Displaying Quantitative Data with Graphs. Dotplots

Section 1.2 Displaying Quantitative Data with Graphs. Dotplots Section 1.2 Displaying Quantitative Data with Graphs Dotplots One of the simplest graphs to construct and interpret is a dotplot. Each data value is shown as a dot above its location on a number line.

More information

Measurement and Descriptive Statistics. Katie Rommel-Esham Education 604

Measurement and Descriptive Statistics. Katie Rommel-Esham Education 604 Measurement and Descriptive Statistics Katie Rommel-Esham Education 604 Frequency Distributions Frequency table # grad courses taken f 3 or fewer 5 4-6 3 7-9 2 10 or more 4 Pictorial Representations Frequency

More information

Michigan Nutrition Network Outcomes: Balance caloric intake from food and beverages with caloric expenditure.

Michigan Nutrition Network Outcomes: Balance caloric intake from food and beverages with caloric expenditure. DRAFT 1 Obesity and Heart Disease: Fact or Government Conspiracy? Grade Level: High School Grades 11 12 Subject Area: Mathematics (Statistics) Setting: Classroom and/or Computer Lab Instructional Time:

More information

General recognition theory of categorization: A MATLAB toolbox

General recognition theory of categorization: A MATLAB toolbox ehavior Research Methods 26, 38 (4), 579-583 General recognition theory of categorization: MTL toolbox LEOL. LFONSO-REESE San Diego State University, San Diego, California General recognition theory (GRT)

More information

Identify two variables. Classify them as explanatory or response and quantitative or explanatory.

Identify two variables. Classify them as explanatory or response and quantitative or explanatory. OLI Module 2 - Examining Relationships Objective Summarize and describe the distribution of a categorical variable in context. Generate and interpret several different graphical displays of the distribution

More information

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test February 2016

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test February 2016 UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test February 2016 STAB22H3 Statistics I, LEC 01 and LEC 02 Duration: 1 hour and 45 minutes Last Name: First Name:

More information

Organizing Data. Types of Distributions. Uniform distribution All ranges or categories have nearly the same value a.k.a. rectangular distribution

Organizing Data. Types of Distributions. Uniform distribution All ranges or categories have nearly the same value a.k.a. rectangular distribution Organizing Data Frequency How many of the data are in a category or range Just count up how many there are Notation x = number in one category n = total number in sample (all categories combined) Relative

More information

Statistics Assignment 11 - Solutions

Statistics Assignment 11 - Solutions Statistics 44.3 Assignment 11 - Solutions 1. Samples were taken of individuals with each blood type to see if the average white blood cell count differed among types. Eleven individuals in each group were

More information

Undertaking statistical analysis of

Undertaking statistical analysis of Descriptive statistics: Simply telling a story Laura Delaney introduces the principles of descriptive statistical analysis and presents an overview of the various ways in which data can be presented by

More information

Meaning-based guidance of attention in scenes as revealed by meaning maps

Meaning-based guidance of attention in scenes as revealed by meaning maps SUPPLEMENTARY INFORMATION Letters DOI: 1.138/s41562-17-28- In the format provided by the authors and unedited. -based guidance of attention in scenes as revealed by meaning maps John M. Henderson 1,2 *

More information

Choosing a Significance Test. Student Resource Sheet

Choosing a Significance Test. Student Resource Sheet Choosing a Significance Test Student Resource Sheet Choosing Your Test Choosing an appropriate type of significance test is a very important consideration in analyzing data. If an inappropriate test is

More information

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences. SPRING GROVE AREA SCHOOL DISTRICT PLANNED COURSE OVERVIEW Course Title: Basic Introductory Statistics Grade Level(s): 11-12 Units of Credit: 1 Classification: Elective Length of Course: 30 cycles Periods

More information

STATISTICS 201. Survey: Provide this Info. How familiar are you with these? Survey, continued IMPORTANT NOTE. Regression and ANOVA 9/29/2013

STATISTICS 201. Survey: Provide this Info. How familiar are you with these? Survey, continued IMPORTANT NOTE. Regression and ANOVA 9/29/2013 STATISTICS 201 Survey: Provide this Info Outline for today: Go over syllabus Provide requested information on survey (handed out in class) Brief introduction and hands-on activity Name Major/Program Year

More information

Stat 13, Lab 11-12, Correlation and Regression Analysis

Stat 13, Lab 11-12, Correlation and Regression Analysis Stat 13, Lab 11-12, Correlation and Regression Analysis Part I: Before Class Objective: This lab will give you practice exploring the relationship between two variables by using correlation, linear regression

More information

Visualization and interpretation of cancer data using linked micromap plots

Visualization and interpretation of cancer data using linked micromap plots Journal of the Korean Data & Information Science Society 2014, 25(6), 1531 1538 http://dx.doi.org/10.7465/jkdi.2014.25.6.1531 한국데이터정보과학회지 Visualization and interpretation of cancer data using linked micromap

More information

BIOL 458 BIOMETRY Lab 7 Multi-Factor ANOVA

BIOL 458 BIOMETRY Lab 7 Multi-Factor ANOVA BIOL 458 BIOMETRY Lab 7 Multi-Factor ANOVA PART 1: Introduction to Factorial ANOVA ingle factor or One - Way Analysis of Variance can be used to test the null hypothesis that k or more treatment or group

More information

STATISTICS INFORMED DECISIONS USING DATA

STATISTICS INFORMED DECISIONS USING DATA STATISTICS INFORMED DECISIONS USING DATA Fifth Edition Chapter 4 Describing the Relation between Two Variables 4.1 Scatter Diagrams and Correlation Learning Objectives 1. Draw and interpret scatter diagrams

More information

Nutritional Risk Factors for Peripheral Vascular Disease: Does Diet Play a Role?

Nutritional Risk Factors for Peripheral Vascular Disease: Does Diet Play a Role? Nutritional Risk Factors for Peripheral Vascular Disease: Does Diet Play a Role? John S. Lane MD, Cheryl P. Magno MPH, Karen T. Lane MD, Tyler Chan BS, Sheldon Greenfield MD University of California, Irvine

More information

Section 3.2 Least-Squares Regression

Section 3.2 Least-Squares Regression Section 3.2 Least-Squares Regression Linear relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these relationships.

More information

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations)

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations) Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations) After receiving my comments on the preliminary reports of your datasets, the next step for the groups is to complete

More information

Reliability of Ordination Analyses

Reliability of Ordination Analyses Reliability of Ordination Analyses Objectives: Discuss Reliability Define Consistency and Accuracy Discuss Validation Methods Opening Thoughts Inference Space: What is it? Inference space can be defined

More information

How Faithful is the Old Faithful? The Practice of Statistics, 5 th Edition 1

How Faithful is the Old Faithful? The Practice of Statistics, 5 th Edition 1 How Faithful is the Old Faithful? The Practice of Statistics, 5 th Edition 1 Who Has Been Eating My Cookies????????? Someone has been steeling the cookie I bought for your class A teacher from the highschool

More information

Molecules. Background

Molecules. Background Background A molecule is a group of two or more atoms. Compounds are also two or more atoms. Compounds are made from different types of atoms. can be made of different types of atoms or can also be made

More information

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics Biost 517 Applied Biostatistics I Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 3: Overview of Descriptive Statistics October 3, 2005 Lecture Outline Purpose

More information

Chapter 4: Scatterplots and Correlation

Chapter 4: Scatterplots and Correlation Chapter 4: Scatterplots and Correlation http://www.yorku.ca/nuri/econ2500/bps6e/ch4-links.pdf Correlation text exr 4.10 pg 108 Ch4-image Ch4 exercises: 4.1, 4.29, 4.39 Most interesting statistical data

More information

This report was generated by the EnQuireR package

This report was generated by the EnQuireR package This report was generated by the package Cadoret M., Fournier O., Fournier G., Le Poder F., Bouche J., Lê S. July 28, 2010 : Multivariate Exploratory Analysis of Questionnaires Multivariate exploration

More information

How to describe bivariate data

How to describe bivariate data Statistics Corner How to describe bivariate data Alessandro Bertani 1, Gioacchino Di Paola 2, Emanuele Russo 1, Fabio Tuzzolino 2 1 Department for the Treatment and Study of Cardiothoracic Diseases and

More information

Class 7 Everything is Related

Class 7 Everything is Related Class 7 Everything is Related Correlational Designs l 1 Topics Types of Correlational Designs Understanding Correlation Reporting Correlational Statistics Quantitative Designs l 2 Types of Correlational

More information

Outlier Analysis. Lijun Zhang

Outlier Analysis. Lijun Zhang Outlier Analysis Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Extreme Value Analysis Probabilistic Models Clustering for Outlier Detection Distance-Based Outlier Detection Density-Based

More information

Test 1C AP Statistics Name:

Test 1C AP Statistics Name: Test 1C AP Statistics Name: Part 1: Multiple Choice. Circle the letter corresponding to the best answer. 1. At the beginning of the school year, a high-school teacher asks every student in her classes

More information

Announcement. Homework #2 due next Friday at 5pm. Midterm is in 2 weeks. It will cover everything through the end of next week (week 5).

Announcement. Homework #2 due next Friday at 5pm. Midterm is in 2 weeks. It will cover everything through the end of next week (week 5). Announcement Homework #2 due next Friday at 5pm. Midterm is in 2 weeks. It will cover everything through the end of next week (week 5). Political Science 15 Lecture 8: Descriptive Statistics (Part 1) Data

More information

Caffeine & Calories in Soda. Statistics. Anthony W Dick

Caffeine & Calories in Soda. Statistics. Anthony W Dick 1 Caffeine & Calories in Soda Statistics Anthony W Dick 2 Caffeine & Calories in Soda Description of Experiment Does the caffeine content in soda have anything to do with the calories? This is the question

More information

Homework Exercises for PSYC 3330: Statistics for the Behavioral Sciences

Homework Exercises for PSYC 3330: Statistics for the Behavioral Sciences Homework Exercises for PSYC 3330: Statistics for the Behavioral Sciences compiled and edited by Thomas J. Faulkenberry, Ph.D. Department of Psychological Sciences Tarleton State University Version: July

More information

Statistics for Psychosocial Research Session 1: September 1 Bill

Statistics for Psychosocial Research Session 1: September 1 Bill Statistics for Psychosocial Research Session 1: September 1 Bill Introduction to Staff Purpose of the Course Administration Introduction to Test Theory Statistics for Psychosocial Research Overview: a)

More information

The North Carolina Health Data Explorer

The North Carolina Health Data Explorer The North Carolina Health Data Explorer The Health Data Explorer provides access to health data for North Carolina counties in an interactive, user-friendly atlas of maps, tables, and charts. It allows

More information