Machine Learning Statistical Learning. Prof. Matteo Matteucci
|
|
- Edward Howard
- 6 years ago
- Views:
Transcription
1 Machine Learning Statistical Learning Pro. Matteo Matteucci
2 Statistical Learning Outline o What Is Statistical Learning? Why estimate? How do we estimate? The trade-o between prediction accuracy & model interpretability Y/G o Some important taxonomies I expect you ll know this by heart! Prediction vs. Inerence Parametric vs. Non Parametric models Regression vs. Classiication problems Supervised vs. Unsupervised learning Pro. Matteo Matteucci - Machine Learning
3 Example: Increasing Sales by Advertising 3 Pro. Matteo Matteucci - Machine Learning
4 What is Statistical Learning? 4 Y i i = i1,..., ip i =1,...,n o Suppose we observe and or Assume a relationship exists between Y and at least one o the observed s Assume we can model this relationship as Y i : unknown unction systematic ε i : zero mean random error i i y y ε i x o The term statistical learning reers to using the data to learn Pro. Matteo Matteucci - Machine Learning
5 Reducible vs Irreducible Error 5 o The error our estimate will have has two components Y i i i Reducible error due to the choice o model complexity Y/G Irreducible error due to the presence o ε i in the training set Pro. Matteo Matteucci - Machine Learning
6 Because noise matters 6 sd=0.001 sd=0.005 y y x x sd=0.01 sd=0.03 y y y y y y x x Pro. Matteo Matteucci - Machine Learning
7 Reducible vs Irreducible Error Part 7 o The error our estimate will have has two components Y i i i Reducible error due to the choice o model complexity Irreducible error due to the presence o ε i in the training set ˆ o Let assume and ixed or the time being Pro. Matteo Matteucci - Machine Learning
8 Pro. Matteo Matteucci - Machine Learning Reducible vs Irreducible Error Part 3 o Can you derive this? x sd=0.03 ˆ ˆ Y Y Y 1 1 Y ˆ 0 ] [ ˆ ] [ ˆ ˆ ˆ ˆ ] [ ˆ ˆ ] [ ] [ ˆ ] [ ] ˆ ˆ ˆ [ ] ˆ [ ] ˆ [ Var E E E E E E E E Y Y E
9 Example: Income vs. Education Seniority 9 o Function might also involve multiple variables Pro. Matteo Matteucci - Machine Learning
10 Why do we estimate? 10 o There are reasons or estimating Prediction Inerence Y/G o Prediction I we can produce a good estimate or and the variance o ε is not too large we can make accurate predictions or the response, Y/G, based on a new value o. o Inerence We may be interested in the type o relationship between Y/G and the 's to control/inluence Y/G. Which particular predictors actually aect the response? Is the relationship positive or negative? Is the relationship a simple linear one or is it more complicated etc.? Pro. Matteo Matteucci - Machine Learning
11 Examples or Prediction & Inerence 11 o Direct Mail Prediction Interested in predicting how much money an individual will donate based on observations rom 90,000 people on which we have recorded over 400 dierent characteristics. Don t care too much about each individual characteristic. Just want to know: For a given individual should I send out a mailing? o Medium House Price Which actors have the biggest eect on the response How big the eect is. Want to know: how much impact does a river view have on the house value Pro. Matteo Matteucci - Machine Learning
12 How Do We Estimate? 1 o We have observed a set o training data {, Y,, Y 1,, n, Y 1 n o Use statistical method/model to estimate so that or any, Y } y x o Statistical methods/models are usually divided in Parametric Methods/Models Non-parametric Methods/Models Pro. Matteo Matteucci - Machine Learning
13 Parametric Methods Part 1 13 o Parametric methods leverage on an assumption about the model underlining They reduce the problem o estimating down to the one o estimating a set o parameters They involve a two-step model based approach o STEP 1: Make some assumption about the unctional orm o, i.e. come up with a model e.g., a linear model i 0 1 i1 i p ip o STEP : Use the training data to it the model, i.e., estimate through the unknown parameters 0 1 p Pro. Matteo Matteucci - Machine Learning
14 Parametric Methods Part 14 o Parametric methods leverage on an assumption about the model underlining They reduce the problem o estimating down to the one o estimating a set o parameters They involve a two-step model based approach o STEP 1: In this course we will examine ar more complicated, and lexible, models or w.r.t linear ones. In a sense the more lexible the model the more realistic it is. o STEP : The most common approach or estimating the parameters in a linear model is Ordinary Least Squares OLS, but there are oten superior approaches. Pro. Matteo Matteucci - Machine Learning
15 Example: A Linear Regression Estimate 15 o Even i the standard deviation is low we will still get a bad answer i we use the wrong model. Pro. Matteo Matteucci - Machine Learning = b 0 + b 1 Education+ b Seniority
16 Non-parametric Methods 16 o Sometimes they are reerred as sample-based or instancebased methods, they do not make explicit assumptions about the unctional orm o, they exploit the training data directly o Advantages: They accurately it a wider range o possible shapes o They do not require a trainining phase o Disadvantages: A very large number o observations is required to obtain an accurate estimate o Higher computational cost at testing time They accurately it a wider range o possible shapes o. Pro. Matteo Matteucci - Machine Learning
17 Example: A Thin-Plate Spline Estimate 17 Smooth thin-plate spline it o Non-parametric regression methods are more lexible thus they can potentially provide more accurate estimates Pro. Matteo Matteucci - Machine Learning
18 Prediction Accuracy vs Model Interpretability 18 o Why not just use a more lexible method i it is more realistic? Reason 1: A simple method such as linear regression produces a model which is much easier to interpret the Inerence part is better. E.g., in a linear model, β j is the average increase in Y or a one unit increase in j holding all other variables constant. Reason : Even i you are only interested in prediction, it is oten possible to get more accurate predictions with a simple, instead o a complicated, model. This seems counter intuitive but has to do with the act that it is harder to it properly a more lexible model. Pro. Matteo Matteucci - Machine Learning
19 A Poor Estimate 19 o Non-parametric regression methods can also be too lexible and produce poor estimates or Pro. Matteo Matteucci - Machine Learning Thin-plate spline it with zero training error
20 Flexibility vs Model Interpretability 0 Pro. Matteo Matteucci - Machine Learning
21 Supervised vs. Unsupervised Learning 1 o Machine Learning makes usually a clear distinction between Supervised Models Unsupervised Models o Supervised Learning: Supervised Learning is where both the predictors, i, and the response, Y i, are observed. Pro. Matteo Matteucci - Machine Learning
22 Supervised vs. Unsupervised Learning o Machine Learning makes usually a clear distinction between Supervised Models Unsupervised Models o Unsupervised Learning: Only the i s are observed and use them to build a high level representation possibly or modeling some Y Pro. Matteo Matteucci - Machine Learning
23 Regression vs. Classiication 3 o Supervised learning problems can be urther divided into Regression problems cover situations where Y is continuous/numerical Predicting the value o the Dow in 6 months Predicting the value o a given house based on various inputs. Classiication problems cover situations where Y is categorical Will the Dow be up U or down D in 6 months? Is this a SPAM or not? Pro. Matteo Matteucci - Machine Learning
24 A Simple Clustering Example 4 Pro. Matteo Matteucci - Machine Learning
25 What about higher dimensions? 5 Pro. Matteo Matteucci - Machine Learning
26 Wrap up! 6 o What Is Statistical Learning? Why estimate? How do we estimate? The trade-o between prediction accuracy & model interpretability Y/G o Some important taxonomies I expect you ll know this by heart! Prediction vs. Inerence Parametric vs. Non Parametric models Regression vs. Classiication problems Supervised vs. Unsupervised learning Pro. Matteo Matteucci - Machine Learning
Applied Quantitative Methods II
Applied Quantitative Methods II Lecture 7: Endogeneity and IVs Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 7 VŠE, SS 2016/17 1 / 36 Outline 1 OLS and the treatment effect 2 OLS and endogeneity 3 Dealing
More informationMotivation: Fraud Detection
Outlier Detection Motivation: Fraud Detection http://i.imgur.com/ckkoaop.gif Jian Pei: CMPT 741/459 Data Mining -- Outlier Detection (1) 2 Techniques: Fraud Detection Features Dissimilarity Groups and
More informationCSE 258 Lecture 1.5. Web Mining and Recommender Systems. Supervised learning Regression
CSE 258 Lecture 1.5 Web Mining and Recommender Systems Supervised learning Regression What is supervised learning? Supervised learning is the process of trying to infer from labeled data the underlying
More informationMidterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do.
Midterm STAT-UB.0003 Regression and Forecasting Models The exam is closed book and notes, with the following exception: you are allowed to bring one letter-sized page of notes into the exam (front and
More informationEC352 Econometric Methods: Week 07
EC352 Econometric Methods: Week 07 Gordon Kemp Department of Economics, University of Essex 1 / 25 Outline Panel Data (continued) Random Eects Estimation and Clustering Dynamic Models Validity & Threats
More informationSimple Linear Regression the model, estimation and testing
Simple Linear Regression the model, estimation and testing Lecture No. 05 Example 1 A production manager has compared the dexterity test scores of five assembly-line employees with their hourly productivity.
More informationSample Size Considerations. Todd Alonzo, PhD
Sample Size Considerations Todd Alonzo, PhD 1 Thanks to Nancy Obuchowski for the original version of this presentation. 2 Why do Sample Size Calculations? 1. To minimize the risk of making the wrong conclusion
More informationChapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)
Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it
More informationIntroduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018
Introduction to Machine Learning Katherine Heller Deep Learning Summer School 2018 Outline Kinds of machine learning Linear regression Regularization Bayesian methods Logistic Regression Why we do this
More informationCSE 258 Lecture 2. Web Mining and Recommender Systems. Supervised learning Regression
CSE 258 Lecture 2 Web Mining and Recommender Systems Supervised learning Regression Supervised versus unsupervised learning Learning approaches attempt to model data in order to solve a problem Unsupervised
More informationMostly Harmless Simulations? On the Internal Validity of Empirical Monte Carlo Studies
Mostly Harmless Simulations? On the Internal Validity of Empirical Monte Carlo Studies Arun Advani and Tymon Sªoczy«ski 13 November 2013 Background When interested in small-sample properties of estimators,
More informationApplications. DSC 410/510 Multivariate Statistical Methods. Discriminating Two Groups. What is Discriminant Analysis
DSC 4/5 Multivariate Statistical Methods Applications DSC 4/5 Multivariate Statistical Methods Discriminant Analysis Identify the group to which an object or case (e.g. person, firm, product) belongs:
More informationRussian Journal of Agricultural and Socio-Economic Sciences, 3(15)
ON THE COMPARISON OF BAYESIAN INFORMATION CRITERION AND DRAPER S INFORMATION CRITERION IN SELECTION OF AN ASYMMETRIC PRICE RELATIONSHIP: BOOTSTRAP SIMULATION RESULTS Henry de-graft Acquah, Senior Lecturer
More informationMultiple Regression Analysis
Multiple Regression Analysis Basic Concept: Extend the simple regression model to include additional explanatory variables: Y = β 0 + β1x1 + β2x2 +... + βp-1xp + ε p = (number of independent variables
More informationChapter 14: More Powerful Statistical Methods
Chapter 14: More Powerful Statistical Methods Most questions will be on correlation and regression analysis, but I would like you to know just basically what cluster analysis, factor analysis, and conjoint
More informationWELCOME! Lecture 11 Thommy Perlinger
Quantitative Methods II WELCOME! Lecture 11 Thommy Perlinger Regression based on violated assumptions If any of the assumptions are violated, potential inaccuracies may be present in the estimated regression
More information1.4 - Linear Regression and MS Excel
1.4 - Linear Regression and MS Excel Regression is an analytic technique for determining the relationship between a dependent variable and an independent variable. When the two variables have a linear
More information4. Model evaluation & selection
Foundations of Machine Learning CentraleSupélec Fall 2017 4. Model evaluation & selection Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr
More informationKnowledge Discovery and Data Mining I
Ludwig-Maximilians-Universität München Lehrstuhl für Datenbanksysteme und Data Mining Prof. Dr. Thomas Seidl Knowledge Discovery and Data Mining I Winter Semester 2018/19 Introduction What is an outlier?
More information1 Simple and Multiple Linear Regression Assumptions
1 Simple and Multiple Linear Regression Assumptions The assumptions for simple are in fact special cases of the assumptions for multiple: Check: 1. What is external validity? Which assumption is critical
More informationNORTH SOUTH UNIVERSITY TUTORIAL 2
NORTH SOUTH UNIVERSITY TUTORIAL 2 AHMED HOSSAIN,PhD Data Management and Analysis AHMED HOSSAIN,PhD - Data Management and Analysis 1 Correlation Analysis INTRODUCTION In correlation analysis, we estimate
More informationStudy of cigarette sales in the United States Ge Cheng1, a,
2nd International Conference on Economics, Management Engineering and Education Technology (ICEMEET 2016) 1Department Study of cigarette sales in the United States Ge Cheng1, a, of pure mathematics and
More informationReview: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections
Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections New: Bias-variance decomposition, biasvariance tradeoff, overfitting, regularization, and feature selection Yi
More informationn Outline final paper, add to outline as research progresses n Update literature review periodically (check citeseer)
Project Dilemmas How do I know when I m done? How do I know what I ve accomplished? clearly define focus/goal from beginning design a search method that handles plateaus improve some ML method s robustness
More informationPrediction of Malignant and Benign Tumor using Machine Learning
Prediction of Malignant and Benign Tumor using Machine Learning Ashish Shah Department of Computer Science and Engineering Manipal Institute of Technology, Manipal University, Manipal, Karnataka, India
More informationInstrumental Variables Estimation: An Introduction
Instrumental Variables Estimation: An Introduction Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA The Problem The Problem Suppose you wish to
More informationImplications of Longitudinal Data in Machine Learning for Medicine and Epidemiology
Implications of Longitudinal Data in Machine Learning for Medicine and Epidemiology Billy Heung Wing Chang, Yanxian Chen, Mingguang He Zhongshan Ophthalmic Center, Sun Yat-sen University Biostatistics
More informationDr. Kelly Bradley Final Exam Summer {2 points} Name
{2 points} Name You MUST work alone no tutors; no help from classmates. Email me or see me with questions. You will receive a score of 0 if this rule is violated. This exam is being scored out of 00 points.
More informationEECS 433 Statistical Pattern Recognition
EECS 433 Statistical Pattern Recognition Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 19 Outline What is Pattern
More informationRegression Discontinuity Design (RDD)
Regression Discontinuity Design (RDD) Caroline Flammer Ivey Business School 2015 SMS Denver Conference October 4, 2015 The Identification Challenge Does X cause Y? Tempting to regress Y on X Y = a + b
More information2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%
Capstone Test (will consist of FOUR quizzes and the FINAL test grade will be an average of the four quizzes). Capstone #1: Review of Chapters 1-3 Capstone #2: Review of Chapter 4 Capstone #3: Review of
More informationTechnical Track Session IV Instrumental Variables
Impact Evaluation Technical Track Session IV Instrumental Variables Christel Vermeersch Beijing, China, 2009 Human Development Human Network Development Network Middle East and North Africa Region World
More informationProblem set 2: understanding ordinary least squares regressions
Problem set 2: understanding ordinary least squares regressions September 12, 2013 1 Introduction This problem set is meant to accompany the undergraduate econometrics video series on youtube; covering
More informationNeurons and neural networks II. Hopfield network
Neurons and neural networks II. Hopfield network 1 Perceptron recap key ingredient: adaptivity of the system unsupervised vs supervised learning architecture for discrimination: single neuron perceptron
More informationDiscovering Meaningful Cut-points to Predict High HbA1c Variation
Proceedings of the 7th INFORMS Workshop on Data Mining and Health Informatics (DM-HI 202) H. Yang, D. Zeng, O. E. Kundakcioglu, eds. Discovering Meaningful Cut-points to Predict High HbAc Variation Si-Chi
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 10: Introduction to inference (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 17 What is inference? 2 / 17 Where did our data come from? Recall our sample is: Y, the vector
More information1. The Role of Sample Survey Design
Vista's Approach to Sample Survey Design 1978, 1988, 2006, 2007, 2009 Joseph George Caldwell. All Rights Reserved. Posted at Internet website http://www.foundationwebsite.org. Updated 20 March 2009 (two
More informationSimple Linear Regression One Categorical Independent Variable with Several Categories
Simple Linear Regression One Categorical Independent Variable with Several Categories Does ethnicity influence total GCSE score? We ve learned that variables with just two categories are called binary
More informationUsing Statistical Intervals to Assess System Performance Best Practice
Using Statistical Intervals to Assess System Performance Best Practice Authored by: Francisco Ortiz, PhD STAT COE Lenny Truett, PhD STAT COE 17 April 2015 The goal of the STAT T&E COE is to assist in developing
More informationCHILD HEALTH AND DEVELOPMENT STUDY
CHILD HEALTH AND DEVELOPMENT STUDY 9. Diagnostics In this section various diagnostic tools will be used to evaluate the adequacy of the regression model with the five independent variables developed in
More informationStatistics for Psychology
Statistics for Psychology SIXTH EDITION CHAPTER 12 Prediction Prediction a major practical application of statistical methods: making predictions make informed (and precise) guesses about such things as
More informationGlossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha
Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha attrition: When data are missing because we are unable to measure the outcomes of some of the
More informationArtificial Intelligence Lecture 7
Artificial Intelligence Lecture 7 Lecture plan AI in general (ch. 1) Search based AI (ch. 4) search, games, planning, optimization Agents (ch. 8) applied AI techniques in robots, software agents,... Knowledge
More informationCitation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.
University of Groningen Latent instrumental variables Ebbes, P. IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document
More informationBrain Tumour Detection of MR Image Using Naïve Beyer classifier and Support Vector Machine
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Brain Tumour Detection of MR Image Using Naïve
More informationLINEAR REGRESSION FOR BIVARIATE CENSORED DATA VIA MULTIPLE IMPUTATION
STATISTICS IN MEDICINE Statist. Med. 18, 3111} 3121 (1999) LINEAR REGRESSION FOR BIVARIATE CENSORED DATA VIA MULTIPLE IMPUTATION WEI PAN * AND CHARLES KOOPERBERG Division of Biostatistics, School of Public
More informationThe Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016
The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016 This course does not cover how to perform statistical tests on SPSS or any other computer program. There are several courses
More information1 Pattern Recognition 2 1
1 Pattern Recognition 2 1 3 Perceptrons by M.L. Minsky and S.A. Papert (1969) Books: 4 Pattern Recognition, fourth Edition (Hardcover) by Sergios Theodoridis, Konstantinos Koutroumbas Publisher: Academic
More informationApplication of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties
Application of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties Bob Obenchain, Risk Benefit Statistics, August 2015 Our motivation for using a Cut-Point
More informationOverview of Non-Parametric Statistics
Overview of Non-Parametric Statistics LISA Short Course Series Mark Seiss, Dept. of Statistics April 7, 2009 Presentation Outline 1. Homework 2. Review of Parametric Statistics 3. Overview Non-Parametric
More informationEmpirical Tools of Public Finance. 131 Undergraduate Public Economics Emmanuel Saez UC Berkeley
Empirical Tools of Public Finance 131 Undergraduate Public Economics Emmanuel Saez UC Berkeley 1 DEFINITIONS Empirical public finance: The use of data and statistical methods to measure the impact of government
More informationList of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition
List of Figures List of Tables Preface to the Second Edition Preface to the First Edition xv xxv xxix xxxi 1 What Is R? 1 1.1 Introduction to R................................ 1 1.2 Downloading and Installing
More informationPsychology Research Process
Psychology Research Process Logical Processes Induction Observation/Association/Using Correlation Trying to assess, through observation of a large group/sample, what is associated with what? Examples:
More informationThe Lens Model and Linear Models of Judgment
John Miyamoto Email: jmiyamot@uw.edu October 3, 2017 File = D:\P466\hnd02-1.p466.a17.docm 1 http://faculty.washington.edu/jmiyamot/p466/p466-set.htm Psych 466: Judgment and Decision Making Autumn 2017
More informationLecture 13: Finding optimal treatment policies
MACHINE LEARNING FOR HEALTHCARE 6.S897, HST.S53 Lecture 13: Finding optimal treatment policies Prof. David Sontag MIT EECS, CSAIL, IMES (Thanks to Peter Bodik for slides on reinforcement learning) Outline
More informationSAMPLING AND SAMPLE SIZE
SAMPLING AND SAMPLE SIZE Andrew Zeitlin Georgetown University and IGC Rwanda With slides from Ben Olken and the World Bank s Development Impact Evaluation Initiative 2 Review We want to learn how a program
More informationTEACHING REGRESSION WITH SIMULATION. John H. Walker. Statistics Department California Polytechnic State University San Luis Obispo, CA 93407, U.S.A.
Proceedings of the 004 Winter Simulation Conference R G Ingalls, M D Rossetti, J S Smith, and B A Peters, eds TEACHING REGRESSION WITH SIMULATION John H Walker Statistics Department California Polytechnic
More informationUNLOCKING VALUE WITH DATA SCIENCE BAYES APPROACH: MAKING DATA WORK HARDER
UNLOCKING VALUE WITH DATA SCIENCE BAYES APPROACH: MAKING DATA WORK HARDER 2016 DELIVERING VALUE WITH DATA SCIENCE BAYES APPROACH - MAKING DATA WORK HARDER The Ipsos MORI Data Science team increasingly
More informationIntroduction to Multilevel Models for Longitudinal and Repeated Measures Data
Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Today s Class: Features of longitudinal data Features of longitudinal models What can MLM do for you? What to expect in this
More informationStepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality
Week 9 Hour 3 Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Stat 302 Notes. Week 9, Hour 3, Page 1 / 39 Stepwise Now that we've introduced interactions,
More informationSpatiotemporal models for disease incidence data: a case study
Spatiotemporal models for disease incidence data: a case study Erik A. Sauleau 1,2, Monica Musio 3, Nicole Augustin 4 1 Medicine Faculty, University of Strasbourg, France 2 Haut-Rhin Cancer Registry 3
More information26:010:557 / 26:620:557 Social Science Research Methods
26:010:557 / 26:620:557 Social Science Research Methods Dr. Peter R. Gillett Associate Professor Department of Accounting & Information Systems Rutgers Business School Newark & New Brunswick 1 Overview
More informationTechnical appendix Strengthening accountability through media in Bangladesh: final evaluation
Technical appendix Strengthening accountability through media in Bangladesh: final evaluation July 2017 Research and Learning Contents Introduction... 3 1. Survey sampling methodology... 4 2. Regression
More informationRegression analysis of mortality with respect to seasonal influenza in Sweden
Regression analysis of mortality with respect to seasonal influenza in Sweden 1993-2010 Achilleas Tsoumanis Masteruppsats i matematisk statistik Master Thesis in Mathematical Statistics Masteruppsats 2010:6
More informationIAPT: Regression. Regression analyses
Regression analyses IAPT: Regression Regression is the rather strange name given to a set of methods for predicting one variable from another. The data shown in Table 1 and come from a student project
More informationCHAPTER TWO REGRESSION
CHAPTER TWO REGRESSION 2.0 Introduction The second chapter, Regression analysis is an extension of correlation. The aim of the discussion of exercises is to enhance students capability to assess the effect
More informationChoosing an Approach for a Quantitative Dissertation: Strategies for Various Variable Types
Choosing an Approach for a Quantitative Dissertation: Strategies for Various Variable Types Kuba Glazek, Ph.D. Methodology Expert National Center for Academic and Dissertation Excellence Outline Thesis
More informationCointegration: the Engle and Granger approach
Cointegration: the Engle and Granger approach Matthieu Stigler Matthieu.Stigler@gmail.com October 29, 2008 Matthieu Stigler Matthieu.Stigler@gmail.com Cointegration: () the Engle and Granger approach October
More informationOriginal Article Downloaded from jhs.mazums.ac.ir at 22: on Friday October 5th 2018 [ DOI: /acadpub.jhs ]
Iranian journal of health sciences 213;1(3):58-7 http://jhs.mazums.ac.ir Original Article Downloaded from jhs.mazums.ac.ir at 22:2 +33 on Friday October 5th 218 [ DOI: 1.18869/acadpub.jhs.1.3.58 ] A New
More informationCorrelation and regression
PG Dip in High Intensity Psychological Interventions Correlation and regression Martin Bland Professor of Health Statistics University of York http://martinbland.co.uk/ Correlation Example: Muscle strength
More informationConfidence Intervals On Subsets May Be Misleading
Journal of Modern Applied Statistical Methods Volume 3 Issue 2 Article 2 11-1-2004 Confidence Intervals On Subsets May Be Misleading Juliet Popper Shaffer University of California, Berkeley, shaffer@stat.berkeley.edu
More informationAnalysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach
University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School November 2015 Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach Wei Chen
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write
More informationInference with Difference-in-Differences Revisited
Inference with Difference-in-Differences Revisited M. Brewer, T- F. Crossley and R. Joyce Journal of Econometric Methods, 2018 presented by Federico Curci February 22nd, 2018 Brewer, Crossley and Joyce
More informationStatistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.
Statistics as a Tool A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations. Descriptive Statistics Numerical facts or observations that are organized describe
More informationBootstrapping Residuals to Estimate the Standard Error of Simple Linear Regression Coefficients
Bootstrapping Residuals to Estimate the Standard Error of Simple Linear Regression Coefficients Muhammad Hasan Sidiq Kurniawan 1) 1)* Department of Statistics, Universitas Islam Indonesia hasansidiq@uiiacid
More informationSTA 3024 Spring 2013 EXAM 3 Test Form Code A UF ID #
STA 3024 Spring 2013 Name EXAM 3 Test Form Code A UF ID # Instructions: This exam contains 34 Multiple Choice questions. Each question is worth 3 points, for a total of 102 points (there are TWO bonus
More informationA MODEL FOR MIXED CONTINUOUS AND DISCRETE RESPONSES WITH POSSIBILITY OF MISSING RESPONSES
Journal o Sciences Islamic epublic o Iran : 5-6 National Center For Scientiic esearch ISSN 6- A MODE FO MIED CONTINUOUS AND DISCETE ESPONSES WITH POSSIBIIT OF MISSING ESPONSES M. Ganjali Department o Statistics
More informationSTATISTICS IN CLINICAL AND TRANSLATIONAL RESEARCH
09/07/11 1 Overview and Descriptive Statistics a. Application of statistics in biomedical research b. Type of data c. Graphic representation of data d. Summary statistics: central tendency and dispersion
More informationMemorial Sloan-Kettering Cancer Center
Memorial Sloan-Kettering Cancer Center Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series Year 2007 Paper 14 On Comparing the Clustering of Regression Models
More informationInference beyond significance testing: a gentle primer of model based inference
Inference beyond significance testing: a gentle primer of model based inference Junpeng Lao, PhD Fribourg Day of Cognition 2017/10/04 https://www.nature.com/articles/s41562-017-0189-z An old tale: the
More informationThe Human Side of Science: I ll Take That Bet! Balancing Risk and Benefit. Uncertainty, Risk and Probability: Fundamental Definitions and Concepts
The Human Side of Science: I ll Take That Bet! Balancing Risk and Benefit Uncertainty, Risk and Probability: Fundamental Definitions and Concepts What Is Uncertainty? A state of having limited knowledge
More information10. LINEAR REGRESSION AND CORRELATION
1 10. LINEAR REGRESSION AND CORRELATION The contingency table describes an association between two nominal (categorical) variables (e.g., use of supplemental oxygen and mountaineer survival ). We have
More informationNon-parametric methods for linkage analysis
BIOSTT516 Statistical Methods in Genetic Epidemiology utumn 005 Non-parametric methods for linkage analysis To this point, we have discussed model-based linkage analyses. These require one to specify a
More informationPrivate Health Investments under Competing Risks: Evidence from Malaria Control in Senegal
Private Health Investments under Competing Risks: Evidence from Malaria Control in Senegal Pauline ROSSI (UvA) and Paola VILLAR (PSE) UNU-WIDER Seminar October 18, 2017 Motivation Malaria has long been
More informationData Analysis in Practice-Based Research. Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine
Data Analysis in Practice-Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine Multilevel Data Statistical analyses that fail to recognize
More informationYour Task: Find a ZIP code in Seattle where the crime rate is worse than you would expect and better than you would expect.
Forensic Geography Lab: Regression Part 1 Payday Lending and Crime Seattle, Washington Background Regression analyses are in many ways the Gold Standard among analytic techniques for undergraduates (and
More informationDaniel Boduszek University of Huddersfield
Daniel Boduszek University of Huddersfield d.boduszek@hud.ac.uk Introduction to Multiple Regression (MR) Types of MR Assumptions of MR SPSS procedure of MR Example based on prison data Interpretation of
More informationappstats26.notebook April 17, 2015
Chapter 26 Comparing Counts Objective: Students will interpret chi square as a test of goodness of fit, homogeneity, and independence. Goodness of Fit A test of whether the distribution of counts in one
More informationReflection Questions for Math 58B
Reflection Questions for Math 58B Johanna Hardin Spring 2017 Chapter 1, Section 1 binomial probabilities 1. What is a p-value? 2. What is the difference between a one- and two-sided hypothesis? 3. What
More informationAnnotation and Retrieval System Using Confabulation Model for ImageCLEF2011 Photo Annotation
Annotation and Retrieval System Using Confabulation Model for ImageCLEF2011 Photo Annotation Ryo Izawa, Naoki Motohashi, and Tomohiro Takagi Department of Computer Science Meiji University 1-1-1 Higashimita,
More informationBOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT REGRESSION MODELS
BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT REGRESSION MODELS 17 December 2009 Michael Wood University of Portsmouth Business School SBS Department, Richmond Building Portland Street, Portsmouth
More informationPerformance of Median and Least Squares Regression for Slightly Skewed Data
World Academy of Science, Engineering and Technology 9 Performance of Median and Least Squares Regression for Slightly Skewed Data Carolina Bancayrin - Baguio Abstract This paper presents the concept of
More informationlateral organization: maps
lateral organization Lateral organization & computation cont d Why the organization? The level of abstraction? Keep similar features together for feedforward integration. Lateral computations to group
More informationChapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence Section 8.1 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Introduction Our goal in many statistical settings is to use a sample statistic
More informationQuasi-experimental analysis Notes for "Structural modelling".
Quasi-experimental analysis Notes for "Structural modelling". Martin Browning Department of Economics, University of Oxford Revised, February 3 2012 1 Quasi-experimental analysis. 1.1 Modelling using quasi-experiments.
More informationMultiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University
Multiple Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Multiple Regression 1 / 19 Multiple Regression 1 The Multiple
More informationUnderstanding the Hypothesis
Understanding the Hypothesis Course developed by Deborah H. Glueck and Keith E. Muller Slides developed by Jessica R. Shaw, Keith E. Muller, Albert D. Ritzhaupt and Deborah H. Glueck Copyright by the Regents
More information