Social Effects in Blau Space:
|
|
- Ashlyn Payne
- 6 years ago
- Views:
Transcription
1 Social Effects in Blau Space: Miller McPherson and Jeffrey A. Smith Duke University
2 Abstract We develop a method of imputing characteristics of the network alters of respondents in probability samples of individuals using the homophily principle to estimate the properties of a respondent s core discussion network. These properties include a measure of the potential exposure to the attitudes, values, beliefs, and other characteristics of the respondent s network alters. Data from the General Social Survey data demonstrate that the imputed network characteristics are strongly related to individual level measures such as attitudes, beliefs, and other variables typical of survey analysis. In some cases, the imputed network variable drastically alters and even eliminates the effects of standard sociodemographic variables such as age and education. We follow with examples of health-related behavior from the Panel Study of income Dynamics
3 The Relational Approach N cases become N(N-1)/2 cases The metric is now distance Distance is created by Homophily
4 The Homophily Principle Applies to almost all social distinctions: Age, race, gender, beauty, height, weight, skin color. Has been found in almost all studies of social contact. Governs the large-scale organization of most kinds of interaction. Shapes the most elementary form of social structure: the probability of contact between two individuals. Produces localization of social entities.
5 Activities are localized in social space: Points are a representative sample of individuals. Boxes are a representative sample of associations.
6 What ties large scale systems together? The homophily principle simplifies transactions in Blau space, turning sociodemographic distance into network distance.
7 As the scale of the system grows larger, the Homophily principle becomes more and more powerful.
8
9
10
11 Focus on Industrial Society Sizes in millions or billions High dimensional Blau space Differentiation of types of relationships Social entities operate at multiple levels
12 The number of potential connections in systems of varying size Connections (N(N-1)/2) System Size (N)
13 Estimated number of Confiding relationships on Earth 7,280,000,000,000
14 Ratio of Actual to Potential
15 Assumptions in the theory People have a finite capacity for processing information and finite time and energy The homophily principle organizes the flow of information across the system
16 Some implications of the theory: The localization of information through homophily leads to the formation of niches for social entities These social niches evolve and interact with each other dynamically The location of an individual inside a niche conditions the probability that the individual will be affiliated with the social entity
17 Core question for the present: Is the presence of socially transmissible characteristic in an individual the result of contagion in Blau space, or is it a result of human capital, material resources, or some atomistic process?
18 Activities are localized in social space: Points are a probability sample of individuals. Boxes are a probability sample of activities.
19 For example: Are highly educated people more tolerant because they have been trained by the educational institution to appreciate diversity [education causes tolerance]? Or, is it that educated people tend to be surrounded by other educated people who express tolerant views, and become more tolerant as a result [birds of a feather not only flock together, but they fly in parallel]?
20 Another example: Do highly educated people desire fewer children because there is something inherent in the educational process that inhibits childbearing [education causes a desire for smaller families]? Or, do highly educated people get their views on desired family size from their friends, who are also highly educated [social context affects desired family size]
21 Today s approach to the problem: 1. Parameterize Blau space in multiple dimensions. 2. Model the dependencies among social entities produced by proximity in Blau space. 3. Use the estimated dependencies to impute local social context 4. Compare the predictive power of social context and the conventional model
22 Parameterizing Blau space The General Social Survey asked respondents in 1985 and 2004 to name the persons with whom they discussed important matters, and to report the following social characteristics of those persons: Age, Race, Sex, Religion, Education.
23 Parameterizing, cont. The survey also collected information on those characteristics for the respondents, as well as a rich variety of attitudinal and belief variables.
24
25
26 Parameterizing, cont. We create a dataset of all pairs of the GSS respondents in a given year, and assume that these pairs constitute a representative sample of pairs who do not discuss important matters with each other. Each of these pairs has a vector of Blau distances in Education, Age, Sex, Race, and Religion (e.g. D ij = Age i -Age j ). This dataset is the set of Controls, in our Case Control analysis.
27 Parameterizing, cont. We create a parallel dataset consisting of the pairs generated by the reports of each respondent on one of their core discussion partners. Again, each pair is characterized by the Blau distances in Education, Age, Sex, Race, and Religion. This dataset is the set of Cases.
28 Parameterizing, cont. We combine the cases and controls into a single dataset, and estimate the parameters in the following case control logistic regression model: Ln[P(tie ij )/[1-P(tie ij )] = ά + β( Dij ) where β( Dij ) represents the set of Blau distances and the associated parameters.
29 Case Control Analysis It is well known in the biometric literature that logistic regression provides consistent estimates up to a constant of proportionality for the parameters of models fitted to data sampled on the dependent variable (c.f. Hosmer and Lemeshow 1978, Pregibon 1974, Allison 2007). Our application fits into this model in the following way
30 Case Control Analysis, cont. The sample of relationships produced by the GSS study of core discussion networks is a representative sample of core confidant ties, since the respondents are a probability sample of the U.S.population.
31 Case Control Analysis, cont. The respondents in the GSS generate a sample of potential ties representative of the universe of core confidant ties that are unrealized, since the individuals are sampled independently (approximately). This result follows because the probability of obtaining a true network alter of any particular respondent in the actually measured core discussion networks is very small (p<.00001, with heroic assumptions about sampling design).
32 Case Control Analysis, cont Thus, we have a representative sample of observed core discussion network ties, and a representative sample of non-ties, which may be combined in the case control study to allow us to estimate the parameters of our model of homophily in Blau space.
33 Estimated effects of Blau Distances Intercept Race Distance Religion Distance Education Distance Age Distance Gender Distance (All coefficients significant beyond.001)
34 Modeling Dependencies From the estimated logistic regression equation, we recover fitted probabilities of contact between persons i and j in our sample, given their distance in Blau space: P(contact ij D ij )=1/(1+e -(ά + β( Dij ) )
35
36
37
38
39
40 Modeling Dependencies, cont. These probabilities are assembled into a row stochastic distance matrix P with which we can form the term PY i, the i th element of which is proportional to the expectation of Y for the potential network alters of respondent i, located in that respondent s position in Blau space.
41 Modeling Dependencies, cont. In concrete terms, the product PY i imputes the social context of ego s locale in Blau space. If Y is binary, then PY i estimates the proportion of potential alters which have attribute Y. If Y is continuous, PY i estimates the mean value of Y among ego s potential network alters. Since the y-intercepts from the case control analysis are biased, these estimates are correct up to a constant of proportionality, which is all that is required for the next stage of analysis.
42 Modeling Dependencies, cont. With our imputed network PY in hand, we then form the spatial/network regression model: Y i = ηx i + ΔPY j + Θ i Where ηx i represents the conventional survey analytic effects of X variables, Δ parameterizes the homophily effects in Blau space, and Θ i is stochastic.
43
44 Return to our Tolerance Example: For our tolerance example cited earlier, the network regression model will be: Tolerance i = a + b 1 Education i + b 2 PTolerance j + error i
45
46
47 Summary The social context variable not only is a substantial predictor of these survey measures of attitudes, but it actually destroys much of the effects of the conventional major predictors of those attitudes.
48 To emphasize, the model: Is applicable to any social survey variables, from any sample, as long as sociodemographic information is measured in that survey Does not require any information on networks in that survey
49 A Modest Claim If these results hold in general, then past studies of attributes in surveys like the GSS are almost certainly producing biased and inconsistent estimates due to the omission of social context. All prior studies of these survey variables may be producing illusory results due to the operation of social contagion in Blau space.
50 Some Implications The effects of sociodemographic variables may occur through social distance in Blau space, rather than through the traditionally conceived causal mechanisms Contagion in Blau space is a baseline phenomenon that should be accounted for before other effects are posited
51
52
53 A Health Application to a Dataset with No Direct Network Variables Panel Study of Income Dynamics N=1112 Years 1999,2001,2003,2005 Growth Curves: Health, BMI, Ailments
54 Methods We compute the social context of a position in Blau space by constructing Ego s view of the health terrain We argue that Ego will tend to regress to the expected value of their social context over time, rather than to the grand mean over time.
55
56
57
58 Regression to Social Context or Regression to the Grand Mean? Change in Count of Ailments Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** Deviation from Alters e-16 *** Deviation from Overall Mean e-12 *** Change in Reported Health: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** Deviation from Alters e-16 *** Deviation from Overall Mean e-10 *** Change in BMI: Estimate Std. Error t value Pr(> t ) (Intercept) e-11 *** Deviation from Alters Deviation from Overall Mean *
59 Interpretations Network Neighborhood Contagion: The Cristakis Effect: Strong and Weak Ties Local Social Context: Family, Neighborhood, Workplace: Lifestyle Generalized Social Context: Position in Stratification System, Urban-Rural Location, Geography. Location in the Cultural System: Cultural Niches: Habitat and Habitus
60 Some limitations Statistical properties not well understood P term in Y i = ηx i + ΔPY i + Θ i is estimated, not fixed Assumed unity of network content All relations assumed to be from same distributional family P estimates based on very strong ties Strong ties are embedded in dense neighborhoods Assumes transmissible Y Genetically coded Y operates on different time scale Limited to entities with niches Will not explain uniformly distributed Y
61 Miller s Final Rant for the Day Homophily is not choice Social change is not rewiring, but is microevolution The action in Blau space is relational not attributional. All surveys of individuals are samples of the residue of networks: Each observation is a highly fallible information dump from a node in a high dimensional space organized by the homophily principle. The above implies that 1) survey observations are serially correlated in Blau space and 2) survey datasets actually have some number in the factorials of n and k dependent observations, rather than n independent observations, where n is the number of individuals, and k is the number of variables. Blau space is a lens that enables us to view the connections implicit in survey data with the high dimensional web of human networks. Micro-level networks, the bipartite (multipartite) networks of connections between individuals and and higher level social entities, and the connections among the higher level entities coevolve in Blau space. Human networks are instantiations of high dimensional objects that are mostly unobservable, not the zeros and ones in our models.
62
63
IMPACTS OF SOCIAL NETWORKS AND SPACE ON OBESITY. The Rights and Wrongs of Social Network Analysis
IMPACTS OF SOCIAL NETWORKS AND SPACE ON OBESITY The Rights and Wrongs of Social Network Analysis THE SPREAD OF OBESITY IN A LARGE SOCIAL NETWORK OVER 32 YEARS Nicholas A. Christakis James D. Fowler Published:
More informationSize Matters: the Structural Effect of Social Context
Size Matters: the Structural Effect of Social Context Siwei Cheng Yu Xie University of Michigan Abstract For more than five decades since the work of Simmel (1955), many social science researchers have
More informationSocial Network Analysis: When Social Relationship is the Dependent Variable. Anabel Quan Haase Faculty of Information and Media Studies Sociology
Social Network Analysis: When Social Relationship is the Dependent Variable Anabel Quan Haase Faculty of Information and Media Studies Sociology Overview of Presentation General overview of the social
More informationIdentifying Peer Influence Effects in Observational Social Network Data: An Evaluation of Propensity Score Methods
Identifying Peer Influence Effects in Observational Social Network Data: An Evaluation of Propensity Score Methods Dean Eckles Department of Communication Stanford University dean@deaneckles.com Abstract
More informationL4, Modeling using networks and other heterogeneities
L4, Modeling using networks and other heterogeneities July, 2017 Different heterogeneities In reality individuals behave differently both in terms of susceptibility and infectivity given that a contact
More informationLogistic regression. Department of Statistics, University of South Carolina. Stat 205: Elementary Statistics for the Biological and Life Sciences
Logistic regression Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 1 Logistic regression: pp. 538 542 Consider Y to be binary
More informationAddendum: Multiple Regression Analysis (DRAFT 8/2/07)
Addendum: Multiple Regression Analysis (DRAFT 8/2/07) When conducting a rapid ethnographic assessment, program staff may: Want to assess the relative degree to which a number of possible predictive variables
More informationIdentifying Endogenous Peer Effects in the Spread of Obesity. Abstract
Identifying Endogenous Peer Effects in the Spread of Obesity Timothy J. Halliday 1 Sally Kwak 2 University of Hawaii- Manoa October 2007 Abstract Recent research in the New England Journal of Medicine
More informationMMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?
MMI 409 Spring 2009 Final Examination Gordon Bleil Table of Contents Research Scenario and General Assumptions Questions for Dataset (Questions are hyperlinked to detailed answers) 1. Is there a difference
More informationDo Your Online Friends Make You Pay? A Randomized Field Experiment on Peer Influence in Online Social Networks Online Appendix
Forthcoming in Management Science 2014 Do Your Online Friends Make You Pay? A Randomized Field Experiment on Peer Influence in Online Social Networks Online Appendix Ravi Bapna University of Minnesota,
More informationAssessing Studies Based on Multiple Regression. Chapter 7. Michael Ash CPPA
Assessing Studies Based on Multiple Regression Chapter 7 Michael Ash CPPA Assessing Regression Studies p.1/20 Course notes Last time External Validity Internal Validity Omitted Variable Bias Misspecified
More informationBayesian graphical models for combining multiple data sources, with applications in environmental epidemiology
Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology Sylvia Richardson 1 sylvia.richardson@imperial.co.uk Joint work with: Alexina Mason 1, Lawrence
More informationNORTH SOUTH UNIVERSITY TUTORIAL 2
NORTH SOUTH UNIVERSITY TUTORIAL 2 AHMED HOSSAIN,PhD Data Management and Analysis AHMED HOSSAIN,PhD - Data Management and Analysis 1 Correlation Analysis INTRODUCTION In correlation analysis, we estimate
More informationGeneralized Estimating Equations for Depression Dose Regimes
Generalized Estimating Equations for Depression Dose Regimes Karen Walker, Walker Consulting LLC, Menifee CA Generalized Estimating Equations on the average produce consistent estimates of the regression
More informationStatistical Analysis of Complete Social Networks
Statistical Analysis of Complete Social Networks Co-evolution of Networks & Behaviour Christian Steglich c.e.g.steglich@rug.nl median geodesic distance between groups 1.8 1.2 0.6 transitivity 0.0 0.0 0.5
More information8/10/2015. Introduction: HIV. Introduction: Medical geography
Introduction: HIV Incorporating spatial variability to generate sub-national estimates of HIV prevalence in SSA Diego Cuadros PhD Laith Abu-Raddad PhD Sub-Saharan Africa (SSA) has by far the largest HIV
More information"Lack of activity destroys the good condition of every human being, while movement and methodical physical exercise save it and preserve it.
Leave all the afternoon for exercise and recreation, which are as necessary as reading. I will rather say more necessary because health is worth more than learning. - Thomas Jefferson "Lack of activity
More informationMULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES
24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter
More informationLogistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision in Pune, India
20th International Congress on Modelling and Simulation, Adelaide, Australia, 1 6 December 2013 www.mssanz.org.au/modsim2013 Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision
More informationObjective: To describe a new approach to neighborhood effects studies based on residential mobility and demonstrate this approach in the context of
Objective: To describe a new approach to neighborhood effects studies based on residential mobility and demonstrate this approach in the context of neighborhood deprivation and preterm birth. Key Points:
More informationSocial Network Sensors for Early Detection of Contagious Outbreaks
Supporting Information Text S1 for Social Network Sensors for Early Detection of Contagious Outbreaks Nicholas A. Christakis 1,2*, James H. Fowler 3,4 1 Faculty of Arts & Sciences, Harvard University,
More informationMEA DISCUSSION PAPERS
Inference Problems under a Special Form of Heteroskedasticity Helmut Farbmacher, Heinrich Kögel 03-2015 MEA DISCUSSION PAPERS mea Amalienstr. 33_D-80799 Munich_Phone+49 89 38602-355_Fax +49 89 38602-390_www.mea.mpisoc.mpg.de
More informationChapter 2 Interactions Between Socioeconomic Status and Components of Variation in Cognitive Ability
Chapter 2 Interactions Between Socioeconomic Status and Components of Variation in Cognitive Ability Eric Turkheimer and Erin E. Horn In 3, our lab published a paper demonstrating that the heritability
More informationCancer survivorship and labor market attachments: Evidence from MEPS data
Cancer survivorship and labor market attachments: Evidence from 2008-2014 MEPS data University of Memphis, Department of Economics January 7, 2018 Presentation outline Motivation and previous literature
More informationSupplementary Appendix
Supplementary Appendix This appendix has been provided by the authors to give readers additional information about their work. Supplement to: Weintraub WS, Grau-Sepulveda MV, Weiss JM, et al. Comparative
More informationLeveraging Social Networks to Promote Cancer Prevention Health Behaviors
Leveraging Social Networks to Promote Cancer Prevention Health Behaviors Dr. Jaya Aysola MD, MPH Jazmine Smith Masters in Criminology Candidate, Sarah Griggs MPH, Sitara Soundar MD candidate, Gabrielle
More informationWrite your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).
STOCKHOLM UNIVERSITY Department of Economics Course name: Empirical methods 2 Course code: EC2402 Examiner: Per Pettersson-Lidbom Number of credits: 7,5 credits Date of exam: Sunday 21 February 2010 Examination
More informationAnalysis of TB prevalence surveys
Workshop and training course on TB prevalence surveys with a focus on field operations Analysis of TB prevalence surveys Day 8 Thursday, 4 August 2011 Phnom Penh Babis Sismanidis with acknowledgements
More informationSUMMATED RATING SCALES AND LEVELS OF MEASUREMENT
Measurement, Scaling, and Dimensional Analysis Summer 07 Bill Jacoby SUMMATED RATING SCALES AND LEVELS OF MEASUREMENT Assume that we are interested in measuring public attitudes toward government spending.
More informationChapter 3 CORRELATION AND REGRESSION
CORRELATION AND REGRESSION TOPIC SLIDE Linear Regression Defined 2 Regression Equation 3 The Slope or b 4 The Y-Intercept or a 5 What Value of the Y-Variable Should be Predicted When r = 0? 7 The Regression
More informationBeer Purchasing Behavior, Dietary Quality, and Health Outcomes among U.S. Adults
Beer Purchasing Behavior, Dietary Quality, and Health Outcomes among U.S. Adults Richard Volpe (California Polytechnical University, San Luis Obispo, USA) Research in health, epidemiology, and nutrition
More informationEstimating Heterogeneous Choice Models with Stata
Estimating Heterogeneous Choice Models with Stata Richard Williams Notre Dame Sociology rwilliam@nd.edu West Coast Stata Users Group Meetings October 25, 2007 Overview When a binary or ordinal regression
More informationDAZED AND CONFUSED: THE CHARACTERISTICS AND BEHAVIOROF TITLE CONFUSED READERS
Worldwide Readership Research Symposium 2005 Session 5.6 DAZED AND CONFUSED: THE CHARACTERISTICS AND BEHAVIOROF TITLE CONFUSED READERS Martin Frankel, Risa Becker, Julian Baim and Michal Galin, Mediamark
More informationbivariate analysis: The statistical analysis of the relationship between two variables.
bivariate analysis: The statistical analysis of the relationship between two variables. cell frequency: The number of cases in a cell of a cross-tabulation (contingency table). chi-square (χ 2 ) test for
More informationDANIEL KARELL. Soc Stats Reading Group. Princeton University
Stochastic Actor-Oriented Models and Change we can believe in: Comparing longitudinal network models on consistency, interpretability and predictive power DANIEL KARELL Division of Social Science New York
More informationModelling Research Productivity Using a Generalization of the Ordered Logistic Regression Model
Modelling Research Productivity Using a Generalization of the Ordered Logistic Regression Model Delia North Temesgen Zewotir Michael Murray Abstract In South Africa, the Department of Education allocates
More informationMultivariate Multilevel Models
Multivariate Multilevel Models Getachew A. Dagne George W. Howe C. Hendricks Brown Funded by NIMH/NIDA 11/20/2014 (ISSG Seminar) 1 Outline What is Behavioral Social Interaction? Importance of studying
More information11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES
Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are
More informationRapid decline of female genital circumcision in Egypt: An exploration of pathways. Jenny X. Liu 1 RAND Corporation. Sepideh Modrek Stanford University
Rapid decline of female genital circumcision in Egypt: An exploration of pathways Jenny X. Liu 1 RAND Corporation Sepideh Modrek Stanford University This version: February 3, 2010 Abstract Egypt is currently
More informationMotherhood and Female Labor Force Participation: Evidence from Infertility Shocks
Motherhood and Female Labor Force Participation: Evidence from Infertility Shocks Jorge M. Agüero Univ. of California, Riverside jorge.aguero@ucr.edu Mindy S. Marks Univ. of California, Riverside mindy.marks@ucr.edu
More informationA Study of the Spatial Distribution of Suicide Rates
A Study of the Spatial Distribution of Suicide Rates Ferdinand DiFurio, Tennessee Tech University Willis Lewis, Winthrop University With acknowledgements to Kendall Knight, GA, Tennessee Tech University
More informationStatistics 202: Data Mining. c Jonathan Taylor. Final review Based in part on slides from textbook, slides of Susan Holmes.
Final review Based in part on slides from textbook, slides of Susan Holmes December 5, 2012 1 / 1 Final review Overview Before Midterm General goals of data mining. Datatypes. Preprocessing & dimension
More informationIn this module I provide a few illustrations of options within lavaan for handling various situations.
In this module I provide a few illustrations of options within lavaan for handling various situations. An appropriate citation for this material is Yves Rosseel (2012). lavaan: An R Package for Structural
More informationEffects of School-Level Norms on Student Substance Use
Prevention Science, Vol. 3, No. 2, June 2002 ( C 2002) Effects of School-Level Norms on Student Substance Use Revathy Kumar, 1,2,4 Patrick M. O Malley, 1 Lloyd D. Johnston, 1 John E. Schulenberg, 1,3 and
More informationTRIPLL Webinar: Propensity score methods in chronic pain research
TRIPLL Webinar: Propensity score methods in chronic pain research Felix Thoemmes, PhD Support provided by IES grant Matching Strategies for Observational Studies with Multilevel Data in Educational Research
More informationCSE 255 Assignment 9
CSE 255 Assignment 9 Alexander Asplund, William Fedus September 25, 2015 1 Introduction In this paper we train a logistic regression function for two forms of link prediction among a set of 244 suspected
More information1.4 - Linear Regression and MS Excel
1.4 - Linear Regression and MS Excel Regression is an analytic technique for determining the relationship between a dependent variable and an independent variable. When the two variables have a linear
More informationPERCEIVED TRUSTWORTHINESS OF KNOWLEDGE SOURCES: THE MODERATING IMPACT OF RELATIONSHIP LENGTH
PERCEIVED TRUSTWORTHINESS OF KNOWLEDGE SOURCES: THE MODERATING IMPACT OF RELATIONSHIP LENGTH DANIEL Z. LEVIN Management and Global Business Dept. Rutgers Business School Newark and New Brunswick Rutgers
More informationSCHOOL OF MATHEMATICS AND STATISTICS
Data provided: Tables of distributions MAS603 SCHOOL OF MATHEMATICS AND STATISTICS Further Clinical Trials Spring Semester 014 015 hours Candidates may bring to the examination a calculator which conforms
More informationMidterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do.
Midterm STAT-UB.0003 Regression and Forecasting Models The exam is closed book and notes, with the following exception: you are allowed to bring one letter-sized page of notes into the exam (front and
More informationIAPT: Regression. Regression analyses
Regression analyses IAPT: Regression Regression is the rather strange name given to a set of methods for predicting one variable from another. The data shown in Table 1 and come from a student project
More informationFood Labels and Weight Loss:
Food Labels and Weight Loss: Evidence from the National Longitudinal Survey of Youth Bidisha Mandal Washington State University AAEA 08, Orlando Motivation Who reads nutrition labels? Any link with body
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 Exam policy: This exam allows one one-page, two-sided cheat sheet; No other materials. Time: 80 minutes. Be sure to write your name and
More informationLinear Regression in SAS
1 Suppose we wish to examine factors that predict patient s hemoglobin levels. Simulated data for six patients is used throughout this tutorial. data hgb_data; input id age race $ bmi hgb; cards; 21 25
More informationThe Dynamic Effects of Obesity on the Wages of Young Workers
The Dynamic Effects of Obesity on the Wages of Young Workers Joshua C. Pinkston University of Louisville June, 2015 Contributions 1. Focus on more recent cohort, NLSY97. Obesity
More informationInstrumental Variables Estimation: An Introduction
Instrumental Variables Estimation: An Introduction Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA The Problem The Problem Suppose you wish to
More informationWhat is Regularization? Example by Sean Owen
What is Regularization? Example by Sean Owen What is Regularization? Name3 Species Size Threat Bo snake small friendly Miley dog small friendly Fifi cat small enemy Muffy cat small friendly Rufus dog large
More informationPropensity scores: what, why and why not?
Propensity scores: what, why and why not? Rhian Daniel, Cardiff University @statnav Joint workshop S3RI & Wessex Institute University of Southampton, 22nd March 2018 Rhian Daniel @statnav/propensity scores:
More informationHow to analyze correlated and longitudinal data?
How to analyze correlated and longitudinal data? Niloofar Ramezani, University of Northern Colorado, Greeley, Colorado ABSTRACT Longitudinal and correlated data are extensively used across disciplines
More informationAnalysis of Environmental Data Conceptual Foundations: En viro n m e n tal Data
Analysis of Environmental Data Conceptual Foundations: En viro n m e n tal Data 1. Purpose of data collection...................................................... 2 2. Samples and populations.......................................................
More information1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp
The Stata Journal (22) 2, Number 3, pp. 28 289 Comparative assessment of three common algorithms for estimating the variance of the area under the nonparametric receiver operating characteristic curve
More informationSelected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC
Selected Topics in Biostatistics Seminar Series Missing Data Sponsored by: Center For Clinical Investigation and Cleveland CTSC Brian Schmotzer, MS Biostatistician, CCI Statistical Sciences Core brian.schmotzer@case.edu
More informationSituation of Obesity in Different Ages in Albania
Available online at www.scholarsresearchlibrary.com European Journal of Sports & Exercise Science, 2018, 6 (1): 5-10 (http://www.scholarsresearchlibrary.com) Situation of Obesity in Different Ages in Albania
More informationIntroduction to Observational Studies. Jane Pinelis
Introduction to Observational Studies Jane Pinelis 22 March 2018 Outline Motivating example Observational studies vs. randomized experiments Observational studies: basics Some adjustment strategies Matching
More informationRegression Discontinuity Analysis
Regression Discontinuity Analysis A researcher wants to determine whether tutoring underachieving middle school students improves their math grades. Another wonders whether providing financial aid to low-income
More informationHierarchical Bayesian Modeling of Individual Differences in Texture Discrimination
Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Timothy N. Rubin (trubin@uci.edu) Michael D. Lee (mdlee@uci.edu) Charles F. Chubb (cchubb@uci.edu) Department of Cognitive
More informationEstimating treatment effects with observational data: A new approach using hospital-level variation in treatment intensity
Preliminary and incomplete Do not quote Estimating treatment effects with observational data: A new approach using hospital-level variation in treatment intensity Mark McClellan Stanford University and
More informationNBER WORKING PAPER SERIES HOW WAS THE WEEKEND? HOW THE SOCIAL CONTEXT UNDERLIES WEEKEND EFFECTS IN HAPPINESS AND OTHER EMOTIONS FOR US WORKERS
NBER WORKING PAPER SERIES HOW WAS THE WEEKEND? HOW THE SOCIAL CONTEXT UNDERLIES WEEKEND EFFECTS IN HAPPINESS AND OTHER EMOTIONS FOR US WORKERS John F. Helliwell Shun Wang Working Paper 21374 http://www.nber.org/papers/w21374
More informationThe Epidemiology of HIV/AIDS in Texas in Ages ( )
The Epidemiology of HIV/AIDS in Texas in Ages 25-49 (1999-2010) Author: Jonathan Rodriguez Faculty Mentor: Joseph R. Oppong, Department of Geography, College of Arts and Sciences; Toulouse School of Graduate
More informationPerformance of Median and Least Squares Regression for Slightly Skewed Data
World Academy of Science, Engineering and Technology 9 Performance of Median and Least Squares Regression for Slightly Skewed Data Carolina Bancayrin - Baguio Abstract This paper presents the concept of
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write
More informationMulti-level approaches to understanding and preventing obesity: analytical challenges and new directions
Multi-level approaches to understanding and preventing obesity: analytical challenges and new directions Ana V. Diez Roux MD PhD Center for Integrative Approaches to Health Disparities University of Michigan
More informationMLE #8. Econ 674. Purdue University. Justin L. Tobias (Purdue) MLE #8 1 / 20
MLE #8 Econ 674 Purdue University Justin L. Tobias (Purdue) MLE #8 1 / 20 We begin our lecture today by illustrating how the Wald, Score and Likelihood ratio tests are implemented within the context of
More informationThis exam consists of three parts. Provide answers to ALL THREE sections.
Empirical Analysis and Research Methodology Examination Yale University Department of Political Science January 2008 This exam consists of three parts. Provide answers to ALL THREE sections. Your answers
More informationApplied Quantitative Methods II
Applied Quantitative Methods II Lecture 7: Endogeneity and IVs Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 7 VŠE, SS 2016/17 1 / 36 Outline 1 OLS and the treatment effect 2 OLS and endogeneity 3 Dealing
More informationConfluence: Conformity Influence in Large Social Networks
Confluence: Conformity Influence in Large Social Networks Jie Tang *, Sen Wu *, and Jimeng Sun + * Tsinghua University + IBM TJ Watson Research Center 1 Conformity Conformity is the act of matching attitudes,
More informationTWO-DAY DYADIC DATA ANALYSIS WORKSHOP Randi L. Garcia Smith College UCSF January 9 th and 10 th
TWO-DAY DYADIC DATA ANALYSIS WORKSHOP Randi L. Garcia Smith College UCSF January 9 th and 10 th @RandiLGarcia RandiLGarcia Mediation in the APIM Moderation in the APIM Dyadic Growth Curve Modeling Other
More informationAN INFORMATION VISUALIZATION APPROACH TO CLASSIFICATION AND ASSESSMENT OF DIABETES RISK IN PRIMARY CARE
Proceedings of the 3rd INFORMS Workshop on Data Mining and Health Informatics (DM-HI 2008) J. Li, D. Aleman, R. Sikora, eds. AN INFORMATION VISUALIZATION APPROACH TO CLASSIFICATION AND ASSESSMENT OF DIABETES
More informationApplication of Cox Regression in Modeling Survival Rate of Drug Abuse
American Journal of Theoretical and Applied Statistics 2018; 7(1): 1-7 http://www.sciencepublishinggroup.com/j/ajtas doi: 10.11648/j.ajtas.20180701.11 ISSN: 2326-8999 (Print); ISSN: 2326-9006 (Online)
More information9 research designs likely for PSYC 2100
9 research designs likely for PSYC 2100 1) 1 factor, 2 levels, 1 group (one group gets both treatment levels) related samples t-test (compare means of 2 levels only) 2) 1 factor, 2 levels, 2 groups (one
More informationIntroduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018
Introduction to Machine Learning Katherine Heller Deep Learning Summer School 2018 Outline Kinds of machine learning Linear regression Regularization Bayesian methods Logistic Regression Why we do this
More information11/24/2017. Do not imply a cause-and-effect relationship
Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection
More informationCorrelation and regression
PG Dip in High Intensity Psychological Interventions Correlation and regression Martin Bland Professor of Health Statistics University of York http://martinbland.co.uk/ Correlation Example: Muscle strength
More informationFollowing in Your Father s Footsteps: A Note on the Intergenerational Transmission of Income between Twin Fathers and their Sons
D I S C U S S I O N P A P E R S E R I E S IZA DP No. 5990 Following in Your Father s Footsteps: A Note on the Intergenerational Transmission of Income between Twin Fathers and their Sons Vikesh Amin Petter
More informationMultiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University
Multiple Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Multiple Regression 1 / 19 Multiple Regression 1 The Multiple
More informationCRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys
Multiple Regression Analysis 1 CRITERIA FOR USE Multiple regression analysis is used to test the effects of n independent (predictor) variables on a single dependent (criterion) variable. Regression tests
More informationThose Who Tan and Those Who Don t: A Natural Experiment of Employment Discrimination
Those Who Tan and Those Who Don t: A Natural Experiment of Employment Discrimination Ronen Avraham, Tamar Kricheli Katz, Shay Lavie, Haggai Porat, Tali Regev Abstract: Are Black workers discriminated against
More informationUnderweight Children in Ghana: Evidence of Policy Effects. Samuel Kobina Annim
Underweight Children in Ghana: Evidence of Policy Effects Samuel Kobina Annim Correspondence: Economics Discipline Area School of Social Sciences University of Manchester Oxford Road, M13 9PL Manchester,
More informationEPI 200C Final, June 4 th, 2009 This exam includes 24 questions.
Greenland/Arah, Epi 200C Sp 2000 1 of 6 EPI 200C Final, June 4 th, 2009 This exam includes 24 questions. INSTRUCTIONS: Write all answers on the answer sheets supplied; PRINT YOUR NAME and STUDENT ID NUMBER
More informationMicronutrients intake and cancer: Protective or promoters? A challenge for risk estimation.
Micronutrients intake and cancer: Protective or promoters? A challenge for risk estimation. Muñoz SE 1,2 ; Roman D 1 ; Roque F 1 ; Navarro A 1 ; Díaz MP 1. 1.Facultad de Ciencias Médicas, Universidad Nacional
More informationCross-Lagged Panel Analysis
Cross-Lagged Panel Analysis Michael W. Kearney Cross-lagged panel analysis is an analytical strategy used to describe reciprocal relationships, or directional influences, between variables over time. Cross-lagged
More informationExamining Relationships Least-squares regression. Sections 2.3
Examining Relationships Least-squares regression Sections 2.3 The regression line A regression line describes a one-way linear relationship between variables. An explanatory variable, x, explains variability
More informationIntroduction to Social Network Analysis for Dissemination and Implementation Research
Introduction to Social Network Analysis for Dissemination and Implementation Research Miruna Petrescu-Prahova, PhD mirunapp@uw.edu Health Promotion Research Center Department of Health Services University
More informationcloglog link function to transform the (population) hazard probability into a continuous
Supplementary material. Discrete time event history analysis Hazard model details. In our discrete time event history analysis, we used the asymmetric cloglog link function to transform the (population)
More informationPrediction and Inference under Competing Risks in High Dimension - An EHR Demonstration Project for Prostate Cancer
Prediction and Inference under Competing Risks in High Dimension - An EHR Demonstration Project for Prostate Cancer Ronghui (Lily) Xu Division of Biostatistics and Bioinformatics Department of Family Medicine
More informationCh. 11 Measurement. Measurement
TECH 646 Analysis of Research in Industry and Technology PART III The Sources and Collection of data: Measurement, Measurement Scales, Questionnaires & Instruments, Sampling Ch. 11 Measurement Lecture
More informationDonna L. Coffman Joint Prevention Methodology Seminar
Donna L. Coffman Joint Prevention Methodology Seminar The purpose of this talk is to illustrate how to obtain propensity scores in multilevel data and use these to strengthen causal inferences about mediation.
More informationSTATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012
STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION by XIN SUN PhD, Kansas State University, 2012 A THESIS Submitted in partial fulfillment of the requirements
More informationDoing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto
Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling Olli-Pekka Kauppila Daria Kautto Session VI, September 20 2017 Learning objectives 1. Get familiar with the basic idea
More information