This tutorial presentation is prepared by. Mohammad Ehsanul Karim

Similar documents
Notes for laboratory session 2

Multiple Regression Analysis

Multiple Linear Regression Analysis

Introduction to regression

Sociology 63993, Exam1 February 12, 2015 Richard Williams, University of Notre Dame,

CHILD HEALTH AND DEVELOPMENT STUDY

Age (continuous) Gender (0=Male, 1=Female) SES (1=Low, 2=Medium, 3=High) Prior Victimization (0= Not Victimized, 1=Victimized)

Final Exam - section 2. Thursday, December hours, 30 minutes

WELCOME! Lecture 11 Thommy Perlinger

Business Research Methods. Introduction to Data Analysis

Name: emergency please discuss this with the exam proctor. 6. Vanderbilt s academic honor code applies.

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

5 To Invest or not to Invest? That is the Question.

Stat 13, Lab 11-12, Correlation and Regression Analysis

MULTIPLE REGRESSION OF CPS DATA

M15_BERE8380_12_SE_C15.6.qxd 2/21/11 8:21 PM Page Influence Analysis 1

ANOVA. Thomas Elliott. January 29, 2013

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations)

Daniel Boduszek University of Huddersfield

An Introduction to Modern Econometrics Using Stata

Multiple Regression Using SPSS/PASW

CLASSICAL AND. MODERN REGRESSION WITH APPLICATIONS

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition

CHAPTER TWO REGRESSION

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 5 Residuals and multiple regression Introduction

APPENDIX D REFERENCE AND PREDICTIVE VALUES FOR PEAK EXPIRATORY FLOW RATE (PEFR)

2. Scientific question: Determine whether there is a difference between boys and girls with respect to the distance and its change over time.

1.4 - Linear Regression and MS Excel

Understandable Statistics

4. STATA output of the analysis

Dr. Kelly Bradley Final Exam Summer {2 points} Name

Small Group Presentations

Effects of Nutrients on Shrimp Growth

6. Unusual and Influential Data

Modeling unobserved heterogeneity in Stata

Problem 1) Match the terms to their definitions. Every term is used exactly once. (In the real midterm, there are fewer terms).

Linear Regression in SAS

Validity, Reliability and Classical Assumptions

RESPONSE SURFACE MODELING AND OPTIMIZATION TO ELUCIDATE THE DIFFERENTIAL EFFECTS OF DEMOGRAPHIC CHARACTERISTICS ON HIV PREVALENCE IN SOUTH AFRICA

bivariate analysis: The statistical analysis of the relationship between two variables.

Psych 5741/5751: Data Analysis University of Boulder Gary McClelland & Charles Judd. Exam #2, Spring 1992

isc ove ring i Statistics sing SPSS

Chapter 10: Moderation, mediation and more regression

Unit 1 Exploring and Understanding Data

Least likely observations in regression models for categorical outcomes

Simple Linear Regression

TEACHING REGRESSION WITH SIMULATION. John H. Walker. Statistics Department California Polytechnic State University San Luis Obispo, CA 93407, U.S.A.

Study Guide #2: MULTIPLE REGRESSION in education

1. Objective: analyzing CD4 counts data using GEE marginal model and random effects model. Demonstrate the analysis using SAS and STATA.

Staff Papers Series. Department of Agricultural and Applied Economics

EXECUTIVE SUMMARY DATA AND PROBLEM

NEUROBLASTOMA DATA -- TWO GROUPS -- QUANTITATIVE MEASURES 38 15:37 Saturday, January 25, 2003

Exercise Verify that the term on the left of the equation showing the decomposition of "total" deviation in a two-factor experiment.

Answer Key to Problem Set #1

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

11/24/2017. Do not imply a cause-and-effect relationship

Question 1(25= )

Item-Total Statistics

Answer all three questions. All questions carry equal marks.

ECON Introductory Econometrics Seminar 7

Modern Regression Methods

THE UNIVERSITY OF SUSSEX. BSc Second Year Examination DISCOVERING STATISTICS SAMPLE PAPER INSTRUCTIONS

Doctors Fees in Ireland Following the Change in Reimbursement: Did They Jump?

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Seid M. Zekavat, Loyola Marymount University

Midterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do.

STAT445 Midterm Project1

Data Analysis for Project. Tutorial

Statistics Assignment 11 - Solutions

Regression Analysis II

m 11 m.1 > m 12 m.2 risk for smokers risk for nonsmokers

Sample Exam Paper Answer Guide

The Stata Journal. Editor Nicholas J. Cox Geography Department Durham University South Road Durham City DH1 3LE UK

Choosing a Significance Test. Student Resource Sheet

Correlation and Regression

Online Appendix. According to a recent survey, most economists expect the economic downturn in the United

STA 3024 Spring 2013 EXAM 3 Test Form Code A UF ID #

Subject index. A about this book downloading programs...4 errata...19 example datasets... 4, 19

Sociology Exam 3 Answer Key [Draft] May 9, 201 3

HZAU MULTIVARIATE HOMEWORK #2 MULTIPLE AND STEPWISE LINEAR REGRESSION

Introduction of Empirical Analysis using Stata: For Beginners

Use the above variables and any you might need to construct to specify the MODEL A/C comparisons you would use to ask the following questions.

Bangor University Laboratory Exercise 1, June 2008

ANOVA in SPSS (Practical)

SPSS output for 420 midterm study

Regression Output: Table 5 (Random Effects OLS) Random-effects GLS regression Number of obs = 1806 Group variable (i): subject Number of groups = 70

Cross-over trials. Martin Bland. Cross-over trials. Cross-over trials. Professor of Health Statistics University of York

Research Methods in Forest Sciences: Learning Diary. Yoko Lu December Research process

Bambang Subroto, Rosidi, Bambang Purnomosidhi Departement of Accounting, Faculty of Economics and Business, Brawijaya University

In many cardiovascular experiments and observational studies,

Multiple Bivariate Gaussian Plotting and Checking

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

Data Analysis in Practice-Based Research. Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine

EPS 625 INTERMEDIATE STATISTICS TWO-WAY ANOVA IN-CLASS EXAMPLE (FLEXIBILITY)

Overview of Lecture. Survey Methods & Design in Psychology. Correlational statistics vs tests of differences between groups

Multivariate dose-response meta-analysis: an update on glst

Data Analysis with SPSS

Demystifying causal inference in randomised trials. Lecture 3: Introduction to mediation and mediation analysis using instrumental variables

What you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu

Transcription:

STATA: The Red tutorial

STATA: The Red tutorial This tutorial presentation is prepared by Mohammad Ehsanul Karim ehsan.karim@gmail.com

STATA: The Red tutorial This tutorial presentation is prepared by Mohammad Ehsanul Karim ehsan.karim@gmail.com

Contents Linear Regression Analysis 1. Introduction to Linear Regression 2. Tests for Normality of Residuals 3. Tests for Heteroscedasticity 4. Tests for Multicollinearity 5. Tests for Autocorrelation 6. Detecting Unusual and Influential Data 7. Tests for Model Specification

1. Introduction to Linear Regression

Linear Regression The command regress is used to perform linear regressions. The first variable after the regress command is always the dependent variable ( left-hand-side variable), and the list of the independent variables that we chose to include in the estimation model follows ( right-hand-side variables).

Linear Regression. clear. use hs1, clear. regress write read female

Linear Regression. clear. use hs1, clear. regress write read female Source SS df MS Number of obs = 200 -------------+------------------------------ F( 2, 197) = 77.21 Model 7856.32118 2 3928.16059 Prob > F = 0.0000 Residual 10022.5538 197 50.8759077 R-squared = 0.4394 -------------+------------------------------ Adj R-squared = 0.4337 Total 17878.875 199 89.843593 Root MSE = 7.1327 ------------------------------------------------------------------------------ write Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- read.5658869.0493849 11.46 0.000.468496.6632778 female 5.486894 1.014261 5.41 0.000 3.48669 7.487098 _cons 20.22837 2.713756 7.45 0.000 14.87663 25.58011 ------------------------------------------------------------------------------

Linear Regression. clear. use hs1, clear. regress write read female Source SS df MS Number of obs = 200 -------------+------------------------------ F( 2, 197) = 77.21 Model 7856.32118 2 3928.16059 Prob > F = 0.0000 Residual 10022.5538 197 50.8759077 R-squared = 0.4394 -------------+------------------------------ Adj R-squared = 0.4337 Total 17878.875 199 89.843593 Root MSE = 7.1327 ------------------------------------------------------------------------------ write Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- read.5658869.0493849 11.46 0.000.468496.6632778 female 5.486894 1.014261 5.41 0.000 3.48669 7.487098 _cons 20.22837 2.713756 7.45 0.000 14.87663 25.58011 ------------------------------------------------------------------------------

2. Tests for Normality of Residuals

Tests for Normality of Residuals We use the predict command with the resid option to generate residuals and we name the residuals r.. predict r, resid

Tests for Normality of Residuals Shapiro-Wilk W test for Normality For verifying that the residuals are normally distributed, which is a very important assumption for regression, we use Shapiro-Wilk W test for normal data

Tests for Normality of Residuals Shapiro-Wilk W test for Normality For verifying that the residuals are normally distributed, which is a very important assumption for regression, we use Shapiro-Wilk W test for normal data. swilk r

Tests for Normality of Residuals Shapiro-Wilk W test for Normality For verifying that the residuals are normally distributed, which is a very important assumption for regression, we use Shapiro-Wilk W test for normal data. swilk r Shapiro-Wilk W test for normal data Variable Obs W V z Prob>z -------------+------------------------------------------------- r 200 0.98714 1.919 1.499 0.06692

Tests for Normality of Residuals In verifying that the residuals are normally distributed, which is a very important assumption for regression, the kdensity command with the normal option displays a density graph of the residuals with an normal distribution superimposed on the graph.

Tests for Normality of Residuals. kdensity r, normal

Tests for Normality of Residuals. kdensity r, normal

Tests for Normality of Residuals The pnorm command produces a normal probability plot and it is another method of testing whether the residuals from the regression are normally distributed.

Tests for Normality of Residuals. pnorm r

Tests for Normality of Residuals. pnorm r

Tests for Normality of Residuals The qnorm command produces a normal quantile plot. It is yet another method for testing if the residuals are normally distributed.

Tests for Normality of Residuals. qnorm r

Tests for Normality of Residuals. qnorm r

Tests for Normality of Residuals Summary of Tests for Normality of Residuals swilk performs the Shapiro-Wilk W test for normality. kdensity produces kernel density plot with normal distribution overlayed. pnorm graphs a standardized normal probability (P-P) plot. qnorm plots the quantiles of varname against the quantiles of a normal distribution.

3. Tests for Heteroscedasticity

Tests for Heteroscedasticity One of the basic assumptions for the ordinary least squares regression is the homogeneity of variance of the residuals. There are graphical and non-graphical methods for detecting heteroscedasticity.

Tests for Heteroscedasticity Cook-Weisberg test for heteroskedasticity

Tests for Heteroscedasticity Cook-Weisberg test for heteroskedasticity. hettest Cook-Weisberg test for heteroskedasticity using fitted values of write Ho: Constant variance chi2(1) = 5.79 Prob > chi2 = 0.0161

Tests for Heteroscedasticity we use the rvfplot command with the yline(0) option to put a reference line at y=0.

Tests for Heteroscedasticity we use the rvfplot command with the yline(0) option to put a reference line at y=0.. rvfplot, yline(0)

Tests for Heteroscedasticity we use the rvfplot command with the yline(0) option to put a reference line at y=0.. rvfplot, yline(0)

Tests for Heteroscedasticity Summary of Tests for Heteroscedasticity hettest performs Cook and Weisberg test rvfplot graphs residual-versus-fitted plot.

4. Tests for Multicollinearity

Tests for Multicollinearity Multicollinearity is a concern for multiple regression, not for its existence, but for its degree. For severe degree of multicollinearity, the regression model estimates of the coefficients become unstable and the standard errors for the coefficients can get wildly inflated.

Tests for Multicollinearity We can use the vif command after the regression to check for multicollinearity. vif stands for variance inflation factor.

Tests for Multicollinearity We can use the vif command after the regression to check for multicollinearity. vif stands for variance inflation factor.. vif Variable VIF 1/VIF -------------+---------------------- female 1.00 0.997182 read 1.00 0.997182 -------------+---------------------- Mean VIF 1.00

Tests for Multicollinearity We can use the vif command after the regression to check for multicollinearity. vif stands for variance inflation factor.. vif Variable VIF 1/VIF -------------+---------------------- female 1.00 0.997182 read 1.00 0.997182 -------------+---------------------- Mean VIF 1.00 A variable whose VIF values are greater than 10 may merit further investigation. Tolerance= 1/VIF, is used to check on the degree of collinearity. A tolerance value lower than 0.1 is comparable to a VIF of 10.

Tests for Multicollinearity Summary of Tests for Multicollinearity vif calculates the variance inflation factor for the independent variables in the linear model.

5. Tests for Autocorrelation

Tests for Autocorrelation. tsset id time variable: id, 1 to 200. dwstat Durbin-Watson d-statistic( 3, 200) = 1.93992

6. Detecting Unusual and Influential Data

Detecting Unusual and Influential Data Outliers: In linear regression, an outlier is an observation with large residual. In other words, it is an observation whose dependent-variable value is unusual given its values on the predictor variables. An outlier may indicate a sample peculiarity or may indicate a data entry error or other problem. Leverage: An observation with an extreme value on a predictor variable is called a point with high leverage. Leverage is a measure of how far an independent variable deviates from its mean. These leverage points can have an effect on the estimate of regression coefficients. Influence: An observation is said to be influential if removing the observation substantially changes the estimate of coefficients. Influence can be thought of as the product of leverage and outlierness.

Detecting Unusual and Influential Data Here we summarize the general rules of thumb we use for these measures to identify observations worthy of further investigation (where k is the number of predictors and n is the number of observations). Measure Value leverage >(2k+2)/n abs(rstu) > 2 Cook's D > 4/n abs(dfits) > 2*sqrt(k/n) abs(dfbeta) > 2/sqrt(n)

Detecting Unusual and Influential Data We use the predict command with the rstudent option to generate studentized residuals and we name the residuals r. Studentized residuals are a type of standardized residual that can be used to identify outliers.

Detecting Unusual and Influential Data We use the predict command with the rstudent option to generate studentized residuals and we name the residuals r. Studentized residuals are a type of standardized residual that can be used to identify outliers.. predict r, rstudent

Detecting Unusual and Influential Data. stem r Stem-and-leaf plot for r (Studentized residuals) r rounded to nearest multiple of.01 plot in units of.01-2** 50,42-2** 26,21-2** 18-1** 92,85,84,83-1** 75,72,69,61,61,60-1** 50,48,46,46,42-1** 33,32,22,20,20,20-1** 17,16,13,12,10,01-0** 97,97,96,96,93,93,92,92,90,89,89,89,86,86,84,82,82,80,80-0** 74,74,71,70,67-0** 59,59,58,53,49,49,47,42,42,40-0** 35,35,33,31,31,31,30,28,28,28,28,27,25,23,23,22-0** 19,17,16,16,16,16,14,13,13,09,09,07,04,03,03,02 0** 00,02,02,04,04,04,04,07,09,11,14,16,16,19 0** 21,23,23,24,24,26,28,29,30,33,33,35,35 0** 40,44,44,51,51,54,54,54,54,56,56,57,57,57 0** 61,63,64,64,64,64,64,66,70,70,71,73,73,73,74,78 0** 88,88,89,93,94,94,97,98,99 1** 01,06,06,08,08,13,13,13,13,15,19 1** 23,29,32,36,36,37,37,39 1** 42,43,44,48,51,52,53,55 1** 60,68,73,73,75,77 1** 80,84 2** 16

Detecting Unusual and. stem r. sort r. list r in 1/10 r 1. -2.503566 2. -2.421219 3. -2.255832 4. -2.210221 5. -2.178212 6. -1.916192 7. -1.848524 8. -1.843611 9. -1.831068 10. -1.750652 Influential Data

Detecting Unusual and Influential Data. stem r. sort r. list r in 1/10 r 1. -2.503566 2. -2.421219 3. -2.255832 4. -2.210221 5. -2.178212 6. -1.916192 7. -1.848524 8. -1.843611 9. -1.831068 10. -1.750652. list r in -10/l r 191. 1.551833 192. 1.602682 193. 1.677923 194. 1.726393 195. 1.730591 196. 1.749522 197. 1.774811 198. 1.798141 199. 1.840841 200. 2.160904

Detecting Unusual and. stem r. sort r. list r in 1/10 r 1. -2.503566 2. -2.421219 3. -2.255832 4. -2.210221 5. -2.178212 6. -1.916192 7. -1.848524 8. -1.843611 9. -1.831068 10. -1.750652 Influential Data. We should pay attention to. list r in -10/lstudentized r residuals that 191. 1.551833 exceed +2 or - 192. 1.602682 2, and get even 193. 1.677923 more concerned 194. 1.726393 about residuals 195. 1.730591 that exceed 196. 1.749522 +2.5 or -2.5 and 197. 1.774811 198. 1.798141 even yet more 199. 1.840841 concerned about residuals 200. 2.160904 that exceed +3 or -3.

Detecting Unusual and Influential Data. We should pay attention to studentized residuals that exceed +2 or - 2, and get even more concerned about residuals that exceed +2.5 or -2.5 and even yet more concerned about residuals that exceed +3 or -3.

Detecting Unusual and. list r if r<-2 r>2 r 1. -2.503566 2. -2.421219 3. -2.255832 4. -2.210221 5. -2.178212 200. 2.160904 Influential Data. We should pay attention to studentized residuals that exceed +2 or - 2, and get even more concerned about residuals that exceed +2.5 or -2.5 and even yet more concerned about residuals that exceed +3 or -3.

Detecting Unusual and. list r if r<-2 r>2 r 1. -2.503566 2. -2.421219 3. -2.255832 4. -2.210221 5. -2.178212 200. 2.160904. list r if r<-2.5 r>2.5 Influential Data r 1. -2.503566. We should pay attention to studentized residuals that exceed +2 or - 2, and get even more concerned about residuals that exceed +2.5 or -2.5 and even yet more concerned about residuals that exceed +3 or -3.

Detecting Unusual and Influential Data To get Leverage points, we use the predict command with the leverage option and we name them lev.

Detecting Unusual and Influential Data To get Leverage points, we use the predict command with the leverage option and we name them lev.. predict lev, leverage

Detecting Unusual and Influential Data Cook's D and DFITS measures both combine information on the residual and leverage. Cook's D and DFITS are very similar except that they scale differently but they give us similar answers.

Detecting Unusual and Influential Data Cook's D and DFITS measures both combine information on the residual and leverage. Cook's D and DFITS are very similar except that they scale differently but they give us similar answers.. predict d, cooksd

Detecting Unusual and Influential Data Cook's D and DFITS measures both combine information on the residual and leverage. Cook's D and DFITS are very similar except that they scale differently but they give us similar answers.. predict d, cooksd. list female read d if d>4/_n female read d 13. male 50.0234054 39. male 47.0212312 123. female 57.0202435 142. male 76.0327483

Detecting Unusual and Influential Data Cook's D and DFITS measures both combine information on the residual and leverage. Cook's D and DFITS are very similar except that they scale differently but they give us similar answers.. predict dfit, dfits. list dfit if abs(dfit)>2*sqrt(3/51)

Detecting Unusual and Influential Data Cook's D and DFITS measures both combine information on the residual and leverage. Cook's D and DFITS are very similar except that they scale differently but they give us similar answers.. predict dfit, dfits. list dfit if abs(dfit)>2*sqrt(3/51) The above measures are general measures of influence.

Detecting Unusual and Influential Data We can also consider more specific measures of influence that assess how each coefficient is changed by deleting the observation. This measure is called DFBETA and is created for each of the predictors.

Detecting Unusual and Influential Data We can also consider more specific measures of influence that assess how each coefficient is changed by deleting the observation. This measure is called DFBETA and is created for each of the predictors. Apparently this is more computational intensive than summary statistics such as Cook's D.

Detecting Unusual and Influential Data We can also consider more specific measures of influence that assess how each coefficient is changed by deleting the observation. This measure is called DFBETA and is created for each of the predictors. In Stata, the dfbeta command will produce the DFBETAs for each of the predictors.

Detecting Unusual and Influential Data We can also consider more specific measures of influence that assess how each coefficient is changed by deleting the observation. This measure is called DFBETA and is created for each of the predictors. In Stata, the dfbeta command will produce the DFBETAs for each of the predictors.. dfbeta DFread: DFbeta(read) DFfemale: DFbeta(female)

Detecting Unusual and Influential Data We can also consider more specific measures of influence that assess how each coefficient is changed by deleting the observation. This measure is called DFBETA and is created for each of the predictors.. list DFread DFfemale in 1/5 DFread DFfemale 1..0492348.1971976 2. -.0887463 -.1617497 3..0915453.1802994 4..0434659.1740918 5..0717626 -.1374498

Detecting Unusual and Influential Data There are also several graphs that can be used to search for unusual and influential observations. The avplot command graphs an addedvariable plot.

Detecting Unusual and Influential Data avplot command not only works for the variables in the model, it also works for variables that are not in the model, which is why it is called added-variable plot. We can do an avplot on variable grade.

Detecting Unusual and Influential Data. avplot grade

Detecting Unusual and Influential Data. avplot grade Added-Variable plot

Detecting Unusual and Influential Data rvpplot is another convenience command which produces a plot of the residual versus a specified predictor and it is also used after regress or anova.

Detecting Unusual and Influential Data. rvpplot read

Detecting Unusual and Influential Data. rvpplot read

Detecting Unusual and Influential Data lvr2plot stands for leverage versus residual squared plot.

Detecting Unusual and Influential Data. lvr2plot

Detecting Unusual and Influential Data. lvr2plot

Detecting Unusual and Influential Data Summary of Detecting Unusual and Influential Data predict create predicted values, residuals, and measures of influence. dfbeta DFBETAs for all the independent variables avplot graphs an added-variable plot lvr2plot graphs a leverage-versus-squaredresidual plot. rvpplot graphs a residual-versus-predictor plot. rvfplot graphs residual-versus-fitted plot.

7. Tests for Model Specification

Tests for Model Specification A model specification error can occur when one or more relevant variables are omitted from the model or one or more irrelevant variables are included in the model.

Tests for Model Specification There are several methods to detect specification errors. The linktest command performs a model specification link test for single-equation models.

Tests for Model Specification. Linktest Source SS df MS Number of obs = 200 -------------+------------------------------ F( 2, 197) = 79.86 Model 8005.11739 2 4002.55869 Prob > F = 0.0000 Residual 9873.75761 197 50.120597 R-squared = 0.4477 -------------+------------------------------ Adj R-squared = 0.4421 Total 17878.875 199 89.843593 Root MSE = 7.0796 ------------------------------------------------------------------------------ write Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- _hat 2.807497 1.052071 2.67 0.008.7327302 4.882264 _hatsq -.0170281.0098827-1.72 0.086 -.0365176.0024615 _cons -47.29516 27.77544-1.70 0.090-102.0705 7.480201 ------------------------------------------------------------------------------

Tests for Model Specification The ovtest command performs performs a regression specification error test (RESET) for omitted variables.

Tests for Model Specification The ovtest command performs performs a regression specification error test (RESET) for omitted variables.. ovtest

Tests for Model Specification The ovtest command performs performs a regression specification error test (RESET) for omitted variables.. ovtest Ramsey RESET test using powers of the fitted values of write Ho: model has no omitted variables F(3, 194) = 1.95 Prob > F = 0.1233

Tests for Model Specification Summary of Tests for Model Specification linktest performs a link test for model specification. ovtest performs regression specification error test (RESET) for omitted variables.

STATA: The Red tutorial