August 29, 2018 Introduction and Overview
Why are we here? Haavelmo(1944): to become master of the happenings of real life. Theoretical models are necessary tools in our attempts to understand and explain events in real life.... Whatever be the explanations we prefer, it is not to be forgotten that they are all our own artificial inventions in a search for an understanding of real life; they are not hidden truths to be discovered.
The point of empirical work is to tell stories that matter to us Understanding concrete developments ion the world around us Providing useful guides for action Persuading others Not technique! Every economic story we might want to tell has some reflection in data we can observe.
Our goal is to answer questions posed to us by the world to advance a political project to design a policy to make money Judgements of relevance or importance cannot be made from within economics, but only from some outside vantage point.
The map is not the territory Economic data is the product of some social process Concrete social activity comes, through a series of specific activities, to be represented in quantitative form Heterogeneous human activity and its products are classified into discrete categories Market transactions are recorded, then grouped into accounts Private accounts are reported to public authorities in standardized form Survey are conducted, and results compiled Quantities like output or profit or the price level are not preexisting objects in the world, but come into being through particular accounting practices
Products of all this activity comes to us as data, which we select from, assemble, transform through various statistical techniques One important set of techniques is to summarize the variation in the data in terms of a limited set of parameters. Standard statistical distributions are a useful tool for this. always keep in mind that distributions are tools for describing data, not objects existing independently in the world. We use these summaries to describe historical developments, tell stories, make and evaluate causal claims This activity then gets generalized as economic theory
Econometrics Econometrics is one set of tools for organizing data and bringing it into relation with economic theory It is just one set of tools among others!
Econometrics starts from premises The data we see is a reflection of a genuine economic process This data is drawn from an underlying population Variables in that population have their own well-defined statistical distributions The variables are causally linked in such a way that we can regard one of them (the dependent variable) as a function of a set of others (the independent variables) plus a random error term We can draw repeated samples from the same population in the ideal case with identical values of the independent variables So there is a data generating process with stable properties which we can describe statistically
What does a regression do? Seeks to establish the effect of a ceteris paribus change in an independent variable on a dependent variable... by looking at the variation shared by those variables that is not shared with other (independent) variables Express this as an equation in which the dependent variable is a (linear) function of the independent variables plus an error term Assesses the relationship between the dependent and independent variables by comparing it to hypothetical null of no relationship... Strictly speaking in econometrics we don t test our own claims, we test the null and treat rejection of the null as evidence for our preferred hypothesis
... on the basis of an assumed distribution of the error term Often focus on qualitative question of whether observed distribution is far enough from the null hypothesis Sometimes, main finding is the estimated parameter value, often it is simply that a variable was significant or not significant Only qualitative results possible in common case where independent variable used is supposed to be a proxy or instrument for true variable of interest...... or where independent variable is indicator rather than numerical Treats link between independent and dependent variable as black box we are only observing variation at beginning and end of causal chain
People doing regressions worry about: Specification Identification Endogeneity Distribution of the error term Heteroskedasticity Kurtosis Independence (from independent variables and each other) External validity
Econometrics Sometimes a regression is the best approach: We are confident about the relationship between data we observe and underlying relationship we are interested in There s well-defined population that our sample is drawn from The question we are interested in is causal...... and is about effect of a cause rather than cause of an effect Accepting or rejecting the null hypothesis of no relationship answers an interesting question... Negative answers (accepting the null) are more often interesting than positive ones... and/or we can interpret the coefficients quantitatively. The ceteris paribus ( all else equal ) assumption makes sense Reasonably direct link between the variables
For example: Synthetic control studies of minimum wage laws ask what happened to employment in a jurisdiction that raised the minimum wage compared with employment in comparable jurisdictions with similar employment trends that did not raise the minimum We are comparing one place and time to a well defined population of other places We are interested in effect of minimum wage change, not cause of employment change The qualitative question were there fewer jobs? is an interesting one It makes sense to think of the minimum wage increase as a distinct event that could be different without anything else changing
Herndon (2017) compares defaults losses on low documentation mortgages ( liar s loans ) with losses on otherwise similar conventional loans There is a well-defined population of mortgage loans issued in the given period Knowing whether default rates were significantly higher in the absence of documentation is a little interesting...... and the study also tells us how much additional default losses were incurred on low-documentation loans Comparing the experience of a low-documentation loan to the counterfactual of a conventional loan issued to an apparently similar borrower in the time period makes sense There is no ambiguity about the mechanism linking absent documentation to higher default rates
The limits of econometrics Often a regression is not sensible, because of one or more of: We can t directly observe what we are interested in and have to depend on some more or less indirect proxy The question of interest is not causal In many cases we are interested in the joint distribution of the variables and not a cause and effect relationship E.g. the question how much do women earn relative to men is different from the question how does being a woman cause your earnings to change, all else equal? We already know effect of cause (typically because of accounting) There is not a clear or sensible counterfactual
There is no meaningful sense in which the case of interest is drawn from a larger population Was the Great Depression drawn from the distribution of 19th century downturns? is a sensible question to an econometrician. But probably not to anyone else! Our goal is to explain the observed outcome ( cause of effect ) rather than isolate the influence of one variable ( effect of cause ) Important case: Our variables are linked by accounting identities, so we already know effects of causes, want to know what part of variation in dependent variable is attributed to each Rejecting the null doesn t tell us anything interesting, and it s not possible to interpret coefficients quantitatively The mechanism linking variables is unclear/ambiguous, or can be studied directly
Alternatives to econometrics What else is there? Descriptive statistics Characterize distribution of data in sample No reliance on underlying population or data generating process Examples: Principal component analysis, decompositions Accounting Focus on mechanism Qualitative approaches Narrative surveys interviews case studies
Summers, The Scientific Illusion in Empirical Macroeconomics (1991): Formal empirical work which... tries to take models seriously econometrically has had almost no influence... The only empirical research that has influenced thinking about substantive questions has been... attempts to gauge the strength of associations rather than to estimate structural parameters, verbal characterizations of how causal relations might operate rather than explicit mathematical models, and the skillful use of carefully chosen natural experiments rather than sophisticated statistical technique to achieve identification.... Formal econometric work has had little impact on the growth of economic knowledge.
Successful pieces of pragmatic empirical work have three elements in common... First and foremost, in each case, the bottom line was a stylized fact or collection of stylized facts characterizing an aspect of how the world worked rather than parameter estimates or formal tests of a point hypothesis.... Second, pragmatic pieces of empirical work produce regularities of a kind that theory can seek to explain.... Third, successful pieces of pragmatic empirical work have no scientific pretense... No single test is held out as decisive. Many different types of data are examined.
Empirical work in real world In practice, successful empirical work in most contexts depends on A compelling story A clear relationship between observable variables that is consistent with our story, and inconsistent with alternative Scatterplots do a lot of work in real-world debates Attention to mechanism linking cause and effect Accounting - dividing the outcome you are interested in into meaningful components Explaining specific facts/events that people are aware of and care about natural experiments
At the same time, we do need statistical training: Apparent patterns may be spurious may have alternative explanation Time series data is particularly susceptible to spurious patterns Data may need to be transformed appropriately before it is meaningful Regressions are a good way to answer some questions Other people are doing econometrics even if you aren t Joan Robinson: the point of studying economics is not to be fooled by economists
Good empirical work Asks clear, interesting, well-posed question Question comes from world outside of economics - some practical reason to know answer, whether political, business, story we want to tell to wider world Question is answerable. Many interesting sounding questions (especially causal questions) do not have answers! How much questions are more interesting than whether questions Looks for things that will be different about the world in the case where one answer to our question is right, versus if another answer is Even when we are doing exploratory analysis we should have some set of alternatives in mind
Good empirical work Spends a lot of time with the data internal questions how big, what kind of spread, what is it indexed over, observations by index values? maximum and minimum values? how many values are missing? external questions how was it collected, by whom? how are the variables defined? how are they measured? are some values imputed, topcoded, etc.? cleaning data there will be missing values, outliers that seem spurious (but be careful!), fields we don t want, numbers encoded as text, etc. transforming data may need to convert units, take logs, normalize values by something often not obvious what transformation appropriate
Good empirical work Makes use of broader knowledge of world thinks about causal links, does not treat relationship as black box makes use of accounting relationships Develops a clear, substantive claim about the world Describes concrete historical developments Or if it posits general relationship, is very clear about domain over which it applies Presents results in a small number of clear, self-explanatory tables or charts