Case studies, causal mechanisms, and selecting cases. Version 5

Case studies, causal mechanisms, and selecting cases Version 5 Gary Goertz Kroc Institute for International Peace Studies University of Notre Dame Notre Dame, IN 46556-5677 ggoertz@nd.edu August 18, 2012

Introduction If one is thinking about doing case studies say 1 10 cases the problem arises: which cases to choose? Quantitative scholars typically have well-developed procedures for case selection; they randomly sample for surveys; they use populations e.g., all advanced industrial societies; they analyze datasets collected by others e.g., COW datasets. In contrast, the case study researcher faces the problem that she is going to commit significant resources to a case, so she wants to choose the right 1 10 cases. The problem lies in that there may be dozens, if not thousands, of cases to choose from. In addition, to the embarrassment of riches of potential cases, she also faces a wide variety of rationales, techniques, and advice for choosing her few cases. For example, should she choose (1) by random, (2) some representative cases, (3) with variation on X or Y, (4) on- or off-line, (5) extreme, and so forth? To add to her confusion she faces a lot of reasons why she might want to do a case study in the first place. Is she (1) testing a theory, (2) generating a hypotheses, (3) trying to explain an individual case of interest, etc.? Gerring s central chapters 3 and 5 (2007) illustrate the variety of reasons for doing a case study (chapter 3, what is a case study good for ) and techniques for choosing cases (chapter 5). If one considers that there are at least four research goals (table 3.1) and there are nine ways to choose cases (table 5.1) that gives a combination of 36 combinations of goals and techniques. While Gerring does not go through this exercise, Van Evera (1997) does exactly this. In his table 1 he gives five goals (columns) for doing case studies and then 11 rows of case selection criteria. This produces a table of 55 cells. This paper takes exactly the opposite approach. I argue that there are two dominant logics for case study research. The first is what I call the statistical approach. The literature on case studies often works from the statistical, cross-case paradigm. Already in Campbell s classic article (1975) one reads of the degrees of freedom problem in case studies. Much multimethod work implicitly or explicitly takes the statistical approach to case studies (e.g., Gerring 2007; Lieberman 2005). However, there exists no nice short exposition of that logic. In the next section I give what I think is the dominant logic of case study selection from the statistical perspective. The main goal of the paper is to provide what I call the qualitative logic of case studies. I think a good place to start is by looking at case studies in their own right. Once one does not think about case studies in the light of statistical procedures then many things make sense and one can develop an independent logic of case study research. I suggest that this qualitative logic explains what scholars naturally do when selecting case studies. For example, I shall also suggest that this qualitative logic makes sense of the debates around selecting on the dependent variable. Case study 1

practice when viewed through a statistical lense is described in this manner; but when viewed via qualitative logic it has a very different description. The technique for illuminating these two logics is to ask how or what cases a research would choose when faced with budget constraints. If the person can only do one case study what kind of case would be chosen? What if the budget allocation is 2, 4 or 10 cases? I think that this device forces one to prioritize and make clear what is most important about the case study. I argue that the statistical logic makes quite different choices when faced with these budget constraints than does the qualitative logic. The statistical approach to case studies The technique for illuminating the logic of case studies starts with a scenario where the researcher has a budget constraint. She only has resources to do one case study: what kind of case should she choose and what is the rationale for that choice? I suggest that given this constraint the researcher following the statistical logic will choose a representative case for intensive analysis. So what makes this the obvious choice when faced with a large budget constraint? First, if one examines the rationales scholars give for why they chose a given case, representativeness is one of the two most common (the second is diversity which I discuss below). Gerring is more explicit about the central importance of representativeness as central to case studies, since he actually defines the goal case studies making this central: the case study approach to research is most usefully defined as an intensive study of a single unit of small number of units (the cases), for the purpose of understanding a larger class of of similar units (a population of cases) (Gerring 2007, 37). The goal is understanding a population. If that is the goal then one naturally would chose a case which is representative of that population. Gerring is quite explicit that the case study stands-in for the population: In all other circumstances [than deviant case and influential case] cases must be representative of the population of interest in whatever ways might be relevant to the proposition in question.... there is no dispensing with the question. Case studies (with the two exceptions already noted) rest upon an assumed synecdoche: the case should stand for the population. If this is not true, or if there is reason to doubt this assumption, then the utility of the case study is brought severely into question. (Gerring 2007, 146, 147) In his core table 5.3 where he lists the techniques of case selection, each is evaluated on its degree of representativeness. As a rule, those techniques which are not representative can only be used in conjunction with other case studies. Hence they would not fit in with our budget constraint of one case. 2

If one examines methodological works on multimethod and/or case studies that rely on a statistical logic they almost always make representativeness a core issue: In order for a focused case study to provide insight into a broader phenomenon it must be representative of a broader set of cases... By construction, the typical case is also a representative case. (Gerring 2007, 91; emphasis is mine). From a purely statistical sampling perspective, focusing attention on cases that are not representative of the population as a whole is a huge waste of resources. While such cases may be useful for exploratory analysis and/or theory construction, the amount of information they can provide about population-level average causal effects is, by definition, limited. (Herron and Quinn, n.d., 13) Once again, though, we run into a problem of representativeness. If one is selecting a few cases from a larger set, why this one and not another? Why shouldn t the reader be suspicious about selection of good cases if no explanation is given for the choice? If an explanation is given and it amounts to convenience sampling, don t we still need to worry about representativeness? (Fearon and Laitin 2008, 762 63) These methodological works make explicit a very common rationale. If one examines the justifications for case selection, they very often rely on the argument that the case is a representative one. It turns out that to operationalize representative can be quite problematic. Kruskal and Mosteller (two of the most famous statisticians of their day) in a very readable series of articles (1979a, 1979b, 1979c, 1980) give a wide-ranging analysis of what representative might mean and the meanings that have been given to it. However, if we limit ourselves to two variables, independent and dependent (I treat control or confounding variables later in the paper) and if we assume that X and Y are continuous variables then I think there is one obvious interpretation of representative. Again, Gerring provides a clear suggestion: the mean of X and the mean of Y. The first section of his chapter 5 discusses exactly the mean value of X. One the first page he states: A demonstration of the fact that random sampling is likely to produce a representative sample is shown in figure 5.1 (Gerring 2007, 86). Gerring is just making explicit a natural interpretation of representative. The mean or average student is taken as representative of the population of students. If the distribution of the data is unimodal (and ideally symmetric) then the mean can nicely be taken as representative. Of course, this view of representative becomes problematic if the data are skewed, bimodal, etc. 1 But at the same time it is a natural move from a statistical point of view to assume normality. 1 Taken from this perspective one needs to ask what kind of case X or Y is likely to be in terms of the variables and concepts, X and Y. By definition, mean or average cases are in the middle of the conceptual continuum. In Goertz s (2005) terms they lie in the gray zone. This means that they are often neither good cases of X nor not-x. For example, take the polity dataset, used by Gerring in chapter 5. The polity democracy scale runs from 10 to 10. The average value, depending on years and countries, is often around 2 3. This is in the middle of the anocracy zone. These are countries, 3

Figure 1: Case study selection: continuous X and Y. 1.0 * * * * * * * * 0.8 * * * * * * * * * ** * ** 0.6 * * * * * * *** * * * ** Y (X, Y ) 0.4 * * * * * * * * ** * * * * ** * * * * * * * * * 0.2 * ** * * * * * * * * * * 0.0 * * 0.0 0.2 0.4 0.6 0.8 1.0 X In short I propose that the statistical logic and a budget constraint of one case leads to the basic case selection rule: One case budget constraint: choose a case of (X, Y ). We now increase the budget constraint to two cases and ask the same question: if only two cases can be chosen, which kind of cases and why? One way to think of this is that we have already chosen a case when the budget constraint was one case, so that is a good candidate for one of the two cases. What about the other one? It is useful when asking these questions to have a basic bivariate scatterplot in mind, i.e., figure 1, as well as a corresponding 2 2 table, i.e., table 1. One can then locate various case selection options in the figure or the table. The 2 2 table provides a convenient notation whereby (1,0) means choose a case where X = 1 and Y = 0. depending on one s view, are either not democratic at all or are hybrid cases. So if the theory or causal mechanism involves democracy taking the mean might be way off base. The polity example illustrates a second potential problem with using the mean. These data are very bimodal in distribution (see Goertz 2008 for a histogram and discussion): the vast majority of cases are either clearly democratic or clearly authoritarian. So in this case, the mean is not even really representative at all of the population. The hidden assumption behind the idea of the mean being typical or representative is that the data are relatively symmetric (i.e., not heavily skewed) and that they are unimodal. If these hold then the mean is not a bad choice, but much social science data do not fit these requirements. 4

Table 1: X Y configurations X = 0 X = 1 Y = 1 (0,1) (1,1) Y = 0 (0,0) (1,0) In figure 1 I have indicated the (X, Y ) choice. By definition this case is on the regression line. So in addition to it being the obvious candidate for a single case study on grounds of representativeness, it is also an on-line case. Once one has described this case in that way, the natural choice for a second case study would be an offline case. This fits naturally with a nested or multimethod approach. The two main categories are on-line and off-line and so one does a case study of each. Once one has dichotomous variables things look a little different as set out in table 1. Here basically the on-line cases lie in the cells (0,0) and (1,1). Here we assume a positive relationship between X and Y ; so that X is a positive cause of Y, i.e., we expect the occurrence of X to be associated with the occurrence of Y. The off-line cases are then to be found in the cells (0,1) and (1,0). However, we have a budget constraint of two cases. Do we have reason to prefer a case study from a particular on-line or off-line cell? Should we choose (0,0) over (1,1) or vice versa for the on-line choice? I propose that statistical approach has no clear preference in these decisions. I think this comes out most clearly in the continuous version given in figure 1. Do we have a strong preference for a case in the upper right (i.e., (1,1)) or the lower left (i.e., (0,0)). From the regression point of view they are both equally on-line. Similarly, do we prefer an off-line case above the regression line or below it. Again there is no obvious reason to prefer above as opposed to below. So while one still needs a procedure for deciding upon the on-line and off-line cases in the 2 2 table we can lump the two on-line cells together, and likewise for the two off-line ones. To be practical one needs additional decision rules to end up with two cases (I discuss this more below). However, for the purposes of this paper the key rule with a two-case budget constraint is: Two case budget constraint: chose one on-line case and one off-line case. My analysis also gives a dichotomous version of the one-case budget constraint since we can reframe that rule as choose an on-line case when the variables are dichotomous. 5

We now increase the budget constraint to four cases and ask the same question: if only four cases can be chosen, which kind of cases and why? If one looks at the 2 2 table in table 1 the answer seems obvious: choose one case study from each cell. Looking at table 1 from the point of view of statistical measures of association each cell has its own role to play. For example, all 2 2 measures of association require information from all four cells. I mentioned that probably the most important criterion of case selection from the statistical perspective is representativeness. If one looks at the other most popular justification it is almost certainly diversity or variation on X and Y. Once one has a budget constraint of four or more one can begin to implement the variation or diversity notions: If all variables are deemed relevant to the analysis, the selection of diverse cases mandates the selection of one case drawn within each cell (Gerring 2007, 98). Following the publication of King, Keohane, and Verba qualitative researchers became extremely conscious of the issue of selecting on the dependent variable. So for many variation on Y became a critical factor in choosing case studies. Hence, one might choose at least one case of Y = 0 and one case of Y = 1. For example, Kupchan does about 20 case studies in a study devoted to the causes of stable peace. While this is a large budget constraint in the context of this paper, his justification for case selection is that they are diverse: The successful and failed instances of a stable peace examined in this book thus represent a diverse subset of a broader universe of cases (Kupchan 2010, 9). As a qualitative researcher he is keenly aware of the charge of selecting on the dependent variable, hence he emphasizes variation on Y. At the same time the counterfactual, potential outcomes, Neyman-Rubin approach to causation focuses on the contrast between X = 1 and X = 0. Ideally, one assess the causal influence by counterfactually changing X from present to absent. This naturally then leads to a focus on variation on X: comparing a X = 0 case with a X = 1 one. Putting these together we get the case selection rule when the budget constraint is four cases: Four case budget constraint: choose one case in each cell of a 2 2 table. This idea naturally can be extended to the 2 2 continuous in figure 1. Choosing a case each from the two on-line cells means choosing an extreme case on the line in the upper right, and another extreme case on the line in the lower left. For the off-line choices one chooses one case that is an off the line above and on which is off the line below. 6

Control and matching variables: most similar systems The suggestions above regarding case selection are only the first stage in getting to a final choice of 1 5 cases. One might decide to do a case study from the (1,1) cell, but there might be dozens or hundreds of cases in that cell. As such one need additional rules or guidelines to choose among these cases. Generally, if we move from the bivariate setup, e.g., 2 2 table, to the inclusion of a second independent variable, Z, does that influence our decisions about which cases would be best for a case study? Can Z be useful in deciding among cases in a given cell of the bivariate, 2 2 table? Once the question of control variables, aka Z, is raised then one s thought naturally turn to matching techniques. Matching fits quite naturally with most similar systems kind of logic. Gerring quite naturally treats them as basically the same thing: matching is a more rigorous and well-defined specification of the most-similar idea that qualitative, case study researchers have used for years. Among statistical methodologists matching is certainly one of the hotest topics over the last few years. If one is taking a statistical approach to case studies then a matching logic could prove quite compelling. As such I propose that matching is probably the dominant approach to thinking about control variables for the statistical approach to case studies and multimethod work. If one follows the general statistical logic of matching then we might start with two case studies one where X = 1 and the other where X = 0. The idea of matching then is that we control for confounding variables Z by having the same value on Z for these two cases, e.g., Z = 1. Gerring illustrates this nicely in his table 5.4 (p. 132), where X 2 is his control or matching variable. 2 In Gerring s example he assigns Z (in my terminology) the value of zero. However, in general one must make some decisions about whether one should match with Z = 1 or Z = 0. In the normal course of a large-n study one would have matches for both Z = 1 and Z = 0. However, for case studies one needs to make a conscious decision and justify matching on Z = 1 versus Z = 0. One way that might work often in practice is that one is committed for various reasons to a certain case in the (1,1) cell which has value zero on Z. Then a reasonable thing would be to look for a comparison case of X = 0 that also has zero on Z. The key methodological point here is: One must choose which value of Z to match on, and defend that choice. While matching in statistical studies is about X and Z, with case studies one has to decide about Y as well. Given that we are choosing X = 1 and Z = 0, one question 2 He differentiates between hypothesis-generating versus hypothesis-testing matching, but I use his hypothesis-testing since that clearly more connected to the statistical paradigm than hypothesis generating. 7

is whether we want both on-line, both off-line, or one of each. Gerring does not give advice, the entries are? for the Y column. This is not surprising given statistical matching literature origins of the most similar design. Normally there will be variation on Y and the key thing is eliminating cases from the dataset that have no matches based on X and Z. However, when doing a case study one must make decisions. 3 I suspect that the dominate choice would be two on-line cases. If we have (1,1) and (0,0) then X appears to have the theorized causal effect, and then one could worry about confounding, and control variables. Then matching would come into play. Offline cases mean there are problems; one is not really in a position to worry about confounding variables. It is not obvious to me what how the logic of matching would work if one uses off-line cases. In short, I think the most similar systems and matching logic work most naturally, with two more more on-line cases, normally, (0,0) and (1,1). Off-line cases imply that there are problems and work needs to be done. It is useful to recall that scholars do not suggest the inclusion of control variables when the main variable is not significant. Questions arise when the results are statistically significant; then one has to face questions about the inclusion of control variables. This common behavior would lead to a general recommendation to use matching in conjunction with on-line cases. Summary In summary, there are two major criteria for selecting cases from the statistical point of view: representativeness and diversity-variation. When one can only choose one or two cases it is representativeness which determines the choice. Once there are a few more cases available then the researcher can implement the diversity and variation criteria. Depending on the kinds of variables one has, e.g., continuous or dichotomous and their distribution (e.g., normal or symmetric versus skewed or bimodal) the procedures vary slightly. However, I think that these two criteria completely dominate thinking about case selection for many scholars, and certainly do so for multimethod scholars who work in a regression framework. As I have argued, however, representativeness generally primes diversity. As Kruskal and Mosteller describe in their wonderful survey of representative case selection, the notion of a representative case has a very strong grasp on researchers thinking about case selection. As they stress, with a large-n statistical studies one does not talk about representative samples but random samples. Of course, a random sample will on average be representative in some sense of the population, however, there are problems with this as Kruskal and Mosteller discuss. 3 Of course, randomization is always an option. One could choose 1 4 cases by random and let the chips fall where they may on the Y values. 8

Almost all sampling techniques used in case studies rely on purposive sampling, which is defined as not-random sampling. This fits nicely with the classic tradition started by Eckstein of types of case studies and continued in the work of George and Bennett, Gerring, and others. The types are defined by their purpose. This brings us full circle back to why one does a case study which is for the purpose of understanding a larger class of of similar units (a population of cases) (Gerring 2007, 37). The qualitative logic of case studies In this second part of the paper I present an alternative, qualitative, logic for case studies. It is qualitative because central to the methodology is causal mechanisms and within-case causal inference. These features, which are very intimately related, are notable features of qualitative work and methodology. They contrast quite decisively with statistical approaches which involve cross-case analysis and typically a weak ability to investigate causal mechanisms. I suspect that many approaches to case studies in fact combine features of the qualitative and quantitative approaches to case studies. While that is potentially a good thing to use a culinary metaphor, methodological fusion cuisine I think that in many instances it makes more for confusion as scholars implicitly move and forth between paradigms. If in fact one is to combine the logics it is best to first have a clear vision of each as a pure type. In this part I use the same setup as in the previous section. I use mostly table 1 but also figure 1. I also use the same notion of case budget constraints and rationales for decisions. Finally, I contrast the qualitative logic with the quantitative one for each budget constraint. Case study goals I believe that the main reason for the rise of mixed and multimethod research has been the desire to combine the advantages of cross-case large-n research with the advantages of intense within-case causal analysis. For example, Brady and Collier (2010) contrast dataset observations, i.e., cross-case, with causal process observations, i.e., within-case. Often hypotheses of the form X has an effect on Y are quite vague on the details of the causal mechanism. It is not uncommon for a researcher to provide several causal mechanisms to explain the effect of X. For example, there are various mechanisms that attempt to explain the strong effect of democracy on war. Hyde (2007) gives 3 4 different causal mechanisms that might explain the impact of United Nations election observers on the quality of the election. Finally, a variety of mechanisms have been proposed to explicate the relationship between GDP/capita and democracy. 9

Table 2: What is a case study good for? Consideration Case study Cross-Case/Statistical Hypothesis Generation Testing Validity Internal External Causal insight Mechanisms Effects Scope of proposition Deep Broad Population of cases Heterogeneous Homogeneous Causal strength Strong Weak Useful variation Rare Common Data availability Concentrated Dispersed Causal complexity Indeterminate Indeterminate State of the field Indeterminate Indeterminate Source: Gerring 2007, table 3.1. I propose: The central goal of a case study is to investigate causal mechanisms and make causal inferences within individual cases. I think that this goal underlies the vast majority of case studies and informs many discussions of case study methodology. This core goal should be brought up and placed front and center in the discussion of case study methodology. The rest of this paper looks at case selection with this as the central goal of case studies. It worth briefly seeing how this goal figures in many discussions of case study methodology. Gerring s analysis, summarized in table 2, supports this perspective. Under the rubric causal insight he argues that case studies are about mechanisms while large-n cross-case studies are effects. I stress that case studies are about causal inference within cases, Gerring argues that the within-case analysis validity is internal while statistical analyses have external validity. Similarly, causal strength within-case is strong while cross-case, statistical analysis is weak. One can in fact combine these three rows of table 2 into the core proposition that the purpose of a case study is to conduct an intensive, valid, causal analysis of the case. The center of the causal analysis is the causal mechanism that the researcher proposes that sees X as producing Y. If one looks at Eckstein (1975) and George and Bennett s list of case study types, table 3, this same central focus also appears. Van Evera (1997, 8) gives the following uses for case studies: (1) Testing theories, (2) Inferring theories, (3) Inferring antecedent, context conditions, (4) Testing antecedent, context conditions, (5) Cases of intrinsic 10

Table 3: Case study types: Eckstein, George and Bennett Goal-Purpose Configurative-idiographic Disciplined configurative Heuristic case studies Crucial case studies Building block Rationale Focus on individual case Testing general theories Theory-building Theory-testing, must-fit cases Part of larger contingent generalizations or typological theories Source: Eckstein 1975; George and Bennett, 2005 chapter 4. importance. Finally, Sambanis in a large multimethod research project funded by the World Bank argues for these goals: 1. They [case studies] help us identify a number of causal mechanisms through which independent variables in the CH [Collier and Hauffler] and FL [Fearon and Laitin] models influence the dependent variable i.e., the risk of civil war onset. It quickly becomes clear that the CH modelõs distinction between greed and grievance as competing motives for civil war is illusory, because greed and grievance are usually shades of the same problem. 2. They question assumptions and premises of the quantitative studies and make clear that CH and FL are often right for the wrong reasons yet also wrong for the wrong reasons. (In other words, the cases identify mechanisms that are different from those underlying their theories, both where the statistical models make good predictions and where they make bad predictions). 3. They sometimes point to a poor fit between the empirical proxies and the theoretically significant variables i.e., they identify measurement problems in the statistical studies. Later in this article, I offer some examples and suggest ways to improve the connection between theory and data. 4. They help us identify new variables that might explain civil war but are omitted from CH and FL (e.g., external intervention, or diffusion and contagion from violence in the neighborhood ). Adding these variables to quantitative models might reduce the risk of omitted-variable bias and facilitate inductive theory building. 5. They highlight interactive effects between variables in the statistical models and help us identify exogenous and endogenous variables by presenting narratives of the series of events and the processes that led to civil war. 6. They suggest substantial unit heterogeneity in the data, as the mechanisms that lead to civil war seem to differ substantially across different sets of countries and types of civil war. (Sambanis 2004, 259 60) 11

At issue in large part is what these authors mean by theory. Statistical methodology tests hypotheses, which are almost always of the form X has a causal effect on Y. I suggest that for case study research the theory that is tested is a causal mechanism. So when Eckstein says theory-building I interpret that to mean, developing a causal mechanism. When he says testing he means verify that the causal mechanism in the theory is present in the case. This within-case analysis can come at various points in the research process. If undertaken early in the program it is often call a plausibility probe (Eckstein 1975). Almost everyone agrees that case studies are useful in generating hypotheses. For quantitative researchers the main thing is the statistical analysis which follows up on the hypotheses generated by the case study. The case study could come after the statistical analysis in an attempt to verify that the causal mechanism(s) linking X and Y are really there in at least one or a few cases. This can be important because often the measures of X in the statistical analysis are only weakly related to the causal mechanisms. The key difference with the quantitative approach is that there one is thinking in terms of representativeness of a population, a cross-case notion. In the qualitative logic there is no reference to a population at all, rather it is about nature of a causal mechanism and whether it is present in a given case; the goal is a within case one. The way most scholars think about multimethod research is really a combination of these two goals. The statistical analysis provides generalization to a population, while the case study investigates causal mechanisms and does within-case causal inference. I think if the goal is generalization to a population, then some sort of large-n strategy is the way to go. The question here is rather: if the goal is within-case causal inference what should one do? While I, like everyone else, talk of variables X and Y it is crucial to keep in mind that X is a causal mechanism. So in fact X is really a series of causal process or causal mechanism factors. For example, the Brady analysis of the 2000 presidential elections in Florida involves causal process observations which are in fact variables such as time zone, media audience data, etc. He puts these variables into a causal sequence and hence typifies the within-case causal analysis nature of case studies. Thus, be it early or late in the research program, the case study is about looking at causal mechanisms in individual cases and confirming that the proposed causal linkage between X and Y can be found in individual cases. Hence, it is about causal inference within cases. The researcher needs to convince her audience that X was a cause of Y in the individual case: this is what I mean by within-case causal inference. Most of the differences between the two logics I discuss in the next subsections flow from this fundamental difference in goals. 12

The qualitative logic of case selection with budget constraints As with statistical logic section I start with a budget constraint of one case. Here I think the answer is clear: choose a key case that illustrates the nature of the causal mechanism. If the main goal of a case study is to investigate causal mechanisms then if only one case study is possible one should look a good example of that causal mechanism. One case budget constraint: A focus on causal mechanisms or within-case causal inference leads to choosing a case from the (1,1) cell. Recall that qualitative researchers have been frequently accused of selecting on the dependent variable. I think that this accusation is somewhat misplaced. What they have really been doing is selecting cases from the (1,1) cell. So it is not selection on the dependent variable per se, but rather selecting from the cell where one is most likely to see the causal mechanism in action. This makes eminent sense if one is really interested in the causal mechanisms: one should first examine cases where the scholar expects to see it in action: this is the (1,1) cell. I suspect that if one goes back to look at the classic studies where the scholar has been accused of selecting on the dependent variable one will find that it is in fact selection on the (1,1) cell. Above the statistical approach took (X,Y ) as the representative case under the onecase budget constraint. Faced with continuous variables as in figure 1 would the qualitative logic make the same choice? The answers is almost certainly no. If the goal is a clear case of the causal mechanism then typically this means you want a good case of Y = 1 and a good case of X = 1. These then would be cases on the upper right part of the line. So while the statistical approach had no real preference among the on-line cases the qualitative logic clearly does. This is a nice example of how the same thing can be perceived in quite different ways depending on the fundamental perspective. In the statistical logic (e.g., Gerring Table 4: Case study selection: rationales and roles of X Y configurations X = 0 X = 1 Y = 1 Importance=3 Importance=1 Equifinality Causal mechanism Y = 0 Importance=4 Importance=2 Counterfactual Falsification/Scope Note: the importance scores and the labels will be discussed in the next section. 13

2007; Herron and Quinn n.d.) these are not good cases of X and Y but extreme cases. We are talking about exactly the same points in Cartesian space but they are given radically different interpretations. So while both the statistical and qualitative logic agree that with a budget constraint of one the choice is an on-line case, they disagree significantly on which on-line case. So I propose that among the four cells of table 4 the case study researcher has a very clear preference for the (1,1) cell. This contrasts with the statistical point of view where there is no real preference of one on-line cell over the other (i.e., (0,0)). As I argue below, in fact, the (0,0) cell is probably the least important cell in the qualitative logic of the four in the 2 2 table. As we shall see in this section, the qualitative logic has a strong set of preferences among these cells in terms of doing case studies. I shall present these preferences as ordinal, but the distance between the preferences is certainly not the same. In short, the preference order is radically different from the statistical approach and the distance between some items will be much greater than in the statistical approach. So if we have a budget constraint of two cases? What about the other cells in table 4? Is there a sense of priority among them? In the statistical approach we had two clear categories, on-line and off-line. However, as we have just seen, the qualitative approach does not make that division because it has strong preference for one on-line cell as opposed to the other. In addition, within the qualitative approach there is in a real sense no line. In the statistical approach the line is basically the theory relating X to Y. However, in the qualitative approach we do not have a line but a causal mechanism. What about the off-line cells, (1,0) and (0,1)? Are these equally disconfirming cases? In the statistical approach they are symmetric. The basic measure of disconfirmation is distance to the regression line. The sign of the difference almost never is used as useful information; it is absolute distance which matters. In contrast, I argue that these cells are in fact not of equal value and in fact serve different empirical and theoretical purposes within the qualitative logic. The (1,0) cell is of particular importance. These are what I call disconfirming or falsifying cases. Recall that the causal mechanism approach suggests that when X is present then the mechanism works to produce Y. Cases in the (1,0) cell suggest that the mechanism is not working. Hence they offer potential evidence against the causal mechanism hypothesis. Hence the presence of cases in this cell is of great concern to the researcher. For example, throughout the 1990s many scholars would look at cases where neorealism should apply to see if the outcomes conformed to Waltz s theory (1979). 14

This is naturally the category of crucial or theory testing case studies. More specifically they are falsifying cases. While most users of case studies avoid for obvious reasons case studies which tend to falsify their theories, their opponents equally obviously focus their attention on this cell. Gerring refers to these kinds of case studies as deviant or crucial. In summary: Two case budget constraint: A focus on causal mechanisms or within-case causal inference leads to choosing a case from the (1,1) cell and the second case should be a falsifying one (if possible). Obviously scholars are not likely to present cases that falsify their theory. However, those who have objections to the theory or causal mechanism are usually glad to help. So I think it is a good idea to face these cases head on. In good comparative work, scholars take very seriously the disconfirming cases. For example, Ertman (1997) has a couple of cases where his theory does not work and he spends more time discussing these than the confirming cases. The (1,0) cell has a very important positive use: it can be useful in finding the scope limits to the causal mechanism. Realistically, causal mechanisms do not work everywhere in all time periods and in widely different political, economic or cultural situations. The positive use of falsifying cases is the construction of the population or scope. As Ragin (2000) emphasizes, populations in much social science are created not given. In contrast for the statistical approach to case studies the population is a given. In Gerring s definition of a case study (above) the population is given, it is not one of the goals of the comparative case study. Hence the cases in the (1,0) cell play two complementary roles: they disconfirm the causal mechanism, but at the same time can begin to lay out the scope where the causal mechanism works. Since the budget constraint is two cases the researcher can compare the two and hypothesize about what in the second case is critically different from the case with the mechanism works. So if we have a budget constraint of four cases? If we take the 2 2 table, the idea seems quite compelling to choose one case from each cell. As I noted above, that satisfies the diversity and variation requirement. However, if the causal mechanism goal is front and center it is not so obvious that this is a good idea. To determine this we need to investigate the role that the (0,1) and (0,0) cells can play in this enterprise. Take first the (0,1) cell. While it is off the line, in the causal mechanism approach we do not really have a line. Is it metaphorically off-line from a causal mechanism perspective? I think that in general the answer is no. Causal mechanisms usually 15

have the form that when we see X there is a mechanism that produces Y. 4 If X is absent then there is no reason why we should or should not see Y. Because qualitative scholars work with the notion of equifinality they expect that there are alternative paths to Y = 1. This is why in table 4 I call this cell the equifinality cell. Lijhardt s analysis of pluralism is a classic case of the (0,1) cell. Lijhardt s key insight was that there were other paths to political stability beyond the classic Anglo- American one. Hence his analysis did not invalidate necessarily the relationship between pluralism and political stability, but rather showed that there are other ways to achieve it. Unlike the (1,0) cell which is absolutely central because it falsifies the causal mechanism, equifinality is not per se a serious threat. It certainly could be part of a research agenda to investigate alternative causal mechanisms, but certainly not to do so is certainly acceptable. So the choice of the (0,1) cell is completely optional. What about the (0,0) cell? The same logic applies in general applies. If X is absent we might think Y is less likely to happen, but this depends tremendously on the existence and importance of alternative causal mechanisms. As I have discussed elsewhere in detail (Mahoney and Goertz 2004) the (X = 0, Y = 0) cases are very often conceptually problematic. For example, we have a relatively clear idea about the occurrence of a social revolution, however there are millions of cases of non social revolution. If we have to choose one or two among these it could be quite difficult. In short, there are no compelling reasons to choose cases from the X = 0 column. In specific circumstances they be quite useful, but there is no strong reason to choose from them in general. If one wants to do more than two case studies and use up the four-case budget, then it might make more sense to go back to the cells that are clearly more important (1,1) or (1,0). Choosing more cases of (1,1) can help understand more clearly the causal mechanism. There may be variation in the details, there are things that seem important in one success which are not in another success case. Similarly with the scope cases. Working out the scope limits of a causal mechanism might be more important than dealing with equifinality. Here one might apply the diversity criterion within the qualitative logic. Diverse cases of success, (1,1), can be critical in evaluating the causal mechanism. Diverse case in the (1,0) cell can really help delimit scope. In contrast to the statistical notion which is variation across X and/or Y, the qualitative logic stresses diversity within the key cells of the table. 4 See the conclusions for more discussion of this point. 16

Counterfactuals and within-case analysis One of the most central differences between a qualitative approach to research and a quantitative statistical one is philosophy about within-case causal inference. This has tremendous importance for case study methodology. The potential outcomes approach as a philosophical position states that within-case causal inference is basically impossible: Fundamental Problem of Causal Inference. It is impossible to observe the value of Y t (i) and Y c (i) on the same unit and, therefore, it is impossible to observe the effect of t on i. (Holland 1986, 947) Because of this Fundamental Problem one must use cross-case evidence, ideally an experiment. Qualitative scholars, among others, believe that within case causal inference is possible. It is not without its own problems, but no method of causal inference, including experiments, is without its own set of issues. This is central to case study methodology because variation in X or Y is typically defined in a cross-case manner. We must have separate cases of X = 0 and X = 1. However, if one thinks that within-case causal inference is possible then one also believes that within-case counterfactuals are doable and have value. I need not go into the methodology of individual case counterfactuals (which is a huge issue in the philosophy of causation). For our purposes I assume that one can do this. The key methodological point is: For all cells of the 2 2 table, e.g., table 4, one can do counterfactual analysis which generates observations in other cells. The obvious place to start is with the key (1,1) cell. The key counterfactual question and everyone agrees on this point is: if X had been zero then what would have happened to Y? Notice that doing a within case counterfactual moves around in the 2 2 table. I start with (1,1) and then ask about (0,1) versus (0,0). If the causal mechanism argument is correct we end up in cell (0,0). We do not have to choose a separate case of (0,0) because counterfactual analysis generates these cases from the (1,1) cell. In short, the (0,0) cell is critical to case study methodology, but not as a separate cross-case case study. It is critical because of the counterfactual analysis of the (1,1) cases. In summary, in the ideal scenario the within-case counterfactual analysis of the (1,1) cell would lead to the conclusion that Y does not occur. Thus the end point of the analysis, ideally, is in fact in the (0,0) cell. This is why I have called the (0,0) cell in table 4 the counterfactual cell. If the hypotheses about the causal mechanisms are correct the counterfactual analysis of cell (1,1) produces cases in cell (0,0). In the statistical logic there was no particular reason to prefer the (0,0) to the (1,1) cell or vice versa. However, from the within-case perspective the order of priorities is 17

clear. One goes for the (1,1) case, and conducts a counterfactual analysis to get the (0,0) cases. If the counterfactual analysis fails then we have a situation of (0,1), Y would have occurred in spite of the absence of X. As we have seen this means equifinality. Hence critical to these counterfactuals is the existence of alternative paths to Y. More specifically the counterfactual explicitly raises the question of overdetermination. It is quite common in QCA analyses for case to be on multiple paths. This leads to an important criterion for selecting cases: Overdetermination avoidance: avoid cases that exemplify multiple causal mechanisms. One might not know this from the start, but if there are a preceding statistical or QCA results then these cases-to-avoid can be identified based on these results. This criterion flows naturally from the emphasis on a specific causal mechanism. We want, at least initially, cases that are clear examples of the causal mechanism. Over-determined cases are muddied waters, because of the existence of multiple causal mechanism. One can think about counterfactuals from all cells of the table. I have focused our attention on the central (1,1) cell, which if the causal mechanism is correct produces a case in (0,0). But we can work the counterfactual from the other direction: take a case from the (0,0) cell and make X = 1. Does the counterfactual analysis lead to the cell (1,1)? What about the falsification-scope cell (1,0)? Changing X = 1 to X = 0 seems less problematic since we have no reason to think Y would occur, so we would probably end up in the (0,0) cell. What about the equifinality cell (0,1)? This seems like a bit of an odd counterfactual because if the other path is still present we would still expect to have Y = 1. If we concluded that Y would be zero then we must have some interaction between the two causal mechanisms whereby they cancel each other out producing Y = 0. This would certainly be an interesting counterfactual if it were the case. The key point here is that one needs to think about how within-case counterfactual analysis substitutes for cross-case case studies. There is a huge literature on the methodology and validity of within-case counterfactuals. However, this needs to be balanced with the problems of causal inference when doing comparative case studies. It is not clear at all that cross-case analyses lead to better causal inference about the (1,1) cell than within-case counterfactual analysis. Both are potentially useful. However, the qualitative logic suggests that one should seriously consider within-case counterfactual analysis as an alternative to comparative cross-case case studies. 18

Alternative explanations and scope The question naturally arises regarding the role that a second independent variable, i.e., Z, might place in a (comparative) case study project. Above I argued that within the statistical framework the natural impulse would be to think about control variables, matching, and most similar systems design. I also think it is clear that the most similar system notion has guided most qualitative, comparative case study research. This is not surprising because it has the logic of control variables. In this section, I propose to revisit these issues from a qualitative logic perspective. As we shall see the issues of concern will be quite different than the matching and control variable notions that dominate the statistical paradigm. Huge differences arise from the very start. The language of control variables means that one is thinking of Z in certain way. Once we move a qualitative perspective, one where the causal model is fundamentally one of INUS models, then the way one thinks of Z changes quite radically. One speaks of Z not as a control or matching variable, but rather as an alternative causal mechanism: Y = X + Z is now interpreted as there are two causal mechanisms that can produce Y. Normally, INUS models have complex terms, e.g., X w T. For the purposes of this paper one can think of X and Z as a shorthand for such complex, configurational causes. In the discussion above, Z = 1 and Z = 0 were just values on the (control) variable Z. Now the interpretation changes significantly to the presence or absence of an alternative causal mechanism Z. With this interpretation of Z the question left open above about whether or choose Z = 1 versus Z = 0 is often much easier: choose Z = 0 cases. Cases where X = 1 AND Z = 1 are overdetermined. If the goal of the case study analysis is to investigate the causal mechanism X, then one clearly wants to avoid the overdetermined ones. We have already seen this principle above (see also Schneider and Rohlfing 2011; Gerring 2006 chapter 5 in the discussion of the pathway technique). If one goes back to the discussion of matching, I assumed that one needed two cases to get going; fundamentally one contrasts a X = 1 case with a X = 0 one. Matching then determines which value on Z one should use. Gerring implicitly states this as a requirement for the most similar or matching technique since the discussion assumes minimally a pair of cases. One can ask if such a requirement exists in the equifinality, INUS view of Z? The answer is clearly no. One can apply the logic of alternative pathways even when the budget constraint is one case. The guideline is do not choose a case where Z = 1. 5 One of main differences between a qualitative and a statistical view of case studies lies in the role of within case causal analysis. Suppose that one is unaware of the alternative causal mechanism Z until the case study is well under way, i.e., one has in 5 This makes the logic of this section much closer to Gerring s pathway case, where he in fact conducts some of his analysis using X as a sufficient conditions. 19