Predictions Without Models? Using Natural Experiments to Test the Performance of Machine Learning Algorithms

Size: px
Start display at page:

Download "Predictions Without Models? Using Natural Experiments to Test the Performance of Machine Learning Algorithms"

Transcription

1 Predictions Without Models? Using Natural Experiments to Test the Performance of Machine Learning Algorithms Devesh Raval Federal Trade Commission Ted Rosenbaum Federal Trade Commission January 12, 2017 Preliminary and Incomplete Nathan E. Wilson Federal Trade Commission Abstract In recent years, scholars and practitioners in many areas have become interested in machine learning algorithms. Typically, these emphasize predictive accuracy and are agnostic about what theoretical models suggest should determine behavior. However, in changing environments, predictions based upon historical choice patterns may be difficult or even without a model of behavior. In this paper, we test several standard machine learning models performance in a context where the relevant policy question often concerns how consumers would respond to a significant change in their choice environment. Specifically, we assess how well machine learning algorithms perform in predicting where patients receive hospital-based treatment when their first choice hospital is no longer available. To do this, we exploit natural disasters that abruptly close one or more general acute care hospitals, but leave the surrounding area relatively unaffected. Our results suggest that relative to commonly used econometric techniques, common machine learning algorithms often do perform better at counterfactual prediction of individual patients choices, although their performance degrades for patients with a larger change in choice environment post-disaster. JEL Codes: C18, I11, L1, L41 Keywords: machine learning, hospitals, natural experiment, patient choice, prediction The views expressed in this article are those of the authors. They do not necessarily represent those of the Federal Trade Commission or any of its Commissioners. We are grateful to Jonathan Byars, Gregory Dowd, Aaron Keller, Laura Kmitch, and Peter Nguon for their excellent research assistance. We also thank Dave Schmidt for his comments on this draft. The usual caveat applies. 1

2 1 Introduction The increasing ability of firms to observe and collect large amounts of information about individual consumers and their decisions has led to the rise of what Breiman (2001b) calls the algorithmic modeling culture. In this approach, often labeled machine learning, analysts predict choices using algorithms that are not directly derived from formal models of behavior. 1 For example, a regularization procedure such as LASSO might be used to select the variables most relevant for predicting a purchase from a large set of possible regressors. This model agnostic approach towards prediction stands in sharp contrast to the traditional Cowles Commission approach in which economists use economic theory to identify the relevant variables and functional form. For example, McFadden (1981) shows how a model of utility maximizing consumers leads to the multinomial logit econometric model of consumer choice. Recently, economists have began to use machine learning methods to answer economic questions (Kalouptsidi, 2014; Gilchrist and Sands, forthcoming; Goel et al., 2016a,b; Athey and Imbens, 2016). This rising interest may be traced to the fact that for many empirical questions, all that is required is prediction (Kleinberg et al., 2015). Therefore, the algorithmic approach may be appropriate to answer the question at hand. Moreover, comparisons of machine learning algorithms to canonical econometric models have suggested the former often outperform the latter (Bajari et al., 2015a,b). To date, however, economists applications of algorithmic prediction have focused on settings where the support of out-of-sample outcomes is effectively observed in the historical data. Such settings include classic problems like the response of demand to a tax change or the treatment effect of a job training program. However, there are many economic problems, 1 Broad details on the methodological underpinnings of machine learning models can be found in various texts, including Hastie et al. (2005) and James et al. (2013). Varian (2014) provide a readable introduction for economists. 2

3 where this is not true. In these types of cases, there is no way to train an algorithm to predict an outcome, since there is no data to train on (Nevo and Whinston, 2010). For example, antitrust policymakers cannot rely on historical data to predict the effects of a proposed merger, nor can marketers leverage brands past performance to predict market shares in new product categories. For such problems, it is less obvious whether machine learning methods can usefully be applied. As Marschak (1974) notes, It follows that a theory may appear unnecessary for policy decisions until a certain structural change is expected or intended. It becomes necessary then. In this paper, we demonstrate how algorithmic prediction models can be straightforwardly integrated into the canonical random utility model of rational decision-making in order to address these types of question. We then use a set of natural experiments to compare the relative performance of machine learning algorithms to standard econometric models after a major structural change in the choice environment. The setting for our experiments are local hospital markets, which are shocked by natural disasters that severely damaged or destroyed hospitals but left the majority of the surrounding area undisturbed. These natural disasters exogenously altered consumers choice sets, creating a benchmark against which to assess the performance of different predictive models. As in our prior work focused on the relative performance of different commonly used econometric specifications (Raval et al., 2015a), we use the pre-disaster data to estimate consumers preferences for different hospital characteristics. Then, we predict consumer decisions after the disaster has changed consumers choice sets. By comparing the different models predictions to actual post-disaster effects, we are able to evaluate their performance in both absolute and relative terms and whether any differences are likely to matter for policy decisions. We make no claims to exhaustively consider how all possible machine learning models perform, as the set of different algorithms is already large and continues to rapidly expand. Instead, we focus on a set of approaches considered as highly accurate, already implemented 3

4 within existing software packages, and that can straightforwardly be applied to multinomial choice problems. In particular, we examine decision trees, random forests, gradient boosted trees, and elastic net regularized conditional logit models. Across all of our natural experiments, we find that the gradient boosting tree model does particularly well, and is usually one of the best models at predicting both aggregate shares, aggregate diversion ratios, and individual choices. When we perform an explicit model combination approach to compare the performance of all the models, the gradient boosting model and random forest model together receive about three-quarters of the model weight on average. We do find, however, that the performance of machine learning models does not always dominate parametric models. For example, the model performance of machine learning models becomes worse for patients who were more likely to have gone to the destroyed hospital, and so were more likely to have to change their preferred hospital post-disaster. In addition, parametric logit models perform better at individual prediction for the service area with the largest share of the destroyed hospital. These results indicate that parametric logit models may still have an important role to play in counterfactual prediction when there are large changes in the environment or comparatively little data on which to train the model. We also find that, while the random forest model does very well at predictions of aggregate diversion ratios, it does very badly at predictions of aggregate shares, is often worse than the prediction of a single tree at individual prediction, and overfits the data much more than the other models. This likely reflects the fact that the predictions of the random forest are biased, as it estimates probabilities by averaging the hospital prediction of each tree rather than the probabilities of each tree. This highlights the need to adapt machine learning models to the questions that economists pose. Econometricians have already started on this project; for example, Wager and Athey (2015) and Belloni et al. (2012) examine how to develop unbiased probability estimates of random forests and regularized models, respectively. Overall, our work contributes to the emerging literature in economics on the application 4

5 of machine learning techniques (Varian, 2014; Athey, 2015; Kleinberg et al., 2015). Within this literature, the work most similar to our own is by Bajari et al. (2015a,b), who consider the relative out of sample performance of several machine learning models compared to simple econometric aggregate demand models. They find that many of the machine learning models outperform simple linear and logit demand models. A major difference between their work and ours is that the choice environment faced by consumers changes dramatically in our test, which is precisely the environment for which a model based approach could be more fruitful. The paper proceeds as follows. Section 2 describes our data and experimental settings. In Section 3, we lay out the theoretical framework underpinning the models considered in this paper. In Section 5, we present our results on model performance. Section 6 concludes. 2 Natural Experiments 2.1 Disasters We exploit the unexpected closures of six hospitals in four different markets following a natural disaster. Table I belows describes the disasters. The Americus tornado struck a community hospital in rural Georgia, while the Moore tornado hit a small local hospital in the suburbs of Oklahoma City. Hurricane Sandy flooded portions of New York City, leading three hospitals to close in Manhattan and Brooklyn. These hospitals included NYU Hospital, one of the highest ranked hospitals in the country, and Bellevue Hospital Center, a flagship hospital of the NYC public system. The Northridge earthquake hit Los Angeles, causing the closure of one hospital in Santa Monica. Because there is considerable heterogeneity in the treated groups, we expect any results that appear consistent across our experimental settings to have a high degree of external validity. 5

6 Table I Natural Disasters Location Month/Year Severe Weather Hospital(s) Closed Share Destroyed Northridge, CA Jan-94 Earthquake St. John s Hospital 17.4% Americus, GA Mar-07 Tornado Sumter Regional Hospital 50.4% New York, NY Oct-12 Superstorm Sandy NYU Langone 8.9% Bellevue Hospital Center 10.8% Coney Island Hospital 18.2% Moore, OK May-13 Tornado Moore Medical Center 11.0% For a natural disaster to provide a good natural experiment to assess choice models, it must satisfy several criteria. First, the service area must be large enough and the period post disaster for which the hospital is closed long enough that we have enough power to compare different demand models. Second, the destroyed hospital must have had a large enough market share in the service area for the disaster, because the experiment is informative on model performance only when the choice environment undergoes a substantial change. Finally, the damage from the disaster must be narrow enough that the change in patient decision making is limited to the change in the choice set. As described in detail in Raval et al. (2015a), our set of disasters meet these criteria. For each experiment, our primary data come from the inpatient hospital discharge records collected by state departments of health. Such patient-hospital data have been previously used by researchers (Capps et al., 2003; Ciliberto and Dranove, 2006), and provide a host of characteristics describing the patient receiving care as well as the type of clinical care being provided. The details on the construction of our estimation samples are provided in Appendix B of Raval et al. (2015a). The set of affected patients are those living within the zip-codes making up the destroyed hospitals 90% service areas. We identify the choice set of affected consumers as those hospitals that have a share of above 1% for patients in the 90% service area in at least one month (quarter for the smaller Sumter and Moore experiments). The last column in Table I contains the share of the destroyed hospital in the choice set for each experiment. While all markets were significantly affected, the choice environment 6

7 changed differentially across experiments. Sumter Regional, the hospital hit by a tornado in rural Georgia, had about a 50% share of admissions in its service area. For the other hospitals, the share of the destroyed hospital ranges from 9% for NYU to 18% for Coney Island hospital. We leverage this variation to try to better understand the relative performance of different models. 3 Predicting Choices in a Changing Environment 3.1 The ARUM Framework To predict multinomial choices, economists have typically turned to additive random utility models (ARUM). Such models presume that decision-makers choose from a defined set of options so as to maximize their expected utility. Utility, in turn, is assumed to be a linearly separable combination of a deterministic component based on observable elements and an idiosyncratic shock, i.e., u ij = δ ij + ɛ ij, where u is utility, δ is the deterministic component of utility, and ɛ is the random component, while i and j index decision-maker and choice, respectively. This framework applies straightforwardly to a patient s choice of hospitals. Consider a patient i who becomes ill with condition c. Needing care, the patient chooses the specific hospital h from the set of available hospitals H (h = 1,..., N) based on their expected utility of going and receiving treatment there. The utility patient i with condition c receives from care at hospital h can be represented as: u ihc = δ ihc + ɛ ihc = f(x ic, Y h θ) + ɛ ihc, (1) where X ic are observable characteristics about the patient and their condition, Y h are observable characteristics of the hospital s ability to treat the condition, f( ) is a function of 7

8 X and Y with parameters θ, and ɛ ihc is a random shock affecting the relative likelihood that patient i chooses hospital h. To make predictions, the economist fully specifies f( ) and the distribution of ɛ ex ante. She then uses historical data to estimate the parameters of f( ). To make out-of-sample predictions, the recovered ˆθ are applied to the observable characteristics to generate a predicted value of δ ihc. When combined with knowledge of the distribution of the unobserved information, the estimates pin down the likelihood of observing different choices out-of-sample. This approach holds irrespective of whether the choice environment is changing or not. Predictive models performance will vary depending on the reasonableness of the assumptions the economist made about the f( ), including what elements should be in X and Y, and the distribution of ɛ. 3.2 Integrating Machine Learning into the ARUM Framework Machine learning models have also been applied to multinomial choice problems. The different algorithms recover predictions about the likelihood of different outcomes conditional on the choice set and observables. For the most part, assumptions about the distribution of ɛ are not made and are not required in order to make out-of-sample predictions so long as the choice-set has not changed. However, absent modification, they cannot be used to make predictions when a consumer s choice set differs from those observed in the training data. 2 To circumvent this problem, however, an econometrician may impose a distributional assumption. With this assumption, counterfactual predictions are straightforward, and the machine learning approach converges to the ARUM framework. Whereas the economist is solely responsible for identifying the elements of f( ) in the canonical framework, when using machine learning algorithms, this structure is endogenously recovered from the data. 2 Regularized versions of standard maximum likelihood models are a counterexample to this. This is because they involve distributional assumptions. 8

9 Once it has been recovered using historical data, the econometrician generates estimates of ˆδ for the out-of-sample data using the observable information. These are tranformed into predicted probabilities for the changed environment using the assumed distribution of the error, just as in canonical implementations of the ARUM framework. While it is clear that machine learning can easily be integrated into the ARUM framework, one still must answer the question of what distributional assumption one should make in any particular context. Although ARUM frameworks have used a multiplicity of distributional assumptions on ɛ, the economic literature on hospital choice has almost invariably assumed that ɛ are independent and identically distributed random variables drawn from the type- I extreme value distribution. While this implies that consumers with identical δs will on average exhibit similar preference patterns which are independent of irrelevant alternatives (IIA), the availability of highly detailed micro data are assumed to enable the analyst to efficiently account for heterogeneity in patients choices. 3 Our prior assessment of many commonly applied logit models suggests this assumption is not obviously a bad one. Therefore, in our assessment of the relative performance of different machine learning models, we make the logit assumption. This implies that the probability that patient i with condition c receives care at hospital h takes the familiar form: s ihc = exp(δ ihc ) j H exp(δ ijc). (2) 3 See, e.g., discussion in Ackerberg et al. (2007, p. 4185). 9

10 4 Models Compared 4.1 Econometric Models of Patient Choice In this paper, we compare the predictive performance of machine learning algortithms to three standard econometric models. One is a very rich parametric model (Inter) that performed better than the other parametric logit models in our previous work (Raval et al., 2015a). It includes interactions of hospital indicators with acuity, major diagnostic category, and time as well as many interactions between patient characteristics and travel time. The second is a grouping model in our previous paper, a semiparametric bin estimator (Semipar), similar to that outlined in Raval et al. (2015b), that we found to be the most accurate in most disaster settings. This model assumes one can flexibly account for consumer heterogeneity across choices by constructing small and homogeneous groups based upon a small set of patient characteristics, including zip code, age, disease acuity, and diagnosis category. It then leverages the assumption that IIA holds within groups, and so hospital choice probabilities change proportionally to the observed shares of the group with a change in the choice set. In our implementation of this approach, we allow for group sizes as small as twenty, such that for some groups very few patients are used to predict substitution patterns. As discussed in Carlson et al. (2013) and Raval et al. (2015b), this flexible approach is computationally efficient despite being equivalent to including a fixed effect for each grouphospital interaction in a multinomial logit model. The third econometric model (Indic) assumes there is no patient level heterogeneity. In other words, everyone within the relevant area has, on average, the same preferences for each hospital. As a result, patient choices can be modeled as being proportional to aggregate market shares, and δ can be estimated using only hospital indicators as covariates. In other 10

11 words, this model could be estimated with aggregate data. As in our prior work, we use it as a reference point. 4.2 Machine Learning Models of Patient Choice Decision Tree Models We now examine several machine learning models that are also grouping models, like Semipar, in that they partition patients based on characteristics and estimate the same probabilities for all patients in the same group. The main difference between these models and Semipar is that Semipar defined the set of groups ex-ante; the decision tree models we will look at use information on the choice of patients while grouping and use sophisticated algorithms to create the groups. Estimating a decision tree model requires one to partition patients into groups. While there are many possible approaches, we examine perhaps the most popular approach known as CART due to Breiman et al. (1984). The CART approach acts as a greedy algorithm, splitting the data into two groups at each node based on the split that minimizes the error criterion. It recursively partitions the data by growing the tree through successive splits. The major advantage of the decision tree model is it can allow complex interactions between the variables considered; as mentioned above, the parametric and semiparametric logit models have to pre-specify a set of interactions. This greedy algorithm runs the risk of overfitting the data by creating too many splits. Typically, the tree model is then pruned by removing excessive splits which likely contribute little to the out of sample performance of the tree; the model we estimate does so by limiting the tree to a pre-specified depth and removing splits beyond that depth. For example, we might set the depth to five and remove splits that result in more than 5 levels. While the decision tree is simple to understand and interpret, they are known to not 11

12 provide the best predictive power. As Breiman (2001b) notes, While trees rate an A+ on interpretability, they are good, but not great, predictors. Give them, say, a B on prediction. Models that improve on the decision tree by averaging the predictions of many trees are known to provide much better predictive power. In this paper, we examine two such models; random forests (which Breiman (2001b) assigns an A+ for prediction) and gradient boosted trees. A random forest model, originally due to Breiman (2001a), provides two major improvements over the basic decision tree by injecting two sources of stochasticity into the formation of trees. First, a whole forest of trees are built by estimating different tree models on bootstrap samples of the original dataset. This procedure is known as bagging. Second, the set of variables that are considered for splitting is random for each tree. Random forests are clearly difficult to interpret, as the model is a collection of hundreds or thousands of base decision tree models. However, they are known to perform very well for prediction. Breiman (2001b) cites considerable evidence in the machine learning literature that random forests outperform other machine learning models. In the economics literature, Bajari et al. (2015a) find that random forest models were most accurate in predicting aggregate demand out of sample in their study. The second approach to improve the performance of decision trees is gradient boosting (Freund and Schapire, 1995; Friedman et al., 2000; Friedman, 2001). Gradient boosting estimates the decision tree model repeatedly, with each iteration overweighting observations that were classified incorrectly in the previous iteration. For example, with linear regression, a boosting procedure would overweight observations with large residuals. The final prediction is then a weighted average across all of the different models produced. Boosting can be thought of as an additive expansion in a set of elementary basis functions (in our case, trees). A shrinkage parameter scales how much each new tree adds to the overall prediction, and acts as a form of regularization. Boosting is also an extremely good prediction algorithm, 12

13 and has been called the best off-the-shelf classifier in the world (Hastie et al., 2005) Regularization In contrast to grouping models, we also apply a machine learning framework to some of the grading models that have been used in the literature. In particular, we use elastic net penalized regression framework in order to select the most relevant variables from the set of all variables used in the parametric models described in (Raval et al., 2015a). Elastic net can be viewed as a weighted average of a LASSO and a ridge regression. LASSO regression shrinks coefficients towards zero and is helpful in choosing among many variables to avoid overfitting. Ridge regression helps to choose between potentially highly co-linear variables. Since the set of variables we are considering is both large and are highly correlated, we utilized this penalized framework (Hastie et al., 2005). We use the version of elastic net regularization that was developed for the conditional logit by Reid and Tibshirani (2014). 4 In the case of the conditional logit model, the new objective function is: log L(β) + λ(α K β k + 1 K 2 (1 α) k=1 }{{} LASSO k=1 β 2 k }{{} Ridge Where λ and α are tuning parameters. We use the clogitl1 package in R to estimate this model and cross validate to select the tuning parameter λ. We use a value of α =.95. As outlined in Belloni et al. (2011) and Belloni et al. (2012), the coefficients of a lasso estimator can be biased towards zero. Therefore, we apply the two-step approach outlined in that paper to estimate unbiased coefficients. While that paper suggests this approach for standard LASSO penalized regression, we apply it here in the case of elastic net penalized 4 While the specific framework in that paper is different than ours, the McFadden logit model can be viewed as a special case of the one described in that article. ) 13

14 regression on the grounds that similar concerns likely apply Inputs to Machine Learning Models Estimating any of the machine learning models requires us to set the variables used for estimation and the values for hyperparameters of the algorithm. We use nine variables from the patient characteristics available in the discharge data the patient s zip code, disease acuity (DRG weight), the Major Diagnosis Category (MDC) of the patient s diagnosis, the patient s age, an indicator for medical vs. surgical admission, an indicator for Emergency admission, an indicator for whether the patient was black, and an indicator for whether the patient was female. The first four of these variables were used in the semiparametric bin model Semipar. For the decision tree model we estimate (Tree), we use the R package rpart (Therneau et al., 2010). The main two hyperparameters are the minimum size of any node in the tree and the number of levels of the tree. We set the minimum size of the node to 20, the same value we use for Semipar and all other tree models, and use 5 fold cross validation to set the number of levels of the tree. For the random forest model we estimate (RF ), we also use 5 fold cross validation to set the number of levels of the tree. We use R package randomforest (Liaw and Wiener, 2002), based on the original work of Breiman (2001a) to estimate the model and set the number of trees to 500. For gradient boosting (GBM ), we use R package GBM (Ridgeway, 2006). We have not been able to cross validate the hyperparameters of this algorithm yet; we set the number of trees to 1500, the maximum tree depth to 10, and the shrinkage rate to We implement all of these models using R package caret (Kuhn, 2008). 14

15 5 Prediction We estimate all of the models in Section 3 on data from the period before the disaster, and assess each model s predictive performance on data from the period after the disaster. Each model is thus out of sample along two dimensions. First, it is estimated on an earlier time period, and second, the choice set available to patients has changed with the disaster. The change in the choice set is crucial to see how well each model predicts patients choices after a major change in market structure. 5.1 Relative Performance We compare the relative performance of the models on their predictions of aggregate market shares, aggregate diversion ratios post-disaster, and individual hospital choices for each destroyed hospital s service area Aggregate Shares A simple way to assess performance on aggregate shares is to plot the time series of predictions against observed shares. In Figure 1, we do this for the Sumter disaster for four models Semipar, Inter, GBM, and RF and 6 hospitals. The observed shares are the dotted red line. The grey dot-dash vertical line depicts the quarter of the disaster. With the disaster, Sumter Regional s market share falls from about 50 percent to zero. Both the Semipar, Inter, and GBM models closely track each other, as well as the actual changes in market shares for most of the remaining hospitals. For example, they all get the observed market shares for Flint River approximately correct. RF, on the other hand, has fairly different predictions. Often it performs poorly, with predictions in the pre-period that are quite off from observed shares. But it does correctly predict the post-disaster share for Phoebe Putney, which the other models underpredict, as well as the market share for 15

16 Palmyra Medical at the end of the period. All of the models overpredict the share going to the outside option. Figure 1 Aggregate Market Shares, Predicted and Observed, for Sumter Note: Red dotted line is the observed series of market shares. The grey vertical dot-dash line depicts the quarter of the disaster. To see if these broad patterns are general, we examine the performance of all of the models across all of the destroyed hospitals using the criterion of root mean squared error (RMSE). At the aggregate level, the RMSE is defined as: 1 RMSE = [y j ŷ j ] N 2. J j Here y j is the share of alternative j, ŷ j the model prediction, and N J the total number of alternatives. To look at relative differences across models, we examine the percent improvement in 16

17 (a) Aggregate Share (b) Aggregate Diversion Ratio Figure 2 Relative Improvement in RMSE of Aggregate Predictions Note: Improvement is Percentage Improvement in RMSE for each model over the Indic model. Parametric models are circles, semiparametric models are triangles, and machine learning models diamonds.

18 RMSE for each model over the baseline of the Indic model. The Indic model provides a useful baseline as it is a simple model that only requires data on market shares. We define the percent improvement as: 1 RMSE Model RMSE Indic. Our results are shown in Figure 2a, which depicts the relative improvement in RMSE for each destroyed hospital s service area and model in the period after the disaster. Each row is a different destroyed hospital s service area. The models are distinguished both by color and by shape, with the parametric models as circles, the semiparametric model as a triangle, and the three machine learning tree models as diamonds. While the decision tree model Tree does not perform badly, as expected, it is never one of the best models at prediction. It always predicts worse than Semipar, with Semipar between 0.5 and 10 percentage points better across service areas. Surprisingly, the random forest model performs much worse than Tree. It has much higher RMSE than the other models in all cases except Sumter, performing between 50 to 300 percent worse than Indic. For Sumter, however, it outperforms all of the models. This poor performance is most likely due to the way RF predicts. The random forest model does not average the probabilities for each tree; instead, it constructs probabilities based on the average probability of the class prediction of each tree. This procedure likely underweights probabilities for small hospitals, which may have a small probability of choice but never be any tree s class prediction. Across all of the hospitals, the differences between the parametric model Inter, semiparametric model Semipar, and the gradient boosting model GBM are relatively small. For example, Semipar ranges from 9 percentage points worse than GBM to 12 percentage points better across the hospital service areas. However, GBM is usually the best of these models. Of these models, GBM is the best model in four cases, Semipar is the best model in one case, and RF in one case, although all models underperform our baseline model, Indic, for Bellevue. 18

19 These results illustrate the extent to which models match the levels of consumer choice probabilities before and after a choice was eliminated. However, in many applications, the change in consumers choice probabilities for different options after removing an object from the choice set is the object of interest, such as the diversion ratio referred to in Garmon (2016) and the Horizontal Merger Guidelines. Therefore, we examine the RMSE of the aggregate diversion ratio following the disaster. We define the aggregate diversion ratio for hospital j as: y j,1 y j,0 y dest,0 where y j,1 is the share of hospital j in the period after the disaster, y j,0 the share of hospital j before the disaster, and y dest,0 the share of the destroyed hospital. Assuming that all changes in market shares after the disaster are due to the closure of the destroyed hospital, the diversion ratio tells us the fraction of the destroyed hospital s patients that went to hospital j. For the New York hospitals, the denominator of the diversion ratio includes all destroyed hospitals in the choice set. Figure 2b depicts the relative improvement in RMSE for each model over Indic. We see a very different picture for aggregate diversions. While the Tree model again slightly underperforms Semipar, the random forest model does much better on diversion ratios than they did on shares; its earlier poor performance stems primarily from missing the levels of shares. RF is the best model for aggregate diversions in three cases, and is substantially better than the next best model in all three of those cases. It only performs badly for Bellevue, where it is the worst model, performing about 50% worse than Indic. In the other three cases, GBM is the best model. Thus, the decision tree based machine learning models conclusively beat the parametric and semiparametric logit models for aggregate diversions. 19

20 5.1.2 Individual Predictions Since the shape of demand is determined by individual heterogeneous consumers, predictions on individual choice are key for assessing welfare in differentiated product markets. Figure 3 depicts the percent improvement over Indic for all models and across all of the hospitals for individual choices. We again measure model performance using RMSE, although we found in our previous paper general agreement across alternative performance metrics. 5 The gradient boosting model almost always performs the best, on average, across all of our experimental settings: GBM performs the best for all of the destroyed hospitals except Sumter, for which it is second best after Inter. Across hospitals, GBM is 2 to 8 percentage points better than Semipar, and between 4 to 14 percentage points better than Indic. The other tree models Tree and RF are always clearly worse than GBM ; Semipar is better than RF in five cases and better than Tree in four. Again, the Tree model is better than RF in half of the occasions, indicating that the probability estimation procedure of RF may understate probabilities for hospitals that patients are relatively unlikely to visit. In most situations, researchers will not have access to natural experiments like ours in order to assess models, but could use in-sample model performance to evaluate models. We examine whether in-sample performance can provide a good guide to out of sample performance in Figure 4. For each of the destroyed hospitals, we compare each model s performance for individual predictions in the period before the disaster to after the disaster. The blue line is the linear best-fit line across the models. Unlike our results with parametric and semiparametric logit models, the performance of the model before the disaster is not necessarily a good guide to its performance afterwards; two of the linear relationships are not even upward sloping. The main culprit is RF, as it tends to be in the lower right corner of each figure, performing much better in the pre-period 5 These alternative metrics are Mean Absolute Error, zero-one loss based on whether the patient went to the choice with the highest probability, and relative entropy (a log likelihood based statistic). 20

21 Note: Figure 3 Relative Improvement in RMSE of Individual Predictions Improvement is Percentage Improvement in RMSE for each model over the Indic model. than it does in the post-period. This is consistent with the hypothesis that RF overfits the data. The GBM model, on the other hand, does not appear to be overfitting. However, all of the models do tend to do worse compared to Indic in the post-period than they did in the pre-period. 5.2 Prediction Under a Changing Environment The above results demonstrate that, on average, the flexible machine learning models tend to predict very well after the disaster induced change in the choice set. However, it is possible that this is because we include in our estimates many patients for whom their preferred hospital was unaffected by the disaster, and so the destruction of a non-preferred hospital had no impact on their choices. The greater the number of patients included in our calculations 21

22 Note: Figure 4 Relative Improvement in RMSE of Individual Predictions, Post-Period vs. Pre-Period Improvement is Percentage Improvement in RMSE for each model over the Indic model. that prefer a non-destroyed hospital, the more our out-of-sample validation resembles more traditional split-the-sample validation. In that environment, the flexibility may reflect a type of overfitting that delivers good predictions in the existing choice environment, but fails at extrapolations out of that environment. We focus on the patients who were more likely to experience the elimination of their preferred hospital following the natural disaster by examining patients whose characteristics place them in bins with a greater share of discharges from the destroyed hospital in the pre-disaster period. For the first approach, we calculate the RMSE for each bin produced by Semipar and examine how bin level performance varies by the bin s share of the destroyed hospital for all of the models; Figure 5 depicts this relationship for Sumter and NYU. The size of each 22

23 (a) Sumter (b) NYU Figure 5 Bin Level RMSE by Destroyed Hospital Share for Semipar model Note: Blue solid line is the loess trend, weighting each bin by its number of patients. Each point is the RMSE for a particular bin, and its size is proportional to the number of patients in each bin. point is proportional to the number of patients in each bin; the blue solid line is the loess trend weighting each bin by its number of patients. For Sumter, the average RMSE increases as the share of the destroyed hospital increases, flattens out, and then increases again. For NYU, the average RMSE about doubles when going from the lowest pre-disaster share to the highest pre-disaster share. This pattern is intuitive when there is a change in the choice environment, the models generally do not predict as well. While we should not expect the models to perform as well when there is a change in the choice environment, some models may perform relatively better than others. Therefore, we examine the relative performance by plotting the loess trend from each model across the bins. Figure 6 depicts these graphs for each model for all of the hospitals. Except for Sumter, where Inter is the best model for very low shares of the destroyed hospital, GBM is always the best model for low shares of the destroyed hospital. For Moore and St. Johns, GBM is always the best model; for the other disasters, the parametric logit model performs better at 23

24 Figure 6 Bin Level RMSE by Destroyed Hospital Share for All Models Note: Each line is the loess trend for a different model, weighting each bin by its number of patients in the pre-period. high shares of the destroyed hospital, although the share cutoff before which Inter is better is typically above 40 percent. This may explain why Inter outperforms GBM at Sumter; Sumter is the only disaster where the overall pre-disaster share of the destroyed hospital is about 50%. Except for Moore, Inter always outperforms RF and Tree. 5.3 Model Combination So far, we have examined the performance of each model separately. However, one major finding of machine learning is that ensembles of models can perform better than one individual model (Van der Laan et al., 2007). In this study, both GBM and RF are already combinations of hundreds of base learners and perform very well. These findings suggest that combining the predictions from multiple models may lead to better predictions of behavior 24

25 than using a single preferred model. While there are several ways to combine models, we apply a simple regression based approach that has been developed in the literature on optimal model combination for optimally combining macroeconomics forecasts (Timmermann (2006)). To apply the method to our context, we treat each patient as an observation, and regress the predictions from all the models on observed patient behavior. We constrain the coefficients on the models predictions to be non-negative and to sum to one. Thus, each coefficient in the regression can be interpreted as a model weight, and many models will be given zero weight. We perform this analysis separately for each disaster, which enables us to see the variation in our findings across the different settings. The regression framework implicitly deals with the correlations in predictions across models. If two models are very highly correlated but one is a better predictor than the other, only the better of the two models might receive some weight in the optimal model combination. Formally, we regress each patient s choice of hospital on the predicted probabilities from all of the models in the period after the disaster without including a constant, as below: y ih = β Semipar ŷ Semipar ih β RF ŷ RF ih + ɛ where y ih is the observed choice for patient i and hospital h and ŷ Semipar ih is the predicted probability for patient i and hospital h for Semipar. We include all of the parametric logit models tested in Raval et al. (2015a) as well as the machine learning decision tree models tested in this paper. Table II displays the model weights from these regressions for all models with positive weight for some experiment. We highlight three major findings. First, there is no one preferred model. Within a given disaster, there is no single model that receives all of the weight; the largest weight any model receives is 61%. On average, two of the machine 25

26 Table II Model Weights for Optimal Model Combination Model Sumter Moore NYU Coney Bellevue StJohn s Average All Parametric Logit Models RF GBM Tree Semipar Note: The second through seventh columns provide the model weights for the optimal model combination for each experiment s service area in the period after the disaster. The last column provides the average weight for each model across the different experiments. learning models receive about three-quarters of the weight, with GBM received almost half, at 48%, and RF receiving 23% of the weight. All of the parametric logit models combined received about one-fourth of the weight. Thus, there appears to be a role for grading models like parametric logits as well as grouping models like GBM and RF in optimal prediction. Second, the Semipar model appears to be dominated by the decision tree models; it receives zero weight on average, and its highest share is 2% for one service area. Since Semipar is a grouping model like the decision tree models, it appears not to provide extra information for prediction after including the machine learning models. Third, the one disaster where the parametric logit models receive a majority of the weight (54%) is Sumter. For Sumter, the destroyed hospital had about half of the market share in the service area. For groups in which the destroyed hospital has a large share, choice probabilities will be based upon data from only a few individuals and so will have a high variance. Grouping models like RF and GBM may not perform as well as grading models that are more global and so use data on all patients in the market. Thus, machine learning models may underperform traditional parametric logit models in cases where the change in the choice environment is very large. 26

27 6 Conclusion TBD 27

28 References Ackerberg, Daniel, C. Lanier Benkard, Steven Berry, and Ariel Pakes, Econometric Tools for Analyzing Market Outcomes, Handbook of Econometrics, 2007, 6, Athey, Susan, Machine Learning and Causal Inference for Policy Evaluation, in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining ACM 2015, pp and Guido Imbens, The State of Applied Econometrics-Causality and Policy Evaluation, arxiv preprint arxiv: , Bajari, Patrick, Denis Nekipelov, Stephen P Ryan, and Miaoyu Yang, Demand Estimation with Machine Learning and Model Combination, Technical Report, National Bureau of Economic Research 2015.,,, and, Machine Learning Methods for Demand Estimation, The American Economic Review, 2015, 105 (5), Belloni, Alexandre, Daniel Chen, Victor Chernozhukov, and Christian Hansen, Sparse Models and Methods for Optimal Instruments with an application to Eminent Domain, Econometrica, 2012, 80 (6), , Victor Chernozhukov, and Christian Hansen, LASSO methods for Gaussian Instrumental Variables Models, MIT Department of Economics Working Paper, Breiman, Leo, Random Forests, Machine Learning, 2001, 45 (1), 5 32., Statistical Modeling: The Two Cultures, Statistical Science, 2001, 16 (3), , Jerome Friedman, Charles J. Stone, and R.A. Olshen, Classification and Regression Trees, Chapman and Hall, Capps, Cory, David Dranove, and Mark Satterthwaite, Competition and Market Power in Option Demand Markets, RAND Journal of Economics, 2003, 34 (4), Carlson, Julie A., Leemore S. Dafny, Beth A. Freeborn, Pauline M. Ippolito, and Brett W. Wendling, Economics at the FTC: Physician Acquisitions, Standard Essential Patents, and Accuracy of Credit Reporting, Review of Industrial Organization, 2013, 43 (4), Ciliberto, Federico and David Dranove, The Effect of Physician Hospital Affiliations on Hospital Prices in California, Journal of Health Economics, 2006, 25 (1), der Laan, Mark J. Van, Eric C. Polley, and Alan E. Hubbard, Super Learner, Statistical Applications in Genetics and Molecular Biology, 2007, 6 (1). Freund, Yoav and Robert E. Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting, in European Conference on Computational Learning Theory Springer 1995, pp

29 Friedman, Jerome H., Greedy Function Approximation: A Gradient Boosting Machine, Annals of Statistics, 2001, pp Friedman, Jerome, Trevor Hastie, and Robert Tibshirani, Additive Logistic Regression: a Statistical View of Boosting, The Annals of Statistics, 2000, 28 (2), Garmon, Christopher, The Accuracy of Hospital Merger Screening Methods, mimeo, Gilchrist, Duncan Sheppard and Emily Glassberg Sands, Something to Talk About: Social Spillovers in Movie Consumption, Journal of Political Economy, forthcoming. Goel, Sharad, Justin M. Rao, and Ravi Shroff, Personalized Risk Assessments in the Criminal Justice System, American Economic Review, May 2016, 106 (5), , Justin M Rao, and Ravi Shroff, Precinct or Prejudice? Understanding Racial Disparities in New York City s Stop-and-Frisk Policy, Annals of Applied Statistics, 2016, 10 (1), Hastie, Trevor, Robert Tibshirani, Jerome Friedman, and James Franklin, The Elements of statistical learning: data mining, inference and prediction, The Mathematical Intelligencer, 2005, 27 (2), James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani, An Introduction to Statistical Learning, Vol. 112, Springer, Kalouptsidi, Myrto, Time to build and fluctuations in bulk shipping, The American Economic Review, 2014, 104 (2), Kleinberg, Jon, Jens Ludwig, Sendhil Mullainathan, and Ziad Obermeyer, Prediction Policy Problems, The American Economic Review, 2015, 105 (5), Kuhn, Max, Caret package, Journal of Statistical Software, 2008, 28 (5). Liaw, Andy and Matthew Wiener, Classification and Regression by randomforest, R News, 2002, 2 (3), Marschak, Jacob, Economic Measurements for Policy and Prediction, in Economic Information, Decision, and Prediction, Springer, 1974, pp McFadden, Daniel, Econometric Models of Probabilistic Choice, in Daniel McFadden and Charles F. Manski, eds., Structural Analysis of Discrete Data and Econometric Applications, Cambridge: The MIT Press, Nevo, Aviv and Michael D. Whinston, Taking the Dogma out of Econometrics: Structural Modeling and Credible Inference, The Journal of Economic Perspectives, 2010, pp Raval, Devesh, Ted Rosenbaum, and Nathan E. Wilson, Industrial Reorganization: Learning about Patient Substitution Patterns from Natural Experiments, mimeo, 2015.,, and Steven A. Tenn, A Semiparametric Discrete Choice Model: An Application to Hospital Mergers, mimeo, Reid, Stephen and Rob Tibshirani, Regularization Paths for Conditional Logistic Regression: The clogitl1 Package, Journal of Statistical Software, 2014, 58 (12). 29

30 Ridgeway, Greg, gbm: Generalized Boosted Regression Models, R package version, 2006, 1 (3). Therneau, Terry M, Beth Atkinson, and Brian Ripley, rpart: Recursive Partitioning, R package version, 2010, 3, Timmermann, Allan, Forecast Combinations, Handbook of Economic Forecasting, 2006, 1, Varian, Hal R., Big Data: New Tricks for Econometrics, The Journal of Economic Perspectives, 2014, 28 (2), Wager, Stefan and Susan Athey, Estimation and Inference of Heterogeneous Treatment Effects using Random Forests, arxiv preprint arxiv: ,

31 A Disaster Timelines In this section, we give a brief narrative descriptions of the destruction in the areas surrounding the destroyed hospitals. A.1 St. John s (Northridge Earthquake) On January 17th, 1994, an earthquake rated 6.7 on the Richter scale hit the Los Angeles Metropolitan area 32 km northwest of Los Angeles. This earthquake killed 61 people, injured 9,000, and seriously damaged 30,000 homes. According to the USGS, the neighborhoods worst affected by the earthquake were the San Fernando Valley, Northridge and Sherman Oaks, while the neighborhoods of Fillmore, Glendale, Santa Clarita, Santa Monica, Simi Valley and western and central Los Angeles also suffered significant damage. 6 Over 1,600 housing units in Santa Monica alone were damaged with a total cost of $70 million. 7 The earthquake damaged a number of major highways of the area; in our service area, the most important was the I-10 (Santa Monica Freeway) that passed through Santa Monica. It reopened on April 11, By the same time, many of those with damaged houses had found new housing. 9 Santa Monica Hospital, located close to St. John s, remained open but at a reduced capacity of 178 beds compared to 298 beds before the disaster. In July 1995, Santa Monica Hospital merged with UCLA Medical Center. 10 St. John s hospital reopened for inpatient services on October 3, 1994, although with only about half of the employees and inpatient beds and without its North Wing (which was razed). 11 A.2 Sumter (Americus Tornado) On March 1, 2007, a tornado went through the center of the town of Americus, GA, damaging 993 houses and 217 businesses. The tornado also completely destroyed Sumter Regional Hospital. An inspection of the damage map in the text and GIS maps of destroyed structures suggests that the damage was relatively localized the northwest part of the city was not damaged, and very few people in the service area outside of the town of Americus were affected. 12 Despite the tornado, employment remains roughly constant in the Americus Micropolitan Statistical Area after the disaster, at 15,628 in February 2007 before the disaster and 15,551 in February 2008 one year 6 See 7 See 8 See 9 See html?pagewanted=all. 10 See 11 See 12 See for the GIS map. 31

32 Figure 7 Damage Map in Los Angeles, CA Note: Darker green areas indicate the earthquake intensity measured by the Modified Mercalli Intensity (MMI); an MMI value of 7 reflects non-structural damage and a value of 8 moderate structural damage. The areas that experienced the quake with greater intensity were shaded in a darker color, with the MMI in the area ranging from Any areas with an MMI of below 7 were not colored. The zip codes included in the service area are outlined in pink. Sources: USGS Shakemap, OSHPD Discharge Data later. 13 While Sumter Regional slowly re-introduced some services such as urgent care, they did not reopen for inpatient admissions until April 1, 2008 in a temporary facility with 76 beds and 71,000 sq ft of space. Sumter Regional subsequently merged with Phoebe Putney Hospital in October 2008, with the full merge completed on July 1, On December 2011, a new facility was built with 76 beds and 183,000 square feet of space. 14 A.3 NYU, Bellevue, and Coney Island (Superstorm Sandy) Superstorm Sandy hit the New York Metropolitan area on October 28th - 29th, The storm caused severe localized damage and flooding, shutdown the New York City Subway system, and caused many people in the area to lose electrical power. By November 5th, normal service had been restored on the subways (with minor exceptions). 15 Major bridges reopen on October 30th 13 See 212BF9673EB816FE50F D1695.tc_instance6. 14 See and 15 See 32

33 and NYC schools reopen on November 5th. 16 By November 5th, power is restored to 70 percent of New Yorkers, and to all New Yorkers by November 15th. FEMA damage inspection data reveals that most of the damage from Sandy occured in areas adjacent to water. 17 Manhattan is relatively unaffected, with even areas next to the water suffering little damage. In the Coney Island area, the island tip suffers more damage, but even here, most block groups suffer less than 50 percent damage. Areas on the Long Island Sound farther east of Coney Island, such as Long Beach, are much more affected. NYU Langone Medical Center suffered about $1 billion in damage due to Sandy, with its main generators flooded. While some outpatient services reopened in early November, it only partially reopened inpatient services on December 27, 2012, including some surgical services and medical and surgical intensive care. The maternity unit and pediatrics reopened on January 14th, While NYU Langone opened an urgent care center on January 17, 2013, a true emergency room did not open until April 24, 2014, more than a year later. 19 Bellevue Hospital Center reopened limited outpatient services on November 19th, However, Bellevue dis not fully reopen inpatient services until February 7th, Coney Island Hospital opened an urgent care center by December 3, 2012, but patients were not admitted inpatient. It had reopened ambulance service and most of its inpatient beds by February 20th, 2013, although at that time trauma care and labor and delivery remained closed. The labor and delivery unit did not reopen until June 13th, A.4 Moore (Moore Tornado) A tornado went through the Oklahoma City suburb of Moore on May 20, The tornado destroyed two schools and more than 1,000 buildings (damaging more than 1,200 more) in the area of Moore and killed 24 people. Interstate 35 was briefly closed for a few hours due to the storm. 23 Maps of the tornado s path demonstrate that while some areas were severely damaged, nearby areas 16 See 17 See the damage map at 18 See 19 See and 20 See 21 See Closure html. 22 See intake/, and 23 See 33

34 Figure 8 Damage Map in Manhattan, NY Note: Green shading indicate flood-affected areas. The zip codes included in the service area for Bellevue are outlined in pink. Sources: FEMA, NY Discharge Data Figure 9 Damage Map in Coney Island, NY Note: Green shading indicate flood-affected areas. The zip codes included in the service area are outlined in pink. Sources: FEMA, NY Discharge Data 34

35 Figure 10 Damage Map in Moore, OK Note: The green area indicates the damage path of the tornado. The zip codes included in the service area are outlined in pink. Sources: NOAA, OK Discharge Data were relatively unaffected. 24 Emergency services, but not inpatient admissions, temporarily reopened at Moore Medical Center on December 2, Groundbreaking for a new hospital took place on May 20, 2014 with a tentative opening of fall B Dataset Construction For each dataset, we drop newborns, transfers, and court-ordered admissions. Newborns do not decide which hospital to be born in (admissions of their mothers, who do, are included in the dataset); similarly, government officials or physicians, and not patients, may decide hospitals for court-ordered admissions and transfers. We drop diseases of the eye, psychological diseases, and rehabilitation based on Major Diagnostic Category (MDC) codes, as patients with these diseases may have other options for treatment beyond general hospitals. We also drop patients whose MDC code is uncategorized (0), and neo-natal patients above age one. We also exclude patients who are missing gender or an indicator for whether the admission is for a Medical Diagnosis Related Group (DRG). We also remove patients not going to General Acute Care hospitals. For each disaster, we estimate models on the pre-period prior to the disaster and then validate them on the period after the disaster. We omit the month of the disaster from either period, 24 See and interactive/2013/05/20/us/oklahoma-tornado-map.html for maps of the tornado s path. 25 See and com/2013/11/20/moore-medical-center-destroyed-in-tornado-to-reopen-in-december/. 35

How Do Machine Learning Algorithms Perform in. Changing Environments? Evidence from Disaster. Induced Hospital Closures

How Do Machine Learning Algorithms Perform in. Changing Environments? Evidence from Disaster. Induced Hospital Closures How Do Machine Learning Algorithms Perform in Changing Environments? Evidence from Disaster Induced Hospital Closures Devesh Raval Ted Rosenbaum Nathan E. Wilson Federal Trade Commission draval@ftc.gov

More information

How Do Machine Learning Algorithms Perform in Changing Environments? Evidence from Disaster Induced Hospital Closures

How Do Machine Learning Algorithms Perform in Changing Environments? Evidence from Disaster Induced Hospital Closures How Do Machine Learning Algorithms Perform in Changing Environments? Evidence from Disaster Induced Hospital Closures Devesh Raval Federal Trade Commission draval@ftc.gov Ted Rosenbaum Federal Trade Commission

More information

Using Disaster Induced Closures to Evaluate Discrete. Choice Models of Hospital Demand

Using Disaster Induced Closures to Evaluate Discrete. Choice Models of Hospital Demand Using Disaster Induced Closures to Evaluate Discrete Choice Models of Hospital Demand Devesh Raval Ted Rosenbaum Nathan E. Wilson Federal Trade Commission draval@ftc.gov Federal Trade Commission trosenbaum@ftc.gov

More information

Propensity scores and causal inference using machine learning methods

Propensity scores and causal inference using machine learning methods Propensity scores and causal inference using machine learning methods Austin Nichols (Abt) & Linden McBride (Cornell) July 27, 2017 Stata Conference Baltimore, MD Overview Machine learning methods dominant

More information

Predicting Breast Cancer Survival Using Treatment and Patient Factors

Predicting Breast Cancer Survival Using Treatment and Patient Factors Predicting Breast Cancer Survival Using Treatment and Patient Factors William Chen wchen808@stanford.edu Henry Wang hwang9@stanford.edu 1. Introduction Breast cancer is the leading type of cancer in women

More information

BayesRandomForest: An R

BayesRandomForest: An R BayesRandomForest: An R implementation of Bayesian Random Forest for Regression Analysis of High-dimensional Data Oyebayo Ridwan Olaniran (rid4stat@yahoo.com) Universiti Tun Hussein Onn Malaysia Mohd Asrul

More information

Article from. Forecasting and Futurism. Month Year July 2015 Issue Number 11

Article from. Forecasting and Futurism. Month Year July 2015 Issue Number 11 Article from Forecasting and Futurism Month Year July 2015 Issue Number 11 Calibrating Risk Score Model with Partial Credibility By Shea Parkes and Brad Armstrong Risk adjustment models are commonly used

More information

Response to Mease and Wyner, Evidence Contrary to the Statistical View of Boosting, JMLR 9:1 26, 2008

Response to Mease and Wyner, Evidence Contrary to the Statistical View of Boosting, JMLR 9:1 26, 2008 Journal of Machine Learning Research 9 (2008) 59-64 Published 1/08 Response to Mease and Wyner, Evidence Contrary to the Statistical View of Boosting, JMLR 9:1 26, 2008 Jerome Friedman Trevor Hastie Robert

More information

Evidence Contrary to the Statistical View of Boosting: A Rejoinder to Responses

Evidence Contrary to the Statistical View of Boosting: A Rejoinder to Responses Journal of Machine Learning Research 9 (2008) 195-201 Published 2/08 Evidence Contrary to the Statistical View of Boosting: A Rejoinder to Responses David Mease Department of Marketing and Decision Sciences

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 Exam policy: This exam allows one one-page, two-sided cheat sheet; No other materials. Time: 80 minutes. Be sure to write your name and

More information

For general queries, contact

For general queries, contact Much of the work in Bayesian econometrics has focused on showing the value of Bayesian methods for parametric models (see, for example, Geweke (2005), Koop (2003), Li and Tobias (2011), and Rossi, Allenby,

More information

The Regression-Discontinuity Design

The Regression-Discontinuity Design Page 1 of 10 Home» Design» Quasi-Experimental Design» The Regression-Discontinuity Design The regression-discontinuity design. What a terrible name! In everyday language both parts of the term have connotations

More information

What is Regularization? Example by Sean Owen

What is Regularization? Example by Sean Owen What is Regularization? Example by Sean Owen What is Regularization? Name3 Species Size Threat Bo snake small friendly Miley dog small friendly Fifi cat small enemy Muffy cat small friendly Rufus dog large

More information

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Final Project Report CS 229 Autumn 2017 Category: Life Sciences Maxwell Allman (mallman) Lin Fan (linfan) Jamie Kang (kangjh) 1 Introduction

More information

Selection and Combination of Markers for Prediction

Selection and Combination of Markers for Prediction Selection and Combination of Markers for Prediction NACC Data and Methods Meeting September, 2010 Baojiang Chen, PhD Sarah Monsell, MS Xiao-Hua Andrew Zhou, PhD Overview 1. Research motivation 2. Describe

More information

MEA DISCUSSION PAPERS

MEA DISCUSSION PAPERS Inference Problems under a Special Form of Heteroskedasticity Helmut Farbmacher, Heinrich Kögel 03-2015 MEA DISCUSSION PAPERS mea Amalienstr. 33_D-80799 Munich_Phone+49 89 38602-355_Fax +49 89 38602-390_www.mea.mpisoc.mpg.de

More information

Case Studies of Signed Networks

Case Studies of Signed Networks Case Studies of Signed Networks Christopher Wang December 10, 2014 Abstract Many studies on signed social networks focus on predicting the different relationships between users. However this prediction

More information

Introduction to Program Evaluation

Introduction to Program Evaluation Introduction to Program Evaluation Nirav Mehta Assistant Professor Economics Department University of Western Ontario January 22, 2014 Mehta (UWO) Program Evaluation January 22, 2014 1 / 28 What is Program

More information

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018 Introduction to Machine Learning Katherine Heller Deep Learning Summer School 2018 Outline Kinds of machine learning Linear regression Regularization Bayesian methods Logistic Regression Why we do this

More information

Pros. University of Chicago and NORC at the University of Chicago, USA, and IZA, Germany

Pros. University of Chicago and NORC at the University of Chicago, USA, and IZA, Germany Dan A. Black University of Chicago and NORC at the University of Chicago, USA, and IZA, Germany Matching as a regression estimator Matching avoids making assumptions about the functional form of the regression

More information

An Experimental Investigation of Self-Serving Biases in an Auditing Trust Game: The Effect of Group Affiliation: Discussion

An Experimental Investigation of Self-Serving Biases in an Auditing Trust Game: The Effect of Group Affiliation: Discussion 1 An Experimental Investigation of Self-Serving Biases in an Auditing Trust Game: The Effect of Group Affiliation: Discussion Shyam Sunder, Yale School of Management P rofessor King has written an interesting

More information

Title: New Perspectives on the Synthetic Control Method. Authors: Eli Ben-Michael, UC Berkeley,

Title: New Perspectives on the Synthetic Control Method. Authors: Eli Ben-Michael, UC Berkeley, Title: New Perspectives on the Synthetic Control Method Authors: Eli Ben-Michael, UC Berkeley, ebenmichael@berkeley.edu Avi Feller, UC Berkeley, afeller@berkeley.edu [presenting author] Jesse Rothstein,

More information

Using Ensemble-Based Methods for Directly Estimating Causal Effects: An Investigation of Tree-Based G-Computation

Using Ensemble-Based Methods for Directly Estimating Causal Effects: An Investigation of Tree-Based G-Computation Institute for Clinical Evaluative Sciences From the SelectedWorks of Peter Austin 2012 Using Ensemble-Based Methods for Directly Estimating Causal Effects: An Investigation of Tree-Based G-Computation

More information

Comparison of discrimination methods for the classification of tumors using gene expression data

Comparison of discrimination methods for the classification of tumors using gene expression data Comparison of discrimination methods for the classification of tumors using gene expression data Sandrine Dudoit, Jane Fridlyand 2 and Terry Speed 2,. Mathematical Sciences Research Institute, Berkeley

More information

THE USE OF MULTIVARIATE ANALYSIS IN DEVELOPMENT THEORY: A CRITIQUE OF THE APPROACH ADOPTED BY ADELMAN AND MORRIS A. C. RAYNER

THE USE OF MULTIVARIATE ANALYSIS IN DEVELOPMENT THEORY: A CRITIQUE OF THE APPROACH ADOPTED BY ADELMAN AND MORRIS A. C. RAYNER THE USE OF MULTIVARIATE ANALYSIS IN DEVELOPMENT THEORY: A CRITIQUE OF THE APPROACH ADOPTED BY ADELMAN AND MORRIS A. C. RAYNER Introduction, 639. Factor analysis, 639. Discriminant analysis, 644. INTRODUCTION

More information

Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach

Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School November 2015 Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach Wei Chen

More information

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15)

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15) ON THE COMPARISON OF BAYESIAN INFORMATION CRITERION AND DRAPER S INFORMATION CRITERION IN SELECTION OF AN ASYMMETRIC PRICE RELATIONSHIP: BOOTSTRAP SIMULATION RESULTS Henry de-graft Acquah, Senior Lecturer

More information

KARUN ADUSUMILLI OFFICE ADDRESS, TELEPHONE & Department of Economics

KARUN ADUSUMILLI OFFICE ADDRESS, TELEPHONE &   Department of Economics LONDON SCHOOL OF ECONOMICS & POLITICAL SCIENCE Placement Officer: Professor Wouter Den Haan +44 (0)20 7955 7669 w.denhaan@lse.ac.uk Placement Assistant: Mr John Curtis +44 (0)20 7955 7545 j.curtis@lse.ac.uk

More information

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n. University of Groningen Latent instrumental variables Ebbes, P. IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

Sawtooth Software. MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB RESEARCH PAPER SERIES. Bryan Orme, Sawtooth Software, Inc.

Sawtooth Software. MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB RESEARCH PAPER SERIES. Bryan Orme, Sawtooth Software, Inc. Sawtooth Software RESEARCH PAPER SERIES MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB Bryan Orme, Sawtooth Software, Inc. Copyright 009, Sawtooth Software, Inc. 530 W. Fir St. Sequim,

More information

The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016

The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016 The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016 This course does not cover how to perform statistical tests on SPSS or any other computer program. There are several courses

More information

A Comparison of Collaborative Filtering Methods for Medication Reconciliation

A Comparison of Collaborative Filtering Methods for Medication Reconciliation A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,

More information

Some Thoughts on the Principle of Revealed Preference 1

Some Thoughts on the Principle of Revealed Preference 1 Some Thoughts on the Principle of Revealed Preference 1 Ariel Rubinstein School of Economics, Tel Aviv University and Department of Economics, New York University and Yuval Salant Graduate School of Business,

More information

RISK PREDICTION MODEL: PENALIZED REGRESSIONS

RISK PREDICTION MODEL: PENALIZED REGRESSIONS RISK PREDICTION MODEL: PENALIZED REGRESSIONS Inspired from: How to develop a more accurate risk prediction model when there are few events Menelaos Pavlou, Gareth Ambler, Shaun R Seaman, Oliver Guttmann,

More information

arxiv: v3 [stat.ml] 27 Mar 2018

arxiv: v3 [stat.ml] 27 Mar 2018 ATTACKING THE MADRY DEFENSE MODEL WITH L 1 -BASED ADVERSARIAL EXAMPLES Yash Sharma 1 and Pin-Yu Chen 2 1 The Cooper Union, New York, NY 10003, USA 2 IBM Research, Yorktown Heights, NY 10598, USA sharma2@cooper.edu,

More information

Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals

Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals Patrick J. Heagerty Department of Biostatistics University of Washington 174 Biomarkers Session Outline

More information

An Introduction to Bayesian Statistics

An Introduction to Bayesian Statistics An Introduction to Bayesian Statistics Robert Weiss Department of Biostatistics UCLA Fielding School of Public Health robweiss@ucla.edu Sept 2015 Robert Weiss (UCLA) An Introduction to Bayesian Statistics

More information

Estimating population average treatment effects from experiments with noncompliance

Estimating population average treatment effects from experiments with noncompliance Estimating population average treatment effects from experiments with noncompliance Kellie Ottoboni Jason Poulos arxiv:1901.02991v1 [stat.me] 10 Jan 2019 January 11, 2019 Abstract This paper extends a

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write

More information

The Prevalence of HIV in Botswana

The Prevalence of HIV in Botswana The Prevalence of HIV in Botswana James Levinsohn Yale University and NBER Justin McCrary University of California, Berkeley and NBER January 6, 2010 Abstract This paper implements five methods to correct

More information

Progress in Risk Science and Causality

Progress in Risk Science and Causality Progress in Risk Science and Causality Tony Cox, tcoxdenver@aol.com AAPCA March 27, 2017 1 Vision for causal analytics Represent understanding of how the world works by an explicit causal model. Learn,

More information

An Improved Algorithm To Predict Recurrence Of Breast Cancer

An Improved Algorithm To Predict Recurrence Of Breast Cancer An Improved Algorithm To Predict Recurrence Of Breast Cancer Umang Agrawal 1, Ass. Prof. Ishan K Rajani 2 1 M.E Computer Engineer, Silver Oak College of Engineering & Technology, Gujarat, India. 2 Assistant

More information

Positive and Unlabeled Relational Classification through Label Frequency Estimation

Positive and Unlabeled Relational Classification through Label Frequency Estimation Positive and Unlabeled Relational Classification through Label Frequency Estimation Jessa Bekker and Jesse Davis Computer Science Department, KU Leuven, Belgium firstname.lastname@cs.kuleuven.be Abstract.

More information

Positive and Unlabeled Relational Classification through Label Frequency Estimation

Positive and Unlabeled Relational Classification through Label Frequency Estimation Positive and Unlabeled Relational Classification through Label Frequency Estimation Jessa Bekker and Jesse Davis Computer Science Department, KU Leuven, Belgium firstname.lastname@cs.kuleuven.be Abstract.

More information

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover). STOCKHOLM UNIVERSITY Department of Economics Course name: Empirical methods 2 Course code: EC2402 Examiner: Per Pettersson-Lidbom Number of credits: 7,5 credits Date of exam: Sunday 21 February 2010 Examination

More information

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models White Paper 23-12 Estimating Complex Phenotype Prevalence Using Predictive Models Authors: Nicholas A. Furlotte Aaron Kleinman Robin Smith David Hinds Created: September 25 th, 2015 September 25th, 2015

More information

3. Model evaluation & selection

3. Model evaluation & selection Foundations of Machine Learning CentraleSupélec Fall 2016 3. Model evaluation & selection Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr

More information

Observational Category Learning as a Path to More Robust Generative Knowledge

Observational Category Learning as a Path to More Robust Generative Knowledge Observational Category Learning as a Path to More Robust Generative Knowledge Kimery R. Levering (kleveri1@binghamton.edu) Kenneth J. Kurtz (kkurtz@binghamton.edu) Department of Psychology, Binghamton

More information

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj Statistical Techniques Masoud Mansoury and Anas Abulfaraj What is Statistics? https://www.youtube.com/watch?v=lmmzj7599pw The definition of Statistics The practice or science of collecting and analyzing

More information

Score Tests of Normality in Bivariate Probit Models

Score Tests of Normality in Bivariate Probit Models Score Tests of Normality in Bivariate Probit Models Anthony Murphy Nuffield College, Oxford OX1 1NF, UK Abstract: A relatively simple and convenient score test of normality in the bivariate probit model

More information

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS) Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it

More information

Obesity and health care costs: Some overweight considerations

Obesity and health care costs: Some overweight considerations Obesity and health care costs: Some overweight considerations Albert Kuo, Ted Lee, Querida Qiu, Geoffrey Wang May 14, 2015 Abstract This paper investigates obesity s impact on annual medical expenditures

More information

Instrumental Variables Estimation: An Introduction

Instrumental Variables Estimation: An Introduction Instrumental Variables Estimation: An Introduction Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA The Problem The Problem Suppose you wish to

More information

Detecting Anomalous Patterns of Care Using Health Insurance Claims

Detecting Anomalous Patterns of Care Using Health Insurance Claims Partially funded by National Science Foundation grants IIS-0916345, IIS-0911032, and IIS-0953330, and funding from Disruptive Health Technology Institute. We are also grateful to Highmark Health for providing

More information

Sawtooth Software. The Number of Levels Effect in Conjoint: Where Does It Come From and Can It Be Eliminated? RESEARCH PAPER SERIES

Sawtooth Software. The Number of Levels Effect in Conjoint: Where Does It Come From and Can It Be Eliminated? RESEARCH PAPER SERIES Sawtooth Software RESEARCH PAPER SERIES The Number of Levels Effect in Conjoint: Where Does It Come From and Can It Be Eliminated? Dick Wittink, Yale University Joel Huber, Duke University Peter Zandan,

More information

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha attrition: When data are missing because we are unable to measure the outcomes of some of the

More information

Testing the Predictability of Consumption Growth: Evidence from China

Testing the Predictability of Consumption Growth: Evidence from China Auburn University Department of Economics Working Paper Series Testing the Predictability of Consumption Growth: Evidence from China Liping Gao and Hyeongwoo Kim Georgia Southern University and Auburn

More information

Model reconnaissance: discretization, naive Bayes and maximum-entropy. Sanne de Roever/ spdrnl

Model reconnaissance: discretization, naive Bayes and maximum-entropy. Sanne de Roever/ spdrnl Model reconnaissance: discretization, naive Bayes and maximum-entropy Sanne de Roever/ spdrnl December, 2013 Description of the dataset There are two datasets: a training and a test dataset of respectively

More information

Appendix III Individual-level analysis

Appendix III Individual-level analysis Appendix III Individual-level analysis Our user-friendly experimental interface makes it possible to present each subject with many choices in the course of a single experiment, yielding a rich individual-level

More information

Hospital Readmission Ratio

Hospital Readmission Ratio Methodological paper Hospital Readmission Ratio Methodological report of 2015 model 2017 Jan van der Laan Corine Penning Agnes de Bruin CBS Methodological paper 2017 1 Index 1. Introduction 3 1.1 Indicators

More information

Quasi-experimental analysis Notes for "Structural modelling".

Quasi-experimental analysis Notes for Structural modelling. Quasi-experimental analysis Notes for "Structural modelling". Martin Browning Department of Economics, University of Oxford Revised, February 3 2012 1 Quasi-experimental analysis. 1.1 Modelling using quasi-experiments.

More information

Computer Age Statistical Inference. Algorithms, Evidence, and Data Science. BRADLEY EFRON Stanford University, California

Computer Age Statistical Inference. Algorithms, Evidence, and Data Science. BRADLEY EFRON Stanford University, California Computer Age Statistical Inference Algorithms, Evidence, and Data Science BRADLEY EFRON Stanford University, California TREVOR HASTIE Stanford University, California ggf CAMBRIDGE UNIVERSITY PRESS Preface

More information

Advanced Bayesian Models for the Social Sciences. TA: Elizabeth Menninga (University of North Carolina, Chapel Hill)

Advanced Bayesian Models for the Social Sciences. TA: Elizabeth Menninga (University of North Carolina, Chapel Hill) Advanced Bayesian Models for the Social Sciences Instructors: Week 1&2: Skyler J. Cranmer Department of Political Science University of North Carolina, Chapel Hill skyler@unc.edu Week 3&4: Daniel Stegmueller

More information

Following in Your Father s Footsteps: A Note on the Intergenerational Transmission of Income between Twin Fathers and their Sons

Following in Your Father s Footsteps: A Note on the Intergenerational Transmission of Income between Twin Fathers and their Sons D I S C U S S I O N P A P E R S E R I E S IZA DP No. 5990 Following in Your Father s Footsteps: A Note on the Intergenerational Transmission of Income between Twin Fathers and their Sons Vikesh Amin Petter

More information

PARTIAL IDENTIFICATION OF PROBABILITY DISTRIBUTIONS. Charles F. Manski. Springer-Verlag, 2003

PARTIAL IDENTIFICATION OF PROBABILITY DISTRIBUTIONS. Charles F. Manski. Springer-Verlag, 2003 PARTIAL IDENTIFICATION OF PROBABILITY DISTRIBUTIONS Charles F. Manski Springer-Verlag, 2003 Contents Preface vii Introduction: Partial Identification and Credible Inference 1 1 Missing Outcomes 6 1.1.

More information

Rise of the Machines

Rise of the Machines Rise of the Machines Statistical machine learning for observational studies: confounding adjustment and subgroup identification Armand Chouzy, ETH (summer intern) Jason Wang, Celgene PSI conference 2018

More information

Challenges of Automated Machine Learning on Causal Impact Analytics for Policy Evaluation

Challenges of Automated Machine Learning on Causal Impact Analytics for Policy Evaluation Challenges of Automated Machine Learning on Causal Impact Analytics for Policy Evaluation Prof. (Dr.) Yuh-Jong Hu and Shu-Wei Huang hu@cs.nccu.edu.tw, wei.90211@gmail.com Emerging Network Technology (ENT)

More information

Exploring the Influence of Particle Filter Parameters on Order Effects in Causal Learning

Exploring the Influence of Particle Filter Parameters on Order Effects in Causal Learning Exploring the Influence of Particle Filter Parameters on Order Effects in Causal Learning Joshua T. Abbott (joshua.abbott@berkeley.edu) Thomas L. Griffiths (tom griffiths@berkeley.edu) Department of Psychology,

More information

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Timothy N. Rubin (trubin@uci.edu) Michael D. Lee (mdlee@uci.edu) Charles F. Chubb (cchubb@uci.edu) Department of Cognitive

More information

Study of cigarette sales in the United States Ge Cheng1, a,

Study of cigarette sales in the United States Ge Cheng1, a, 2nd International Conference on Economics, Management Engineering and Education Technology (ICEMEET 2016) 1Department Study of cigarette sales in the United States Ge Cheng1, a, of pure mathematics and

More information

NEW METHODS FOR SENSITIVITY TESTS OF EXPLOSIVE DEVICES

NEW METHODS FOR SENSITIVITY TESTS OF EXPLOSIVE DEVICES NEW METHODS FOR SENSITIVITY TESTS OF EXPLOSIVE DEVICES Amit Teller 1, David M. Steinberg 2, Lina Teper 1, Rotem Rozenblum 2, Liran Mendel 2, and Mordechai Jaeger 2 1 RAFAEL, POB 2250, Haifa, 3102102, Israel

More information

PRINCIPLES OF EFFECTIVE MACHINE LEARNING APPLICATIONS IN REAL-WORLD EVIDENCE

PRINCIPLES OF EFFECTIVE MACHINE LEARNING APPLICATIONS IN REAL-WORLD EVIDENCE PRINCIPLES OF EFFECTIVE MACHINE LEARNING APPLICATIONS IN REAL-WORLD EVIDENCE Prepared and Presented by: Gorana Capkun-Niggli, PhD, Global Head of Innovation, Health Economics and Outcomes Research, Novartis,

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction 1.1 Motivation and Goals The increasing availability and decreasing cost of high-throughput (HT) technologies coupled with the availability of computational tools and data form a

More information

Supplementary Materials

Supplementary Materials Supplementary Materials July 2, 2015 1 EEG-measures of consciousness Table 1 makes explicit the abbreviations of the EEG-measures. Their computation closely follows Sitt et al. (2014) (supplement). PE

More information

Measurement and meaningfulness in Decision Modeling

Measurement and meaningfulness in Decision Modeling Measurement and meaningfulness in Decision Modeling Brice Mayag University Paris Dauphine LAMSADE FRANCE Chapter 2 Brice Mayag (LAMSADE) Measurement theory and meaningfulness Chapter 2 1 / 47 Outline 1

More information

Practical propensity score matching: a reply to Smith and Todd

Practical propensity score matching: a reply to Smith and Todd Journal of Econometrics 125 (2005) 355 364 www.elsevier.com/locate/econbase Practical propensity score matching: a reply to Smith and Todd Rajeev Dehejia a,b, * a Department of Economics and SIPA, Columbia

More information

International Journal of Pharma and Bio Sciences A NOVEL SUBSET SELECTION FOR CLASSIFICATION OF DIABETES DATASET BY ITERATIVE METHODS ABSTRACT

International Journal of Pharma and Bio Sciences A NOVEL SUBSET SELECTION FOR CLASSIFICATION OF DIABETES DATASET BY ITERATIVE METHODS ABSTRACT Research Article Bioinformatics International Journal of Pharma and Bio Sciences ISSN 0975-6299 A NOVEL SUBSET SELECTION FOR CLASSIFICATION OF DIABETES DATASET BY ITERATIVE METHODS D.UDHAYAKUMARAPANDIAN

More information

Empirical Validation in Agent-Based Models

Empirical Validation in Agent-Based Models Empirical Validation in Agent-Based Models Giorgio Fagiolo Sant Anna School of Advanced Studies, Pisa (Italy) giorgio.fagiolo@sssup.it https://mail.sssup.it/~fagiolo Max-Planck-Institute of Economics Jena,

More information

Advanced Bayesian Models for the Social Sciences

Advanced Bayesian Models for the Social Sciences Advanced Bayesian Models for the Social Sciences Jeff Harden Department of Political Science, University of Colorado Boulder jeffrey.harden@colorado.edu Daniel Stegmueller Department of Government, University

More information

On Algorithms and Fairness

On Algorithms and Fairness On Algorithms and Fairness Jon Kleinberg Cornell University Includes joint work with Sendhil Mullainathan, Manish Raghavan, and Maithra Raghu Forming Estimates of Future Performance Estimating probability

More information

EPSE 594: Meta-Analysis: Quantitative Research Synthesis

EPSE 594: Meta-Analysis: Quantitative Research Synthesis EPSE 594: Meta-Analysis: Quantitative Research Synthesis Ed Kroc University of British Columbia ed.kroc@ubc.ca March 28, 2019 Ed Kroc (UBC) EPSE 594 March 28, 2019 1 / 32 Last Time Publication bias Funnel

More information

Syllabus.

Syllabus. Business 41903 Applied Econometrics - Spring 2018 Instructor: Christian Hansen Office: HPC 329 Phone: 773 834 1702 E-mail: chansen1@chicagobooth.edu TA: Jianfei Cao E-mail: jcao0@chicagobooth.edu Syllabus

More information

Applied Quantitative Methods II

Applied Quantitative Methods II Applied Quantitative Methods II Lecture 7: Endogeneity and IVs Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 7 VŠE, SS 2016/17 1 / 36 Outline 1 OLS and the treatment effect 2 OLS and endogeneity 3 Dealing

More information

Modeling Sentiment with Ridge Regression

Modeling Sentiment with Ridge Regression Modeling Sentiment with Ridge Regression Luke Segars 2/20/2012 The goal of this project was to generate a linear sentiment model for classifying Amazon book reviews according to their star rank. More generally,

More information

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data TECHNICAL REPORT Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data CONTENTS Executive Summary...1 Introduction...2 Overview of Data Analysis Concepts...2

More information

Macroeconometric Analysis. Chapter 1. Introduction

Macroeconometric Analysis. Chapter 1. Introduction Macroeconometric Analysis Chapter 1. Introduction Chetan Dave David N. DeJong 1 Background The seminal contribution of Kydland and Prescott (1982) marked the crest of a sea change in the way macroeconomists

More information

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are

More information

Human and Optimal Exploration and Exploitation in Bandit Problems

Human and Optimal Exploration and Exploitation in Bandit Problems Human and Optimal Exploration and ation in Bandit Problems Shunan Zhang (szhang@uci.edu) Michael D. Lee (mdlee@uci.edu) Miles Munro (mmunro@uci.edu) Department of Cognitive Sciences, 35 Social Sciences

More information

Applying Machine Learning Methods in Medical Research Studies

Applying Machine Learning Methods in Medical Research Studies Applying Machine Learning Methods in Medical Research Studies Daniel Stahl Department of Biostatistics and Health Informatics Psychiatry, Psychology & Neuroscience (IoPPN), King s College London daniel.r.stahl@kcl.ac.uk

More information

Supplementary appendix

Supplementary appendix Supplementary appendix This appendix formed part of the original submission and has been peer reviewed. We post it as supplied by the authors. Supplement to: Callegaro D, Miceli R, Bonvalot S, et al. Development

More information

Supplementary materials for: Executive control processes underlying multi- item working memory

Supplementary materials for: Executive control processes underlying multi- item working memory Supplementary materials for: Executive control processes underlying multi- item working memory Antonio H. Lara & Jonathan D. Wallis Supplementary Figure 1 Supplementary Figure 1. Behavioral measures of

More information

Prediction and Inference under Competing Risks in High Dimension - An EHR Demonstration Project for Prostate Cancer

Prediction and Inference under Competing Risks in High Dimension - An EHR Demonstration Project for Prostate Cancer Prediction and Inference under Competing Risks in High Dimension - An EHR Demonstration Project for Prostate Cancer Ronghui (Lily) Xu Division of Biostatistics and Bioinformatics Department of Family Medicine

More information

Analysis of Environmental Data Conceptual Foundations: En viro n m e n tal Data

Analysis of Environmental Data Conceptual Foundations: En viro n m e n tal Data Analysis of Environmental Data Conceptual Foundations: En viro n m e n tal Data 1. Purpose of data collection...................................................... 2 2. Samples and populations.......................................................

More information

G5)H/C8-)72)78)2I-,8/52& ()*+,-./,-0))12-345)6/3/782 9:-8;<;4.= J-3/ J-3/ "#&' "#% "#"% "#%$

G5)H/C8-)72)78)2I-,8/52& ()*+,-./,-0))12-345)6/3/782 9:-8;<;4.= J-3/ J-3/ #&' #% #% #%$ # G5)H/C8-)72)78)2I-,8/52& #% #$ # # &# G5)H/C8-)72)78)2I-,8/52' @5/AB/7CD J-3/ /,?8-6/2@5/AB/7CD #&' #% #$ # # '#E ()*+,-./,-0))12-345)6/3/782 9:-8;;4. @5/AB/7CD J-3/ #' /,?8-6/2@5/AB/7CD #&F #&' #% #$

More information

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research 2012 CCPRC Meeting Methodology Presession Workshop October 23, 2012, 2:00-5:00 p.m. Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy

More information

EMPIRICAL STRATEGIES IN LABOUR ECONOMICS

EMPIRICAL STRATEGIES IN LABOUR ECONOMICS EMPIRICAL STRATEGIES IN LABOUR ECONOMICS University of Minho J. Angrist NIPE Summer School June 2009 This course covers core econometric ideas and widely used empirical modeling strategies. The main theoretical

More information

Assignment 4: True or Quasi-Experiment

Assignment 4: True or Quasi-Experiment Assignment 4: True or Quasi-Experiment Objectives: After completing this assignment, you will be able to Evaluate when you must use an experiment to answer a research question Develop statistical hypotheses

More information

Minimizing Uncertainty in Property Casualty Loss Reserve Estimates Chris G. Gross, ACAS, MAAA

Minimizing Uncertainty in Property Casualty Loss Reserve Estimates Chris G. Gross, ACAS, MAAA Minimizing Uncertainty in Property Casualty Loss Reserve Estimates Chris G. Gross, ACAS, MAAA The uncertain nature of property casualty loss reserves Property Casualty loss reserves are inherently uncertain.

More information

Module 14: Missing Data Concepts

Module 14: Missing Data Concepts Module 14: Missing Data Concepts Jonathan Bartlett & James Carpenter London School of Hygiene & Tropical Medicine Supported by ESRC grant RES 189-25-0103 and MRC grant G0900724 Pre-requisites Module 3

More information

DRAFT (Final) Concept Paper On choosing appropriate estimands and defining sensitivity analyses in confirmatory clinical trials

DRAFT (Final) Concept Paper On choosing appropriate estimands and defining sensitivity analyses in confirmatory clinical trials DRAFT (Final) Concept Paper On choosing appropriate estimands and defining sensitivity analyses in confirmatory clinical trials EFSPI Comments Page General Priority (H/M/L) Comment The concept to develop

More information