Cannabis Use and Mental Health Problems

Similar documents
The Effects of Cannabis Use on Physical and Mental Health van Ours, J.C.; Williams, J.

Cannabis use and suicidal ideation

Hazardous or Not? Early Cannabis Use and the School to Work Transition of Young Men

Early Cannabis Use and the School to Work Transition of Young Men

Tilburg University. Cannabis Use and Support for Cannabis Legalization Palali, Ali; van Ours, J.C.

Why do some people want to legalize cannabis use?

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Cannabis use and support for cannabis legalization

Ceasing cannabis use during the peak period of experimentation:

PRIMARY MENTAL HEALTH CARE MINIMUM DATA SET. Scoring the Kessler 10 Plus

Rapid decline of female genital circumcision in Egypt: An exploration of pathways. Jenny X. Liu 1 RAND Corporation. Sepideh Modrek Stanford University

The Link between Marijuana &

Instrumental Variables Estimation: An Introduction

"High"-School: The Relationship between Early Marijuana Use and Educational Outcomes

Tilburg University. Love Conquers All but Nicotine Palali, Ali; van Ours, Jan. Publication date: Link to publication

Motherhood and Female Labor Force Participation: Evidence from Infertility Shocks

P E R S P E C T I V E S

The Effects of Maternal Alcohol Use and Smoking on Children s Mental Health: Evidence from the National Longitudinal Survey of Children and Youth

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Small-area estimation of mental illness prevalence for schools

Are Illegal Drugs Inferior Goods?

Unit 1 Exploring and Understanding Data

The Impact of Relative Standards on the Propensity to Disclose. Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX

Those Who Tan and Those Who Don t: A Natural Experiment of Employment Discrimination

Supplementary Materials:

Score Tests of Normality in Bivariate Probit Models

MEA DISCUSSION PAPERS

DEPARTMENT OF ECONOMICS

August 29, Introduction and Overview

DEPARTMENT OF ECONOMICS

EXAMINING THE EDUCATION GRADIENT IN CHRONIC ILLNESS

Reading and maths skills at age 10 and earnings in later life: a brief analysis using the British Cohort Study

9 research designs likely for PSYC 2100

Econometric Game 2012: infants birthweight?

SURVEY TOPIC INVOLVEMENT AND NONRESPONSE BIAS 1

Journal of Political Economy, Vol. 93, No. 2 (Apr., 1985)

Overview of the Australian National Drug Strategy Household Survey (NDSHS)

Testing for non-response and sample selection bias in contingent valuation: Analysis of a combination phone/mail survey

EDRS. trends. bulletin. Alcohol use disorders amongst a group of regular ecstasy users. Key findings. july Introduction.

Results. NeuRA Motor dysfunction April 2016

Health information and life-course smoking behavior: evidence from Turkey

Two economists musings on the stability of locus of control

Methods for Addressing Selection Bias in Observational Studies

Carrying out an Empirical Project

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics

Policy Brief RH_No. 06/ May 2013

Distance to Cannabis-Shops and Age of Onset of Cannabis Use Palali, Ali; van Ours, J.C.

Lecture (chapter 1): Introduction

Decline in daily smoking by younger teens has ended

Early illicit drug use and the age of onset of homelessness

Obesity and health care costs: Some overweight considerations

The Adverse Consequences of Cannabis Use: Summary of Findings from the. Christchurch Health & Development Study. David M.

Modelling Research Productivity Using a Generalization of the Ordered Logistic Regression Model

An analysis of structural changes in the consumption patterns of beer, wine and spirits applying an alternative methodology

Meta-Analysis and Publication Bias: How Well Does the FAT-PET-PEESE Procedure Work?

CHAPTER 3 METHOD AND PROCEDURE

Introduction to Observational Studies. Jane Pinelis

Does Male Education Affect Fertility? Evidence from Mali

The B.E. Journal of Economic Analysis & Policy Manuscript Sequential Patterns of Drug Use Initiation Can we Believe in the Gateway Theory?

Identifying Endogenous Peer Effects in the Spread of Obesity. Abstract

Alcohol, drug and related health and wellbeing issues among young people completing an online screen.

Sawtooth Software. The Number of Levels Effect in Conjoint: Where Does It Come From and Can It Be Eliminated? RESEARCH PAPER SERIES

Estimating Heterogeneous Choice Models with Stata

1. The Role of Sample Survey Design

Studying the effect of change on change : a different viewpoint

EC352 Econometric Methods: Week 07

Consequences of effect size heterogeneity for meta-analysis: a Monte Carlo study

Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data

Cancer survivorship and labor market attachments: Evidence from MEPS data

Do children in private Schools learn more than in public Schools? Evidence from Mexico

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Patterns of adolescent smoking initiation rates by ethnicity and sex

Systematic review of the non- specific effects of BCG, DTP and measles containing vaccines

Ecstasy and Related Drugs Reporting System drug trends bulletin April 2013 Supplement

Fixed Effect Combining

Homicides and suicides in Cape Town,

Pros. University of Chicago and NORC at the University of Chicago, USA, and IZA, Germany

Political Science 15, Winter 2014 Final Review

Evidence Informed Practice Online Learning Module Glossary

Role of respondents education as a mediator and moderator in the association between childhood socio-economic status and later health and wellbeing

Evidence table for systematic reviews

A Brief Introduction to Bayesian Statistics

Midterm Exam ANSWERS Categorical Data Analysis, CHL5407H

Business Statistics Probability

Marno Verbeek Erasmus University, the Netherlands. Cons. Pros

Statistics on Drug Misuse: England, 2007

Logistic regression: Why we often can do what we think we can do 1.

(b) empirical power. IV: blinded IV: unblinded Regr: blinded Regr: unblinded α. empirical power

You must answer question 1.

Missing data. Patrick Breheny. April 23. Introduction Missing response data Missing covariate data

Disentangling the Contemporaneous and Life-Cycle Effects of Body Mass on Earnings

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

Social Determinants and Consequences of Children s Non-Cognitive Skills: An Exploratory Analysis. Amy Hsin Yu Xie

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

LIFE WITH EPILEPSY Report

Methodological Issues in Measuring the Development of Character

Chapter 3 - Does Low Well-being Modify the Effects of

Supplement 2. Use of Directed Acyclic Graphs (DAGs)

The Limits of Inference Without Theory

Transcription:

Department of Economics Working Paper Series Cannabis Use and Mental Health Problems Jan C. van Ours & Jenny Williams July 2009 Research Paper Number 1073 ISSN: 0819 2642 ISBN: 978 0 7340 4037 4 Department of Economics The University of Melbourne Parkville VIC 3010 www.economics.unimelb.edu.au

Cannabis Use and Mental Health Problems Jan C. van Ours Jenny Williams July 22, 2009 Abstract This paper investigates whether cannabis use leads to worse mental health. To do so, we account for common unobserved factors affecting mental health and cannabis consumption by modeling mental health jointly with the dynamics of cannabis use. Our main finding is that using cannabis increases the likelihood of mental health problems, with current use having a larger effect than past use. The estimates suggest a dose response relationship between the frequency of recent cannabis use and the probability of currently experiencing a mental health problem. Keywords: cannabis use; mental health; duration models; discrete factor models JEL codes: C41, D12, I19 Department of Economics, CentER, Tilburg University, The Netherlands; Department of Economics, University of Melbourne, Parkville, Australia and CEPR; vanours@uvt.nl. Department of Economics, University of Melbourne, Parkville, Australia; email: jenny.williams@unimelb.edu.au The authors wish to thank Carol Propper, Tue Gørgens, Chikako Yamauchi and participants of seminars at the Australian National University s Research School of Social Sciences Economics Program, the Health Economics Workshop at the Melbourne Institute of Applied Economic and Social Research, the health-labor seminar at Tilburg University and the annual conference of the European Society for Population Economics for useful comments and suggestions. We gratefully acknowledge support from the Australian Research Council; grant number DP0770580. Finally, the authors thank three anonymous referees for helpful comments when revising the paper.

1 Introduction Cannabis is the most widely used illicit drug. Over the last thirty years, the age at which it is first used has fallen and lifetime prevalence has risen in most developed countries (Hall, 2006). Cannabis popularity is derived from the mild euphoria associated with its consumption and from the generally held belief that its health consequences are rather benign. However, there is growing evidence of an association between mental health problems and cannabis use. What remains unclear is whether the proper interpretation of this evidence is that cannabis use causes mental health problems. The existence of unobserved personal characteristics or circumstances that causes both mental illness and cannabis use is a plausible alternative explanation. The purpose of this paper is to investigate the nature of the relationship between cannabis consumption and mental health and in so doing, determine the extent to which cannabis use leads to worse mental health. Establishing whether cannabis use is a cause of mental illness is of particular interest from a policy perspective. Uptake of cannabis typically occurs during the mid to late teens while individuals are still attending school. For example, 42% of 12 th graders in the US and 32% of 12 th graders in Australia have used cannabis in their lifetime (Johnston et al., 2006; White and Hayman, 2006). If cannabis use is a cause of mental illness, then educating adolescents about this risk may deter its uptake and thereby reduce population levels of mental illness. While a reduction in the prevalence of mental illness is desirable in itself, it is also likely to lead to significant economic benefits. The World Health Organization estimates the economic cost of mental illness to be between 3 and 4% of GNP per year for developed countries, with around half of the cost attributed to lost productivity (WHO, 2003). Knowledge of the mental health consequences of cannabis consumption is also useful for informing the debate over its legal status because it would allow a more accurate accounting of the costs and benefits of maintaining its status as a criminal offense. Given the recent moves to legalize cannabis in England and Portugal, this issue is clearly of ongoing policy interest. There exists a substantial literature in economics documenting the consequences of cannabis use in terms of educational attainment, physical health and labor market 2

success (for overviews see Van Ours and Williams, 2009; Williams and Skeels, 2006). Previous research on the relationship between mental health and illicit substance use, however, comes almost entirely from epidemiology. 1 The earliest attempt to identify the causal impact of cannabis use on mental illness is by Andreasson et al. (1987) who study a cohort of more than 50,000 18-20 year old Swedish conscripts. The authors find that the post-conscription risk of developing schizophrenia is increasing in the number of times cannabis is used prior to conscription. Giving a causal interpretation to this finding is complicated, however, by the fact that while the prevalence of cannabis use has increased over the last 30 years in most countries, the prevalence of schizophrenia has not (Hall, 2006; Kalant, 2004). Nonetheless, Andreasson et al. s (1987) research has prompted a raft of epidemiological studies on the topic. These subsequent studies tend to consider more general measures of mental health, including depression and anxiety. The results from this research are mixed with some papers reporting a positive association between cannabis use and mental health problems (Fergusson et al., 2005; Fergusson et al., 2002; Patton et al., 2002; Rey et al., 2002; Boys et al., 2003) and others reporting no association (McGee et al., 2000; Fergusson et al. 1997). In a meta analysis, Degenhardt et al. (2003) found a modest but significant association between heavy use of cannabis and later depression. In their overview study, Arseneault et al. (2004) conclude that rates of cannabis use are approximately twice as high among people with schizophrenia than among the general population. In examining the relationship between mental health and cannabis use, the literature cited above has attempted to identify the causal effect of cannabis use by controlling for observed factors that may be a source of confounding. 2 However, as noted by Pudney (2010), the potential for unobserved common confounding factors makes inference regarding the causal impact of cannabis use difficult. The purpose of this paper is to address this issue. To do so, we use a discrete factor approach. Our methodology marries Heckman and Singer s (1984) use of discrete factors in 1 For interesting reviews see Hall (2006); Kalant (2004); or Macleod et al. (2004). 2 Fergusson et al. (2002, 2007) are exceptions in that they also control for unobserved heterogeneity using fixed effects models. The later paper addresses the issue of causality using models estimated with LISREL. 3

addressing unobserved heterogeneity in hazard rate analysis with their use by Mroz (1999) to account for endogenous variables in regression models. More specifically, we estimate a trivariate system of equations consisting of hazard functions for the decision to start using cannabis and the decision to quit and a Tobit model for the production of mental health. By allowing the distribution of discrete factors determining cannabis use dynamics and continuous mental health to be correlated, we account for common unobserved factors and hence obtain reliable estimates of the impact of cannabis use on mental health. Our main finding is that frequent use of cannabis increases the likelihood of mental health problems. Infrequent and past cannabis use also increases the likelihood of mental health problems but the effects are substantially smaller. To give a sense of the magnitude of the effects, our estimates suggest that 2.4% of males who use cannabis weekly or more often will experience severe mental health problems compared with 1.5% of males who use monthly, 1.4% of males who are past users and 0.9% of males who have never used cannabis. The rest of this paper is laid out as follows. Section 2 describes the data used in this study and discusses its strengths and weaknesses. Section 3 presents the econometric methodology and results from estimation. Section 4 reports on an examination of the robustness of the results by way of an extensive sensitivity analysis. Section 5 summarizes our findings. 2 Data 2.1 Australian National Drug Strategy Household Survey This research draws on information collected in the 2004 Australian National Drug Strategy Household Survey (NDSHS). The NDSHS is managed by the Australian Institute of Health and Welfare on behalf of the Commonwealth Department of Health and Ageing. It is designed to provide data on awareness, attitudes and behavior relating to licit and illicit drug use by the non-institutionalized civilian population in Australia. The sampling framework is a multistage stratified sample design, where stratification is based on geographic region. In each sampled house- 4

hold, the respondent is the person with the next birthday who is at least 12 years of age. Self-completion questionnaires and computer assisted telephone interviewing methodologies were used to survey respondents, with the bulk of data (82%) collected by self-completion questionnaires. Of the households contacted for the Drop and Collect Survey who fell within the scope of the study, 68% agreed to participate and accepted a questionnaire. Seventy nine percent of these households returned the questionnaire. However, only 66% of the returned surveys were deemed usable, with the balance of questionnaires returned blank (8%) or with missing essential information or otherwise unreliable (5%). An analysis of the profile of non-response based on those who returned blank Drop and Collect Questionnaires indicates that this form of non-response was most prevalent among the over sixty age group, especially females over sixty (AIHW, 2005). In addition to asking individuals whether they have ever used or currently use various licit and illicit drugs, the NDSHS also asks those who report having ever used each substance the age at which it was first used. This, along with an objective measure of mental health in the form of the Kessler-10 (K10) scale of psychological distress make these data useful for examining the impact of cannabis use on mental health outcomes. 2.2 Cannabis Use, Mental Health and Data Issues Several measures related to cannabis consumption are used in our analysis. In modeling cannabis use dynamics, the outcomes of interest are the age at which cannabis was first used and the duration of use. The age of first use is constructed from responses to the question, About what age were you when you first used marijuana/cannabis?. This question was asked of all those who reported ever using cannabis. While we do not have information on the age at which respondents last used cannabis and hence duration of use, we do know whether or not they have used in the year prior to survey. Uncertainty surrounding the duration of cannabis use is addressed using econometric techniques which are described in section 3. In modeling the production function for mental health, current and past use of cannabis are the focus. These measures are directly related to the outcomes 5

in the dynamics of cannabis use. We define current cannabis users as those who have initiated into cannabis and have used at least once in the twelve months prior to survey. Past users are defined as those who have initiated use but have not consumed cannabis in the twelve months before being surveyed. For current users, we also investigate whether the mental health impacts differ by the frequency with which cannabis is used. To do this, current cannabis users are categorized as using once or twice a year, every few months, about once a month, once a week or more, or every day. These mutually exclusive categories form the set of potential responses to the question In the last 12 months, how often did you use marijuana/cannabis?. The measure of mental health we use is the K10 scale of psychological distress. The K10 was developed as a screening tool for non-specific psychological distress for the US National Health Interview Survey (Kessler et al., 2002 and 2003). The K10 is widely used, both as a measure of mental health status in general population surveys and as an outcome measure in primary care settings (Pirkis et al., 2005). It has been shown to be significantly correlated with other instruments including the General Health Questionnaire, the Short Form 12, the Comprehensive International Diagnostic Interview-Short Form, and the World Health Organization Disability Assessment Schedule (Kessler et al., 2003; Andrews and Slade, 2001). The K10 is a self-report measure of psychological distress consisting of 10 items which ask respondents about symptoms of depression and anxiety in the past four weeks. There is a five level response scale that ranges from none of the time (1) to all of the time (5). The specific items asked are as follows: In the last four weeks, about how often did you feel... : 1. Tired out for no good reason? 2. Nervous? 3. So nervous that nothing could calm you down? 4. Hopeless? 5. Restless or fidgety? 6. So restless you could not sit still? 7. Depressed? 8. That everything was an effort? 9. So sad that nothing could cheer you up? 10. Worthless? The sum of scored responses to the 10 questions is used to generate a single score of psychological distress ranging from 10 to 50. A score under 20 indicates that the respondent is likely to be well, a score of 20-24 indicates a mild mental disorder, a score of 25-29 indicates a moderate disorder and a score of 30 or greater indicates a severe mental disorder. 3 3 These cut-offs are recommended for Australian General Practitioners; see: 6

As with previous research studying transitions in substance use with crosssectional data, this study is subject to potential measurement error problems associated with recall error. As discussed below, we find some evidence of recall error in the reported age of initiation into cannabis use for those who report initiating after the age of 25. To the extent that respondent s make errors in the age they report first using cannabis, our parameter estimates are likely to be biased towards zero. Since initiation into use after the age of 25 is fairly rare, we do not anticipate large effects from this source of measurement error. Nonetheless, we investigate the impact of this issue on our results in a sensitivity analysis contained in section 4. A further potential measurement issue relates to the use of the K10 score as the measure of mental health. The K10 score items relate to symptoms experienced in the 4 weeks prior to survey. If symptoms of mental illness are not experienced in the relevant 4 week window, then they may go undetected by this measure. A more serious shortcoming of the data used in our study is that, while retrospective information is collected about the age when cannabis was first used, no such information is available for the age at which symptoms of mental illness were first experienced or the age at which mental illness was first diagnosed. 4 Therefore, we are unable to account for the potential for mental illness preceding, or causing, cannabis use. Several studies from epidemiology have, however, sought to determine whether there is a causal pathway running from mental illness to cannabis use. For example, Fergusson et al. (2005) investigate the relationship between cannabis use and psychotic episodes measured at ages 18, 21 and 25. They found no evidence that psychotic episodes lead to cannabis use and some evidence that increasing psychotic symptoms were associated with a decline in the use of cannabis. These findings are in line with Van Os et al. (2002) and Henquet et al. (2005) who find no evidence that early psychotic symptoms predict an increased risk of cannabis use, and Patton http://www.gpcare.org/outcome%20measures/outcomemeasures.html 4 As far as we have been able to determine, this weakness is not overcome in any other data. For example, the National Longitudinal Survey of Youth 97 cohort (NLSY97) uses a five-item short version of the Mental Health Inventory (MHI) to first assess mental health (in the past month) when respondents are aged 15-20 and the Christchurch Health and Development Study (CHDS) first asks about anxiety and depression at age 15. Thus neither study identifies the age at which mental health problems first occur. 7

et al. (2002) who find that anxiety in teenagerhood does not predict later cannabis use. McGee et al. (2000), however, finds that although cannabis use at age 18 predicts mental disorders at age 21, mental disorders at age 15 predicts a small but significantly elevated risk of cannabis use at age 18. So, while we cannot rule out the possibility of reverse causality, the evidence suggests that it is unlikely to have a large effect on our estimates. 2.3 Descriptive Statistics Our sample is composed of 4771 males and 6719 females aged 26-50 years old for whom we have complete data on mental health, cannabis use, and the other control variables. Summary statistics for the outcomes of interest and other explanatory variables are reported in Table 1. Table 1 shows that 58% of males and 49% of females in the sample have used cannabis in their lifetime. Twenty percent of males and 11% of females have used cannabis in the past year. Therefore, according to our definitions, 38% of males and females are past users of cannabis, while 20% of males and 11% of females are current users of cannabis. Amongst those who have ever used cannabis, the average age of initiation is 18.4 years for males and 18.7 years for females, with 12% of males and 9% of females initiating before the age of 16. The frequency of past year use is also reported in Table 1. For comparability with other variables, these rates are reported as percentages of the full sample. As shown in Table 1, 5.2% of males and 3.7% of females in our sample use cannabis once or twice a year, 2.9% of males and 1.5% of females use every few months, 2.2% of males and 1.3% of females use every month, 5.3% of males and 2.4% of females use once a week, while 4.6% of males and 1.7% of females use cannabis every day. In terms of mental health, the average K10 score for males is 14.8 and for females it is 15.5. On the basis of their K10 score, 14% of males and 17% of females have a mental health disorder. 5 5 The Australian 2007 National Survey of Mental Health and Wellbeing collected information for the K10 as well as diagnosing mental disorders based on the International classification of Disease - 10th Revision, Classification of Mental Health and Behavioral Disorders. It finds a high correlation between the two instruments with 80% of those with a K10 of 30 or more being diagnosed with having a mental disorder in the previous 12 months compared with only 11% with a score of less 8

In terms of demographic characteristics, the average age of the sample is about 38, and close to 80% of the sample were born in Australia, with 2% identifying themselves as Aboriginal or Torres Strait Islanders. More than two-thirds of individuals in the sample are currently married, and a further 10% of males and 14% of females are divorced. In terms of education, 16% of males and 24% of females report their highest level of educational attainment as a 10 th grade education or less. Figure 1 provides a graphical illustration of the probability of starting cannabis use at each age, conditional on not having been a user up to that age. The figure shows that initiation into cannabis use begins at an early age. The first peak in the probability of uptake is at age 16, when 10% of males and 8% of females who had not previously used cannabis initiate use. The mean peak is at age 18, with 14% of males and 11% of females initiating use, but there is also a peak of 10% for males at age 20 and subsequent peaks are at age 25 and 30. The peaks at ages 25 and 30 in the age-specific starting probabilities may point to bundling in the recollection of the starting age, although as Figure 1 shows, initiation into cannabis use rarely occurs beyond age 25. At age 16, 12% of males and 9% of females in the sample have started cannabis use. This increases to 54% of males and 45% of females at age 25. By the age of 50, 58% of males and 49% of females have used cannabis at some point in their life. Figure 2 shows the distribution of K10 scores for males and females. By construction, the K10 score has a lower bound of 10 and an upper bound of 50, with higher scores indicating greater levels of psychological distress. The K10 score has a right skewed distribution, with an average value around 15 for both males and females. A large proportion of observations (17% of males and 11% of females) occur at the lower bound value of 10. Figure 2 also shows that the sample proportion with each score falls as the score increases. As mentioned above, scores below 20 indicate no mental health problems, and 86% of males and 83% of females fall within this category. Scores in the range of 20-24 indicate mild psychological distress and 9% of males and 10% of females fall within this category. Moderate psychological distress is indicted by a score between 25 and 29 and severe psychological distress is indicted than 15 being diagnosed as having a mental disorder in the past 12 months (Australian Bureau of Statistics, 2008). 9

by a score of 30 or greater. In this sample, 3% of males and 4% of females suffer moderate psychological distress and 2% of males and 3% of females suffer severe psychological distress according to their K10 score. The distribution of mental health status conditional on cannabis use status (never used, past user, current user) is shown for males and females in Table 2. Three main points emerge from examining this table. First, there is a higher prevalence of mental illness among current and past users of cannabis compared to those who have never used. For example, we see that 11% of males and 14% of females who have never used cannabis have a mental health disorder compared to 21% of males and 29% of females who are current users. Second, past users have a lower prevalence of mental illness than current users. For example, 13% of males and 19% of females who are past users of cannabis are classified as having a mental health disorder compared to 21% of males and 29% of females who are current users. The third stylized fact that emerges from Table 2 is that females are more likely than males to suffer mental illness. The above discussion demonstrates an association between cannabis use and mental health problems that appears to differ between past and current use. The analysis that follows attempts to discern the degree to which this relationship is causal. 3 Empirical Strategy The decision to start and stop using cannabis as well as the likelihood of experiencing mental illness may be effected by many personal characteristics in addition to circumstances faced in childhood and early adulthood. The most significant challenge posed in investigating potential links between cannabis use and mental health is the impossibility of observing all the personal characteristics and circumstances that might be relevant. According to Pudney (2010) even the most comprehensive longitudinal survey cannot hope to measure every relevant aspect of the individual and his or her environment. Nevertheless, in order to be able to assess the potential causal link between cannabis use and mental health common unobserved confounding factors that may be a source of spurious association must be taken into account. 10

It is difficult, if not impossible, to identify characteristics or circumstances that affect cannabis use but not mental health, rendering the use of instrumental variable techniques infeasible. 6 Instead we exploit the discrete factor approach. This method for allowing for correlation in unobservables across multiple equations without imposing distributional assumptions has been used in a wide variety of applications in health and labor economics (see for example, Cutler 1995; Bray 2005; Van Ours 2006, 2007; Yang, Gilleskie and Norton 2009). The discrete factor approach was proposed by Heckman and Singer (1984) to address unobserved heterogeneity in hazard rate analysis. It has been further developed by Mroz (1999) for application to regression models with endogenous dummy variables. 7 Mroz demonstrates that when the idiosyncratic error terms for the latent endogenous variable and the outcome of interest have a bivariate normal disturbance, the discrete factor method compares favorably to the usual Maximum Likelihood Estimator (MLE) in terms of precision and bias. Furthermore, the discrete factor approximation outperforms both the MLE and the Two Stage estimator (TSE) when the disturbances are non-normal. 8 In our application, we use the discrete factor approach to account for the unobserved factors effecting the production of mental health, cannabis uptake and quitting in order to obtain reliable estimates of the mental health effects of cannabis use. 6 Note that policy variables relating to cannabis use, such as its legal status and the price of cannabis are potential candidates for instruments. Given our approach, this would require data on prices and policies from 1966 to the present since our oldest sample members are 50 in 2004 and are assumed to be at risk of uptake from the age of 12. This information is simply not available for cannabis prices. While it may be possible to construct this type of historical series for the legal status of cannabis, it would vary insufficiently for it to be useful since there are only eight states and territories in Australia and the four that have decriminalized the consumption of cannabis have done so rather recently. 7 In his set-up, Mroz interprets the discrete factors as an unobserved covariate that impacts on the outcome of interest as well as the latent process generating the dummy variable, thereby inducing its endogeneity. 8 The discrete factor model also outperforms MLE and TSE in the presence of weak instruments in models with non-normal errors. This is of particular salience given that state level policy variables are often relied upon to identify the effects of substance use and these policy variables tend to be only weakly predictive of substance use. 11

Identification of this trivariate model with correlated errors comes from functional form assumptions. In the case of cannabis uptake and quitting, we follow Heckman and Singer (1984) and assume mixed proportional hazard functions. Due to its censored nature, the equation for mental health is based on the Tobit model. Similar to Mroz (1999), identification of unobserved heterogeneity in this equation relies on the linearity of the model for the latent variable and normality of its idiosyncratic error. As with any attempt to discern causal effects of endogenous variables, identification of the parameters of interest is ultimately based on untestable assumptions. We have, however, attempted to explore issues related to identification and model specification in an extensive sensitivity analysis that is reported in section 4. Our estimation strategy is implemented in three steps. First, we jointly model the cannabis uptake and quitting decisions. In doing so we pay particular attention to modeling the potentially correlated unobserved heterogeneity driving these processes, which is assumed to come from a discrete distribution representing latent proclivities towards cannabis use. In the second step we model the censored continuous measure for mental health treating cannabis use as exogenous, but accounting for unobserved heterogeneity with respect to susceptibility to mental illness using the discrete factor approach. In the third step, we marry the bivariate hazard model for cannabis uptake and quitting with the model for mental health, accounting for common unobserved confounding factors by allowing for correlation in the unobserved discrete factors determining the uptake and quitting of cannabis and the production of mental health. 3.1 Dynamics in cannabis use The first part of our econometric strategy focusses on modeling the transitions into and out of cannabis use. We model the rate of uptake of cannabis and the quit rate from cannabis use with a bivariate mixed proportional hazard framework. Concerning the uptake of cannabis we assume that potential exposure to cannabis occurs from the age 12. The starting rate for cannabis use, at time (from age 12) t conditional on observed characteristics x and unobserved characteristics u is specified 12

as θ s (t x, u) = λ s (t) exp(x β s + u) (1) where λ s (t) represents individual duration dependence and β represents a vector of parameters to be estimated. Unobserved heterogeneity accounts for differences in susceptibility to cannabis. We model duration (age) dependence in a flexible way by using a step function λ s (t) = exp(σ k λ k I k (t)), where k (= 1,..,15) is a subscript for age categories and I k (t) are time-varying dummy variables that are one in subsequent categories. We specify 15 age dummies, 14 of which are for individual ages (age 12, 13,.., 25) and the last interval is open: 26 years. Because we also estimate a constant term, we normalize λ 1 = 0. The conditional density function for the completed durations until first use can be written as f s (t x, u) = θ(t x, u) exp( t 0 θ(s x, u)ds) (2) Individuals who initiate cannabis use have a completed duration until first use equal to the age at first use minus 12; individuals who have not used cannabis at the time of the survey have a duration until first use that is right-censored at their current age minus 12. The quit rate from cannabis use at duration of use τ conditional on observed characteristics x, the age of first use a f, and unobserved characteristics v is specified as θ q (τ x, a f, v) = exp(x β q + δ f a f + v) (3) where δ f and β q represent parameters to be estimated. The conditional density function for the completed durations of cannabis use can be written as f q (τ x, a f, v) = θ q (τ x, a f, v) exp( τ 0 θ q (r x, a f, v)dr. (4) As we do not know the actual age at which individuals quit cannabis use we cannot calculate the exact duration of use and hence cannot estimate the conditional density for completed durations. However, we do know whether or not individuals used 13

cannabis in the 12 months before the survey. For those who had not, we know that their duration of use, τ s, lies in the interval [0, a s 1 a f ], where a s is the respondents age at the time of the survey and a f is their age of first use. We can therefore account for the uncertainty over the exact age at quitting by integrating the conditional density for durations of use over this range: τ s 0 f qdq = F q (τ s x, a f, v), where F q is the distribution function of f q. Individuals who are still using cannabis have a duration of use that is right censored and for these observations, we use the survival function: 1 F q (τ s x, a f, v). Modeling the dynamics of cannabis use requires information about characteristics and circumstances faced by individuals at each point in time in which they are confronted with the choice to initiate or quit cannabis use. Information likely to be relevant includes family situation, experiences at school, cannabis supply conditions, the price of cannabis, and the price of other drugs (substitutes and complements). Unfortunately, this type of information is not available in the NDSHS. We note however, that many of these factors are likely to be endogenous and it is therefore not clear how one should proceed if they were available. 9 The observable characteristics that we are able to control for are nationality (an indicator for Australian born), whether the respondent identifies themselves as an Aboriginal or Torres Strait Islander, whether they live in a capital city, birth year (to account for birth cohort size effects), the respondent s state of residence at the time of survey and the respondent s education (an indicator for dropping out of school with a 10 th grade education or less) which is used as a proxy for ability. 10 These characteristics are assumed to be known at the time an individual first faces the decision of whether to initiate cannabis use. In the case of the education variable, this requires the assumption that education represents ability and that this ability is known to the 9 We do attempt to address the issue of greater vulnerability early in life in one of our sensitivity analyses in section 4. Note that variables reflecting the respondent s current circumstance, such as marital status, that are collected as part of the NDSHS are not useful for modeling cannabis uptake because they represent events that may have taken place long after the individual started to use cannabis. 10 Jacobson (2004) finds a positive correlation between youth cohort size and the prevalence of cannabis use which she largely attributes to a decreased costs of supply due to reduced risk of arrest for selling and informational economies associated with a larger market. 14

individual from the time he first faces the decision to use cannabis. The education variable will not fulfill this requirement if, at the time an individual decides to start using cannabis, he is uncertain as to whether he will drop out of school before completing 10 th grade or, if there exist unobserved characteristics that impact both educational attainment and cannabis use (see for example, Heckman, Stixrud and Urzua (2006) who allow latent cognitive and noncognitive skills to determine education and cannabis use). There is also the possibility of reverse causality in which case cannabis use may result in dropping out of school. The impact of these issues are examined in section 4. The potential correlation between the unobserved components in the hazard rates for cannabis uptake and quitting is taken into account by specifying the joint density function for the duration until first use t and the duration of time until quitting use τ conditional on x as h(t, τ x, a f ) = f s (t x, u)f q (τ x, a f, v)dg(u, v) (5) u v where G(u, v) is a bivariate discrete distribution with n points of support. probabilities associated with each type are assumed to have a multinomial logit specification: p j = exp(α j) Σ j exp(α j ), j = 1,.., n, with normalization α n = 0. The parameters of the model are estimated separately for males and females using maximum likelihood and presented in Table 3. To select the number of points of support in the joint distribution of discrete factors, an upward-testing approach is used, starting with one point of support and adding additional points of support in a stepwise manner. Beyond four points of support the locations of additional mass points converged to each other and no improvement of the loglikelihood function was found. 11 The points of support reflect the assumption that, conditional on observed characteristics, there exist 4 distinct types of individuals who are differentiated by their susceptibility to starting and stopping cannabis use. 12 Type 1, represented 11 The bottom part of Table 3 shows the loglikelihood values for the model with 1, 2, and 3 points of support. 12 According to Gaure et al. (2007) it may not be meaningful to interpret the mass-point distribution in terms of representing a corresponding number of distinct types of individuals as the underlying true heterogeneity distribution may be continuous. However, in the case of cannabis The 15

by (u 1, v 1 ), has a low starting rate and a positive quit rate, type 2, represented by (u 2, v 2 ), has a high starting rate and a zero quit rate, type 3, represented by (u 3 ), has a zero starting rate (and hence no quit rate), while type 4, represented by (u 1, v 2 ), has a low starting rate and a zero quit rate. As shown in Table 3, the distribution of unobserved heterogeneity implies that 40.4% of the male sample and 48.4% of the female sample fall into the group with a low starting rate and a positive quit rate, 7.9% of males and 3.5% of females belong to the group with a high starting rate and a zero quit rate, while 41.6% of males and 43.7% of females have a zero starting rate. Finally, 10.1% of the males and 4.4% of the females have a low starting rate and a zero quit rate. The results from estimation indicate that Australian born males and females with a low level of education have a higher uptake of cannabis than foreign born individuals or those with greater than a 10 th grade education. Those born in more recent years have a higher starting rate than those born earlier, and Aboriginal females are less likely to initiate into cannabis use than other Australian females. In terms of quit rates, we find that females with a low level of education are less likely to quit compared to those females with more than a 10 th grade education. In line with Van Ours and Williams (2007), we also find that age of initiation has a large effect on the quit rate with those initiating into cannabis use early in life having a lower quit rate. 3.2 Determinants of mental health assuming exogenous cannabis use Figure 2 shows that, in our sample, the distribution of K10 scores has a significant proportion of observations at the lower limit value of 10. It also shows that the distribution of K10 scores is skewed to the right. To account for these features, we model the production of mental health using a Tobit model for log (K10). The natural log of latent mental health of individual i, m i, is assumed to depend upon a vector of observed characteristics x m, past and current cannabis use, and unobserved use the interpretation in terms of types is more natural as starting to use cannabis is a discrete choice as is quitting conditional on use. 16

characteristics µ: m i = x mβ m + δ 1 cc i + δ 2 cp i + µ i + ɛ i (6) where cc i is a dummy variable indicating whether or not an individual is a current user of cannabis (has used in the past 12 months), cp i is a dummy variable indicating whether an individual is a past user of cannabis (has used in lifetime but not in the past 12 months) and β m, δ 1 and δ 2 are parameters to be estimated. Observed mental health, m i, is measured by the natural log of the K10 score. 13 Given the censoring point of 10 for the K10, the relationship between latent mental health m i and observed mental health m i is given by m i = log(10) if m i log(10) and m i = m i if m i > log(10). The main parameters of interest are δ 1 and δ 2, which measure the effect of current and past cannabis use on mental health, respectively. The observed characteristics, x m, include the respondents age, their nationality (an indicator for Australian born), whether the respondent identifies themselves as an Aboriginal or Torres Strait Islander, whether they live in a capital city, the respondent s state of residence at the time of survey, the respondent s education (an indicator for dropping out of school with a 10 th grade education or less) and their marital status (indicators for married and divorced with single or widowed as the comparison category). As with the dynamics of cannabis use, there exists the potential for unobserved characteristics that determine mental health to also determine education and marital status. The sensitivity of our results to this issue are explored below. The error term for individual i is assumed to be composed of two components. The first is a a discrete factor µ i, which is intended to capture unobserved susceptibility to mental illness. The second component of the error term is drawn from a normal distribution, ɛ i N(0, σ 2 ). Since we do not have panel data, the identification of the discrete factor relies on the assumption that ɛ i is normally distributed. 13 A priori it is not clear whether to use a linear or loglinear specification for the K10 score. We used the pseudo R 2 as guidance. The pseudo R 2 is calculated as (1- Lu L 0 ), where L u is the value of the loglikelihood of the full model and L 0 is the value of the loglikelihood of the model with only an intercept. For the logarithmic specification we found a pseudo R 2 of 0.062 both for males and females; for the linear specification we found a pseudo R 2 of 0.007 for males and 0.008 for females. 17

As before we assume that the probabilities associated with the discrete factors follow a logit specification. Table 4 contains the maximum likelihood estimates of the relevant parameters. We found no improvement in the likelihood function beyond two mass points, implying two distinct types in our sample. 14 The first mass point is estimated to be 2.512 for males and 2.632 for females. This implies K10 sores of 12.3 (exp(2.512)) and 13.9 (exp(2.632)) respectively. Since a K10 score of less than 20 is indicative of an absence of mental illness, this group is considered to have a low susceptibility to mental health problems. The second mass point is 3.089 for males and 3.203 for females, which translates in K10 s of 22.0 and 24.6. Given that a K10 score between 20 and 25 identifies a person as having mild mental health problem, those belonging to this group are considered to be susceptible to mental health problems. Table 4 shows that, all else being equal, married respondents are in better mental health than those who are single or widowed and that divorcees are in worse mental health than those who are single or widowed. Younger respondents are, on average in better mental health than their older counterparts, and Aboriginals are in worse mental health than non-aboriginals. We find no evidence that a low level of education is associated with worse mental health. The main parameters of interest are those measuring the effects of current and past cannabis use on mental health. As shown in Table 4, we find that both men and women who currently use cannabis have a higher K10 score than those who have never used cannabis, where the increase in K10 attributable to current cannabis use is similar in size and magnitude to the effect of a married person becoming divorced. We also find that past users of cannabis have a higher K10 score compared to their counterparts who have never used. The increase in the K10 score attributable to past use is around half of that associated with current use. 14 Note that the explanatory variables are specified as deviations from the mean. Also note that the bottom part of Table 4 shows that the introduction of 2 mass points has a big effect on the value of the log-likelihood while the estimated effects of cannabis use on mental health are very similar to a specification without mass points. 18

3.3 A joint model of cannabis use and mental health In estimating the joint model for cannabis dynamics and the production of mental health, we start by assuming that the distribution of unobserved heterogeneity has eight points of support, reflecting two types in terms of susceptibility to mental illness combined with four types in terms of susceptibility to cannabis use. As before, the associated probabilities are assumed to have a multinomial logit specification: p j = exp(α j) Σ j exp(α j ), j = 1,.., 8, with the normalization α 8 = 0. The three equation system is estimated using maximum likelihood and the relevant estimates are presented in the first panel of Table 5. We do indeed find that the joint distribution of unobserved heterogeneity for the cannabis starting and quit rates and the production of mental health has 8 points of support. The vast majority of the sample belong to two groups, each of which have a low susceptibility for mental health problems. For example, 36.9% of males and 41.4% of females belong to the group characterized by a low positive cannabis starting rate, a positive cannabis quit rate and low susceptibility to mental health problems (type 1) and 36.9% of males and 38.1% of females have a zero cannabis starting rate and a low susceptibility to mental illness (type 3). While this distribution demonstrates a correlation between cannabis use and susceptibility to mental illness, the correlation does not appear to be very strong. At the 10% level of significance, a Likelihood Ratio test (LR) of the null hypothesis of no correlation is rejected for men, confirming that common unobserved factors is an issue for this sample. For women this is not the case. A comparison of the results from estimating models with and without accounting for common unobserved factors (comparing the first panel of Table 5 and the parameter estimates in Table 4) reveals that the causal effect of cannabis use on mental health is overestimated if one doesn t account for these unobserved common factors. This implies that those who are more likely to start using cannabis are also more likely to have mental health problems. However, as shown the parameter estimates representing the effect of current and past cannabis use on mental health are not much affected by whether or not correlation in unobserved characteristics is accounted for. From this we conclude that after accounting for the potentially 19

confounding effect of unobserved characteristics cannabis use has an adverse impact on mental health. Moreover, while this effect is greater for current use, it persists well after use ceases. 4 Sensitivity analysis and simulations We next investigate the sensitivity of our findings to assumptions made in modeling the relationship between cannabis use dynamics and mental health. The results from doing so are summarized in Table 5. First, we investigate the impact of recall error by estimating our model over the subsample of 26-35 year olds. To the extent that respondents make mistakes in the age they report first using cannabis, our parameter estimates of the impact of cannabis use on mental health based on the full sample and reported in Table 5 panel 1 are likely to be biased towards zero. We expect that recall error is less of a problem amongst the 26-35 age group (as less calendar time has elapsed since they first used and perhaps quit cannabis) and hence results based on these data more reliable. The results for this younger subsample are reported in Table 5 panel 2. A comparison of results based on estimation over the full sample with those based on the younger sample reveals no significant differences suggesting that recall error is not an important source of bias in the data used in this analysis. Second, we examine whether the impact of cannabis use on mental health varies by age of first use. Specifically, we allow the effects of past and current use on mental health to differ for those who first used cannabis before the age of 16 and those who were 16 or older when they first tried cannabis. A Likelihood Ratio test is used to examine the empirical support for effects that differ by age. As shown in the third panel of Table 5, we find significant differences in the impact of cannabis use on mental health by the age of uptake for females but not males. Specifically, we find that current and past use of cannabis produces significantly greater increases in the K10 score of females who first used cannabis before the age of 16 compared to those who first used at 16 years or older. We also investigate whether the mental health effects of current cannabis use depend upon the frequency of use. To do so, we measured frequency of use in the 20

past year with a set of indicators for the categories use every day, use weekly, use monthly, use every few months, and use once or twice a year, in addition to the categories past use and never use. As shown, the mental health effects of cannabis use increase with the frequency of use. For example, on average males who use once or twice a year have a K10 score which is similar to males who stopped using cannabis, whereas males who use daily have a K10 score that is approximately 15.1 percent larger than a comparable male who has never used. For females, we find similar results. In addition to the sensitivity analyses reported in Table 5, we also also investigated the robustness of our results to several other aspects of model specification. First, we examined the impact of the potential endogeneity of marital status and education on the estimated effect of cannabis use on mental health. We did this by comparing results from our baseline model (Table 5 panel 1) with estimates from a model which omitted education and marital status from all three equations. The estimated effects of cannabis use (past and current) are quite robust suggesting this potential source of model misspecification is not driving our findings. Second, we allowed for more flexible cohort effects by replacing the continuous year of birth variable with indicators for birth cohort (born in the 1950s, the 1960s and the 1970s). The results from this model are almost identical to those from the original specification. We also explored the potential for heterogeneity in treatment effects by allowing all parameter estimates in the mental health equation to differ by whether the individual was a never user, current user or past user of cannabis. On the basis of an LR test we are unable to reject the null hypothesis of homogeneity in the effects of cannabis use on mental health. Finally, we considered the sensitivity of our results to specifying mental health as a continuous (censored) variable. As an alternate approach we constructed an ordinal categorical variable for mental health (no mental health problems, mild mental health problems, moderate mental health problems and severe problems). The estimated effects from ordered probit models of mental health are very similar to our baseline estimates in Table 5 panel 1. In order to illustrate the impact of cannabis use on mental health as measured by the K10 score, we use the parameter estimates contained in Table 5, panel 4 to perform simulations. We present scenarios in which individuals vary in their 21