BAYESIAN HYPOTHESIS TESTING WITH SPSS AMOS

Sara Garofalo Department of Psychiatry, University of Cambridge BAYESIAN HYPOTHESIS TESTING WITH SPSS AMOS Overview Bayesian VS classical (NHST or Frequentist) statistical approaches Theoretical issues Examples SPSS AMOS What is it and what can be used for Example of regression model in SPSS AMOS (Bayesian VS Frequentist)

Bayesian approach Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability of an hypothesis, given a set of evidences. FREQUENTIST APPROACH BAYESIAN APPROACH vs OR NHST (Null Hypothesis Significance Testing) A Frequentist is a person whose long-run interest is to be wrong 5% of the time. A Bayesian is one who, vaguely expecting a horse and catching a glimpse of a donkey, strongly concludes he has seen a mule (Senn, 1997)

Frequentist vs Bayesian FREQUENTIST APPROACH Parameters are fixed values LIKELIHOOD: P(R H 0 ) Probability of getting evidence R, when the Null Hypothesis is true BAYESIAN APPROACH Parameters are random values POSTERIOR PROBABILITY: P(H R) Probability that an hypothesis is true, given the observed evidence R posterior ~ prior X likelihood H 0 : µ a = µ b H 1 : µ a µ b

Bayes Theorem an example A new HIV test is claimed to have 95% sensitivity (true positive) and 97% specificity (true negative). R = test is positive H 0 = subject is truly HIV negative H 1 = subject is truly HIV positive LIKELIHOOD P(R H 0 ) =.03 PRIOR HIV prevalence in the population = 2% P(H 0 ) =.98 P(H 1 )=.02 BAYES S THOREM

Applications of Bayesian methods COMPARE MODELS No interest in significance Compare two models (i.e., H 0 and H 1 ) in order to look for the best one Bayesian Information criteria (BIC) Bayes Factor (BF) likelihood of a result given two models/hypothesis PREDICT FUTURE RESULTS Estimate unknown results, given a set of known evidences e.g., how many heads will I get by flipping a fair coin 10000 times? And what if it s an unfair coin? ESTIMATION OF PARAMETERS Evaluate the probability that the observed data are real Posterior distribution

SPSS AMOS AMOS (Analysis of Moment Structures) visual Structural Equation Modeling Structural Equation Modeling (SEM) Statistical technique used to establish relationships between variables Correspondence between the model specified and the data collected With AMOS it is possible to Quickly specify, view, and modify your model graphically using simple drawing tools

Example of a regression model Hamilton (1990) Average SAT score (Scholastic Assessment Test) Income expressed in $1,000 units Median education for residents 25 years of age or older

Example of a regression model SSSSSS ~ iiiiiiiiiiii + eeeeeeeeeeeeeeeeee + ee

Example of a regression model

Regression model Bayesian Approach Estimate means and intercepts Analyze Bayesian Estimation MCMC (Markov Chain Monte Carlo) algorithm begins sampling immediately, and it continues until you click the Pause Sampling button to halt the process. MCMC algorithm samples random values of parameters from a probability distribution

Regression model Bayesian Approach

Regression model Bayesian Approach 90.500 analysis samples

Regression model Bayesian Approach For each parameter Mean = estimate posterior mean (averaging across the MCMC samples) S.E. = likely distance between the estimated posterior mean and the true posterior mean S.D. = likely distance between the posterior mean and the unknown true parameter C.S. = Convergence Statistic Median Value Lower and upper 95% boundaries of the distribution (confidence interval) Skewness and Kurtosis Minimum and Maximum Value Name

Regression model Bayesian Approach CREDIBLE INTERVAL

Regression model Bayesian Approach CREDIBLE INTERVAL is interpreted as a probability statement about the parameter itself 95% sure that the true value lies between -4.840 and 9.292

Regression model Bayesian Approach CREDIBLE INTERVAL is interpreted as a probability statement about the parameter itself 95% sure that the true value lies between -4.840 and 9.292 Thus, it can be equal to 0 Accept H 0

Regression model Bayesian Approach CREDIBLE INTERVAL 95% sure that the true value lies between 67.033 and 203.38 Thus, > 0 Accept H 1

Regression model Bayesian Approach CREDIBLE INTERVAL 95% sure that the true value lies between 0.117 and 0.479 Thus, > 0 But still quite small

Frequentist vs Bayesian FREQUENTIST APPROACH Can only falsify H 0, but can t say much about H 1 (which is my real interest) With large sample sizes always favors H 1 P-value is sensitive to N Assumptions are often neglected p just indicate if it is significantly different from 0 but not how much BAYESIAN APPROACH Direct test of the hypothesis I m interested in More powerful with both small and large sample sizes With large sample sizes tends to favor the hypothesis which is more likely If the posterior distribution is not normal, the confidence interval will not be symmetric about the posterior mean Avoid misleading interpretations of the p-value and get a measure Statistical signifcance is not a scientific test. It is a philosophical, qualitative test. It does not ask how much. It asks whether. Existence, the question of whether, is interesting. But it is not scientific. (Ziliak & McCloskey, 2008)

Further reading MANY ISSUES COULD NOT BE COVERED!! (Seeds, convergence, priors, other applications in SPSS AMOS,...) Gelman et al. Bayesian Data Analysis (recent 3rd edition) Berry (1996) Introductory text on Bayesian methods Lee (2004) Good intro to Bayesian inference Bernardo and Smith (1994) (Advanced text on Bayesian theory) Hoff, D. H. (2009). A First Course in Bayesian Statistical Methods. Springer Texts in Statistics Kruschke, J., K. (2010). Doing Bayesian Data Analysis: A Tutorial with R and Bugs. Academic Press/Elsevier Science