Bayesian hierarchical modelling

Similar documents
Ordinal Data Modeling

ST440/550: Applied Bayesian Statistics. (10) Frequentist Properties of Bayesian Methods

Biostatistical modelling in genomics for clinical cancer studies

Missing data. Patrick Breheny. April 23. Introduction Missing response data Missing covariate data

An Introduction to Bayesian Statistics

Historical controls in clinical trials: the meta-analytic predictive approach applied to over-dispersed count data

A Bayesian Measurement Model of Political Support for Endorsement Experiments, with Application to the Militant Groups in Pakistan

County-Level Small Area Estimation using the National Health Interview Survey (NHIS) and the Behavioral Risk Factor Surveillance System (BRFSS)

Data Analysis Using Regression and Multilevel/Hierarchical Models

On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses. Structural Equation Modeling Lecture #12 April 29, 2015

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Ecological Statistics

GENERALIZED ESTIMATING EQUATIONS FOR LONGITUDINAL DATA. Anti-Epileptic Drug Trial Timeline. Exploratory Data Analysis. Exploratory Data Analysis

Small-area estimation of mental illness prevalence for schools

Bayesian Hierarchical Models for Fitting Dose-Response Relationships

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

T-Statistic-based Up&Down Design for Dose-Finding Competes Favorably with Bayesian 4-parameter Logistic Design

Bayesian Joint Modelling of Benefit and Risk in Drug Development

Meta-analysis of two studies in the presence of heterogeneity with applications in rare diseases

A Case Study: Two-sample categorical data

Improving ecological inference using individual-level data

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

Chapter 21 Multilevel Propensity Score Methods for Estimating Causal Effects: A Latent Class Modeling Strategy

Bayesian versus maximum likelihood estimation of treatment effects in bivariate probit instrumental variable models

Inference About Magnitudes of Effects

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination

Bayesian and Frequentist Approaches

Treatment effect estimates adjusted for small-study effects via a limit meta-analysis

Bayesian Models for Combining Data Across Subjects and Studies in Predictive fmri Data Analysis

Statistical Tolerance Regions: Theory, Applications and Computation

Bayesian Methodology to Estimate and Update SPF Parameters under Limited Data Conditions: A Sensitivity Analysis

Analysis of left-censored multiplex immunoassay data: A unified approach

Combining Risks from Several Tumors Using Markov Chain Monte Carlo

Some alternatives for Inhomogeneous Poisson Point Processes for presence only data

Mediation Analysis With Principal Stratification

A re-randomisation design for clinical trials

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology

Accommodating informative dropout and death: a joint modelling approach for longitudinal and semicompeting risks data

Dichotomizing partial compliance and increased participant burden in factorial designs: the performance of four noncompliance methods

Practical Bayesian Design and Analysis for Drug and Device Clinical Trials

How many people do you know?: Efficiently estimating personal network size

Sample size calculation for a stepped wedge trial

Introduction to Bayesian Analysis 1

Joint Spatio-Temporal Modeling of Low Incidence Cancers Sharing Common Risk Factors

Methods for meta-analysis of individual participant data from Mendelian randomization studies with binary outcomes

ISIR: Independent Sliced Inverse Regression

Bayesian growth mixture models to distinguish hemoglobin value trajectories in blood donors

Bayesian approaches to handling missing data: Practical Exercises

Using mixture priors for robust inference: application in Bayesian dose escalation trials

Understandable Statistics

Modeling unobserved heterogeneity in Stata

Abstract. Introduction A SIMULATION STUDY OF ESTIMATORS FOR RATES OF CHANGES IN LONGITUDINAL STUDIES WITH ATTRITION

Bayesian random-effects meta-analysis made simple

WinBUGS : part 1. Bruno Boulanger Jonathan Jaeger Astrid Jullion Philippe Lambert. Gabriele, living with rheumatoid arthritis

Genome-Wide Localization of Protein-DNA Binding and Histone Modification by a Bayesian Change-Point Method with ChIP-seq Data

Bayesian Methods for Medical Test Accuracy. Broemeling & Associates Inc., 1023 Fox Ridge Road, Medical Lake, WA 99022, USA;

Advanced Bayesian Models for the Social Sciences. TA: Elizabeth Menninga (University of North Carolina, Chapel Hill)

Advanced Bayesian Models for the Social Sciences

Fundamental Clinical Trial Design

Improving ecological inference using individual-level data

The Effects of Autocorrelated Noise and Biased HRF in fmri Analysis Error Rates

Meta-analysis of few small studies in small populations and rare diseases

STATISTICAL METHODS FOR THE EVALUATION OF A CANCER SCREENING PROGRAM

Analysis of Hearing Loss Data using Correlated Data Analysis Techniques

MS&E 226: Small Data

Meta-analysis using individual participant data: one-stage and two-stage approaches, and why they may differ

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

An application of a pattern-mixture model with multiple imputation for the analysis of longitudinal trials with protocol deviations

16:35 17:20 Alexander Luedtke (Fred Hutchinson Cancer Research Center)

The matching effect of intra-class correlation (ICC) on the estimation of contextual effect: A Bayesian approach of multilevel modeling

Bayesian meta-analysis of Papanicolaou smear accuracy

Case Studies in Bayesian Augmented Control Design. Nathan Enas Ji Lin Eli Lilly and Company

Generation times in epidemic models

Bayesian Mediation Analysis

Risk-prediction modelling in cancer with multiple genomic data sets: a Bayesian variable selection approach

The Late Pretest Problem in Randomized Control Trials of Education Interventions

Introduction. We can make a prediction about Y i based on X i by setting a threshold value T, and predicting Y i = 1 when X i > T.

Data Analysis in Practice-Based Research. Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine

Statistical Models for Censored Point Processes with Cure Rates

Dynamic borrowing of historical data: Performance and comparison of existing methods based on a case study

Bayesian Analysis of Between-Group Differences in Variance Components in Hierarchical Generalized Linear Models

Use of GEEs in STATA

Bayesian Joint Modelling of Longitudinal and Survival Data of HIV/AIDS Patients: A Case Study at Bale Robe General Hospital, Ethiopia

Bayesian Nonparametric Methods for Precision Medicine

Type and quantity of data needed for an early estimate of transmissibility when an infectious disease emerges

Selection and estimation in exploratory subgroup analyses a proposal

On Regression Analysis Using Bivariate Extreme Ranked Set Sampling

Bayesian Latent Subgroup Design for Basket Trials

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Measurement Error in Nonlinear Models

Estimating drug effects in the presence of placebo response: Causal inference using growth mixture modeling

Design for Targeted Therapies: Statistical Considerations

Small Sample Bayesian Factor Analysis. PhUSE 2014 Paper SP03 Dirk Heerwegh

In this module I provide a few illustrations of options within lavaan for handling various situations.

Multivariate Multilevel Models

Analysis of acgh data: statistical models and computational challenges

Chapter 1: Exploring Data

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Bayesian and Classical Approaches to Inference and Model Averaging

Estimating Heterogeneous Choice Models with Stata

Transcription:

Bayesian hierarchical modelling Matthew Schofield Department of Mathematics and Statistics, University of Otago Bayesian hierarchical modelling Slide 1

What is a statistical model? A statistical model: a data generating process f(y θ) The model can be used to simulate data Fixing parameters θ and considering realizations y The model can be used for statistical inference Observe data y Estimate parameters ˆθ Infer to the population of interest Bayesian hierarchical modelling Slide 2

What is a hierarchical model? The parameters θ are described by a probability model f(θ ψ) θ is considered a random variable Special cases include: Mixed models Latent variable models Missing data models Various forms of overdispersion Penalized regression... Bayesian hierarchical modelling Slide 3

What is Bayesian statistics? Alternate approach for statistical inference Probability is used to express uncertainty Update our knowledge (with data): f(θ) }{{} Prior distribution f(y θ) }{{} Collect data f(θ y) }{{} Posterior distribution Posterior distribution f(θ y) reflects our updated knowledge Used for inference Often the prior is chosen to reflect ignorance Reference or default prior Bayesian hierarchical modelling Slide 4

Bayesian hierarchical modelling Combine the previous two slides Use Bayesian statistics for inference from hierarchical models The two are often combined Hierarchical modelling is natural within a Bayesian context Relatively simple to specify and fit hierarchical models Bayesian hierarchical modelling Slide 5

Example I: muscle fibres Observe fibre level data across a muscle cross-section Binary: slow-twitch or fast-twitch fibre Bayesian hierarchical modelling Slide 6

Example I: muscle fibres Observe fibre level data across a muscle cross-section Binary: slow-twitch or fast-twitch fibre Bayesian hierarchical modelling Slide 6

Example I: muscle fibres Fibres are grouped within fascicles Multiple fascicles make up a muscle Goal: understand how fibre composition depends on location Conjecture: function declines near fascicle and muscle edge Model occurs at two levels: Fibre level Parameters describing how fibres vary within fascicle Fascicle level Model fibre level parameters based on fasicle location Complexity: allow for additional spatial covariation Bayesian hierarchical modelling Slide 6

Example II: genetic mapping SNP data from high-throughput sequencing Full-sibling family population Outcrossing of two individuals Output: a genetic map Locating the (SNP) markers on the genome Estimating the genetic distance between markers Bayesian hierarchical modelling Slide 7

Example II: genetic mapping Statistical model includes: Parameter that account for genotyping error Nuisance parameter Collection of parameters that describe crossover Functions of these parameters determine genetic distance One parameter for each marker data hungry Consider as a realization from hierarchical model Borrow strength and improve estimation? Other advantages: prior specification Potential for model extension describing relationship Consider map uncertainty Bayesian hierarchical modelling Slide 7

Example III: animal abundance (a cautionary tale) Avoid tagging animals (difficult) Use repeated counts to estimate abundance Assume the distribution (binomial) is the same each visit Both index (N) and probability (p) are unknown If we have 2 replicates both parameters can be estimated Properties have been long studied Peter Hall (1992): On the Erratic Behavior of Estimators of N in the Binomial N, p distribution Use repeated trials (across space) Consider abundances (N s) as realization from hierarchical model Borrow strength and improve estimation? Bayesian hierarchical modelling Slide 8

Other examples Climate reconstruction Missing data in earthquake records Density dependence from mark-recapture data... Bayesian hierarchical modelling Slide 9

Some advantages Model latent variables Describe a model for a hidden or partially observed process Separate data collection (nuisance) and process modelling Specify a complex marginal model for the data A series of simple conditional models Return to this point later Improved estimation Specifying hierarchical models can improve estimation Broadly applicable Ideas go back to work by James and Stein Look at some simulation results Bayesian hierarchical modelling Slide 10

Simulation: ANOVA type model Five groups, each with 10 observations Variance is known: 1 Look at two scenarios: 1. Five means are similar: µ = (0, 0.1, 0.1, 0.2, 0.2) 2. Five means are unrelated: µ = (0, 100, 100, 200, 200) Look at the mean square error of µ s: Standard ANOVA model y ij iid N(µ j, 1) Hierarchical model y ij iid N(µ j, 1) µ j iid N(α, κ 2 ) Bayesian hierarchical modelling Slide 11

Simulation: similar values of µ Difference in squared errors (+ve: hier model preferred) 0.0 0.2 0.4 0.6 Hierarchical model lower MSE than standard ANOVA model Bayesian hierarchical modelling Slide 12

Simulation: unrelated values of µ Difference in squared errors (+ve: hier model preferred) 0.02 0.01 0.00 0.01 0.02 When µ j s are unrelated hierarchical model has done no harm Return to this later Bayesian hierarchical modelling Slide 13

Relatively straightforward to fit The model above is straightforward to specify and fit in freely available software, e.g. JAGS. model{ for(j in 1:G){ for(i in 1:n[j]){ y[i,j] ~ dnorm(mu[j],1) } mu[j] ~ dnorm(alpha,tau) } ### Prior distributions -- their specification is for another talk tau <- 1/kappa^2 kappa ~ dt(0,0.04,3)t(0,) alpha ~ dnorm(0,0.0001) } Bayesian hierarchical modelling Slide 14

Relatively straightforward to fit When fitting hierarchical models using MCMC Computational issues can and do arise Generally easier than finding MLEs Extending the hierarchical is relatively easy E.g. we could allow variance of y to be: Unknown Vary by group Hierarchical distribution Bayesian hierarchical modelling Slide 15

Model extensions: JAGS model{ for(j in 1:G){ } for(i in 1:n[j]){ } y[i,j] ~ dnorm(mu[j],tauy[j]) mu[j] ~ dnorm(alpha[1],tau[1]) tauy[j] <- 1/sdy[j] sdy[j] ~ dlnorm(alpha[2],tau[2]) ### Prior distributions for(h in 1:2){ } tau[h] <- 1/kappa[h]^2 kappa[h] ~ dt(0,0.04,3)t(0,) alpha[h] ~ dnorm(0,0.0001) } Bayesian hierarchical modelling Slide 16

What are we doing? Specifying a marginal model We specify the model conditionally f(y θ)f(θ ψ) The model is fitted marginally f(y ψ) = f(y θ)f(θ ψ)dθ MCMC perform (numerical) integration for us With simple conditional models Results in complex marginal models Some care is required Bayesian hierarchical modelling Slide 17

Marginal and conditional models Many common distributions are marginal hierarchical models Negative binomial: Conditional: y P ois(θ) θ Gamma(α, β) Marginal: y NB(y; α, β) t distribution Conditional: y N(µ, θ) Marginal: y t ν (µ, σ 2 ) θ Gamma 1 ( ν 2, ν 2 σ2) Beta-binomial Mixture models Probit regression Bayesian hierarchical modelling Slide 18

What are we doing? Partial pooling Consider the ANOVA model Two choices: 1. Means are different (no pooling) 2. Means are the same (complete pooling) Hierarchical modeling gives an intermediate option Means are different but related (partial pooling) Bayesian hierarchical modelling Slide 19

What are we doing? Partial pooling 0.15 0.20 0.25 0.30 0.35 0.40 Batting average JS MLE Bayesian hierarchical modelling Slide 19

What are we doing? Biased estimation Consider the ANOVA model Gauss-Markov theorem: BLUE Hierarchal model is introducing bias Simulation 1: E[ µ5 ] 0.1 with µ 5 = 0.2 Increased bias is associated with decreased variance Simulation 1: Var( µ5 ) 0.05 compared to Var( µ 5 ) 0.1 Introduce bias to improve (decrease) the mean square error Goes back to work by James, Stein, Efron, Morris,... Bayesian hierarchical modelling Slide 20

Example 1 Goal: was to describe spatial distribution of fibres on muscle Probit regression at the fibre level (for each fascicle) Predictor is the distance from edge of fascicle Spatial model on error structure Both intercept and slope are modelled at the fascicle level Intercept describes relative abundance of fast/slow twitch fibres Modelled as function of distance from muscle edge Slope tells us about amount fast/slow twitch fibres change within fascicle as a function of distance Modelled as function of distance from muscle edge Common spatial process at fascicle level Bayesian hierarchical modelling Slide 21

Example 1 0.80 0.60 0.40 0.20 0.00 0.20 0.40 β 0 0.25 0.20 β 1 0.15 0.10 Bayesian hierarchical modelling Slide 22

Example 1 Distance explained the spatial variability of type at fibre level Distance only partially explained the variability of the parameters in the fascicle level model Considerable spatial clustering after accounting for distance Distance from edge appears to be an important predictor within fascicles Assess the importance at multiple levels within the muscle Future: embed this within a larger hierarchical model to assess demographic changes in muscle composition Bayesian hierarchical modelling Slide 23

Example 2 Work in progress showing considerable promise Bayesian hierarchical modelling Slide 24

Cautions and limitations Hierarchical modelling has the potential for abuse Replace data with model assumption Several examples Example 3 (N-mixture model) Latent class analysis for diagnostic testing Estimating abundance from occupancy data Including heterogeneity in mark-recapture models Factor analysis... Simplicity of model fitting can lead to pushing the boundaries. Bayesian hierarchical modelling Slide 25

A continuum of models M 1 M 5 M 3 M 4 M 2 Data model estimable without hierarchy Data model overspecified Stage 2: partial pooling Examples 1 and 2 Estimable with hierarchy As we move from left to right: Sensitivity to hierarchical model increases Increasing reliance on specification of hierarchical model Bayesian hierarchical modelling Slide 26

Model checking Model fitting is done marginally. Suggests we need to assess fit marginally RHS of continuum: hierarchical model essential Important part of model adequacy Should we check model fit conditionally? Trade-off between data and process models How do we assess fit? Bayesian hierarchical modelling Slide 27

What do the latent variables represent? When the latent variable is a first moment Assess directly against data We may need to pool If the latent variable is not a first moment Cannot directly assess variables against data Estimation can be sensitive to minor changes in data Process variables need not reflect any physical quantity RHS of continuum & not related to a moment Good marginal fit with latent variable not reflecting reality Bayesian hierarchical modelling Slide 28

Example 3 Challenges to fitting model to one site Erratic behaviour of standard estimators Sensitive to model assumptions (cf Poisson) Marginal model is multivariate Poisson Mean = λp = µ Variance = λp = µ Correlation = p Latent abundances do not relate to first moment Good information regarding µ Information about λ (and the site specific N s) Depends on p (estimated from second moment) Ratio Bayesian hierarchical modelling Slide 29

Summary and discussion Hierarchical models have considerable appeal Degree of flexibility in model specification Separate data model and process models Hierarchical models can offer improved estimation Partial pooling Borrowing strength Regularization Bayesian approach offers advantages Hierarchical modelling cannot absolve all statistical sins Potential for a poor model to attain a veneer of respectability Need improved understanding of model adequacy Bayesian hierarchical modelling Slide 30

Acknowledgements Collaborators on various examples Tilman Davies, Phil Sheard, Jon Cornwall Timothy Bilton et al. Richard Barker, Bill Link, John Sauer Bayesian hierarchical modelling Slide 31