Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins HARK: A New Apprach fr Regressin with Functinal Predictrs Dawn Wdard Operatins Research and Infrmatin Engineering Crnell University Ciprian Crainiceanu (Jhns Hpkins) David Ruppert (Crnell) JSM, August 2010 1
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Regressin with Functinal Predictrs Regressin with functinal predictrs: Our applicatin: Relating sleep patterns, as measured using electrencephalgrams (EEG), t health utcmes such as cardivascular health indicatrs Other app.s: estimating chemical variables frm spectrscpic data; relating diffusin tensr images t multiple sclersis. Mst existing methds make strng linearity and additivity assumptins Fail t accunt fr events that ccur at variable times, such as sleep transitins in the EEG data 3
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Regressin with Functinal Predictrs Our methd (HARK: Hierarchical Adaptive Regressin Kernels): Represent the functinal predictr with a nnparametric kernel mixture mdel. Parsimnius, interpretable Captures features such as spikes, bumps, dips, whse frequency, lcatin, size varies acrss subjects. Regress the utcme n summaries f this representatin: e.g. frequency f bumps, r their average height r width. Jint inference n functinal representatins and regressin parameters. 4
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Regressin with Functinal Predictrs Advantages: - Des nt require alignment f functins r bservatin lcatins, r a cmmn dmain - Naturally handles missing, c-lcated data 5
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Existing Wrk Mst existing methds relate the utcme t a finite set f cefficients frm a basis functin representatin f the predictr: Principal cmpnent scres: Cardt et al. (2003); Müller and Stadtmüller (2005) Spline cefficients: James (2002) Furier cefficients: Ramsay and Silverman (2005) Partial least squares cefficients: Gutis and Fearn (1996); Reiss and Ogden (2007) 6
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Existing Wrk These methds assume that the expected respnse Y i is linear and additive in the functinal predictr f i (x) at each lcatin x: E(Y i ) = f i (x)β(x)dx 7
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Mtivating Simulatin Fr subjects i generate nisy bservatins frm a functin f i(x) having a single blip at randm time µ i [0, 100], with randm amplitude γ i [5, 20]: Subj. i = 1 Subj. i = 2 W 10 0 10 fi(x) 10 0 10 0 20 40 60 80 100 Time x 0 20 40 60 80 100 Time x Take the utcme t be γ i. Try t detect this relatinship between predictr f i(x) and utcme γ i using (a) HARK; (b) Principal cmpnent regressin. 9
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Mtivating Simulatin HARK effectively captures relatinship between predictr & utcme, even fr smallest sample sizes: Represents predictr using a Gaussian kernel mixture Finds that average magnitude f mixture cmpnents is psitively crrelated with the utcme 10
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Mtivating Simulatin Principal cmpnent regressin: PC functins difficult t interpret: 0.2 0.0 0.2 PC 1 0 20 40 60 80 100 Time x 0.2 0.0 0.2 PC 10 0 20 40 60 80 100 Time x 11
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Mtivating Simulatin Principal cmpnent regressin: Estimated regressin cefficient functin β(x) hard t interpret: beta(x) 1.0 0.0 0.5 1.0 1.5 0 20 40 60 80 100 Time (recall E(Y i) = R f i(x)β(x)dx). x 12
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Functin Representatin Regressin Mdel HARK Mdel Nnparametric functinal data mdel fr subject i: Nisy bservatins W i(x ik) f a functinal predictr f i(x) at lcatins x ik X i: W i(x ik) ind. N(f i(x ik), τi 2 ) Kernel mixture mdel fr f i( ): M X i f i(x) = β 0i + γ imk(x, s im). m=1 where K(x, s) is a specified kernel functin n X i S and the parameters f the kernel are defined n S. 14
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Functin Representatin Regressin Mdel HARK Mdel M i : # f mixture cmpnents M i f i (x) = β 0i + γ im K(x, s im ). m=1 γ im R and s im S: magnitudes and parameter vectrs f the mixture cmpnents E.g. fr K a Gaussian kernel, s im = (µ im, σ 2 im). Scaling and ther parameters can vary between cmpnents, adapting t the lcal features f f i ( ) Sparsity is induced thrugh the prirs n M i, γ im, and s im. 15
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Functin Representatin Regressin Mdel HARK Mdel Illustratin fr a test functin: Test data: Functin estimated & true: 0 2 4 6 0 2 4 6 Mixture representatin (ne psterir sample): 1 1 2 3 4 5 0.0 0.2 0.4 0.6 0.8 1.0 x 16
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Functin Representatin Regressin Mdel HARK Mdel Regressin Mdel: Functin representatin is ω i = (β 0i, τ 2 i, {(γ im, s im)} M i m=1 ) Define a vectr θ(ω i) f summaries f ω i. E.g. when s im = (µ im, σ 2 im): θ(ω i) = (1, β 0i, τ 2 i, M i, γ i, σ 2 i ) where γ i = P M i m=1 γim /Mi and σ2 i = P M i m=1 σ2 im/m i. Linear regressin mdel fr the utcme Y i given θ i = θ(ω i ): ind. Y i N(θ i η, ψ 2 ) 17
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Functin Representatin Regressin Mdel HARK Mdel Jint estimatin f functin representatins ω i and regressin parameters η, φ 2. Cmputatin is via reversible jump Markv chain Mnte Carl (Green 1995), using an apprximatin f the psterir distributin btained via mdularizatin (Liu, Bayarri, & Berger 2009). Cmp. increases linearly in # subjects & is parallelizable. 18
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Results fr the Sleep Data Relate EEG time series btained during sleep t respiratry distress index (RDI) and bdy mass index (BMI); 6,000+ subjects. EEG series: Subject 1 Subject 2 delta pwer 0.0 0.4 0.8 0 1 2 3 4 Subject 3 0.0 0.4 0.8 0 1 2 3 4 Subject 4 delta pwer 0.0 0.4 0.8 0.8 0 1 2 3 4 Time (hurs) Subject 5 0.0 0.4 0.8 0.8 0 1 2 3 4 Time (hurs) Subject 6 with penalized spline estimates. RDI is a measure f sleep apnea. # and timing f fluctuatins varies acrss i 20
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Results fr the Sleep Data Psterir mean estimates f f i( ) frm HARK (slid curve) are similar t penalized spline estimates (dashed curve): Subject A: Subject B: lgit( delta pwer ) 0.5 0.5 1.5 lgit( delta pwer ) 1.0 0.0 1.0 0 1 2 3 4 Time (hurs) 0 1 2 3 4 Time (hurs) 21
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Results fr the Sleep Data Kernel mixture representatin (slid curves) f f i( ) frm a single HARK psterir sample: Mixture Cmpnents frm MCMC Iteratin # 9607 Subject A: Subject B: lgit(δ-pwer) delta pwer ) lgit( lgit(δ-pwer) delta pwer ) 1.0 0.0 1.0 0.5 0.5 1.5 0 Mixture 1Cmpnents frm 2 MCMC Iteratin 3# 9607 4 Time (hurs) 0 1 2 3 4 Time (hurs) Hriz. line is β 0i, mixture cmpnents deviate frm this line. Dashed curve: f i. 22
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Results fr the Sleep Data Regressin cefficient estimates frm HARK: Outcme Predictr Cef. Est. 95% Psterir Int. lg(rdi + 0.5) β 0i -0.210 (-0.304, -0.117) M i -0.058 (-0.096, -0.020) γ 1/2 i -0.835 (-1.279, -0.401) lg BMI β 0i -0.026 (-0.039,-0.012) lg τi 2-0.041 (-0.073,-0.009) 23
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Results fr the Sleep Data RDI & BMI are negatively assciated with average δ-pwer Subjects with higher RDI tend t have fewer and less prnunced fluctuatins in δ-pwer, a measure f slw neurnal firing (RDI negatively assciated with M i and γ i) Subjects with higher BMI have less measurement errr in δ-pwer (reasnable since EEG measurement errr affected by skin prperties, perspiratin) 24
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Cnclusins Intrduced a methd fr regressin with functinal predictrs, using a parsimnius, interpretable functin representatin Mre effective and efficient than existing methds fr data that include features ccurring at varying lcatins Applied HARK t find imprtant relatinships between sleep characteristics and health utcmes. Large and cmplex dataset! A cpy f this paper and seminar are available at: http://peple.rie.crnell.edu/wdard 26
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Supplementary Material The fllwing slides cntain supplementary material. 27
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins Mtivating Simulatin Principal cmpnent regressin: Smthed β(x) functin: beta(x) 0.10 0.05 0.00 0.05 0.10 0 20 40 60 80 100 Time 28
Mtivating Simulatin HARK Mdel Results fr the Sleep Data Cnclusins HARK Mdel Typical prir fr functinal data mdel: M i Pis(λ) γ im M i ind. Symmetric Gamma(α, ρ) µ im M i ind. Unif(X i) σ 2 im M i ind. IG(α σ, ρ σ) See e.g. Wlpert, Clyde, & Tu (2010) 29