Diagnostic accuracy in the presence of an imperfect reference standard: challenges in evaluating latent class models specifications (a Campylobacter infection case) J Asselineau a, P Perez a, A Paye a, E Bessède b, C Proust-Lima a,c a Bordeaux University Hospital, Public Health Department, Clinical Epidemiology Unit and CIC 1401 EC, Bordeaux, France b French National Reference Center for Campylobacter and Helicobacter, Bordeaux, France c INSERM U1219, Bordeaux Population Health Research Center, Bordeaux, France
2 Latent class models (LCM) in diagnostic studies Evaluation of diagnostic tests when the reference standard is imperfect D=0 / D=1 D=1 : disease class D=0 : disease-free class Test 1 Test 2 Test k Test K Disease prevalence: p = Pr(D=1) Probability of k test results given the latent class Pr(T k =1 D=1) = Se k and Pr(T k =0 D=1) = 1 Se k Pr(T k =0 D=0) = Sp k and Pr(T k =1 D=0) = 1 Sp k
3 Latent class models (LCM) in diagnostic studies Individual contribution to the likelihood Pr T 1 = t 1, T 2 = t 2,, T K = t K = K t disease class p Se k+ k (1 Se k ) t k k=1 + K t disease-free class 1 p Sp k k (1 Sp k ) t k+ k=1 Model identifiability: at least 3 diagnostic tests assumes an independence between the tests conditionally to the 2 latent classes Introduction into the model of residual dependences between the tests
4 Latent class models (LCM) in diagnostic studies Systematic review (van Smeden, 2014) LCM applied in medical applications independence between the tests conditionally to the latent classes no model assessment in 55% of the studies different criteria to assess LCM hypothesis Objective: to implement and assess different LCM specifications for estimating diagnostic accuracy of Campylobacter infection tests
5 Clinical application Campylobacter infection: leading cause of bacterial gastroenteritis worldwide Diagnostic reference standard: bacteriological culture of stools excellent specificity but moderate sensitivity (bacteria growth in a micro-aerobic atmosphere) Development of new tests on stools: biomolecular and immunoenzymatic techniques
6 Study population: n=623 Culture Karmali Real-time PCR Ridascreen 1 er Campy Immunocard N (%) - - - - - 522 (83.8) - - - - 15 (2.4) - - - 4 (0.6) - - - - 7 (1.1) - - - 3 (0.5) - - 2 (0.3) - - - - 3 (0.5) - - - 1 (0.2) - - 1 (0.2) - 9 (1.4) - - - - 2 (0.3) - - - 1 (0.2) - - 1 (0.2) - 5 (0.8) - - 1 (0.2) - 5 (0.8) 41 (6.6)
7 LCMs specifications Infection / Infection-free Karmali Real-time PCR Ridascreen 1 st Campy Immunocard LCM assuming conditional independence (LCM CI) (Qu, 1996) Pr T ik = 1 D = d = a kd with d = 0,1 and the CDF N(0,1) # parameters: 2 5 + 1 (prevalence)
8 LCMs specifications Infection / Infection-free Karmali Real-time PCR Ridascreen 1 st Campy Immunocard Residual dependence LCM assuming a common residual dependence (LCM CD) Pr T ik = 1 D = d, U i = u i = a kd + σu i with U i ~ N(0,1) # parameters: 2 K + 2 (prevalence + σ)
9 LCMs specifications Infection / Infection-free Karmali Real-time PCR Ridascreen 1 st Campy Immunocard LCM assuming a specific residual dependence within immunoenzymatic tests (LCM SD) Pr T ik = 1 D = d = a kd for culture and PCR Pr T ik = 1 D = d, U i = u i = a kd + σu i for immunological tests # parameters: (2 K) + 2 (prevalence + σ) Residual dependence
10 LCMs estimation / implementation Estimation maximum likelihood estimation numerical integration for the random effect (adaptive gaussian quadrature) local maxima problems 100 sets of random initial values within clinically plausible ranges (sensitivity, specificity and prevalence) Implementation NLMIXED procedure in SAS randomlca package in R
11 LCMs assessment Akaïke information criterion: AIC = 2q 2loglik with q the number of parameters
12 LCMs assessment Akaïke information criterion: AIC = 2q 2loglik with q the number of parameters Goodness-of-fit statistics (Formann, 2003; van Smeden, 2016) observed (n s ) versus predicted frequencies (m s ) for profile s Pearson statistic: X 2 = (n s m s ) 2 s m s Likelihood ratio statistic: G 2 = 2 n s ln n s s m s asymptotic distribution (Chi-Square) sparse data empirical distribution (parametric bootstrap procedure)
13 LCMs assessment Residual correlations between the tests (Qu, 1996) corr kk = corr kk corr kk bootstrapped 95% confidence interval
14 LCMs assessment Residual correlations between the tests (Qu, 1996) corr kk = corr kk corr kk bootstrapped 95% confidence interval Local independence testing (Kollenburg, 2015) bivariate residual statistics : BVR kk = pairwise Pearson statistics robustness to the sparse data issue 1 r k =0 1 r k =0 (n rk r k m rk r k ) 2 m rk r k asymptotic distribution unknown empirical distribution (parametric bootstrap procedure)
15 Observed versus predicted frequencies Culture Karmali Real-time PCR 1 er Campy Ridascreen Immunocard N Predictions LCM CI LCM CD LCM SD - - - - - 522 519.6 521.2 522.1 - - - - 15 16.9 16.4 15.6 - - - 4 0.2 0.7 1.4 - - - - 7 9.6 7.6 5.2 - - - 3 0.4 1.8 4.1 - - 2 2.6 2.4 3.0 - - - - 3 3.4 2.4 3.0 - - - 1 0.1 0.5 0.2 - - 1 1.7 1.9 1.2-9 10.8 7.1 8.7 - - - - 2 1.9 1.5 2.0 - - - 1 0.1 0.2 0.0 - - 1 1.2 1.2 0.8-5 8.1 4.3 5.4 - - 1 0.2 0.2 0.8-5 5.2 3.5 5.7 41 34.2 42.4 40.2
16 Akaïke Information Criterion LCM CI LCM CD LCM SD AIC 1041.5 1023.9 1011.9
17 Goodness-of-fit statistics LCM CI LCM CD LCM SD AIC 1041.5 1023.9 1011.9 Pearson statistics asymptotic distribution 0.001 0.012 0.021 empirical distribution 0.001 0.022 0.052 Likelihood ratio statistics asymptotic distribution 0.001 0.052 0.523 empirical distribution 0.001 0.004 0.086
18 Goodness-of-fit statistics LCM CI LCM CD LCM SD AIC 1041.5 1023.9 1011.9 Pearson statistics asymptotic distribution 0.001 0.012 0.021 empirical distribution 0.001 0.022 0.052 Likelihood ratio statistics asymptotic distribution 0.001 0.052 0.523 empirical distribution 0.001 0.004 0.086
19 Residual correlations LCM CI T1 : Karmali T2 : Real-time PCR T3 : Ridascreen T4 : 1 er Campy T5 : Immunocard LCM CD LCM SD Pair of tests
20 Local independence testing (BVRs) LCM CI T1 : Karmali T2 : Real-time PCR T3 : Ridascreen T4 : 1 er Campy T5 : Immunocard LCM CD LCM SD Pair of tests
21 LCMs assessment: summary Criteria LCM CI LCM CD LCM SD AIC GOF statistics Residual correlations Local independence testing
22 Diagnostic accuracy Prevalence of Campylobacter infection LCM SD: p = 10.5% [8.4;13.3] Ref Std: p = 9.0% [6.9;11.5] Culture Karmali LCM SD Ref Std Real-time PCR LCM SD Ref Std Ridascreen LCM SD Ref Std 1 er Campy LCM SD Ref Std Immunocard LCM SD Ref Std Sensitivity Specificity
23 Discussion Appealing approach when imperfection of the reference standard Classic LCM: independence conditionally to the 2 latent classes 1. investigation of different LCM specifications complexity of the models possible problems in the estimation limitations of usual software: lack of flexibility / unreliability 2. evaluation with several criteria as a body of evidence sparseness context empirical distributions not implemented in usual software specific programming external data? gold standard in a subset, clinical follow-up, etc.
24 Acknowlegements : thanks to CIC 1401 EC for the travel grant References Albert PS et al. A cautionary note on the robustness of latent class models for estimating diagnostic error without a gold standard. Biometrics 2004;60(2):427-35. Formann AK. Latent class model diagnostics a review and some proposals. Computational statistics & data analysis 2003;41(3):549-59. van Kollenberg GH et al. Assessing model fit in latent class analysis when asymptotics do not hold. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences 2015;11(2):65. Qu Y et al. Random effects models in latent class analysis for evaluating accuracy of diagnostic tests. Biometrics 1996;52:797-810 Reitsma JB et al. A review of solutions for diagnostic accuracy studies with an imperfect or missing reference standard. Journal of Clinical Epidemiology 2009;62:797-806 van Smeden M et al. Latent class models in diagnostic studies when there is no reference standard a systematic review. American journal of epidemiology 2014;179(4):423-31. van Smeden M et al. Problems in detecting misfit of latent class models in diagnostic research without a gold standard were shown. Journal of Clinical Epidemiology 2016;74:158-66.