Supplementary Material Supplementary Table 1. Symptoms assessed, number of items assessed, scoring, and cut-off points for the psychiatric rating scales: Montgomery Åsberg Depression Rating Scale, Hamilton Anxiety Rating Scale, Suicidal Behaviors Questionnaire-Revised, and the Columbia-Suicide Severity Rating Scale Montgomery Åsberg Depression Rating Scale Hamilton Anxiety Rating Scale Suicide Behaviors Questionnaire-Revised Columbia-Suicide Severity Rating Scale Symptoms assessed Depression Anxiety Suicidal ideation/behavior Suicidal ideation/behavior No. of items 10 14 4 4 Scoring Each item rated 0 6 Total score range=0 60 Total score range=0 56 Each item rated 0 4 Total score range=0 16 n/a Cut-off points 7 = recovered 15 = mildly depressed 25 = moderately depressed 31 = severely depressed 44 = very severely depressed 0 13 = normal range 14 17 = mild 18 24 = moderate 25 = severe 7 = at risk (in general adult population) Severity subscale: A positive answer to Question 4 or 5 indicating presence of ideation with at least some intent to die suggests a clear need for further evaluation or clinical management (e.g., triggers immediate referral to mental health services) Behavior subscale: Presence of ANY suicidal behavior (suicide attempt, interrupted attempt, aborted attempt and preparatory behavior) in the past 3 months indicates a severe risk Intensity subscale: The total score ranges from 2 to 25, with a higher number indicating more intense ideation and greater risk Lethality subscale: Greater lethality of the behavior (endorsed on the Behavior subscale) indicates increased risk 1
Supplementary Text 1 A priori rules applied to missing data The a priori rules applied to the handling of missing data were as follows: Participants who discontinued the study and were lost to follow up for subsequent visits were assumed to be smokers for the remainder of the study. In binary responder assessments, participants who discontinued continued to be represented in the denominator but not in the numerator regardless of smoking status at the time of discontinuation. This is considered a worst case carried forward analysis and represents a conservative approach to imputation of missing data. Specifically, CAR participants were assessed as responders using the weekly reports of cigarette and nicotine use since the last visit for specified periods. For example, CAR for weeks 9 12 was assessed from data collected from weeks 9 through 12 inclusive. Additionally, a participant was not considered a responder if the expired CO was >10 ppm at any given time point during weeks 9 through 12. In the case of a missed visit(s) during the evaluation period, a participant was considered a responder if they met the following criterion: The participant reports that they have not smoked or used nicotine products since the last visit at the visit after the missing visit(s). Missing CO was imputed as negative (i.e. not disqualifying the subject as a responder). No attempt was made to impute missing data from subject diaries (if collected) or other weekly interview questions. 2
Supplementary Text 2 Sensitivity analyses conducted to address the potential impact of large dropout rates and treatment non-adherence on efficacy results In smoking cessation studies, drop-out is not believed to occur at random (i.e., drop-out is believed to follow a non-random mechanism); the conjecture is that non-quitters are more likely to drop out than quitters due to loss of motivation. The imputation rule that subjects who cannot be contacted or are unavailable for follow-up (lost to follow-up) will be considered to be smokers from that time-point on incorporates a non-random drop-out mechanism. Thus, dropouts were treated as informative in our primary analysis. However, many commonly-used methods assume data are missing at random (MAR) and there is no statistical test to distinguish between MAR and missing not at random (MNAR). Consequently, we conducted various sensitivity analyses to help to evaluate the impact of missing data. In order to utilize cessation data from earlier weeks in the presence of treatment nonadherence, we have modeled since last visit cessation data from Week 2 through Week 12 using a longitudinal logistic regression analysis of these data with the a priori imputation method used in the primary analysis. In addition, to examine treatment non-adherence we conducted two sensitivity analyses on these data. A MAR longitudinal logistic regression analysis, and a MNAR pattern-mixture model using longitudinal logistic regression. The description of each model is as follows: A longitudinal model was fit for since last visit smoking status for Weeks 2 through 12 including fixed effects for pooled study center, cohort, treatment group, week, and the interaction between treatment and week. This model used our a priori imputation approach, and as such, is an MNAR approach. 3
A second, similar longitudinal model, was then fit to the data without the imputation of post-discontinuation missing data. This model is an MAR approach and can be viewed as a sensitivity analysis of our MNAR approach. Also, in order to more thoroughly explore the impact of treatment discontinuation a third model was fit using observed, non-imputed data that included additional effects for treatment discontinuation (dropout). This longitudinal pattern mixture model initially included drop out, week by dropout, and treatment by dropout and was subsequently reduced to eliminate nonsignificant interaction effects with dropout. An unstructured covariance structure was used. This model represents an MNAR approach. Treatment odds ratios from this model were estimated separately by dropout status, as well as combined using the marginal treatment dropout rate as weights in the logistic regression analysis. While dropout status was itself statistically significant (p<0.0001), the estimated treatment effect from this model was similar to those from the models without dropout included. Supplementary Figure1 below shows the estimated odds ratios by week for each of these models. 4
Supplementary Figure 1. Estimated odds ratios by week (full analysis set) The estimated odds ratios at Week 12 from each of these approaches are shown in Supplementary Table 2 below. Supplementary Table 2. Estimated Odds Ratio at Week 12 Method Odds Ratio 95% Confidence Interval p-value MNAR: Primary Imputation 3.5 (2.1, 5.5) <0.0001 MAR: No Imputation 3.7 (2.1, 6.4) <0.0001 MNAR: Pattern Mixture Model 3.5 (2.0, 6.1) <0.0001 In conclusion, these models confirm that while the treatment completion rate was higher among varenicline-treated subjects, the imputation of post-discontinuation missing data as nonresponse yielded a very similar odds ratio to those obtained by the MAR analysis and by the MNAR pattern-mixture model. 5
Supplementary Table 3. Summary of study discontinuations by responder status at the time of discontinuation Varenicline (N=256) Placebo (N=269) Total study discontinuations 81 90 Discontinued study prior to week 9 32 43 Discontinued study after week 9, already non-responders at the time of discontinuation: Between weeks 9 and 12 9 16 Between weeks 9 and 24 24 29 Between weeks 9 and 52 46 41 Discontinued study after week 9, responders at the time of discontinuation, who have been imputed as non-responders in primary analyses: CAR weeks 9 12 0 0 CAR weeks 9 24 1 (0.4%) 3 (1.1%) CAR weeks 9 52 3 (1.2%) 6 (2.2%) Data for weeks 9 12, 9 24 and 9 52 is cumulative 6
Supplementary Table 4. Number and percentage of participants at each visit with recorded measurements for exhaled carbon monoxide, Nicotine Use Inventory, Montgomery Åsberg Depression Rating Scale, and Hamilton Anxiety Rating Scale Varenicline N = 256 Placebo N = 269 Exhaled CO NUI MADRS HAM-A Exhaled CO NUI MADRS HAM-A Visit Baseline 256 (100.0) 256 (100.0) 269 (100.0) 269 (100.0) Week 0 256 (100.0) 256 (100.0) 268 (99.6) 269 (100.0) Week 1 250 (97.7) 250 (97.7) 250 (97.7) 250 (97.7) 265 (98.5) 266 (98.9) 267 (99.3) 267 (99.3) Week 2 245 (95.7) 245 (95.7) 246 (96.1) 246 (96.1) 252 (93.7) 255 (94.8) 257 (95.5) 257 (95.5) Week 3 240 (93.8) 242 (94.5) 242 (94.5) 242 (94.5) 248 (92.2) 249 (92.6) 256 (95.2) 254 (94.4) Week 4 230 (89.8) 231 (90.2) 232 (90.6) 231 (90.2) 244 (90.7) 246 (91.4) 248 (92.2) 247 (91.8) Week 5 227 (88.7) 229 (89.5) 231 (90.2) 231 (90.2) 234 (87.0) 236 (87.7) 241 (89.6) 241 (89.6) Week 6 224 (87.5) 224 (87.5) 224 (87.5) 224 (87.5) 227 (84.4) 227 (84.4) 229 (85.1) 229 (85.1) Week 7 225 (87.9) 226 (88.3) 226 (88.) 226 (88.3) 226 (84.0) 227 (84.4) 230 (85.5) 229 (85.1) Week 8 219 (85.5) 221 (86.3) 221 (86.3) 221 (86.3) 214 (79.6) 218 (81.0) 221 (82.2) 221 (82.2) Week 9 217 (84.8) 219 (85.5) 219 (85.5) 219 (85.5) 214 (79.6) 216 (80.3) 218 (81.0) 218 (81.0) Week 10 214 (83.6) 214 (83.6) 214 (83.6) 214 (83.6) 211 (78.4) 213 (79.2) 215 (79.9) 215 (79.9) Week 11 209 (81.6) 211 (82.4) 211 (82.4) 211 (82.4) 204 (75.8) 206 (76.6) 206 (76.6) 206 (76.6) Week 12 210 (82.0) 210 (82.0) 210 (82.0) 210 (82.0) 205 (76.2) 205 (76.2) 206 (76.6) 206 (76.6) Week 13 211 (82.4) 211 (82.4) 211 (82.4) 211 (82.4) 200 (74.3) 201 (74.7) 203 (75.5) 202 (75.1) Week 14 208 (81.3) 202 (75.1) 7
Week 16 201 (78.5) 203 (79.3) 204 (79.7) 204 (79.7) 200 (74.3) 202 (75.1) 203 (75.5) 203 (75.5) Week 20 199 (77.8) 197 (73.2) Week 24 195 (76.2) 195 (76.2) 191 (71.0) 191 (71.0) Week 28 197 (80.0) 188 (69.9) Week 32 183 (71.5) 186 (72.7) 185 (68.8) 186 (69.1) Week 36 186 (72.7) 177 (65.8) Week 40 178 (69.5) 179 (69.9) 182.(67.7) 182.(67.7) Week 44 177 (69.1) 178 (66.2) Week 48 175 (63.4) 178 (66.2) Week 52 178 (69.5) 178 (69.5) 179 (66.5) 180 (66.9) Only measurements at scheduled visits are shown. CO, carbon monoxide; NUI, Nicotine Use Inventory; MADRS, Montgomery Åsberg Depression Rating Scale; HAM-A, Hamilton Anxiety Rating Scale 8