Cross-over trials. Martin Bland. Cross-over trials. Cross-over trials. Professor of Health Statistics University of York

Similar documents
Name: emergency please discuss this with the exam proctor. 6. Vanderbilt s academic honor code applies.

Revised Cochrane risk of bias tool for randomized trials (RoB 2.0) Additional considerations for cross-over trials

Notes for laboratory session 2

NEUROBLASTOMA DATA -- TWO GROUPS -- QUANTITATIVE MEASURES 38 15:37 Saturday, January 25, 2003

Age (continuous) Gender (0=Male, 1=Female) SES (1=Low, 2=Medium, 3=High) Prior Victimization (0= Not Victimized, 1=Victimized)

Teaching A Way of Implementing Statistical Methods for Ordinal Data to Researchers

Application of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties

Research Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1:

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels;

The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

CREDO Finale to Hoxby s Revised Memorandum

Exam 3 PS 306, Spring 2005

ANOVA. Thomas Elliott. January 29, 2013

Chapter 5: Field experimental designs in agriculture

Basic statistics for public health and policy

Lessons in biostatistics

Measures of Dispersion. Range. Variance. Standard deviation. Measures of Relationship. Range. Variance. Standard deviation.

Business Research Methods. Introduction to Data Analysis

STA 3024 Spring 2013 EXAM 3 Test Form Code A UF ID #

ABSTRACT THE INDEPENDENT MEANS T-TEST AND ALTERNATIVES SESUG Paper PO-10

appstats26.notebook April 17, 2015

Chapter 9. Factorial ANOVA with Two Between-Group Factors 10/22/ Factorial ANOVA with Two Between-Group Factors

Statistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN

Title: A new statistical test for trends: establishing the properties of a test for repeated binomial observations on a set of items

Multiple Treatments on the Same Experimental Unit. Lukas Meier (most material based on lecture notes and slides from H.R. Roth)

Multiple Linear Regression Analysis

Online Introduction to Statistics

ECON Introductory Econometrics. Seminar 1. Stock and Watson Chapter 2 & 3

Hungry Mice. NP: Mice in this group ate as much as they pleased of a non-purified, standard diet for laboratory mice.

Quantitative Methods in Computing Education Research (A brief overview tips and techniques)

Statistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.

Learning Objectives 9/9/2013. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency

9/4/2013. Decision Errors. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency

Comparisons within randomised groups

Analysis of Variance (ANOVA)

ANOVA in SPSS (Practical)

Effect of Source and Level of Protein on Weight Gain of Rats

Dynamic Allocation Methods: Why the Controversy?

2016 Chronic Wound Healing Case Series: Statistical Results:

Research paper. One-way Analysis of Variance (ANOVA) Research paper. SPSS output. Learning objectives. Alcohol and driving ability

Clincial Biostatistics. Regression

Use the above variables and any you might need to construct to specify the MODEL A/C comparisons you would use to ask the following questions.

Supplementary Appendix

CHAPTER ONE CORRELATION

Completely Random Design and Least Significant Differences for Breast Cancer in Al-Najaf City ( )

Chapter 12: Analysis of covariance, ANCOVA

Single-Factor Experimental Designs. Chapter 8


Making comparisons. Previous sessions looked at how to describe a single group of subjects However, we are often interested in comparing two groups

Statistical Essentials in Interpreting Clinical Trials Stuart J. Pocock, PhD

CHAPTER - 6 STATISTICAL ANALYSIS. This chapter discusses inferential statistics, which use sample data to

Sensitivity, Specificity and Predictive Value [adapted from Altman and Bland BMJ.com]

8/28/2017. If the experiment is successful, then the model will explain more variance than it can t SS M will be greater than SS R

This tutorial presentation is prepared by. Mohammad Ehsanul Karim

Sociology 63993, Exam1 February 12, 2015 Richard Williams, University of Notre Dame,

To open a CMA file > Download and Save file Start CMA Open file from within CMA

Reading Time [min.] Group

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

HZAU MULTIVARIATE HOMEWORK #2 MULTIPLE AND STEPWISE LINEAR REGRESSION

Overview of Non-Parametric Statistics

MODEL SELECTION STRATEGIES. Tony Panzarella

AMSc Research Methods Research approach IV: Experimental [2]

Chapter 13: Factorial ANOVA

HOW STATISTICS IMPACT PHARMACY PRACTICE?

Introduction to regression

Two-Way Independent Samples ANOVA with SPSS

One-Way Independent ANOVA

Statistics for EES Factorial analysis of variance

SUMMER 2011 RE-EXAM PSYF11STAT - STATISTIK

MULTIPLE REGRESSION OF CPS DATA

SCHOOL OF MATHEMATICS AND STATISTICS

Title: A robustness study of parametric and non-parametric tests in Model-Based Multifactor Dimensionality Reduction for epistasis detection

Sample size for clinical trials

Explaining heterogeneity

Lesson 9: Two Factor ANOVAS

Analysis of Variance (ANOVA) Program Transcript

Things you need to know about the Normal Distribution. How to use your statistical calculator to calculate The mean The SD of a set of data points.

Investigating the robustness of the nonparametric Levene test with more than two groups

Still important ideas

Psych 5741/5751: Data Analysis University of Boulder Gary McClelland & Charles Judd. Exam #2, Spring 1992

1. Objective: analyzing CD4 counts data using GEE marginal model and random effects model. Demonstrate the analysis using SAS and STATA.

Analysis of data in within subjects designs. Analysis of data in between-subjects designs

Types of Statistics. Censored data. Files for today (June 27) Lecture and Homework INTRODUCTION TO BIOSTATISTICS. Today s Outline

Helmut Schütz. Satellite Short Course Budapest, 5 October

Choosing the Correct Statistical Test

Answer Key to Problem Set #1

REVIEW ARTICLE. A Review of Inferential Statistical Methods Commonly Used in Medicine

Dr. Kelly Bradley Final Exam Summer {2 points} Name

Table of Contents. Plots. Essential Statistics for Nursing Research 1/12/2017

Fixed-Effect Versus Random-Effects Models

CHL 5225 H Advanced Statistical Methods for Clinical Trials. CHL 5225 H The Language of Clinical Trials

1) What is the independent variable? What is our Dependent Variable?

Analysis of Variance: repeated measures

The t-test: Answers the question: is the difference between the two conditions in my experiment "real" or due to chance?

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 8 One Way ANOVA and comparisons among means Introduction

Interactions between predictors and two-factor ANOVA

Student name: SOCI 420 Advanced Methods of Social Research Fall 2017

Small Group Presentations

Careful with Causal Inference

Transcription:

Cross-over trials Martin Bland Professor of Health Statistics University of York http://martinbland.co.uk Cross-over trials Use the participant as their own control. Each participant gets more than one treatment. Also known as change-over trials. Cross-over trials Example: a two treatment cross-over trial: pronethalol vs placebo for the treatment of angina. Patients received placebo for two periods of two weeks and pronethalol for two periods of two weeks, in random order. Completed diaries of attacks of angina. Pritchard BNC, Dickinson CJ, Alleyne GAO, Hurst P, Hill ID, Rosenheim ML, Laurence DR. Report of a clinical trial from Medical Unit and MRC Statistical Unit, University College Hospital Medical School, London} BMJ 1963; 2: 1226-. 1

Cross-over trials Example: a two treatment cross-over trial: pronethalol vs placebo for the treatment of angina. Attacks of angina recorded over four weeks: Placebo: 2 3 8 14 1 23 34 60 9 1 323 Pronethalol: 0 0 1 2 15 16 25 29 41 65 348 Mann Whitney U test: P = 0.4. But this ignores the data structure. These observations should be paired. Results of a trial of pronethalol for the treatment of angina pectoris (Pritchard et al., 1963) Patient 1 2 3 Placebo 1 323 8 Pronethalol 29 348 1 Placebo- Pronethalol 42-25 Paired analysis (sign test): P = 0.006. 4 14 5 6 8 9 10 11 23 34 9 60 2 3 1 16 25 65 41 0 0 15 9 14 19 2 3 2 Two sample analysis, ignoring pairing, (Mann Whitney U test): 12 2 5 P = 0.4. Advantages of cross-over designs each participant acts as their own control removes variability between participants, fewer subjects needed. Disadvantages of cross-over designs short term treatment, no follow-up. 2

Cross-over trials are suitable for: chronic diseases (angina, asthma, arthritis), symptomatic treatment, quick, quantitative outcome (attack frequency, lung function, pain scores), early stage in treatment development. Cross-over trials are not suitable for: acute conditions (myocardial infarction, pneumonia), treatment to cure (clot-busters, antibiotics), slow or qualitative outcomes (time to recurrence, death), later stages in development (side effects of long term treatment). Estimation and significance tests Trialists are encouraged to present results as estimates with confidence intervals rather than use significance tests, i.e. give P values. Cross-over trials are typically small, so t methods are required for this. In the pronethalol example, only P values were given. Does this matter? Not so much as in a larger trial, as cross-over trials are usually at an early stage in treatment development. P values are often more important than estimates. A trial where there are two treatments, each given once, in random order, is called a simple two period two treatment cross-over trial or AB/BA design. Analysis will be illustrated using a cross-over trial of a homeopathic preparation intended to reduce mental fatigue. This was a trial in healthy volunteers. On different occasions, paid student and staff volunteers received either the homeopathic preparation or a placebo. They underwent a psychological test to measure their resistance to mental fatigue. 3

There were two treatments labelled A and B, one is a homeopathic dose of potassium phosphate and the other a control. This is a triple blind trial, in that I do not know which is which. Subjects took A or B, in random order, on different occasions, and carried out a test where accuracy was the outcome measurement. There were 86 subjects, 43 for each order. +------------+------------+------------+------------+ A first B first acc1 acc2 acc1 acc2 acc1 acc2 acc1 acc2 ------------ ------------ ------------ ------------ 84 108 106 104 50 101 105 10 85 108 106 10 86 99 105 108 88 82 106 10 89 106 106 96 88 89 106 10 91 102 106 108 88 10 106 108 92 100 106 108 91 104 106 108 93 106 106 108 92 10 106 108 93. 106. 93 89 10 100 9 106 10 105 98 89 10 104 99 106 10 106 98 10 10 105 101 103 10 106 101 80 10 10 102 95 10 106 101 90 10 10 102 99 10 10 101 99 10 108 102 101 10 10 103 98 10 108 102 101 10 108 103 106 108 94 102 106 108 10 103 10 108 104 102 108 108 10 104 10 108 106 102 108 108 108 104 108 108 108 103 105 108 108 105 106 108 108 103 108 108 108 105 10 108 108 104 90 108 108 105 108 108 108 105 104 108 108 106 100 105 10 +------------+------------+------------+------------+ Sorted by first observation. Clear ceiling effect. Two students did not come back for the second measurement. Accuracy score 120 100 80 60 40 Treatment A Treatment B Group Period 1 Period 2 Ceiling effect and negative skewness. Period 2 may have greater accuracy than period 1. 4

We can do a simple test of the treatment effect, by estimating the mean difference, A minus B.. ttest diffamb=0 One-sample t test Variable Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] diffamb 84 1.03514 1.0045 9.20639 -.9621963 3.033625 Degrees of freedom: 83 Ho: mean(diffamb) = 0 Ha: mean < 0 Ha: mean!= 0 Ha: mean > 0 t = 1.0311 t = 1.0311 t = 1.0311 P < t = 0.842 P > t = 0.3055 P > t = 0.1528 Estimated treatment effect = 1.0 (95% CI 1.0 to 3.0, P=0.3). Are the assumptions of this analysis met? Plot of difference against average of the two scores: Accuracy score A minus B 20 0-20 -40-60 0 80 90 100 110 Accuracy score, average A and B A first B first Frequency 30 20 10 0-40 -20 0 20 40 60 Accuracy score, A minus B Differences are related to the magnitude of the measurement, cannot assume SD is well estimated. Assumptions for t method not met. Plot of difference against average of the two scores: Accuracy score A minus B 20 0-20 -40-60 0 80 90 100 110 Accuracy score, average A and B A first B first Frequency 30 20 10 0-40 -20 0 20 40 60 Accuracy score, A minus B Differences have symmetrical distribution. Could use Wilcoxon matched-pairs (signed rank) test. 5

Non-parametric analysis, Wilcoxon paired test :. signrank diffamb=0 Wilcoxon signed-rank test sign obs sum ranks expected -------------+--------------------------------- positive 36 1991.5 139.5 negative 35 148.5 139.5 zero 13 91 91 -------------+--------------------------------- all 84 350 350 unadjusted variance 502.50 adjustment for ties -180.00 adjustment for zeros -204.5 ---------- adjusted variance 49892.5 Ho: diffamb = 0 z = 1.128 Prob > z = 0.2592 P = 0.3, as before. Conclusion: no evidence for a treatment effect. Non-parametric analysis, Wilcoxon paired test:. signrank diffamb=0 Wilcoxon signed-rank test sign obs sum ranks expected -------------+--------------------------------- positive 36 1991.5 139.5 negative 35 148.5 139.5 zero 13 91 91 -------------+--------------------------------- all 84 350 350 unadjusted variance 502.50 adjustment for ties -180.00 adjustment for zeros -204.5 ---------- adjusted variance 49892.5 Ho: diffamb = 0 z = 1.128 Prob > z = 0.2592 Conclusion: no evidence for a treatment effect. But we can do better. The difference between periods has gone into the error. Using the period effect, step by step method using t tests (Armitage and Hills, 1982). To see how the analysis works, we will use the following notation: A1 = the mean for A in the first period A2 = the mean for A in the second period B1 = the mean for B in the first period B2 = the mean for B in the second period Armitage P and Hills M. (1982) The two period cross-over trial. The Statistician 31, 119-131. 6

First we ask whether there is evidence for a period effect, i.e. are scores in the first period the same as in the second? For example, there might be a learning effect, with accuracy increasing with repetition of the test. If there is no period effect, we expect the differences between the treatment to be the same in the two periods. The period effect, first period mean minus second period mean, will be estimated by (A1 A2 + B1 B2)/2. We can rearrange this as (A1 B2 A2 + B1)/2 = (A1 B2)/2 (A2 B1)/2 (A1 B2) is the mean treatment difference for the group with A first, (A2 B1) is the mean treatment difference for the group with A first. We can test the null hypothesis that the difference between these two mean differences is zero. Compare difference A minus B between orders. Compare difference A minus B between orders. 60 Accuracy score A minus B 40 20 0-20 A first Order B first

Compare difference A minus B between orders.. ttest diffamb, by(order) Two-sample t test with equal variances Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] A first 43 -.8604651 1.295952 8.49812-3.45803 1.5482 B first 41 3.02439 1.49898 9.598145 -.0051582 6.053939 combined 84 1.03514 1.0045 9.20639 -.9621963 3.033625 diff -3.884855 1.9546 -.815243.0455321 Degrees of freedom: 82 Ho: mean(a first) - mean(b first) = diff = 0 Ha: diff < 0 Ha: diff!= 0 Ha: diff > 0 t = -1.9663 t = -1.9663 t = -1.9663 P < t = 0.0263 P > t = 0.052 P > t = 0.93 Weak evidence of a period effect, P=0.05. Compare difference A minus B between orders. Non-parametric analysis using Mann-Whitney U test:. ranksum diffamb, by(order) Two-sample Wilcoxon rank-sum (Mann-Whitney) test order obs rank sum expected -------------+--------------------------------- A first 43 1610 182.5 B first 41 1960 142.5 -------------+--------------------------------- combined 84 350 350 unadjusted variance 1248.92 adjustment for ties -151.09 ---------- adjusted variance 12336.83 Ho: diffamb(order==a first) = diffamb(order==b first) z = -1.958 Prob > z = 0.0502 Weak evidence of a period effect, P=0.05. We can allow for a possible period effect. We estimate the average of the differences between A and B for each period: (A1 B1)/2 + (A2 B2)/2 (A1 B1)/2 + (A2 B2)/2 = (A1 B2)/2 (B1 A2)/2 (A1 B2) and (B1 A2) are the differences between periods 1 and 2, for those starting with A and for those starting with B. Test the difference between difference between periods 1 and 2 for the two orders. This is called the CROS analysis. 8

Test the difference between difference between periods 1 and 2 for the two orders.. ttest diff1m2, by(order) Two-sample t test with equal variances Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] A first 43 -.8604651 1.295952 8.49812-3.45803 1.5482 B first 41-3.02439 1.49898 9.598145-6.053939.0051582 combined 84-1.91666.98893 9.062312-3.883309.049956 diff 2.163925 1.9546-1.66462 6.094313 diff = mean(a first) - mean(b first) t = 1.0952 Ho: diff = 0 degrees of freedom = 82 Ha: diff < 0 Ha: diff!= 0 Ha: diff > 0 Pr(T < t) = 0.861 Pr( T > t ) = 0.266 Pr(T > t) = 0.1383 No evidence for a treatment effect. Estimate 2.163925/2 = 1.1 (95% CI 0.9 to 3.0, P=0.3). Test the difference between difference between periods 1 and 2 for the two orders.. ranksum diff1m2, by(order) Two-sample Wilcoxon rank-sum (Mann-Whitney) test order obs rank sum expected -------------+--------------------------------- A first 43 1949 182.5 B first 41 1621 142.5 -------------+--------------------------------- combined 84 350 350 unadjusted variance 1248.92 adjustment for ties -100. ---------- adjusted variance 1238.15 Ho: diff1m2(order==a first) = diff1m2(order==b first) z = 1.092 Prob > z = 0.250. No evidence for a treatment effect, P=0.3. Same analysis by analysis of variance:. anova score sub treat period Number of obs = 10 R-squared = 0.6490 Root MSE = 6.40033 Adj R-squared = 0.265 Source Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model 6210.0834 8 1.380229 1.4 0.0059 sub 5990.61699 85 0.48469 1.2 0.001 treat 49.1391331 1 49.1391331 1.20 0.266 period 158.3228 1 158.3228 3.8 0.052 Residual 3359.0692 82 40.9642585 -----------+---------------------------------------------------- Total 9569.15294 169 56.6222068 9

Is the treatment difference the same whatever order of treatments is given? I.e., is there an interaction between period and treatment? Is there an order effect? If not, the participant s average response should be the same whichever order treatments were given: is A1 + B2 = A2 + B1? To test for period treatment interaction, we compare the sum or the average of the scores between orders. To test for period treatment interaction, we compare the sum or the average of the scores between orders. 110 Accuracy score average A and B 100 90 80 0 A first Order B first To test for period treatment interaction, we compare the sum or the average of the scores between orders.. ttest av1and2, by(order) Two-sample t test with equal variances Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] A first 43 102.593.9138191 5.992312 100.489 104.432 B first 41 103.2439.9346191 5.984482 101.355 105.1328 combined 84 102.910.6504313 5.961301 101.61 104.2044 diff -.650892 1.3016-3.251251 1.949493 Degrees of freedom: 82 Ho: mean(a first) - mean(b first) = diff = 0 Ha: diff < 0 Ha: diff!= 0 Ha: diff > 0 t = -0.499 t = -0.499 t = -0.499 P < t = 0.3099 P > t = 0.6199 P > t = 0.6901. No evidence of an interaction, P=0.6. 10

Period treatment interaction In the mental fatigue trial, there could be an interaction because of the ceiling effect and practice. One treatment could raise scores to the ceiling in the first period and all get near the ceiling in the second period. Another possibility in cross-over trials is a carry-over effect, where the first treatment continues to have an effect in the second period. Note that if the interaction is significant, there is a significant treatment effect. Period treatment interaction Should we test it? Two views: Grizzle (1965) says yes. If significant, use period 1 data only. Difference, A B = 0.5 (95% CI 3.2 to 4.2, P=0.8). Called the two-stage analysis. Grizzle JE. (1965) The two-period change-over design and its use in clinical trials. Biometrics, 21, 46-480. Period treatment interaction Should we test it? Two views: Grizzle (1965) says yes. If significant, use period 1 data only. Difference, A B = 0.5 (95% CI 3.2 to 4.2, P=0.8). Called the two-stage analysis. Compare the full data estimate: Difference, A B = 1.1 (95% CI 0.9 to 3.0, P=0.3). We lose power and precision. 11

Period treatment interaction Should we test it? Two views: Senn (1989) says no. The average of the first and second periods is highly correlated with the first period. Senn S. Cross-Over Trials in Clinical Research. Chichester: Wiley, 1989. Period treatment interaction Should we test it? Two views: Senn (1989) says no. The average of the first and second periods is highly correlated with the first period. Accuracy score Period 1 120 100 80 60 40 40 60 80 100 120 Accuracy score, average A and B Hence the correlated treatment test using first period only is highly correlated with the interaction test. The alpha value conditional on the interaction test is >0.05. Period treatment interaction The power of the test is low and alpha = 0.10 is often recommended as a decision point. Example: Nicardipine against placebo in patients with Raynaud s phenomenon (Kahan et al., 198, given by Altman, 1991). Kahan A, Amor B, Menkes CJ, et al., Nicardipine in the treatment of Raynaud s phenomenon: a randomised doubleblind trial. Angiology 198; 38: 333-. Altman DG. Practical Statistics for Medical Research Chapman and Hall, London, 1991. 12

Attacks of Raynaud s phenomenon in two week periods. Period 1 Nicardipine 16 26 8 3 9 41 52 10 11 30 Mean Period 2 Placebo 12 19 20 44 25 36 36 11 20 2 Placebo - Nicardipine 4 12 16 5 16 1 9 3 1.0 Period 1 Placebo 18 12 46 51 28 29 51 46 18 44 Period 2 Nicardipine 12 4 3 58 2 18 44 14 30 4 Placebo - Nicardipine 6 8 9 26 11 32 12 40 12.0 Looks like carry-over to me! Period treatment interaction Nicardipine data, testing for interaction:. ttest av, by(order) Two-sample t test with equal variances Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] 1st peri 10 24.5 3.952496 12.49889 15.55883 33.4411 2nd peri 10 28.3 4.8202 15.1221 1.4823 39.11 combined 20 26.4 3.050582 13.64262 20.01506 32.8494 diff -3.8 6.204031-16.83419 9.234185 diff = mean(1st peri) - mean(2nd peri) t = -0.6125 Ho: diff = 0 degrees of freedom = 18 Ha: diff < 0 Ha: diff!= 0 Ha: diff > 0 Pr(T < t) = 0.239 Pr( T > t ) = 0.549 Pr(T > t) = 0.261 No evidence for an interaction, P = 0.5. Period treatment interaction Nicardipine data, testing for interaction: No evidence for an interaction, P = 0.5. But there appears to be one! Difference in attacks, placebo - nicardipine -20 0 20 40 1st period 2nd period Period of nicardipine treatment 13

Period treatment interaction Compare treatments:. ttest diff1m2, by(order) Two-sample t test with equal variances Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] 1st peri 10-1 3.119829 9.86566-8.05544 6.05544 2nd peri 10 12 5.16829 16.34353.3085399 23.69146 combined 20 5.5 3.29433 14.3449-1.395955 12.39595 diff -13 6.036923-25.68311 -.3168945 diff = mean(1st peri) - mean(2nd peri) t = -2.1534 Ho: diff = 0 degrees of freedom = 18 Ha: diff < 0 Ha: diff!= 0 Ha: diff > 0 Pr(T < t) = 0.0225 Pr( T > t ) = 0.0451 Pr(T > t) = 0.95 Evidence for a treatment effect, P = 0.045. But the estimate must be in doubt, due to the interaction. An aside, CROS or simple paired t test? CROS: Evidence for a treatment effect, P = 0.045. Simple paired t test:. ttest diffamb=0 One-sample t test Variable Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] diffamb 20 6.5 3.1945 14.29943 -.192339 13.19234 mean = mean(diffamb) t = 2.0329 Ho: mean = 0 degrees of freedom = 19 Ha: mean < 0 Ha: mean!= 0 Ha: mean > 0 Pr(T < t) = 0.919 Pr( T > t ) = 0.0563 Pr(T > t) = 0.0281 Evidence for a treatment effect, P = 0.056. CROS adjusts for the period effect so reduces the effect of the (non-significant) period difference a bit. It is more powerful. Period treatment interaction Should we test for a period treatment interaction? Grizzle (1965) proposed this. Senn (1989) claims this is an error. Jones and Kenward (1989) review the question but do not make a strong recommendation. Jones B and Kenward MG. Design and Analysis of Cross-Over Trials. London: Chapman and Hall, 1989. 14

Period treatment interaction Should we test for a period treatment interaction? Grizzle (1965) proposed this. Senn (1989) claims this is an error. Jones and Kenward (1989) review the question but do not make a strong recommendation. I find Senn s argument convincing. I would say do not test or do the two-stage analysis. However, I think it is worth inspecting the data to see whether the assumption of no interaction required for the CROS estimate is plausible. If it is not, rely on the P value. Period treatment interaction What should we do instead of Grizzle s approach using the first period only? Senn (1989) suggests that if the estimate is needed, we should repeat the trial and design the carryover out of it, using washout periods described next. Washout periods A washout period is a time when the participants do not receive any active trial treatment. Intended to prevent continuation of the effects of the trial treatment from one period to another. Typical cross-over trial with washout periods: washout / run-in treatment 1 removes effects of pre-trial treatments washout removes effects of treatment 1 treatment 2 washout removes effects of treatment 2 usual care 15

Washout periods A washout period is necessary if treatments might interact in an adverse way. In a placebo controlled trial, we could simply make the treatment periods longer. In drug trials, washout periods should be at least 3 half life of drug in body (FDA). If no washout periods are used, the treatment periods should be longer than would be required for washout and no measurements made in the time that would be needed for washout. Two texts: Senn S. Cross-Over Trials in Clinical Research, 2 nd ed. Chichester: Wiley, 2002. Jones B and Kenward MG. Design and Analysis of Cross- Over Trials, 2 nd ed. London: Chapman and Hall, 2003. For a brief introduction: Altman DG. Practical Statistics for Medical Research Chapman and Hall, London, 1991. 16