Effectiveness of non-steroidal anti-inflammatory drugs for the treatment of pain in knee and hip osteoarthritis: a network meta-analysis RETRACTED

Similar documents
Database of Abstracts of Reviews of Effects (DARE) Produced by the Centre for Reviews and Dissemination Copyright 2017 University of York.

Received: 11 Sep 2006 Revisions requested: 6 Nov 2006 Revisions received: 3 Jan 2007 Accepted: 31 Jan 2007 Published: 31 Jan 2007

Efficacy and safety of paracetamol for spinal pain and osteoarthritis: systematic review and meta-analysis of randomised placebo controlled trials

CARDIOVASCULAR RISK and NSAIDs

ACR OA Guideline Development Process Knee and Hip

Anneloes van Walsem 1, Shaloo Pandhi 2, Richard M Nixon 2, Patricia Guyot 1, Andreas Karabis 1* and R Andrew Moore 3

Oral or transdermal opioids for osteoarthritis of the knee or hip (Review)

Cochrane Pregnancy and Childbirth Group Methodological Guidelines

Glucosamine May Reduce Pain in Individuals with Knee Osteoarthritis

TRANSPARENCY COMMITTEE OPINION. 26 November 2008

CTU Bern, Bern University Hospital & Institute of Social and Preventive Medicine (ISPM)

Systematic Reviews and Meta- Analysis in Kidney Transplantation

NATIONAL INSTITUTE FOR HEALTH AND CARE EXCELLENCE

Systematic review with multiple treatment comparison metaanalysis. on interventions for hepatic encephalopathy

What is indirect comparison?

Background: Traditional rehabilitation after total joint replacement aims to improve the muscle strength of lower limbs,

Systematic reviews and meta-analyses of observational studies (MOOSE): Checklist.

Uses and misuses of the STROBE statement: bibliographic study

(i) Is there a registered protocol for this IPD meta-analysis? Please clarify.

Supplementary appendix

Determinants of quality: Factors that lower or increase the quality of evidence

Papers. Abstract. Introduction. Methods. Jonathan J Deeks, Lesley A Smith, Matthew D Bradley

Oral or transdermal opioids for osteoarthritis of the knee or hip (Review)

ARCHE Risk of Bias (ROB) Guidelines

the cardiovascular safety of non-steroidal anti-inflammatory

The QUOROM Statement: revised recommendations for improving the quality of reports of systematic reviews

Bayesian meta-analysis of Papanicolaou smear accuracy

Cochrane Breast Cancer Group

Controlled Trials. Spyros Kitsiou, PhD

Systematic Reviews and Meta-analysis in Aquatic therapy

Knee osteoarthritis (OA) is a common and progressive REVIEW

Copyright GRADE ING THE QUALITY OF EVIDENCE AND STRENGTH OF RECOMMENDATIONS NANCY SANTESSO, RD, PHD

British Medical Journal. June 3, 2006;332: Patricia M Kearney, Colin Baigent, Jon Godwin, Heather Halls, Jonathan R Emberson, Carlo Patrono

Traumatic brain injury

GRADE: Applicazione alle network meta-analisi

PROSPERO International prospective register of systematic reviews

Safinamide (Addendum to Commission A15-18) 1

Meta-analyses: analyses:

PRESCRIBING SUPPORT TEAM AUDIT: Etoricoxib hypertension safety evaluation

Single dose oral analgesics for acute postoperative pain in adults (Review)

Ibuprofen versus other non-steroidal anti-in ammatory drugs: use in general practice and patient perception

PDP 406 CLINICAL TOXICOLOGY

Drug Class Review on Cyclo-oxygenase (COX)-2 Inhibitors and Non-steroidal Anti-inflammatory Drugs (NSAIDs)

Effective Health Care Program

PROSPERO International prospective register of systematic reviews

Results. NeuRA Worldwide incidence April 2016

PROSPERO International prospective register of systematic reviews

Efficacy and tolerability of celecoxib in osteoarthritis patients who previously failed naproxen and ibuprofen: results from two trials

Capture-recapture method for assessing publication bias

Problem solving therapy

School of Dentistry. What is a systematic review?

National Institute for Health and Clinical Excellence. CG 59 Review of Osteoarthritis Guideline Review Consultation Comments Table

Characteristics of selective and non-selective NSAID use in Scotland

Empirical evidence on sources of bias in randomised controlled trials: methods of and results from the BRANDO study

Treatment effect estimates adjusted for small-study effects via a limit meta-analysis

The legally binding text is the original French version

Is Tanezumab More Effective than a Placebo in Reducing Pain in Patients with Osteoarthritis?

Chemoprevention of colorectal cancer in individuals with previous colorectal neoplasia: systematic review and network meta-analysis

Month/Year of Review: January 2012 Date of Last Review: February 2007

Setting The setting was primary and secondary care. The economic study was carried out in the UK.

Drug Class Review Nonsteroidal Antiinflammatory Drugs (NSAIDs)

PROSPERO International prospective register of systematic reviews

Abwägung zwischen Schaden und Nutzen medizinischer Interventionen

Osteoarthritis (OA), the most common joint

Workshop: Cochrane Rehabilitation 05th May Trusted evidence. Informed decisions. Better health.

The RoB 2.0 tool (individually randomized, cross-over trials)

Alcohol interventions in secondary and further education

Results. NeuRA Mindfulness and acceptance therapies August 2018

SUPPLEMENTARY DATA. Supplementary Figure S1. Search terms*

Effective Health Care Program

Evidence Based Medicine

SYSTEMATIC REVIEW: AN APPROACH FOR TRANSPARENT RESEARCH SYNTHESIS

Canadian Chiropractic Guideline Initiative (CCGI) Guideline Summary

Alectinib Versus Crizotinib for Previously Untreated Alk-positive Advanced Non-small Cell Lung Cancer : A Meta-Analysis

research methods & reporting

Learning from Systematic Review and Meta analysis

The moderating impact of temporal separation on the association between intention and physical activity: a meta-analysis

NSAIDs: Side Effects and Guidelines

Study protocol v. 1.0 Systematic review of the Sequential Organ Failure Assessment score as a surrogate endpoint in randomized controlled trials

Does the use of NSAIDs amongst patients with long-bone fractures increase the risk of non-union: A structured review protocol for a systematic review.

Quality and Reporting Characteristics of Network Meta-analyses: A Scoping Review

Choice of axis, tests for funnel plot asymmetry, and methods to adjust for publication bias

Systematic reviews: From evidence to recommendation. Marcel Dijkers, PhD, FACRM Icahn School of Medicine at Mount Sinai

COMPOUNDING PHARMACY SOLUTIONS PRESCRIPTION COMPOUNDING FOR PAIN MANAGEMENT

Results. NeuRA Hypnosis June 2016

NON STEROIDEAL ANTI-INFLAMMATORY DRUGS AND CARDIOVASCULAR RISK. Advances in Cardiac Arrhythmias and Great Innovations in Cardiology

Results from a 52-Week, Phase 2A Study of an Intra-Articular, Wnt Pathway Inhibitor, SM04690, for Knee Osteoarthritis

Meta-Analysis. Zifei Liu. Biological and Agricultural Engineering

Downloaded from:

Meta-analysis of diagnostic research. Karen R Steingart, MD, MPH Chennai, 15 December Overview

Models for potentially biased evidence in meta-analysis using empirically based priors

Suffolk PCT Drug & Therapeutics Committee New Medicine Report (Adopted by the CCG until review and further notice)

18/11/2013. An Introduction to Meta-analysis. In this session: What is meta-analysis? Some Background Clinical Trials. What questions are addressed?

Gastrointestinal Safety of Coxibs and Outcomes Studies: What s the Verdict?

IMMEDIATE DICLOFENAC NEW CONTRAINDICATIONS AND WARNINGS AFTER A EUROPE-WIDE REVIEW OF CARDIOVASCULAR SAFETY

This document has not been circulated to either the industry or Consultants within the Suffolk system.

Annual Rheumatology & Therapeutics Review for Organizations & Societies

Critical appraisal: Systematic Review & Meta-analysis

Cardiovascular and Gastrointestinal Effects of Etoricoxib in the Treatment of Osteoarthritis: A Systematic Review and Network Meta-analysis

SM04690: Potential first-in-class disease modifying treatment for knee osteoarthritis. Nancy Lane, MD

Transcription:

Effectiveness of non-steroidal anti-inflammatory drugs for the treatment of pain in knee and osteoarthritis: a network meta-analysis Bruno R da Costa*, Stephan Reichenbach*, Noah Keller, Linda Nartey, Simon Wandel, Peter Jüni, Sven Trelle Summary Background Non-steroidal anti-inflammatory drugs (NSAIDs) are the backbone of osteoarthritis pain management. We aimed to assess the effectiveness of different preparations and doses of NSAIDs on osteoarthritis pain in a network meta-analysis. Methods For this network meta-analysis, we considered randomised trials comparing any of the following interventions: NSAIDs, paracetamol, or placebo, for the treatment of osteoarthritis pain. We searched the Cochrane Central Register of Controlled Trials (CENTRAL) and the reference lists of relevant articles for trials published between Jan 1, 1980, and Feb 24, 2015, with at least 100 patients per group. The prespecified primary and secondary outcomes were pain and physical function, and were extracted in duplicate for up to seven timepoints after the start of treatment. We used an extension of multivariable Bayesian random effects models for mixed multiple treatment comparisons with a random effect at the level of trials. For the primary analysis, a random walk of first order was used to account for multiple follow-up outcome data within a trial. Preparations that used different total daily dose were considered separately in the analysis. To assess a potential dose response relation, we used preparation-specific covariates assuming linearity on log relative dose. Findings We identified 8973 manuscripts from our search, of which 74 randomised trials with a total of 58 556 patients were included in this analysis. 23 nodes concerning seven different NSAIDs or paracetamol with specific daily dose of administration or placebo were considered. All preparations, irrespective of dose, improved point estimates of pain symptoms when compared with placebo. For six interventions (diclofenac 150 mg/day, etoricoxib 30 mg/day, 60 mg/day, and 90 mg/day, and rofecoxib 25 mg/day and 50 mg/day), the probability that the difference to placebo is at or below a prespecified minimum clinically important effect for pain reduction (effect size [ES] 0 37) was at least 95%. Among maximally approved daily doses, diclofenac 150 mg/day (ES 0 57, 95% credibility interval [CrI] 0 69 to 0 46) and etoricoxib 60 mg/day (ES 0 58, 0 73 to 0 43) had the highest probability to be the best intervention, both with 100% probability to reach the minimum clinically important difference. Treatment effects increased as drug dose increased, but corresponding tests for a linear dose effect were significant only for celecoxib (p=0 030), diclofenac (p=0 031), and naproxen (p=0 026). We found no evidence that treatment effects varied over the duration of treatment. Model fit was good, and between-trial heterogeneity and inconsistency were low in all analyses. All trials were deemed to have a low risk of bias for blinding of patients. Effect estimates did not change in sensitivity analyses with two additional statistical models and accounting for methodological quality criteria in meta-regression analysis. Interpretation On the basis of the available data, we see no role for single-agent paracetamol for the treatment of patients with osteoarthritis irrespective of dose. We provide sound evidence that diclofenac 150 mg/day is the most effective NSAID available at present, in terms of improving both pain and function. Nevertheless, in view of the safety profile of these drugs, physicians need to consider our results together with all known safety information when selecting the preparation and dose for individual patients. Funding Swiss National Science Foundation (grant number 405340-104762) and Arco Foundation, Switzerland. Introduction Osteoarthritis is the most common form of joint disease and the leading cause of pain in elderly people. 1 Pain symptoms associated with osteoarthritis result in increased physical and walking disability, which in turn increase the risk of all-cause mortality. 1 3 Management of osteoarthritis pain is based on a sequential hierarchical approach, with non-steroidal anti-inflammatory drugs (NSAIDs) being the main form of treatment. 4,5 In the USA, about 65% of patients with osteoarthritis are prescribed NSAIDs, making them one of the most widely used drugs in this patient population. 6 When prescribing NSAIDs, clinicians are faced with a myriad of different preparations and dosages, which poses a challenge to clinical decision making. Analyses of routine data suggest that initial treatment is characterised by switching between drugs or complete discontinuation. 7 Inadequate pain control is probably a major reason for this approach, which is usually what patients perceive as the main treatment target. 8,9 Several guidelines and Lancet 2016; 387: 2093 105 Published Online March 17, 2016 http://dx.doi.org/10.1016/ S0140-6736(16)30002-2 See Comment page 2065 *Both authors contributed equally to this work Institute of Primary Health Care (BIHAM) (B R da Costa PhD, N Keller MMed, Prof P Jüni MD), Institute of Social and Preventive Medicine (S Reichenbach MD, S Trelle MD), and CTU Bern, Department of Clinical Research (L Nartey MD, S Trelle), University of Bern, Bern, Switzerland; Department of Rheumatology, Immunology and Allergology, University Hospital and University of Bern (S Reichenbach); Cogitars GmbH (in Liq), Wangen an der Aare, Switzerland (S Wandel PhD); and Applied Health Research Centre, Li Ka Shing Knowledge Institute of St Michael s Hospital (Prof P Jüni) and Department of Medicine (Prof P Jüni), University of Toronto, Toronto, Canada Correspondence to: Dr Sven Trelle, CTU Bern, University of Bern, Finkenhubelweg 11, 3012 Bern, Switzerland sven.trelle@ctu.unibe.ch www.thelancet.com Vol 387 May 21, 2016 2093

See Online for appendix systematic reviews have investigated the effectiveness of NSAIDs for treatment of osteoarthritis pain. 10,11 However, these reviews report only the effect of NSAIDs on pain reduction as compared with placebo and therefore are only of restricted use for clinical practice. A few systematic reviews have looked at the comparative effectiveness of different NSAIDs, but considered only direct evidence and did not address different drug preparations or drug doses. 12,13 Network meta-analysis allows an integrated analysis of all randomised controlled trials that compare different doses of NSAIDs head to head or with placebo while fully respecting randomi sation. 14 We assessed the effectiveness of different preparations and doses of NSAIDs for osteoarthritis pain by integrating all available direct and indirect evidence in a network meta-analysis. Methods Selection criteria We considered large-scale randomised controlled trials of patients with knee or osteoarthritis, comparing any of the following interventions: NSAIDs, paracetamol (acetaminophen), or placebo, for the treatment of osteoarthritis pain. Trials of NSAIDs or treatment groups within trials of NSAIDs that did not have enough data to be included in our previously published safety assessment analysis were not considered (see appendix for protocol). 15 Trials that included patients with diseases other than osteoarthritis of the knee or must either have reported results separately for the different patient populations or have at least 80% of the included patients with knee or osteoarthritis. Trials must have randomly assigned, on average, at least 100 patients per group to minimise bias due to small study effects. 16 We excluded trials published only as abstracts (with no additional data available from other sources). No language restrictions were applied. Identification of trials We searched the Cochrane Central Register of Controlled Trials (CENTRAL) for eligible trials (appendix) from Jan 1, 1980, to Feb 24, 2015, using the search terms: osteoarthriti* OR osteoarthro* OR gonarthriti* OR gonarthro* OR coxarthriti* OR coxarthro* OR arthros* OR arthrot*. Although CENTRAL includes all randomised trials indexed in Embase and MEDLINE, indexing of trials from these databases in CENTRAL might be delayed. To avoid missing relevant trials because of this time lag, we also searched Embase and MEDLINE from Jan 1, 2009, to Feb 24, 2015. Furthermore, we screened for potential trials in an internal database of musculoskeletal trials consisting of 721 trials. We then screened reference lists of all obtained articles, including relevant reviews, and searched ClinicalTrials.gov for trials in progress. In the case of incomplete data, we searched for additional data in ClinicalTrials.gov, WHOapproved trial registries, company-specific trial registries, and documents available on the website of the US Food and Drug Admini stration). The search was last updated on Feb 24, 2015. Selection of studies and data extraction Two investigators independently assessed all trials for eligibility and extracted data with the help of a translator for non-english reports. In the case of disagreements, consensus was reached through discussion. We screened trials for eligibility, extracted data, and reached consensus using a standardised, piloted web-based data management tool for systematic reviews, accompanied by a codebook. We extracted trial design; trial size; details of intervention including dose and treatment duration; patient characteristics such as mean age, sex, and mean duration of symptoms; duration of follow-up; type and source of financial support; type of outcome (pain or function); and outcome data for each timepoint of interest. For crossover trials, we extracted data from the first period only, because of possible carryover effects. Whenever necessary, we approximated means and measures of dispersion from figures in the reports as previously described. 17 We extracted results from intention-to-treat analyses whenever possible. Endpoints Our prespecified primary outcome was pain. If data for more than one pain scale were provided for a trial, we referred to a previously described hierarchy of pain related outcomes and extracted data for the pain scale that was highest on this list: (1) global pain score; (2) pain on walking; (3) WOMAC osteoarthritis index pain subscore; (4) composite pain scores other than WOMAC; (5) pain on activities other than walking (such as stair climbing); (6) WOMAC global score; (7) Lequesne osteoarthritis index global score; (8) other algofunctional composite scores; (9) patient s global assessment; (10) physician s global assessment. 5,17 In this list, global pain takes precedence over pain on walking and the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) pain subscores. We extracted pain outcome data at the following timepoints whenever available: 1 week (±2 days), 2 weeks (±2 days), 4 weeks (±1 week), 6 weeks (±1 week), 3 months (±1 month), 6 months (±1 month), 12 months (±1 month), and at the end of treatment if not covered by the specific timepoints. The secondary outcome was physical function. If data for more than one physical function scale were provided for a trial, we referred to a previously described hierarchy of physical function related outcomes, in which global measures take precedence over more complex scales, and extracted data on the physical function scale that was highest on this list: (1) global function score; (2) walking disability; (3) WOMAC osteoarthritis index physical function subscore; (4) composite physical function scores other than WOMAC; (5) physical function on activities other than walking (such as stair climbing); (6) WOMAC global score; (7) Lequesne osteoarthritis index global 2094 www.thelancet.com Vol 387 May 21, 2016

score; (8) other algofunctional composite scores; (9) patient s global assessment; (10) physician s global assessment. 5,17 We extracted physical function outcome data at the following timepoints whenever available: 1 week (±2 days), 2 weeks (±2 days), and at the end of treatment if not covered by the specific timepoints. Risk of bias assessment We assessed methodological quality of included trials using a slightly adapted version of the risk of bias approach of the Cochrane Collaboration (appendix). 18 Statistical methods For the analysis of effect sizes, we used an extension of multivariable Bayesian random effects models for mixed multiple treatment comparisons (appendix). 19,20 It fully preserves the direct randomised comparisons within each trial, but allows the comparison of all available treatments across trials, and accounts for multiple comparisons in trials with more than two treatment groups. 21 The model includes a random effect at the level of trials, and uses a random walk to account for correlation of outcome data reported at various timepoints within a trial. The model assumes that, for any trial, the outcome data recorded at a specified timepoint are more similar to the outcome data recorded at adjacent timepoints immediately before and after than at nonadjacent, more remote timepoints. In this sense, the model borrows strength across timepoints for an estimate. For all analyses reported herein, we fixed the timepoint at 6 weeks but did a sensitivity analysis with the timepoint fixed at week 1 (appendix). To assess the robustness of the results obtained by the primary model, we did two sensitivity analyses (appendix). These sensitivity analyses investigated different assumptions about the potential relation between time and treatment effect. Furthermore, we adjusted the results of the primary outcome for trial characteristics (ie, concealment of allocation, therapist blinding, completeness of outcome data, last-observationcarried-forward as imputation method, and whether patients with knee,, or knee and osteoarthritis were included in the analysis) by incorporating a regression coefficient in the model. Corresponding twosided p values for interaction between treatment effects and trial characteristic were estimated from the posterior distribution. To assess potential dose response relations, we introduced preparation-specific covariates, assuming linearity on log relative dose (appendix). To assess whether treatment effects varied over time, we did separate analyses per timepoint. For all variables, minimally informative prior distributions were chosen (appendix), and all estimates reported are posterior medians with corresponding 95% credibility intervals (CrIs), unless stated otherwise. Effect sizes were calculated by dividing the difference in mean values between treatment groups in a specific time window by the median pooled SD recorded across all timepoints in a trial. 22 If SDs were not provided, we calculated them from SEs or CIs as described elsewhere. 17,23 We assessed the goodness of fit of the model to the data by calculating the number of means of standardised node-based residuals within 1 96 of the standard normal distribution; visually inspecting the distribution of residuals on Q Q plots; calculating the heterogeneity of treatment effects estimated from the posterior median between trial variance τ²; 24 and calculating the consistency of the network (determined by the difference in effect sizes derived from direct and indirect comparisons). 25 To examine the data for small study effects, we constructed comparison-adjusted funnel plots. 26 To facilitate interpretation of estimated treatment effects, we calculated several metrics for each intervention. First, the median rank and 95% CrIs; an intervention with a median rank of 1 would be interpreted as having the most beneficial effect. Second, the probability for the effect of the experimental intervention to reach the minimum clinically important difference of 0 37 SD units, with high probabilities favouring the active 12 13 11 14 10 09 15 16 08 17 07 18 Figure 1: Network of comparisons included in the analyses The size of every circle is proportional to the number of randomly assigned patients and indicates the sample size. The width of the lines corresponds to the number of trials. 01=placebo. 02=paracetamol <2000 mg. 03=paracetamol 3000 mg. 04=paracetamol 3900 4000 mg. 05=rofecoxib 12 5 mg. 06=rofecoxib 25 mg. 07=rofecoxib 50 mg. 08=lumiracoxib 100 mg. 09=lumiracoxib 200 mg. 10=lumiracoxib 400 mg. 11=etoricoxib 30 mg. 12=etoricoxib 60 mg. 13=etoricoxib 90 mg. 14=diclofenac 70 mg. 15=diclofenac 100 mg. 16=diclofenac 150 mg. 17=celecoxib 100 mg. 18=celecoxib 200 mg. 19=celecoxib 400 mg. 20=naproxen 750 mg. 21=naproxen 1000 mg. 22=ibuprofen 1200 mg. 23=ibuprofen 2400 mg. 19 06 20 05 04 21 03 22 02 23 01 www.thelancet.com Vol 387 May 21, 2016 2095

Altman et al (2007) Baerwald et al (2010) Bensen et al (1999) Bin et al (2007) Bingham et al (2007) Bingham et al (2007a) Birbara et al (2006) Birbara et al (2006a) Bocanegra et al (1998) Boureau et al (2004) Cannon et al (2000) Caruso et al (1987) Conaghan et al (2013) Dahlberg et al (2009) Day et al (2000) DeLemos et al (2011) Doherty et al (2011) Ehrich et al (2001) Emery et al (2008) Essex et al (2012) Interventions Placebo vs paracetamol (650 mg/tid) vs paracetamol (1300 mg/tid) Placebo vs naproxen (50 mg/bid) vs celecoxib (100 mg/bid) vs celecoxib (200 mg/bid) vs naproxen Celecoxib vs lumiracoxib vs etoricoxib (30 mg/qid) vs etoricoxib (30 mg/qid) vs rofecoxib (12 5 mg/qid) vs rofecoxib (12 5 mg/qid) Placebo vs diclofenac (75 mg/bid) Ibuprofen (200 mg/q6d) vs paracetamol (500 mg/q6d) Diclofenac (50 mg/tid) vs rofecoxib (12 5 mg/qid) vs rofecoxib Placebo vs naproxen (125 mg/q6d) (100 mg/bid) Celecoxib vs diclofenac (50 mg/bid) Placebo vs ibuprofen (800 mg/tid) vs rofecoxib (12 5 mg/qid) vs rofecoxib Ibuprofen (400 mg/tid) vs paracetamol (1000 mg/tid) Placebo vs rofecoxib (12 5 mg/qid) vs rofecoxib vs rofecoxib (50 mg/qid) Celecoxib vs diclofenac (50 mg/tid) Celecoxib vs naproxen Intervention nodes (intervention node number)* Placebo (1) vs paracetamol <2000 mg (2) vs paracetamol 3900 4000 mg (4) Placebo (1) vs naproxen 100 mg (17) vs celecoxib vs celecoxib 400 mg (19) vs naproxen Celecoxib vs lumiracoxib 200 mg (9) vs etoricoxib 30 mg (11) vs etoricoxib 30 mg (11) vs rofecoxib 12 5 mg (5) vs rofecoxib 12 5 mg (5) Placebo (1) vs diclofenac 150 mg (16) Ibuprofen 1200 mg (22) vs paracetamol 3000 mg (3) Diclofenac 150 mg (16) vs rofecoxib 12 5 mg (5) vs rofecoxib Placebo (1) vs naproxen 750 mg (20) Celecoxib vs diclofenac 100 mg (15) Placebo (1) vs ibuprofen 2400 mg (23) vs rofecoxib 12 5 mg (5) vs rofecoxib Ibuprofen 1200 mg (22) vs paracetamol 3000 mg (3) Placebo (1) vs rofecoxib 12 5 mg (5) vs rofecoxib vs rofecoxib 50 mg (7) Celecoxib vs diclofenac 150 mg (16) Celecoxib vs naproxen Number of patients/ proportion of women (%) Follow-up (weeks) Mean age (years) Joint 483/66% 12 62 Knee and Lastobservationcarriedforward Risk of bias Concealed allocation Patient blinding/ investigator binding Yes Unclear Low/unclear Low 810/66% 15 63 Hip Yes Unclear Low/unclear Low 1004/72% 12 62 Knee Yes Unclear Low/low Low 703/85% 6 61 Knee Yes Unclear Low/low Low 599/67% 26 63 Knee and 608/66% 26 62 Knee and No Unclear Low/low High No Unclear Low/low High 395/72% 6 61 Knee Yes Unclear Low/low High 413/65% 6 61 Knee Yes Unclear Low/low High 572/69% 6 63 Knee and 222/73% 2 67 Knee and 784/67% 54 64 Knee and Yes Unclear Low/unclear Low Yes Unclear Low/unclear High Yes Low Low/unclear High Incomplete outcome data 734/74% 4 2 59 Knee and Yes Unclear Low/low High 1399/66% 12 61 Knee Yes Unclear Low/low High 925/70% 52 71 Knee and 809/80% 7 64 Knee and Yes Low Low/low High Yes Unclear Low/low High 1011/63% 13 60 Knee and Yes Unclear Low/low High 892/49% 13 61 Knee Yes Unclear Low/low High 672/71% 6 62 Knee and Yes Unclear Low/low High 249/ 12 64 Hip Yes Unclear Low/low High 589/66% 26 60 Knee Yes Unclear Low/low High (Table continues on next page) 2096 www.thelancet.com Vol 387 May 21, 2016

Interventions (Continued from previous page) Essex et al (2012a) vs naproxen Essex et al (2014) vs naproxen Fleischmann et al (2006) (400 mg/qid) GAIT (2006) Gibofsky et al (2003) vs rofecoxib Gibofsky et al (2014) Placebo vs diclofenac (35 mg/bid) vs diclofenac (35 mg/tid) Gottesdiener et al (2002) Placebo vs etoricoxib (30 mg/qid) vs etoricoxib (60 mg/qid) vs etoricoxib (90 mg/qid) Hawkey et al (2000) Placebo vs ibuprofen (800 mg/tid) vs rofecoxib vs rofecoxib (50 mg/qid) Herrero-Beaumont et al Placebo vs paracetamol (2007) (1000 mg/tid) Hochberg et al (2011) Hochberg et al (2011a) Karlsson et al (2009) Placebo vs rofecoxib Kivitz et al (2001) (100 mg/qid) vs celecoxib vs celecoxib (400 mg/qid) vs naproxen Kivitz et al (2002) Placebo vs naproxen Kivitz et al (2004) Placebo vs rofecoxib (12 5 mg/qid) Lehmann et al (2005) (100 mg/qid) vs lumiracoxib Leung et al (2002) Placebo vs etoricoxib (60 mg/qid) vs naproxen Lohmander et al (2005) Placebo vs naproxen Makarowski et al (2002) Placebo vs naproxen Intervention nodes (intervention node number)* Number of patients/ proportion of women (%) Follow-up (weeks) Mean age (years) Joint Lastobservationcarriedforward Risk of bias Concealed allocation Patient blinding/ investigator binding Incomplete outcome data vs naproxen 322/80% 6 58 Knee Yes Unclear Low/low High 318/66% 6 60 Knee Yes Unclear Low/low High vs naproxen 1608/66% 15 61 Knee Yes Unclear Low/low High 200 mg (9) vs lumiracoxib 400 mg (10) 1583/64% 24 59 Knee Yes Unclear Low/low Low 477/67% 6 63 Knee Yes Unclear Low/low High vs rofecoxib Placebo (1) vs diclofenac 305/67% 13 62 Knee and Yes Unclear Low/unclear High 70 mg (14) vs diclofenac 100 mg (15) Placebo (1) vs etoricoxib 617/72% 14 61 Knee Yes Low Low/low High 30 mg (11) vs etoricoxib 60 mg (12) vs etoricoxib 90 mg (13) Placebo (1) vs ibuprofen 775/75% 24 62 Knee and Yes Unclear Low/low High 2400 mg (23) vs rofecoxib vs rofecoxib 50 mg (7) Placebo (1) vs paracetamol 325/86% 25 8 64 Knee Yes Low Low/low High 3000 mg (3) 619/63% 12 62 Knee Yes Low Low/low High 615/63% 12 62 Knee Yes Low Low/low High Placebo (1) vs rofecoxib 543/65% 6 62 Knee and Yes Unclear Low/low High 1061/66% 2 63 Hip Yes Unclear Low/unclear High 100 mg (17) vs celecoxib vs celecoxib 400 mg (19) vs naproxen Placebo (1) vs naproxen 1019/65% 2 60 Knee Yes Unclear Low/unclear High Placebo (1) vs rofecoxib 1042/68% 6 63 Knee Yes Unclear Low/low High 12 5 mg (5) 1684/70% 13 62 Knee Yes Low Low/low High 100 mg (8) vs lumiracoxib 200 mg (9) Placebo (1) vs etoricoxib 501/78% 12 63 Knee and Yes Unclear Low/low High 60 mg (12) vs naproxen Placebo (1) vs naproxen 970/73% 7 59 Knee and Yes Unclear Low/unclear High Placebo (1) vs naproxen 467/68% 12 62 Hip Yes Unclear Low/unclear High (Table continues on next page) www.thelancet.com Vol 387 May 21, 2016 2097

Interventions Intervention nodes (intervention node number)* Number of patients/ proportion of women (%) Follow-up (weeks) Mean age (years) Joint Lastobservationcarriedforward Risk of bias Concealed allocation Patient blinding/ investigator binding Incomplete outcome data (Continued from previous page) McKenna et al (2001) 600/65% 6 62 Knee Yes Unclear Low/unclear High (100 mg/bid) vs diclofenac (50 mg/tid) vs diclofenac 150 mg (16) Miceli-Richard et al Placebo vs paracetamol Placebo (1) vs paracetamol 779/75% 6 70 Knee Yes Unclear Low/unclear Low (2004) (1000 mg/qid) 3900 4000 mg (4) Novartis (2005) 408/ 1 66 Knee Yes Unclear Low/low High (200 mg/bid) vs lumiracoxib (400 mg/qid) 400 mg (19) vs lumiracoxib 200 mg (9) vs lumiracoxib 400 mg (10) Novartis (2005a) 1551/ 13 61 Knee Yes Unclear Low/low High (100 mg/qid) vs lumiracoxib 100 mg (8) vs lumiracoxib 200 mg (9) Novartis (2006) 1684/ 13 62 Knee Yes Unclear Low/low Low (100 mg/qid) vs lumiracoxib 100 mg (8) vs lumiracoxib 200 mg (9) Novartis (2006a) Celecoxib vs Celecoxib vs 703/ 7 61 Knee Yes Unclear Low/low Low lumiracoxib lumiracoxib 200 mg (9) Novartis (2007) 1262/62% 13 62 Hip Yes Unclear Low/low Low (100 mg/qid) 100 mg (8) PACES (2004) 524/63% 14 64 Knee and Yes Unclear Low/unclear Low vs paracetamol (1000 mg/qid) vs paracetamol 3900 4000 mg (4) PACESa (2004) 556/64% 14 64 Knee and Yes Unclear Low/unclear Low vs paracetamol (1000 mg/qid) vs paracetamol 3900 4000 mg (4) Prior et al (2014) Placebo vs paracetamol Placebo (1) vs paracetamol 542/74% 12 62 Knee and Yes Low Low/low Low (1300 mg/tid) 3900 4000 mg (4) Puopolo et al (2007) Placebo vs etoricoxib Placebo (1) vs etoricoxib 548/76% 12 63 Knee and Yes Low Low/low High (30 mg/qid) vs ibuprofen (800 mg/tid) 30 mg (11) vs ibuprofen 2400 mg (23) Reginster et al (2007) Placebo vs etoricoxib Placebo (1) vs etoricoxib 997/72% 12 63 Knee and Yes Unclear Low/low High (60 mg/qid) vs naproxen 60 mg (12) vs naproxen Rother et al (2007) 397/60% 6 63 Knee Yes Low Low/unclear Low (100 mg/bid) Saag et al (2000) Placebo vs ibuprofen Placebo (1) vs ibuprofen 736/74% 6 62 Knee and Yes Unclear Low/low High (800 mg/tid) vs rofecoxib (12 5 mg/qid) vs rofecoxib 2400 mg (23) vs rofecoxib 12 5 mg (5) vs rofecoxib Saag et al (2000a) Diclofenac (50 mg/tid) vs Diclofenac 150 mg (16) vs 693/80% 52 62 Knee and Yes Unclear Low/unclear Low rofecoxib (12 5 mg/qid) vs rofecoxib rofecoxib 12 5 mg (5) vs rofecoxib Schnitzer et al (2004) Lumiracoxib (400 mg/qid) vs Lumiracoxib 400 mg (10) 9511/77% 56 64 Knee and Yes Low Low/low High naproxen vs naproxen Schnitzer et al (2005) Placebo vs naproxen vs rofecoxib Placebo (1) vs naproxen vs rofecoxib 672/62% 6 60 Knee Yes Unclear Low/low High (Table continues on next page) 2098 www.thelancet.com Vol 387 May 21, 2016

Interventions (Continued from previous page) Schnitzer et al (2005a) Celecoxib vs paracetamol (1000 mg/qid vs rofecoxib (12 5 mg/qid) vs rofecoxib Schnitzer et al (2009) Schnitzer et al (2010) Schnitzer et al (2011) Schnitzer et al (2011a) Sheldon et al (2005) Smugar et al (2006) Smugar et al (2006a) Sowers et al (2005) Tannenbaum et al (2004) Weaver et al (2006) Wiesenhutter et al (2005) Williams et al (2000) Paracetamol (1300 mg/tid) vs rofecoxib (12 5 mg/qid) vs rofecoxib Placebo vs naproxen (100 mg/qid) Placebo vs naproxen (100 mg/qid) vs lumiracoxib vs rofecoxib (12 5 mg/qid) vs rofecoxib vs rofecoxib Celecoxib vs naproxen vs rofecoxib (400 mg/qid) Placebo vs rofecoxib (12 5 mg/qid) Placebo vs etoricoxib (30 mg/qid) vs ibuprofen (800 mg/tid) (100 mg/bid) vs celecoxib Intervention nodes (intervention node number)* Celecoxib vs paracetamol 3900 4000 mg (4) vs rofecoxib 12 5 mg (5) vs rofecoxib Paracetamol 3900 4000 mg (4) vs rofecoxib 12 5 mg (5) vs rofecoxib Placebo (1) vs naproxen 100 mg (8) Placebo (1) vs naproxen 100 mg (8) vs lumiracoxib 200 mg (9) vs rofecoxib 12 5 mg (5) vs rofecoxib vs rofecoxib Celecoxib vs naproxen vs rofecoxib 200 mg (9) vs lumiracoxib 400 mg (10) Placebo (1) vs rofecoxib 12 5 mg (5) Placebo (1) vs etoricoxib 30 mg (11) vs ibuprofen 2400 mg (23) Number of patients/ proportion of women (%) Follow-up (weeks) Mean age (years) Joint Lastobservationcarriedforward Risk of bias Concealed allocation Patient blinding/ investigator binding 1578/67% 6 62 Knee Yes Unclear Low/low High 403/58% 4 60 Knee Yes Unclear Low/low High 918/70% 13 61 Knee Yes Unclear Low/unclear High 1262/62% 17 62 Hip Yes Unclear Low/low Low 1020/70% 53 60 Knee No Unclear Low/unclear High 1551/62% 13 60 Knee Yes Unclear Low/low Low 1521/68% 6 62 Knee and 1082/66% 6 62 Knee and 404/60% 12 63 Knee and Yes Unclear Low/low High Yes Unclear Low/low High Yes Unclear Low/unclear High 1702/69% 13 64 Knee Yes Low Low/low Low 978/70% 6 63 Knee Yes Unclear Low/low High 528/70% 12 62 Knee Yes Low Low/low High 686/66% 6 63 Knee Yes Low Low/low High Incomplete outcome data (Table continues on next page) treatment. This threshold of 0 37 SD units is based on the median minimum clinically important difference reported in studies in patients with osteoarthritis. 27 An effect size of 0 37 corresponds to a difference of 9 mm on a 100 mm visual analogue scale. Third, the surface under the cumulative ranking (SUCRA) line; an intervention with a SUCRA value of 100 is certain to be the best, whereas an intervention with 0 is certain to be the worst. 28 Analyses were done with Stata (StataCorp LP 2005. Stata Statistical Software: Release 12. College Station, TX, USA) and WinBUGS (MRC Biostatistics Unit 2007. Version 1.4.3 Cambridge, UK). Role of the funding source The funder of the study had no role in study design, data collection, data analysis, data interpretation, or writing of www.thelancet.com Vol 387 May 21, 2016 2099

Intervention Paracetamol <2000 mg Paracetamol 3000 mg Paracetamol 3900 4000 mg* Rofecoxib 12 5 mg Rofecoxib 25 mg* Rofecoxib 50 mg Lumiracoxib 100 mg Lumiracoxib 200 mg* Lumiracoxib 400 mg Etoricoxib 30 mg Etoricoxib 60 mg* Etoricoxib 90 mg Diclofenac 70 mg Diclofenac 100 mg Diclofenac 150 mg* Celecoxib 100 mg Celecoxib 200 mg Celecoxib 400 mg* Naproxen 750 mg Naproxen 1000 mg* Ibuprofen 1200 mg Ibuprofen 2400 mg* Interventions (Continued from previous page) Williams et al (2001) (100 mg/bid) vs celecoxib Wittenberg et al (2006) (200 mg/bid) vs lumiracoxib (400 mg/qid) Yoo et al (2014) Celecoxib vs etoricoxib (30 mg/qid) Zacher et al (2003) Diclofenac (50 mg/tid) vs etoricoxib (60 mg/qid) Zhao et al (1999) (50 mg/bid) vs celecoxib (100 mg/bid) vs celecoxib (200 mg/bid) vs naproxen 1 25 1 0 0 75 0 50 0 25 0 0 25 0 50 Favours active treatment Intervention nodes (intervention node number)* 400 mg (19) vs lumiracoxib 400 mg (10) Celecoxib vs etoricoxib 30 mg (11) Diclofenac 150 mg (16) vs etoricoxib 60 mg (12) 100 mg (17) vs celecoxib vs celecoxib 400 mg (19) vs naproxen Favours placebo Number of patients/ proportion of women (%) Effect size (95% CrI) 0 08 ( 0 41 to 0 27) 0 17 ( 0 66 to 0 32) 0 17 ( 0 27 to 0 06) 0 43 ( 0 50 to 0 35) 0 51 ( 0 58 to 0 43) 0 62 ( 0 84 to 0 39) 0 34 ( 0 44 to 0 24) 0 34 ( 0 43 to 0 25) 0 43 ( 0 57 to 0 28) 0 49 ( 0 61 to 0 37) 0 58 ( 0 73 to 0 43) 0 61 ( 0 90 to 0 33) 0 22 ( 0 61 to 0 16) 0 34 ( 0 58 to 0 10) 0 57 ( 0 69 to 0 46) 0 16 ( 0 29 to 0 03) 0 36 ( 0 41 to 0 32) 0 33 ( 0 45 to 0 21) 0 05 ( 0 42 to 0 32) 0 40 ( 0 47 to 0 33) 0 30 ( 0 84 to 0 25) 0 43 ( 0 55 to 0 31) Follow-up (weeks) p=0 83 p=0 46 p=0 24 p=0 77 p=0 031 p=0 030 p=0 026 p=0 50 Figure 2: Estimates of the treatment effects on pain for different daily doses of NSAIDs and paracetamol compared with placebo Analysis considers data from all timepoints as available. Area between dashed lines shows the treatment effect estimates below the minimum clinically important difference. Two-sided p values are derived from tests of linear dose effect. NSAID=non-steroidal anti-inflammatory drug. CrI=credibility interval. *Maximum approved daily dose. Mean age (years) Joint Lastobservationcarriedforward Risk of bias Concealed allocation Patient blinding/ investigator binding 718/70% 6 62 Knee Yes Unclear Low/low High 364/58% 3 65 Knee Yes Unclear Low/low Low 239/90% 12 63 Knee Yes Unclear Low/low High Incomplete outcome data 516/80% 8 63 Knee and Yes Unclear Low/low High 1004/72% 12 62 Knee Yes Unclear Low/unclear High Full references for all trials are given in the appendix. bid=twice a day. tid=three times a day. qid=four times a day. q6d=six times a day. *Intervention node number corresponds to those given in the legend of figure 1. Industry funded trial. Unclear whether industry funded. Table: Characteristics of included trials the report. All authors had full access to all the data in the study and the corresponding author had final responsibility for the decision to submit for publication. Results We identified 8973 reports, of which 74 randomised clinical trials investigating seven different NSAIDs and paracetamol were described and included in the analysis (appendix). 23 nodes were included in our network meta-analysis. Each of the nodes concerned different interventions with specific daily dose of administration, or placebo (figure 1). Celecoxib 200 mg/day was the most frequently investigated intervention (39 trials), whereas four interventions were investigated by only one trial (table). Across trials, the mean age of patients ranged from 58 to 71 years, the percentage of female patients ranged from 49% to 90%, and the median follow-up was 12 weeks (range 1 52 weeks). In total, 58 556 patients were included in our primary analysis of osteoarthritis pain. The interventions with the most randomly assigned patients were celecoxib 200 mg/day (11 411 patients) and naproxen 1000 mg/day (8195 patients), whereas the interventions with the fewest randomly assigned patients were diclofenac 70 mg/day (104 patients) and etoricoxib 90 mg/day (112 patients). All trials were judged to have a low risk of bias for blinding of patients, 73% for blinding of therapists, 26% for incomplete outcome data, and 19% for concealment of allocation. None of the trials was thought to have a 2100 www.thelancet.com Vol 387 May 21, 2016

Intervention Rofecoxib 50 mg Etoricoxib 90 mg Diclofenac 150 mg* Etoricoxib 60 mg* Rofecoxib 25 mg* Etoricoxib 30 mg Rofecoxib 12 5 mg Ibuprofen 2400 mg* Lumiracoxib 400 mg Naproxen 1000 mg* Celecoxib 200 mg Diclofenac 100 mg Lumiracoxib 100 mg Celecoxib 400 mg* Lumiracoxib 200 mg* Ibuprofen 1200 mg Diclofenac 70 mg Paracetamol 3000 mg Paracetamol 3900 4000 mg* Celecoxib 100 mg Paracetamol <2000 mg Naproxen 750 mg Placebo 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Rank high risk of bias for any of the methodological quality items assessed, except for incomplete outcome data, since 55 (74%) of the 74 trials analysed had excluded at least one of the randomly assigned patients from their analysis. 50 (68%) of the 74 trials analysed reported the use of last-observation-carried-forward for imputation of missing data, four (5%) did not need to use imputation, and 20 (27%) did not do an intention-to-treat analysis. None of the trials reported the use of multiple imputation. 68 (92%) of the trials analysed received financial funding from a commercial body; source of funding was unclear for the other six (8%; table). Pooled effect sizes suggested that all interventions, irrespective of dose, improved osteoarthritic pain symptoms when compared with placebo (figure 2). For five interventions (paracetamol <2000 mg/day and 3000 mg/day, diclofenac 70 mg/day, naproxen 750 mg/day, and ibuprofen 1200 mg/day), not enough statistical evidence was available to support superiority when compared with placebo (ie, estimates were also compatible with a null effect). For ten interventions, effect sizes exceeded the prespecified minimum clinically important difference of 0 37. For six interventions (diclofenac 150 mg/day, etoricoxib 30 mg/day, 60 mg/day, and 90 mg/day, and rofecoxib 25 mg/day and 50 mg/day), sufficient statistical evidence was available to support a minimum clinically important effect (ie, the probability Median rank (95% CrI) 2 00 (1 00 11 00) 2 00 (1 00 14 00) 3 00 (1 00 8 00) 3 00 (1 00 8 00) 5 00 (2 00 9 00) 6 00 (2 00 12 00) 9 00 (5 00 14 00) 9 00 (4 00 16 00) 9 00 (3 00 16 00) 11 00 (7 00 15 00) 13 00 (9 00 16 00) 14 00 (3 00 20 00) 14 00 (8 00 18 00) 14 00 (8 00 18 00) 14 00 (9 00 18 00) 15 00 (1 00 23 00) 17 00 (3 00 23 00) 19 00 (3 00 23 00) 19 00 (16 00 21 00) 19 00 (15 00 22 00) 21 00 (10 00 23 00) 21 00 (10 00 23 00) 22 00 (19 00 23 00) Probability to reach MID 98 95 100 100 100 98 93 83 79 78 41 40 28 27 25 40 23 21 0 0 4 4 Reference Figure 3: Median rank, probability of reaching MID, and SUCRA values of competing interventions and daily doses MID=minumum clinically important difference. SUCRA=surface under the cumulative ranking curve. CrI=credibility interval. *Maximum daily dose SUCRA (%) that the difference from placebo is at or below the prespecified threshold of 0 37 was at least 95%; figure 3). Visual inspection of point estimates in figure 2 suggests that treatment effects increased as drug doses increased, but CrIs of different doses within one preparation overlapped widely and a corresponding test for a linear dose effect was only significant for celecoxib (p=0 030), diclofenac (p=0 031), and naproxen (p=0 026). Figure 3 shows the median rank of interventions according to their treatment effect, the probability for the treatment effect of a specific intervention to reach the minimum clinically important difference, and the SUCRA. The top four ranked interventions were etoricoxib 90 mg/day, rofecoxib 50 mg/day, diclofenac 150 mg/day, and etoricoxib 60 mg/day. Among these interventions, only diclofenac 150 mg/day and etoricoxib 60 mg/day had a 100% probability to reach the minimum clinically important difference. No evidence existed that treatment effects varied with the duration of treatment (appendix). Effect sizes suggest that for 20 (95%) of 21 interventions physical function was improved when compared with placebo (no data for physical function were available for etoricoxib 90 mg; figure 4). However, for six inter ventions (paracetamol <2000 mg/day or 3000 mg/day, rofecoxib 50 mg/day, diclofenac 70 mg/day, naproxen 750 mg/day, and ibuprofen 1200 mg/day), not enough evidence exists (p>0 05) to support superiority when compared with 92 89 91 91 82 78 65 65 65 58 50 48 45 44 44 46 34 29 22 21 17 16 8 www.thelancet.com Vol 387 May 21, 2016 2101

Intervention Paracetamol <2000 mg Paracetamol 3000 mg Paracetamol 3900 4000 mg* Rofecoxib 12 5 mg Rofecoxib 25 mg* Rofecoxib 50 mg Lumiracoxib 100 mg Lumiracoxib 200 mg* Lumiracoxib 400 mg Etoricoxib 30 mg Etoricoxib 60 mg* Diclofenac 70 mg Diclofenac 100 mg Diclofenac 150 mg* Celecoxib 100 mg Celecoxib 200 mg Celecoxib 400 mg* Naproxen 750 mg Naproxen 1000 mg* Ibuprofen 1200 mg Ibuprofen 2400 mg* 1 25 1 0 0 75 0 50 0 25 0 0 25 0 50 Favours active treatment Favours placebo Effect size (95% CrI) 0 04 ( 0 30 to 0 37) 0 30 ( 0 78 to 0 19) 0 14 ( 0 25 to 0 04) 0 38 ( 0 47 to 0 29) 0 48 ( 0 56 to 0 40) 0 30 ( 0 68 to 0 07) 0 35 ( 0 47 to 0 24) 0 37 ( 0 47 to 0 27) 0 41 ( 0 57 to 0 25) 0 45 ( 0 59 to 0 31) 0 52 ( 0 70 to 0 35) 0 22 ( 0 62 to 0 17) 0 39 ( 0 72 to 0 06) 0 51 ( 0 65 to 0 38) 0 21 ( 0 36 to 0 06) 0 35 ( 0 40 to 0 30) 0 32 ( 0 47 to 0 18) 0 03 ( 0 39 to 0 33) 0 41 ( 0 49 to 0 33) 0 51 ( 1 07 to 0 07) 0 38 ( 0 51 to 0 26) Figure 4: Estimates of the treatment effects on physical function for different daily doses of NSAIDs and paracetamol compared with placebo Analysis considers data from all timepoints as available. Area between dashed lines shows treatment effect estimates below the minimum clinically important difference. NSAID=non-steroidal anti-inflammatory drug. CrI=credibility interval. *Maximum approved daily dose. placebo. In ten interventions, effect sizes exceeded the prespecified minimum clinically important difference of 0 37. For two interventions (diclofenac 150 mg/day and rofecoxib 25 mg/day), enough evidence exists to support a minimum clinically important treatment effect. The model fit was good for both pain and physical function outcomes (appendix). τ² estimates suggest low statistical heterogeneity for both pain (0 011, 95% CrI 0 007 0 017) and physical function (0 010, 0 007 0 015). No relevant inconsistency was seen for both pain and physical function (appendix). Analyses with alternative models for network meta-analysis yielded much the same results (appendix). No evidence exists for an interaction between any of the trial characteristics assessed and the treatment effect (all p values 0 14; appendix). An analysis adjusted for patient blinding was not possible since all trials had low risk of bias for this item. Comparison-adjusted funnel plots show no evidence of asymmetry (appendix). Discussion In this network meta-analysis comparing the effectiveness of different treatment regimens of NSAIDs, paracetamol, or placebo, diclofenac 150 mg/day seemed to be the most effective in terms of pain and physical function. The magnitude of treatment effect estimates varied greatly across different NSAIDs and doses. Whereas paracetamol had nearly a null effect on pain symptoms at various doses (effect size of 0 17, corresponding to 4 mm difference on a 100 mm visual analogue scale), diclofenac 150 mg/day had a moderate to large effect size of 0 57, corresponding to difference on a 100 mm visual analogue scale of 14 mm. This is 1 5 times the minimum clinically important difference for chronic pain of 0 37. Our results also suggest that a typical patient with only osteoarthritis has 100% probability of having a minimum clinically important improvement when taking diclofenac 150 mg/day, etoricoxib 60 mg/day, or rofecoxib 25 mg/day. Finally, these findings are corroborated by similar treatment effects on pain and physical function within each of the interventions. Although our findings suggest that some NSAIDs have a clinically relevant treatment effect on osteoarthritis pain, their benefit has to be weighed against their potential harmful effects. 29 For example, previous analyses suggest that diclofenac increases the risk for cardiovascular events, especially cardiovascular death, but that the upper gastrointestinal complications are similar to cyclooxygenase-2 (COX-2) inhibitors. 15,30 By contrast, naproxen does not seem to increase cardio vascular risk but substantially increases the likelihood of upper gastrointestinal complications. Appropriate drug selection is a major challenge in patients with osteoarthritis, who are often elderly with polypharmacy. 31 Our study will help to put the available safety data into perspective. The length of follow-up in most of the included trials was short to intermediate, with durations of 3 months or less. We believe this is an accurate representation of clinical practice. Nowadays, NSAIDs, in view of their well established gastrointestinal and cardiovascular harm, 15,30 will typically be prescribed on an as-needed basis, with intermittent short-term to mid-term use in doses as required, rather than a long-term fixed dose. Our network meta-analysis provides data for the short-term to midterm analgesic effectiveness of different NSAIDs and paracetamol at different doses that are relevant to present practice. However, we acknowledge that some people might call for additional long-term trials that directly compare head-to-head, not only the most effective preparation-dose combinations identified in our network meta-analysis, but also NSAIDs on a continuous fixeddose versus NSAIDs on an as-needed basis, before coming to definite conclusions. These measures could be achieved with a two-by-two factorial design, in which diclofenac 150 mg/day is compared with etoricoxib 60 mg/day, either on a fixed or as-needed basis for at least 12 months. The available safety data do not allow the same extent of resolution according to different doses. The most pertinent analyses of safety data are by the Coxib and traditional NSAID Trialists Collaboration, 30 who reported on both gastrointestinal and cardiovascular safety, and by our group, who reported cardiovascular safety only. 15 The Coxib and traditional NSAID Trialists Collaboration showed that all NSAIDs are associated with increased gastrointestinal 2102 www.thelancet.com Vol 387 May 21, 2016

risk compared with placebo. Both groups recorded the same results for cardiovascular toxic effects, with naproxen being the only exception. Because of the low number of events per each specific dose, neither group could distinguish between different doses of a specific NSAID. The quality of our analysis is limited by the quality of the underlying data. Whether investigators in some trials were properly blinded was unclear, and most trials had a high risk of incomplete outcome data bias because they used the inappropriate approach of last-observation carriedforward to impute missing data. However, by disregarding small studies, we minimised the risk of biases due to small study effects. 16 The generally high quality of included trials could be viewed as reassuring for this approach. Moreover, results were much the same after we controlled for trial characteristics that could potentially confound our results. Although the overall number of included patients was fairly large in view of the continuous nature of our outcomes, the number of individual trials assessing individual doses was still low. Although not a limitation for statistical power, the small number of studies included could be of concern for the generalisability of our results. Confirmation of clinical trial results is required by regulatory authorities to substantiate claims of clinical effectiveness of drugs, and the importance of independent validation of research results is now also acknowledged in the scientific community. 32,33 Most of the included trials were multicentric and therefore replication might not be an issue. 34 Various doses of individual preparations were assessed in the available trials. We chose to separate the various doses by analysing them according to daily doses. This approach hampers statistical efficiency by introducing additional parameters in the statistical model but allows flexible modelling and exploration of any dose effects. However, data were too sparse to explore effects of different schedules, which might also affect treatment effective ness because of differences in the pharma cokinetics of various preparations. 35 We used a comprehensive search strategy and searched pertinent sources to retrieve potentially eligible randomised controlled trials. We therefore believe that it is unlikely that we missed any relevant trials. We used various statistical models to analyse the available data and to increase robustness of our results. All analytical approaches provided qualitatively the same results. Moreover, results were also consistent across the two outcomes of pain and function. The robustness and accuracy of the results are further corroborated by the low between-trial heterogeneity, absence of inconsistency, and good model fit. However, the main limitations of our primary model need to be acknowledged. First, since study-specific covariance estimates are rarely reported, the dependency of outcome data over time within participants is only approximately represented through the random walk. Second, if strong temporal patterns such as a linear trend can be identified in the data, the random walk model used in our analysis cannot account for it properly. Consequently, the primary model does not allow assessment of whether treatment effects varied over time. Study-specific covariance estimates would be needed but were not available. The CrIs of estimates per timepoint presented in the appendix are therefore probably too narrow. However, the analysis still allows the conclusion that variation over time was minimum. One of the most comprehensive systematic reviews 13 so far included 273 studies of NSAIDs, irrespective of the study design, and did not do a meta-analysis. On the basis of a qualitative assessment, the authors concluded that the different preparations of NSAIDs had no clear difference between them. Another large systematic review, 12 including 93 randomised clinical trials, compared non-selective and COX-2-selective NSAIDs and concluded that no difference existed between the two classes. 12 Machado and colleagues 36 focused on the effects of paracetamol, and included only placebo-controlled trials in a standard, pair-wise metaanalysis. On the basis of ten trials that used a dose similar to our highest dose group for paracetamol, they showed an effect that was similar to that in our analysis. 36 As opposed to these previous studies, our network meta-analysis integrates all available high-quality randomised evidence on the effectiveness of NSAIDs to treat osteoarthritis pain in one analysis while fully preserving randomisation. The integration of direct and indirect comparisons results in a gain of statistical precision compared with previous analyses and allows for formal comparisons of NSAIDs with paracetamol and placebo. This integration also facilitates interpretation because it makes comparisons between the different interventions explicit. We are aware of two studies 37,38 related to osteoarthritis that also integrate direct and indirect evidence in one network meta-analysis. Bannuru and colleagues 37 did not investigate all preparations considered in our analysis. Notably, the latest COX-2 inhibitors were missing in their analysis. Moreover, the network metaanalysis 37 combined oral and intra-articular treatments in one analysis, hampering clinical interpretation. Finally, the number of patients in their analysis was substantially lower than that in our analysis and no consideration was given to the longitudinal nature of the data. van Walsem and colleagues 38 also did a comprehensive network meta-analysis. However, their analysis included fewer preparations than did our analysis and only one dose per preparation. Their analysis did not integrate data over time, instead providing results only for 6 and 12 weeks separately. Nevertheless, their results are qualitatively similar to ours; diclofenac 150 mg/day is more effective than the other preparations in the analysis except for etoricoxib 60 mg/day, which showed much the same effectiveness. Thus, the effectiveness of NSAIDs varies substantially, with evidence of dose dependency. Our analysis suggests that paracetamol is clinically ineffective and should not be recommended for the symptomatic treatment of www.thelancet.com Vol 387 May 21, 2016 2103

For an up-to-date list of CTU Bern s conflicts of interest see http://www.ctu.unibe.ch/ research/conflicts_of_interest/ index_eng.html osteoarthritis, irrespective of the dose. Conversely, diclofenac at the maximum daily dose of 150 mg/day is most effective for the treatment of pain and physical disability in osteoarthritis, and superior to the maximum doses of frequently used NSAIDs, including ibuprofen, naproxen, and celecoxib. Etoricoxib at the maximum dose of 60 mg/day is as effective as diclofenac 150 mg/day for treatment of pain, but its effect estimates on physical disability are imprecise. Moreover, etoricoxib is not as accessible as diclofenac because it has marketing authorisation in fewer countries. Etoricoxib 90 mg/day and rofecoxib 50 mg/day, which are both above the approved maximum daily doses, might be more effective for pain treatment, but estimates are imprecise. No evidence exists that supramaximum doses of any of the other NSAIDs further increase effectiveness. Since all NSAIDs with available safety data were associated with clinically relevant gastrointestinal and cardiovascular harms, 15,30 type and dose of NSAID should be chosen on the basis of the analgesic effectiveness reported in this analysis. In view of the well established harms of NSAIDs and the treatment duration in almost all the trials included in this analysis, intermittent short-term use of NSAIDs in moderate to maximum doses as required should be given preference over long-term fixed doses. Contributors BRdC, SR, PJ, and ST were responsible for the conception and design of the study. BRdC, SW, and ST did the analysis and interpreted the analysis in collaboration with SR, NK, LN, and PJ. BRdC, NK, and LN were responsible for the acquisition of data. BRdC and ST wrote the first draft of the Article. All authors critically revised the Article for important intellectual content and approved the final version. SR, PJ, and ST obtained public funding. Declaration of interests LN and ST are affiliated with CTU Bern, University of Bern, which has a staff policy of not accepting honoraria or consultancy fees. However, CTU Bern is involved in design, conduct, or analysis of clinical studies funded by Abbott Vascular, Ablynx, Amgen, AstraZeneca, Biosensors, Biotronik, Boehringer Ingelheim, Eisai, Eli Lilly, Exelixis, Geron, Gilead Sciences, Nestlé, Novartis, Novo Nordisc, Padma, Roche, Schering- Plough, St Jude Medical, and Swiss Cardio Technologies. PJ has received research grants to the institution from AstraZeneca, Biotronik, Biosensors International, Eli Lilly, and the Medicines Company, and serves as an unpaid member of the steering group of trials funded by AstraZeneca, Biotronik, Biosensors, St Jude Medical, and The Medicines Company. SW is now an employee of Novartis Pharma AG, Biometrics and Data Management, Oncology. SW was previously an employee of and currently holds shares in Cogitars GmbH Switzerland (in liquidation). All other authors declare no competing interests. Acknowledgments This study was funded by the Swiss National Science Foundation (grant number 405340-104762) and by a grant from the Arco Foundation, Switzerland. We thank Kali Tal for proofreading the manuscript and Toshi Furukawa for support with the development of the search strategy. References 1 Altman R, Brandt K, Hochberg M, et al. Design and conduct of clinical trials in patients with osteoarthritis: recommendations from a task force of the Osteoarthritis Research Society. Results from a workshop. Osteoarthritis Cartilage 1996; 4: 217 43. 2 Rosemann T, Wensing M, Joest K, Backenstrass M, Mahler C, Szecsenyi J. Problems and needs for improving primary care of osteoarthritis patients: the views of patients, general practitioners and practice nurses. BMC Musculoskelet Disord 2006; 7: 48. 3 Nuesch E, Dieppe P, Reichenbach S, Williams S, Iff S, Juni P. All cause and disease specific mortality in patients with knee or osteoarthritis: population based cohort study. BMJ 2011; 342: d1165. 4 Porcheret M, Jordan K, Jinks C, Croft P. Primary care treatment of knee pain a survey in older adults. Rheumatology (Oxford) 2007; 46: 1694 700. 5 Juni P, Reichenbach S, Dieppe P. Osteoarthritis: rational approach to treating the individual. Best Pract Res Clin Rheumatol 2006; 20: 721 40. 6 Gore M, Tai KS, Sadosky A, Leslie D, Stacey BR. Use and costs of prescription medications and alternative treatments in patients with osteoarthritis and chronic low back pain in community-based settings. Pain Pract 2012; 12: 550 60. 7 Gore M, Sadosky A, Leslie D, Tai KS, Seleznick M. Patterns of therapy switching, augmentation, and discontinuation after initiation of treatment with select medications in patients with osteoarthritis. Clin Ther 2011; 33: 1914 31. 8 Arboleya LR, de la Figuera E, Soledad Garcia M, Aragon B. Management pattern for patients with osteoarthritis treated with traditional non-steroidal anti-inflammatory drugs in Spain prior to introduction of Coxibs. Curr Med Res Opin 2003; 19: 278 87. 9 Cordero-Ampuero J, Darder A, Santillana J, Caloto MT, Nocea G. Evaluation of patients and physicians expectations and attributes of osteoarthritis treatment using Kano methodology. Qual Life Res 2012; 21: 1391 404. 10 McAlindon TE, Bannuru RR, Sullivan MC, et al. OARSI guidelines for the non-surgical management of knee osteoarthritis. Osteoarthritis Cartilage 2014; 22: 363 88. 11 Bjordal JM, Klovning A, Ljunggren AE, Slordal L. Short-term efficacy of pharmacotherapeutic interventions in osteoarthritic knee pain: a meta-analysis of randomised placebo-controlled trials. Eur J Pain 2007; 11: 125 38. 12 Chen YF, Jobanputra P, Barton P, et al. Cyclooxygenase-2 selective non-steroidal anti-inflammatory drugs (etodolac, meloxicam, celecoxib, rofecoxib, etoricoxib, valdecoxib and lumiracoxib) for osteoarthritis and rheumatoid arthritis: a systematic review and economic evaluation. Health Technol Assess 2008; 12: 1 278. 13 Chou R, McDonagh MS, Nakamoto E, Griffin J. Analgesics for osteoarthritis: an update of the 2006 comparative effectiveness review. Rockville, MD: Agency for Healthcare Research and Quality (US), 2011. 14 Higgins JP, Whitehead A. Borrowing strength from external trials in a meta-analysis. Stat Med 1996; 15: 2733 49. 15 Trelle S, Reichenbach S, Wandel S, et al. Cardiovascular safety of non-steroidal anti-inflammatory drugs: network meta-analysis. BMJ 2011; 342: c7086. 16 Nuesch E, Trelle S, Reichenbach S, et al. Small study effects in meta-analyses of osteoarthritis trials: meta-epidemiological study. BMJ 2010; 341: c3515. 17 Reichenbach S, Sterchi R, Scherer M, et al. Meta-analysis: chondroitin for osteoarthritis of the knee or. Ann Intern Med 2007; 146: 580 90. 18 Higgins G, Green S, eds. Cochrane handbook for systematic reviews of interventions version 5.1.0. Updated March, 2011. London: The Cochrane Collaboration, 2011. 19 Smith TC, Spiegelhalter DJ, Thomas A. Bayesian approaches to random-effects meta-analysis: a comparative study. Stat Med 1995; 14: 2685 99. 20 Lu G, Ades AE. Combination of direct and indirect evidence in mixed treatment comparisons. Stat Med 2004; 23: 3105 24. 21 Cooper NJ, Sutton AJ, Lu G, Khunti K. Mixed comparison of stroke prevention treatments in individuals with nonrheumatic atrial fibrillation. Arch Intern Med 2006; 166: 1269 75. 22 Cohen J. Statistical power analysis for the behavioral sciences, 2nd edn. Hillsdale: Lawrence Earlbaum, 1988. 23 Follmann D, Elliott P, Suh I, Cutler J. Variance imputation for overviews of clinical trials with continuous response. J Clin Epidemiol 1992; 45: 769 73. 24 Spiegelhalter DJ, Abrams KR, Myles JP. Bayesian approaches to clinical trials and health care evaluation. Chichester: John Wiley & Sons, 2004. 25 Dias S, Welton NJ, Caldwell DM, Ades AE. Checking consistency in mixed treatment comparison meta-analysis. Stat Med 2010; 29: 932 44. 2104 www.thelancet.com Vol 387 May 21, 2016