Validation of a prediction model for the follicle-stimulating hormone response dose in women with polycystic ovary syndrome Madelon van Wely, Ph.D., a Bart C. J. M. Fauser, M.D., Ph.D., b Joop S. E. Laven, MD, Ph.D., c Marinus J. Eijkemans, Ph.D., b and Fulco van der Veen, M.D., Ph.D. a a Center for Reproductive Medicine, Department of Obstetrics and Gynecology, Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands; b Department of Reproductive Medicine and Gynecology, University Medical Center, Utrecht, The Netherlands; and c Division of Reproductive Medicine, Department of Obstetrics and Gynecology, Erasmus University Medical Center Rotterdam, Rotterdam, The Netherlands Objective: To validate a published model for the prediction of the individual FSH response dose for gonadotropin induction of ovulation in polycystic ovary syndrome (PCOS). Design: Structured, complete, and carefully monitored patient-based data collection to test the external validity of the prediction model. Setting: Twenty-nine hospitals in The Netherlands. Patient(s): Eighty-five clomiphene citrate (CC) resistant women with PCOS. Intervention(s): Ovulation induction in a chronic low-dose step-up FSH regimen. Main Outcome Measure(s): Predicted individual FSH response dose, defined as follicle growth 10 mm in diameter on ultrasound. Result(s): The model, using the women s body mass index, CC response, initial serum FSH level, and initial serum insulin-to-glucose ratio was studied in the validation sample. Overall, the FSH response dose predicted by the model was higher than the observed response dose. The predictive performance of the model was poor, with an R 2 of 0.11, and the average prediction error was 35 IU. Conclusion(s): The external validity of the model predicting the individual FSH response dose was inadequate in women with CC-resistant PCOS undergoing ovulation induction with recombinant FSH in a low-dose step-up regimen. (Fertil Steril 2006;86:1710 5. 2006 by American Society for Reproductive Medicine.) Key Words: Validation, ovulation induction, PCOS, rfsh Polycystic ovary syndrome (PCOS) is the commonest cause of anovulatory infertility and is estimated to affect approximately 6% 7% of women in the general population (1 4). Clomiphene citrate (CC) still represents the first-line treatment of choice in women with PCOS, although approximatetely 20% 25% of the women do not ovulate and over 50% do not conceive after CC (5, 6). Women not responding to CC represent a clinical challenge. In these CC-resistant women, ovulation induction with gonadotropins is standard treatment. Pregnancy rates between 35% and 70% after ovulation induction with FSH have been reported in the literature (7 9). Ovulation induction with FSH is, especially in PCOS, also associated with a high risk of multiple pregnancies caused by multiple follicular development (10 12). For this reason, adjusted dose regimens and intense monitoring of ovarian response to prevent multiple follicular development have been implemented. Received January 26, 2006; revised and accepted May 8, 2006. Reprint requests: Madelon van Wely, Center for Reproductive Medicine, Department of Obstetrics and Gynecology, Meibergdreef 9, 1109 AZ Amsterdam, The Netherlands (FAX: 31206963489; E-mail: m. vanwely@amc.uva.nl). The amount of exogenous FSH required to induce ovulation varies greatly, because the individual FSH threshold differs among individual women (13). This variability is considered responsible for the high incidence of complications caused by the development of multiple dominant follicles, which may result in multiple pregnancies and ovarian hyperstimulation syndrome (14, 15). One method of finetuning FSH levels and reducing complications is to start with a low dose of FSH and gradually increase the dose until ovarian response is observed. Such an approach is referred to as the chronic low-dose step-up regimen. Another treatment strategy is the step-down regimen, which involves a relatively high initial dose of FSH followed by subsequent stepwise decreases (16 18). This approach more closely mimics physiologic conditions in normo-ovulatory women (19). However, the initial standard FSH dose may be too high for those women with a low FSH threshold level and could induce imminent ovarian hyperstimulation, requiring cancelation of stimulation. Adjustment of the starting dose in women with a low FSH threshold would reduce the incidence of multiple dominant follicle development and related complications during the step-down regimen. The ability of clinicians to choose an appropriate starting dose of FSH for a given patient could considerably improve 1710 Fertility and Sterility Vol. 86, No. 6, December 2006 0015-0282/06/$32.00 Copyright 2006 American Society for Reproductive Medicine, Published by Elsevier Inc. doi:10.1016/j.fertnstert.2006.05.046
treatment outcome in both step-down and low-dose step-up regimens and make this treatment safer and more efficient. Clinicians who prefer to use a low-dose step-up regimen may start with a higher FSH starting dose in patients with a predicted high FSH threshold. These patients may reach adequate ovarian stimulation in a shorter time and have fewer requirements for ovarian response monitoring and fewer gonadotropin preparations administered at doses below the FSH threshold. In contrast, women with a low predicted FSH threshold may start with a lower initial dose of FSH to minimize complication or cancelation rates, especially when a step-down protocol is applied. A model has previously been developed to predict the individual FSH response dose for gonadotropin induction of ovulation (20). This model includes body mass index, response to CC, baseline FSH level, and free insulin growth factor I (IGF-I). Free IGF-I however, is not measured in daily clinical practice. The FSH response dose formula in which free IGF-I was substituted by the insulin-to-glucose ratio resulted in a slightly lower predictive probability (apparent R 2 of 0.49 compared with 0.54, optimism corrected 0.34 compared with 0.39). A model that performs well in an internal validation can produce poor predictions in other patients (21). Therefore, the aim of the present study was to externally validate the response dose estimation model. We assessed the model in women with PCOS that had been treated in other centers in which the model had not been developed. METHODS Between February 1998 and October 2000, 85 women underwent ovulation induction with recombinant (r) FSH as part of a multicenter controlled trial (22). The Institutional Review Boards of all 29 participating hospitals approved the study, and written informed consent was obtained from each participant. Polycystic ovary syndrome was diagnosed on the basis of chronic anovulation and the presence of polycystic ovaries on transvaginal ultrasonography, which is in line with a recent consensus on diagnostic criteria (23, 24). All women were CC resistant, i.e., persistent anovulation was observed despite 150 mg CC daily for 5 days during a given cycle. Anovulation was defined as absent follicle development assessed by ultrasonography for 35 days. The primary exclusion criteria were other causes of infertility, such as hyperprolactinemia, hypothalamic amenorrhea, premature ovarian failure, a negative post-coital test, ovarian tumor, previous treatment with gonadotropins, age above 40 years, and a partner with male subfertility. We defined male subfertility as a semen analysis that did not meet the World Health Organization criteria for concentration, motility, and/or morphology (25). An initial clinical, ultrasonographic, and endocrinologic evaluation took place before ovulation induction with rfsh. Clinical variables studied were age, menarche, duration of infertility, cycle history (amenorrhea or oligomenorrhea), body mass index (BMI), and waist-to-hip ratio (WHR). Oligomenorrhea was defined as cycle length of more than 35 days but less than 6 months and amenorrhea was defined as intervals between periods of more than 6 months. Endocrine evaluation included serum assays for FSH, LH, E 2,T,androstenedione, sex hormone binding globulin (SHBG), fasting insulin, glucose, and leptin. The free androgen index (FAI) was calculated as T 100/SHBG. Blood samples were centrifuged within 2 hours after withdrawal and stored at 20 C until assayed. All hormone assays were performed at a single laboratory (Endocrinology Department, Academic Medical Center Amsterdam) on cryopreserved blood. The rfsh (follitropin alpha, Gonal-F; Serono Benelux, The Hague, The Netherlands) was administered in a chronic low-dose step-up protocol. The rfsh starting dose was not based on the FSH response dose as predicted by the model. Treatment started on cycle day 3 of a progesterone-induced bleeding with daily 75 IU SC injections in all women (26). Follicle development was assessed by transvaginal ultrasonography at weekly intervals or more frequently if indicated by follicle growth. If the diameter of the follicles remained at 10 mm, the dose was increased with 37.5 IU FSH on cycle day 16. In the absence of follicle growth the FSH dose was subsequently increased on cycle days 23 and 30. If still no follicle development (i.e., no follicle with a diameter of 10 mm) was seen on cycle day 35 with a maximum FSH dose of 187.5 IU per day, the cycle was canceled because of poor response. Primary outcome is the FSH dose on the day of ovarian response (the FSH response dose), defined as follicle growth of 10 mm in diameter. Development Sample The population sample in which the prognostic model was developed consisted of 90 women attending the infertility unit at the Erasmus University Medical Center in Rotterdam, The Netherlands. Women that were treated between June 1997 and September 1999 were included. Women had oligoor amenorrhea, normal serum FSH level (1 10 IU/L), normal serum PRL and TSH levels, spontaneous menses or positive bleeding response to progestogen withdrawal, BMI of at least 18 kg/m 2, age of 19 40 years, and previously unsuccessful treatment with CC (failure to ovulate or failure to conceive in six CC cycles that produced ovulation). The Erasmus University Medical Center did not participate in the study that provided the validation sample. Thus the model was assessed in women that had been treated in other centers than the center where the model had been developed. Data Analysis Differences in clinical, ultrasonographic, and endocrinologic parameters between the validation sample and the develop- Fertility and Sterility 1711
ment model (17) were tested by t tests for continuous variables and chi-squared tests for categoric variables. The prognostic effects of the patient characteristics included in the model were studied in the validation sample using linear regression with FSH response dose as the dependant variable. The formula of the model was: FSH response dose (75 to 187 IU/day) before initiation of therapy [3.5 BMI (in kg/m 2 )] [35.6 CC resistance (yes 1orno 0)] [2.6 initial free IGR] [6.7 initial serum FSH level (in IU/L)] 32.5. The predicted response dose was calculated for each woman of the validation sample. Calibration was assessed by graphically plotting the observed FSH response dose against the predicted FSH response dose. The overall predictive performance (goodness of fit) of the model in the validation sample was assessed by using the multiple R 2 value. This number expresses the fraction of the total variance of FSH response dose that is explained by the regression model. The accuracy or average error of the model when used for prediction of the FSH response dose of new patients was assessed by using the root mean squared error. Data were analyzed using SPSS 11.0.1 (SPSS, Chicago, IL) and S-PLUS 2000 (MathSoft, Cambridge, MA). RESULTS The cohort of 90 women in whom the prognostic model for response dose had been made (development sample) consisted of 40 women (44%) that did not ovulate on 150 mg CC and 50 women (56%) that did ovulate on CC but did not conceive in six CC cycles that produced ovulation (20). The population of 85 women in whom the model was validated (validation model) consisted only of women that did not ovulate on CC. Of the total of 85 women, 81 had ovarian response during their first ovulation induction cycle, defined as follicle growth of 10 mm in diameter. In 4 women the cycle was canceled owing to understimulation. In these women a next cycle was used to determine their response dose. Baseline characteristics in the 90 women who did not ovulate or conceive with CC (development sample) and in the 85 CC-resistant women with PCOS (validation sample) are presented in Table 1. The clinical characteristics appear comparable in both populations, except for hyperandrogenism. The validation population presented with higher LH, T, SHBG, FAI, and leptin levels. The FSH response dose was the dose at which ongoing dominant follicle growth beyond 10 mm was visualized. The response dosages in the development sample and the valida- TABLE 1 Women s characteristics in the development sample (n 90) a and in the validation sample (n 85). Characteristics Development sample, mean (SD) Validation sample, mean (SD) P value b Clinical Age, yrs 29 (4) 28.5 (4.1) NS Infertility duration, yrs ND 2.8 (2.2) NS Primary infertility, n (%) 66 (73%) 65 (76%) NS Amenorrhea, n (%) 37 (33%) 23 (27%) NS Body mass index, kg/m 2 26.5 (5.5) 27.0 (5.8) NS Waist-to-hip ratio 0.81 (0.08) 0.83 (0.08) NS Ultrasound Mean ovarian volume, ml 10.7 (4.2) 11.6 (6.5) NS Endocrinology LH 8.1 (5.7) 11 (5.1).01 FSH 5.1 (1.5) 6.0 (1.6) NS T, nmol/l 2.5 (1.0) 3.9 (1.3).01 SHBG, nmol/l 48 (25) 40 (21).01 FAI 7.1 (5.4) 13.3 (10.2).001 Androstenedione, nmol/l 13.9 (6.7) 11.9 (4.3) NS Insulin, mu/l 12.3 (8.2) 10.4 (8.0) NS Glucose, mmol/l 4.3 (0.7) 5.1 (1.3) NS Insulin-to-glucose ratio, ( 10 3 3.0 (2.1) 2.1 (1.7) NS Leptine, ng/ml 20 (15) 26 (18).01 a Reference 20. NS not significant. van Wely. Validation of FSH response model in PCOS. Fertil Steril 2006. 1712 van Wely et al. Validation of FSH response model in PCOS Vol. 86, No. 6, December 2006
TABLE 2 FSH response dose in the development sample and the validation sample. Response dose Development sample, n (%) Validation sample, n (%) 75 IU/d 33 (37) 26 (31) 112.5 IU/d 27 (30) 26 (31) 150 IU/d 16 (18) 27 (32) 187.5 IU/d 10 (11) 6 (7) 187.5 IU/d 4 (4) van Wely. Validation of FSH response model in PCOS. Fertil Steril 2006. tion sample are listed in Table 2. In the development sample, more women received an FSH dose above 150 IU/day than in the validation sample; furthermore, the maximum dose was higher in the development sample. The observed and predicted FSH response doses obtained by using the model is visualized in a spline plot with 95% confidence intervals (Fig. 1). Overall the predicted response dose by the model was higher than the observed response, with a mean difference of 25 IU (95% CI 18 to 33). This was, however, not a systematic difference. The overestimation in predicted response dose increased with observed response dose, i.e., it was lowest at observed response doses of 75 to 100 IU and highest at the maximum observed response dose of 187 IU. The intercept of the spline was 43 IU and the slope was 0.52. The R 2 of the model was 0.11, and the average prediction error (root mean squared error) was 35. This means that in the validation sample the model could explain only 11% of the between-woman variability in FSH response dose and that the standard error of the predicted value was about 35 IU/L. DISCUSSION We assessed the validity of a model predicting the individual FSH response dose for gonadotropin induction of ovulation in a low-dose step-up regimen in a comparable PCOS population different from the sample of patients used to develop the model. This study shows that the model overestimated the FSH response dose. The model predicted an FSH response dose that was on average 25 IU higher than the observed response dose. The overestimation enlarged with increasing predicted FSH response dose. The difference between predicted and observed response dose could be due to a difference in the dosing scheme or minor differences in the studied populations. Although in the validation sample the starting dose was similar to that in the development sample (75 IU/day), the first increase was performed not earlier than after 14 days of treatment, whereas in the development sample the dose was increased after 7 days. The predictive performance of the model was poor. Only 11% of the variance of FSH response dose could be explained by the regression model, which is much lower than the 34% explained variation in development data. A reason for this difference may be that in the present data only women who did not ovulate with CC were included, whereas the development study also included women that did ovulate on CC but did not conceive. Although this difference was controlled for with CC resistance as part of the model, the development data have a wider clinical scope with more variation to be explained. Although in the development sample FSH level was a predictor, in the validation sample no association between FSH level and FSH response dose was found. Of the four predictors in the model BMI, IGR, FSH level, and CC resistance only BMI and IGR were found to be associated with the FSH response dose in the validation sample. The development sample was less hyperandrogenemic than the validation sample, i.e., the validation population presented with higher LH, T, SHBG, FAI, and leptin levels. Therefore we also checked whether these factors were associated with FSH response dose in the validation sample using univariable and multivariable linear regression analysis. As in the development sample, neither FAI nor the LH, T, SHBG, or leptin levels came out as a predictor in the validation sample. The endocrinologic differences between the two populations can therefore not explain the poor predictive performance of the model. In the development sample, the starting dose was 75 IU urinary FSH in 76 women and 50 IU rfsh in 14 women. The lower dose of rfsh was used because at the time the model was developed it was thought that less rfsh was needed owing to its higher biopotency. Correction for this difference in starting dose did not affect the predictive performance of the model: The R 2 of the model remained 0.11, and the FIGURE 1 Calibration (spline) curve with confidence intervals for the regression line. van Wely. Validation of FSH response model in PCOS. Fertil Steril 2006. Fertility and Sterility 1713
average prediction error (root mean squared error) was 32 instead of 35. The aim of the chronic low-dose step-up regimen is to administer the lowest possible daily dose of exogenous gonadotropins to gradually surpass the individual FSH threshold to achieve follicular maturation and subsequent ovulation. The concept behind this aim is that each woman has a given threshold requirement for FSH below which follicles do not develop (27). If we are able to predict individual FSH thresholds this would increase the safety, efficiency, and convenience of low-dose regimens for gonadotropin induction of ovulation by determining the appropriate starting dose for a given patient. Time-consuming low-dose increases in dosage may be avoided by starting with a higher dose in women with a high FSH threshold. In addition, starting with an individualized dose in step-down regimens may reduce chances for hyperresponse in women with a low FSH threshold. However, a model that predicts response doses that are higher than the real response doses, as we found in our validation sample, is not useful. Even more, implementing this exact model could in theory lead to more cancelations, ovarian hyperstimulation syndrome, and multiple pregnancies, i.e., the opposite of what the model aimed for. It is possible that the model performs better in a population that undergoes ovarian stimulation with a starting dose of 75 IU and a dose increase after 7 days. However, the state-of-the-art regimen uses a chronic low-dose step-up scheme with a dose increase not earlier than after 14 days (7, 22, 27). Furthermore, in highly responsive women with PCOS the risk of multiple follicular development and multiple pregnancies can in theory be prevented when lower starting doses are used. Available data suggest that similar pregnancy rates can be obtained when starting with doses from 25 to 37.5 IU and using smaller incremental doses (26, 28). We therefore challenge investigators to develop a new model in women with CC-resistant PCOS that are undergoing a chronic low dose step-up protocol with a starting dose below 75 IU. We conclude that the external validity of the model predicting the individual FSH response dose was inadequate in women with CC-resistant PCOS undergoing ovulation induction with rfsh in a low-dose step-up regimen. REFERENCES 1. Hull MG. Epidemiology of infertility and polycystic ovarian disease: endocrinological and demographic studies. Gynecol Endocrinol 1987; 1:235 45. 2. Knochenhauer ES, Key TJ, Kahsar-Miller M, Waggoner W, Boots LR, Azziz R. Prevalence of the polycystic ovary syndrome in unselected black and white women of the southeastern United States: a prospective study. J Clin Endocrinol Metab 1998;83:3078 82. 3. DiamantiKandarakis E, Kouli CR, Bergiele AT, et al. A survey of the polycystic ovary syndrome in the Greek island of Lesbos: hormonal and metabolic profile. J Clin Endocrinol Metab 1999;84:4006 11. 4. Asuncion M, Calvo RM, San Millan JL, Sancho J, Avila S, Escobar- Morreale HF. A prospective study of the prevalence of the polycystic ovary syndrome in unselected Caucasian women from Spain. J Clin Endocrinol Metab 2000;85:2434 8. 5. Imani B, Eijkemans MJ, te Velde ER, Habbema JD, Fauser BC. Predictors of patients remaining anovulatory during clomiphene citrate induction of ovulation in normogonadotropic oligoamenorrheic infertility. J Clin Endocrinol Metab 1998;83:2361 5. 6. Imani B, Eijkemans MJ, te Velde ER, Habbema JD, Fauser BC. A nomogram to predict the probability of live birth after clomiphene citrate induction of ovulation in normogonadotropic oligoamenorrheic infertility. Fertil Steril 2002;77:91 7. 7. Christin-Maitre S, Hugues JN. A comparative randomized multicentric study comparing the step-up versus step-down protocol in polycystic ovary syndrome. Hum Reprod 2003;18:1626 31. 8. Eijkemans MJ, Imani B, Mulders AG, Habbema JD, Fauser BC. High singleton live birth rate following classical ovulation induction in normogonadotrophic anovulatory infertility (WHO 2). Hum Reprod 2003;18:2357 62. 9. van Wely M, Yding AC, Bayram N, van der Veen F. Urofollitropin and ovulation induction. Treat Endocrinol 2005;4:155 65. 10. Homburg R. Management of infertility and prevention of ovarian hyperstimulation in women with polycystic ovary syndrome. Best Pract Res Clin Obstet Gynaecol 2004;18:773 88. 11. van Santbrink EJ, Eijkemans MJ, Laven JS, Fauser BC. Patient-tailored conventional ovulation induction algorithms in anovulatory infertility. Trends Endocrinol Metab 2005;16:381 9. 12. Fauser BC, Devroey P, Macklon NS. Multiple birth resulting from ovarian stimulation for subfertility treatment. Lancet 2005;365:1807 16. 13. Brown JB. Pituitary control of ovarian function concepts derived from gonadotrophin therapy. Aust N Z J Obstet Gynaecol 1978;18:46 54. 14. Delvigne A, Rozenberg S. Epidemiology and prevention of ovarian hyperstimulation syndrome (OHSS): a review. Hum Reprod Update 2002;8:559 77. 15. Aboulghar MA, Mansour RT. Ovarian hyperstimulation syndrome: classifications and critical analysis of preventive measures. Hum Reprod Update 2003;9:275 89. 16. van Santbrink EJ, Donderwinkel PF, van Dessel TJ, Fauser BC. Gonadotrophin induction of ovulation using a step-down dose regimen: single-centre clinical experience in 82 patients. Hum Reprod 1995;10: 1048 53. 17. van Santbrink EJ, Fauser BC. Urinary follicle-stimulating hormone for normogonadotropic clomiphene-resistant anovulatory infertility: prospective, randomized comparison between low dose step-up and stepdown dose regimens. J Clin Endocrinol Metab 1997;82:3597 602. 18. Fauser BC, Van Heusden AM. Manipulation of human ovarian function: physiological concepts and clinical consequences. Endocr Rev 1997;18:71 106. 19. van Santbrink EJ, Hop WC, van Dessel TJ, de Jong FH, Fauser BC. Decremental follicle-stimulating hormone and dominant follicle development during the normal menstrual cycle. Fertil Steril 1995;64:37 43. 20. Imani B, Eijkemans MJ, Faessen GH, Bouchard P, Giudice LC, Fauser BC. Prediction of the individual follicle-stimulating hormone threshold for gonadotropin induction of ovulation in normogonadotropic anovulatory infertility: an approach to increase safety and efficiency. Fertil Steril 2002;77:83 90. 21. Justice AC, Covinsky KE, Berlin JA. Assessing the generalizability of prognostic information. Ann Intern Med 1999;130:515 24. 22. Bayram N, Van Wely M, Kaaijk EM, Bossuyt PM, van der Veen F. Using an electrocautery strategy or recombinant follicle stimulating hormone to induce ovulation in polycystic ovary syndrome: randomised controlled trial. BMJ 2004;328:192. 23. Rotterdam ESHRE/ASRM-Sponsored PCOS Consensus Workshop Group. Revised 2003 consensus on diagnostic criteria and long-term health risks related to polycystic ovary syndrome. Fertil Steril 2004; 81:19 25. 24. Rotterdam ESHRE/ASRM-Sponsored PCOS Consensus Workshop Group. Revised 2003 consensus on diagnostic criteria and long-term 1714 van Wely et al. Validation of FSH response model in PCOS Vol. 86, No. 6, December 2006
health risks related to polycystic ovary syndrome (PCOS). Hum Reprod 2004;19:41 7. 25. WHO manual for the standardized investigation and diagnosis of the infertile couple. Cambridge: Cambridge University Press, 1993. 26. White DM, Polson DW, Kiddy D, Sagle P, Watson H, Gilling-Smith C, et al. Induction of ovulation with low-dose gonadotropins in polycystic ovary syndrome: an analysis of 109 pregnancies in 225 women. J Clin Endocrinol Metab 1996;81:3821 4. 27. Homburg R, Howles CM. Low-dose FSH therapy for anovulatory infertility associated with polycystic ovary syndrome: rationale, results, reflections and refinements. Hum Reprod Update 1999;5:493 9. 28. Balasch J, Fabregues F, Creus M, Casamitjana R, Puerto B, Vanrell JA. Recombinant human follicle-stimulating hormone for ovulation induction in polycystic ovary syndrome: a prospective, randomized trial of two starting doses in a chronic low-dose step-up protocol. J Assist Reprod Genet 2000;17:561 5. Fertility and Sterility 1715