Non-parametric Survival Analysis for Breast Cancer Using nonmedical

Similar documents
Modeling the Survival of Retrospective Clinical Data from Prostate Cancer Patients in Komfo Anokye Teaching Hospital, Ghana

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

Joint Modelling Approaches in diabetes research. Francisco Gude Clinical Epidemiology Unit, Hospital Clínico Universitario de Santiago

Estimation of Relative Survival Based on Cancer Registry Data

The effect of salvage therapy on survival in a longitudinal study with treatment by indication

Copy Number Variation Methods and Data

THE NATURAL HISTORY AND THE EFFECT OF PIVMECILLINAM IN LOWER URINARY TRACT INFECTION.

NHS Outcomes Framework

Richard Williams Notre Dame Sociology Meetings of the European Survey Research Association Ljubljana,

Impact of Imputation of Missing Data on Estimation of Survival Rates: An Example in Breast Cancer

Parameter Estimates of a Random Regression Test Day Model for First Three Lactation Somatic Cell Scores

A comparison of statistical methods in interrupted time series analysis to estimate an intervention effect

Saeed Ghanbari, Seyyed Mohammad Taghi Ayatollahi*, Najaf Zare

Modeling Multi Layer Feed-forward Neural. Network Model on the Influence of Hypertension. and Diabetes Mellitus on Family History of

Economic crisis and follow-up of the conditions that define metabolic syndrome in a cohort of Catalonia,

HIV/AIDS-related Expectations and Risky Sexual Behavior in Malawi

Appendix for. Institutions and Behavior: Experimental Evidence on the Effects of Democracy

National Polyp Study data: evidence for regression of adenomas

INITIAL ANALYSIS OF AWS-OBSERVED TEMPERATURE

What Determines Attitude Improvements? Does Religiosity Help?

WHO S ASSESSMENT OF HEALTH CARE INDUSTRY PERFORMANCE: RATING THE RANKINGS

Association between cholesterol and cardiac parameters.

310 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'16

CONSTRUCTION OF STOCHASTIC MODEL FOR TIME TO DENGUE VIRUS TRANSMISSION WITH EXPONENTIAL DISTRIBUTION

HIV/AIDS-related Expectations and Risky Sexual Behavior in Malawi

Normal variation in the length of the luteal phase of the menstrual cycle: identification of the short luteal phase

Using the Perpendicular Distance to the Nearest Fracture as a Proxy for Conventional Fracture Spacing Measures

A GEOGRAPHICAL AND STATISTICAL ANALYSIS OF LEUKEMIA DEATHS RELATING TO NUCLEAR POWER PLANTS. Whitney Thompson, Sarah McGinnis, Darius McDaniel,

The Limits of Individual Identification from Sample Allele Frequencies: Theory and Statistical Analysis

An Introduction to Modern Measurement Theory

Study and Comparison of Various Techniques of Image Edge Detection

Statistical Analysis on Infectious Diseases in Dubai, UAE

UNIVERISTY OF KWAZULU-NATAL, PIETERMARITZBURG SCHOOL OF MATHEMATICS, STATISTICS AND COMPUTER SCIENCE

A Meta-Analysis of the Effect of Education on Social Capital

Appendix F: The Grant Impact for SBIR Mills

Project title: Mathematical Models of Fish Populations in Marine Reserves

Estimating the distribution of the window period for recent HIV infections: A comparison of statistical methods

ALMALAUREA WORKING PAPERS no. 9

Fitsum Zewdu, Junior Research Fellow. Working Paper No 3/ 2010

Rainbow trout survival and capture probabilities in the upper Rangitikei River, New Zealand

Unobserved Heterogeneity and the Statistical Analysis of Highway Accident Data

Estimation for Pavement Performance Curve based on Kyoto Model : A Case Study for Highway in the State of Sao Paulo

Alma Mater Studiorum Università di Bologna DOTTORATO DI RICERCA IN METODOLOGIA STATISTICA PER LA RICERCA SCIENTIFICA

NUMERICAL COMPARISONS OF BIOASSAY METHODS IN ESTIMATING LC50 TIANHONG ZHOU

THIS IS AN OFFICIAL NH DHHS HEALTH ALERT

Survival Rate of Patients of Ovarian Cancer: Rough Set Approach

ARTICLE IN PRESS Neuropsychologia xxx (2010) xxx xxx

Optimal probability weights for estimating causal effects of time-varying treatments with marginal structural Cox models

Gurprit Grover and Dulumoni Das* Department of Statistics, Faculty of Mathematical Sciences, University of Delhi, Delhi, India.

A MIXTURE OF EXPERTS FOR CATARACT DIAGNOSIS IN HOSPITAL SCREENING DATA

Insights in Genetics and Genomics

I I I I I I I I I I I I 60

BIOSTATISTICS. Lecture 1 Data Presentation and Descriptive Statistics. dr. Petr Nazarov

Disease Mapping for Stomach Cancer in Libya Based on Besag York Mollié (BYM) Model

Price linkages in value chains: methodology

Desperation or Desire? The Role of Risk Aversion in Marriage. Christy Spivey, Ph.D. * forthcoming, Economic Inquiry. Abstract

STAGE-STRUCTURED POPULATION DYNAMICS OF AEDES AEGYPTI

Incorrect Beliefs. Overconfidence. Types of Overconfidence. Outline. Overprecision 4/22/2015. Econ 1820: Behavioral Economics Mark Dean Spring 2015

Analysis of Correlated Recurrent and Terminal Events Data in SAS Li Lu 1, Chenwei Liu 2

Resampling Methods for the Area Under the ROC Curve

Does reporting heterogeneity bias the measurement of health disparities?

BIOSTATISTICS. Lecture 1 Data Presentation and Descriptive Statistics. dr. Petr Nazarov

Using Past Queries for Resource Selection in Distributed Information Retrieval

Cancer morbidity in ulcerative colitis

THE NORMAL DISTRIBUTION AND Z-SCORES COMMON CORE ALGEBRA II

Length of Hospital Stay After Acute Myocardial Infarction in the Myocardial Infarction Triage and Intervention (MITI) Project Registry

BAYESIAN EXPONENTIAL SURVIVAL MODEL IN THE ANALYSIS OF UNEMPLOYMENT DURATION DETERMINANTS

Journal of Engineering Science and Technology Review 11 (2) (2018) Research Article

Investigation of zinc oxide thin film by spectroscopic ellipsometry

Comparison of methods for modelling a count outcome with excess zeros: an application to Activities of Daily Living (ADL-s)

Statistical models for predicting number of involved nodes in breast cancer patients

Leukemia in Polycythemia Vera. Relationship to Splenic Myeloid Metaplasia and Therapeutic Radiation Dose

Subject-Adaptive Real-Time Sleep Stage Classification Based on Conditional Random Field

Integration of sensory information within touch and across modalities

Biased Perceptions of Income Distribution and Preferences for Redistribution: Evidence from a Survey Experiment

The Effect of Fish Farmers Association on Technical Efficiency: An Application of Propensity Score Matching Analysis

Optimal Planning of Charging Station for Phased Electric Vehicle *

TOPICS IN HEALTH ECONOMETRICS

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Encoding processes, in memory scanning tasks

Lateral Transfer Data Report. Principal Investigator: Andrea Baptiste, MA, OT, CIE Co-Investigator: Kay Steadman, MA, OTR, CHSP. Executive Summary:

Sheffield Economic Research Paper Series. SERP Number:

Recent Trends in U.S. Breast Cancer Incidence, Survival, and Mortality Rates

Evaluation of two release operations at Bonneville Dam on the smolt-to-adult survival of Spring Creek National Fish Hatchery fall Chinook salmon

HYPEIIGLTCAEMIA AS A MENDELIAN P~ECESSIVE CHAI~ACTEP~ IN MICE.

Human development is deeply embedded in social

Arithmetic Average: Sum of all precipitation values divided by the number of stations 1 n

A Linear Regression Model to Detect User Emotion for Touch Input Interactive Systems

THE IMPACT OF IMPLANTABLE CARDIOVERTER- DEFIBRILLATORS ON MORTALITY AMONG PATIENTS ON THE WAITING LIST FOR HEART TRANSPLANTATION

(From the Gastroenterology Division, Cornell University Medical College, New York 10021)

Are Drinkers Prone to Engage in Risky Sexual Behaviors?

Performance Evaluation of Public Non-Profit Hospitals Using a BP Artificial Neural Network: The Case of Hubei Province in China

Are National School Lunch Program Participants More Likely to be Obese? Dealing with Identification

NATIONAL QUALITY FORUM

Willingness to Pay for Health Risk Reductions: Differences by Type of Illness

Validation of the Gravity Model in Predicting the Global Spread of Influenza

Balanced Query Methods for Improving OCR-Based Retrieval

Prediction of Total Pressure Drop in Stenotic Coronary Arteries with Their Geometric Parameters

J. H. Rohrer, S. H. Baron, E. L. Hoffman, D. V. Swander

Causal inference in nonexperimental studies typically

Transcription:

IOSR Journal Of Humantes And Socal Scence (IOSR-JHSS) Volume 1, Issue 5, Ver. 1 (May. 16) PP -34 e-issn: 79-837, p-issn: 79-845. www.osrjournals.org Non-parametrc Survval Analyss for Breast Cancer Usng nonmedcal Data Dr. Intsar Al A. Khall, Saud Araba, Umluj Area et al 1 Faculty of Scences-Unversty of Tabuk, P.O. Box 471Umlj College. Abstract: Breast cancer s a major health problem n many parts of the world. Breast cancer has the hghest prevalence rate among women n Saud Araba. Ths study am to revew the valdty and workablty of sem and non-parametrc survval models n non-medcal data. The methodology of ssterhood method for mortalty data collecton s used, by desgnng questonnare. Non-medcal varables; age, resdence, weght, famly hstory, fertlty and martal status are used to show the dfferences n survval analyss. A total of 3 female breast cancer cases were studed. The results ndcates that premenopausal, fertle cases were more survve and more dagnoss, whle obese cases and those who lve n Umluj were less survve and more dagnoss. The medan survval tme s (8.64) wth mean age at dagnoss (45.59).The proporton survvng from general 1- year survval study was decreased gradually wth the tme nterval and the hazard rate ncrease shortly wth the tme nterval. انمسخخهص عذ سشطان انثذي مشكهت صح ت ف أخضاء مه انعانم,سدم أعهى معذل اوخشاس ب ه انىساء ف انسع د ت. ح ذف زي انذساست إنى ح ض ح صالح ت فاعه ت حطب ق ومارج انبقاء عه انب اواث غ ش انطب ت. اسخخذامج مى د ت طش قت األخ اث ندمع انب اواث. اسخخذمج انىمارج شب غ ش انمعهم ت نخحه م انب اواث. انمخغ شاث غ ش انطب ت: انعمش مكان اإلقامت ان صن انخص بت, انخاس خ انعائه انحانت االخخماع ت اسخخذمج إلظ اس االخخالفاث ف انبقاء عهى ق ذ انح اة. انع ىت مك وت مه 3 مه االواد انمصاباحبسشطان انثذي. انىخائح ب ىج أن االواد انخصباث االواد ف فخشة ما قبم اوقطاع انطمذ أكثش بقاء عه ق ذ انح اة اقم اصابت بانمشض ف ح ه أن حاالث انسمىت انالئ ع شىذاخم مذ ىتأمهح كاوج أقم بقاءأ عهى ق ذ انح اة اكثش اصابت. مخ سظ انبقاءعه ق ذ انح اة )8.64( ش شا مخ سظ انعمش عىذ انخشخ ص )45.6(. اوخفضج وسبت انباق ه عهى ق ذ انح اة عىذ حطب ق انذساست نمذة 1 سى اث, حذس د ا معانضمه, ف ح ه صادث معذالث انخط سة ببظء مع انضمه. I. INTRODUCTION At an ndvdual level, dagnoss of cancer s generally regarded as a human tragedy. At the level of socety, cancer s one of the major chronc dseases, causng a notable amount of health admnstratve costs. Prognoss and possble cure from cancer are thus mportant measures of lfe span whch can be assessed by analyzng the survval of cancer patents. Dfferent statstcal approaches are used n the lterature to analyzng the cancer survval data. The results of survval analyss for cancer patents have been wdely presented and reported for dfferent human sub populatons of the globe (Woolson, 1981 ; Kardaun, 1983 3 ; Beadle et al, 1984; Sedmak et al., 1989 4 ). McCarty (1974) has mentoned that for adoptng any sutable statstcal technque for analyzng survval data, t should be assumed that the statstcal model embody the evaluaton of some natural processes belevng that the model s a useful approxmaton of a real process. Several approaches have been proposed n the lterature by Leung et al. (1997) 5 and Lttle and Rubn () for analyzng the survval data. Investgators are often nclned to use conventonal statstcal methodology for the analyss of survval data. Logstc regresson analyss could be appled to quantfy the mportance of certan covarates n classfyng ndvduals nto two groups: those who dd or dd not experence the event durng the perod of observaton. Ths 1 Dr. Mara Zakara Adam Hashm, Department of Mathematc, Umluj college- unversty of Tabuk Woolson, R.F., (1981). Rank test and a one sample log rank test for comparng observed survval data to standard populaton. Bostatstcs, 37:687-696. 3 Kardaum,(1993). Statstcal analyss of male larynx cancer patents: A case study. Statstcal Nederlandca, 37:13-16. 4 Sedmak, D. D., T. A. Meneke, D. S. Knechtges and J. Anderson, (1989): Prognostc sgnfcance of cytokeratn-postve breast cancer metastases. Modern Pathol, : 519-5. 5 Leung, K.M., R.M. Elashoff and A.A. Aff, (1997). Censorng ssues n survval analyss. Annual Revew of Publc Health, 18:83-14 DOI: 1.979/837-15134 www.osrjournals.org Page

Non-parametrc Survval Analyss for Breast Cancer Usng non-medcal Data approach can result n a consderable loss of nformaton because dfferences n the tmng of event occurrence are not consdered. Alternatvely, one could use ordnary least squares regresson analyss to dentfy covarates that nfluence survval tmes. The major drawback here les n the fact that survval data are often censored,.e., they contan observatons for whch one does not know when the event has occurred. Ths s ether because the correspondng ndvdual s lost from the data set durng the study perod, or because the study has ended before all ndvduals have experenced the event. Both causes of censorng occur commonly durng epdemology nvestgatons. Whle conventonal statstcal methodology, censored observatons would ether have to be deleted, or one would have to make certan ad-hoc assumptons. By contrast, the lkelhood-based parameter estmaton methods used n survval analyss can effectvely extract relevant nformaton from both censored and uncensored observatons, thereby producng relable parameter estmates (Allson,1995) 6 and (Le,1997) 7. Furthermore, survval analyss s the only method that can readly accommodate tme dependent covarates,.e., ndependent varables whose values change durng the course of the study. II. RESEARCH PROBLEM AND OBJECTIVES Ths study attempted to apply the non-parametrc approach to breast cancer survval data n order to show ther applcablty and workablty n non-medcal data and to present survval analyss results for breast cancer data. The problem of analyzng censored data s usually referred to as survval analyss, whch s a model tme to falure or event. Survval analyss s unlke lnear regresson survval analyss whch has bnary outcome and also unlke logstc regresson, because survval analyss analyses the tme to an event. However, Survval analyss n oncology s complcated snce not all patents can be observed for the same perod of tme (Kosko, 199) 8.In survval analyss termnology, patents who are observed untl they reach the end pont (e.g. death) are called uncensored cases whle those who survve further than the end of the study or who are lost to follow-up at some pont are called censored cases. Tradtonal methods of analyss of censored data rely on lnear models (statstcs), statstcal methods such as the lfe-table, the Kaplan-Meer method and regresson models such as the Cox Proportonal Hazards are typcally used to model and predct survval data wth the ablty to handle censored data. Survval data can be represented statstcally by the probablty densty functon, survval functon or hazard rate functon. Survval statstcs ndcates a cohort of patents wth certan types and stages of cancer and s measured followng treatment. Statstcs alone may not be suffcent to predct the future outcome of a partcular patent, as no-two patents are exactly alke (Kosko, 199).The man objectve of ths study s to revew the methodology and workablty of sem non-parametrc survval models to non-medcal data, and ther applcaton to breast cancer data n Saud Araba. III. MATERIALS AND METHODS Populaton and samplng: Ths study was planned to nvestgate the survval analyss of breast cancer among Saud Araban's women. Umluj area was selected as a case study for ths research. The only health nformaton agency n ths area s Al-Hawra Hosptal. It s one of the government hosptals that have been establshed n Tabuk state n the North West regon of Saud Araba. The hosptal has fve health unts n addton to outpatent clncs and emergency department. There s no unt of oncology and rado therapy treatment, so the hosptal regstry system shows that all patents that have been suspected to cancer were referred to other hosptal n the area such as (Alwagh, Yanbu, and Tabouk). Therefore, the study uses prmary data to brefly revew the methodologcal features of the sem and non-parametrc survval models. The study dervng such data from female dagnosed wth breast cancer usng the approach of ssterhood method. A questonnare desgned to collect nformaton about breast cancer patent n the famly. A total of 1 questonnares forwarded to students n Umluj College, to brng nformaton about dagnosed relatves wth breast cancer. The response rate of the populaton s very low, only 36 questonnares were retaned and 4 questonnares of them are not completed so they excluded. So researchers decde, wth the consderaton that the approach s no-parametrc, to conduct the study usng only 3 complete cases, see annex (A) table 1. Student enrolled to Umluj College, were from Umluj and dfferent 6 Allson, P.D. (1984). Event hstory analyss: regresson for longtudnal event data. Beverly Hlls, CA: Sagle publcaton 7 Lee et al, (199); Statstcal methods for survval data analyss. nd edton. Wley, New York. 48pp. (comprehensve, no easy readng). 8 Kosko, B (199). Neural networks and fuzzy systems. Dynamcal systems approach to machne ntellgence. 1st ed. Prentce-Hall Internatonal Edtons, DOI: 1.979/837-15134 www.osrjournals.org 3 Page

Non-parametrc Survval Analyss for Breast Cancer Usng non-medcal Data vllages around t. The researcher used the student as populaton unts because the populaton s closed and very senstve toward such nformaton. The study models: Sem and Non-parametrc models: The probablty of survvng beyond t s S (t) = Pr (t>t) (1.1) Because t cannot be negatve, s () =1 S (t) can be estmated by the Kaplan-Meer method (Kaplan and Meer,1958) 9 S (t) Where N j s the number of cases at rsk of an event at tme t j and d j s the number of event at tme t j. Instantaneous rsk that an event occurs n the small nterval between t and t+ t s Pr t t t Δt t t h(t) lm Δt Δt A hazard s a rate not a probablty. One Sample Kaplan-Meer If the data were not censored, the obvous estmate would be the emprcal survval functon 1 Ŝ (t) It t, (1.4) n 1 Where, I s the ndcator functon that takes the value 1 f the condton n braces s true and otherwse. The estmator s smply the proporton alve at t. Estmaton wth Censored Kaplan and Meer (1958) 1 extended the estmate to censored data. Let t Denote the dstnct ordered tmes of death (not countng censorng tmes). Let d be the number of deaths at t (), and let n be the number alve just before t (). Ths s the number exposed to rsk at tme t (). Then the Kaplan- Meer or product lmt estmate of the survvor functon s d Ŝ (t) 1. (1.6) :t n () t A heurstc justfcaton of the estmate s as follows. To survve to tme t you must frst survve to t (1). You must then survve from t (1) to t () gven that you have already survved to t (1). And so on. Because there are no deaths between t (-1) and t (), we take the probablty of dyng between these tmes to be zero. The condtonal probablty of dyng at t () gven that the subject was alve just before can be estmated by j:t j t probablty of survvng tme t () s the complement 1 t... t (1) () (m) (1.) (1.3) (1.5). The condtonal. The overall uncondtonal probablty of survvng to t s obtaned by multplyng the condtonal probabltes for all relevant tmes up to t. the Kaplan- Meer estmate s a step functon wth dscontnutes or jumps at the observed death tmes. If there s no censorng, the K-M estmate concdes wth the emprcal survval functon. If the last observaton happens to be a censored case, the estmate s undefned beyond the last death. Non-parametrc Maxmum Lkelhood (NPML) The K-M estmator has nterpretaton as a non-parametrc maxmum lkelhood estmator (NPML). Let c denote the number of cases censored between t () ant t (+1), and let d be the number of cases that de at t (). Then the lkelhood functon takes the form n d N 1 L j j d m 1 n d d /n c S(t ) S(t ) S(t ), ( 1) () () (1.7) 9 Kaplan, E. L. and Meer, P. (1958): Nonparametrc estmaton from ncompletes observatons. Journal of Amercan Statstcal Assocaton, 53:457-451. 1 Kaplan, E. L. and Meer, P. (1958): Nonparametrc estmaton from ncompletes observatons. Journal of Amercan Statstcal Assocaton, 53:457-451. DOI: 1.979/837-15134 www.osrjournals.org 4 Page

Non-parametrc Survval Analyss for Breast Cancer Usng non-medcal Data Where the product s over the m dstnct tmes of death, and we takes t () = wth S (t()) =1. the problem now s to estmate m parameters representng the values of survval functon at the death tmes t (1), t (),,t (m). Wrte π S(t )/S(t ) for the condtonal probablty of survvng from S(t ) to S(t. Then the () ( 1) ( 1) () ) lkelhood becomes Note that all cases whch are at t () or are censored between t () and t (+1) contrbute a term π j to each of the prevous tmes of death from t () to t (-1) addton, those who de at t () contrbute 1- π, and the censored cases contrbute an addtonal π. let n (d c ) denote the total number exposed to rsk at t (). We can j j then collect terms on each and wrte the lkelhood as: a bnomal lkelhood. The m.i.e. of then The K-M estmator follows from multplyng these condtonal probabltes. Expectaton of Lfe (lfe tables) If Ŝ (t(m)) (1 π = then one can estmate μ=e(t) as the ntegral of the K-M estmate: (1.8) (1.9) (1.1) (1.11) Can you fgure out the varance of ˆ? Regresson: Cox s Model Let us consder the more general problem where the researcher has a vector x of covarates. The k-sample problem can be vewed as the specal where the x are dummy varables denotng group membershp. Recall the bass model (1.1) And consder estmaton of β wthout makng any assumptons about the baselne hazard (t). Cox, (197) 11 proposed fttng the model by maxmzng a specal lkelhood. Let (1.13) Denote the observed dstnct tmes of death, as before, and consder what happens at t (). Let R denote the rsk set at t (), defned as set of ndces of the subjects that are alve just before t (). Thus, 1,,..., n. suppose frst that there are no tes n the observaton tmes, so one and only person subject faled at t ( and does not depend on the baselne hazard (t). L m 1 j πˆ μˆ L λ(t, t λ x) m ) 1 n d (1.14) Cox proposed multplyng these probabltes together over all dstnct falure tmes and treatng the resultng product π c (π 1 (1 π ) d n Ŝ (t)dt λ (t)e π d 1 m 1 x B (t, π d n... π n d () t... t (1) () (m) x j() β j R L e m e x jβ 1 e j R x j() β e 1 t x jβ ) d c ( 1). )Ŝ (t () ) R λ. (1.15) 11 Cox, D. R (197): Regresson models and lfe tables. Journal of the Royal Statstcal Socety Seres B, 34:187-. DOI: 1.979/837-15134 www.osrjournals.org 5 Page

Non-parametrc Survval Analyss for Breast Cancer Usng non-medcal Data As f t was an ordnary lkelhood. Cox (1975) 1 calls ths a condtonal lkelhood because t s a product of condtonal probabltes. Kalbflesch and Prentce(1973) 13 consder the case where the covarates are fxed over tme and showed that L s the margnal lkelhood of ranks of the observatons, obtaned by consderng just the order n whch people de and no the actual tmes at whch they de. Tests of Hypotheses: As usual, there are three approaches to testng hypotheses about ˆ : -Lkelhood Raton test: gven two nested models, the researcher treats twce the dfference n partal log-lkelhoods as a χ statstc wth degrees of freedom equal to the dfference n number of parameters. -Wald test: usng the fact that approxmately n large samples βˆ has a multvarate normal dstrbuton wth mean β and varance-covarance matrx var( βa I (ββ) Thus, under H : β, the quadratc form ) ~ χ (1.16) where p s the dmnuton of β. Ths test s often used for a subset of β. -Score Test: usng the fact that approxmately n large samples the score u (β) has a multvarate normal dstrbuton wth mean and varance-covarance matrx equal to nformaton matrx. Thus, under H :, the quadratc form Note that ths test does not requre calculatng the M.L.E. unlke the k-sample case the score test of H : 1 β 1 u ( ) I ( ) u ( 1 ( βˆ β ) var (βˆ )( βˆ β ) ~ (1.17) ˆ. One reason for brngng up the score test s that based on Cox s models happens to be the same as the Mantel-Haesnszel log-rank test. All the three tests are asymptotcally equvalent. The equalty of the normal approxmatons depends on sample sze, the dstrbuton of cases over the covarate space, and the extent of censorng. IV. RESULTS Ths secton contans a revewng of results of applcaton of the sem and non-parametrc survval models to the data. 3.1. Lfe table estmates of patent survval: A lfe table for the 3 patents, whch s constructed for annual ntervals and uses the actuaral assumpton, s shown n Table 6. The frst column n Table (6) gves the ndex () for the nterval, followed by the number of patents alve at the start of the nterval (l ), the number of deaths (rrespectve of cause) durng the nterval (d ), the number of censorngs durng the nterval (w ), and the effectve number of patents at rsk durng the nterval (l-). The effectve number of patents at rsk s gven by (l w/). The notaton w s used for the number of censored observatons snce these are sometmes referred to as wthdrawals or patents wthdrawn alve. The column labeled p contans the estmated condtonal survval rates of survvng each nterval among the patents alve at the start of the nterval (p = 1 d /l_ ). When estmated from a lfe table, the condtonal survval rates are known as nterval-specfc survval rates. When the ntervals are annual, as n Table 6, they are referred to as annual survval rates. The column labeled 1p contans estmates of the cumulatve survval rates from dagnoss to the end of the th nterval, calculated as the product of the nterval-specfc survval rates for each nterval from 1 to. The cumulatve prefx s often dropped from the term cumulatve survval rate and we wll sometmes follow ths practce n the text. The remanng two columns n the lfe table contan the cumulatve expected (1p ) and relatve (1r) survval rates, whch wll be descrbed later n ths secton. Lfe tables are a descrptve procedure for examnng the dstrbuton of tme-to-event varables. The researcher also can compare the dstrbuton by levels of a factor varable. The basc dea of lfe tables s to subdvde the perod of observaton nto smaller tme ntervals. Then the probabltes from each of the ntervals are estmated. Usng p p 1 Cox, Bometrka(1975): Partal lkelhood) 6 ():Journal of the Royal Statstcal Socety, Seres B, 6 ():69-76 13 Kalbflesch, J. D. & Prentce, R. L. (1973). Margnal lkelhoods based on X-Cox's regresson and lfe model. Bometrka 6, 67-78. DOI: 1.979/837-15134 www.osrjournals.org 6 Page

Non-parametrc Survval Analyss for Breast Cancer Usng non-medcal Data IBM SPSS verson19. 14 to apply the lfe table method for table (1) data n Annex (A1) as follows: The approach s to dvde the perod of observaton nto a seres of tme ntervals and estmate survval for each nterval. The ntervals need not be of equal length, although they frequently are. Cancer regstres often record survval tme only n completed years, rather than months or days, so t s common to construct lfe tables usng annual ntervals. If survval tme s known only n completed years, the exact number of person-months at rsk cannot be calculated. When constructng lfe tables from such data, t s assumed that censorng occurs unformly throughout the nterval, so each of the censored patents s assumed to be at rsk for half of the nterval (an assumpton known as the actuaral assumpton). The medan survval tme from Lfe Table n ths study s estmated as 8.64 month rank n the tme nterval 8 up to 9 month, the total number of female enterng ths nterval s 1, no patents were wthdrawn from ths nterval by the event of death. Therefore, the numbers exposed to rsk of death n ths nterval s equal 1 wth 9 termnal events so the proporton of termnatng s.75; the proporton survvng through ths nterval s (1-.75=.5) wth hazard rate equal to.6. The proporton survvng from general 1-year survval study was decreased gradually wth the tme nterval and the hazard rate ncrease shortly wth the tme nterval. 3.. Estmatng Survval Usng Kaplan-Meer Model The 1-year Survval analyss (145-1435):The smplest measure of patent survval s the proporton of patents who survve at least t years followng dagnoss. Ths s known as the observed survval rate. For example, the 1-year observed survval rate s estmated as the proporton of patents who are alve one year subsequent to dagnoss.among the patents who survved the frst year, we may be nterested n the proporton who survves a further year. 14 IBM SPSS verson 19.: IBM software for the Statstcal Packaged for Socal Scences Edton DOI: 1.979/837-15134 www.osrjournals.org 7 Page

Non-parametrc Survval Analyss for Breast Cancer Usng non-medcal Data Ths estmate s known as a condtonal survval rate, snce t s condtonal on survvng one year. Condtonal survval rates are useful for showng how the mortalty due to the dsease vares. Survval rates calculated from dagnoss are known as cumulatve survval rates, although the cumulatve s often omtted. 3.3. The Survval Analyss usng Cox's Regresson Models The lkelhood-rato, Wald, and score ch-square statstcs are asymptotcally equvalent tests of the omnbus null hypothess that all of the β s are zero. In ths nstance, the test statstcs are n close agreement, and the hypothess s soundly rejected.the ntal estmaton n based on log lkelhood s114.568.the Cox regresson model results shows that place of resdent, age, martal status and weght have a negatve regresson coeffcents whch ndcates that they reduce the hazard of breast cancer by 4.9%, 45.1 %, 16.4% and 61.3% respectvely. Famly hstory, fertlty and menopausal status have a postve regresson coeffcent whch ndcates that greater value of these varables ncrease the hazard of breast cancer by 36.6%, 13.4% and 155.4% respectvely. Table (3): Results from Cox regresson Model B SE Wald df Sg. Exp(B) 95.% CI for Exp(B) Lower Upper Age -.49.6.648 1.41.953.846 1.7 Resdent -.451.584.598 1.439.637.3. Martal -1.639 97.54.17 1.897.. 3.5E77 Fertlty 13.48 97.55.18 1.89 56711. 6.1E88 Weght -.613.543 1.76 1.59.54.187 1.569 Hstory.366.559.47 1.513 1.441.48 4.31 Menopause 1.554 1.46 1.134 1.87 4.73.71 8.69 The overall goodness of ths model was calculated as follows; R M 1 exp (9.651 9.46 ) =.194 3 Ths results shows that the above model was perfectly adequate model to ft the breast cancer survval data because t has a very small value of R. The relatve hazard of ths model s h ( t ) = (1.9114) and the log h ( ) h ( t ) relatve hazard s ln =-..547. h ( ) 3.3.. Cox Regresson Dagnostcs In order to chck the Cox regresson dagnostcs the researcher uses all the data as an nterval censorng study. Frst wee checks for non-proportonal hazard assumpton usng Schoenfeld, (198) 15 resdual, then repeats the estmaton by fttng wth strata here she used a year at dagnoss as strata varable. Also she s detectng Influental Observatonusng dfbeta and detectng non-lnearty usng Martngale resdual. As for a lnear or generalzed-lnear model, t s mportant to determne whether a ftted Cox-regresson model adequately represents the data. 3.3.3. Checkng for Non-Proportonal Hazards :A departure from proportonal hazards occurs when regresson coeffcents are dependent on tme that s, when tme nteracts wth one or more covarates. Tests and graphcal dagnostcs for nteractons between covarates and tme may be based on the scaled Schoenfeld resduals from the Cox model. The formula and ratonale for the scaled Schoenfeld resduals are complcated the detals are avalable n (Hosmer and Lemeshow, 1999 16, or Therneau and Grambsch, ) 17. 15 Schoenfeld, D (198): Partal resduals for the proportonal hazards model, Bometrka 69, 551{55}. 16 Hosmer, D. W. & Lemeshow, S. (1999). Appled survval analyss. Regrsson modelng of tme to event data. NY: John Wley & Sons wdely used. 17 Therneau, T.M. and Grambsch, P.M.(): Modelng survval data: extendng the Cox model. Sprnger. DOI: 1.979/837-15134 www.osrjournals.org 8 Page

schoenfel Resdual for Age Schoenfeld Resedual for Resdent Non-parametrc Survval Analyss for Breast Cancer Usng non-medcal Data The scaled Schoenfeld resduals comprse a matrx, wth one row for each record n the data set to whch the model was ft and one column for each covarate. Plottng scaled Schoenfeld resduals aganst tme, or a sutable transformaton of tme, reveals un-modeled nteractons between covarates and tme. One choce s to use the Kaplan-Meer estmate of the survval functon to transform tme. A systematc tendency of the scaled Schoenfeld resduals to rse or fall more or less lnearly wth (transformed) tme suggests enterng a lnear-by-lnear nteracton (.e., the smple product) between the covarate and tme nto the model. A test for non-proportonal hazards can be based on the estmated correlaton between the scaled Schoenfeld resduals and (transformed) tme. Ths test can be performed on a per-covarate bass and also cumulated across covarates. In ths study the ntal estmated of Cox model for the entre data ad follows; Table (4): Cox Regresson Results for Entre Data. T ˆ SE P > z Lower Upper It s concevable that a varable wth a non-sgnfcant coeffcent n the ntal model nevertheless nteracts sgnfcantly wth tme, startng wth the orgnal model we present n the followng table a test for nonproportonal hazard. Table (5): Tests for non-proportonal hazards n ths model are as follows T ˆ SE P > z Lower Upper ˆ Age -.49.6.648.41.846 1.7 Resdent -.451.584.598.439.3. Martal -1.639 97.6.17.897. 3.5 fertlty 13.48 97.6.18.89. 6.169 Weght -.613.543 1.76.59.187 1.569 Hstory.366.559.47.513.48 4.31 Menopause 1.554 1.46 1.134.87.71 8.69 Age -.79.84.878.349.784 1.9 Resdent -.444.614.53.47.193.137 Martal -1.185 1.65.877.349.6 3.649 Weght -.91.64.3.881.79.986 Hstory.59.56.894.344.567 5.84 Menopause.751 1.846.165.684.57 78.934 s the estmated correlaton between the scaled Schoenfeld resduals and transformed tme. Under the null hypothess of proportonal hazards, each test statstc s dstrbuted as wth the ndcated degrees of freedom.thus, the tests for all covarate n the orgnal model and n the new test were statstcally not sgnfcant as s the global test for non-proportonal hazards. Age Resdent 15 1 5-5 -1-15 5 1 15 Tme.6.4. -. -.4 -.6 -.8 5 1 15 Axs Ttle DOI: 1.979/837-15134 www.osrjournals.org 9 Page

Schoenfeld resdual for Martal status Schoenfeld Resdual for Fertlty Non-parametrc Survval Analyss for Breast Cancer Usng non-medcal Data Martal Status Fertlty 1 1.5.5 -.5 5 1 15 -.5 5 1 15-1 Tme -1 Tme Fgure (1): Plots of scaled Schoenfeld resduals aganst transformed tme for the covarates Age, resdence, martal status and Fertlty. Fgure (1) shows plots of scaled Schoenfeld resduals vs. the covarates race, resdence, menopause and age n the 1 st study. The lne on each plot s a smoothng SP-lne (a method of nonparametrc regresson); the broken lnes gve a pont-wse 95-percent confdence envelope around ths ft. The tendency for the effect of all varables s to rse wth tme. The effect of Age on the hazard of re-offendng s ntally postve, but ths effect decrease wth tme and eventually becomes negatve (by 6 months). The effect of resdence s ntally negatve, but eventually becomes postve (by 6 months). The effect of martal status s ntally negatve but eventually t becomes postve (by 7 months). The effect of fertlty s ntally negatve, but ths effect ncrease wth tme and became postve (by 8 month). The re-specfed model shows an evdence of non-proportonal hazards; the global test and p-value =.. 39.7 3.3.4. Fttng by Strata An alternatve to ncorporatng nteractons wth tme s to dvde the data nto strata based on the values of one or more covarates. Each stratum may have a dfferent baselne hazard functon, but the regresson coeffcents n the Cox model are assumed to be constant across strata. An advantage of ths approach s that we do not have to assume a partcular form of nteracton between the stratfyng covarates and tme. There are a couple of dsadvantages, however: The stratfyng covarates dsappear from the lnear predctor nto the baselne hazard functons. Stratfcaton s therefore most attractve when we are not really nterested n the effects of the stratfyng covarates, but wsh smply to control for them. When the stratfyng covarates take on many dfferent (combnatons of) values, stratfcaton whch dvdes the data nto groups s not practcal. We can, however, recode a stratfyng varable nto a small number of relatvely homogeneous categores. In ths study we dvded age nto two categores: those 54 years old or less; those 54 years old. Race and resdence are statstcally not sgnfcant wth age, but stage at dagnoss and party are statstcally sgnfcant wth age at dagnoss. Fttng by strata for the covarates stratfed be age. Table (6): Fttng the stratfed Cox model to the data B SE Wald df Sg. Exp(B) 95.% CI for Exp(B) Lower Upper Age -.49.6.663 1.415.95.846 1.71 Resdent -.46.585.64 1.43.63. 1.983 Martal -1.99 93.17.17 1.895.. 8.476E7 Fertlty 1.898 93.13.19 1.89 399571.4. 7.481E8 Weght -.65.543 1.41 1.65.546.188 1.583 Hstory.378.563.451 1.5 1.459.484 4.397 The ntal log lkelhood for the estmaton n ths model s 17.17. The stratfed model ncludes sx covarates that affect the hazard of breast cancer sgnfcantly. In model age, resdence, martal and weght has negatve regresson coeffcent whch ndcates that they reduced the hazard of breast cancer by 4.9%, 46.%, 13% and 6.3% respectvely, when we assume other covarates constant n the model. Fertlty and Famly hstory estmated to ncrease the hazard of breast cancer by 18.9% and 37.8% respectvely. The new log relatve hazard s (-.139) whch ndcate that the stratfed model DOI: 1.979/837-15134 www.osrjournals.org 3 Page

Martngale Resdual Non-parametrc Survval Analyss for Breast Cancer Usng non-medcal Data 3.3.5. Detectng Influental Observatons As n lnear and generalzed lnear models, we don t want the results n Cox regresson to depend unduly on one or a small number of observatons. Approxmatons to changes n the Cox regresson coeffcents attendant on deletng ndvdual observatons (dfbeta), and these changes standardzed by coeffcent standard errors (dfbetas), can be obtaned for the Cox model. Fgure (): dfbeta for Age, Resdence, Martal and Fertlty. Fgure () shows ndex plots of dfbeta for the four covarates, Age, resdence, martal status and Fertlty n the stratfed Cox model. All of the dfbeta are small relatve to the szes of the correspondng regresson coeffcents. 3.3.6. Detectng Nonlnearty Other knd of Cox-model resduals, called martngale resduals, are useful for detectng nonlnearty n Cox regresson. Plottng resduals aganst covarates, n a manner analogous to plottng resduals aganst covarates from a lnear model, can reveal nonlnearty n the partal relatonshp between the log hazard and the covarates. The martngale resduals shown n Fg () are slghtly skewed. Ths mght be attrbuted to the sngle falure outcome feature of the Cox model. Presently, the estmated mode, medan, and mean martngales are -.85,, and -.15, respectvely. The estmated measure of skew-ness was approxmately 3. We see an ndcaton of a lack of ft of the model to ndvdual observatons. - -4-6 -8-1 4 6 8 1 1 14 Tme Fgure (3): The Martngale resduals for the model. DOI: 1.979/837-15134 www.osrjournals.org 31 Page

Non-parametrc Survval Analyss for Breast Cancer Usng non-medcal Data V. DISCUSSION The medan survval tme from Lfe Table n ths study s estmated as 8.64 month rank n the tmeبم nterval 8 up to 9 month, the total number of female enterng ths nterval s 1; no patents were wthdrawn from ths nterval by the event of death. Therefore, the numbers exposed to rsk of death n ths nterval s equal 1 wth 9 termnal events so the proporton of termnatng s.75; the proporton survvng through ths nterval s (1-.75=.5) wth hazard rate equal to.6. The proporton survvng from general 1-year survval study was decreased gradually wth the tme nterval and the hazard rate ncrease shortly wth the tme nterval. Kaplan Meer model results show that,1-year survval analyss results ndcate that the medan and mean survval tme for the patent n the sample s hgh; t about 7 year out of 1 years, wth mean age at dagnoss 45.59 (3, 7). Also the results ndcate that fertle woman are more survved and less dagnosed compare to nfertle women. Hgh weght cases were less survved and more dagnosed compare to normal weght cases. Consderng the resdence of cases, women lve n Umluj were more survvng and less dagnoss compare wth those lve outsde Umluj. All these results are confrmed by usng the tests of equalty. In North Amerca and Europe the ncdence n women younger than 4 ranges from 8% to 15%, compared to 8.1% n our experence. It could be argued that our fgures, beng sample data, reflect a referral bas, but the Natonal Cancer Regstry fgures for 1994 show that 3% of breast cancer s n those younger than 4 years. Some studes ndcate dfferent ages at menarche, weght at menopause, and varyng estrogen levels as contrbutory factors to the dfference n ncdence between countres. In the U.S., localzed breast cancer accounts for 58% of the cases. In our study pre-menopausal women were found to be more survve than post-menopausal women. The sem-parametrc model results (The Cox regresson model ) ndcates that, age, place of resdence, martal status and weght have a negatve regresson coeffcents whch ndcates that they reduce the hazard of breast cancer by 4.9%, 45.1 %, 16.4% and 61.3% respectvely. Famly hstory, fertlty and menopausal status have postve regresson coeffcents whch ndcate that greater value of these varables ncrease the hazard of breast cancer by 36.6%, 13.4% and 155.4% respectvely. Also results shows that the above model was perfectly adequate model to ft the breast cancer survval data because t has a very small value of R (.194), wth log relatve hazard =.547. The workablty of the Cox regresson model to non-medcal data were confrmed by checkng the volaton of the assumpton of proportonal hazard, nfluental data and nonlnearty n the relatonshp between the loghazard and the covarates, and all of them were confrmed. VI. OBSTACLES The lack of clncal nformaton for breast cancer n the area s the man obstacle that face the researcher, after some trals for accessng clncal nformaton from other source, researcher decde to use unclncal data. The major obstacle that was encountered n ths research study was the dffculty n persuadng nvted students to actually partcpate n the study, researcher used to conduct an n-depth ntervew to encourage them to partcpate n the study. The small sample sze (3),led to calculate the lfe table for the whole sample only, the researcher can t calculate lfe tables for the rest of covarates n the study. VII. CONCLUSIONS The man goal for ths study s to revews the survval models and examnes ther workablty n nonmedcal data. To collect the non-medcal data, researcher used the approach of ssterhood method of data collecton; ths approach s a very useful n collectng maternal mortalty data. It has been used to calculate maternal mortalty rato n many demographc studes. Informaton about age, resdence, martal status, weght and famly hstory of dsease are collected usng questonnare. Applcaton of lfe table, as an actuaral method, was done. Kaplan-Meer estmator has been used to; show the dfferences n survval tme for each varable n the study. Results from K-M model ndcate there s a statstcal evdence of dfference n the survval for the fertlty varable, for the other varables there was no evdence for statstcal dfference. All these results are confrmed by usng the tests of equalty. Cox regresson model has been appled to test the effect of each covarate n the hazard rate. Results shows that age, place of resdence, martal status and weght have a negatve effect, whle famly hstory, fertlty and menopausal status have postve effect on the hazard of breast cancer. Also results shows that the above model was perfectly adequate model to ft the breast cancer survval data. Fnally t concluded that the survval models can be appled to non-medcal data usng prmary data (questonnares). VIII. RECOMMENDATIONS To establsh treatment centers of breast cancer n dfferent states n Saud Araba especally the perpheral ones, wth traned staff who know how to tran females for early detecton of breast cancer. Facltate the access of cancer regstry data and let t avalable for researcher, through a partnershp of unverstes and DOI: 1.979/837-15134 www.osrjournals.org 3 Page

Non-parametrc Survval Analyss for Breast Cancer Usng non-medcal Data research centers wth the cancer regstry center. Actvate the roles of cvl socetes, and research centers n the awareness of breast cancer and early detecton. Encourage future researcher to conduct a populaton-based survval analyss study usng prmary avalable data. And further studes to establsh survval analyss of breast cancer usng parametrc approach. Also further studes to establsh the actual prevalence and rsk factors assocated wth breast cancer n Saud Araba. Develop an ndrect survval models to overlap the problem of age mss reportng. ACKNOWLEDGEMENT IX. The authors would lke to acknowledge fnancal support for ths work from the Deanshp of Scentfc Research (DSR), Unversty of Tabuk Tabuk, Saud Araba, under grant no. s,35/139/1435.. REFERENCES [1]. Allson, P.D. (1984). Event hstory analyss: regresson for longtudnal event data. Beverly Hlls, CA: Sagle publcaton. []. Anderson, P. K. and GILL, R. D. (198): Cox s regresson model for countng processes: a large sample study. Ann. Statst. 1 11]11, [3]. Cox, D. R (197): Regresson models and lfe tables. Journal of the Royal Statstcal Socety Seres B, 34:187-. [4]. Cox, Bometrka(1975): Partal lkelhood) 6 ():Journal of the Royal Statstcal Socety, Seres B, 6 ():69-76 [5]. Hosmer, D. W. & Lemeshow, S. (1999). Appled survval analyss. Regresson modelng of tme to event data. NY: John Wley & Sons wdely used. [6]. Kalbflesch, J. D. & Prentce, R. L. (1973). Margnal lkelhoods based on X-Cox's regresson and lfe model. Bometrka 6, 67-78. [7]. Kalbflesch, J. D. & Prentce, R. L. (198). The statstcal analyss of falure rate data. NY: John Wley. [8]. Kaplan, E. L. and Meer, P. (1958): Nonparametrc estmaton from ncompletes observatons. Journal of Amercan Statstcal Assocaton, 53:457-451. [9]. Kardaum, (1993). Statstcal analyss of male larynx cancer patents: A case study. Statstcal Nederlandca, 37:13-16. [1]. Kosko, B (199). Neural networks and fuzzy systems. Dynamcal systems approach to machne ntellgence. 1st ed. Prentce-Hall Internatonal Edtons, [11]. Lee et al, (199); Statstcal methods for survval data analyss. nd edton. Wley, New York. 48pp. (comprehensve, no easy readng). [1]. Leung, K.M., R.M. Elashoff and A.A. Aff, (1997). Censorng ssues n survval analyss. Annual Revew of Publc Health, 18:83-14. [13]. NCR, (199): Saud Araba cancer regstry report [14]. Prentce, R. L. & Farewell, B. T. (1986). Relatve rsk and odds rato regresson. Annual Revew of Publc Health 7: 335-338. [15]. Prentce, R. L. and Ca, J. (199). Covarance and survval functon estmaton usng censored multvarate falure tme data. Bometrka 79, 495-51. [16]. Prentce, R. L., Wllams, B.J. and Peterson, A.V. (1981). On the regresson analyss of multvarate falure tme da ta. Bometrka 68, 373-379. [17]. Schoenfeld, D (198): Partal resduals for the proportonal hazards model, Bometrka 69, 551{55}. [18]. Sedmak, D. D., T. A. Meneke, D. S. Knechtges and J. Anderson, (1989): Prognostc sgnfcance of cytokeratn-postve breast cancer metastases. Modern Pathol, : 519-5. [19]. Shane, S. & Foo, M. D. (1999): New frm survval: nsttutonal explanatons for new franchsor mortalty, Management Scence, 45(), pp. 14-159. []. Statstcal program for socal scences (SPSS, 19). SPSS advanced models Chcago, IL: Author (www.spss,com). [1]. Therneau, T.M. and Grambsch, P.M.(): Modelng survval data: extendng the Cox model. Sprnger. []. Woolson, R.F., (1981). Rank test and a one sample log rank test for comparng observed survval data to standard populaton. Bostatstcs, 37:687-696. DOI: 1.979/837-15134 www.osrjournals.org 33 Page

Non-parametrc Survval Analyss for Breast Cancer Usng non-medcal Data Annex (A1): DOI: 1.979/837-15134 www.osrjournals.org 34 Page