CHAPTER II EXPLORATORY FACTOR ANALYSIS

Size: px

Start display at page:

Download "CHAPTER II EXPLORATORY FACTOR ANALYSIS"

Virgil McDonald
5 years ago
Views:

1 CHAPTER II EXPLORATORY FACTOR ANALYSIS Section I - Introduction: In maximum likelihood estimation and hypothesis testing, the true values of the model parameters are viewed as fixed but unknown and the estimates of those parameters from a given sample are viewed as random but known. An alternative kind of statistical inference called the Bayesian approach, views any quantity that is unknown as a random variable and assigns it a probability distribution. From a Bayesian standpoint, true model parameters are unknown and therefore considered to be random and they are assigned a joint probability distribution. This distribution is not meant to suggest that the parameters are varying or changing in some fashion. Rather, the distribution is intended to summarize the state of knowledge, or what is currently known about the parameters. The distribution of the parameters before the data are seen is called a prior distribution. Once the data are observed, the evidence provided by the data is combined with the prior distribution by a well-known formula called Bayes Theorem. The result is an updated distribution for the parameters, called a posterior distribution, which reflects a combination of prior belief and empirical evidence. Exploratory factor analysis contains in this chapter Markov chain Monte Carlo (MCMC) using a new class of simulation techniques, high-dimensional joint posterior distribution, Maximum Likelihood Analysis, Regression Weights, Intercepts, covariance, and variance towards the discussions of first model of this thesis, namely Bayesian estimation. Section II - Bayesian Estimation: 2.2. Bayesian Analysis: Human beings tend to have difficulty visualizing and interpreting the joint posterior distribution for the parameters of a model. Therefore, when performing a Bayesian analysis, one needs summaries of the posterior distribution that are easy to interpret. 8

2 A good way to start is to plot the marginal posterior density for each parameter, one at a time. Often, especially with large data samples, the marginal posterior distributions for parameters tend to resemble normal distributions. The mean of a marginal posterior distribution, called a posterior mean, can be reported as a parameter estimate. The posterior standard deviation, the standard deviation of the distribution, is a useful measure of uncertainty similar to a conventional standard error. The analogue of a confidence interval may be computed from the percentiles of the marginal posterior distribution; the interval that runs from the 2.5 percentile to the 97.5 percentile forms a Bayesian 95% credible interval. If the marginal posterior distribution is approximately normal, the 95% credible interval will be approximately equal to the posterior mean ±.96 posterior standard deviations. In that case, the credible interval becomes essentially identical to an ordinary confidence interval that assumes a normal sampling distribution for the parameter estimate. If the posterior distribution is not normal, the interval will not be symmetric about the posterior mean. In that case, the Bayesian version often has better properties than the conventional one. Although the idea of Bayesian inference dates back to the late 8 th century, its use by statisticians has been rare until recently. For some, reluctance to apply Bayesian methods stems from a philosophical distaste for viewing probability as a state of belief and from the inherent subjectivity in choosing prior distributions. But for the most part, Bayesian analyses have been rare because computational methods for summarizing joint posterior distributions have been difficult or unavailable. Using a new class of simulation techniques called Markov chain Monte Carlo (MCMC), however, it is now possible to draw random values of parameters from high-dimensional joint posterior distributions, even in complex problems. With MCMC, obtaining posterior summaries becomes as simple as plotting histograms and computing sample means and percentiles. 9

3 2.2.2 Selecting Priors: A prior distribution quantifies the researcher s belief concerning where the unknown parameter may lie. Knowledge of how a variable is distributed in the population can sometimes be used to help researchers select reasonable priors for parameters of interest. If the test is given to participants in a study who are fairly representative of the general population, then it would be reasonable to center the prior distributions for the mean and standard deviation of the test score at 00 and 5, respectively. Knowing that an observed variable is bounded may help any one to place bounds on the parameters. Prior distributions for the mean and variance of this item can be specified to enforce these bounds. In many cases, researchers would like to specify prior distribution that introduces as little information as possible, so that the data may be allowed to speak for themselves. A prior distribution is said to be diffuse if it spreads its probability over a very wide range of parameter values. By default, Amos applies a uniform distribution from to to each parameter. Diffuse prior distributions are often said to be non-informative, and any one will use that term as well. In a strict sense, however, no prior distribution is ever completely non- informative, not even a uniform distribution over the entire range of allowable values, because it would cease to be uniform if the parameter were transformed (if the variance of a variable is uniformly distributed from 0 to, then the standard deviation will not be uniformly distributed). Every prior distribution carries with it at least some information. As the size of a dataset grows, the evidence from the data eventually swamps this information, and the influence of the prior distribution diminishes. Unless a sample is unusually small or if a model and/or prior distribution are strongly contradicted by the data, one will find that the answers from a Bayesian analysis tend to change very little if the prior is changed. Amos makes it easy for any one to change the prior distribution for any parameter, so he can easily perform this kind of sensitivity check. 20

4 2.2.3 Gaussian graphical model: Former and present researchers are concerned with Datasets d in which a large number p (example, tens of hundreds) of variables is recorded and the sample size n is relatively small (example, tens or possibly hundreds of observations). Through suitable transformations, d can sometimes be assumed to roughly follow a multivariate Gaussian distribution N p (0, Σ). Directly attempting to fit this apparently simple model with p.(p+) / 2 parameters represented by the entries of Σ raises challenging questions of structuring and dimensionality reduction in parameter space. Dempster (972) introduced the idea of reducing the number of parameters that need to be estimated by setting to zero selected elements of the precision matrix Ω = (Σ ). This can, and generally will, lead to more robust estimates of Σ if Ω is required to have a substantial number of structural zeros. In addition, the dependency patterns among the variables in d can be visually summarized by means of an undirected independence graph G in which each variable is associated with a vertex and the edges that link the vertices are the off-diagonal elements of Ω that are not constrained to be zero. The resulting Gaussian distribution satisfies a set of conditional independence relations encoded by G.These relations are called the pairwise, local and global Markov properties, while the pair M = (Σ, G) is called a Gaussian graphical model (Lauritzen (996). This model is undirected since the edges in G are lines that represent symmetric associations Directed acyclic graph for data: Regarding in performing covariance selection (Dempster, 972) with the objective of identifying a number of Gaussian graphical models that are best supported by the data and, in a Bayesian framework, by the prior information available, inference on the parameters of Σ (equivalently, Ω) can consequently be done by Bayesian model averaging a cross the pool of models selected. Searching for graphical models with tens of thousands of nodes is an extremely difficult task, statistically and computationally, due to the vast space of possible graphs that 2

5 needs to be explored. The majority of the structural learning methods developed so far involve exploring the target space by sequentially adding (deleting) one or more edges to (from) the current graph. In the special case when decomposable graphs G are the only graphs considered, the search space is considerably reduced and there exist conjugate prior distributions for the parameters of M = (Σ, G) (Lauritzen,996) that lead to exact formulas for the marginal likelihood of M, p (d M). Unfortunately there are two major shortcomings that make decomposable graphs less desirable: (i) the learning procedure is slowed down by the need of determining what edges can be changed so that the resulting graph is still decomposable; and, much more importantly, (ii) the decomposability constraint is simply too severe to yield models that are representative for the complex dependency patterns that exist among the variables in d in other than rather small dimensional problems. In the most general case when all the possible graphs are considered, numerical or stochastic methods for approximately computing the posterior probability of M need to be employed (Roverato, 2002; Atay-Kayis and Massam, 2003; Dellaportas et al., 2004) which result in search procedures that cannot efficiently cover huge sets of graphs For a comprehensive review of learning Gaussian graphical models in moderately large datasets, see Jones et al. (2004). An alternative method of performing covariance selection is to exploit the connection between graphical models on undirected graphs and graphical models on directed acyclic graphs (DAGs, henceforth). The latter distributions follow the order Markov property relative to their underlying DAGs and further obey the Markov properties with respect to the moralized undirected versions of these DAGs (Lauritzen, 996). A DAG is a convenient graphical structure that induces a recursive factorization of the joint density as a product of univariate regressions associated with each variable. 22

6 Variables are linked in a DAG with arrows instead of lines. An arrow points from an explanatory variable (the parent) to the response (the child). The decomposition of a multivariate joint distribution induced by a DAG is a straightforward generalization of the usual chain rule and yields exact formulas for computing the marginal likelihood of the corresponding model (Heckerman and Geiger, 995; Geiger and Heckerman, 2002). Thus DAG models have properties similar to those of graphical models on decomposable graphs. Actually, any decomposable graph can be transformed in a DAG using an ordering generating by the maximum cardinality search algorithm (Lauritzen, 996) which implies that the class of decomposable graphs is included in the class S of undirected graphs that can be obtained by moralizing (ensuring an edge exists between the parents of each child) and replacing the arrows with undirected edges. Searching the space of DAGs can be done using local moves involving the addition, deletion or reversal of arrows. Unfortunately, methods based on local moves can spend much time traversing DAGs that are statistically equivalent (Heckerman et al., 994). Two equivalent DAGs describe the same joint distribution and consequently the same Markov relations. Chickering (995) presented characterizations of equivalent DAGs and introduce search algorithms that jump between equivalence Markov classes. These search methods, although proven to be better than simple local moves-based algorithms, are, unfortunately, simply not efficient enough to scale to datasets with tens of thousands of variables. A novel framework for constructing high-dimensional Gaussian graphical models by searching for graphical models on DAGs was presented. This approach builds on the methods introduced in Dobra and West (2004a) and is guaranteed to eventually converge to local optima in the space of undirected graphs S, in the sense of identifying local modes in posterior distributions over S based on dataset d. 23

7 2.2.5 HdBCS: The related works were presented a novel structural learning method called HdBCS that performs covariance selection in a Bayesian framework for datasets with tens of thousands of variables. HdBCS is based on the intrinsic connection between graphical models on undirected graphs and graphical models on directed acyclic graphs (Bayesian networks). There was a model show how to produce and explore the corresponding association networks by Bayesian model averaging across the models identified. The use of HdBCS with an example from a large-scale gene expression study of breast cancer was illustrated by many researchers. Section III - Bayesian Estimation for Blood Cancer: 2.3. Introduction: In some previous works of Bayesian analysis of zero-inflated count data with applications to dental caries by Dipankar Bandyopadhyay, experimental and observational studies in high-throughput genomics often generate multiple gene expression signatures, each signature being a list of genes with associated numerical measures of change in gene expression relative to an experimental condition or outcome. A biological or environmental design factor in a controlled experiment generates a signature of response to that factor (Huang et al., 2003c,b; Bild et al., 2006; Chen et al., 2007), while evaluation of gene expression related to a specific clinical outcome or state may generate a signature as a biomarker of the outcome in disease studies (West et al., 200; Huang et al., 2002, 2003a; Pittman et al., 2004; Seo et al., 2004; Rich et al., 2005; Seo et al., 2007). Interpretation and, often, follow on biological studies rely on the comparison of such signatures with multiple, annotated biological pathway databases that contain lists of putatively pathway-specific genes based on cumulated biological research. A core challenge is then to assess the candidate signature gene sets and numerical summaries against these databases to suggest potential pathway interpretations and connections. The focus here was a formal, novel statistical modeling approach to this problem. 24

8 2.3.2 Gene set enrichment analysis (GSEA): The first statistical approach, and general identification of this problem area, led to the method of gene set enrichment analysis (GSEA) (Subramanian et al., 2005) and has generated some deeper statistical approaches more recently (Newton et al., 2007). GSEA aims to measure aggregate association between a full list of genes ranked by their association with an outcome also referred to as a phenotype, and a set of genes in a predefined pathway gene set. The underlying idea is to assess whether or not the pathway gene set is enriched with genes that score highly in association with the experimental outcome, perhaps with a directional component that looks separately at genes positively versus negatively associated. GSEA was path-breaking and is now quite widely used. In previous applied work, broader questions were interested and also in formal statistical inference on gene-pathway membership, and this have motivated a formal probabilistic framework that extends the basic thinking into a broader statistical approach. The resulting probabilistic pathway annotation (PROPA) methodology then also addresses a number of issues GSEA methods were not designed for, including the abilities to: (a) deliver formal probabilistic assessments of phenotype-pathway concordance, in terms of marginal likelihoods and posterior probabilities; (b) formally assess concordance of experimental results with several or many biological pathways simultaneously and in comparison with each other;(c) recognize that experimental inferences and established biological pathway databases are error prone, and allow for the identification and correction of errors of both kinds within the analysis; (d) utilize a range of direct numerical measures of association between genes and an experimental outcome as inputs; and (e) provide a more general framework that can be customized to apply to the outputs of gene expression, or other genomic studies of many forms. In addition to withinanalysis robustness, item (c) here also leads to an ability to suggest refinements to the pathway gene lists in established biological databases. 25

9 The corresponding focus was on applications in cancer genomics. While the primary aim of the research is to highlight the area and applications, the statistical methodology has modeling and computational novelty. A core ingredient of biological pathway assessment is the evaluation of marginal likelihoods in Bayesian models fitted using MCMC methods. Marginal likelihood computations are common and often hard problems (Raftery et al. (2007) for a recent approach with discussion and many references to other approaches), especially in cases, such as here, of high-dimensional parameter spaces. The favored approach involves a novel extension of variational methods that have been applied in other problems of marginal likelihood computation (example, Jordan et al. (999); Corduneanu and Bishop (200); McGrory and Titterington (2007)); in addressing this problem in our specific applied context, an extension of existing variational methods that was introduced, will apply in many other model contexts. It describes the overall MCMC strategy for posterior simulation in an analysis focused on a single biological pathway, and the developments of computational methods for marginal likelihood computation to aid in comparisons of multiple biological pathways. This includes the innovations in variational methodology. It explores examples to highlight the specification and use of the model. The first cancer genomics application concerns a detailed study of two well-known hormonal pathways in blood cancer. The second application concerns novel experimental data arising in studies of micro-environmental influences on gene expression from in vitro experiments, and connects these experimental findings to in vivo observational blood cancer data. Among other things, this case study demonstrates an overall strategy for in vitro to in vivo projection of gene expression patterns within which PROPA analysis plays key roles. Related to above informations in research works, final finding by researchers were as follows: The Gullah-speaking inhabitants of the Sea Islands of South Carolina are a unique 26

10 population because of their minimal Caucasian genetic admixture and high propensity for diabetes. A clinical study was conducted to determine their dental health status of diabetic patients. Dental caries was assessed using the total number of decayed, missing and filled surfaces, an index known as DMFS in the dental literature. Data resulted from examining 4 (for canines and incisors) or 5 (for premolars and molars) surfaces per tooth, for all (up to 32) teeth, for over 260 individuals. Also recorded were covariates including age, gender, smoking and brushing/flossing habits, etc., which may influence caries development. Then the tooth-level contributions to DMFS, which range from 0 to 5, and evaluate associations with covariates, were already modeled. Histograms suggest a zero-inflated binomial model for the tooth-level counts. As in a Hurdle Model, the process determining a healthy tooth (with a count of 0) is treated as a structural zero and hence separated from the remaining counts ( to 5), which are modeled using a zero-truncated binomial distribution. A multivariate model where covariates enter through a random effects logistic regression on the logit of the probability of a carious surface, was developed. To preserve marginal logit structure for interpretability, a bridge density (Wang and Louis, 2003) for the subject-specific random effects was used. The tooth-specific zero-inflation probability is modeled as arising from a beta distribution whose shape/scale parameters are linked to the odds of a healthy tooth (Song et al., 2006). The model with alternatives to assess improvements in stating prediction and interpretability was modeled Probabilistic pathway annotation: Next discussion was presented about Bayesian models and computational methods for the problem of matching predictions from molecular studies with known biological pathway databases, and the problem of pathway annotation of summary results of an experiment or observational study. In areas such as cancer genomics, 27

11 linking quantified, experimentally defined gene expression signatures with known biological pathway gene sets is essential to improving the understanding of the complexity of molecular pathways related to outcome. The probabilistic pathway annotation (PROPA) analysis involves new models for formal assessment and rankings of pathways putatively linked to an experimental or observational phenotype. Integrates qualitative biological information into the analysis and generates coherent inferences on uncertainties about gene pathway membership that can inform the revision of pathway databases. The analysis in final works mentioned above relied on simulation-based computation in high-dimensional models, and introduced a novel extension of variational methods for computation of model evidence, or marginal likelihood functions, that were central to the comparison of multiple biological pathways. Examples highlight the methodology using both simulated and real data, and detailed cases studies in breast cancer genomics involving hormonal pathways and pathway activities underlying cellular responses to lactic acidosis in breast cancer were developed. The second study demonstrated the application of the method in decomposing the complexity of gene expression-based predictions about interacting biological pathway activation from both experimental (in vitro) and observational (in vivo) human cancer data Introduction for first model: The first statistical approach, and general identification of a problem, led to the method of Bayesian estimation, and has generated some deeper statistical approaches more recently in exploratory factor analysis in AMOS. It aims to measure aggregate association between a full list of pair of variables ranked by their association with an outcome also referred by its effect, and a set of all statistical measures are analyzed. 28

12 The underlying idea is to assess whether or not the pathway is enriched with all components of a cancer among blood cancer, breast cancer, and primary tumor. It is discussed a way that score highly in association with the experimental outcome, perhaps with a directional component that looks separately at each component of the a cancer, and is now quite widely used. Our first model is concerned with datasets d in which a large number p (tens of hundreds) of variables is recorded and the sample size n is relatively small (tens). Our first model is Bayesian estimation from exploratory factor analysis. To illustrate Bayesian estimation using Amos Graphics, an example is explained, and it shows how to test the null hypothesis that the covariance between two variables is 0 by fixing the value of the covariance between age and vocabulary to 0. This is the resulting path diagram: Chi-square = \ c min ( \ df df) P = \ p Age 0 Vocabulary Maximum Likelihood Analysis: Before performing a Bayesian analysis of our model, a maximum likelihood analysis for comparison purposes is performed using AMOS software for calculating estimates to display the following parameter estimates and standard errors. Our first model Bayesian Estimation contains the following tables (. to.9) including F-F2 diagram Amos displays Estimates, Scalar Estimates, Maximum Likelihood Estimates, and Regression Weights Table. 29

13 Estimate S.E. C.R. P Class <--- F.000 Age <--- F Lymphatics <--- F2.000 Affere <--- F Lymphc <--- F Lymphs <--- F Extravasates <--- F regeneration <--- F Earlyup <--- F ly.no.dim <--- F ly.no.en <--- F ch.in.lym <--- F Defect <--- F changeinnode <--- F changesinstru <--- F specialforms <--- F dislocationof <--- F exclusionofno <--- F no.ofnodesin <--- F Intercepts: Table.2 Estimate S.E. C.R. P Class *** Age *** Lymphatics *** Affere *** Lymphc *** Lymphs *** Extravasates *** Regeneration *** Earlyup *** ly.no.dim *** ly.no.en *** ch.in.lym *** Defect *** changeinnode *** changesinstru *** specialforms *** dislocationof *** exclusionofno *** no.ofnodesin *** 30

14 Covariances: - Table.3 Estimate S.E. C.R. P Label F2 <--> F C Variances: Table.4 Estimate S.E. C.R. P F F e e *** e *** e *** e *** e *** e *** e *** e *** e *** e *** e *** e *** e *** e *** e *** e *** e *** e *** Bayesian Analysis: It requires estimation of explicit means and intercepts. Before performing any Bayesian analysis in Amos, one must first tell Amos to estimate means and intercepts. Then only Bayesian SEM window appears, and the MCMC algorithm immediately begins generating samples. Therefore F-F2 diagram is then obtained after analyzing the tables in Amos. In it, F contains class & age, and F2 includes all other components for blood cancer. Here is the regression weight for all components. For e, 0.2 is the mean in variance table (.9), and 2.45 is the regression coefficient (standard loading) with class in intercepts table (.7). 0.3 is the regression coefficient between age and F. Similarly, mean in variance table (.9), and regression coefficient for every pair are observed from table (.5). 3

15 F-F2 diagram (Table.5).07 0,.20 0,.30 e e class age ,.3 F F lymphatics affere.55.8 lymphc.05 lymphs earlyup ,.29.04ly.no.dim extravasates regeneration ly.no.en ch.in.lym.6.93 defect e3 e4 e5 e6 e7 e8 e9 e0 e e2 e3 changeinnodee changesinstrue specialforms e6.66 dislocationof e7.79 exclusionofnoe no.ofnodesine9 0,.62 0,.25 0,.4 0,.04 0,.22 0,.05 0,.8 0,.09 0,.47 0,.3 0,.65 0,.54 0, ,.39 0,.9 0,.3 0,.46 32

16 Table.6 Mean S.E. S.D. C.S. Median Regression weights 95% Lower bound 95% Upper bound Skewness Kurtosis Min Max age<--f affere<--f lymphc<--f lymphs<--f extravasates<--f regeneration<--f earlyup<--f ly.no.dim<--f ly.no.en<--f ch.in.lym<--f defect<--f changeinnode<--f changesinstru<--f specialforms<--f dislocationof<--f exclusionofno<--f no.ofnodesin<--f Intercepts- Table.7 Class Age Lymphatics Affere Lymphc Lymphs Extravasates Regeneration Earlyup ly.no.dim ly.no.en ch.in.lym Defect Changeinnode Changesinstru Specialforms Dislocationof Exclusionofno no.ofnodesin Covariances Table.8 F2<->F

17 Variances- Table.9 F F e e e e e e e e e e e e e e e e e e e : F-F2 diagram: The Bayesian SEM window has a toolbar near the top of the window and has a results summary table below. Each row of the summary table describes the marginal posterior distribution of a single model parameter. The first column, labeled Mean, contains the posterior mean, which is the center or average of the posterior distribution. This can be used as a Bayesian point estimate of the parameter, based on the data and the prior distribution. With a large dataset, the posterior mean will tend to be close to the maximum likelihood estimate. (In this case, the two are somewhat close; compare the posterior mean of for the age-vocabulary covariance to the maximum likelihood estimate of Bayesian SEM window: When Analyze Bayesian Estimation is chosen in Amos, the MCMC algorithm begins sampling immediately, and it continues until the Pause Sampling button is clicked to halt the process. Sampling was halted after 40 completed 34

18 samples. Amos generated and discarded 500 burn-in samples prior to drawing the first sample that was retained for the analysis. Amos draws burn-in samples to allow the MCMC procedure to converge to the true joint posterior distribution. After Amos draws and discards the burn-in samples, it draws additional samples to give us a clear picture of what this joint posterior distribution looks like. Amos has drawn 5,000 of these analysis samples, and it is upon these analysis samples that the results in the summary table are based. Actually, the displayed results are for analyzing 725 samples. Because the sampling algorithm Amos uses is very fast, updating the summary table after each sample would lead to a rapid, incomprehensible blur of changing results in the Bayesian SEM window. It would also slow the analysis down. To avoid both problems, Amos refreshes the results after every 250 samples. The above tables are for first model if F-F2 diagram and other tables are fitting to diagnose a patient in blood cancer Conclusion: Maximum likelihood estimations for all types of Blood Cancer are near to zero. Further the posterior means are near to zero which is Bayesian point estimate for each parameter, based on dataset in the Blood Cancer and the prior distribution. The disease is to diagnostic through this model, and the expected life of a patient is also calculated from means of patients in data. Our model is strongly diagnosed for a patient in blood cancer having component namely Affearc or lymph s with mean 0-, component like as lymph; c, extravasatee, regeneration of, early uptake in, lymnodes dimin, change in lym, number of nodes in, or dislocation, exclusion of node with mean 2-4. It is further diagnosed for a patient in blood cancer having component like as lymnodes enlar, defect in node, changes in structure, special forms with mean 4-8. It also is diagnosed for a patient in blood cancer having remaining components with mean above 8. 35

19 The posterior mean will tend to be close to the maximum likelihood estimate. In this case, two are some what close: compare the posterior mean of for the age - vocabulary covariance to the maximum likelihood estimate of So the first model fits to diagnostic a patient in Blood cancer. If F-F2 diagram with Bayesian Estimate standard error and intercept mean for independent and dependent variable is obtained, then only the other models are best fitness to diagnostic Blood cancer for a patient from chapter III to Chapter VII Convergence statistic for blood cancer: The next two investigations of the first model for a patient in blood cancer are Standard error (S.E) and convergence statistic (C.S). The second column, labeled S.E., of our first model in blood cancer reports an estimated standard error that suggests how far the Monte-Carlo estimated posterior mean may lie from the true posterior mean. As the MCMC procedure continues to generate more samples, the estimate of the posterior mean becomes more precise, and the S.E. gradually drops. Note that this S.E. is not an estimate of how far the posterior mean may lie from the unknown true value of the parameter. One would not use ± 2 S.E. values as the width of a 95% interval for the parameter. Additional columns of first model in blood cancer contain the convergence statistic (C.S.), the median value of each parameter, the lower and upper 50% boundaries of the distribution of each parameter, and the skewness, kurtosis, minimum value, and maximum value of each parameter. The lower and upper 50% boundaries are the endpoints of a 50% Bayesian credible set, which is the Bayesian analogue of a 50% confidence interval Conclusion: For collected information in blood cancer and from Tables. to.9, there is no significant difference between posterior mean and the true posterior mean, and the difference is near to zero expect ly.no.enc, defect in node, and no.of node in. The likely distance between the posterior mean and the unknown true parameter is reported in the third column, labeled S.D., and that number is analogous to the standard error in maximum likelihood estimation. 36

20 Most of us are accustomed to using a confidence level of 95%, so we will soon show one how to change to 95% from convergence analysis. Section IV: Bayesian estimation for breast cancer 2.4. Introduction: In a research work, decision-analytical models are widely used in economic evaluations of health care interventions with the objective of providing information to allow scarce health care resources to be allocated efficiently. Such models have a range of uses including the synthesis of data from a variety of sources (often using meta-analysis) to produce the cost or cost-effectiveness results of interest. Researchers are often used to evaluate the complex process usually associated with the implementation of health care interventions. Examples of instances when decision modeling techniques may be of value include the extrapolation of primary data beyond the endpoint of a trial or to make comparisons between treatments for which no head-to-head trials exist. Decision trees provide a simple way to structure problems of decision making under uncertainty whilst describing the major factors involved. More complicated decision trees can be represented in the form of Markov models. Such models provide a technique for analyzing events that arerepeatable (example: relapses of a chronic disease such as multiple sclerosis, arthritis and asthma), or events that play out over an extended period of time (example: the progression of cancer). To evaluate decision-analytical models, estimates need to be acquired for the costs and health outcomes of the various pathways through the model together with the probability of their occurrence. It should not be forgotten that the usefulness of the results obtained from such models depends on the source and quality of the estimates input into the model. 37

21 2.4.2 Decision-analytical models: They are sometimes based on primary data collection, but more often rely on published or other secondary sources for cost and effectiveness information. Systematic review methods are a formal and replicable approach to identifying and summarizing existing evidence. When data permit, quantitative synthesis of the evidence often referred to as metaanalysis can be conducted within a systematic review. The uses of systematic methods for evidence synthesis are desirable for the evaluation of health care to be truly evidence-based and hence information for decision models should be based on such rigorous methods. However, very little has been written on the methods of systematic reviews (including meta-analysis) to be used for the synthesis of evidence for an economic decision. It is currently unclear what sources of evidence should be included in systematic reviews informing decision models (example should both RCTs and observational studies be included in the same analysis). This is a particularly pertinent issue in economic decision modeling since estimates of costs and probabilities are required in addition to clinical effectiveness, Probabilistic decision models: They used in economic evaluation are almost exclusively analyzed using classical statistical approaches (two of the rare exceptions are Parmigiani et al. (997) and Fryback et al. (200)). Such models place probability distributions on parameters where there is uncertainty in their true value. These can be derived from the results of individual studies, or, more desirably, the results of systematic reviews. Parametric distributional assumptions are necessary when specifying parameter uncertainty, but occasions do exist when such assumptions may be inappropriate, such as when events are rare. The Bayesian analyses described herein relax the need for some of these distributional assumptions. Further, by combining the synthesis and decision process into one coherent model, the Bayesian approach described here incorporates uncertainty in incidental model parameters, that need estimating but are not of direct interest in the decision 38

22 model, that is often ignored in classical analyses (the between study variance parameters in meta-analyses are examples of these, as will be shown later). An additional advantage of the Bayesian approach is that the correlation between parameters induced by the fact that the same data sources (example: systematic review may be used to propagate different parts of the model is automatically accounted for) Probabilistic decision analytical models: Fryback et al. (200) outlined how these simple models may be evaluated using Bayesian methods. The previous works were to extend the above by describing a method whereby the whole process (systematic review incorporating meta-analyses, estimation of transition probabilities, and evaluation of model and sensitivity analysis) may be combined into a single coherent Bayesian model. The ease of applying such a method is demonstrated through the use of two illustrative examples: ) The prophylactic use of antibiotics for caesarean section patients to reduce the incidence of wound infections; and 2) The use of taxanes for the second-line treatment of advanced breast cancer compared to conventional treatment. Examples were given to illustrates the process of inputting the pooled estimates obtained from a systematic review (meta-analyses), together with their associated uncertainty, directly into a probabilistic decision analytical model, and to illustrates the situation whereby the pooled estimates obtained from a systematic review, together with their associated uncertainty, are initially converted to transition probabilities and then applied to a probabilistic Markov decision model. The incorporation of subjective/expert prior beliefs is also illustrated in the latter example. 39

23 2.4.5 Decision analytical economic modeling: Towards the model within a Bayesian framework, the some final works were as follows: Economic evaluation of health care interventions based on decision analytic modeling can generate valuable information for health policy decision-makers. However, the usefulness of the results obtained depends on the quality of the data input into the model, and the accuracy of the estimates for the costs, effectiveness and transition probabilities between the different health states of the model. Few models are to demonstrate how the individual components required for decision analytical modeling (systematic review incorporating meta-analyses, estimation of transition probabilities, evaluation of the model and sensitivity analysis) may be addressed simultaneously in one coherent Bayesian model evaluated using Markov Chain Monte Carlo simulation implemented in the specialist Bayesian statistics software WinBUGS. The approach described is applied to two illustrative examples: ) The prophylactic use of antibiotics for caesarean section patients; and 2) The use of taxanes for the second-line treatment of advanced breast cancer. The advantages of using the Bayesian statistical approach outlined compared to the conventional classical approaches to decision analysis include the ability to: (i) perform all necessary analyses, including all intermediate analyses (example, meta-analyses) required to derive model parameters, in a single coherent model; (ii) incorporate expert opinion either directly or regarding the relative credibility of different data sources; (iii) use the actual posterior distributions for parameters of interest (opposed to making distributional assumptions necessary for the classical formulation); and (iv) incorporate uncertainty for all model parameters Bayesian estimation for breast cancer As the same procedure in blood cancer, Amos displays Estimates, Scalar Estimates, Maximum Likelihood Estimates, and Regression Weights Tables.0 to.8. 40

24 Table.0 Estimate S.E. C.R. P Class <--- F.000 Age <--- F me2pause <--- F2.000 Tumorsize <--- F inv2des <--- F Decaps <--- F Degmalig <--- F Breast <--- F breastquad <--- F Irradiat <--- F Intercepts: Covariances Table.2 Table. Estimate S.E. C.R. P Class *** Age *** me2pause *** Tumorsize *** inv2des *** Decaps *** Degmalig *** Breast *** breastquad *** Irradiat *** Estimate S.E. C.R. P F2 <--> F

25 2.4.0 Variances: - Table. 3 Estimate S.E. C.R. P F F e e *** e *** e *** e *** e *** e *** e *** e *** e *** 2.4. Bayesian Analysis: Bayesian analysis requires estimation of explicit means and intercepts. Before performing any Bayesian analysis in Amos, any one must first tell Amos to estimate means and intercepts. Further F-F2 diagram is then obtained after analyzing the tables in Amos. In it, F contains class & age, and F2 includes all other components for breast cancer. Here is the regression weight for all components. For e, 0.4 is the mean in variance table (.8), and.30 is the regression coefficient (standard loading) with class in intercepts table (.6) is the regression coefficient between age and F. Similarly, mean in variance table (.8), and regression coefficient for every pair are observed from table (.4). 42

26 F-F2 diagram (Table. 4) 0,.4 0,.00 e e class age ,.07 F e3 0,.30 0, ,.58 0,.07 0,.42 0,.25 0,.43 0, e e5 e6 e7 e8 e9 e me2pause tumorsize inv2des decaps degmalig breast breastquad irradiat , F2 43

27 Table. 5 Mean S.E. S.D. C.S. Median Regression weights 95% Lower bound 95% Upper bound Skewness Kurtosis Min Max age<--f tumorsize<--f inv2des<--f decaps<--f degmalig<--f breast<--f breastquad<--f irradiat<--f Intercepts -Table.6 Class Age me2pause Tumorsize inv2des Decaps Degmalig Breast Breastquad Irradiat Covariances -Table.7 F2<->F Variances-Table.8 F F e e e e e e e e e e

28 2.4.2: The Bayesian SEM window appears similarly in patients in breast cancer. The above tables are for first model if F-F2 diagram and other tables are fitting to diagnose a patient in breast cancer Conclusion: Maximum likelihood estimations for all types of Breast Cancer are near to zero. Further the posterior means are near to zero which is Bayesian point estimate for each parameter, based on dataset in the Breast Cancer and the prior distribution. The disease is to diagnostic through this model in breast cancer, and the expected life of a patient is also calculated from means of patients in data. The first model of the thesis is strongly diagnosed for a patient in breast cancer having component namely nodecaps, irradiat with negative mean not near to zero. It is further diagnosed for a patient in breast cancer having component like as breast, breastquad with negative mean nearing zero. It also is diagnosed for a patient in breast cancer having remaining components like as tumorsize, inv-nodes and deg-maling with positive mean. The posterior mean will tend to be close to the maximum likelihood estimate. In this case the two are some what close: compare the posterior mean of for the age - vocabulary covariance to the maximum likelihood estimate of So the first model fits to diagnostic a patient in breast cancer. If F-F2 diagram with Bayesian Estimate standard error and intercept mean for independent and dependent variable is obtained, then only the other models are best fitness to diagnostic breast cancer for a patient from chapter III to Chapter VII Standard error and convergence statistic: The same conclusions are obtained for breast cancer as in blood cancer, and the data fit to diagnose for a patient in breast cancer. 45

29 2.4.5 F-F2 diagram: The same conclusions are got to diagnose a patient in breast cancer as in blood cancer. These give the fact that () there is no significant difference between posterior mean and the difference is near to zero except tumor size, inv-nodes.decamps, and degmalig, (2) for these Components and due to prioi assumptions in S.E., the expected life of patients can be calculated, and (3) only then the other models are best fitness to diagnostic Brest cancer for a patient from chapter III to chapter VII. Section V - Bayesian Estimation for Primary tumor cancer 2.5. Amos displays Estimates, Scalar Estimates, Maximum Likelihood Estimates,and Regression Weights: Tables.9 to.27 Table.9 Estimate S.E. C.R. P Class <--- F.000 Age <--- F Sex <--- F *** Type <--- F *** Bone <--- F2.000 Difference <--- F Bonemarrow <--- F Lung <--- F Pleura <--- F Peritoneum <--- F Liver <--- F Brain <--- F Skin <--- F Neck <--- F Supraclavicular <--- F Axillar <--- F Mediastinum <--- F Abdominal <--- F

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile