Curve groups and breast cancer

Size: px
Start display at page:

Download "Curve groups and breast cancer"

Transcription

1 Curve groups and breast cancer Exploring carcinogenesis by gene expression in blood before diagnosis of breast cancer by curve group analysis - the prospective NOWAC postgenome cohort Note no. Authors SAMBA/18/15 Eiliv Lund, Lars Holden, Hege Bøvelstad, Sandra Plancade, Nicolle Mode, Clara-Cecilie Günther, Gregory Nuel, Jean- Christophe Thalabard, Marit Holden Date 26. mai. 2015

2 Authors Eiliv Lund (1) Lars Holden (2) Hege Bøvelstad (1) Sandra Plancade (3) Nicolle Mode (1, 4) Clara-Cecilie Günther (2) Gregory Nuel (5) Jean-Christophe Thalabard (6) Marit Holden (2) 1) UiT The Arctic University of Norway, Tromsø, Norway 2) Norsk Regnesentral, Oslo, Norway 3) INRA, UR1404 Unité Mathématiques et Informatique Appliquées du Génome à l'environnement, F78352 Jouy-en-Josas, France. 4) National Institute on Aging, National Institutes of health, Baltimore, MD, USA 5) DR CNRS, INSMI Stochastics and Biology Group (PSB) LPMA, UPMC, Sorbonne University, Paris, France 6) MAP 5, Universite Paris Descartes, Sorbonne Paris Cite, France Norsk Regnesentral Norsk Regnesentral (Norwegian Computing Center, NR) is a private, independent, non-profit foundation established in NR carries out contract research and development projects in information and communication technology and applied statistical-mathematical modelling. The clients include a broad range of industrial, commercial and public service organisations in the national as well as the international market. Our scientific and technical capabilities are further developed in co-operation with The Research Council of Norway and key customers. The results of our projects may take the form of reports, software, prototypes, and short courses. A proof of the confidence and appreciation our clients have in us is given by the fact that most of our new contracts are signed with previous customers.

3 Title Authors Curve groups and breast cancer Eiliv Lund, Lars Holden, Hege Bøvelstad, Sandra Plancade, Nicolle Mode, Clara-Cecilie Günther, Gregory Nuel, Jean-Christophe Thalabard, Marit Holden Date 26. mai Year 2015 Publication number SAMBA/18/15 Abstract The understanding of temporal mutational processes in cancer is limited. This analysis aimed at exploring the trajectories for the genes, i.e. the changes in gene expression in blood between breast cancer cases and controls as a function of time to cancer diagnosis. Between 2003 and 2006 almost women entered the Norwegian Women and Cancer (NOWAC) postgenome biobank by donating a blood sample preserved for transcriptomic analyses (PAX tube). A total of 637 invasive breast cancer cases were identified through 2009 by linkages to the Cancer Registry of Norway. For each case, a random control matched on birth year and time of blood sampling was selected. After exclusions 441 case-control pairs were available for analyses. The trajectories consist of the differences over time in gene expression between each case and control pair. We present novel non-parametric statistical methods based on hypothesis testing that show whether there is development over time or not, and whether this development varies among the different strata. We introduced the concept of curve groups, where each curve group consists of genes that have a similar development through time. The gene expressions varied with time in the last years before diagnosis, and this development differs among clinical stages for women participating in the National Breast Cancer Screening Program. The differences among the strata appeared larger the last year before diagnosis, compared to earlier years. The curve group analysis revealed significant gene expression differences in blood before diagnosis among strata of clinical stage and mode of detection. Keywords Availability transcriptomics, gene expression, cohort, breast cancer, carcinogenesis, metastasis, mammographic screening, blood, systems epidemiology Open Project number and Research field Bioinformatics Number of pages 32 Copyright Norsk Regnesentral Curve groups and breast cancer 3

4 4 Curve groups and breast cancer

5 Table of Content 1 Introduction Material and methods Follow-up and register information Laboratory procedures Microarray data Preprocessing of array data Statistical methods Hypothesis tests for development in time for each stratum Hypothesis test for comparing two strata An alternative statistic for comparing two strata Computing p-values permutation tests Results Hypothesis tests for development in time for each stratum Hypothesis tests for comparing two strata Discussion Conclusion References Supplementary Hypothesis tests for development in time for each stratum Supplementary figure for Table 2 in the paper Supplementary table for Table 2 in the paper Methods for separation of the strata Supplementary figure for Figure 3 in the paper Curve groups and breast cancer 5

6

7 1 Introduction The assumption of systems epidemiology [1] is that functional aspects of human carcinogenesis might be communicated through blood as gene expression patterns before diagnosis, either as active signals or as passive information. Recently an editorial in Nature Medicine [2] advocated the need to change from mice models to a human model for the understanding of the carcinogenic processes. In observational studies of humans, the prospective design would be best to incorporate the time aspect of the carcinogenesis and the changing exposures. On the other hand, analyses of somatic mutations in cancer genome studies have revealed a huge diversity of the mutational processes as part of carcinogenesis [3]. One explanation for this observation could be that multiple mutational processes operate dependently on different biological processes in subgroups of cancers, thus giving a jumbled composite signature. Due to the problems of jumbled composite signature, the functional analyses in observational studies should be stratified based on important clinical knowledge like node status, mode of detection and potential exposures. One approach for prospective functional genomic studies is to compile trajectories from many independent case-control pair measurements in order to study the process of carcinogenesis [4]. The trajectory for a gene is a curve that shows the changes in gene expression in blood as a function of time to cancer diagnosis, and consists of the differences in gene expression between cases and controls in a nested case-control design. The controls establish the average (mean) level of gene expression in women not affected by cancer and provide exposure adjusted analyses. The level of expression for a gene not involved in the carcinogenic process should be constant, on average, during years before diagnosis. Genes related to the different stages of the carcinogenesis could be differentially expressed over time. There is no prior knowledge about the form of the trajectories for any of the thousands of genes. This lack of a priori information demands an agnostic approach [5] putting all genes on an equal basis and adjusting for multiple testing using a false discovery rate [6]. We present a prospective analysis based on the Norwegian Women and Cancer postgenome study (NOWAC) [7]. The aim was to describe the time-dependent carcinogenesis process in blood through an agnostic approach and epidemiological design. The trajectories were analyzed stratified on important clinical factors like lymph Curve groups and breast cancer 7

8 node status at time of diagnosis and the mode of detection, but without identifying single genes or conducting pathway analyses. 2 Material and methods The Norwegian Women and Cancer (NOWAC) cohort study is a nation-wide population-based cancer study initiated in 1991; for detailed information see [8]. Random samples of women based on unique national birth number were drawn from the central person register by Statistics Norway. Name and address were printed on the letter of information and the birth number was replaced by a serial number on the questionnaires. The linkage file for the birth number and the serial number was kept at Statistics Norway. The questionnaires were returned to the Institute of Community Medicine, University of Tromsø. Non-responders were mailed one or two remainders. Between 2003 and 2006 the postgenome biobank collected approximately samples nested in the NOWAC cohort, for more details see [7]. Women in the NOWAC study who completed an eight-page questionnaire with an information letter introducing blood sampling and who agreed to participate in blood sampling (97.2%) were eligible for blood donation. Each woman received equipment for blood collection and a two-page questionnaire. Blood sampling equipment was mailed in batches of 500 to randomly chosen women with one reminder after 4-6 weeks. The blood sampling consisted of one PAXgene tube (PreAnalytiX GmbH, Hembrechtikon, Switzerland) with a buffer or stabilization agent for mrna in order to improve the quality of the gene expression for genome wide microarray analyses. Blood was primarily drawn at the family doctor s office and sent as biological material overnight to Tromsø. Upon arrival the PAX tubes were immediately frozen. Altogether women were invited through 141 groups and (72.3%) of them returned a blood sample and questionnaire during May August In addition, 2569 women donated blood at the Mammographic Screening Unit, the University Hospital of Tromsø from 2004 until After removing duplicates, missing blood samples and excluding women who later would leave the study (n=4) a total of blood samples were available for follow-up. 8 Curve groups and breast cancer

9 2.1 Follow-up and register information For the set of unique women belonging to the postgenome biobank breast cancer cases through the end of 2009 were identified through linkages to the Cancer Registry of Norway providing information on incident cases of breast cancer and stage. Altogether 637 cases of invasive breast cancer were reported. For each case a control matched on time of blood sampling and year of birth was analyzed together with the case. After removing 16 cases with other cancer, 8 cases with previous breast cancer, 18 pairs where the control was diagnosed with cancer within two years after blood sampling and 44 pairs defined as outliers of which 18 case-control pairs were marked technical outliers, a total of 551 pairs remained. Of these, 83 had missing, incomplete or uncertain clinical information. Cases with blood samples taken more than five years before diagnosis, 27, were not included. The eligible cohort consisted of 441 pairs. Information on method of diagnosis, at screening unit or outside, was obtained from the Cancer Registry of Norway through linkage to the screening database kept by the National Breast Cancer Screening Program [9]. The cases were reclassified into node negative or positive (without spread or with spread) based on the ptnm information from the Cancer Registry of Norway. Women participating in the screening program diagnosed with breast cancer consisted of two groups: Screen detected cancer and Interval cancer detected within two years after their last mammogram screening. The stratum Clinical consists of women that did not attend screening prior to diagnosis. Cases with a negative mammogram and with a clinically detected breast cancer more than two years after screening were included in the Clinical group. Based on this information we could classify our cases into six strata: «Screening with spread», «Screening without spread», «Interval with spread», «Interval without spread», «Clinical with spread», and «Clinical without spread». The repartition of the selected 441 case-control pairs into the six strata is shown in Table 1. 3 Laboratory procedures 3.1 Microarray data To control for technical variability such as different batches of reagents and kits, day to day variations, microarray production batches and effects related to different laboratory operators, each case and its random control matched on birth year and month of blood Curve groups and breast cancer 9

10 collection were kept together through all procedures like extraction, amplification and hybridization. RNA extraction used the PAXgene Blood mirna Isolation kit according to the manufacturer s manual at the NTNU Genomic Core Facility in Trondheim, Norway. RNA quality and purity was assessed using the NanoDrop ND 8000 spectrophotometer (ThermoFisher Scientific; Delaware, USA) and Agilent bioanalyzer (Palo Alto; CA, USA), respectively. RNA amplification was performed on 96 plates using 300 ng of total RNA and the Illumina TotalPrep-96 RNA Amplification Kit (Ambio Inc; Austin, Texas, USA). The amplification procedure consisted of reverse transcription with a T7 promotor and ArrayScript, followed by a second-strand synthesis. In vitro transcription with T7 RNA polymerase using a biotin-ntp mix produced biotinylated crna copies of each mrna in the sample. All cases and controls were run on either the IlluminaHumanAWG-6 version 3 expression bead or the HumanHT-12 version 4. The microarray service was provided by the Genomics Core Facility, Norwegian University of Science and Technology. Outliers were excluded after visual examination of dendrograms, principal component analysis plots and density plots. Individuals that were considered as borderline outliers were excluded if their laboratory quality measures where below given thresholds (RIN value < 7, 260/280 ratio < 2, 260/230 ratio < 1.7, and 50 < RNA < 500). 3.2 Preprocessing of array data The dataset was preprocessed as previously described [10]. The dataset consisting of 441 case-control pairs and probes were background corrected using negative control probes and normalized on the original scale using quantile normalization. Data from the two Illumina chip types (HumanWG-6 v3 and HumanHT-12 v4) were combined on identical nucleotide universal identifiers (nuid) [11]. We retained probes present in at least 1 % of the individuals, i.e. in at least 9 of the 882 individuals. If a gene was represented with more than one probe only one was selected resulting in a dataset with probes. The probes were translated to genes using the IlluminaHumanAll.db database [12]. Finally, the log 2 -differences of the expression values for each case-control pair were computed and used in the statistical analyses. Additional adjustments for possible batch effects were unnecessary due to the matchedpair processing and focus on differences between pairs. 10 Curve groups and breast cancer

11 3.3 Statistical methods An original statistical method based on hypothesis testing was developed in order to detect a functional dependence in gene expression over time, and whether this functional dependency differed among the different strata. We have developed methods that are able to identify small changes that are varying slowly in time and/or among strata, by using a large number of genes in each test. For defining test statistics that measure development in time and differences among strata, we have introduced the concept of curve groups, where each curve group consists of genes that have a similar development in time, i.e., similar trajectories. Below we will describe the methods in detail. Let XX gg,pp be the log 2 -expression difference for case-control pair pp and gene gg. Each casecontrol pair belongs to a stratum ss and a time period tt, tt = 1,2,3. where t=1 is 0-1 year before diagnosis, t=2 is 1-2 years before diagnosis and 3 is t=3-5 years before diagnosis. We want to test whether XX gg,pp is independent of the time period, and whether there is no difference among the strata, i.e., XX gg,pp is independent of stratum. 3.4 Hypothesis tests for development in time for each stratum For each stratum we will test whether XX gg,pp is independent of the time period. To define a statistic that measures development in time we first introduce the concept of curve groups: For a given stratum ss, a gene gg can belong to zero or one of six curve groups based on the order of the average of the data over all case-control pairs in the stratum in the three time periods. These averages are denoted XX gg,3,ss, XX gg,2,ss and XX gg,1,ss, respectively. Six curve groups, called «123, 132, 213, 231, 312 and 321», respectively, were defined. The three numbers in each name of a curve group represent the order of time period 3 (left number), the order of time period 2 (middle number) and the order of time period 1 (right number). If e.g. XX gg,3,ss < XX gg,2,ss < XX gg,1,ss, gene gg may belong to curve group 123 indicating an increasing gene expression in time when approaching the time of diagnosis. See Figure 1 for an illustration of the concept of curve groups. For each curve group we will only include genes with a significant change in gene expression over time. This is done by testing whether the smallest and largest values of XX gg,3,ss, XX gg,2,ss and XX gg,1,ss are different using a two-sample t-test (assuming unequal variances). Let pp gg,cc be the p-value of this test. Depending Curve groups and breast cancer 11

12 on the statistical question at hand, we define two alternative criteria for concluding that a gene g belongs to the curve group c: o Inclusion criterion 1: Gene gg belongs to curve group cc if pp gg,cc is below a predefined limit αα. o Inclusion criterion 2: Gene gg belongs to curve group cc if gene gg is among the M genes with lowest pp gg,cc -value, see more next section. To make a test for development in time, we count for each stratum the number of genes that belong to the curve group using inclusion criterion 1 defined above. For each stratum, we then perform seven hypothesis tests, one global test and one for each of the six curve groups. In the global test the test statistic is the total number of genes which belong to one of the six curve groups, while in the test for a curve group the test statistic is the number of genes that belong to this curve group. If the conclusion of the hypothesis test is that there are more genes in the curve groups than what is expected by chance, we conclude that there is a significant development in time for some of these genes. 3.5 Hypothesis test for comparing two strata We want to test whether there are differences in gene expressions between two strata with spread and without spread using information from several genes. For each curve group cc, stratum ss and case-control pair pp, we define a curve group variable ZZ cc,ss,pp as follows: We select the genes that belong to the curve group cc for stratum ss using inclusion criterion 2 defined above with M=100. Let GG cc,ss denote this set of genes. The curve group variable ZZ cc,ss,pp for case-control pair p is then computed as the average value of the data XX gg,pp over genes in GG cc,ss : ZZ cc,ss,pp = XX gg,pp. gg GG cc,ss We can test whether the variables ZZ cc,ss,pp are different for case-control pairs pp between the two strata either for all time periods combined or for each time period separately. Note that the genes are selected based on data from stratum ss, but the variable may be calculated for case-control pairs pp in any stratum. More specificly, assume that we want to test if there is a difference in gene expression between case-control pairs in stratum "with spread" versus stratum "without spread" for curve group 123. Assume that the set of100 genes G 123,spread is selected using criterion 2 in the "spread" stratum. We then 12 Curve groups and breast cancer

13 calculate Z 123,spread,p for all case-control pairs p in stratum "spread" and Z 123,without spread,p' for p in stratum "without spread", and test if the difference is larger than expected by chance. Note that testing the "with spread" versus "without spread" strata may also be performed with the set of curve groups G 123,without spread selected from the "without spread" stratum or from any of the other defined strata.?? 3.6 An alternative statistic for comparing two strata The test described above focuses on genes that belong to the same curve group. We have also constructed a hypothesis test to compare the difference in time development between two strata that does not depend on curve groups. The test statistic is constructed by first computing the two-sample t-statistic TT gg,tt, comparing the difference in gene expression between the two strata for each gene gg and time period tt. We define FF gg = ww tt TT gg,tt tt as the weighted sum of the absolute values of the t-statistics for gene gg with weight ww tt. Further, the test statistic is defined as LL kk = gg GGkk FF gg, where GG kk is the set of genes with the kk largest FF gg values, i.e. LL kk is the sum of the kk largest FF gg values. We observe that LL kk is a weighted sum of t-statistics. We used equal weights ww tt = 1/3 for each time period. Alternatively, the weights could be selected either as proportional to the number of case-control pairs in each time period or with larger values for the pairs with time period closer to the time of diagnosis. In addition to the global test including all three time periods, separate tests for each time period were also performed, in which only data corresponding to each time period were included. This test performed very well on several simulated datasets with a different time development or different gene expression level for some genes for two strata, for details see [13]. In the Supplementary we use this t-statistics to construct a variable that separates the case-control pairs in two strata. 3.7 Computing p-values permutation tests In all tests described above, we compute p-values by estimating the null distribution for the statistic of the hypothesis test by randomizing the data. In the hypothesis test for a given stratum where we test for development in time, the null model is estimated by randomizing case-control pairs for that stratum between time periods, while in the hypothesis tests where two strata are compared, the null model is estimated by randomizing case-control pairs between the two strata for each time period. Note that Curve groups and breast cancer 13

14 these randomization algorithms maintain the correlation structure between the genes for each case-control pair. Also note that the curve groups are redefined before a sample of the null model is computed from a randomized dataset. The p-value of the test is set to K+1 N+1, where N is the total number of randomizations and K is the number of randomizations out of N with a more extreme statistic than the statistic for the real data [14]. In the results presented we have used N = Results 4.1 Hypothesis tests for development in time for each stratum A time trend was considered present if there were more genes in the curve groups than expected by chance. Results for the different strata are presented in Table 2. In the first panel we compared all pairs with spread to all pairs without spread. The results were non-significant indicating no changes in gene expression over time when not stratifying on mode of detection. Stratifying cases on participation in the screening or not revealed significant time trends in cases with spread either found at screening or as interval cancers, as more p-values are less than 0.05 than we would expect by chance. Further stratification on all modes of detection showed that the effect mainly was restricted to interval cancers with spread. In the tests we have used inclusion criterion 1 with αα = In Figure 4 in the Supplementary we show how the results depend on the ααvalues. We conclude that the results are not very sensitive to the choice of αα-values and that αα = 0.01 is a reasonable choice. 4.2 Hypothesis tests for comparing two strata Based on the results from the previous section, we restrict our analysis to compare the gene expression in the two strata «Screening or interval with spread» and «Screening or interval without spread» using the curve group variable ZZ cc,ss,pp described in the method section. P-values obtained by testing whether the curve group variables ZZ cc,ss,pp are different in the two strata are shown in Table 3. Note that many of the p-values are below 0.05 and that some are smaller than In Figure 2 we illustrate how to use the gene expression data to separate the two strata by showing the curve group variable ZZ cc,ss,pp for each case-control pair pp in the different strata. The plot showed that the difference between the two strata changes over time for the two most significant ZZ cc,ss,pp 14 Curve groups and breast cancer

15 variables. Black and red points are separated, which indicate that the differences between the strata (with spread and without spread) were larger the last year before diagnosis, than in earlier years, and that this difference holds for both screening and interval cases (red/black circles and red/black triangles, respectively). Nevertheless, the differences were not large and the ability to predict the clinical stage for individual cases remains limited. However, it can be possible to develop a procedure that can separate a subgroup of the case-control pairs without spread from the remaining casecontrol pairs, i.e. it should be possible to predict the clinical stage for some of the cases without spread, but not for all. In the methods section we introduced the statistic LL kk, a weighted sum of t- statistics, as an alternative to the curve group variables ZZ cc,ss,pp for comparing the gene expression levels of two strata. In Figure 3 we plot the p-value in a hypothesis test with LL kk as test statistic against the number of genes kk. The plot show that the gene expression levels are different in the two strata. The p-values decrease with increasing number of genes used in the calculation of LL kk. If we used 50 genes, the p-value is about 0.05, and the p-value decreased to below 0.02 when we used the 1000 most significant genes. This indicate that the difference between the strata is present in a large number of genes, but so weak that the strongest result was obtained when including a large number of genes. Also, notice that the time period 1, the last year before diagnosis, contributed most to the low p-values. This is in accordance with the results shown in Figure 2 and Table 3. In Figure 5 in the Supplementary we illustrate how to separate the two strata for each case-control pair using a variable that corresponds to the test statistic LL kk used in Figure 3. 5 Discussion This explorative analysis has shown that it is possible to significantly discriminate the time trend of gene expression patterns observed before diagnosis. The findings are based on an original approach for the statistical analysis of time dependent curves of gene expression in the NOWAC postgenome cohort. The methods could also be used for other aspects of functional genomics like methylation. These findings deserve to be further interpreted in relation to the biology of both single genes and gene pathways. The prospective analyses of gene expression in the years preceding diagnosis as Curve groups and breast cancer 15

16 assessed by the log-fold change between cases and controls showed significant differences in the curve groups according to stratification as defined by mode of detection and node status of the cases at time of diagnosis. Studies of gene expression in peripheral blood are challenging as they are exposed to many difficulties and pitfalls. The ubiquitous degradation by RNase reduces the quality of mrna for whole genome analyses in most biobanks except for those with a buffer or directly frozen in liquid nitrogen. The signals related to carcinogenesis are expected to be much weaker than in tumor tissue and can be confounded by signals from exposures to carcinogens or other lifestyle factors. The problem of noise due to the complicated study object of carcinogenesis, the need for adequate epidemiological design including exposure information and blood sampling, complicated technology and development of robust statistics could make the approach unsuccessful. The prospective design made it difficult to increase the statistical power of the study, so interpretation of the results should be made carefully. To the best of our knowledge, the NOWAC postgenome cohort is the largest population based prospective cancer study designed for transcriptomic studies based on buffered RNA. All parts of the analyses are done within the same cohort framework of NOWAC. In the NOWAC postgenome cohort a single laboratory processed all samples using the same technology, thus reducing analytical bias and batch effects. The cohort design reduced selection bias. A weakness of a prospective study could be the change of case-control status as controls became cases over time, thus reducing the differences in gene expression within a pair. We removed all pairs where controls were diagnosed with breast cancer or another cancer in a period of at least two years after blood sampling. Unfortunately no repeated sampling of blood and questionnaires was conducted. Repeated measurements would secure better analyses making it possible to use intra individual comparisons over time. One stratification factor was based on the mode of detection. In Norway, the National Mammographic Screening Program for breast cancer started in 1996 with complete coverage of the population from 2005 [9]. It has been estimated that the introduction of population based mammographic screening in Norway gave a mean sojourn time for invasive cancer of 4.0 years in women aged years and 6.6 years for those years [15]. Analyses of breast cancer carcinogenesis as a time dependent process should therefore take into consideration that cases diagnosed at the 16 Curve groups and breast cancer

17 mammographic screening program are diagnosed at an earlier phase of carcinogenesis and thus not directly comparable to clinically detected cancers. Secondly, node status has for a hundred years been the most important prognostic factor in breast cancer treatment. In one of the earliest publications from Yale 1920 the five-year survival of metastatic cancer was 15% [16]. Even in Norway after the Second World War, the observed five-year survival rate was 25% [17] in node positive cases. What can be observed before diagnosis or treatment are thus the signals from a deadly disease. The starting time of the metastatic growth is unknown. The time from initiation of metastases to its diagnosis has been estimated at 5.8 years [18]. At time of diagnosis we had a censored distribution of tumors where mode of detection determines the time of diagnosis irrespective of the underlying carcinogenic process. Differences in gene expression in blood at diagnosis between node positive and node negative tumors have been described in a small clinical study without controls [19]. The findings of pervasive, but small changes in gene expression present in blood before diagnosis of breast cancer could have several explanations depending on the different views on carcinogenesis. One conclusion of the cancer genome project was that remarkably little is known about the process of carcinogenesis [1]. The interpretations of the gene expression trajectories should therefore be explorative. Human observational findings should not necessarily be related to existing models of carcinogenesis since these are based mainly on animal experiments. Among models currently debated, there is the driver-passenger model [20], the mathematical multistage model or the two stage clonal model [21], the oncogene addiction hypothesis [22], hallmarks of cancer with a current focus on the immune system [23] and the exposure driven model [24]. The findings of a strong effect of stratification by stage and mode of detection could have implications for the construction of predictive or prognostic clinical tests. Most studies with tumor tissues have not taken into account the different biological signals from cancers at different stages. Stratification could improve the sensitivity and specificity of such tests. From a statistical point of view the Cox proportional hazard model and its extension have been largely used by epidemiologist since the seminal work by Cox [25] for analyzing cohort studies with time- varying covariates. It has been adapted as well for case-control designs [26] and some extension have been proposed for covariates Curve groups and breast cancer 17

18 measured with noise [27] and time-changing coefficients [28]. More recently, the adjunction of covariates in high dimension like gene expression data added some challenging statistical issues [29]. While the characteristics and the basic assumptions of the Cox model are adapted to the dimensionality and the very specific paired design of the NOWAC study, the Cox model is not fully adapted to the estimation of changes in the gene expression curves and to the biological interpretations of gene pathways. An agnostic search for time trends depends on a sensitive statistical approach. We have presented two novel statistical methods that demonstrated that the gene expressions vary with time the last years before diagnosis and that this development in time differs between clinical stages for participants inside a screening program. One of the methods focuses on identifying genes with specific functional dependencies in time within a given clinical stage. The other method focuses on difference in gene expressions between clinical stages in the different time periods. Hence, the two methods focus on different aspects of functional time dependency relative to time of diagnosis of the gene expressions. Both methods give significant results when we use many genes and the data from the last year before diagnosis contributes the most to this result. As the gene expression data are very noisy, all methods use information from several genes simultaneously to increase the power of the hypothesis tests used. We found that the differences between the strata (with spread and without spread) are larger the last year before diagnosis, than in earlier years, but that the differences are small and the ability to predict the clinical stage for individual cases is limited. However, it is possible to separate a subgroup of the case-control pairs without spread from the remaining case-control pairs, Figure 2, and predict the clinical stage for some cases without spread, but not for all. A potential weakness of the curve group approach could be the increasing number of curve groups as time of observation increases. With four time periods we will need 24 curve groups, and with five time periods even more. 6 Conclusion The findings indicate that gene expression in blood before diagnosis might be used as a biomarker of disease extent. These findings could be viewed as a proof of concept of 18 Curve groups and breast cancer

19 systems epidemiology indicating the potential of including gene expression for functional analysis in prospective studies of cancer. Curve groups and breast cancer 19

20 References 1. Lund E, Dumeaux V. Systems Epidemiology in Cancer. Cancer Epidemiology Biomarkers & Prevention. 2008;17(11): doi: / epi Of men, not mice. Nat Med. 2013;19(4): Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SAJR, Behjati S, Biankin AV et al. Signatures of mutational processes in human cancer. Nature. 2013;500(7463): doi: /nature Lund E, Plancade S. Transcriptional output in a prospective design conditionally on follow-up and exposure: the multistage model of cancer. International Journal of Molecular Epidemiology and Genetics. 2012;3(2): Spitz MR, Bondy ML. The evolving discipline of molecular epidemiology of cancer. Carcinogenesis. 2010;31(1): doi: /carcin/bgp Reiner A, Yekutieli D, Benjamini Y. Identifying differentially expressed genes using false discovery rate controlling procedures. Bioinformatics. 2003;19(3): doi: /bioinformatics/btf Dumeaux V, Borresen-Dale A-L, Frantzen J-O, Kumle M, Kristensen V, Lund E. Gene expression analyses in breast cancer epidemiology: the Norwegian Women and Cancer postgenome cohort study. Breast Cancer Research. 2008;10(1):R Lund E, Dumeaux V, Braaten T, Hjartåker A, Engeset D, Skeie G et al. Cohort Profile: The Norwegian Women and Cancer Study (NOWAC) Kvinner og kreft. Int J Epidemiol. 2008;37(1): doi: /ije/dym Hofvind S, Geller B, Vacek PM, Thoresen S, Skaane P. Using the European guidelines to evaluate the Norwegian Breast Cancer Screening Program. European Journal of Epidemiology. 2007;22(7): doi: / Günther C, Holden M, Holden L. Preprocessing of gene-expression data related to breast cancer diagnosis: SAMBA/35/ Du P, Kibbe W, Lin S. nuid: a universal naming scheme of oligonucleotides for Illumina, Affymetrix, and other microarrays. Biology Direct. 2007;2(1): Carlson M. lumihumanall.db: Illumina Human Illumina expression annotation data (chip lumihumanall. R Package version Holden L. Classify strata. NR note SAMBA/11/ Phipson B, Smyth GK. Permutation P-values Should Never Be Zero: Calculating Exact P-values When Permutations Are Randomly Drawn. Stat Appl Genet Mol Biol. 2010;31(9). doi: / Weedon-Fekjær H, Lindqvist BH, Vatten LJ, Aalen OO, Tretli S. Estimating mean sojourn time and screening sensitivity using questionnaire data on time since previous screening. Journal of Medical Screening. 2008;15(2): doi: /jms Todd M, Shoag M, Cadman E. Survival of women with metastatic breast cancer at Yale from 1920 to Journal of Clinical Oncology. 1983;1(6): Survival of cancer patients : cases diagnosed in Norway The Norwegian Cancer Society and The Cancer Registry in Norway. Oslo Engel J, Eckel R, Kerr J, Schmidt M, Fürstenberger G, Richter R et al. The process of metastasisation for breast cancer. European Journal of Cancer. 2003;39(12): Curve groups and breast cancer

21 19. Zuckerman NS, Yu H, Simons DL, Bhattacharya N, Carcamo-Cavazos V, Yan N et al. Altered local and systemic immune profiles underlie lymph node metastasis in breast cancer patients. International Journal of Cancer. 2013;132(11): doi: /ijc Stratton MR, Campbell PJ, Futreal PA. The cancer genome. Nature. 2009;458(7239): Vineis P, Schatzkin A, Potter JD. Models of carcinogenesis: an overview. Carcinogenesis. 2010;31(10): doi: /carcin/bgq Felsher DW. Oncogene Addiction versus Oncogene Amnesia: Perhaps More than Just a Bad Habit? Cancer Research. 2008;68(9): doi: / can Hanahan D, Weinberg RA. The Hallmarks of Cancer. Cell. 2000;100(1): doi: /s (00) Lund E. An exposure driven functional model of carcinogenesis. Medical Hypotheses. 2011;77(2): doi: /j.mehy Cox DR. Regression Models and Life-Tables. Journal of the Royal Statistical Society Series B (Methodological). 1972;34(2): doi: / Aalen OO, Borgan Ø, Gjessing HK. Survival and Event History Analysis. A Process Point of View, Statistics for Biology and Health. New York. Springer; p. 27. Hu P, Tsiatis AA, Davidian M. Estimating the Parameters in the Cox Model When Covariate Variables are Measured with Error. Biometrics. 1998;54(4): doi: / O'Quigley J. Proportional Hazards Regression. Statistics for Biology and Health. Springer; Benner A, Zucknick M, Hielscher T, Ittrich C, Mansmann U. High-Dimensional Cox Models: The Choice of Penalty as Part of the Model Building Process. Biometrical Journal. 2010;52(1): doi: /bimj Curve groups and breast cancer 21

22 Table 1 Number of case-control pairs in each stratum and time period for the dataset. The stratum Clinical consists of i) Women that attended screening, but this was more than two years before diagnosis; and ii) Women that did not attend screening prior to diagnosis. Year before diagnosis (time period) 5-3 (3) 2 (2) 1 (1) Stratum Screening: Spread Diagnosed at a screening visit Not spread Interval: Spread Diagnosed within two years of a screening visit Clinical: Outside the screening program Not spread Spread Not spread Curve groups and breast cancer

23 Table 2 P-values obtained when testing whether there are more genes in the curve groups than what is expected by chance. We have used inclusion criterion 1 with αα = P-values below 0.05 are highlighted in yellow. The observed and expected number of genes in each curve group are shown in Table 4 in the Supplementary. p-value Screening, interval or Screening, interval or Screening or Screening or Curve clinical clinical interval interval group with spread without spread with spread without spread Global p-value Screening Screening Interval Interval Clinical Clinical Curve with without with without with without group spread spread spread spread spread spread Global Curve groups and breast cancer 23

24 Table 3 P-values obtained when testing whether the curve group variables ZZ cc,ss,pp are different in the two strata «Screening or interval with spread» and «Screening or interval without spread». P-values below 0.05 are highlighted in yellow. p-value Genes selected based on stratum ss 1 = «Screening or interval with spread» ZZ cc,ss1,pp Genes selected based on stratum ss 2 = «Screening or interval without spread» ZZ cc,ss2,pp Period tt N N Curve group cc N1 is the number of case-control pairs in the stratum «Screening or interval with spread» in the time period tt, while N2 is the number of case-control pairs in the stratum «Screening or interval without spread» in the time period tt. 24 Curve groups and breast cancer

25 Figure 1 Example of two different curve groups: 123 (upper) and 132 (lower). In the left panels curves for 20 genes from the given curve group are plotted. For illustrational purposes, the curves have been estimated from the data using splines. In the middle panels the data for one of the 20 genes are shown with the corresponding splineestimated curve. The points represent the differences in gene expression shown with the corresponding spline-estimated curve. The points represent the differences in gene expression XX gg,pp for each case-control pair. The mean value in each time period XX gg,3,ss, XX gg,2,ss and XX gg,1,ss, is shown in red. The right panels are similar to the middle panels except that the data that are plotted are the mean values computed over the 20 genes in the left panel. Curve groups and breast cancer 25

26 Figure 2 Plot of two of the most significant curve group variables ZZ cc,ss,pp for the screening and interval strata. «With spread 132» on the x-axis denotes that s in ZZ cc,ss,pp is the stratum «Screening or interval with spread» and c is curve group 132, while «Without spread 312» on the y-axis denotes that s in ZZ cc,ss,pp is the stratum «Screening or interval without spread» and c is curve group Curve groups and breast cancer

27 Figure 3 The p-value in a hypothesis test with test statistic LL kk, a weighted sum of t- statistics, plotted against the number of genes kk used in the calculation of LL kk. The two strata that are compared using LL kk are «Screening or interval with spread» and «Screening or interval without spread». Curve groups and breast cancer 27

28 7 Supplementary 7.1 Hypothesis tests for development in time for each stratum Supplementary figure for Table 2 in the paper In Table 2 in the paper we presented p-values obtained when testing whether there are more genes in the curve groups than what is expected by chance. In these tests we used inclusion criterion 1 with αα = A small αα-value implies that we only include genes with a strong trend. Figure 4 shows how the p-values depend on the value of αα. We observe that the p-values are not sensitive to the choice of αα except that αα should be closer to 0 than to 1. Figure continues on next page 28 Curve groups and breast cancer

29 Figure 4 P-values obtained when testing whether there are more genes in a curve group than expected, plotted against αα, the parameter of inclusion criterion 1. The horizontal Curve groups and breast cancer 29

30 dotted line indicates a p-value equal to The data for the stratum «With spread» consists of the data for «Screening with spread», «Interval with spread» and «Clinical with spread», and similar for the stratum «Without spread». Supplementary table for Table 2 in the paper In Table 2 in the paper we presented p-values obtained when testing whether there are more genes in the curve groups than what is expected by chance. In these tests we used inclusion criterion 1 with αα = Table 4 shows the observed number and the expected number of genes in each curve group. Here it is important to notice that the numbers of genes in each curve group is not too small. If this had been the case, this would indicate that a too small αα-value had been chosen weakening the power of the test. The table shows that this is not the case. Table 4 The observed number of genes in each curve group with expected number of genes in parenthesis. The cases with a p-value below 0.05 in Table 2 are highlighted in yellow. Observed number of genes (expected number of genes) Screening, interval Screening or or clinical interval without spread with spread Screening, interval Screening or Curve or clinical interval group with spread without spread Global 305 (513) 609 (535) 1360 (482) 708 (547) (76) 97 (82) 259 (70) 69 (86) (100) 171 (103) 518 (99) 205 (107) (102) 145 (105) 171 (105) 203 (108) (82) 40 (82) 314 (77) 46 (82) (77) 44 (81) 48 (66) 51 (82) (76) 112 (82) 50 (65) 134 (83) Curve group Screening with spread Observed number of gene (expected number of genes) Screening Interval Interval without with without spread spread spread Global 475 (464) 490 (547) Clinical with spread Clinical without spread 1233 (485) 471 (525) 448 (491) 302 (502) (75) 78 (85) 101 (81) 33 (90) 233 (84) 83 (83) (91) 141 (106) 515 (92) 96 (97) 52 (84) 54 (90) (96) 107 (109) 237 (89) 123 (96) 18 (82) 40 (92) (82) 29 (82) 213 (81) 71 (83) 101 (83) 45 (77) (63) 46 (82) 92 (70) 31 (78) 21 (77) 27 (77) (58) 89 (83) 75 (73) 117 (81) 23 (81) 53 (83) 30 Curve groups and breast cancer

31 7.2 Methods for separation of the strata. Supplementary figure for Figure 3 in the paper In Figure 3 in the paper we illustrated our ability to separate between two strata based on the t-statistics TT gg,tt for each gene g and time period t. We can also illustrate the separation of the two strata by calculating a variable YY pp,kk for each case-control pair pp using information from several genes simultaneously to show that there are differences in gene expression between the two strata. We define the variable YY pp,kk for each casecontrol pair pp such that the variable is low for case-control pairs from one stratum and high for case-control pairs from the other stratum. We define YY pp,kk = 1 kk XX gg,pp ssssssss(tt gg,tt ) gg GG kk, where tt is the time period for case-control pair pp, and ssiiiiii(tt gg,tt ) is 1 if TT gg,tt is positive, - 1 otherwise. Here, GG kk is the set of genes with the kk largest FF gg -values (defined in the paper). Figure 5 shows the YY pp,1000 values for the different case-control pairs p for the different strata. Notice that there is a separation between spread and not spread in period 1 and between interval with spread and interval without spread in period 2. The separation is such that some of the pairs without spread, but not all, have smaller values that seems to be outside the range of the values with spread. Curve groups and breast cancer 31

32 Figure 5 Plot of the variable YY pp,kk, where kk=1000 and where genes have been selected based on data in the period of case-control pair pp (upper) or on data in all three periods (lower). 32 Curve groups and breast cancer

Analysis of gene expression in blood before diagnosis of ovarian cancer

Analysis of gene expression in blood before diagnosis of ovarian cancer Analysis of gene expression in blood before diagnosis of ovarian cancer Different statistical methods Note no. Authors SAMBA/10/16 Marit Holden and Lars Holden Date March 2016 Norsk Regnesentral Norsk

More information

Nature Methods: doi: /nmeth.3115

Nature Methods: doi: /nmeth.3115 Supplementary Figure 1 Analysis of DNA methylation in a cancer cohort based on Infinium 450K data. RnBeads was used to rediscover a clinically distinct subgroup of glioblastoma patients characterized by

More information

Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD

Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD Department of Biomedical Informatics Department of Computer Science and Engineering The Ohio State University Review

More information

Abstract. Optimization strategy of Copy Number Variant calling using Multiplicom solutions APPLICATION NOTE. Introduction

Abstract. Optimization strategy of Copy Number Variant calling using Multiplicom solutions APPLICATION NOTE. Introduction Optimization strategy of Copy Number Variant calling using Multiplicom solutions Michael Vyverman, PhD; Laura Standaert, PhD and Wouter Bossuyt, PhD Abstract Copy number variations (CNVs) represent a significant

More information

National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2009 Formula Grant

National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2009 Formula Grant National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2009 Formula Grant Reporting Period July 1, 2011 June 30, 2012 Formula Grant Overview The National Surgical

More information

RASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays

RASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays Supplementary Materials RASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays Junhee Seok 1*, Weihong Xu 2, Ronald W. Davis 2, Wenzhong Xiao 2,3* 1 School of Electrical Engineering,

More information

The Loss of Heterozygosity (LOH) Algorithm in Genotyping Console 2.0

The Loss of Heterozygosity (LOH) Algorithm in Genotyping Console 2.0 The Loss of Heterozygosity (LOH) Algorithm in Genotyping Console 2.0 Introduction Loss of erozygosity (LOH) represents the loss of allelic differences. The SNP markers on the SNP Array 6.0 can be used

More information

Supplementary information for: Human micrornas co-silence in well-separated groups and have different essentialities

Supplementary information for: Human micrornas co-silence in well-separated groups and have different essentialities Supplementary information for: Human micrornas co-silence in well-separated groups and have different essentialities Gábor Boross,2, Katalin Orosz,2 and Illés J. Farkas 2, Department of Biological Physics,

More information

Cancer outlier differential gene expression detection

Cancer outlier differential gene expression detection Biostatistics (2007), 8, 3, pp. 566 575 doi:10.1093/biostatistics/kxl029 Advance Access publication on October 4, 2006 Cancer outlier differential gene expression detection BAOLIN WU Division of Biostatistics,

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Analysis of small RNAs from Drosophila Schneider cells using the Small RNA assay on the Agilent 2100 bioanalyzer. Application Note

Analysis of small RNAs from Drosophila Schneider cells using the Small RNA assay on the Agilent 2100 bioanalyzer. Application Note Analysis of small RNAs from Drosophila Schneider cells using the Small RNA assay on the Agilent 2100 bioanalyzer Application Note Odile Sismeiro, Jean-Yves Coppée, Christophe Antoniewski, and Hélène Thomassin

More information

Gene expression profiling predicts clinical outcome of prostate cancer. Gennadi V. Glinsky, Anna B. Glinskii, Andrew J. Stephenson, Robert M.

Gene expression profiling predicts clinical outcome of prostate cancer. Gennadi V. Glinsky, Anna B. Glinskii, Andrew J. Stephenson, Robert M. SUPPLEMENTARY DATA Gene expression profiling predicts clinical outcome of prostate cancer Gennadi V. Glinsky, Anna B. Glinskii, Andrew J. Stephenson, Robert M. Hoffman, William L. Gerald Table of Contents

More information

National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2008 Formula Grant

National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2008 Formula Grant National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2008 Formula Grant Reporting Period July 1, 2011 December 31, 2011 Formula Grant Overview The National Surgical

More information

Statistical Analysis of Biomarker Data

Statistical Analysis of Biomarker Data Statistical Analysis of Biomarker Data Gary M. Clark, Ph.D. Vice President Biostatistics & Data Management Array BioPharma Inc. Boulder, CO NCIC Clinical Trials Group New Investigator Clinical Trials Course

More information

Roadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers. Richard Simon, J Clin Oncol 23:

Roadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers. Richard Simon, J Clin Oncol 23: Roadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers. Richard Simon, J Clin Oncol 23:7332-7341 Presented by Deming Mi 7/25/2006 Major reasons for few prognostic factors to

More information

Computer Science, Biology, and Biomedical Informatics (CoSBBI) Outline. Molecular Biology of Cancer AND. Goals/Expectations. David Boone 7/1/2015

Computer Science, Biology, and Biomedical Informatics (CoSBBI) Outline. Molecular Biology of Cancer AND. Goals/Expectations. David Boone 7/1/2015 Goals/Expectations Computer Science, Biology, and Biomedical (CoSBBI) We want to excite you about the world of computer science, biology, and biomedical informatics. Experience what it is like to be a

More information

T. R. Golub, D. K. Slonim & Others 1999

T. R. Golub, D. K. Slonim & Others 1999 T. R. Golub, D. K. Slonim & Others 1999 Big Picture in 1999 The Need for Cancer Classification Cancer classification very important for advances in cancer treatment. Cancers of Identical grade can have

More information

Estimating mean sojourn time and screening sensitivity using questionnaire data on time since previous screening

Estimating mean sojourn time and screening sensitivity using questionnaire data on time since previous screening Estimating mean sojourn time and screening sensitivity using questionnaire data on time since previous screening Harald Weedon-Fekjær, Bo H. Lindqvist, Odd O. Aalen, Lars J. Vatten and Steinar Tretli Short

More information

AD (Leave blank) TITLE: Genomic Characterization of Brain Metastasis in Non-Small Cell Lung Cancer Patients

AD (Leave blank) TITLE: Genomic Characterization of Brain Metastasis in Non-Small Cell Lung Cancer Patients AD (Leave blank) Award Number: W81XWH-12-1-0444 TITLE: Genomic Characterization of Brain Metastasis in Non-Small Cell Lung Cancer Patients PRINCIPAL INVESTIGATOR: Mark A. Watson, MD PhD CONTRACTING ORGANIZATION:

More information

micrornas (mirna) and Biomarkers

micrornas (mirna) and Biomarkers micrornas (mirna) and Biomarkers Small RNAs Make Big Splash mirnas & Genome Function Biomarkers in Cancer Future Prospects Javed Khan M.D. National Cancer Institute EORTC-NCI-ASCO November 2007 The Human

More information

Supplementary Information Titles Journal: Nature Medicine

Supplementary Information Titles Journal: Nature Medicine Supplementary Information Titles Journal: Nature Medicine Article Title: Corresponding Author: Supplementary Item & Number Supplementary Fig.1 Fig.2 Fig.3 Fig.4 Fig.5 Fig.6 Fig.7 Fig.8 Fig.9 Fig. Fig.11

More information

DOES THE BRCAX GENE EXIST? FUTURE OUTLOOK

DOES THE BRCAX GENE EXIST? FUTURE OUTLOOK CHAPTER 6 DOES THE BRCAX GENE EXIST? FUTURE OUTLOOK Genetic research aimed at the identification of new breast cancer susceptibility genes is at an interesting crossroad. On the one hand, the existence

More information

New Enhancements: GWAS Workflows with SVS

New Enhancements: GWAS Workflows with SVS New Enhancements: GWAS Workflows with SVS August 9 th, 2017 Gabe Rudy VP Product & Engineering 20 most promising Biotech Technology Providers Top 10 Analytics Solution Providers Hype Cycle for Life sciences

More information

Supplementary Figures

Supplementary Figures Supplementary Figures Supplementary Figure 1. Pan-cancer analysis of global and local DNA methylation variation a) Variations in global DNA methylation are shown as measured by averaging the genome-wide

More information

CHL 5225 H Advanced Statistical Methods for Clinical Trials. CHL 5225 H The Language of Clinical Trials

CHL 5225 H Advanced Statistical Methods for Clinical Trials. CHL 5225 H The Language of Clinical Trials CHL 5225 H Advanced Statistical Methods for Clinical Trials Two sources for course material 1. Electronic blackboard required readings 2. www.andywillan.com/chl5225h code of conduct course outline schedule

More information

BIOSTATISTICAL METHODS AND RESEARCH DESIGNS. Xihong Lin Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA

BIOSTATISTICAL METHODS AND RESEARCH DESIGNS. Xihong Lin Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA BIOSTATISTICAL METHODS AND RESEARCH DESIGNS Xihong Lin Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA Keywords: Case-control study, Cohort study, Cross-Sectional Study, Generalized

More information

ncounter TM Analysis System

ncounter TM Analysis System ncounter TM Analysis System Molecules That Count TM www.nanostring.com Agenda NanoString Technologies History Introduction to the ncounter Analysis System CodeSet Design and Assay Principals System Performance

More information

chapter 1 - fig. 2 Mechanism of transcriptional control by ppar agonists.

chapter 1 - fig. 2 Mechanism of transcriptional control by ppar agonists. chapter 1 - fig. 1 The -omics subdisciplines. chapter 1 - fig. 2 Mechanism of transcriptional control by ppar agonists. 201 figures chapter 1 chapter 2 - fig. 1 Schematic overview of the different steps

More information

SUPPLEMENTARY INFORMATION. Confidence matching in group decision-making. *Correspondence to:

SUPPLEMENTARY INFORMATION. Confidence matching in group decision-making. *Correspondence to: In the format provided by the authors and unedited. SUPPLEMENTARY INFORMATION VOLUME: 1 ARTICLE NUMBER: 0117 Confidence matching in group decision-making an Bang*, Laurence Aitchison, Rani Moran, Santiago

More information

CHAPTER 6. Conclusions and Perspectives

CHAPTER 6. Conclusions and Perspectives CHAPTER 6 Conclusions and Perspectives In Chapter 2 of this thesis, similarities and differences among members of (mainly MZ) twin families in their blood plasma lipidomics profiles were investigated.

More information

Introduction to Discrimination in Microarray Data Analysis

Introduction to Discrimination in Microarray Data Analysis Introduction to Discrimination in Microarray Data Analysis Jane Fridlyand CBMB University of California, San Francisco Genentech Hall Auditorium, Mission Bay, UCSF October 23, 2004 1 Case Study: Van t

More information

Journal: Nature Methods

Journal: Nature Methods Journal: Nature Methods Article Title: Network-based stratification of tumor mutations Corresponding Author: Trey Ideker Supplementary Item Supplementary Figure 1 Supplementary Figure 2 Supplementary Figure

More information

Corporate Medical Policy

Corporate Medical Policy Corporate Medical Policy Microarray-based Gene Expression Testing for Cancers of Unknown File Name: Origination: Last CAP Review: Next CAP Review: Last Review: microarray-based_gene_expression_testing_for_cancers_of_unknown_primary

More information

Non-Profit Startup Paradigm Launches Cancer Panel Based on DNA, RNA Sequencing

Non-Profit Startup Paradigm Launches Cancer Panel Based on DNA, RNA Sequencing Non-Profit Startup Paradigm Launches Cancer Panel Based on DNA, RNA Sequencing April 11, 2014 By Tony Fong Non-profit diagnostics outfit Paradigm last month joined a growing list of entrants in the clinical

More information

Supplement for: CD4 cell dynamics in untreated HIV-1 infection: overall rates, and effects of age, viral load, gender and calendar time.

Supplement for: CD4 cell dynamics in untreated HIV-1 infection: overall rates, and effects of age, viral load, gender and calendar time. Supplement for: CD4 cell dynamics in untreated HIV-1 infection: overall rates, and effects of age, viral load, gender and calendar time. Anne Cori* 1, Michael Pickles* 1, Ard van Sighem 2, Luuk Gras 2,

More information

On the Reproducibility of TCGA Ovarian Cancer MicroRNA Profiles

On the Reproducibility of TCGA Ovarian Cancer MicroRNA Profiles On the Reproducibility of TCGA Ovarian Cancer MicroRNA Profiles Ying-Wooi Wan 1,2,4, Claire M. Mach 2,3, Genevera I. Allen 1,7,8, Matthew L. Anderson 2,4,5 *, Zhandong Liu 1,5,6,7 * 1 Departments of Pediatrics

More information

Sample Size Estimation for Microarray Experiments

Sample Size Estimation for Microarray Experiments Sample Size Estimation for Microarray Experiments Gregory R. Warnes Department of Biostatistics and Computational Biology Univeristy of Rochester Rochester, NY 14620 and Peng Liu Department of Biological

More information

Prediction Model For Risk Of Breast Cancer Considering Interaction Between The Risk Factors

Prediction Model For Risk Of Breast Cancer Considering Interaction Between The Risk Factors INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME, ISSUE 0, SEPTEMBER 01 ISSN 81 Prediction Model For Risk Of Breast Cancer Considering Interaction Between The Risk Factors Nabila Al Balushi

More information

Risk-prediction modelling in cancer with multiple genomic data sets: a Bayesian variable selection approach

Risk-prediction modelling in cancer with multiple genomic data sets: a Bayesian variable selection approach Risk-prediction modelling in cancer with multiple genomic data sets: a Bayesian variable selection approach Manuela Zucknick Division of Biostatistics, German Cancer Research Center Biometry Workshop,

More information

AVENIO family of NGS oncology assays ctdna and Tumor Tissue Analysis Kits

AVENIO family of NGS oncology assays ctdna and Tumor Tissue Analysis Kits AVENIO family of NGS oncology assays ctdna and Tumor Tissue Analysis Kits Accelerating clinical research Next-generation sequencing (NGS) has the ability to interrogate many different genes and detect

More information

Developing and evaluating polygenic risk prediction models for stratified disease prevention

Developing and evaluating polygenic risk prediction models for stratified disease prevention Developing and evaluating polygenic risk prediction models for stratified disease prevention Nilanjan Chatterjee 1 3, Jianxin Shi 3 and Montserrat García-Closas 3 Abstract Knowledge of genetics and its

More information

Biostatistics II

Biostatistics II Biostatistics II 514-5509 Course Description: Modern multivariable statistical analysis based on the concept of generalized linear models. Includes linear, logistic, and Poisson regression, survival analysis,

More information

Genome-Scale Identification of Survival Significant Genes and Gene Pairs

Genome-Scale Identification of Survival Significant Genes and Gene Pairs Genome-Scale Identification of Survival Significant Genes and Gene Pairs E. Motakis and V.A. Kuznetsov Abstract We have used the semi-parametric Cox proportional hazard regression model to estimate the

More information

How to address tumour heterogeneity in next generation oncology trials

How to address tumour heterogeneity in next generation oncology trials How to address tumour heterogeneity in next generation oncology trials Cihangir YANDIM, PhD Research Associate in Cancer Therapeutics and Clinical Sciences Dr. Cihangir Yandim - CTIP 2016, Hamburg 1 Founded

More information

Efficacy of the Extended Principal Orthogonal Decomposition Method on DNA Microarray Data in Cancer Detection

Efficacy of the Extended Principal Orthogonal Decomposition Method on DNA Microarray Data in Cancer Detection 202 4th International onference on Bioinformatics and Biomedical Technology IPBEE vol.29 (202) (202) IASIT Press, Singapore Efficacy of the Extended Principal Orthogonal Decomposition on DA Microarray

More information

Screening for novel oncology biomarker panels using both DNA and protein microarrays. John Anson, PhD VP Biomarker Discovery

Screening for novel oncology biomarker panels using both DNA and protein microarrays. John Anson, PhD VP Biomarker Discovery Screening for novel oncology biomarker panels using both DNA and protein microarrays John Anson, PhD VP Biomarker Discovery Outline of presentation Introduction to OGT and our approach to biomarker studies

More information

Numerous hypothesis tests were performed in this study. To reduce the false positive due to

Numerous hypothesis tests were performed in this study. To reduce the false positive due to Two alternative data-splitting Numerous hypothesis tests were performed in this study. To reduce the false positive due to multiple testing, we are not only seeking the results with extremely small p values

More information

Cellecta Overview. Started Operations in 2007 Headquarters: Mountain View, CA

Cellecta Overview. Started Operations in 2007 Headquarters: Mountain View, CA Cellecta Overview Started Operations in 2007 Headquarters: Mountain View, CA Focus: Development of flexible, scalable, and broadly parallel genetic screening assays to expedite the discovery and characterization

More information

MicroRNA expression profiling and functional analysis in prostate cancer. Marco Folini s.c. Ricerca Traslazionale DOSL

MicroRNA expression profiling and functional analysis in prostate cancer. Marco Folini s.c. Ricerca Traslazionale DOSL MicroRNA expression profiling and functional analysis in prostate cancer Marco Folini s.c. Ricerca Traslazionale DOSL What are micrornas? For almost three decades, the alteration of protein-coding genes

More information

Supplementary webappendix

Supplementary webappendix Supplementary webappendix This webappendix formed part of the original submission and has been peer reviewed. We post it as supplied by the authors. Supplement to: Kratz JR, He J, Van Den Eeden SK, et

More information

Single SNP/Gene Analysis. Typical Results of GWAS Analysis (Single SNP Approach) Typical Results of GWAS Analysis (Single SNP Approach)

Single SNP/Gene Analysis. Typical Results of GWAS Analysis (Single SNP Approach) Typical Results of GWAS Analysis (Single SNP Approach) High-Throughput Sequencing Course Gene-Set Analysis Biostatistics and Bioinformatics Summer 28 Section Introduction What is Gene Set Analysis? Many names for gene set analysis: Pathway analysis Gene set

More information

Multi-Stage Stratified Sampling for the Design of Large Scale Biometric Systems

Multi-Stage Stratified Sampling for the Design of Large Scale Biometric Systems Multi-Stage Stratified Sampling for the Design of Large Scale Biometric Systems Jad Ramadan, Mark Culp, Ken Ryan, Bojan Cukic West Virginia University 1 Problem How to create a set of biometric samples

More information

A Statistical Framework for Classification of Tumor Type from microrna Data

A Statistical Framework for Classification of Tumor Type from microrna Data DEGREE PROJECT IN MATHEMATICS, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2016 A Statistical Framework for Classification of Tumor Type from microrna Data JOSEFINE RÖHSS KTH ROYAL INSTITUTE OF TECHNOLOGY

More information

SSM signature genes are highly expressed in residual scar tissues after preoperative radiotherapy of rectal cancer.

SSM signature genes are highly expressed in residual scar tissues after preoperative radiotherapy of rectal cancer. Supplementary Figure 1 SSM signature genes are highly expressed in residual scar tissues after preoperative radiotherapy of rectal cancer. Scatter plots comparing expression profiles of matched pretreatment

More information

Supplement to SCnorm: robust normalization of single-cell RNA-seq data

Supplement to SCnorm: robust normalization of single-cell RNA-seq data Supplement to SCnorm: robust normalization of single-cell RNA-seq data Supplementary Note 1: SCnorm does not require spike-ins, since we find that the performance of spike-ins in scrna-seq is often compromised,

More information

Supplementary Materials for

Supplementary Materials for www.sciencetranslationalmedicine.org/cgi/content/full/7/283/283ra54/dc1 Supplementary Materials for Clonal status of actionable driver events and the timing of mutational processes in cancer evolution

More information

Sequence balance minimisation: minimising with unequal treatment allocations

Sequence balance minimisation: minimising with unequal treatment allocations Madurasinghe Trials (2017) 18:207 DOI 10.1186/s13063-017-1942-3 METHODOLOGY Open Access Sequence balance minimisation: minimising with unequal treatment allocations Vichithranie W. Madurasinghe Abstract

More information

SUPPLEMENTARY APPENDIX

SUPPLEMENTARY APPENDIX SUPPLEMENTARY APPENDIX 1) Supplemental Figure 1. Histopathologic Characteristics of the Tumors in the Discovery Cohort 2) Supplemental Figure 2. Incorporation of Normal Epidermal Melanocytic Signature

More information

Comparison of Gene Set Analysis with Various Score Transformations to Test the Significance of Sets of Genes

Comparison of Gene Set Analysis with Various Score Transformations to Test the Significance of Sets of Genes Comparison of Gene Set Analysis with Various Score Transformations to Test the Significance of Sets of Genes Ivan Arreola and Dr. David Han Department of Management of Science and Statistics, University

More information

TITLE: MicroRNA in Prostate Cancer Racial Disparities and Aggressiveness

TITLE: MicroRNA in Prostate Cancer Racial Disparities and Aggressiveness AWARD NUMBER: W81XWH-13-1-0477 TITLE: MicroRNA in Prostate Cancer Racial Disparities and Aggressiveness PRINCIPAL INVESTIGATOR: Cathryn Bock CONTRACTING ORGANIZATION: Wayne State University REPORT DATE:

More information

Classification of cancer profiles. ABDBM Ron Shamir

Classification of cancer profiles. ABDBM Ron Shamir Classification of cancer profiles 1 Background: Cancer Classification Cancer classification is central to cancer treatment; Traditional cancer classification methods: location; morphology, cytogenesis;

More information

Supplementary Online Content

Supplementary Online Content Supplementary Online Content Fumagalli D, Venet D, Ignatiadis M, et al. RNA Sequencing to predict response to neoadjuvant anti-her2 therapy: a secondary analysis of the NeoALTTO randomized clinical trial.

More information

Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells.

Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells. SUPPLEMENTAL FIGURE AND TABLE LEGENDS Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells. A) Cirbp mrna expression levels in various mouse tissues collected around the clock

More information

The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis

The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis Tieliu Shi tlshi@bio.ecnu.edu.cn The Center for bioinformatics

More information

Inter-session reproducibility measures for high-throughput data sources

Inter-session reproducibility measures for high-throughput data sources Inter-session reproducibility measures for high-throughput data sources Milos Hauskrecht, PhD, Richard Pelikan, MSc Computer Science Department, Intelligent Systems Program, Department of Biomedical Informatics,

More information

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method Biost 590: Statistical Consulting Statistical Classification of Scientific Studies; Approach to Consulting Lecture Outline Statistical Classification of Scientific Studies Statistical Tasks Approach to

More information

VL Network Analysis ( ) SS2016 Week 3

VL Network Analysis ( ) SS2016 Week 3 VL Network Analysis (19401701) SS2016 Week 3 Based on slides by J Ruan (U Texas) Tim Conrad AG Medical Bioinformatics Institut für Mathematik & Informatik, Freie Universität Berlin 1 Motivation 2 Lecture

More information

2) Cases and controls were genotyped on different platforms. The comparability of the platforms should be discussed.

2) Cases and controls were genotyped on different platforms. The comparability of the platforms should be discussed. Reviewers' Comments: Reviewer #1 (Remarks to the Author) The manuscript titled 'Association of variations in HLA-class II and other loci with susceptibility to lung adenocarcinoma with EGFR mutation' evaluated

More information

Identification of Tissue Independent Cancer Driver Genes

Identification of Tissue Independent Cancer Driver Genes Identification of Tissue Independent Cancer Driver Genes Alexandros Manolakos, Idoia Ochoa, Kartik Venkat Supervisor: Olivier Gevaert Abstract Identification of genomic patterns in tumors is an important

More information

Gene Selection for Tumor Classification Using Microarray Gene Expression Data

Gene Selection for Tumor Classification Using Microarray Gene Expression Data Gene Selection for Tumor Classification Using Microarray Gene Expression Data K. Yendrapalli, R. Basnet, S. Mukkamala, A. H. Sung Department of Computer Science New Mexico Institute of Mining and Technology

More information

National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2009 Formula Grant

National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2009 Formula Grant National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2009 Formula Grant Reporting Period July 1, 2012 June 30, 2013 Formula Grant Overview The National Surgical

More information

Machine Learning! Robert Stengel! Robotics and Intelligent Systems MAE 345,! Princeton University, 2017

Machine Learning! Robert Stengel! Robotics and Intelligent Systems MAE 345,! Princeton University, 2017 Machine Learning! Robert Stengel! Robotics and Intelligent Systems MAE 345,! Princeton University, 2017 A.K.A. Artificial Intelligence Unsupervised learning! Cluster analysis Patterns, Clumps, and Joining

More information

Detection of aneuploidy in a single cell using the Ion ReproSeq PGS View Kit

Detection of aneuploidy in a single cell using the Ion ReproSeq PGS View Kit APPLICATION NOTE Ion PGM System Detection of aneuploidy in a single cell using the Ion ReproSeq PGS View Kit Key findings The Ion PGM System, in concert with the Ion ReproSeq PGS View Kit and Ion Reporter

More information

Gene expression analysis. Roadmap. Microarray technology: how it work Applications: what can we do with it Preprocessing: Classification Clustering

Gene expression analysis. Roadmap. Microarray technology: how it work Applications: what can we do with it Preprocessing: Classification Clustering Gene expression analysis Roadmap Microarray technology: how it work Applications: what can we do with it Preprocessing: Image processing Data normalization Classification Clustering Biclustering 1 Gene

More information

underlying metastasis and recurrence in HNSCC, we analyzed two groups of patients. The

underlying metastasis and recurrence in HNSCC, we analyzed two groups of patients. The Supplementary Figures Figure S1. Patient cohorts and study design. To define and interrogate the genetic alterations underlying metastasis and recurrence in HNSCC, we analyzed two groups of patients. The

More information

DO NOT OPEN THIS BOOKLET UNTIL YOU ARE TOLD TO DO SO

DO NOT OPEN THIS BOOKLET UNTIL YOU ARE TOLD TO DO SO NATS 1500 Mid-term test A1 Page 1 of 8 Name (PRINT) Student Number Signature Instructions: York University DIVISION OF NATURAL SCIENCE NATS 1500 3.0 Statistics and Reasoning in Modern Society Mid-Term

More information

Supplement 2. Use of Directed Acyclic Graphs (DAGs)

Supplement 2. Use of Directed Acyclic Graphs (DAGs) Supplement 2. Use of Directed Acyclic Graphs (DAGs) Abstract This supplement describes how counterfactual theory is used to define causal effects and the conditions in which observed data can be used to

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Rates of different mutation types in CRC.

Nature Genetics: doi: /ng Supplementary Figure 1. Rates of different mutation types in CRC. Supplementary Figure 1 Rates of different mutation types in CRC. (a) Stratification by mutation type indicates that C>T mutations occur at a significantly greater rate than other types. (b) As for the

More information

CNV PCA Search Tutorial

CNV PCA Search Tutorial CNV PCA Search Tutorial Release 8.1 Golden Helix, Inc. March 18, 2014 Contents 1. Data Preparation 2 A. Join Log Ratio Data with Phenotype Information.............................. 2 B. Activate only

More information

A Methodological Issue in the Analysis of Second-Primary Cancer Incidence in Long-Term Survivors of Childhood Cancers

A Methodological Issue in the Analysis of Second-Primary Cancer Incidence in Long-Term Survivors of Childhood Cancers American Journal of Epidemiology Copyright 2003 by the Johns Hopkins Bloomberg School of Public Health All rights reserved Vol. 158, No. 11 Printed in U.S.A. DOI: 10.1093/aje/kwg278 PRACTICE OF EPIDEMIOLOGY

More information

Supplementary Figure 1. Principal components analysis of European ancestry in the African American, Native Hawaiian and Latino populations.

Supplementary Figure 1. Principal components analysis of European ancestry in the African American, Native Hawaiian and Latino populations. Supplementary Figure. Principal components analysis of European ancestry in the African American, Native Hawaiian and Latino populations. a Eigenvector 2.5..5.5. African Americans European Americans e

More information

mirna Biomarkers Seena K. Ajit PhD Pharmacology & Physiology Drexel University College of Medicine October 12, 2017

mirna Biomarkers Seena K. Ajit PhD Pharmacology & Physiology Drexel University College of Medicine October 12, 2017 mirna Biomarkers Seena K. Ajit PhD Pharmacology & Physiology Drexel University College of Medicine October 12, 2017 Outline Introduction Circulating mirnas as potential biomarkers for pain Complex regional

More information

Discovery of Novel Human Gene Regulatory Modules from Gene Co-expression and

Discovery of Novel Human Gene Regulatory Modules from Gene Co-expression and Discovery of Novel Human Gene Regulatory Modules from Gene Co-expression and Promoter Motif Analysis Shisong Ma 1,2*, Michael Snyder 3, and Savithramma P Dinesh-Kumar 2* 1 School of Life Sciences, University

More information

BIOSTATISTICAL METHODS

BIOSTATISTICAL METHODS BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH PROPENSITY SCORE Confounding Definition: A situation in which the effect or association between an exposure (a predictor or risk factor) and

More information

Title: Persistent tumor cells in bone marrow of early breast cancer patients after primary surgery are associated with inferior outcome

Title: Persistent tumor cells in bone marrow of early breast cancer patients after primary surgery are associated with inferior outcome Author's response to reviews Title: Persistent tumor cells in bone marrow of early breast cancer patients after primary surgery are associated with inferior outcome Authors: Kjersti Tjensvoll (ktje@sus.no)

More information

Trial Designs. Professor Peter Cameron

Trial Designs. Professor Peter Cameron Trial Designs Professor Peter Cameron OVERVIEW Review of Observational methods Principles of experimental design applied to observational studies Population Selection Looking for bias Inference Analysis

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction 1.1 Motivation and Goals The increasing availability and decreasing cost of high-throughput (HT) technologies coupled with the availability of computational tools and data form a

More information

Evaluation of Record Linkage for SEER breast cancer registries to Oncotype DX tests

Evaluation of Record Linkage for SEER breast cancer registries to Oncotype DX tests Evaluation of Record Linkage for SEER breast cancer registries to Oncotype DX tests Michael D. Larsen, Will Howe, Nicola Schussler, Benmei Liu, Valentina Petkov, Mandi Yu FCSM Research Conference2015,

More information

Profiles of gene expression & diagnosis/prognosis of cancer. MCs in Advanced Genetics Ainoa Planas Riverola

Profiles of gene expression & diagnosis/prognosis of cancer. MCs in Advanced Genetics Ainoa Planas Riverola Profiles of gene expression & diagnosis/prognosis of cancer MCs in Advanced Genetics Ainoa Planas Riverola Gene expression profiles Gene expression profiling Used in molecular biology, it measures the

More information

Patnaik SK, et al. MicroRNAs to accurately histotype NSCLC biopsies

Patnaik SK, et al. MicroRNAs to accurately histotype NSCLC biopsies Patnaik SK, et al. MicroRNAs to accurately histotype NSCLC biopsies. 2014. Supplemental Digital Content 1. Appendix 1. External data-sets used for associating microrna expression with lung squamous cell

More information

Simple, rapid, and reliable RNA sequencing

Simple, rapid, and reliable RNA sequencing Simple, rapid, and reliable RNA sequencing RNA sequencing applications RNA sequencing provides fundamental insights into how genomes are organized and regulated, giving us valuable information about the

More information

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology Sylvia Richardson 1 sylvia.richardson@imperial.co.uk Joint work with: Alexina Mason 1, Lawrence

More information

TITLE: Unique Genomic Alterations in Prostate Cancers in African American Men

TITLE: Unique Genomic Alterations in Prostate Cancers in African American Men AD Award Number: W81XWH-12-1-0046 TITLE: Unique Genomic Alterations in Prostate Cancers in African American Men PRINCIPAL INVESTIGATOR: Michael Ittmann, M.D., Ph.D. CONTRACTING ORGANIZATION: Baylor College

More information

Research Methods in Forest Sciences: Learning Diary. Yoko Lu December Research process

Research Methods in Forest Sciences: Learning Diary. Yoko Lu December Research process Research Methods in Forest Sciences: Learning Diary Yoko Lu 285122 9 December 2016 1. Research process It is important to pursue and apply knowledge and understand the world under both natural and social

More information

Assessment of omicsbased predictor readiness for use in a clinical trial

Assessment of omicsbased predictor readiness for use in a clinical trial Assessment of omicsbased predictor readiness for use in a clinical trial Lisa Meier McShane Biometric Research Branch Division of Cancer Treatment & Diagnosis U.S. National Cancer Institute Biopharmaceutical

More information

Nature Genetics: doi: /ng Supplementary Figure 1. HOX fusions enhance self-renewal capacity.

Nature Genetics: doi: /ng Supplementary Figure 1. HOX fusions enhance self-renewal capacity. Supplementary Figure 1 HOX fusions enhance self-renewal capacity. Mouse bone marrow was transduced with a retrovirus carrying one of three HOX fusion genes or the empty mcherry reporter construct as described

More information

CONTRACTING ORGANIZATION: The Chancellor, Masters and Scholars of the University of Cambridge, Clara East, The Old Schools, Cambridge CB2 1TN

CONTRACTING ORGANIZATION: The Chancellor, Masters and Scholars of the University of Cambridge, Clara East, The Old Schools, Cambridge CB2 1TN AWARD NUMBER: W81XWH-14-1-0110 TITLE: A Molecular Framework for Understanding DCIS PRINCIPAL INVESTIGATOR: Gregory Hannon CONTRACTING ORGANIZATION: The Chancellor, Masters and Scholars of the University

More information

AVENIO ctdna Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB

AVENIO ctdna Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB Analysis Kits Next-generation performance in liquid biopsies 2 Accelerating clinical research From liquid biopsy to next-generation

More information

Products for cfdna and mirna isolation. Subhead Circulating Cover nucleic acids from plasma

Products for cfdna and mirna isolation. Subhead Circulating Cover nucleic acids from plasma MACHEREY-NAGEL Products for cfdna and mirna isolation Bioanalysis Subhead Circulating Cover nucleic acids from plasma n Flexible solutions for small and large blood plasma volumes n Highly efficient recovery

More information

Mammographic density and risk of breast cancer by tumor characteristics: a casecontrol

Mammographic density and risk of breast cancer by tumor characteristics: a casecontrol Krishnan et al. BMC Cancer (2017) 17:859 DOI 10.1186/s12885-017-3871-7 RESEARCH ARTICLE Mammographic density and risk of breast cancer by tumor characteristics: a casecontrol study Open Access Kavitha

More information