Title: Human breast cancer associated fibroblasts exhibit subtype specific gene expression profiles

Author's response to reviews Title: Human breast cancer associated fibroblasts exhibit subtype specific gene expression profiles Authors: Julia Tchou (julia.tchou@uphs.upenn.edu) Andrew V Kossenkov (akossenkov@wistar.org) Lisa Chang (lchang@wistar.org) Celine Satija (celine.satija@gmail.com) Meenhard Herlyn (herlynm@wistar.org) Louise C Showe (lshowe@wistar.org) Ellen Pure (pure@wistar.org) Version: 3 Date: 26 May 2012 Author's response to reviews: see over

Dear Editor-in-chief, We hereby submit our revised manuscript entitled Human breast cancer associated fibroblasts exhibit subtype specific gene expression profiles for consideration of publication as an original article. We appreciate the reviewers comments and our point by point response is as follows: REVIEWER1 Major Compulsory Revisions: 1. The manuscript does not adhere to the standards for reporting and data Deposition. Response: The manuscript is now formatted according to the guidelines of BMC medical genetics. 2. there is no reference to deposit of microarray data in GEO (Gene expression Omnibus), or equivalent, and thus may not be MIAME compliant (see http://www.ncbi.nlm.nih.gov/geo/info/miame.html) Response: The data is uploaded to GEO database and available by accession GSE37614 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=ptopjamaoomwilm&acc=gse37614) Corrections were made in methods section to note that. The data was submitted to GEO database (http://www.ncbi.nlm.nih.gov/geo/) and available by using accession number GSE37614 Minor Essential Revisions: 1. In materials and methods section, RNA purification and microarrays, authors state 28S/16S ratio--(16s is bacterial) so I believe they mean '18S', also a ratio of >0.75 is rather low; when '2' is optimal, particularly for RNA isolated from cell culture--and maybe reflective of RNA degradation. Do author's have RIN from analysis? If so please report. Response: 16S is indeed correct. The 0.75 is actually the minimal acceptable RIN number and not the 260/280! This is now corrected. Discretionary Revisions: 1. Intro. first sentence; suggest changing "which" to "that" 2. Intro. Authors define Her2-neu as (Her2) in first sentence, and then use 'Her2-neu' in second sentence. In third sentence they define 'Her2-neu' as (Her2+)...this needs to be consistent and defined one time. 3. Intro. page 2, line 9. Word "within" should be changed to "among" or "between". 4. Intro. page 2, first paragraph, last sentence needs a "." 5. Page 4, second line; "P" should be lowercase in platelet. 6. Page 4, first sentence of 2nd paragraph...the way sentence is written, reads as if "RNA" was assigned to one of 2 samples sets, when it is gene expression analysis that were assigned to sample sets. 7. Page 4, third sentence of 3rd paragraph has 2 ".." 8. Materials and methods, patient characteristics section; TNBC description should be "lacks expression of ER, PR AND Her2-neu" 9. Figure 4 legend..."most" should be changed to "greatest differences" 10. Figure 5 legend, insert "microarray" in front of results. Response: The authors appreciate the reviewer s comments and all above recommended changes have been inserted into the manuscript as indicated. 1

REVIEWER2 Major points 1. Tumor specimens of different size were processed differently in order to obtain the CAF cultures. What was the rationale behind using different strategies, especially since both procedures involved an initial mincing step and since this conceivably may yield distinct subpopulations of CAFs? The procedure used for each tumor specimen needs to be specified. Response: We appreciate the reviewer s comments. The rational for the difference in tissue dissociation protocol is to retrieve as much cells as possible with the least amount of processing time. We have added the 5 tissues (TB160-165) that were processed differently in p.4 under Tissue dissociation and cell culture section. There were minimal batch effects due to sample processing and array analyses and we applied batch normalization as detailed in p. 5. Section RNA purification and Microarrays. 2. The in vitro culture step may introduce abnormalities, in particular since fibroblasts/cafs are known to be highly plastic cells. The authors should provide validation of the gene expression differences in CAFs of different subtype using qpcr on freshly isolated materials. Response: We appreciate the reviewer s comments. We did not separate stromal cells from non stromal cells during the tissue dissociation protocol and therefore we did not have the cells to do the recommended assays. Instead, we use cell culture to select for stromal cells and use stromal cells with low passage numbers (between 2-4) to minimize culture artifacts. We have also validated the differential expression of PTGIS in Her2 + cancer and TNBC as shown in the figure below. Figure 1. Validation of differential expression of PTGIS in tumor stroma of TB117 (Her2 +) (left panel) and TB125 (TNBC) (right panel. Immunofluorescence staining using anti-ptgis antibodies were used to confirm relative lower expression in Her2+ as seen in Heatmap. The stroma rich region (white arows) are highlighted by increased level of expression (middle panel). 2

PTGIS appears to be expressed in both epithelial cells and stromal cells but its expression within the stroma is less robust in Her2 + cancer (white arrows, bottom panel). 3. The authors need to specify the stage of each tumor, and whether this parameter is different between tumor subtypes to rule out that the observed difference in gene expression is related to the invasive behavior of the tumor. Response: We did compare the tumor stage and other tumor characteristics. Results are summarized in Table 1. and stated in page. 7 under Results Isolation of CAFs from fresh human breast cancer samples. Briefly, we did not see any differences between age, tumor size, nodal status except for tumor grade (both triple negative and Her2 + breast cancer have significantly higher tumor grade than ER + tumors). 4. Flow cytometry using only one marker of each cell type is a blunt instrument to determine the purity of the isolated cell cultures. The authors should utilize qpcr to control for contaminating tumor cells that may have acquired mesenchymal-like properties through EMT. Response: We appreciate reviewer s comment and acknowledge that our results may be contaminated by tumor cells that may have acquired mesenchymal-like properties through EMT. This was indeed a limitation of our study. However, when we compare expression of several biomarkers of EMT[1] such as N-cadherin, syndecan, and several integrins, there were no significant differences in the level of expression among all cultured fibroblasts. Therefore, we felt that the possible contamination of tumor cells that had gone through EMT did not contribute significantly to our results. 5. The main finding of the paper that CAFs from different subtypes of breast cancer have a distinct transcriptional profile is very interesting. How much of the whole tumor classification of tumor subtypes can be explained by the variation in CAF gene expression? And can the authors distinguish the tumor subtype by assessing expression of the CAF signature on whole tumor material? Response: We checked 3 gene expression datasets for breast tumor samples present in the GEO database, which had Her2+ status reported and had both, positive and negative classes: GSE3744, GSE5764 and GSE16873. None showed significant differences in expression of genes identified from CAFs differences in the whole tumors. We decided not to include those results in the manuscript because of the few samples in Geo and since they did not add to our results. 6. While it is recognized that this is mainly a genomics study, it is important even within this field to provide functional insight. Based on the gene expression profile, the authors speculate about the importance of CAFs from the HER2+ subtype in assisting tumor cell invasion. Several questions are raised by this speculation, in particular since the CAFs are derived from the tumor center, presumably far from the invasion front of the tumor: Were the HER2+ tumors used in this study of higher stage and/or more locally invasive than TNBC and ER+ tumors? 3

Are CAFs derived from HER2+ tumors superior at stimulating tumor cell invasiveness in coculture assays? Results: Following the reviewer s recommendation, we have compared the migration properties of CAF of three breast cancer subtypes. Results are shown in the adjacent figure confirming that fibroblasts derived from Her2+ tumors have increased mobility. This increased migration properties are likely due to the differential gene expression described in our discussion. We have added this result into our result section. REVIEWER3 Major Compulsory Revisions 1. Page 4, paragraph 2, line 3: Explicitly indicate how many training samples were used and how many validation samples were used. Response: This is now more clearly indicated 2. Page 4, paragraph 2, line 4: p-values from training set need to be adjusted for multiple testing such as the Benjamini-Hochberg corrected p-values reported in Table 3. Response: We appreciate the reviewer s comment. We feel that adjusting for multiple testing in this case in unnecessary, as we used a more rigorous training set followed by independent testing approach on new samples to address the problem of false positive results. The repeat performance of the genes selected on the training group on the completely independent testing set samples proves that the genes (as a group) identified on the training set identifies true differences between Her2+ CAFs and CAFs from the other 2 classes. 3. Page 4, paragraph 2, line 8: The variation accounted for by the first principal component (49%) is the overall variation, not the variation between subtypes. Likewise for the variation accounted for by the second PC. In fact, sample type is not used in the PC analysis at all. As such, it is an unsupervised method. Response: In this paragraph we tried to emphasize that even with unsupervised PCA, the 1 st principal component that accounts for 49% of the overall variation coincides with separating subtypes (Her2+ vs others). We have rephrased the statement to be more precise. 4. Page 5, paragraph 1, line 4: A false discovery rate of 28% is unacceptably high. The p-value threshold needs to be adjusted to bring the FDR down to a more respectable 5 to 10%. Response: We appreciate the reviewer s concern. We want to clarify that we used the p<0.05 list of 1829 genes with this 28% false discover rate only for enrichment analysis. Enrichment analysis does not depend heavily on any particular FDR cutoff used for gene selection, but is based mainly on gene ranks. For example, the widely popular Gene Set Enrichment Analysis (GSEA) software does not use any significance (FDR) cutoffs at all, but rather bases the results on ranks of all the genes whether significant or not. We felt that, using the gene list with a FDR rate of 28% was acceptable for the purpose described. 5. Page 8, paragraph 3: A statistical sample size justification needs to be provided to ensure readers that a sufficient number of samples were included to yield robust results. Response: Because the training sample size is small, we used the rigorous assessment of the original observation in an independent validation set to confirm that the results found in a training set were not specific to the training samples. We felt this addresses the robustness of results. 4

6. Figure 2B: The failure of the ER+ and TNBC validation samples to properly segregate apart indicates a failure to validate the results from the training set. The only part that was validated was the ability to separate out the HER2+ samples. Response: We agree with the reviewer s comment and we have emphasized the reviewer observation in both the results section and discussion illustrating that it is only the distinction of HER2+ CAFs from the other 2 subtypes which is significant in our studies. 7. Table 1: Due to the small numbers of samples, these comparisons have low statistical power and thus a high false negative rate. Thus it is impossible to interpret the lack of significance. Response: We feel that providing test results for all available clinical characteristics are still useful to interested investigators. The table indicates differences in the assayed parameters that reach significance in this particular study. While we agree that absence of significance does not definitively means that there is no difference, it is still useful to know the results of the comparison. Minor Essential Revisions 1. Figure 5: What measure of uncertainty is represented by the whiskers on the bar graphs? Response: Error bars represent standard error of mean for each tumor type group. We corrected the figure legend to reflect that. 2. Page 4, paragraph 3, line 6: TNBC+ should just be TNBC. Response: Corrected, thank you 3. Table 2: Omit the Full name column since it adds nothing to the table. Response: corrected 5