Supplementary Figure S Characterization of unannotated transcripts. (A) CPAT coding probability scores and (B) cummulative distribution of ORF length are plotted for each category of protein-coding genes, known RNA genes, known, and novel. (C) Expression level distribution of novel lncrna genes, known RNA genes, recently discovered, and protein-coding genes. (D) Repetitive content analysis of. Stacked bars represent the percentage of nucleotides covered by various transposable element families for known and novel. A B.. Coding probability.7.. Ratio of sequences with ORF <= Size.7.. Novel Known Known RNA genes.. Protein coding genes Protein coding Known genes RNA genes Known Novel 4 6 8 Size C Expression Levels by Class Known_RNA LncRNA Novel Protein D Quantity of Transcripts Transcript coverage Family LINE/L SINE/Alu SINE/MIR LTR/ER LTR/ERV LTR/ERVL DNA/hA rlie Others Expression (FPKM) + Known Novel
Supplementary Figure S Differentially expressed in lung cancer. (A) Schematic showing the filtering steps used in our lncrna differential expression pipeline. Heatmaps showing the differentially expressed in the (B) Seo, (C), and (D) cohorts. Although only the paired tumors and normal tissues were used in the differential expression analysis, the expression of the unpaired tumors is also shown for the TCGA cohorts. A B All RefSeq noncoding RNAs Merge ncrnas from GENCODE, Ensembl, UCSC, Human Body Map, and novel Normal (n=7) Tumor (n=7) Remove single-exon transcripts Merge and remove transcripts < nt Remove transcripts overlapping RefSeq protein-coding gene or Ensembl pseudogene FPKM matrix from Cufflinks Read count matrix from BedTools Filter out lowly expressed transcripts (at least 7% of samples with FPKM <. or count < ) 3 Merge overlapping transcripts, keeping transcript with largest average FPKM Differential Expression Analysis using edger C D Normal (n=) Tumor (n=) Unpaired Tumor (n=43) Normal (n=34) Tumor (n=34) Unpaired Tumor (n=63) Samples Samples Samples
Supplementary Figure S3 Subtype-specific. Heatmap of differentially expressed between TCGA lung adenocarcinoma () and lung squamous cell carcinoma () tumors. (n=97) (n=96) Samples
Supplementary Figure S4 Lung cancer cell line validation. qpcr validation of six LCALs across a panel of lung cancer cell lines (n = 8) relative to a control cell line, BEAS-B, and normalized to the housekeeping gene RPL3. All error bars are mean +/- standard error across n = 3 biological replicates. LCAL LCAL Log fold change (LCAL/RPL3) BEASB A49 HOP6 H HOP9 H46 Log fold change (LCAL/RPL3) BEASB A49 HOP6 H HOP9 H46 Log fold change (LCAL7/RPL3) LCAL7 BEAS A49 HOP6 H HOP9 H46 Log fold change (LCAL8/RPL3) LCAL8 BEAS A49 HOP6 H HOP9 H46 Log fold change (LCAL8/RPL3) LCAL8 BEAS A49 HOP6 H HOP9 H46 Log fold change (LCAL8/RPL3) LCAL8 BEASB A49 HOP6 H HOP9 H46
Supplementary Figure S LCAL expression in lung cancer. Coverage maps showing the average expression levels of tumor and normal samples across all three lung cancer cohorts for (A) LCAL8 (FENDRR) (B) LCAL8 (ESCCAL-; known as CASC9 in RefSeq), and (C) LCAL8 (CCAT). Annotated RefSeq (dark blue), UCSC (light blue), and full-length transcripts as determined by and 3 RACE in cell line (black) are shown below each plot. qpcr validation in an independent cohort of human adenocarcinoma and matched controls and squamous cell carcinoma and matched controls are shown for (D) LCAL8 (E) LCAL8, and (F) LCAL8. Insert tables distinguish high and low expression of LCALs using the cutoff value denoted by the dotted line. Because LCAL8 and LCAL8 are differentially expressed in the cohort only, the insert tables were calculated separately for the two subtypes. These results further demonstrate that LCAL8 and LCAL8 are broadly over-expressed in lung squamous cell carcinomas but not adenocarcinomas. A Average Read Depth N T N T Seo N Seo T LCAL8 kb D Relative Expression (LCAL8/RPL3 4. 4 3. 3... + - Normal Tumor 4 p<. B Average Read Depth FENDRR 6 4 LCAL8 FENDRR LCAL8 kb E Relative Expression (LCAL8/RPL3 4 AD Normal AD Tumor SQ Normal SQ Tumor + - Normal Tumor 7 + - Normal Tumor p<. C Average Read Depth 4 LCAL8 kb F Relative Expression (LCAL8/RPL3 8 6 4 8 6 4 AD Normal AD Tumor SQ Normal SQ Tumor + - Normal Tumor + - Normal Tumor 7 p=.46 CCAT AD Normal AD Tumor SQ Normal SQ Tumor
Supplementary Figure S6 Association between LCAL expression and mutation status. Expression levels of (A) LCAL, (B) LCAL8, (C) LCAL4, (D) LCAL38, (E) LCAL74, and (F) LCAL84, measured by log FPKM, for wild type (black) and mutant (colored) samples. Data points are ordered by expression levels and symbols designate cohort (squares for, circles for ). Thick colored lines represent the median expression level for each group. P-values for each mutational association are also reported (*: FDR <., **: FDR <.). A KEAP: P =.7 NFEL: P <.** KEAP: P =.7 B P <.** P =. LCAL Expression NFEL KEAP NFEL & KEAP LCAL8 Expression TP3 C 3 3 D 3 3 LCAL4 Expression 8 6 4 TP3 LCAL38 Expression 8 6 4 TP3 P <.** P =.7 P <.** P =.4* E LCAL74 Expression 3 3 HGF F LCAL84 Expression 3 3 CDKNA P <.** P =. P =.43 P <.** 3 3 3 3
Supplementary Figure S7 Conservation of LCAL. The UCSC schematic shows a lack of Pfam domains or conserved RNA secondary structures predicted by EvoFold. ENCODE data shows DNaseI Hypersensitivity and transcription factor binding in the promoter of LCAL. Evolutionary conservation, using PhyloP, does not show any strong basepair conservation within LCAL. Multiz alignments across vertebrates reveals LCAL sequence similarity restricted within the majority of primates. Scale chr6: MYC FOS STAT3 NR3C RELA KAP STAT3 MYC FOS CTCF KAP POLRA SETDB MYC EP3 USF CEBPB FOS STAT3 JUND MAFF 4.88 _ kb hg9 8,, 8,, 8,, RefSeq Genes LCAL Pfam Domains in UCSC Genes EvoFold Predictions of RNA Secondary Structure Digital DNaseI Hypersensitivity Clusters in cell types from ENCODE 6 3 8 mm mmmm mmm e Transcription Factor ChIP-seq (6 factors) from ENCODE with Factorbook Motifs G h mmm mm mmmm n vertebrates Basewise Conservation by PhyloP h m u mm H H KAHK mmmm mmmmm K K Primates Vert. Cons - -4. _ Chimp Gorilla Orangutan Gibbon Rhesus Crab-eating_macaque Baboon Green_monkey Marmoset Squirrel_monkey Bushbaby Chinese_tree_shrew Squirrel Lesser_Egyptian_jerboa Prairie_vole Chinese_hamster Golden_hamster Mouse Rat Naked_mole-rat Guinea_pig Chinchilla Brush-tailed_rat Rabbit Pika Pig Alpaca Bactrian_camel Dolphin Killer_whale Tibetan_antelope Cow Sheep Domestic_goat Horse White_rhinoceros Cat Dog Ferret_ Panda Pacific_walrus Weddell_seal Black_flying-fox Megabat David s_myotis_(bat) Microbat Big_brown_bat Hedgehog Shrew Star-nosed_mole Elephant Cape_elephant_shrew Manatee Cape_golden_mole Tenrec Aardvark Armadillo Opossum Tasmanian_devil Wallaby Platypus Saker_falcon Peregrine_falcon Collared_flycatcher White-throated_sparrow Medium_ground_finch Zebra_finch Tibetan_ground_jay Budgerigar Parrot Scarlet_macaw Rock_pigeon Mallard_duck Chicken Turkey American_alligator Green_seaturtle Painted_turtle Chinese_softshell_turtle Spiny_softshell_turtle Lizard X_tropicalis Coelacanth Tetraodon Fugu Yellowbelly_pufferfish Nile_tilapia Princess_of_Burundi Burton s_mouthbreeder Zebra_mbuna Pundamilia_nyererei Medaka Southern_platyfish Stickleback Atlantic_cod Zebrafish Mexican_tetra_(cavefish) Spotted_gar Lamprey Multiz Alignments of Vertebrates
Supplementary Figure S8 Nuclear Localization of LCAL. Nuclear and cytosolic fractionation of lysates indicates high expression of LCAL in the nucleus in cells. GAPDH and MT-RNR were used as positive control for cytosolic gene expression and U6 was used as a positive control for nuclear gene expression. qpcr results are relative to total RNA and normalized to the housekeeping gene RPL3. All error bars are mean +/- standard error across three biological replicates in two independent experiments. Relative Expression (gene/rpl3) 4% % % 8% 6% 4% % % cytoplasmic nuclear GAPDH U6 MT-RNR LCAL
Supplementary Figure S9 Lung cancer cell line validation. qpcr validation of LCAL across a panel of squamous carcinoma cell lines (n=) relative to a control cell line, BEAS-B, and normalized to the housekeeping gene RPL3. Beas H73 Calu- SK-MES- SW9 HCC9 Log Fold Change (LCAL/RPL3)
Supplementary Figure S LCAL expression affects cellular proliferation. After 7h transfection, cells were seeded in a 96-well plate at 3, cells/well. At indicated days Alamar Blue reduction was measured by fluorescence. Fluorescence was normalized to mean scrambled control. All error bars are mean +/- standard error across n=4 biological replicates in two independent experiments. * P <., ** P <., P <. by a two-tailed Student s t-test. Percent Normalized Alamar Blue (fluorescence) 8 6 4 control LCAL sirna LCAL sirna ** Relative Expression (LCAL/RPL3)..8.6.4. Day Day 4 Day 6 control LCAL sirna LCAL sirna HCC9 control LCAL sirna LCAL sirna. Percent Normalized Alamar Blue (fluorescence) 8 6 4 * ** ** * Relative Expression (LCAL/RPL3)..8.6.4. Day Day 4 Day 6. control LCAL sirna LCAL sirna