Next Generation Sequencing as a tool for breakpoint analysis in rearrangements of the globin-gene clusters XXXth International Symposium on Technical Innovations in Laboratory Hematology Honolulu, Hawaii - Hawaii Convention Center May 4-6, 2017 Barnaby E. Clark 1,2, Claire Shooter 2, Frances Smith 1, David Brawand 1 and Swee Lay Thein 2,* 1 Viapath at King s College Hospital NHS Foundation Trust, London, UK 2 King s College London, Faculty of Life Sciences and Medicine, London, UK * Present Address: NHLBI / NIH, Sickle Cell Branch, Bethesda, USA
Disclosures No Relevant Financial Relationships with Commercial Interests
Objectives What do we need to detect for hemoglobinopathies Genotypic and phenotypic diversity of hemoglobin disorders Overview of sequential process followed historically for DNA diagnostics of the hemoglobin disorders What is Next Generation sequencing creating and analysing the data Evaluation of next generation sequencing (NGS) as a comprehensive single methodology in Hemoglobin DNA diagnostics Case examples
Hemoglobinopathy Cases Requiring Molecular Diagnosis Ø Antenatal assessment: Part of UK national screening programme Ø Prenatal Diagnosis: Part of UK national screening programme Ø Newborn screening: Part of UK national screening programme Ø Specialist workup of individual cases Unusual phenotype detected Unusual presentation Phenotype more/ less severe than expected
Genomic structure of the clusters of α-like and β-like globin genes on chromosomes 16 and 11. 2008 by American Society of Hematology Schechter A N Blood 2008;;112:3927-3938
Mutations downregulating beta globin gene Point mutations 5 b - LCR 4 3 2 1 e G g A g g y b d b e G g d b 3' -30kb -20kb -10kb 0 10kb 20k 30kb 40kb 50kb 60kb 70kb 80kb Deletions restricted to b gene Large deletions involving b LCR with and without b gene Trans-acting mutations identified in: GATA-1 TFIIH
α + thalassemia caused by deletions of one gene Harteveld CL & Higgs DR Orphanet J of Rare Dis 2012;; 5: 13
α 0 thalassemia caused by deletions of both a genes Harteveld CL & Higgs DR Orphanet J of Rare Dis 2012;; 5: 13
Genetic Testing for Hemoglobinopathies Clinical features Family history FBC Hemoglobin separation, HbA2, HbF Phenotype Osmotic fragility EMA labelling Sequence HBB gene If negative, Gap-PCR HBA cluster, Sequence HBA genes If negative Diagnosis: MLPA HBB, HBA cluster Genotype / phenotype consistent CGH array Or Found Mutation but not consistent with Phenotype Or No Genetic cause identified Genotype
Library preparation
Bait Capture Agilent SureSelect DNA fragmentation Library preparation Genomic DNA end repair and A overhang ultrasonic shearing adaptor ligation Fragmented DNA ~400bp Sequencing Capture Target library 300 bp Paired end Sequencing low cycle PCR amplification untargeted regions 16 hour hybridisation biotinylated RNA oligos
Illumina method
Illumina Sequencing 1 read from 1 cluster Each cluster derived from 1 molecule Reads alleles separately Indexing identifies origin Massive multiplexing Targets Samples
Target region Baits designed HBB Globin cluster
MiSeq Sequencer 2.5-4.5 M reads/ sample DNA fragment Read 1 Read 2
Reference sequence: AGCTGCTCTAGATAGCTCGATAAAAGCTCCGATATAGTGCATCAGCCAGCGCGCGCAGATAGAAAGAGC GCTGCTCTAGATAGCTCGATAAA SNP Statistics G = 5 T = 5 50:50 Heterozygous Deletion = all homozygous Duplication = 66:33 Coverage = 10x
Normal control Patient sample Normalisation of coverage within sample by Reads per kb RPKM Then compare patient samples to diploid controls Relative coverage = Coverage in Patient (normalised to total reads) RPKM Coverage in Control (normalised to total reads) RPKM
Variation in RPKM values across bait tiled region on chromosome 16 Log2 deviation from control average 1.5 1 0.5 0-0.5-1 Bait performance variability Mutation CNV event Variation due to insertions Majority of assay variation Variation due to deletions - 1.5 200000 210000 220000 230000 240000 250000 Chromosomal position
Case Example 1
PROBAND - 39 year old woman (English Anglo-Saxon) Hypochromic microcytic anemia since infancy, unresponsive to iron supplements Individual Age Hb RBC MCV MCH HbA 2 HbF β/α (gm/dl (x 10 12 /L) (fl) (μg) (%) (%) Proband 14 yrs 9.9 5.37 58.0 18.4 2.7 0.9 0.26 25 yrs 7.8 4.02 61.9 19.4 3.0 <1.0 0.59 - Normal HbA 2 thalassemic RBC indices α thalassemia variant excluded - Globin chain synthesis consistent with phenotype of εγδβ thalassemia Southern blotting, MLPA, qpcr and CGH array confirmed deletion from HBG2 exon 2 to β- LCR BUT not possible to characterise breakpoints
PROBAND - 39 year old woman (English Anglo-Saxon) Hypochromic microcytic anemia since infancy, unresponsive to iron supplements Individual Age Hb RBC MCV MCH HbA 2 HbF β/α (gm/dl (x 10 12 /L) (fl) (μg) (%) (%) Proband 14 yrs 9.9 5.37 58.0 18.4 2.7 0.9 0.26 25 yrs 7.8 4.02 61.9 19.4 3.0 <1.0 0.59 Daughter 3½ yrs 9.3 5.05 56.6 18.4 3.2 1.0 Mother 56 yrs 8.0 4.96 53.5 16.2 2.6 2.0 0.54 Case 2 52 yrs 6.8 4.53 50.0 15.0 2.5 2.1 Case 3 24 yrs 9.7 5.01 62.7 19.4 2.6 1.0 Daughter of proband received two intra-uterine blood transfusions followed by another blood transfusion at birth when she developed neonatal jaundice and anemia
RPKM plot across chromosome 11 in patient Coverage difference: control/patient (Log2 Scale) 2 1.5 β δ Ψ γ 1 γ 2 Ɛ LCR 1 0.5 0-0.5-1 - 1.5-2 5150000 5200000 5250000 5300000 5350000 5400000 5450000 Position on Chromosome 11 RPKM Normal variation Deletion
Problem Coverage data indicated a deletion Data was consistent with MLPA and CGH array SNP data showed homozygous only SNPs in the deletion and heterozygous SNPs in the region outside of the deletion BUT Can not amplify across the breakpoint with primers to confirm the deletion
Alignment information can be used to identify deletions or translocations Fragment size Expected Reference Sequence Opposite direction reads Indicates a deletion? These opposite direction reads did not align in our patient sample near the predicted breakpoints
Alignment information can be used to identify inversions Fragment size Reference Sequence Same direction reads? Inversion
A P1 LTR LTR LINE 1 P2 OR51V1 HBB 5,250,000 HBD HBP1 HBG1 P3 HBG2 HBE LCR 5,300,000 LINE 2 5,400,000 P4 OR51M1 Deletion 5,215,690 59 kb Inversion 5,274,684 122,511 bp 5,397,195 B LTR HBG2 HBG1 HBP1 HBD Inverted sequence HBB OR51V1 LINE 1 LINE 2 OR51M1 P1 P3 P2 P4 1031bp 4499bp Note: not drawn to scale
1kb plus ladder A 1kb plus ladder B 12,000 12,000 2,000 1,650 4,000 3,000 1,000 2,000 1,650 850 Inversion Gap PCR P1 / P3-1031 bp 2) Blank 3) Normal Control 1 4) Normal Control 2 5) Case 2 6) Case 3 7) Proband, Case 1 8) Daughter of Proband in Case 1 Deletion Gap PCR P2 /P4 4499 bp 2) Proband, Case 1 3) Normal Control 4) Daughter of Proband in Case 1 5) Case 2 6) Case 3 7) Blank
I1 Family 2 I2 I1 3 Families : Hb 159 g/l Hb 110 g/l 3 unique duplications of the a - globin cluster All β thal carriers who have co- inherited the a - globin MCV MCH 80.1 fl 28.2 pg Hb A2 2.7% Hb F 0.2% HBB sequence normal HBA a a /a a,a a MCV MCH 58.9 fl 19.4 pg Hb A2 4.8% Hb F 0.7% HBB Het c.135delc (Cd 44-C del) HBA a a /a a duplication, have a II1 II2 thalassemia intermedia phenotype Hb MCV 59 g/l 65.6 fl Hb MCV 85 g/l 68.9 fl MCH 19.6 pg MCH 20.7 pg Clark et al, BJH 2016; doi: 10.1111/bjh.14294. Hb A2 3.1% Hb F 13.6% HBB Het c.135delc (Cd 44-C del) HBA a a /a a,a a Hb A2 4.2% Hb F 15.2% HBB Het c.135delc (Cd 44-C del) HBA a a /a a,a a
Deviation of RPKM from control average (log2) 2 1.5 1 0.5 0-0.5-1 - 1.5 NGS coverage data Family 2 (proband) Gap PCR Primer 1 Gap PCR Primer 2 Duplication - 2 50000 100000 150000 200000 250000 300000 350000 400000 450000 500000 Chromosome 16 Marker size (bp) (L) 1 2 3 4 5 Sequence at end of balanced region Sequence at start of duplicated region 10,000 5000 2,000 1,500 1,000 750 500 250 Ambiguous bases
Family 2 I1 I1 I2 Hb 159 g/l Hb 110 g/l MCV 80.1 fl MCV 58.9 fl MCH 28.2 pg MCH 19.4 pg Hb A2 2.7% Hb F 0.2% Hb A2 4.8% Hb F 0.7% HBB sequence normal HBA a a /a a,a a HBB Het c.135delc (Cd 44-C del) HBA a a /a a II1 II2 Hb 59 g/l Hb 85 g/l MCV 65.6 fl MCV 68.9 fl MCH 19.6 pg MCH 20.7 pg Hb A2 3.1% Hb F 13.6% HBB Het c.135delc (Cd 44-C del) HBA a a /a a,a a Hb A2 4.2% Hb F 15.2% HBB Het c.135delc (Cd 44-C del) HBA a a /a a,a a 120,500 bp Duplication WASIR2 MIR6859 POLR3K SNRNP25 RHBDF1 MPG NPRL3 HBZ HBM HBA2 HBA1 HBQ1 LUC7L NPRL3 HBZ HBM HBA2 HBA1 HBQ1 LUC7L FAM234A RGS1 1 ARHGDIG PDIA2 AXIN1
Case example 3 : Shooter et al Brit. J. Haemat. 2015;; 70: 123-138
Summary We have developed a comprehensive single NGS methodology that can fully characterize all types of variants by analysis of a single data set SNPs, deletions, insertions, inversions and rearrangements of any size Exception: the 3.7 kb a -globin deletion cannot be identified accurately using NGS as currently not possible to uniquely map the reads back to the reference sequence Using this NGS analysis pipeline, we have characterized a novel rearrangement of the HBB cluster, responsible for εγδβ thalassemia in an English family, a globin cluster and b globin cluster duplications. We have automated the library preparation to reduce noise in assay and also automated bioinformatics pipeline analyses. With time, it should be possible to apply NGS to routine diagnostics including newborn and antenatal screening