Deep-Sequencing of HIV-1 The quest for true variants Alexander Thielen, Martin Däumer 09.05.2015
Limitations of drug resistance testing by standard-sequencing Blood plasma RNA extraction RNA Reverse Transcription/ Polymerase Chain Reaction cdna M Sample 2. Nested PCR PCR products (optional) Purification (optional) Sequencing Sensitivity Reaction for minor variants: >20% Purification Sequencer
Ultra-deep-sequencing Standard Sanger-sequencing...PQIYMDDHTRE... Ultra-deep-sequencing...PQIYMDDHTRE......PQIYMDDHTRE......PQIYVDDHTRE......PQIYMDDHTRE......PQIYMDDHTRE......PQIYMDDHTRE...
454 Sequencing / Roche GS Junior System GS FLX+ System Illumina HiSeq Systems Genome analyzer IIx MiSeq, NextSeq Life Technologies SOLiD 5500 System SOLiD 5500xl System Ion Torrent PGM Proton Next Generation Sequencing, Amplified Single Molecule Sequencing Helicos Helicos Genetic Analysis System Pacific Biosciences PacBio RS Oxford Nanopore Technologies GridION System MinION Third Generation Sequencing, Single Molecule Sequencing
The players
Most important... iphone docking station
Specifications
MinION Nanopore technology
MinION Nanopore technology
MinION Nanopore technology HIV-1 pol amplicon, 1.35kb
outlook
Illumina systems
MiSeq sequencing instrument Illumina s benchtop sequencer easy sample preparation no homopolymer problems current specification: ~ up to 15 Gb output (10 x FLX+, >100 x 454 GS Junior) 2x20mio reads of up to 2x300 nts length 37h run time (2x250 nts)
MiSeq Personal Sequencing System
Library preparation using Nextera TM XT tagmentation Easy library preparation Fast - less than 20 minutes hands-on time Only 1ng DNA per sample needed Indices for up to 96 samples Normalization step included
Drug resistance testing using Illumina s MiSeq
Experimental setup HIV genome PRRT IN ENV whole genome
Fragmentation
putting things together mapping PRRT IN ENV Reference sequences
Coverage PR/RT ENV ~7500 full V3 loops
Reproducibility
PR/RT # resistance mutations found 236 230 239 252 275 324 499
PR/RT resistance interpretation ATV/r limited susceptibility Intermediate Resistance 18 9 8 4 6 6 1 1 5 6 5 4 4 4 4 2 2 2 2 2 3 1% 2% 5% 10% 15% 20% SANGER interpretation according to
Number of sequencing errors, substitutions, deletions, and insertions Archer J et al., 2012, PLoS ONE 7: e49602. doi:10.1371/journal.pone.0049602
Workflow NGS Library preparation: fragmentation & indexing RNA/DNA Total NA extraction rt-pcr/ nested pcr Sanger Sequencing reaction Taq-cycle reaction, sequencing analysis analysis
Workflow NGS RNA/DNA Total NA extraction Sanger Library preparation: fragmentation & indexing Sequencing reaction analysis PCR errors Sequencing errors rt-pcr/ nested pcr RNA vs. DNA: viable vs non-viable PCR errors, recombination Potential error sources editing Taq-cycle reaction, sequencing analysis
PCR errors in clones 7,00% 6,00% 5,00% 4,00% 3,00% 2,00% 1,00% 0,00% I47V I50V N83D I84V D30N G73D L76V G48V K20M M46L I54M T74P V82L L89V K20V G73T V82C M230V D67G K219E K219R K70E K101E V118I V179I V189I H221Y K238T T215S T215N A B C D E1 E2 V179D K219N A98G Y115F K101Q F227C K101H P225H T215F K103S T69D T215D V179T Y181I A
PCR errors in clones
Effect of high fideltity enzymes 3,50% 3,00% 2,50% 2,00% 1,50% 1,00% 0,50% 0,00% G48V N83D I50V F53L I54V G73D L10I K20M M46L I50L K43T I54L L10V G48A G73T V82M D67G M230V E138G P225H Y188H V75A V106A V90I G190E E138K D67N V179E Qiagen One step RT-PCR kit / HotStarTaq Invitrogen One step RT-PCR Superscript III / HiFi Platinum Taq M41L L74V L74I Y115F T215N K101Q V179I K101P T215F G190Q V75M T215Y Q151M K103S V179F Y181I
Ultra-deep sequencing of proviral DNA
C.,J. *1980 1st line ART: TDF/FTC/EFV Resistance test from proviral DNA Viruslast Kop/ml CD4 + /µl
Proviral DNA
Plasmavirus RNA
C.,J. *1980 Proviral DNA Standard Sanger
... but things may turn out diffenrently...
Resistence testing from proviral DNA and low-abundance variants 14-140800D
Deep-Sequencing of HIV-1 The quest for true proviral variants Alexander Thielen 09.05.2015
Resistance testing from proviral DNA viral archive interesting what about defective viruses? => not preferred but sometimes required
Resistance testing from proviral DNA what means defective? stop codons => definitively hypermutated => probably other suggestions? do we see these viruses in our data? do we have a big problem?
Stop codons in the routine samples: RNA : 528 DNA: 169 samples with stop codons: cutoff DNA RNA 1% 92 54.44% 220 42.39% 2% 48 28.40% 79 15.22% 5% 24 14.20% 35 6.74% 10% 14 8.28% 17 3.28% 15% 9 5.33% 11 2.12% 20% 7 4.14% 10 1.93% 30% 6 3.55% 7 1.35%
Where are the stop codons are coming from? PR: W42* RT: W88*, W153*, W229*, W266*, W212*, W71*, W212*, W252*, W153*, W239*, W24* TAG HXB2 reference codon: TGG TGA stop codons mostly: TAG TAA Apobec? Apobec3F: GA AA Apobec3G GG AG, further preference for TGG, TGGG motifs!
Are there further motifs found in DNA? other reading frames, e.g. ATGG ATAG (M I) odds ratio DNA / RNA > 2 @2% cutoff, N at least 5: PR: 42*, 90M, 73S,... RT: 135V, 184V, 88*, 51R, 41L, 153*, 36A*, 212*, 230I, 70R, 196R, 35M, 266*, 16I, 210W, 41I, 45E,... can these be explained by Apobec? not all, e.g. M184V: ATG GTG
Covariation analysis Are there further motifs found in DNA?
Covariation analysis Are there further motifs found in DNA? M184V, 41L, 70R, 210W, 181C, 215Y, 65R, 190A, PR-90M, 219Q, 215F,... no G A! Wx*, 51R, 230I, 196R, 16I, 41I, 45E, 184I, 190R, 276I, 186N,... G A!
Covariation within patient sample 14-140800D 190R 3.6% 230I 3.6% 184I 3.5% 252* 3.35% 153* 4.35% 230I w/o 252* 1.23% 184I w/o 153* 1.38% 190R w/o 153* 1.62%
Covariation within patient sample 14-160255D 190R 8.95% 230I 9.35% 184I 0.17%? 252* 0.52% 153* 0.52% 230I w/o 252* 9.22% also no correlations with other stop-codons but high correlation between 190R and 230I
Covariation within patient sample 14-173390 190R 0.37% 230I 4.31% 184I 0.24% no correlations with stop-cluster mutations
Acknowledgments Kirsten Becker Nina Engel Anna Memmer Benjamin Racké Bettina Spielberger Steffi Wenzel Bernhard Thiele