Workshop presentation: Development and application of Bioinformatics methods

Similar documents
Epitope discovery and Rational vaccine design Morten Nielsen

In silico methods for rational design of vaccine and immunotherapeutics

CS229 Final Project Report. Predicting Epitopes for MHC Molecules

Co-evolution of host and pathogen: HIV as a model. Can Keşmir Theoretical Biology/Bioinformatics Utrecht University, NL

The Immune Epitope Database Analysis Resource: MHC class I peptide binding predictions. Edita Karosiene, Ph.D.

IMMUNOINFORMATICS: Bioinformatics Challenges in Immunology

Eur. J. Immunol : Antigen processing 2295

NIH Public Access Author Manuscript Immunogenetics. Author manuscript; available in PMC 2014 September 01.

A HLA-DRB supertype chart with potential overlapping peptide binding function

The Major Histocompatibility Complex (MHC)

Profiling HLA motifs by large scale peptide sequencing Agilent Innovators Tour David K. Crockett ARUP Laboratories February 10, 2009

MetaMHC: a meta approach to predict peptides binding to MHC molecules

Convolutional and LSTM Neural Networks

Significance of the MHC

A community resource benchmarking predictions of peptide binding to MHC-I molecules

BIOINF 3360 Computational Immunomics

BIOINF 3360 Computational Immunomics

MHC class I MHC class II Structure of MHC antigens:

Vaccine Design: A Statisticans Overview

PERFORMANCE MEASURES

Antigen Recognition by T cells

6/7/17. Immune cells. Co-evolution of innate and adaptive immunity. Importance of NK cells. Cells of innate(?) immune response

BIOINFORMATICS. June Immunological Bioinformatics

Significance of the MHC

Basel - 6 September J.-M. Tiercy National Reference Laboratory for Histocompatibility (LNRH) University Hospital Geneva

Host Genomics of HIV-1

Convolutional and LSTM Neural Networks

Immuno-Oncology Therapies and Precision Medicine: Personal Tumor-Specific Neoantigen Prediction by Machine Learning

The Human Major Histocompatibility Complex

Definition of MHC Supertypes Through Clustering of MHC Peptide-Binding Repertoires

Basic Immunology. Lecture 5 th and 6 th Recognition by MHC. Antigen presentation and MHC restriction

Predicting Protein-Peptide Binding Affinity by Learning Peptide-Peptide Distance Functions

ProPred1: prediction of promiscuous MHC Class-I binding sites. Harpreet Singh and G.P.S. Raghava

Adaptive Immune System

DIRECT IDENTIFICATION OF NEO-EPITOPES IN TUMOR TISSUE

Gibbs sampling - Sequence alignment and sequence clustering

The Major Histocompatibility Complex

HLA and antigen presentation. Department of Immunology Charles University, 2nd Medical School University Hospital Motol

LESSON 2: THE ADAPTIVE IMMUNITY

Two categories of immune response. immune response. infection. (adaptive) Later immune response. immune response

Epitope matching in solid organ transplantation: T-cell epitopes

Helminth worm, Schistosomiasis Trypanosomes, sleeping sickness Pneumocystis carinii. Ringworm fungus HIV Influenza

Immunology - Lecture 2 Adaptive Immune System 1

PEPVAC: a web server for multi-epitope vaccine development based on the prediction of supertypic MHC ligands

Practical Solution: presentation to cytotoxic T cells. How dendritic cells present antigen. How dendritic cells present antigen

Alessandra Franco MD PhD UCSD School of Medicine Department of Pediatrics Division of Allergy Immunology and Rheumatology

NetMHCpan, a Method for Quantitative Predictions of Peptide Binding to Any HLA-A and -B Locus Protein of Known Sequence

10/18/2012. A primer in HLA: The who, what, how and why. What?

Major Histocompatibility Complex Class II Prediction

Significance of the MHC

Degenerate T-cell Recognition of Peptides on MHC Molecules Creates Large Holes in the T-cell Repertoire

the HLA complex Hanna Mustaniemi,

1. Overview of Adaptive Immunity

Machine Learning For Personalized Cancer Vaccines. Alex Rubinsteyn February 9th, 2018 Data Science Salon Miami

Major Histocompatibility Complex (MHC) and T Cell Receptors

The Major Histocompatibility Complex of Genes

The Adaptive Immune Response. T-cells

Structure and Function of Antigen Recognition Molecules

HLA and antigen presentation. Department of Immunology Charles University, 2nd Medical School University Hospital Motol

Antigen Presentation and T Lymphocyte Activation. Abul K. Abbas UCSF. FOCiS

Identification and characterization of merozoite surface protein 1 epitope

Immuno-Oncology Therapies and Precision Medicine: Personal Tumor-Specific Neoantigen Prediction by Machine Learning

Antigen Presentation to T lymphocytes

The MHC and Transplantation Brendan Clark. Transplant Immunology, St James s University Hospital, Leeds, UK

CELL BIOLOGY - CLUTCH CH THE IMMUNE SYSTEM.

The major histocompatibility complex (MHC) is a group of genes that governs tumor and tissue transplantation between individuals of a species.

Lecture 6. Burr BIO 4353/6345 HIV/AIDS. Tetramer staining of T cells (CTL s) Andrew McMichael seminar: Background

AG MHC HLA APC Ii EPR TAP ABC CLIP TCR

DEPARTMENT OF BIOTECHNOLOGY

Selection of epitope-based vaccine targets of HCV genotype 1 of Asian origin: a systematic in silico approach

CHAPTER 18: Immune System

Attribution: University of Michigan Medical School, Department of Microbiology and Immunology

Transplantation. Immunology Unit College of Medicine King Saud University

RAISON D ETRE OF THE IMMUNE SYSTEM:

SUPPLEMENTARY INFORMATION

Supporting Information

Protein Structure and Computational Biology, Good morning and welcome!

Antigen processing and presentation. Monika Raulf

Antigen capture and presentation to T lymphocytes

Completing the CIBMTR Confirmation of HLA Typing Form (Form 2005)

Nature Biotechnology: doi: /nbt Supplementary Figure 1. Experimental design and workflow utilized to generate the WMG Protein Atlas.

Historical definition of Antigen. An antigen is a foreign substance that elicits the production of antibodies that specifically binds to the antigen.

Prediction of MHC-Peptide Binding: A Systematic and Comprehensive Overview

LIPOPREDICT: Bacterial lipoprotein prediction server

Cover Page. The handle holds various files of this Leiden University dissertation.

RAISON D ETRE OF THE IMMUNE SYSTEM:

B F. Location of MHC class I pockets termed B and F that bind P2 and P9 amino acid side chains of the peptide

Key Concept B F. How do peptides get loaded onto the proper kind of MHC molecule?

New technologies for studying human immunity. Lisa Wagar Postdoctoral fellow, Mark Davis lab Stanford University School of Medicine

Definition of MHC supertypes through clustering of MHC peptide binding repertoires

Defensive mechanisms include :

Clinical Basis of the Immune Response and the Complement Cascade

MHC Tetramers and Monomers for Immuno-Oncology and Autoimmunity Drug Discovery

Protein Structure and Computational Biology, Programme. Programme. Good morning and welcome! Introduction to the course

Analysis of The DQ Family of The Human Leukocyte Antigens

Antigenic Epitope Prediction towards SARS Corona Virus using Bioinformatics Analysis *1

Mucosal Immune System

I. Critical Vocabulary

Contents. Just Classifier? Rules. Rules: example. Classification Rule Generation for Bioinformatics. Rule Extraction from a trained network

Andrea s SI Session PCB Practice Test Test 3

Transcription:

Workshop presentation: Development and application of Bioinformatics methods

Immune system Innate fast Addaptive remembers Cellular Cytotoxic T lymphocytes (CTL) Helper T lymphocytes (HTL) Humoral B lymphocytes Figure by Eric A.J. Reits

Data driven predictions List of peptides that have a given biological feature Mathematical model (neural network, hidden Markov model) YMNGTMSQV GILGFVFTL ALWGFFPVV ILKEPVHGV ILGFVFTLT LLFGYPVYV GLSPTVWLS WLSLLVPFV FLPSDFFPS CVGGLLTMV FIAGNSAYE Search databases for other biological sequences with the same feature/property >polymerase! MERIKELRDLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPALRMKWMMAMKYPITAD! KRIMEMIPERNEQGQTLWSKTNDAGSDRVMVSPLAVTWWNRNGPTTSTVHYPKVYKTYFE! KVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQDVIMEVVFPNEVGARILTSE! SQLTITKEKKEELQDCKIAPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCW! EQMYTPGGEVRNDDVDQSLIIAARNIVRRATVSADPLASLLEMCHSTQIGGIRMVDILRQ! NPTEEQAVDICKAAMGLRISSSFSFGGFTFKRTNGSSVKKEEEVLTGNLQTLKIKVHEGY! EEFTMVGRRATAILRKATRRLIQLIVSGRDEQSIAEAIIVAMVFSQEDCMIKAVRGDLNF!...!

Prediction of HLA binding specificity Simple Motifs Allowed/non allowed amino acids Extended motifs Amino acid preferences (SYFPEITHI) Anchor/Preferred/other amino acids Hidden Markov models Neural Networks

Class II MHC binding MHC class II binds peptides in the class II antigen presentation pathway Binds peptides of length 9-18 (even whole proteins can bind!) Binding cleft is open Binding core is 9 aa Human MHC II: ~1000 variants Peptide: up to 20 9 variants

MHC class II prediction Complexity of problem Peptides of different length Weak motif signal Alignment crucial Gibbs Monte Carlo sampler RFFGGDRGAPKRG YLDPLIRGLLARPAKLQV KPGQPPRLLIYDASNRATGIPA GSLFVYNITTNKYKAFLDKQ SALLSSDITASVNCAK PKYVHQNTLKLAT GFKGEQGPKGEP DVFKELKVHHANENI SRYWAIRTRSGGI TYSTNEIDLQLSQEDGQTIE M Nielsen et al., Bioinformatics. 2004 20:1388-97

Class II binding motif Alignment by Gibbs sampler Random ClustalW RFFGGDRGAPKRG YLDPLIRGLLARPAKLQV KPGQPPRLLIYDASNRATGIPA GSLFVYNITTNKYKAFLDKQ SALLSSDITASVNCAK PKYVHQNTLKLAT GFKGEQGPKGEP DVFKELKVHHANENI SRYWAIRTRSGGI TYSTNEIDLQLSQEDGQTI Gibbs sampler M Nielsen et al., Bioinformatics. 2004 20:1388-97

Earlier MHC Class II binding prediction methods Virtual matrices TEPITOPE: Hammer, J., Current Opinion in Immunology 7, 263-269, 1995, PROPRED: Singh H, Raghava GP Bioinformatics 2001 Dec;17(12):1236-7

Gibbs sampler. Prediction accuracy

NetMHC-IIpan pseudo sequence Include polymorphic residues in potential contact with the bound peptide The contact residues are defined as being within 4.0 Å of the peptide in any of a representative set of HLA-DR, -DQ, and DP structures with peptides. Only polymorphic residues are included Pseudo-sequence consisting of 25 amino acid residues. M Nielsen et al., Comput Biol. 2008

NetMHC-IIpan Method XAAAAAVAAEAYXXX WLFECLECYPDYWLQRATYCHNVGF 0.119861 DRB1_0101 AAAAAVAAEAY ALNVKRREGMFIDEX WLFECLECYPDYWLQRATYCHNVGF 0.238877 DRB1_0101 AALNVKRREGMFIDE QPGLTSAVIEALPXX WLFECLECYPDYWLQRATYCHNVGF 0.169769 DRB1_0101 AAQPGLTSAVIEALP XACVKDLVSKYLADN WLFECLECYPDYWLQRATYCHNVGF 0.577653 DRB1_0101 ACVKDLVSKYLADNE KIGLHTEFQTVSFXX WLFECLECYPDYWLQRATYCHNVGF 0.982712 DRB1_0101 AFKIGLHTEFQTVSF AGDLGRDELMELASD WLFECLECYPDYWLQRATYCHNVGF 0.061007 DRB1_0101 AGDLGRDELMELASD XAGLIAIVMVTILLC WLFECLECYPDYWLQRATYCHNVGF 0.104993 DRB1_0101 AGLIAIVMVTILLCC XAGYAATNDDNILSH WLFECLECYPDYWLQRATYCHNVGF 0.364429 DRB1_0101 AGYAATNDDNILSHV XXAKCNLDHSSEFCD WLFECLECYPDYWLQRATYCHNVGF 0.156760 DRB1_0101 AKCNLDHSSEFCDML XAKMKCFGNTAVAKC WLFECLECYPDYWLQRATYCHNVGF 0.734955 DRB1_0101 AKMKCFGNTAVAKCN + peptide length + PFR length AAAAAVAAEAY 0.11986 0.33183 AALNVKRREGMFIDE 0.23888 0.44053 AAQPGLTSAVIEALP 0.16977 0.56296 ACVKDLVSKYLADNE 0.57765 0.59562 AFKIGLHTEFQTVSF 0.98271 0.46325 AGDLGRDELMELASD 0.06101 0.18933 AGLIAIVMVTILLCC 0.10499 0.36359 AGYAATNDDNILSHV 0.36443 0.29837 AKCNLDHSSEFCDML 0.15676 0.17179 AKMKCFGNTAVAKCN 0.73496 0.69730 M Nielsen et al., Comput Biol. 2008

Final NetMHC-IIpan method Pan SMM-align TEPITOPE Allele N Pearson AUC Pearson AUC AUC DRB1*0101 5166 0.682 0.841 0.610 0.802 0.720 DRB1*0301 1020 0.659 0.846 0.563 0.795 0.664 DRB1*0401 1024 0.625 0.816 0.496 0.751 0.716 DRB1*0404 663 0.701 0.860 0.579 0.801 0.770 DRB1*0405 630 0.637 0.833 0.560 0.789 0.759 DRB1*0701 853 0.725 0.870 0.618 0.812 0.761 DRB1*0802 420 0.660 0.843 0.555 0.787 0.766 DRB1*0901 530 0.517 0.728 0.360 0.655 DRB1*1101 950 0.724 0.871 0.581 0.796 0.721 DRB1*1302 498 0.663 0.819 0.558 0.785 0.652 DRB1*1501 934 0.644 0.800 0.528 0.727 0.686 DRB3*0101 549 0.619 0.841 0.585 0.836 DRB4*0101 446 0.695 0.871 0.541 0.793 DRB5*0101 924 0.703 0.856 0.529 0.761 0.680 Ave* 14 0.661 0.835 0.547 0.778 Ave** 11 0.675 0.841 0.562 0.782 0.718 M Nielsen et al., Comput Biol. 2008

Web servers CTL epitopes Prediction servers at CBS http://www.cbs.dtu.dk/services/netctl/ MHC binding http://www.cbs.dtu.dk/services/netmhc/ http://www.cbs.dtu.dk/services/netmhcii/ http://www.cbs.dtu.dk/services/netmhcpan/ http://www.cbs.dtu.dk/services/netmhciipan/ MHC Motif viewer http://www.cbs.dtu.dk/biotools/mhcmotifviewer/home.html Proteasome processing http://www.cbs.dtu.dk/services/netchop-3.0/ B-cell epitopes http://www.cbs.dtu.dk/services/bepipred/ http://www.cbs.dtu.dk/services/discotope/ Plotting of epitopes relative to reference sequence http://www.cbs.dtu.dk/services/epiplot-1.0/ Analysis of human immunoglobulin VDJ recombination http://www.cbs.dtu.dk/services/vdjsolver/ Geno-pheno type association based mapping of binding sites http://www.cbs.dtu.dk/services/signisite/ PhD/master course in Immunological Bioinformatics, June, 2009 http://www.cbs.dtu.dk/courses/27685.imm/

Proposed application in assessment of protein drugs 1 Compare amino acid sequence of drug with the human proteome 2 Predict epitopes in regions that differ from the human proteome 3 Select representative HLA alleles 4 Verify binding experimentally 5 Assess predicted immunogenecity using blood from treated patients/transgenic animals 6 Compare with clinical findings of immunogenecity/adverse effects/lack of effect

Work in progress: Pilot study based on DrugBank www.drugbank.ca Records corresponding to 123 FDA-approved biotech (protein/peptide) drugs were downloaded Sequences were compared to the human proteome (sequences from Homo Sapiens in NR (non redundant database from NCBI)) using blast. Sequences found in DrugBank and NR need to be manually validated/curated

Types of proteins Human/Human protein sequence Identical proteins Non human proteins Modified/allelic human proteins Antibodies Non human Human-murine chimaer Humanized Human

Immunonological Bioinformatics Ole Lund Claus Lundegaard Morten Nielsen Mette Voldby Larsen Sheila Tang Jorid Sørli Marlene Erup Larsen Bent Petersen Thomas Stranzl Massimo Andretta Edita Bartaseviciute Group