MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1. Lecture 27: Systems Biology and Bayesian Networks

MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1 Lecture 27: Systems Biology and Bayesian Networks Systems Biology and Regulatory Networks o Definitions o Network motifs o Examples Bayesian Networks o Bayesian Statistics o Bayesian Networks Welcome again to Bioinformatics, this is the last lecture for the bioinformatics course, so I want to congratulate you on completing the course up to this point. In this last lecture we ll talk about systems biology and Bayesian networks. The first half of the lecture we ll talk about systems biology focusing on regulatory networks. We ll give definitions of what are the key elements of regulatory networks, we ll go through examples of network motifs that are commonly occurring, and we ll go through some examples of how those network motifs regulate the biology of the system. The second half of the lecture will focus on Bayesian Networks, which is a statistical approach to looking at networks so we ll start off by discussing briefly Bayesian Statistics. I should note that this Bayesian Statistics is at a relatively high mathematical level. You don t need to worry about the mathematical details of this, I ll explain what you need to know when we go through this. And then we ll talk after we ve discussed the statistics briefly we ll talk about Bayesian networks, and how we can generate them and what they mean. Slide #2 System Biology: Definitions Nodes (circles): Genes/Proteins Edges (lines): interactions Edges can represent o protein-protein interactions o protein-dna interactions (e.g. TF binding to promoter) o genetic interactions o regulatory or functional interactions Edges can have directionality (gene X regulates gene Y) Edges can represent positive or negative regulation So starting off with definitions for systems biology, typically when we are describing a biological system in terms of a network we use common elements. So as we discuss in the previous lecture, often we represent genes or proteins with nodes or circles. So the

circles can represent proteins or they can represent genes or proteins and DNA, depends on what kind of network we re talking about. And then those genes or proteins can have interactions of various types, and those are usually represented by edges or lines. So the edges can represent protein-protein interactions as we discussed in the previous lecture, so for example shown here on the top right is a protein-protein interaction where two different proteins are interacting with each other as represented by the edge or line connecting them. You can have protein-dna interactions, where, for example, a transcription factor binds to the promoter of a target gene, and that type of interaction can be positive where the binding of the transcription factor activates the transcription of the gene, so an activator regulating its target gene, or it can be negative where a repressor turns off the expression of its target gene. And as you can see on the right there the activator is typically represented by an edge that has an arrow pointing to the target, whereas the repressor has a line, perpendicular line, indicating a negative regulation of the target gene by the repressor. So those are regulatory interactions, you can also have genetic interactions as we discussed in the previous slide, or other types of functional regulatory interactions. Again, edges can have directionality as we are showing in the bottom two examples, gene X regulates gene Y, and those edges can represent positive or negative regulation. So again we get positive regulation, for example a transcription activator regulating its target gene or you can have negative regulation a transcription repressor negatively regulating its target gene. And so make sure you know that those conventions are representing networks. Slide #3 Transcription Factor Autoregulation Transcription factors (TF) frequently bind to their own promoters and regulate their own transcription Which network would respond more rapidly (by increasing the expression of the TF protein) in response to a signal? Diagram So the first type of network we are going to talk about is very simple, and it s called transcription factor autoregulation. So transcription factors frequently bind to their own promoters and regulate their own transcription. So transcription factor protein will be expressed from its gene, and then it will come back after being transcribed, translated it will come back into the nucleus, and bind to the promoter of its own gene, its own promoter. And either positively or negatively autoregulating its own expression. And you can have, there s three possibilities, you can have no autoregulation, so the transcription factor doesn t regulate itself, you can have positive autoregulation where the transcription factor activates its own expression, and you can have negative autoregulation where the transcription factor represses its own expression. Now, which network would you anticipate would respond more rapidly, and what I mean by respond more rapidly, say increase the expression of the transcription factor protein in response to a signal, positive autoregualtion, negative autoregulation, or the no autoregulation, which of those three

networks would respond more rapidly in response to a signal. So for instance we are showing here three different types of responses in the graph shown below in terms of the activation of this transcription factor, its expression. The green line indicates a rapid response to a signal, so the zero is where the blue indicates a kind of moderate response that blue line, and then the red line indicates a delay in response. So if we were matching those curves, those different response curves, to the different networks, positive autoregulation, negative autoregulation, and no autoregulation. How would you match those? Think about it for a minute. Where would you put positive autoregulation, negative autoregulation on that graph. The answer might surprise you, but actually negative autoregulation has the most rapid response. In the middle the moderate response is none or no autoregulation, and positive autoregulation paradoxically has the most delayed or slowest response to a signal. Now why is that? Slide #4 Transcription Factor Autoregulation Negative autoregulation facilitates a more rapid response to a signal Positive auotregulation delays gene expression in response to a signal Diagram So just to summarize, transcription factor with transcription autoregulation, negative autoregulation facilitates a more rapid response to a signal, positive autoregulation delays the gene expression in response to a signal, and no autoregulation has, as I mentioned before, kind of a moderate response. Now, why is that the case we mentioned in the last slide, that might seem paradoxical, you d think negative autoregulation would be slower, but actually it s faster, the reason for that is with negative autoregulation its most rapid induction occurs when there is little if any of the protein around because the more transcription factor protein is present in the cell, in the nucleus, the slower its expression rate, so it has the most rapid expression rate, transcription rate, at the initial time point. So it has a very rapid initial response, which would correspond to that green line there. Its response then slows down as time goes on and the level of the transcription factor protein builds up. And so you can see the rate of increase in the transcription factor protein decreases as you go to later time points, and so the negative autoregulation has the most rapid response at the immediate, early, time points because there s no repressor around, no transcription factor around to repress its own expression. In contrast, with positive autoregulation, its most rapid response is not at the initial time points because there is no transcription factor around to activate its own expression, so it has the slowest induction of gene expression at the initial time points because there is very little transcription factor present at the initial signal. And so, if you look at the red line you get very little build up of gene expression, increase in gene expression in the positive autoregulation case, at the initial time points, but then at later time points as the amount of transcription factor protein increases than you stimulate the expression of the proteins so it builds up over time. And so negative autoregulation typically generates more rapid response to a signal, positive autoregulation typically delays expression and response to a signal.

Slide #5 Transcription Factor Autoregulation: Summary Negative autoregulation facilitates a more rapid response to an environmental signal Positive autoregulation delays gene expression in response to an environmental signal Negative autoregulation helps maintain the expression of the TF at a constant level Positive autoregulation can cause bistability Network image at right So just to summarize, as we said negative autoregulation generates a more rapid response, positive autoregulation delays the response. In addition negative autoregulation helps maintain the expression of the transcription factor protein at a constant level in the cell, and the reason for this is if the expression of the protein fluctuates at all, it s held in check by this negative autoregulation loop. So, if it increases, the expression of the transcription factor increases, then it will turn off its own promoter, it will repress it more greatly, and that repression will decrease the level, so it s kind of compensating for this increase in transcription factor protein. In contrast, if the transcription factor protein drops, than there will be less repression of the promoter, and so you will get an increase in expression, so the transcription factor levels will go back to the steady state, so it helps maintain, negative autoregulation helps maintain, the expression of the transcription factor at a constant level. In contrast, positive autoregulation often causes what is called bistability, where you can have two different steady states, one at a low level, and then once you build up the level of the protein enough so you start activating its expression through positive autoregulation, and then it can go to a very high level, so you can get these kind of multi-step equilibrium with positive autoregulation loops. Slide #6 Complex Network Motifs: Feed-Forward Loops Some network motifs are very common in biological networks (overrepresented) For example, the coherent Feed-Forward loop, pictured to the right, is common in biological networks In a coherent Feed-Forward loop, Gene X and Gene Y must be activated to activate Gene Z o Note that Gene Y activation is dependent on Gene X A coherent Feed-Forward loop requires a sustained input in order to activate the target gene Z Hence, the coherent Feed-Forward loop is a persistence detector

Diagram at right So now we will begin to discuss more complicated network motifs that are commonly found in biological networks. One common network motif is the feed-forward loop. Feed-forward loops tend to be over-represented in biological networks, they occur more commonly than one would expect based on random chance. Indicating, they likely have an important function in biological networks, and that function we ll get to in a little bit. One type of feed-forward loop that is commonly found in biological networks is a coherent feed-forward loop, and the network motif shown at the right of this slide is an example of a coherent feed-forward loop. In a coherent feed-forward loop you have three genes involved in that motif instead of just one like what we had when we had autoregulated genes. These three genes are labeled gene X, gene Y, and gene Z. Again, the circles represent the nodes, and the lines represent edges. So in the coherent feedforward loop gene X is activated by an input it then activates gene Y, and then the combination of gene X and gene Y activate gene Z. So there s two important features of this network motif. The first is that gene X regulates gene Y, so once gene X is activated gene Y will be activated and begin to accumulate. The second key feature is that the expression of gene Z is dependent on both gene X and gene Y. So in essence, gene X is regulating gene Z through two different pathways. One, directly, by directly binding to the promoter of gene Z and activating its expression, and second by turning on gene Y, which then also binds to the promoter of gene Z, and both X and Y are required to activate gene Z. Now because of this interesting network architecture, a coherent feedforward loop requires a sustained input in order to activate the expression of target gene Z. If you have only a brief input, there will be no change in expression of gene Z, there won t be any activation of the target gene Z. For this reason the coherent feed-forward loop is called a persistence-detector because you need a persistent, sustained, input in order to activate the expression of target gene Z. Slide #7 Process of Identifying Purified Proteins Diagram Note, however, that there is no delay in turning off the network signal So this slide essentially explains why that is the case. So again we have our three gene network, X, Y, and Z shown on the left there, same as in the previous slide, and in this case, on the right in the graphs there we are mapping how the network responds to different inputs, inputs of different length or duration. So in the first part of the graphs highlighted here is a brief input, we have a short input, that s brief in duration, it immediately activates the expression of gene X shown with that green line in the graph on the top for X. So gene X immediately increases, as we mentioned, gene X regulates gene Y, positively, so gene Y expression begins to increase if you look at the middle graph, however gene Y expression doesn t have the time to accumulate above its

threshold value, which is shown with the dotted line in the middle graph there, so the peak for gene Y doesn t cross the threshold. What that means is that gene Y doesn t accumulate enough to activate gene Z. And so because of a brief input, gene Y does not accumulate sufficiently, and the activation of gene Z requires both gene X and gene Y to accumulate. We don t get any activation of gene Z, so it was as if there was no input at all if it is a brief input in terms of response of gene Z, there s no activation. In contrast, if we have a sustained input, in this case, input for a long duration, again gene X expression increases right away, this causes a subsequent increase in gene Y expression, and once gene Y expression crosses the threshold value it begins to activate gene Z because now we have gene X and gene Y present in sufficient levels to get activation of gene Z, and so gene Z expression begins to accumulate after gene Y expression crosses its threshold value, but notice there s a delay there, there s a delayed response to turning on the network the signal for gene Z, the delay is highlighted there on the graph on the right, and that is because there s a delay required for gene Y to accumulate to sufficient levels to begin to activate gene Z. And so with a sustained input we get activation of gene Z, but there is a slight delay in that activation. However, notice that when we turn off the signal gene X and gene Y immediately begin to decrease, and there is no delay in turning off the network signal, gene Z immediately begins to turn off, there s no delay in that response, there s a delay in the activation, but there s not a delay in turning off the network signal. And these are key features of the coherent feed forward loop. So in summary the coherent feed forward loop only responds to a sustained input, which might be present in a biological system, it does not respond to brief, or noisy, signals, which could just be noise in the biological system. Slide #8 Incoherent Feed-Forward Loops The incoherent Feed-Forward loop is also common in biological networks In an Incoherent Feed-Forward loop, Gene X and must be activated, and Gene Y must be inactive to activate Gene Z. o Note again that Gene Y activation is dependent on Gene X. An incoherent Feed-Forward loop generates a rapid response that is eventually suppressed as Gene Y accumulates Hence, the incoherent Feed-Forward loop is a pulse generator. Diagram at right. The second network motif we ll talk about is the Incoherent Feed-Forward Loop. Basically, similar to the network motif we just talked about, except in this case, again we have three genes, gene X, gene Y, and gene Z, but now gene Y represses gene Z as opposed to activates it. So in an incoherent feed-forward loop, gene X must be activated, but gene Y must be inactive in order for gene Z to be activated. And again, gene X activates gene Y, and so in response to input, gene X turns on, and it turns on gene Z, but then as gene Y accumulates it begins to suppress the expression of gene Z. So, for this reason the incoherent feed-forward loop typically acts as what is known as a pulse

generator, so it generates a pulse of expression of gene Z, but that expression is not sustained over time. Slide #9 Pulse Generation with Incoherent Feed-Forward Loops Graphs of expression Again, here s an example of the incoherent feed-forward loop, again we have the graph here showing, in panel A. We have the expression of X in response to signal, and then Y and gene Z, and so in response to the input expression of X goes up, and that causes an immediate increase in the expression of gene Z because the values of Y are low, they haven t had a chance to activate yet, but then as Y accumulates once it crosses the threshold value, shown in there with the dotted line, then it begins to repress gene Z, so gene Z expression starts to go down. So gene Z expression as you can see in the bottom panel of part A goes up quickly, in expression, and then drops off, in essence generating a pulse of expression. And so that s why an incoherent feed forward loop acts as a pulse generator in response to a sustained input. Slide #10 Bayesian Statistics Bayes Theorem: P(B A)=(P(A B)P(B))/P(A) A= Experimental data B= Model parameters P(A B)= Likelihood P(A) is constant Bayes Postulate= P(B) is constant How do we find the model parameters that best fits our data? o From Bayes therorem we know: P(B A)αP(A B) o To find the best fitting parameters [maximize value of P(B A)] then we need to maximize the likelihood of the data given the model parameters [P(A B)] So that s all we ll talk about in terms of network motifs, now we re going to move into description of quantitative network analysis using what s called Bayesian networks, before we talk about Bayesian networks, we need to discuss a little bit in terms of statistics, and these are what are known as Bayesian Statistics, now, much of these statistics are fairly complicated, I don t want you to get bogged down in the mathematics of some of these details, I just want you to be familiar with the concepts, so don t worry

too much about the formula s, just be aware of the concepts. So Bayesian statistics depends on Baye s formula, which is expressed in terms of probabilities or likelihoods, and uses a couple different parameters, and so if we want to know the probability of a model based on the data, experimental data we have, we need to know what is the probability of the data, and that s expressed as P(B or given A) equals the probability of the data given the model, times the probability of the model divided by the probability of the data. P(A) is a constant, P(B), according to Bayes postulate, is a constant. And so what that means by Bayes theorem is the probability of a model being correct based on a certain set of data is proportional to the probability of the data fitting a particular model. And so, if you want to find the best model parameter that best fits our data we define the best fitting parameters maximizing the values of probability of model fitting based on the data, we need to maximize the likelihood of the data given the model parameters, the probability of A given B. Slide #11 Maximum Likelihood Ratio Test Method for testing hypothesis using maximum likelihood estimation (MLE) Compute MLE for null hypothesis (H 0 ) and alternative hypothesis (H 1 ) from experimental data (x), as follows: l=l(x H 0 )/L(x H 1 ) The ratio of the MLE for H 0 over the MLE for H 1 is related to the chi-square distribution via the following equation χ 2 =-2lnl=-2ln[L(x H 0 )/L(x H 1 )] And to do that we use what is known as a maximum likelihood ratio test, and this is a method for testing hypotheses using what is known as maximum likelihood estimation. And so to do this we compute a maximum likelihood estimation for the null hypothesis if you remember going back to the statistics section we have a null hypothesis and an alternative hypothesis, two different models, based on the experimental data. So this little l equals the likelihood of the data X given model H 0, divided by the likelihood of the data X given the alternative hypothesis H 1. We take the ratio of those two likelihood values. If we take the natural log of that ratio, and multiply by minus two that gives a value which is proportional to a chi squared distribution, so we can actually calculate a p-value using the chi squared distribution using these likelihood values Slide #12 Bayesian Networks A Bayesian network is a graphical representation of a probability distribution Can use Bayesian networks to model relationships between genes and genetic regulatory networks

Advantages of using Bayesian network: o Compact and intuitive representation of gene relationships o Captures causal relationships between genes o Integration of prior knowledge into network o Works well with noisy data, a characteristic of gene expression data o Efficiently learn new models of gene relationships And so that s just a very, very brief summary of Bayesian statistics don t worry too much about the details, but we ll be using a little bit of that nomenclature when we talk about Bayesian networks. So all the networks we ve been talking about up to this point are qualitative networks, we re showing one gene regulates another gene, there s really no statistics or probability involved. A Bayesian network, is a graphical representation of a probability distribution, so basically it s modeling relationships between gene sand networks, but using probability to model those relationships. The advantages of using a Bayesian network is that it s a compact and intuitive representation of gene relationships. It can capture these kinds of causal relationships where gene A activates gene B, or gene X activates gene Y. You can integrate prior knowledge into the network, it works well with noisy data, which is typically characteristic of gene expression data, and it can learn new models, so you can help develop new networks of genes, how genes regulate each other. Slide #13 Bayesian Networks (continued) Bayesian networks consist two parts: Qualitative and Quantitative Qualitative part of network is a directed acyclic graph (DAG) o Nodes represent random variables (gene mrna levels) o Edges represent direct (causal) influences between nodes (genes)\ o A positive edge from X to Y indicates that higher values of X will result in higher values of Y (bias distribution of Y higher) o A negative edge from X to Y indicates that higher values of X will result in lower values of Y (bias distribution of Y lower) Quantitative part of network consists of a set of conditional probability distributions Two network diagrams So the Bayesian network consists of two parts, the qualitative part, which will look very similar to some of the networks we ve been talking about, and then the quantitative part. The qualitative part is what is known as a directed acyclic graph. Very similar to the networks we ve just been talking about, again, the nodes represent the random variables, the genes for example, say, the mrna levels of the genes. The edges represent nfluences between genes, so connections, for example, gene X points to gene Y, that would indicate a positive regulation, so for example, a positive edge from X to Y indicates that higher levels of X results in higher levels of gene Y. Whereas, a negative edge, just like we

talked about, indicates negative regulation, So if gene X goes up, than gene Y will go down in expression. That s the qualitative part, the quantitative part of a bayesian network consists of these sets of these conditional probability distributions. How likely are we expecting to see gene Y expression given gene X expression. Slide #14 Example of a Bayesian Network Qualitative part: Network diagram Quantitative part: Table of data This row indicates that when Gene A and Gene B are up-regulated, then Gene C has 60% probability to be up-regulated and 40% probability to be down-regulated. So, this is an example of a Bayesian network, here s our qualitative part shown on the top. Gene A positively regulates gene C, whereas gene B negatively regulates gene C. That s the qualitative part, it s a three gene network, the quantitative part is shown below, this gives the probability of gene C being either highly or lowly expressed given what we know about gene A and gene B. So let s look at some of these examples. So, start out by looking at the second row, the second row of data in the table, where gene A is minus, gene B is plus, So look at the second row where gene A is minus, gene B is plus. In that case, gene A is not being expressed, it s absent, gene B is plus, so it is being expressed, in that case, because there is no gene A to activate gene C, but there is gene B to turn off gene C. We would expect gene C to be expressed at a low level. And so, if you look at the two different probabilities, the P parentheses C+ given A and B is the probability that C+ means that it is being highly expressed, plus means expressed. The other, the fourth column, P parentheses C- given A and B means that the probability of C being not expressed given A and B, and so in this case in that second row, if A is not expressed, and B is expressed, then the probability of C being expressed is very low, only.01, it s very unlikely that C would be expressed. The probability that C would not be expressed is.99, 99% of the time, C will not be expressed in that case. If you look at the third row where gene A is plus, and gene B is minus. Then in that case, since gene A is expressed, and it has a positive relationship with gene C, whereas gene B is not expressed, so it is not repressing gene C, in that case 99% of the time gene C is going to be positively

expressed. Only.01 or 1% of the time, is gene C going to be not expressed. The one that s highlighted there in red, the first row, where gene A, what happens if we have gene A and gene B being expressed? In that case, gene A seems to overweight a little bit, to have more weight because 60% of the time gene C will express only 40% of the time gene C will be not expressed or down-regulated. So that s how you interpret the quantitative part of a Bayesian network. Slide #15 Testing Bayesian Network Models Microarray data can be used to test the validity of genetic regulatory networks written in the form of Bayesian networks Typically, two or more alternate network models will be constructed and tested using the microarray data A Bayesian scoring metric is used to identify the network model that best fits the microarray data BayesianScore(M)=log[P(M D)] =log[p(m)]+log[p(d M)]+c o Where M = model, D = microarray data, c = constant The term P(D M) is the likelihood of the data D, given the model M We can calculate this term using the Maximum Likelihood method You can also test Bayesian network models, so, we can use things like microarray data, or other gene expression data to test the validity of a gene regulatory network that is written in this kind of Bayesian network form. Typically, what you do is you construct multiple alternate network models and test them using micorarray data, or whatever type of data you want to use. And again, you use these Bayesian statistics to score the probability of a model given the data, and you convert that using Bayes theorem to the probability of the data given the model plus some constants. And so the P(D M) is the likelihood of the data given the model, and we can calculate this using the maximum likelihood method we just talked about. Slide #16 Example: Galactose Regulatory Network Hartemink et al. (2001) used 52 microarray data sets to score the competing models Using the Bayesian scoring metric, Model 1 received a score of -44.0 while Model 2 received a score of -34.5 These scores indicate that the microarray data was over 13,000 times more likely to be observed under Model 2 Images at top

So let s go through an example, so this is the galactose regulatory network in yeast, and there are two models to express this galactose regulatory network. It s known that Gal4 activates the expression of galactose genes like Gal2, it s known that Gal80 acts as a repressor of galactose genes, and there are two models in which Gal80 could do this. One model, is it would repress Gal4, which would then not be able to activate Gal2, the second model, model 2 is that Gal4 and Gal80 independently regulate Gal2, Gal4 activates it, and Gal80 represses Gal2 expression. So a paper a few years ago, by Hartemink et al., used 52 microarray data sets, and scored how well those data sets fit these two different models, so using this Bayesian statistics scoring metric, model 1 received a score of -44.0, while model two received a score of -34.5. So the more positive the better, and so model two had a better score. In fact, if you calculate the statistics, the likelihood of model 2 was 13,000 times more likely to be observed under model 2 than model 1. So model 2 was clearly favored. That Gal80 directly regulated Gal2 as opposed to Gal4. Slide #17 Learning Bayesian Networks from Microarray Data Can also use microarray data as a basis for learning new Bayesian networks Issues in using microarray data in learning Bayesian networks o Large number of variables (genes) o Relatively small number of microarray data sets o Sparse networks (few genes directly affect one another) The Sparse Candidate algorithm (Friedman et al.) is commonly used in Bayesian Network learning. Flow chart near top Not only can you use Bayesian networks and statistics to test models, but you can also use them as a way of learning new models from data. So if we plug in our microarray data and other types of information, and put it into some kind of learning algorithm, we can then come up with these types of network models. The issues in this type of analysis, in learning Bayesian networks, is the large numbers of variables and genes, there s a relatively small number of microarray data sets, though it s not as true now that there s thousands of microarray data sets available, and that these tend to be sparse networks, that means relatively few genes directly affect one another, that means that few transcription factors in the whole genome. And so, for example, the Sparse Candidate algorithm developed by Friedman et al. is commonly used for Bayesian network learning, you don t need to know the details of that just know that that s a common way of learning new Bayesian networks Slide #18 Example of Bayesian Network Learning Images of Microarray Data and Bayesian Network

And so, this is just an example of how one could take microarray or expression data, and develop a Bayesian network from it, and so in this case, we have a hierarchical clustering of microarray expression data for a set of genes, involved in this case in yeast in mating and so on, and this is, to the right in part B is the Bayesian network that was developed from this microarray data set, so you can see connections between different genes, and response to different inputs and so forth. Slide #19 Appendix: Maximum Likelihood Estimation Want to choose the most likely model parameter of the mean µ given the experimental data x. Following Bayes theorem and postulate: P(B A)αP(A B) If the data x follows a Gaussian, then the Maximum Likelihood Estimate (MLE) of µ is given by: equations This last slide is just an appendix to tell how Maximum Likelihood Estimation works, you don t need to worry about this, this is only if you are curious about how Maximum Likelihood Estimation works. This is the mathematical details of this slide. So I want to thank you for sitting through the lecture, I hope you ve learned a lot from this bioinformatics class, and I encourage you to continue learning new things about bioinformatics, and computational biology, again, thank you very much.