* *

*Bayes Networks and Graphical Models in Computational Molecular Biology and Bioinformatics *

Survey of Recent Research

DNA Analysis, Genetics, System Biology, Gene Expression, Functional Annotation, Protein-protein interaction, Haplotype Inference, Pedigree Analysis, Data Integration

******************************** WARNING!!! ********************************

THE OBJECTS IN THIS "MIRROR" APPEAR CLOSER THAN THEY ACTUALLY ARE!

******************************** WARNING!!! ********************************

DISCLAIMER: There is not attempt to be complete and we do not provide a strong endorsement of any of the publications below (including our own work). It is up to the readers to form their opinions and assess the strength and limitations of the work described in these papers. It is typical than in early publications on any given topic the authors are somewhat optimistic and perhaps even naive. Iif you want to add your paper to this resource please send email to kasif at bu . edu with the bibliographical information.Additionally, one should not confuse graphical models and Bayesian statistics. While the two areas are related and one can do Bayesian Statistics with Graphical Models, the relationship is not strict. One can do Bayesian inference with other models and we can use "non-Bayesian" maximum likelihood approaches for learning or inference in graphical models. There is no attempt to include a full reference set to Bayesian statistics but see BUGS as a starting point.

In 1992 Prof. Simon Kasif and two former students Arthur Delcher and William (Bill) Xsu at Johns Hopkins University described one of the earliest applications of Bayes networks (graphical models) to modern problems in bioinformatics. This application of graphical models to early genetics has some history. In a completely different setting S. Wright described the so called path diagrams (an ancestor of Causal Markov Networks) in a particular application of genetics in 1934 or even earlier. Initially our 1992 AAAI submission described the application of Bayes Networks to modeling protein secondary structure. Our follow-up paper in ISMB 2003 describes the use of Bayes networks for perturbational analysis to simulate in-silico mutagenesis and HMM modeling of secondary structure. Bayesian networks generalize HMMs (in fact the relationship of our simple model to an HMM was suggested to us first by Bill's uncle Kai-Fu Lee, one of the world experts on speech recognition). The number of applications of Bayes nets in computational biology is growing. These include applications such as integration of diverse biological databases, functional genomics, microarray analysis, comparative genomics genetics, linkage analysis, system biology and other key computational problems in molecular biology

This page provides a set of links to a few key papers. We make no claim to the relationship of our early work to these more recent applications of Bayes Networks in Bioinformatics. The methods are different, the models are different and the biology is different. There is also no attempt to provide a complete and comprehensive survey, just a few links that can provide a possible starting point. As mentioned above HMMs are a special case of Dynamic Bayes Nets. The number of applications of HMMs in computational biology is rather extensive and the book by Durbin et al is an excellent starting point. Our initial work is often not recognized as part of the HMM literature because many people are not aware of the connection between HMMs and Bayes Networks. This connection is documented in a number of recent books and papers on Bayes Nets e.g.:

*
Probabilistic Independence Networks for Hidden Markov Probability Models (1996)
*

**Intro and many refs to Graphical models can be found at Kevin's Murphy's site at MIT
**

* England:
Ghahramani's Group, "
Protein Secondary Prediction with Segmental Models
*

* Singapore, "
Protein Structure and Fold Prediction with Tree Augmented Bayesian classifier
*

* Israel, Hanover, Weiss , "
Approximate Inference and Protein Folding
*

* UC Irvine, Baaldi Lab , "
Secondary Structure prediiction
*

* Israel, Nir Friedman's Group, "
Using Bayesian Network to Analyze Expression Data*

* Nir Friedman's Science Survey , " Inferring Cellular Networks Using Probabilistic Graphical Models, *

* Nir Friedman's Group, "
Class Discovery in Gene Expression Data *

* Nir Friedman's Group, "
Tissue Classification with Gene Expression Profiles *

* Other Nir Friedman's Group Publications , " *

* MIT: Gifford / Jaakkola, "
Pathway modeling *

* Stanford: Eran Segal, Daphne Koller et al, "
Expression Analysis, Module Discovery, Pathway analysis *

* Harvard: Zak Kohane's Group, " Relevance Networks and Bayesian Gene Expression Analysis *

* Duke: Alex Hartemink , "
Pathways and Bayes nets *

* More from Duke: Dobra and West , "
Sparse graphical models for exploring gene expression data.
*

*
Japan: Miyano's Group "
Combining Bayes Network and Regression *

* From U. Penn , " Estimating genomic coexpression networks using first-order conditional independence *

* From France , " Gene networks inference using dynamic Bayesian networks
*

* From Norway , " MGraph: graphical models for microarray data analysis
*

*
Dirk Husmeier , " Dynamic Bayes Networks and Microarray Analysis *

*
England , "
Two-Stage Bayesian Networks for Metabolic Network Prediction
*

* MIT: Jaakkola's Lab , "
Physical network models and multi-source data integration *

* Yale: Gerstein's Lab et al , "
Integration of genomic datasets to predict protein complexes in yeast
*

* Gerstein's Lab et al , "
Bayes Networks Used to Integrate Data -- for protein protein interaction
*

* DIMACS WORKSHOP ON INTEGRATION , "
Various Integration approaches including Bayes Networks
*

* Nir Friedman's Group, "
Modeling Dependencies in Protein-DNA Binding Sites *

*
Denmarks : Graphical Models for Genetic Analyses
, " *

* Israel: Dan Geiger's Lab at the Technion, "
Superlink Program *

* Oxford :
Recombination Analysis Using Directed Graphical Models , " *

*
Oxford : Phylogenetic evidence for recombination in dengue virus
, " *

* Max Planck: Likelihood Analysis of Phylogenetic Networks Using Directed Graphical Models , " *

* Berkeley, Multiple-sequence functional annotation and the generalized hidden Markov phylogeny , " *

* Berkeley,
Bayesian Haplotype Inference via the Dirichlet Process
, " *

* Utah,
Graphical Modeling of the Joint Distribution of Alleles at Associated Loci
, " *

More soon.

Last modified