Selected Bioinformatics Publications
PUBMED Computational Biology / Bioinformatics Publications
Salzberg,S., D. Searl and S. Kasif, Computational Methods in Molecular Biology;,Elsevier Publ., 1998.
S. Letovsky, S. Kasif, "A Probabilistic Approach to Gene Function Assignment and Propagation in Protein Interaction Networks", Bioinformatics 2003. ,
International Human Genome Consortium, Lander et al, Initial Sequencing and Analysis of the Human Genome, Nature, February 2001
A., S. Kasif, H. Goldberg and W. Xsu, Protein Secondary-Structure Modeling
with Probabilistic Networks, International Conference on Intelligent Systems
and Molecular Biology, pp. 109--117, 1993. One of the first applications of
Hidden Markov models and first application of probabilistic networks (Bayes Nets) to modeling proteins.
S., A. Delcher, S. Kasif and O. White, Microbial Gene Identification Using
Interpolated Markov Models;, Nucleic Acids, 1997, a widely
used system for gene finding in microbial DNA.
Delcher, A., S. Kasif, R. Fleischmann, O. White and S. Salzberg, Whole Genome Alignment, Nucleic Acids Research, pp. 2369--2376, 1999, used in the Minimal Organism study at TIGR (Science 10/1999), used to detect chromosomal duplications in Arabodopsis (Nature 12/2000), used to detect inversions and large scale duplications in microbial genomes.
A., D. Harmon, S. Kasif, O. White and S. Salzberg, Improved Microbial Gene
Identification with Glimmer, Nucleic Acids, Vol. 27, No. 23, pp. 4636--4641,
1999. Glimmer II.
Tettelin, H. D. Radune, S. Kasif, H. Khouri, and S. Salzberg, Pipette Optimal Multiplexed PCR: Efficiently Closing Whole Genome Shotgun Sequencing Project;, Genomics, Vol. 62, pp. 500--507, 1999.
Kasif,S., S. Salzberg, D. Waltz, J. Rachlin and D. Aha, ; Towards a ProbabilisticFramework for Memory-Based Reasoning , Artificial Intelligence, November 1998.A early proposal for for combining generative models and supevised learning (classification).
S. Kasif, ; Datascope: Mining Biological Sequences ;, IEEE Intelligent Systems, pp. 38--46, 1999.
Cai, D., B. Kao, S. Kasif and A. Delcher, "Modeling Splice Sites Using Bayes Networks'', Bioinformatics, 2000.
Pavlovic, V., A. Garg and S. Kasif, "A Bayesian Framework for Combining Gene Predictions", Computational Genomics Nov. 2000.
A. Delcher, A. Grove, S. Kasif and J. Pearl, Logarithmic Time Query-Update Inference Algorithms in Bayesian Networks", Journal of Artificial Intelligence, 1996. An early theoretical proposal for deploying Bayesian Networks for Perturbational Analysis in Biological Systems
Beigel, R., N. Alon, S.M. Apaydin, L. Fortnow and S. Kasif, "An Optimal Multiplex PCR Protocol for Closing Gaps in Whole Genomes", Computational Genomics, Nov 2000.
B. Logan, P. Moreno, B. Suzek, Z. Weng and Simon Kasif, "Remote-Homology Detection Using Feature Vectors", in review.
B. Suzek, S. Kasif, B. Logan and P. Moreno, "Remote-Homology Detection Using Feature Vectors", in review.
Beigel, R., N. Alon, S.M. Apaydin, L. Fortnow and S.Kasif, "An Optimal Multiplex PCR Protocol for Closing Gaps in Whole Genomes", RECOMB 2001, , a theoretical treatment of multiplex PCR for GAP closing in whole genome assembly.
S. Kasif, "Efficient Gene Discovery by Database Matching", by request.
Kasif, S., S. Banerjee, A. Delcher and G.Sullivan, "Some Results on the Complexity of Symmetric Connectionist Networks", Annals of Mathematics and Artificial Intelligence, 327-344, Nov. 1993. Hopfield Networks
Murthy, S., S. Kasif and S. Salzberg, "A System for Induction of Oblique Decision Trees'', Journal of Artificial Intelligence Research, 2:1 (1994),1--33, a popular decision tree system.
Very early paper on voting of committees of decision trees, D. Heath, S. Kasif, S. Salzberg, " Committees of decision trees "., Conference on Multi-strategy Learning, 1993
A full version of the above on bagged committees of decision trees. Voting (Bagging) decision trees have proven to be rather useful as proven by the seminal research and software produced by Leo Breiman -- Random-Forest ----- D. Heath, S. Kasif, S. Salzberg, " Committees of decision trees "., In Gorayanska and J. Mey (Eds) Cognitive Technology: In search of a Humane Interface, Elsevier Science, 1996.
NEW PAPER WATCH
ANNOTATED RECENT PAPERS
1*. Walker, M., V. Pavlovic, and S. Kasif "A Comparative genomic method for the identification of prokaryotic translation sites", Nucleic Acid Research, 2002.
An application of comparative product HMMs (a new methodology for comparative genomics) for bacterial start-site identification. This is an increasingly important problem because of the opportunity to use comparative analysis of up-stream sequences for binding site identification leading to understand regulation in bacteria.
2*. Zheng, Y., R. Roberts and S. Kasif, "Computational identification of operons in microbial genomes", Genome Research, 2002
Identification of complex operons in bacterial genomes. Applications include identification of polyketides for production of antibiotics.
3*. Zheng, Y., R. Roberts and S. Kasif, "Genomic functional annotation using co-evolution profiles of gene clusters, Genome Biology 2002.Continued collaboration with Dr. Richard Roberts.
4*. Simon Kasif+, Zhiping Weng+, Richard Beigel, Charles DeLisi, "A Computational Framework for Optimal Masking in the Synthesis of Oligonucleotide Microarrays", Nucleic Acid Research 2002 (+ - contributed equally).
10-20% improvement in number of rounds needed to synthesize a microarray.
5*. Noga Alon, Richard Beigel and Simon Kasif and Steven Rudich and Benny Sudakov, "Learning a Hidden Matching", Proc. of Foundations of Computer Science (FOCS 2002), a theoretical treatment of multiplex PCR to close gaps in shot gun sequencing of genomes, joint project with Princeton Institute for Advanced Studies. Previous version of this algorithm has been used at MIT Genome Center and The Institute for Genomic Research to close the gaps in several major pathogens and archea.
6. Joseph D. Szustakowski, Ulas Karaoz, Serafim Batzouglou, James Galagan, Tarjei Mikkelsen, Zhiping Weng, Joel H. Graber, S. Kasif, On the Organization of Ancient and Modern Genes in the Human Genome, in review. (joint project with MIT Genome Center).
7. B. Logan, P. Moreno, B, Suzek and S. Kasif, "Learning Remote-Homology Using Probabilistic Feature Vectors", in review.
A new approach for classification of proteins into structural families.
8*. Yang Su, T. M. Murali S. Kasif, "RankGene", Bioinformatics, 2003.
A system to identify diagnostic genes in microarray data.
9*. T. M. Murali and S. Kasif, "Discovering Conserved Gene Expression Motifs in Microarray Data", PSB 2002.
Identifying groups of genes whose expression profile is relatively conserved across a collection of conditions.
10*. J. Zhang, V. Pavlovic, C. Cantor, S. Kasif, "Cross Species Gene Identification in Human and Mouse Sequences using Evidence Integration Frameworks", Genome Research 2003.
joint project with Charles Cantor, Sequenom.
A new approach for gene prediction using multiple related genomes. The first evolutionary analysis of error based on evolutionary distance between organisms.11. Y. Zheng, R. Roberts, S. Kasif, "Identification of the Adaptivity Layer of Microbial Organisms using Whole-Genome Variability Profiless", to be submitted, continued comparative genomics research with Rich Roberts on the structure of microbial organisms.
12*. N. O. Stitziel, J. Tseng, D. Pervouchine, D. Goddeau, S. Kasif, J. Liang, "Structural Location of Disease-Associated Single Nucleotide Polymorphisms", Journal of Molecular Biology, 2003.
A approach for classifying SNPs using a novel structural analysis tool and conservation analysis. A database with this information
13*. J. Wu, Simon Kasif, and Charles DeLisi, "Identification of functional links between genes using phylogenetic profiles", Bioinformatics 2003.
A rigorous methodology for assigning function based on phylogenetic profiles.
14*. D. Pervouchine, J. Graber and S. Kasif, "On the normalization of RNA equllibrium to sequence length", NAR , 2003.
An efficient method for whole genome RNA analysis.
15. T. M. Murali and S. Kasif, "Discovering Conserved Gene Expression Motifs in Microarray Data", in review.
16. S. Letovsky, S. Kasif, "A Probabilistic Approach to Gene Function Assignment and Propagation in Protein Interaction Networks", Bioinformatics 2003. ".,
Prediction of Protein Function from Protein Protein Interaction Networks A systematic approach for Functional Gene Annotation using Protein-Protein interaction graphs. 30-40% of genes in newly sequenced organisms have not been assigned function. This approach is highly promising for functional annotation.
17. V. Pavlovic, J. Zhang, C. Cantor, S. Kasif, "The Effect of Evolution: on the Performance of Comparative Gene Finders" , in
20. M. Schaffer, J. Tullai, S. Kasif and G. Cooper, Cold Spring Harbor Meeting on System Biology.
21. Noga Alon, Richard Beigel and Simon Kasif and Steven Rudich and Benny Sudakov, "Learning a Hidden Matching", in press SIAM Journal of Computing.