Artificial intelligence techniques for bioinformatics
Professor A. Narayanan (A.Narayanan@ex.ac.uk) was appointed Lecturer in Computer Science at the University of Exeter in 1980 and has taught various modules in computer science (artificial intelligence and machine learning techniques) cognitive science (mind/brain issues) and philosophy (philosophy of mind and language). He designed and developed the MSc/MRes programme in Bioinformatics in 1999. He teaches machine learning techniques for bioinformatics and bioethics on that programme. His CV can found at http://www.dcs.ex.ac.uk/~anarayan/. He has published several papers on the application of artificial intelligence techniques to bioinformatics. His recent professional activity includes being an advisory board member for the 2004 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB04) and the 2004 IEEE International Conference on Intelligent Data Engineering and Automated Learning (IDEAL’04).
Dr Keedwell is a researcher in the School of Engineering and Computer Science whose PhD thesis concerned a neural-genetic model for gene expression analysis. He has several publications in the application of neural networks and genetic algorithms.He is co-author with Professor Narayanan of ‘Intelligent Bioinformatics’, a book to be published by Wiley in 2005. Further details can be found at http://www.ex.ac.uk/~eckeedwe.
The presenters gave a four-hour tutorial on Machine Learning Techniques for Bioinformatics at the Intelligent Systems for Molecular Biology conference (ISMB03) in Brisbane, Australia to an audience of 150 delegates. The slides from that tutorial can be found at http://www.dcs.ex.ac.uk/~anarayan/ismb03/ismb_tutorial.ppt to provide an indication of the style and quality of presentation for this proposed tutorial.
Expected Goals, Objectives and Motivation:
There is growing interest in the application of artificial intelligence (AI) techniques in bioinformatics. In particular, there is an appreciation that many of the bioinformatics problems need a new way of being addressed given either the intractability of current approaches or the lack of an informed and intelligent way to exploit biological data. For an instance of the latter, there is an urgent need to identify new methods for extracting gene and protein networks from the rapidly proliferating gene expression and proteomic datasets. For an instance of the former, predicting the way a protein folds from first principles may well be feasible given some algorithms for protein sequences of 20 or so amino acids, but once the sequences become biologically plausible (200 or 300 amino acids and more) current protein folding algorithms which work on first principles rapidly become intractable.
Detailed outline of the presentation
The tutorial will consist of three parts. First, the basics of molecular biology will be introduced, informed by the latest discoveries in genomics, spliceosomics, transcriptomics and proteomics. Next, a variety of AI approaches to problems in these areas will be described, including classical symbolic machine learning techniques (nearest neighbour and identification tree approaches), supervised and unsupervised neural networks, and evolutionary computation techniques (genetic algorithms, genetic programming, cellular automata). Finally, novel hybrid methods will be introduced, including genetic neural networks and symbolically informed neural networks. The tutorial will be delivered at the pace the audience requires, with questions during presentation being encouraged. Problems and application areas will include: secondary structure protein folding prediction; viral protease cleavage prediction; cancer gene expression data mining; temporal gene expression data analysis; multiple sequence alignment; reverse engineering gene regulatory networks.
The material for the tutorial will be based on a paper ‘Artificial
intelligence techniques for Bioinformatics’, written by the presenters
and recently published in Applied Bioinformatics (available from http://www.dcs.ex.ac.uk/~anarayan/publications/
Slides and examples will be taken from this paper, supplemented with further material from our research papers and the forthcoming book, Intelligent Bioinformatics by Narayanan and Keedwell (Wiley, 2005). The tutorial presenters will use See5 and SNNS (Stuttgart Neural Network Simulator) for demonstration purposes, as well as other purpose-built genetic algorithm and evolutionary computation software. Tutorial attendees will receive free copies of slides.
There is currently a great deal of interest among computer scientists concerning the application of genetic algorithms, neural networks and machine learning techniques to bioinformatics problems. The tutorial will introduce the basics of molecular biology and then provide examples of how AI techniques can be used to help solve problems of analysing gene expression data, reverse engineering gene regulatory networks, form multiple alignments and predict protein structure. Examples will be taken from the bioinformatics literature and the research programmes of the tutorial presenters. The tutorial audience will therefore immediately see the relevance of these techniques to bioinformatics problems. Additionally, new ‘discoveries’ made by these techniques will be presented to demonstrate the value of applying AI and machine learning techniques to a variety of bioinformatics problems, including alternative models to the standard theory of cancer and the identification of new drug targets.