|
CSE Home |
About Us |
Search |
Contact Info |
| Course Info |
Reading and Research in Computational Biology
CSE 590 CB is a weekly seminar on Readings and Research in
Computational Biology, open to all graduate students in the computer,
biological, and mathematical sciences.
| |||||||||||||||||||||||||||||||||||||||||||||||||
| Organizers: | Larry Ruzzo, Rimli Sengupta, Martin Tompa |
| Credit: | 1-3 Variable |
| Grading: | Credit/No Credit. Talk to the organizers if you are unsure of our expectations. |
| Date | Presenters/Participants | Topic | Papers |
|---|---|---|---|
| 04/01 | Organizational Meeting | ||
| 04/08 | Saurabh Sinha | Discriminatory Motifs | Abstract |
| 04/15 | Hamid Bolouri, U. Hertfordshire & Caltech | Why high-throughput biology needs computer science | Abstract |
| 04/22 | Amol Prakash | Identifying Muscle Regulatory Elements... | Paper |
| 04/29 | Zizhen Yao, Brian Tjaden | ...Markov Chain Optimization | Paper |
| 05/06 | Emily Rocke | A hybrid scoring metric for protein multiple alignment | |
| 05/13 | Gidon Shavit, Dan Grossman | Composite Patterns | Paper |
| 05/20 | Kevin Karplus, UC Santa Cruz | Protein Structure Prediction (Fold Recognition) using Hidden Markov Models | |
| 05/27 | Holiday | ||
| 06/03 | |||
Note on Electronic Access to Journals
Abstract: This paper takes a new view of motif discovery, addressing a common problem in existing motif finders. A motif is treated as a feature of the input promoter regions that leads to a good classifier between these promoters and a set of background promoters. This perspective allows us to adapt existing methods of feature selection, a well studied topic in machine learning, to motif discovery. We develop a general algorithmic framework that can be specialized to work with a wide variety of motif models, including consensus models with degenerate symbols or mismatches, and composite motifs. A key feature of our algorithm is that it measures over-representation while maintaining information about the distribution of motif instances in individual promoters. The assessment of a motif's discriminative power is normalized against chance behaviour by a probabilistic analysis. We apply our framework to two popular motif models, and are able to detect several known binding sites in sets of co-regulated genes in yeast.
Abstract: There has been much talk in recent years of the need for mathematicians, physicists, and engineers to underpin and extend the emerging biotechnologies. In contrast, the contributions of computer scientists to disciplines such as whole genome sequencing & bioinformatics have sometimes been portrayed as "unglamorous" software development.
However, many of the challenges that arise as we move beyond collecting static parts lists, and attempt to build dynamic understanding of how the parts interact, are fundamental computer science research issues. I will describe some examples arising from research in my group.
Abstract: We report the identification of several putative muscle-specific regulatory elements, and genes which are expressed preferentially in the muscle of the nematode Caenorhabditis elegans. We used computational pattern finding methods to identify cis-regulatory motifs from promoter regions of a set of genes known to express preferentially in muscle; each motif describes the potential binding sites for an unknown regulatory factor. The significance and specificity of the identified motifs were evaluated using several different control sequence sets. Using the motifs, we searched the entire C. elegans genome for genes whose promoter regions have a high probability of being bound by the putative regulatory factors. Genes that met this criterion and were not included in our initial set were predicted to be good candidates for muscle expression. Some of these candidates are additional, known muscle expressed genes and several others are shown here to be preferentially expressed in muscle cells by using GFP (green fluorescent protein) constructs. The methods described here can be used to predict the spatial expression pattern of many uncharacterized genes.
|
Department of Computer Science & Engineering University of Washington Box 352350 Seattle, WA 98195-2350 (206) 543-1695 voice, (206) 543-2969 FAX [comments to cse590cb-webmaster@cs.washington.edu] | |