|
CSE Home |
About Us |
Search |
Contact Info |
| Course Info |
Reading and Research in Computational Biology
CSE 590 CB is a weekly seminar on Readings and Research in
Computational Biology, open to all graduate students in the computer,
biological, and mathematical sciences.
| |||||||||||||||||||||||||||||||||||||||||||||||||
| Instructors: | Larry Ruzzo, Rimli Sengupta |
| Credit: | 1-3 Variable |
| Grading: | Credit/No Credit. Talk to an instructor if you are unsure of our expectations. |
| Date | Topic | Presenters/Participants | Papers |
|---|---|---|---|
| 1/03 | Organizational Meeting | ||
| 1/10 | Guest Speaker "From gene expression data to cancer class discovery" | Amir Ben-Dor, Agilent Laboratories | Abstract |
| 1/17 | "Nonlinear PCA" | Tammy, Gidon; Larry | Papers |
| 1/24 | UTR Reconstruction | Chris, Jochen; Larry | Paper (ISMB 2000) |
| 1/31 | "Two-hybrid screens and group testing" | Zasha | |
| 2/07 | Guest Speaker "Plaid models for microarray data" |
Art Owen, Stanford | Abstract |
| 2/14 | Guest Speaker "Comparison of prokaryotic genomes using colinear regions" | Joao Carlos Setubal, Instituto de Computacao - UNICAMP | Abstract |
| 2/21 | "Separating real motifs from their artifacts" | Mathieu, Saurabh | Abstract |
| 2/28 | Guest Speaker "Protein structure prediction: progress and prospects" | Ram Samudrala, UW Microbiology | Abstract |
| 3/07 | Guest Speaker "A Statistical Modeling Approach for Analyzing Microarray Data --- Questions, Issues and Challenges" | Lue Ping Zhao, FHCRC, Biostatistics | Abstract |
Preprints of some of this work are available from his co-author's web site. Two papers of interest are:
There's also a paper on gene scoring methods (technical report, pdf file) at http://www.labs.agilent.com/resources/techreports.html .
1/17: "Nonlinear PCA"
Following up on last quarter's look at Principal Component
Analysis for analysis of gene expression microarry data, here
are two recent papers proposing related ideas for discovery of
nonlinear structures in high-dimensional spaces.
A set of genes behaving similarly in a set of samples, defines what we call a ``layer''. These are very much like clusters, except that: genes can belong to more than one layer or to none of them, the layer may be defined with respect to only a subset of the samples, and the role of genes and samples is symmetric in our formulation.
The plaid model is a superposition of two way anova models, each defined over subsets of genes and samples. This talk will present the plaid model, an interior point style algorithm for fitting it, and some examples from yeast DNA arrays and other problems.
This is joint work with Laura Lazzeroni.
2/14: Comparison of prokaryotic genomes using colinear
regions
Abstract: Now that there are dozens of prokaryotic genomes publicly available (with soon to become hundreds or even thousands) there is a major drive towards gaining full advantage of the information provided by such complete genomes. One way to compare genomes is to look for runs of corresponding consecutive genes (colinear regions). Most of these runs, when properly clustered, contain genes that are functionally related, so the cluster analysis can help the understanding of metabolic pathways as well as provide clues to functions of hypothetical genes. In this talk I will review previous work that has been done along these lines and describe data on such runs that I have collected from comparisons of five bacterial genomes.
2/21: Separating real motifs from their artifacts
Abstract: The typical output of many computational methods to identify binding sites is a long list of motifs containing some real motifs (those most likely to correspond to the actual binding sites) along with a large number of random variations of these. We present a statistical method to separate real motifs from their artifacts. This produces a short list of high quality motifs that is sufficient to exp lain the over-representation of all motifs in the given sequences. Using synthetic data sets, we show that the output of our method is very accurate. On various sets of upstream sequences in {m S. cerevisiae}, our program identifies several known binding sites, as well as a number of significant novel motifs.
2/28: Protein structure prediction: progress and prospects
Abstract: The Critical Assessment of protein Structure Prediction (CASP) methods conference was instigated to ensure that protein structure prediction approaches are tested rigorously without advance knowledge of the experimental answer. We have made predictions at all four CASP meetings, each time improving upon previously developed methodologies. In the recent CASP4 experiment, we made predictions in all three prediction categories: comparative modelling, fold recognition, and ab initio prediction. The talk will focus on the performance of our prediction methodologies in the context of ongoing structural genomics efforts.
Abstract: Functional genomic studies are now routinely conducted by biomedical researchers, generating a huge amount of expression data. Preliminary exploration of such data have yielded much useful information, and have also indicated many challenges facing continuing successes of functional genomics. One objective of this talk is to identify some of the challenging issues. Another objective is to identify additional research questions one can address in the analysis. Lastly, I will be describing a statistical modeling approach that we have developed for analyzing microarray data. To illustrate the methodology, we apply it to the analysis of Leukemia data set, some results from which will be highlighted in this talk.
|
Department of Computer Science & Engineering University of Washington Box 352350 Seattle, WA 98195-2350 (206) 543-1695 voice, (206) 543-2969 FAX cse590cb-webmaster@cs.washington.edu [comments to cse590CB-webmaster] | |