BIOLOGICAL SEQUENCE ANALYSIS

Math/Stats 547 (Lecture) and 548 (Lab) MWF, 9-10 (1084 East Hall); Fri, 10-11 (B743 East Hall)

 

Syllabus

Assignments

Lab Worksheets

548 Resources

Web Resources

Term Project

Speaker Schedule

Outside Seminars

Contact Instructor

Instructor: Dan Burns

Office: 5834 East Hall

Phone: 763-0152

E-mail: dburns@umich.edu


Under Construction!

This page will provide a convenient access to the class events which are being scheduled, the assignments and group rosters, and a convenient set of links to (online) papers and web sites which will be of use in the course. For your convenience, here is the first day handout summarizing the course, as well as the course announcement which gives a bit more syllabus detail. Note, however, that I am thinking of modifying a few things over the course of the term due to recent developments.

Problem set assignments will be available below (follow link in the left side bar). First problems sets will be due Wednesday, February 4.

There will be no examinations in the class. There will be a final project which will consist of your studying on a particular subject and making a presentation to the class. This will be a twenty to twenty-five minute Power Point presentation, and will be done in teams of two. There will be a page of suggested topics, though you will be free to choose a topic of your own (it must be approved, however).

Link here to the page for the final project, including suggested topics, etc.


Schedule of Readings and Detailed Syllabus:

Click on a section subject for a relevant link, if available. DE means Durbin, Eddy, et al., ``Biological Sequence Analysis"

March 3-7,
2003
Read: Durbin and Eddy, Chapters 7 & 8
(as much as possible)
Phylogeny, especially for protein families. Many approaches to phylogeny,
which is, again, a
computationally hard problem.
March 7-10,
2003
Kahn, Qian & Goldstein 2000,
Qian, Goldstein 2002.
Phylogeny, especially for protein families, especially one use for making multiple sequence alignments more accurate. (The preprint links -- right -- are more directly relevant than the reprint links -- left.) Tree based HMM's for m.s.a.
and classification of GPCR's
(= G-protein coupled receptors).
March
12-19
(March 14 canceled),
2003
Read: DE, Chap. 7 (parsimony);
Chap. 8 (ML);
HMM: Felsenstein-Churchill 1996,
Mitchison-Durbin, 1995 (not linkable; will be distributed in class).
Phylogeny: ML; Parsimony;
HMM's in phylogeny.
Parsimony most used method for tree estimation; HMM's vary substitution rates across sequence positions.
March
21-24,
2003
Reference: Brian Ripley, ``Pattern Recognition and Neural Networks", Camb.UP (1995),
Ch. 5: Feedforward Neural Nets.
Basics of NN's:
Feedforward nets,
supervised training, backpropagation algorithm;
gradient descent minimization.
The ``vanilla" settings for NN's; the complete literature is vast;
few rigorous arguments, very heuristic field.
March
26-28,
2003
Neural nets in promoter recognition: NNPP;
M. Reese, Comps. & Chem., 26 (1998) 51-56.
Time delay NN's;
application to eukaryotic promoter site recognition.
Typical use of NN for pattern recognition, with modification to allow for flexible location for recognition of the ``same" signal.
March
31, 2003
Probabilistic version of promoter recognition: McPromoter.
Ohler et al., 1999.
Interpolated Markov chains;
application to eukaryotic promoter site recognition.
Use of higher order Markov chains for pattern recognition, with modification to allow for flexible use of available data: weighted use of shorter and longer context sequences, with (non-probabilistically enforced) weighting of more commonly occuring context sequences.
April
2 - 4,
2003
Improvements in McPromoter;
Ohler et al., 2001.
Incorporating biophysical properties of sequences. Ohler's extension of McPromoter to include DNA physics;
intro to duplex stress and gene promoters (after Benham et al.).
April
4 - 14,
2003
Term Project Presentations:
Good luck!
Great variety of topics. Visitors welcome:
schedule of speakers.
April
14 - 17,
2003
End of Term:
Sonnhammer et al., 1998.
Martelli et al. (2002).
Trans-membrane proteins: recognizing helices and beta barrels. Using HMM's for structural feature recognition.
(Papers taken from the suggested project topics.)