![]() |
BIOLOGICAL SEQUENCE ANALYSIS |
| Math/Stats 547 (Lecture) and 548 (Lab) | MWF, 9-10 (1060 East Hall); Tu, 8:30-9:30 (5631 MedSci II) |
|
|
Recall the basic outlines of the project:
It is certainly possible to use a paper for your project which is not online, but then we will have to have copies xeroxed for the other participants beforehand.
Suggested Topics and Papers: a) Scoring and statistical methods: 1. S.Karlin and S.F. Altschul, Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes, Proc. Nat. Acad. Sci. USA 87(1990), 2264-2268. 2. R.Schwartz and Y.-L.Chow, The N-best algorithm: an efficient and exact procedure for finding the n most likely hypotheses. In Proceedings of ICAASP '90, 81-84. 3. L.R.Cardon, C.Burge, D.A.Clayton and S.Karlin, Pervasive CpG suppression in animal mitochondrial genomes, Proc. Nat. Acad. Sci. USA 91(1994), 3799-3803. 4. K. Sjölander, K. Karplus, M. Brown, R. Hughey, A. Krogh, I. S. Mian and D. Haussler, Dirichlet mixtures: A method for improved detection of weak but significant protein sequence homology, CABIOS, 12(1996), 327-345. 5. S.Govindarajan, R.Recabarren and R.A.Goldstein, Estimating the total number of protein folds, Proteins 35(1999), 408-414. 6. Bin Qian and R.A.Goldstein, Distribution of indel lengths, Proteins 45(2001), 102-104.
b) Multiple sequence alignment and analysis: 1. G.Hertz and G. Stormo, Identifying DNA and protein patterns with statistically significant alignments of multiple sequences, Bioinformatics 15(1999), 563-577. 2. A.Bateman, E.Birney, L.Cerruti, R.Durbin, L.Etwiller, S.Eddy, S.Griffiths-Jones, K.L.Howe, M.Marshall and E.L.Sonnhammer, The Pfam protein families database, Nucleic Acids Research 30(2002):276-80.
c) Gene finding, gene structure: 1. N.Miyajima, C.Burge and T.Saito, Computational and experimental analysis identifies many novel human genes, Biochemical & Biophysical Research Communications 272(2000), 801-807. 2. A. Krogh, Using database matches with HMMgene for automated gene detection in Drosophila, Genome Research 10(2000), 523-528. 3. M.Das, C.Burge, E.Park, J.Colinas and J. Pelletier, Assessment of the total number of human transcription units, Genomics 77(2001), 71-78. 4. L.P.Lim and C.Burge, A computational analysis of sequence features involved in recognition of short introns, Proc. Nat. Acad. Sci. USA 98(2001),11193-11198. 5. M. Skovgaard,
L. J. Jensen, S. Brunak, D. Ussery and A. Krogh, 6. W.Fairbrother, R.Yeh, P.Sharp and C.Burge, Predictive identification of exonic splicing enhancers in human genes, Science 297(2002),1007-1013. 7. L. J. Jensen, R. Gupta, N. Blom, D. Devos, J. Tamames, C. Kesmir, H. Nielsen, H. H. Staerfeldt, K. Rapacki, C. Workman, C. A. Andersen, S. Knudsen, A. Krogh, A. Valencia, and S. Brunak, Prediction of human protein function from post-translational modifications and localization features, Journal of Molecular Biology, 319(2002),1257-1265. d) Protein structure: 1. P. L.
Martelli, P. Fariselli, A. Krogh, and R. Casadio. 2. E. L.L.
Sonnhammer, G. von Heijne and A. Krogh, A hidden Markov model for predicting
transmembrane helices in protein sequences.
e) Phylogeny: 1. J.H.Huelsenbeck and R.Nielsen, Variation in the pattern of nucleotide substitution rate, Journal of Molecular Evolution, 48(2000), 86-93. 2. T.Mueller and M.Vingron, Modeling amino acid replacement, Journal of Computational Biology, 7(2000), 761-776. 3. J.A.Eisen, Phylogenomics: improving functional predictions by evolutionary analysis, Genome Research, 8(1998), 163-167.
f) Finding RNA genes: 1. T.Lowe and S.Eddy, A computational screen for methylation guide snoRNAs in yeast, Science 283(1999),1168-1171. 2. E.Rivas and S.R.Eddy, Secondary structure alone is generally not statistically significant for the detection of noncoding RNAs, Bioinformatics 16(2000), 583-605. 3. E.Rivas, R.J.Klein, T.A.Jones and S.R.Eddy, Computational identification of noncoding RNAs in E. coli by comparative genomics, Current Biology 11(2001),1369-73. 4. R.J.Klein, Z.Misulovin and S.Eddy, Noncoding RNA genes identified in AT-rich hyperthermophiles, Proc. Nat. Acad. Sci. USA 99(2002), 7542-7547.
g) Attempted medical applications: 1. S.Karlin and C.Burge, Trinucleotide repeats and long homopeptides in genes and proteins associated with nervous system disease and development, Proc. Nat.Acad. Sci. USA 93 (1996), 1560-1565.
|
![]() |
![]() |
![]() |