CS 415/515: Computational Biology: Sequence Analysis
Catalog Description: Design and analyze algorithms that address the computational problems posed by biological sequence data, such as DNA or protein sequences. Topics may include: comparing sequences (from genes to genomes), database searching, multiple sequence alignment, phylogenetic inferencing, gene discovery and annotation, and genome assembly. Additional class presentation and/or paper required for graduate credit.
Type: CS 415 is a technical elective for CS majors. CS 515 is available for graduate credit.
Total Credits: 3
Course Coordinator: Robert Heckendorn
Prereq: Knowledge of high level programming language, basic probability theory, basic molecular biology, or Permission
Textbook: Biological Sequence Analysis by Durbin et al. Cambridge University Press Optionally: Quick Python Book, 2nd ed, Manning Publications
Textbook URL: http://www.cambridge.org/us/academic/subjects/life-sciences/genomics-bioinformatics-and-systems-biology/biological-sequence-analysis-probabilistic-models-proteins-and-nucleic-acids, https://www.manning.com/books/the-quick-python-book-second-edition
Prerequisites by Topic:
- Basic programming skills in any high level language
- Solid intuition of basic math and statistics
- A love of digging into algorithms and studying processes.
Major Topics Covered
- Introduction to the biology of biological sequences (6hrs)
- Analysis of algorithms, brute force techniques and search trees (3hrs)
- Branch and bound and Motif finding (3hrs)
- Pairwise sequence alignment (3hrs)
- Dynamic programming (3hrs)
- Sequence search with approximate matching and BLAST (3hrs)
- Shotgun sequencing (3hrs)
- Markov chains and hidden Markov models (6hrs)
- Identification of sequence families (3hrs)
- Multiple sequence alignment (1hrs)
- Sequence generation from models (2hrs)
- Deterministic phylogenetic analysis (2hrs)
- Bootstrapping phylogenies (1hrs)
- Probabilistic phylogenetic analysis including maximum likelihood (3hrs)
- Evolutionary computation applied to phylogenetic analysis (3hrs)
If time allows:
- Parallel processing techniques and bioinformatics
- Transformational grammars
- Recognize the common algorithms used in gene sequencing
- Compare different gene sequencing algorithm approaches: their advantages, disadvantages, and tradeoffs.
- Describe sequence searches and approximate matching.
- Understand Markov chains as applied to sequencing.
- Understand how to construct a phylogenetic analysis.
- Understand the relationship between evolutionary computation and phylogenetic analysis.