CS 415/515

From CS Wiki
Jump to: navigation, search

CS 415/515: Computational Biology: Sequence Analysis

Catalog Description: Design and analyze algorithms that address the computational problems posed by biological sequence data, such as DNA or protein sequences. Topics may include: comparing sequences (from genes to genomes), database searching, multiple sequence alignment, phylogenetic inferencing, gene discovery and annotation, and genome assembly. Additional class presentation and/or paper required for graduate credit.

Type: CS 415 is a technical elective for CS majors. CS 515 is available for graduate credit.

Total Credits: 3

Course Coordinator: Robert Heckendorn

URL: http://marvin.cs.uidaho.edu/~heckendo/CS515/

Prereq: Knowledge of high level programming language, basic probability theory, basic molecular biology, or Permission

Textbook: Biological Sequence Analysis by Durbin et al. Cambridge University Press Optionally: Quick Python Book, 2nd ed, Manning Publications

Textbook URL: http://www.cambridge.org/us/academic/subjects/life-sciences/genomics-bioinformatics-and-systems-biology/biological-sequence-analysis-probabilistic-models-proteins-and-nucleic-acids, https://www.manning.com/books/the-quick-python-book-second-edition

Prerequisites by Topic:

  • Basic programming skills in any high level language
  • Solid intuition of basic math and statistics
  • A love of digging into algorithms and studying processes.

Major Topics Covered

  • Introduction to the biology of biological sequences (6hrs)
  • Analysis of algorithms, brute force techniques and search trees (3hrs)
  • Branch and bound and Motif finding (3hrs)
  • Pairwise sequence alignment (3hrs)
  • Dynamic programming (3hrs)
  • Sequence search with approximate matching and BLAST (3hrs)
  • Shotgun sequencing (3hrs)
  • Markov chains and hidden Markov models (6hrs)
  • Identification of sequence families (3hrs)
  • Multiple sequence alignment (1hrs)
  • Sequence generation from models (2hrs)
  • Deterministic phylogenetic analysis (2hrs)
  • Bootstrapping phylogenies (1hrs)
  • Probabilistic phylogenetic analysis including maximum likelihood (3hrs)
  • Evolutionary computation applied to phylogenetic analysis (3hrs)

If time allows:

  • Parallel processing techniques and bioinformatics
  • Transformational grammars

Course Outcomes

  1. Recognize the common algorithms used in gene sequencing
  2. Compare different gene sequencing algorithm approaches: their advantages, disadvantages, and tradeoffs.
  3. Describe sequence searches and approximate matching.
  4. Understand Markov chains as applied to sequencing.
  5. Understand how to construct a phylogenetic analysis.
  6. Understand the relationship between evolutionary computation and phylogenetic analysis.