GS559: Introduction to Statistical and Computational Genomics (Winter 2013)

Instructors:
   Jim Thomas, jht@uw.edu
   Elhanan Borenstein, elbo@uw.edu

Schedule: Tues. Thurs, 3:30-4:50, Hitchcock 220.

Links:

News:

» The final exam will take place in class, Thursday, March 14 (last class of the quarter). It will have two parts: The first will focus on the bioinformatics topics covered in class and the second on programming. You are allowed to use any static resource (i.e., books, notes).
» Problem Set 6 is now posted below.
» Problem Set 5 is now posted below.
» Problem Set 4 is now posted below.
» Problem Set 2 Answers and Problem Set 3 are now posted below.


Assignments:

Test/Demo Files

The following files are used in some of the in-class exercises and demos.

speech.txt
hello.txt
four.txt
nine.txt
matrix.txt
small.fasta
large.fasta
cfam_repmask.txt
cfam_repmask2.txt
scores.txt
seq_names.txt
ko.txt
reaction.txt
genome.txt
ps6colswap.txt
enzyme.txt
warandpeace.txt
crispian.txt



Lectures and Reading:

Lecture #Lecture TopicProgramming TopicReading
1 Overview of course. Introduction to sequence comparison. BLAST, alignment scoringPDF Introduction to Python. Interpreter, objects, types, variables, command linePDF [1, 2]
2 Sequence alignment - dynamic programmingPDF StringsPDF
3 Sequence alignment - local alignmentPDF Numbers, lists, tuplesPDF
4 Sequence alignment - protein score matricesPDF File input-ouput, if-then-elsePDF
5 Sequence alignment - signficance of similarity scoresPDF For loopsPDF
6 Signficance of similarity scores continuedPDF While loopsPDF
7 Whole genome alignmentsPDF Loops and efficient code
More practice on loops (no solutions)
PDF
8 Sequence trees - distance treesPDF Dictionaries (hash maps)PDF [3]
9 ParsimonyPDF FunctionsPDF
10 Small parsimonyPDF Functions as arguments, sortingPDF
11 Gene ontology and functional enrichmentPDF More on functions, modulesPDF
12 Gene set enrichment analysisPDF Classes and objectsPDF
13 Gene expression: Clustring PDF More on classes and objectsPDF
14 Gene expression: K-mean clustring PDF Regular expressionsPDF
15 Biological networks; Dijkstra algorithmPDF More regular expressionsPDF
16 Gene predictionPDF ExceptionsPDF
17 Degree distribution and network motifsPDF More on classes, BiopythonPDF
18 RecursionPDF
19 ProjectPDF
20 Final Exam

References:

Electronic access to journals is generally free from on-campus computers. For off-campus access, follow the "[offcampus]" links or look at the library "proxy server" instructions.

  1. Noble, WS, "A quick guide to organizing computational biology projects." PLoS Comput. Biol. 5 (2009) e1000424. Pmid: 19649301 [Offcampus]
  2. Dudley, JT and Butte, AJ, "A quick guide for developing effective bioinformatics programming skills." PLoS Comput. Biol. 5 (2009) e1000589. Pmid: 20041221 [Offcampus]
  3. How dictionaries work (aka hash tables or hash maps)
  4. Subramanian et al., "Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles"PNAS102(43) (2005)

Python Resources:

   General
Regular Expressions
"RegExPal" (For Javascript rather than Python, but similar and quite handy. Try it!)
Biopython
Python Books
Python for Software Design: How to Think Like a Computer Scientist by Allen B. Downey. (Includes early drafts of our text book; cheaper than the published version, but less polished...)
Learning Python by Mark Lutz. O'Reilly (Very comprehensive. Much is accessible to beginners.)
Dive Into Python 3 by Mark Pilgrim. (Another online book. Based on Python 3, so some differences, and more advanced, but also free.)

Bioinformatics Books

» Biological sequence analysis: probabilistic models of proteins and nucleic acids, R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Cambridge. (Excellent reference, classics)
» Inferring Phylogenies, Joseph Felsenstein, Sinauer, 2004. (Excellent reference on this topic.)
» Introduction to Computational Genomics: A Case Studies Approach, Cristianini, Nello & Hahn, Matthew, Cambridge, 2007.
» An Introduction to Bioinformatics Algorithms, Neil C. Jones & Pavel A. Pevzner, 2004.
» Bioinformatics: Sequence and Genome Analysis, David W. Mount, Cold Spring Harbor Laboratory Press.
» Python for Bioinformatics, Sebastian Bassi, CRC Press, 2010. (A little too advanced as a progamming book for beginners, but fine now that you're experienced.)
» Python for Bioinformatics, Jason Kinser, Jones and Bartlett, 2009. (Ditto.)



James H. Thomas
Department of Genome Sciences
University of Washington
Elhanan Borenstein
Departments of Genome Sciences
University of Washington