GS559: Introduction to Statistical and Computational Genomics (Winter 2017)

Instructors:
   Elhanan Borenstein, elbo@uw.edu
   William S Noble, wnoble@uw.edu

Teaching Assistant: Lindsay Pino, lpino@uw.edu

Schedule: Tue. Thur, 10:30am-11:50pm, Foege S110.

Links:

News:


Assignments:

You are welcome to talk to classmates about principles for solving problems, but do NOT solve specific problems together.
In many ways, the problem solving is where you will learn the most for this class, especially the programming.

All problem sets are due by the start of class on the date listed.
Grades will come 80% from problem sets and 20% from one final exam. There will be no mid-term exams.

Test/Demo Files

The following files are used in some of the in-class exercises and demos.

sonnet.txt
speech.txt
hello.txt
four.txt
nine.txt
matrix.txt
small.fasta
large.fasta
cfam_repmask.txt
cfam_repmask2.txt
scores.txt
seq_names.txt
ko.txt
reaction.txt
genome.txt
ps6colswap.txt
enzyme.txt
warandpeace.txt
crispian.txt
words.txt



Lectures and Reading:

Lecture #Lecture TopicProgramming TopicReading
1-8
See https://noble.gs.washington.edu/~wnoble/genome559/

9 Phylogenetic TreesPDF The next step .... PDF
10 ParsimonyPDF FunctionsPDF
11 Small parsimonyPDF Functions, argumentsPDF
12 Big Parsimony, Bootstrap supportPDF Modules
Recursion (if of interest)
PDF
PDF
13 Clustring, hierarchical ClustringPDF More on functions, SortingPDF
14 K-mean clustring PDF Classes and objects 1PDF
15 Biological networks and Dijkstra's algorithm PDF Classes and objects 2PDF
16 Biological networks and network motifsPDF Classes and objects 3 (practice)PDF
17 Final Exam

References:

Electronic access to journals is generally free from on-campus computers. For off-campus access, follow the "[offcampus]" links or look at the library "proxy server" instructions.

  1. Noble, WS, "A quick guide to organizing computational biology projects." PLoS Comput. Biol. 5 (2009) e1000424. Pmid: 19649301 [Offcampus]
  2. Dudley, JT and Butte, AJ, "A quick guide for developing effective bioinformatics programming skills." PLoS Comput. Biol. 5 (2009) e1000589. Pmid: 20041221 [Offcampus]
  3. How dictionaries work (aka hash tables or hash maps)
  4. Subramanian et al., "Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles"PNAS102(43) (2005)

Python Resources:

   General
Regular Expressions
"RegExPal" (For Javascript rather than Python, but similar and quite handy. Try it!)
Biopython
Python Books
Python for Software Design: How to Think Like a Computer Scientist by Allen B. Downey. (Includes early drafts of our text book; cheaper than the published version, but less polished...)
Learning Python by Mark Lutz. O'Reilly (Very comprehensive. Much is accessible to beginners.)
Dive Into Python 3 by Mark Pilgrim. (Another online book. Based on Python 3, so some differences, and more advanced, but also free.)

Bioinformatics Books

» Biological sequence analysis: probabilistic models of proteins and nucleic acids, R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Cambridge. (Excellent reference, classics)
» Inferring Phylogenies, Joseph Felsenstein, Sinauer, 2004. (Excellent reference on this topic.)
» Introduction to Computational Genomics: A Case Studies Approach, Cristianini, Nello & Hahn, Matthew, Cambridge, 2007.
» An Introduction to Bioinformatics Algorithms, Neil C. Jones & Pavel A. Pevzner, 2004.
» Bioinformatics: Sequence and Genome Analysis, David W. Mount, Cold Spring Harbor Laboratory Press.
» Python for Bioinformatics, Sebastian Bassi, CRC Press, 2010. (A little too advanced as a progamming book for beginners, but fine now that you're experienced.)
» Python for Bioinformatics, Jason Kinser, Jones and Bartlett, 2009. (Ditto.)



William S Noble
Department of Genome Sciences
University of Washington
Elhanan Borenstein
Departments of Genome Sciences
University of Washington