EEB 5300: Bioinformatics/Genomics Course

Fall 2014  Bioinformatics and Genomic Applications

Class time: Tuesday & Thursday 1:30-3:30 pm; Class room: TLS 171B (Bamford Room).

Instructor: Yaowu Yuan (yaowu.yuan at uconn.edu); Office: BioPharm 300A.

TA: Cera Fisher (cera.fisher at uconn.edu); Office: BioPharm 318.

Office hours: by appointment.

bullets_tri_purple   Course goal: This new course is designed to–– (1) help students develop the basic skills for practical computing in biology (with a particular focus on genomic analysis) and start solving real-world problems immediately; (2) empower students to continue training themselves with more advanced computer skills when needed in their future research; and (3) familiarize students with the process of experimental design, data collection, analysis, interpretation, and presentation (i.e., the process from project design to publication), using empirical examples.

bullets_tri_purple   Course content: The course has three major parts–– (1) The Linux environment and ~40 most useful commands, shell scripts; (2) Perl programming for text manipulation (with a particular focus on empirical genomic data); and (3) R for simple statistics and graphics.

bullets_tri_purple   A Few Useful References:
Unix and Perl Primer for Biologists (Version 3.1.1). Keith Bradnam & Ian Korf (2012): http://korflab.ucdavis.edu/unix_and_Perl/.
Practical Computing for Biologists. Steven H. D. Haddock & Casey Dunn (2011).
Beginning Perl for Bioinformatics. James Tisdall (2001). Available from UConn library E-books.
R for Beginners. Emmanuel Paradis (2005).
R in Action: Data Analysis and Graphics with R. Robert I. Kabacoff (2011).

Week 1-1 (Aug. 26): Linux Basics I (pwd, cd, ls, mkdir, rmdir, mv, rm, touch, less, man, wc, cat, ssh, which, exit, logout)

Week 1-2 (Aug. 28): Linux Basics II (history, cp, scp, echo, chmod, grep, tr, the nano editor, shell scripts)

Week 2-1 (Sep. 02): Linux Basics III (head, tail, the pipe | , sed, cut, sort, uniq, awk, gzip, gunzip, zip, unzip, tar, curl, wget)

Week 2-2 (Sep. 04): Linux Basics IV (running batch jobs on a Linux cluster and installing new programs under a user’s account)

Week 3-1 (Sep. 09): Perl Introductions (the first perl program; common features of all programming languages)

Week 3-2 (Sep. 11): Scalar Variables and Basic String Manipulation 

Week 4-1 (Sep. 16): File I/O, Arrays, Loops (I)

Week 4-2 (Sep. 18): File I/O, Arrays, Loops (II)

Week 5-1 (Sep. 23): Hashes

Week 5-2 (Sep. 25): Regular Expressions

Week 6-1 (Sep. 30): Introduction to Research Workflow

Week 6-2 (Oct. 02): Genome Statistics

Week 7-1 (Oct. 07): Subroutine, Module, and Variable Scope

Week 7-2 (Oct. 09): Guest Lecture by Dr. Jill Wegrzyn: Introduction to Genomics and Common Bioinformatic Approaches

Week 8-1 (Oct. 14): Genome filtering, Interaction with the command line and other programs

Week 8-2 (Oct. 16): Directory operations, parsing MUMmer output

Week 9-1 (Oct. 21): Short read mapping

Week 9-2 (Oct. 23): SNP calling

Week 10-1 (Oct. 28): Genome comparisons with MUMMer

Week 10-2 (Oct. 30): R introductions

Week 11-1 (Nov. 04): SNP density and sliding windows

Week 11-2 (Nov. 06): R graphics I

Week 12-1 (Nov. 11): R graphics II

Week 12-2 (Nov. 13): Clustered SNP filtering and Running R Scripts in Perl

Week 13-1 (Nov. 18): R applications in phylogenetics 

Week 13-2 (Nov. 20): Combining Linux, Perl, and R for Research I