Course Curriculum 2020 (Semester Course)

________________________________________ CURRICULUM FOR 2020 ______________________________________

  • What will the students learn?

    • Server and Cluster usage in Linux
    • Basic Programming in R
    • Basic Statistics
    • Introduction to Next Gen Sequencing technologies
    • NGS Data Quality Control and Alignment
    • Downstream analysis of NGS Data including RNA-seq, ChIP-seq, Gene Set Enrichment Analysis, SNP calling, Pathway analysis, publication quality Heatmaps and graphics using R
    • Pipeline construction using shell scripts in Linux and Python
  • Course timeline and details (15 weeks)
  • Lectures – 15, Practical sessions – 14, Midterm Review (1 session), Final presentations (1 session)

    1. Introduction to Linux and Servers with no prior knowledge (Dr. Thapar) {Lecture 1}
      • Server accounts and Logging into “The Shell”
      • Linux command-line operations programming concepts and practice;
      • FTP and other services for Remote usage
      • Parallel Programming with Queues at MGH on Erisone
    2. Pipeline Development for RNA-Seq Part 1 (Dr. Thapar)
      • Introduction to Quality Control analysis for NGS data: FastQC tool, Fastq and Fasta format along with sequence quality scores and Phred scaling (Lecture 2)
      • Read Alignment with Tophat, Star and other aligners. Aligner comparison and specific use cases. (Lecture 3)
      • Lecture by Dr. Ting about Sequencing Technologies (Lecture 4)
    3. Introduction to R Programming (Dr. Thapar)
      • Introduction of the fundamental principles of object-oriented programming using the R language. (Lecture 5)
        • Variables, Data Types, Data Structures, Expressions and Statements
      • Functions, Control Structures, Loops in R (Lecture 6)
      • Apply (Lecture 6)
        • What are Apply functions in R?
      • Graphics in R & Counting Reads using Htseq-Count (Lecture 7)
    4. Introduction to Statistics (Dr. Wittner) {Lecture 8}
      • Distribution of data
      • Hypothesis testing and p-value
      • Normalization
      • Statistical Power
    5. Mid Term Test and Review (1 session)
    6. Sequence Analysis Part 2 (Dr. Thapar)
      • Differential Expression analysis in R (DESeq, EdgeR, CuffDiff) {Lecture 9}
      • Visualizing your results via Heatmaps in R with Dr. Wittner{Lecture 10}
      • Gene Set Enrichment Analysis (GSEA) after Differential Expression (Lecture 11)
      • ChIP-Seq (Lecture 12)
        • Peak calling using MACS and theory behind ChIP peak identification
      • Using Python and Snakemake to make a pipeline (Lecture 13)
      • Variant and Mutation Calling using MuTect (Lecture 14)
    7. Simultaneous Project Practical involving sequence data analysis (Dr. Thapar) {15 sessions}
      • Unix Commands in the Shell (1 Session)
      • Quality control and alignment of data (1 sessions)
      • Alignment of Fastq Files to a reference genome (1 session)
      • Samtools and Bedtools: Working with Aligned data files and genomic intervals, Bonus: Variant calling using Samtools (1 session)
      • Introduction to R programming: Nuts and Bolts (1 session)
      • R: Control and Functions (1 session)
      • R: Apply functions (1 session)
      • Graphics in R and Counting Reads (1 session)
      • Differential Expression using DESeq2 (with Vishal) (1 session)
      • DESeq2 review with Mark Kalinich (1 session)
      • GSEA and Pathway analysis (1 session)
      • ChIP-Seq (1 session)
      • Mutect for Variant Calling (1 session)
      • Making pipelines in Shell and Python for NGS analysis (1 session)
    8. Machine Learning (Lecture 15)
      1. Fundamentals of learning
      2. Types of Learning
      3. Training networks
      4. Metrics to measure performance

    Form of education

    ●      15 Lectures (1 hour lecture in the afternoon, one day a week)

    ●      14 Project practical sessions (1 hour session following the 
lectures, one day per week), fully supervised

    ●      1 Mid term test and review session

    ●      Office hours every week and tutoring (upon request).

    ●      Make up session (2 hours, optional for some students and mandatory for others based on Mid term test. In these sessions, we go over the practical and related concepts in detail)

    ●      6 Homework assignments

    ●      1 Final Project and presentation

    ●      Optional Final Examination for grade

    Please sign up for the upcoming info session on the August 8th, 2018 and we will get in touch with you with further details.

______________________________________ APPLICATION FORM ___________________________________

Please fill out the form at this link:  BioinformaticsCourseApplication and send it to me via email at

Feel free to ask any questions via email or in person.