CSCI-4800/5800: Bioinformatics

Cross-listed (UGrad+Grad) course, Remote, 2020

Biological sciences are undergoing a revolution in how they are practiced. In the last decade, a vast amount of data (Electronic Health Records (EHR), DNA sequences, protein sequences, etc.) has become available, and computational methods are playing a fundamental role in transforming this data into scientific understanding. Bioinformatics involves developing and applying computational methods for managing and analyzing information about the clinical, sequence, structure and function of biological molecules and systems. Topics will include understanding the evolutionary organization of genes (genomics), the structure and function of gene products (proteomics), the dynamics for gene expression in biological processes (transcriptomics), and other omics data including tumor tissue imaging, MRIs, EEG, ECG, and numerous human diseases and disorders. Students will also learn about the technology behind the Next Generation Sequencing (NGS), also known as the High Throughput Sequencing (HTS), Genome Wide Association Studies (GWAS) and get skills to analyze associated datasets to decipher useful information.

Course objectives

By the end of the course you are expected to gain the following skills:

  1. learn key ideas and algorithms of Bioinformatics.
  2. be able to understand some basic biological phenomenon and how Bioinformatics can aid in the analysis pipeline.
  3. apply acquired knowledge in solving real-world problems in the health sector.


For undergraduate students:

  1. CSCI-3412 (Algorithms) or equivalent.

For graduate students:

  1. The graduate standing.

Topics covered

  1. Introduction to Bioinformatics
  2. Sequence alignment problems
  3. Next generation (High throughput) sequencing
  4. Genome assembly
  5. scRNA pipeline
  6. RNA-seq data analysis
  7. Epigenetics
  8. Metagenomics
  9. Hidden Markov Model
  10. Multiple sequence alignment
  11. Phylogenetics
    1. UPGMA
    2. Neighbor joining method
  12. Clusting algorithms for Bioinformatics
  13. EHR data analysis