JetBrains Research — наука, меняющая мир

Epigenetics data computational analysis

The course on computational data analysis of epigenetics data by Oleg Shpynov and Roman Chernyatchik for ITMO masters students.


The course consists of two major parts.

The first part was dedicated to the role of epigenetics in transcriptional regulation. The second part was dedicated to methods and computational pipelines for experimental data analysis.

Part 1. Theory

The following topics were covered by the block Transcription regulation:

  • Regulation
  • Transcription Factors
  • ChIP-seq
  • Chromatin
  • DNase, ATAC-seq
  • Histone modifications
  • Enhancers
  • TAD, CTCF, 3C, 4C, 5C, HighC

The block ChromHMM and DNA methylation covered:

  • ChromHMM
  • ENCODE project
  • DNA methylation
  • Cytosine context CpG, CHH, etc.
  • Methylation across genome
  • Bisulfite conversion
  • Protocols WGBS, RRBS
  • Methylation clocks

Part 2. Practice

Practical part was dedicated to computational approaches. This block was split into three almost independent parts: ChIP-seq analysis, genomic positional data downstream analysis and DNA methylation.

ChIP-seq and downstream analysis

Pipeline for ChIP-seq analysis:


ChIP-seq analysis was divided into three blocks:

ChIP-seq 1-2 topics:

  • Downloading datasets, ENCODE
  • Useful Linux commands
  • QC + MultiQC
  • Alignment + QC / filtration
  • Visualization for BAM file
  • Peak calling MACS2, SICER, SPAN & JBR
  • Peaks - confidence, statistics
  • Differential peak calling

Downstream analysis topics:

  • Genomic regions manipulation - BEDTools
  • Associated/closest gene annotation, working with GTF files
  • Coverage profile per TSS/Genes/etc - DeepTools
  • Functional genome annotation - ChIPpeakAnno, ChIPSeekR
  • Motif analysis TF - Homer + MEME
  • Pathway enrichment analysis - GREAT / EnrichR
  • Similar datasets - ChIP-Atlas

DNA methylation

Pipeline for DNA methylation analysis:


DNA methylation analysis was split into two blocks:

Covered topics:

  • Alignment + QC + Visualization
  • Call Methylation + QC + Visualization
  • Hyper-/Hypo-/Partially- Methylated Regions
  • Partially Methylated Domains
  • DMRs
  • Bis-SNP approach
  • Compare Methylation Micro Arrays with NGS data
  • Methylation Clock
  • Methylation @ Loci

The course took 5 hours of theory (Feb 13, 2020) and 23 hours of practice (Nov 2-7, 2020).

We thank ITMO University and Yandex for the provided cloud based computational infrastructure.