Description

This set of tracks represent differentially expressed clusters detected by our pipeline, RECLU, in all pairwise comparisons of the cap analysis of gene expression (CAGE) datasets for 156 human primary cells and the HeLa and THP-1 cells. We proposed the pipeline to identify transcription start sites (TSSs) with reproducibility and multiple scales. We extracted both the lowest peaks (termed "bottom") and the highest peaks (termed "top") to capture clusters at different levels of the hierarchy. For the detail about our pipeline, refer our paper. This work is part of the FANTOM5 project.

You can get these datasets from here.

Display Conventions

A log fold change at the left of each cluster is shown.
A black cluster indicates that the cluster is not significantly differentially expressed or < 2.0 absolute log fold change. For the comparison between a sample A and B, a highly expressed peak for the sample B is represented in orange and the log fold change at the left of the cluster is a negative value. In contranst, a green bar indicates highly expressed for the sample A compared with B.

Methods

Data sources

We used two CAGE datasets. The first was the human CAGE data with replicates set for 156 primary cells sequenced on a HeliScope sequencer and mapped to the hg19 genome assembly in the FANTOM5 project. All primary cell data and ethics application numbers are described in the FANTOM5 main paper1). In brief the majority of primary cell samples were purchased from commercial suppliers while the remainder were obtained through collaborating institutes from patients who provided informed consent. The other was the triplicate human CAGE dataset for the HeLa and THP-1 samples sequenced on a HeliScope sequencer and mapped to the hg18 genome assembly by Kanamori-Katayama et al.2)

1) Forrest et al. A promoter-level mammalian expression atlas. Nature 507(7493), 462-470. 2014.
2) Kanamori-Katayama et al. Unamplified cap analysis of gene expression on a single molecule sequencer. Genome Res 21, 1150-1159. 2011.

Differential expression analysis

We used the edgeR package (version 2.5.3) in the R language to perform the exact test for differential expression analysis.

Robinson et al. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139-140. 2010.

References

Ohmiya et al. RECLU: a pipeline to discover reproducible transcriptional start sites and their alternative regulation using capped analysis of gene expression (CAGE) BMC Genomics 25;15:269. 2014.

Contacts

Author: Hiroko Ohmiya, RIKEN

Please send any questions to hiroko.ohmiya@riken.jp