Re-processing of the data generated by the FANTOM5 project (galGal6 v1) === All the chicken data produced by the FANTOM5 project was originally processed on galGal5. With the recent update of genome assembly and related information, we reprocessed the FANTOM5 data here. - target genome: galGal6 - inquiries: fantom-help@riken.jp - original data: http://fantom.gsc.riken.jp/5/datafiles/phase2.6 Updates --- * Mar 27, 2020 initial release * Add CAGE and sRNA mapping data * Nov 20, 2020 version2 * Fixed CAGE mapping/CTSS files that was filtering by aln_filter Data types --- - CAGE read alignment: the raw HeliScope reads are aligned by delve (http://fantom.gsc.riken.jp/software/). The resulting alignment formatted in ( *.bam ) are indexed ( *.bai ) - CTSS (CAGE tag starting site): 5'-end of the CAGE read alignments with mapping quality above 20 (which is equivalent to single mapped reads) and percent identity 85% are counted at 1bp resolution. Genomic coordinates are formatted as BED and the counts are described in its score column - sRNA alignment: the raw reads are aligned by bwa. The resulting alignment formatted in ( *.bam ) are indexed ( *.bai ) - experimental meta data: *sdrf.txt is a tab delimited flat file describing the experimental details for each sample. Directory and file names --- Data files are located under the directory names as .. - Technology is either hCAGE (CAGE sequencing on Heliscope single molecule sequencer), LQhCAGE (Low Quantity hCAGE) or sRNA (sRNA seq). For details on the protocols used, please see [http://fantom.gsc.riken.jp/5/sstar/Protocols]. - The biological category is one of primary_cell, or tissue. - A part of file name represent the sample name. The sample name is encoded by percent encoding, and concatenated with , , , , and data types described wbove. Reference --- - FANTOM5 main papers * Forrest ARR, et al. A promoter-level mammalian expression atlas. Nature 507: 462–470 (2014) * Andersson R, et al. An atlas of active enhancers across human cell types and tissues. Nature 507: 455–461 (2014) * Arner E, et al. 2015. Transcribed enhancers lead waves of coordinated transcription in transitioning mammalian cells. Science (80- ) 347: 1010–1014. http://www.sciencemag.org/cgi/doi/10.1126/science.1259418. - Data descriptor * Abugessaisa I, et al. FANTOM5 CAGE profiles of human and mouse reprocessed for GRCh38 and GRCm38 genome assemblies. Sci Data 4: 170107 (2017) * Noguchi S, et al. FANTOM5 CAGE profiles of human and mouse samples. Sci Data 4: 170112 (2017) - FANTOM5 databases / data resource: * Lizio M, et al. Gateways to the FANTOM5 promoter level mammalian expression atlas. Genome Biol 16: 22 (2015) - HeliScopeCAGE: * Kanamori-Katayama M, et al. Unamplified cap analysis of gene expression on a single-molecule sequencer. Genome Res 21: 1150–1159 (2011) * Itoh M, Automated workflow for preparation of cDNA for cap analysis of gene expression on a single molecule sequencer. PLoS One 7: e30809 (2012) - BAM: https://samtools.github.io/hts-specs/SAMv1.pdf - BED: https://genome.ucsc.edu/FAQ/FAQformat.html#format1 - SDRF: http://isatab.sourceforge.net/format.html