2013-02-06 Marina Lizio Annotation of CAGE peaks by Nicolas Bertin (nbertin@gsc.riken.jp) Inquiries to Nicolas Bertin or to fantom-help@riken.jp This folder contains the hierarchical annotations of DPI cluster with respect to gene model sets It also contains annotation of clusters with repeats and CpG islands. The hierarchy of annotations for the gene models followed is described in each .txt file In partiucular for each source of annotation coding and non-coding transcripts are considered equally with respect to common annotation class (TSS, TSSregion500, ...). See each .txt file for more details Annotations are formatted as OSCfiles with headers describing the content of each columns. In particular : * columns 1-10 correspond to the exact input from the DPI cluster (BED6 plus additional columns) * column 11 is the input order (wold there be a need to quiclky resort the annotation similarly to the input file to merge expression levels) * column 12 is the annotation category (TSS, TSSregion500, ...). See each .txt file for more details * column 13 is the comma delimited list of transcript names assiociated to this annotation * column 14 is the comma delimited list of distances to the TSS of each transcript assiociated to this annotation - negative values indicate that the DPI cluster is upstream of the transcript TSS - positives values indicate that the DPI cluster is downstream of the transcript TSS - the distance is the shortest distance to the TSS (aka the DPIcluster 5'end for DPI clusters downstream of the transcript TSS, the DPIcluster 3'end for DPI clusters upstream of the transcript TSS) * column 15 is the comma delimited list of transcript coding/noncoding (/pseudogene when available) class * column 16(Gencode only) is the comma delimited list of transcript biotype