Protocols:HeliScopeCAGE read alignment: Difference between revisions

Revision as of 18:24, 5 September 2012

The data sequenced on Heliscope sequencers are processed, in order to:

Discard the CAGE tags derived from the ribosomal DNA repeating unit, which is not contained in the genome assembly, by rRNAdust (developed by Timo Lassmann). See Protocols:rRNAdust
Align the remained CAGE tags with the genome sequences with DELVE (developed by Timo Lassmann), which generates BAM files containing a single mapped position per read with mapping quality and alignments. File:Delve User Manual.pdf

Post-mapping processing involves the following steps.

Retain only those reads with mapping quality corresponding to a 99% accuracy.

this is obtained by specifying the following samtools options
samtools view -q 20 [mapping_file.bam]

Aggregate the 5'-end of those mapped CAGE tags as CAGE transcription starting site (CTSS).

command line instructions are combined with bedtools to read the bed file output
obtained by the conversion of mapping file .BAM into .bed; then the bedtools function
groupBy (or equivalent for the latest versions of bedtools) is used to aggregate those
tags with same starting position into CTSS.

Full instructions from BAM to CTSS. Commands are executed separately for plus and minus strand.

read bam file considering quality score and keeping binary format (-u)
convert into bed
select the strand
sort
aggregate by start position
print results

samtools view -uq 20 BAMfile \
| bamToBed -i stdin \
| awk 'BEGIN{OFS="\t"}{if($6=="+"){print $1,$2,$5}}' \
| sort -k1,1 -k2,2n \
| groupBy -i stdin -grp 1,2 -opCols 3 -ops count \
| awk 'BEGIN{OFS="\t"}{print $1,$2,$2+1,  $1":"$2".."$2+1",+"  ,$3,"+"}'
samtools view -uq 20 BAMfile \
| bamToBed -i stdin \
| awk 'BEGIN{OFS="\t"}{if($6=="-"){print $1,$3,$5}}' \
| sort -k1,1 -k2,2n \
| groupBy -i stdin -grp 1,2 -opCols 3 -ops count \
| awk 'BEGIN{OFS="\t"}{print $1,$2-1,$2,  $1":"$2-1".."$2",-"  ,$3,"-"}'

Personal tools

Protocols:HeliScopeCAGE read alignment: Difference between revisions - resource_browser

Search

Human

Mouse

Rat

Dog

Chicken

Macaque

Cross species

Other

Tools

Protocols:HeliScopeCAGE read alignment: Difference between revisions

From FANTOM5_SSTAR

Revision as of 18:24, 5 September 2012

@@ Line 1: / Line 1: @@
-The data sequenced on Heliscope sequencers are post processed, in order to:
+The data sequenced on Heliscope sequencers are processed, in order to:
 *Discard the CAGE tags derived from the ribosomal DNA repeating unit, which is not contained in the genome assembly, by rRNAdust (developed by Timo Lassmann). See [[Protocols:rRNAdust]]
 *Align the remained CAGE tags with the genome sequences with DELVE (developed by Timo Lassmann), which generates BAM files containing a single mapped position per read with mapping quality and alignments. [[File:Delve_User_Manual.pdf]]
+Post-mapping processing involves the following steps.
 *Retain only those reads with mapping quality corresponding to a 99% accuracy.
   this is obtained by specifying the following samtools options
@@ Line 8: / Line 11: @@
 *Aggregate the 5'-end of those mapped CAGE tags as CAGE transcription starting site (CTSS).
-  command line instructions are combined with bedtools to read the bed file output obtained by the conversion of mapping file .BAM into .bed; then the bedtools function groupBy (or equivalent for the latest versions of bedtools) is used to aggregate those tags with same starting position into CTSS.
+  command line instructions are combined with bedtools to read the bed file output
+ obtained by the conversion of mapping file .BAM into .bed; then the bedtools function
+ groupBy (or equivalent for the latest versions of bedtools) is used to aggregate those
+ tags with same starting position into CTSS.
 Full instructions from BAM to CTSS. Commands are executed separately for plus and minus strand.
 *read bam file considering quality score and keeping binary format (-u)
 *convert into bed

	FAQ
	Mozilla Cavendish skin modified by DaSch for the Web Community Wiki GitHub project page – Report a bug – Skin version: 2.6.0