| Database or repository |
Description |
Nucleotide sequence
|
| Mouse Gene Index (in-house) |
redundant phase I clone
sequencing data |
| nr-nt (in-house) |
non-redundant database built
from Genbank, EMBL, DDBJ, and their cumulative daily-updated nucleotide
sequences |
| tigr-mgi |
Nucleotide sequences from
TIGR Mouse Gene Index |
| MGI |
integrated view of gene
characterization, nomenclature, genetic markers, mapping, gene homologies,
expression, phenotype and other biological data |
| est_mouse |
mouse EST sequences |
| UniGene |
clusters of ESTs and full-length
mRNA sequences; each cluster; represent a unique known or putative human
gene |
| TIGR
Gene Indices |
human and non-human TIGR
and GenBank EST sequences assembled to tentative consensus sequences |
| UTRdB |
a non-redundant 3' and 5'UTR
sequences of eukaryotic mRNAs enriched with annotations abouts functional
elements and repeats |
Mapping
|
| Whitehead
Mouse RH dB |
T31 RH hybrid data of 20
mouse chromosomes |
| Jackson
Laboratory T31 Mouse RH dB |
T31 RH data of 20 mouse
chromosomes from various sources incl. WICGR mouse RH dB, The UK Mouse
Genome Centre, Genoscope - CNS mapped together into a single comprehensive
map |
| Refseq |
reference sequence standards
for chromosomes, mRNAs, and proteins for the functional annotation of genome
data |
| Ensembl |
human genome dataset containing
confirmed and predicted genes, exons, transcripts, and contigs |
Protein sequence
|
| NCBI-nr |
non-redundant GenBank CDS
translations+PDB+SwissProt+PIR+PDB |
| SwissProt |
annotated protein database
with minimum redunandancy, annotation incl. GO terms and functional sites> |
| TrEMBL |
translations of all CDS
present in the EMBL, which are not yet integrated into SWISS-PROT |
| TIGR's nr-aa |
non-redundant amino acid
sequence database prepared at TIGR using data from EGAD, SwissProt, PDB
and GenPept |
Gene cluster
|
| HomoloGene |
curated orthologs of mouse,
rat, and human and zebrafish, mouse human, calculated orthologs for sequence
comparisons between all UniGene clusters for each pair of organism |
| Pfam |
semi-automatic protein family
database containing multiple protein alignments and profile-HMMs of these
families |
| TIGRFAM |
a curated protein family
database containing multiple protein alignments and profile HMMs of these
families |
| InterPro |
integrated view of other
domain and functional site databases (PROSITE, PRINTS, ProDom and Pfam) |
| UTRsite |
nucleotide sequence patterns
of UTRs where a functional role has been shown epxerimentally |
Pathway
|
| KEGG |
metabolic and regulatory
pathway maps |
Disease
|
| LocusLink |
annotated sequence and descriptive
information about genetic loci |
| Refseq |
reference sequence standards
for chromosomes, mRNAs, and proteins for the functional annotation of genome
data |
| OMIM |
catalog of human genes and
genetic disorders |
Literature
|
| PubMed |
abstracts and bibliographic
information of journal articles and books |
Gene Onotology
|
| swp2go |
gene ontology index for
mapping of SwissProt keywords to GO terms |
| egad2go |
gene ontology index for
mapping of EGAD cellular roles to GO terms |
| Software name |
Description |
Functional Annotation
|
| FANTOM+ |
web-based system for human
curation of sequences |
Database searching
|
| NCBI-BLAST |
Basic Local Alignment Search
Tool that includes s a set of similarity search programs(BLASTN, BLASTP,
BLASTX, TBLASTN, TBLASTX) |
| RepeatMasker |
screens DNA sequences against
a library of repetitive elements, as well as for low complexity regions;
it returns a masked query sequence ready for database searches |
| FASTA |
The package that compares
a sequence to another sequence or to a sequence database using the FASTA
algorithm. Especially, FASTY program was frequently used in the FANTOM
meeting. (FASTY is a program that compares a DNA sequence to a protein
sequence database using the FASTA algorithm; it translates the DNA sequence
in three forward (or reverse) frames and allows frameshifts) |
| FLAST (in house) |
DDS
based program that compares a query sequence pairwise with a cDNA sequence
database |
| Wise2 |
Wise2 is a package for comparing
DNA and protein sequences. In the meeting, estwise in the Wise2 package
was frequently used because it can compare a protein sequence against an
EST/cDNA sequence with the option of using a protein profile HMM |
| HMMER |
profile hidden Markov models
for biological sequence analysis; searches a sequence database with a profile
HMM or builds a hidden Markov model from an sequence alignment |
| Patsearch |
finds functional elements
in nucleotide and protein sequences and assesses their statistical significance |
Gene structure; Open Reading
Frame
|
| GenScan |
determines the most likely
gene structure (exon/intron) under a probabilistic model of the gene structural
and compositional properties of the genomic DNA for a given organism |
| ORF
Finder |
finds all open reading frames
of a selected minimum size in a sequence |
| DECODER (in house) |
extracts open reading frames
from sequences and corrects frame-shifts |
Multiple sequence alignment
|
| CLUSTALW |
progressive multiple sequence
alignment through sequence weighting, position-specific gap penalties and
weight matrix choice |
Cluster analysis
|
| Maximum density subgraph
(in house) |
generates a linkage graph
whose veritices are sequences and edges are pairwise similarities; it then
finds subgraphs whose vertices are connected with a fraction'p' of
the other vertices until all sequences are covered and the maximum density
(sum of similarities/no of nodes) is found |
Assemble
|
| Phred |
reads DNA sequencer trace
data, calls bases, and assigns quality values to the bases |
| Phrap |
assembles shotgun DNA sequence
data to a contig sequence |
| Consed |
edits sequence assemblies
created by Phrap for reassembling of the same data set |
| CAP3 |
assembles sequences using
base quality values in computation of overlaps between reads; construction
of multiple sequence alignments of reads, and generation of consensus sequences |
Others
|
| bioSCOUT |
commercial software package
for enhanced sequence analysis |
| experimental programs |
extraction and assignment
of GO terms |