丁香实验_LOGO
登录
提问
我要登录
|免费注册
点赞
收藏
wx-share
分享

codon usage database

互联网

1847

Announcement


Codon Usage Database is an extended WWW version of CUTG (Codon Usage Tabulated from GenBank). The frequency of codon use in each organism is made searchable through this World Wide Web site.

CUTG was originally developed by Prof. Toshimichi Ikemura at Laboratory of Evolutionary Genetics, National Institute of Genetics.

Codon Usage Database is developed and mainteined by Yasukazu Nakamura at The First Laboratory for Plant Gene Research, Kazusa DNA Research Institute.

Data source

NCBI-GenBank Flat File Release 141.0 [May 11 2004].

Files pri (primate sequence entries), rod (rodent sequence entries), mam (other mammalian sequence entries), vrt (other vertebrate sequence entries), inv (invertebrate sequence entries), pln (plant sequence entries), bct (bacterial sequence entries), vrl (viral sequence entries) and phg (phage sequence entries) were used.

Files for est (EST: expressed sequence tag sequence entries) and pat (patent sequence entries), rna (Structural RNA sequence entries), sts (STS: sequence tagged site sequence entries), syn (synthetic and chimeric sequence entries) and una (unanotated sequence entries) were not used.

All of the complete sequenced protein coding genes (CDS's) are used. Codons containing ambiguous base were excluded from count.

Data amount

21,733 organisms
1,125,820 complete protein coding genes (CDS's)

Usage

A query box to search a codon usage table for an organism, is presented. Search can be done with Latin name or its sub-string of organism. Default search process is case sensitive. Case insensitive option could be selected. Ambiguous query which hits over 100 organisms returns no answer.

Name of organism, shown in the answer list for query or alphabetical list, is followed by name of division of GenBank [gbbct, gbinv etc.], colon and number of compiled CDS. Like this;

Arabidopsis thaliana [gbpln]: 23188

If you select a link for an organism, codon usage table for the organism will be shown. The table shows frequency (per thousand ) and count for each codon as a sum of all CDS's of the organism. Table which include amino acids or which is formatted as GCG style (construction ) are also shown when one genetic code system is selected. Back translation program which is useful to design PCR primers in protein coding area also available (construction ).

Selecting the link "Codon usage of each CDS" under the table, you will browse or download all codon usage tables of CDS's in the organism. The format of table is CUTG style (See below link for "CODEN_LABEL" file in CUTG).

CUTG: Codon Usage Tabulated from GenBank

Codon usage tables for all CDS's for each GenBank division (pri, rod, mam, vrt, inv, pln, bct, vrl and phg) will be downloaded from "FTP links for CUTG files" link in top page.

A document README contains the latest information on the database in plain text format. CODON_LABEL and SPSUM_LABEL files show file formats.

Acknowledgment

We wish to thank Dr. Ugawa * at The DNA Information and Stock Center , National Institute of Agrobiological Resources for help in constructing and distributing the database from 1996 to 1999.

~undefined Present address of Dr. Ugawa is Environmental Education Center, Miyagi University of Education ]

This work was suported by a Grant-in-Aid for Scientific Research (Grant-in-Aid for Publication of Scientific Research Results) from Japan Society for the Promotion of Science.

Please cite

Codon usage tabulated from the international DNA sequence databases: status for the year 2000. Nakamura, Y., Gojobori, T. and Ikemura, T. (2000) Nucl. Acids Res. 28, 292.

( NAR Database Issue page )

This article gives references to earlier papers.

QUERY Box for search with Latin name of organism


Case: sensitive insensitive

Input a scientific name (or its regular expression ) for an organism and press "Submit" or return key. Use Latin name such as "Marchantia polymorpha", "Saccharomyces cerevisiae" etc., not "liverwort", "yeast" etc.

Alphabetical lists of all organisms

A

  
B

  
C

  
D

  
E

  
F

  
G

  
H

  
I

  
J

  
K

  
L

  
M



N

  
O

  
P

  
Q

  
R

  
S

  
T

  
U

  
V

  
W

  
X

  
Y

  
Z



Chloroplast

  
Mitochondrion



Others (intials are not capital)


CUTG: Codon Usage Tabulated from GenBank (ftp distribution)


Additional Service
Countcodon program: compilation a sequence into a codon usage table

提问
扫一扫
丁香实验小程序二维码
实验小助手
丁香实验公众号二维码
扫码领资料
反馈
TOP
打开小程序