Genetic Code and Amino Acid Translation
Table 1 shows the genetic code of the messenger ribonucleic acid (mRNA), i.e. it shows all 64 possible combinations of codons composed of three nucleotide bases (tri-nucleotide units) that specify amino acids during protein assembling.
Each codon of the deoxyribonucleic acid (DNA) codes for or specifies a single amino acid and each nucleotide unit consists of a phosphate, deoxyribose sugar and one of the 4 nitrogenous nucleotide bases, adenine (A), guanine (G), cytosine (C) and thymine (T). The bases are paired and joined together by hydrogen bonds in the double helix of the DNA. mRNA corresponds to DNA (i.e. the sequence of nucleotides is the same in both chains) except that in RNA, thymine (T) is replaced by uracil (U), and the deoxyribose is substituted by ribose.
The process of translation of genetic information into the assembling of a protein requires first mRNA, which is read 5' to 3' (exactly as DNA), and then transfer ribonucleic acid (tRNA), which is read 3' to 5'. tRNA is the taxi that translates the information on the ribosome into an amino acid chain or polypeptide.
For mRNA there are 43 = 64 different nucleotide combinations possible with a triplet codon of three nucleotides. All 64 possible combinations are shown in Table 1. However, not all 64 codons of the genetic code specify a single amino acid during translation. The reason is that in humans only 20 amino acids (except selenocysteine) are involved in translation. Therefore, one amino acid can be encoded by more than one mRNA codon-triplet. Arginine and leucine are encoded by 6 triplets, isoleucine by 3, methionine and tryptophan by 1, and all other amino acids by 4 or 2 codons. The redundant codons are typically different at the 3rd base. Table 2 shows the inverse codon assignment, i.e. which codon specifies which of the 20 standard amino acids involved in translation.
Table 1. Genetic code: mRNA codon -> amino acid
1st
Base |
2nd
Base |
3rd
Base |
|
U |
C |
A |
G |
|
U |
Phenylalanine |
Serine |
Tyrosine |
Cysteine |
U |
Phenylalanine |
Serine |
Tyrosine |
Cysteine |
C |
Leucine |
Serine |
Stop |
Stop |
A |
Leucine |
Serine |
Stop |
Tryptophan |
G |
C |
Leucine |
Proline |
Histidine |
Arginine |
U |
Leucine |
Proline |
Histidine |
Arginine |
C |
Leucine |
Proline |
Glutamine |
Arginine |
A |
Leucine |
Proline |
Glutamine |
Arginine |
G |
A |
Isoleucine |
Threonine |
Asparagine |
Serine |
U |
Isoleucine |
Threonine |
Asparagine |
Serine |
C |
Isoleucine |
Threonine |
Lysine |
Arginine |
A |
Methionine (Start)1 |
Threonine |
Lysine |
Arginine |
G |
G |
Valine |
Alanine |
Aspartate |
Glycine |
U |
Valine |
Alanine |
Aspartate |
Glycine |
C |
Valine |
Alanine |
Glutamate |
Glycine |
A |
Valine |
Alanine |
Glutamate |
Glycine |
G |
Table 2. Reverse codon table: amino acid -> mRNA codon
Amino acid |
mRNA codons |
Amino acid |
mRNA codons |
Ala/A |
GCU, GCC, GCA, GCG |
Leu/L |
UUA, UUG, CUU, CUC, CUA, CUG |
Arg/R |
CGU, CGC, CGA, CGG, AGA, AGG |
Lys/K |
AAA, AAG |
Asn/N |
AAU, AAC |
Met/M |
AUG |
Asp/D |
GAU, GAC |
Phe/F |
UUU, UUC |
Cys/C |
UGU, UGC |
Pro/P |
CCU, CCC, CCA, CCG |
Gln/Q |
CAA, CAG |
Ser/S |
UCU, UCC, UCA, UCG, AGU, AGC |
Glu/E |
GAA, GAG |
Thr/T |
ACU, ACC, ACA, ACG |
Gly/G |
GGU, GGC, GGA, GGG |
Trp/W |
UGG |
His/H |
CAU, CAC |
Tyr/Y |
UAU, UAC |
Ile/I |
AUU, AUC, AUA |
Val/V |
GUU, GUC, GUA, GUG |
START |
AUG |
STOP |
UAG, UGA, UAA |
The direction of reading mRNA is 5' to 3'. tRNA (reading 3' to 5') has anticodons complementary to the codons in mRNA and can be "charged" covalently with amino acids at their 3' terminal. According to Crick the binding of the base-pairs between the mRNA codon and the tRNA anticodon takes place only at the 1st and 2nd base. The binding at the 3rd base (i.e. at the 5' end of the tRNA anticodon) is weaker and can result in different pairs. For the binding between codon and anticodon to come true the bases must wobble out of their positions at the ribosome. Therefore, base-pairs are sometimes called wobble-pairs.
Table 3 shows the possible wobble-pairs at the 1st, 2nd and 3rd base. The possible pair combinations at the 1st and 2nd base are identical. At the 3rd base (i.e. at the 3' end of mRNA and 5' end of tRNA) the possible pair combinations are less unambiguous, which leads to the redundancy in mRNA. The deamination (removal of the amino group NH2) of adenosine (not to confuse with adenine) produces the nucleotide inosine (I) on tRNA, which generates non-standard wobble-pairs with U, C or A (but not with G) on mRNA. Inosine may occur at the 3rd base of tRNA.
Table 3. Base-pairs: mRNA codon -> tRNA anticodon
1st (i.e. 5' end) and 2nd place mRNA codon |
1st (i.e. 3' end) and 2nd place tRNA anticodon |
A |
U |
U |
A |
C |
G |
G |
C |
3rd place (i.e. 3' end) mRNA codon |
3rd place (i.e. 5' end) tRNA anticodon |
A or G |
U |
U |
A |
U or C |
G |
G |
C |
U, C or A |
I |
Table 3 is read in the following way: for the 1st and 2nd base-pairs the wobble-pairs provide uniqueness in the way that U on tRNA always emerges from A on mRNA, A on tRNA always emerges from U on mRNA, etc. For the 3rd base-pair the genetic code is redundant in the way that U on tRNA can emerge from A or G on mRNA, G on tRNA can emerge from U or C on mRNA and I on tRNA can emerge from U, C or A on mRNA. Only A and C at the 3rd place on tRNA are unambiguously assigned to U and G at the 3rd place on mRNA, respectively.
Due to this combination structure a tRNA can bind to different mRNA codons where synonymous or redundant mRNA codons differ at the 3rd base (i.e. at the 5' end of tRNA and the 3' end of mRNA). By this logic the minimum number of tRNA anticodons necessary to encode all amino acids reduces to 31 (excluding the 2 STOP codons AUU and ACU, see Table 5). This means that any tRNA anticodon can be encoded by one or more different mRNA codons (Table 4). However, there are more than 31 tRNA anticodons possible for the translation of all 64 mRNA codons. For example, serine has a fourfold degenerate site at the 3rd position (UCU, UCC, UCA, UCG), which can be translated by AGI (for UCU, UCC and UCA) and AGC on tRNA (for UCG) but also by AGG and AGU. This means, in turn, that any mRNA codon can also be translated by one or more tRNA anticodons (see Table 5).
The reason for the occurrence of different wobble-pairs encoding the same amino acid may be due to a compromise between velocity and safety in protein synthesis. The redundancy of mRNA codons exist to prevent mistakes in transcription caused by mutations or variations at the 3rd position but also at other positions. For example, the first position of the leucine codons (UCA, UCC, CCU, CCC, CCA, CCG) is a twofold degenerate site, while the second position is unambiguous (not redundant). Another example is serine with mRNA codons UCA, UCG, UCC, UCU, AGU, AGC. Of course, serine is also twofold degenerate at the first position and fourfold degenerate at the third position, but it is twofold degenerate at the second position in addition. Table 4 shows the assignment of mRNA codons to any possible tRNA anticodon in eukaryotes for the 20 standard amino acids involved in translation. It is the reverse codon assignment.
Table 4. Reverse amino acid encoding: amino acid -> tRNA anticodon -> mRNA codon
Amino acid |
tRNA anticodon |
mRNA codon |
Phenylalanine |
3'-AAG-5' |
5'-UUU-3', 5'-UUC-3' |
|
3'-AAA-5' |
5'-UUU-3' |
Leucine |
3'-AAU-5' |
5'-UUA-3', 5'-UUG-3' |
|
3'-AAC-5' |
5'-UUG-3' |
|
3'-GAI-5' |
5'-CUU-3', 5'-CUC-3', 5'-CUA-3' |
|
3'-GAG-5' |
5'-CUU-3', 5'-CUC-3' |
|
3'-GAU-5' |
5'-CUA-3', 5'-CUG-3' |
|
3'-GAA-5' |
5'-CUU-3' |
|
3'-GAC-5' |
5'-CUG-3' |
Serine |
3'-AGI-5' |
5'-UCU-3', 5'-UCC-3', 5'-UCA-3' |
|
3'-AGG-5' |
5'-UCU-3', 5'-UCC-3' |
|
3'-AGU-5' |
5'-UCA-3', 5'-UCG-3' |
|
3'-AGA-5' |
5'-UCU-3' |
|
3'-AGC-5' |
5'-UCG-3' |
|
3'-UCG-5' |
5'-AGU-3', 5'-AGC-3' |
|
3'-UCA-5' |
5'-AGU-3' |
Tyrosine |
3'-AUG-5' |
5'-UAU-3', 5'-UAC-3' |
|
3'-AUA-5' |
5'-UAU-3' |
Cysteine |
3'-ACG-5' |
5'-UGU-3', 5'-UGC-3' |
|
3'-ACA-5' |
5'-UGU-3' |
Tryptophan |
3'-ACC-5' |
5'-UGG-3' |
Proline |
3'-GGI-5' |
5'-CCU-3', 5'-CCC-3', 5'-CCA-3' |
|
3'-GGG-5' |
5'-CCU-3', 5'-CCC-3' |
|
3'-GGU-5' |
5'-CCA-3', 5'-CCG-3' |
|
3'-GGA-5' |
5'-CCU-3' |
|
3'-GGC-5' |
5'-CCG-3' |
Histidine |
3'-GUG-5' |
5'-CAU-3', 5'-CAC-3' |
|
3'-GUA-5' |
5'-CAU-3' |
Glutamine |
3'-GUU-5' |
5'-CAA-3', 5'-CAG-3' |
|
3'-GUC-5' |
5'-CAG-3' |
Arginine |
3'-GCI-5' |
5'-CGU-3', 5'-CGC-3', 5'-CGA-3' |
|
3'-GCG-5' |
5'-CGU-3', 5'-CGC-3' |
|
3'-GCU-5' |
5'-CGA-3', 5'-CGG-3' |
|
3'-GCA-5' |
5'-CGU-3' |
|
3'-GCC-5' |
5'-CGG-3' |
|
3'-UCU-5' |
5'-AGA-3', 5'-AGG-3' |
|
3'-UCC-5' |
5'-AGG-3' |
Isoleucine |
3'-UAI-5' |
5'-AUU-3', 5'-AUC-3', 5'-AUA-3' |
|
3'-UAG-5' |
5'-AUU-3', 5'-AUC-3' |
|
3'-UAA-5' |
5'-AUU-3' |
|
3'-UAU-5' |
5'-AUA-3' |
Methionine |
3'-UAC-5' |
5'-AUG-3' |
Threonine |
3'-UGI-5' |
5'-ACU-3', 5'-ACC-3', 5'-ACA-3' |
|
3'-UGG-5' |
5'-ACU-3', 5'-ACC-3' |
|
3'-UGU-5' |
5'-ACA-3', 5'-ACG-3' |
|
3'-UGA-5' |
5'-ACU-3' |
|
3'-UGC-5' |
5'-ACG-3' |
Asparagine |
3'-UUG-5' |
5'-AAU-3', 5'-AAC-3' |
|
3'-UUA-5' |
5'-AAU-3' |
Lysine |
3'-UUU-5' |
5'-AAA-3', 5'-AAG-3' |
|
3'-UUC-5' |
5'-AAG-3' |
Valine |
3'-CAI-5' |
5'-GUU-3', 5'-GUC-3', 5'-GUA-3' |
|
3'-CAG-5' |
5'-GUU-3', 5'-GUC-3' |
|
3'-CAU-5' |
5'-GUA-3', 5'-GUG-3' |
|
3'-CAA-5' |
5'-GUU-3' |
|
3'-CAC-5' |
5'-GUG-3' |
Alanine |
3'-CGI-5' |
5'-GCU-3', 5'-GCC-3', 5'-GCA-3' |
|
3'-CGG-5' |
5'-GCU-3', 5'-GCC-3' |
|
3'-CGU-5' |
5'-GCA-3', 5'-GCG-3' |
|
3'-CGA-5' |
5'-GCU-3' |
|
3'-CGC-5' |
5'-GCG-3' |
Aspartate |
3'-CUG-5' |
5'-GAU-3', 5'-GAC-3' |
|
3'-CUA-5' |
5'-GAU-3' |
Glutamate |
3'-CUU-5' |
5'-GAA-3', 5'-GAG-3' |
|
3'-CUC-5' |
5'-GAG-3' |
Glycine |
3'-CCI-5' |
5'-GGU-3', 5'-GGC-3', 5'-GGA-3' |
|
3'-CCG-5' |
5'-GGU-3', 5'-GGC-3' |
|
3'-CCU-5' |
5'-GGA-3', 5'-GGG-3' |
|
3'-CCA-5' |
5'-GGU-3' |
|
3'-CCC-5' |
5'-GGG-3' |
While it is not possible to predict a specific DNA codon from an amino acid, DNA codons can be decoded unambiguously into amino acids. The reason is that there are 61 different DNA (and mRNA) codons specifying only 20 amino acids. Note that there are 3 additional codons for chain termination, i.e. there are 64 DNA (and thus 64 different mRNA) codons, but only 61 of them specify amino acids.
Table 5 shows the genetic code for the translation of all 64 DNA codons, starting from DNA over mRNA and tRNA to amino acid. In the last column, the table shows the different tRNA anticodons minimally necessary to translate all DNA codons into amino acids and sums up the number in the final row. It reveals that the minimum number of tRNA anticodons to translate all DNA codons is 31 (plus 2 STOP codons). The maximum number of tRNA anticodons that can emerge in amino acid transcription is 70 (plus 3 STOP codons).
Table 5. Genetic code: DNA -> mRNA codon -> tRNA
anticodon -> amino acid
Obs. |
DNA |
mRNA |
tRNA |
Amino acid |
Different AA |
Diff. tRNA anticodons
to encode all AA |
1 |
TTT |
UUU |
AAA, AAG |
Phe |
Phenylalanine |
AAG |
2 |
TTC |
UUC |
AAG |
Phe |
|
|
3 |
TTA |
UUA |
AAU |
Leu |
Leucine |
AAU |
4 |
TTG |
UUG |
AAU, AAC |
Leu |
|
|
5 |
TCT |
UCU |
AGI, AGG, AGA |
Ser |
Serine |
AGI |
6 |
TCC |
UCC |
AGI, AGG |
Ser |
|
|
7 |
TCA |
UCA |
AGI, AGU |
Ser |
|
|
8 |
TCG |
UCG |
AGC, AGU |
Ser |
|
AGC (or AGU) |
9 |
TAT |
UAU |
AUA, AUG |
Tyr |
Tyrosine |
AUG |
10 |
TAC |
UAC |
AUG |
Tyr |
|
|
11 |
TAA |
UAA |
AUU |
STOP |
|
AUU |
12 |
TAG |
UAG |
AUC, AUU |
STOP |
|
|
13 |
TGT |
UGU |
ACA, ACG |
Cys |
Cysteine |
ACG |
14 |
TGC |
UGC |
ACG |
Cys |
|
|
15 |
TGA |
UGA |
ACU |
STOP |
|
ACU |
16 |
TGG |
UGG |
ACC |
Trp |
Tryptophan |
ACC |
17 |
CTT |
CUU |
GAI, GAG, GAA |
Leu |
|
GAI |
18 |
CTC |
CUC |
GAI, GAG |
Leu |
|
|
19 |
CTA |
CUA |
GAI, GAU |
Leu |
|
|
20 |
CTG |
CUG |
GAC, GAU |
Leu |
|
GAC (or GAU) |
21 |
CCT |
CCU |
GGI, GGG, GGA |
Pro |
Proline |
GGI |
22 |
CCC |
CCC |
GGI, GGG |
Pro |
|
|
23 |
CCA |
CCA |
GGI, GGU |
Pro |
|
|
24 |
CCG |
CCG |
GGC, GGU |
Pro |
|
GGC (or GGU) |
25 |
CAT |
CAU |
GUA, GUG |
His |
Histidine |
GUG |
26 |
CAC |
CAC |
GUG |
His |
|
|
27 |
CAA |
CAA |
GUU |
Gln |
Glutamine |
GUU |
28 |
CAG |
CAG |
GUC, GUU |
Gln |
|
|
29 |
CGT |
CGU |
GCI, GCG, GCA |
Arg |
Arginine |
GCI |
30 |
CGC |
CGC |
GCI, GCG |
Arg |
|
|
31 |
CGA |
CGA |
GCI, GCU |
Arg |
|
|
32 |
CGG |
CGG |
GCC, GCU |
Arg |
|
GCC (or GCU) |
33 |
ATT |
AUU |
UAI, UAG, UAA |
Ile |
Isoleucine |
UAI |
34 |
ATC |
AUC |
UAI, UAG |
Ile |
|
|
35 |
ATA |
AUA |
UAI, UAU |
Ile |
|
|
36 |
ATG |
AUG |
UAC |
Met |
Methionine |
UAC |
37 |
ACT |
ACU |
UGI, UGG, UGA |
Thr |
Threonine |
UGI |
38 |
ACC |
ACC |
UGI, UGG |
Thr |
|
|
39 |
ACA |
ACA |
UGI, UGU |
Thr |
|
|
40 |
ACG |
ACG |
UGC, UGU |
Thr |
|
UGC (or UGU) |
41 |
AAT |
AAU |
UUA, UUG |
Asn |
Asparagine |
UUG |
42 |
AAC |
AAC |
UUG |
Asn |
|
|
43 |
AAA |
AAA |
UUU |
Lys |
Lysine |
UUU |
44 |
AAG |
AAG |
UUC, UUU |
Lys |
|
|
45 |
AGT |
AGU |
UCA, UCG |
Ser |
|
UCG |
46 |
AGC |
AGC |
UCG |
Ser |
|
|
47 |
AGA |
AGA |
UCU |
Arg |
|
UCU |
48 |
AGG |
AGG |
UCC, UCU |
Arg |
|
|
49 |
GTT |
GUU |
CAI, CAG, CAA |
Val |
Valine |
CAI |
50 |
GTC |
GUC |
CAI, CAG |
Val |
|
|
51 |
GTA |
GUA |
CAI, CAU |
Val |
|
|
52 |
GTG |
GUG |
CAC, CAU |
Val |
|
CAC (or CAU) |
53 |
GCT |
GCU |
CGI, CGG, CGA |
Ala |
Alanine |
CGI |
54 |
GCC |
GCC |
CGI, CGG |
Ala |
|
|
55 |
GCA |
GCA |
CGI, CGU |
Ala |
|
|
56 |
GCG |
GCG |
CGC, CGU |
Ala |
|
CGC (or CGU) |
57 |
GAT |
GAU |
CUG, CUA |
Asp |
Aspartate |
CUG |
58 |
GAC |
GAC |
CUG |
Asp |
|
|
59 |
GAA |
GAA |
CUU |
Glu |
Glutamate |
CUU |
60 |
GAG |
GAG |
CUU, CUC |
Glu |
|
|
61 |
GGT |
GGU |
CCI, CCG, CCA |
Gly |
Glycine |
CCI |
62 |
GGC |
GGC |
CCI, CCG |
Gly |
|
|
63 |
GGA |
GGA |
CCI, CCU |
Gly |
|
|
64 |
GGG |
GGG |
CCC, CCU |
Gly |
|
CCC (or CCU) |
No. |
64 |
64 |
|
|
20 |
33 |
Note:
1The codon AUG both codes for methionine and serves as an initiation site: the first AUG in an mRNA's coding region is where translation into protein begins.
|