Genetic code as a way of recording hereditary information. Biosynthesis of protein and nucleic acids

In the body's metabolism leading role belongs to proteins and nucleic acids.
Protein substances form the basis of all vital cell structures, have an unusually high reactivity, and are endowed with catalytic functions.
Nucleic acids are part of the most important organ of the cell - the nucleus, as well as the cytoplasm, ribosomes, mitochondria, etc. Nucleic acids play an important, primary role in heredity, body variability, and protein synthesis.

Plan synthesis protein is stored in the cell nucleus, and direct synthesis occurs outside the nucleus, so it is necessary delivery service encoded plan from the nucleus to the site of synthesis. This delivery service is performed by RNA molecules.

The process starts at core cells: part of the DNA “ladder” unwinds and opens. Thanks to this, the RNA letters form bonds with the open DNA letters of one of the DNA strands. The enzyme transfers the RNA letters to join them into a strand. This is how the letters of DNA are “rewritten” into the letters of RNA. The newly formed RNA chain is separated, and the DNA “ladder” twists again. The process of reading information from DNA and synthesizing it using its RNA matrix is ​​called transcription , and the synthesized RNA is called messenger or mRNA .

After further modifications, this type of encoded mRNA is ready. mRNA comes out of the nucleus and goes to the site of protein synthesis, where the letters of the mRNA are deciphered. Each set of three i-RNA letters forms a “letter” that represents one specific amino acid.

Another type of RNA finds this amino acid, captures it with the help of an enzyme, and delivers it to the site of protein synthesis. This RNA is called transfer RNA, or t-RNA. As the mRNA message is read and translated, the chain of amino acids grows. This chain twists and folds into a unique shape, creating one type of protein. Even the protein folding process is remarkable: it takes a computer to calculate everything options folding an average-sized protein consisting of 100 amino acids would take 1027 (!) years. And it takes no more than one second to form a chain of 20 amino acids in the body, and this process occurs continuously in all cells of the body.

Genes, genetic code and its properties.

About 7 billion people live on Earth. Apart from the 25-30 million pairs of identical twins, genetically all people are different : everyone is unique, has unique hereditary characteristics, character traits, abilities, and temperament.

These differences are explained differences in genotypes- sets of genes of the organism; Each one is unique. The genetic characteristics of a particular organism are embodied in proteins - therefore, the structure of the protein of one person differs, although very slightly, from the protein of another person.

It does not mean that no two people have exactly the same proteins. Proteins that perform the same functions may be the same or differ only slightly by one or two amino acids from each other. But does not exist on Earth of people (with the exception of identical twins) who would have all their proteins are the same .

Protein Primary Structure Information encoded as a sequence of nucleotides in a section of a DNA molecule, gene – a unit of hereditary information of an organism. Each DNA molecule contains many genes. The totality of all the genes of an organism constitutes it genotype . Thus,

Gene is a unit of hereditary information of an organism, which corresponds to a separate section of DNA

Coding of hereditary information occurs using genetic code , which is universal for all organisms and differs only in the alternation of nucleotides that form genes and encode proteins of specific organisms.

Genetic code consists of triplets (triplets) of DNA nucleotides, combined in different sequences (AAT, HCA, ACG, THC, etc.), each of which encodes a specific amino acid (which will be built into the polypeptide chain).

Actually code counts sequence of nucleotides in an mRNA molecule , because it removes information from DNA (process transcriptions ) and translates it into a sequence of amino acids in the molecules of synthesized proteins (the process broadcasts ).
The composition of mRNA includes nucleotides A-C-G-U, the triplets of which are called codons : a triplet on DNA CGT on i-RNA will become a triplet GCA, and a triplet DNA AAG will become a triplet UUC. Exactly mRNA codons the genetic code is reflected in the record.

Thus, genetic code - a unified system for recording hereditary information in nucleic acid molecules in the form of a sequence of nucleotides . The genetic code is based on the use of an alphabet consisting of only four letters-nucleotides, distinguished by nitrogenous bases: A, T, G, C.

Basic properties of the genetic code:

1. Genetic code triplet. A triplet (codon) is a sequence of three nucleotides encoding one amino acid. Since proteins contain 20 amino acids, it is obvious that each of them cannot be encoded by one nucleotide ( Since there are only four types of nucleotides in DNA, in this case 16 amino acids remain uncoded). Two nucleotides are also not enough to encode amino acids, since in this case only 16 amino acids can be encoded. This means that the smallest number of nucleotides encoding one amino acid must be at least three. In this case, the number of possible nucleotide triplets is 43 = 64.

2. Redundancy (degeneracy) The code is a consequence of its triplet nature and means that one amino acid can be encoded by several triplets (since there are 20 amino acids and 64 triplets), with the exception of methionine and tryptophan, which are encoded by only one triplet. In addition, some triplets perform specific functions: in an mRNA molecule, triplets UAA, UAG, UGA are stop codons, i.e. stop-signals that stop the synthesis of the polypeptide chain. The triplet corresponding to methionine (AUG), located at the beginning of the DNA chain, does not code for an amino acid, but performs the function of initiating (exciting) reading.

3. Unambiguity code - at the same time as redundancy, code has the property unambiguity : each codon matches only one a certain amino acid.

4. Collinearity code, i.e. nucleotide sequence in a gene exactly corresponds to the sequence of amino acids in a protein.

5. Genetic code non-overlapping and compact , i.e. does not contain “punctuation marks”. This means that the reading process does not allow the possibility of overlapping columns (triplets), and, starting at a certain codon, reading proceeds continuously, triplet after triplet, until stop-signals ( stop codons).

6. Genetic code universal , i.e., the nuclear genes of all organisms encode information about proteins in the same way, regardless of the level of organization and systematic position of these organisms.

Exist genetic code tables for decryption codons mRNA and construction of chains of protein molecules.

Matrix synthesis reactions.

Reactions unknown in inanimate nature occur in living systems - matrix synthesis reactions.

The term "matrix" in technology they designate a mold used for casting coins, medals, and typographic fonts: the hardened metal exactly reproduces all the details of the mold used for casting. Matrix synthesis resembles casting on a matrix: new molecules are synthesized in exact accordance with the plan laid down in the structure of existing molecules.

The matrix principle lies at the core the most important synthetic reactions of the cell, such as the synthesis of nucleic acids and proteins. These reactions ensure the exact, strictly specific sequence of monomer units in the synthesized polymers.

There is directional action going on here. pulling monomers to a specific location cells - into molecules that serve as a matrix where the reaction takes place. If such reactions occurred as a result of random collisions of molecules, they would proceed infinitely slowly. The synthesis of complex molecules based on the template principle is carried out quickly and accurately. The role of the matrix macromolecules of nucleic acids play in matrix reactions DNA or RNA .

Monomeric molecules from which the polymer is synthesized - nucleotides or amino acids - in accordance with the principle of complementarity, are located and fixed on the matrix in a strictly defined, specified order.

Then it happens "cross-linking" of monomer units into a polymer chain, and the finished polymer is discharged from the matrix.

After that matrix is ​​ready to the assembly of a new polymer molecule. It is clear that just as on a given mold only one coin or one letter can be cast, so on a given matrix molecule only one polymer can be “assembled”.

Matrix reaction type- a specific feature of the chemistry of living systems. They are the basis of the fundamental property of all living things - its ability to reproduce its own kind.

Template synthesis reactions

1. DNA replication - replication (from Latin replicatio - renewal) - the process of synthesis of a daughter molecule of deoxyribonucleic acid on the matrix of the parent DNA molecule. During the subsequent division of the mother cell, each daughter cell receives one copy of a DNA molecule that is identical to the DNA of the original mother cell. This process ensures that genetic information is accurately passed on from generation to generation. DNA replication is carried out by a complex enzyme complex consisting of 15-20 different proteins, called replisome . The material for synthesis is free nucleotides present in the cytoplasm of cells. The biological meaning of replication lies in the accurate transfer of hereditary information from the mother molecule to the daughter molecules, which normally occurs during the division of somatic cells.

A DNA molecule consists of two complementary strands. These chains are held together by weak hydrogen bonds that can be broken by enzymes. The DNA molecule is capable of self-duplication (replication), and on each old half of the molecule a new half is synthesized.
In addition, an mRNA molecule can be synthesized on a DNA molecule, which then transfers the information received from DNA to the site of protein synthesis.

Information transfer and protein synthesis proceed according to a matrix principle, comparable to the operation of a printing press in a printing house. Information from DNA is copied many times. If errors occur during copying, they will be repeated in all subsequent copies.

True, some errors when copying information with a DNA molecule can be corrected - the process of error elimination is called reparation. The first of the reactions in the process of information transfer is the replication of the DNA molecule and the synthesis of new DNA chains.

2. Transcription (from Latin transcriptio - rewriting) - the process of RNA synthesis using DNA as a template, occurring in all living cells. In other words, it is the transfer of genetic information from DNA to RNA.

Transcription is catalyzed by the enzyme DNA-dependent RNA polymerase. RNA polymerase moves along the DNA molecule in the direction 3" → 5". Transcription consists of stages initiation, elongation and termination . The unit of transcription is an operon, a fragment of a DNA molecule consisting of promoter, transcribed part and terminator . mRNA consists of a single chain and is synthesized on DNA in accordance with the rule of complementarity with the participation of an enzyme that activates the beginning and end of the synthesis of the mRNA molecule.

The finished mRNA molecule enters the cytoplasm onto ribosomes, where the synthesis of polypeptide chains occurs.

3. Broadcast (from lat. translation- transfer, movement) - the process of protein synthesis from amino acids on a matrix of information (messenger) RNA (mRNA, mRNA), carried out by the ribosome. In other words, this is the process of translating the information contained in the sequence of nucleotides of mRNA into the sequence of amino acids in the polypeptide.

4. Reverse transcription is the process of forming double-stranded DNA based on information from single-stranded RNA. This process is called reverse transcription, since the transfer of genetic information occurs in the “reverse” direction relative to transcription. The idea of ​​reverse transcription was initially very unpopular because it contradicted the central dogma of molecular biology, which assumed that DNA is transcribed into RNA and then translated into proteins.

However, in 1970, Temin and Baltimore independently discovered an enzyme called reverse transcriptase (revertase) , and the possibility of reverse transcription was finally confirmed. In 1975, Temin and Baltimore were awarded the Nobel Prize in Physiology or Medicine. Some viruses (such as the human immunodeficiency virus, which causes HIV infection) have the ability to transcribe RNA into DNA. HIV has an RNA genome that is integrated into DNA. As a result, the DNA of the virus can be combined with the genome of the host cell. The main enzyme responsible for the synthesis of DNA from RNA is called reversease. One of the functions of reversease is to create complementary DNA (cDNA) from the viral genome. The associated enzyme ribonuclease cleaves RNA, and reversease synthesizes cDNA from the DNA double helix. The cDNA is integrated into the host cell genome by integrase. The result is synthesis of viral proteins by the host cell, which form new viruses. In the case of HIV, apoptosis (cell death) of T-lymphocytes is also programmed. In other cases, the cell may remain a distributor of viruses.

The sequence of matrix reactions during protein biosynthesis can be represented in the form of a diagram.

Thus, protein biosynthesis- this is one of the types of plastic exchange, during which hereditary information encoded in DNA genes is implemented into a specific sequence of amino acids in protein molecules.

Protein molecules are essentially polypeptide chains made up of individual amino acids. But amino acids are not active enough to combine with each other on their own. Therefore, before they combine with each other and form a protein molecule, amino acids must activate . This activation occurs under the action of special enzymes.

As a result of activation, the amino acid becomes more labile and, under the action of the same enzyme, binds to t- RNA. Each amino acid corresponds to a strictly specific t- RNA, which finds “its” amino acid and transfers it into the ribosome.

Consequently, various activated amino acids combined with their own T- RNA. The ribosome is like conveyor to assemble a protein chain from various amino acids supplied to it.

Simultaneously with t-RNA, on which its own amino acid “sits,” “ signal"from the DNA that is contained in the nucleus. In accordance with this signal, one or another protein is synthesized in the ribosome.

The directing influence of DNA on protein synthesis is not carried out directly, but with the help of a special intermediary - matrix or messenger RNA (m-RNA or mRNA), which synthesized into the nucleus e under the influence of DNA, so its composition reflects the composition of DNA. The RNA molecule is like a cast of the DNA form. The synthesized mRNA enters the ribosome and, as it were, transfers it to this structure plan- in what order must the activated amino acids entering the ribosome be combined with each other in order for a specific protein to be synthesized? Otherwise, genetic information encoded in DNA is transferred to mRNA and then to protein.

The mRNA molecule enters the ribosome and stitches her. That segment of it that is currently located in the ribosome is determined codon (triplet), interacts in a completely specific manner with those that are structurally similar to it triplet (anticodon) in transfer RNA, which brought the amino acid into the ribosome.

Transfer RNA with its amino acid matches a specific codon of the mRNA and connects with him; to the next, neighboring section of mRNA another tRNA with a different amino acid is added and so on until the entire chain of i-RNA is read, until all the amino acids are reduced in the appropriate order, forming a protein molecule. And tRNA, which delivered the amino acid to a specific part of the polypeptide chain, freed from its amino acid and exits the ribosome.

Then, again in the cytoplasm, the desired amino acid can join it and again transfer it to the ribosome. In the process of protein synthesis, not one, but several ribosomes - polyribosomes - are involved simultaneously.

The main stages of the transfer of genetic information:

1. Synthesis on DNA as a template for mRNA (transcription)
2. Synthesis of a polypeptide chain in ribosomes according to the program contained in mRNA (translation) .

The stages are universal for all living beings, but the temporal and spatial relationships of these processes differ in pro- and eukaryotes.

U prokaryote transcription and translation can occur simultaneously because DNA is located in the cytoplasm. U eukaryotes transcription and translation are strictly separated in space and time: the synthesis of various RNAs occurs in the nucleus, after which the RNA molecules must leave the nucleus by passing through the nuclear membrane. The RNAs are then transported in the cytoplasm to the site of protein synthesis.

Lecture 5. Genetic code

Definition of the concept

The genetic code is a system for recording information about the sequence of amino acids in proteins using the sequence of nucleotides in DNA.

Since DNA is not directly involved in protein synthesis, the code is written in RNA language. RNA contains uracil instead of thymine.

Properties of the genetic code

1. Triplety

Each amino acid is encoded by a sequence of 3 nucleotides.

Definition: a triplet or codon is a sequence of three nucleotides encoding one amino acid.

The code cannot be monoplet, since 4 (the number of different nucleotides in DNA) is less than 20. The code cannot be doublet, because 16 (the number of combinations and permutations of 4 nucleotides of 2) is less than 20. The code can be triplet, because 64 (the number of combinations and permutations from 4 to 3) is more than 20.

2. Degeneracy.

All amino acids, with the exception of methionine and tryptophan, are encoded by more than one triplet:

2 AK for 1 triplet = 2.

9 AK, 2 triplets each = 18.

1 AK 3 triplets = 3.

5 AK of 4 triplets = 20.

3 AK of 6 triplets = 18.

A total of 61 triplets encode 20 amino acids.

3. Presence of intergenic punctuation marks.


Gene - a section of DNA that encodes one polypeptide chain or one molecule tRNA, rRNA orsRNA.

GenestRNA, rRNA, sRNAproteins are not coded.

At the end of each gene encoding a polypeptide there is at least one of 3 triplets encoding RNA stop codons, or stop signals. In mRNA they have the following form: UAA, UAG, UGA . They terminate (end) the broadcast.

Conventionally, the codon also belongs to punctuation marks AUG - the first after the leader sequence. (See Lecture 8) It functions as a capital letter. In this position it encodes formylmethionine (in prokaryotes).

4. Unambiguity.

Each triplet encodes only one amino acid or is a translation terminator.

The exception is the codon AUG . In prokaryotes, in the first position (capital letter) it encodes formylmethionine, and in any other position it encodes methionine.

5. Compactness, or absence of intragenic punctuation marks.
Within a gene, each nucleotide is part of a significant codon.

In 1961, Seymour Benzer and Francis Crick experimentally proved the triplet nature of the code and its compactness.

The essence of the experiment: “+” mutation - insertion of one nucleotide. "-" mutation - loss of one nucleotide. A single "+" or "-" mutation at the beginning of a gene spoils the entire gene. A double "+" or "-" mutation also spoils the entire gene.

A triple “+” or “-” mutation at the beginning of a gene spoils only part of it. A quadruple “+” or “-” mutation again spoils the entire gene.

The experiment proves that The code is transcribed and there are no punctuation marks inside the gene. The experiment was carried out on two adjacent phage genes and showed, in addition, presence of punctuation marks between genes.

6. Versatility.

The genetic code is the same for all creatures living on Earth.

In 1979, Burrell opened ideal human mitochondria code.


“Ideal” is a genetic code in which the rule of degeneracy of the quasi-doublet code is satisfied: If in two triplets the first two nucleotides coincide, and the third nucleotides belong to the same class (both are purines or both are pyrimidines), then these triplets code for the same amino acid .

There are two exceptions to this rule in the universal code. Both deviations from the ideal code in the universal relate to fundamental points: the beginning and end of protein synthesis:




Mitochondrial codes







With UA




230 substitutions do not change the class of the encoded amino acid. to tearability.

In 1956, Georgiy Gamow proposed a variant of the overlapping code. According to the Gamow code, each nucleotide, starting from the third in the gene, is part of 3 codons. When the genetic code was deciphered, it turned out that it was non-overlapping, i.e. Each nucleotide is part of only one codon.

Advantages of an overlapping genetic code: compactness, less dependence of the protein structure on the insertion or deletion of a nucleotide.

Disadvantage: the protein structure is highly dependent on nucleotide replacement and restrictions on neighbors.

In 1976, the DNA of phage φX174 was sequenced. It has single-stranded circular DNA consisting of 5375 nucleotides. The phage was known to encode 9 proteins. For 6 of them, genes located one after another were identified.

It turned out that there is an overlap. Gene E is located entirely within the gene D . Its start codon results from a frame shift of one nucleotide. Gene J starts where the gene ends D . Start codon of the gene J overlaps with the stop codon of the gene D as a result of a shift of two nucleotides. The construction is called a “reading frameshift” by a number of nucleotides not a multiple of three. To date, overlap has only been shown for a few phages.

Information capacity of DNA

There are 6 billion people living on Earth. Hereditary information about them
enclosed in 6x10 9 spermatozoa. According to various estimates, a person has from 30 to 50
thousand genes. All humans have ~30x10 13 genes, or 30x10 16 base pairs, which make up 10 17 codons. The average book page contains 25x10 2 characters. The DNA of 6x10 9 sperm contains information equal in volume to approximately

4x10 13 book pages. These pages would take up the space of 6 NSU buildings. 6x10 9 sperm take up half a thimble. Their DNA takes up less than a quarter of a thimble.

Previously, we emphasized that nucleotides have an important feature for the formation of life on Earth - in the presence of one polynucleotide chain in a solution, the process of formation of a second (parallel) chain spontaneously occurs based on the complementary connection of related nucleotides. The same number of nucleotides in both chains and their chemical affinity are an indispensable condition for the implementation of this type of reaction. However, during protein synthesis, when information from mRNA is implemented into the protein structure, there can be no talk of observing the principle of complementarity. This is due to the fact that in mRNA and in the synthesized protein not only the number of monomers is different, but also, what is especially important, there is no structural similarity between them (nucleotides on the one hand, amino acids on the other). It is clear that in this case there is a need to create a new principle for accurately translating information from a polynucleotide into the structure of a polypeptide. In evolution, such a principle was created and its basis was the genetic code.

The genetic code is a system for recording hereditary information in nucleic acid molecules, based on a certain alternation of nucleotide sequences in DNA or RNA, forming codons corresponding to amino acids in a protein.

The genetic code has several properties.


    Degeneracy or redundancy.






It should be noted that some authors also propose other properties of the code related to the chemical characteristics of the nucleotides included in the code or the frequency of occurrence of individual amino acids in the body’s proteins, etc. However, these properties follow from those listed above, so we will consider them there.

A. Tripletity. The genetic code, like many complexly organized systems, has the smallest structural and smallest functional unit. A triplet is the smallest structural unit of the genetic code. It consists of three nucleotides. A codon is the smallest functional unit of the genetic code. Typically, triplets of mRNA are called codons. In the genetic code, a codon performs several functions. Firstly, its main function is that it encodes a single amino acid. Secondly, the codon may not code for an amino acid, but, in this case, it performs another function (see below). As can be seen from the definition, a triplet is a concept that characterizes elementary structural unit genetic code (three nucleotides). Codon – characterizes elementary semantic unit genome - three nucleotides determine the attachment of one amino acid to the polypeptide chain.

The elementary structural unit was first deciphered theoretically, and then its existence was confirmed experimentally. Indeed, 20 amino acids cannot be encoded with one or two nucleotides because there are only 4 of the latter. Three out of four nucleotides give 4 3 = 64 variants, which more than covers the number of amino acids available in living organisms (see Table 1).

The 64 nucleotide combinations presented in table have two features. Firstly, of the 64 triplet variants, only 61 are codons and encode any amino acid; they are called sense codons. Three triplets do not encode

Table 1.

Messenger RNA codons and corresponding amino acids







amino acids a are stop signals indicating the end of translation. There are three such triplets - UAA, UAG, UGA, they are also called “meaningless” (nonsense codons). As a result of a mutation, which is associated with the replacement of one nucleotide in a triplet with another, a nonsense codon can arise from a sense codon. This type of mutation is called nonsense mutation. If such a stop signal is formed inside the gene (in its information part), then during protein synthesis in this place the process will be constantly interrupted - only the first (before the stop signal) part of the protein will be synthesized. A person with this pathology will experience a lack of protein and will experience symptoms associated with this deficiency. For example, this kind of mutation was identified in the gene encoding the hemoglobin beta chain. A shortened inactive hemoglobin chain is synthesized, which is quickly destroyed. As a result, a hemoglobin molecule devoid of a beta chain is formed. It is clear that such a molecule is unlikely to fully fulfill its duties. A serious disease occurs that develops as hemolytic anemia (beta-zero thalassemia, from the Greek word “Thalas” - Mediterranean Sea, where this disease was first discovered).

The mechanism of action of stop codons differs from the mechanism of action of sense codons. This follows from the fact that for all codons encoding amino acids, corresponding tRNAs have been found. No tRNAs were found for nonsense codons. Consequently, tRNA does not take part in the process of stopping protein synthesis.

CodonAUG (sometimes GUG in bacteria) not only encode the amino acids methionine and valine, but are alsobroadcast initiator .

b. Degeneracy or redundancy.

61 of the 64 triplets encode 20 amino acids. This three-fold excess of the number of triplets over the number of amino acids suggests that two coding options can be used in the transfer of information. Firstly, not all 64 codons can be involved in encoding 20 amino acids, but only 20 and, secondly, amino acids can be encoded by several codons. Research has shown that nature used the latter option.

His preference is obvious. If out of 64 variant triplets only 20 were involved in encoding amino acids, then 44 triplets (out of 64) would remain non-coding, i.e. meaningless (nonsense codons). Previously, we pointed out how dangerous it is for the life of a cell to transform a coding triplet as a result of mutation into a nonsense codon - this significantly disrupts the normal functioning of RNA polymerase, ultimately leading to the development of diseases. Currently, three codons in our genome are nonsense, but now imagine what would happen if the number of nonsense codons increased by about 15 times. It is clear that in such a situation the transition of normal codons to nonsense codons will be immeasurably higher.

A code in which one amino acid is encoded by several triplets is called degenerate or redundant. Almost every amino acid has several codons. Thus, the amino acid leucine can be encoded by six triplets - UUA, UUG, TSUU, TsUC, TsUA, TsUG. Valine is encoded by four triplets, phenylalanine by two and only tryptophan and methionine encoded by one codon. The property that is associated with recording the same information with different symbols is called degeneracy.

The number of codons designated for one amino acid correlates well with the frequency of occurrence of the amino acid in proteins.

And this is most likely not accidental. The higher the frequency of occurrence of an amino acid in a protein, the more often the codon of this amino acid is represented in the genome, the higher the likelihood of its damage by mutagenic factors. Therefore, it is clear that a mutated codon has a greater chance of encoding the same amino acid if it is highly degenerate. From this perspective, the degeneracy of the genetic code is a mechanism that protects the human genome from damage.

It should be noted that the term degeneracy is used in molecular genetics in another sense. Thus, the bulk of the information in a codon is contained in the first two nucleotides; the base in the third position of the codon turns out to be of little importance. This phenomenon is called “degeneracy of the third base.” The latter feature minimizes the effect of mutations. For example, it is known that the main function of red blood cells is to transport oxygen from the lungs to the tissues and carbon dioxide from the tissues to the lungs. This function is performed by the respiratory pigment - hemoglobin, which fills the entire cytoplasm of the erythrocyte. It consists of a protein part - globin, which is encoded by the corresponding gene. In addition to protein, the hemoglobin molecule contains heme, which contains iron. Mutations in globin genes lead to the appearance of different variants of hemoglobins. Most often, mutations are associated with replacing one nucleotide with another and the appearance of a new codon in the gene, which may encode a new amino acid in the hemoglobin polypeptide chain. In a triplet, as a result of mutation, any nucleotide can be replaced - the first, second or third. Several hundred mutations are known that affect the integrity of the globin genes. Near 400 of which are associated with the replacement of single nucleotides in a gene and the corresponding amino acid replacement in a polypeptide. Of these only 100 replacements lead to instability of hemoglobin and various kinds of diseases from mild to very severe. 300 (approximately 64%) substitution mutations do not affect hemoglobin function and do not lead to pathology. One of the reasons for this is the above-mentioned “degeneracy of the third base,” when a replacement of the third nucleotide in a triplet encoding serine, leucine, proline, arginine and some other amino acids leads to the appearance of a synonymous codon encoding the same amino acid. Such a mutation will not manifest itself phenotypically. In contrast, any replacement of the first or second nucleotide in a triplet in 100% of cases leads to the appearance of a new hemoglobin variant. But even in this case, there may not be severe phenotypic disorders. The reason for this is the replacement of an amino acid in hemoglobin with another one similar to the first in physicochemical properties. For example, if an amino acid with hydrophilic properties is replaced by another amino acid, but with the same properties.

Hemoglobin consists of the iron porphyrin group of heme (oxygen and carbon dioxide molecules are attached to it) and protein - globin. Adult hemoglobin (HbA) contains two identical-chains and two-chains. Molecule-chain contains 141 amino acid residues,-chain - 146,- And-chains differ in many amino acid residues. The amino acid sequence of each globin chain is encoded by its own gene. Gene encoding-the chain is located in the short arm of chromosome 16,-gene - in the short arm of chromosome 11. Substitution in the gene encoding-the hemoglobin chain of the first or second nucleotide almost always leads to the appearance of new amino acids in the protein, disruption of hemoglobin functions and serious consequences for the patient. For example, replacing “C” in one of the triplets CAU (histidine) with “Y” will lead to the appearance of a new triplet UAU, encoding another amino acid - tyrosine. Phenotypically this will manifest itself in a severe disease.. A similar substitution in position 63-chain of histidine polypeptide to tyrosine will lead to destabilization of hemoglobin. The disease methemoglobinemia develops. Replacement, as a result of mutation, of glutamic acid with valine in the 6th position-chain is the cause of the most severe disease - sickle cell anemia. Let's not continue the sad list. Let us only note that when replacing the first two nucleotides, an amino acid with physicochemical properties similar to the previous one may appear. Thus, replacement of the 2nd nucleotide in one of the triplets encoding glutamic acid (GAA) in-chain with “U” leads to the appearance of a new triplet (GUA), encoding valine, and replacing the first nucleotide with “A” forms the triplet AAA, encoding the amino acid lysine. Glutamic acid and lysine are similar in physicochemical properties - they are both hydrophilic. Valine is a hydrophobic amino acid. Therefore, replacing hydrophilic glutamic acid with hydrophobic valine significantly changes the properties of hemoglobin, which ultimately leads to the development of sickle cell anemia, while replacing hydrophilic glutamic acid with hydrophilic lysine changes the function of hemoglobin to a lesser extent - patients develop a mild form of anemia. As a result of the replacement of the third base, the new triplet can encode the same amino acids as the previous one. For example, if in the CAC triplet uracil was replaced by cytosine and a CAC triplet appeared, then practically no phenotypic changes will be detected in humans. This is understandable, because both triplets code for the same amino acid – histidine.

In conclusion, it is appropriate to emphasize that the degeneracy of the genetic code and the degeneracy of the third base from a general biological point of view are protective mechanisms that are inherent in evolution in the unique structure of DNA and RNA.

V. Unambiguity.

Each triplet (except nonsense) encodes only one amino acid. Thus, in the direction codon - amino acid the genetic code is unambiguous, in the direction amino acid - codon it is ambiguous (degenerate).


Amino acid codon


And in this case, the need for unambiguity in the genetic code is obvious. In another option, when translating the same codon, different amino acids would be inserted into the protein chain and, as a result, proteins with different primary structures and different functions would be formed. Cell metabolism would switch to the “one gene – several polypeptides” mode of operation. It is clear that in such a situation the regulatory function of genes would be completely lost.

g. Polarity

Reading information from DNA and mRNA occurs only in one direction. Polarity is important for defining higher order structures (secondary, tertiary, etc.). Earlier we talked about how lower-order structures determine higher-order structures. Tertiary structure and higher order structures in proteins are formed as soon as the synthesized RNA chain leaves the DNA molecule or the polypeptide chain leaves the ribosome. While the free end of an RNA or polypeptide acquires a tertiary structure, the other end of the chain continues to be synthesized on DNA (if RNA is transcribed) or a ribosome (if a polypeptide is transcribed).

Therefore, the unidirectional process of reading information (during the synthesis of RNA and protein) is essential not only for determining the sequence of nucleotides or amino acids in the synthesized substance, but for the strict determination of secondary, tertiary, etc. structures.

d. Non-overlapping.

The code may be overlapping or non-overlapping. Most organisms have a non-overlapping code. Overlapping code is found in some phages.

The essence of a non-overlapping code is that a nucleotide of one codon cannot simultaneously be a nucleotide of another codon. If the code were overlapping, then the sequence of seven nucleotides (GCUGCUG) could encode not two amino acids (alanine-alanine) (Fig. 33, A) as in the case of a non-overlapping code, but three (if there is one nucleotide in common) (Fig. 33, B) or five (if two nucleotides are common) (see Fig. 33, C). In the last two cases, a mutation of any nucleotide would lead to a violation in the sequence of two, three, etc. amino acids.

However, it has been established that a mutation of one nucleotide always disrupts the inclusion of one amino acid in a polypeptide. This is a significant argument that the code is non-overlapping.

Let us explain this in Figure 34. Bold lines show triplets encoding amino acids in the case of non-overlapping and overlapping code. Experiments have clearly shown that the genetic code is non-overlapping. Without going into details of the experiment, we note that if you replace the third nucleotide in the sequence of nucleotides (see Fig. 34)U (marked with an asterisk) to some other thing:

1. With a non-overlapping code, the protein controlled by this sequence would have a substitution of one (first) amino acid (marked with asterisks).

2. With an overlapping code in option A, a substitution would occur in two (first and second) amino acids (marked with asterisks). Under option B, the replacement would affect three amino acids (marked with asterisks).

However, numerous experiments have shown that when one nucleotide in DNA is disrupted, the disruption in the protein always affects only one amino acid, which is typical for a non-overlapping code.



*** *** *** *** *** ***

Alanin - Alanin Ala - Cis - Ley Ala - Ley - Ley - Ala - Ley


Non-overlapping code Overlapping code

Rice. 34. A diagram explaining the presence of a non-overlapping code in the genome (explanation in the text).

The non-overlap of the genetic code is associated with another property - the reading of information begins from a certain point - the initiation signal. Such an initiation signal in mRNA is the codon encoding methionine AUG.

It should be noted that a person still has a small number of genes that deviate from the general rule and overlap.

e. Compactness.

There is no punctuation between codons. In other words, triplets are not separated from each other, for example, by one meaningless nucleotide. The absence of “punctuation marks” in the genetic code has been proven in experiments.

and. Versatility.

The code is the same for all organisms living on Earth. Direct evidence of the universality of the genetic code was obtained by comparing DNA sequences with corresponding protein sequences. It turned out that all bacterial and eukaryotic genomes use the same sets of code values. There are exceptions, but not many.

The first exceptions to the universality of the genetic code were found in the mitochondria of some animal species. This concerned the terminator codon UGA, which reads the same as the codon UGG, encoding the amino acid tryptophan. Other rarer deviations from universality were also found.

