Although the RNA viruses present a modification of the central dogma, the fact remains that in normal prokaryotic and eukaryotic cells, RNA synthesis is directed by DNA. Transcription—the formation of a specific RNA from a specific
DNA — requires several components:
A DNA template for complementary base pairing
The appropriate ribonucleoside triphosphates (ATP, GTP, CTP, and UTP) to act as substrates
An enzyme, RNA polymerase. Within each gene, only one of the two strands of DNA — the template strand — is transcribed. The other, complementary DNAstrand, referred to as the non-template strand, remains untranscribed. For different genes in the same DNA molecule, different strands may be transcribed. That is, the strand that is the non-template strand in one gene may be the template strand in another. Not only mRNA is produced by transcription. The same process is responsible for the synthesis of tRNA and ribosomal RNA (rRNA), whose important roles in protein synthesis will be described below. Like polypeptides, these RNAs are encoded by specific genes. In DNA replication, as we know, the two strands of the parent molecule unwind, and each strand serves as the template for a new strand. In transcription, DNA partly unwinds so that it can serve as a template for RNA synthesis. As the RNA transcript is formed, it peels away, allowing the DNA to be rewound into the double helix. Transcription can be divided into three distinct processes: initiation, elongation and termination. Let’s consider each of these in turn.
Initiation of transcription requires a promoter and RNA polymerase
Initiation begins transcription, and requires a promoter, a special sequence of DNA to which RNA polymerase binds very tightly. There is at least one promoter for each gene (or, in prokaryotes, each set of genes). Promoters are important control sequences that “tell” the RNA polymerase three things:
- where to start transcription
- which strand of DNA to read
- the direction to take from the start
A promoter, which is a specific sequence in the DNA that reads in a particular direction, orients the RNA polymerase and thus “aims” it at the correct strand to use as a template. Promoters function somewhat like the punctuation marks that determine how a sequence of words is to be read as a sentence. Part of each promoter is the initiation site, where transcription begins. Farther toward the 3′end of the pro- moter lie groups of nucleotides that help the RNA polymerase bind. RNA polymerase moves in a 3′-to-5′ direction along the template strand. Although every gene has a promoter, not all promoters are identical. Some promoters are more effective at transcription initiation than others. Furthermore, there are differences between transcription initiation in prokaryotes and in eukaryotes.
RNA polymerase elongates the transcript
Once RNA polymerase has bound to the promoter, it begins the process of elongation. It unwinds the DNA about 20 base pairs at a time and reads the template strand in the 3′-to-5′ direction. Like DNA polymerase, RNA polymerase adds new nucleotides to the 3′end of the growing strand, but does not require a primer to get started. The new RNA elongates from the first base that forms its 5′ end to its 3′ end. The RNA transcript is thus antiparallel to the DNA template strand. Unlike DNA polymerases, RNA polymerases do not inspect and correct their work. Transcription errors occur at a rate of one mistake for every 104 to 105 bases. Because many copies of RNA are made, and because they often have only a relatively short existence, these errors are not as potentially harmful as mutations in DNA.
Transcription terminates at particular base sequences
What tells RNA polymerase to stop adding nucleotides to a growing RNA transcript? Just as initiation sites specify the start of transcription, particular base sequences in the DNA specify its termination. The mechanisms of termination are complex and of more than one kind. For some genes, the newly formed transcript simply falls away from the DNA template and the RNA polymerase. For others, a helper protein pulls the transcript away. In prokaryotes, in which there is no nuclear envelope and ribosomes can be near the chromosome, the translation of mRNA often begins near the 5′ end of the mRNA before transcription of the mRNA molecule is complete. In eukaryotes, the situation is more complicated. First, there is a spatial separation of transcription (in the nucleus) and translation (in the cytoplasm). Second, the first product of transcription is a premRNA that is longer than the final mRNA and must undergo considerable processing before it can be translated.
The Genetic Code
How do transcription and translation produce specific and functional protein products? These processes require a geneticcode that relates genes (DNA) to mRNA and mRNA to the amino acids of proteins. The genetic code specifies which amino acids will be used to build a protein. You can think of the genetic information in an mRNA molecule as a series of sequential, nonoverlapping three-letter “words.” Each sequence of three nucleotide bases (the three “letters”) along the chain specifies a particular amino acid. Each three-letter “word” is called a codon. Each codon is complementary to the corresponding triplet in the DNA molecule from which it was transcribed. Thus, the genetic code is the means of relating codons to their specific amino acids. The complete genetic code is shown in Figure . Notice that there are many more codons than there are different amino acids in proteins. Combinations of the four available “letters” (the bases) give 64 (43) different three-letter codons, yet these codons determine only 20 amino acids. AUG, which codes for methionine, is also the start codon, the initiation signal for translation. Three of the codons (UAA, UAG, UGA) are stop codons, or termination signals for translation; when the translation machinery reaches one of these codons, translation stops, and the polypeptide is released from the translation complex. After describing the properties of the genetic code, we will examine some of the scientific thinking and experimentation that went into deciphering it.
The genetic code is redundant but not ambiguous
After the start and stop codons, the remaining 60 codons are far more than enough to code for the other 19 amino acids— and indeed there are repeats. Thus we say that the genetic code is redundant; that is, an amino acid may be represented by more than one codon. The redundancy is not evenly divided among the amino acids. For example, methionine and tryptophan are represented by only one codon each, whereas leucine is represented by six different codons. The term “redundancy” should not be confused with “ambiguity.” To say that the code was ambiguous would mean that a single codon could specify either of two (or more) different amino acids; there would then be doubt whether to put in, say, leucine or something else. The genetic code is not ambiguous. Redundancy in the code means that there is more than one clear way to say, “Put leucine here.” In other words, a given amino acid may be encoded by more than one codon, but a codon can code for only one amino acid. But just as people in different places prefer different ways of saying the same thing—”Good-bye!” “See you!” “Ciao!” and “So long!” have the same meaning—different organisms prefer one or another of the redundant codons.
The genetic code is (nearly) universal
Over 40 years of experiments on thousands of organisms from all the living domains and kingdoms reveal that the genetic code appears to be nearly universal, applying to all the species on our planet. Thus the code must be an ancient one that has been maintained intact throughout the evolution of living organisms. Exceptions are known: within mitochondria and chloroplasts, the code differs slightly from that in prokaryotes and in the nuclei of eukaryotic cells; in one group of protists, UAA and UAG code for glutamine rather than functioning as stop codons. The significance of these differences is not yet clear. What is clear is that the exceptions are few and slight. The common genetic code means that there is also a common language for evolution. As natural selection resulted in one species replacing another, the raw material of genetic variation has remained the same. The common code also has profound implications for genetic engineering since it means that a human gene is in the same language as a bacterial gene. The differences are more like dialects of a single language than entirely different languages. So the transcription and translation machinery of a bacterium could theoretically utilize genes from a human as well as its own genes. The codons in Figure are mRNA codons. The base sequence on the DNA strand that was transcribed to produce the mRNA is complementary and antiparallel to these codons. Thus, for example, 3′-AAA-5′in the template DNA strand corresponds to phenylalanine (which is encoded by the mRNA codon 5′-UUU-3′), and 3′-ACC-5′in the template DNA corresponds to tryptophan (which is encoded by the mRNA codon 5′-UGG-3′). How assign these codons to specific amino acids?
Biologists deciphered the genetic code by using artificial messengers
Molecular biologists broke the genetic code in the early 1960s. The problem was perplexing: How could more than 20 “code words” be written with an “alphabet” consisting of only four “letters”? How, in other words, could four bases (A, U, G, and C) code for 20 different amino acids? That the code was a triplet code, based on three-letter codons, was considered likely. Since there are only four letters (A, G, C, U), a one-letter code clearly could not unambiguously encode 20 amino acids; it could encode only four of them. A two-letter code could contain only 4 x 4 = 16 codons—still not enough. But a triplet code could contain up to 4 x 4 x 4 = 64 codons. This was more than enough to encode the 20 amino acids. Marshall W. Nirenberg and J. H. Matthaei, at the National Institutes of Health, made the first decoding breakthrough in 1961 when they realized that they could use a simple artificial polynucleotide instead of a complex natural mRNA as a messenger. They could then identify the polypeptide that the artificial messenger encoded. Scientists prepared an artificial mRNA in which all the bases were uracil (poly U). When poly U was added to a test tube containing all the ingredients necessary for protein synthesis (ribosomes, all the amino acids activating enzymes, tRNAs and other factors), a polypeptide formed. This polypeptide contained only one kind of amino acid: phenylalanine (Phe). Poly U coded for poly Phe. Accordingly, UUU appeared to be the mRNA code word—the codon—for phenylalanine. Following up on this success, Nirenberg and Matthaei soon showed that CCC codes for proline and AAA for lysine. (Poly G presented some chemical problems and was not tested initially.) UUU, CCC and AAA were three of the easiest codons; different approaches were required to work out the rest. Other scientists later found that simple artificial mRNAs only three nucleotides long — each amounting to a codon — could bind to a ribosome, and that the resulting complex could then cause the binding of the corresponding tRNA with its specific amino acid. Thus, for example, simple UUU caused the tRNA carrying phenylalanine to bind to the ribosome. After this discovery, complete deciphering of the genetic code was relatively simple. To find the “translation” of a codon, Nirenberg could use a sample of that codon as an artificial mRNA and see which amino acid became bound to it.
Preparation for Translation: Linking RNAs, Amino Acids, and Ribosomes
As Crick’s adapter hypothesis proposed, the translation of mRNAinto proteins requires a molecule that links the information contained in mRNA codons with specific amino acids in proteins. That function is performed by tRNA. Two key events must take place to ensure that the protein made is the one specified by mRNA:
- tRNA must read mRNA correctly.
- tRNA must carry the amino acid that is correct for its reading of the mRNA.
Transfer RNAs carry specific amino acids and bind to specific codons
The codon in mRNA and the amino acid in a protein are related by way of an adapter — a specific tRNA with attached amino acid. For each of the 20 amino acids, there is at least one specific type (“species”) of tRNA molecule.
The tRNA molecule has three functions: It carries (is “charged with”) an amino acid, it associates with mRNA molecules, and it interacts with ribosomes. Its molecular structure relates clearly to all of these functions. AtRNAmolecule has about 75 to 80 nucleotides. It has a conformation (a three-dimensional shape) that is maintained by complementary base pairing (hydrogen bonding) within its own sequence The conformation of a tRNA molecule allows it to combine specifically with binding sites on ribosomes. At the 3′ end of every tRNA molecule is a site to which its specific amino acid binds covalently. At about the midpoint of tRNA is a group of three bases, called the anticodon, that constitutes the site of complementary base pairing (hydrogen bonding) with mRNA. Each tRNA species has a unique anticodon, which is complementary to the mRNA codon for that tRNAs amino acid. At contact, the codon and the anticodon are antiparallel to each other. As an example of this process, consider the amino acid arginine:
- The DNA coding region for arginine is 3′-GCC-5′, which is transcribed, by complementary base pairing, to the mRNA codon 5′-CGG-3′.
- That mRNA codon binds by complementary base pairing to a tRNA with the anticodon 3′-GCC-5′ which is charged with arginine.
Recall that 61 different codons encode the 20 amino acids in proteins. Does this mean that the cell must produce 61 different tRNA species, each with a different anticodon? No. The cell gets by with about two-thirds that number of tRNA species, because the specificity for the base at the 3′end of the codon (and the 5′end of the anticodon) is not always strictly observed. This phenomenon, called wobble, allows the alanine codons GCA, GCC, and GCU, for example, all to be recognized by the same tRNA. Wobble is allowed in some matches but not in others; of most importance, it does not allow the genetic code to be ambiguous!
Activating enzymes link the right tRNAs and amino acids
The charging of each tRNA with its correct amino acid is achieved by a family of activating enzymes, known more formally as aminoacyl-tRNA synthetases (Figure ). Each activating enzyme is specific for one amino acid and for its corresponding tRNA. The enzyme has a three-part active site that recognizes three smaller molecules: a specific amino acid, ATP, and a specific tRNA. The activating enzyme reacts with tRNA and an amino acid (AA) in two steps:
enzyme + ATP + AA→enzyme—AMP—AA + PPi enzyme—AMP—AA + tRNA→enzyme + AMP + tRNA—AA
The amino acid is attached to the 3′end of the tRNA (to a free OH group on the ribose) with an energy-rich bond, forming charged tRNA. This bond will provide the energy for the synthesis of the peptide bond that will join adjacent amino acids. A clever experiment by Seymour Benzer and his colleagues at the California Institute of Technology demonstrated the importance of the specificity of the attachment of tRNA to its amino acid. In their laboratory, the amino acid cysteine, already properly attached to its tRNA, was chemically modified to become a different amino acid, alanine. Which component— the amino acid or the tRNA—would be recognized when this hybrid charged tRNA was put into a protein-synthesizing system? The answer was: the tRNA. Everywhere in the synthesized protein where cysteine was supposed to be, alanine appeared instead. The cysteine-specific tRNA had delivered its cargo (alanine) to every mRNA “address” where cysteine was called for. This experiment showed that the protein synthesis machinery recognizes the anticodon of the charged tRNA, not the amino acid attached to it. If activating enzymes in nature did what Benzer did in the laboratory and charged tRNAs with the wrong amino acids, those amino acids would be inserted into proteins at inappropriate places, leading to alterations in protein shape and function. The fact that the activating enzymes are highly specific has led to the process of tRNA charging being called the “second genetic code.”
The ribosome is the workbench for translation
Ribosomes are required for the translation of the genetic information in mRNA into a polypeptide chain. Although ribosomes are small in contrast to other cellular organelles, their mass of several million daltons makes them large in comparison with charged tRNAs. Each ribosome consists of two subunits, a large one and a small one. In eukaryotes, the large subunit consists of three different molecules of rRNA and about 45 different protein molecules arranged in a precise pattern. The small subunit consists of one rRNA molecule and 33 different protein molecules. When not active in the translation of mRNA, the ribosomes exist as separated subunits. The ribosomes of prokaryotes are somewhat smaller than those of eukaryotes, and their ribosomal proteins and RNAs are different. Mitochondria and chloroplasts also contain ribosomes, some of which are similar to those of prokaryotes. The different proteins and rRNAs in a ribosomal subunit are held together by ionic and hydrophobic forces, not covalent bonds. If these forces are disrupted by detergents, for example, the proteins and rRNAs separate from one another. When the detergent is removed, the entire complex structure self-assembles. This is like separating the pieces of a jigsaw puzzle and having them fit together again without human hands to guide them! A given ribosome does not specifically produce just one kind of protein. A ribosome can use any mRNA and all species of charged tRNAs, and thus can be used to make many different polypeptide products. The mRNA, as a linear sequence of codons, specifies the polypeptide sequence to be made; the ribosome is simply the molecular workbench where the task is accomplished. Its structure enables it to hold the mRNA and charged tRNAs in the right positions, thus allowing the growing polypeptide to be assembled efficiently.
On the large subunit of the ribosome are four sites to which tRNA binds (see Figure ). A charged tRNA traverses these four sites in order:
- The T (transfer) site is where a charged tRNA first lands on the ribosome, accompanied by a special protein “escort” called the T or transfer factor.
- The A (amino acid) site is where the tRNA anticodon binds to the mRNA codon, thus lining up the correct amino acid to be added to the growing polypeptide chain.
- The P (polypeptide) site is where the tRNA adds its amino acid to the growing polypeptide chain.
- The E (exit) site is where the tRNA, having given up its amino acid, resides before leaving the ribosome and going back to the cytosol to pick up another amino acid and begin the process again. Because codon–anticodon interactions and peptide bond formation occur at the A and P sites, we will describe their function in detail in the next section.
An important role of the ribosome is to make sure that the mRNA–tRNAinteractions are precise: that is, that a charged tRNA with the correct anticodon (e.g., 3′-UAC-5′) binds to the appropriate codon in mRNA (e.g., 5′-AUG-3′). When this occurs, hydrogen bonds form between the base pairs. But these hydrogen bonds are not enough to hold the tRNA in place. The rRNA of the small ribosomal subunit plays a role in validating the three-base-pair match. If hydrogen bonds have not formed between all three base pairs, the tRNA must be the wrong one for that mRNA codon, and that tRNA is ejected from the ribosome.
Translation: RNA-Directed Polypeptide Synthesis
We have been working our way through the steps by which the sequence of bases in the template strand of a DNA molecule specifies the sequence of amino acids in a protein . We are now at the last step: translation, the RNA-directed assembly of a protein. Like transcription, translation occurs in three steps: initiation, elongation, and termination.
Translation begins with an initiation complex
The translation of mRNA begins with the formation of an initiation complex, which consists of a charged tRNA bearing what will be the first amino acid of the polypeptide chain and a small ribosomal subunit, both bound to the mRNA. The rRNA of the small ribosomal subunit binds to a complementary ribosome recognition sequence on the mRNA. This sequence is “upstream” (toward the 5′end) of the actual start codon that begins translation. Recall that the mRNA start codon in the genetic code is AUG. The anticodon of a methioninecharged tRNA binds to this start codon by complementary base pairing to form the initiation complex. Thus the first amino acid in the chain is always methionine. Not all mature proteins have methionine as their N-terminal amino acid. In many cases, the initiator methionine is removed by an enzyme after translation. After the methionine-charged tRNA has bound to the mRNA, the large subunit of the ribosome joins the complex. The methionine-charged tRNA now lies in the P site of the ribosome, and the A site is aligned with the second mRNA codon. These ingredients — mRNA, two ribosomal subunits, and methionine-charged tRNA —are put together properly by a group of proteins called initiation factors.
The polypeptide elongates from the N terminus
A charged tRNA whose anticodon is complementary to the second codon on the mRNA now enters the open A site of the large ribosomal subunit. The large subunit then catalyzes two reactions:
- It breaks the bond between the tRNA in the P site and its amino acid.
- It catalyzes the formation of a peptide bond between that amino acid and the one attached to the tRNA in the A site.
Because the large subunit performs these two actions, it is said to have peptidyl transferase activity. In this way, methionine (the amino acid in the P site) becomes the N terminus of the new protein. The second amino acid is now bound to methionine but remains attached to its tRNA by its carboxyl group (—COOH) in the Asite. How does the large ribosomal subunit catalyze this binding? In 1992, Harry Noller and his colleagues at the University of California at Santa Cruz found that if they removed almost all the proteins in the large subunit, it still catalyzed peptide bond formation. But if the rRNA was destroyed, so was peptidyl transferase activity. Part of the rRNA in the large subunit interacts with the end of the charged tRNA where the amino acid is attached. Thus rRNA appears to be the catalyst. This situation is very unusual because proteins are the usual catalysts in biological systems. The recent purification and crystallization of ribosomes has allowed scientists to examine their structure in detail, and the catalytic role of rRNA in peptidyl transferase activity has been confirmed.
Elongation continues and the polypeptide grows
After the first tRNA releases its methionine, it dissociates from the ribosome, returning to the cytosol to become charged with another methionine. The second tRNA, now bearing a dipeptide, is shifted to the P site as the ribosome moves one codon along the mRNA in the 5′-to-3′direction. The elongation process continues, and the polypeptide chain grows, as the steps are repeated:
- The next charged tRNA enters the open A site.
- Its amino acid forms a peptide bond with the amino acid chain in the P site, so that it picks up the growing polypeptide chain from the tRNA in the P site.
- The tRNA in the P site is released. The ribosome shifts one codon, so that the entire tRNA–polypeptide complex, along with its codon, moves to the newly vacated P site. All these steps are assisted by proteins called elongation factors.
A release factor terminates translation
The elongation cycle ends, and translation is terminated, when a stop codon — UAA, UAG, or UGA — enters the A site. These codons encode no amino acids, nor do they bind tRNAs. Rather, they bind a protein release factor, which hydrolyzes the bond between the polypeptide and thetRNA in the P site. The newly completed protein thereupon separates from the ribosome. Its C terminus is the last amino acid to join the chain. Its N terminus, at least initially, is methionine, as a consequence of the AUG start codon. In its amino acid sequence, it contains information specifying its conformation, as well as its ultimate cellular destination.
Regulation of Translation
Like any factory, the machinery of translation can work at varying rates. Variation in the rate of translation is useful for controlling the amount of an active protein in a cell. Some externally applied chemicals, such as some antibiotics, can stop translation. Conversely, the presence of more than one ribosome on an mRNA can speed up protein synthesis. Some antibiotics and bacterial toxins work by inhibiting translation
Antibiotics are defensive molecules produced by microorganisms such as certain bacteria and fungi. These substances often destroy other microbes, which might compete with the defenders for nutrients. Since the 1940s, scientists have isolated increasing numbers of antibiotics, and physicians use them to treat a great variety of infectious diseases, ranging from bacterial meningitis to pneumonia to gonorrhea. The key to the medical use of antibiotics is specificity: An antibiotic must act to destroy the microbial invader, but not harm the human host. One way in which antibacterial antibiotics achieve this is to block the synthesis of the bacterial cell wall — something that is essential to the microbe but is not part of human biochemistry. Penicillin works in this way. Another way in which antibiotics work is to inhibit all bacterial protein synthesis. Recall that the prokaryotic ribosome is smaller, and has a different collection of proteins, than the eukaryotic ribosome. Some antibiotics bind only to bacterial ribosomal proteins that are important in protein synthesis. Without the ability to make proteins, the bacterial invaders die, and the infection is stemmed. Some bacteria affect their human hosts through mechanisms similar to those we use against them. Diphtheria is an infectious disease of childhood and before the advent of effective vaccines, it was a major cause of childhood death. The infective agent, the bacterium Cornybacterium diphtheriae, produces a highly lethal toxin that modifies and inactivates a protein that is essential for the movement of mRNA and ribosomes during eukaryotic protein synthesis.
Polysome formation increases the rate of protein synthesis
Several ribosomes can work simultaneously at translating a single mRNA molecule, producing multiple molecules of the protein at the same time. As soon as the first ribosome has moved far enough from the initiation point, a second initiation complex can form, then a third, and so on. An assemblage consisting of a thread of mRNA with its beadlike ribosomes and their growing polypeptide chains is called a polyribosome, or polysome. Cells that are actively synthesizing proteins contain large numbers of polysomes and few free ribosomes or ribosomal subunits. A polysome is like a cafeteria line, in which patrons follow one another, adding items to their trays. At any moment, the person at the start has a little food (a newly initiated protein); the person at the end has a complete meal (a completed protein). However, in the polysome cafeteria, everyone gets the same meal: Many copies of the same protein are made from a single mRNA. While protein synthesis can be inhibited with antibiotics and speeded up via polysomes, these are not the only ways in which the amount of an active protein in a cell can be controlled. After the protein is synthesized, it may undergo changes that alter its function.
A functional protein is not necessarily the same as the polypeptide chain that is released from the ribosome. Especially in eukaryotic cells, the polypeptide may need to be moved far from the site of synthesis in the cytoplasm, moved into an organelle, or even secreted from the cell. In addition, the polypeptide is often modified by the addition of new chemical groups that have functional significance. In this section, we examine these two posttranslational aspects of protein synthesis.
Chemical signals in proteins direct them to their cellular destinations
As a polypeptide chain emerges from the ribosome, it folds into its three-dimensional shape. This conformation is determined by the sequence of the amino acids that make up the protein, as well as by factors such as the polarity and charge of their R groups. Ultimately, the conformation of the polypeptide allows it to interact with other molecules in the cell, such as a substrate or another polypeptide. In addition to this structural information, the amino acid sequence contains an “address label” indicating where in the cell the polypeptide belongs. All protein synthesis begins on free ribosomes in the cytoplasm. As a polypeptide chain is made, the information contained in its amino acid sequence gives it one of two sets of instructions:
- “Finish translation and be released to the cytoplasm.” Such proteins are sent to the nucleus, mitochondria, plastids, or peroxisomes, depending on the address in their instructions; or, lacking such specific instructions, they remain in the cytosol . “Stop translation, go to the endoplasmic reticulum (ER), and finish synthesis there.” After protein synthesis is completed, such proteins may be retained in the ER or sent to lysosomes via the Golgi apparatus. Alternatively, they may be sent to the plasma membrane, or, lacking such specific instructions, they are secreted from the cell via vesicles that emanate from the plasma membrane.
DESTINATION: CYTOPLASM. After translation, some folded polypeptides have a short exposed sequence of amino acids that acts like a postal “zip code” directing them to an organelle. These signal sequences are either at the N terminus or in the interior of the amino acid chain. For example, the following sequence directs a protein to the nucleus:
This amino acid sequence occurs, for example, in the histone proteins associated with nuclear DNA but not in citric acid cycle enzymes, which are addressed to the mitochondria. The signal sequences have a conformation that allows them to bind to specific receptor proteins, appropriately called docking proteins, on the outer membrane of the appropriate organelle.Once the protein has bound to it, the receptor forms achannel in the membrane, allowing the protein to pass throughto its organelle destination. (In this process, the protein is usuallyunfolded by a chaperonin so that it can pass through thechannel, then refolds into its normal conformation.).
Destination: endoplasmic reticulum
If a specific hydrophobic sequence of about 25 amino acids occurs at the beginning of a polypeptide chain, the finished product is sent initially to the ER, and then to the lysosomes, the plasma membrane, or out of the cell. In the cytoplasm, before translation is finished, the signal sequence binds to a signal recognition particle composed of protein and RNA. This binding blocks further protein synthesis until the ribosome can become attached to a specific receptor protein in the membrane of the rough ER. Once again, the receptor protein is converted into a channel, through which the growing polypeptide passes. The elongating polypeptide may be retained in the ER membrane itself, or it may enter the interior space—the lumen—of the ER. In either case, an enzyme in the lumen of the ER removes the signal sequence from the polypeptide chain. At this point, protein synthesis resumes, and the chain grows longer until its sequence is completed. If the finished protein enters the ER lumen, it can be transported to its appropriate location — to other cellular compartments or to the outside of the cell — without mixing with other molecules in the cytoplasm. Additional signals are needed for sorting the protein further (remember that the signal sequence that sent it to the ER has been removed). These signals are of two kinds:
- Some are sequences of amino acids that allow the protein’s retention within the ER.
- Others are sugars added in the Golgi apparatus, to which the protein is transferred in vesicles from the ER.
The resulting glycoproteins end up either at the plasma membrane or in a lysosome (or plant vacuole), depending on which sugars are added. Proteins with no additional signals pass from the ER through the Golgi apparatus and are secreted from the cell. It is important to emphasize that the addressing of a protein to its destination is a property of its amino acid sequence, and so is genetically determined. An example of what can go wrong if a gene for protein targeting is mutated is mucoplidosis II, or I-cell disease. People with this disease lack an essential enzyme for the formation of the lysosomal targeting signal. Consequently, proteins destined for their lysosomes never get there, but instead either stay in the Golgi (where they form I or inclusion, bodies) or are secreted from the cell. The lack of normal lysosome functions in a person’s cells leads to progressive illness and death in childhood.
Many proteins are modified after translation
Most finished proteins are not identical to the polypeptide chains translated from mRNA on the ribosomes. Instead, most polypeptides are modified after translation, and these modifications are essential to the final functioning of the protein.
Proteolysis is the cutting of a polypeptide chain. Cleavage of the signal sequence from the growing polypeptide chain in the ER is an example of proteolysis; the protein might move back out of the ER through the membrane channel if the signal sequence were not cut off. Also, some proteins are actually made from polyproteins (long polypeptides) that are cut into final products by enzymes called proteases. Proteases are essential to some viruses, including HIV, because the large viral polyprotein cannot fold properly unless it is cut. Certain drugs used to treat AIDS work by inhibiting the HIV protease, thereby preventing the formation of proteins needed for viral reproduction.
Glycosylation involves the addition of sugars to proteins, as described above. In both the ER and the Golgi apparatus, resident enzymes catalyze the addition of various sugar residues or short sugar chains to certain amino acid R groups on proteins as they pass through. One such type of “sugar coating” is essential for addressing proteins to lysosomes discussed in the preceding section. Other types are important in the conformation and the recognition functions of proteins at the cell surface. Still other attached sugar residues help in stabilizing proteins stored in storage vacuoles in plant seeds.
Phosphorylation, the addition of phosphate groups to proteins, is catalyzed by protein kinases. The charged phosphate groups change the conformation of targeted proteins, often exposing an active site of an enzyme or a binding site for another protein. All of the processes we have just described result in a functional protein only if the amino acid sequence of that protein is correct. If the sequence is not correct, cellular dysfunction and disease may result. Changes in the DNA —mutations — are a major source of errors in amino acid sequences.
Mutations: Heritable Changes in Genes
Accurate DNA replication, transcription, and translation all depend on the reliable pairing of complementary bases. Errors occur, though infrequently, in all three processes — least often in DNA replication. But, the consequences of DNA errors are the most severe because only they are heritable.
Mutations are heritable changes in genetic information. In unicellular organisms, any mutations that occur are passed on to the daughter cells when the cell divides. In multicellular organisms, there are two general types of mutations in terms of inheritance:
- Somatic mutations are those that occur in somatic (body) cells. These mutations are passed on to the daughter cells after mitosis, and to the offspring of those cells in turn but are not passed on to sexually produced offspring. A mutation in a single skin cell, for example, could result in a patch of skin cells, all with the same mutation, but would not be passed on to a person’s children.
- Germ line mutations are those that occur in the cells of the germ line—the specialized cells that give rise to gametes. A gamete with the mutation passes it on to a new organism at fertilization. Very small changes in the genetic material can lead to easily observable changes in the phenotype. Some effects of mutations in humans are readily detectable—dwarfism, for instance or the presence of more than five fingers on each hand. A mutant genotype in a microorganism may be obvious if, for example, it results in a change in nutritional requirements, as we described for Neurospora earlier (see Figure ). Other mutations may not be easily observable. In humans, for example, a particular mutation drastically lowers the level of an enzyme called glucose 6-phosphate dehydrogenase that is present in many tissues, including red blood cells. The red blood cells of a person carrying the mutant allele are abnormally sensitive to an antimalarial drug called primaquine; when such people are treated with this drug, their red blood cells rupture. People with the normal allele have no such problem. Before the drug came into use, no one was aware that such a mutation existed. In bacteria, because of their small sizes and simpler morphologies, distinguishing a mutant from a normal bacterium usually requires sophisticated chemical methods, not just visual inspection. Some mutations cause their phenotypes only under certain restrictive conditions. They are not detectable under other, permissive conditions. These phenotypes are known as conditional mutants. Many conditional mutants are temperature- sensitive, able to grow normally at a permissive temperature—say, 30°C—but unable to grow at a restrictive temperature — say, 37°C. The mutant allele in such an organism may code for an enzyme with an unstable tertiary structure that is altered at the restrictive temperature. All mutations are alterations in the nucleotide sequence of DNA. At the molecular level, we can divide mutations into two categories:
- Point mutations are mutations of single base pairs and so are limited to single genes: One allele (usually dominant) becomes another allele (usually recessive) because of an alteration (gain/loss or substitution) of a single nucleotide (which, after DNA replication, becomes a mutant base pair).
- Chromosomal mutations are more extensive alterations than point mutations. They may change the position or orientation of a DNA segment without actually removing any genetic information, or they may cause a segment of DNA to be irretrievably lost.
Point mutations are changes in single nucleotides
Point mutations result from the addition or subtraction of a nucleotide base, or the substitution of one base for another, in the DNA, and hence in the mRNA. Point mutations can be caused by errors in chromosome replication that are not corrected in proofreading or by environmental mutagens such as chemicals and radiation. Changes in the mRNA may or may not result in changes in the protein. Silent mutations have no effect on the protein; missense and nonsense mutations will result in changes in the protein, some of them drastic.
Because of the redundancy of the genetic code, some point mutations result in no change in amino acids when the altered mRNA is translated; for this reason, they are called silent mutations. For example, there are four mRNA codons that code for proline: CCA, CCC, CCU, and CCG. If the template strand of DNA has the sequence CGG, it will be transcribed to CCG in mRNA, and proline-charged tRNA will bind to it at the ribosome. But if there is a mutation such that the codon in the template DNA now reads AGG, the mRNA codon will be CCU—the tRNA that binds it will still carry proline: Silent mutations are quite common, and they result in genetic diversity that is not expressed as phenotypic differences.
In contrast to silent mutations, some base substitution mutations change the genetic message such that one amino acid substitutes for another in the protein. These changes are called missense mutations:
A missense mutation may cause a protein not to function, but often its effect is only to reduce the functional efficiency of the protein. Therefore, individuals carrying missense mutations may survive, even though the affected protein is essential to life. Through evolution, some missense mutations even improve functional efficiency.
Nonsense mutations, another type of mutation in which one base is substituted for another, are more often disruptive than missense mutations. In a nonsense mutation, the base substitution causes a stop codon, such as UAG, to form in the mRNA product: Think again of codons as three-letter words, each corresponding to a particular amino acid. Translation proceeds codon by codon; if a base is added to the message or subtracted from it, translation proceeds perfectly until it comes to the one-base insertion or deletion. From that point on, the three-letter words in the message are one letter out of register. In other words, such mutations shift the “reading frame” of the genetic message. Frame-shift mutations almost always lead to the production of nonfunctional proteins.
Chromosomal mutations are extensive changes in the genetic material
Changes in single nucleotides are not the most dramatic changes that can occur in the genetic material. Whole DNA molecules can break and rejoin, grossly disrupting the sequence of genetic information. There are four types of such chromosomal mutations: deletions, duplications, inversions, and translocations. These mutations can be caused by severe damage to chromosomes resulting from mutagens or by drastic errors in chromosome replication.
- Deletions remove part of the genetic material. Like frame-shift point mutations, their consequences can be severe unless they affect unnecessary genes or are masked by the presence, in the same cell, of normal alleles of the deleted genes. It is easy to imagine one mechanism that could produce deletions: A DNA molecule might break at two points, and the two end pieces might rejoin, leaving out the DNA between the breaks.
- Duplications can be produced at the same time as deletions. Duplication would arise if homologous chromosomes broke at different positions and then reconnected to the wrong partners. One of the two molecules produced by this mechanism would lack a segment of DNA (it would have a deletion), and the other would have two copies (a duplication) of the segment that was deleted from the first.
- Inversions also result from breaking and rejoining. A segment of DNA may be removed and reinserted into the same location in the chromosome, but “flipped” end over end so that it runs in the opposite direction (Figure). If the break site for an inversion includes part of a DNA segment that codes for a protein, the resulting protein will be drastically altered and almost certainly nonfunctional.
- Translocations result when a segment of DNA breaks off, moves from its chromosome, and is inserted into a different chromosome. Translocations may be reciprocal, or nonreciprocal, as the mutation involving duplication and deletion in Figure illustrates. Translocations often lead to duplications and deletions, and may result in sterility if normal chromosome pairing in meiosis cannot occur.
Mutations can be spontaneous or induced
It is useful to distinguish two types of mutations in terms of their causes. Spontaneous mutations are permanent changes in the genome that occur without any outside influence. In other words, they occur simply because the machinery of the cell is imperfect. Induced mutations occur when some agent outside the cell—a mutagen—causes a permanent change in DNA. Spontaneous mutations may occur by several mechanisms:
- The four nucleotide bases of DNA are somewhat unstable. They can exist in two different forms (called tautomers), one of which is common and one rare. When a base temporarily forms its rare tautomer, it can pair with a different base. For example, C normally pairs with G. But if C is in its rare tautomer at the time of DNA replication, it pairs with (and DNA polymerase will insert) A. The result is a point mutation:
- Bases may change because of a chemical reaction. For example, loss of an amino group in cytosine (a reaction called deamination) forms uracil. When DNA replicates, instead of a G opposite what was C, DNA polymerase adds an A (base-pairs with U).
- DNA polymerase makes errors in replication for example, inserting a T opposite a G. Most of these errors are repaired by the proofreading function of the replication complex, but some errors escape and become permanent.
- Meiosis is not perfect. Nondisjunction can occur, leading to one too many or one too few chromosomes. Random chromosome breaks and rejoining can produce deletions, duplications and inversions, or, when involving nonhomologous chromosomes, translocations.
Mutagens can also alter DNA by several mechanisms:
- Some chemicals can covalently alter the nucleotide bases. For example, nitrous acid (HNO2) and its relatives can turn cytosine in DNA into uracil by deamination: they convert an amino group on cytosine (—NH2) into a keto group. This alteration has the same result as a spontaneous deamination: instead of a G, DNA polymerase inserts an A (base-pairs with U).
- Some chemicals add groups to the bases. For instance, benzpyrene, a component of cigarette smoke, adds a large chemical group to guanine, making it unavailable for base pairing. When DNA polymerase reaches such a modified guanine, it inserts any of the four bases; of course, three-fourths of the time the inserted base will not be cytosine, and a mutation results.
- Radiation damages the genetic material in two ways. Ionizing radiation (X rays) produces highly reactive chemical species called free radicals, which can change bases in DNA to unrecognizable (by DNA polymerase) forms or break the sugar–phosphate backbone causing chromosoma l abnormalities. Ultraviolet radiation from the sun (or a tanning lamp) is absorbed by thymine in DNA, causing it to form interbase covalent bonds with adjacent nucleotides. This, too, plays havoc with DNA replication. Mutations have both benefits and costs. Germ line mutations provide genetic diversity for evolution to work on, as we will see below. But they usually produce an organism that does more poorly in its current environment. Somatic mutations do not affect the organism’s offspring but they can lead to cancer.
Mutations are the raw material of evolution
Without mutation, there would be no evolution. Mutation does not drive evolution, but it provides the genetic diversity on which natural selection and other agents of evolution act. All mutations are rare events, but mutation frequencies vary from organism to organism and from gene to gene within a given organism. The frequency of mutation is usually much lower than one mutation per 104 base pairs per DNAduplication, and sometimes as low as one mutation per 109 base pairs per duplication. Most mutations are point mutations in which one nucleotide is substituted for another during the synthesis of a new DNA strand. Mutations can harm the organism that carries them, or they can be neutral (have no effect on the organism’s ability to survive or produce offspring). Once in a while, a mutation improves an organism’s adaptation to its environment or it becomes favorable when environmental conditions change. Most of the complex creatures living on Earth have more DNA, and therefore more genes, than the simpler creatures do. Humans, for example, have 20 times more genes than prokaryotes have. How did these new genes arise? If whole genes were sometimes duplicated by the mechanisms described in the previous section, the bearer of the duplication would have a surplus of genetic information that might be turned to good use. Subsequent mutations in one of the two copies of the gene might not have an adverse effect on sur- vival because the other copy of the gene would continue to produce functional protein. The extra gene might mutate over and over again without ill effect because its original function would be fulfilled by the original copy. If the random accumulation of mutations in the extra gene led to the production of a useful protein (for example, an enzyme with an altered specificity for the substrates it binds, allowing it to catalyze different — but related — reactions), natural selection would tend to perpetuate the existence of this new gene. New copies of genes may also arise through the activity of transposable elements.