This subchapter deals with the mechanisms by which genes are expressed as proteins. We will begin with evidence for the relationship between genes and proteins, and then fill in some of the details of the processes of transcription—the copying of the gene sequence of DNA into a sequence of RNA—and translation—the use of the sequence of RNA to make a polypeptide with a defined order of amino acids. Finally, we will define mutations and their phenotypes in specific molecular terms.
One Gene, One Polypeptide
There are many steps between genotype and phenotype. Genes cannot, all by themselves, directly produce a phenotypic result, such as a particular eye color, a specific seed shape or a cleft chin, any more than a compact disk can play a symphony without the help of a CD player. The first historical step in relating genes to phenotypes was to define phenotypes in molecular terms. The molecular basis of phenotypes was actually discovered before the discovery that DNA was the genetic material. Scientists had studied the chemical differences between individuals carrying wild-type and mutant alleles in organisms as diverse as humans and bread molds. They found that the major phenotypic differences were the result of differences in specific proteins. In the 1940s, a series of experiments by George W. Beadle and Edward L. Tatum at Stanford University showed that when an altered gene resulted in an altered phenotype, that altered phenotype was always associated with an altered enzyme. This finding was critically important in defining the phenotype in chemical terms. The roles of enzymes in biochemistry were being described at this time, and it occurred to Beadle and Tatum that the expression of a gene as phenotype could occur through an enzyme. They experimented with the bread mold Neurospora crassa. The nuclei in the body of this mold are haploid (n), as are its reproductive spores. (This fact is important because it means that even recessive mutant alleles are easy to detect in experiments.) Beadle and Tatum grew Neurospora on a minimal nutritional medium containing sucrose, minerals and a vitamin. Using this medium, the enzymes of wild-type Neurospora could catalyze the metabolic reactions needed to make all the chemical constituents of their cells, including proteins. These wild-type strains are called prototrophs (“original eaters”). Beadle and Tatum treated wild-type Neurospora with X rays, which act as a mutagen (something known to cause mutations). When they examined the treated molds, they found some mutant strains could no longer grow on the minimal medium, but needed to be supplied with additional nutrients. The scientists hypothesized that these auxotrophs (“increased eaters”) must have suffered mutations in genes that code for the enzymes used to synthesize the nutrients they now needed to obtain from their environment. For each auxotrophic strain, Beadle and Tatum were able to find a single compound that, when added to the minimal medium, supported the growth of that strain. This result suggested that mutations have simple effects, and that each mutation causes a defect in only one enzyme in a metabolic pathway described as the one-gene, one-enzyme hypothesis. One group of auxotrophs, for example, could grow only if the minimal medium was supplemented with the amino acid arginine. (Wild-type Neurospora makes its own arginine.). These mutant strains were designated arg mutants. Beadle and Tatum found several different arg mutant strains. They proposed two alternative hypotheses to explain why these different genetic strains had the same phenotype:
The different arg mutants could have mutations in the same gene, as in the case of the different eye color alleles of fruit flies. In this case, the gene might code for an enzyme involved in arginine synthesis.
The different arg mutants could have mutations in different genes, each coding for a separate function that leads to arginineproduction. These independent functions might bedifferent enzymes along the same biochemical pathway.
Some of the arg mutant strains fell into each of the two categories. Genetic crosses showed that some of the mutations were at the same chromosomal locus, and so were different alleles of the same gene. Other mutations were at different loci or on different chromosomes and so were not alleles of the same gene. Beadle and Tatum concluded that these different genes participated in governing a single biosynthetic pathway—in this case, the pathway leading to arginine synthesis. By growing different arg mutants in the presence of various compounds suspected to be intermediates in the synthetic metabolic pathway for arginine, Beadle and Tatum were able to classify each mutation as affecting one enzyme or another, and to order the compounds along the pathway. Then they broke open the wild-type and mutant cells and examined them for enzyme activities. The results confirmed their hypothesis: Each mutant strain was indeed missing a single active enzyme in the pathway. The gene–enzyme connection had been proposed 40 years earlier in 1908 by the Scottish physician Archibald Garrod, who studied the inherited human disease alkaptonuria. He linked the biochemical phenotype of the disease to an abnormal gene and a missing enzyme. Today we know of hundreds of examples of such hereditary diseases. The gene–enzyme relationship has undergone several modifications in light of our current knowledge of molecular biology. Many enzymes are composed of more than one polypeptide chain, or subunit (that is, they have a quaternary structure). In this case, each polypeptide chain is specified by its own separate gene. Thus, it is more correct to speak of a one-gene, one-polypeptide relationship: the function of a gene is to control the production of a single, specific polypeptide. Much later, it was discovered that some genes code for forms of RNA that do not become translated into polypeptides, and that still other genes are involved in controlling which other DNA sequences are expressed. While these discoveries have supplanted the idea that all genes code for proteins, they did not invalidate the relationship between genes and polypeptides. But how does this relationship work—that is, how is the information encoded in DNA used to specify a particular polypeptide?
DNA, RNA, and the Flow of Information
The expression of a gene to form a polypeptide occurs in two major steps:
Transcription copies the information of a DNA sequence (the gene) into corresponding information in an RNA sequence.
Translation converts this RNA sequence into the amino acid sequence of a polypeptide.
RNA differs from DNA
RNA is a key intermediary between DNA and polypeptide. RNA (ribonucleic acid) is a polynucleotide similar to DNA, but it differs from DNA in three ways:
RNA generally consists of only one polynucleotide strand.
The sugar molecule found in RNA is ribose, rather than the deoxyribose found in DNA.
Although three of the nitrogenous bases (adenine, guanine, and cytosine) in RNA are identical to those in DNA, the fourth base in RNA is uracil (U), which is similar to thymine but lacks the methyl (—CH3) group. The central dogma raised two questions:
How does genetic information get from the nucleus to the cytoplasm? (most of the DNA of a eukaryotic cell is confined to the nucleus but proteins are synthesized in the cytoplasm.)
What is the relationship between a specific nucleotide sequence in DNA and a specific amino acid sequence in a protein?
To answer these questions, Crick proposed two hypotheses.
The messenger hypothesis and transcription
To answer the first question, Crick and his colleagues proposed the messenger hypothesis. They proposed that an RNA molecule forms as a complementary copy of one DNA strand of a particular gene. The process by which this RNA forms is called transcription). This messenger RNA or mRNA, then travels from the nucleus to the cytoplasm, where it serves as a template for the synthesis of proteins.
Crick’s hypothesis has been tested repeatedly for genes that code for proteins and the answer is always the same: each gene sequence in DNA that codes for a protein is expressed as a sequence in mRNA.
The adapter hypothesis and translation
To answer the second question, Crick proposed the adapter hypothesis: there must be an adapter molecule that can bind a specific amino acid with one region and recognize a sequence of nucleotides with another region. In due course, these adapters, called transfer RNA or tRNA, were identified. Because they recognize the genetic message of mRNA and simultaneously carry specific amino acids, tRNAs can translate the language of DNA into the language of proteins. The tRNA adapters line up on the mRNA so that the amino acids are in the proper sequence for a growing polypeptide chain — a process called translation. Once again, actual observations of the expression of thousands of genes have confirmed the hypothesis that tRNA acts as the intermediary between the nucleotide sequence information in mRNA and the amino acid sequence in a protein. Summarizing the main features of the central dogma, the messenger hypothesis and the adapter hypothesis, we may say that a given gene is transcribed to produce a messenger RNA (mRNA) complementary to one of the DNA strands, and that transfer RNA (tRNA) molecules translate the sequence of bases in the mRNA into the appropriate sequence of linked amino acids during protein synthesis.
RNA viruses modify the central dogma
Certain viruses are rare exceptions to the central dogma. Viruses are infectious particles that reproduce inside cells.
Many viruses, such as the tobacco mosaic virus, influenza virus, and poliovirus, have RNA rather than DNA as their genetic material. With its nucleotide sequence, RNA could potentially act as an information carrier and be expressed as proteins. But since RNA is usually single-stranded, its replication is a problem. The viruses generally solve this problem by transcribing from RNA to RNA, making an RNA strand that is complementary to their genome. This “opposite” strand is then used to make multiple copies of the viral genome by transcription. The human immunodeficiency virus (HIV) and certain rare tumor viruses also have RNA as their genome, but do not replicate it as RNA-to-RNA. Instead, after infecting a host cell, they make a DNA copy of their genome and use it to make more RNA. This RNA is then used both as genomes for more copies of the virus and as mRNA to produce viral proteins. Synthesis of DNA from RNA is called reverse transcription, and not surprisingly, such viruses are called retroviruses.