Skip to main content
Chemistry LibreTexts

4.9: Gene Expression

  • Page ID
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Source: BiochemFFA_7_7.pdf. The entire textbook is available for free from the authors at

    The processes of transcription and translation described so far tell us what steps are involved in the copying of information from a gene (DNA) into RNA and the synthesis of a protein directed by the sequence of the transcript (Figure 7.102). These steps are required for gene expression, the process by which information in DNA directs the production of the proteins needed by the cell.

    But what determines whether a gene is expressed at a given time? Cells do not, as we know, express all of their genes all of the time. Some genes are expressed in particular cell types but not others, while others may be expressed at specific stages of development. Cells must also be able alter their patterns of gene expression in response to internal and external cues, controlling the production of proteins as needed, to meet their changing needs. Regulating gene expression is, therefore, crucial. Given that there are multiple steps involved in gene expression, there are several different points at which the process could be regulated. Not surprisingly, many regulatory mechanisms are known, each acting at a different stage in the path from DNA to protein.

    Regulation of Transcription

    The first step in gene expression is transcription, so regulation of transcription is an obvious way to affect whether a gene is expressed and to what extent.

    What are the molecular switches that turn transcription on or off? Although there are additional factors that affect transcription, such as the accessibility of a gene to the transcriptional machinery, the basic mechanism by which transcription is regulated depends on highly specific interactions between transcription regulating proteins and regulatory sequences on DNA.

    What are these regulatory sequences and what proteins bind them? In addition to the promoter sequences required for transcription initiation, genes have additional cis regulatory sequences (sequences of DNA on the same DNA molecule as the gene) that control when a gene is transcribed. Regulatory sequences are bound tightly and specifically by transcriptional regulators, proteins that can recognize DNA sequences and bind to them. The binding of such proteins to the DNA can regulate transcription by preventing or increasing transcription from a particular promoter.

    Transcriptional regulation in prokaryotes

    Let us first consider some examples from prokaryotes. In bacteria, genes are often clustered in groups, such that genes that need to be expressed at the same time are next to each other and all of them are controlled as a single unit by the same promoter. Groups of genes that are coordinately regulated by a single promoter are referred to as operons. The entire set of genes in an operon can be controlled through the action of DNA binding proteins that act as either repressors (preventing transcription of the genes) or activators (increasing transcription of the genes). The binding of these proteins to their DNA targets is allosterically controlled by the binding of specific small molecules that signal the state of the cell.

    Induction of the lac operon

    The lac operon is one such group of coordinately regulated genes that encode proteins needed for the uptake and breakdown of the sugar lactose. E.coli cells preferentially use glucose for their energy needs, but if glucose is unavailable, and lactose is present, the bacteria will take up lactose and break it down for energy. Since the proteins for taking up and breaking down lactose are only needed when glucose is absent and lactose is available, the bacterial cells need a way to express the genes of the lac operon only under those conditions. The default state of the lac operon is OFF.

    Removing a repressor

    Transcription of the lac cluster of genes is primarily controlled by a repressor protein that binds to a region of the DNA just downstream of the -10 sequence of the lac promoter (Figure 7.104). Recall that the promoter is where the RNA polymerase must bind to begin transcription. The location on the DNA where the lac repressor is bound is called the operator (Figure 7.105). When the repressor is bound at this position, it physically blocks the RNA polymerase from transcribing the genes, just as a vehicle blocking your driveway would prevent you from pulling out. Obviously, if you want to leave, the vehicle that is blocking your path must be removed. Likewise, in order for transcription to occur, the repressor must be removed from the operator to clear the path for RNA polymerase (Figure 7.106).

    How is the repressor removed? When the sugar lactose is present, a small amount of it is taken up by the cells and converted to an isomeric form, allolactose (Figure 7.107). Allolactose binds to the repressor, changing its conformation so that it no longer binds to the operator. When the repressor is no longer bound to the operator, the "road-block" in front of the RNA polymerase is removed, permitting the transcription of the genes of the lac operon

    What makes this an especially effective control system is that the genes of the lac operon encode proteins that enable the break down of lactose. Turning on these genes requires lactose to be present. Once the lactose has been broken down, the lac repressor binds to the operator once more and the lac genes are no longer expressed. This allows the genes to be expressed only when they are needed.

    Recruiting RNA polymerase

    But how do glucose levels affect the expression of the lac genes? We noted earlier that if glucose was present, lactose would not be used. A second level of control is exerted by a protein called Catabolite Activator Protein (CAP - Figure 7.108)). CAP (also sometimes called CBP or cAMP binding protein) binds to a site adjacent to the promoter and is necessary to recruit RNA polymerase to bind the lac promoter.

    cAMP binding

    CAP binds to its site only when glucose levels are low. Low glucose levels are linked to the activation of an enzyme, adenylate cyclase, that makes the molecule cyclic AMP (cAMP). The binding of cAMP to the CAP causes a conformational change in CAP that allows it to bind to the CAP-binding site. When CAP is bound at this site, it is able to recruit RNA polymerase to bind at the promoter, and begin transcription.

    The combination of CAP binding and the lac repressor dissociating from the operator when lactose levels are high ensures transcription of the lac operon just when it is most needed. The binding of CAP may be thought of as a green light for the RNA polymerase, while the removal of lac repressor is like the lifting of a barricade in front of it. When both conditions are met, the RNA polymerase transcribes the downstream genes.

    Control of the trp operon by repression

    The lac operon we have just described is a set of genes that are expressed only under the specific conditions of glucose depletion and lactose availability. Other genes may be expressed unless a particular condition is met. For these genes, the default state is ON.

    An example of this is the trp operon, which encodes enzymes necessary for the synthesis of the amino acid tryptophan. These genes are constitutively expressed (always on), except when tryptophan is available from the cell's surroundings, making its synthesis unnecessary. Under conditions where tryptophan is abundant in the environment, the trp genes can be turned off. This is achieved by a repressor protein that will bind to the operator only in the presence of tryptophan (Figure 7.110). Binding of tryptophan to the repressor causes binding of the repressor to the operator. Because it acts together with the repressor to turn off the trp genes, tryptophan is called a co-repressor.


    Another mechanism that regulates the expression of the trp operon is attenuation. Attenuation is a process by which the expression of an operon is controlled by termination of transcription before the first gene of the operon (Figure 7.111).

    In the trp operon, this functions as follows: Transcription begins some distance upstream of the first gene in the operon, producing what is termed a 5’ leader sequence. This leader sequence contains an intrinsic terminator that can form a hairpin structure that stops transcription when high levels of tryptophan are available to the cells. It can also form a different structure that permits continued transcription of the genes in the operon when tryptophan levels are low. How does the level of tryptophan influence which of these two structures are formed?

    Recall that the 5’ end of the RNA is the first part of the transcript to be made and that in bacteria translation is linked to transcription, so the 5’ end of the RNA begins to be translated before the entire transcript is made. It turns out that the 5’ leader sequence of the trp operon mRNA encodes a short peptide that contains two tryptophan codons. If there is plenty of tryptophan available, the leader sequence will be easily translated. Under these conditions, the leader sequence is able to form the termination hairpin, preventing the transcription of the downstream trp genes.

    If, however, levels of tryptophan are low, then the ribosome stalls as it attempts to translate the leader sequence. Under these conditions, the leader sequence adopts a different conformation that permits continued transcription of the genes of the trp operon.


    Similar in concept to the attenuation of the trp operon described above, but not dependent on translation, is a control mechanism called a riboswitch (Figure 7.113). Riboswitches are typically found in the 5'UTR of messenger RNAs (i.e., they are part of the sequence of the RNA). These sequences can control transcription of the downstream genes based on the conformation they adopt. One conformation allows continued transcription, while the other terminates it. So, what determines which conformation they adopt?


    Riboswitches have two characteristic features that are important for their function. One is a region of the sequence called an aptamer, which folds into a three-dimensional shape that can bind a small effector molecule. The other is an adjacent region of the RNA, called the expression platform, that can fold into different conformations depending on whether or not the aptamer is bound to the effector.

    An example of a riboswitch found in bacteria is the guanine riboswitch, which controls the expression of genes required for purine biosynthesis. The aptamer region of this riboswitch binds to the effector, guanine, when levels of the base are high. The binding of the guanine triggers a change in the folding of the downstream expression platform, causing it to adopt a conformation that terminates transcription of the genes needed for the synthesis of guanine. In the absence of guanine, the expression platform assumes a different conformation that allows transcription of the purine biosynthesis genes. Thus, levels of guanine can be sensed and the genes needed for its synthesis can be expressed as needed.

    Regulation of transcription in eukaryotes

    Transcription in eukaryotes is also regulated by the binding of proteins to specific DNA sequences, but with some differences from the simple schemes outlined above.

    For most eukaryotic genes, general transcription factors and RNA polymerase (i.e., the transcription initiation complex) are necessary but not sufficient for high levels of transcription. Promoter-proximal DNA sequences like the CAAT box and GC box bind proteins that interact with the transcription initiation complex, influencing its formation (Figure 7.114).

    Distant regulatory sequences

    Additional regulatory sequences called enhancers and the proteins that bind to them are needed to achieve high levels of transcription. Enhancers are short DNA sequences that regulate the transcription of genes, but may be located at a distance from the gene they control (although they are on the same DNA molecule as the gene). Often enhancers are many kilobases away on the DNA, either upstream or downstream of the gene. As the name suggests, enhancers can enhance (increase) transcription of a particular gene. How can a DNA sequence far from the gene being transcribed affect the level of transcription?

    Transcriptional activators

    Enhancers work by binding proteins (transcriptional activators) that can, in turn, interact with the proteins bound at the promoter. The enhancer region of the DNA, with its associated transcriptional activator(s) can come in contact with the transcription initiation complex that is bound at a distant site by looping of the DNA (Figure 7.115). This allows the protein bound at the enhancer to make contact with the proteins in the basal transcription complex. The interaction of the activator with the transcription initiation complex may be direct, or it may be through a “middle-man”, a protein complex called mediator.

    One effect of this interaction is to assist in recruiting proteins necessary for transcription, like the general transcription factors and RNA polymerase to the promoter, increasing the frequency and efficiency of formation of the transcription initiation complex. There is also evidence that at some promoters, following assembly of the transcription initiation complex, the RNA polymerase remains stalled at the promoter. In such cases, the interaction with the transcription initiation complex of an activator bound to an enhancer could play a role in facilitating the transition of the RNA polymerase to the elongation phase of transcription.

    Chromatin remodeling proteins

    Another mechanism by which activators bound at the enhancer can affect transcription is by recruiting to the promoter proteins that can modify the structure of that region of the chromosome. In eukaryotes, DNA is packaged with proteins to form chromatin. When the DNA is tightly associated with these proteins, it is difficult to access for transcription. So proteins that can make the DNA more accessible to the transcription machinery can also play a role in the extent to which transcription occurs.


    In addition to enhancers, there are also negative regulatory sequences called silencers. Such regulatory sequences bind to transcriptional repressor proteins. Like the transcriptional activators, these repressors work by interacting with the transcription initiation complex. In the case of repressors, the effect they have on the transcription initiation complex is to reduce transcription.

    DNA binding proteins

    Transcriptional activators and repressors are modular proteins- they have a part that binds DNA and a part that activates or represses transcription by interacting with the transcription initiation complex (Figure 7.118). The DNA binding domain is the part of the protein that confers specificity for determining which gene(s) will be activated or repressed. The activation domain is the part of the protein that stimulates or represses transcription. The DNA binding domains of transcriptional activators form characteristic structures that recognize their target DNA sequences by making contacts with bases, usually in the major groove of the DNA helix. It is possible to engineer hybrid transcription factors that combine the DNA binding domain of one activator with the activation domain of another. Such proteins retain the specificity dictated by the DNA binding domain. Truncated transcription factors can also be generated that have their DNA binding domain but lack the activation domain. Such transcription factors can be useful tools in studying transcriptional regulation because their DNA binding domains can compete with the endogenous transcription factors for regulatory binding sites without increasing transcription from the target promoters.

    Multiple factors

    The description above may suggest that each gene in eukaryotes is controlled by the binding of a single transcriptional activator or repressor to a particular enhancer or silencer site. However, it turns out that the transcription of any given gene may be simultaneously regulated by a combination of proteins, both activators and repressors, bound at multiple regulatory sites on the DNA, all of which interact with the transcription initiation complex. The combinatorial nature of such regulation provides great versatility, with different combinations of regulatory elements and proteins working together in response to a wide variety of conditions and signals.

    The mechanisms described so far have focused on the sequence elements in DNA that regulate transcription through the activator and repressor proteins bound to them. Following transcription, alternative splicing (see HERE) and editing of the transcripts can also modify the proteins that are produced by the cell. We will now examine some of the other ways in which gene expression is modulated in cells.

    First, we will consider some so-called epigenetic mechanisms that affect gene expression. The term epigenetics derives from epi (above, or on top of) and genetic (of genes) and refers to the fact that these mechanisms act in addition to, or overlaid on, the information in the gene sequences. Two such epigenetic mechanisms are the covalent modifications of histones in chromatin and the methylation of DNA sequences.

    Histone modification

    As noted earlier, transcription in eukaryotes is complicated by the fact that the DNA is packaged with histones to make chromatin. This means that for a gene to be transcribed, the relevant regions of the chromatin must be opened up to allow access to the RNA polymerase and transcription factors. This provides another potential point of control of gene expression. Chromatin remodeling factors, mentioned earlier, assist in reorganizing the nucleosome structure at regions that need to be made accessible.

    But what determines that a given region of the chromatin will be acted upon by the remodeling complexes? Transcriptional activator proteins bound at enhancers, sometimes work by recruiting histone modifying enzymes to the promoter region. An example of such a modifying enzyme is histone acetyl transferase (HAT) that works to acetylate specific amino acid residues in the tails of the histones forming the nucleosome core (Figures 7.119 & 7.120). Acetylation of histones is thought to be responsible for loosening the interaction between histones and the DNA in nucleosomes and helps to make the DNA more readily accessible for transcription. The opposite effect may be achieved if the enzymes recruited are histone deacetylases (HDAC) which remove acetyl groups from the tails of the histones in the nucleosome, and lead to tighter packing of the chromatin.

    Writers, readers and erasers

    In addition to the histone acetyl transferases and the deacetylases, other enzymes may add or remove methyl groups, phosphate groups, and other chemical moieties to specific amino acid side chains on the histone tails. The patterns of these covalent modifications, sometimes called the histone code, are established by the so-called "writers", or enzymes, such as histone methyltransferases, that add the chemical groups on to the histone tails. Yet other enzymes, like the histone demethylases, may act as "erasers," removing the chemical groups added by the "writers." The histone code is interpreted by "readers," proteins that bind to specific combinations of the modifications and assist in either silencing the expression of genes in the vicinity or making the region more transcriptionally active.

    DNA methylation

    Gene expression can also be regulated by methylation of the other component of chromatin - DNA. Enzymes called DNA methyltransferases (DNMTs) catalyze the covalent addition of a methyl group to C5 of cytosines in DNA. Patterns of cytosine methylation vary in different organisms, with methylation concentrated in some parts of the genome in some groups and scattered throughout the genome in others. In vertebrates, the cytosines that are methylated are generally next to a guanine (the CG dinucleotide is commonly abbreviated as CpG). Methylation of DNA seems to correlate with gene silencing while demethylation is associated with increased transcription (Figure 7.121).

    How does methylation of the DNA at CpG sites regulate gene expression? Although the extent of DNA methylation near promoters has been observed to correlate with gene silencing, it is not clear how exactly methylation brings about this effect. It has been suggested that methylation could block the binding of proteins necessary for transcription. Methylation at enhancer sites might also prevent the binding of transcriptional activators to them.

    Another interesting observation is that certain proteins that bind to methylated CpG sites also seem to interact with histone deacetylases. As noted above, histones deacetylases remove acetyl groups from histones, and promote tighter packing of chromatin and transcriptional silencing. Thus, methylation on DNA likely works in combination with histone modification to affect gene expression.

    Regulatory RNAs

    One of the most unexpected discoveries in the past few decades has been the role that RNAs play in regulating gene expression. The classic view that RNA either encoded proteins (mRNA) or assisted in their synthesis (rRNA and tRNA) is now known to be a vast underestimate of the various ways in which RNAs function in gene expression. It is now clear that regulatory RNAs have widespread and significant effects on gene expression, a realization that has revolutionized our understanding of gene regulation.

    What are some of the ways in which regulatory RNAs function to modulate the expression of genes?

    Small regulatory RNAs

    MicroRNAs (miRNAs) and Short Interfering RNAs (siRNAs) are small, non-coding RNAs that act at the post-transcriptional level to regulate gene expression (Figure 7.123 & 7.124). These RNAs appear to silence genes by base-pairing with target mRNAs and marking them for degradation, or by blocking their translation. The functional forms of both miRNAs and siRNAs are from 20-30 nucleotides long and are derived by processing from longer primary transcripts. Mature miRNAs and siRNAs work in association with a class of proteins called Argonaute proteins to form a gene silencing complex.

    MicroRNAs are transcribed from specific genes by RNA polymerase II. The primary transcript, known as a pri-miRNA folds on itself to form double-stranded hairpin structures that are cleaved by an RNase in the nucleus called Drosha. The products of Drosha cleavage, double-stranded RNAs of roughly 60-70 nucleotides known as pre-miRNAs, are exported to the cytoplasm, where they are further processed into the small 20-30 nucleotide lengths of mature double-stranded miRNAs by an enzyme known as Dicer. The RNA duplexes of miRNAs are not perfectly matched, and have loops and mismatches (Figure 7.124).

    siRNAs also derive from double-stranded RNAs, but these may arise from either endogenous or exogenous sources (such as viruses). These double-stranded RNAs are processed in the cytoplasm by the same enzyme, Dicer, that generates the mature miRNAs, to produce the small, 20-30 nucleotide double-stranded RNAs.

    In contrast to miRNAs, the mature siRNAs are perfectly base-paired along their lengths.

    RISC assembly

    Both miRNAs and siRNAs then are assembled with Argonaute proteins to form a silencing complex called RISC (RNA-induced silencing complex). Recall that both miRNAs and siRNAs are, at this point double-stranded. One strand of the RNA is referred to as the guide RNA, while the other is called the passenger RNA.

    During the process of loading the RNA onto the Argonaute protein, the guide strand of the RNA remains associated with the protein, while the passenger strand is removed. The guide RNA associated with the Argonaute protein is the functional gene silencing complex (Figure 7.125).

    Sequence specific base-pairing of the guide RNA with an mRNA leads to either the degradation of the mRNA by the Argonaute protein (in the case of the siRNAs) or in suppression of translation of the mRNA (for miRNAs). The extent to which these processes play a role in regulating gene expression is impressive. The expression of at least a third of all human genes has already been shown to be modulated by miRNAs, demonstrating clearly that these RNAs play a major role in gene regulation.

    Long noncoding RNAs

    Long noncoding RNAs (lncRNAs) are RNAs of greater than 200 nucleotides that do not code for proteins. Some of these RNAs are derived from intron sequences, while others, transcribed from intergenic regions form a subset of lncRNAs called lincRNAs (long intergenic noncoding RNAs). Yet other lncRNAs are produced as antisense transcripts of coding genes. An astounding 30,000 transcripts in humans are thought to be lncRNAs, but little is known of their function. From the few lncRNAs that have been intensively studied, it is evident that they do not all function in the same way. However, they appear to affect gene expression in a variety of ways including modification of chromatin structure, regulation of splicing, or serving as structural scaffolds for the assembly of nucleoprotein complexes. Additional mechanisms will doubtless be uncovered as these fascinating RNAs are investigated in years to come.

    Regulation of translation

    The synthesis of proteins is dependent on the availability of the mRNAs encoding them. If an mRNA is blocked at its 5' end, it cannot be translated. The rate of degradation of an mRNA will influence how long it is around to direct the synthesis of the protein it codes for. Gene expression can also, therefore, be regulated by mechanisms that alter the rate of mRNA degradation. Regulation of translation is used to control the production of many proteins. Two examples, ferritin and the transferrin receptor, are important for iron storage and transport in cells. Ferritin is an iron-binding protein that sequesters iron atoms in cells to keep them from reacting. When iron levels are high, there is a need for more ferritin than when iron levels are low. How are ferritin levels regulated? The 5'UTR of the ferritin mRNA contains a 28-nucleotide sequence called the Iron Response Element, or IRE (Figure 7.127). When iron levels are low, the IRE is bound by a protein. The presence of the IRE-binding protein at the 5'UTR blocks translation of the ferritin mRNA. However, if iron levels are high, the iron binds to the IRE-binding protein, which undergoes a conformational change and dissociates from the IRE. This frees up the 5' end of the ferritin mRNA for ribosome assembly and translation, producing more ferritin.

    The other protein involved in iron transport, the transferrin receptor, is required for uptake of iron into cells, when intracellular iron levels are low. In the case of the transferrin receptor, it is when iron levels are low that more of it is needed. When iron levels are high, there is no need to make more transferrin receptor. The mRNA encoding the transferrin receptor also has IRE sequences, but in this case, the IRE is situated in the 3'UTR of the transcript (Figure 7.128). The IRE is, as in the case of ferritin, bound by the IRE-binding protein. When iron levels in the cell are high, the iron binds the IRE-binding protein, which dissociates from the IRE. This leaves the 3'UTR susceptible to attack by RNases, leading to degradation of the transferrin receptor mRNA. At times when iron levels are low, the IRE-binding protein remains bound to the 3' UTR of the mRNA, stabilizing it and permitting more transferrin receptor to be made by translation.

    Gene expression is controlled at many steps

    As can be seen from the examples in this section, regulation of gene expression in eukaryotic cells is a function of multiple mechanisms that act at different stages in the flow of information from DNA to protein, responding to the internal state of the cell as well as external conditions and signals.

    Information Processing: Gene Expression


    YouTube Lectures

    by Kevin



    Figure 7.102 - Multiple levels of control of gene expression



    Figure 7.103 - Prokaryotic genes organized in an operon


    Figure 7.104 - Protein binding sites in the lac regulatory region

    Image by Martha Baker

    Interactive Learning




    Figure 7.105 - Lac operon structure and products

    Image by Martha Baker

    Figure 7.106 - Lac operon in the absence (middle) and presence (bottom) of inducer

    Image by Martha Baker


    Figure 7.107 - Allolactose (top) and lactose (bottom)

    Figure 7.108 - CAP (blue) bound to the DNA adjacent to the lac promoter (orange). cAMP shown in pink.


    Figure 7.109 - Lac operon in the presence (top) and absence (bottom) of glucose

    Image by Martha Baker


    Figure 7.110 - Structure and regulation of the trp operon


    YouTube Lectures

    by Kevin



    Figure 7.111 - Attenuation in regulation of the trp operon


    Figure 7.112 - Sequence of the leader region of the trp operon




    Figure 7.113 - Riboswitch features


    Figure 7.114 - Regulatory sequences for a eukaryotic gene



    Figure 7.115 - DNA looping allows contact between activator bound at a distant enhancer and the basal transcription complex

    Image by Martha Baker


    Figure 7.116 - Transcription factors in regulation of eukaryotic transcription


    YouTube Lectures

    by Kevin



    Figure 7.117 - Binding of c-myc protein to its target DNA sequence


    Figure 7.118 Activators bound at multiple sites can regulate transcription from a given promoter



    Figure 7.119 - Transcriptional activation (right) and deactivation (left) by histone modification



    Figure 7.120 - Chromatin configuration affects transcription


    Interactive Learning




    Figure 7.121 - Inactivation of transcription by CpG methylation

    Image by Indira Rajagopal

    YouTube Lectures

    by Kevin



    Figure 7.122 - Epigenetic changes through histone and DNA modification


    Figure 7.123 - miRNAs function in the regulation of gene expression


    Figure 7.124 Pre-miRNA hairpin structures with the mature guide miRNAs shown in red



    Figure 7.125 - Gene silencing by siRNA

    Image by Pehr Jacobson


    Figure 7.126 - Processed siRNA duplex with perfect base-pairing, 5’ phosphates and two bases overhanging at each 3’ end


    Figure 7.127 -Regulation of ferritin mRNA translation

    Image by Aleia Kim

    YouTube Lectures

    by Kevin



    Figure 7.128 -Regulation of transferrin receptor mRNA translation

    Image by Aleia Kim

    Graphic images in this book were products of the work of several talented students. Links to their Web pages are below

    Click HERE for

    Martha Baker’s

    Web Page

    Click HERE for

    Pehr Jacobson’s

    Web Page

    Click HERE for

    Aleia Kim’s

    Web Page

    Click HERE for

    Penelope Irving’s

    Web Page

    Problem set related to this section HERE

    Point by Point summary of this section HERE

    To get a certificate for mastering this section of the book, click HERE

    Kevin Ahern’s free iTunes U Courses - Basic / Med School / Advanced

    Biochemistry Free & Easy (our other book) HERE / Facebook Page

    Kevin and Indira’s Guide to Getting into Medical School - iTunes U Course / Book

    To see Kevin Ahern’s OSU ecampus courses - BB 350 / BB 450 / BB 451

    To register for Kevin Ahern’s OSU ecampus courses - BB 350 / BB 450 / BB 451

    Biochemistry Free For All Facebook Page (please like us)

    Kevin Ahern’s Web Page / Facebook Page / Taralyn Tan’s Web Page

    Kevin Ahern’s free downloads HERE

    OSU’s Biochemistry/Biophysics program HERE

    OSU’s College of Science HERE

    Oregon State University HERE

    Email Kevin Ahern / Indira Rajagopal / Taralyn Tan

    God Bless These Complexes

    To the tune of “God Bless America”

    Metabolic Melodies Website HERE

    All information in

    Cells’ DNA

    Just increases

    With pieces

    Mixed and matched in the mRNAs

    Linking exons

    All together

    Using snurps in


    God bless the spliceosomes

    And trans-crip-tomes

    (slow and loud) God bless the spliceosomes

    And my ge-nome

    Your blueprint info is

    In DNA

    Since you need it

    Proofread it

    Or you’ll mutate the mRNA

    You can translate

    All the codons

    With the cells’ gen-

    et-ic code

    God bless the ribosomes

    They translate code

    (slow and loud) God bless the ribosomes

    And proteomes

    Recording by David Simmons

    Lyrics by Kevin Ahern
    Recording by David Simmons Lyrics by Kevin Ahern

    The Book of Life

    To the tune of “The Look of Love”

    Metabolic Melodies Website HERE

    The book of life - the stuff of dreams

    Is everywhere, it seems

    The book of life, is biochemistry and

    Its words fill every day

    Just what it says is written in the DNA

    I just want to get to know it

    How the info’s coded

    What are all the secrets?

    Ribosomes can read it

    Goodness knows it’s needed

    And so its alphabet’s

    In codon forms

    For ribosome bookworms

    They read it right

    A protein’s function to its sequence corresponds

    It’s not just randomly created peptide bonds

    What a marvel of creation, how they do translation

    Of m-R-N-A chains,

    Using bits of glycine

    Proline and some lysine

    Translate the code


    I just marvel at the knowledge

    That I got in college

    To learn all the secrets

    Double helix spaces

    Complementary bases


    Paired to purines

    The book of life

    Recording by Carol Adriane Smith

    Lyrics by Kevin Ahern
    Recording by Carol Adriane Smith Lyrics by Kevin Ahern

    This page titled 4.9: Gene Expression is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Kevin Ahern, Indira Rajagopal, & Taralyn Tan.

    • Was this article helpful?