The central dogma of molecular biology, DNA to RNA to protein, has given us an explanation of how information encoded by our DNA is translated and used to make an organism. It describes how a gene made of DNA is transcribed by messenger RNA and then translated into a protein by transfer RNA in a complex series of events utilizing ribosomal RNA and amino acids. Although in essence the central dogma remains true, studies of genes and proteins are revealing a complexity that we had never imagined. For example, distinct genes are expressed in different cell types and the physiological state of the cell alters which proteins are produced and at what level. Furthermore, chemical changes (i.e. phosphorylation) to proteins occur after translation and are critical to a protein’s function. The importance and diversity of proteins started a whole new field termed proteomics.
Proteomics is the study of proteins, particularly their structures and functions. This term was coined to make an analogy with genomics. The Human Genome Project, started in 1990 and completed in 2003, sequenced three billion bases in genes (the human genome). The entire set of proteins in existence in an organism throughout its life cycle, or on a smaller scale the entire set of proteins found in a particular cell type under a specific set of conditions is referred to as the proteome.
Proteomics is much more complicated than genomics for several reasons. The genome is a rather constant entity while the proteome differs from cell to cell and is constantly changing through its biochemical interactions with the genome and the environment. Consequently, the proteome reflects the particular stage of development or the current environmental condition of the cell or organism. One organism will have radically different protein expression in different parts of the body, in different stages of its life cycle, and in different environmental conditions. For example, when E. coli cells are grown under conditions of elevated temperature a class of proteins known as heat shock proteins are upregulated. Many members of this group perform a chaperone function by stabilizing new proteins to ensure correct folding or by helping to refold proteins that were damaged by the cell stress. Ultimately, the comparison of proteomes of healthy and diseased tissues may identify the molecular nature of a disease and provide potential new targets for drug development. The field of proteomics also presents many analytical challenges when compared to genomics. In DNA there are only four nucleotide bases with similar molecular weights and properties. In a proteome there are thousands of different proteins with a wide range of concentrations, molecular weights, and properties.
Proteomics was initially defined as the effort to catalog all the proteins expressed in all cells at all stages of development. That definition has now been expanded to include the study of protein functions, protein-protein interactions, cellular locations, expression levels, and post-translational modifications of all proteins within all cells and tissues at all stages of development. It is hypothesized that a large amount of the non-coding DNA in the human genome functions to regulate protein production, expression levels, and post-translational modifications. It is regulation of our complex proteomes, rather than our genes, that makes us different from simpler organisms with a similar number of genes. An international collaboration of scientists in the human Proteome Project (HPP) is working to characterize all 20,300 genes of the known genome and generate a map of the protein based molecular architecture of the human body. Completion of this project will enhance understanding of human biology at the cellular level and lay a foundation for development of diagnostic, prognostic, therapeutic, and preventive medical applications.
1. Define the term proteome.
A. The proteome is the entire set of proteins that an organism expresses.
2. Define the term proteomics.
A. Proteomics is the large scale and systematic study of a proteome and all the various information associated with it including cellular locations, protein abundances, modifications, and interacting partners and networks.
3. Why is the analysis of proteins in a cell more difficult than sequencing DNA?
A. The proteome is constantly changing based on the state of the cell.
Proteins have widely varying concentrations, molecular weights, and properties.
There are thousands of different proteins in a cell requiring separation methods.
4. What types of questions can be answered by studying the proteome?
Example 1: How does protein expression change between different types of cells and stages of development?
Example 2: What protein(s) are related to a disease? What drug would be useful treatment? If a certain protein is implicated in a disease, its three dimensional structure provides the information needed to design drugs that interfere with the action of the protein. A molecule that fits the active site of an enzyme, but cannot be released by the enzyme, inactivates the enzyme. This is the basis of new drug-discovery tools, which aim to find new drugs to inactivate proteins involved in disease.
Example 3: What post-translational modifications occur and how are they involved in the regulatory mechanism of the cell?
Read the following research paper to learn how the field of proteomics can be useful in the treatment of cancer.
Comparative proteomics of oral cancer cell lines: identification of cancer associated proteins, Karsani et al, Proteome Science, 2014, 12:3. doi:10.1186/1477-5956-12-3 (Open access journal)
1. What was goal of the scientific study reported in the paper?
A. To determine differences in protein expression in oral cancer cells compared to normal healthy cells. The goal is to identify cancer associated proteins.
2. Why would a study of the change in oral cancer proteins add to our understanding of the disease?
A. It is known that the mechanism of oral cancer involves an activation of oncogenes which results in a change in expression of various proteins. This research could help identify candidate proteins for biomarkers and early detection.
3. Refer to the results and discussion section and Figure 1 to answer the following questions.
a. How were the proteins in the healthy and cancerous cells separated and detected?
A. Two dimensional gel electrophoresis
First dimension is isoelectric point (pH where protein is neutral)
Second dimension is size (MW)
b. How many individual protein spots were resolved on the silver stained gels?
A. More than 1000
c. How many protein spots exhibited a significant difference in abundance from normal cells to cancerous cells?
4. Table 1 is a list of proteins with different abundances in the cancer cell line.
(The section in this module on peptide mass mapping describes how the identity of the protein in the gel spot was determined.)
Examine the data for two structural proteins: Stathmin (STMN1) and myosin regulatory light chain-2 (ML12A).
What is the change in abundance for each protein? Can this change be visualized from the image of the spot?
How is the change quantified?
A. Stathmin (STMN1) increased (+1.6). The gel image is different. The spot in the cancerous cell line is darker and larger than the spot in the normal cell line.
Myosin regulatory light chain-2 (ML12A) decreased (-1.9) . The spot in the cancerous cell line is lighter and smaller than the spot in the normal cell line.
5. The simplified 2D gel shown represents the proteins from a healthy cell line.
Draw a new 2D gel which could represent the changes in protein expression that occur in a cancerous cell line.
A. A representative 2D gel is shown.