Describe how short chain proteins (polypeptides) are formed from amino acids.
Proteins may be defined as compounds of high molar mass consisting largely or entirely of chains of amino acids. Their masses range from several thousand to several million daltons (Da). In addition to carbon, hydrogen, and oxygen atoms, all proteins contain nitrogen and sulfur atoms, and many also contain phosphorus atoms and traces of other elements. Proteins serve a variety of roles in living organisms and are often classified by these biological roles. Muscle tissue is largely protein, as are skin and hair. Proteins are present in the blood, in the brain, and even in tooth enamel. Each type of cell in our bodies makes its own specialized proteins, as well as proteins common to all or most cells. We begin our study of proteins by looking at the properties and reactions of amino acids, which is followed by a discussion of how amino acids link covalently to form peptides and proteins. We end the chapter with a discussion of enzymes—the proteins that act as catalysts in the body.
The proteins in all living species, from bacteria to humans, are constructed from the same set of 20 amino acids, so called because each contains an amino group attached to a carboxylic acid. The amino acids in proteins are α-amino acids, which means the amino group is attached to the α-carbon of the carboxylic acid. Humans can synthesize only about half of the needed amino acids; the remainder must be obtained from the diet and are known as essential amino acids. However, two additional amino acids have been found in limited quantities in proteins: Selenocysteine was discovered in 1986, while pyrrolysine was discovered in 2002.
The amino acids are colorless, nonvolatile, crystalline solids, melting and decomposing at temperatures above 200°C. These melting temperatures are more like those of inorganic salts than those of amines or organic acids and indicate that the structures of the amino acids in the solid state and in neutral solution are best represented as having both a negatively charged group and a positively charged group. Such a species is known as a zwitterion.
Classification
In addition to the amino and carboxyl groups, amino acids have a side chain or R group attached to the α-carbon. Each amino acid has unique characteristics arising from the size, shape, solubility, and ionization properties of its R group. As a result, the side chains of amino acids exert a profound effect on the structure and biological activity of proteins. Although amino acids can be classified in various ways, one common approach is to classify them according to whether the functional group on the side chain at neutral pH is nonpolar, polar but uncharged, negatively charged, or positively charged. The structures and names of the 20 amino acids, their one- and three-letter abbreviations, and some of their distinctive features are given in Table \(\PageIndex{1}\).
Table \(\PageIndex{1}\): Common Amino Acids Found in Proteins
Common Name
Abbreviation
Structural Formula (at pH 6)
Molar Mass
Distinctive Feature
Amino acids with a nonpolar R group
glycine
gly (G)
75
the only amino acid lacking a chiral carbon
alanine
ala (A)
89
—
valine
val (V)
117
a branched-chain amino acid
leucine
leu (L)
131
a branched-chain amino acid
isoleucine
ile (I)
131
an essential amino acid because most animals cannot synthesize branched-chain amino acids
phenylalanine
phe (F)
165
also classified as an aromatic amino acid
tryptophan
trp (W)
204
also classified as an aromatic amino acid
methionine
met (M)
149
side chain functions as a methyl group donor
proline
pro (P)
115
contains a secondary amine group; referred to as an α-imino acid
Amino acids with a polar but neutral R group
serine
ser (S)
105
found at the active site of many enzymes
threonine
thr (T)
119
named for its similarity to the sugar threose
cysteine
cys (C)
121
oxidation of two cysteine molecules yields cystine
tyrosine
tyr (Y)
181
also classified as an aromatic amino acid
asparagine
asn (N)
132
the amide of aspartic acid
glutamine
gln (Q)
146
the amide of glutamic acid
Amino acids with a negatively charged R group
aspartic acid
asp (D)
132
carboxyl groups are ionized at physiological pH; also known as aspartate
glutamic acid
glu (E)
146
carboxyl groups are ionized at physiological pH; also known as glutamate
Amino acids with a positively charged R group
histidine
his (H)
155
the only amino acid whose R group has a pKa (6.0) near physiological pH
lysine
lys (K)
147
—
arginine
arg (R)
175
almost as strong a base as sodium hydroxide
The first amino acid to be isolated was asparagine in 1806. It was obtained from protein found in asparagus juice (hence the name). Glycine, the major amino acid found in gelatin, was named for its sweet taste (Greek glykys, meaning “sweet”). In some cases an amino acid found in a protein is actually a derivative of one of the common 20 amino acids (one such derivative is hydroxyproline). The modification occurs after the amino acid has been assembled into a protein.
Zwitterions
The structure of an amino acid allows it to act as both an acid and a base. An amino acid has this ability because at a certain pH value (different for each amino acid) nearly all the amino acid molecules exist as zwitterions. If acid is added to a solution containing the zwitterion, the carboxylate group captures a hydrogen (H+) ion, and the amino acid becomes positively charged. If base is added, ion removal of the H+ ion from the amino group of the zwitterion produces a negatively charged amino acid. In both circumstances, the amino acid acts to maintain the pH of the system—that is, to remove the added acid (H+) or base (OH−) from solution.
The Peptide Bond: Peptides and Proteins
Two or more amino acids can join together into chains called peptides. Previously, we discussed the reaction between ammonia and a carboxylic acid to form an amide. In a similar reaction, the amino group on one amino acid molecule reacts with the carboxyl group on another, releasing a molecule of water and forming an amide linkage as shown in Figure \(\PageIndex{1}\)
An amide bond joining two amino acid units is called a peptide bond. Note that the product molecule still has a reactive amino group on the left and a reactive carboxyl group on the right. These can react with additional amino acids to lengthen the peptide. The process can continue until thousands of units have joined, resulting in large proteins.
A chain consisting of only two amino acid units is called a dipeptide; a chain consisting of three is a tripeptide. By convention, peptide and protein structures are depicted with the amino acid whose amino group is free (the N-terminal end) on the left and the amino acid with a free carboxyl group (the C-terminal end) to the right.
The general term peptide refers to an amino acid chain of unspecified length. However, chains of about 50 amino acids or more are usually called proteins or polypeptides. In its physiologically active form, a protein may be composed of one or more polypeptide chains.
Peptide cells in our bodies have an intricate mechanism for the manufacture of proteins. Humans have to use other techniques in order to synthesize the same proteins in a lab. The chemistry of peptide synthesis is complicated. Both active groups on an amino acid can react and the amino acid sequence must be a specific one in order for the protein to function. Robert Merrifield developed the first synthetic approach for making proteins in the lab, a manual approach which was lengthy and tedious (and, he won the Nobel Prize in Chemistry in 1984 for his work). Today, however, automated systems can crank out a peptide in a very short period of time.
The Sequence of Amino Acids
The particular sequence of amino acids in a longer chain is called an amino acid sequence. By convention, the amino acid sequence is listed in the order such that the free amine group is on the left end of the molecule and the free carboxyl group is on the right end of the molecule.
For example, suppose that a sequence of the amino acids glycine, tryptophan, and alanine is formed with the free amine group as part of the glycine and the free carboxyl group as part of the alanine. The amino acid sequence can be easily written using the abbreviations as Gly-Trp-Ala. This is a different sequence from Ala-Trp-Gly because the free amine and carboxyl groups would be on different amino acids in that case.
Bradykinin
Just as millions of different words are spelled with our 26-letter English alphabet, millions of different proteins are made with the 20 common amino acids. However, just as the English alphabet can be used to write gibberish, amino acids can be put together in the wrong sequence to produce nonfunctional proteins. Although the correct sequence is ordinarily of utmost importance, it is not always absolutely required. Just as you can sometimes make sense of incorrectly spelled English words, a protein with a small percentage of “incorrect” amino acids may continue to function. However, it rarely functions as well as a protein having the correct sequence. There are also instances in which seemingly minor errors of sequence have disastrous effects. For example, in some people, every molecule of hemoglobin (a protein in the blood that transports oxygen) has a single incorrect amino acid unit out of about 300 (a single valine replaces a glutamic acid). That “minor” error is responsible for sickle cell anemia, an inherited condition that usually is fatal.
Example \(\PageIndex{1}\)
Draw the polypeptide Asp-Val-Ser.
Solution
1. Identify the structures of each of the three given amino acids and draw them in the same order as given in the name.
2. Leaving the order the same, connect the amino acids to one another by forming peptide bonds. Note that the order given in the name is the same way the amino acids are connected in the molecule. The first one listed is always the \(\ce{N}\)-terminus of the polypeptide.
Example \(\PageIndex{2}\)
List all of the possible polypeptides that can be formed from cysteine (Cys), leucine (Leu), and arginine (Arg).
Solution
Although there are only three amino acids, the order in which they are bonded changes the identity, properties, and function of the resulting polypeptide. There are six possible polypeptides formed from these three amino acids.
Cys-Leu-Arg
Cys-Arg-Leu
Leu-Cys-Arg
Leu-Arg-Cys
Arg-Cys-Leu
Arg-Leu-Cys
Exercise \(\PageIndex{1}\)
Draw the structure for each peptide.
gly-val
val-gly
Answers
a.
b.
Summary
The amino group of one amino acid can react with the carboxyl group on another amino acid to form a peptide bond that links the two amino acids together.
Additional amino acids can be added on through the formation of addition peptide (amide) bonds.
A chain consisting of only two amino acid units is called a dipeptide; a chain consisting of three is a tripeptide.
Chains of about 50 amino acids or more are usually called proteins or polypeptides.
A sequence of amino acids in a peptide or protein is written with the N-terminal amino acid first and the C-terminal amino acid at the end (writing left to right).
The order, or sequence, in which the amino acids are connected is also of critical importance in order for a peptide or protein to be physiologically active.
Contributors and Attributions
Libretext: The Basics of GOB Chemistry (Ball et al.)
Allison Soult, Ph.D. (Department of Chemistry, University of Kentucky)