Skip to main content
Chemistry LibreTexts

Section 3C. Peptide Mass Mapping for Protein Identification

  • Page ID
    79445
  • Peptide mass mapping is a technique that uses powerful search engines (e.g. Mascot) to identify a protein from mass spectrometry data and primary sequence databases. The general approach is to take a small sample of the protein and digest it with a proteolytic enzyme, such as trypsin. Trypsin cleaves the protein after lysine and arginine residues. The resulting mixture of peptides is analyzed by MALDI-TOF mass spectrometry.

    The experimental mass values of the peptides are then compared with theoretical peptide mass values. Theoretical mass values of peptides are obtained by using the genome sequence for an organism and predicting all the proteins that can be expressed. Once all the proteins are predicted then the cleavage rules for the digest enzyme are applied and the masses of the resulting peptides calculated by the computer (in-silico digest). By using an appropriate scoring algorithm, the closest match or matches can be identified. If the "unknown" protein is present in the sequence database, then the aim is to pull out that precise entry. If the sequence database does not contain the “unknown” protein, then the aim is to pull out those entries which exhibit the closest homology, often equivalent proteins from related species. The steps in peptide mass mapping are outlined in the flow chart.

    The analysis of a complex mixture of proteins from an organism always involves some type of separation step to isolate a certain protein. The separation methods frequently used are two dimensional (2D) gel electrophoresis or liquid chromatography. The figure shows the experimental workflow used to identify a protein spot from a 2D gel.

    Reading Questions

    1. In your own words, describe the general principle of peptide mass mapping for protein identification.

    The enzyme trypsin is frequently used to digest proteins in the peptide mass mapping technique. Trypsin cleaves the amide bond after lysine (K) and arginine (R) residues. K and R make up about 10% of the amino acids in a protein and digesting with trypsin typically results in peptides in a useful range mass range for mass spectrometry (500-3,000 Da). (Remember: The mass unit amu is the same as Da.)

    The structures of lysine and arginine are shown.

    Discussion Questions

    1. What is the purpose of digesting the protein with trypsin? (Hint: Think about how mass spectrometry is used to identify and/or elucidate the structure of small organic molecules such as caffeine).

    2. Why is a well annotated genome for the organism of interest needed in the peptide mass mapping technique for identifying a protein?

    3. Classify the side chains of R and K as acidic or basic.

    4. Digesting the protein with trypsin ensures that there is an R or K residue in each peptide. Why is this helpful for MALDI-TOF analysis? (Hint: Think about the function of the matrix in MALDI).

    5. a. Can the following two peptides with the same amino acid composition be distinguished using MALDI-TOF mass spectrometry?

    Peptide 1: GASPVRTCILKMHFY

    Peptide 2: GMFHRATIKYPVCSL

    b. Calculate the expected (monoisotopic) masses if the enzyme trypsin was used to digest peptides 1 and 2. A table of amino acid masses is provided. Can the peptides be distinguished after digestion with trypsin?

    (Refer to the Introduction section of this module if you need assistance in calculating the mass of a peptide or defining the difference between a monoisotopic mass and average mass.)

    6. There are other enzymes that could be used to digest the proteins.

    Pepsin is most efficient in cleaving peptide bonds between hydrophobic amino acids (leucine) and aromatic amino acids such as phenylalanine, tryptophan, and tyrosine. Pepsin is less specific and results in many small peptides. Would pepsin be a good choice for digesting proteins for peptide mass mapping? Explain your reasoning.

    There are also enzymes that are highly specific and result in only a few cleavage sites in a protein. Is a highly specific enzyme that creates a few large peptides be a good choice for peptide mass mapping? Explain your reasoning.

    Table 2. Molecular weight information for all twenty naturally occurring amino acids.

    Amino Acid

    Single-Letter Code

    Residue MW (amu)

    Amino Acid MW (amu)

    glycine

    G

    57.02

    75.03

    alanine

    A

    71.04

    89.05

    Serine

    S

    87.03

    105.04

    proline

    P

    97.05

    115.06

    Valine

    V

    99.07

    117.08

    threonine

    T

    101.05

    119.06

    cysteine

    C

    103.01

    121.02

    isoleucine

    I

    113.08

    131.09

    leucine

    L

    113.08

    131.09

    asparagine

    N

    114.04

    132.05

    aspartic acid

    D

    115.03

    133.04

    glutamine

    Q

    128.06

    146.07

    Lysine

    K

    128.09

    146.11

    glutamic acid

    E

    129.04

    147.05

    methionine

    M

    131.04

    149.05

    histidine

    H

    137.06

    155.07

    phenylalanine

    F

    147.07

    165.08

    arginine

    R

    156.10

    174.11

    tyrosine

    Y

    163.06

    181.07

    tryptophan

    W

    186.08

    204.09

    • Was this article helpful?