|
|
||||||||
Invited Review
1Cardiovascular Research Institute and 2Mass Spectrometry Facility, Department of Pharmaceutical Chemistry, University of California, San Francisco, California 94143
| ABSTRACT |
|---|
|
|
|---|
mass spectrometry; proteome; lung
In the first part of this review, the status of current proteomic techniques is outlined. The rapid evolution in mass spectrometry, which was initiated by the development of the ionization techniques MALDI and ESI, has led to significant improvements in the central step of a proteomics experiment, protein identification.
The subsequent application of these techniques to a large number of previously inaccessible categories of samples has in turn triggered progress in other crucial steps, namely protein separation techniques and the analysis of the resulting data. The classic proteomics approach of describing and comparing the protein content of a given sample has consequently been refined by the description of "posttranslational modifications" of the protein and widened by tools that allow quantitative comparison of two or more samples ("quantitative proteomics"). This article provides an outline of current techniques in both of these fields that will be used to investigate the lung proteome in the coming years.
The third part of this review focuses on the present status of the investigation of the lung proteome with specific examples from pulmonary studies that have evaluated bronchoalveolar lavage as well as other biological samples in a variety of acute and chronic lung diseases. Some future possibilities for lung research that may arise from the rapid progress occurring in proteome method development are also considered (3, 69, 103, 240).
| IMPLICATIONS OF THE HUMAN GENOME PROJECT |
|---|
|
|
|---|
Therefore, the study of the genome or even mRNA levels (the transcriptome) will reveal only a small spectrum of the response to a particular stimulus. Even from the diseases known to be based on specific genetic defects, only a very small number are likely to be monogenic, since cellular systems include complex interactions with a high level of redundancy (234). Conversely, the function of a large number of the protein products that are encoded by these genes is still unclear (25).
Direct investigation of the proteome provides a more complete representation of changes in the status of an organism. However, there exist several impediments to such an approach, the sheer complexity of the proteome being the most important one (Fig. 1). Diversity is another issue, since there are at least 250 different types of human cells, each of which contains at least 2,0006,000 different primary proteins (33, 59), and posttranslational modifications will multiply this number (152, 165, 257, 258). It has been estimated that the different types of human cells may differ from each other in
400 unique proteins (32). Another important factor is the dynamic range of concentrations of proteins, since one cell can contain between one and more than 100,000 copies of a single protein (32). Finally, the proteome of organisms is dynamic and changes with environment and with time (106).
|
| WHAT CAN BE MEASURED USING A PROTEOMICS APPROACH? |
|---|
|
|
|---|
There has been an expansion of proteomics into "functional proteomics," the correlation of changes in the proteome with different states of the organism. This field is currently expanding in several different dimensions. "Protein profiling techniques" take a global view at complex protein samples, such as plasma. Given the complexity of these samples, these techniques need to be streamlined to achieve high throughput. The resulting protein patterns have diagnostic value as biomarkers on their own and indicate directions for more specific investigations. The application of protein profiling to tissue samples provides a combination of spatial information and protein profiles. The current results clearly indicate that these techniques are a valuable complement to histology (38, 265). The continuing improvement in protein identification will provide further insights into pathological processes and will most likely be especially valuable in cancer research. The application of mass spectrometry technology to the evaluation of "protein modifications" further extends the scope of proteomic analysis in depth. The physiological responses of an organism are only to a small part represented by changes in protein concentrations; especially, rapid responses to stimuli are transmitted by the modification of existing proteins. In spite of this complexity, this emerging field has, therefore, a large potential for clinically relevant research. The development of quantitative proteomics has widened the applicability of these techniques beyond a purely descriptive study design. Novel techniques in this field, namely differential gel electrophoresis (DIGE) and isotope-coded affinity tagging (ICAT), allow the direct comparison of samples, e.g., of different disease states.
| CURRENT CHALLENGES |
|---|
|
|
|---|
For example, plasma and pulmonary edema fluid contain large amounts of albumin (3050 mg/ml in plasma and 2025 mg/ml in pulmonary edema fluid) but comparatively small quantities of cytokines such as TNF-
or IL-1
(ng/ml to pg/ml range). Therefore, protein separation and purification techniques are key elements of proteome research that represent one of the major challenges (7, 15, 33).
Although the size of the proteome is unknown, the number of expressed proteins can be estimated from the open reading frames in a sequenced genome. It has been reported that 20% (1,484 proteins) from Saccharomyces cerevisiae (249) and >61% of the predicted proteome of Deinococcus radiodurans (145) could be identified by a current multidimensional chromatography-tandem mass spectrometric approach. These results indicate that identification of a significant part of the proteome of a cell is feasible.
Other common obstacles to proteomics are more dependent on the individual sample and the specific techniques. The validity of the results of a proteomic experiment is dependent on the initial sample, the purity of cell and protein isolation, and the subsequent sample fractionation steps. Salts, mucus, and other contaminants may require purification procedures that lead to loss of proteins of interest. The presence of proteases in samples can cause additional cleavages of the investigated proteins, complicating protein identification and quantitation. Ongoing cellular protein synthesis and posttranslational processing, by phospatases and kinases, for example, can influence the results as well.
| ANALYSIS METHODS |
|---|
|
|
|---|
The rapid progress in mass spectrometry in the last decade has made it a key technique for the investigation of the proteome (2, 3, 29, 69, 85, 103, 150). Mass spectrometry can be used to identify proteins by providing the molecular mass to electric charge (m/z ratio) of molecular species in a sample. Due to the high accuracy of this method, which under some circumstances can detect peptides in the femtomole to attomole range with an accuracy of <10 parts per million (ppm) (45, 70), it is now possible to identify proteins by using search algorithms that interrogate public "protein databases," such as the nonredundant National Center for Biotechnology Information (NCBI) database, which can be accessed over the Internet.
Because the human genome is virtually known (243), every protein sequence can be predicted and included in these databases. Mass spectrometry is most often used as the identification technique after 2D-PAGE (46) or other separation techniques such as liquid chromatography (LC) (92, 93, 249).
Sample Preparation
Because many components of biological samples interfere with analysis, it is necessary to remove them before study. Insoluble substances can be removed by centrifugation. For 2D-PAGE and mass spectrometry, it is necessary to remove salts before analysis. This can be achieved by dialysis, size-exclusion filtering, protein precipitation, or reverse-phase chromatography (12, 54, 108). Frequently, abundant proteins such as albumin or immunoglobulins need to be removed first (7, 214). Complex samples need to be fractionated before analysis to obtain simpler subfractions and to decrease the dynamic range of components, if possible. For example, the dynamic range of concentrations in a plasma sample exceeds 10 orders of magnitude (7), whereas a current one-dimensional chromatography-mass spectrometry approach can only detect proteins in a dynamic range of approximately 4 orders of magnitude (7). Affinity purification is a powerful approach to reduce the complexity of a sample by specifically isolating individual proteins or "protein complexes" (15). These preparation steps are often more time consuming than the subsequent analysis steps and influence the sensitivity and discriminative power of mass spectrometry-based protein identification (108, 191).
Electrophoresis
Gel electrophoresis, especially 2D-PAGE (121, 178), has long been the major method for the investigation of the proteome. An overview on the most frequently used electrophoresis techniques is provided in Table 1. For visualization, proteins in the gel are stained using a variety of different methods. A synopsis of the most widely employed staining methods is given in Table 2. With the use of this method, gel maps of body fluids, such as human plasma (6, 149, 194) or BAL fluid (176, 252), have been published (see Fig. 2). The large number of spots in a 2D gel is partly due to posttranslational and proteolytic modifications of proteins; one protein may, therefore, be present in several locations in the gel (25). Although this phenomenon is potentially useful for the further analysis of these modifications, the increased number of spots for analysis can lead to additional effort, since >25% of the spots on one gel may be due to modified proteins (34) found elsewhere on the gel. The number of protein spots in complex samples makes computer-assisted image analysis necessary. Digital image analysis is also needed for quantitative information. There are several software suites for this purpose that are commercially available.
|
|
|
Chromatography
Chromatography, especially LC, can be carried out as a purification step before or after 2D-PAGE (12, 163, 194). The progress in separation science has made this method a competitive alternative to electrophoresis. LC-LC-MS-MS (tandem mass spectrometry)-based techniques such as multidimensional protein identification technology (MudPIT) may have advantages over gel-based techniques in speed, sensitivity, reproducibility, and applicability to different samples and conditions (84, 144, 248, 249, 259, 260). The purification process of all LC techniques can be automated to a large extent (107, 137). The main shortcoming of this technique is the lack of quantitative information. The development of protein labeling techniques such as ICAT can overcome this disadvantage (see below).
Another interesting set of approaches to visualize changes in the proteome content of a sample are the protein profiling techniques (37, 164) (see below).
| MASS SPECTROMETRY |
|---|
|
|
|---|
Types of Mass Spectrometers
An overview of mass spectrometers currently being used for protein identification is provided in Table 3. The relatively soft ionization techniques of MALDI (117) and ESI (63) have made it possible to generate ions from large, nonvolatile analytes such as proteins without significant fragmentation. Both methods can be used to analyze proteins
100 kDa (2, 29). Their introduction in the late 1980s revolutionized the applicability of mass spectrometry to biomolecules and initiated an era of rapid progress that persists today (3).
|
|
|
Mass measurements of the intact proteins provide a mass balance and rapid and valuable information on the protein profile of a sample. It is, however, not practical to attempt to identify a protein solely on the basis of its m/z ratio. This is mainly due to splice and sequence variation from database entries combined with a heterogeneous set of posttranslational modifications, which lead to variable differences in the molecular weight of a protein compared with the theoretical mass derived from the database. Therefore, additional strategies have been developed for protein identification, and these can be used separately or in combination.
"Peptide mass fingerprinting" is based on mass measurements of peptide fragments derived from a single protein. Before mass spectrometry, proteins are cleaved into peptides at specific, reproducible points in their amino acid sequence using chemical agents or proteases. A protein covalent modification will only be reflected in one or a few of the peptide mass values, whereas the rest will remain unchanged. Because of its highly reproducible cleavage on the COOH-terminal side of arginine and lysine residues, trypsin is the proteolytic enzyme used most often. With the use of this specificity, the anticipated mass values of all peptides in virtual digests of all proteins in the database are calculated. The protein identity is determined by comparing the measured peptide mass values with those calculated (45, 98, 110, 151, 208, 268). The reliability of peptide mass fingerprinting is dependent on: 1) the mass accuracy of the peptide measurements (45); 2) the number of matched vs. unmatched peaks in the spectrum; 3) the number of peptides that could be matched to a single protein; and 4) the number of proteins that are present in the digested sample, since random matches can occur at a level of confidence similar to real matches in complex mixtures. The decreased reliability of results using peptide fingerprinting with complex mixtures of proteins has been exacerbated by the massive increase in the size of the databases. Other potentially critical factors are the increased rate of false-positive matches and bias toward high-molecular-weight proteins, which yield a larger number of peptides and are, therefore, more likely to be matched by this technique than smaller proteins. Scoring systems included in the analysis software packages (see below) aim at compensating for these potential problems.
With the use of two sequential mass analyzers (tandem mass spectrometry or MS-MS), primary structural analysis of the amino acid sequence can be obtained (3, 22, 150, 161) by fragmenting one or more of the peptides (Fig. 5). Peptide fragmentation is achieved by preferential cleavage of the backbone bond of polypeptides upon collisional activation with a gas [collision-induced dissociation (CID)] (21, 161). Tandem mass spectrometry can be carried out using both ESI (e.g., ESI-triple-quadrupole or ESI-Qq-TOF) (42) and MALDI ionization (MALDI-TOF-TOF) (102, 162). Often, fragmentation spectra of only a few peptides are sufficient for unambiguous protein identification (45, 150).
Although sequence information can also be obtained with relatively inexpensive instruments using the metastable decay of some ions after desorption by MALDI (postsource decay), this time-consuming technique is rapidly being replaced by the faster and more sensitive tandem time-of-flight mass spectrometry (102, 150, 162, 266).
Protein Profiling Techniques
Protein profiling is the rapid screening of samples by mass spectrometry with limited or no sample preparation. The resulting profile of m/z ratio peaks of different samples (that can be body fluids, cell lysates, or even tissue samples) can then be compared, and differences in the relative abundance of proteins can be identified. The samples can then be further purified by chromatography and identified by techniques such as peptide fingerprinting or MS-MS. These techniques provide a complementary method to 2D-PAGE for protein visualization.
In SELDI (Table 3), proteins are retained on a protein chip array composed of various chromatographic, immunologic, or enzymatic surfaces and subsequently detected directly by time-of-flight mass spectrometry. In contrast to the metal sample target employed in MALDI mass spectrometry, in SELDI the probe surfaces play an active role in the extraction, structural modification, and presentation of the protein of interest from the sample. There are several different probe surfaces available, thus SELDI can be modified for use with proteins of different properties (164). Of the different SELDI applications in development today, surface-enhanced affinity capture is considered the most promising, with a reported 100-fold dynamic range (164). The special advantage of this technique is the possibility of high-throughput analysis. Protein chips may be useful in the discovery of new drug targets (271) and biomarkers (109, 164, 189, 193).
IMS utilizes MALDI-MS for the direct analysis of tissue samples (37) (Table 3). This is carried out by coating a slice of frozen tissue with crystallization matrix or by blotting the tissue on a target coated with C18 beads (30, 3739). Mass spectrometry generates ion images of samples providing the capability of mapping specific molecules to 2D coordinates on the original sample, thus giving spatial information on peptide/protein distributions (Fig. 5). (Fig. 6). This technique has been successfully applied to brain tumors (233) and non-small cell lung cancer (265); the latter study is described in more detail later in this article. This methodology will certainly continue to be increasingly utilized.
|
|
Posttranslational modifications play a crucial role in cell signaling and protein function (77, 152, 190). More than 200 different protein modifications have been described (125, 257, 258). Important posttranslational modifications include phosphorylation, acetylation, glycosylation, ubiquitination, and nitration (125, 152, 242). The analysis of posttranslational modifications on a proteome scale is still considered an analytical challenge (66, 69, 152, 159, 177, 229, 274); reasons for this are the fragility of the chemical bonds of many protein modifications upon sequencing by CID, signal suppression of negatively charged (phosphate-, sulfate-containing) molecules in the commonly used positive detection mode, and difficulty of obtaining full-sequence coverage (123). Moreover, most modifications are substoichiometric; therefore, modified peptides are frequently present at much lower levels than unmodified peptides (124, 269).
Phosphorylation is an important regulation mechanism of protein activity and signaling networks. It is crucial in protein kinase activation, cell-cycle progression, cellular differentiation, transformation, response, and adaptation of peptide hormones (47, 77, 154, 165). Approximately 30% of all mammalian proteins are phosphorylated at any given time (153). The more than 500 protein kinases and
100 phosphatases have relatively wide substrate specificities and work in different combinations to achieve a variety of biological responses, which can make analysis of these complex networks challenging (47, 153, 154). Phosphopeptides are generally difficult to analyze by mass spectrometry. One reason for this is their negative charge, which reduces ion intensity (electrospray is generally performed in the positive mode). Other impediments include their presence at substoichiometric levels, their hydrophilicity, which interferes with reverse-phase chromatography, and other factors (8, 124, 153, 221, 269). Currently, phosphorylation is evaluated most often by labeling a previously defined protein with 32P-inorganic phosphate followed by 2D-PAGE and/or reverse-phase chromatography, which is a relatively complex, time-consuming procedure (124, 152, 153, 269). For example, in a recent comprehensive study (182), the regulatory mechanisms controlling the activity of 3-phosphoinositide-dependent protein kinase-1 (PDK1), which plays a central role in signal transduction pathways that activate phosphoinositide 3-kinase, were evaluated. With the use of site-directed mutants, phosphorylation on Tyr373/Tyr376 was shown to be important for PDK1 activity, whereas phosphorylation on Tyr9 had no effect. Other novel approaches to investigate phosphorylation include the 14N:15N labeling of immunoprecipitated phosphorylated peptides (79, 177), the phosphoprotein-isotope-coded affinity tag method (79, 80), the use of immobilized metal ion affinity chromatography to affinity capture phosphopeptides (95, 196, 238, 261), and the chemical transformation of phosphoserine and phosphothreonine residues into lysine analogs that are then cleaved with a lysine-specific protease to map sites of phosphorylation (123).
In response to various inflammatory stimuli, lung endothelial cells, alveolar and airway epithelial cells, and activated alveolar macrophages produce nitric oxide and superoxide, products that may react to form peroxynitrite. Peroxynitrite can nitrate and oxidize amino acids in various lung proteins, such as surfactant protein A (SP-A), and inhibit their function. It has been shown that the nitration and oxidation of a variety of alveolar proteins is associated with diminished function in vitro; in addition, both modifications have been identified in proteins sampled from patients with acute lung injury using immunoassays (132, 275). The selective nitration of tyrosine residues in different cytoplasmatic high-molecular-weight proteins and histone proteins in murine tumor cells by neutrophils has been demonstrated by Western blotting and mass spectrometry in vivo and in vitro (94). The authors found that histone nitration was relatively stable, making it a potentially useful marker for extended exposure of cells or tissues to nitric oxide-derived reactive species.
Novel methodologies for the evaluation of other protein modifications are available as well. N- and O-linked glycosylation occurs throughout the entire phylogenetic spectrum and plays key roles in reactions in the endoplasmic reticulum, Golgi apparatus, cytosol, and nucleus (53, 227). Glycosylation is present especially on proteins destined for extracellular environments (207); consequently, many therapeutic targets and clinical biomarkers are glycoproteins. For example, CFTR is an integral membrane glycoprotein that normally functions as a chloride channel in epithelial cells (210). The most common mutation in cystic fibrosis,
F508, results in mislocalization and altered glycosylation of CFTR. Moreover, altered fucosylation and sialylation of both membrane and secreted glycoproteins occur in cystic fibrosis, and the two major bacterial pathogens causing chronic infection in the cystic fibrosis lung, Pseudomonas aeruginosa and Haemophilus influenzae, have binding proteins that recognize these altered sites. For the investigation of protein glycosylation, mass spectrometry has been widely used (28, 53, 129) in the last years, especially the Qq-TOF instrument (Fig. 4) (35, 53, 232). In a recent study (270), glycoproteins were conjugated to a solid support by hydrazide chemistry, and glycopeptides were labeled with stable isotopes. Subsequently, the formerly N-linked glycosylated peptides were specifically released using peptide-N-glycosidase F and identified and quantified by MS-MS. The methodology has been used to investigate plasma membrane and serum proteins.
A rapidly evolving part of functional proteomics is the investigation of specific protein complexes (67, 68, 264). Protein complexes can be isolated from complex mixtures by affinity extraction techniques such as direct antibody coprecipitation (5) or indirect tagging of the bait protein with an epitope that is then recognized by an antibody using tandem affinity purification tags. (72, 205). Chemical cross-linking can be used to prevent the loss of components from the protein complex during precipitation (213). Affinity purification techniques for the analysis of protein complexes have been reviewed (15, 264). The resulting isolated complexes are subsequently analyzed by mass spectrometry. A more general approach is the comprehensive identification of proteins in macromolecular complexes after separation by liquid chromatography (144).
Quantitative Proteomics
With the use of tandem mass spectrometry, the sequence of one peptide can be sufficient to identify an entire protein. This simplification of protein identification has triggered the development of methods that aim at increasing throughput by performing protein separation and identification in one suite of experiments (87). Because cutting out individual gel spots from a 2D gel is a very time-consuming procedure, many recently introduced approaches use chromatography for sample separation. These techniques either couple LC directly to ESI-MS-MS or robotically spot the chromatographically separated fractions to a MALDI target. However, 2D-PAGE provides quantitative information that has only been obtained to a very limited extent from mass spectrometry-based methods. The lack of quantitative results is obviously a serious shortcoming that would limit a LC-mass spectrometry approach to a purely descriptive study design. The use of isotope ratio mass spectrometry (IRMS) is one method being used to close this gap.
Currently, the most advanced IRMS technique is the ICAT technology (89). In an ICAT experiment, the reduced cysteine residues of proteins are labeled differentially. The two different tags consist of an iodoacetamide group that reacts with the free cysteine, a biotin tag that can be used for affinity purification of labeled peptides, and a linker region containing the different isotopic labels. The light version and the heavy version differ in eight protons within the linker region of the ICAT reagent that have been substituted with eight deuterons in the heavy version. The two samples can be discriminated by mass spectrometry according to this mass difference of 8.0 Da (89). After being labeled, the two samples are pooled and digested with trypsin. The tagged peptides are then extracted with an avidin-containing column. Because only cysteine-containing peptides are evaluated, the complexity of the sample is reduced by more than one order of magnitude (89). The frequency of cysteine residues in proteins varies slightly from species to species and averages
1% (27). In yeast,
9% of all theoretically possible peptides after tryptic digestion contain cysteine (89).
A disadvantage of ICAT is that no absolute concentrations of proteins are measured and that comparisons of the expression of two different proteins are not possible. Another shortcoming is the low-sequence coverage, since only cysteine-containing peptides are labeled. The applicability of ICAT to the analysis of posttranslational modifications or protein isoforms is therefore limited (186). This restriction of ICAT to cysteine-containing peptides can be partially overcome by separate analysis of the unlabeled peptides that are not captured in the affinity chromatography step. However, quantitative information will not be available in this case unless a corresponding ICAT-labeled peptide is identified for the same protein (144). Another potential problem is that the differentially labeled peptides can separate from each other during the chromatography process because deuterium affects the retention time in reverse-phase chromatography. Consequently, they may be ionized at separate time points and eventually in different fractions, which can lead to different quantitation intensities (272). In addition, the ICAT tag is relatively large, which may interfere with the detection of large peptides (186). Furthermore, the dynamic range for the quantification of different expression levels of one protein is relatively small (
10-fold) (9, 89), which is inferior compared with fluorescent dyes (186). Some of these limitations can be overcome by using a newly introduced cleavable ICAT reagent. The new reagent utilizes 13C with a mass difference of 9 Da between the heavy and the light marker. The advantages are a smaller tag (227 Da compared with the 442 Da of the original ICAT), which interferes less with the analysis of larger peptides, a mass difference that can easily discriminate a peptide with two ICAT labels (2 cysteine residues) from the common oxidation of methionine, and a reduction of CID fragmentation byproducts, which improves the quality of the resulting mass spectra (93).
The ICAT technique has successfully been employed for the labeling of membrane protein extracts in prostate and breast tumor cell lines (10). Another recent study (220) compared differences in the expression of protein patterns between rat cells that did or did not contain the myc oncogene. These authors reported expression differences among functionally related proteins in myc-positive cells, such as induction of protein synthesis pathways, upregulation of anabolic enzymes, and reduction of proteases, and changes in the levels of adhesion molecules, of actin network proteins, and Rho pathway proteins that correlated with the known qualities of myc-positive cells. Another interesting application of ICAT was a comparison of the microsomal fraction of cells from the human myeloid cell line HL-60 with and without the induction of differentiation by phorbol 12-myristate 13-acetate; the authors identified and quantified 491 proteins. One example of quantitative analysis of alveolar type II cells using the cleavable ICAT technique from our research is given in Fig. 7 (93). The method is an active area of research and development (86, 92, 223, 224).
|
| DATA ANALYSIS AND INTERPRETATION |
|---|
|
|
|---|
Given the complexity of the proteome, an adequate proteomics approach requires the identification of thousands rather than several or a few proteins at a time (76). Therefore, bioinformatics plays a key role in proteomic studies and is often the rate-limiting step (183, 246). The data obtained from mass spectrometry must be interpreted by interrogation against protein databases, the quality of which is crucial for protein identification. Both peptide masses and peptide sequence information can be used for protein identification. There are several protein databases readily available over the Internet that differ in the frequency with which they are updated and the amount of redundancy. Currently, the most complete and most frequently updated database is provided by NCBI, which is a combination of several databases, including Swiss-Prot and Owl. Consequently, this database also contains the most redundancy of protein entries.
Several software packages are available for the analysis of mass spectrometry data. They interrogate the obtained peptide or sequence data against the protein databases and rank the results according to a scoring system [often-used scoring algorithm, Molecular Weight Search (MOWSE) (181)]. Software packages include Mascot from Matrix Science (London, UK; http://www.matrixscience.com) (192), ProFound from Rockefeller University (http://prowl.rockefeller.edu) (273), ProteinProspector, a software suite developed at the University of California, San Francisco (http://prospector.ucsf.edu) (45), the SEQUEST algorithm developed at the University of Washington (http://thompson.mbt.Washington.edu/sequest) (60), and others (2). Each of these programs provides additional utilities; for example, ProteinProspector includes additional tools for the interpretation of mass spectrometry, MS-MS, and ICAT data (at present not included in the public Internet version) as well as a batch mode for repetitive tasks and other analysis tools.
Another bioinformatics challenge is the analysis and description of the large amount of information into a comprehensive model. This includes the development of methods for data comparison between different research groups (183) and the integration of gene ontologies (10).
| INVESTIGATING THE LUNG PROTEOME |
|---|
|
|
|---|
During the development of proteomics over the last two decades, there have been numerous attempts to apply proteomic methodologies to pulmonary medicine. These shall be briefly reviewed in this section.
| EXPERIMENTAL DESIGNS |
|---|
|
|
|---|
| THE PROTEOME OF BAL FLUID |
|---|
|
|
|---|
2-thioglycoprotein,
1-acid glycoprotein, and Gc-globulin. In 1990, Lenz and colleagues (136) published a method for 2D-PAGE of BAL fluid from dogs and then compared protein patterns in BAL fluid proteins from patients with idiopathic pulmonary fibrosis, sarcoidosis, and asbestosis with normal controls (135). In idiopathic pulmonary fibrosis, the spot intensity of one surfactant-associated protein, SP-A, was decreased, whereas in sarcoidosis, the immunoglobulins (IgG, IgA) were increased. Another group of protein spots with a molecular weight of 55 kDa and one spot with a molecular weight of 12 kDa were identified. Compared with normal samples, the number and intensity of low-molecular-weight proteins were significantly increased in patients with asbestosis and, in some cases, in patients with idiopathic pulmonary fibrosis and with sarcoidosis.
At the time of this early proteomics research, many of the characterized spots could not be identified. Although the results of these studies provided the first information for a basic understanding of the protein composition of BAL fluid, the value of these results for clinical medicine was limited. Since then, gradual progress in staining and imaging techniques and improvements in standardization have made it possible to identify the most abundant proteins and refine the information on proteomic changes in different disease states. In 1995, Lindahl and coworkers (142) evaluated the BAL fluid proteome in patients after occupational exposure to irritating chemicals. They defined >1,000 protein spots. Plasma proteins were identified by pattern matching. After occupational exposure, 14 protein spots were increased, and one spot decreased by a factor of more than 3 compared with the levels before exposure and in healthy individuals. Subsequently, the same group found higher levels of basic proteins in smokers than in nonsmokers, whereas subjects exposed to asbestos had increased amounts of several high-molecular-weight and basic proteins (138). The results of protein identification showed lower levels of albumin and higher levels of immunoglobulins in smokers than in nonsmokers, whereas the levels of transferrin were higher in asbestos-exposed subjects. Further progress in the proteomic analysis of BAL fluid was boosted by the development of the SWISS-2D-PAGE database containing compiled maps of human BAL fluid (139, 251, 252). The current master gel of BAL proteins encompasses >1,200 spots visualized by silver staining (Fig. 2) (176). Information is available on changes in 2D-PAGE protein patterns of BAL for smoking (17, 135, 138, 139, 141, 143, 176, 252), sarcoidosis (135, 138, 139, 176, 251, 252), idiopathic pulmonary fibrosis (135, 138, 139, 176, 251, 252), lupus erythematosis (251), Wegener's granulomatosis (251), hypersensitivity pneumonitis (135, 138, 139, 176, 252), lipoid pneumonia (251), chronic eosinophilic pneumonia (251), alveolar proteinosis (18), bacterial pneumonia (251), other infections, malignancies and immunosuppression (82, 173), cystic fibrosis before and after
1-antiprotease treatment (83), and asbestosis (251).
The application of narrow-range immobilized pH gradient (IPG) strips can further increase the resolution of 2D-PAGE (208). Interestingly, the improvement in protein spot detection has been shown to be more significant for the protein spots present exclusively in BAL (55%) than for the spots present in both BAL and serum. This finding suggests that many of the BAL fluid-specific proteins, which are likely to be of pulmonary origin, are low-abundance proteins.
Improvements in protein identification increased the clinical relevance of 2D-PAGE studies. Three years after their initial stu