Key words: biomarker, mass spectrometry, protein biochip, proteomics, SELDI.
Protein biochips are emerging in two distinct formats. The first involves high-density immobilized arrays of recognition molecules (e.g. antibodies), where target binding is monitored indirectly (e.g. via fluorescence). This technology is in its infancy, being limited by the availability of suitable binding molecules that can cope effectively with protein diversity. The second format involves the capture of proteins by biochemical or intermolecular interaction, coupled with direct detection by MS. This technology is available as the ProteinChip® Biomarker System. ProteinChip technology uses surface-enhanced laser desorption/ionization processes to analyse proteins directly from biological samples. Chromatographic surfaces are placed on to ProteinChip Arrays and used to capture subclasses of proteins, dependent on their physical properties. Time-of-flight MS then assigns native molecular masses to the captured proteins. Reproducible protein profiles can be generated from crude biological fluids (e.g. cell lysates, urine or serum). The technology is being applied to a wide range of disciplines, from plant sciences to cancer research, and will be reviewed here.
In recent years there has been a vast amount of activity in genome sequencing projects and the analysis of gene (mRNA) expression. It is hoped that a better understanding of the genome, its expression patterns and functional products will lead eventually to an improved understanding, diagnosis and treatment of disease. DNA- and RNA-based technologies are extremely powerful, yet knowledge of these areas cannot be used to predict patterns of protein expression, or possible downstream processing of these proteins. Several studies have illustrated a poor correlation between gene expression and protein abundance [
1]. In addition, certain body fluids (such as serum, urine and saliva) can only be described in terms of their protein composition mRNA analysis is not applicable. These types of samples are also highly desirable for diagnostic screening, as their retrieval is considered non-invasive. Therefore there is a clear need for the direct analysis of protein expression; a research area described as proteomics.
At present, the most widely used tool for proteomic studies is two-dimensional protein gel electrophoresis (2-DE). Proteins are separated first based on their isoelectric point (pI) and then by their molecular mass. This technique has a high resolving power and can display several thousand proteins. However, the technique has a number of well recognized limitations [
2]. It is generally accepted to be labour intensive, and the generation of reproducible gels requires considerable technical expertise. In addition, there are certain types of proteins that are not well represented by 2-DE, including low-molecular-mass proteins (< 20 kDa) and those with exteme pIs or hydrophobicity.
Several research groups and companies are striving towards generating protein biochips, in order to minaturize proteomics, and the technology is emerging in two distinct formats. The first of these is an approach analogous to DNA biochips, where high-density arrays of immobilized recognition molecules (e.g. antibodies) are prepared, and target binding is monitored indirectly (e.g. by fluorescence). This technology is limited by the availability of suitable binding molecules. Recognition molecules must be generated that interact with all the known (or predicted) gene products. Clearly this is not as straightforward as the generation of DNA probes used on DNA microarrays. Also, recognition molecules will profile binding epitopes rather than the functional protein molecules. An individual protein may exist in several processed forms (truncated or post-translationally modified) that may not be discriminated by the biochip. The second protein biochip format involves the capture of proteins by biochemical or intermolecular interaction followed by direct detection by MS. This technology is commercially available from Ciphergen Biosystems as the ProteinChip® Biomarker System, and will be the focus of this paper.
The ProteinChip System is based on proprietary surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) MS [
3], which combines protein capture on chromatographic surfaces with MS. The ProteinChip Arrays are derivatized with affinity matrices that mirror the properties of conventional chromatography media such as reverse phase; anion and cation exchange; metal affinity; and normal phase (
Figure 1). Depending upon the type of chromatographic surface and wash condition applied, different sets of proteins will be isolated from the crude sample (
Figure 2).
Protein expression profiling in clinical studies
Protein expression profiling by any method requires the reproducible visualization of a set of proteins so that samples from different treatment or disease groups can be confidently compared. The ProteinChip System offers several advantages compared with other proteomic approaches. Protein expression profiling on the ProteinChip System utilizes the chromatographic ProteinChip Arrays to select a subset of proteins based on their biophysical properties (such as pI or hydrophobicity). In this way, hundreds of proteins can be detected from a single active spot without the need for specific recognition molecules. This overcomes the first challenge encountered with high-density protein arrays described above, namely the generation of hundreds or thousands of specific capture molecules. A chromatographic ProteinChip Array can be prepared and analysed in approx. 1 h, and as each array contains eight or 16 active spots, many samples can be analysed in parallel. This represents a more rapid profiling approach than is realistically achievable with 2-DE analysis.
There are also several specific problems posed by clinical samples. Reproducibility of data is affected by sample variation as well as the robustness of the analytical technique. Sample variation is relatively easy to control in model systems. Cell culture growth conditions can be well defined, and even animal models can have controlled diet regimes. However, biological variation within clinical samples is much harder to control and can be affected by sample contamination, or sample processing and storage (transfer time from clinic to laboratory). In order to generate statistically meaningful data from variable sample sets, a large sample population must be examined; a process that is feasible with ProteinChip proteomics. Further complications arise with regard to sample heterogeneity. With prostate cancer, for example, regions of tumour are 'contaminated' with a large number of non-transformed cells, making studies of bulk, homogenized samples difficult to interpret [
4]. This problem has been tackled using laser capture microdissection to isolate pure populations of cancerous cells [
46]. Previously, protein expression studies in samples obtained by laser capture microdissection have been exceedingly difficult, as
50000 cells are needed for 2-DE analysis. However, as few as 2000 cells can be profiled using ProteinChip technology [
46], illustrating the utility of the ProteinChip System when sample quantities are limiting.
Looking for disease biomarkers bladder cancer
There are many examples of the use of the ProteinChip System to look for disease biomarkers [
79]. One such example is the search for markers of transitional cell carcinoma (TCC) in urine [
10]. There is a need to develop a non-invasive method to diagnose TCC of the bladder. Currently, the most reliable diagnosis is achieved by cystoscopic examination and bladder biopsy, which are both invasive and labour intensive. Biomarkers in urine would be highly desirable, due to the non-invasive nature of this bio-fluid. Using the ProteinChip System, 94 urine samples from patients with TCC, other urogenital diseases and healthy subjects were analysed using strong anion-exchange ProteinChip Arrays [
10]. Multiple protein changes were detected reproducibly in the TCC group. The correlation of individual markers with TCC was relatively low, but combining markers (using conventional statistical methods) increased the sensitivity of the assay. This combinatorial approach gave a sensitivity of 78% in detecting grade I and II carcinomas, compared with 2030% by voided urine cytology [
11]. The ability to rapidly monitor many proteins at once is a marked advantage over individual protein assays (such as immunoassays).
Multicomponent biomarker analysis
There is an increasing need to generate software that can deconvolute large data sets in order to find common 'themes' that may have a diagnostic use. These analyses will tend to rely on patterns rather than individual biomarkers. One such approach is cluster analysis, which generates groupings of like samples based on similarities in the data produced from these samples. This is an unbiased analysis, as no prior information on sample phenotype is used. However, it can be difficult to determine the rationale employed by the software to group the data.
Ciphergen Biosystems can provide Biomarker PatternsTM Software, which is a classification tool. An initial training sample set, with known phenotypes, is used to stratify the data. The resulting set of stratification rules is displayed as a 'tree' diagram that can be used to classify unknown samples. The variables (protein masses) used to generate the classification tree are clearly displayed, and can be used to prioritize studies to identify these potential biomarkers.
Biomarker purification strategies
For diagnostic needs, a pattern of biomarkers may be deemed sufficient information to stratify disease groups. However, in order to understand disease progression better and to explore possible drug targets, it is necessary to identify the disease biomarkers as a first step towards understanding their function. As many proteins are retained simultaneously on the ProteinChip Arrays, it is necessary to enrich potential biomarkers to enable identification. Devising purification strategies is greatly assisted by the chromatographic affinity information gained during the discovery process. A biomarker binding to a strong-anion-exchange ProteinChip Array can be enriched by utilizing a strong-anion-exchange column in the purification scheme [
12]. Microspin columns are available for this purpose. The ProteinChip System then becomes the screen for the column fractions, allowing rapid monitoring of biomarker elution.
Biomarker identification prostate cancer
Once a potential marker has been purified, many techniques can be employed to ascertain its identity (e.g. Edman degradation). However, protein quantities can be low, and often a semi-purified biomarker is run on a one-dimensional gel as the final preparative step, to prevent further sample losses. The gel band can be excised and exposed to tryptic peptide mapping, a method commonly employed to identify 2-DE gel spots. The ProteinChip System can be used to analyse the digest peptides, and the masses used to search against protein databases for possible matches to known proteins. This method for protein identification is heavily dependent upon the data available in the databases. Some organisms are not well represented, and when relying on predicted protein products from gene sequences, the 'protein' match may be of unknown function. In these cases, sequence information is a vital aid to finding the protein's identity. Ciphergen have developed a ProteinChip Interface for quadropolequadropole (Qq) TOF mass spectrometers, which are capable of performing collision-induced dissociation [
3]. This system will allow researchers to sequence peptides directly from the surface of ProteinChip Arrays. The complementarity of the ProteinChip System and ProteinChip QqTOF was demonstrated by work to differentiate between benign prostatic hyperplasia and prostate cancer [
3]. Seminal plasma samples from patients diagnosed with benign prostatic hyperplasia or prostate cancer were screened using the ProteinChip System. Several potential biomarkers for prostate cancer were discovered, and on-chip isolation was achieved for a 5.7 kDa marker. On-chip tryptic digest results suggested that the protein was semenogellin I, but this is a much larger protein of molecular mass 52 kDa. The 5.7 kDa peptide was then sequenced using ProteinChip QqTOF-MS, revealing that the marker was seminal basic protein, a proteolytically derived fragment of semenogellin I.
Future directions
The ability of the ProteinChip System to screen large sample sets; test simultaneously for multiple protein changes; and deconvolute the data with dedicated software shows that it has the potential to improve disease diagnosis. Continued development of the technology is focused on creating a greater diversity of capture surfaces and supporting products, such as fractionation and purification kits. In addition, as it is applied more frequently to large-scale projects, then the automation of sample preparation, array loading and reading will follow.
References
1 Anderson, L. and Seilhamer, J. (1997) Electrophoresis 18, 533537
Medline 1st Citation
2 Jenkins, R. E. and Pennington, S. R. (2001) Proteomics 1, 1329
Medline 1st Citation
3 Merchant, M. and Weinberger, S. R. (2000) Electrophoresis 21, 11641167
Medline 1st Citation 2nd 3rd
4 Paweletz, C. P., Gillespie, J. W., Ornstein, D. K., Simone, N. L., Brown, M. R., Cole, K. A., Wang, Q. H., Huang, J., Hu, N., Yip, T. T. et al. (2000) Drug Dev. Res. 49, 3442
1st Citation 2nd 3rd
5 Englert, C. R., Petricoin, E. F., Krizman, D. B. and Emmert-Buck, M. R. (1999) Curr. Opin. Mol. Ther. 1, 712719
1st Citation 2nd
6 Wright, Jr, G. L., Cazares, L. H., Leung, S. M., Nasim, S., Adam, B., Yip, T., Schellhammer, P. F., Gong, L. and Vlahou, A. (1999) Prostate Cancer Prostatic Dis. 2, 264276
1st Citation 2nd
7 Johnston-Wilson, N. L., Bouton, C. M., Pevsner, J., Breen, J. J., Torrey, E. F. and Yolken, R. H. (2001) Int. J. Neuropsychopharmacol. 4, 8392
Medline 1st Citation
8 Von Eggeling, F., Junker, K., Fiedler, W., Wollscheid, V., Durst, M., Claussen, U. and Ernst, G. (2001) Electrophoresis 22, 28982902
Medline 1st Citation
9 Fung, E. T., Wright, Jr, G. L. and Dalmasso, E. A. (2000) Curr. Opin. Mol. Ther. 2, 643650
Medline 1st Citation
10 Vlahou, A., Schellhammer, P. F., Mendrinos, S., Patel, K., Kondylis, F. I., Gong, L., Nasim, S. and Wright, G. L. (2001) Am. J. Pathol. 158, 14911502
Medline 1st Citation 2nd
11 Grossman, H. B. and Dinney, C. P. N. (2000) Urol. Oncol. 5, 310
1st Citation
12 Thulasiraman, T., McCutchen-Maloney, S. L., Motin, V. L. and Garcia, E. (2001) BioTechniques 30, 428432
Medline 1st Citation
Received 19 November 2001
Copyright 2002 Biochemical Society