Metabonomic signature analysis of cervical carcinoma and precancerous lesions in women by 1 H NMR spectroscopy

(1)H nuclear magnetic resonance (NMR)-based metabonomics has been used to characterize the metabolic profiles of cervical intraepithelial neoplasia (CIN) and cervical squamous cell carcinoma (CSCC). Principal component analysis (PCA) and orthogonal partial least-squares discriminant analysis (OPLS-DA) were used to model the systematic variation related to patients with CIN or CSCC with healthy controls. Potential metabolic biomarkers were identified using database comparisons, and the one-way analysis of variance (ANOVA) test was used to examine the significance of the metabolites. Compared with plasma obtained from the healthy controls, plasma from patients with CIN had higher levels of very-low density lipoprotein (VLDL), acetone, unsaturated lipid and carnitine, together with lower levels of creatine, lactate, isoleucine, leucine, valine, alanine, glutamine, histidine, glycine, acetylcysteine, myo-inositol, choline and glycoprotein. Plasma from patients with CSCC had higher levels of acetate and formate, together with lower levels of creatine, lactate, isoleucine, leucine, valine, alanine, glutamine, histidine and tyrosine compared with the plasma of the healthy controls. In addition, compared with the plasma of patients with CIN, the plasma of CSCC patients had higher levels of acetate, formate, lactate, isoleucine, leucine, valine, alanine, glutamine, histidine, tyrosine, acetylcysteine, myo-inositol, glycoprotein, α-glucose and β-glucose, together with lower levels of acetone, unsaturated lipid and carnitine. Moreover, the profiles showed high feasibility and specificity by statistical analysis with OPLS-DA compared to the Thinprep cytology test (TCT) by setting the histopathological outcome as standard. The metabolic profile obtained for cervical cancer is significant, even for the precancerous disease. This suggests a systemic metabolic response to cancer, which may be used to identify potential early diagnostic biomarkers of the cancer and to establish clinical diagnostic methods.


Introduction
Cervical cancer (CC) is a high-risk human papillomavirus (h-HPV)-induced tumor.Over 500,000 new cases of CC are diagnosed worldwide every year and 130,000 of these are in China, most of which are cervical squamous cell carcinomas (CSCC) (1,2).The region of Xinjiang has one of the highest rates of CC incidence in China and the ratio of CSCC incidence between Uighur and Han women could be as great as 3.4:1 (3,4).Despite the advances in surgery and chemotherapy that have been made over the last 20 years, the overall survival rate for patients with CSCC has not changed significantly.The high mortality rate of CSCC occurs primarily due to the majority of women being diagnosed at an advanced stage of the disease when the 5-year survival rate is 8-12%.By contrast, patients who are accurately diagnosed with earlier stage or precancerous diseases have a 5-year survival rate of 90%.CSCC is characterized by a long period of preclinical dysplasia or carcinoma in situ progressing into invasive cancer.Cervical intraepithelial neoplasia (CIN) is a common type of precancerous disease of CSCC, which is defined by WHO as a potential precancerous condition representing a generalized state associated with a significantly increased risk of cancer.Therefore, early detection and screening of populations at a high risk of CSCC and precursor lesions are attractive strategies to reduce the incidence of CSCC.Although the Papanicolaou (Pap) smear test has contributed significantly to the early detection of precursor lesions, the cytological screening has inherent problems that produce considerable false-negative/-positive results (5,6).Mucins present particularly difficult problems by forming sticky layers or sheets of disorganized cords which appear irregularly in the smear specimen.These approaches tend to contribute insufficient diagnostic sensitivity and specificity.
The study of metabolic processes in biological systems has been termed metabonomics.The primary goals of meta-bonomics are to identify metabolic biomarkers or predictors associated with a specific biochemical event and to relate these to the mechanism of the effect (7).Nuclear magnetic resonance (NMR) spectroscopy is an efficient and nondestructive tool for generating data on a multitude of metabolites in bodily fluids (8,9).Certain studies have previously demonstrated that NMR-based plasma metabonomics may be used to determine the diagnosis and prognosis of disease (10)(11)(12)(13)(14)(15)(16)(17).NMR spectroscopy has previously been used to identify the metabolic signatures of CSCC compared with normal tissues and this revealed that the malignant tissue of the cervix differed from the nonmalignant tissue, with higher levels of choline and amino acids and lower levels of glucose (18). 1 H NMR spectroscopy for the assessment of apoptosis in cervical carcinomas has revealed that the choline:creatine ratio is significantly higher in CSCC than in normal tissue (18)(19)(20).The results of a previous study also revealed that high lactate levels may be used to predict the likelihood of metastases, tumor recurrence and restricted patient survival in human CCs (21).Research has mainly focused on CC tissues since they provide several lines of enquiry for the understanding of the metabolic processes and mechanisms in the development of cancer.Urinary biomarkers which could be used to distinguish between cancer and normal cases have been reported for gynecological cancers, including breast, ovarian and cervical cancer (22).However, the metabonomic analysis of the plasma of patients with CC and precancerous diseases has not been well documented thus far.
In this study, plasma samples from patients with CSCC or CIN as well as from healthy controls were subjected to metabonomic analyses by 1 H NMR spectroscopy followed by PCA and OPLS-DA to profile the concentration and composition of the plasma metabolites in the three groups.

Materials and methods
Collection of plasma samples.The study protocol was approved by the Ethics Committee of Xinjiang Medical University.All the diagnoses of CIN and CSCC were confirmed by histopathology.In a total of 38 patients with CIN, 2 had CIN Ⅰ, 31 had CIN II and 5 had CIN III and the average age (±SD) was 39.6±0.7 years.Plasma samples were taken from 38 Uighur patients with CSCC on the date of diagnosis and prior to initial treatment.Their tumor stages (according to the criteria of the International Federation of Gynecology and Obstetrics) were IIb (18 patients), IIIb (16 patients) and IVb (4 patients) and their average age (±SD) was 45.6±0.3 years.Samples of 38 healthy controls were obtained from Uighur individuals who underwent a routine health check.The selection criteria were for the control subjects to be free of neoplasm and any inflammatory disease.The average age (±SD) for the healthy controls was 41.6±0.3 years.The blood samples were collected prior to the morning meal in tubes and the plasma was obtained by centrifugation of the blood samples at 3,500 rpm for 10 min at 4˚C.The plasma samples were stored at -80˚C until NMR analysis. 1H NMR spectral data.The frozen plasma samples were thawed prior to use and prepared for NMR analysis by mixing 200 µl of plasma with 400 µl of saline (0.9% w/v NaCl in 20% v/v D 2 O and 80% v/v H 2 O).The plasma-saline mixture was left to stand at room temperature for 10 min and was then centrifuged at 10,000 rpm for 10 min.The clear supernatant (550 µl) was then transferred to a 5-mm NMR tube.The samples were analyzed by 1 H NMR spectroscopy at 599.95 MHz using a Varian Inova 600 spectrometer at 298 K. Water signals and broad protein resonances were suppressed by a combination of presaturation and the Carr-Purcell-Meiboom-Gill (CPMG) pulse sequence [relaxation delay-90˚-(τ-180˚-τ) n -acquire].For each sample there were 128 scans into 32,768 data points over a spectral width of 10,000 Hz, which resulted in an acquisition time of 1.64 sec and a relaxation delay of 2 sec. 1 H NMR spectra were processed and corrected for phase and baseline with Topspin 2.0 software (Brokers Biospin, Rheinstetten, Germany).Chemical shifts were referenced to the anomeric proton of α-glucose at δ5.233 and the spectra were put into 2,834 integrated regions of 0.003 ppm, corresponding to δH=9.0, to 0.5 ppm.The region 5.20-4.66ppm was removed in order to avoid the effects of water suppression and the 2.72-2.47ppm region was also removed due to decoagulant signals.The data sets were then imported into Microsoft ® Excel.Multivariate data analysis was carried out on the normalized NMR data sets using the software package SIMCA-P+11 (Umetrics Inc., Umea, Sweden).

Preparation of plasma samples and acquisition of
One-way analysis of variance (ANOVA) and pattern recognition analysis.The spectral segments for each NMR spectrum were normalized to the total integrated area of each spectrum.The integral values were imported into the SIMCA-P+11 software as variables for the multivariate pattern recognition analysis.Principal component analysis (PCA) and the orthogonal projection to latent structure with discriminant analysis (OPLS-DA) methods with unit variance (UV) scaling were carried out for class discrimination and biomarker identification (23).PCA was conducted using mean-centered scaling and the results are presented in scatter plots; each pointing the former represented one sample, whereas the latter showed the magnitude and manners of the NMR signals (thus metabolites) to classification.Further analysis of the NMR data was carried out using OPLS-DA, which is one of the most accurate methods for identifying the metabolic profiles which are associated with a given clinical condition (24)(25)(26).The data were visualized with the score plots of the first two principal components (PCs 1 and 2) in order to provide the most efficient 2D representation of the information contained in the data set.Based on the number of samples in each group, a correlation coefficient (determined by the Pearson's product-moment correlation coefficient) of 0.33 was used as the cut-off value that calculated discrimination at the level of P=0.05.The normalized NMR data set was subjected to classical statistical analysis (one-way ANOVA) using SPSS 16.0 software.The ANOVA was carried out using a two-sided Tukey post-test for the comparison of the absolute values of the spectral variables among the plasma samples.P<0.05 was considered to indicate a statistically significant result. 1 H NMR spectra.Fig. 1 shows the typical 1 H NMR spectra of a plasma sample obtained from a healthy individual, a patient with CIN and a patient with CSCC (Fig. 1A-C, respectively).The resonances in these spectra were assigned to individual metabolites (based on http:/metabolomics.ca)and confirmed using the two dimensional NMR methods 1 H-1 H homonuclear correlation spectroscopy (COSY), total correlation spectroscopy (TOCSY) and J-Resolved spectroscopy (J-Res).The resulting NMR spectra were then used to visualize chemical shifts and scalar couplings along different spectral dimensions and increase the peak dispersion and therefore the metabolite specificity in the 1D projection (27).

Determination of metabolic changes according to
Discrimination between different cervical lesions using pattern recognition analysis.Initially, PCA was conducted on the spectral data and two PCs were calculated for the extracts contained from metabolites being explained by PC1 and PC2, respectively.Fig. 2 shows the PCA score plot of the three groups, namely the patients with CIN or CSCC and the healthy controls.The PCA applied to the 1D projection is depicted in Fig. 2 and shows a clear separation between the samples from the healthy subjects and those from patients with CIN and CSCC.The scatter plot shows each set of two groups scattering into different regions.It was possible to achieve a good separation of the patients with CSCC and CIN from the healthy controls (Fig. 2A-C).Although there were several cases showing an overlap between patients with CIN and those with CSCC (Fig. 2C), PCA not only differentiated between the disease and control samples but also showed the potential to distinguish between the precancerous and cancer cases with a high specificity.
To obtain a more objective statistical estimation and specific loadings, we used OPLS-DA for a model discriminating between the samples from patients with CIN and CSCC and those of the healthy controls that was more focused on distinguishing variation than the PCA approach (Fig. 3).In this case, the sensitivity and specificity for the detection of CIN and CSCC were >90 and >95%, respectively, following the application of the Venetian blind algorithm for crossvalidation.Fig. 3A shows the OPLS-DA scatter plot of the CIN patients and the healthy controls (R 2 X=0.434, R 2 Y=0.668, Q 2 =0.569).Notably, several samples from patients with CIN were mixed in the cluster of healthy samples and two healthy samples were mixed in the cluster of CIN samples.Similarly, the OPLS-DA scatter plot for the CSCC patients and the healthy controls (Fig. 3B, R 2 X=0.34, R 2 Y=0.87, Q 2 =0.83) shows a clear discrimination between samples from the two groups.It was observed that the cluster of patients with CSCC was located at a distance from that of the healthy controls,  Loading plots were calculated from the OPLS-DA models in order to identify discriminatory metabolites for different models.According to the correlation coefficients which resulted from the OPLS-DA, 22 metabolites may be used to quantitatively separate the CSCC, CIN and healthy control groups.Table I summarizes the variations in certain metabolite signals in the plasma of CIN and CSCC patients compared with that of the healthy controls.Positive values represent a relatively low abundance in the plasma of patients with CSCC and CIN compared with the healthy controls.Negative values represent a high abundance in the plasma of patients with CSCC and CIN compared with the healthy controls.In the samples from the CIN patients, the levels of very-low density lipoprotein (VLDL), acetone, unsaturated lipid and carnitine were increased compared with those of the healthy controls, whereas the level of creatine, lactate, isoleucine, leucine, valine, alanine, glutamine, histidine, glycine, acetylcysteine, myo-inositol, choline and glycoprotein were reduced.Similarly, plasma from patients with CSCC had higher levels of acetate and formate, together with lower levels of creatine, lactate, isoleucine, leucine, valine, alanine, glutamine, histidine and tyrosine compared with the healthy controls.Concerning the differences between samples from patients with CSCC and CIN, an altered plasma concentration of acetone, acetate, formate, glycoprotein, α-glucose and β-glucose may form a unique profile which could be used to separate CSCC and CIN.

Diagnostic sensitivity of the metabonomic profiles of CIN and
CSCC by 1 H NMR spectroscopy.All samples from patients were examined with the Thinprep cytological test (TCT), a routine method in the clinical screening of cervical diseases.The results revealed that, of all 38 cases positive for CIN and CSCC confirmed by the histopathological diagnosis, which is a current standard diagnostic criterion, 7 and 4 were negative with TCT, respectively.We matched the sensitivity of the metabonomic profiles derived from the OPLS-DA, discrimi-nating patients with CIN or CSCC from the healthy controls with the TCT test.The analysis revealed a high sensitivity of the metabonomic profile for the clinical diagnosis of CIN and CSCC (91.6 and 100%, respectively), whereas TCT had a lower sensitivity for the same cases (80.6 and 88.9%; Table II).This indicates that metabonomic profiling with 1 H NMR spectroscopy may increase the feasibility and accuracy of the diagnosis.

Discussion
In this study, compared with plasma samples from healthy individuals, plasma from patients with CIN had higher levels of VLDL, acetone, unsaturated lipid and carnitine, together with lower levels of creatine, lactate, isoleucine, leucine, valine, alanine, glutamine, histidine, glycine, acetylcysteine, myo-inositol and choline.Similarly, plasma from patients with CSCC had higher levels of acetate and formate, together with lower levels of creatine, lactate, isoleucine, leucine, valine, alanine, glutamine, histidine and tyrosine.The scatter plots of the pattern recognition analysis were capable of distinguishing CIN and CSCC patients from the healthy controls.Compared with samples from patients with CIN, the plasma of CSCC patients had higher levels of acetate, formate, lactate, isoleucine, leucine, valine, alanine, glutamine, histidine, tyrosine, acetylcysteine, myo-inositol, glycoprotein, α-glucose and β-glucose, together with lower levels of acetone, unsaturated lipid and carnitine.
Despite a tumor tissue being restricted to a certain organ, cancer is believed to be a disease of the host and an indication that the whole body has changed to a pathological state, in which the body is functionally unable to defeat or remove tumor cells.As a result, the plasma of the patients with cancer principally contains all the information concerning the pathogenic changes caused by the disease in the content of the metabolites, which reflect aberrant alterations at the level of gene expression and regulation as well as abnormalities in the function of multiple organs and tissues.Due to blood plasma representing the effects of metabolism in different organs, it is difficult to assign a metabolic fingerprint to specific metabolic processes.Nevertheless, changes in the blood plasma metabolite concentrations of patients with CIN and CSCC clearly point to an altered energy metabolism.The relatively low level of lactate is markedly different from observations in other types Table II.Comparision of the sensitivity and false-positive rate of the diagnosis of CSCC and CIN with the TCT method and 1 H NMR metabonomics coupled with OPLS-DA. 1 H NMR metabonomics coupled with OPLS-DA Cervical TCT ---------------------------------------------------------------------------------------------------------------------------------------------- of cancer where, commonly, lactate levels are high (10,21,28).Lactate is a metabolite that is found at high levels under the conditions of tumor hypoxia, in which the lack of intracellular oxygen alters the balance of cellular energy production from oxidative phosphorylation to glycolysis, which is specific to the tumor nest (29,30).A decreased level of lactate may result from an increased energy metabolism, which leads to lactate being directly decomposed into H 2 O and CO 2 (31).Precursors of glucose in gluconeogenesis, including lactate and alanine, were found at lower concentrations in patients with CIN and CSCC, which clearly points to an altered energy metabolism.Furthermore, several tricarboxylic acid (TCA) cycle intermediates, including valine and isoleucine, were found at a lower concentration in patients with CIN and CSCC, suggesting a suppressed TCA cycle.This represents a typical signature in cancer patients and it has been previously confirmed that tumors rely on glycolysis as a main source of energy, even in the presence of oxygen (32).Increased levels of acetone, carnitine, unsaturated lipid and VLDL in plasma from patients with CIN further supports the hypothesis that the rate of lipid metabolism was increased in response to the tissue injury caused when CIN occurs since they are products of lipid metabolism.The role of these metabolites during the progression of CC has also been described in previous studies.This reflects the activation of the lipolysis pathway as a backup mechanism for energy production (33).Choline and its derivatives are constituents of the phospholipid metabolism pathway and have been previously identified as markers of cellular proliferation (31).In contrast to previous studies, the low level of choline found in this study indicates an activation of the phosphatidyl choline pathway and consequently a high cell membrane turnover or the activation of cell proliferation (34)(35)(36)(37).Consistent with our study, a decrease in the concentration of glutamine was also documented in the plasma of patients with early stage head and neck cancer (38).
CSCC patients were characterized as having a relatively low abundance of creatine, lactate and amino acids as well as a high abundance of acetate and formate in their plasma compared with the healthy controls.The relatively low concentration of lactate and creatine indicated an increased energy metabolism and the concomitant decrease in the concentration of amino acids may not imply an increased rate of protein biosynthesis, but an increased energy consumption at the expense of the amino acids.As acetate and formate are intermediates of pyrimidine and amino acid degradation, respectively, the high abundance of these metabolites may represent a deregulated energy metabolism in CSCC patients.In cases of CIN, the pattern of a low abundance of metabolites appears to be representative and common to CIN and CSCC when compared with the healthy controls, indicating that a deregulated energy metabolism had occurred as early as in CIN, which is pathologically classified as a possible precursor of CSCC.To compensate for the increased energy consumption, lipid mobilization and transport may already be initiated in CIN, as VLDL, acetone, unsaturated lipids and carnitine were found in high abundance and contributed to the profiling of CIN patients compared with the healthy controls.
CIN is generally believed to be the precursor of CSCC.However, in contrast to the lower concentrations of amino acids, lactate and creatine in CSCC compared with the healthy controls, the ratio of these metabolites was reversely correlated between CSCC and CIN and may not follow the pathological order from normal and CIN to CSCC, revealing a more complex mechanism of CIN development as a unique disease rather than a precursor of the cancer.However, the low abundance of unsaturated lipids, acetone and carnitine and the high abundance of acetate and formate may still represent an active lipid catabolism and a more disordered energy metabolism in CSCC than in CIN.
According to the statistical analysis using the PCA and OPLS-DA as unsupervised and supervised methods, respectively, the populations of patients with CSCC or CIN and the healthy controls were scattered into two regions.This represents a good separation of the cancer from non-cancer cases with the pattern of metabolites and suggests that patients with CSCC have a specific profile which is different from those with CIN and the healthy controls.In this study, however, both approaches were unable to separate cases of CIN from the healthy controls, as several cases showed an overlap between the two groups.This result may be reasonable for CIN, since the CIN was not further pathologically classified into CIN I, II and III due to the limited number of cases enrolled in this study.In particular, the definition of CIN I as a precancerous stage has been controversial and this may have an impact on the profiling of CIN.In this study, several samples from patients with CIN were mixed in the cluster of the healthy samples.These CIN patients had low-grade cervical squamous intraepithelial lesions (CIN Ⅰ).However, the sensitivity for CIN and CSCC detection by 1 H NMR coupled with OPLS-DA (91.6 and 100%, respectively) was higher than that of the TCT (80.6 and 88.9%, respectively).
From the results of this study and those carried out previously, it was suggested that there were distinct metabonomic signatures that are able to distinguish between CSCC, CIN and healthy controls.Most of the alterations may reflect an altered energy metabolism or a deregulated metabolism of corresponding metabolites to compensate for the energy consumption of the cancer.In future studies, it is important to further define the precancerous lesions according to the pathological criteria, for example CIN I, II and III, and to optimize the metabonomic analysis of CIN with a relatively large number of samples and also to compare early-with late-stage CSCC.The results of the present study indicate that plasma NMR spectra, combined with pattern recognition analysis techniques, offer an efficient and convenient method to depict tumor biochemistry.This may contribute to the early diagnosis of human malignant diseases and to the discovery of potential plasma biomarkers from blood samples.
indicates that the metabolic profile of the CSCC patients is different from that of the healthy controls.Patients located towards the edges of the plots were CSCC patients with metastases.Fig.3Cshows the scatter plot of CIN and CSCC patients (R 2 X=0.31, R 2 Y=0.89, Q 2 =0.84) which, although scattered, were located in different clusters, demonstrating a different metabolic profile in patients with CIN and CSCC.
a CIN compared with the healthy control group; b CSCC compared with the healthy control group.The correlation coefficient of |r|=0.29 was used as a cut-off value for statistical significance (one-way ANOVA, P﹤0.05).A correlation coefficient with a positive or negative value in parentheses represents a relatively low ↓ or high ↑ concentration of a given metabolite, respectively.(-)indicates a metabolite statistically not significant and excluded from the analysis of the corresponding groups.s,singlet; d, doublet; t, triplet; q, quartet; m, multiplet; dd, doublet of doublets; CIN, cervical intraepithelial neoplasia;CSCC, cervical squamous cell carcinoma; VLDL, very-low density lipoprotein; CPMG, Carr-Purcell-Meiboom-Gill pulse sequence; J-Res, which