Professor Xiaogang Zhao, Department of Thoracic Surgery, The Second Hospital of Shandong University, 247 Beiyuan Street, Jinan, Shandong 250021, P.R. China
Lung adenocarcinoma (LUAD) is the predominant pathological subtype of lung cancer, which is the most prevalent and lethal malignancy worldwide. Cyclins have been reported to regulate the physiology of various types of tumors by controlling cell cycle progression. However, the key roles and regulatory networks associated with the majority of the cyclin family members in LUAD remain unclear. In total, 556 differentially expressed genes were screened from the GSE33532, GSE40791 and GSE19188 mRNA microarray datasets by R software. Subsequently, protein-protein interaction network containing 499 nodes and 4,311 edges, in addition to a significant module containing 76 nodes and 2,631 edges, were extracted through the MCODE plug-in of Cytoscape. A total of four cyclin family genes [
Lung cancer is the most prevalent and lethal malignancy in the world, with lung adenocarcinoma (LUAD) being the predominant pathological subtype (
Cyclins are a class of proteins that control cell cycle progression by activating CDK enzymes (
Based on the RNA microarray data of GSE33532, GSE40791 and GSE19188, the present study used bioinformatics methods to search for differentially expressed genes (DEGs) between LUAD and adjacent normal lung tissue. A protein-protein interaction (PPI) network was then established to screen for key genes enriched in the cyclin gene family. Online databases were implemented to validate the expression, PPI and clinical relevance of the hub genes. The purpose of the present study was to search for genes in the cyclin family that are associated with LUAD in addition to their potential upstream regulators. It is anticipated that this information could reveal potential targets for subsequent experimental validation.
In the present study, the microarray datasets were searched and downloaded from Gene Expression Omnibus (GEO) using the following criteria: i) Choose Affymetrix array under GPL570 platform; ii) the tissue source was from human LUAD samples and adjacent normal samples; and iii) study containing ≥20 LUAD and 20 normal samples. Finally, three datasets based on the GPL570 platform were selected, namely GSE19188, GSE33532 and GSE40791. Specifically, GSE19188 included 40 LUAD samples and 65 adjacent normal lung tissue samples (
The gene expression matrix and associated annotation files of the three aforementioned datasets were downloaded from the GEO database before the probe matrix in the expression profiling following the array was converted into a gene matrix through ‘affy’ package of R software (
The RRA method is a tool that can be used for integrating data from multiple microarray studies with minimal inconsistencies to robustly identify DEGs (
Enrichment analysis of GO and KEGG has been extensively utilized for deciphering microarray data to further understanding into the biological functions of each gene (
Using the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING;
The enriched gene family was selected according to the gene module screened by Cytoscape. An expression correlation matrix was then made for the family genes in the three datasets, before genes with high positive correlation (R>0.6; P<0.05) were selected to be key genes according to Pearson's methods.
Subsequently, three datasets (GSE33532, GSE40791 and GSE19188) and the Gene Set Cancer Analysis (GSCA;
In the present study, the UALCAN database (
The LUAD cell line Beas-2B was cultured in high-glucose DMEM medium (cat. no. 23-10-013-CV; Corning, Inc.), the LUAD cell line A549 was cultured in high-glucose F12K medium (cat. no. 21127022; Thermo Fisher Scientific, Inc.), and the human bronchial epithelial cell line 16-HBE and LUAD cell line H1299 cell line were cultured in high-glucose RPMI-1640 medium (cat. no. 10-040-CV; Corning, Inc.). All mediums contained 10% FBS (cat. no. 10091148; Gibco; Thermo Fisher Scientific, Inc.), 100 U/ml penicillin and 100 µg/ml streptomycin and cells were incubated routinely in a cell incubator containing 5% CO2 at 37˚C. The three cell lines were purchased from FuHeng Cell Center (
According to the manufacturer's protocol, the TRIzol® reagent (cat. no. 15596026; Invitrogen; Thermo Fisher Scientific, Inc.) to isolate total RNA from the 16-HBE, A549, Beas-2B and H1299 cells. Reverse transcription was performed using super script first strand synthesis system cat. no. 18080051; Invitrogen; Thermo Fisher Scientific, Inc.) with oligo (DT) 20 primer and 5.0 µg RNA to synthesize the first strand of cDNA. Using GAPDH as the endogenous control, the primers were synthesized by Beijing Tsingke Biotechnology Co., Ltd. Primer sequences are provided in
The protein samples lysate for western blot were collected from 16-HBE, Beas-2B, A549 and H1299 cell lines with RIPA lysis buffer (cat. no. P0013B; Beyotime Institute of Biotechnology) containing protease inhibitor cocktail. Concentrations of protein samples were detected using the BCA Protein Assay Kit (cat. no. A53225; Thermo Fisher Scientific, Inc.) and 20 µg protein lysate was loaded in 10% SDS-PAGE gel respectively and transferred to PVDF membrane (Bio-Rad Laboratories, Inc.). After blocking in 5% non-fat milk dissolved in TBST buffer for 60 min at room temperature, the membranes were washed 3 times by TBST containing 1% Tween 20 (cat. no. P1379; Sigma-Aldrich; Merck KGaA) and then incubated with the following 5% BSA-diluted (cat. no. ST2254; Beyotime Institute of Biotechnology) primary antibodies: CCNA2 (1:1,000; cat. no. 18202-1-AP), CCNB1 (1:1,000; cat. 28603-1-AP), CCNB2 (1:1,000; cat. no. 21644-1-AP; all from ProteinTech Group, Inc.) and ACTB (1:10,000; cat. no. AC026; Abclonal Biotech Co., Ltd.) for 6-8 h at 4˚C; the HRP-linked secondary antibodies (1:20,000; cat. no. SA00001-2; ProteinTech Group, Inc.) were used to probe the primary antibodies for 1 h at room temperature. Finally, the immunoreactive protein bands were visualized by ECL kit (cat. no. WBKLS0500; MilliporeSigma), and the images were obtained by scanning using a fluorescence imager (Typhoon FLA 7000; Cytiva). The quantification of blot bands was calculated using ImageJ (Version. 1.52; National Institutes of Health).
In total, 10 pairs of LUAD and adjacent normal tissues were collected from the Second Hospital of Shandong University (Jinan, China) from 2021/01/01 to 2021/12/31, with complete pathological data. The age of the patients was 61.2±6.3 years, including 4 women and 6 men. The present study was approved [approval no. KYLL-2020(KJ)P-0099] by the Medical Ethics Committee of the Second Hospital of Shandong University (Jinan, China). Written informed consent was obtained from all participants. The human LUAD specimens were formalin-fixed and paraffin-embedded for 24 h at 4˚C and cut into 4-µm thin slices. The IHC staining kit (cat. no. PV-6000; ZSGB-BIO) was used for the experiment according to the manufacturer's instructions. DAB (cat. no. ZLI-9017; ZSGB-BIO) was used for staining (37˚C for 90 sec). The final immunostaining images were obtained using a NanoZoomer Digital Pathology scanner (NanoZoomer S60; Hamamatsu Photonics K.K.). Protein expression was analyzed by calculating the integrated optical density (IOD/area) of each stained region using Image-Pro Plus version 6.0 (Media Cybernetics, Inc.).
To evaluate the diagnostic efficacy of the hub genes for LUAD, the three data sets GSE33532, GSE40791 and GSE19188 were combined. The raw expression data were then normalized by Affy package (
MASS package (
A ROC curve for this model was constructed using the ‘pROC package’ (
Statistical comparisons were performed using SPSS 25.0 (IBM Corp.). The Pearson correlation coefficient between cyclin family genes was calculated using R (
In the present study, GSE19188, GSE33532 and GSE40791 were included for analysis, with a total of 179 LUAD samples and 185 normal samples. These three microarray datasets were first standardized by quantiles to mitigate individual differences among samples. A total of 1,883, 3,079 and 2,258 DEGs were screened from the GSE19188, GSE33532 and GSE40791 datasets, respectively (
The RRA method assumes that the number of ranked each gene is known (
GO analysis revealed that biological processes of the significant DEGs were associated with the cell cycle, including ‘mitotic nuclear division’, ‘extracellular matrix organization’, ‘extracellular structure organization’, ‘mitotic sister chromatid segregation’ and ‘chromosome segregation’ (
A total of 499 nodes and 4,311 edges were found in the PPI network (
A total of four cyclin family genes (CCNA2, CCNB1, CCNB2 and CCNE2) were clustered in module 1. To analyze the cyclin family genes, a cyclin family gene expression correlation matrix was made for the three datasets. The results revealed that
To investigate the correlation in the expression of the six cyclin family genes, the GEPIA online tool was used to obtain the Pearson's rank coefficient results among these genes. According to the pairwise gene expression correlation analysis, GEPIA revealed significant positive correlation among
The UALCAN database is based on The Cancer Genome Atlas data (
RT-qPCR was used to verify the mRNA expression levels of these genes in the cell lines. The results showed that the mRNA expression of the hub genes was significantly higher in the two non-small cell lung cancer cell lines Beas-2B and H1299 compared with human bronchial epithelial cell line 16-HBE, and the expression of four hub genes was upregulated in A549 cell line, in which there was a significantly high expression in CCNB1 and CCNB2 (
The diagnostic efficacy of all hub genes was assessed by constructing multi-factorial logistic regression models from the training set, where
Since the expression levels of
Dysregulation in cell cycle control can lead to tumor progression. Cyclins are cell cycle regulators that are associated with numerous types of cancer (
In the present study, expression profiling and functional enrichment analysis revealed that four significant DEGs, namely
Using online databases, the high expression of hub genes was found in LUAD. Furthermore, the expression of
The protein encoded by
CCNB2 is a B-type cyclin (
CCNA2 belongs to a highly conserved cyclin family (
FOXM1 is a member of the FOX transcription factor family that serves an important role in cell proliferation, differentiation and survival (
Nomograms have been widely applied for predicting prognosis and outcome in a clinical setting by combining multiple risk factors (
In the training set and validation set, all the AUC values of the present diagnostic model were >0.9. Therefore, according to this nomogram, the diagnostic evaluation of LUAD based on the expression levels of
However, many limitations remain associated with the present study. Although the present study found that the higher expression of
In conclusion, in the present study bioinformatics analysis identified that
Not applicable.
The data generated and/or analyzed during the current study are available in the GEO database under accession number (GSE19188, GSE33532, GSE40791, GSE10072 and GSE75037) or at the following URLs:
XZ and XY confirm the authenticity of all the raw data. XZ provided the funding support of the study and designed this project. XY wrote the manuscript, and analyzed and interpreted the data. HG and ZT organized all the figures and interpreted data. YZ and PL performed tissue culture, RT-qPCR and western blot experiments, and revised the manuscript, figures and table. All authors read and approved the final manuscript.
The present study was approved [approval no. KYLL-2020(KJ)P-0099] by the Medical Ethics Committee of the Second Hospital of Shandong University (Jinan, China). Written informed consent was obtained from all participants.
Not applicable.
The authors declare that they have no competing interests.
Volcano plots of the three microarray datasets. Differentially expressed genes of LUAD and normal samples in (A) GSE19188, (B) GSE33532 and (C) GSE40791. Red points represent upregulated genes, whilst green points represent downregulated genes. Black points represent genes with no significant difference in expression. (D) Heatmap of the top 10 up- and downregulated genes according to robust rank aggregation analysis. Red and blue represent genes with higher and lower expression levels in patients with LUAD, respectively. LUAD, lung adenocarcinoma.
GO and pathway enrichment analysis of differentially expressed genes in lung adenocarcinoma. Top 10 GO terms in (A) biological processes, (B) cellular components, (C) molecular function and (D) Kyoto Encyclopedia of Genes and Genomes pathways. GO, gene ontology.
PPI network and the most significant module formed by the DEGs in lung adenocarcinoma. (A) PPI network of DEGs. (B) The most significant module from the PPI network, containing 76 differentially expressed genes. (C) The most significantly enriched pathway for module 1. Red represents genes with higher expression levels, whilst blue represents genes with lower expression levels. Yellow representσ the pathway of the cell circle. PPI, protein-protein interaction; DEGs, differentially expressed genes.
Cyclin family gene expression profile. Correlation matrix of the expression levels of all genes in the cyclin family in the (A) GSE19188, (B) GSE33532 and (C) GSE40791 datasets. (D) Expression of sex cyclin family genes (
Correlation in the expression of the hub genes. Correlations between (A) CCNB2-CCNA2, (B) CCNB1-CCNA2 and (C) CCNB1-CCNB2 are significant (R>0.75; P<0.001). Correlation between Forkhead box M1 and (D) CCNA2, (E) CCNB1 and (F) CCNB2 in lung adenocarcinoma was revealed by GEPIA (R=0.76, 0.74 and 0.73 respectively). CCN, cyclin.
Expression levels of hub genes and patient prognosis. (A-D) Identifying the expression levels of the hub genes in LUAD using UALCAN. (E-H) Association between hub gene expression and adenocarcinoma tumor stage. (I-L) Survival curves comparing the prognosis of patients with high and low expression levels of hub genes in LUAD according to the GEPIA database. *P<0.05, **P<0.01 and ***P<0.001. LUAD, lung adenocarcinoma.
Expression of hub genes in LUAD cells and tissues. The expression levels of (A) CCNA2, (B) CCNB1, (C) CCNB2 and (D) FOXM1 mRNA in the non-small cell lung cancer cell lines A549, Beas-2B and H1299 and the control cell line 16-HBE were validated by reverse transcription-quantitative PCR. (E) Western blot analysis was used to detect the expression of CCNA2, CCNB1 and CCNB2 in LUAD cell lines; ACTB was used as loading control. Each column represents the mean ± SD from independent experiment. (F) Immunohistochemistry was used to analyze the expression of hub genes in LUAD and adjacent normal tissues (magnification, x200; scale bar, 100 µm). (G) Image-Pro Plus version 6.0 was used to calculate the integral optical density/area and analyze the hub gene protein expression level. Statistical analysis of results in panels A-E was performed using one-way ANOVA, and comparisons between groups were performed using Dunnett's test. Statistical analysis of results in panel F was performed using t-test. *P<0.05, **P<0.01, ***P<0.001 and ****P<0.0001 compared with 16-HBE. LUAD, lung adenocarcinoma; CCN, cyclin.
Nomogram construction and evaluation. (A) Nomogram constructed using CCNB1, CCNB2 and FOXM1 expression. Red dots represent the outcome values of one of the samples and the predicted results in the Nomogram. (B) Receiver operating characteristic curves and area under the curve values of the diagnostic model in the training set (yellow line), validation set GSE10072 (green line) and the validation set GSE75037 (red line). (C) Calibration curves for the nomogram, where the prediction curves were close to the ideal curve and the Hosmer-Lemeshow test was ~1, suggesting consistency for the detection of lung adenocarcinoma for the hub genes used. (D) Decision curve analysis graph. The gray line represents the hypothesis that all lesions are malignant (full treatment option). The solid black line represents the hypothesis that all lesions are benign (no treatment option). The black dashed line represents the decision to treat benign or malignant based on the fitted model. The results show that using the nomogram designed to predict malignancy increases the benefit of developing a treatment plan over treating all (gray solid line) or no treatment (black solid line).