Gene expression profiling via bioinformatics analysis reveals biomarkers in laryngeal squamous cell carcinoma

The present study aimed to identify key genes and relevant microRNAs (miRNAs) involved in laryngeal squamous cell carcinoma (LSCC). The gene expression profiles of LSCC tissue samples were analyzed with various bioinformatics tools. A gene expression data set (GSE51985), including ten laryngeal squamous cell carcinoma (LSCC) tissue samples and ten adjacent non-neoplastic tissue samples, was downloaded from the Gene Expression Omnibus. Differential analysis was performed using software package limma of R. Functional enrichment analysis was applied to the differentially expressed genes (DEGs) using the Database for Annotation, Visualization and Integrated Discovery. Protein-protein interaction (PPI) networks were constructed for the protein products using information from the Search Tool for the Retrieval of Interacting Genes/Proteins. Module analysis was performed using ClusterONE (a software plugin from Cytoscape). MicroRNAs (miRNAs) regulating the DEGs were predicted using WebGestalt. A total of 461 DEGs were identified in LSCC, 297 of which were upregulated and 164 of which were downregulated. Cell cycle, proteasome and DNA replication were significantly over-represented in the upregulated genes, while the ribosome was significantly over-represented in the downregulated genes. Two PPI networks were constructed for the up- and downregulated genes. One module from the upregulated gene network was associated with protein kinase. Numerous miRNAs associated with LSCC were predicted, including miRNA (miR)-25, miR-32, miR-92 and miR-29. In conclusion, numerous key genes and pathways involved in LSCC were revealed, which may aid the advancement of current knowledge regarding the pathogenesis of LSCC. In addition, relevant miRNAs were also identified, which may represent potential biomarkers for use in the diagnosis or treatment of the disease.


Introduction
Laryngeal squamous cell carcinoma (LSCC) is the most common type of laryngeal cancer. It is able to spread to regional cervical lymph nodes, or to more distant tissues, for example the lung. Improvements in current therapies have resulted in an improved quality of life for patients with LSCC (1). However, survival rates have not been significantly improved, which identifies the need for a change in clinical approach, requiring novel biomarkers for diagnosis, prognostic assessment and drug design (2).
Certain achievements have been made in the identification of biomarkers of LSCC. Järvinen et al (3) performed high-resolution copy number and gene expression microarray analyses to identify 739 genes overexpressed in LSCC. Gajecka et al (4) reported that polymorphisms of CYP1A1, CYP2D6, CYP2E1, NAT2, GSTM1 and GSTT1 were associated with an increased risk of LSCC. HLA class I antigen downregulation was identified as a poor prognostic marker for LSCC, which may reflect the reduction in the extent of CD8(+) T cell infiltration in LSCC lesions (5). Overexpression of osteopontin enhances the proliferation and invasiveness of LSCC (6), suggesting that it may represent a potential therapeutic target.
MicroRNAs (miRNAs) are short regulatory RNAs that modulate gene expression at the post-transcriptional level, and are involved in the pathogenesis of numerous types of cancer (7). Multiple miRNAs have been associated with LSCC. Overexpression of miR-21 contributes to the malignant phenotype of LSCC via inhibition of BTG family member 2 (8). miR-203 inhibits the proliferation of laryngeal carcinoma cells by modulating their survival (9). In addition, let-7a (10), miR-16 (11) and hsa-miR-34c (12) also have roles in laryngeal carcinoma.
Microarray technology provides global patterns of gene expression and therefore facilitates biomarker discovery. Lian et al (13) investigated tumorigenesis and regional lymph node metastasis in LSCC, whereas the present study focused specifically on the discovery of biomarkers associated with tumorigenesis. In the present study, gene expression profiles of LSCC were analyzed with a variety of bioinformatics tools, including functional enrichment and network analyses, in order to identify novel potential biomarkers. Additionally, miRNAs targeting these genes were also predicted. The identification of such biomarkers may be useful in the diagnosis and/or treatment of LSCC.
Pretreatment and differential analysis. According to the annotation files, probes were initially mapped into the genes. If more than one probe was mapped into a single gene, levels of the probes were averaged as the final expression level for the specific gene. Following normalization, differential analysis was performed between the ten LSCC tissues and corresponding adjacent non-neoplastic tissues using the Linear Models for Microarray Analysis package (limma; http://www.bioconductor.org/packages/ release/bioc/html/limma.html) (14) of R. P<0.05 and |log2 (fold change)|>1 were set as the threshold levels for the identification of differentially expressed genes (DEGs).
Functional enrichment analysis. Functional enrichment analysis facilitated the identification of altered biological functions. In the present study, functional enrichment analysis was performed on the DEGs using the Database for Annotation, Visualization and Integration Discovery, version 6.7 (DAVID; http://david.abcc.ncifcrf.gov/) (15), which is able to reveal enriched Gene Ontology (GO) terms; the Kyoto Encyclopedia of Genes and Genomes (KEGG) for pathways (16) and InterPro, version 34.0 for protein domains (17) based on the hypergeometric distribution. P<0.05 was set as the threshold.

Construction of protein-protein interaction (PPI) networks.
Proteins 'work together' to exert certain biological functions, and the genome-wide identification of PPIs represents a significant step in the elucidation of the underlying molecular mechanisms. In present study, PPI networks were constructed for the protein products using information from the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING, version 9.1; http://string-db.org/) (18). Interactions with a score (i.e. required confidence) >0.4 were retained in the network.
The proteins in the network serve as the 'nodes', and each pairwise protein interaction is represented by an undirected link. The 'degree' of a node corresponds to the number of interactions of that particular protein. The most highly connected nodes (those of a high degree) were considered to be the network 'hubs'.
Module analysis of the network. The PPI networks and regulatory associations between miRNAs and target genes were combined and subsequently visualized with Cytoscape, version 2.6.3 (http://cytoscape.org/) (19). Functional modules of the network were explored using ClusterONE, version 1.0 (http://www.paccanarolab.org/clusterone), a Cytoscape software plugin (20). P<0.01 was set as the cut-off value.
Functional enrichment analysis. GO and KEGG pathway enrichment analyses were applied to the up-and downregulated genes using DAVID tools. The results are presented in Tables I  and II.
In addition, the cell cycle (hsa04110), proteasome (hsa03050) and DNA replication (hsa03030) pathways were significantly enriched in the upregulated genes, while ribosome (hsa03010) was significantly enriched in the downregulated genes.
PPI networks of the DEGs. A PPI network was constructed for the protein products of the DEGs using STRING. Interactions with a combined score of >0.4 were included in the network.
A network consisting of 213 proteins (nodes) and 2719 interactions (edges) was generated for the upregulated genes (Fig. 1). The top nodes, or hubs, characterized by a degree of >70, were aurora kinase B (AURKB), cyclin dependent kinase  member 2C (KIF2C), maternal embryonic leucine zipper kinase (MELK), topoisomerase (DNA) II α 170kDa (TOP2A) and cell division cycle associated 5 (CDCA5). Furthermore, a network comprised of 50 nodes and 80 edges was obtained for the downregulated genes. Few PPIs were observed amongst the downregulated genes, and therefore the subsequent analyses were focused on the upregulated genes.
Module analysis. Module analysis was performed using ClusterONE to predict protein complexes, and the results are presented in Fig. 2. Two modules were identified amongst the upregulated genes: Module 1 (P<0.000) and Module 2 (P=5.559x10 -4 ). Protein domain enrichment analysis was applied to the genes in the two modules using InterPro and the results are displayed in Table III. Serine/threonine protein kinase, active site (IPR008271); protein kinase, ATP binding site (IPR017441) and protein kinase, core (IPR000719) were significantly enriched amongst the genes from Module 1. No significantly enriched term was identified in the genes from Module 2. One module was generated by the downregulated genes ( Fig. 2), but no significantly enriched protein domain was identified.
Gene regulatory networks. The miRNA-target interactions were visualized by Cytoscape and thereby a gene regulatory network was established (Fig. 3).

Discussion
Through the comparative analysis of gene expression data of LSCC and adjacent non-neoplastic control tissues, a total of 461 DEGs were identified in LSCC, of which 297 were    upregulated and 164 were downregulated. Functional enrichment analysis indicated that the cell cycle, proteasome and relevant biological pathways were over-represented amongst the upregulated genes. Cell cycle progression is closely associated with multiple types of cancer (22)(23), which indicated that the analysis results were of high confidence. According to the results of previous studies, specific DEGs including CDK4, CDK1, MCM2, MCM3 and MCM4, have been implicated in LSCC (13,24). CDK4 is required for cell cycle G 1 phase progression (25). Dong et al (26) reported that CDK4 was overexpressed in LSCC and suggested that it may have a critical role in cell proliferation together with cyclin D1. MCM2 has been implicated in the initiation of eukaryotic genome replication, which has been proposed as a marker of dysplasia and malignancy (27). Chatrath et al (28) reported aberrant expression of MCM2 in LSCC, while Torres-Rendon et al (29) indicated that MCM2 may be an indicator of growth and may provide a useful prognostic tool for oral epithelial dysplasia. MCM3 and MCM4 were also identified as DEGs in the present study. These findings were consistent with the results presented in a study by Lian et al (13).  Additional potential biomarkers were discovered in the present study. There is emerging evidence that glycogen synthase kinase (GSK)3β may be a tumor suppressor in oral cancer (30). It was therefore hypothesized that GSK3β may function in a similar way in LSCC. Several subunits of the proteasome were also identified in LSCC, including proteasome subunit β type 4 (PSMB4), PSMB7, PSMB1 and PSMC3. The ubiquitin-proteasome system is a critical regulator of cell growth and apoptosis (31), and proteasome inhibitors have been developed for use in cancer therapy (32)(33). PSMB4 was identified as the first proteasomal subunit with oncogenic properties, promoting cancer cell survival and tumor growth in vivo (34). Elevated expression of PSMB4 is associated with poor prognosis in human cancer (34). PSMB7 was identified to be a prognostic biomarker in breast cancer (35). Therefore, further study of these subunits may reveal novel biomarkers for LSCC.
Network and module analyses were performed for the DEGs, and the identification of hub nodes and interactions may aid the elucidation of the underlying molecular mechanisms. AURKA and AURKB had a high degree in the PPI network for upregulated genes. Overexpression and hyperactivation of AURKA and AURKB have major roles in tumorigenesis, and therefore their inhibitors are already regarded as promising therapeutics for various types of cancer (36), including head and neck squamous-cell carcinoma (37). MCM2 and MCM4 also had a degree of >70, confirming their significant roles in the pathogenesis of LSCC.
Considering that miRNAs are closely involved in multiple types of cancer, a number of miRNAs targeting the DEGs were predicted by WebGestalt, including miR-15, miR-16, miR-25 and miR-195. Wu et al (11) reported that miR-16 was upregulated in LSCC and that it targets zyxin and promotes cell motility in human laryngeal carcinoma cell line HEp-2. Other miRNAs have been reported to have roles in various types of cancer. miR-15a forms a cluster with miR-16 at the chromosomal region 13q14, and functions as a putative tumor suppressor by targeting the oncogene BCL2 (38,39). miR-25 regulates apoptosis by targeting Bim in human ovarian cancer (40) and miR-195 is regarded as a predictor of poor prognosis in adrenocortical cancer (41). Future studies of these miRNAs may better describe the regulatory mechanisms underlying LSCC. In conclusion, a number of key genes in LSCC were identified, which may represent novel biomarkers for diagnosis, prognosis and therapy. In addition, relevant miRNAs were also explored, and these may offer therapeutic targets for the modulation of abnormal gene expression in LSCC.