Identification of prognostic microRNA candidates for head and neck squamous cell carcinoma

The aim of the present study was to uncover potential prognostic microRNA (miRNA) markers in head and neck squamous cell carcinoma (HNSCC). miRNA expression profile and clinical data were obtained from The Cancer Genome Atlas (TCGA) database. Survival analysis was conducted by Kaplan-Meier method. Then, candidate prognostic miRNAs were screened via Cox proportional hazards regression analysis. Furthermore, target genes of miRNAs were predicted, and their interactions and functions were then analyzed. A total of 15 miRNAs were discovered to be significantly related to overall survival. Among them, miR-1251, miR-618 and miR-328 with p<0.05 were potentially significant prognostic factors of HNSCC. In the protein-protein interaction (PPI) network, the target gene of miR-328, MAX, interacted with multiple genes. The target gene of miR-618, E2F1, interacted with genes such as MAX, SMAD3 and IGF1. Furthermore, the target genes of miR-618 (e.g. E2F1 and SMAD3), and the target genes of miR-328 (e.g. MAX) were significantly enriched in the functions of transcriptional regulation. The target gene of miR-1251, IGF1, was associated with functions such as signal transduction. The above miRNAs (miR-1251, miR-618 and miR-328) may be potential prognostic markers in HNSCC.


Introduction
Head and neck squamous cell carcinoma (HNSCC) accounts for the major proportion of head and neck cancers and is the sixth most common non-skin cancer worldwide (1). The high mortality rate of HNSCC results from poor prognosis and lack of reliable prognostic signatures. Thus, it is urgent to reveal more reliable prognostic markers to achieve better outcomes for HNSCC patients.
MicroRNAs (miRNAs) are small non-protein-coding RNAs of ~22 nucleotides, and they play roles in various biological functions, such as cell growth, differentiation and the development of disease (2). Dysregulation of miRNAs is also closely related to the development, diagnosis and prognosis of many types of cancers including HNSCC (3). For instance, miR-218 inhibits cell migration and invasion by suppressing the expression of laminin-332 in HNSCC (4). By suppressing invadopodia activity, miR-375 impairs the tumor invasion of HNSCC and is implicated in the poor outcome of HNSCC (5,6). Moreover, poorer overall survival of HNSCC has been found to be linked with the promoter methylation of miR-137 (7). In addition, overexpression of miR-205 is able to mediate HNSCC growth and E2F1 signaling (8). Although more and more studies have focused on uncovering miRNAs that are involved in HNSCC, few miRNAs have been identified as potential prognostic markers for HNSCC.
In the present study, to identify potential prognostic miRNA markers for HNSCC, miRNA expression profile and clinical data of 397 HNSCC cases were obtained from The Cancer Genome Atlas (TCGA) database. Survival and Cox regression analyses were performed to identify the miRNAs that were potentially associated with the survival of HNSCC patients. Furthermore, target genes of those miRNAs were predicted and the functions of the target genes were analyzed. The results may provide new information for the further study of HNSCC, and the identified miRNAs may be used as prognostic markers for HNSCC, which may be useful to improve future clinical outcome of HNSCC.

Materials and methods
Data acquisition. The miRNA expression profiling and clinical data of HNSCC were obtained from the TCGA database (http://cancergenome.nih.gov/, the deadline of data downloading: July 20, 2014). The miRNA expression profiling data included a total of 513 cases, and they were produced by two platforms: BCGSC_IlluminaGA_miRNASeq (37 cases) and BCGSC_IlluminaHiSeq_miRNASeq (476 cases). The clinical data included 418 HNSCC cases. After data filtering, only cases with both miRNA expression profile and clinical survival data (397 cases) were retained for further analysis.
Survival analysis. The average expression value of each miRNA in the 397 cases was calculated, and the mean value was set as the critical value. Then, the cases were divided into two groups according to the miRNA expression values higher or lower than the critical value. Furthermore, survival estimates were calculated by the Kaplan-Meier (KM) method and the log-rank test based on a survival package in R (9). Only the miRNAs with p<0.05 were considered significant. The miRNAs that had a significant association with survival time were selected for subsequent analysis.
Cox regression analysis. The miRNAs having a significant impact on survival time and clinical survival data received Cox proportional hazards regression analysis (10) based on KMsurv package in R (11). Patients with a high risk score were considered to have poor survival. The miRNAs with p<0.05 were chosen for further analysis.
Construction of protein-protein interaction (PPI) network for target genes. PPIs between target genes of the miRNAs were analyzed via the online database Search Tool for the Retrieval of Interacting Genes (STRING, http://string-db.org/) (19).
Then, target gene pairs with a combined score >0.4 were chosen to construct PPI networks, which were visualized using the Cytoscape software (http://www.cytoscape.org/) (20). In the network, a node represents a protein (gene), and lines represent the interactions of the proteins. The ̔degree̓ of each node is equal to the number of nodes that interact with this node. The larger the degree is, the closer the connections with other nodes are.
Functional analyses of target genes. The Database for Annotation, Visualization and Integrated Discovery (DAVID; https://david.ncifcrf.gov/) was applied to investigate the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of the target genes (21). p<0.05 was chosen as the cut-off criterion.
To further investigate the function of the target genes, the GenCLip tool (22) was utilized to perform functional clustering for genes in the PPI network based on literature mining, and the cut-off criteria were p<1E-04 and hit ≥9. Additionally, pathway enrichment analysis was performed for the genes in the PPI network, based on the Gene Set Enrichment Analysis (GSEA) database (http://software.broadinstitute.org/ gsea/index.jsp). p<0.05 was set as the cut-off criterion.

Results
Survival analysis. In total, expression values of 1,046 miRNAs were obtained from the TCGA database. Among the miRNAs, a total of 15 miRNAs were discovered to be significantly  (Table I).
Analysis of PPI network of target genes. A total of 245 genes were discovered to be regulated by miR-1251, miR-618 and miR-328. miR-328 modulated most of the genes, such as MAX.
To further analyze the interactions of the target genes, an interaction network of the target gene was further constructed based on the STRING database. The PPI network was composed of 97 nodes and 112 lines (Fig. 2). Several genes had strong interactions with the other genes. For example, MAX strongly interacted with 8 genes (e.g. E2F1); E2F1 interacted with MAX, SMAD3 and E2F1.
Functional analysis of target genes. To investigate the functions of the three miRNAs that were found to be potentially associated with risk factors of HNSCC, GO functional and KEGG pathway enrichment analyses of the target genes were performed. The target genes of hsa-mir-328 were mainly enriched in the GO terms of positive regulation of transcription (e.g. IL5 and FOXA2) and transcription activator activity (e.g. FOXA2 and MAX), as well as the pathways in cancer (e.g. MAX and SLC2A1) (Table II). Meanwhile, the target genes of hsa-mir-618 were mainly enriched in the GO terms of the negative regulation of transcription from the RNA polymerase II promoter (e.g. E2F1 and ID2) and transcription repressor activity (e.g. E2F1 and ID2), as well as the TGF-β signaling pathway (e.g. ID2 and SMAD3) (Table III). There were no significant GO and pathway terms enriched by the target genes of hsa-mir-1251.
Furthermore, to further reveal the functions of 97 genes in the PPI interaction network, functional clustering and pathway enrichment analysis were performed. According to the functional clustering, 19 clusters were enriched. A set of genes (e.g.  E2F1, NR3C1, IGF1, SMAD3 and BDNF) were significantly enriched in multiple functional clusters, such as cell death and signal transduction (Fig. 3). Additionally, 26 genes in the network were markedly enriched in 9 pathway clusters, such as pathways in cancer (e.g. E2F1, MAX and SMAD3), cell cycle (e.g. E2F1 and SMAD3) and the Wnt signaling pathway (e.g. SMAD3) (Table IV).

Discussion
HNSCC is a common type of non-skin cancer worldwide (1).
In the present study, using the miRNA expression profile and clinical data of 397 cases in the TCGA database, miR-1251, miR-618 and miR-328 were screened out to be significantly associated with risk factors for HNSCC, based on the survival analysis and Cox regression analysis. A set of target genes of these three miRNAs were significantly enriched in a series of GO functions and pathways. For example, targets of miR-618, E2F1 and SMAD3, were significantly enriched in functions concerning regulation of transcription, and SMAD3 was also markedly enriched in the TGF-β signaling pathway. These two genes interacted with each other in the PPI network. E2F1 encodes an E2F transcription factor, which mediates cell cycle and functions as a tumor-suppressor protein (23). Overexpression of E2F1 in vitro and in vivo suppresses HNSCC cell growth through induction of apoptosis (24). Additionally, single nucleotide polymorphisms (SNPs) of E2F1 are highly correlated with the risk of HNSCC (25). Moreover, as a transcriptional modulator and signal transducer, SMAD3 mediates multiple signaling pathways and plays a role in the regulation of carcinogenesis (26). In the present study, SMAD3 was predicted to interact with E2F1, which has been reported in a previous study (27). In HNSCC, SMAD3 is inactivated, which is associated with higher overall survival of patients (28). In the present study, SMAD3 was markedly enriched in the TGF-β signaling pathway. The association of SMAD3 and TGF-β signaling has been reported in numerous studies (29)(30)(31). A previous study reported that overexpression of death-associated protein kinase-related apoptosis-inducing kinase 1 (DRAK1) binds to Smad3, and then inhibits TGF-β1 tumor suppressor activity in HNSCC (32). Additionally, SNPs in miR-618 are implicated in cancer risk (33). miR-618 has been discovered to be progressively expressed in esophageal adenocarcinoma (34). There is no evidence that miR-618 is associated with HNSCC risk or survival rate. Taken together, miR-618 may play a crucial role in the prognosis of HNSCC by regulating the genes E2F1 and SMAD3.
Furthermore, in the PPI network, the target gene of miR-328, MAX, interacted with multiple genes, such as E2F1. MAX was significantly enriched in the function of transcription activator activity and the pathways in cancer. MYC associated factor X (MAX) encodes a transcription factor belonging to the helix-loop-helix leucine zipper (bHLHZ) family (35). It is able to form a homodimer and heterodimer with MYC, which is an oncoprotein correlated with cell proliferation, differentiation and apoptosis (35). There is no evidence to prove the association of MAX with HNSCC. However, overexpression of c-myc has been reported to be associated with the poor prognosis of HNSCC (36). Furthermore, c-myc promotes elevated levels of glutaminolysis and glutamic acid, which marker aggressive     tumorigenesis in HNSCC (37). Additionally, miR-328 has been confirmed to be highly expressed in salinomycin-treated HNSCC cells (38). Additionally, the expression level of miR-328 was found to be significantly higher in early-stage non-small cell lung cancer (NSCLC), and shows good diagnostic accuracy in NSCLC patients (39). These results indicate an important role of miR-328 in cancer. Therefore, miR-328 may play a pivotal role in the progression of HNSCC via targeting MAX, and it may be associated with the survival time of HNSCC patients. In addition, the target gene of miR-1251, IGF1, also interacted with E2F1 in the PPI network, and it was enriched in multiple function clusters, such as cell death and signal transduction. Insulin-like growth factor I (IGF1) is involved in regulating cell growth and development (40), and the activation of multiple signaling pathways in head and neck cancer (41). Previous studies have demonstrated that IGF1 is able to stimulate the expression of vascular endothelial growth factor, which is correlated with increased progression and poor prognosis of HNSCC (42,43). Moreover, miR-1251 has been discovered to be downregulated in nasopharyngeal carcinoma (44). There is no study to report the association of miR-1251 with HNSCC. Therefore, miR-1251 may play a crucial role in the progression of HNSCC via mediating the expression of IGF1, and it may be closely associated with the survival of HNSCC patients.
Despite the aforementioned results, there were several limitations in the present study. The predicted results should be confirmed by laboratory data. In our further studies, the expression levels of miR-1251, miR-618 and miR-328, as well as the regulatory relationships of them and their target genes, will be validated in a large scale of patients with HNSCC.
In conclusion, miR-1251, miR-618 and miR-328 were predicted to be associated with the survival time of HNSCC patients. They may play pivotal roles in the progression of HNSCC via a set of genes associated with transcriptional regulation or signal transduction, such as MAX, E2F1, SMAD3 and IGF1. The three miRNAs were considered as novel prognostic markers in HNSCC. These findings contribute to our further experimental study, and may provide new information for the estimation of the prognosis of HNSCC.