Text mining ‐ based drug discovery in cutaneous squamous cell carcinoma

Cutaneous squamous cell carcinoma (cSCC) is one of the most common skin cancers. However, the efficacy and utility of the available drug therapies are limited. The objective of the present study was to determine the genes and molecular pathways associated with cSCC by using computational tools and publicly available data, and to explore drugs targeting the relevant molecular pathways for cSCC treatment. In this study, we used text mining and GeneCodis to mine genes which were highly related to cSCC. Protein‐protein interaction (PPI) analysis was performed by using STRING and Cytoscape. By using the data analytical tool cBioPortal, we analyzed the characteristics of candidate genes for the purpose of drug selection. Based on the drug‐gene interaction analysis of the final genes, candidate drugs were then derived. Our analysis identified 121 genes related to cSCC from the text mining searches. Gene enrichment analysis yielded 11 genes representing 10 pathways, targetable by a total of 55 drugs as possible drug treatments for cSCC. The final list included 25 chemotherapy agents, 21 tyrosine kinase inhibitors (TKIs), 7 PI3K/AKT/mTOR inhibitors, 2 MAPK inhibitors, 2 cyclin‐dependent kinase (CDK) inhibitors, 1 histone deacetylase (HDAC) inhibitor, 3 nonsteroidal anti‐inflammatory drugs (NSAIDs) and 3 other drugs, which directly affect the most enriched pathways. In conclusions, drug discovery using in silico text mining and pathway analysis tools may be a method of exploring candidate drugs which target the genes/pathways relevant to cSCC, to identify potential treatments. Introduction Cutaneous squamous cell carcinoma (cSCC) is the second most common non-melanoma skin cancer (NMSC) and accounts for 20‐50% of skin cancers (1,2). An overall 263% increase in the incidence of cSCC between 1976 and 1984, and between 2000 and 2010, was reported by the Rochester Epidemiology Project, conducted by the Mayo Clinic (3). Metastatic cSCC is lethal, with a mortality rate of 70% as demonstrated by several large studies (4). Among all of the risk factors, UV radiation is the most detrimental. Tumors most commonly arise on sun‐exposed skin in fair‐skinned individuals who are at the highest risk, especially the head, neck, and backs of the arms and hands (4). UV radiation acts as a carcinogen both directly, by inducing cell damage, such as DNA mutations, and indirectly, by inducing immunosuppression (5). Therefore, the development of cSCC is closely associated with genomic perturbations, genetic mutations and the altered expression of key molecules, through a highly complex pathological process involving many dynamic and interacting biological processes and molecular pathways. In recent years, with the rising incidence of cSCC and increasing expectations of patients, the selection of an appropriate treatment to maximize the preservation of function and minimize the recurrence and metastasis rate remains a challenge. In the clinic, surgical excision is considered the gold standard in the treatment of cSCC. Non‐surgical approaches, including photodynamic therapy (PDT), radiation therapy, cryotherapy and chemotherapy, are widely used in late‐stage cSCC (6,7). However, research into drug therapy remains limited, and further studies are needed to provide a rationale for the development of novel therapeutic modalities. While the efficacy of the available drug therapies is limited and discovering new drug therapies using traditional methods is likely to take a long time, drug repositioning may speed up the process of discovering other conditions that existing drugs could treat more effectively, and may be less expensive (8). This study aimed to investigate new drug therapies for cSCC by using computational methods, including text mining, biological process and pathway analysis, protein‐protein interaction (PPI) analysis to mine public databases, and bioinformatics tools to systematically identify interaction networks between drugs and gene targets (9,10). By using data analytical tools, we were able to analyze the characteristics of candidate genes for the purpose of drug selection. Based on the drug‐gene Text mining‐based drug discovery in cutaneous squamous cell carcinoma YUYAN PAN, YONG ZHANG and JIAQI LIU Department of Plastic and Reconstructive Surgery, Zhongshan Hospital, Fudan University, Shanghai 200032, P.R. China Received May 13, 2018; Accepted September 21, 2018 DOI: 10.3892/or.2018.6746 Correspondence to: Dr Yong Zhang or Dr Jiaqi Liu, Department of Plastic and Reconstructive Surgery, Zhongshan Hospital, Fudan University, Shanghai 200032, P.R. China E‐mail: zhang.yong1@zs‐hospital.sh.cn E‐mail: liujiaqi1213@yahoo.com


Introduction
Cutaneous squamous cell carcinoma (cSCC) is the second most common non-melanoma skin cancer (NMSC) and accounts for 20-50% of skin cancers (1,2).An overall 263% increase in the incidence of cSCC between 1976 and 1984, and between 2000 and 2010, was reported by the Rochester Epidemiology Project, conducted by the Mayo Clinic (3).Metastatic cSCC is lethal, with a mortality rate of 70% as demonstrated by several large studies (4).Among all of the risk factors, UV radiation is the most detrimental.Tumors most commonly arise on sun-exposed skin in fair-skinned individuals who are at the highest risk, especially the head, neck, and backs of the arms and hands (4).UV radiation acts as a carcinogen both directly, by inducing cell damage, such as DNA mutations, and indirectly, by inducing immunosuppression (5).Therefore, the development of cSCC is closely associated with genomic perturbations, genetic mutations and the altered expression of key molecules, through a highly complex pathological process involving many dynamic and interacting biological processes and molecular pathways.
In recent years, with the rising incidence of cSCC and increasing expectations of patients, the selection of an appropriate treatment to maximize the preservation of function and minimize the recurrence and metastasis rate remains a challenge.In the clinic, surgical excision is considered the gold standard in the treatment of cSCC.Non-surgical approaches, including photodynamic therapy (PDT), radiation therapy, cryotherapy and chemotherapy, are widely used in late-stage cSCC (6,7).However, research into drug therapy remains limited, and further studies are needed to provide a rationale for the development of novel therapeutic modalities.
While the efficacy of the available drug therapies is limited and discovering new drug therapies using traditional methods is likely to take a long time, drug repositioning may speed up the process of discovering other conditions that existing drugs could treat more effectively, and may be less expensive (8).This study aimed to investigate new drug therapies for cSCC by using computational methods, including text mining, biological process and pathway analysis, protein-protein interaction (PPI) analysis to mine public databases, and bioinformatics tools to systematically identify interaction networks between drugs and gene targets (9,10).By using data analytical tools, we were able to analyze the characteristics of candidate genes for the purpose of drug selection.Based on the drug-gene interaction analysis of the final genes, candidate drugs were then derived.

Materials and methods
Text mining.Text mining makes it possible to collect disease-gene associations automatically from large volumes of biological literature.We used pubmed2ensembl (http://pubmed2ensembl.ls.manchester.ac.uk/) to perform text mining.Pubmed2ensembl has been developed as an extension to the BioMart system that links over 2,000,000 articles in PubMed to nearly 150,000 genes in Ensembl from 50 species (11).Pubmed2ensembl provides links between the literature and genes for data exploration, which means that when queries are performed, Pubmed2ensembl extracts all of the genes related to the search concepts from the biological literature available.In this study, we performed the query with the concept 'cutaneous squamous cell carcinoma' (cSCC).We chose 'Homo sapiens' as the species dataset, then selected 'Ensembl Gene ID' and 'Associated Gene Name' under GENE and deselected 'MEDLINE: PubMed ID' under PUBMED2ENSEMBL FEATURES.After entering 'cutaneous squamous cell carcinoma' in the search box, we selected the 'search for PubMed IDs' and 'filter on Entrez: PMID' drop-down menus.Then, the query returned all the gene hits that were used in the next step.
Biological process and pathway analysis.GeneCodis (http://genecodis.cnb.csic.es/) is a web-based tool that integrates various sources of information to search for annotations that frequently co-occur in a group of genes and rank them by statistical significance, which is essential in the biological interpretation of high-throughput experiments (12).We used GeneCodis for an enrichment analysis of the genes related to cSCC.We put the genes from the text mining step into the input set and analyzed this set of genes using the GO biological process categories.The most significantly enriched biological processes were selected.Genes with the selected annotations were used for the next step, which was an additional GeneCodis analysis performed with annotations of the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways.Among these highly enriched pathways above the P-value cutoff, those most relevant to cSCC pathology were selected.Genes belonging to the selected pathways were used for further analysis.
Protein interaction network.The STRING database (http://string-db.org)integrates the protein-protein interactions of selected genes (13).In the first page of the STRING database, we selected 'Multiple proteins' from the left menu bar, entered the genes selected from the previous step, and 'Homo sapiens' was selected as the organism.With respect to the confidence score, the stronger the evidence that two proteins interact with each other is, the higher the confidence scores observed.However, although a lower confidence score may decrease the confidence of the network, given the fact that it may broaden the inclusion criteria, the confidence score was set to medium (score 0.400) in this study.Then, the protein-protein interaction network of the target genes was obtained.
Next, we used the Cytoscape software platform to visualize and analyze the interaction network.Cytoscape is a software tool for the visual exploration of biomedical networks composed of proteins, genes and other types of interactions, supported by diverse annotations and experimental data (14).We imported data in the format '.tsv' from the STRING EXPORT channel.CentiScaPe, an app which calculates a larger number of network parameters, was then used to analyze the topological characteristics of each node and to select the key nodes.We chose 'Degree' and 'Betweenness' as parameters to select the key genes.The higher the Degree of the node, the greater the number of gene products that interact with each other, which means the greater the role that the node plays in cSCC.The Betweenness value of the node indicates the tendency of a node to connect to the core of other nodes.In this study, the criteria for the selection of key genes was set in such a way that the nodes for which Degree and Betweenness were both greater than or equal to the mean were the key nodes.Candidate genes were selected for final analysis in the next step.
Gene data analysis.The cBioPortal for cancer genomics (http://cbioportal.org) is a web-based tool for exploring, visualizing and analyzing multidimensional cancer genomics data.Datasets in the portal are from 10 published cancer studies and for each tumor sample, data may be available from multiple genomic analysis platforms (15).We selected 'Cutaneous squamous cell carcinoma' as the filter, and then performed the query with the final list of genes from the previous step as the input set.Results were described in each tab, including OncoPrint, Mutations, Survival and Network, which allowed us to explore genomic alterations and provided a graphical summary of the mutations identified in each query gene, survival analysis, patient-centric queries, and network visualization and analysis.
Drug-gene interactions.We used DGIdb (http://www.dgidb.org) to explore drug-gene interactions in the final list of genes, which were used as the potential targets in a search for existing drugs (16).At least one drug was found to correspond to each gene.These candidate drugs targeting the genes/pathways relevant to cSCC may represent potential treatments.

Results of text mining, biological process and pathway analysis.
In the process of exploring potential drugs for cSCC (Fig. 1), 178 genes were found to be related to cSCC from the text mining searches.After deleting the duplicates, 121 genes were left (Fig. 2).GeneCodis gene enrichment analysis was first performed using biological processes analysis to find the most enriched terms related to cSCC pathology.During this process, to make sure that only the most enriched annotations were selected, a P-value cutoff (P=1.00E-06) was set.Among the most significantly enriched biological processes above the cutoff, those most relevant to cSCC pathology based on the available literature and research were selected.Therefore, the analysis of enriched biological process annotations resulted in 21 sets of annotations containing 84 genes (Table I).The five most enriched biological process annotations were: i) 'Negative regulation of apoptotic process' (P=3.94548E-14);ii) 'positive regulation of cell proliferation' (P=1.70535E-12);iii) 'anti-apoptosis' (P=5.19303E-10);iv) 'positive regulation of transcription from RNA polymerase II promoter' (P=1.04587E-09); and v) 'negative regulation of transcription from RNA polymerase II promoter' (P=1.06495E-09),containing 17, 17, 12, 17 and 15 genes from the query set, respectively.Other highly enriched biological process annotations included 'response to cytokine stimulus', 'positive regulation of ERK1 and ERK2 cascade', 'positive regulation of MAPK cascade' and 'apoptotic process'.
In the process of KEGG pathway enrichment analysis, the P-value cutoff was set at 1.00E-05.Among the most significantly enriched pathway annotations above the cutoff, those most relevant to cSCC pathology based on the available literature and research were selected.The analysis of enriched pathway annotations resulted in 10 pathways containing a total of 39 unique genes (Table II).The three most significantly enriched pathways were: i) 'Pathways in cancer' (P=3.24886E-35);ii) 'MAPK signaling pathway' (P=1.71374E-15); and iii) 'ErbB signaling pathway' (P=2.49437E-15),containing 29, 15 and 11 genes from the query set, respectively.Other highly enriched pathways included 'focal adhesion', 'p53 signaling pathway', 'apoptosis', 'cell cycle', 'endocytosis', 'VEGF signaling pathway' and 'TGF-β signaling pathway'.
Results of protein-protein interaction.The protein-protein interaction network of the 39 target genes was illustrated using STRING (Fig. 3).We imported data in the format '.tsv' from the STRING EXPORT channel to Cytoscape (Fig. 4).Then, we used CentiScaPe, an app which calculates a larger number of network parameters, to analyze the topological characteristics of each node and select the key nodes.Node Degree represents the total number of edges incident to the node.In this study, the maximum and minimum Degree values were 34.00 and 3.00 respectively, with an average value of 18.62.Betweenness centrality for each node refers to the number of shortest paths that pass through the node.The maximum value was 99.60, with a minimum of 0.00 and an average of 20.10.We set up the criteria for the selection of key genes so that the nodes for which Degree and Betweenness were both greater than or equal to the mean were the key nodes.Eleven genes were selected based on these criteria, forming a tight interaction network, as determined by Cytoscape.
Results of gene data analysis.The 11 genes included TP53, MDM2, CCND1, CDKN2A, HRAS, EGFR, MYC, ERBB2, AKT1, STAT3 and SRC.The results of the cBioPortal analysis are shown in each tab (Fig. 5).From the OncoPrint, 26 cases (90%) had an alteration in at least one of the 11 genes, with the frequency of alteration in each of the 11 selected genes presented in Fig. 5A.For TP53, most of the alterations were missense mutations and truncating mutations.The alterations Figure 1.Overall data mining procedure.Text mining was used to identify genes associated with cutaneous squamous cell carcinoma (cSCC) by using pubmed2ensemble.Extracted genes were then analyzed for their function by using GeneCodis.Further enrichment was obtained by protein interaction analysis using STRING and Cytoscape.Gene data analysis was performed using cBioPortal.The final drug list was obtained by gene-drug interaction analysis using the Drug Gene Interaction Database.
Figure 2. Summary of data mining results.(A) Text mining: Text mining was performed by using the search term 'cutaneous squamous cell carcinoma' (cSCC) and 178 genes were found using pubmed2ensembl.A total of 121 genes remained after deleting the duplicates.(B) Gene set enrichment: GeneCodis gene enrichment analysis was performed using biological processes and pathway analysis to enrich 84 and 39 genes, respectively.Next, using STRING and Cytoscape, 11 significant genes were selected for the final analysis.(C) Gene data analysis: Gene data analysis was performed using cBioPortal.(D) Drug-gene interactions: The final 11 genes were inputted to DGIdb and 54 drugs were selected as potential drug treatments for cSCC.To make sure that only the most enriched annotations were selected, a P-value cutoff (P=1.00E-06) was set.Among the most significantly enriched biological processes above the cutoff, those most relevant to cSCC pathology based on the available literature and research were selected.The analysis of enriched biological process annotations resulted in 21 sets of annotations containing 84 genes.Genes with selected annotations were used for the next step.To make sure that only the most enriched annotations were selected, a P-value cutoff (P=1.00E-05) was set.Among the most significantly enriched pathway annotations above the cutoff, those most relevant to cSCC pathology based on the available literature and research were selected.The analysis of enriched pathway annotations resulted in 10 pathways containing a total of 39 unique genes when combined.Genes with selected pathways were used for the next step.
in MDM2 were truncating mutations with unknown significance.Events associated with CDKN2A included an in-frame mutation, and several missense and truncating mutations.The alterations in CCND1 and HRAS were missense mutations.For EGFR, MYC, AKT1 and SRC, the alterations were missense mutations with unknown significance.The alterations in ERBB2 included missense mutations and truncating mutations with unknown significance.No alterations occurred in STAT3 in these cases.To further identify the details of all mutations in each query gene, the Mutations tab was used.For example, the graphical summary of the mutations associated with TP53 showed that there were 35 mutations, including 1 duplicate mutation, in patients with multiple samples.In the Survival tab, results displayed as Kaplan-Meier plots with P-values are available (Fig. 5B).The overall survival of cSCC patients with or without alterations in the query genes are presented with a log rank text P-value of 0.139.The Network tab provides interactive analysis and visualization of networks consisting of genes, pathways, drugs and interactions (Fig. 5C).The network contained 61 nodes, including 11 query genes and the 50 most frequently altered neighboring genes.By selecting 'show all drugs', a network containing gene-centric drug-target information was generated.For example, gefitinib and erlotinib, which are epidermal growth factor receptor (EGFR) tyrosine kinase inhibitors (TKIs) (EGFR-TKIs) that target the intracellular domain of EGFR, and cetuximab and panitumumab, which are antibodies that block the extracellular domain of EGFR and ERBB2, respectively, were shown with edges connecting them to their targets.

Results of drug-gene interactions.
Using the final list of 11 genes as the potential targets in the drug-gene interaction analysis, a list of 54 drugs was selected as possible drug treatments for cSCC (Table III).Most of the drugs are used for cancer treatment, which is in accordance with the fact that the most enriched pathway was 'pathways in cancer'.These drugs can be divided into several major categories, including chemotherapy agents, tyrosine kinase inhibitors (TKIs), HER2 inhibitors, EGFR inhibitors, AKT/PI3K inhibitors, mTOR inhibitors, mitogen-activated protein kinase (MAPK) inhibitors, cyclin-dependent kinase (CDK) inhibitors, histone deacetylase (HDAC) inhibitors, nonsteroidal anti-inflammatory drugs (NSAIDs) and others.Fifteen of the 54 drugs were chemotherapy agents.Among these, platins and 5-fluorouracil (5-FU) are used as palliative treatments alone or in combination with radiotherapy (17).Other drugs which were previously studied, including capecitabine, carboplatin, doxorubicin, gemcitabine and methotrexate, are used for either advanced or metastatic disease, alone or in combination (18).Certain chemotherapy agents work by disrupting the normal function of microtubules and thereby stopping cell division, such as docetaxel and vinorelbine (19).Irinotecan serves as a topoisomerase inhibitor by blocking topoisomerase I, which results in DNA damage and cell death (20).While azacitidine, a chemical analog of the nucleoside cytidine, is used in the treatment of myelodysplastic syndrome (21), most of the drugs had been previously approved to be used in different types of cancer, including non-small cell lung cancer (docetaxel), colon and small cell lung cancer (irinotecan), colorectal cancer (oxaliplatin), and astrocytoma and glioblastoma multiforme (temozolomide) (19-21).
Figure 3.The protein-protein medium (confidence score 0.400) interaction network of the 39 targeted genes, produced using STRING.Network nodes represent proteins and different colored edges represent protein-protein associations.
Figure 4.The protein-protein interaction network of the 39 targeted genes, produced using Cytoscape.Network nodes represent proteins and edges represent protein-protein associations.Among these candidate drugs, 21 drugs targeted receptor tyrosine kinases (RTKs) and interacted in signaling pathways responsible for cell proliferation, survival, angiogenesis, invasion and metastasis.These drugs included EGFR-TKIs, HER2 receptor inhibitor, Bruton's TKI (BTKI), Bcr-Abl TKI, anaplastic lymphoma kinase (ALK) inhibitor, and multi-target TKIs.Gefitinib and erlotinib are EGFR-TKIs, which block EGFR signaling selectively through competitive reversible binding at the intracellular EGFR-TK domain.Cetuximab, a recombinant chimeric monoclonal antibody that competitively inhibits EGFR, was reported to be used as a first-line therapy for patients with unresectable cSCC expressing EGFR in a phase II trial of 36 patients.The disease control rate was 69% overall, with eight partial responses and two complete responses (22).Drugs targeting HER2 included the monoclonal antibodies trastuzumab and pertuzumab, used for breast cancer that is HER2 receptor-positive (23,24).Ibrutinib, a BTKI, is used to treat B-cell cancers such as chronic lymphocytic leukemia, mantle cell lymphoma, and Waldenström's macroglobulinemia (25).TKIs that work by blocking the Bcr-Abl tyrosine-kinase included dasatinib, imatinib, bosutinib and ponatinib, which have been used to treat chronic myelogenous leukemia (CML) and acute lymphocytic leukemia (ALL), Philadelphia chromosome-positive (Ph + ) (26).ALK inhibitors, crizotinib and ceritinib, were approved for the treatment of some NSCLC cases in the US and other countries, but have not been formally evaluated in the setting of cSCC (27).Other drugs such as masitinib, sorafenib and sunitinib inhibit cellular signaling by targeting multiple RTKs, including platelet-derived growth factor (PDGFR), vascular endothelial growth factor receptor (VEGFR), fibroblast growth factor receptor (FGFR) and RAF family kinases, which play an important role in both tumor angiogenesis and tumor cell proliferation (28)(29)(30).
PI3K/AKT/mTOR is one of the major signaling pathways that have been identified to be important in cancer.mTOR is a key kinase downstream of the PI3K/AKT pathway, which regulates tumor cell proliferation, growth, angiogenesis and survival (31).Drugs targeting this signaling pathway include AKT/PI3K inhibitors (perifosine, omipalisib apitolisib and ipatasertib) and mTOR inhibitors (everolimus, sirolimus and temsirolimus).Another important pathway, the MAPK signaling pathway, is closely related to the pathology of cSCC, which is consistent with the results of the KEGG pathway analysis.Drugs related to this signaling pathway include the p38-MAPK inhibitor doramapimod, and the MEK inhibitor trametinib, which was approved for use in combination with dabrafenib for the treatment of patients with BRAF V600E/K-mutant metastatic melanoma (32).
Among the list, two of the drugs work as CDK4 and CDK6 inhibitors, palbociclib and ribociclib, and are used for the treatment of ER-positive and HER2-negative breast cancer.One drug, vorinostat, an inhibitor of HDACs, was reported to block proliferation by inhibiting cell cycle regulatory machinery (33).Non-steroidal anti-inflammatory drugs (NSAIDs), such as aspirin, indomethacin and sulindac, are commonly used in the treatment of acute or chronic inflammatory conditions.Other drugs included acitretin, a second-generation retinoid used for psoriasis, bevacizumab, a recombinant monoclonal antibody that blocks angiogenesis by inhibiting vascular endothelial growth factor A (VEGF-A), and dimethyl sulfoxide (DMSO), used as a topical analgesic, anti-inflammatory and antioxidant (34).

Discussion
Cutaneous squamous cell carcinoma (cSCC) is the second most common skin cancer, with a high metastasis rate and an >70% mortality rate when metastasized.It is generally accepted that the majority of cSCCs are treated with surgical excision.However, research into drugs as an adjuvant therapy is limited.In this study, we identified 11 target genes following gene set enrichment analysis, targetable by 54 drugs for possible cSCC treatment (Table III).Genetic changes in this carcinogenic process alter cell function, allowing for self-sufficiency in growth signals, insensitivity to antigrowth signals, escape from apoptosis, unlimited replicative potential, invasion, angiogenesis and metastasis, which is consistent with the enriched biological processes identified using GeneCodis analysis (Table I), such as 'apoptotic process', 'cell proliferation', 'cell adhesion' and 'signal transduction'.Potential drugs identified by the drug-gene interaction search were divided into chemotherapy agents, TKIs, HER2 inhibitors, EGFR inhibitors, AKT/PI3K inhibitors, mTOR inhibitors, MAPK inhibitors, CDK inhibitors, HDAC inhibitors, NSAIDs and others, acting upon signaling pathways including 'MAPK signaling pathway', 'ErbB signaling pathway', 'p53 signaling pathway', 'apoptosis' and 'VEGF signaling pathway' (Table II).
Chemotherapy plays a crucial role alone or in combination with other treatment modalities in cSCC.Current chemotherapeutic agents used in advanced or metastatic disease are as follows: Platin derivates (e.g.cisplatin or carboplatin), Fifty-four drugs which met the criteria of targeting one of the candidate genes by an appropriate interaction were collected in the final list.
Whether the drug has been approved by the FDA and whether it has been approved for cutaneous squamous cell carcinoma are documented in the table.
5-fluorouracil (5-FU), bleomycin, methotrexate, adriamycin, taxanes, gemcitabine, or ifosfomide alone or in combination, 6 of which appeared in our drug list.Carboplatin and docetaxel have been used previously in cSCC as a routine clinical treatment with good overall and progression-free survival (35,36), while a study has shown that 5-FU, when administered orally in 14 patients with aggressive cSCC, achieved some success, with 64.3% exhibiting measurable improvement and 14.3% partially responding (37).Mutations in tyrosine kinase receptor genes have been observed in cSCC.The frequent amplification of EGFR and the closely related ERBB2 can contribute to elevated receptor activity, which is consistent with our gene search results.EGFR, a member of the ERBB growth factor receptor family, has an extracellular ligand-binding domain, a transmembrane region and a transduction module consisting of a TK motif and several autophosphorylation sites, and initiates signal transduction via activation of a receptor-associated TK.The family also includes ERBB2 (HER2), ERBB3, and ERBB4 (38).They regulate apoptosis, cell cycle progression, differentiation, angiogenesis, migration and development.There are at least two types of EGFR inhibitors.One is EGFR small molecule inhibitors, including gefitinib and erlotinib, which are illustrated in the gene-drug interaction results; these block tyrosine kinase binding at the intracellular domain.The other is antibodies against EGFR, including cetuximab and panitumumab, which block the extracellular domain of the receptor and inhibit ligand binding.The downstream pathways including the P53 signaling pathway, MAPK signaling pathway, PI3K/AKT pathway and nuclear factor-κB (NF-κB) pathway, are activated by signaling through EGFR family members (Fig. 6) (39).
Gefitinib is a reversible EGFR-TKI, which regulates the EGFR/MAPK/Pak1 pathways (40).A phase II trial to evaluate the efficacy of gefitinib in aggressive cSCC has shown that gefitinib, as a neoadjuvant therapy prior to definitive treatment, was well tolerated in patients with aggressive cSCC and did not interfere with the definitive treatment, with 18.2% having a complete response and 27.3% having a partial response (41).A single-arm phase II study in 40 patients with cSCC not amenable to curative therapy showed an overall response rate of 16% with a favorable adverse event profile, which demonstrated modest activity of gefitinib in incurable cSCC (42).Analogously, erlotinib, another orally available reversible EGFR inhibitor, demonstrates responses in advanced cSCC alone or in combination with other therapies (43).Although many results of trials using either gefitinib or erlotinib have been promising, especially considering the acceptance of oral therapy by patients and drug tolerance, none of the trials can be considered as definitive due to their lack of controls.Cetuximab, a monoclonal antibody that competitively inhibits EGFR, administered in combination with simultaneous high-dose radiotherapy, showed a good result with respect to significantly improved local tumor control and survival compared to irradiation alone in a randomized phase III clinical trial in HNSCC patients (44).The potential of cetuximab to treat unresectable advanced cSCC alone or combined with radiotherapy was confirmed in several studies, and further randomized studies are still needed (45,46).Lapatinib, a dual tyrosine kinase inhibitor which blocks the HER2/neu and EGFR pathways, was reported to induce the inhibition of PI3K/AKT/mTOR and ErK1/2 signaling pathways, leading to autophagy-inducing and EMT-suppressing effects in A431 cell lines (43).Thus, lapatinib may represent a promising anticancer drug for cSCC treatment.
The PI3K/AKT/mTOR pathway is frequently activated in cSCC, compared to actinic keratosis or normal skin (47).Class I PI3Ks phosphorylate phosphatidylinositol 4,5-bisphosphate (PIP2) at the 3-OH of the inositol ring to generate phosphatidylinositol 3,4,5-trisphosphate (PIP3), which in turn activates AKT and the downstream mTOR complex to play key roles in carcinogenesis (48).Activation of AKT will generate downstream effects, including the promotion of cell proliferation, migration and invasion, increasing cellular glycolytic flux and inhibiting cellular apoptosis (49).Phosphorylated mTORC1 activates p70 S6 kinase, which enhances mRNA translation and drives cell growth by activating the ribosomal protein S6 and elongation factor 2, while mTORC2 directly phosphorylates AKT on serine 473, which promotes cell survival and proliferation via the activation of NF-κB and upregulation of cyclin D1 (50,51).Therefore, the PI3K/AKT/mTOR pathway has become a potential therapeutic target in cancer treatment (Fig. 6).As a result, inhibitors of PI3K/AKT/mTOR have been explored, among which a selective p110 inhibitor, idelalisib, in PIK3CA mutation-positive cancer has been approved for the treatment of chronic lymphocytic leukemia (CLL) and follicular B-cell lymphoma of non-Hodgkin's lymphoma (48).However, no clinical studies have reported the role of PI3K/AKT pathway-targeting drugs in cSCC to date.
Apitolisib, a dual PI3K and mTOR inhibitor found in our drug list, demonstrated favorable pharmacokinetics and evidence of biological activity in a phase I study in patient with solid tumors.However, the therapeutic window was narrow for the putative increased risk of pulmonary toxicity (48).Everolimus, an mTORC1 inhibitor, is undergoing a trial as part of a Phase I/II study investigating induction chemotherapy with weekly everolimus plus carboplatin and paclitaxel in locally unresectable advanced HNSCC.The treatment is well tolerated with an overall response rate of 79% (52).Another Phase I and pharmacokinetic study of everolimus in combination with cetuximab and carboplatin for recurrent or metastatic HNSCC showed an encouraging response rate, with 61.5% and 8.15 months progression-free survival, which suggested possible clinical efficacy in a select group of patients with cSCC (49).
The MAPK pathway is another major oncogenic pathway in human cancer (Fig. 6).PI3K and RAF can be activated by RAS, a monomeric membrane-associated GTP-binding protein.RAF kinase transduces signals through a MAPK cascade.Therefore, the PI3K/AKT/mTOR and MAPK signaling pathways are compensatory pathways that mediate cell survival through co-regulated proteins, and negatively regulate each other.AKT directly phosphorylates and inactivates RAF, while MEK suppresses PI3K signaling by promoting the membrane localization of phosphatase and tensin homolog (PTEN) (31).The MAPK signaling pathway was the second most highly enriched pathway annotation in our pathway analysis process.Two drugs, including trametinib, a MEK inhibitor, and doramapimod, a p38 MAPK inhibitor, were included in our final drug list.Trametinib was approved by the US Food and Drug Administration and European Medicines Agency as a single agent for the treatment of patients with BRAF V600E-mutated metastatic melanoma (53).However, according to previous studies, patients treated with the BRAF V600E inhibitors vemurafenib and dabrafenib rapidly develop cSCC (17).Evidence indicates that these inhibitors may result in a paradoxical activation of the MAPK pathway, which in turn acts in tandem with mutations in other tumor suppressors or oncogenes such as TP53 and HRAS (54).Therefore, inhibition of the MAPK pathway may provide a potential treatment option for cSCC.
The NF-κB signaling pathway is a major pathway mediating several fatal processes in cancer cells.The PI3K/AKT or MAPK pathways have been shown to regulate the expression and activity of the NF-κB transcription factor in cancer (55).NF-κB cooperates with many signaling molecules and pathways.Its crosstalk with other transcription factors includes STAT3, P53 and the ETS family member ERG, which either directly interacts with NF-κB subunits or affects NF-κB target genes (56,57).NF-κB can also be regulated by other signaling pathways.Kinases activing or regulating NF-κB include members of the MAPK family [Jun-N-terminal kinase (JNK) and p38], protein kinase C (PKC), AKT and PI3K, which regulate the expression and activity of NF-κB or affect upstream signaling pathways (58,59).Inhibitors of either the PI3K/AKT or MAPK signaling pathways may contribute to the inhibition of the NF-κB signaling cascade, which verifies our research findings to a certain degree.
One drug in our final drug list, the HDAC inhibitor vorinostat, has shown potential to inhibit the growth of human xenograft tumors.A study conducted by Kurundkar et al showed that vorinostat impaired proliferation and migration, and induced apoptosis, by downregulating the ERK and AKT/mTOR signaling pathways; more specifically, by inhibiting mTOR signaling, which was accompanied by a reduction in cell survival-associated AKT and ERK signaling pathways (33).A study on osteoarthritis demonstrated that vorinostat inhibits the IL-1β-induced expression of matrix metalloproteinase (MMP)-1, MMP-13 and inducible nitric oxide synthase (iNOS) through inhibition of the phosphorylation of p38 and ERK1/2, and the regulation of NF-κB by inhibiting the degradation of I-kBα and attenuating NF-κB p65 translocation to the nucleus.This suggested that vorinostat targets multiple proliferation and growth regulatory pathways by inhibiting HDACs, which provides the rationale for a novel mechanism-based therapeutic intervention for cSCC (33).
Three of the drugs found in our drug list are NSAIDs, which are closely related to cSCC pathology as exposure to UV radiation damages skin cells, leading to the release of inflammatory cytokines such as cyclooxygenase (COX)-2 and its product prostaglandin E2, which promote skin carcinogenesis (23).Overexpression of COX-2 has been observed in cSCC (60).NSAIDs inhibit COX-2 and suppress the production of prostaglandin E2, which prevents E2 and E4 receptors from upregulating inflammatory cytokines that recruit immune cells and promote NF-κB activation.A meta-analysis reported that the use of non-aspirin NSAIDs or any NSAIDs significantly reduced the risk of developing cSCC by 15 and 18%, respectively, suggesting that NSAIDs have the potential to prevent the development of cSCC collectively (61).
The limitations of this study are associated with the databases we used and the criteria we set in each screening step.These databases may have limited data sources, thus our analysis may have to be repeated as databases become more comprehensive.The criteria that we set are likely to be subjective.Different rational criteria could be set up to further explore the best outcome.
In conclusion, we presented a method to explore candidate drugs which target the genes/pathways that are relevant to cSCC, with the aim of elucidating potential treatments.Such methods may be used routinely as databases and analytical tools evolve and improve.As a result, in this method, we identified a total of 54 potential drugs, including 25 chemotherapy agents, 21 TKIs, 7 inhibitors of AKT/PI3K/mTOR, 2 inhibitors of MAPK, 2 CDK inhibitors, 1 HDAC inhibitor, 3 NSAIDs and 3 other drugs.Forty-nine of the 54 have not yet been tested in cSCC, which provides a basis for new trials and the development of novel targeted therapies as potential treatments for cSCC.

Figure 5 .
Figure 5.The results of the cBioPortal analysis.(A) The OncoPrint Tab: The OncoPrint tab summarizes genomic alterations in all queried genes across a sample set.Each row represents a gene, and each column represents a tumor sample.Orange squares indicate in-frame mutations with unknown significance, green squares are missense mutations, and black and grey squares indicate truncating mutations.(B) The Survival tab: This figure shows the overall survival of cutaneous squamous cell carcinoma patients with or without alterations in the query genes.The red curves in the Kaplan-Meier plots include all cases with one or more alterations in the query genes; the blue curves include all samples without alterations in the query genes.(C) The Network tab: This figure shows network analysis of 11 genes and their target drug networks in cutaneous squamous cell carcinoma.The query genes are outlined with a thick border, and the nearest neighbor genes are color-coded by their alteration frequency in cutaneous squamous cell carcinoma.All drugs, including FDA-approved and non-FDA-approved, are shown in the network.

Figure 6 .
Figure 6.Key signaling pathways involved in cutaneous squamous cell carcinoma.Mutations induced by UVB exposure can interfere with multiple cellular pathways.Receptor tyrosine kinases (RTKs) are phosphorylated following growth factor binding.Downstream pathways are activated.

Table I .
Summary of biological process gene set enrichment analysis.

Table II .
Summary of KEGG process gene set enrichment analysis.

Table III .
Candidate drug targeting genes with cSCC.

Table III .
Continued.