Screening of differentially expressed genes related to esophageal squamous cell carcinoma and functional analysis with DNA microarrays
- Authors:
- Published online on: January 21, 2014 https://doi.org/10.3892/ijo.2014.2262
- Pages: 1163-1170
Abstract
Introduction
Esophageal cancer is the eighth most common cancer type and the sixth leading cause of cancer mortality worlwide (1). Esophageal squamous cell carcinoma (ESCC) is the predominant histologic type of esophageal cancer. The disease development is affected by multiple factors, so well-designed multidisciplinary epidemiologic study is needed to examine their roles in ESCC risk (2). A wide range of molecular changes is associated with ESCC, possibly because the esophagus is exposed to many kinds of carcinogens including alcohol and cigarette smoke (3). Despite treatment improvements, such as esophagectomy with lymph node dissection (4) and endoscopic submucosal dissection (ESD) (5,6), the first-year survival rate remains low at 10–13% (7). Thus, searching for new therapies for esophageal cancer is urgent.
Esophageal carcinogenesis (EC) is a multi-stage process, involving a variety of changes in gene expression and physiological structure change. Common genetic alterations associated with ESCC progression are epidermal growth factor receptor (EGFR) and cyclin D1 overexpression, dysregulation of p120 catenin, p16Ink4a and expression of p53 missense mutations (8). Moreover, a study has shown that the significant rise of CEA and AFP in serum has a significance on the early diagnosis and prognosis of esophageal cancer, and they can be used as the auxiliary diagnosis index of esophageal cancer in clinic (9). By using immunohistochemistry, high-mobility group AT-hook2 (HMGA2) which was overexpressed in the panel of ESCC cell lines has been found to be upregulated in ESCC tissues (10). With the development of molecular biology and its wide use in cancer research, many scholars have carried out in-depth research on the pathogeny of esophageal cancer from gene level, and a variety of significant genes were found, as shown above. However, the mechanism of the disease has not been clarified. Thus, further study is imperative.
With the emerging technology of DNA microarray, it is now possible to screen for alterations in the expression of many genes simultaneously. DNA microarray analysis as a global approach is applied to investigate physiological mechanisms in health and disease (11). Gene expression profiling using microarrays is a robust and straightforward way to study the molecular features of different types and subtypes of cancer at a system level. Because the development and progression of cancer are accompanied by complex changes in patterns of gene expression, the DNA microarray technology provides a very useful tool for studying these complex processes.
In this study, we aimed to find disease-associated genes and gene functions in esophageal squamous cell carcinoma (ESCC) and to gain more insights into the molecular mechanisms of esophageal cancer. Developing the non-invasive esophageal cancer molecular marker detection may provide evidence for early diagnosis and treatment for esophageal cancer.
Materials and methods
Affymetrix microarray data
We extracted the gene expression profiles from GEO (Gene Expression Omnibus) database (ID: GSE20347) (12), which included 17 ESCC and 17 matched normal adjacent tissue samples. Platform information was GPL571 [HG-U133A_2] Affymetrix Human Genome U133A 2.0 Array. Annotation information for all probe sets of ATH1 (25K) is provided by Affymetrix. We downloaded the raw data and the probe annotation files for further analysis.
Data preprocessing
The probe-level data in CEL files were converted into expression measures. Then we imputed missing data with k-nearest neighbor (KNN) algorithm (13) and performed quartile data normalization by the robust multiarray average (RMA) algorithm with defaulted parameters in R affy package (14).
Analysis of differentially expressed genes (DEGs)
The limma package (15) in R was used to identify differentially expressed genes between matched normal adjacent and ESCC tissues. To circumvent the multi-test problem which might induce too many false positive results, the Benjamin and Hochberg (BH) method was used to adjust the raw P-values into false discovery rate (FDR) (16). The FDR <0.05 and |logFC|>2 were used as the cut-off criterion.
Interaction network construction of DEGs
To further analyze these DEGs, we mapped them to the STRING (Search Tool for the Retrieval of Interacting Genes) database (17) to construct interaction network. STRING which is linked to other databases predicted the target gene interaction in use of the gene characteristic and spatial structure. All associations were provided with a probabilistic confidence score, and the pairs with scores were all selected as interactional ones.
Hub protein screening
The node in the interaction network was analyzed to identify the hub protein under the scale-free property of protein-protein interaction (PPI) networks. From previous obtained PPI networks of other species, we have gained knowledge that most of the biological networks are subject to scale-free network property (18), which means in the network, most nodes have only few connections, and those a few nodes which have a large number of connections are the key nodes (the hub). We selected the hub protein through network topology analysis (19) including node degree (20), clustering coefficient (21) and path length (22).
Interaction network construction of the hub gene
The development of many vital activities is achieved by the combination and dissociation of protein. Various important physiological activities and responses to the external and internal environment of cells are based on interactions between proteins to form a signal transduction network system. Therefore, further study of PPI is the necessary premise to understand the vital phenomenon. Osprey software platform is designed for better study of PPI networks and protein complexes. The software itself integrates with BIND (23) and GRID (24) database, involves protein and nucleic acid sequence, and has a collection of more than 50,000 interactions. In our study, Osprey software (25) was applied to identify the interactors of the hub gene to construct an interaction network. DEGs with the same function in the network were annotated through the annotation function of Osprey connected with GO.
Gene Ontology and pathway enrichment analysis
Gene enrichment analysis evaluates differential expression patterns of gene groups with similar or related functions instead of those of individual genes. This approach especially targets gene groups where constituents show subtle but coordinated biological function or property changes by calculating the whole significance of the gene expression changes, which might not be detected by the usual individual gene analysis (26). In our analysis, p-value represents the probability that the gene in a module was randomly endowed with a GO function, and it is usually a standard to assign the module to a function. Smaller p-value is, the more significant biological significance it shows, which proves that the module does not appear randomly, but in order to complete a specific biological function (27). DAVID software, with built-in rich graphical display, clusters the significant gene collections according to their functions, and it has abundant public database links (28). We clustered the DEGs in the interaction network of the selected hub gene. Then DAVID (28,29) was applied to analyze the GO and KEGG pathway enrichment (FDR <0.05).
Results
Analysis of differentially expressed genes
In order to get differentially expressed genes, we obtained publicly available microarray dataset GSE20347 from GEO. After data preprocessing, we identified genes specifically differentially expressed from the standardized data (Fig. 1). Total 538 genes were selected as DEGs at FDR <0.05 and |logFC|>2, including 320 downregulated genes and 218 upregulated genes.
Interaction network construction of DEGs
We mapped the DEGs to the STRING database to construct an interaction network. STRING which is linked to other databases calculated the combined scores of the DEGs in use of the gene characteristic and spatial structure. The interaction network of the DEGs is revealed in Fig. 2.
Hub gene analysis
Based on network topology analysis, we found that the protein interaction network also had scale-free attributes. Three network topology were, respectively, processed with statistical analysis, as shown in Fig. 3. Graph A shows the node degree distribution in the network, and we obtained y = 57.261 x ^ (0.896) (red line in Fig. 3) with the exponential method. The x axis represents the node degrees, which means the number of nodes directly connected with it, and the y axis represents the node number in different degrees. The property of the node degree distribution shows that both loose-connected nodes and close-connected ones exist in the network. Graph B shows the clustering coefficient distribution in the network, and the clustering coefficient is able to display the aggregation degree of nodes. It is seen in the graph that the distribution range of the clustering coefficient is [0, 1]. The nodes with high clustering coefficient are in the minority, and the clustering coefficient distributions mostly locate in the area of <0.5. Graph C shows the path length of the network, indicating that the length of most paths was about at 3–4. After analyzing the three topology property parameters of the network, we can conclude the small world effect of network which is subject to scale-free property. Table I shows the degree of distribution of the top ten nodes in the network. We could see that a high degree of the node is relative to the high clustering coefficient, and the highest degree node is the translocator protein (TSPO).
Hub gene network analysis
Osprey software was used to identify the hub genes and the interactors of gene TSPO from all the DEGs, and to construct PPI network. DEGs with the same function in the network were annotated through the annotation function of Osprey connected with GO. As shown in Fig. 4, different colors represent different GO annotation nodes.
Gene Ontology and pathway enrichment analysis
To gain further insights into the functions of genes in our interaction network, we used DAVID to analyze the significantly enriched GO functional nodes in the network of TSPO, and we finally obtained 12 significant nodes (Table II). Fig. 5 shows the toll-like receptor signaling pathway (FDR <0.05) in which the DEGs in the TSPO network participated. In our study, cathepsin K (CTSK) and interleukin 8 (IL8), which were interacted with TSPO, participated in this pathway.
Discussion
Esophageal carcinogenesis (EC) is a multi-stage process and features high rates of mortality and morbidity. Therefore, there is an urgent need to explore the mechanism of EC, and develop an effective prevention strategy. Our study aimed to find specific expression genes and gene functions in esophageal squamous cell carcinoma with a group of ESCC tissue microarray expression data. After network analysis of DEGs, gene TSPO was selected as the hub gene by using network topology analysis. Among its interactors, gene CTSK and IL8 participated in toll-like receptor signaling pathway which is closely related to tumor occurrence.
The TSPO, which was previously called peripheral-type benzodiazepine receptor (PBR), is an 18-kDa protein with 5 transmembrane domains (30). TSPO plays important roles in steroidogenesis, cell proliferation (in both normal and cancerous cells) and apoptosis (31), and its overexpression has been shown to correlate with tumorigenicity in various human malignancies such as brain, breast, colon, and prostate cancer (32). The TSPO has been suggested to interact with a mitochondrial protein complex, and summarized as mitochondrial membrane permeability transition pore (MPTP), which is considered to regulate the mitochondrial membrane potential (33). In addition, it has been found that the mechanism whereby TSPO initiates the mitochondrial apoptosis cascade includes oxidation of cardiolipins i.e. ROS generation at mitochondrial levels (34,35). A recent study shows that TSPO can modulate the activity of this Fo proton pump in the inner mitochondrial membrane (36). From these studies it is hypothesized that enhanced TSPO levels present a mechanism directed to cause programmed cell death, a mechanism that apparently functions to its potential in established cancers (32,34,36,37). Toll-like receptor (Toll-like receptors, TLRs) is a homologous toll protein that has survived the evolution from fruit flies to human. The toll protein plays a key role in the innate immunity process to defend against microbial invasion by identifying pathogen-associated molecular patterns (PAMPs) (38,39), and mediates tumor immunotherapy through TLRs signaling pathway (40). Human esophageal epithelial cells can sense endogenous danger signals, in part through TLR3 signaling (41). Cathepsin K (CTSK) is a novel and unique lysosomal enzyme, for unlike other lysosomal enzymes which show wide tissue distribution and expression, it is highly cell and tissue specific (42). Recently, it was shown that CTSK affected the innate immune response by compromising TLR9 signaling (43) and TLR7-dependent Th17 polarization (44). In addition, interleukin-8 (IL8) is a chemokine that has an autocrine or paracrine tumor-promoting role and significant potential as a prognostic or predictive cancer biomarker (45). Using the cytokine IL8 as a physiological read-out of the inflammatory response, researchers have found that TLR3 is the most functional of the expressed TLRs in both primary and immortalized esophageal epithelial cell lines (46), and suppression of TLR3 signaling leads to reduced IL-8 induction in esophageal epithelial cells (41).
CTSK can be induced after TLR activation (47), and squamous cell carcinoma (SCC) cells upregulated fibroblastic CTSK expression more potently than normal keratinocytes, which was mainly attributable to SCC-derived IL-1α and IL8 (48). The interaction of gene TSPO and its interactors (CTSK and IL8) may affect the cancer specific gene expression by participating in TLRs signaling pathway.
Reasonable use of chip technology not only makes the analysis of large amounts of biological information quick and easy, but also is helpful to find the key molecular markers associated with esophageal tumor biological behavior accurately. Our discovery may be useful in investigating the complex interacting mechanisms underlying the disease, and provides a new clue and the direction for the early diagnosis of esophageal cancer and treatment. However, further experiments are still needed to confirm our results.
References
Ferlay J, Shin HR, Bray F, Forman D, Mathers C and Parkin DM: Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008. Int J Cancer. 127:2893–2917. 2010. View Article : Google Scholar : PubMed/NCBI | |
Lin Y, Totsuka Y, He Y, et al: Epidemiology of esophageal cancer in Japan and China. J Epidemiol. 23:233–242. 2013. View Article : Google Scholar : PubMed/NCBI | |
Toh Y, Oki E, Ohgaki K, et al: Alcohol drinking, cigarette smoking, and the development of squamous cell carcinoma of the esophagus: molecular mechanisms of carcinogenesis. Int J Clin Oncol. 15:135–144. 2010. View Article : Google Scholar : PubMed/NCBI | |
Motoyama S, Jin M, Matsuhashi T, et al: Outcomes of patients receiving additional esophagectomy after endoscopic resection for clinically mucosal, but pathologically submucosal, squamous cell carcinoma of the esophagus. Surg Today. 43:638–642. 2013. View Article : Google Scholar | |
Urabe Y, Hiyama T, Tanaka S, Yoshihara M, Arihiro K and Chayama K: Advantages of endoscopic submucosal dissection versus endoscopic oblique aspiration mucosectomy for superficial esophageal tumors. J Gastroenterol Hepatol. 26:275–280. 2011. View Article : Google Scholar | |
Isomoto H, Yamaguchi N, Nakayama T, et al: Management of esophageal stricture after complete circular endoscopic submucosal dissection for superficial esophageal squamous cell carcinoma. BMC Gastroenterol. 11:462011. View Article : Google Scholar | |
Umar SB and Fleischer DE: Esophageal cancer: epidemiology, pathogenesis and prevention. Nat Clin Pract Gastroenterol Hepatol. 5:517–526. 2008. View Article : Google Scholar | |
Wong GS: Periostin and its role in promoting invasion in the tumor microenvironment of esophageal cancer. ProQuest. Paper AAI3551813. | |
Bhatnagar J, Heroman W, Murphy M and Austin GE: Immunohistochemical detection of carcinoembryonic antigen in esophageal carcinomas: a comparison with other gastrointestinal neoplasms. Anticancer Res. 22:1849–1857. 2002. | |
Cheung L, Lai K, Lam A, et al: Gene expression profiling identified high-mobility group AT-hook 2 (HMGA2) as being frequently upregulated in esophageal squamous cell carcinoma. Br J Med Med Res. 3:407–419. 2013. View Article : Google Scholar | |
Spies M, Dasu MR, Svrakic N, et al: Gene expression analysis in burn wounds of rats. Am J Physiol Regul Integr Comp Physiol. 283:R918–R930. 2002.PubMed/NCBI | |
Hu N, Clifford R, Yang H, et al: Genome wide analysis of DNA copy number neutral loss of heterozygosity (CNNLOH) and its relation to gene expression in esophageal squamous cell carcinoma. BMC Genomics. 11:5762010. View Article : Google Scholar : PubMed/NCBI | |
Marques FZ, Campain AE, Tomaszewski M, et al: Gene expression profiling reveals renin mRNA overexpression in human hypertensive kidneys and a role for microRNAs. Hypertension. 58:1093–1098. 2011. View Article : Google Scholar : PubMed/NCBI | |
Fujita A, Sato J, Rodrigues L, Ferreira C and Sogayar M: Evaluating different methods of microarray data normalization. BMC Bioinformatics. 7:4692006. View Article : Google Scholar : PubMed/NCBI | |
Smyth GK: Limma: linear models for microarray data. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Gentleman R, Carey V, Dudoit S, Irizarry R and Huber W: Springer; New York: pp. 397–420. 2005, View Article : Google Scholar | |
Benjamini Y and Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B Stat Methodol. 289–300. 1995. | |
Szklarczyk D, Franceschini A, Kuhn M, et al: The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res. 39:D561–D568. 2011. View Article : Google Scholar | |
Albert R: Scale-free networks in cell biology. J Cell Sci. 118:4947–4957. 2005. View Article : Google Scholar : PubMed/NCBI | |
Bruckner S, Hüffner F, Karp RM, Shamir R and Sharan R: Topology-free querying of protein interaction networks. J Comput Biol. 17:237–252. 2010. View Article : Google Scholar : PubMed/NCBI | |
Wu X-R, Zhu Y and Li Y: Analyzing protein interaction networks via random graph model. Int J Inform Technol. 11:125–132. 2005. | |
Miller GA, Shi YY, Qian H and Bomsztyk K: Clustering coefficients of protein-protein interaction networks. Phys Rev E. 75:0519102007. View Article : Google Scholar : PubMed/NCBI | |
Xu K, Bezakova I, Bunimovich L and Yi SV: Path lengths in protein-protein interaction networks and biological complexity. Proteomics. 11:1857–1867. 2011. View Article : Google Scholar : PubMed/NCBI | |
Willis RC and Hogue CW: Searching, viewing, and visualizing data in the Biomolecular Interaction Network Database (BIND). Curr Protoc Bioinformatics. 8.9. 1–8.9. 30. 2006.PubMed/NCBI | |
Breitkreutz B-J, Stark C and Tyers M: The GRID: the general repository for interaction datasets. Genome Biol. 4:R232003. View Article : Google Scholar : PubMed/NCBI | |
Breitkreutz B-J, Stark C and Tyers M: Osprey: a network visualization system. Genome Biol. 4:R222003. View Article : Google Scholar | |
Nam D and Kim SY: Gene-set approach for expression pattern analysis. Brief Bioinform. 9:189–197. 2008. View Article : Google Scholar : PubMed/NCBI | |
Allison DB, Cui X, Page GP and Sabripour M: Microarray data analysis: from disarray to consolidation and consensus. Nat Rev Genet. 7:55–65. 2006. View Article : Google Scholar : PubMed/NCBI | |
Huang da W, Sherman BT and Lempicki RA: Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 4:44–57. 2008.PubMed/NCBI | |
Huang da W, Sherman BT and Lempicki RA: Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 37:1–13. 2009.PubMed/NCBI | |
Scarf AM, Ittner LM and Kassiou M: The translocator protein (18 kDa): central nervous system disease and drug design. J Med Chem. 52:581–592. 2009. View Article : Google Scholar : PubMed/NCBI | |
Papadopoulos V, Baraldi M, Guilarte TR, et al: Translocator protein (18 kDa): new nomenclature for the peripheral-type benzodiazepine receptor based on its structure and molecular function. Trends Pharmacol Sci. 27:402–409. 2006. View Article : Google Scholar : PubMed/NCBI | |
Veenman L, Alten J, Linnemannstöns K, et al: Potential involvement of F0F1-ATP (synth) ase and reactive oxygen species in apoptosis induction by the antineoplastic agent erucylphosphohomocholine in glioblastoma cell lines. Apoptosis. 15:753–768. 2010. View Article : Google Scholar | |
Veenman L and Gavish M: The role of 18 kDa mitochondrial translocator protein (TSPO) in programmed cell death, and effects of steroids on TSPO expression. Curr Mol Med. 12:398–412. 2012.PubMed/NCBI | |
Veenman L, Shandalov Y and Gavish M: VDAC activation by the 18 kDa translocator protein (TSPO), implications for apoptosis. J Bioenerg Biomembr. 40:199–205. 2008. View Article : Google Scholar : PubMed/NCBI | |
Zeno S, Zaaroor M, Leschiner S, Veenman L and Gavish M: CoCl2 induces apoptosis via the 18 kDa translocator protein in U118MG human glioblastoma cells. Biochemistry. 48:4652–4661. 2009. View Article : Google Scholar : PubMed/NCBI | |
Veenman L, Weizman A, Weisinger G and Gavish M: Expression and functions of the 18 kDa mitochondrial translocator protein (TSPO) in health and disease. Target Drug Deliv Cancer Ther. 2010. | |
Veenman L, Papadopoulos V and Gavish M: Channel-like functions of the 18-kDa translocator protein (TSPO): regulation of apoptosis and steroidogenesis as part of the host-defense response. Curr Pharm Des. 13:2385–2405. 2007. View Article : Google Scholar : PubMed/NCBI | |
Medzhitov R, Preston-Hurlburt P and Janeway CA: A human homologue of the Drosophila toll protein signals activation of adaptive immunity. Nature. 388:394–397. 1997. View Article : Google Scholar | |
Takeda K, Kaisho T and Akira S: Toll-like receptors. Annu Rev Immunol. 21:335–376. 2003. View Article : Google Scholar | |
Hennessy EJ, Parker AE and O’Neill LA: Targeting toll-like receptors: emerging therapeutics? Nat Rev Drug Discov. 9:293–307. 2010. View Article : Google Scholar : PubMed/NCBI | |
Lim DM and Wang ML: Toll-like receptor 3 signaling enables human esophageal epithelial cells to sense endogenous danger signals released by necrotic cells. Am J Physiol Gastrointest Liver Physiol. 301:G91–G99. 2011. View Article : Google Scholar | |
Ho N, Punturieri A, Wilkin D, et al: Mutations of CTSK result in pycnodysostosis via a reduction in cathepsin K protein. J Bone Miner Res. 14:1649–1653. 1999. View Article : Google Scholar : PubMed/NCBI | |
Asagiri M, Hirai T, Kunigami T, et al: Cathepsin K-dependent toll-like receptor 9 signaling revealed in experimental arthritis. Science. 319:624–627. 2008. View Article : Google Scholar : PubMed/NCBI | |
Hirai T, Kanda T, Sato K, et al: Cathepsin K is involved in development of psoriasis-like skin lesions through TLR-dependent Th17 activation. J Immunol. 190:4805–4811. 2013. View Article : Google Scholar : PubMed/NCBI | |
Todorović-Raković N and Milovanović J: Interleukin-8 in breast cancer progression. J Interferon Cytokine Res. 33:563–570. 2013.PubMed/NCBI | |
Lim DM, Narasimhan S, Michaylira CZ and Wang ML: TLR3-mediated NF-κB signaling in human esophageal epithelial cells. Am J Physiol Gastrointest Liver Physiol. 297:G1172–G1180. 2009. | |
Sun Y, Ishibashi M, Seimon T, et al: Free cholesterol accumulation in macrophage membranes activates Toll-like receptors and p38 mitogen-activated protein kinase and induces cathepsin K. Circ Res. 104:455–465. 2009. View Article : Google Scholar : PubMed/NCBI | |
Xie L, Moroi Y, Hayashida S, et al: Cathepsin K-upregulation in fibroblasts promotes matrigel invasive ability of squamous cell carcinoma cells via tumor-derived IL-1α. J Dermatol Sci. 61:45–50. 2011.PubMed/NCBI |