<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "journalpublishing3.dtd">
<article xml:lang="en" article-type="research-article" xmlns:xlink="http://www.w3.org/1999/xlink">
<?release-delay 0|0?>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">OR</journal-id>
<journal-title-group>
<journal-title>Oncology Reports</journal-title>
</journal-title-group>
<issn pub-type="ppub">1021-335X</issn>
<issn pub-type="epub">1791-2431</issn>
<publisher>
<publisher-name>D.A. Spandidos</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3892/or.2019.7368</article-id>
<article-id pub-id-type="publisher-id">OR-0-0-7368</article-id>
<article-categories>
<subj-group>
<subject>Articles</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Weighted gene coexpression analysis indicates that PLAGL2 and POFUT1 are related to the differential features of proximal and distal colorectal cancer</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author"><name><surname>Lv</surname><given-names>Yiming</given-names></name>
<xref rid="af1-or-0-0-7368" ref-type="aff">1</xref>
<xref rid="af2-or-0-0-7368" ref-type="aff">2</xref>
<xref rid="fn1-or-0-0-7368" ref-type="author-notes">&#x002A;</xref></contrib>
<contrib contrib-type="author"><name><surname>Xie</surname><given-names>Binbin</given-names></name>
<xref rid="af3-or-0-0-7368" ref-type="aff">3</xref>
<xref rid="fn1-or-0-0-7368" ref-type="author-notes">&#x002A;</xref></contrib>
<contrib contrib-type="author"><name><surname>Bai</surname><given-names>Bingjun</given-names></name>
<xref rid="af1-or-0-0-7368" ref-type="aff">1</xref>
<xref rid="af2-or-0-0-7368" ref-type="aff">2</xref></contrib>
<contrib contrib-type="author"><name><surname>Shan</surname><given-names>Lina</given-names></name>
<xref rid="af1-or-0-0-7368" ref-type="aff">1</xref>
<xref rid="af2-or-0-0-7368" ref-type="aff">2</xref></contrib>
<contrib contrib-type="author"><name><surname>Zheng</surname><given-names>Wenqian</given-names></name>
<xref rid="af1-or-0-0-7368" ref-type="aff">1</xref>
<xref rid="af2-or-0-0-7368" ref-type="aff">2</xref></contrib>
<contrib contrib-type="author"><name><surname>Huang</surname><given-names>Xuefeng</given-names></name>
<xref rid="af1-or-0-0-7368" ref-type="aff">1</xref>
<xref rid="af2-or-0-0-7368" ref-type="aff">2</xref></contrib>
<contrib contrib-type="author"><name><surname>Zhu</surname><given-names>Hongbo</given-names></name>
<xref rid="af1-or-0-0-7368" ref-type="aff">1</xref>
<xref rid="af2-or-0-0-7368" ref-type="aff">2</xref>
<xref rid="c1-or-0-0-7368" ref-type="corresp"/></contrib>
</contrib-group>
<aff id="af1-or-0-0-7368"><label>1</label>Department of Colorectal Surgery, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, Hangzhou, Zhejiang 310016, P.R. China</aff>
<aff id="af2-or-0-0-7368"><label>2</label>Key Laboratory of Biotherapy of Zhejiang Province, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, Hangzhou, Zhejiang 310016, P.R. China</aff>
<aff id="af3-or-0-0-7368"><label>3</label>Department of Medical Oncology, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, Hangzhou, Zhejiang 310016, P.R. China</aff>
<author-notes>
<corresp id="c1-or-0-0-7368"><italic>Correspondence to</italic>: Dr Hongbo Zhu, Department of Colorectal Surgery, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, 3 Qingchun East Road, Hangzhou, Zhejiang 310016, P.R. China, E-mail: <email>ykzhb@zju.edu.cn</email></corresp>
<fn id="fn1-or-0-0-7368"><label>&#x002A;</label><p>Contributed equally</p></fn>
</author-notes>
<pub-date pub-type="ppub">
<month>12</month>
<year>2019</year></pub-date>
<pub-date pub-type="epub">
<day>14</day>
<month>10</month>
<year>2019</year></pub-date>
<volume>42</volume>
<issue>6</issue>
<fpage>2473</fpage>
<lpage>2485</lpage>
<history>
<date date-type="received"><day>18</day><month>05</month><year>2019</year></date>
<date date-type="accepted"><day>09</day><month>08</month><year>2019</year></date>
</history>
<permissions>
<copyright-statement>Copyright: &#x00A9; Lv et al.</copyright-statement>
<copyright-year>2019</copyright-year>
<license license-type="open-access">
<license-p>This is an open access article distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by-nc-nd/4.0/">Creative Commons Attribution-NonCommercial-NoDerivs License</ext-link>, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.</license-p></license>
</permissions>
<abstract>
<p>In the current era of precision medicine, there is a general consensus that the anatomical site is an important factor in the management of colorectal cancer (CRC). To investigate the underlying molecular mechanisms between proximal and distal CRC and to identify the responsible genes, we analyzed the gene expression patterns of colorectal tumors from two microarray datasets, GSE39582 and GSE14333, on the NCBI Gene Expression Omnibus and the RNA-seq data from TCGA. Weighted coexpression network analysis (WGCNA) was applied to construct a gene coexpression network. The red module in GSE39582 and the dark-gray module from the TCGA dataset were found to be highly correlated with the anatomical site of CRC. A total of 12 hub genes were found in two datasets, 2 of which PLAG1 like zinc finger 2 (<italic>PLAGL2</italic>) and protein O-fucosyltransferase 1 (<italic>POFUT1</italic>) were common and upregulated in tumor samples in CRC. The module with the highest correlation provided references that will help to characterize the difference between left-sided and right-sided CRC. The survival analysis of <italic>PLAGL2</italic> and <italic>POFUT1</italic> expression revealed differences between proximal and distal CRC. Gene set enrichment analysis based on those two genes provided similar results: GPI anchor biosynthesis and peroxisome and selenoamino acid metabolism. <italic>PLAGL2</italic> and <italic>POFUT1</italic>, which have the highest correlation with tumor location, may serve as biomarkers and therapeutic targets for the precise diagnosis and treatment of CRC in the future.</p>
</abstract>
<kwd-group>
<kwd>bioinformatics analysis</kwd>
<kwd>colorectal cancer</kwd>
<kwd>tumor location</kwd>
<kwd>weighted gene coexpression analysis</kwd>
<kwd>WGCNA</kwd>
<kwd><italic>PLAGL2</italic></kwd>
<kwd><italic>POFUT1</italic></kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec sec-type="intro">
<title>Introduction</title>
<p>Colorectal cancer (CRC), which accounted for approximately 1.8 million new cases and more than 860,000 deaths in 2018, ranks as the fourth most commonly diagnosed malignancy and the second leading cause of cancer-related deaths worldwide (<xref rid="b1-or-0-0-7368" ref-type="bibr">1</xref>). The incidence and mortality rates of CRC are still increasing rapidly in many developing countries around the world, causing a considerable public health issue (<xref rid="b2-or-0-0-7368" ref-type="bibr">2</xref>).</p>
<p>Nearly three decades ago, J.A. Bufill proposed sub-classifying CRC depending on the anatomical site, either proximal (right) or distal (left) to the splenic flexure (<xref rid="b3-or-0-0-7368" ref-type="bibr">3</xref>). Subsequent research has observed distinct differences in epidemiology and pathological features according to primary tumor location in CRC. In 2000, H. Elsaleh found that the tumor site is associated with survival benefit from adjuvant chemotherapy in CRC (<xref rid="b4-or-0-0-7368" ref-type="bibr">4</xref>). This researcher discovered that patients with right-sided tumors have better survival benefits from adjuvant chemotherapy than patients with left-sided tumors. In addition, the frequency of MSI was much higher in right-sided tumors than in left-sided tumors (<xref rid="b5-or-0-0-7368" ref-type="bibr">5</xref>,<xref rid="b6-or-0-0-7368" ref-type="bibr">6</xref>). It is now well established by a variety of studies that primary tumor location affects the outcome of the chemotherapy and immunotherapy of CRC patients in a large-scale population, and tumor location is a high-risk parameter for prognosis in specific stages. There is a general consensus that primary tumor location plays an important role in CRC development. We could even define right-sided and left-sided tumors as two different diseases that need different treatments (<xref rid="b7-or-0-0-7368" ref-type="bibr">7</xref>). This influence of tumor location may be due to differences in embryological development. Specifically, the right side of the colon has historically been understood to be derived from the embryological midgut, and the left colon arises from the embryological hindgut. The transverse colon is composed of parts of both structures. These different origins could result in various clinical traits.</p>
<p>However, the underlying molecular mechanism governing those different behaviors and outcomes has not been fully elucidated to date. With the popularization of next-generation sequencing technology, we currently have abundant published research describing the use of the Chip-seq or RNA-seq method to investigate problems related to cancer. In the last decade, a considerable number of studies have been published on the distinct gene expression between left- and right-sided CRC (<xref rid="b8-or-0-0-7368" ref-type="bibr">8</xref>,<xref rid="b9-or-0-0-7368" ref-type="bibr">9</xref>). The generalizability of much of the published research on this issue has been restricted to the analysis of differential gene expression, while few previous studies have investigated this problem from the perspective of expression patterns. Weighted gene coexpression analysis (WGCNA) is a powerful tool to describe the correlation patterns among genes across microarray or RNA-seq samples (<xref rid="b10-or-0-0-7368" ref-type="bibr">10</xref>). This method has been widely used to identify modules of tightly correlated genes and summarize such modules using the module eigengene or intramodular hub genes. After the modules are identified, we can easily evaluate the association between the modules and external clinical traits using eigengene network methodology. This approach has been generally acknowledged and successfully applied to various cancer studies.</p>
<p>In this study, we aimed to utilize the gene expression data from the public genomic database to explore the inner connections and genetic difference between proximal and distal CRC and to use weighted gene coexpression analysis (WGCNA) to search for the responsible genes.</p>
</sec>
<sec sec-type="materials|methods">
<title>Materials and methods</title>
<sec>
<title/>
<sec>
<title>Data collection</title>
<p>The raw expression data of GSE39582 (<xref rid="b11-or-0-0-7368" ref-type="bibr">11</xref>) and GSE14333 (<xref rid="b12-or-0-0-7368" ref-type="bibr">12</xref>) were retrieved from the Gene Expression Omnibus database (<uri xlink:href="http://www.ncbi.nlm.nih.gov/geo/">http://www.ncbi.nlm.nih.gov/geo/</uri>), both based on the platform of GPL570 Affymetrix Human Genome U133 Plus 2.0 Array. We used the Affy package in R to transform the CEL files of the tumor samples into an expression matrix (<xref rid="b13-or-0-0-7368" ref-type="bibr">13</xref>). To improve the data quality, we used the k-nearest neighbors algorithm (k-NN) from the impute package in R to impute the missing expression data (<xref rid="b14-or-0-0-7368" ref-type="bibr">14</xref>). Meanwhile, the robust multiarray average algorithm (RMA) was utilized to adjust the data for potential batch effects and for background correcting (<xref rid="b15-or-0-0-7368" ref-type="bibr">15</xref>). Prior to WGCNA analysis, we filtered out the probes that were absent in all samples. The probe information was then transformed into the official gene symbols using Bioconductor in R. If multiple probes were applied to detect the same mRNA, the average value of the probes was used. The genes that were not differentially expressed between samples had to be excluded from WGCNA, as two genes without notable variance in expression between patients will be highly correlated. We chose the 75&#x0025; most varying genes to construct the weighted gene coexpression networks. Specifically, the median absolute deviation (MAD) was used as a robust measure of variability.</p>
<p>In addition, the level three RNA-sequencing data of both colon adenocarcinoma (COAD) and rectum adenocarcinoma (READ) patients were downloaded from The Cancer Genome Atlas data portal (TCGA; <uri xlink:href="http://cancergenome.nih.gov/">http://cancergenome.nih.gov/</uri>). In contrast to ChIP-sequencing data, we used the voom function in package limma to normalize the TCGA data and create an expression matrix for samples for which the detailed clinical data are available (<xref rid="b16-or-0-0-7368" ref-type="bibr">16</xref>,<xref rid="b17-or-0-0-7368" ref-type="bibr">17</xref>). The voom method estimates the mean variance of the log counts and generates a precision weight for each observation. Thus, the WGCNA workflows originally developed for microarray analysis can be used on the RNA-seq data. Further preprocessing steps included the removal of control samples and the genes with zero counts in more than 80&#x0025; of samples. As mentioned before, genes that are not differentially expressed between samples must be excluded; thus, we chose the top 12,000 genes with the highest MAD for the network building. <xref rid="f1-or-0-0-7368" ref-type="fig">Fig. 1</xref> depicts a flow chart for the bioinformatic analysis.</p>
</sec>
<sec>
<title>Construction of weighted gene coexpression networks</title>
<p>The R package &#x2018;WGCNA&#x2019; was used in our study to construct a gene coexpression network (<xref rid="b10-or-0-0-7368" ref-type="bibr">10</xref>). After data collection and normalization, it is crucial that outliers be excluded. However, it was difficult to distinguish outlying samples in a dendrogram when the number of samples was large. To solve this problem, we used the standardized connectivity (Z. K) method recommended by WGCNA authors with the default threshold, Z. K score &#x00A3;2. After filtering out the outlying samples, expression data were tested to determine whether the samples and genes were good using the integrated function in the WGCNA package.</p>
<p>After filtering out the outliers and bad samples in the dataset, the next step of WGCNA is to build a scale-free network. In a scale-free network, several nodes, which are called hub nodes, are highly connected to other nodes in the network (<xref rid="b18-or-0-0-7368" ref-type="bibr">18</xref>). In our study, we use the unsigned coexpression measure, which means that the positive correlation and negative correlation are equal. We constructed the gene coexpression network using the following steps.</p>
<p>First, we need a soft thresholding power &#x03B2; to which coexpression similarity is raised to calculate adjacency. By raising the absolute value of the correlation to a power &#x03B2;&#x2265;1 (soft thresholding), the weighted gene coexpression network construction emphasizes high correlations at the expense of low correlations. To determine the best soft threshold power, scale independence and average connectivity degree of modules with different power values were calculated by the gradient method. We selected the power &#x03B2; to ensure that the coexpression network was a &#x2018;scale-free&#x2019; network, which was biologically close to reality. Moreover, to minimize the effects of noise and spurious associations, we subsequently constructed the Topology Overlap Matrix (TOM) from the adjacency matrix and calculated the corresponding dissimilarity (1-TOM), as well (<xref rid="b19-or-0-0-7368" ref-type="bibr">19</xref>).</p>
<p>In the same way, the second coexpression network was built from TCGA data.</p>
</sec>
<sec>
<title>Identification of coexpression modules</title>
<p>The traditional static tree cut method exhibits suboptimal performance on complicated dendrograms. In WGCNA, we tend to use the dynamic tree cut method by hierarchically clustering genes using the dissimilarity matrix (1-TOM) (<xref rid="b20-or-0-0-7368" ref-type="bibr">20</xref>). The minimal size of a module was set as 30, and modules with high similarity were identified by clustering and then merged together with a height cut-off of 0.25. To determine whether the modules are reproducible, we tested the preservation of all modules with an independent gene expression dataset, GSE14333. We used the module preservation function (number of permutations set to 100) integrated in the WGCNA package to calculate the <italic>Z</italic> summary score of each module (<xref rid="b21-or-0-0-7368" ref-type="bibr">21</xref>). In this method, a <italic>Z</italic> summary &#x003C;2 indicates that the modules have no preservation, a <italic>Z</italic> summary of 2&#x2013;10 indicates low to moderate preservation, and a <italic>Z</italic> summary &#x003E;10 means that the module is strongly preserved.</p>
</sec>
<sec>
<title>Finding the key module and its hub gene</title>
<p>The module eigengenes (MEs), which were measured by principal component analysis (PCA), were generated for each coexpressed module along with the module identification procedure.</p>
<p>We used two methods to identify the module of interest. First, we performed a module-trait relationship (MTR) analysis by calculating the correlation between module eigengenes and external clinical parameters, especially the anatomical site of the tumor. Having the module-trait relationships heatmap drawn, it was easy for us to identify which module related to the tumor location most.</p>
<p>Second, we measured gene significance based on the correlation of a gene expression profile with a sample trait and following module significance as an average absolute gene significance measure for all genes in a given module. Then, we plotted the barplot of the module significance for all modules detected. The highest module means it had the strongest correlation with the clinical trait.</p>
<p>In the key module, the hub genes were those that showed the most connections in the network. We called this property module membership, also known as eigengene-based connectivity kME, and in this instance, we used the default threshold value of 0.8. In addition to the module membership, the hub genes we need should also have a relatively higher gene significance; in this instance, we used the cut-off value as 0.4 (TGCA data set to 0.3). Combing both characteristics, we easily filtered out our hub gene in the module.</p>
</sec>
<sec>
<title>Validation of the hub genes</title>
<p>We applied Gene Expression Profiling Interactive Analysis (GEPIA) (<uri xlink:href="http://gepia.cancer-pku.cn/">http://gepia.cancer-pku.cn/</uri>) to detect the difference in expression levels of each hub gene between tumor and normal tissues in both the COAD and READ datasets from TCGA (<xref rid="b22-or-0-0-7368" ref-type="bibr">22</xref>). To further validate our method, correlation plots between hub genes were generated by GEPIA, as well.</p>
</sec>
<sec>
<title>Coexpression validation with qPCR</title>
<p>Twenty non-selected CRC samples were applied to perform qPCR to validate coexpression of PLAG1 like zinc finger 2 (<italic>PLAGL2</italic>) and protein O-fucosyltransferase 1 (<italic>POFUT1</italic>). These experimental samples were collected at the Sir Run Run Shaw Hospital of Zhejiang University between January 2004 and December 2006. After total RNA was isolated from tumor specimens using Trizol reagent (Invitrogen; Thermo Fisher Scientific, Inc., Waltham, MA, USA), RNA was quantified by NanoDrop 2000c spectrophotometer (Thermo Fisher Scientific, Inc.) and reverse transcribed using RNeasy Mini Kit (Takara, Kyoto, Japan) according to the manufacturer&#x0027;s protocols. Quantitative real-time PCR was executed with SYBR Green Master Mix (Takara). Relative expression levels were calculated with 2<sup>&#x2212;&#x0394;&#x0394;Cq</sup> formula (<xref rid="b23-or-0-0-7368" ref-type="bibr">23</xref>). Expression of mRNA was standardized according to &#x03B2;-actin. The primers used were as follows: &#x03B2;-actin_fwd, ACTCTTCCAGCCTTCCTTCC and &#x03B2;-actin_rev, CGTCATACTCCTGCTTGCTG; PLAGL2_fwd, GAGTCAAGTGAAGTGCCAATGT and PLAGL2_rev, TGAGGGCAGCTATATGGTCTC; POFU-T1_fwd, AACCAGGCCGATCACTTCTTG and POFUT1_rev, GTTGGTGAAAGGAGGCTTGTG. The primers were designed on online tools (<uri xlink:href="https://www.genscript.com/tools/real-time-pcr-tagman-primer-design-tool">https://www.genscript.com/tools/real-time-pcr-tagman-primer-design-tool</uri>) and these were synthesized by Shanghai Generay Biotech Co. Ltd. (Shanghai, China).</p>
</sec>
<sec>
<title>Survival analysis</title>
<p>We performed survival analysis for hub genes using the GSE39582 dataset because of its complete overall survival information. Kaplan-Meier analysis and log-rank test were performed to evaluate the association between hub gene expression and patient survival in left- and right-sided CRC, respectively. This procedure utilized the survival package in R (<xref rid="b24-or-0-0-7368" ref-type="bibr">24</xref>), and the Kaplan-Meier survival curves with the at-risk table were drawn using the survminer package (<xref rid="b25-or-0-0-7368" ref-type="bibr">25</xref>).</p>
</sec>
<sec>
<title>Gene set enrichment analysis</title>
<p>To identify the possible pathway through which hub genes may play a part in the development of CRC, the expression data from GSE14333 was also used to perform Gene Set Enrichment Analysis (GSEA). The expression data of 290 cases were uniformly divided into two groups according to each hub gene&#x0027;s expression value.</p>
<p>We used the GSEA-p 2.0 software to conduct the enrichment analysis (<xref rid="b26-or-0-0-7368" ref-type="bibr">26</xref>). For configuration, &#x2018;c2.cp.kegg.v6.2.symbols.gmt&#x2019; from the Molecular signatures database (MSigDB) 3.0 (<xref rid="b27-or-0-0-7368" ref-type="bibr">27</xref>) was used as the gene set, and the permutation number was set to 1,000 as the default. Finally, P-values &#x003C;0.05 and FDR &#x003C;25&#x0025; were considered to be statistically significant (<xref rid="b28-or-0-0-7368" ref-type="bibr">28</xref>).</p>
</sec>
<sec>
<title>Statistical analysis</title>
<p>In this study, we used Pearson correlation coefficient to measure the strength of the relationship between the variables. The coexpression of mRNA expression level of <italic>PLAGL2</italic> and <italic>POFUT1</italic> was presented by linear regression model. Coefficient of determination was calculated and presented. The independent samples t-test was performed for data comparison in GEPIA validation part. All statistical analyses were performed using R program. P-values &#x003C;0.05 was considered to indicate a statistically significant result.</p>
</sec>
</sec>
</sec>
<sec sec-type="results">
<title>Results</title>
<sec>
<title/>
<sec>
<title>Data preprocessing</title>
<p>A workflow of the study is shown in <xref rid="f1-or-0-0-7368" ref-type="fig">Fig. 1</xref>. The dataset GSE39582 contained 585 samples from CRC patients, including 19 normal tissue samples and 566 tumor samples, while GSE14333 had 290 primary CRC tissues. We used the GSE39582 data to build our network and GSE14333 for validation purposes. After data collection, a total of 436 tumor samples with complete clinical information from GSE38582 were obtained. The clinical information of GSE39582 is shown in the clustering dendrogram with the trait heatmap (<xref rid="f2-or-0-0-7368" ref-type="fig">Fig. 2</xref>).</p>
<p>For genes, we transformed the 50,362 probe ids into 22,880 official gene symbols and calculated the median absolute deviation (MAD) of each gene in all samples mentioned above. The three-quarters genes, which equals 17,160, that have the highest MAD were used to construct the final expression network. This step also ensured that the median absolute deviation was not 0, thereby avoiding further errors when constructing the gene coexpression network.</p>
<p>In the meantime, the preprocess of TCGA RNA-seq data was different. We combined the COAD and READ data into one matrix, which has a total of 19,754 genes and 644 samples. Then, we deleted 22 repeat samples and filtered out the genes with zero expression in more than 80&#x0025; of samples. After voom normalization, we chose the top 12,000 genes with the highest MAD for further analysis.</p>
</sec>
<sec>
<title>Network construction and module identification</title>
<p>In choosing the best threshold, we calculated the network topology for soft-thresholding powers from 1 to 20. As shown in <xref rid="f3-or-0-0-7368" ref-type="fig">Fig. 3A</xref>, power value 5, which was the lowest power for the scale-free topology fit index on 0.9, was selected. Afterward, we checked the mean connectivity (<xref rid="f3-or-0-0-7368" ref-type="fig">Fig. 3B</xref>) and double-checked the scale-free topology R<sup>2</sup> with a linear regression plot (<xref rid="f3-or-0-0-7368" ref-type="fig">Fig. 3C</xref>). <xref rid="f3-or-0-0-7368" ref-type="fig">Fig. 3D</xref> contains a histogram of the frequency of connections. A highly skewed histogram is said to approximate a scale-free network.</p>
<p>The coexpression similarity matrix was then transformed into the adjacency matrix by choosing 5 as a soft threshold, and a topological overlap matrix (TOM) was subsequently computed. Using the dynamic tree cut method, a total of 38 modules were identified. The modules with higher correlation than 0.75 were subsequently merged, resulting in 31 modules at last (<xref rid="f4-or-0-0-7368" ref-type="fig">Fig. 4</xref>). The gray module includes genes that were not assigned to any gene modules.</p>
<p>In the network built by the TCGA dataset, the soft threshold was 7 by the calculation (<xref rid="f5-or-0-0-7368" ref-type="fig">Fig. 5A</xref>). Ultimately, 26 gene modules were recognized (<xref rid="f5-or-0-0-7368" ref-type="fig">Fig. 5C</xref>).</p>
</sec>
<sec>
<title>Identification of key modules</title>
<p>To analyze the relationship between gene modules and sample clinical information, we employed module eigengene (ME) as the average gene expression level of the corresponding modules. It can be considered a representative of the gene expression profiles in a module. The correlations between module eigengene and clinical phenotypes in GSE39582 were calculated and plotted as a labeled heatmap (<xref rid="f6-or-0-0-7368" ref-type="fig">Fig. 6</xref>). The red module and orange module were significantly associated with tumor location.</p>
<p>We calculated gene significance based on the correlation of a gene expression profile with the samples&#x0027; location traits. Then, the module significance was defined as the average absolute value of the gene significance of all genes in one module. As shown in <xref rid="f7-or-0-0-7368" ref-type="fig">Fig. 7A</xref>, the red and orange modules had considerably stronger correlations with tumor location than did the rest of the modules.</p>
<p>To determine the module&#x0027;s reproducibility, module preservation analysis was performed using an independent dataset GSE14333. As we can see in <xref rid="f7-or-0-0-7368" ref-type="fig">Fig. 7B</xref>, modules below the green dashed line (Z summary &#x003C;10) are poorly preserved, while the modules above the line are well-preserved in the CRC tissues. The red module, according to the preservation test, is highly preserved in CRC; however, the orange module showed moderate preservation. Thus, we chose the red module for further analysis.</p>
<p>Again, the same method was applied to TCGA data, locating a similar dark-gray module (<xref rid="f5-or-0-0-7368" ref-type="fig">Fig. 5D</xref>).</p>
</sec>
<sec>
<title>Identification of hub genes in the key module</title>
<p>There were 865 genes in the GSE39582 red module. After plotting the gene significance against module membership, we observed that genes with higher module memberships tended to have higher gene significance in this module (<xref rid="f7-or-0-0-7368" ref-type="fig">Fig. 7C</xref>). We used a relatively high criterion to select hub genes: The absolute value of gene significance &#x003E;0.4 and module membership &#x003E;0.8. Six hub genes were successfully identified. The genes with the highest gene significance were found to be <italic>POFUT1</italic> and <italic>PLAGL2</italic>, which are labeled in blue print in <xref rid="f7-or-0-0-7368" ref-type="fig">Fig. 7C</xref>.</p>
<p>Meanwhile, in the TCGA dark-gray module, we used the absolute value of gene significance &#x003E;0.3 to filter out 8 hub genes (<xref rid="f5-or-0-0-7368" ref-type="fig">Fig. 5B</xref>). After combining two datasets, we determined that there were 12 possible hub genes, 2 of which are in common (<xref rid="tI-or-0-0-7368" ref-type="table">Table I</xref>).</p>
</sec>
<sec>
<title>Validation of the hub genes</title>
<p>We concentrated on <italic>PLAGL2</italic> and <italic>POFUT1</italic> because of their high gene significance and their presence in both datasets. We then evaluated their expression with the online TCGA-based tool GEPIA. <italic>PLAGL2</italic> and <italic>POFUT1</italic> were found to be significantly differentially expressed between tumor and normal tissue in both the COAD and READ datasets (<xref rid="f8-or-0-0-7368" ref-type="fig">Fig. 8A and B</xref>). We also performed a correlation analysis between <italic>PLAGL2</italic> and <italic>POFUT1</italic>. The plot shows that the Pearson correlation coefficient is tightly correlated to 0.9 in CRC (<xref rid="f8-or-0-0-7368" ref-type="fig">Fig. 8C</xref>).</p>
<p>We utilized quantitative polymerase chain reaction (qPCR) to measure the RNA expression of <italic>PLAGL2</italic> and <italic>POFUT1</italic> in CRC samples. <italic>PLAGL2</italic> had a high positive correlation with <italic>POFUT1</italic> according to the qPCR results (<xref rid="f8-or-0-0-7368" ref-type="fig">Fig. 8D</xref>).</p>
</sec>
<sec>
<title>Survival analysis and gene set enrichment analysis</title>
<p>For survival analysis, Kaplan-Meier curves were drawn for <italic>PLAGL2</italic> and <italic>POFUT1</italic> in both proximal and distal CRC (<xref rid="f9-or-0-0-7368" ref-type="fig">Fig. 9</xref>). Although the log-rank P-value of all the analyses was &#x003E;0.05 (not statistically significant), we still compared the results from different parts of the colon. In proximal CRC samples, there was a clear trend that high <italic>PLAGL2</italic> and <italic>POFUT1</italic> expression is related to adverse prognosis in CRC patients. However, in distal CRC samples, the expression of <italic>POFUT1</italic> was not related to survival, and the high expression of <italic>PLAGL2</italic> was even associated with poor survival.</p>
<p>We also performed a Gene Set Enrichment Analysis based on the expression level of <italic>PLAGL2</italic> and <italic>POFUT1</italic>. As shown in <xref rid="f10-or-0-0-7368" ref-type="fig">Fig. 10</xref>, these two genes share a similar enriched KEGG pathway: Glycosylphosphatidylinositol GPI anchor biosynthesis and peroxisome and selenoaminoacid metabolism.</p>
</sec>
</sec>
</sec>
<sec sec-type="discussion">
<title>Discussion</title>
<p>We have only recently (over the past 5 to 10 years) determined that the parts of the colon derived from the midgut and the hindgut are different. Numerous studies have investigated this subject. In 2015, Guinney and colleagues published a leading article in Nature Medicine. These researchers divided CRC into 4 well-defined subtypes by their gene expression patterns and discovered that certain types are mainly located on one side of the colon rather than being randomly distributed (<xref rid="b29-or-0-0-7368" ref-type="bibr">29</xref>). Moreover, behind this phenomenon, there must be gene expression patterns that we can be investigated.</p>
<p>The information captured by microarray or RNA-seq experiments is notably richer than a list of differentially expressed genes. Microarray and RNA-seq data are more completely represented by considering the relationships between measured transcripts, which can be assessed by pair-wise correlations between gene expression profiles. Prior bioinformatics studies have noted the importance of gene coexpression networks in various types of cancers. However, many studies used differentially expressed genes to build the coexpression network. It is not recommended by the author of WGCNA, because filtering genes by differential expression will lead to a set of correlated genes that will essentially form a single (or a few highly correlated) module. Since nonvarying genes usually represent noise, we used genes with the top 75&#x0025; MAD to improve the robustness and confidence of the present analysis.</p>
<p>In this study, we used three different datasets to analyze the gene expression patterns of CRC. These datasets have different patient information which leads to the different clinical features. However, when we clustered every gene into the modules by WGCNA, we did not use the clinical features of any kind. Considering the number of samples in these datasets are large, together with the results from the module preservation test, we could assume the key module we identified is universal. An interesting part in the module-to-trait relationship heatmap is that the modules with high correlation with tumor location also highly correlate with mismatch repair (MMR) (<xref rid="b30-or-0-0-7368" ref-type="bibr">30</xref>) and the CpG island methylator phenotype (CIMP) (<xref rid="b31-or-0-0-7368" ref-type="bibr">31</xref>) status. In the last decade, extensive studies have studied this problem and found that tumors with deficient mismatch repair (microsatellite instability-high, MSI-H) and the CpG island methylator phenotype are mostly located on the right side of the colon, which matches our sample traits from GSE39582. Although dMMR or CIMP<sup>&#x002B;</sup> samples are not the majority in the dataset, this tendency may cause a bias that the correlation between tumor sites and the key module is mainly from MMR and CIMP status or other clinical features. To diminish the bias, we also used module significance to define the correlation between modules and tumor site phonotype, both in GSE39582 and TCGA. Although other clinical information was slightly different, the key modules we found in both datasets were similar, which had several common hub genes.</p>
<p>The fundamental theory of WGCNA is that we assume genes interact with each other in a scale-free network. In this way, the hub genes play more important roles in the whole module than other genes. Among the cluster of genes that have a strong relationship with the tumor location of CRC, 12 hub genes with high significance were identified in the GSE39582 and TCGA datasets, which may have contributed most to the distinct behaviors. Some of the genes have been found to be critical in CRC development and prognostic biomarkers in specific stages from other publications (<xref rid="b32-or-0-0-7368" ref-type="bibr">32</xref>,<xref rid="b33-or-0-0-7368" ref-type="bibr">33</xref>).</p>
<p>As we examined these hub genes, we found they are all located on the long arm of chromosome 20 (20q11). Previous studies have confirmed that the copy number gain in 20q (mostly in 20q11 and 20q13) occurs in more than 65&#x0025; of CRC patients (<xref rid="b34-or-0-0-7368" ref-type="bibr">34</xref>). As a consequence of copy number gain of 20q, multiple genes mapping at the chromosome 20q amplicon contribute to colorectal adenoma to carcinoma progression (<xref rid="b35-or-0-0-7368" ref-type="bibr">35</xref>). In our study here, we identified several coexpressed hub genes in 20q11 that may be attributed to the differential features of proximal and distal CRC. However, in the 12 hub genes displayed in <xref rid="tI-or-0-0-7368" ref-type="table">Table I</xref>, <italic>PLAGL2</italic> and <italic>POFUT1</italic> were not only presented in the two datasets, but also showed the highest gene significance. We believe that they are more representative than other genes, thus we focused on them for further exploration.</p>
<p><italic>PLAGL2</italic> encodes a zinc finger transcription factor that contains seven C2H2 zinc finger motifs that exhibit DNA binding and transcriptional activation activity. Recently, Li <italic>et al</italic> found that overexpression of <italic>PLAGL2</italic> transcriptionally activates Wnt6 and promotes cancer development in CRC (<xref rid="b36-or-0-0-7368" ref-type="bibr">36</xref>). <italic>PLAGL2</italic> activates the Wnt/&#x03B2;-catenin pathway as a transcription factor by binding to the promoter region of Wnt6.</p>
<p><italic>POFUT1</italic>, on the other hand, is essential for Notch signal transduction in mammals. In 2018, Du <italic>et al</italic> discovered that <italic>POFUT1</italic> promotes CRC development through the activation of Notch1 signaling (<xref rid="b37-or-0-0-7368" ref-type="bibr">37</xref>). Another study by Chabanais <italic>et al</italic> also confirmed that <italic>POFUT1</italic> is overexpressed in CRC from stage I, and its high expression is associated with the metastatic process (<xref rid="b38-or-0-0-7368" ref-type="bibr">38</xref>). In addition, these researchers found that <italic>POFUT1</italic> overexpression is markedly associated with rectal location, which corroborates our finding.</p>
<p>In all the studies reviewed in this article, <italic>PLAGL2</italic> and <italic>POFUT1</italic> are recognized as oncogenes that promote or at least are associated with CRC development. Furthermore, these genes are highly correlated based on our qPCR result and correlation analysis from the TCGA dataset. As we found in the GEPIA (<xref rid="f7-or-0-0-7368" ref-type="fig">Fig. 7</xref>), these genes were both significantly differentially expressed between tumor and normal tissue in both the COAD and READ datasets.</p>
<p>Moreover, our survival analysis, despite not being statistically significant, found that there were different results between left- and right-sided CRC for <italic>PLAGL2</italic> and <italic>POFUT1</italic> (<xref rid="f9-or-0-0-7368" ref-type="fig">Fig. 9</xref>). In proximal CRC patients, the red curves, which represent the low expression of <italic>PLAGL2</italic> and <italic>POFUT1</italic>, were beneath the blue ones, and the log-rank P-value was at the verge of significance. However, in distal CRC samples, the relationship of <italic>PLAGL2</italic> and <italic>POFUT1</italic> expression and survival were vague and even reversed. This research showed a considerable difference between left- and right-sided survival with regard to <italic>PLAGL2</italic> and <italic>POFUT1</italic>, which indirectly indicates that the expression of the genes is related to the tumor location in CRC patients.</p>
<p>According to our GSEA results, these two genes may also take effect through glycosylphosphatidylinositol (GPI) anchor biosynthesis and peroxisome and selenoamino acid metabolism pathways. When we examined the hub genes in <xref rid="tI-or-0-0-7368" ref-type="table">Table I</xref>, we found that one of the hub genes from GSE39582 is associated with one of the pathways mentioned above. <italic>PIGU</italic> is a component of the GPI transamidase complex that may be involved in the recognition of either the GPI attachment signal or the lipid portion of GPI. This finding confirms that the hub genes&#x0027; functions are as tightly connected as their expression levels, which is the foundation of the WGCNA theory. However, there are few articles discussing the association of this gene with the development of CRC. This subject warrants further investigation in the future.</p>
<p>Another thorough study of gene expression in colon cancer from Slattery <italic>et al</italic> used Ingenuity Pathway Analysis (IPA) to determine networks associated with deregulated genes (<xref rid="b39-or-0-0-7368" ref-type="bibr">39</xref>). In his study, <italic>PLAGL2</italic> and <italic>POFUT1</italic> were found to be differentially expressed genes in both MSI and CIMP status comparisons. In other words, we could assume that these genes may be related to the anatomical site of CRC through MSI and CIMP status.</p>
<p>The findings of these studies indicate that the hub genes that we found are oncogenes that may relate to the sidedness of CRC. Notably, <italic>PLAGL2</italic> and <italic>POFUT1</italic> are the centers of the module and are differentially expressed between normal and tumor tissues, which makes them promising biomarkers.</p>
<p>As Dr Alan P. Venook noted in Clinical Advances in Hematology &#x0026; Oncology (<xref rid="b40-or-0-0-7368" ref-type="bibr">40</xref>), what matters is not the sidedness of the tumor because sidedness is simply a surrogate for the types of tumors that tend to occur on that side. Our work, while preliminary, suggests that a weak link may exist between the oncogenesis triggered by these genes and the primary site of CRC. However, the underlying mechanism requires further investigation.</p>
</sec>
</body>
<back>
<ack>
<title>Acknowledgements</title>
<p>This research represents partial fulfillment of the requirements for a Master degree for YL and WZ.</p>
</ack>
<sec>
<title>Funding</title>
<p>This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.</p>
</sec>
<sec>
<title>Availability of data and materials</title>
<p>The datasets analyzed during the current study are available from the corresponding author on reasonable request.</p>
</sec>
<sec>
<title>Authors&#x0027; contributions</title>
<p>YL, BX, XH and HZ conceived and designed the study. BB collected the data. YL and BX performed the bioinformatics analysis. LS and WZ performed the experiments. YL and BX wrote the paper. BB, LS, WZ, XH and HZ reviewed and edited the manuscript. All authors read and approved the manuscript and agree to be accountable for all aspects of the research in ensuring that the accuracy or integrity of any part of the work are appropriately investigated and resolved.</p>
</sec>
<sec>
<title>Ethics approval and consent to participate</title>
<p>Research was authorized by the Ethics Committee of Sir Run Run Shaw Hospital and informed consent was obtained from all participating patients. The reference number was 20180226-88.</p>
</sec>
<sec>
<title>Patient consent for publication</title>
<p>Not applicable.</p>
</sec>
<sec>
<title>Competing interests</title>
<p>The authors declare that they have no competing interests.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="b1-or-0-0-7368"><label>1</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bray</surname><given-names>F</given-names></name><name><surname>Ferlay</surname><given-names>J</given-names></name><name><surname>Soerjomataram</surname><given-names>I</given-names></name><name><surname>Siegel</surname><given-names>RL</given-names></name><name><surname>Torre</surname><given-names>LA</given-names></name><name><surname>Jemal</surname><given-names>A</given-names></name></person-group><article-title>Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries</article-title><source>CA Cancer J Clin</source><volume>68</volume><fpage>394</fpage><lpage>424</lpage><year>2018</year><pub-id pub-id-type="doi">10.3322/caac.21492</pub-id><pub-id pub-id-type="pmid">30207593</pub-id></element-citation></ref>
<ref id="b2-or-0-0-7368"><label>2</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Arnold</surname><given-names>M</given-names></name><name><surname>Sierra</surname><given-names>MS</given-names></name><name><surname>Laversanne</surname><given-names>M</given-names></name><name><surname>Soerjomataram</surname><given-names>I</given-names></name><name><surname>Jemal</surname><given-names>A</given-names></name><name><surname>Bray</surname><given-names>F</given-names></name></person-group><article-title>Global patterns and trends in colorectal cancer incidence and mortality</article-title><source>Gut</source><volume>66</volume><fpage>683</fpage><lpage>691</lpage><year>2017</year><pub-id pub-id-type="doi">10.1136/gutjnl-2015-310912</pub-id><pub-id pub-id-type="pmid">26818619</pub-id></element-citation></ref>
<ref id="b3-or-0-0-7368"><label>3</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bufill</surname><given-names>JA</given-names></name></person-group><article-title>Colorectal cancer: Evidence for distinct genetic categories based on proximal or distal tumor location</article-title><source>Ann Intern Med</source><volume>113</volume><fpage>779</fpage><lpage>788</lpage><year>1990</year><pub-id pub-id-type="doi">10.7326/0003-4819-113-10-779</pub-id><pub-id pub-id-type="pmid">2240880</pub-id></element-citation></ref>
<ref id="b4-or-0-0-7368"><label>4</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Elsaleh</surname><given-names>H</given-names></name><name><surname>Joseph</surname><given-names>D</given-names></name><name><surname>Grieu</surname><given-names>F</given-names></name><name><surname>Zeps</surname><given-names>N</given-names></name><name><surname>Spry</surname><given-names>N</given-names></name><name><surname>Iacopetta</surname><given-names>B</given-names></name></person-group><article-title>Association of tumour site and sex with survival benefit from adjuvant chemotherapy in colorectal cancer</article-title><source>Lancet</source><volume>355</volume><fpage>1745</fpage><lpage>1750</lpage><year>2000</year><pub-id pub-id-type="doi">10.1016/S0140-6736(00)02261-3</pub-id><pub-id pub-id-type="pmid">10832824</pub-id></element-citation></ref>
<ref id="b5-or-0-0-7368"><label>5</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Deng</surname><given-names>G</given-names></name><name><surname>Kakar</surname><given-names>S</given-names></name><name><surname>Tanaka</surname><given-names>H</given-names></name><name><surname>Matsuzaki</surname><given-names>K</given-names></name><name><surname>Miura</surname><given-names>S</given-names></name><name><surname>Sleisenger</surname><given-names>MH</given-names></name><name><surname>Kim</surname><given-names>YS</given-names></name></person-group><article-title>Proximal and distal colorectal cancers show distinct gene-specific methylation profiles and clinical and molecular characteristics</article-title><source>Eur J Cancer</source><volume>44</volume><fpage>1290</fpage><lpage>1301</lpage><year>2008</year><pub-id pub-id-type="doi">10.1016/j.ejca.2008.03.014</pub-id><pub-id pub-id-type="pmid">18486467</pub-id></element-citation></ref>
<ref id="b6-or-0-0-7368"><label>6</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Minoo</surname><given-names>P</given-names></name><name><surname>Zlobec</surname><given-names>I</given-names></name><name><surname>Peterson</surname><given-names>M</given-names></name><name><surname>Terracciano</surname><given-names>L</given-names></name><name><surname>Lugli</surname><given-names>A</given-names></name></person-group><article-title>Characterization of rectal, proximal and distal colon cancers based on clinicopathological, molecular and protein profiles</article-title><source>Int J Oncol</source><volume>37</volume><fpage>707</fpage><lpage>718</lpage><year>2010</year><pub-id pub-id-type="doi">10.3892/ijo_00000720</pub-id><pub-id pub-id-type="pmid">20664940</pub-id></element-citation></ref>
<ref id="b7-or-0-0-7368"><label>7</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lee</surname><given-names>GH</given-names></name><name><surname>Malietzis</surname><given-names>G</given-names></name><name><surname>Askari</surname><given-names>A</given-names></name><name><surname>Bernardo</surname><given-names>D</given-names></name><name><surname>Al-Hassi</surname><given-names>HO</given-names></name><name><surname>Clark</surname><given-names>SK</given-names></name></person-group><article-title>Is right-sided colon cancer different to left-sided colorectal cancer? -a systematic review</article-title><source>Eur J Surg Oncol</source><volume>41</volume><fpage>300</fpage><lpage>308</lpage><year>2015</year><pub-id pub-id-type="doi">10.1016/j.ejso.2014.11.001</pub-id><pub-id pub-id-type="pmid">25468456</pub-id></element-citation></ref>
<ref id="b8-or-0-0-7368"><label>8</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Glebov</surname><given-names>OK</given-names></name><name><surname>Rodriguez</surname><given-names>LM</given-names></name><name><surname>Nakahara</surname><given-names>K</given-names></name><name><surname>Jenkins</surname><given-names>J</given-names></name><name><surname>Cliatt</surname><given-names>J</given-names></name><name><surname>Humbyrd</surname><given-names>CJ</given-names></name><name><surname>DeNobile</surname><given-names>J</given-names></name><name><surname>Soballe</surname><given-names>P</given-names></name><name><surname>Simon</surname><given-names>R</given-names></name><name><surname>Wright</surname><given-names>G</given-names></name><etal/></person-group><article-title>Distinguishing right from left colon by the pattern of gene expression</article-title><source>Cancer Epidemiol Biomarkers Prev</source><volume>12</volume><fpage>755</fpage><lpage>762</lpage><year>2003</year><pub-id pub-id-type="pmid">12917207</pub-id></element-citation></ref>
<ref id="b9-or-0-0-7368"><label>9</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Birkenkamp-Demtroder</surname><given-names>K</given-names></name><name><surname>Olesen</surname><given-names>SH</given-names></name><name><surname>S&#x00F8;rensen</surname><given-names>FB</given-names></name><name><surname>Laurberg</surname><given-names>S</given-names></name><name><surname>Laiho</surname><given-names>P</given-names></name><name><surname>Aaltonen</surname><given-names>LA</given-names></name><name><surname>Orntoft</surname><given-names>TF</given-names></name></person-group><article-title>Differential gene expression in colon cancer of the caecum versus the sigmoid and rectosigmoid</article-title><source>Gut</source><volume>54</volume><fpage>374</fpage><lpage>384</lpage><year>2005</year><pub-id pub-id-type="doi">10.1136/gut.2003.036848</pub-id><pub-id pub-id-type="pmid">15710986</pub-id><pub-id pub-id-type="pmcid">1774427</pub-id></element-citation></ref>
<ref id="b10-or-0-0-7368"><label>10</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Langfelder</surname><given-names>P</given-names></name><name><surname>Horvath</surname><given-names>S</given-names></name></person-group><article-title>WGCNA: An R package for weighted correlation network analysis</article-title><source>BMC Bioinformatics</source><volume>9</volume><fpage>559</fpage><year>2008</year><pub-id pub-id-type="doi">10.1186/1471-2105-9-559</pub-id><pub-id pub-id-type="pmid">19114008</pub-id><pub-id pub-id-type="pmcid">2631488</pub-id></element-citation></ref>
<ref id="b11-or-0-0-7368"><label>11</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Marisa</surname><given-names>L</given-names></name><name><surname>de Reynies</surname><given-names>A</given-names></name><name><surname>Duval</surname><given-names>A</given-names></name><name><surname>Selves</surname><given-names>J</given-names></name><name><surname>Gaub</surname><given-names>MP</given-names></name><name><surname>Vescovo</surname><given-names>L</given-names></name><name><surname>Etienne-Grimaldi</surname><given-names>MC</given-names></name><name><surname>Schiappa</surname><given-names>R</given-names></name><name><surname>Guenot</surname><given-names>D</given-names></name><name><surname>Ayadi</surname><given-names>M</given-names></name><etal/></person-group><article-title>Gene expression classification of colon cancer into molecular subtypes: Characterization, validation, and prognostic value</article-title><source>PLoS Med</source><volume>10</volume><fpage>e1001453</fpage><year>2013</year><pub-id pub-id-type="doi">10.1371/journal.pmed.1001453</pub-id><pub-id pub-id-type="pmid">23700391</pub-id><pub-id pub-id-type="pmcid">3660251</pub-id></element-citation></ref>
<ref id="b12-or-0-0-7368"><label>12</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Jorissen</surname><given-names>RN</given-names></name><name><surname>Gibbs</surname><given-names>P</given-names></name><name><surname>Christie</surname><given-names>M</given-names></name><name><surname>Prakash</surname><given-names>S</given-names></name><name><surname>Lipton</surname><given-names>L</given-names></name><name><surname>Desai</surname><given-names>J</given-names></name><name><surname>Kerr</surname><given-names>D</given-names></name><name><surname>Aaltonen</surname><given-names>LA</given-names></name><name><surname>Arango</surname><given-names>D</given-names></name><etal/></person-group><article-title>Metastasis-associated gene expression changes predict poor outcomes in patients with dukes Stage B and C colorectal cancer</article-title><source>Clin Cancer Res</source><volume>15</volume><fpage>7642</fpage><lpage>7651</lpage><year>2009</year><pub-id pub-id-type="doi">10.1158/1078-0432.CCR-09-1431</pub-id><pub-id pub-id-type="pmid">19996206</pub-id><pub-id pub-id-type="pmcid">2920750</pub-id></element-citation></ref>
<ref id="b13-or-0-0-7368"><label>13</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gautier</surname><given-names>L</given-names></name><name><surname>Cope</surname><given-names>L</given-names></name><name><surname>Bolstad</surname><given-names>BM</given-names></name><name><surname>Irizarry</surname><given-names>RA</given-names></name></person-group><article-title>Affy-analysis of Affymetrix GeneChip data at the probe level</article-title><source>Bioinformatics</source><volume>20</volume><fpage>307</fpage><lpage>315</lpage><year>2004</year><pub-id pub-id-type="doi">10.1093/bioinformatics/btg405</pub-id><pub-id pub-id-type="pmid">14960456</pub-id></element-citation></ref>
<ref id="b14-or-0-0-7368"><label>14</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hastie</surname><given-names>T</given-names></name><name><surname>Tibshirani</surname><given-names>R</given-names></name><name><surname>Narasimhan</surname><given-names>B</given-names></name><name><surname>Chu</surname><given-names>G</given-names></name></person-group><article-title>Impute: Imputation for microarray data</article-title><source>Bioinformatics</source><volume>17</volume><fpage>520</fpage><lpage>525</lpage><year>2001</year><pub-id pub-id-type="pmid">11395428</pub-id></element-citation></ref>
<ref id="b15-or-0-0-7368"><label>15</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Irizarry</surname><given-names>RA</given-names></name><name><surname>Hobbs</surname><given-names>B</given-names></name><name><surname>Collin</surname><given-names>F</given-names></name><name><surname>Beazer-Barclay</surname><given-names>YD</given-names></name><name><surname>Antonellis</surname><given-names>KJ</given-names></name><name><surname>Scherf</surname><given-names>U</given-names></name><name><surname>Speed</surname><given-names>TP</given-names></name></person-group><article-title>Exploration, normalization, and summaries of high density oligonucleotide array probe level data</article-title><source>Biostatistics</source><volume>4</volume><fpage>249</fpage><lpage>264</lpage><year>2003</year><pub-id pub-id-type="doi">10.1093/biostatistics/4.2.249</pub-id><pub-id pub-id-type="pmid">12925520</pub-id></element-citation></ref>
<ref id="b16-or-0-0-7368"><label>16</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ritchie</surname><given-names>ME</given-names></name><name><surname>Phipson</surname><given-names>B</given-names></name><name><surname>Wu</surname><given-names>D</given-names></name><name><surname>Hu</surname><given-names>Y</given-names></name><name><surname>Law</surname><given-names>CW</given-names></name><name><surname>Shi</surname><given-names>W</given-names></name><name><surname>Smyth</surname><given-names>GK</given-names></name></person-group><article-title>Limma powers differential expression analyses for RNA-sequencing and microarray studies</article-title><source>Nucleic Acids Res</source><volume>43</volume><fpage>e47</fpage><year>2015</year><pub-id pub-id-type="doi">10.1093/nar/gkv007</pub-id><pub-id pub-id-type="pmid">25605792</pub-id><pub-id pub-id-type="pmcid">4402510</pub-id></element-citation></ref>
<ref id="b17-or-0-0-7368"><label>17</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Law</surname><given-names>CW</given-names></name><name><surname>Chen</surname><given-names>Y</given-names></name><name><surname>Shi</surname><given-names>W</given-names></name><name><surname>Smyth</surname><given-names>GK</given-names></name></person-group><article-title>Voom: Precision weights unlock linear model analysis tools for RNA-seq read counts</article-title><source>Genome Biol</source><volume>15</volume><fpage>R29</fpage><year>2014</year><pub-id pub-id-type="doi">10.1186/gb-2014-15-2-r29</pub-id><pub-id pub-id-type="pmid">24485249</pub-id><pub-id pub-id-type="pmcid">4053721</pub-id></element-citation></ref>
<ref id="b18-or-0-0-7368"><label>18</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname><given-names>B</given-names></name><name><surname>Horvath</surname><given-names>S</given-names></name></person-group><article-title>A general framework for weighted gene co-expression network analysis</article-title><source>Stat Appl Genet Mol Biol</source><volume>4</volume><fpage>17</fpage><year>2005</year><pub-id pub-id-type="doi">10.2202/1544-6115.1128</pub-id></element-citation></ref>
<ref id="b19-or-0-0-7368"><label>19</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Yip</surname><given-names>AM</given-names></name><name><surname>Horvath</surname><given-names>S</given-names></name></person-group><article-title>Gene network interconnectedness and the generalized topological overlap measure</article-title><source>BMC Bioinformatics</source><volume>8</volume><fpage>22</fpage><year>2007</year><pub-id pub-id-type="doi">10.1186/1471-2105-8-22</pub-id><pub-id pub-id-type="pmid">17250769</pub-id><pub-id pub-id-type="pmcid">1797055</pub-id></element-citation></ref>
<ref id="b20-or-0-0-7368"><label>20</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Langfelder</surname><given-names>P</given-names></name><name><surname>Zhang</surname><given-names>B</given-names></name><name><surname>Horvath</surname><given-names>S</given-names></name></person-group><article-title>Defining clusters from a hierarchical cluster tree: The dynamic tree cut package for R</article-title><source>Bioinformatics</source><volume>24</volume><fpage>719</fpage><lpage>720</lpage><year>2008</year><pub-id pub-id-type="doi">10.1093/bioinformatics/btm563</pub-id><pub-id pub-id-type="pmid">18024473</pub-id></element-citation></ref>
<ref id="b21-or-0-0-7368"><label>21</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Langfelder</surname><given-names>P</given-names></name><name><surname>Luo</surname><given-names>R</given-names></name><name><surname>Oldham</surname><given-names>MC</given-names></name><name><surname>Horvath</surname><given-names>S</given-names></name></person-group><article-title>Is my network module preserved and reproducible?</article-title><source>PLoS Comput Biol</source><volume>7</volume><fpage>e1001057</fpage><year>2011</year><pub-id pub-id-type="doi">10.1371/journal.pcbi.1001057</pub-id><pub-id pub-id-type="pmid">21283776</pub-id><pub-id pub-id-type="pmcid">3024255</pub-id></element-citation></ref>
<ref id="b22-or-0-0-7368"><label>22</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Tang</surname><given-names>Z</given-names></name><name><surname>Li</surname><given-names>C</given-names></name><name><surname>Kang</surname><given-names>B</given-names></name><name><surname>Gao</surname><given-names>G</given-names></name><name><surname>Li</surname><given-names>C</given-names></name><name><surname>Zhang</surname><given-names>Z</given-names></name></person-group><article-title>GEPIA: A web server for cancer and normal gene expression profiling and interactive analyses</article-title><source>Nucleic Acids Res</source><volume>45</volume><fpage>W98</fpage><lpage>W102</lpage><year>2017</year><pub-id pub-id-type="doi">10.1093/nar/gkx247</pub-id><pub-id pub-id-type="pmid">28407145</pub-id><pub-id pub-id-type="pmcid">5570223</pub-id></element-citation></ref>
<ref id="b23-or-0-0-7368"><label>23</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Livak</surname><given-names>KJ</given-names></name><name><surname>Schmittgen</surname><given-names>TD</given-names></name></person-group><article-title>Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) method</article-title><source>Methods</source><volume>25</volume><fpage>402</fpage><lpage>408</lpage><year>2001</year><pub-id pub-id-type="doi">10.1006/meth.2001.1262</pub-id><pub-id pub-id-type="pmid">11846609</pub-id></element-citation></ref>
<ref id="b24-or-0-0-7368"><label>24</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Therneau</surname><given-names>TM</given-names></name><name><surname>Lumley</surname><given-names>T</given-names></name></person-group><article-title>Package &#x2018;survival&#x2019;</article-title><source>Survival analysis Published on CRAN</source><year>2014</year></element-citation></ref>
<ref id="b25-or-0-0-7368"><label>25</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kassambara</surname><given-names>A</given-names></name><name><surname>Kosinski</surname><given-names>M</given-names></name><name><surname>Biecek</surname><given-names>P</given-names></name></person-group><article-title>Survminer: Drawing survival curves usingggplot2</article-title><comment>R package version 0.3 1</comment><year>2017</year></element-citation></ref>
<ref id="b26-or-0-0-7368"><label>26</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Subramanian</surname><given-names>A</given-names></name><name><surname>Kuehn</surname><given-names>H</given-names></name><name><surname>Gould</surname><given-names>J</given-names></name><name><surname>Tamayo</surname><given-names>P</given-names></name><name><surname>Mesirov</surname><given-names>JP</given-names></name></person-group><article-title>GSEA-P: A desktop application for gene set enrichment analysis</article-title><source>Bioinformatics</source><volume>23</volume><fpage>3251</fpage><lpage>3253</lpage><year>2007</year><pub-id pub-id-type="doi">10.1093/bioinformatics/btm369</pub-id><pub-id pub-id-type="pmid">17644558</pub-id></element-citation></ref>
<ref id="b27-or-0-0-7368"><label>27</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Liberzon</surname><given-names>A</given-names></name><name><surname>Subramanian</surname><given-names>A</given-names></name><name><surname>Pinchback</surname><given-names>R</given-names></name><name><surname>Thorvaldsd&#x00F3;ttir</surname><given-names>H</given-names></name><name><surname>Tamayo</surname><given-names>P</given-names></name><name><surname>Mesirov</surname><given-names>JP</given-names></name></person-group><article-title>Molecular signatures database (MSigDB) 3.0</article-title><source>Bioinformatics</source><volume>27</volume><fpage>1739</fpage><lpage>1740</lpage><year>2011</year><pub-id pub-id-type="doi">10.1093/bioinformatics/btr260</pub-id><pub-id pub-id-type="pmid">21546393</pub-id><pub-id pub-id-type="pmcid">3106198</pub-id></element-citation></ref>
<ref id="b28-or-0-0-7368"><label>28</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Benjamini</surname><given-names>Y</given-names></name><name><surname>Hochberg</surname><given-names>Y</given-names></name></person-group><article-title>Controlling the false discovery rate: A practical and powerful approach to multiple testing</article-title><source>J R Stat Soc: Series B (Methodological)</source><volume>57</volume><fpage>289</fpage><lpage>300</lpage><year>1995</year></element-citation></ref>
<ref id="b29-or-0-0-7368"><label>29</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Guinney</surname><given-names>J</given-names></name><name><surname>Dienstmann</surname><given-names>R</given-names></name><name><surname>Wang</surname><given-names>X</given-names></name><name><surname>de Reyni&#x00E8;s</surname><given-names>A</given-names></name><name><surname>Schlicker</surname><given-names>A</given-names></name><name><surname>Soneson</surname><given-names>C</given-names></name><name><surname>Marisa</surname><given-names>L</given-names></name><name><surname>Roepman</surname><given-names>P</given-names></name><name><surname>Nyamundanda</surname><given-names>G</given-names></name><name><surname>Angelino</surname><given-names>P</given-names></name><etal/></person-group><article-title>The consensus molecular subtypes of colorectal cancer</article-title><source>Nat Med</source><volume>21</volume><fpage>1350</fpage><lpage>1356</lpage><year>2015</year><pub-id pub-id-type="doi">10.1038/nm.3967</pub-id><pub-id pub-id-type="pmid">26457759</pub-id><pub-id pub-id-type="pmcid">4636487</pub-id></element-citation></ref>
<ref id="b30-or-0-0-7368"><label>30</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Boland</surname><given-names>CR</given-names></name><name><surname>Goel</surname><given-names>A</given-names></name></person-group><article-title>Microsatellite instability in colorectal cancer</article-title><source>Gastroenterology</source><volume>138</volume><fpage>2073</fpage><lpage>2087.e3</lpage><year>2010</year><pub-id pub-id-type="doi">10.1053/j.gastro.2009.12.064</pub-id><pub-id pub-id-type="pmid">20420947</pub-id><pub-id pub-id-type="pmcid">3037515</pub-id></element-citation></ref>
<ref id="b31-or-0-0-7368"><label>31</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Toyota</surname><given-names>M</given-names></name><name><surname>Ahuja</surname><given-names>N</given-names></name><name><surname>Ohe-Toyota</surname><given-names>M</given-names></name><name><surname>Herman</surname><given-names>JG</given-names></name><name><surname>Baylin</surname><given-names>SB</given-names></name><name><surname>Issa</surname><given-names>JP</given-names></name></person-group><article-title>CpG island methylator phenotype in colorectal cancer</article-title><source>Proc Natl Acad Sci USA</source><volume>96</volume><fpage>8681</fpage><lpage>8686</lpage><year>1999</year><pub-id pub-id-type="doi">10.1073/pnas.96.15.8681</pub-id><pub-id pub-id-type="pmid">10411935</pub-id></element-citation></ref>
<ref id="b32-or-0-0-7368"><label>32</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Damas</surname><given-names>ND</given-names></name><name><surname>Marcatti</surname><given-names>M</given-names></name><name><surname>C&#x00F4;me</surname><given-names>C</given-names></name><name><surname>Christensen</surname><given-names>LL</given-names></name><name><surname>Nielsen</surname><given-names>MM</given-names></name><name><surname>Baumgartner</surname><given-names>R</given-names></name><name><surname>Gylling</surname><given-names>HM</given-names></name><name><surname>Maglieri</surname><given-names>G</given-names></name><name><surname>Rundsten</surname><given-names>CF</given-names></name><name><surname>Seemann</surname><given-names>SE</given-names></name><etal/></person-group><article-title>SNHG5 promotes colorectal cancer cell survival by counteracting STAU1-mediated mRNA destabilization</article-title><source>Nature Commun</source><volume>7</volume><fpage>13875</fpage><year>2016</year><pub-id pub-id-type="doi">10.1038/ncomms13875</pub-id></element-citation></ref>
<ref id="b33-or-0-0-7368"><label>33</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Song</surname><given-names>S</given-names></name><name><surname>Li</surname><given-names>D</given-names></name><name><surname>Yang</surname><given-names>C</given-names></name><name><surname>Yan</surname><given-names>P</given-names></name><name><surname>Bai</surname><given-names>Y</given-names></name><name><surname>Zhang</surname><given-names>Y</given-names></name><name><surname>Hu</surname><given-names>G</given-names></name><name><surname>Lin</surname><given-names>C</given-names></name><name><surname>Li</surname><given-names>X</given-names></name></person-group><article-title>Overexpression of NELFCD promotes colorectal cancer cells proliferation, migration, and invasion</article-title><source>Onco Targets Ther</source><volume>11</volume><fpage>8741</fpage><lpage>8750</lpage><year>2018</year><pub-id pub-id-type="doi">10.2147/OTT.S186266</pub-id><pub-id pub-id-type="pmid">30584332</pub-id><pub-id pub-id-type="pmcid">6287418</pub-id></element-citation></ref>
<ref id="b34-or-0-0-7368"><label>34</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sillars-Hardebol</surname><given-names>AH</given-names></name><name><surname>Carvalho</surname><given-names>B</given-names></name><name><surname>Tijssen</surname><given-names>M</given-names></name><name><surname>Beli&#x00EB;n</surname><given-names>JA</given-names></name><name><surname>de Wit</surname><given-names>M</given-names></name><name><surname>Delis-van Diemen</surname><given-names>PM</given-names></name><name><surname>Pont&#x00E9;n</surname><given-names>F</given-names></name><name><surname>van de Wiel</surname><given-names>MA</given-names></name><name><surname>Fijneman</surname><given-names>RJ</given-names></name><name><surname>Meijer</surname><given-names>GA</given-names></name></person-group><article-title>TPX2 and AURKA promote 20q amplicon-driven colorectal adenoma to carcinoma progression</article-title><source>Gut</source><volume>61</volume><fpage>1568</fpage><lpage>1575</lpage><year>2012</year><pub-id pub-id-type="doi">10.1136/gutjnl-2011-301153</pub-id><pub-id pub-id-type="pmid">22207630</pub-id></element-citation></ref>
<ref id="b35-or-0-0-7368"><label>35</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Carvalho</surname><given-names>B</given-names></name><name><surname>Postma</surname><given-names>C</given-names></name><name><surname>Mongera</surname><given-names>S</given-names></name><name><surname>Hopmans</surname><given-names>E</given-names></name><name><surname>Diskin</surname><given-names>S</given-names></name><name><surname>van de Wiel</surname><given-names>MA</given-names></name><name><surname>van Criekinge</surname><given-names>W</given-names></name><name><surname>Thas</surname><given-names>O</given-names></name><name><surname>Matth&#x00E4;i</surname><given-names>A</given-names></name><name><surname>Cuesta</surname><given-names>MA</given-names></name><etal/></person-group><article-title>Multiple putative oncogenes at the chromosome 20q amplicon contribute to colorectal adenoma to carcinoma progression</article-title><source>Gut</source><volume>58</volume><fpage>79</fpage><lpage>89</lpage><year>2009</year><pub-id pub-id-type="doi">10.1136/gut.2007.143065</pub-id><pub-id pub-id-type="pmid">18829976</pub-id></element-citation></ref>
<ref id="b36-or-0-0-7368"><label>36</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Li</surname><given-names>N</given-names></name><name><surname>Li</surname><given-names>D</given-names></name><name><surname>Du</surname><given-names>Y</given-names></name><name><surname>Su</surname><given-names>C</given-names></name><name><surname>Yang</surname><given-names>C</given-names></name><name><surname>Lin</surname><given-names>C</given-names></name><name><surname>Li</surname><given-names>X</given-names></name><name><surname>Hu</surname><given-names>G</given-names></name></person-group><article-title>Overexpressed PLAGL2 transcriptionally activates Wnt6 and promotes cancer development in colorectal cancer</article-title><source>Oncol Rep</source><volume>41</volume><fpage>875</fpage><lpage>884</lpage><year>2019</year><pub-id pub-id-type="pmid">30535429</pub-id></element-citation></ref>
<ref id="b37-or-0-0-7368"><label>37</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Du</surname><given-names>Y</given-names></name><name><surname>Li</surname><given-names>D</given-names></name><name><surname>Li</surname><given-names>N</given-names></name><name><surname>Su</surname><given-names>C</given-names></name><name><surname>Yang</surname><given-names>C</given-names></name><name><surname>Lin</surname><given-names>C</given-names></name><name><surname>Chen</surname><given-names>M</given-names></name><name><surname>Wu</surname><given-names>R</given-names></name><name><surname>Li</surname><given-names>X</given-names></name><name><surname>Hu</surname><given-names>G</given-names></name></person-group><article-title>POFUT1 promotes colorectal cancer development through the activation of Notch1 signaling</article-title><source>Cell Death Dis</source><volume>9</volume><fpage>995</fpage><year>2018</year><pub-id pub-id-type="doi">10.1038/s41419-018-1055-2</pub-id><pub-id pub-id-type="pmid">30250219</pub-id><pub-id pub-id-type="pmcid">6155199</pub-id></element-citation></ref>
<ref id="b38-or-0-0-7368"><label>38</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Chabanais</surname><given-names>J</given-names></name><name><surname>Labrousse</surname><given-names>F</given-names></name><name><surname>Chaunavel</surname><given-names>A</given-names></name><name><surname>Germot</surname><given-names>A</given-names></name><name><surname>Maftah</surname><given-names>A</given-names></name></person-group><article-title>POFUT1 as a promising novel biomarker of colorectal cancer</article-title><source>Cancers</source><volume>10</volume><issue>pii</issue><fpage>E411</fpage><year>2018</year><pub-id pub-id-type="doi">10.3390/cancers10110411</pub-id><pub-id pub-id-type="pmid">30380753</pub-id></element-citation></ref>
<ref id="b39-or-0-0-7368"><label>39</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Slattery</surname><given-names>ML</given-names></name><name><surname>Pellatt</surname><given-names>DF</given-names></name><name><surname>Mullany</surname><given-names>LE</given-names></name><name><surname>Wolff</surname><given-names>RK</given-names></name><name><surname>Herrick</surname><given-names>JS</given-names></name></person-group><article-title>Gene expression in colon cancer: A focus on tumor site and molecular phenotype</article-title><source>Genes Chromosomes Cancer</source><volume>54</volume><fpage>527</fpage><lpage>541</lpage><year>2015</year><pub-id pub-id-type="doi">10.1002/gcc.22265</pub-id><pub-id pub-id-type="pmid">26171582</pub-id><pub-id pub-id-type="pmcid">5998821</pub-id></element-citation></ref>
<ref id="b40-or-0-0-7368"><label>40</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Venook</surname><given-names>AP</given-names></name></person-group><article-title>Right-sided vs left-sided colorectal cancer</article-title><source>Clin Adv Hematol Oncol</source><volume>15</volume><fpage>22</fpage><lpage>24</lpage><year>2017</year><pub-id pub-id-type="pmid">28212365</pub-id></element-citation></ref>
</ref-list>
</back>
<floats-group>
<fig id="f1-or-0-0-7368" position="float">
<label>Figure 1.</label>
<caption><p>Flow chart of data preparation, processing, analysis and validation in this study. TCGA, The Cancer Genome Atlas; GSEA, Gene Set Enrichment Analysis.</p></caption>
<graphic xlink:href="or-42-06-2473-g00.tif"/>
</fig>
<fig id="f2-or-0-0-7368" position="float">
<label>Figure 2.</label>
<caption><p>Sample clustering dendrogram and clinical traits heatmap. The clustering was based on the filtered expression data from GSE39582. The red color represents female, CIMP<sup>&#x002B;</sup>, pMMR and right-side CRC. The color intensity was proportional to older age, as well as higher TNM stage. CRC, colorectal cancer; CIMP, CpG island methylator phenotype; pMMR, proficient mismatch repair.</p></caption>
<graphic xlink:href="or-42-06-2473-g01.tif"/>
</fig>
<fig id="f3-or-0-0-7368" position="float">
<label>Figure 3.</label>
<caption><p>Analysis of the network topology for adjacency matrix weighting parameters (power). (A and B) The x-axis represents weighting parameters (power). The y-axis represents the scale free fitting index and connectivity for each power. (C) The regression line with an index of R<sup>2</sup>=0.92 when choosing the power of 5. The CRC network exhibits a scale-free topology. (D) The histogram of k when choosing the power of 5. CRC, colorectal cancer.</p></caption>
<graphic xlink:href="or-42-06-2473-g02.tif"/>
</fig>
<fig id="f4-or-0-0-7368" position="float">
<label>Figure 4.</label>
<caption><p>Cluster dendrogram produced by average linkage hierarchical clustering of genes based on topological overlap matrix (TOM). Each branch in the dendrogram is a line that represents a single gene. Each color indicates a single module that contained closely conserved genes.</p></caption>
<graphic xlink:href="or-42-06-2473-g03.tif"/>
</fig>
<fig id="f5-or-0-0-7368" position="float">
<label>Figure 5.</label>
<caption><p>TCGA data using the same method locating a dark-gray module that is highly correlated with tumor location. (A) Analysis of scale-free topology model fit vs. the candidate soft threshold powers. (B) Gene significance (y-axis) vs. module membership (x-axis) plotted for dark-gray module in the TCGA dataset. (C) Cluster dendrogram based on topological overlap matrix (TOM) in the TCGA dataset. (D) Module-trait relationships heatmap in the TCGA dataset indicates the dark-gray module is highly related to the tumor location. TCGA, The Cancer Genome Atlas.</p></caption>
<graphic xlink:href="or-42-06-2473-g04.tif"/>
<graphic xlink:href="or-42-06-2473-g05.tif"/>
</fig>
<fig id="f6-or-0-0-7368" position="float">
<label>Figure 6.</label>
<caption><p>Module-trait relationships were evaluated by WGCNA using GSE39582 microarray analysis comprising 431 human colorectal cancer samples. Gene modules are denoted by an arbitrary color name. Bins show the Pearson correlation value between gene expression levels of each module within the noted clinical traits and P-values. A value of 1 (red) and &#x2212;1 (blue) both quantify the strongest correlation, and 0 (white) quantifies no correlation. WGCNA, weighted gene coexpression analysis; CIMP, CpG island methylator phenotype; MMR, mismatch repair.</p></caption>
<graphic xlink:href="or-42-06-2473-g06.tif"/>
</fig>
<fig id="f7-or-0-0-7368" position="float">
<label>Figure 7.</label>
<caption><p>(A) Bar plot of mean gene significance across genes associated with tumor location in the module. (B) Calculations of module preservation statistics between GSE39582 and the independent dataset GSE14333. The dashed lines mark thresholds at Z=2 and Z=10, according to which &#x003E;10 suggests strong evidence for preservation and &#x003E;2 moderate evidence for preservation. A Z summary value &#x003C;2 indicates no preservation. (C) Gene significance (y-axis) vs. module membership (x-axis) plotted for red module in the GSE39582 dataset. In this module, genes with high module membership tended to have high gene significance. The genes with the highest gene significance are labeled blue.</p></caption>
<graphic xlink:href="or-42-06-2473-g07.tif"/>
</fig>
<fig id="f8-or-0-0-7368" position="float">
<label>Figure 8.</label>
<caption><p>(A and B) Expression of <italic>PLAGL2</italic> and <italic>POFUT1</italic> in CRC and normal tissues from GEPIA. (C) The gene expression correlation between <italic>PLAGL2</italic> and <italic>POFUT1</italic> from GEPIA. (D) qPCR results indicate a strong relationship between <italic>PLAGL2</italic> and <italic>POFUT1</italic> at the RNA level. CRC, colorectal cancer; GEPIA, Gene Expression Profiling Interactive Analysis; <italic>PLAGL2</italic>, PLAG1 like zinc finger 2; <italic>POFUT1</italic>, protein O-fucosyltransferase 1.</p></caption>
<graphic xlink:href="or-42-06-2473-g08.tif"/>
</fig>
<fig id="f9-or-0-0-7368" position="float">
<label>Figure 9.</label>
<caption><p>Kaplan-Meier (KM) survival curves for (A and B) <italic>PLAGL2</italic> and (C and D) <italic>POFUT1</italic> in proximal and distal CRC, respectively. Patients were divided into high-expression and low-expression groups based on the expression value of the considered gene. CRC, colorectal cancer; <italic>PLAGL2</italic>, PLAG1 like zinc finger 2; <italic>POFUT1</italic>, protein O-fucosyltransferase 1.</p></caption>
<graphic xlink:href="or-42-06-2473-g09.tif"/>
</fig>
<fig id="f10-or-0-0-7368" position="float">
<label>Figure 10.</label>
<caption><p>Gene set enrichment analysis for the groups with high and low expression of (A) <italic>PLAGL2</italic> and (B) <italic>POFUT1</italic>. <italic>PLAGL2</italic>, PLAG1 like zinc finger 2; <italic>POFUT1</italic>, protein O-fucosyltransferase 1.</p></caption>
<graphic xlink:href="or-42-06-2473-g10.tif"/>
</fig>
<table-wrap id="tI-or-0-0-7368" position="float">
<label>Table I.</label>
<caption><p>Twelve hub genes are found in the GSE39582 and TCGA dataset.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="bottom">Hub gene</th>
<th align="center" valign="bottom">Ensemble ID</th>
<th align="center" valign="bottom">Name</th>
<th align="center" valign="bottom">Cytogenetic location</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top"><bold><italic>PLAGL2</italic></bold></td>
<td align="right" valign="top"><bold>5326</bold></td>
<td align="left" valign="top"><bold>PLAG1-like zinc finger 2</bold></td>
<td align="center" valign="top"><bold>20q11</bold></td>
</tr>
<tr>
<td align="left" valign="top"><bold><italic>POFUT1</italic></bold></td>
<td align="right" valign="top"><bold>23509</bold></td>
<td align="left" valign="top"><bold>Protein O-fucosyltransferase 1</bold></td>
<td align="center" valign="top"><bold>20q11</bold></td>
</tr>
<tr>
<td align="left" valign="top"><italic>TTI1</italic></td>
<td align="right" valign="top">9675</td>
<td align="left" valign="top">TELO2 interacting protein 1</td>
<td align="center" valign="top">20q11</td>
</tr>
<tr>
<td align="left" valign="top"><italic>ASXL1</italic></td>
<td align="right" valign="top">171023</td>
<td align="left" valign="top">Additional sex combs-like 1</td>
<td align="center" valign="top">20q11</td>
</tr>
<tr>
<td align="left" valign="top"><italic>AAR2</italic></td>
<td align="right" valign="top">25980</td>
<td align="left" valign="top">AAR2 splicing factor homolog</td>
<td align="center" valign="top">20q11</td>
</tr>
<tr>
<td align="left" valign="top"><italic>PIGU</italic></td>
<td align="right" valign="top">128869</td>
<td align="left" valign="top">Phosphatidylinositol glycan anchor biosynthesis class U</td>
<td align="center" valign="top">20q11</td>
</tr>
<tr>
<td align="left" valign="top"><italic>STAU1</italic></td>
<td align="right" valign="top">6780</td>
<td align="left" valign="top">Staufen double-stranded RNA binding protein 1</td>
<td align="center" valign="top">20q11</td>
</tr>
<tr>
<td align="left" valign="top"><italic>DYNLRB1</italic></td>
<td align="right" valign="top">83658</td>
<td align="left" valign="top">Dynein light chain roadblock-type 1</td>
<td align="center" valign="top">20q11</td>
</tr>
<tr>
<td align="left" valign="top"><italic>NELFCD</italic></td>
<td align="right" valign="top">51497</td>
<td align="left" valign="top">Negative elongation factor complex member C/D</td>
<td align="center" valign="top">20q11</td>
</tr>
<tr>
<td align="left" valign="top"><italic>ZSWIM3</italic></td>
<td align="right" valign="top">140831</td>
<td align="left" valign="top">Zinc finger SWIM-type containing 3</td>
<td align="center" valign="top">20q11</td>
</tr>
<tr>
<td align="left" valign="top"><italic>MOCS3</italic></td>
<td align="right" valign="top">27304</td>
<td align="left" valign="top">Molybdenum cofactor synthesis 3</td>
<td align="center" valign="top">20q11</td>
</tr>
<tr>
<td align="left" valign="top"><italic>TM9SF4</italic></td>
<td align="right" valign="top">9777</td>
<td align="left" valign="top">Transmembrane 9 superfamily member 4</td>
<td align="center" valign="top">20q11</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="tfn1-or-0-0-7368"><p>The genes in common are indicated in bold print. <italic>PLAGL2</italic>, PLAG1 like zinc finger 2; <italic>POFUT1</italic>, protein O-fucosyltransferase 1.</p></fn>
</table-wrap-foot>
</table-wrap>
</floats-group>
</article>
