<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "journalpublishing3.dtd">
<article xml:lang="en" article-type="research-article" xmlns:xlink="http://www.w3.org/1999/xlink">
<?release-delay 0|0?>
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">IJO</journal-id>
<journal-title-group>
<journal-title>International Journal of Oncology</journal-title></journal-title-group>
<issn pub-type="ppub">1019-6439</issn>
<issn pub-type="epub">1791-2423</issn>
<publisher>
<publisher-name>D.A. Spandidos</publisher-name></publisher></journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3892/ijo.2015.3263</article-id>
<article-id pub-id-type="publisher-id">ijo-48-02-0690</article-id>
<article-categories>
<subj-group>
<subject>Articles</subject></subj-group></article-categories>
<title-group>
<article-title>Identifying molecular subtypes in human colon cancer using gene expression and DNA methylation microarray data</article-title></title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>REN</surname><given-names>ZHONGLU</given-names></name><xref rid="af1-ijo-48-02-0690" ref-type="aff">1</xref><xref rid="af2-ijo-48-02-0690" ref-type="aff">2</xref></contrib>
<contrib contrib-type="author">
<name><surname>WANG</surname><given-names>WENHUI</given-names></name><xref rid="af1-ijo-48-02-0690" ref-type="aff">1</xref><xref rid="af3-ijo-48-02-0690" ref-type="aff">3</xref></contrib>
<contrib contrib-type="author">
<name><surname>LI</surname><given-names>JINMING</given-names></name><xref rid="af1-ijo-48-02-0690" ref-type="aff">1</xref><xref rid="af2-ijo-48-02-0690" ref-type="aff">2</xref><xref ref-type="corresp" rid="c1-ijo-48-02-0690"/></contrib></contrib-group>
<aff id="af1-ijo-48-02-0690">
<label>1</label>Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, Guangdong, P.R. China</aff>
<aff id="af2-ijo-48-02-0690">
<label>2</label>State Key Laboratory of Organ Failure Research, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, P.R. China</aff>
<aff id="af3-ijo-48-02-0690">
<label>3</label>Network Information Center, The Sixth Affiliated Hospital of Sun Yat-Sen University, Guangzhou, Guangdong, P.R. China</aff>
<author-notes>
<corresp id="c1-ijo-48-02-0690">Correspondence to: Professor Jinming Li, Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, No. 1838 Guangzhoudadaobei, Guangzhou, Guangdong 510515, P.R. China, E-mail: <email>jmli@smu.edu.cn</email></corresp></author-notes>
<pub-date pub-type="collection">
<month>2</month>
<year>2016</year></pub-date>
<pub-date pub-type="epub">
<day>24</day>
<month>11</month>
<year>2015</year></pub-date>
<volume>48</volume>
<issue>2</issue>
<fpage>690</fpage>
<lpage>702</lpage>
<history>
<date date-type="received">
<day>10</day>
<month>10</month>
<year>2015</year></date>
<date date-type="accepted">
<day>11</day>
<month>11</month>
<year>2015</year></date></history>
<permissions>
<copyright-statement>Copyright: &#x000A9; Ren et al.</copyright-statement>
<copyright-year>2016</copyright-year>
<license license-type="open-access">
<license-p>This is an open access article distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by-nc-nd/4.0/">Creative Commons Attribution-NonCommercial-NoDerivs License</ext-link>, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.</license-p></license></permissions>
<abstract>
<p>Identifying colon cancer subtypes based on molecular signatures may allow for a more rational, patient-specific approach to therapy in the future. Classifications using gene expression data have been attempted before with little concordance between the different studies carried out. In this study we aimed to uncover subtypes of colon cancer that have distinct biological characteristics and identify a set of novel biomarkers which could best reflect the clinical and/or biological characteristics of each subtype. Clustering analysis and discriminant analysis were utilized to discover the subtypes in two different molecular levels on 153 colon cancer samples from The Cancer Genome Atlas (TCGA) Data Portal. At gene expression level, we identified two major subtypes, ECL1 (expression cluster 1) and ECL2 (expression cluster 2) and a list of signature genes. Due to the heterogeneity of colon cancer, the subtype ECL1 can be further subdivided into three nested subclasses, and HOTAIR were found upregulated in subclass 2. At DNA methylation level, we uncovered three major subtypes, MCL1 (methylation cluster 1), MCL2 (methylation cluster 2) and MCL3 (methylation cluster 3). We found only three subtypes of CpG island methylator phenotype (CIMP) in colon cancer instead of the four subtypes in the previous reports, and we found no sufficient evidence to subdivide MCL3 into two distinct subgroups.</p></abstract>
<kwd-group>
<kwd>clustering analysis</kwd>
<kwd>CpG island methylator phenotype</kwd>
<kwd>discriminant analysis</kwd>
<kwd>HOTAIR</kwd>
<kwd>subtypes</kwd></kwd-group></article-meta></front>
<body>
<sec sec-type="intro">
<title>Introduction</title>
<p>Colon cancer (~95&#x00025; cases are adenocarcinoma cancer) is a sub-site cancer of colorectal cancer, but it is different from rectal cancer not only in the location but also in the treatments postoperatively, hence we could have unique considerations in the patient with colon cancer (<xref rid="b1-ijo-48-02-0690" ref-type="bibr">1</xref>). Over all, it is one of the most common cancers in the developed countries.</p>
<p>Cancer arises as a consequence of the accumulation of epigenetic alterations and genetic alterations (<xref rid="b2-ijo-48-02-0690" ref-type="bibr">2</xref>). Most investigators divide colon cancer biologically into those with microsatellite instability (MSI) and those that are microsatellite stable but chromosomally unstable (CIN) in the genomic level (<xref rid="b3-ijo-48-02-0690" ref-type="bibr">3</xref>). At expression level, many investigators with different purposes have identified many marker genes associated with prognosis and different stages (<xref rid="b4-ijo-48-02-0690" ref-type="bibr">4</xref>). Wang <italic>et al</italic> utilized 74 colon cancer samples (31 relapsed in 3 years and 43 disease-free more than 3 years) with Dukes' B stage to reveal the 23-gene signature that predicted recurrence in Dukes' B patients (<xref rid="b5-ijo-48-02-0690" ref-type="bibr">5</xref>). In 2006, Barrier <italic>et al</italic> investigated 50 patients with stage II colon cancer to identify 30 prognosis genes (<xref rid="b6-ijo-48-02-0690" ref-type="bibr">6</xref>). Oh <italic>et al</italic> applied unsupervised hierarchical clustering analysis to gene expression data from 177 patients with colorectal cancer to determine a prognostic gene expression signature (<xref rid="b7-ijo-48-02-0690" ref-type="bibr">7</xref>). They also found that two independent groups associated with overall survival and disease-free survival. Notably, Slattery <italic>et al</italic> used microRNA microarray data from 100 patients and discovered relationship between tumor location and MSI/CIMP subtypes (<xref rid="b8-ijo-48-02-0690" ref-type="bibr">8</xref>). A TCGA group study indicated that colorectal tumors have three subtypes in gene expression level, MSI/CIMP, CIN and Invasive (<xref rid="b3-ijo-48-02-0690" ref-type="bibr">3</xref>).</p>
<p>The concept &#x02018;CpG island methylator phenotype&#x02019; (CIMP) was first proposed in 1999 by Toyota <italic>et al</italic> (<xref rid="b9-ijo-48-02-0690" ref-type="bibr">9</xref>). It was characterized by CpG island methylation in multiple regions (<xref rid="b2-ijo-48-02-0690" ref-type="bibr">2</xref>). Weisenberger <italic>et al</italic> reported four epigenetic subtypes and a list of related marker genes (<xref rid="b10-ijo-48-02-0690" ref-type="bibr">10</xref>,<xref rid="b11-ijo-48-02-0690" ref-type="bibr">11</xref>). TCGA group also described four epigenetic subtypes, namely CIMP-H, CIMP-L, cluster 3 and cluster 4, where the union of cluster 3 and cluster 4 was named as Non-CIMP (<xref rid="b3-ijo-48-02-0690" ref-type="bibr">3</xref>). In other studies, Shen <italic>et al</italic> (<xref rid="b12-ijo-48-02-0690" ref-type="bibr">12</xref>) and Yagi <italic>et al</italic> (<xref rid="b2-ijo-48-02-0690" ref-type="bibr">2</xref>) identified three epigenetic subtypes and some hyper-methylation genes as markers.</p>
<p>Using the unsupervised clustering approach to 153 colon cancer samples, we reached interesting and different results compared to the early reports. We identified two subgroups in gene expression level and three subgroups in DNA methylation level, respectively. Due to the heterogeneity of samples, we further identified nested subgroups in ECL1 and MCL3, and by examining the difference between these nested subgroups we ended up with our classification of colon cancer molecular subtypes. Our data suggested that the HOTAIR upregulated samples in CIN have higher metastasis rate and death rate.</p></sec>
<sec sec-type="materials|methods">
<title>Materials and methods</title>
<sec>
<title>Patients and microarray data</title>
<p>All clinical information and microarray data in the two molecular levels were downloaded from TCGA Data Portal (<ext-link xlink:href="https://tcga-data.nci.nih.gov/tcga/tcgaHome2.jsp" ext-link-type="uri">https://tcga-data.nci.nih.gov/tcga/tcgaHome2.jsp</ext-link>). A total of 153 colon Adenocarcinoma cancer samples with gene expression microarray data and DNA methylation microarray data had subtype labels from previous study (<xref rid="b3-ijo-48-02-0690" ref-type="bibr">3</xref>). The platforms of gene expression and DNA methylation microarray are Custom Agilent 244K Gene Expression Microarray (AMDID019760) and Illumina Infinium DNA methylation (HumanMethylation27 BeadChip), respectively. The data level 3 in the data portal was used in this study, which means that the gene expression data was Lowess normalized and the ratio of the Cy5 channel and Cy3 channel were log2 transformed to create gene expression values for 23199 probe-sets, resulting in 17814 genes available for further analysis, and the DNA methylation data contain beta-value calculations, HUGO gene symbol, chromosome number and genomic coordinate for each targeted CpG site on the array. Approximately 27578 CpG sites were located in proximity to the transcription start sites of 14475 consensus coding sequences.</p></sec>
<sec>
<title>Gene expression microarray analysis</title>
<p>We combined gene expression data of 153 samples into one file, and imputed the missing value using KNN Imputed (<xref rid="b13-ijo-48-02-0690" ref-type="bibr">13</xref>). The informative genes for clustering analysis were selected using a threshold standard deviation SD&gt;1 across all samples, and this resulted in 1393 genes. To perform consensus clustering (<xref rid="b14-ijo-48-02-0690" ref-type="bibr">14</xref>) we used K-mean approach with average linkage to detect robust clusters, where the metric was 1 minus the Pearson's correlation coefficient. The procedure was run over 2000 iterations and with a sub-sampling ratio of 0.8. To evaluate the heterogeneity of the subtypes we applied silhouette width values to identify the most &#x02018;core&#x02019; members of each subtype (<xref rid="b15-ijo-48-02-0690" ref-type="bibr">15</xref>&#x02013;<xref rid="b17-ijo-48-02-0690" ref-type="bibr">17</xref>), and samples with Silhouette Score&gt;0.5 were considered as core samples. Significance analysis of microarrays (<xref rid="b18-ijo-48-02-0690" ref-type="bibr">18</xref>) (SAM) was applied to identify differentially expressed genes between subgroups, and the Prediction analysis of microarrays (<xref rid="b19-ijo-48-02-0690" ref-type="bibr">19</xref>) (PAM) was used to obtain marker genes and establish classifiers. The training set for PAM is 70&#x00025; of 153 samples selected randomly and the testing set is the other 30&#x00025; of the samples. The Gene Ontology analysis was performed for each subtype using the Database for Annotation, Visualization, and Integrated Discovery (DAVID) (<xref rid="b20-ijo-48-02-0690" ref-type="bibr">20</xref>,<xref rid="b21-ijo-48-02-0690" ref-type="bibr">21</xref>), and GeneMANIA (<xref rid="b22-ijo-48-02-0690" ref-type="bibr">22</xref>) was applied to find the co-expressed network of marker genes.</p></sec>
<sec>
<title>DNA methylation microarray analysis</title>
<p>After combining data into one file, we removed the probes containing any &#x02018;NA&#x02019; marked data points and the probes that were designed for the sequences on the X and Y chromosomes. We then conducted a filtering process to reach a final data matrix with 1491 probes, which exhibited sufficient variable methylation levels with a threshold standard deviation value (SD&gt;0.2) across all samples. The DNA methylation microarray data were &#x003B2;-value, following &#x003B2;-distribution. To use the consensus clustering method, a data set must be transformed so that it follows a normal distribution. We used the Transfer Function (<xref rid="b23-ijo-48-02-0690" ref-type="bibr">23</xref>,<xref rid="b24-ijo-48-02-0690" ref-type="bibr">24</xref>) to transform the &#x003B2;-value into M-value which is normally distributed, which was similar with RPMM (<xref rid="b25-ijo-48-02-0690" ref-type="bibr">25</xref>) used in &#x003B2;-value in Hinoue <italic>et al</italic> (<xref rid="b11-ijo-48-02-0690" ref-type="bibr">11</xref>). Since some subtyping systems were reported in early studies on DNA methylation of colon cancer, we only performed the PAM on all samples and did not build testing sets. DAVID and GeneMANIA were also used on DNA methylation data.</p></sec>
<sec>
<title>Statistical analysis of clinical parameters</title>
<p>All data analyses were done in R platform (Windows version 2.15.2) (<xref rid="b26-ijo-48-02-0690" ref-type="bibr">26</xref>,<xref rid="b27-ijo-48-02-0690" ref-type="bibr">27</xref>). For the categorical variables in clinical information table such as gender, tumor subtype (previous studies), oncogene mutation (Yes or No), the Fisher's exact test was used to assess the significance of their association to the subtype derived in this study. For age levels, we used ANOVA to assess differences among subtypes. The package <italic>ConsensusClusterPlus</italic> was used to perform unsupervised clustering analysis. Package <italic>SAMr</italic> and <italic>PAMr</italic> were applied to identify the differentially expressed genes, to build the classifier and to determine the marker genes, respectively.</p></sec></sec>
<sec sec-type="results">
<title>Results</title>
<sec>
<title>Patient and tumor characteristics</title>
<p>Clinical and pathologic features of the patients and their tumors were summarized for further analysis. All 153 patients had information on age, gender, AJCC stage, vital status, tumor location and subtypes from earlier studies (<xref rid="tI-ijo-48-02-0690" ref-type="table">Table I</xref>).</p></sec>
<sec>
<title>Subgroups identified by gene expression data</title>
<p>Unsupervised <italic>K</italic>-mean consensus clustering was used to uncover potential subgroups of colon cancer on the basis of the similarities of their gene expression values of 1393 informative genes. We let <italic>K</italic>=2 to 6 in core <italic>K</italic>-mean clustering, two subgroups could be identified when <italic>K</italic>=2 and the cluster consensus are 0.98 and 0.99 for each subgroup (<xref rid="f1-ijo-48-02-0690" ref-type="fig">Fig. 1A and D</xref>), thus the first subgroup was named as ECL1 with 104 samples (68&#x00025;) and the second subgroup was named as ECL2 with 49 samples (32&#x00025;). When <italic>K</italic>=2 to 4, the ECL2 subgroup showed steady and consistency (<xref rid="f1-ijo-48-02-0690" ref-type="fig">Fig. 1A&#x02013;C</xref>). The relationship between two subgroups and their clinical characteristics were listed in <xref rid="tII-ijo-48-02-0690" ref-type="table">Table II</xref>.</p>
<p>In ECL2, the age of onset (73.3&#x000B1;11.47) is significantly higher than ECL1 (P&lt;0.049, ANOVA). We found that the majority samples of ELC2 are right sided tumors. All the MSI-H samples were found in the ECL2, and all the CIN samples in the ECL1. Furthermore, these two subgroups showed no significant difference in AJCC stage and history of polyps. Mutations of <italic>KRAS, BRAF</italic> and <italic>TP53</italic> were investigated in many studies, we found that all samples with <italic>BRAF</italic> mutation were in ECL2 and most of samples with <italic>TP53</italic> mutation were in ECL1 (<xref rid="tIII-ijo-48-02-0690" ref-type="table">Table III</xref>).</p>
<p>Nearly 62&#x00025; of the samples in ECL1 were left sided tumors. Most of ECL1 samples were MSS status and the majority samples of Invasive subtype (<xref rid="b3-ijo-48-02-0690" ref-type="bibr">3</xref>) were in ECL1. Compared with those reported in previous studies, we found that ECL1 contained both CIN and Invasive subtypes (<xref rid="b3-ijo-48-02-0690" ref-type="bibr">3</xref>), and therefore we examined the heterogeneity of this subgroup (<xref rid="f2-ijo-48-02-0690" ref-type="fig">Fig. 2A</xref>).</p>
<p>We carried out unsupervised clustering analysis only on ECL1 samples with <italic>K</italic>=2 to 6. When <italic>K</italic>=3, we discovered three distinct subclasses with very clear boundaries (<xref rid="f3-ijo-48-02-0690" ref-type="fig">Fig. 3B</xref>). There are 87 samples with Silhouette Score &gt;0.5 considered as core samples and retained, with 29 samples in subclass 1, 30 samples in subclass 2 and 28 samples in subclass 3. There are 18 CIN samples in subclass 1, 27 CIN samples in subclass 2, and 23 Invasive samples in subclass 3, and <xref rid="f2-ijo-48-02-0690" ref-type="fig">Fig. 2B</xref> demonstrates the relationship between subtypes reported earlier (<xref rid="b3-ijo-48-02-0690" ref-type="bibr">3</xref>) and the subclasses derived from ECL1 (P&lt;1.065e-10, Fisher's exact test).</p>
<p>There were two subclasses correlated to the CIN subtype, and due to the heterogeneity of ECL1 subgroup we investigated the difference of these two CIN groups. CIN samples extracted from two subclasses were compared using SAM with Wilcoxon rank sum test. There were 250 differentially expressed genes found with 2-fold change, and only 6 genes were upregulated in subclass 2, namely, <italic>SLC25A21, POPDC3, GREG2, HOTAIR, GYPB</italic> and <italic>SLC35F4</italic>. The proportion of either the metastatic samples or the death samples in subclass 2 was roughly two-fold of that in subclass 1 (<xref rid="tIV-ijo-48-02-0690" ref-type="table">Table IV</xref>).</p>
<p>On the top level we identified two subgroups in colon cancer, ECL1 had relatively high heterogeneity and it was associated with CIN and Invasive subtype derived from earlier studies, whereas ECL2 showed high homogeneity. On the secondary level, three subclasses were derived from ECL1, where the subclass 1 and 2 were associated with CIN subtype and the subclass 3 was associated with Invasive subtype.</p></sec>
<sec>
<title>Marker genes and their biological characteristics</title>
<p>PAM analysis was carried out to identify marker genes that could discriminate the two subgroups on the top level. When &#x00394; =4.16 (overall error rate 0.019 at minimum), 256 genes were selected from the 107 training samples. The testing set was used for independent validation, and only 2 samples were classified into wrong groups with an overall error rate of 0.043.</p>
<p>There were 137 genes out of the 256 marker genes that were upregulated in ECL2, among them <italic>SPP1</italic> and <italic>POSTN</italic> were associated with metastasis and poor prognosis in colorectal cancer, which were reported in earlier studies. DAVID analysis showed that these 137 genes were enriched in immune response, defense response, response to wounding, inflammatory response and carbohydrate binding GO terms. Furthermore, the GSEA (<xref rid="b28-ijo-48-02-0690" ref-type="bibr">28</xref>) analysis of these genes showed that they were upregulated in advanced gastric cancer and basal subtype of breast cancer. There were 119 genes upregulated in ECL1, and they were enriched in <italic>ERBB</italic> receptor signaling network, and &#x003B2;-oxidation of pristanoyl-CoA pathways. Finally, we plotted a heating map with the 256 marker genes for all 153 cancer samples (<xref rid="f4-ijo-48-02-0690" ref-type="fig">Fig. 4</xref>) with sample resorted hierarchical clustering and only 5 samples were classified into incorrect groups. This suggested that these genes could serve as feature genes for the subtype classification.</p></sec>
<sec>
<title>Subgroups identified by DNA methylation data</title>
<p>To investigate the subtypes using DNA methylation data, we applied the same method to the transformed methylation array data. When <italic>K</italic>=3 or 4 (<xref rid="f5-ijo-48-02-0690" ref-type="fig">Fig. 5A and B</xref>), the clustering reached the highest consensus. When <italic>K</italic>=3, we named these subgroups as MCL1 with 57 samples (37&#x00025;), MCL2 with 40 samples (26&#x00025;) and MCL3 with 56 samples (37&#x00025;). We found that the gender proportion among the three subgroups showed significant difference (P&lt;0.029, Fisher's exact test). The age distribution among three subgroups also showed significant difference (P&lt;2.24e-3, ANOVA, <xref rid="tV-ijo-48-02-0690" ref-type="table">Table V</xref>).</p>
<p>Majority of the samples in MCL1 were left tumors (~79&#x00025;), of the minimum mean age, <italic>MSS</italic> status and no <italic>BRAF</italic> mutation. More than 50&#x00025; of the samples in MCL1 had <italic>TP53</italic> mutation and a few samples had <italic>KRAS</italic> mutation. Almost all samples in MCL2 were male, right tumors (~93&#x00025;), of the maximum mean age and more than 50&#x00025; of the samples were MSI-H status; all samples with BRAF mutation were in MCL2 and a few samples in this subgroup had <italic>KRAS</italic> mutation and <italic>TP53</italic> mutation. More than 50&#x00025; of the samples in MCL3 were female, right tumors, MSS status, and there were no <italic>BRAF</italic> mutation and nearly 50&#x00025; of the samples had <italic>KRAS</italic> mutation and <italic>TP53</italic> mutation (<xref rid="tVI-ijo-48-02-0690" ref-type="table">Table VI</xref>).</p>
<p>Compared with the results of TCGA and Hinoue <italic>et al</italic> (<xref rid="b3-ijo-48-02-0690" ref-type="bibr">3</xref>,<xref rid="b11-ijo-48-02-0690" ref-type="bibr">11</xref>) (<xref rid="f6-ijo-48-02-0690" ref-type="fig">Fig. 6A</xref>), most samples in cluster 4 fell into MCL1, and all of CIMP-H samples were in MCL2; majority of the samples in CIMP-L and cluster 3 fell into MCL3 (P&lt;2.2e-16, Fisher's exact test).</p>
<p>Characteristics of MCL3 were quite similar with CIMP2. The CIMP2 showed more heterogeneity than the other two (<xref rid="b12-ijo-48-02-0690" ref-type="bibr">12</xref>), hence we further examined the subdivision of MCL3. When <italic>K</italic>=4, the four subgroups generated were largely overlapped with the previous classification (<xref rid="f6-ijo-48-02-0690" ref-type="fig">Fig. 6B</xref>), but the cluster consensus were lower than that when <italic>K</italic>=3 (<xref rid="f5-ijo-48-02-0690" ref-type="fig">Fig. 5D</xref>). To judge whether the CIMP-L and cluster 3 were distinct subtypes of colon cancer, we examined the data in <xref rid="tI-ijo-48-02-0690" ref-type="table">Table I</xref> of Hinoue <italic>et al</italic> (<xref rid="b11-ijo-48-02-0690" ref-type="bibr">11</xref>), and we found that tumor location and the frequence of TP53 mutation exhibited significant difference between the two clusters.</p></sec>
<sec>
<title>DNA methylation gene marker panels and their biological characteristics</title>
<p>PAM analysis was applied in these three subgroups to identify DNA methylation gene maker panels which could discriminate the subgroups. Firstly, MCL2 (CIMP-H) was compared with the combination of the MCL1 and MCL3 (Non-CIMP-H), when &#x00394;=11.4, 52 probes corresponding to 47 genes were selected as the first panel, and the overall error rate was 0.052. <italic>DSC3, LOX, RUNX3, SLC30A2</italic> and <italic>TLR2</italic> harbored two hypermethylation sites in the samples from MCL2 subgroup. Secondly, regardless of MCL2, MCL1 (cluster 4) was compared with MCL3, when &#x00394; =6.99 and overall error rate was 0.079, 39 probes corresponding to 33 genes were selected as the second panel. <italic>ELMO1, JAKMIP1, NCAM1, NDRG4</italic> harbored two hypermethylation sites in the samples from MCL3.</p>
<p>Combining two marker panels, there were 80 methylation genes. DAVID analysis on these genes showed that they were enriched in cell fate commitment, neuron differentiation, extracellular matrix, and sequence-specific DNA binding GO terms. We also used GeneMANIA to build the co-expression network of these 80 genes, and it turned out that the Wnt receptor signaling pathway and the digestive system development pathway were involved in the network.</p></sec>
<sec>
<title>Overlapping of subgroups derived from two molecular levels</title>
<p>We performed hierarchical clustering on all 153 samples with the genes in the two panels and were able to find three subtypes in DNA methylation data. Labels of ECL1 and ECL2 in each sample were also listed. Almost all samples in MCL2 were overlapped with those in ECL2; moreover, the ECL1 comprised MCL1 and MCL3 (<xref rid="f7-ijo-48-02-0690" ref-type="fig">Fig. 7</xref>).</p></sec></sec>
<sec sec-type="discussion">
<title>Discussion</title>
<p>Two main subtypes were identified in gene expression level and three main subtypes were found in gene methylation level (<xref rid="f8-ijo-48-02-0690" ref-type="fig">Fig. 8</xref>). For subtypes found in gene expression data, ECL2 was associated with MSI-H status, <italic>BRAF</italic> mutation, higher age and right tumor location; the samples from this subtype showed higher homogeneity than the samples in ECL1. Noteworthy, ECL1 could be further divided into three subclasses, both subclass 1 and 2 were related to CIN; and subclass 3 was related to Invasive type. We found that 6 genes, including <italic>HOTAIR</italic>, were upregulated in subclass 2. <italic>HOTAIR</italic> is an lncRNA that plays a key role in the initiation and progression of different types of cancer (<xref rid="b29-ijo-48-02-0690" ref-type="bibr">29</xref>). Patients with high <italic>HOTAIR</italic> expression had higher recurrence rates and reduced metastasis-free and overall survival than patients with low <italic>HOTAIR</italic> expression (<xref rid="b30-ijo-48-02-0690" ref-type="bibr">30</xref>). Hence, <italic>HOTAIR</italic> might be one of the most important marker genes contributing to the difference of metastasis rate and death rate between two CIN status-related subclass, and this supports the finding of Kogo <italic>et al</italic> (<xref rid="b31-ijo-48-02-0690" ref-type="bibr">31</xref>). In addition, these results also suggested that samples with CIN status might be refined into two different subclasses.</p>
<p>A list of genes for discriminating two subtypes (ECL1 and ECL2) was also determined, and these genes were involved in some important pathway of colon cancer pathogenesis, such as the chemokine receptor binding chemokine pathway and <italic>ERBB</italic> receptor signaling network. The chemokine receptor binding chemokine pathway is an upstream pathway of <italic>MAPK</italic> signaling pathway and <italic>JAK-STAT</italic> signaling pathway. Generally speaking, the alteration of genes influenced the changes of these pathways, finally resulting in different subtypes in colon cancer.</p>
<p>For subtypes found in DNA methylation level, MCL1 was association with cluster 4 which contained mostly sigmoid colon samples (68&#x00025;). The tumors in cluster 4 were significantly enriched in the rectum compared with the other groups (<xref rid="b11-ijo-48-02-0690" ref-type="bibr">11</xref>), whereas all of the samples we used were colon samples. This might be due to the fact that sigmoid and rectum are the closest in anatomy. The characteristics of the samples that belong to MCL1 are similar with LME subtype derived from Yagi <italic>et al</italic> (<xref rid="b2-ijo-48-02-0690" ref-type="bibr">2</xref>) and CIMP-negative from Shen <italic>et al</italic> (<xref rid="b12-ijo-48-02-0690" ref-type="bibr">12</xref>), although the frequency of MSI status, <italic>TP53</italic> and <italic>KRAS</italic> mutation was lower than that reported in previous studies, this subgroup could still be taken as a specific subtype of colon cancer. MCL2 contained all samples in CIMP-H status and with BRAF mutation, right tumor and the highest mean age, and more than 50&#x00025; of the samples in MSI-H status. This was quite similar with previous reported subtypes such as CIMP1 (<xref rid="b12-ijo-48-02-0690" ref-type="bibr">12</xref>), HME (<xref rid="b2-ijo-48-02-0690" ref-type="bibr">2</xref>) and CIMP-H (<xref rid="b11-ijo-48-02-0690" ref-type="bibr">11</xref>). Of note, the frequency of male in MCL2 was higher than that of female patients (62.5&#x00025;), and in MCL3 the frequency of female patients was higher than that of male patients (64.3&#x00025;). This suggested that colon cancer was to some extent related to the gender (P&lt;0.029, Fisher's exact test). We also found that samples in MCL2 exhibited high homogeneity.</p>
<p>MCL3 was comprised of CIMP-L and cluster 3 (<xref rid="b11-ijo-48-02-0690" ref-type="bibr">11</xref>). The MCL3, which was the most heterogeneous subgroup, was similar with CIMP2 (<xref rid="b12-ijo-48-02-0690" ref-type="bibr">12</xref>) and IME (<xref rid="b2-ijo-48-02-0690" ref-type="bibr">2</xref>), although the frequence of <italic>KRAS</italic> mutation was lower than that in CIMP2 (92&#x00025;), but this coincided with CIMP-L. We attempted to subdivide MCL3 and could not find sufficient evidence to support cluster 3 as a specific epigenetic subtype of colon cancer, except that the tumor location and the frequence of <italic>TP53</italic> mutation exhibited significant difference between the two clusters. More experiments and analyses should be carried out to resolve this.</p>
<p>The genes in first marker gene panel were hypermethylation in MCL2, and the genes in second panel were hypermethylation in MCL3. Almost all of the classic markers (<xref rid="b32-ijo-48-02-0690" ref-type="bibr">32</xref>), such as <italic>RUNX3, LOX, CACNA1G</italic> and <italic>MYOCD</italic> were involved in the first panel, and <italic>SLC30A2, NEUROG2</italic> were also found in this panel. <italic>NEUROG1, PRICKLE1</italic> and <italic>SOX5</italic> were found in the second panel. Furthermore, our data suggested that MCL2 were overlapped with ECL2, and the ECL1 comprised MCL1 and MCL3.</p>
<p>In this study, we only focused on the number of subtypes in different molecular levels of colon cancer, and did not explain molecular mechanisms forming these subtypes. Our findings might be helpful in understanding the subtypes of colon cancer in different molecular levels and provide a useful resource with clinical implications for further studies.</p></sec></body>
<back>
<ack>
<title>Acknowledgements</title>
<p>This study is supported by National Natural Science Foundation of China (grant no. 31371290), and a Start-up Grant from Guangdong Province (YCJ-2011-430) and Southern Medical University and Guangdong Province.</p></ack>
<ref-list>
<title>References</title>
<ref id="b1-ijo-48-02-0690"><label>1</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Minsky</surname><given-names>BD</given-names></name></person-group><article-title>Unique considerations in the patient with rectal cancer</article-title><source>Semin Oncol</source><volume>38</volume><fpage>542</fpage><lpage>551</lpage><year>2011</year><pub-id pub-id-type="doi">10.1053/j.seminoncol.2011.05.008</pub-id><pub-id pub-id-type="pmid">21810513</pub-id></element-citation></ref>
<ref id="b2-ijo-48-02-0690"><label>2</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Yagi</surname><given-names>K</given-names></name><name><surname>Akagi</surname><given-names>K</given-names></name><name><surname>Hayashi</surname><given-names>H</given-names></name><name><surname>Nagae</surname><given-names>G</given-names></name><name><surname>Tsuji</surname><given-names>S</given-names></name><name><surname>Isagawa</surname><given-names>T</given-names></name><name><surname>Midorikawa</surname><given-names>Y</given-names></name><name><surname>Nishimura</surname><given-names>Y</given-names></name><name><surname>Sakamoto</surname><given-names>H</given-names></name><name><surname>Seto</surname><given-names>Y</given-names></name><etal/></person-group><article-title>Three DNA methylation epigenotypes in human colorectal cancer</article-title><source>Clin Cancer Res</source><volume>16</volume><fpage>21</fpage><lpage>33</lpage><year>2010</year><pub-id pub-id-type="doi">10.1158/1078-0432.CCR-09-2006</pub-id></element-citation></ref>
<ref id="b3-ijo-48-02-0690"><label>3</label><element-citation publication-type="journal"><collab>Cancer Genome Atlas Network</collab><article-title>Comprehensive molecular characterization of human colon and rectal cancer</article-title><source>Nature</source><volume>487</volume><fpage>330</fpage><lpage>337</lpage><year>2012</year><pub-id pub-id-type="doi">10.1038/nature11252</pub-id><pub-id pub-id-type="pmid">22810696</pub-id><pub-id pub-id-type="pmcid">3401966</pub-id></element-citation></ref>
<ref id="b4-ijo-48-02-0690"><label>4</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Walther</surname><given-names>A</given-names></name><name><surname>Johnstone</surname><given-names>E</given-names></name><name><surname>Swanton</surname><given-names>C</given-names></name><name><surname>Midgley</surname><given-names>R</given-names></name><name><surname>Tomlinson</surname><given-names>I</given-names></name><name><surname>Kerr</surname><given-names>D</given-names></name></person-group><article-title>Genetic prognostic and predictive markers in colorectal cancer</article-title><source>Nat Rev Cancer</source><volume>9</volume><fpage>489</fpage><lpage>499</lpage><year>2009</year><pub-id pub-id-type="doi">10.1038/nrc2645</pub-id><pub-id pub-id-type="pmid">19536109</pub-id></element-citation></ref>
<ref id="b5-ijo-48-02-0690"><label>5</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname><given-names>Y</given-names></name><name><surname>Jatkoe</surname><given-names>T</given-names></name><name><surname>Zhang</surname><given-names>Y</given-names></name><name><surname>Mutch</surname><given-names>MG</given-names></name><name><surname>Talantov</surname><given-names>D</given-names></name><name><surname>Jiang</surname><given-names>J</given-names></name><name><surname>McLeod</surname><given-names>HL</given-names></name><name><surname>Atkins</surname><given-names>D</given-names></name></person-group><article-title>Gene expression profiles and molecular markers to predict recurrence of Dukes' B colon cancer</article-title><source>J Clin Oncol</source><volume>22</volume><fpage>1564</fpage><lpage>1571</lpage><year>2004</year><pub-id pub-id-type="doi">10.1200/JCO.2004.08.186</pub-id><pub-id pub-id-type="pmid">15051756</pub-id></element-citation></ref>
<ref id="b6-ijo-48-02-0690"><label>6</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Barrier</surname><given-names>A</given-names></name><name><surname>Boelle</surname><given-names>PY</given-names></name><name><surname>Roser</surname><given-names>F</given-names></name><name><surname>Gregg</surname><given-names>J</given-names></name><name><surname>Tse</surname><given-names>C</given-names></name><name><surname>Brault</surname><given-names>D</given-names></name><name><surname>Lacaine</surname><given-names>F</given-names></name><name><surname>Houry</surname><given-names>S</given-names></name><name><surname>Huguier</surname><given-names>M</given-names></name><name><surname>Franc</surname><given-names>B</given-names></name><etal/></person-group><article-title>Stage II colon cancer prognosis prediction by tumor gene expression profiling</article-title><source>J Clin Oncol</source><volume>24</volume><fpage>4685</fpage><lpage>4691</lpage><year>2006</year><pub-id pub-id-type="doi">10.1200/JCO.2005.05.0229</pub-id><pub-id pub-id-type="pmid">16966692</pub-id></element-citation></ref>
<ref id="b7-ijo-48-02-0690"><label>7</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Oh</surname><given-names>SC</given-names></name><name><surname>Park</surname><given-names>YY</given-names></name><name><surname>Park</surname><given-names>ES</given-names></name><name><surname>Lim</surname><given-names>JY</given-names></name><name><surname>Kim</surname><given-names>SM</given-names></name><name><surname>Kim</surname><given-names>SB</given-names></name><name><surname>Kim</surname><given-names>J</given-names></name><name><surname>Kim</surname><given-names>SC</given-names></name><name><surname>Chu</surname><given-names>IS</given-names></name><name><surname>Smith</surname><given-names>JJ</given-names></name><etal/></person-group><article-title>Prognostic gene expression signature associated with two molecularly distinct subtypes of colorectal cancer</article-title><source>Gut</source><volume>61</volume><fpage>1291</fpage><lpage>1298</lpage><year>2012</year><pub-id pub-id-type="doi">10.1136/gutjnl-2011-300812</pub-id><pub-id pub-id-type="pmcid">3419333</pub-id></element-citation></ref>
<ref id="b8-ijo-48-02-0690"><label>8</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Slattery</surname><given-names>ML</given-names></name><name><surname>Wolff</surname><given-names>E</given-names></name><name><surname>Hoffman</surname><given-names>MD</given-names></name><name><surname>Pellatt</surname><given-names>DF</given-names></name><name><surname>Milash</surname><given-names>B</given-names></name><name><surname>Wolff</surname><given-names>RK</given-names></name></person-group><article-title>MicroRNAs and colon and rectal cancer: differential expression by tumor location and subtype</article-title><source>Genes Chromosomes Cancer</source><volume>50</volume><fpage>196</fpage><lpage>206</lpage><year>2011</year><pub-id pub-id-type="doi">10.1002/gcc.20844</pub-id><pub-id pub-id-type="pmid">21213373</pub-id><pub-id pub-id-type="pmcid">3370677</pub-id></element-citation></ref>
<ref id="b9-ijo-48-02-0690"><label>9</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Toyota</surname><given-names>M</given-names></name><name><surname>Ahuja</surname><given-names>N</given-names></name><name><surname>Ohe-Toyota</surname><given-names>M</given-names></name><name><surname>Herman</surname><given-names>JG</given-names></name><name><surname>Baylin</surname><given-names>SB</given-names></name><name><surname>Issa</surname><given-names>JP</given-names></name></person-group><article-title>CpG island methylator phenotype in colorectal cancer</article-title><source>Proc Natl Acad Sci USA</source><volume>96</volume><fpage>8681</fpage><lpage>8686</lpage><year>1999</year><pub-id pub-id-type="doi">10.1073/pnas.96.15.8681</pub-id><pub-id pub-id-type="pmid">10411935</pub-id><pub-id pub-id-type="pmcid">17576</pub-id></element-citation></ref>
<ref id="b10-ijo-48-02-0690"><label>10</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Weisenberger</surname><given-names>DJ</given-names></name><name><surname>Siegmund</surname><given-names>KD</given-names></name><name><surname>Campan</surname><given-names>M</given-names></name><name><surname>Young</surname><given-names>J</given-names></name><name><surname>Long</surname><given-names>TI</given-names></name><name><surname>Faasse</surname><given-names>MA</given-names></name><name><surname>Kang</surname><given-names>GH</given-names></name><name><surname>Widschwendter</surname><given-names>M</given-names></name><name><surname>Weener</surname><given-names>D</given-names></name><name><surname>Buchanan</surname><given-names>D</given-names></name><etal/></person-group><article-title>CpG island methylator phenotype underlies sporadic microsatellite instability and is tightly associated with BRAF mutation in colorectal cancer</article-title><source>Nat Genet</source><volume>38</volume><fpage>787</fpage><lpage>793</lpage><year>2006</year><pub-id pub-id-type="doi">10.1038/ng1834</pub-id><pub-id pub-id-type="pmid">16804544</pub-id></element-citation></ref>
<ref id="b11-ijo-48-02-0690"><label>11</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hinoue</surname><given-names>T</given-names></name><name><surname>Weisenberger</surname><given-names>DJ</given-names></name><name><surname>Lange</surname><given-names>CP</given-names></name><name><surname>Shen</surname><given-names>H</given-names></name><name><surname>Byun</surname><given-names>HM</given-names></name><name><surname>Van De Berg</surname><given-names>D</given-names></name><name><surname>Malik</surname><given-names>S</given-names></name><name><surname>Pan</surname><given-names>F</given-names></name><name><surname>Noushmehr</surname><given-names>H</given-names></name><name><surname>van Dijk</surname><given-names>CM</given-names></name><etal/></person-group><article-title>Genome-scale analysis of aberrant DNA methylation in colorectal cancer</article-title><source>Genome Res</source><volume>22</volume><fpage>271</fpage><lpage>282</lpage><year>2012</year><pub-id pub-id-type="doi">10.1101/gr.117523.110</pub-id><pub-id pub-id-type="pmcid">3266034</pub-id></element-citation></ref>
<ref id="b12-ijo-48-02-0690"><label>12</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Shen</surname><given-names>L</given-names></name><name><surname>Toyota</surname><given-names>M</given-names></name><name><surname>Kondo</surname><given-names>Y</given-names></name><name><surname>Lin</surname><given-names>E</given-names></name><name><surname>Zhang</surname><given-names>L</given-names></name><name><surname>Guo</surname><given-names>Y</given-names></name><name><surname>Hernandez</surname><given-names>NS</given-names></name><name><surname>Chen</surname><given-names>X</given-names></name><name><surname>Ahmed</surname><given-names>S</given-names></name><name><surname>Konishi</surname><given-names>K</given-names></name><etal/></person-group><article-title>Integrated genetic and epigenetic analysis identifies three different subclasses of colon cancer</article-title><source>Proc Natl Acad Sci USA</source><volume>104</volume><fpage>18654</fpage><lpage>18659</lpage><year>2007</year><pub-id pub-id-type="doi">10.1073/pnas.0704652104</pub-id><pub-id pub-id-type="pmid">18003927</pub-id><pub-id pub-id-type="pmcid">2141832</pub-id></element-citation></ref>
<ref id="b13-ijo-48-02-0690"><label>13</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Troyanskaya</surname><given-names>O</given-names></name><name><surname>Cantor</surname><given-names>M</given-names></name><name><surname>Sherlock</surname><given-names>G</given-names></name><name><surname>Brown</surname><given-names>P</given-names></name><name><surname>Hastie</surname><given-names>T</given-names></name><name><surname>Tibshirani</surname><given-names>R</given-names></name><name><surname>Botstein</surname><given-names>D</given-names></name><name><surname>Altman</surname><given-names>RB</given-names></name></person-group><article-title>Missing value estimation methods for DNA microarrays</article-title><source>Bioinformatics</source><volume>17</volume><fpage>520</fpage><lpage>525</lpage><year>2001</year><pub-id pub-id-type="doi">10.1093/bioinformatics/17.6.520</pub-id><pub-id pub-id-type="pmid">11395428</pub-id></element-citation></ref>
<ref id="b14-ijo-48-02-0690"><label>14</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Monti</surname><given-names>S</given-names></name><name><surname>Tamayo</surname><given-names>P</given-names></name><name><surname>Mesirov</surname><given-names>J</given-names></name><name><surname>Golub</surname><given-names>T</given-names></name></person-group><article-title>Consensus Clustering: A resampling-based method for class discovery and visualization of gene expression microarray data</article-title><source>Mach Learn</source><volume>52</volume><fpage>91</fpage><lpage>118</lpage><year>2003</year><pub-id pub-id-type="doi">10.1023/A:1023949509487</pub-id></element-citation></ref>
<ref id="b15-ijo-48-02-0690"><label>15</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rousseeuw</surname><given-names>P</given-names></name></person-group><article-title>Silhouettes: A graphical aid to the interpretation and validation of cluster analysis</article-title><source>J Comput Appl Math</source><volume>20</volume><fpage>53</fpage><lpage>65</lpage><year>1987</year><pub-id pub-id-type="doi">10.1016/0377-0427(87)90125-7</pub-id></element-citation></ref>
<ref id="b16-ijo-48-02-0690"><label>16</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Verhaak</surname><given-names>RG</given-names></name><name><surname>Hoadley</surname><given-names>KA</given-names></name><name><surname>Purdom</surname><given-names>E</given-names></name><name><surname>Wang</surname><given-names>V</given-names></name><name><surname>Qi</surname><given-names>Y</given-names></name><name><surname>Wilkerson</surname><given-names>MD</given-names></name><name><surname>Miller</surname><given-names>CR</given-names></name><name><surname>Ding</surname><given-names>L</given-names></name><name><surname>Golub</surname><given-names>T</given-names></name><name><surname>Mesirov</surname><given-names>JP</given-names></name><etal/></person-group><article-title>Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1</article-title><source>Cancer Cell</source><volume>17</volume><fpage>98</fpage><lpage>110</lpage><year>2010</year><pub-id pub-id-type="doi">10.1016/j.ccr.2009.12.020</pub-id><pub-id pub-id-type="pmid">20129251</pub-id><pub-id pub-id-type="pmcid">2818769</pub-id></element-citation></ref>
<ref id="b17-ijo-48-02-0690"><label>17</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lovmar</surname><given-names>L</given-names></name><name><surname>Ahlford</surname><given-names>A</given-names></name><name><surname>Jonsson</surname><given-names>M</given-names></name><name><surname>Syv&#x000E4;nen</surname><given-names>AC</given-names></name></person-group><article-title>Silhouette scores for assessment of SNP genotype clusters</article-title><source>BMC Genomics</source><volume>6</volume><fpage>35</fpage><year>2005</year><pub-id pub-id-type="doi">10.1186/1471-2164-6-35</pub-id><pub-id pub-id-type="pmid">15760469</pub-id><pub-id pub-id-type="pmcid">555759</pub-id></element-citation></ref>
<ref id="b18-ijo-48-02-0690"><label>18</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Tusher</surname><given-names>VG</given-names></name><name><surname>Tibshirani</surname><given-names>R</given-names></name><name><surname>Chu</surname><given-names>G</given-names></name></person-group><article-title>Significance analysis of microarrays applied to the ionizing radiation response</article-title><source>Proc Natl Acad Sci USA</source><volume>98</volume><fpage>5116</fpage><lpage>5121</lpage><year>2001</year><pub-id pub-id-type="doi">10.1073/pnas.091062498</pub-id><pub-id pub-id-type="pmid">11309499</pub-id><pub-id pub-id-type="pmcid">33173</pub-id></element-citation></ref>
<ref id="b19-ijo-48-02-0690"><label>19</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Tibshirani</surname><given-names>R</given-names></name><name><surname>Hastie</surname><given-names>T</given-names></name><name><surname>Narasimhan</surname><given-names>B</given-names></name><name><surname>Chu</surname><given-names>G</given-names></name></person-group><article-title>Diagnosis of multiple cancer types by shrunken centroids of gene expression</article-title><source>Proc Natl Acad Sci USA</source><volume>99</volume><fpage>6567</fpage><lpage>6572</lpage><year>2002</year><pub-id pub-id-type="doi">10.1073/pnas.082099299</pub-id><pub-id pub-id-type="pmid">12011421</pub-id><pub-id pub-id-type="pmcid">124443</pub-id></element-citation></ref>
<ref id="b20-ijo-48-02-0690"><label>20</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Dennis</surname><given-names>G</given-names><suffix>Jr</suffix></name><name><surname>Sherman</surname><given-names>BT</given-names></name><name><surname>Hosack</surname><given-names>DA</given-names></name><name><surname>Yang</surname><given-names>J</given-names></name><name><surname>Gao</surname><given-names>W</given-names></name><name><surname>Lane</surname><given-names>HC</given-names></name><name><surname>Lempicki</surname><given-names>RA</given-names></name></person-group><article-title>DAVID: Database for Annotation, Visualization, and Integrated Discovery</article-title><source>Genome Biol</source><volume>4</volume><fpage>3</fpage><year>2003</year><pub-id pub-id-type="doi">10.1186/gb-2003-4-5-p3</pub-id></element-citation></ref>
<ref id="b21-ijo-48-02-0690"><label>21</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Huang da</surname><given-names>W</given-names></name><name><surname>Sherman</surname><given-names>BT</given-names></name><name><surname>Lempicki</surname><given-names>RA</given-names></name></person-group><article-title>Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists</article-title><source>Nucleic Acids Res</source><volume>37</volume><fpage>1</fpage><lpage>13</lpage><year>2009</year><pub-id pub-id-type="doi">10.1093/nar/gkn923</pub-id><pub-id pub-id-type="pmcid">2615629</pub-id></element-citation></ref>
<ref id="b22-ijo-48-02-0690"><label>22</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Warde-Farley</surname><given-names>D</given-names></name><name><surname>Donaldson</surname><given-names>SL</given-names></name><name><surname>Comes</surname><given-names>O</given-names></name><name><surname>Zuberi</surname><given-names>K</given-names></name><name><surname>Badrawi</surname><given-names>R</given-names></name><name><surname>Chao</surname><given-names>P</given-names></name><name><surname>Franz</surname><given-names>M</given-names></name><name><surname>Grouios</surname><given-names>C</given-names></name><name><surname>Kazi</surname><given-names>F</given-names></name><name><surname>Lopes</surname><given-names>CT</given-names></name><etal/></person-group><article-title>The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function</article-title><source>Nucleic Acids Res</source><volume>38</volume><issue>Web Server issue</issue><fpage>W214</fpage><lpage>W220</lpage><year>2010</year><pub-id pub-id-type="doi">10.1093/nar/gkq537</pub-id><pub-id pub-id-type="pmid">20576703</pub-id><pub-id pub-id-type="pmcid">2896186</pub-id></element-citation></ref>
<ref id="b23-ijo-48-02-0690"><label>23</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Siegmund</surname><given-names>KD</given-names></name></person-group><article-title>Statistical approaches for the analysis of DNA methylation microarray data</article-title><source>Hum Genet</source><volume>129</volume><fpage>585</fpage><lpage>595</lpage><year>2011</year><pub-id pub-id-type="doi">10.1007/s00439-011-0993-x</pub-id><pub-id pub-id-type="pmid">21519831</pub-id><pub-id pub-id-type="pmcid">3166559</pub-id></element-citation></ref>
<ref id="b24-ijo-48-02-0690"><label>24</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Du</surname><given-names>P</given-names></name><name><surname>Zhang</surname><given-names>X</given-names></name><name><surname>Huang</surname><given-names>CC</given-names></name><name><surname>Jafari</surname><given-names>N</given-names></name><name><surname>Kibbe</surname><given-names>WA</given-names></name><name><surname>Hou</surname><given-names>L</given-names></name><name><surname>Lin</surname><given-names>SM</given-names></name></person-group><article-title>Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis</article-title><source>BMC Bioinformatics</source><volume>11</volume><fpage>587</fpage><year>2010</year><pub-id pub-id-type="doi">10.1186/1471-2105-11-587</pub-id><pub-id pub-id-type="pmid">21118553</pub-id><pub-id pub-id-type="pmcid">3012676</pub-id></element-citation></ref>
<ref id="b25-ijo-48-02-0690"><label>25</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Houseman</surname><given-names>EA</given-names></name><name><surname>Christensen</surname><given-names>BC</given-names></name><name><surname>Yeh</surname><given-names>RF</given-names></name><name><surname>Marsit</surname><given-names>CJ</given-names></name><name><surname>Karagas</surname><given-names>MR</given-names></name><name><surname>Wrensch</surname><given-names>M</given-names></name><name><surname>Nelson</surname><given-names>HH</given-names></name><name><surname>Wiemels</surname><given-names>J</given-names></name><name><surname>Zheng</surname><given-names>S</given-names></name><name><surname>Wiencke</surname><given-names>JK</given-names></name><etal/></person-group><article-title>Model-based clustering of DNA methylation array data: a recursive-partitioning algorithm for high-dimensional data arising as a mixture of beta distributions</article-title><source>BMC Bioinformatics</source><volume>9</volume><fpage>365</fpage><year>2008</year><pub-id pub-id-type="doi">10.1186/1471-2105-9-365</pub-id><pub-id pub-id-type="pmid">18782434</pub-id><pub-id pub-id-type="pmcid">2553421</pub-id></element-citation></ref>
<ref id="b26-ijo-48-02-0690"><label>26</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gentleman</surname><given-names>RC</given-names></name><name><surname>Carey</surname><given-names>VJ</given-names></name><name><surname>Bates</surname><given-names>DM</given-names></name><name><surname>Bolstad</surname><given-names>B</given-names></name><name><surname>Dettling</surname><given-names>M</given-names></name><name><surname>Dudoit</surname><given-names>S</given-names></name><name><surname>Ellis</surname><given-names>B</given-names></name><name><surname>Gautier</surname><given-names>L</given-names></name><name><surname>Ge</surname><given-names>Y</given-names></name><name><surname>Gentry</surname><given-names>J</given-names></name><etal/></person-group><article-title>Bioconductor: open software development for computational biology and bioinformatics</article-title><source>Genome Biol</source><volume>5</volume><fpage>R80</fpage><year>2004</year><pub-id pub-id-type="doi">10.1186/gb-2004-5-10-r80</pub-id><pub-id pub-id-type="pmid">15461798</pub-id><pub-id pub-id-type="pmcid">545600</pub-id></element-citation></ref>
<ref id="b27-ijo-48-02-0690"><label>27</label><element-citation publication-type="web"><collab>R Development Core Team</collab><year>2011</year><source>R: A Language and Environment for Statistical Computing</source><publisher-loc>Vienna, Austria</publisher-loc><publisher-name>the R Foundation for Statistical Computing</publisher-name><isbn>3-900051-07-0</isbn><comment>Available online at <ext-link xlink:href="http://www.R-project.org/" ext-link-type="uri">http://www.R-project.org/</ext-link></comment></element-citation></ref>
<ref id="b28-ijo-48-02-0690"><label>28</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Subramanian</surname><given-names>A</given-names></name><name><surname>Tamayo</surname><given-names>P</given-names></name><name><surname>Mootha</surname><given-names>VK</given-names></name><name><surname>Mukherjee</surname><given-names>S</given-names></name><name><surname>Ebert</surname><given-names>BL</given-names></name><name><surname>Gillette</surname><given-names>MA</given-names></name><name><surname>Paulovich</surname><given-names>A</given-names></name><name><surname>Pomeroy</surname><given-names>SL</given-names></name><name><surname>Golub</surname><given-names>TR</given-names></name><name><surname>Lander</surname><given-names>ES</given-names></name><etal/></person-group><article-title>Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles</article-title><source>Proc Natl Acad Sci USA</source><volume>102</volume><fpage>15545</fpage><lpage>15550</lpage><year>2005</year><pub-id pub-id-type="doi">10.1073/pnas.0506580102</pub-id><pub-id pub-id-type="pmid">16199517</pub-id><pub-id pub-id-type="pmcid">1239896</pub-id></element-citation></ref>
<ref id="b29-ijo-48-02-0690"><label>29</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hajjari</surname><given-names>M</given-names></name><name><surname>Salavaty</surname><given-names>A</given-names></name></person-group><article-title>HOTAIR: an oncogenic long non-coding RNA in different cancers</article-title><source>Cancer Biol Med</source><volume>12</volume><fpage>1</fpage><lpage>9</lpage><year>2015</year><pub-id pub-id-type="pmid">25859406</pub-id><pub-id pub-id-type="pmcid">4383848</pub-id></element-citation></ref>
<ref id="b30-ijo-48-02-0690"><label>30</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname><given-names>ZH</given-names></name><name><surname>Wang</surname><given-names>XL</given-names></name><name><surname>Tang</surname><given-names>HM</given-names></name><name><surname>Jiang</surname><given-names>T</given-names></name><name><surname>Chen</surname><given-names>J</given-names></name><name><surname>Lu</surname><given-names>S</given-names></name><name><surname>Qiu</surname><given-names>GQ</given-names></name><name><surname>Peng</surname><given-names>ZH</given-names></name><name><surname>Yan</surname><given-names>DW</given-names></name></person-group><article-title>Long non-coding RNA HOTAIR is a powerful predictor of metastasis and poor prognosis and is associated with epithelial-mesenchymal transition in colon cancer</article-title><source>Oncol Rep</source><volume>32</volume><fpage>395</fpage><lpage>402</lpage><year>2014</year><pub-id pub-id-type="pmid">24840737</pub-id></element-citation></ref>
<ref id="b31-ijo-48-02-0690"><label>31</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kogo</surname><given-names>R</given-names></name><name><surname>Shimamura</surname><given-names>T</given-names></name><name><surname>Mimori</surname><given-names>K</given-names></name><name><surname>Kawahara</surname><given-names>K</given-names></name><name><surname>Imoto</surname><given-names>S</given-names></name><name><surname>Sudo</surname><given-names>T</given-names></name><name><surname>Tanaka</surname><given-names>F</given-names></name><name><surname>Shibata</surname><given-names>K</given-names></name><name><surname>Suzuki</surname><given-names>A</given-names></name><name><surname>Komune</surname><given-names>S</given-names></name><etal/></person-group><article-title>Long noncoding RNA HOTAIR regulates polycomb-dependent chromatin modification and is associated with poor prognosis in colorectal cancers</article-title><source>Cancer Res</source><volume>71</volume><fpage>6320</fpage><lpage>6326</lpage><year>2011</year><pub-id pub-id-type="doi">10.1158/0008-5472.CAN-11-1021</pub-id><pub-id pub-id-type="pmid">21862635</pub-id></element-citation></ref>
<ref id="b32-ijo-48-02-0690"><label>32</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname><given-names>MS</given-names></name><name><surname>Lee</surname><given-names>J</given-names></name><name><surname>Sidransky</surname><given-names>D</given-names></name></person-group><article-title>DNA methylation markers in colorectal cancer</article-title><source>Cancer Metastasis Rev</source><volume>29</volume><fpage>181</fpage><lpage>206</lpage><year>2010</year><pub-id pub-id-type="doi">10.1007/s10555-010-9207-6</pub-id><pub-id pub-id-type="pmid">20135198</pub-id></element-citation></ref></ref-list></back>
<floats-group>
<fig id="f1-ijo-48-02-0690" position="float">
<label>Figure 1</label>
<caption>
<p>Unsupervised clustering on gene expression data. (A) K=2. (B) K=3. (C) K=4. (D) Cluster consensus values and consensus CDF when K=2 to 6.</p></caption>
<graphic xlink:href="IJO-48-02-0690-g00.gif"/></fig>
<fig id="f2-ijo-48-02-0690" position="float">
<label>Figure 2</label>
<caption>
<p>Subgroups delivered in clustering analysis overlapped with subtypes identified by TCGA group in gene expression level. (A) On top level. (B) Three subclasses divided from ECL1.</p></caption>
<graphic xlink:href="IJO-48-02-0690-g01.gif"/></fig>
<fig id="f3-ijo-48-02-0690" position="float">
<label>Figure 3</label>
<caption>
<p>Unsupervised clustering on ECL1. (A) K=2. (B) When K=3, we identified three subclasses in ECL1. (C) K=4. (D) Consensus CDF. (E) Silhouette Score in three subclasses. (F) Cluster consensus values.</p></caption>
<graphic xlink:href="IJO-48-02-0690-g02.gif"/></fig>
<fig id="f4-ijo-48-02-0690" position="float">
<label>Figure 4</label>
<caption>
<p>Heat map of the 256 marker genes for all 153 cancer samples.</p></caption>
<graphic xlink:href="IJO-48-02-0690-g03.gif"/></fig>
<fig id="f5-ijo-48-02-0690" position="float">
<label>Figure 5</label>
<caption>
<p>Unsupervised clustering on DNA methylation data. (A) K=3. (B) K=4. (C) Consensus CDF when K=2 to 6. (D) Cluster consensus values when K=2 to 6.</p></caption>
<graphic xlink:href="IJO-48-02-0690-g04.gif"/></fig>
<fig id="f6-ijo-48-02-0690" position="float">
<label>Figure 6</label>
<caption>
<p>Subgroups delivered in clustering analysis overlapped with subtypes identified by TCGA group in DNA methylation level. (A) When K=3. (B) When K=4.</p></caption>
<graphic xlink:href="IJO-48-02-0690-g05.gif"/></fig>
<fig id="f7-ijo-48-02-0690" position="float">
<label>Figure 7</label>
<caption>
<p>Heat map of the 91 DNA methylation probes in the two marker gene panels.</p></caption>
<graphic xlink:href="IJO-48-02-0690-g06.gif"/></fig>
<fig id="f8-ijo-48-02-0690" position="float">
<label>Figure 8</label>
<caption>
<p>Workflow of the unsupervised clustering of 153 colon cancer samples in the two molecular levels.</p></caption>
<graphic xlink:href="IJO-48-02-0690-g07.gif"/></fig>
<table-wrap id="tI-ijo-48-02-0690" position="float">
<label>Table I</label>
<caption>
<p>Clinical data and subtypes identified by previous studies for 153 colon cancer samples.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th valign="bottom" align="left">Characteristics</th>
<th valign="bottom" align="center">n (&#x00025;)</th></tr></thead>
<tbody>
<tr>
<td colspan="2" valign="top" align="left">Gender</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Male</td>
<td valign="top" align="center">78 (51.0)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Female</td>
<td valign="top" align="center">75 (49.0)</td></tr>
<tr>
<td colspan="2" valign="top" align="left">Age</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Mean &#x000B1; SD</td>
<td valign="top" align="center">75&#x000B1;11.7</td></tr>
<tr>
<td colspan="2" valign="top" align="left">Tumor sub-site</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Left</td>
<td valign="top" align="center">72 (47.1)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Right</td>
<td valign="top" align="center">80 (52.3)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">1 (0.6)</td></tr>
<tr>
<td colspan="2" valign="top" align="left">MSI-status</td></tr>
<tr>
<td valign="top" align="left">&#x02003;MSI-H</td>
<td valign="top" align="center">28 (18.3)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;MSI-L</td>
<td valign="top" align="center">33 (21.6)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;MSS</td>
<td valign="top" align="center">92 (60.1)</td></tr>
<tr>
<td colspan="2" valign="top" align="left">Expression subtypes</td></tr>
<tr>
<td valign="top" align="left">&#x02003;CIN</td>
<td valign="top" align="center">57 (37.3)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Invasive</td>
<td valign="top" align="center">37 (24.2)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;MSI/CIMP</td>
<td valign="top" align="center">58 (37.9)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">1 (0.6)</td></tr>
<tr>
<td colspan="2" valign="top" align="left">Methylation subtypes</td></tr>
<tr>
<td valign="top" align="left">&#x02003;CIMP-H</td>
<td valign="top" align="center">29 (18.9)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;CIMP-L</td>
<td valign="top" align="center">35 (22.9)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Cluster 3</td>
<td valign="top" align="center">44 (28.8)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Cluster 4</td>
<td valign="top" align="center">45 (29.4)</td></tr>
<tr>
<td colspan="2" valign="top" align="left">Tumor stage</td></tr>
<tr>
<td valign="top" align="left">&#x02003;I</td>
<td valign="top" align="center">28 (18.3)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;II</td>
<td valign="top" align="center">61 (39.9)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;III</td>
<td valign="top" align="center">39 (25.5)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;IV</td>
<td valign="top" align="center">23 (15.0)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">2 (1.3)</td></tr>
<tr>
<td colspan="2" valign="top" align="left">Vital status</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Living</td>
<td valign="top" align="center">138 (90.2)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Deceased</td>
<td valign="top" align="center">15 (9.8)</td></tr></tbody></table></table-wrap>
<table-wrap id="tII-ijo-48-02-0690" position="float">
<label>Table II</label>
<caption>
<p>Correlation between the clinical data and the subgroups identified in the gene expression data of the colon cancer samples.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th valign="bottom" align="left"/>
<th colspan="2" valign="bottom" align="center">Subgroups (&#x00025;)</th>
<th valign="bottom" align="center"/></tr>
<tr>
<th valign="bottom" align="left"/>
<th colspan="2" valign="bottom" align="left">
<hr/></th>
<th valign="bottom" align="center"/></tr>
<tr>
<th valign="bottom" align="left">Characteristics</th>
<th valign="bottom" align="center">ECL1</th>
<th valign="bottom" align="center">ECL2</th>
<th valign="bottom" align="center">P-value</th></tr></thead>
<tbody>
<tr>
<td valign="top" align="left">Total sample no.</td>
<td valign="top" align="center">104 (68.0)</td>
<td valign="top" align="center">49 (32.0)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="4" valign="top" align="left">Gender</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Male</td>
<td valign="top" align="center">59 (75.6)</td>
<td valign="top" align="center">19 (24.4)</td>
<td valign="top" align="center">0.06</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Female</td>
<td valign="top" align="center">45 (60.0)</td>
<td valign="top" align="center">30 (40.0)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="4" valign="top" align="left">Age (yrs.)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Mean &#x000B1; SD</td>
<td valign="top" align="center">69.4&#x000B1;11.7</td>
<td valign="top" align="center">73.3&#x000B1;11.4</td>
<td valign="top" align="center">0.049</td></tr>
<tr>
<td colspan="4" valign="top" align="left">Tumor location</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Ascending</td>
<td valign="top" align="center">11 (39.3)</td>
<td valign="top" align="center">17 (60.7)</td>
<td valign="top" align="center">3.6e-06</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Cecum</td>
<td valign="top" align="center">17 (58.6)</td>
<td valign="top" align="center">12 (41.4)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Transverse</td>
<td valign="top" align="center">13 (52.0)</td>
<td valign="top" align="center">12 (48)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Descending</td>
<td valign="top" align="center">6 (100)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Sigmoid</td>
<td valign="top" align="center">56 (87.5)</td>
<td valign="top" align="center">8 (12.5)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">1 (100)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="4" valign="top" align="left">Sub-site</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Left</td>
<td valign="top" align="center">64 (88.9)</td>
<td valign="top" align="center">8 (11.1)</td>
<td valign="top" align="center">9.4e-08</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Right</td>
<td valign="top" align="center">39 (48.8)</td>
<td valign="top" align="center">41 (51.2)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">1 (100)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="4" valign="top" align="left">AJCC stage</td></tr>
<tr>
<td valign="top" align="left">&#x02003;I</td>
<td valign="top" align="center">19 (67.9)</td>
<td valign="top" align="center">9 (32.1)</td>
<td valign="top" align="center">0.68</td></tr>
<tr>
<td valign="top" align="left">&#x02003;II</td>
<td valign="top" align="center">39 (63.9)</td>
<td valign="top" align="center">22 (36.1)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;III</td>
<td valign="top" align="center">28 (71.8)</td>
<td valign="top" align="center">11 (28.2)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;IV</td>
<td valign="top" align="center">17 (77.3)</td>
<td valign="top" align="center">5 (22.7)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">1 (50.0)</td>
<td valign="top" align="center">1 (50.0)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="4" valign="top" align="left">MSI status</td></tr>
<tr>
<td valign="top" align="left">&#x02003;MSS</td>
<td valign="top" align="center">78 (84.8)</td>
<td valign="top" align="center">14 (15.2)</td>
<td valign="top" align="center">2.2e-16</td></tr>
<tr>
<td valign="top" align="left">&#x02003;MSI-H</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">28 (100)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;MSI-L</td>
<td valign="top" align="center">26 (78.8)</td>
<td valign="top" align="center">7 (21.2)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="4" valign="top" align="left">Expression subtype</td></tr>
<tr>
<td valign="top" align="left">&#x02003;CIN</td>
<td valign="top" align="center">57 (100)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">2.2e-16</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Invasive</td>
<td valign="top" align="center">35 (94.6)</td>
<td valign="top" align="center">2 (5.4)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;MSI/CIMP</td>
<td valign="top" align="center">11 (19.9)</td>
<td valign="top" align="center">47 (81.0)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">1 (100)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center"/></tr></tbody></table></table-wrap>
<table-wrap id="tIII-ijo-48-02-0690" position="float">
<label>Table III</label>
<caption>
<p>Correlation between the gene mutation and the subgroups identified in the gene expression data of the colon cancer samples.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th valign="bottom" align="left"/>
<th colspan="2" valign="bottom" align="center">Subgroups (&#x00025;)</th>
<th valign="bottom" align="center"/></tr>
<tr>
<th valign="bottom" align="left"/>
<th colspan="2" valign="bottom" align="left">
<hr/></th>
<th valign="bottom" align="center"/></tr>
<tr>
<th valign="bottom" align="left">Mutation genes</th>
<th valign="bottom" align="center">ECL1</th>
<th valign="bottom" align="center">ECL2</th>
<th valign="bottom" align="center">P-value</th></tr></thead>
<tbody>
<tr>
<td valign="top" align="left">Total sample no.</td>
<td valign="top" align="center">104 (68.0)</td>
<td valign="top" align="center">49 (32.0)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="4" valign="top" align="left"><italic>BRAF</italic> mutation</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Yes</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">17 (100)</td>
<td valign="top" align="center">1.8e-10</td></tr>
<tr>
<td valign="top" align="left">&#x02003;No</td>
<td valign="top" align="center">92 (78.6)</td>
<td valign="top" align="center">25 (21.4)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">12 (63.2)</td>
<td valign="top" align="center">7 (36.8)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="4" valign="top" align="left"><italic>KRAS</italic> mutation</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Yes</td>
<td valign="top" align="center">31 (66.0)</td>
<td valign="top" align="center">16 (34.0)</td>
<td valign="top" align="center">0.69</td></tr>
<tr>
<td valign="top" align="left">&#x02003;No</td>
<td valign="top" align="center">61 (70.1)</td>
<td valign="top" align="center">26 (29.9)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">12 (63.2)</td>
<td valign="top" align="center">7 (36.8)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="4" valign="top" align="left"><italic>TP53</italic> mutation</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Yes</td>
<td valign="top" align="center">52 (81.2)</td>
<td valign="top" align="center">12 (18.8)</td>
<td valign="top" align="center">2.9e-03</td></tr>
<tr>
<td valign="top" align="left">&#x02003;No</td>
<td valign="top" align="center">40 (57.1)</td>
<td valign="top" align="center">30 (42.9)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">12 (63.2)</td>
<td valign="top" align="center">7 (36.8)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="4" valign="top" align="left"><italic>SOX9</italic> mutation</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Yes</td>
<td valign="top" align="center">7 (87.5)</td>
<td valign="top" align="center">1 (12.5)</td>
<td valign="top" align="center">0.43</td></tr>
<tr>
<td valign="top" align="left">&#x02003;No</td>
<td valign="top" align="center">85 (67.5)</td>
<td valign="top" align="center">41 (32.5)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">12 (63.2)</td>
<td valign="top" align="center">7 (36.8)</td>
<td valign="top" align="center"/></tr></tbody></table></table-wrap>
<table-wrap id="tIV-ijo-48-02-0690" position="float">
<label>Table IV</label>
<caption>
<p>Metastatic and death counts in two nested subclasses related to CIN.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th valign="bottom" align="left">Subclass</th>
<th valign="bottom" align="center">CIN</th>
<th valign="bottom" align="center">Metastatic count (&#x00025;)</th>
<th valign="bottom" align="left">Death count (&#x00025;)</th></tr></thead>
<tbody>
<tr>
<td valign="top" align="left">Subclass 1</td>
<td valign="top" align="center">18</td>
<td valign="top" align="center">3 (16.7)</td>
<td valign="top" align="left">1 (5.6)</td></tr>
<tr>
<td valign="top" align="left">Subclass 2</td>
<td valign="top" align="center">27</td>
<td valign="top" align="center">8 (29.6)</td>
<td valign="top" align="left">4 (14.8)</td></tr></tbody></table></table-wrap>
<table-wrap id="tV-ijo-48-02-0690" position="float">
<label>Table V</label>
<caption>
<p>Correlation between the clinical data and the subgroups identified in the DNA methylation data of the colon cancer samples.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th valign="bottom" align="left"/>
<th colspan="3" valign="bottom" align="center">Subgroups (&#x00025;)</th>
<th valign="bottom" align="center"/></tr>
<tr>
<th valign="bottom" align="left"/>
<th colspan="3" valign="bottom" align="left">
<hr/></th>
<th valign="bottom" align="center"/></tr>
<tr>
<th valign="bottom" align="left">Characteristics</th>
<th valign="bottom" align="center">MCL1</th>
<th valign="bottom" align="center">MCL2</th>
<th valign="bottom" align="center">MCL3</th>
<th valign="bottom" align="center">P-value</th></tr></thead>
<tbody>
<tr>
<td valign="top" align="left">Total sample no.</td>
<td valign="top" align="center">57 (37.3)</td>
<td valign="top" align="center">40 (26.1)</td>
<td valign="top" align="center">56 (36.6)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="5" valign="top" align="left">Gender</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Male</td>
<td valign="top" align="center">27 (37.5)</td>
<td valign="top" align="center">25 (34.7)</td>
<td valign="top" align="center">20 (7.8)</td>
<td valign="top" align="center">0.029</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Female</td>
<td valign="top" align="center">30 (37.0)</td>
<td valign="top" align="center">15 (18.5)</td>
<td valign="top" align="center">36 (44.5)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="5" valign="top" align="left">Age (yrs.)</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Mean &#x000B1; SD</td>
<td valign="top" align="center">66.8&#x000B1;12.7</td>
<td valign="top" align="center">74.9&#x000B1;10.1</td>
<td valign="top" align="center">71.6&#x000B1;10.7</td>
<td valign="top" align="center">2.24e-3</td></tr>
<tr>
<td colspan="5" valign="top" align="left">Tumor location</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Ascending</td>
<td valign="top" align="center">3 (10.7)</td>
<td valign="top" align="center">14 (50.0)</td>
<td valign="top" align="center">11 (39.3)</td>
<td valign="top" align="center">4.1e-8</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Cecum</td>
<td valign="top" align="center">6 (20.7)</td>
<td valign="top" align="center">12 (41.4)</td>
<td valign="top" align="center">11 (37.9)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Transverse</td>
<td valign="top" align="center">5 (20.0)</td>
<td valign="top" align="center">11 (44.0)</td>
<td valign="top" align="center">9 (36.0)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Descending</td>
<td valign="top" align="center">4 (66.7)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">2 (33.3)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Sigmoid</td>
<td valign="top" align="center">39 (60.9)</td>
<td valign="top" align="center">3 (4.7)</td>
<td valign="top" align="center">22 (34.4)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">1 (100)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="5" valign="top" align="left">Sub-site</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Left</td>
<td valign="top" align="center">45 (62.5)</td>
<td valign="top" align="center">3 (4.2)</td>
<td valign="top" align="center">24 (33.3)</td>
<td valign="top" align="center">2.2e-12</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Right</td>
<td valign="top" align="center">12 (15)</td>
<td valign="top" align="center">37 (46.3)</td>
<td valign="top" align="center">31 (38.7)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">1 (100)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="5" valign="top" align="left">AJCC stage</td></tr>
<tr>
<td valign="top" align="left">&#x02003;I</td>
<td valign="top" align="center">12 (42.9)</td>
<td valign="top" align="center">6 (21.4)</td>
<td valign="top" align="center">10 (35.7)</td>
<td valign="top" align="center">0.348</td></tr>
<tr>
<td valign="top" align="left">&#x02003;II</td>
<td valign="top" align="center">19 (31.2)</td>
<td valign="top" align="center">21 (34.4)</td>
<td valign="top" align="center">21 (34.4)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;III</td>
<td valign="top" align="center">13 (33.3)</td>
<td valign="top" align="center">10 (25.6)</td>
<td valign="top" align="center">16 (41.1)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;IV</td>
<td valign="top" align="center">13 (56.5)</td>
<td valign="top" align="center">3 (13.1)</td>
<td valign="top" align="center">7 (30.4)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">2 (100)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="5" valign="top" align="left">MSI status</td></tr>
<tr>
<td valign="top" align="left">&#x02003;MSS</td>
<td valign="top" align="center">39 (42.4)</td>
<td valign="top" align="center">11 (11.9)</td>
<td valign="top" align="center">42 (45.7)</td>
<td valign="top" align="center">1.5e-10</td></tr>
<tr>
<td valign="top" align="left">&#x02003;MSI-H</td>
<td valign="top" align="center">3 (10.7)</td>
<td valign="top" align="center">23 (82.1)</td>
<td valign="top" align="center">2 (7.2)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;MSI-L</td>
<td valign="top" align="center">15 (45.5)</td>
<td valign="top" align="center">6 (18.2)</td>
<td valign="top" align="center">12 (36.3)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="5" valign="top" align="left">Expression subtype</td></tr>
<tr>
<td valign="top" align="left">&#x02003;CIN</td>
<td valign="top" align="center">32 (56.1)</td>
<td valign="top" align="center">2 (3.5)</td>
<td valign="top" align="center">23 (40.4)</td>
<td valign="top" align="center">6.9e-12</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Invasive</td>
<td valign="top" align="center">9 (24.3)</td>
<td valign="top" align="center">5 (13.5)</td>
<td valign="top" align="center">23 (62.2)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;MSI/CIMP</td>
<td valign="top" align="center">16 (27.6)</td>
<td valign="top" align="center">33 (56.9)</td>
<td valign="top" align="center">9 (15.5)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">1 (100)</td>
<td valign="top" align="center"/></tr></tbody></table></table-wrap>
<table-wrap id="tVI-ijo-48-02-0690" position="float">
<label>Table VI</label>
<caption>
<p>Correlation between the gene mutation and the subgroups identified in the DNA methylation data of the colon cancer samples.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th valign="bottom" align="left"/>
<th colspan="3" valign="bottom" align="center">Subgroups (&#x00025;)</th>
<th valign="bottom" align="center"/></tr>
<tr>
<th valign="bottom" align="left"/>
<th colspan="3" valign="bottom" align="left">
<hr/></th>
<th valign="bottom" align="center"/></tr>
<tr>
<th valign="bottom" align="left">Mutation genes</th>
<th valign="bottom" align="center">MCL1</th>
<th valign="bottom" align="center">MCL2</th>
<th valign="bottom" align="center">MCL3</th>
<th valign="bottom" align="center">P-value</th></tr></thead>
<tbody>
<tr>
<td valign="top" align="left">Total sample no.</td>
<td valign="top" align="center">57 (37.3)</td>
<td valign="top" align="center">40 (26.1)</td>
<td valign="top" align="center">56 (36.6)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="5" valign="top" align="left"><italic>BRAF</italic> mutation</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Yes</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">17 (100)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">8.3e-13</td></tr>
<tr>
<td valign="top" align="left">&#x02003;No</td>
<td valign="top" align="center">50 (42.7)</td>
<td valign="top" align="center">16 (13.7)</td>
<td valign="top" align="center">51 (43.6)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">7 (36.8)</td>
<td valign="top" align="center">7 (36.8)</td>
<td valign="top" align="center">5 (26.4)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="5" valign="top" align="left"><italic>KRAS</italic> mutation</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Yes</td>
<td valign="top" align="center">7 (14.9)</td>
<td valign="top" align="center">13 (27.7)</td>
<td valign="top" align="center">27 (57.4)</td>
<td valign="top" align="center">1.3e-4</td></tr>
<tr>
<td valign="top" align="left">&#x02003;No</td>
<td valign="top" align="center">43 (49.4)</td>
<td valign="top" align="center">20 (23.0)</td>
<td valign="top" align="center">24 (27.6)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">7 (36.8)</td>
<td valign="top" align="center">7 (36.8)</td>
<td valign="top" align="center">5 (26.4)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="5" valign="top" align="left"><italic>TP53</italic> mutation</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Yes</td>
<td valign="top" align="center">29 (45.3)</td>
<td valign="top" align="center">8 (12.5)</td>
<td valign="top" align="center">27 (42.2)</td>
<td valign="top" align="center">6.5e-3</td></tr>
<tr>
<td valign="top" align="left">&#x02003;No</td>
<td valign="top" align="center">21 (30.0)</td>
<td valign="top" align="center">25 (35.7)</td>
<td valign="top" align="center">24 (34.3)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">7 (36.8)</td>
<td valign="top" align="center">7 (36.8)</td>
<td valign="top" align="center">5 (26.4)</td>
<td valign="top" align="center"/></tr>
<tr>
<td colspan="5" valign="top" align="left"><italic>SOX9</italic> mutation</td></tr>
<tr>
<td valign="top" align="left">&#x02003;Yes</td>
<td valign="top" align="center">2 (25.0)</td>
<td valign="top" align="center">1 (12.5)</td>
<td valign="top" align="center">5 (62.5)</td>
<td valign="top" align="center">0.46</td></tr>
<tr>
<td valign="top" align="left">&#x02003;No</td>
<td valign="top" align="center">48 (38.1)</td>
<td valign="top" align="center">32 (25.4)</td>
<td valign="top" align="center">46 (36.5)</td>
<td valign="top" align="center"/></tr>
<tr>
<td valign="top" align="left">&#x02003;Unknown</td>
<td valign="top" align="center">7 (36.8)</td>
<td valign="top" align="center">7 (36.8)</td>
<td valign="top" align="center">5 (26.4)</td>
<td valign="top" align="center"/></tr></tbody></table></table-wrap></floats-group></article>
