Current evidence suggests that the occurrence and development of CRC is a multi-step, multi-stage and multi-gene process. It is widely considered to result from the interaction of environmental and genetic factors, as well as from the upregulation of tumor suppressor genes and proto-oncogenes. Based on the genetic mutations and the cytogenetic background of the genome, molecular typing based on the CRC genome is affected by the presence of the following: Chromosomal instability (CIN), microsatellite instability (MSI), CpG island methylator phenotype (CIMP) and molecular markers.
CIN
CIN refers to the phenomenon of chromosomal variation in cells (10). It mainly consists of two parts: Chromosomal number variation, namely chromosomal aneuploidy, which is closely related to tumor deterioration, progression, metastasis and a poor prognosis (11–13), and abnormal chromosomal structure, such as recombination, ectopia, and inversion, among others (14). It has been shown that mutation accumulation of multiple proto-oncogenes, such as RAS, phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit α(PIK3CA), c-Myc, BRAF and tumor suppressor genes, such as adenomatous polyposis coli (APC) gene, tumor protein 53 (TP53), PTEN, deleted in Colorectal Cancer(DCC), can lead to CIN. Genomic instability promotes development of CRC. CIN tumors can develop through loss of heterozygosity (LOH) in chromosomes (14). Watanabe et al (15) classified CIN tumors into CIN-high (severe type; LOH ratio ≥75%), CIN-high (mild type; LOH ratio ≥33 and <75%), and CIN-low (LOH ratio <33%) according to the LOH ratio. Survival analysis showed that disease-free survival (DFS) and OS rates of patients with CIN-high tumors were significantly lower than those of patients with CIN-low tumors, corroborating that the CIN phenotype is an independent risk factor for CRC survival. CIN phenotype is most common in the distal colon (16).
CIN has been documented in most sporadic CRCs (Sp-CRCs) and tumors with APC germline mutations, with an APC mutation rate of only 1%. Nevertheless, little is known about whether CIN is an independent predictor of familial CRC. Some researchers concluded that although the sensitivity of CIN prediction for familial CRC was acceptable, it was not sufficient to be an independent predictor (10,17,18). One study found no significant difference in CIN between familial CRC cases and non-familial, control CRC cases (P=0.50) (18).
A substantial number of CRCs, known as interval CRCs (I-CRCs), are diagnosed in the period shortly after a negative colonoscopy result (i.e., no detectable polyps or CRC) and prior to the recommended follow-up screening (19). According to the American Cancer Society, ~5,200 Americans were diagnosed with an I-CRC in 2014, and nearly 2,000 succumbed to the disease (20). This particular type of CRC may be associated with genetic defects inducing genome instability, or may be a specific type of Sp-CRC. In response to this uncertainty, researchers performed a matching comparison experiment of I-CRC/Sp-CRC cases and found that CIN occurred in 80–85% of Sp-CRCs and I-CRCs, and the latter frequently exhibited gains and losses in chromosomes 8, 11 and 17 (20). One possible explanation is the inaccurate detection of certain polyps/tumors or similar clinical features leading to negative colonoscopies. Another explanation is that I-CRCs represent a distinct tumor subset with both CIN and MSI phenotypes, and that these two molecular features may play a synergistic role. Furthermore, the interval between colonoscopy and screening could also be an additional explanation.
The CIN phenotype tends to be more of a predictive tool in clinical practice, and patients with CIN-positive CRC have shown poor OS and progression-free survival outcomes, regardless of ethnic background, anatomical location and fluorouracil (5-FU) chemotherapy efficacy (21). Watanabe et al (15) retrospectively reviewed the expression of MSI and CIN in 1,103 patients and concluded that the CIN phenotype could be used as an independent risk factor for DFS and OS in stage II/III patients, and CIN-high could be used as a predictor of a poor prognosis. Only 8% of patients with CRC were either in stage I or IV. CIN is a driver of the metastasis of human cancer cells, which has been preliminarily verified in breast and lung cancer models (22); however, the progression pattern of the CIN phenotype in breast cancer and lung adenocarcinoma is not applicable in CRC. Orsetti et al (23) used array comparative genomic hybridization to analyze a group of 162 patients with CIN CRC, consisting of 131 primary cancer cases evenly distributed in stages I to IV, 31 metastases (28/31 formed a primary-tumor/matched-metastasis pair) and 14 adenomas. The results showed that the increased level of genomic instability represented by CIN was not entirely consistent with the progression from stage I to IV during the histopathological examination. In addition, with study of the molecular mechanism of CIN, the genetic variation or abnormal expression of some molecules that maintain chromosomal stability may become therapeutic targets and diagnostic markers (22,24–27). High levels of CIN are not conducive to the proliferation of tumor cells. It is widely considered that drugs could induce higher levels of CIN phenotype, leading to the spontaneous death of tumor cells. Heat shock protein 90 inhibitors may achieve this effect by inducing higher aneuploidy and limiting tumor cell growth (28). A phase II trial (29) showed that patients with CRC did not respond well to docetaxel (Taxotere®), which may be attributed to the fact that 85% of CRC tumors were of CIN type, and aneuploidy was less receptive to taxanes than diploid karyotype (21), associated with increased taxane resistance caused by abnormal spindle examination points in CIN (30).
MSI
Microsatellites are short nucleotide repeats (1–6 repeat units) that are heritable, unstable and highly polymorphic in the human genome (31). MSI refers to the change in the number of microsatellite tandem repeats within a certain location in certain cells. Importantly, if the DNA mismatch repair genes (MMR) show germline mutations or LOH, errors from microsatellite replication will be retained (MMR-Deficient (dMMR)/MSI) (32). MSI in CRC includes the majority of hereditary non-polyposis CRC (HNPCC) and 15% of Sp-CRCs (32,33). MSI commonly occurs in two situations (34): The first is the germline mutation of MMRs MutL homolog 1 (MLH1), MutS homolog 2 (MSH2), (MSH6) or Postmeiotic segregation increased 2(PMS2), and the other is hypermethylation of the MLH1 gene promoter region, which are predominantly cases of Sp-CRC showing dMMR. According to the microsatellite expression, MSI can be divided into three types: Microsatellite high instability (MSI-H), microsatellite low instability (MSI-L) and microsatellite stability (MSS). The two main techniques used to determine dMMR/MSI status include immunohistochemistry (IHC), designed to detect dMMR status, and molecular testing, which determines MSI status (35). Overwhelming evidence substantiates that MSI CRC often presents as poorly differentiated carcinoma and mucinous adenocarcinoma, mostly in the proximal colon with peritumoral lymphocyte infiltration (31,36–38).
The incidence in younger adults (patient younger than 50 years), early-onset CRC (EOCRC) is rising alarmingly. EOCRC is an important reflection of the younger trend of gastrointestinal tumors. Long-term tumor burden is becoming increasingly severe for patients with EOCRC. In previous studies, in younger patients, metastatic tumors represented an increasing proportion of all tumor stages (39). Compared with general patients with CRC, patients with EOCRC had a higher frequency of dMMR/MSI-H and a higher proportion of wild-type (WT) KRAS and BRAF, as well as a higher BRAF V600E mutation rate (40,41). Taken together, these results support changing the average-risk screening age from 50 to 45 years for all patients, with molecular characterization being an important breakthrough for clinical intervention. It is well established that patients with MSI-H CRC have a better prognosis and longer survival time than patients with either MSI-L or MSS CRC. Guastadisegni et al (42) analyzed the survival status of 12,782 patients with CRC and concluded that patients with MSI-H CRC had improved OS and DFS times. A meta-analysis involving >7,500 patients showed that MSI-positive tumors were superior to MSI-negative tumors in terms of MSI and survival assessment, suggesting that the genomic molecular marker status can be independently analyzed to assess prognosis (43). Current guidelines recommend harnessing MSI-H to guide CRC adjuvant therapy and improve the quality of individualized treatment. In this respect, according to the National Comprehensive Cancer Network® (NCCN) guidelines (44,45), patients with stage II MSI-H CRC may have an improved prognosis but may not benefit from 5-FU-assisted chemotherapy. Kim et al (46) performed MSI and MMR detection, and prognosis analysis on 135 patients who received FOLFOX-assisted chemotherapy (adjuvant oxaliplatin, 5-FU and leucovorin therapy) after radical resection of CRC. The results showed that DFS and OS times were not significantly prolonged in patients with MSI-H/MMR-deficient (MMR-D) CRC compared with patients with MSI-L/MMR-intact (MMR-I) CRC. It was not investigated whether patients with MSI-L/MMR-I CRC would benefit more from 5-FU chemotherapy. Guastadisegni et al (42) hypothesized that patients with MSS CRC would benefit more from 5-FU chemotherapy than patients with MSI-H CRC. In a retrospective study of 6,964 patients with stage II CRC (47), an attempt was made to determine the relationship between 5-FU-based adjuvant chemotherapy, primary tumor laterality, MSI status and OS. The results showed that for MSS-positive tumors, adjuvant chemotherapy was significantly associated with improved patient 5-year OS rate [hazard ratio (HR), 0.47; P<0.001], even in the absence of other risk characteristics. By contrast, there was no significant association between adjuvant chemotherapy and OS in patients with MSI-positive CRC (HR, 0.85; P=0.671). It is difficult to judge the sensitivity of patients to 5-FU based solely on MSI status, and multiple stable expression markers are needed for a comprehensive analysis. In recent years, immunotherapy has been increasingly used to treat MSI CRC (48,49). In 2017, the US Food and Drug Administration approved pembrolizumab to treat inoperable or metastatic dMMR/MSI-H solid tumors based on the high response rates observed in five clinical trials (50–55). Nivolumab was introduced in dMMR/MSI-H metastatic CRC (mCRC) in the same year (56). Frameshift peptides generated by frameshift mutations caused by MSI-H are highly immunogenic and respond well to programmed death receptor-1(PD1)/programmed death ligand 1 (PD-L1) inhibitors. In 2015, a study showed that the MSI status of tumors is closely related to the effect of immunotherapy (57). From later-line monotherapy (Keynote-164, CheckMate-142) and later-line dual-drug therapy (CheckMate-142), to first-line monotherapy (Keynote-177) and first-line dual-drug therapy (CheckMate-142), the role of immunotherapy in the treatment of dMMR/MSI-H CRC is expanding (53,56,58,59). The efficacy of nivolumab plus ipilimumab in the treatment of patients with advanced dMMR/MSI-H CRC is reflected in the 2021 NCCN guidelines and the 2022 American Society of Clinical Oncology (ASCO) conference report (60). Dual immunotherapy can effectively reduce the occurrence of drug resistance, while B2M or JAK1/2 gene mutations associated with resistance to traditional immunotherapy do not affect the benefit of MSI-H CRC to PD-1 antibodies (61); however, immunotherapy will not be effective for the treatment of MSI-H mucinous adenocarcinoma CRC. Due to the persistence or potential toxicity of immunotherapy, a balance between efficacy and toxicity is necessary. It is worth mentioning that since immune checkpoint inhibitors (ICIs) are not effective in MMR-proficient (pMMR)/MSS mCRC, MMR IHC and MSI testing should be performed prior to ICI initiation to minimize the chance of pMMR/MSS tumors being misdetected as dMMR/MSI (62,63). Lynch syndrome is an aggressive autosomal dominant genetic disorder with an ~80% lifetime risk of cancer recurrence caused by germline mutations in the MMR genes (MLH1, MSH2, MSH6 and PMS2) (64). The NCCN guidelines recommend MMR or MSI testing for Lynch syndrome in all patients with a history of CRC (44,45). Some researchers consider that the relationship between MSI and CIN is not independent, and that both can be expressed in one patient with CRC, namely a patient with MSI-positive/CIN-positive CRC, although this is rare (65). The frequency of MSI-positive/CIN-positive tumors was recorded as 12 (Sp-CRCs) and 14% (I-CRCs), respectively (19). Furthermore, ~25% of patients presented with MSI-negative/CIN-negative CRC (66–69). A study stratified survival by CIN and MSI status, and concluded that the univariate survival benefit of stage II and III CRC associated with MSI-positive status was not independent of CIN status during multivariable analysis (21,67). Future experiments designed for the three forms of genomic instability [CIN, MSI and CpG island methylator phenotype (CIMP)] may clarify the relationship between the three.
Mutations and genomic instability contribute to inter-tumor heterogeneity. The sub-clonal phenomenon that exists throughout tumor progression is called intra-tumor heterogeneity (ITH). It has been reported that ITH can be detected in almost all cancer types and is associated with tumor prognosis and drug resistance (70–73). A typical example of ITH is the molecular differences between primary tumors and metastases, such as MMR pattern or MSI status. After assessing the MMR status of mCRC, fewer patients with mCRC showed heterogeneity of MMR status between the primary and corresponding metastatic sites (11.9 and 18.7%, respectively), among which patients with peritoneal metastasis tended to exhibit this feature. Furthermore, the prevalence of heterogeneous MMR phenotypes in primary tumors with dMMR was significantly higher than that with pMMR (P<0.001) (74,75). However, it is noteworthy that various factors, such as the expertise of the pathologists, the quality of the tumor tissue sampling and staining can contribute to these discrepancies.
CIMP
CpG islands are regions rich in cytosine (C) and guanine (G) dinucleotides in the gene, with CG content >50%, length >250–550 base pairs, and a CpG value of 0.6 or greater (76–78). It has been shown that the pathogenesis of CRC is related to DNA methylation, with hypomethylation of genes in non-promoter regions and hypermethylation of genes in promoter regions (79). DNA hypomethylation can lead to oncogene activation, gene marker deletion and chromosomal stability. Hypermethylation of promoter sequences interferes with the normal expression of tumor suppressor genes and DNA repair genes, which is known as epigenetic silencing (80). This hypermethylated phenotype is called CIMP. Various classification criteria have been developed to describe the tumor characteristics of CIMP, each with unique molecular and oncological characteristics, and no gold standard has been established. Weisenberger et al (81) divided CRC into CIMP-positive and CIMP-negative CRC, according to five gene combinations (CACNA1G, IGF2, NEUROG1, RUNX3 and SOCS1). Similarly, Ogino et al (82) classified CRC as CIMP-High (CIMP-H), CIMP-Low (CIMP-L) and CIMP-negative based on eight gene combinations (RUNX3, CACNA1G, IGF2, MLH1, NEUROG1, CRABP1, SOCS1, and CDKN2A). This classification has been confirmed to be genetically associated with TP53, KRAS, BRAF, MSI and specific histological types (poorly differentiated or mucinous) (83,84); however, the relationship with CIN remains unclear. An increasing body of evidence suggests that CIMP-type molecular pathways mostly occur in the proximal colon, mainly in elderly women (84–86). Moreover, it has been observed that the frequency of gene hypermethylation in normal colon mucosa in women and elderly patients with CRC is higher than that in men and young patients (86).
Different methods of CIMP identification and experimental populations may result in different pathological characteristics. Weisenberger et al (86) showed that the right colon is a high-risk site for CIMP, and this classification is associated with advanced age in women, with the hallmark mutation of BRAF (V600E), loss of hypermethylation of the MLH1 promoter and loss of TP53. There is ample evidence suggesting that CIMP is closely associated with prognosis, but it remains unclear whether this association is positive or negative. Ogino et al (87) showed that CIMP-H was an independent predictor of colon cancer-specific low mortality. During the stratified analysis, CIMP-H was associated with significantly reduced colon cancer-specific mortality regardless of MSI and BRAF status. However, a more recent meta-analysis (88) involving 15,315 patients with CRC confirmed that CIMP-H CRC was associated with poorer OS/DFS/PFS/RFS times than CIMP-L/negative CRC. Furthermore, a survival disadvantage was observed in terms of OS, especially in stage III–IV and pMMR tumors. In addition to its prognostic value, the role of CIMP in the prediction of the chemotherapy response is another issue requiring resolution. Much controversy surrounds the efficacy of 5-FU-based chemotherapy against CIMP-positive CRC. Cha et al (89) showed that CIMP was associated with adverse outcomes for patients receiving chemotherapy for mCRC. Jover et al (90) concluded that CIMP-positive patients did not significantly benefit from 5-FU-based adjuvant chemotherapy after following up 302 patients with CRC, while CIMP-negative patients receiving chemotherapy experienced significantly prolonged DFS times. Iacopetta et al (91) reported contrasting findings that patients with CIMP-positive CRC can benefit from 5-FU treatment, mainly related to the association between CIMP positivity and intracellular folic acid metabolism, and gene silencing caused by DNA methylation. A recent study claimed that CIMP-positive tumors are potentially more responsive to the topoisomerase-inhibitor, irinotecan (92). Although CIMP has been reported as a potential prognostic biomarker for drug decision-making, overall research on treating CIMP-positive tumors with hypomethylating drugs appears to be limited. Indeed, the lack of a widely accepted CIMP phenotype and the correct stratification of patients with CRC according to the CIMP status are key issues for future CRC trials. Since CIMP was shown to be a tissue-specific phenomenon, DNA methylation information obtained only by array probes or other low-density techniques is far from sufficient, and new analysis methods, such as PacBio single-molecule real-time sequencing or nanopore sequencing are needed (93). Assuming that the CIMP pattern is stable across tumor sections and that epigenetic drug delivery and protection against side effects are improved, optimal antitumor activity is expected for CIMP-positive tumors.
Molecular markers
Some patients benefit from the wide application of molecular targeted therapy, and identifying molecular markers is an important prerequisite for screening patients with CRC who can benefit from targeted drugs.
RAS
The RAS gene is the most common proto-oncogene in human tumors. RAS can encode a group of small molecular proteins homologous to G proteins, called RAS proteins. When a RAS protein is mutated, it cannot normally complete its signal-mediated transduction process, and abnormal cell growth, differentiation and material transport occur, leading to uncontrolled proliferation and carcinogenesis. Importantly, the KRAS mutation rate in CRC is 30–50% (94,95). Sugimoto et al (96) hypothesized that as a precancerous lesion of CRC, the progression of laterally spreading tumor-granular was closely associated with RAS gene mutations, with RAS mutation rates up to 54.1%. The predictive significance of RAS mutant (mt) on the anti-EGFR drug response rate and survival time in patients with mCRC has been confirmed in previous studies (97–100). According to the NCCN guidelines, all patients with mCRC should be tested for the genotype of RAS (KRAS, NRAS) and BRAF mutations in tumor tissue (6,7). After resection of metastases, a negative association between RAS mutations and patient survival has previously been reported (101). Importantly, patients with any known KRAS (exon 2 or non-exon 2) or NRAS mutation should not be treated with cetuximab or panitumumab, while other targeted therapies, such as bevacizumab, can still be used. It has been established that patients with mCRC carrying WT RAS can benefit from anti-EGFR therapy with prolonged OS and PFS times (102–104). In an analysis of patients with CRC of stage III MSS who received FOLFOX chemotherapy plus or minus cetuximab, KRAS was used as a marker of a poor prognosis (105). The RAS status was also included in the CMS classification established in 2015 (9). CMS analysis of patients with mCRC KRAS (exon 2 WT) in a previous FIRE-3 study showed that the CMS3 and CMS4 subgroups responded significantly better to cetuximab than to bevacizumab (106).
BRAF
BRAF is an important transduction factor in the EGFR signaling pathways of RAS, RAF, MEK, MRK and MAPK, and regulates various physiological processes of cell growth, differentiation and apoptosis. The mutation rate of BRAF in CRC is ~10% (107), and BRAF mutations are associated with the proximal colon and MSI (81,94,108). BRAF mt is often associated with a poor prognosis (109–111). A study demonstrated that patients with CRC in BRAF mt stage III have a higher risk of recurrence (112). By contrast, a study by Birgisson et al (113) reported that patients with CRC with both MSI and BRAF (V600E) mutations had a low recurrence rate, while researchers observed significantly higher recurrence rates in patients with MSI and KRAS mutations. Consistently, Seppälä et al (34) showed that in patients with stage I–II CRC, BRAF (V600E) mutation in conjunction with MSS is negatively associated with quality of life, and the prognostic potential of MSI negates the harmful effects of BRAF (V600E) and presents a positive prognosis. The IHC assay found that MLH1 expression was lost, but BRAF (V600E) was present, which excluded Lynch syndrome. This finding may be attributed to patients with MLH1 promoter methylation generally having BRAF mutations, and BRAF mutations almost always occur at a single site, V600E. BRAF mutations generally occur in patients with Sp-CRC, but not in patients with Lynch syndrome. The NCCN guidelines recommend that patients with mCRC should be tested for BRAF in tumor tissue (6,7). For patients with mCRC and RAS WT/BRAF mt, the guidelines do not recommend treatment with anti-EGFR monotherapy or in combination with chemotherapy agents.
The above two markers (RAS and BRAF) have been investigated in depth in previous studies, and PIK3CA and HER2 have also attracted the attention of clinicians. Current evidence suggests that the mutation rate of PIK3CA is 15–20% (114). In 2004, researchers reported a high frequency of PIK3CA mutations in human cancer cells such as breast cancer (frequency 7.1-35.5%), colorectal cancer (16.9-30.6%), ovarian cancer (33%), lung cancer (0.6–20%), among others, and subsequent studies identified PIK3CA as a risk factor for numerous types of cancer, including CRC (115). Currently, the American Society for Clinical Pathology, College of American Pathologists, Association for Molecular Pathology and ASCO guidelines (116) on the evaluation of molecular biomarkers of CRC do not recommend routine PIK3CA testing for treatment outside of clinical trials. In vitro cell and animal experiments have shown that some classical PIK3CA inhibitors, such as wortmannin, LY294002 and rapamycin, exhibit anti-tumor growth; however, the serious toxic side effects and drug resistance have limited further clinical trials (117,118). In addition, tumor cells treated with PI3KCA inhibitors tend to stop growing rather than undergo apoptosis, allowing tumor cells to develop drug resistance in various ways. It was shown that patients with PIK3CA-mutant CRC had a poor prognosis (119). An increasing body of evidence suggests that the benefit of aspirin in controlling overall CRC mortality may be more significant in PIK3CA-mutated CRC (114,120). HER2 is a proto-oncogene encoding a 185-kDa plasma membrane-bound tyrosine kinase receptor, a member of the EGFR gene family. HER2 amplification/upregulated expression can be detected in 3–5% of patients with RAS WT mCRC, mutually exclusive to KRAS, NRAS and BRAF mutations, and highly consistent between primary and metastatic tumors (121). HER2 has long been considered a marker of a poor prognosis, but multiple previous studies have shown that HER2 positivity is not significantly associated with prognosis (122–124). Accordingly, no consensus has been reached on its predictive effect. In patients with WT RAS, HER2-positive mCRC, the NCCN guidelines recommend trastuzumab combined with lapatinib/pertuzumab (6,7). In addition, HER2 amplification has been reported as one of the causes of patient resistance to EGFR therapeutics, and HER2 may serve as a negative predictor of anti-EGFR therapy (125). Therefore, anti-HER2 therapy may be a more reasonable option for patients with mCRC tested for HER2 upregulated expression before treatment with cetuximab and panitumumab. More recently, trastuzumab deruxtecan (DS-8201) has shown promising and long-lasting activity in patients with refractory, HER2-positive mCRC, including patients previously treated with HER2-targeted therapy, as preliminarily confirmed in the phase II trial DESTINY-CRC01 (126).
In most cases, a single marker characterized by mutations is insufficient to explain the heterogeneity in patients with CRC. In an attempt to refine the molecular map, combinations of multiple markers have the opportunity to overcome this limitation. One study (127) attempted to combine BRAF, PIK3CA and RAS testing and increased the proportion of patients benefiting from anti-EGFR treatment from 36.1 to 41.2%. The selection of drugs targeting the altered multiple gene targets in the MAPK pathway is expected to reduce drug resistance and improve the response rate. Besides, molecular classification using these biomarkers can be used to classify patients. For example, Gil-Raga et al (128) used BRAF (V600E), RAS and MMR status to divide 105 cases of stage I–III CRC into five molecular subtypes to identify differences in prognosis. In addition, it is widely thought that CRC is an umbrella diagnosis encompassing numerous rare disease subtypes, in a context where the complexity of molecular markers is increasing, and the combination of different markers emphasizes the significance of comprehensive genetic testing (125,129).
The fundamental significance of molecular typing is to better guide CRC-targeted therapy, prolong DFS, and improve patient prognosis and quality of life. Discussion of Molecular typing alone is less comprehensive, and individualized assessment of disease conditions inevitably takes into account demographic characteristics, clinicopathological characteristics, molecular markers, lifestyle and nutritional factors, and chemical agents. Subsequent classification reports, such as the Jass Classification of CRC (130), CCS Classification (131), Ogino Classification System (132) and Mangi Classification of CRC (133), mostly used CIMP, MSI, CIN, RAS and other markers as prototypes to explore the future direction of neoadjuvant therapy and targeted therapy in the clinical environment.