CIN pathway
The average rate of genomic mutation in normal human cells is estimated to be ~2.5×10−8 mutations/nucleotide/generation (19,20). However, this rate is higher in cancer cells due to the sequential accumulation of multiple mutations during cell divisions forming a so-called ‘mutator phenotype’ (21). Accordingly, mutations in MMR genes, genes that regulate cell cycle checkpoints, and/or cellular responses may elevate mutation rates to the level commonly observed in human tumors (21). The ‘mutator phenotype’ may have various manifestations, including point mutations, CIN, MSI, CIMP and LOH (21).
CIN appears to be the most common type of genetic instability in CRC, observed in 85% of adenoma-carcinoma transitions (5–7). CIN refers to a high rate of gains or losses of whole, or large portions of chromosomes. This leads to karyotypic variability from cell to cell that consequently forms an aneuploidy, sub-karyotypic amplification, chromosomal rearrangement, and a high frequency of LOH at tumor suppressor gene loci (5,6). In addition, CIN tumors are recognized by the accumulation of mutations in specific oncogenes, including KRAS proto-oncogene GTPase (KRAS) and B-Raf proto-oncogene serine/threonine kinase (BRAF), and tumor suppressor genes, such as APC and tumor protein p53 (TP53), thereby contributing to CRC tumorigenesis (6,10). The multistep genetic model of colorectal carcinogenesis proposed by Fearon and Vogelstein is now widely accepted, and used as a paradigm for solid tumor progression (12). According to this model, inactivation of APC occurs as the first event, followed by oncogenic KRAS mutations in the adenomatous stage, and eventually, deletion of chromosome 18q and inactivation of the tumor-suppressor gene TP53 on chromosome 17p occur during the transition to malignancy (Fig. 1) (12,22–25).
Array-based comparative genomic hybridization and single nucleotide polymorphism techniques have enabled scientists to effectively determine CNVs in the entire human genome with higher resolution. Although the allelic loss of all chromosomal arms has been detected in certain tumors, its frequency varies considerably, and only a few of them are highly recurrent in CRC, including losses at chromosomal arms 1p, 5q, 8p, 17p, 18p, 18q, 20p and 22q (26–31). A high-frequency allelic loss at a specific chromosomal region denotes the presence of a candidate tumor-suppressor gene, including APC on chromosome 5q, TP53 on chromosome 17p, DCC netrin 1 receptor (DCC), SMAD family member (SMAD2 and SMAD4) on chromosome 18q (31). In contrast, a gain of chromosomal material suggests the presence of the potential oncogenes or genes that favor cell growth or survival. In CRC, gains at chromosome 7, and chromosomal arms 1q, 8q, 12q, 13q and 20q have been repeatedly reported by different research groups (26–31). It was reasoned that these chromosomal changes are associated with a gain and loss of function of tumor-associated genes offering mutated cells growth and survival advantages, leading to progressive conversion of normal cells into cancer cells (32,33). However, the gains/losses of chromosomal materials generally span a large region and comprise a large number of genes making identification of target genes challenging.
In the field of stem cell research, genetic analysis of human embryonic stem cell (hESC) lines, a pluripotent cell type that shares numerous characteristics with cancer cells, has also revealed multiple CNVs, and few of them are also recurrent, including losses of chromosomal band 18q21qter, and whole or partial gains of chromosomes 1, 12, 17 and 20 (34,35). Notably, 20q11.21 amplification was identified in >20% of the screened hESC lines (36). Previously, BCL2 like 1 (BCL2L1), which is located in the smallest common chromosomal region of gain and regulates the mitochondrial apoptotic pathway, has been confirmed as the key-driver gene of this amplification (37,38). Accordingly, the overexpression of Bcl-xL, an anti-apoptotic isoform of BCL2L1 has offered cells a survival advantage by preventing apoptosis (37,38). Overexpression of this gene may also be responsible for the gain of 20q in various human cancer types (39).
Losses of 18q
Allelic loss at chromosome 18q is detected in ~70% of primary CRC in the late carcinogenic process (29,31,40,41), and is considered as a poor prognosis marker for survival in patients with CRC (42,43). The high frequency of allelic deletions involving chromosome 18q suggests the presence of candidate tumor-suppressor genes whose inactivation may serve a significant role in CRC, including DCC, SMAD2 and SMAD4 (12,25,44). DCC, located in the chromosome band 18q21.2, encoding a component of the neutrin-1 receptor, was proposed as a putative tumor-suppressor gene (45). However, much of the reported data on the loss and inactivation of DCC is circumstantial and fails to provide conclusive evidence that DCC functions as a tumor-suppressor gene (46). Furthermore, to the best of our knowledge, there is no evidence that germline mutations of DCC serve a role in heritable cancer; and few somatic mutations in DCC have been reported in CRC (46). The presence of two other well-established tumor suppressor genes, SMAD2 and SMAD4 in the region of loss also challenges the function of DCC as a tumor-suppressor gene (47,48). In fact, SMAD2 and SMAD4 genes are localized in 18q21.1, the common region of loss of 18q in CRC (25). These SMAD genes encode downstream signal transducers for transforming growth factor-β (TGF-β), and their alterations may confer resistance to TGF-β and contribute to tumorigenesis (49). SMAD4 was identified to be inactivated in ~60% of pancreatic cancer (50). However, the frequency of SMAD4 and SMAD2 somatic mutations is relatively low in CRC (51–53). Nevertheless, smaller regions of loss, which exclude SMAD2 and SMAD4, have been reported in head and neck squamous cancer (54). In addition, their gene expression is retained in CRC with LOH of 18q (46). Taken together, these observations suggest that SMAD2 and SMAD4 are unlikely to constitute the major chromosome 18q target for inactivation in CRC, and that other tumor suppressor genes besides the DCC and SMAD genes may be the target for chromosome 18q loss.
APC/β-catenin
Activation of the Wnt signaling pathway via mutation of the APC, a multi-functional tumor-suppressor gene on 5q22.2, is essential and the earliest event in the development of CRC (55). APC protein is a key component of the β-catenin destruction complex involved in the degradation and suppression of the Wnt/β-catenin signaling pathway (56). Mutant APC disrupts the formation of the destruction complex leading to stabilization and accumulation of β-catenin protein in the cytoplasm. Accumulated β-catenin protein is translocated to the cell nucleus where it forms complexes with TCF/LEF, and induces overactivation of Wnt downstream effectors that, in turn, promote the proliferation, migration, invasion and metastasis of cancerous cells (57). The same outcome is also observed with mutations in β-catenin (58) and AXIN2 (57), but to a lesser extent. Notably, mutations in AXIN2 have been reported in CRC with MSI only (59).
APC mutations or allelic losses have been identified in ~90% of patients with CRC (60). Germline mutations in APC are responsible for FAP (15), while somatic mutations and/or allelic deletions of APC are described in sporadic CRC (61). The APC gene may also be epigenetically inactivated through promoter hypermethylation that has been identified in 18% of primary colorectal carcinoma and adenoma cases (62).
TP53
TP53 is a tumor-suppressor gene located on the short arm of chromosome 17, which is commonly lost in colorectal carcinoma (40). TP53 has been defined as the ‘guardian of the genome’ because it encodes a transcription factor that regulates the transcription of hundreds of genes involved in different processes, including DNA repair, cell cycle arrest, senescence, apoptosis and metabolism in response to a variety of the stress signals (63). Upon DNA damage, for example, TP53 induces cell cycle arrest at the G1 or G2 phase, or triggers apoptosis when the damage is too severe and irreparable (64). Loss of TP53 function, therefore, contributes to the propagation of damaged DNA to daughter cells.
TP53 alteration is the hallmark of human tumors, and the status of TP53 mutation is associated with the progression and outcome of sporadic CRC (65). Particularly, TP53 loss of function has been reported in 50–75% of CRC cases, much higher compared with that in adenoma, indicating its role in the transition from an adenoma to carcinoma (66,67). To date, the majority of the TP53 mutations reported in CRC are missense mutations that substitute AT for GC (68). Liu and Bodmer (69) have analyzed TP53 mutations and their expression in 56 CRC cell lines, and reported a relatively high frequency of TP53 mutations (76.8%), in which missense mutations accounted for 47.83% and point mutations that are transitions at CpG sites accounted for 37.5%. These mutations render an inactive protein with an abnormally long half-life that is detectable by immunohistochemistry (70).
KRAS
The KRAS gene belongs to the RAS gene family involved in signaling pathways that regulate cellular proliferation, differentiation or survival. KRAS is a membrane-bound GTP/GDP-binding protein with intrinsic GTPase activity and is expressed in the majority of human cells. The switch between its active GTP-bound state and the inactive GDP-bound state is regulated by GTPase-activating proteins and guanine nucleotide exchange factors (71). The KRAS mutations impair the intrinsic GTPase activity of KRAS, causing the accumulation of the KRAS proteins at the GTP-bound active state, eventually resulting in the constitutive activation of the downstream proliferative signaling pathways (72).
Oncogenic mutations in the RAS gene have been identified in ~30% of all human tumors (73), in which mutations in KRAS accounted for ~85%, NRAS proto-oncogene GTPase (NRAS) for ~15%, and HRas proto-oncogene GTPase (HRAS) for <1% (74–76). The high frequency of KRAS mutations and its appearance at a relatively early stage in tumor progression suggest a causative role of KRAS in human tumorigenesis. Several studies have reported an association between KRAS mutations, and poor prognosis of CRC (77,78), and lung (79,80) and liver (81) metastasis. In contrast, several other studies reported that KRAS mutations were strong independent predictors of survival in patients with CRC (80–82). These contradictory findings may be explained by the differences in the distribution of specific KRAS mutations, stage at diagnosis or other characteristics. KRAS mutations have emerged as an important predictive marker of resistance to anti-epidermal growth factor receptors (EGFR) agents, including panitumumab and cetuximab (83–86).
Activating KRAS mutations have been identified in 35–45% of CRC cases (40,80,87–89), and primarily occur in codon 12 and 13 (75,89). The most frequent changes observed in these codons are the substitution of glycine for aspartate (p.G12D, p.G13D) (90). The mutation rates of NRAS, in contrast, are lower (1–3%) and activating mutations of HRAS has not been detected in CRC (40,91,92). Previously, pyrosequencing of KRAS, BRAF and phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit α revealed that 53.8% of patients exhibit a KRAS mutation in codons 12 or 13, of which 57.9% were c.38G>A (pG13D), and 22.2% were c35G>T (p.G12V) mutations (93).
MSI
Another type of genomic instability is MSI, a typical characteristic of cancerous cells, occurring in 15–20% of sporadic CRC and in >95% of HNPPC. Microsatellites are repetitive DNA sequences consisting of tandem repeats, usually between one to five base pairs. Patients with MSI phenotype exhibit a high frequency of replication errors, particularly in repetitive DNA sequences, primarily due to the slippage of the DNA polymerase (94). The progressive insertion/deletions of nucleotides within the microsatellite sequences result in the appearance of longer or shorter alleles compared with those detected in the normal cells of the same individual (95,96).
To access the MSI status of a cancer, a standard panel of five microsatellite markers, including two mononucleotide (BAT26 and BAT25) and three dinucleotide (D2S123, D5S346, and D17S250) repeats, has been recommended according to the Bethesda Guidelines (97). Tumors are then classified based on the number of microsatellites exhibiting instability. Particularly, tumors are classified as MSI high (MSI-H) when ≥30% of the markers exhibit instability; those with <30% markers exhibiting instability are defined as MSI low, and those with no apparent instability are microsatellite stable (MSS) (97,98).
It is now accepted that MSI is associated with post-replicative DNA MMR deficiency, primarily involving mutL homolog 1 (MLH1) and mutS homolog 2 (MSH2) (94,99–101). Impairment of MMR genes can occur by either mutational inactivation or by epigenetic inactivation through CpG island methylation of the promoter of the genes. Loss or insufficiency of MMR activity leads to replication errors with an increased mutation rate and a higher potential for malignancy. In MSI-H gastric cancer, for example, hypermethylation of MLH1 promoter is responsible for the development of >50% of cases, whereas mutations in MLH1 and MSH2 account for ~15% of cases (102,103).
Small insertions/deletions may create frame-shift mutations within repetitive tracts present in the coding region of essential tumor-suppressor or tumor-associated genes, resulting in an inactive protein and contributing to tumorigenesis in cancers with MSI-H (104). Using a large-scale genomic screen of coding region microsatellites, Mori et al (105) identified nine loci that were mutated in >20% of tumors, namely: Transforming growth factor-β receptor (TGFBR2) (79.1%), BCL2 associated X apoptosis regulator (BAX) (37.5%), human mutS homolog 3 (26.2%), activin A receptor, type II (58.1%), SEC63 homolog protein translocation regulator (48.8%), absent in melanoma 2 (47.6%), NADH-ubiquinone oxidoreductase (27.9%), cordon-bleu WH2 repeat protein like 1 (23.8%) and proliferation-associated 2G4/ErbB3-binding protein 1 (20.9%). TGFBR2, encoding a kinase receptor involved in transduction of the TGFB1/2/3 signal from the cell surface to the cytoplasm to inhibit cellular proliferation, is the most commonly affected gene. Particularly, instability in the poly-adenine tract of this gene has been detected in ~85% of MSI-H colorectal tumors, rendering an inactive receptor and thus eliminating the growth-suppressive effects of TGFB1 (106). Another commonly mutated gene in CRC is BAX, a pro-apoptotic gene belonging to the BCL2 family. Frame-shift mutations within the poly-guanine sequence have been detected in 50% of MSI-H colorectal tumors, causing silencing of this gene and suppressing apoptosis (107). These alterations in the gene functions represent a possible mechanism for MSI carcinogenesis.
CIMP or aberrant DNA methylation
Transcription inactivation by DNA hypermethylation at promoter CpG islands of tumor-suppressor genes, causing gene silencing, is now recognized as an important mechanism in human carcinogenesis (108–111). The CpG island methylator phenotype has been identified in 30–35% colorectal adenoma cases, and is considered as an early event and a characteristic for the serrated pathway of colorectal tumorigenesis (108,112,113). However, the quantitative DNA methylation study performed by Ogino et al (114) reported that CIMP accounts for 17% of CRC, which is less frequent compared with previously reported and that clinical features of CIMP are similar to those of MSI-associated CRC (114). Notably, sporadic MSI colorectal tumors are almost exclusively associated with CIMP-associated methylation of MLH1 leading to inactivation of this gene (107,115). In contrast, the familial MSI cases (Lynch syndrome) are generally caused by germline mutations in the MMR genes, primarily including MLH1 and MSH2, and accounts for <5% of all CRC cases (Fig. 2) (107,116).
The CIMP status of CRC is currently assessed by a panel of methylation markers categorizing CRC as exhibiting or not exhibiting DNA methylation on the basis of certain thresholds (114,115,117,118). CIMP+ colorectal tumors appear to have a distinct profile, including associations with the proximal colon, poor differentiation, MSI status, BRAF mutation and wild-type KRAS (113–115,119–121). Particularly, the frequency of BRAF mutations in CIMP+ tumors is significantly higher compared with their CIMP− counterparts (114,115). Shen et al (122) analyzed the genetic and epigenetic alterations in 97 primary CRC samples, and demonstrated that CIMP-high tumors are associated with MSI status (80%) and BRAF mutation (53%); CIMP-low tumors are associated with KRAS mutations (92%); and CIMP− tumors typically have a high rate of p53 mutations (71%) (122). Furthermore, CIMP status has also been indicated to be negatively associated with 18q LOH status in colorectal tumors (117). Particularly, CIMP-0 was associated with 18q LOH-positive tumors and vice versa (117).