Screening for susceptibility genes in hereditary non ‐ polyposis colorectal cancer

In the present study, hereditary non-polyposis colorectal cancer (HNPCC) susceptibility genes were screened for using whole exome sequencing in 3 HNPCC patients from 1 family and using single nucleotide polymorphism (SNP) genotyping assays in 96 other colorectal cancer and control samples. Peripheral blood was obtained from 3 HNPCC patients from 1 family; the proband and the proband's brother and cousin. High-throughput sequencing was performed using whole exome capture technology. Sequences were aligned against the HAPMAP, dbSNP130 and 1,000 Genome Project databases. Reported common variations and synonymous mutations were filtered out. Non‐synonymous single nucleotide variants in the 3 HNPCC patients were integrated and the candidate genes were identified. Finally, SNP genotyping was performed for the genes in 96 peripheral blood samples. In total, 60.4 Gb of data was retrieved from the 3 HNPCC patients using whole exome capture technology. Subsequently, according to certain screening criteria, 15 candidate genes were identified. Among the 96 samples that had been SNP genotyped, 92 were successfully genotyped for 15 gene loci, while genotyping for HTRA1 failed in 4 sporadic colorectal cancer patient samples. In 12 control subjects and 81 sporadic colorectal cancer patients, genotypes at 13 loci were wild-type, namely DDX20, ZFYVE26, PIK3R3, SLC26A8, ZEB2, TP53INP1, SLC11A1, LRBA, CEBPZ, ETAA1, SEMA3G, IFRD2 and FAT1. The CEP290 genotype was mutant in 1 sporadic colorectal cancer patient and was wild-type in all other subjects. A total of 5 of the 12 control subjects and 30 of the 81 sporadic colorectal cancer patients had a mutant HTRA1 genotype. In all 3 HNPCC patients, the same mutant genotypes were identified at all 15 gene loci. Overall, 13 potential susceptibility genes for HNPCC were identified, namely DDX20, ZFYVE26, PIK3R3, SLC26A8, ZEB2, TP53INP1, SLC11A1, LRBA, CEBPZ, ETAA1, SEMA3G, IFRD2 and FAT1.


Introduction
Hereditary non-polyposis colorectal cancer (HNPCC), also known as Lynch syndrome, is inherited as an autosomal dominant disease and is the most common hereditary colorectal cancer, accounting for ~50% of familial colorectal cancer and 3% of all colorectal cancer cases (1).Unlike with sporadic colorectal cancer, HNPCC is associated with specific genetic factors and significant clinicopathological features.These features are often associated with synchronous and metachronous colorectal cancer and cause a high incidence of extraintestinal malignant tumors, including endometrial, gastric, renal, pancreatic and ovarian cancer types (2).Inactivation of DNA mismatch repair (MMR) genes, including MLH1, MSH2, MSH6 and PMS2, is the molecular genetic basis of HNPCC pathogenesis.Mutation of MMR genes can result in loss of DNA MMR function, leading to aberrant DNA replication, increased spontaneous mutation frequency and microsatellite instability.This ultimately leads to the transformation of normal cells into malignant cells (3)(4)(5).
However, a previous study observed that certain HNPCC patients, diagnosed by the presence of MMR gene mutations, did not meet some of the clinical diagnostic criteria for HNPCC (6).Furthermore, in certain patients meeting the clinical diagnostic criteria for HNPCC, MMR gene mutations could not be detected (7,8).Bashyam et al (8) demonstrated that, among 48 patients with Lynch syndrome, only 58% had MMR gene expression defects, which indicated that other, as yet unidentified, causative genes may be involved in the pathogenesis of HNPCC.
Based upon this assumption, in the present study, whole exome sequencing was performed in 3 HNPCC patients from 1 family and unreported mutations were observed in 15 gene

Screening for susceptibility genes in hereditary non-polyposis colorectal cancer
loci.Subsequently, peripheral blood was collected from control subjects, sporadic colorectal cancer patients and the aforementioned 3 HNPCC patients.Single nucleotide polymorphism (SNP) genotyping assays were also performed on the aforementioned 15 genes using the DNA MassARRAY Genetic Analysis system to further verify whether these genes were associated with HNPCC pathogenesis.

Materials and methods
Blood sample collection.All procedures in studies involving human participants were performed in accordance with the ethical standards of the institutional and/or national research committee and the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards.All patients signed informed consent forms prior to participation in the study and the study was approved by the Third Xiangya Hospital Ethics Committee (Changsha, China).Blood samples were collected from 96 subjects, including 12 control subjects, 81 sporadic colorectal cancer patients who were diagnosed by histopathology from January 2014 to December 2016 at the Third Xiangya Hospital of Central South University and 3 HNPCC patients from the aforementioned hospital who met the Amsterdam Criteria (9), which is outlined as follows: i) ≥3 colorectal cancer cases in the same family diagnosed by histopathology, one case being a first-degree relative (parent or sibling) of the other two cases; ii) ≥2 successive generations affected; iii) ≥1 case with onset prior to the age of 50 years; and iv) familial adenomatous polyposis in HNPCC patients should be excluded.In the HNPCC family investigated in the present study, the proband's father had colorectal cancer that was diagnosed by histopathology and the other 2 cases who provided samples were a sibling and a cousin of the proband.The 3 patients experienced changes in their stools and abdominal bloating prior to being hospitalized.Colorectal cancer was diagnosed by histopathology (all pathology diagnoses were confirmed by two deputy or chief director pathologists) following radical surgery (Table I).The pedigree of the HNPCC family is presented in Fig. 1.
Whole exome sequencing.DNA was extracted from the peripheral blood of 3 HNPCC cases and purified using a DNeasy Blood and Tissue kit (cat.no.69506; Qiagen, Inc., Valencia, CA, USA) according to the manufacturer's protocols.Exome sequences were subjected to DNA sequencing on the Illumina platform using Illumina PE Flow Cell v3-HS (Illumina, Inc., San Diego, CA, USA).In accordance with the manufacturer's protocols, genomic DNA fragments were processed by end repair, addition of adenosine (A) to 3' ends, ligation, DNA enrichment and hybridization.DNA libraries from samples were constructed.The concentration, purity and size of the libraries were measured using an Agilent 2100 Bioanalyzer (Agilent Technologies Inc., Santa Clara, CA, USA) and a Qubit ® 2.0 Fluorometer (Thermo Fisher Scientific, Inc., Waltham, MA, USA).The hybridization of sequencing primers and the generation of clusters were performed using cBot (HiSeq 2500; Illumina, Inc.) following the cBot User Guide (Part #15006165; Rev. F; lllumina, Inc.).A paired-end sequencing was then performed on a cluster-containing flow cell following the manufacturer's protocols (HiSeq 2500; Illumina, Inc.).Data acquisition software (Illumina, Inc.) was used for quality control and data analysis.The quality control standards for sequencing results were as follows: The average coverage for an exon region was ~100 times; if the average coverage was <90 times, it was resequenced; and at 100 times coverage, ≥85% of exon regions were covered by ≥1 sequence (Table II).The Burrows-Wheeler Alignment software package (version 0.5.9;Shanghai Biotechnology, China) was used to map sequences using human hg19 as the reference genome.Potential PCR duplicates were removed using rmdup of Samtools-0.1.18(Shanghai Biotechnology, China), and mapping statistics were generated using Samtools flagstat (Shanghai Biotechnology, China) (Table III).Capture-enrichment methods were used to determine the amount of fragment from the captured target region and the coverage and depth of the target region.
Shrimp alkaline phosphatase (SAP) purification.The total volume for the SAP purification reaction was 2 µl.This included 1.53 µl ddH 2 O, 0.17 µl SAP Buffer and 0.3 µl SAP enzyme (Sequenom, San Diego, CA, USA).Reaction conditions were 37˚C for 40 min and 85˚C for 5 min.
Extension reaction.The extension reaction was performed using a 9700 PCR instrument (Sequenom, Inc., San Diego, CA, USA) according to the manufacturer's protocol.The Complete iPLEX ® Gold Genotyping Reagent Set was purchased from Sequenom.The total volume of the extension reaction was 2 µl and included 0.619 µl ddH 2 O, 0.2 µl iPLEX GOLD Buffer, 0.2 µl iPLEXTermination mix, 0.94 µl iPLEX Extension Primer mix and 0.041 µl iPLEX Enzyme.Extension reaction conditions were 40 cycles of 94˚C for 30 sec and 94˚C for 5 sec, and 5 cycles of 52˚C for 5 sec and 80˚C for 5 sec, followed by 1 cycle of 72˚C for 3 min.The PCR products were purified using resin, were spotted onto a chip and were analyzed on the MassARRAY Platform SEQUENOM Analyzer 4 (Sequenom, Inc.).The Q20 value refers to the probability of error given to the identified base in the base calling process.If the mass value is Q20, the probability of error recognition is 1%, that is, the error rate is 1% or the correct rate is 99%.The 1x value refers to the likelihood that there is at least one read coverage in the genome sequence.Filtered reads, number of reads that pass filtering with sequenator; mapped reads, number of reads that map to each reference sequence; map ratio, ratio of mapped reads to filtered reads; unique mapped reads, number of reads that can map to each reference sequence after removing potential polymerase chain reaction duplicates using the Samtools rmdup tool; Unique Mapped Ratio, ratio of unique mapped reads to mapped reads.

Discussion
HNPCC is the most common hereditary colorectal cancer and exhibits familial aggregation; it is often accompanied by synchronous and metachronous colorectal cancer.The incidence of extraintestinal malignant tumors in HNPCC patients was previously revealed to be significantly higher than that in normal subjects (2).MMR gene defects are the molecular genetic basis of HNPCC pathogenesis, and ~90% of MMR gene mutations occur in the hMSH2 and hMLHl genes (10).However, in certain patients who meet the clinical diagnostic criteria for HNPCC, MMR gene defects cannot be detected (11,12).
In the present study, 3 HNPCC cases underwent whole exome sequencing.Mutations were newly identified at 15 gene loci.These 15 genes were investigated using an SNP genotyping assay in 96 subjects, including HNPCC patients, sporadic colorectal cancer patients and control subjects.The 15 loci carried the same mutations in all 3 HNPCC patients.However, in the 12 control subjects and 81 sporadic colorectal cancer patients, genotypes were wild-type at 13 of the 15 gene loci, indicating that mutations in these 13 genes may be associated with HNPCC pathogenesis.A number of these 13 genes have been revealed to be associated with the development and progression of malignant tumors (13-28), autoimmune diseases, tuberculosis and other infectious diseases (28), and sperm differentiation (29).However, the consequences of mutations in these 13 genes have not previously been reported in the pathology of colorectal cancer.
The results of the present study revealed that certain sporadic colorectal cancer patients and control subjects carry mutations in the HTRA1 gene.The expression level of the HTRA1 gene is associated with the prognosis of various types of malignant cancer, including liver cancer, breast cancer and mesothelioma (30,31).Additionally, 1 of the 81 sporadic colorectal cancer patients in the present study carried a mutation in the CEP290 gene that was also present in colorectal cancer patients from the HNPCC family.However, there have been no reports of a correlation between CEP290 mutations and the pathogenesis of malignant tumors.Future studies will further verify whether HTRA1 and CEP290 are susceptibility genes for HNPCC by expanding sample sizes.
In the present study, 13 genes that may be susceptibility genes for HNPCC were identified by whole exome sequencing and SNP genotyping experiments.In the future, studies will  ------------------------------------------------------------- focus on large-scale genetic screening and in vivo and in vitro experiments in order to investigate the mechanisms of the confirmed mutations in the development and progression of colorectal cancer.It is anticipated that more pathogenic genes will be discovered and that our understanding of the molecular genetic basis of HNPCC will be improved, thereby providing theoretical guidance for the diagnosis and treatment of HNPCC.

Table II .
Sequence quality control results for 3 hereditary non-polyposis colorectal cancer patients.

Table I .
Clinical characteristics of 3 hereditary non-polyposis colorectal cancer patients.

Table III .
Sequence mapping information for 3 hereditary non-polyposis colorectal cancer patients.

Table IV .
Names and sequences of polymerase chain reaction primers.

Table VI .
Single nucleotide polymorphism genotyping results at 15 gene loci.