Fifty-four novel mutations in the NF1 gene and integrated analyses of the mutations that modulate splicing

Neurofibromatosis type 1 (NF1) is a common autosomal dominant genetic disorder caused by mutations in the NF1 gene. One of the hallmarks of NF1 is the high mutation rate in this gene. In this study, we present 127 different NF1 mutations and 54 novel mutations detected at both the genomic DNA and mRNA level using a retrospective case series review. We found that 25.2% of these different mutations induced aberrant splicing. Of note, 40.6% of these splicing errors were caused by exonic variants. In addition, one mutation produced mosaicism in the post-transcriptional profile. However, studies investigating these splicing aberrations are limited. In order to better understand the pathogenicity of NF1 and to provide a more accurate interpretation in molecular diagnostic testing, combined computational analyses were employed to elucidate the underlying mechanisms of the variants modulating NF1 gene splicing.


Introduction
Neurofibromatosis type 1 (NF1) (OMIM 162200) is a progressive autosomal dominant inherited disease and is one of the most widespread genetic disorders worldwide with a prevalence of 1 in 2500-to -3000 live births (1). The clinical characteristics in the NF1 diagnostic criteria include café-au-lait spots, neurofibromas, Lisch nodules, intertriginous freckling, typical osseous lesions and optic pathway gliomas (2). At least 78% of patients who fulfill the NIH diagnostic criteria for NF1 have NF1 gene mutations (3). Moreover, 5-10% of the cases are caused by a deletion in the NF1 gene (4). However, the positive rates of the NF1 mutation findings in clinical diagnostic laboratories vary considerably according to the proportions of the samples from clinically definite or suspected patients. NF1 is located on 17q11.2 and spans 28,2751 bp in length. This gene contains 60 exons and encodes neurofibromin, a key component in the RAS-MAPK signaling pathway. The RAS-MAPK pathway regulates the proliferation and differentiation of neuronal cells and myocytes (5). Neurofibromin functions as an inhibitor of RAS activation and as a tumor suppressor with a central region that is homologous to RAS-GTPase activation proteins (GAPs) (6). Mutations in the NF1 gene cause a loss in neurofibromin function, resulting in downstream cell growth activation (7)(8)(9). Previous studies have reported over 1,400 different mutations due to the high mutation rate of the NF1 gene. A high number of these mutations arise as novel mutations; however, there is no hot spot for the pathogenic variations of NF1 (3,10,11).
In this study, we performed a retrospective review of 378 cases and compiled the mutations identified at both the genomic and mRNA level. We present 127 different mutations of the NF1 gene; 54 of which are novel mutations. In addition, the deletion of the NF1 gene was detected in 5 cases using fluorescence in situ hybridization (FISH) or the comparative genomic hybridization (CGH) array method. With the advent of the genomic DNA and cDNA sequencing approach, splicing abnormalities caused by exonic variants were captured and presented in our data. In addition, 7 of these 13 exonic mutations were novel. Of note, one of these mutations, c.3362A>G, produced mosaicism of a point mutation and mutant exon skipping at the mRNA level.
Accurate splicing of pre-mRNA is not only controlled by the 5'/3' splice sites (ss), but also by other cis-acting elements, as well as trans-acting factors, i.e., SR proteins and heterogeneous nuclear ribonucleoproteins (hnRNPs). These cis-acting elements generally include the splicing enhancers related to exon-inclusion enhancement, splicing silencers related to exon-inclusion inhibition, the intronic branch point and the polypyrimidine tract (12,13). Although the NF1 mutation spectrum continues to expand, studies investigating these splicing aberrations are limited (14)(15)(16)(17)(18). Thus, integrated analyses using the bioinformatics tools were further applied to provide insight into the mechanisms of these splicing defects caused by exonic variants, as well as other intronic variants at non-consensus splice sites.

Patients.
A total of 378 cases were referred for NF1 gene testing in our laboratory from January, 2006 to May, 2013 and were recruited in this study. The subjects consisted of 338 unrelated probands with clinically definite or suspected NF1 diagnosis and 40 family members. Consent forms were signed by the patients or authorized representatives. All cases underwent a NF1 gene-sequencing test developed in our laboratory, which was approved by the Ethics Committees at the University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA.
Mutation screening by Sanger sequencing. Genomic DNA was isolated from peripheral blood samples of the patients using the QIAamp DNA Mini kit (Qiagen, Valencia, CA, USA). mRNA was isolated from the peripheral blood samples using the QIAamp RNA Blood Mini kit (Qiagen). First-strand cDNA was reverse-transcribed using the SuperScript III Reverse Transcriptase kit and random primers (both from Invitrogen, Carlsbad, CA, USA). PCR was performed using specific primers targeting the mRNA coding region of the NF1 gene. For confirmation, exon-specific genomic DNA sequencing was also performed using specific primers. Primer information will be provided upon request. Sanger sequencing was performed using the BigDye Terminator v3.1 Cycle Sequencing kit (Life Technologies, Foster City, CA, USA) and an ABI 3130xl genetic analyzer (Life Technologies). Sequences were analyzed using Mutation Surveyor software (SoftGenetics, State College, PA, USA).
In silico analysis. Splice Site Prediction by Neural Network (SSPNN; www.fruitfly.org/seq_tools/splice.html) and the Human Splicing Finder (HSF; www.umd.be/HSF/) were used to investigate the mechanisms of the splicing abnormalities caused by the mutations at the non-consensus splice sites. HSF contains its own programs and other prediction platforms, including the exonic splicing enhancer (ESE) finder (http:// rulai.cshl.edu/cgi-bin/tools/ESE3/esefinder.cgi?process=home), RESCUE-ESE (http://genes.mit.edu/burgelab/rescue-ese), FAS-ESS (http://genes.mit.edu/fas-ess), Putative Exonic Splicing Enhancers/Silencers (PESX) designed by Zhang and Chasin (30) and splicing silencer motifs designed in the study by Sironi et al (19). This assessment system contains 2 sets (HSF and SSPNN) to examine the potential splice sites, where 1 set (HSF) was used for the potential branch points, 4 sets (PESx, RESCUE-ESE, ESE finder and HSF) for the ESE and 3 sets (PESx, FAS-ESS and splicing silencer motifs) for the exonic splicing silencer (ESS). The query sequences were obtained from the normal and mutated sequences. The SSPNN, HSF and ESE finder provided the scores to value the strength of the splicing-relative sequence motifs using corresponding weight matrices. PolyPhen2 (http://genetics.bwh.harvard.edu/ pph2) and SIFT (http://sift.bii.a-star.edu.sg) were applied to predict the potential effect of an amino acid substitution on the structure and function of the NF1 protein. BLASTP was used to align the NF1 protein sequences from the multiple species, including human, chimpanzee, gorilla, cat, dog, mouse, rat, cattle, chicken and zebrafish.
Real-time PCR. The alternative post-transcriptional profiles in one specific case were produced using the SYBR Master Mix (Life Technologies) and ABI PRISM 7000 Sequencing Detection System (Life Technologies). The ACTB gene encoding β-actin was used for normalization. For ACTB, the forward primer was 5'-AGCTCCTCCCTGGAGAAGAG-3' and the reverse primer was 5'-AGCACTGTGTTGGCGTACA-3' . For NF1, the forward primer was 5'-GATGTAAAATGTCTTACAAG-3' and the reverse primer was 5'-CTGCCACCTGTTTGCGCACT-3'. Amplicons targeting on the NF1 and ACTB genes were confirmed using Sanger sequencing. Real-time PCR was performed in triplicate using 5, 2.5, 1.25 and 0.625 ng cDNA.

Results
Mutation spectrum. Mutational screening of the NF1 gene was performed on samples obtained from 378 clinically diagnosed or individuals suspected of having NF1. The mutation nomenclature was based on the NCBI reference NM_000267.3. The exon number was given according to the conventional rule used in the NF1 testing community and previous literatures (10,11,17,18). The mutations were confirmed using the Biobase (HGMD professional version database) and Leiden Open Variation Database (LOVD) to determine the recurrence. In addition, the missense mutations were searched in 1000 Genomes project, dbSNP and Exome Variant Server to rule out normal variants. NF1 mutations were identified in 169 out of 378 cases; 127 different mutations were observed (Tables I and II). Of these mutations, 54 mutations were novel, of which, 23 were frameshift mutations, 10 were splicing defects, 7 were nonsense mutations and 14 were missense mutations (Table III). The mutations affected almost all exons apart from exon 4c, 14, 23.1, 35, 38 and 49 in the mutation spectrum of the patients, which was consistent with the finding of no hot spot mutations in previous studies (3,10,11). The nonsense mutations were the most common molecular defects found in this study (33/127), followed by the splice-site mutations (32/127) and missense mutations (27/127). In the group of frameshift mutations, deletion (n=23) was prone to occurring compared to insertion/duplication (n=8) and indels (n=4), but most of the insertion/duplication and all of the indels were novel mutations.
Only 50% of the splicing defects disrupted the conserved GT/AG or AT/AC dinucleotides of the splice sites in this study (Table II). By contrast, 3 intronic mutations at non-consensus splice sites (Table IV, subgroup I) generated cryptic 5'ss or 3'ss, resulting in the insertion into the mRNA. Thirteen exonic variants, that were 40.6% of the splicing defects, induced exon skipping or aberrant exons instead of a point mutation based on the genomic DNA and cDNA sequencing methods. Among these exonic mutations, c.1466A>G and c.1885G>A (Table IV, subgroup II) induced aberrant exons with a deletion by generating cryptic 5'ss or 3'ss, respectively. In particular, the 1185G>A mutation was a silent mutation at the genomic DNA level. The other 4 exonic sequence alterations, which resulted in exon skipping, were identified as substitutions at the last nucleotide position of exon 8, 20 and 29, and the last second nucleotide of exon 11 (Table IV, subgroup III). The remaining 7 exonic mutations, which caused exon skipping, did not perturb the natural 3'/5'ss or create cryptic splice sites (Table IV, subgroup IV).
Of note, one exonic mutation, c.3362A>G, produced mosaicism of E1121G and exon 20 skipping at the mRNA level (Fig. 1). This germline mutation in the proband was inherited from his father, and the post-transcriptional mosaicism was presented in the mRNA of both patients. However, the missense mutation of the son showed a lower signal intensity using Sanger sequencing (Fig. 2). The mosaicism of the post-transcriptional       profile was further examined using real-time PCR with cDNA obtained from this family (Fig. 3). The sample from the mother was used as the wild-type sample in this assay. The skipping rate of the mutant exon in the samplefrom the son was higher compared to the rate in the sample from the father.
In silico analysis. To better understand the underlying mechanisms of these 16 unusual splicing errors, 7 computational tools were employed to examine how these mutations affected the splicing-relative sequence motifs. The comparison was made between the results of the prediction on the normal and mutated sequences (Table IV). SSPNN predicted a deletion of the authentic acceptor site (3'ss) caused by 2410-18C>G, and a decrease in the strength of the authentic splice sites caused by 1260+3A>G and 5944-5A>G in the subgroup I. By contrast, the cryptic splice sites generated by these 3 mutations and the subgroup II mutations were given high scores by HSF or SSPNN. Substitutions in the subgroup III all abolished the authentic donor sites (5'ss) in the SSPNN evaluation, compared to the prediction of the strength reduction by HSF. No cryptic splice sites were predicted in these substitution sequences and the subgroup IV sequences. Further investigation on other splicing-relative sequence motifs revealed that some cis-acting factors were altered in these mutated sequences. In the subgroup I sequences, the branch-point motif was deleted and a putative ESE was generated by 2410-18C>G; a putative ESS or an abolished ESS was predicted to accompany the strength-decreased donor site or acceptor site in the mutated sequences of 1260+3A>G and 5944-5A>G, respectively. In the subgroup II sequences, the ESE was deleted or the strength of the ESE was decreased simultaneously with the strong new splice site generated by 1466A>G and 1885G>A, respectively. The original ESE was deleted or no ESE was embedded in the subgroup III sequences. The architectural alterations of the splicing-regulatory elements were more complex in the subgroup IV sequences. In general, the decreased strength of the ESE or decreased ratio of the ESE/ ESS was presented. An abolished ESS was also predicted in mutated exon 7 and 37, but no abolished ESE was predicted.

Discussion
NF1 is a multisystem genetic disorder with extreme diversity of clinical expression (3,10,11,20,21,22). A clear correlation of genotype and phenotype has been previously demonstrated in only two types of mutations (23). Patients with an NF1 microdeletion have more severe clinical characteristics (4,23).   We found 5 cases with an NF1 gene deletion in patients with various severe conditions, such as developmental delay, seizures, and early onset skin/subcutaneous tissue disorders. One 4 year-old patient harboring a novel splice site mutation (60+1delG) also exhibited the developmental delay. This alternation, which induced skipping in exon 1 may cause the same effect as an NF1 deletion at the protein level. Another mutation, c.2970-2972delAAT, in exon 17, which was not shown in our study, has been reported to be associated with the absence of cutaneous neurofibromas (11,24). However, due to the lack of detailed clinical information, our investigation of the correlation between genotype and phenotype was limited. The present study focused largely on the mutation spectrum, particularly the splicing errors.
The mutation rate of the NF1 gene is one of the highest reported in the human genome. We presented a mutation spectrum of the NF1 gene with 127 different variants in this study. In the summarized data, exon 19b, 25 and 29 harbored more missense mutations, exon 4b, 10a and 23.2 appeared to have more nonsense mutations, and the highest rate of frameshift mutations was found in exon 28. In addition, exon 7, 20 and 37 were prone to splicing errors. A total of 54 of the 129 different mutations were novel mutations that were categorized as either frameshift mutations, splicing defects, nonsense mutations or missense mutations. With the exception of missense mutations, the other 3 types of mutations are considered highly likely to be deleterious. Patients with NF1 with a missense mutation have a lower incidence of multiple neurofibromas and plexiform neurofibromas compared to patients with a different type mutation; In addition, it is also true that no evidently milder NF1 phenotype was concluded to be distinctly associated with a missense mutation (23). The in silico analysis of the functional consequence of these novel missense mutations was performed using the Polyphen2 and SIFT program. A total of 12 of these 14 novel missense mutations were predicted to be potentially damaging. Although P1830L and A1952V were determined to be benign, these mutations and other novel missense mutations altered amino acids that were conserved across the different species according to the results obtained from an orthologous alignment using BLASTP. It is widely believed that mutations in a highly conserved region may result in a functional change. Moreover, we found 2 second missense variants, M645V and I1658V, with nonsense mutations. These 2 variants can be treated as neutral polymorphism after the parental study.
In the present study, 25.2% of the different mutations induced aberrant splicing and 50% of these splicing errors were caused by exonic mutations and intronic mutations at non-consensus splice sites. These results are consistent with those of previous findings, showing that the NF1 gene is susceptible to having splicing errors in post-transcription (14,15). Accurate splice site recognition is critical in pre-mRNA splicing (25). This process is a coordinated program involving the strong splice sites, correct splicing regulatory elements (SRE) embedded in the genome and the associated proteins. Bioinformatics assessments focused on revealing how the authentic splice sites and the hidden sequence motifs of these SREs were interfered by these exonic and intronic variants at non-consensus splice sites in this study. We found that the change in these splice sites acted by tethering the alteration of the ESE, ESS and other cis-acting elements, which resulted in aberrant splicing. The error-prone splicing occurred in the subgroups I and II in which the high-score cryptic 5'/3'ss or the next AG/GT with a higher score, as the case of 1260+3A>G, replaced the strength-decreased or lower-scored authentic ones in the new microenvironment of splicing regulatory elements. For example, the 2410-18C>G mutation generated a higher score cryptic splice site while forming a putative ESE and abolishing the original branch point, and then 17 base pairs were inserted into the mRNA as the consequences of a strong cryptic splice site in coordination with the gain and/or loss of SREs. Importantly, the 2410-16A>G, 2410-15A>G, 2410-12T>G mutations have been previously documented in the Biobase HGMD database. Given 2410-18C>G, it is an obvious sign to alert that this cluster near the intron 15/exon 16 boundary is a splicing-aberration harbor.
In some cases, the disruption of ESE elements was the principal cause of the splicing error due to the consequences of the reduced splicing enhancement activity (16,26,27). As is known, not all of the substitutions at the last nucleotide position of an exon will cause exon skipping, although the mutants are also predicted to delete the authentic splice site or decrease its strength. The loss of ESE motifs may explain why the exon skipping was caused by the exonic mutations in the subgroup III. An exception in this case was exon 29 that had no ESE motif. Exon 29 skipping resulted from the weakened donor site caused by the mutation 5546G>A and the lack of the ESE to support the splice-site recognition.
There was no cryptic splice site generated and no strength reduction of the authentic splice sites in the subgroup IV sequences. However, the acceptor site strength was characterized to be an important and sensitive parameter in splice-site recognition (28). The low-score exons, such as exon 7 and 37 were observed to have more splicing defects in this study. In addition, our in silico analysis also showed more complex changes in the ESE and ESS. Not only was the ESE demonstrating a decrease in strength, but it also showed that weaker ESS, decreased ratio of ESE/ESS and abolished ESS were predicted to be the architectural weakness in the exon definition, resulting in exon exclusion in the subgroup IV sequences. The ratio of ESE/ESS was important for exon recognition and intron identification in the complexity of splicing. Several studies have observed that a higher density of ESEs was in the exons compared to the introns and vice versa for ESSs (12,29,30). The ratio decreasing of ESE/ESS will break the delicate balance of SREs, and will be prone to exon exclusions. Although enhancers and silencers have apparently opposite effects as suggested by their terms, the Composite Exonic Regulatory Elements of Splicing (CERES) has already been proposed when accumulating evidence has suggested ESE and ESS shared additional properties (31,32). The findings of the weaker or abolished ESS caused by the mutations in our analysis reflected the overlapping function of these 2 elements. The SREs became more critical in determining the splice-site recognition in these cases.
When the trans-acting factors navigated the new landscape in which the mutation plays a make-or-break role, the consequences of the competition and coordination in the splicing process is more evident in the case with the 3362A>G mutation, in which the germline mutation decreased the ratio of the ESE/ESS. The splicing in the mutated microenvironment resulted in the missense mutation and exon skipping coexisting in the mRNA of an 11-year-old boy and his father with different rates. Furthermore, real-time PCR clearly confirmed this mosaicism. Both patients have multiple café-au-lait spots but no other profoundly different and NF1-related clinical features. The higher skipping rate found in the sample of the son may be caused by individual's genetic variability.
Taken together, this study presents 54 novel NF1 mutations and reveals the high frequency of the unusual splicing defects in the pathogenicity of NF1. Integrated analyses using the bioinformatics tools provided insight in order to better understand the underlying mechanisms of these splicing errors. In particular, as the NF1 gene was susceptible to having aberrant splicing caused by exonic variants, including the silent mutation, such as 1185G>A, this result underscored the large consequences of NF1 gene testing at both the genomic and mRNA levels. In addition, the mutation data may contribute information to the ongoing antisense therapeutics for NF1 caused by intronic mutations (33), thus shedding light on a targeted treatment to restore NF1 gene function.