Estimation of copy number aberrations: Comparison of exome sequencing data with SNP microarrays identifies homozygous deletions of 19q13.2 and CIC in neuroblastoma
- Authors:
- Published online on: January 19, 2016 https://doi.org/10.3892/ijo.2016.3349
- Pages: 1103-1116
Abstract
Introduction
The childhood cancer neuroblastoma (NB) is a tumor of the sympathetic nervous system. The patients show high degree of clinical and biological heterogeneity ranging from patients with highly aggressive tumors with fatal outcome, even after intense multimodal treatment, to patients with spontaneous regression despite metastatic disease. Tumors with whole chromosome gains or losses and near triploid karyotypes generally have a good prognosis while tumors with segmental rearrangements and near di- or tetraploid karyotype are associated with a poor prognosis. Consequently, analyses of chromosomal aberrations are an important tool, that together with age at diagnosis, presence of metastases, tumor differentiation and histological grade are used for patient stratification and to determine therapeutic strategy (1). Recurrent genomic alterations with clinical importance include loss of chromosome 1p, 3p, 4p, 11q, gain of 1q, 2p, 17q and amplification of the MYCN oncogene where 11q-deletion and MYCN-amplification both are strongly associated with aggressive disease (2–4). Different analysis techniques commonly used in clinical routine for detection of segmental and numerical aberrations include karyotyping, fluorescence in situ hybridization (FISH), comparative genomic hybridization (CGH) microarrays or multiplex ligation-dependent probe amplification (MLPA) (5,6). However, cancer diagnostics is moving towards a paradigm with tests that perform comprehensive characterization of genomic alterations of individual tumors such as next generation sequencing (NGS) in order to detect actionable targets. Depending on settings, NGS could provide information of different sorts of therapeutically relevant alterations such as single nucleotide variants, indels and translocations. In addition, by analysis of the distribution of read-depth between test tumor sample and normalization control, it is possible to identify gains, losses, amplifications as well as homozygous deletions (7–10). Thus, NGS provides an attractive tool in clinical diagnostics of neuroblastoma, enabling simultaneous assessment of copy number alterations alongside detection of point mutations (e.g., activating mutations in the ALK oncogene) that could be used to aid in choice of therapy. In order to investigate the accuracy of genomic alterations characterized from exome sequencing, exome data for 30 neuroblastoma tumors were compared with corresponding genomic profiles generated from established Affymetrix high resolution SNP-microarrays, the golden standard for copy number detection in neuroblastoma in many laboratories.
Materials and methods
Samples and microarray analysis
The collection of tumors from Swedish patients were performed after either written or verbal consent was obtained from parents/guardians according to ethical permits approved by the local ethics committee (Karolinska Institutet and Karolinska University Hospital, registration number 03–736 and 2009/1369). Clinical information is described in Table I. The tumors were histologically assessed for tumor cell content using adjacent tissue and genomic DNA was isolated from fresh frozen tumor or blood using DNeasy blood and tissue kit (Qiagen, Hilden, Germany) according to the manufacturer's protocol. DNA concentration and purity were assessed through fluorometric analysis and absorbance measurements, respectively.
Microarray analysis of 30 neuroblastoma tumors were performed using either Affymetrix 50K, 250K gene mapping arrays or CytoScan HD (Affymetrix Inc., Santa Clara, CA, USA) containing 59,015, 262,338 and 2,822,125 probes respectively (corresponding to ~50, 10 and 1 kb average probe spacing). Handling of the microarrays has been described previously (2,11). Primary data analysis was performed using GDAS software (Affymetrix) with in silico normalization against control samples from healthy individuals. Genomic position annotations were based on the hg19 build (http://genome.ucsc.edu/) of the human genome.
Exome sequencing and CNV-analysis
Exome sequencing was performed on DNA from 30 neuroblastoma tumors and corresponding constitutional blood from 14 of these patients using Agilent SureSelect All Exon 50Mb V3 or V4 (Agilent, Santa Clara, CA, USA) according to the suppliers protocol before performing pair-end sequencing (2×100 bp or 2×75 bp) on Illumina HiSeq2000 or HiScan SQ (Illumina, San Diego, CA, USA). The sequencing was performed at three separate occasions at SciLife laboratory, Stockholm, Sweden and the Genomics Core facility at University of Gothenburg, Sweden with a median raw coverage of 91X, 127X and 340X for each batch (Table I). Reads were aligned against the reference genome (hg19) using BWA-0.5.10 after fastq trimming with prinseq 0.20.3 prior realignment and recalibration using GATK-2.5-.-gf57256b. Variant calling were made through SNPeff followed by Annovar annotation.
Copy number alterations were generated from the bioinformatical tool Control-FREEC (control-FREE Copy Number Caller) (10) using the normalized distribution of aligned reads in window-by-window basis, in order to determine differences in coverage between tumor and normal. Neuroblastoma tumors were compared against either constitutional DNA from blood from corresponding patient or from that of other controls. The ratios generated with Control-FREEC were visualized using the statistical software R. We constructed a web-based Shiny application (12) that runs with R in the background. The Shiny application for visualizing Control-FREEC profiles is available at https://malinost.shinyapps.io/CNPupload, and the source code at https://gist.github.com/malinost/324f77309eb103147747. The application imports the ratios from an uploaded Control-FREEC output file, and visualizes the results either chromosome by chromosome, or for the whole genome in one figure. In the single chromosome mode the values can also be colored according to e.g., copy number or genotype. In order to detect segmental changes and breakpoints we need to extract segmental averages and detect the jumps that correspond to gains or losses, this is done by applying the Fused Lasso Signal Approximator (FLSA) (13) to the ratios. With different settings we can adjust how much the variations in the ratios affect the FLSA lines. In the whole genome setting ratios are plotted with different colors for different chromosomes in proportion to chromosome size. Results of CNV detection from exome sequencing were compared to SNP-microarray through visual annotation in order to determine the performance of respective platform.
Results
Ratio profiles of tumor-control read coverage were generated using Control-FREEC and visualized with the Shiny application. Break points for structural rearrangements were recorded through FREEC and the plotted profiles were compared to profiles generated from SNP microarrays. Visual annotation shows that the two different methods have high degree of concordance regarding larger alterations (Fig. 1). Both platforms detected a significant number of imbalances including gains, amplification, deletion and homozygous deletions, as well as whole chromosome gains and losses in the thirty analyzed tumor samples. The genomic positions recorded for copy number changes shows close consistency between exome generated profiles and SNP-microarray profiles (Fig. 1), commonly separated with <1 Mb. Among the four deviations recorded when comparing copy number variation generated from SNP-microarray and exome sequencing analysis, two were located in the ATRX gene where sample NBL28R8 and NBL10R8 showed loss of exon 3–8 and exon 1–15, respectively and one were located at chromosome 19: 42.75–44.93 Mb from p-terminal end (pter) in sample NBL13E6 (Table II). However, probe density in these regions is low in the 50k and 250K arrays and thus, the smaller deletions detected through coverage read plots are below the resolution of the arrays used in this study. One additional deviation between the two methods was seen for sample NBL10R8 where copy number profiling through exome and SNP-microarray analysis both indicates 2p-gain. Intriguingly, the two methods show highly different 2p gain breakpoint; exome analysis show gain of the entire chromosome 2p arm (92.2 Mb from pter) while microarrays show a gain ending at 48.2 Mb from pter (Fig. 2).
Additional aberrations of specific interest included one tumor showing chromothriptic features of chromosome 2, 5 and 7 causing amplification of MYCN, TERT and CDK6 and three tumors showing loss of CDKN2A (Fig. 1).
Through the copy number analysis of exome sequencing we observed that three tumors (NBL4E1, NBL6E9 and NBL13E6) carried smaller deletions causing homozygous loss in chromosomal region 19q13.2. The shortest region of overlap (SRO) of deletions are delimitated by sample NBL13E6 showing loss of chr19:42,749,732–42,931,020 containing the genes ERF, CIC, PAFAH1B2, PRR19, TMEM145, MEGF8, CNFN, LIPE-AS1 and LIPE (Fig. 3 and Table I). A minor heterozygous deletion was observed in sample NBL12E3 also at 19q13.2 albeit distally (43.34–43.53 Mb from pter) relative the three samples with homozygous loss at 19q. Of the genes located within the region delimitated by sample NBL16E9, rare variants (e.g. not present in 1,000 genome database, exome variant server or our in house database) were detected in MEGF8 (p.P195L, NBL13E5) and in LIPE (p.G627S, NBL47R4; p.A351V, NBL19R6). All three variants were predicted by SIFT to be non-deleterious.
Tumor specific alterations, such as rearrangements could be used to monitoring residual disease and treatment response. Although MYCN amplification is common in neuroblastoma, the exact start and end-points of amplicons are unique for each patient and thus requires precise characterization of each tumor. Exome sequencing shows that neuroblastoma samples NBL14R2, NBL13E5 and NBL19R6 display complex amplicon structures consisting of multiple 2p-regions in addition to the genomic region containing the MYCN gene; NBL14R2: 2.88–3.14 Mb, 14.88–15.09 Mb and 15.58–16.38 Mb from pter; NBL13E5: 15.92–16.40 Mb and 29.28–30.25 Mb from pter; NBL19R6: 15.68–15.91 Mb, 15.960–15.965 Mb, 16.04–17.10 Mb and 17.13–17.33 Mb from pter. The breakpoints provide several tumor and patient specific junction sites useful for analyses of minimal residual disease follow-up. The high-level amplification seen in MYCN-amplified tumors produces a great amount of off-target reads within the amplified regions that can be used to deduct specific breakpoints and fusions. From the exome sequencing data we were able to retrieve at least one unique junction site for 12 of 14 NB tumors (86%) with MYCN-amplification. We could also identify unique junction sites due to deletions in two patients: NBL28R8 with a deletion in ATRX located at the X-chromosome where 76.94 Mb joined 76.96 Mb and NBL13E6 with a small interstitial chromosome 19 deletion fusing position 42.75 Mb with 42.93 Mb.
Discussion
Next generation sequencing has recently entered clinical practice in evaluation of cancer development and progression. This ranges from use of targeted cancer gene panels to whole genome sequencing. In neuroblastoma there are few recurrent mutations in well-established cancer genes besides ALK (14–17) and thus, other approaches beyond cancer gene panels is required for exploring the mutational landscape. Although sequencing at the whole genome level is reaching affordable levels for clinical utility, the bioinformatic handling and data storage is still a bottleneck for many laboratories. Thus, exome sequencing provides an attractive approach for identification of protein changing mutations for theranostic purposes in neuroblastoma. In this context it is also important to note that the cost of exome sequencing has dropped dramatically in the last five years.
The ability to use exome data for detection of segmental and whole chromosome alterations is crucial as the genomic profile of a neuroblastoma tumor is highly important for decision of treatment regime and indicates prognosis for the patient. In order to analyze the utility of exome based copy number profiling in neuroblastoma we performed a visual annotation of genomic profile and compared exome and SNP-microarray generated profiles of thirty neuroblastoma samples.
The Shiny application for visualization of normalized coverage ratios, developed by us and presented here, allows for a rapid overview of the genomic profiles. Side-by-side comparison of profiles generated from the two different methods show high degree of concordance indicating that exome generated copy number profiles could be used for clinical interpretation. Using exome sequence data we readily detected high level amplifications as well as hetero- and homozygous deletions of gene containing regions (Figs. 1 and 3). Through the exome sequencing data we were able to detect smaller deletions of gene containing genomic regions that were not recorded in the SNP-microarray generated profiles due to inadequate probe density, thereby showing the power of the exome sequencing. However, it is likely that the opposite would be seen also for imbalances occurring outside the targeted regions. When analyzing SNP-microarray profiles an estimation of the minimal number of probes is needed to take into account in order to avoid false positives, and similar considerations should also be addressed in exome generated profiles e.g., by using a sliding window approach. Control-FREEC apply a sliding window approach in several steps, first raw CNP is calculated by counting reads in non-overlapping windows. After the raw CNPs are normalized against a normal sample, a Lasso-based segmentation-algorithm is applied (18).
Besides the resolution of smaller deletions in gene containing regions, only one major deviation was seen in comparing the two methods; in sample NBL10R8 gain of chromosomal region 2p is detected through both methods although with different end-points (Fig. 2). Tracing the source of DNA indicated that DNA extraction was performed on separate occasions and that the difference in breakpoints at 2p likely is due to tumor heterogeneity.
Interestingly, we were able to detect three cases of homozygous deletions at 19q causing loss of multiple genes including the CIC gene (Fig. 3). This gene encodes the HMG-box transcriptional repressor Capicua. Capicua is a key sensor of multiple receptor tyrosine kinases (RTK), repressing RTK-responsive genes in absence of activating signaling and is involved in various biological processes including neuroblast differentiation (19). CIC mutations, deletions or truncating mutations are seen in the majority of oligodendrogliomas (20) and Capicua has also been implicated in other malignancies such as breast-, colorectal and prostate cancer (21,22). Allelic loss of the 19q region and recurrent mutations of CIC is a common feature of oligodendroglioma and frequently seen in combination with allelic loss of 1p and/or deletions of the far upstream element (FUSE) binding protein 1 gene FUBP1 on 1p31.1 or mutations of the isocitrate dehydrogenase genes (IDH1/2) (20). However, allelic loss of FUBP1 is only seen in one of the samples with homozygous 19q deletion (NBL13E6) and no novel protein changing variant could be detected in IDH1/2, FUBP1 or CIC in our set of 30 neuroblastoma tumors. If and how CIC deficiency contributes to cancer progression in neuroblastoma tumors requires further studies. A heterozygous deletion was also observed distally the CIC locus at 43.34–43.53 Mb from pter in sample NBL12E3. However, this particular region contains several pregnancy-specific glycoprotein (PSG) genes that previously have been shown to inhabit various copy number polymorphisms.
Besides identifying genomic copy number alterations we could detect tumor specific junctions in 86% of the MYCN-amplified tumors in using the boundaries from off-target reads in amplified regions. As these junctions are likely to be highly specific for each tumor, and by definition not present in normal DNA, they could be used to monitor disease through analysis of circulating tumor-DNA.
Collectively, we show that copy number profiles generated from normalized coverage of exome sequencing are easily interpreted through the web based Shiny application with similar resolution as the Affymetrix 250K SNP-arrays. The extended use of exome sequencing beyond variant- and indel calling is of particular interest in neuroblastoma tumor biology as genomic profiling is of uttermost importance in the clinical evaluation of these tumors. Use of exome sequencing has the advantage compared to other methods for copy number variants, in that it can also identify protein changing events that can be used for gene targeted therapy such as the ALK specific inhibitor Crizotinib.
Acknowledgements
The authors wish to thank the Genomics and Bioinformatics Core Facility platforms at Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden for access and assistance with instrumentation and analysis. This study was supported with grants from the Swedish Childhood Cancer Foundation (S.F.; NBCNSPDHEL10/021, NCp2015-0061, TJ2014-0064. T.M.; PR2013-0102), the Swedish Cancer Foundation (T.M.; 14-0342), The Swedish state under the LUA/ALF agreement (ALFGBG-447171), Project grant from Laboratory division Sahlgrenska University Hospital, Lions Cancerfond Väst, The Selma Anderson Foundation, Fondkistan, Assar Gabrielssons Foundation, Sahlgrenska University Hospital Foundations, the Swedish Research Council (T.M./S.F.; 2014–3031), the Swedish Foundation for Strategic Research (T.M.; RB13-0204).
References
Cohn SL, Pearson AD, London WB, Monclair T, Ambros PF, Brodeur GM, Faldum A, Hero B, Iehara T, Machin D, et al: INRG Task Force: The International Neuroblastoma Risk Group (INRG) classification system: An INRG Task Force report. J Clin Oncol. 27:289–297. 2009. View Article : Google Scholar : | |
Carén H, Kryh H, Nethander M, Sjöberg RM, Träger C, Nilsson S, Abrahamsson J, Kogner P and Martinsson T: High-risk neuroblastoma tumors with 11q-deletion display a poor prognostic, chromosome instability phenotype with later onset. Proc Natl Acad Sci USA. 107:4323–4328. 2010. View Article : Google Scholar : PubMed/NCBI | |
Janoueix-Lerosey I, Schleiermacher G, Michels E, Mosseri V, Ribeiro A, Lequin D, Vermeulen J, Couturier J, Peuchmaur M, Valent A, et al: Overall genomic pattern is a predictor of outcome in neuroblastoma. J Clin Oncol. 27:1026–1033. 2009. View Article : Google Scholar : PubMed/NCBI | |
Cetinkaya C, Martinsson T, Sandgren J, Träger C, Kogner P, Dumanski J, Díaz de Ståhl T and Hedborg F: Age dependence of tumor genetics in unfavorable neuroblastoma: arrayCGH profiles of 34 consecutive cases, using a Swedish 25-year neuroblastoma cohort for validation. BMC Cancer. 13:2312013. View Article : Google Scholar : PubMed/NCBI | |
Combaret V, Iacono I, Bréjon S, Schleiermacher G, Pierron G, Couturier J, Bergeron C and Blay JY: Analysis of genomic alterations in neuroblastoma by multiplex ligation-dependent probe amplification and array comparative genomic hybridization: A comparison of results. Cancer Genet. 205:657–664. 2012. View Article : Google Scholar : PubMed/NCBI | |
Ambros IM, Brunner B, Aigner G, Bedwell C, Beiske K, Bénard J, Bown N, Combaret V, Couturier J, Defferrari R, et al: A multilocus technique for risk evaluation of patients with neuroblastoma. Clin Cancer Res. 17:792–804. 2011. View Article : Google Scholar : PubMed/NCBI | |
Sathirapongsasuti JF, Lee H, Horst BA, Brunner G, Cochran AJ, Binder S, Quackenbush J and Nelson SF: Exome sequencing-based copy-number variation and loss of heterozygosity detection: Exome CNV. Bioinformatics. 27:2648–2654. 2011. View Article : Google Scholar : PubMed/NCBI | |
Krumm N, Sudmant PH, Ko A, O'Roak BJ, Malig M, Coe BP, Quinlan AR, Nickerson DA and Eichler EE: NHLBI Exome Sequencing Project: Copy number variation detection and genotyping from exome sequence data. Genome Res. 22:1525–1532. 2012. View Article : Google Scholar : PubMed/NCBI | |
Valdés-Mas R, Bea S, Puente DA, López-Otín C and Puente XS: Estimation of copy number alterations from exome sequencing data. PLoS One. 7:e514222012. View Article : Google Scholar | |
Boeva V, Popova T, Bleakley K, Chiche P, Cappo J, Schleiermacher G, Janoueix-Lerosey I, Delattre O and Barillot E: Control-FREEC: A tool for assessing copy number and allelic content using next-generation sequencing data. Bioinformatics. 28:423–425. 2012. View Article : Google Scholar : | |
Carén H, Erichsen J, Olsson L, Enerbäck C, Sjöberg RM, Abrahamsson J, Kogner P and Martinsson T: High-resolution array copy number analyses for detection of deletion, gain, amplification and copy-neutral LOH in primary neuroblastoma tumors: Four cases of homozygous deletions of the CDKN2A gene. BMC Genomics. 9:3532008. View Article : Google Scholar : PubMed/NCBI | |
Chang WCJ, Allaire JJ, Xie Y and McPehrson J: Shiny: Web Application Framework for R. R package version 0.12.0. R package version 0120. 2015, https://cran.r-project.org/web/packages/shiny/index.html. Accessed August 5, 2015 | |
Hoefling H: A Path Algorithm for the Fused Lasso Signal Approximator. J Comput Graph Stat. 19:984–1006. 2010. View Article : Google Scholar | |
Molenaar JJ, Koster J, Zwijnenburg DA, van Sluis P, Valentijn LJ, van der Ploeg I, Hamdi M, van Nes J, Westerman BA, van Arkel J, et al: Sequencing of neuroblastoma identifies chromothripsis and defects in neuritogenesis genes. Nature. 483:589–593. 2012. View Article : Google Scholar : PubMed/NCBI | |
Pandey GK, Mitra S, Subhash S, Hertwig F, Kanduri M, Mishra K, Fransson S, Ganeshram A, Mondal T, Bandaru S, et al: The risk-associated long noncoding RNA NBAT-1 controls neuroblastoma progression by regulating cell proliferation and neuronal differentiation. Cancer Cell. 26:722–737. 2014. View Article : Google Scholar : PubMed/NCBI | |
Pugh TJ, Morozova O, Attiyeh EF, Asgharzadeh S, Wei JS, Auclair D, Carter SL, Cibulskis K, Hanna M, Kiezun A, et al: The genetic landscape of high-risk neuroblastoma. Nat Genet. 45:279–284. 2013. View Article : Google Scholar : PubMed/NCBI | |
Sausen M, Leary RJ, Jones S, Wu J, Reynolds CP, Liu X, Blackford A, Parmigiani G, Diaz LA Jr, Papadopoulos N, et al: Integrated genomic analyses identify ARID1A and ARID1B alterations in the childhood cancer neuroblastoma. Nat Genet. 45:12–17. 2013. View Article : Google Scholar : | |
Boeva V, Zinovyev A, Bleakley K, Vert JP, Janoueix-Lerosey I, Delattre O and Barillot E: Control-free calling of copy number alterations in deep-sequencing data using GC-content normalization. Bioinformatics. 27:268–269. 2011. View Article : Google Scholar : | |
Jiménez G, Shvartsman SY and Paroush Z: The Capicua repressor - a general sensor of RTK signaling in development and disease. J Cell Sci. 125:1383–1391. 2012. View Article : Google Scholar | |
Eisenreich S, Abou-El-Ardat K, Szafranski K, Campos Valenzuela JA, Rump A, Nigro JM, Bjerkvig R, Gerlach EM, Hackmann K, Schröck E, et al: Novel CIC point mutations and an exon-spanning, homozygous deletion identified in oligodendroglial tumors by a comprehensive genomic approach including transcriptome sequencing. PLoS One. 8:e766232013. View Article : Google Scholar : PubMed/NCBI | |
Sjöblom T, Jones S, Wood LD, Parsons DW, Lin J, Barber TD, Mandelker D, Leary RJ, Ptak J, Silliman N, et al: The consensus coding sequences of human breast and colorectal cancers. Science. 314:268–274. 2006. View Article : Google Scholar : PubMed/NCBI | |
Choi N, Park J, Lee JS, Yoe J, Park GY, Kim E, Jeon H, Cho YM, Roh TY and Lee Y: miR-93/miR-106b/miR-375-CIC-CRABP1: A novel regulatory axis in prostate cancer progression. Oncotarget. 6:23533–23547. 2015. View Article : Google Scholar : PubMed/NCBI |