Open Access

Smoking alters the evolutionary trajectory of non‑small cell lung cancer

  • Authors:
    • Xiao‑Jun Yu
    • Gang Chen
    • Jun Yang
    • Guo‑Can Yu
    • Peng‑Fei Zhu
    • Zheng‑Ke Jiang
    • Kan Feng
    • Yong Lu
    • Bin Bao
    • Fang‑Ming Zhong
  • View Affiliations

  • Published online on: August 29, 2019     https://doi.org/10.3892/etm.2019.7958
  • Pages: 3315-3324
  • Copyright: © Yu et al. This is an open access article distributed under the terms of Creative Commons Attribution License.

Metrics: Total Views: 0 (Spandidos Publications: | PMC Statistics: )
Total PDF Downloads: 0 (Spandidos Publications: | PMC Statistics: )


Abstract

Smoking is the biggest risk factor for lung cancer. Smokers have a much higher chance of developing lung tumors with a worse survival rate; however, non‑smokers also develop lung tumors. A number of questions remain including the underlying difference between smoker and non‑smoker lung cancer patients and the involvement of genetic and epigenetic processes in tumor development. The present study analyzed the mutation data of 100 non‑small cell lung cancer (NSCLC) patients, 12 non‑smokers, 48 ex‑smokers and 40 smokers, from Tracking Non‑Small Cell Lung Cancer Evolution through Therapy Consortium. A total of 68 genes exhibited different mutation patterns across non‑smokers, ex‑smokers and smokers. A number of these 68 genes encode membrane proteins with biological regulation, metabolic process, and response to stimulus functions. For each group of patients, the top 10 most frequently mutated genes were selected and their oncogenetic tree inferred, which reflected how the genes evolve during tumor genesis. By comparing the oncogenetic trees of non‑smokers and smokers, it was identified that in non‑smokers, the mutation of epidermal growth factor receptor (EGFR) was an early genetic alteration event and EGFR was the key driver, but in smokers, the mutation of titin (TTN) was more important. Based on network analysis, TTN can interact with spectrin α erythrocytic 1 through calmodulin 2 and troponin C1. These genetic differences during tumorigenesis of non‑smoker and smoker lung cancer patients provided novel insights into the effects of smoking on the evolutionary trajectory of non‑small cell lung cancer and may prove helpful for targeted therapy of different lung cancer subtypes.

Introduction

Lung cancer patients make up ~14% of newly diagnosed cancer cases and is the second most widespread cancer worldwide (1). Of those, ~85% are non-small cell lung cancer (NSCLC) (2). Lung cancer not only has high incidence, but also high death rate. It is a huge healthcare and economic burden for both developing and developed countries.

There are many possible factors that may contribute to the genesis of lung cancer (2). Genetics can explain a large proportion of lung cancer occurrence as many single nucleotide polymorphisms have been discovered to be associated with lung cancer susceptibility by genome-wide association studies (3). Environment factors, such as air pollution (4), particulate matter 2.5 (5) and smoking, can facilitate the epigenetic dysfunctions which will interact with genetic changes and trigger tumorgenesis (2,69). Cigarette smoke includes over 5,000 compounds (10), such as nicotine, free radicals, benzopyrene, catechols, polonium-210 and heavy metals (11). Many of these compounds are strong carcinogenic chemicals (12), which can interfere with DNA mismatch repair and cause somatic mutations. Cigarette smoking accounts for 87% of lung cancer deaths (13) and is the leading risk factor.

Unfortunately, the genetic mechanisms of smoking leading to lung carcinogenesis are largely unknown and many observations are contradictory (10). For example, benzoapyrene, a carcinogenic chemical from smoke, can induce lung tumors in mice but not in rats (14). On the molecular level, several well-established signaling pathways, such as cyclooxygenase and its derived prostanoids, peroxisome proliferator-activated receptor γ and arachidonate 15-lipoxygenase, epidermal growth factor receptor (EGFR) and the P13K/AKT/mTOR and vascular endothelial growth factor-dependent angiogenetic pathway, have been reported to have important roles (10). As a complex systems disease (2), lung cancer dysfunctions are dynamic and the evolution of smoking-induced lung cancer, i.e. the series of genetic events, can elucidate a more realistic picture of tumorigenesis. With the rapid development of next-generation sequencing, the somatic mutations in cancer patients can be more easily identified. Based on somatic mutation data, the evolutionary trajectories of cancer patients can be reconstructed. Caravagna et al (15) developed an algorithm called Pipeline for Cancer Inference (PiCnIc) to analyze the colon adenocarcinoma and rectum adenocarcinoma (COAD/READ) somatic mutation data from The Cancer Genome Atlas project. The underlying somatic evolution based on Suppes' probabilistic causation was reconstructed (16) and it was determined that mutations in APC regulator of WNT signaling pathway, KRAS proto-oncogene, and tumor protein p53 were primary events for micro-satellite stable COAD/READ tumors, which was consistent with previous literature. Brown et al (17), performed phylogenetic analysis on whole-exome sequencing and copy number profiling data of primary and metastatic breast cancer samples and inferred the phylogeny of genomic alterations during breast cancer progression. The study utilized the Dollo parsimony method and the branch and bound exhaustive search algorithm described in Felsenstein (18), to reconstruct the phylogenetic tree.

To investigate the genomic alterations triggered by smoking, the present study analyzed the somatic mutations in 100 NSCLC patients. The different genomic alterations amongst non-smokers, ex-smokers and smokers were identified and the most frequent genetic alterations of each smoking subgroup were analyzed to construct oncogenetic trees, which revealed the evolutionary trajectories of smoking NSCLC. The present results provided novel insights into NSCLC development due to smoking and also identified potential intervention targets for treating NSCLC patients.

Materials and methods

NSCLC somatic mutation dataset

TRAcking Cancer Evolution through therapy (TRACERx) Consortium is a multi-million pound project funded by Cancer Research UK to better understand the genetic risks of lung cancer through exploring the human genome. The present study obtained the somatic mutation data and smoking status data of 100 NSCLC patients from Jamal-Hanjani et al (19). The clinical information of these 100 patients are provided in Table SI. The dataset consists of 12 people who never smoked in their life, 48 people who used to smoke but have quit smoking for >20 years and 40 current smokers or recent ex-smokers. The somatic mutations were annotated to genes. If there were non-synonymous exonic alterations within a gene, this was considered as a mutated gene and it was allocated ‘1’; otherwise genes were classed as ‘0’. There were 11,345 genes that were mutated in at least 1 of the 100 NSCLC patients. An 11345×100 matrix was produced where rows denoted genes, the columns were patients and the binary value indicated whether the particular gene was mutated in this patient.

Unlike the TRACERx study by Jamal-Hanjani et al (19), which analyzed the intratumor heterogeneity by constructing phylogenetic trees for each patient, the present study was interested in characterizing the general mutation pattern within patient subtypes.

Identifying the mutated genes amongst different smoking status groups

To identify the various mutated genes amongst different smoking status groups, the Fisher's Exact Test (20) was applied for the confusion table of mutation status and smoking status. P<0.05 was considered to indicate statistical significance.

Construction of the evolutionary trajectories for different smoking status groups

How the most frequently mutated genes evolved in different smoking status groups was analyzed using Oncotree (21,22), a widely used method for oncogenetic tree deduction (23).

In an oncogenetic tree model, the evolutionary trajectories of tumor genesis are simplified and the causality between genetic alteration events is assumed to occur sequentially. In addition, the causation of a genetic alteration event by another is independent of other causations.

The Oncotree method involves several steps. First, a set of the most relevant genetic events is selected. For the present study, the top 10 most frequent genetic alterations for each smoking status group were considered as relevant for the progression of the tumor group and therefore were selected to be modeled. Then, each pair of such genetic events was assigned a weight corresponding to the probabilities of joint or individual occurrence. Finally, based on the assigned weights, the optimal oncogenetic tree was inferred as maximum-weight branching (21,22).

The method was applied for the present study using R package Oncotree (http://cran.r-project.org/web/packages/Oncotree/).

Annotation of the biological function of the mutated genes

WebGestalt was used to annotate the biological function of the mutated genes (24). WebGestalt is a widely used online enrichment tool to model organisms including human, mouse, rat, yeast, fruit fly and Caenorhabditis elegans. It has many annotation databases integrated, including Kyoto Encyclopedia of Genes and Genomes, Gene Ontology, DrugBank and Online Mendelian Inheritance in Man. The P-value of overrepresentation enrichment analysis was multiple test-adjusted as the false discovery rate (FDR). In the present study, the enriched categories with FDR<0.2 were considered as significant.

Results and Discussion

A total of 68 genes demonstrate different mutation patterns amongst smoking status groups

Fisher's exact test was used to identify the different mutated genes amongst the various smoking status groups. A total of 68 gene mutations were considered as significant to smoking status (P<0.05; Table I). The OncoPrinter plots of these 68 genes in the three different smoking status groups, non-smoker, ex-smoker and smoker, are displayed in Fig. 1. The genes were ranked based on the mutation frequency in all lung cancer patients. Zinc finger homeobox 4 (ZFHX4), usherin (USH2A), CUB and Sushi multiple domains 1 (CSMD1), CUB and Sushi multiple domains 2 (CSMD2), spectrin α erythrocytic 1 (SPTA1), pappalysin 2 (PAPPA2), dynein axonemal heavy chain 9 (DNAH9), contactin-associated protein like 5 (CNTNAP5), additional sex combs like 3 (ASXL3) were highly mutated in ex-smokers and smokers, but not in non-smokers. The mutation rate was associated the smoking status with the current smokers demonstrating the highest rate of mutated genes. There were several non-smoker specific mutations, such as lysine demethylase 8 (KDM8), zinc finger protein 677 (ZNF677), TEA domain transcription factor 1 (TEAD1) and phosphatidylinositol glycan anchor biosynthesis class M (PIGM). These non-smoker specific mutations suggested that tumorigenesis of lung cancer in non-smoker patients was different from the tumorigenesis of lung cancer in smoking patients.

Table I.

A total of 68 genes that demonstrated different mutation patterns amongst non-smokers, ex-smokers and smokers.

Table I.

A total of 68 genes that demonstrated different mutation patterns amongst non-smokers, ex-smokers and smokers.

Gene symbolGene nameNCBI gene IDFisher's exact test P-value
EGFREpidermal growth factor receptor19560.00052
TTNTitin72730.00071
ZFHX4Zinc finger homeobox 4797760.00433
USH2AUsherin73990.00549
SPTA1Spectrin α, erythrocytic 167080.00753
TRPV6Transient receptor potential cation channel subfamily V member 6555030.00988
SEC16ASEC16 homolog A, endoplasmic reticulum export factor99190.00988
SCN1ASodium voltage-gated channel α subunit 163230.01216
ZNF677Zinc finger protein 6773429260.01333
TEAD1TEA domain transcription factor 170030.01333
PIGM Phosphatidylinositol glycan anchor biosynthesis class M931830.01333
EPG5Ectopic P-granules autophagy protein 5 homolog577240.01427
TENM3Teneurin transmembrane protein 3557140.01482
OR6P1Olfactory receptor family 6 subfamily P member 11283660.01494
PAPPA2Pappalysin 2606760.01743
ZNF783Zinc finger family member 7831002896780.01769
CTNNB1Catenin β 114990.01769
SPATA13Spermatogenesis associated 132211780.01769
HIP1Huntingtin interacting protein 130920.01769
SENP7SUMO1/sentrin specific peptidase 7573370.01769
PCDHGA8Protocadherin γ subfamily A, 897080.01769
SNPHSyntaphilin97510.01769
ENPEPGlutamyl aminopeptidase20280.01819
KCNH2Potassium voltage-gated channel subfamily H member 237570.01819
NLGN3Neuroligin 3544130.01819
MS4A14Membrane spanning 4-domains A14846890.01819
DEPDC5DEP domain containing 596810.01819
SMARCA4SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 465970.02044
LYSTLysosomal trafficking regulator11300.02157
CNTN4Contactin 41523300.02157
ZNF536Zinc finger protein 53697450.02420
CNTNAP5Contactin associated protein like 51296840.02459
ASXL3Additional sex combs like 3, transcriptional regulator808160.02459
DNAH9Dynein axonemal heavy chain 917700.02568
CNGA2Cyclic nucleotide gated channel α 212600.02841
KCNH5Potassium voltage-gated channel subfamily H member 5271330.02841
ZEB2Zinc finger E-box binding homeobox 298390.02841
PHLPP2PH domain and leucine rich repeat protein phosphatase 2230350.02918
GLI2GLI family zinc finger 227360.02918
GPR35G protein-coupled receptor 3528590.02918
ATP13A5Atpase 13A53449050.02918
MYF5Myogenic factor 546170.02918
PCDHGB7Protocadherin γ subfamily B, 7560990.02918
WBSCR17Williams-Beuren syndrome chromosome region 17644090.02918
BAZ1BBromodomain adjacent to zinc finger domain 1B90310.02918
COL6A5Collagen type VI α5 chain2560760.03141
CSMD1CUB and Sushi multiple domains 1644780.03183
RYR2Ryanodine receptor 262620.03217
TSHZ3Teashirt zinc finger homeobox 3576160.03459
KDM8Lysine demethylase 8798310.03728
NALCNSodium leak channel, non-selective2592320.03732
MALRD1MAM and LDL receptor class A domain containing 13408950.03732
DOCK10Dedicator of cytokinesis 10556190.03732
DNAH11Dynein axonemal heavy chain 1187010.03857
TAF1LTATA-box binding protein associated factor 1 like1384740.04006
PRUNE2Prune homolog 21584710.04006
PLCH1Phospholipase C eta 1230070.04006
KIAA1549LKIAA1549 like257580.04006
RPTORRegulatory associated protein of MTOR complex 1575210.04165
CSMD2CUB and Sushi multiple domains 21147840.04312
CDH23Cadherin related 23640720.04357
KIAA1324LKIAA1324 like2222230.04374
NUP205Nucleoporin 205231650.04374
TBC1D4TBC1 domain family member 498820.04374
FLNCFilamin C23180.04717
CHD7Chromodomain helicase DNA binding protein 7556360.04717
DNAH17Dynein axonemal heavy chain 1786320.04717

[i] NCBI, National Center for Biotechnology; ID, identification.

Biological functions of the 68 gene mutations associated with smoking status

The 68 gene mutations associated with smoking status were annotated using Gene Ontology (GO) and the biological process (BP), cellular component (CC) and molecular function (MF) categories (Fig. 2). Numerous genes were annotated to be membrane proteins with biological regulation, metabolic process, and response to stimulus functions (Fig. 2). These results were expected since smoke is a xenobiotic stimulus to the human body and the chemicals can affect normal metabolic processes, and alter the biological regulations. Rigorous statistical test for the enrichment significance using WebGestalt was performed for deeper investigation into gene function (24) with significantly enriched BP (Table II), CC (Table III) and MF (Table IV) categories. It was demonstrated that the organ development, morphogenesis of an epithelial fold, muscle tissue morphogenesis and the muscle organ morphogenesis categories were enriched (Table II). These genes may serve an important role in tumor initiation and help transform the normal lung tissue to tumor tissue. Proteins associated with the plasma membrane were enriched (Table III), which was consistent with the preliminary biological function analysis (Fig. 2), and indicated that the mutated genes were involved in stimulus response. In addition, enrichment of proteins associated with muscle/fiber functions suggested that the mutated genes may change the lung muscle structure. Significant enrichment of multiple binding functions proved that the mutated genes were key players in signaling transduction and regulation (Table IV), which may amplify the dysfunctions and accelerate tumorigenesis.

Table II.

Significantly enriched GO biological process categories of the 68 mutated genes associated with smoking status.

Table II.

Significantly enriched GO biological process categories of the 68 mutated genes associated with smoking status.

GO IDDescriptionP-valueFDROverlap genes
GO:0007423Sensory organ development 4.48×10−50.1801CTNNB1, EGFR, GLI2, MYF5, CHD7, TENM3, CDH23, SMARCA4, USH2A, ZEB2
GO:0098655Cation transmembrane transport0.00010850.1801CNGA2, NALCN, KCNH5, GPR35, ATP13A5, KCNH2, NLGN3, TRPV6, CHD7, RYR2, SCN1A
GO:0034765Regulation of ion transmembrane transport0.00012520.1801NALCN, KCNH5, GPR35, KCNH2, NLGN3, CHD7, RYR2, SCN1A
GO:0042391Regulation of membrane potential0.00012860.1801CNGA2, NALCN, KCNH5, GPR35, KCNH2, NLGN3, RYR2, SCN1A
GO:0034762Regulation of transmembrane transport0.00013940.1801NALCN, KCNH5, GPR35, KCNH2, NLGN3, CHD7, RYR2, SCN1A
GO:0006812Cation transport0.00015040.1801CNGA2, CTNNB1, NALCN, KCNH5, GPR35, ATP13A5, KCNH2, NLGN3, TRPV6, CHD7, RYR2, SCN1A, CDH23
GO:0060571Morphogenesis of an epithelial fold0.00021240.1974CTNNB1, EGFR, GLI2
GO:0043010Camera-type eye development0.00023820.1974CTNNB1, EGFR, MYF5, CHD7, TENM3, SMARCA4, ZEB2
GO:0060415Muscle tissue morphogenesis0.00026640.1974MYF5, CHD7, RYR2, TTN
GO:0048644Muscle organ morphogenesis0.00028650.1974MYF5, CHD7, RYR2, TTN
GO:0001508Action potential0.00032130.1974NALCN, GPR35, KCNH2, RYR2, SCN1A
GO:0030001Metal ion transport0.00035430.1974CNGA2, CTNNB1, NALCN, KCNH5, GPR35, KCNH2, TRPV6, CHD7, RYR2, SCN1A, CDH23
GO:0043269Regulation of ion transport0.00035730.1974CTNNB1, NALCN, KCNH5, GPR35, KCNH2, NLGN3, CHD7, RYR2, SCN1A

[i] GO, Gene Ontology; ID, identification; FDR, false discovery rate.

Table III.

Significantly enriched GO cellular component categories of the 68 mutated genes associated with smoking status.

Table III.

Significantly enriched GO cellular component categories of the 68 mutated genes associated with smoking status.

GO IDDescriptionP-valueFDROverlap genes
GO:0030018Z disc 4.49×10−50.0294CTNNB1, FLNC, RYR2, SCN1A, TTN
GO:0031674I band6.91E-050.0294CTNNB1, FLNC, RYR2, SCN1A, TTN
GO:0044459Plasma membrane part0.0001050.0297CNGA2, CTNNB1, EGFR, ENPEP, SPATA13, PHLPP2, KCNH5, GPR35, HIP1, ATP13A5, KCNH2, NLGN3, TRPV6, TENM3, PCDHGB7, SCN1A, SPTA1, USH2A, PCDHGA8, SNPH
GO:0030017Sarcomere0.000290.0572CTNNB1, FLNC, RYR2, SCN1A, TTN
GO:0042995Cell projection0.0003590.0572CNGA2, CTNNB1, CNTN4, DNAH9, SPATA13, PHLPP2, GLI2, TENM3, RPTOR, TSHZ3, CDH23, SPTA1, USH2A, DNAH11, SNPH
GO:0044449Contractile fiber part0.0004560.0572CTNNB1, FLNC, RYR2, SCN1A, TTN
GO:0030016Myofibril0.0004710.0572CTNNB1, FLNC, RYR2, SCN1A, TTN
GO:0043292Contractile fiber0.0005880.0625CTNNB1, FLNC, RYR2, SCN1A, TTN
GO:0030122AP-2 adaptor complex0.0009420.0801EGFR, HIP1
GO:0030128Clathrin coat of endocytic vesicle0.0009420.0801EGFR, HIP1
GO:0098590Plasma membrane region0.0011720.0906CNGA2, CTNNB1, EGFR, ENPEP, SPATA13, PHLPP2, HIP1, NLGN3, USH2A, SNPH
GO:0030132Clathrin coat of coated pit0.0016180.1146EGFR, HIP1
GO:0097458Neuron part0.0022160.1449CNTN4, PHLPP2, HIP1, TENM3, RPTOR, TSHZ3, CDH23, SMARCA4, SPTA1, USH2A, SNPH
GO:0090575RNA polymerase II transcription factor complex0.0024760.1478TAF1L, CTNNB1, MYF5
GO:0005929Cilium0.0030080.1478CNGA2, DNAH9, PHLPP2, GLI2, USH2A, DNAH11
GO:0043234Protein complex0.0030370.1478TAF1L, CTNNB1, DNAH9, EGFR, NUP205, COL6A5, HIP1, MYF5, RPTOR, RYR2, SMARCA4, TEAD1, TTN, USH2A, DNAH11, DEPDC5
GO:0030125Clathrin vesicle coat0.0031270.1478EGFR, HIP1
GO:0031226Intrinsic component of plasma membrane0.0031560.1478CNGA2, ENPEP, KCNH5, GPR35, ATP13A5, KCNH2, NLGN3, TRPV6, TENM3, PCDHGB7, SCN1A, SPTA1, PCDHGA8
GO:0031253Cell projection membrane0.0033050.1478CNGA2, CTNNB1, SPATA13, PHLPP2, USH2A
GO:0098858Actin-based cell projection0.0035650.1515CTNNB1, SPATA13, CDH23, USH2A
GO:0030131Clathrin adaptor complex0.0042540.1718EGFR, HIP1
GO:0031090Organelle membrane0.0044480.1718CNGA2, EGFR, ENPEP, PHLPP2, NUP205, HIP1, MALRD1, RPTOR, RYR2, WBSCR17, DEPDC5, SNPH, TBC1D4, SEC16A
GO:0044441Ciliary part0.0047240.1746CNGA2, DNAH9, PHLPP2, GLI2, USH2A
GO:0044798Nuclear transcription factor complex0.0051270.1816TAF1L, CTNNB1, MYF5

[i] GO, Gene Ontology; ID, identification; FDR, false discovery rate.

Table IV.

Significantly enriched GO molecular function categories of the 68 mutated genes associated with smoking status.

Table IV.

Significantly enriched GO molecular function categories of the 68 mutated genes associated with smoking status.

GO IDDescriptionP-valueFDROverlap genes
GO:0044877Macromolecular complex binding 2.18×10−50.0308CTNNB1, EGFR, FLNC, GLI2, HIP1, CHD7, RPTOR, TSHZ3, SMARCA4, SPTA1, TTN, USH2A, KDM8, BAZ1B, DEPDC5
GO:0070577Lysine-acetylated histone binding5.57E-050.0393TAF1L, SMARCA4, BAZ1B
GO:0005516Calmodulin binding9.33E-050.0418CNGA2, EGFR, KCNH5, TRPV6, RYR2, TTN
GO:0051015Actin filament binding0.0001180.0418EGFR, FLNC, HIP1, SPTA1, TTN
GO:0005261Cation channel activity0.0002730.0742CNGA2, NALCN, KCNH5, KCNH2, TRPV6, RYR2, SCN1A
GO:0003682Chromatin binding0.00040.0742CTNNB1, EGFR, GLI2, CHD7, TSHZ3, SMARCA4, KDM8, BAZ1B
GO:0000155Phosphorelay sensor kinase activity0.0004430.0742KCNH5, KCNH2
GO:0004673Protein histidine kinase activity0.0004430.0742KCNH5, KCNH2
GO:0046982Protein heterodimerization activity0.0004720.0742CTNNB1, EGFR, KCNH5, HIP1, MYF5, TENM3, SPTA1
GO:0005244Voltage-gated ion channel activity0.0009180.118CNGA2, NALCN, KCNH5, KCNH2, SCN1A
GO:0022832Voltage-gated channel activity0.0009180.118CNGA2, NALCN, KCNH5, KCNH2, SCN1A
GO:0016775Phosphotransferase activity, nitrogenous group as acceptor0.0010530.1241KCNH5, KCNH2
GO:0005216Ion channel activity0.0018840.1874CNGA2, NALCN, KCNH5, KCNH2, TRPV6, RYR2, SCN1A
GO:0046873Metal ion transmembrane transporter activity0.0019170.1874CNGA2, NALCN, KCNH5, KCNH2, TRPV6, RYR2, SCN1A
GO:0001159Core promoter proximal region DNA binding0.0019880.1874GLI2, MYF5, CHD7, SMARCA4, TEAD1, ZNF536
GO:0022838Substrate-specific channel activity0.0022020.1896CNGA2, NALCN, KCNH5, KCNH2, TRPV6, RYR2, SCN1A
GO:0008324Cation transmembrane transporter activity0.0022790.1896CNGA2, NALCN, KCNH5, ATP13A5, KCNH2, TRPV6, RYR2, SCN1A
GO:0022836Gated channel activity0.0025410.1903CNGA2, NALCN, KCNH5, KCNH2, RYR2, SCN1A
GO:0032403Protein complex binding0.0025570.1903EGFR, FLNC, HIP1, RPTOR, SPTA1, TTN, USH2A, DEPDC5

[i] GO, Gene Ontology; ID, identification; FDR, false discovery rate.

Evolutionary trajectories of non-smoker, ex-smoker and smoker lung cancer patients

Cancer is a complex multigene and multiprocess disease. The tumorigenesis of colorectal cancer is well studied (25,26) and can be used as a perfect example to explain the roles of mutations in causing pathway dysfunctions. The process includes several steps (25): i) Mutation of mismatch-repair (MMR) gene; ii) microsatellite instability (MSI) pathway dysfunction caused by MMR mutation; iii) normal epithelium becomes small adenoma; iv) chromosomal instability and mutations in KRAS and BRAF; v) serrated adenoma pathway dysfunction triggered by BRAF mutation; vi) small adenoma becomes large adenoma; and vii) mutations of PIK3CA, PTEN, tumor protein p53 (TP53), BAX, SMAD4 and transforming growth factor β receptor 2 accelerate the progression from large adenoma to cancer.

Similarly, lung cancer must also have several mutational events, which occur sequentially to initiate and accelerate tumorigenesis. Smoking is a major risk factor that can cause genetic and epigenetic changes that alter the tumorigenesis procedures. Research into this process will help explain the mechanism differences between smoker and non-smoker lung cancer patients.

The Oncotree method was used to produce oncogenetic trees of the top 10 most frequent mutated genes in non-smoker, ex-smoker and smoker lung cancer patients (Fig. 3). For non-smokers, the early events were EGFR and titin (TTN) mutation. The late EGFR events were mutations of PIGM and zinc finger protein 677, while TTN was followed by mutations of TEAD1, olfactory receptor family 6 subfamily P member 1, catenin β 1, huntingtin interacting protein 1, protocadherin γ subfamily A 8 and SUMO1/sentrin specific peptidase 7. For ex-smokers, TTN was also an early event but more early events were detected compared with non-smokers, including mutations of ryanodine receptor 2, ZFHX4 and CSMD1. For smokers, the results revealed the highest number of early events, including mutations of TTN, ryanodine receptor 2, USH2A, SPTA1 and CSMD1. Results demonstrated that smoking increased spontaneous mutations and formed more complex oncogenetic trees. For non-smokers, EGFR was the primary mutation whilst in ex-smokers and smokers, the importance of TTN was increased. Almost all smokers had the TTN mutation.

Oncogenetic differences between non-smoker, ex-smoker and smoker lung cancer patients

Based on the oncogenetic trees of non-smoker, ex-smoker and smoker lung cancer patients (Fig. 3), the key driver gene of non-smoker lung cancer patients was EGFR, whilst the key driver gene of smoker lung cancer patients was TTN.

EGFR is a well-known oncogene that affects the PI3K and RAS pathway and accelerates cell growth and survival (27). EGFR is widely expressed in >60% of NSCLC patients and is a clinically relevant target of tyrosine kinase inhibitors (TKIs). EGFR mutations are more frequent in Asians, females, non-smokers and lung adenocarcinomas (28,29). The present findings determined that EGFR was the key driver gene of non-smoker lung cancer patients which was in agreement with the literature (28,29).

TTN encodes a protein of striated muscle and is the key component for striated muscle assembly and function. TTN mutation is very frequent in the majority of cancer types with the second highest mutation rate behind TP53 in The Cancer Genome Atlas dataset (30). In the present study, 65 patients had the TTN mutation and 35 patients did not. For the 65 patients with TTN mutation, there were 2 adenosquamous carcinoma, 2 carcinosarcoma, 31 invasive adenocarcinoma, 1 large cell carcinoma and 29 squamous cell carcinoma patients. For the 35 patients without TTN mutations, there were 1 adenosquamous carcinoma, 30 invasive adenocarcinoma, 1 large cell neuroendocrine and 3 squamous cell carcinoma patients. Although its mechanisms remain largely unknown, TTN has great potential for investigation due to its roles in tumorigenesis and progression (30). The present study determined that TTN may function through regulating DNAH9, USH2A, SPTA1 or CSMD2 based on the oncogenetic trees (Fig. 3). Although the oncogenetic tree only demonstrated the process of genetic alteration occurrence, it provided hints of functional regulations; however, this needs to be further confirmed. To explore the possible regulation mechanisms of TTN, the protein functional association network STRING (31,32) was used with medium confidence (>0.4). It was determined that TTN can interact with SPTA1 through calmodulin 2 (CALM2) and troponin C1 (TNNC1; Fig. 4). The STRING confidence scores of each interaction (Table SII) were 0.722 for TTN and CALM2, 0.962 for SPTA1 and CALM2, 0.965 for TTN and TNNC1 and 0.537 for SPTA1 and TNNC1. These results provided insight into how TTN may function in lung cancer of smoking patients, or even other types of cancer.

There were limitations to the oncogenetic tree model. Firstly, the model was based on association rather than causality and the results could not be treated as actual biological regulations, therefore these should be further investigated with experimental methods. Secondly, the oncogenetic tree model cannot handle a large number of genes. The input genes should be carefully picked based on mutation frequency or biological literature with only the highly possible genes analyzed. It is not a general method that can be applied on a genome wide scale. Finally, the sample size should be large enough to capture the association so results generated on small datasets need to be interpreted with caution.

In conclusion, lung cancer is a complex multigene, multiprocess disease with complex genetic and environmental risk factors. Smoking is the biggest risk factor that can alter the genetics and epigenetics of lung tissue causing cancer. Smokers have a much greater chance of developing lung cancer. The present study compared the mutation patterns of non-smoker, ex-smoker and smoker lung cancer patients and identified 68 genes that were significantly differentially mutated amongst smoking status groups. Furthermore, oncogenetic trees were constructed of the top 10 most frequently mutated genes in each group and analyzed. It was identified that in non-smoker lung cancer patients, the key driver gene was EGFR, whilst in smoker lung cancer patients the key driver gene was TTN. The EGFR mutation finding in non-smokers is in line with previous literature. A potential mechanism for the high frequency mutated gene TTN in tumorigenesis was suggested. The present study provided novel insights into the effect of smoking on altering the evolutionary trajectory of lung cancer and its progression.

Supplementary Material

Supporting Data
Supporting Data

Acknowledgements

Not applicable.

Funding

No funding was received.

Availability of data and materials

The datasets generated and/or analyzed during the present study are available from the corresponding author on reasonable request.

Authors' contributions

FMZ designed the experiment and XJY performed the experiment. GC, JY, GCY and PFZ analyzed the data and performed data analysis. ZKJ, KF, YL and BB contributed to the study design. KF and YL wrote the article. ZKJ and BB revised the article. All authors read and approved the final manuscript.

Ethics approval and consent to participate

Not applicable.

Patient consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

References

1 

Siegel R, Naishadham D and Jemal A: Cancer statistics, 2012. CA Cancer J Clin. 62:10–29. 2012. View Article : Google Scholar : PubMed/NCBI

2 

Huang T, Jiang M, Kong X and Cai YD: Dysfunctions associated with methylation, microRNA expression and gene expression in lung cancer. PLoS One. 7:e434412012. View Article : Google Scholar : PubMed/NCBI

3 

Bossé Y and Amos CI: A decade of GWAS results in lung cancer. Cancer Epidemiol Biomarkers Prev. 27:363–379. 2018. View Article : Google Scholar : PubMed/NCBI

4 

Jiang CL, He SW, Zhang YD, Duan HX, Huang T, Huang YC, Li GF, Wang P, Ma LJ, Zhou GB and Cao Y: Air pollution and DNA methylation alterations in lung cancer: A systematic and comparative study. Oncotarget. 8:1369–1391. 2017.PubMed/NCBI

5 

Shu Y, Zhu L, Yuan F, Kong X, Huang T and Cai YD: Analysis of the relationship between PM2.5 and lung cancer based on protein-protein interactions. Comb Chem High Throughput Screen. 19:100–108. 2016. View Article : Google Scholar : PubMed/NCBI

6 

Liu C, Zhang YH, Huang T and Cai Y: Identification of transcription factors that may reprogram lung adenocarcinoma. Artif Intell Med. 83:52–57. 2017. View Article : Google Scholar : PubMed/NCBI

7 

Li BQ, You J, Chen L, Zhang J, Zhang N, Li HP, Huang T, Kong XY and Cai YD: Identification of lung-cancer-related genes with the shortest path approach in a protein-protein interaction network. Biomed Res Int. 2013:2673752013.PubMed/NCBI

8 

Li BQ, You J, Huang T and Cai YD: Classification of non-small cell lung cancer based on copy number alterations. PLoS One. 9:e883002014. View Article : Google Scholar : PubMed/NCBI

9 

Huang T, Yang J and Cai YD: Novel candidate key drivers in the integrative network of genes, microRNAs, methylations and copy number variations in squamous cell lung carcinoma. Biomed Res Int. 2015:3581252015.PubMed/NCBI

10 

Tonini G, D'Onofrio L, Dell'Aquila E and Pezzuto A: New molecular insights in tobacco-induced lung cancer. Future Oncol. 9:649–655. 2013. View Article : Google Scholar : PubMed/NCBI

11 

Hecht SS: More than 500 trillion molecules of strong carcinogens per cigarette: Use in product labelling? Tob Control. 20:3872011. View Article : Google Scholar : PubMed/NCBI

12 

Chen L, Chu C, Lu J, Kong X, Huang T and Cai YD: A computational method for the identification of new candidate carcinogenic and non-carcinogenic chemicals. Mol Biosyst. 11:2541–2550. 2015. View Article : Google Scholar : PubMed/NCBI

13 

Zon RT, Goss E, Vogel VG, Chlebowski RT, Jatoi I, Robson ME, Wollins DS, Garber JE, Brown P and Kramer BS; American Society of Clinical Oncology, : American society of clinical oncology policy statement: The role of the oncologist in cancer prevention and risk assessment. J Clin Oncol. 27:986–993. 2009. View Article : Google Scholar : PubMed/NCBI

14 

Nesnow S, Ross JA, Stoner GD and Mass MJ: Mechanistic linkage between DNA adducts, mutations in oncogenes and tumorigenesis of carcinogenic environmental polycyclic aromatic hydrocarbons in strain A/J mice. Toxicology. 105:403–413. 1995. View Article : Google Scholar : PubMed/NCBI

15 

Caravagna G, Graudenzi A, Ramazzotti D, Sanz-Pamplona R, De Sano L, Mauri G, Moreno V, Antoniotti M and Mishra B: Algorithmic methods to infer the evolutionary trajectories in cancer progression. Proc Natl Acad Sci USA. 113:E4025–E4034. 2016. View Article : Google Scholar : PubMed/NCBI

16 

Suppes P: A probabilistic theory of causalityNorth-Holland Pub. Co.; Amsterdam: 1970, PubMed/NCBI

17 

Brown D, Smeets D, Székely B, Larsimont D, Szász AM, Adnet PY, Rothé F, Rouas G, Nagy ZI, Faragó Z, et al: Phylogenetic analysis of metastatic progression in breast cancer using somatic mutations and copy number aberrations. Nat Commun. 8:149442017. View Article : Google Scholar : PubMed/NCBI

18 

Rohlf FJ: J. Felsenstein J, Inferring PhylogeniesSinauer Associates Inc.; Sunderland, MA: 2004

19 

Jamal-Hanjani M, Wilson GA, McGranahan N, Birkbak NJ, Watkins TBK, Veeriah S, Shafi S, Johnson DH, Mitter R, Rosenthal R, et al: Tracking the evolution of non-small-cell lung cancer. N Engl J Med. 376:2109–2121. 2017. View Article : Google Scholar : PubMed/NCBI

20 

Fisher RA: The logic of inductive inference. J Royal Stat Soc. 98:39–82. 1935. View Article : Google Scholar

21 

Szabo A and Boucher K: Estimating an oncogenetic tree when false negatives and positives are present. Math Biosci. 176:219–236. 2002. View Article : Google Scholar : PubMed/NCBI

22 

Desper R, Jiang F, Kallioniemi OP, Moch H, Papadimitriou CH and Schäffer AA: Inferring tree models for oncogenesis from comparative genome hybridization data. J Comput Biol. 6:37–51. 1999. View Article : Google Scholar : PubMed/NCBI

23 

Li XC, Liu C, Huang T and Zhong Y: The occurrence of genetic alterations during the progression of breast carcinoma. Biomed Res Int. 2016:52378272016.PubMed/NCBI

24 

Zhang B, Kirov S and Snoddy J: WebGestalt: An integrated system for exploring gene sets in various biological contexts. Nucleic Acids Res. 33((Web Server Issue)): W741–W748. 2005. View Article : Google Scholar : PubMed/NCBI

25 

Markowitz SD and Bertagnolli MM: Molecular origins of cancer: Molecular basis of colorectal cancer. N Engl J Med. 361:2449–2460. 2009. View Article : Google Scholar : PubMed/NCBI

26 

Calvert PM and Frucht H: The genetics of colorectal cancer. Ann Intern Med. 137:603–612. 2002. View Article : Google Scholar : PubMed/NCBI

27 

Vogelstein B, Papadopoulos N, Velculescu VE, Zhou S, Diaz LA Jr and Kinzler KW: Cancer genome landscapes. Science. 339:1546–1558. 2013. View Article : Google Scholar : PubMed/NCBI

28 

Proceedings from the 10th annual meeting of molecularly targeted therapy in non-small cell lung cancer. J Thorac Oncol. 5 (12 Suppl 6):S433–S496. 2010. View Article : Google Scholar

29 

Tokumo M, Toyooka S, Kiura K, Shigematsu H, Tomii K, Aoe M, Ichimura K, Tsuda T, Yano M, Tsukuda K, et al: The relationship between epidermal growth factor receptor mutations and clinicopathologic features in non-small cell lung cancers. Clin Cancer Res. 11:1167–1173. 2005.PubMed/NCBI

30 

Kim N, Hong Y, Kwon D and Yoon S: Somatic mutaome profile in human cancer tissues. Genomics Inform. 11:239–244. 2013. View Article : Google Scholar : PubMed/NCBI

31 

Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, Simonovic M, Roth A, Santos A, Tsafou KP, et al: STRING v10: Protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 43((Database Issue)): D447–D452. 2015. View Article : Google Scholar : PubMed/NCBI

32 

Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, et al: STRING v11: Protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47:D607–D613. 2019. View Article : Google Scholar : PubMed/NCBI

Related Articles

Journal Cover

November-2019
Volume 18 Issue 5

Print ISSN: 1792-0981
Online ISSN:1792-1015

Sign up for eToc alerts

Recommend to Library

Copy and paste a formatted citation
x
Spandidos Publications style
Yu XJ, Chen G, Yang J, Yu GC, Zhu PF, Jiang ZK, Feng K, Lu Y, Bao B, Zhong FM, Zhong FM, et al: Smoking alters the evolutionary trajectory of non‑small cell lung cancer. Exp Ther Med 18: 3315-3324, 2019
APA
Yu, X., Chen, G., Yang, J., Yu, G., Zhu, P., Jiang, Z. ... Zhong, F. (2019). Smoking alters the evolutionary trajectory of non‑small cell lung cancer. Experimental and Therapeutic Medicine, 18, 3315-3324. https://doi.org/10.3892/etm.2019.7958
MLA
Yu, X., Chen, G., Yang, J., Yu, G., Zhu, P., Jiang, Z., Feng, K., Lu, Y., Bao, B., Zhong, F."Smoking alters the evolutionary trajectory of non‑small cell lung cancer". Experimental and Therapeutic Medicine 18.5 (2019): 3315-3324.
Chicago
Yu, X., Chen, G., Yang, J., Yu, G., Zhu, P., Jiang, Z., Feng, K., Lu, Y., Bao, B., Zhong, F."Smoking alters the evolutionary trajectory of non‑small cell lung cancer". Experimental and Therapeutic Medicine 18, no. 5 (2019): 3315-3324. https://doi.org/10.3892/etm.2019.7958