<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "journalpublishing3.dtd">
<article xml:lang="en" article-type="research-article" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<?release-delay 0|0?>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">&amp;dcljournalid;</journal-id>
<journal-title-group>
<journal-title>&amp;dcljournalttl;</journal-title></journal-title-group>
<issn pub-type="ppub">&amp;dclissnppub;</issn>
<issn pub-type="epub">&amp;dclissnepub;</issn>
<publisher>
<publisher-name>D.A. Spandidos</publisher-name></publisher></journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3892/ijo.2012.1556</article-id>
<article-id pub-id-type="publisher-id">ijo-41-04-1387</article-id>
<article-categories>
<subj-group>
<subject>Articles</subject></subj-group></article-categories>
<title-group>
<article-title>A smoking-associated 7-gene signature for lung cancer diagnosis and prognosis</article-title></title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>WAN</surname><given-names>YING-WOOI</given-names></name><xref rid="af1-ijo-41-04-1387" ref-type="aff"><sup>1</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>RAESE</surname><given-names>REBECCA A.</given-names></name><xref rid="af1-ijo-41-04-1387" ref-type="aff"><sup>1</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>FORTNEY</surname><given-names>JAMES E.</given-names></name><xref rid="af1-ijo-41-04-1387" ref-type="aff"><sup>1</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>XIAO</surname><given-names>CHANGCHANG</given-names></name><xref rid="af1-ijo-41-04-1387" ref-type="aff"><sup>1</sup></xref><xref rid="af2-ijo-41-04-1387" ref-type="aff"><sup>2</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>LUO</surname><given-names>DAJIE</given-names></name><xref rid="af1-ijo-41-04-1387" ref-type="aff"><sup>1</sup></xref><xref rid="af3-ijo-41-04-1387" ref-type="aff"><sup>3</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>CAVENDISH</surname><given-names>JOHN</given-names></name><xref rid="af1-ijo-41-04-1387" ref-type="aff"><sup>1</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>GIBSON</surname><given-names>LAURA F.</given-names></name><xref rid="af1-ijo-41-04-1387" ref-type="aff"><sup>1</sup></xref><xref rid="af4-ijo-41-04-1387" ref-type="aff"><sup>4</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>CASTRANOVA</surname><given-names>VINCENT</given-names></name><xref rid="af1-ijo-41-04-1387" ref-type="aff"><sup>1</sup></xref><xref rid="af5-ijo-41-04-1387" ref-type="aff"><sup>5</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>QIAN</surname><given-names>YONG</given-names></name><xref rid="af1-ijo-41-04-1387" ref-type="aff"><sup>1</sup></xref><xref rid="af5-ijo-41-04-1387" ref-type="aff"><sup>5</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>GUO</surname><given-names>NANCY LAN</given-names></name><xref ref-type="corresp" rid="c1-ijo-41-04-1387"/><xref rid="af1-ijo-41-04-1387" ref-type="aff"><sup>1</sup></xref><xref rid="af2-ijo-41-04-1387" ref-type="aff"><sup>2</sup></xref></contrib></contrib-group>
<aff id="af1-ijo-41-04-1387">
<label>1</label>Mary Randolph Cancer Center;</aff>
<aff id="af2-ijo-41-04-1387">
<label>2</label>Departments of Community Medicine</aff>
<aff id="af3-ijo-41-04-1387">
<label>3</label>Statistics and</aff>
<aff id="af4-ijo-41-04-1387">
<label>4</label>Microbiology, Immunology and Cell Biology, West Virginia University, Morgantown, WV 26506;</aff>
<aff id="af5-ijo-41-04-1387">
<label>5</label>Pathology and Physiology Research Branch, Health Effects Laboratory Division, National Institute for Occupational Safety and Health, Morgantown, WV 26505, 
<country>USA</country></aff>
<author-notes>
<corresp id="c1-ijo-41-04-1387">Correspondence to: Professor Nancy Lan Guo, West Virginia University, 2816 HSS, Mary Babb Randolph Cancer Center, Morgantown, WV 26506-9300, USA, E-mail: <email>lguo@hsc.wvu.edu</email></corresp></author-notes>
<pub-date pub-type="collection">
<month>10</month>
<year>2012</year></pub-date>
<pub-date pub-type="epub">
<day>16</day>
<month>07</month>
<year>2012</year></pub-date>
<volume>41</volume>
<issue>4</issue>
<fpage>1387</fpage>
<lpage>1396</lpage>
<history>
<date date-type="received">
<day>08</day>
<month>03</month>
<year>2012</year></date>
<date date-type="accepted">
<day>09</day>
<month>05</month>
<year>2012</year></date></history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2012, Spandidos Publications</copyright-statement>
<copyright-year>2012</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/3.0">
<license-p>This is an open-access article licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License. The article may be redistributed, reproduced, and reused for non-commercial purposes, provided the original source is properly cited.</license-p></license></permissions>
<abstract>
<p>Smoking is responsible for 90&#x00025; of lung cancer cases. There is currently no clinically available gene test for early detection of lung cancer in smokers, or an effective patient selection strategy for adjuvant chemotherapy in lung cancer treatment. In this study, concurrent coexpression with multiple signaling pathways was modeled among a set of genes associated with smoking and lung cancer survival. This approach identified and validated a 7-gene signature for lung cancer diagnosis and prognosis in smokers using patient transcriptional profiles (n&#x0003D;847). The smoking-associated gene coexpression networks in lung adenocarcinoma tumors (n&#x0003D;442) were highly significant in terms of biological relevance (network precision &#x0003D; 0.91, FDR&#x0003C;0.01) when evaluated with numerous databases containing multi-level molecular associations. The gene coexpression network in smoking lung adenocarcinoma patients was confirmed in qRT-PCR assays of the identified biomarkers and involved signaling pathway genes in human lung adenocarcinoma cells (H23) treated with 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK). Furthermore, the western blotting results of <italic>p53</italic>, phospho-<italic>p53</italic>, <italic>Rb</italic> and <italic>EGFR</italic> in NNK-treated H23 and transformed normal human lung epithelial cells (BEAS-2B) support their functional involvement in smoking-induced lung cancer carcinogenesis and progression.</p></abstract>
<kwd-group>
<kwd>smoking</kwd>
<kwd>lung cancer diagnosis and prognosis</kwd>
<kwd>gene signature</kwd>
<kwd>signaling pathway</kwd>
<kwd>coexpression networks</kwd></kwd-group></article-meta></front>
<body>
<sec sec-type="intro">
<title>Introduction</title>
<p>Lung cancer remains the leading cause of cancer-related mortality for both men and women, and its incidence is increasing worldwide (<xref rid="b1-ijo-41-04-1387" ref-type="bibr">1</xref>). Smoking is the strongest population-attributable risk factor in lung carcinogenesis and is responsible for approximately 90&#x00025; of lung cancer incidents (<xref rid="b2-ijo-41-04-1387" ref-type="bibr">2</xref>&#x02013;<xref rid="b4-ijo-41-04-1387" ref-type="bibr">4</xref>). Currently, there are no effective diagnostic screening tools for early detection of lung cancer in smokers. CT scans are offered for lung cancer screening in smokers. Nevertheless, neither the American Cancer Society nor the U.S. Preventive Services Task Force recommends CT scans due to concerns about accuracy in the interpretation of results. Furthermore, the mechanistic effect of smoking on lung cancer progression remains unclear. Despite our previous finding that smoking intensity at the time of diagnosis is a significant and independent prognostic factor for lung cancer (<xref rid="b5-ijo-41-04-1387" ref-type="bibr">5</xref>), smoking status in itself is not a prognostic determinant of lung cancer.</p>
<p>Non-small cell lung cancer (NSCLC) accounts for 85&#x02013;90&#x00025; of lung cancer cases. NSCLC includes two major subtypes, adenocarcinoma and squamous cell carcinoma. Owing to the limitations of the current screening techniques, most patients with NSCLC are diagnosed at advanced disease stage. A minority (&#x0223C;25&#x02013;30&#x00025;) of patients with NSCLC are diagnosed with stage I disease and receive surgical resection as the major treatment option (<xref rid="b6-ijo-41-04-1387" ref-type="bibr">6</xref>). However, 35&#x02013;50&#x00025; of stage I NSCLC patients will relapse within five years following surgery (<xref rid="b6-ijo-41-04-1387" ref-type="bibr">6</xref>), indicating that a subgroup of these patients might benefit from adjuvant chemotherapy. Meanwhile, adjuvant chemotherapy of stage II and stage III disease has resulted in only modest survival benefits (<xref rid="b7-ijo-41-04-1387" ref-type="bibr">7</xref>). While tumor recurrence remains the major treatment failure for lung cancer, it is not currently possible to identify specific high-risk patients for adjuvant chemotherapy. As a consequence, current multi-modality therapy is of limited efficacy, with an overall 5-year survival rate of about 15&#x00025; (<xref rid="b8-ijo-41-04-1387" ref-type="bibr">8</xref>).</p>
<p>In this study, we sought to identify a gene signature for lung cancer diagnosis and prognosis in smokers. Genes implicated in cancer initiation and progression show dysregulated interactions with their molecular partners (<xref rid="b9-ijo-41-04-1387" ref-type="bibr">9</xref>), and these cancer genes are more likely to actively interact with signaling proteins (<xref rid="b10-ijo-41-04-1387" ref-type="bibr">10</xref>). Because tumors utilize different signaling pathways, we modeled crosstalk with a diverse set of signaling pathways to identify gene signatures that perform more uniformly across heterogeneous tumor sets. Specifically, implication networks (<xref rid="b11-ijo-41-04-1387" ref-type="bibr">11</xref>,<xref rid="b12-ijo-41-04-1387" ref-type="bibr">12</xref>) were used to model concurrent coexpression with multiple signaling pathways among a set of genes associated with smoking and lung cancer survival. This approach identified and validated a smoking-associated 7-gene signature using patient microarray profiles of (n&#x0003D;847). Furthermore, BEAS-2B cell line transformed from normal human lung epithelial cells and human lung adenocarcinoma cells (H23) were treated with 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK), a major tobacco-specific carcinogen (<xref rid="b13-ijo-41-04-1387" ref-type="bibr">13</xref>,<xref rid="b14-ijo-41-04-1387" ref-type="bibr">14</xref>), for qRT-PCR and western blots validation of the identified biomarkers and involved signaling pathways.</p></sec>
<sec sec-type="methods">
<title>Materials and methods</title>
<sec>
<title>Microarray profiles and patient samples</title>
<p>Four patient cohorts with published microarray transcriptional profiles were used in this study. The first cohort contains 442 lung adenocarcinoma patient samples from the Director&#x02019;s Challenge Study (<xref rid="b15-ijo-41-04-1387" ref-type="bibr">15</xref>). This study cohort is composed of 4 data sets (University of Michigan, H. Lee Moffitt Cancer Center, Memorial Sloan-Kettering Cancer Center and Dana-Farber Cancer Institute) contributed by 6 institutions. The clinical characteristics and smoking status of the patients are summarized in <xref rid="t1-ijo-41-04-1387" ref-type="table">Table I</xref>.</p>
<p>The second patient cohort contains 130 squamous cell lung cancer samples from Raponi <italic>et al</italic>(<xref rid="b16-ijo-41-04-1387" ref-type="bibr">16</xref>). The third cohort contains 111 NSCLC samples from Bild <italic>et al</italic>(<xref rid="b17-ijo-41-04-1387" ref-type="bibr">17</xref>). The fourth cohort contains 164 airway epithelial cell lung tissue samples from current and former smokers published by Spira <italic>et al</italic>(<xref rid="b2-ijo-41-04-1387" ref-type="bibr">2</xref>). This cohort has 60 lung cancer samples (48 NSCLC, 11 small cell lung cancer and 1 unknown histology) and 69 normal lung tissue samples. Patient gene expression profiles from Shedden <italic>et al</italic>(<xref rid="b15-ijo-41-04-1387" ref-type="bibr">15</xref>), Raponi <italic>et al</italic>(<xref rid="b16-ijo-41-04-1387" ref-type="bibr">16</xref>) and Spira <italic>et al</italic>(<xref rid="b2-ijo-41-04-1387" ref-type="bibr">2</xref>) were quantified with Affymetrix HG-U133A. The dataset from Bild <italic>et al</italic>(<xref rid="b17-ijo-41-04-1387" ref-type="bibr">17</xref>) was quantified with Affymetrix HG-U133 Plus 2. The raw microarray data were quantile-normalized and log2 transformed with dChip (<xref rid="b18-ijo-41-04-1387" ref-type="bibr">18</xref>) for further analysis.</p></sec>
<sec>
<title>Implication networks</title>
<p>The implication induction algorithm (<xref rid="b11-ijo-41-04-1387" ref-type="bibr">11</xref>) based on prediction logic (<xref rid="b19-ijo-41-04-1387" ref-type="bibr">19</xref>) was used to derive coexpression between each pair of genes using software Genet (<xref rid="b11-ijo-41-04-1387" ref-type="bibr">11</xref>,<xref rid="b12-ijo-41-04-1387" ref-type="bibr">12</xref>,<xref rid="b20-ijo-41-04-1387" ref-type="bibr">20</xref>). In the biological context, the six foremost implication rules relating two dichotomous variables are interpreted as follows: <italic>A</italic> &#x021D2; <italic>B</italic>: upregulation of gene <italic>A</italic> causes upregulation of gene <italic>B; A</italic> &#x021D2; &#x000AC;<italic>B</italic>: upregulation of gene <italic>A</italic> causes downregulation of gene <italic>B</italic>; &#x000AC;<italic>A</italic> &#x021D2; <italic>B</italic>: downregulation of gene <italic>A</italic> causes upregulation of gene <italic>B</italic>; &#x000AC;<italic>A</italic> &#x021D2; &#x000AC;<italic>B</italic>: downregulation of gene <italic>A</italic> causes downregulation of gene <italic>B; A</italic> &#x021D4; <italic>B</italic>: upregulation of gene <italic>A</italic> causes upregulation of gene <italic>B</italic> and upregulation of gene <italic>B</italic> causes upregulation of gene <italic>A; A</italic> &#x021D4; &#x000AC;<italic>B</italic>: upregulation of gene <italic>A</italic> causes downregulation of gene <italic>B</italic> and downregulation of gene <italic>B</italic> causes upregulation of gene <italic>A</italic>. Mean expression of each gene in the training set was used to define up- or downregulation. The minimum scope and the minimum precision of a derived implication relation were significantly greater than zero (P&#x0003C;0.05, one-sided z-tests).</p></sec>
<sec>
<title>Evaluation of gene coexpression networks</title>
<p>The following pathway databases were used to evaluate the biological relevance of the derived coexpression networks, including NCBI Entrez Gene (<xref rid="b21-ijo-41-04-1387" ref-type="bibr">21</xref>), Kyoto Encyclopedia of Genes and Genomes (KEGG) (<xref rid="b22-ijo-41-04-1387" ref-type="bibr">22</xref>), NCI-Nature Pathway Interaction Database (<ext-link xlink:href="http://pid.nci.nih.gov/" ext-link-type="uri">http://pid.nci.nih.gov/</ext-link>), protein-protein interaction database STRING 8 (<xref rid="b23-ijo-41-04-1387" ref-type="bibr">23</xref>), and Pathway Studio 7.0 (Ariadne Genomics, Rockville, MD, USA). In addition, five gene set collections &#x0005B;positional, curated, motif, computational and Gene Oncology (GO)&#x0005D; and canonical pathway databases from the MSigDB (<xref rid="b24-ijo-41-04-1387" ref-type="bibr">24</xref>) were used in the network precision and FDR evaluation. Using these resources, a coexpression relation is considered a true positive (TP) if the pair of genes satisfy any of the following: i) on the same chromosome or cytogenetic band; ii) in the same curated or canonical pathway; iii) sharing a cis-regulator motif, binding motif, or transcription factor binding site; iv) annotated by the same GO term; v) having protein-protein interaction; or vi) within the same computational gene sets mined from cancer-oriented microarray data. The coexpression relation is considered a false positive (FP) if the gene pair do not satisfy all five conditions listed above (<xref rid="b25-ijo-41-04-1387" ref-type="bibr">25</xref>). If at least one gene in the pair is not annotated, a coexpression relation is labeled as non-discriminatory (ND). Coexpression relations labeled as ND were excluded in this evaluation as they were not confirmed. Network precision is defined as:
<disp-formula id="fd1">
<mml:math id="m1" display='block'>
<mml:mrow>
<mml:mi mathvariant='italic'>network</mml:mi>
<mml:mo>&#x005F;</mml:mo>
<mml:mi mathvariant='italic'>precision</mml:mi>
<mml:mo>&#x003D;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant='italic'>TP</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi mathvariant='italic'>TP</mml:mi>
<mml:mo>&#x002B;</mml:mo>
<mml:mi mathvariant='italic'>FP</mml:mi></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula>The portion of FP over all positive cases is defined as q-value:
<disp-formula id="fd2">
<mml:math id="m2" display='block'>
<mml:mrow>
<mml:mi>q</mml:mi>
<mml:mo>&#x02013;</mml:mo>
<mml:mi mathvariant='italic'>value</mml:mi>
<mml:mo>&#x003D;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant='italic'>FP</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi mathvariant='italic'>TP</mml:mi>
<mml:mo>&#x002B;</mml:mo>
<mml:mi mathvariant='italic'>FP</mml:mi></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula>The FDR of the smoking-mediated coexpression networks was calculated by averaging the q-values obtained from the null distribution generated in 1,000 random permutations of the class labels in the test cohort.</p>
<p>The stability of the computationally derived smoking-mediated coexpression networks was evaluated using different subsets of patient samples from the training set in 100 iterations. The stability is defined as the portion of the smoking-mediated coexpression relations obtained from the original data that were retrieved by using only a random subset of the training data and the full test data.</p></sec>
<sec>
<title>Cell cultures</title>
<p>NCI-H23 (ATCC no. CRL-5800) cells were cultured in RPMI-1640 medium (Mediatech, Manassas, VA, USA) supplemented with 10&#x00025; FBS (Hyclone, Logan, UT, USA), 2 mM L-glutamine (Mediatech), 100 IU penicillin/ml (Sigma, St. Louis, MO, USA), and 100 &#x003BC;g streptomycin/ml (Sigma). BEAS-2B (ATCC no. CRL-9609) cells were cultured in Dulbecco&#x02019;s modified Eagle&#x02019;s medium (Mediatech) supplemented with 5&#x00025; FBS (Hyclone), 2 mM L-glutamine (Mediatech), 100 IU penicillin/ml (Sigma) and 100 &#x003BC;g streptomycin/ml (Sigma).</p></sec>
<sec>
<title>NNK treatment and protein isolation</title>
<p>H23 and BEAS-2B cells were treated with 100 nM NNK (Toronto Research Chemicals, North York, ON, Canada) for 15 min, 1 and 16 h. Four repeats (total of five samples) were performed on each cell line and for each time point. Following treatment, cells were harvested by trypsinization and protein was isolated. Cells were lysed in CLB lysis buffer (50 mM Tris-HCl, pH 7.4, 150 mM NaCl, 1&#x00025; Triton X-100, 0.25&#x00025; Na-deoxycholate, 5 mM EDTA and 1 mM NaF) supplemented with 1&#x00025; (v/v) HALT Protease Inhibitor Cocktail, purchased from Thermo Scientific (Rockford, IL, USA), on ice for 15 min with occasional mixing by vortex. Lysates were centrifuged at 20,800 &#x000D7; g for 15 min to pellet insoluble debris and then supernatants were collected. Total protein concentration was determined by bicinchonic acid (BCA) protein assay purchased from Pierce Protein Research Products (Rockford, IL, USA).</p></sec>
<sec>
<title>SDS-PAGE and western blotting</title>
<p>Proteins (50 &#x003BC;g) were resolved on precast Mini-Protean TGX Gels (4&#x02013;20&#x00025;; Bio-Rad Laboratories, Hercules, CA, USA). After boiling for 5 min with reducing Laemmli buffer, proteins were separated and subsequently transferred to a PVDF membrane at 100 mV for 1 h at 4&#x000B0;C. After transfer, membranes were blocked in NET-gelatin solution (150 mM NaCl, 5 mM EDTA, 50 mM Tris-HCl, pH 7.5, 0.05&#x00025; Triton X-100 and 0.25&#x00025; gelatin) for 1 h at room temperature. Primary antibody was added to membranes in 15 ml NET-gelatin solution &#x0005B;1:500 dilution for anti-EGFR, 1:25,000 dilution for anti-GAPDH, 1:2,000 dilution for anti-p53, 1:1,000 dilution for phospho-p53 (phospho S15) and anti-Rb&#x0005D; and membranes were incubated for 2 h at room temperature with rocking. Membranes were then washed in NET-gelatin solution (3 x 20 min with shaking) with HRP-conjugated secondary monoclonal anti-mouse IgG antibody purchased from GE Healthcare UK Ltd. (Little Chalfont, UK). After 1 h of incubation, unbound secondary antibody was removed by washing in NET-gelatin solution (3 x 20 min with shaking). Signal was visualized using Immobilon chemiluminescent HRP substrate from Millipore (Billerica, MA, USA). Primary antibodies utilized included mouse monoclonal anti-EGFR, from Thermo Fisher Scientific (Fremont, CA, USA), and mouse monoclonal anti-GAPDH purchased from Fitzgerald Industries International Inc. (Acton, MA, USA). In addition, the following antibodies were used in western blotting: anti-p53 &#x0005B;Abcam, Mouse Monoclonal (ab26)&#x0005D;, anti-phospho-p53/phospho S15 &#x0005B;Abcam, Rabbit Polyclonal (ab1431)&#x0005D; and anti-Rb &#x0005B;Abcam, Mouse Monoclonal (ab24)&#x0005D;.</p></sec>
<sec>
<title>Densitometry</title>
<p>Relative <italic>EGFR</italic>, <italic>p53</italic>, phospho-<italic>p53</italic>, and <italic>Rb</italic> expression was determined by densitometric analysis using ImageJ software provided by NIH (<ext-link xlink:href="http://rsb.info.nih.gov/ij/index.html" ext-link-type="uri">http://rsb.info.nih.gov/ij/index.html</ext-link>). X-ray films were scanned at 300 and 600 DPI using a CanoScan (Canon, Lake Success, NY, USA) and images were imported into ImageJ for analysis. The raw signal intensity was determined by selecting the peak corresponding to each band and integrating the intensity within that peak. Local background intensity (calculated by averaging the background intensities at the upper and lower bounds of the peak) was integrated and subtracted from each raw intensity to give the background-corrected signal intensity. To account for loading differences, the corrected signal intensity for the assayed proteins was divided by the corrected GAPDH intensity. Samples treated with varying NNK exposure times were compared to untreated controls for the H23 and BEAS-2B cell lines.</p></sec>
<sec>
<title>RNA isolation, complementary DNA synthesis, and qRT-PCR gene expression profiling</title>
<p>Total-RNA was isolated from both cell lines using the mirVana&#x02122; miRNA Isolation kit and following the manufacturer&#x02019;s protocol (Ambion, Austin, TX, USA). Total-RNA was eluted in 100 &#x003BC;l of nuclease-free water and stored at &#x02212;80&#x000B0;C. RNA concentration was determined using the NanoDrop 1000 Spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA). RNA quality, 28S/18S ratio, and a visual image of the 28S and 18S bands were evaluated using the 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). Total-RNA (1 &#x003BC;g) was converted into complementary DNA (cDNA) using the High Capacity cDNA Reverse Transcription Kit from Applied Biosystems (Life Technologies, Carlsbad, CA, USA). Thermal cycling conditions were as follows: 25&#x000B0;C for 10 min, 2 cycles of 37&#x000B0;C for 60 min and 85&#x000B0;C for 5 sec followed by a programmed hold at 4&#x000B0;C.</p>
<p>All qRT-PCR reactions were performed on a 7500 real-time PCR system from Applied Biosystems. The reports were generated using SDS2.3 software (Applied Biosystems). The Ct values obtained were normalized to the UBC housekeeping gene in each sample. Fold changes were computed using the 2<sup>&#x02212;&#x00394;&#x00394;<italic>Ct</italic></sup> method of 5 biological replicates and 3 technical replicates (<xref rid="b26-ijo-41-04-1387" ref-type="bibr">26</xref>). Statistical significance was computed using repeated ANOVA tests in <italic>R</italic> and is considered statistically significant at P&#x02264;0.05.</p>
<p>The coexpression relation of a gene pair derived with the implication induction algorithm was compared with the observed NNK-induced gene expression changes. The coexpression relation is confirmed when the observed NNK-induced gene expression changes are consistent with the predicted implication rule between the two genes. For example, if the rule between gene A and gene B is positive equivalence (A&#x021D4;B), it is confirmed if both gene A and gene B showed overexpression in the corresponding experiments.</p></sec></sec>
<sec sec-type="results">
<title>Results</title>
<sec>
<title>Identification of a smoking-associated 7-gene signature</title>
<p>Lung cancer survival genes were first selected from the whole genome on the training set (UM and HLM; n&#x0003D;256) from Shedden <italic>et al</italic>(<xref rid="b15-ijo-41-04-1387" ref-type="bibr">15</xref>). A total of 2,310 genes were significantly associated with overall survival (P&#x0003C;0.05, univariate Cox model). Second, from this set of 2,310 genes, 217 genes exhibited significant differential expression (P&#x0003C;0.05, t-tests) in smokers versus non-smokers. These 217 survival and smoking-associated genes, as well as 6 major signaling pathway genes (<italic>EGF</italic>, <italic>EGFR</italic>, <italic>MET</italic>, <italic>KRAS</italic>, <italic>E2F3</italic>, and <italic>E2F5</italic>) were included in the network analysis. These signaling pathways are included in human NSCLC disease mechanisms delineated by the KEGG Pathway Database (<ext-link xlink:href="http://www.genome.jp/kegg/pathway/hsa/hsa05223.html" ext-link-type="uri">http://www.genome.jp/kegg/pathway/hsa/hsa05223.html</ext-link>). They were selected based on their reported clinical relevance in NSCLC. These 6 signaling pathway genes were not significantly associated with survival nor were they differentially expressed in smokers.</p>
<p>Patient samples in the training set were separated into two groups: smokers (patients who smoked in the past or who are currently smoking) and non-smokers (patients who never smoked). For each smoking-defined patient group, a coexpression network among the 223 genes was constructed. Between each pair of the 223 genes, significant (P&#x0003C;0.05; z-tests) coexpression relations were retrieved in the smoker group and the non-smoker group, constituting smoking-mediated gene coexpression networks in NSCLC. By comparing the coexpression types between each pair of genes in the two networks, differential network components were identified and considered important for further evaluation. These differential components are interactions that were present in the smoker group but missing in the non-smoker group, or conversely, those present in the non-smoker group but absent in the smoker group. From the differential components associated with the smoker group and non-smoker group, genes having direct coexpression relations with all 6 lung cancer signaling pathway genes were identified as the signature genes (<xref rid="f1-ijo-41-04-1387" ref-type="fig">Fig. 1</xref>). As a result, 6 genes were identified from the smoker group and 1 gene was identified from the non-smoker group. This constituted the smoking-associated 7-gene signature for NSCLC (<xref rid="t2-ijo-41-04-1387" ref-type="table">Table II</xref>).</p></sec>
<sec>
<title>Prognostic validation in lung adenocarcinoma</title>
<p>We sought to investigate if the identified gene signature could provide accurate prognostic prediction of survival in lung adenocarcinoma patients. On the training cohort, the original microarray gene expression profiles of the identified 7 gene probes were fitted into a Cox model as covariates. A survival risk score was generated for each patient in the training set. A training model (<xref rid="f2-ijo-41-04-1387" ref-type="fig">Fig. 2A</xref>) was identified and applied to the test set (MSK and DFCI; n&#x0003D;186; <xref rid="f2-ijo-41-04-1387" ref-type="fig">Fig. 2B</xref>) without re-estimation of parameters. In both training and test sets, this scheme separated patients into two groups with different survival outcomes (P&#x0003C;0.007, Kaplan-Meier analyses). The hazard ratio of the 7-gene risk score &#x0005B;HR&#x0003D;1.89, 95&#x00025; CI: (1.06, 3.38)&#x0005D; was higher than other lung cancer prognostic factors except cancer stage in the test set (<xref rid="t3-ijo-41-04-1387" ref-type="table">Table III</xref>). There was no significant difference in prognostic value between the hazard ratio of the 7-gene risk score and cancer stage (II vs. I). The results demonstrate that the 7-gene risk score could provide a more accurate prognosis than some commonly used clinicopathological parameters.</p>
<p>The 7-gene signature gave accurate prognostic prediction in smokers in both training and test sets in Shedden&#x02019;s cohorts (<xref rid="b15-ijo-41-04-1387" ref-type="bibr">15</xref>) (P&#x0003C;0.01; <xref rid="f2-ijo-41-04-1387" ref-type="fig">Fig. 2C and D</xref>), but not in non-smokers in Kaplan-Meier analyses (P&#x0003C;0.12, results not shown). In addition, gene expression-defined high- and low-risk groups had significant association with smoking (P&#x0003C;0.02, &#x003C7;<sup>2</sup> tests) and smoking cessation (P&#x0003C;0.00001, &#x003C7;<sup>2</sup> tests; <xref rid="t1-ijo-41-04-1387" ref-type="table">Table I</xref>). These results further confirmed the smoking association of the identified 7-gene signature.</p></sec>
<sec>
<title>Prognostic validation on other histological subtypes of NSCLC</title>
<p>The prognostic performance of the 7-gene signature was further evaluated on cohorts from Raponi <italic>et al</italic>(<xref rid="b16-ijo-41-04-1387" ref-type="bibr">16</xref>) and Bild <italic>et al</italic>(<xref rid="b17-ijo-41-04-1387" ref-type="bibr">17</xref>), which include another major subtype of NSCLC, squamous cell lung carcinoma. For robust validation, patient samples in these two studied cohorts were randomly partitioned into separate training and test sets. A prognostic classifier was constructed on the training set using the Cox model and validated on the test set without re-estimation of parameters.</p>
<p>In the Raponi cohort (<xref rid="b16-ijo-41-04-1387" ref-type="bibr">16</xref>) of squamous cell carcinoma patients, the 7-gene signature stratified patients into two groups with distinct survival outcomes (log-rank P&#x0003C;0.005, Kaplan-Meier analysis) in the training set (<xref rid="f2-ijo-41-04-1387" ref-type="fig">Fig. 2E</xref>). This model generated borderline significant stratification (P&#x0003D;0.06, Kaplan-Meier analysis) in the test set (<xref rid="f2-ijo-41-04-1387" ref-type="fig">Fig. 2F</xref>). This might be owing to the fact that 10 patients (8&#x00025;) of Raponi&#x02019;s cohort were either non-smokers or their smoking status was not known, whereas the 7-gene signature provides refined prognosis specifically in smoking lung cancer patients.</p>
<p>In the Bild cohort (<xref rid="b17-ijo-41-04-1387" ref-type="bibr">17</xref>) containing both lung adenocarcinoma and squamous cell carcinoma patients, the 7-gene signature stratified patients into two distinct survival groups in both training and test sets (P&#x0003C;0.04, Kaplan-Meier analyses) (<xref rid="f2-ijo-41-04-1387" ref-type="fig">Fig. 2G and H</xref>). Overall, these results demonstrate that the 7-gene signature could select high-risk NSCLC patients with a smoking history for chemotherapy.</p></sec>
<sec>
<title>Early diagnostic detection of lung cancer in smokers</title>
<p>We further investigated whether the 7-gene signature could be used for diagnostic screening of lung cancer in smokers. The smoking cohort from Spira <italic>et al</italic>(<xref rid="b2-ijo-41-04-1387" ref-type="bibr">2</xref>) was separated into a training set (n&#x0003D;77) and two independent test sets (n&#x0003D;52 and n&#x0003D;35). Using a nearest neighbour algorithm implemented in WEKA (<xref rid="b27-ijo-41-04-1387" ref-type="bibr">27</xref>), the 7-gene classifier could accurately identify lung cancer patients from normal patients with an overall accuracy of 73 and 74&#x00025; in two test sets, respectively. The odds ratio of predicted lung cancer risk was highly significant in all three sets &#x0005B;OR&#x0003D;3.85, 95&#x00025; CI: (1.45, 10.20), P&#x0003C;0.007 in training set; OR&#x0003D;7.35, 95&#x00025; CI: (2.16, 25.04), P&#x0003C;0.001 in Test set 1; OR&#x0003D;8.45, 95&#x00025; CI: (1.84, 38.75), P&#x0003C;0.006 in Test set 2; <xref rid="t2-ijo-41-04-1387" ref-type="table">Table II</xref>&#x0005D;. These results indicate that the identified 7-gene signature has important implications in diagnostic screening of lung cancer risk in smokers.</p></sec>
<sec>
<title>NNK-induced gene and protein expression in BEAS-2B and H23</title>
<p>BEAS-2B and H23 cells were treated with NNK for validation of smoking-associated gene expression. Each cell line was exposed to NNK for 15 min, 1 and 16 h. Ten signaling pathway genes and 7 signature genes were examined. Results showed that 9 genes (<italic>GPRC5C</italic>, <italic>LTF</italic>, <italic>SEMA3C</italic>, <italic>E2F1</italic>, <italic>E2F4</italic>, <italic>E2F5</italic>, <italic>EGF</italic>, <italic>EGFR</italic> and <italic>TP53</italic>) exhibited significant differential expression in the NNK-treated H23 cells (<xref rid="f3-ijo-41-04-1387" ref-type="fig">Fig. 3A</xref>). In BEASE-2B cells, all genes except <italic>CYP3A4</italic> were expressed following NNK exposure, with 13 genes (<italic>GPRC5C</italic>, <italic>LTF</italic>, <italic>PIGN</italic>, <italic>SEMA3C</italic>, <italic>E2F1</italic>, <italic>E2F3</italic>, <italic>E2F5</italic>, <italic>EGF</italic>, <italic>EGFR</italic>, <italic>KRAS</italic>, <italic>MET</italic>, <italic>TP53</italic> and <italic>RB1</italic>) exhibiting significant differential expression (<xref rid="f3-ijo-41-04-1387" ref-type="fig">Fig. 3B</xref>).</p>
<p>To further evaluate the NNK-induced protein expression, western blot assays were performed in NNK-treated BEAS-2B and H23 cells after 15 min, 1 and 16 h exposures. The results show that <italic>EGFR</italic> had consistent overexpression at both the mRNA and protein levels over the time course in BEAS-2B cells after NNK treatment. In H23 cells, <italic>EGFR</italic> exhibited over-expression at the mRNA level; however, protein expression was downregulated following NNK exposure (<xref rid="f3-ijo-41-04-1387" ref-type="fig">Fig. 3C and D</xref>). These results indicate that in normal lung epithelial cells, the <italic>EGFR</italic> gene is overexpressed upon NNK treatment, consistent with previous findings (<xref rid="b28-ijo-41-04-1387" ref-type="bibr">28</xref>); whereas in lung adenocarcinoma cells, the NNK-induced transcriptional and translational regulation of <italic>EGFR</italic> are not concordant at the same time points.</p>
<p><italic>p53</italic> had consistent NNK-induced expression patterns at mRNA and protein levels, with short-term downregulation followed by upregulation in BEAS-2B cells. As H23 is <italic>p53</italic> deficient, <italic>p53</italic> protein was not expressed in these cells (<xref rid="f3-ijo-41-04-1387" ref-type="fig">Fig. 3E and F</xref>). Interestingly, downregulation of phospho-<italic>p53</italic> was consistently observed in both NNK-treated BEAS-2B and H23 cells (<xref rid="f3-ijo-41-04-1387" ref-type="fig">Fig. 3G and H</xref>), concordant with its mRNA and total protein expression. These results are consistent with the report that NNK induces damage in the <italic>p53</italic> gene (<xref rid="b29-ijo-41-04-1387" ref-type="bibr">29</xref>).</p>
<p>In NNK-treated BEAS-2B cells, <italic>Rb</italic> gene expression was first significantly downregulated at 15 min, returned to its normal expression at 1 h, and then showed modest overexpression at 16 h; whereas the <italic>Rb</italic> protein showed a steady overexpression following the NNK treatment at all-time points. In NNK-treated H23 cells, both mRNA and protein expression of <italic>Rb</italic> were downregulated at all-time points (<xref rid="f3-ijo-41-04-1387" ref-type="fig">Fig. 3I and J</xref>). These results are consistent with the reported increased phosphorylation of the <italic>Rb</italic> Ser<sup>795</sup> 6&#x02013;15 h after NNK treatment of normal human bronchial epithelial cells (NHBE) and small airway epithelial cells (SAEC). This, in turn, promoted cells entering into the S phase (at 15&#x02013;21 h) (<xref rid="b30-ijo-41-04-1387" ref-type="bibr">30</xref>).</p></sec>
<sec>
<title>Confirmation of smoking-associated gene coexpression network in lung adenocarcinoma</title>
<p>There were 17 gene coexpressions specifically associated with smokers, and one coexpression specifically associated with non-smokers significant (P&#x0003C;0.05) in both training and test sets from Shedden <italic>et al</italic>(<xref rid="b15-ijo-41-04-1387" ref-type="bibr">15</xref>) (<xref rid="f4-ijo-41-04-1387" ref-type="fig">Fig. 4A</xref>). Among these 18 coexpression relations, 11 gene associations were confirmed with multiple biological databases (<xref rid="f4-ijo-41-04-1387" ref-type="fig">Fig. 4A</xref>; network precision&#x0003D;0.91; FDR&#x0003C;0.01), and the network structure was stable (<xref rid="f4-ijo-41-04-1387" ref-type="fig">Fig. 4B</xref>).</p>
<p>The smoking-associated gene coexpression network in lung adenocarcinoma patients (<xref rid="f4-ijo-41-04-1387" ref-type="fig">Fig. 4A</xref>) was further validated using the NNK-induced gene expression changes in lung adeno-carcinoma cell line H23 (<xref rid="f3-ijo-41-04-1387" ref-type="fig">Fig. 3A</xref>). Results show that most of the coexpressions in the smoking-associated network of lung adenocarcinoma tumors were confirmed by the coexpressions observed in NNK-treated H23 cells, at varying time points (<xref rid="f4-ijo-41-04-1387" ref-type="fig">Fig. 4C</xref>). All of the 17 smoking-associated gene coexpressions in NSCLC patients were observed in the cell experiments, except two coexpressions, one between <italic>EGF</italic> and <italic>LTF</italic> and one between <italic>CRTAC1</italic> and <italic>GPRC5C</italic>. These two unobserved gene coexpressions could be related to other sources of carcinogens in tobacco, because NNK is only one of the carcinogens in tobacco, among about 54 others (<xref rid="b14-ijo-41-04-1387" ref-type="bibr">14</xref>). Overall, the smoking-associated gene coexpression network in NSCLC patients was largely confirmed in NNK-treated cell experiments, elucidating a network of smoking-induced gene alterations in NSCLC.</p></sec>
<sec>
<title>Comparison with Bayesian belief networks and gene association networks based on Pearson&#x02019;s correlation</title>
<p>This study presents novel implication network formalism for biomarker discovery. The ability to model cyclic relations in Genet overcomes the fundamental drawback of acyclic Bayesian networks in modelling molecular networks (<xref rid="b31-ijo-41-04-1387" ref-type="bibr">31</xref>). In comparison with Bayesian belief networks, expression profiles of the identified 7 signature genes and 6 signalling pathway genes were used to build causal networks with TETRAD IV (<ext-link xlink:href="http://www.phil.cmu.edu/projects/tetrad/current.html" ext-link-type="uri">http://www.phil.cmu.edu/projects/tetrad/current.html</ext-link>) for smoking and non-smoking lung adenocarcinoma patients in training and test sets, respectively (<xref rid="f2-ijo-41-04-1387" ref-type="fig">Fig. 2</xref>). There was only one interaction associated with smokers (between <italic>MET</italic> and <italic>SEMA3C</italic>; <xref rid="f2-ijo-41-04-1387" ref-type="fig">Fig. 2E</xref>) present in both training and test sets, which was considered a true positive when evaluated with MSigDB. In contrast, Genet generated significantly more biologically relevant gene coexpression relations that were validated by the biological databases and experimental results, confirming its topological advantage over the Bayesian belief networks.</p>
<p>Large-scale gene coexpression networks have been used in disease classification (<xref rid="b32-ijo-41-04-1387" ref-type="bibr">32</xref>). These studies construct pair-wise gene coexpression networks by using correlation coefficients computed from gene expression profiles. Such networks indicate the distance or similarity between each pair of gene expression profiles but do not provide the direction or causal relations in the gene regulatory patterns. We have compared Genet with gene association networks based on Pearson&#x02019;s correlation. In constructing smoking-mediated coexpression networks using 217 smoking and survival associated genes and 6 signalling hallmarks, both models had the same network precision and FDR. However, Genet generated significantly more biologically relevant gene association relations that were validated by the test set (<xref rid="b20-ijo-41-04-1387" ref-type="bibr">20</xref>). These results indicate that prediction logic is more robust than Pearson&#x02019;s correlation for inducting gene association networks.</p></sec></sec>
<sec sec-type="discussion">
<title>Discussion</title>
<p>In the United States, about 90&#x00025; of male and 75&#x02013;80&#x00025; of female lung cancer deaths can be attributed to smoking each year (<xref rid="b14-ijo-41-04-1387" ref-type="bibr">14</xref>). In recent years, lung adenocarcinoma, a rare tumor type in the early 20th century, has replaced squamous cell lung cancer as the most frequent cell type of NSCLC (<xref rid="b33-ijo-41-04-1387" ref-type="bibr">33</xref>). The observations in the United States and abroad suggest that increases in lung adenocarcinoma cases since 1950 are more consistent with changes in smoking behavior and cigarette design than with diagnostic advances or histologic interpretation (<xref rid="b34-ijo-41-04-1387" ref-type="bibr">34</xref>&#x02013;<xref rid="b36-ijo-41-04-1387" ref-type="bibr">36</xref>). The gene-smoking interactions and their relationship to lung cancer are not well established in epidemiology studies (<xref rid="b14-ijo-41-04-1387" ref-type="bibr">14</xref>).</p>
<p>This study identified a 7-gene signature for lung cancer diagnosis and prognosis in smokers. The identified biomarker genes are involved in multiple lung cancer signaling pathways through concurrent coexpression with major signaling proteins. The 7-gene signature provided an accurate estimate of risk for tumor development and recurrence (as indicated by lung cancer survival) in smokers. The 7-gene signature also appeared to be a more accurate prognostic factor than commonly used clinicopathological factors for NSCLC. These results indicate the potential utility of this gene signature in predicting lung cancer risk in smokers before symptoms can be detected with morphological assessments in clinic. Such early detection could significantly improve the clinical outcome in lung cancer treatment. Furthermore, the 7-gene assay could potentially be used to identify specific patients at high-risk for tumor recurrence/metastasis using customized Affymetrix arrays, thus improving patient selection for adjuvant chemotherapy.</p>
<p>The gene expression-defined prognostic groups had a strong association with smoking and smoking cessation. Smokers were more likely to have the poor prognosis gene expression pattern than non-smokers. Furthermore, current smokers showed a stronger association with the poor prognosis gene expression pattern than previous smokers. These results suggest that the identified 7-gene signature is associated with smoking induced lung cancer initiation and progression, and the poor prognosis gene expression pattern might be reversed after smoking cession. Tobacco smoke contains a substantial amount of NNK, and the lowest dose shown to induce lung cancer in animal studies is remarkably close to the total dose of exposure experienced by a smoker in their lifetime (<xref rid="b37-ijo-41-04-1387" ref-type="bibr">37</xref>). The smoking-associated gene coexpression network computationally derived from NSCLC patient transcriptional profiles was confirmed in the NNK-treated H23 cell line, further attesting to its biological relevance and smoking association in lung cancer.</p>
<p>Using the same methodology, a 6-gene (<xref rid="b20-ijo-41-04-1387" ref-type="bibr">20</xref>) and an 8-gene signature were also identified from 217 smoking and survival associated genes, by modeling concurrent coexpression with different sets of 6 signaling hallmarks randomly selected from 10 KEGG human NSCLC signaling pathways (<xref rid="t3-ijo-41-04-1387" ref-type="table">Table III</xref>). These 10 signaling proteins were selected based on their reported clinical relevance in NSCLC. The prognostic performance of the 6- and 8-gene signatures was comparable with the 7-gene signature (<xref rid="b20-ijo-41-04-1387" ref-type="bibr">20</xref>) (<xref rid="f1-ijo-41-04-1387" ref-type="fig">Fig. 1</xref>). The 6- and 7-gene signatures both outperformed the clinicopathological covariates, but the 8-gene signature did not (results not shown). There is one common gene, <italic>SEMA3C</italic>, between the 6- and 7-gene signatures. In the experimental validation, all 10 signaling pathway genes showed significant differential expression in NNK treated normal lung epithelial cells and lung adenocarcinoma cells. The observed NNK-induced protein expression of <italic>p53</italic>, phospho-<italic>p53</italic>, <italic>Rb</italic> and <italic>EGFR</italic> was largely concordant with their mRNA expression levels in the BEAS-2B normal lung epithelial cells. In lung adenocarcinoma cell line H23, the NNK-induced gene expression was concordant with protein expression of <italic>p53</italic>, phospho-<italic>p53</italic> and <italic>Rb</italic>, but not of <italic>EGFR</italic>. These results indicate that <italic>p53</italic>, <italic>Rb</italic> and <italic>EGFR</italic> might be functionally involved in smoking-induced lung cancer initiation and progression. <italic>EGFR</italic> mutations, associated with better chemoresponse, are significantly associated with non-smokers compared to smokers in a large epidemiology study (<xref rid="b38-ijo-41-04-1387" ref-type="bibr">38</xref>). The identified gene signatures were concurrently coexpressed with these signaling pathways in patient transcriptional profiles. The association of these gene signatures with smoking, smoking cessation, as well as lung cancer risk and survival, in turn, supports the involvements of these oncoproteins in smoking induced lung cancer initiation and progression.</p></sec></body>
<back>
<ack>
<p>We thank Dr Scot C. Remick at West Virginia University for the thoughtful discussion. This study was supported by the National Institutes of Health &#x0005B;R01LM009500, P20RR16440 and its ARRA Supplement to N.L.G, R01CA134573, R01HL056888 and P30RR032138 to L.F.G, P2016477 for software license and training&#x0005D;. Supplementary data are available in <ext-link xlink:href="http://www.hsc.wvu.edu/mbrcc/fs/GuoLab/publications.asp" ext-link-type="uri">http://www.hsc.wvu.edu/mbrcc/fs/GuoLab/publications.asp</ext-link>.</p></ack>
<ref-list>
<title>References</title>
<ref id="b1-ijo-41-04-1387"><label>1</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Christiani</surname><given-names>DC</given-names></name></person-group><article-title>Genetic susceptibility to lung cancer</article-title><source>J Clin Oncol</source><volume>24</volume><fpage>1651</fpage><lpage>1652</lpage><year>2006</year></element-citation></ref>
<ref id="b2-ijo-41-04-1387"><label>2</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Spira</surname><given-names>A</given-names></name><name><surname>Beane</surname><given-names>JE</given-names></name><name><surname>Shah</surname><given-names>V</given-names></name><etal/></person-group><article-title>Airway epithelial gene expression in the diagnostic evaluation of smokers with suspect lung cancer</article-title><source>Nat Med</source><volume>13</volume><fpage>361</fpage><lpage>366</lpage><year>2007</year></element-citation></ref>
<ref id="b3-ijo-41-04-1387"><label>3</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Massion</surname><given-names>PP</given-names></name><name><surname>Zou</surname><given-names>Y</given-names></name><name><surname>Chen</surname><given-names>H</given-names></name><etal/></person-group><article-title>Smoking-related genomic signatures in non-small cell lung cancer</article-title><source>Am J Respir Crit Care Med</source><volume>178</volume><fpage>1164</fpage><lpage>1172</lpage><year>2008</year></element-citation></ref>
<ref id="b4-ijo-41-04-1387"><label>4</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Woenckhaus</surname><given-names>M</given-names></name><name><surname>Klein-Hitpass</surname><given-names>L</given-names></name><name><surname>Grepmeier</surname><given-names>U</given-names></name><etal/></person-group><article-title>Smoking and cancer-related gene expression in bronchial epithelium and non-small-cell lung cancers</article-title><source>J Pathol</source><volume>210</volume><fpage>192</fpage><lpage>204</lpage><year>2006</year></element-citation></ref>
<ref id="b5-ijo-41-04-1387"><label>5</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Guo</surname><given-names>NL</given-names></name><name><surname>Tosun</surname><given-names>K</given-names></name><name><surname>Horn</surname><given-names>K</given-names></name></person-group><article-title>Impact and interactions between smoking and traditional prognostic factors in lung cancer progression</article-title><source>Lung Cancer</source><volume>66</volume><fpage>386</fpage><lpage>392</lpage><year>2009</year></element-citation></ref>
<ref id="b6-ijo-41-04-1387"><label>6</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Beer</surname><given-names>DG</given-names></name><name><surname>Kardia</surname><given-names>SL</given-names></name><name><surname>Huang</surname><given-names>CC</given-names></name><etal/></person-group><article-title>Gene-expression profiles predict survival of patients with lung adenocarcinoma</article-title><source>Nat Med</source><volume>8</volume><fpage>816</fpage><lpage>824</lpage><year>2002</year></element-citation></ref>
<ref id="b7-ijo-41-04-1387"><label>7</label><element-citation publication-type="book"><source>General Thoracic Surgery</source><publisher-name>Lippincott Williams &#x00026; Wilkins</publisher-name><publisher-loc>Philadelphia, PA</publisher-loc><year>2009</year></element-citation></ref>
<ref id="b8-ijo-41-04-1387"><label>8</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hung</surname><given-names>RJ</given-names></name><name><surname>McKay</surname><given-names>JD</given-names></name><name><surname>Gaborieau</surname><given-names>V</given-names></name><etal/></person-group><article-title>A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25</article-title><source>Nature</source><volume>452</volume><fpage>633</fpage><lpage>637</lpage><year>2008</year></element-citation></ref>
<ref id="b9-ijo-41-04-1387"><label>9</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Mani</surname><given-names>KM</given-names></name><name><surname>Lefebvre</surname><given-names>C</given-names></name><name><surname>Wang</surname><given-names>K</given-names></name><etal/></person-group><article-title>A systems biology approach to prediction of oncogenes and molecular perturbation targets in B-cell lymphomas</article-title><source>Mol Syst Biol</source><volume>4</volume><fpage>169</fpage><year>2008</year></element-citation></ref>
<ref id="b10-ijo-41-04-1387"><label>10</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Cui</surname><given-names>Q</given-names></name><name><surname>Ma</surname><given-names>Y</given-names></name><name><surname>Jaramillo</surname><given-names>M</given-names></name><etal/></person-group><article-title>A map of human cancer signaling</article-title><source>Mol Syst Biol</source><volume>3</volume><fpage>152</fpage><year>2007</year></element-citation></ref>
<ref id="b11-ijo-41-04-1387"><label>11</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Guo</surname><given-names>NL</given-names></name><name><surname>Wan</surname><given-names>YW</given-names></name><name><surname>Bose</surname><given-names>S</given-names></name><etal/></person-group><article-title>A novel network model identified a 13-gene lung cancer prognostic signature</article-title><source>Int J Comput Biol Drug Des</source><volume>4</volume><fpage>19</fpage><lpage>39</lpage><year>2011</year></element-citation></ref>
<ref id="b12-ijo-41-04-1387"><label>12</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wan</surname><given-names>YW</given-names></name><name><surname>Beer</surname><given-names>DG</given-names></name><name><surname>Guo</surname><given-names>NL</given-names></name></person-group><article-title>Signaling pathway-based identification of extensive prognostic gene signatures for lung adenocarcinoma</article-title><source>Lung Cancer</source><volume>76</volume><fpage>98</fpage><lpage>105</lpage><year>2012</year></element-citation></ref>
<ref id="b13-ijo-41-04-1387"><label>13</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Schuller</surname><given-names>HM</given-names></name></person-group><article-title>Mechanisms of smoking-related lung and pancreatic adenocarcinoma development</article-title><source>Nat Rev Cancer</source><volume>2</volume><fpage>455</fpage><lpage>463</lpage><year>2002</year></element-citation></ref>
<ref id="b14-ijo-41-04-1387"><label>14</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hecht</surname><given-names>SS</given-names></name></person-group><article-title>Tobacco smoke carcinogens and lung cancer</article-title><source>J Natl Cancer Inst</source><volume>91</volume><fpage>1194</fpage><lpage>1210</lpage><year>1999</year></element-citation></ref>
<ref id="b15-ijo-41-04-1387"><label>15</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Shedden</surname><given-names>K</given-names></name><name><surname>Taylor</surname><given-names>JM</given-names></name><name><surname>Enkemann</surname><given-names>SA</given-names></name><etal/></person-group><article-title>Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study</article-title><source>Nat Med</source><volume>14</volume><fpage>822</fpage><lpage>827</lpage><year>2008</year></element-citation></ref>
<ref id="b16-ijo-41-04-1387"><label>16</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Raponi</surname><given-names>M</given-names></name><name><surname>Zhang</surname><given-names>Y</given-names></name><name><surname>Yu</surname><given-names>J</given-names></name><etal/></person-group><article-title>Gene expression signatures for predicting prognosis of squamous cell and adenocarcinomas of the lung</article-title><source>Cancer Res</source><volume>66</volume><fpage>7466</fpage><lpage>7472</lpage><year>2006</year></element-citation></ref>
<ref id="b17-ijo-41-04-1387"><label>17</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bild</surname><given-names>AH</given-names></name><name><surname>Yao</surname><given-names>G</given-names></name><name><surname>Chang</surname><given-names>JT</given-names></name><etal/></person-group><article-title>Oncogenic pathway signatures in human cancers as a guide to targeted therapies</article-title><source>Nature</source><volume>439</volume><fpage>353</fpage><lpage>357</lpage><year>2006</year></element-citation></ref>
<ref id="b18-ijo-41-04-1387"><label>18</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Li</surname><given-names>C</given-names></name></person-group><article-title>Automating dChip: toward reproducible sharing of micro-array data analysis</article-title><source>BMC Bioinformatics</source><volume>9</volume><fpage>231</fpage><year>2008</year></element-citation></ref>
<ref id="b19-ijo-41-04-1387"><label>19</label><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Hildebrand</surname><given-names>DK</given-names></name><name><surname>Laing</surname><given-names>JD</given-names></name><name><surname>Rosenthal</surname><given-names>H</given-names></name></person-group><source>Prediction Analysis of Cross Classifications</source><publisher-name>John Wiley &#x00026; Sons</publisher-name><publisher-loc>New York, NY</publisher-loc><year>1977</year></element-citation></ref>
<ref id="b20-ijo-41-04-1387"><label>20</label><element-citation publication-type="other"><person-group person-group-type="author"><name><surname>Guo</surname><given-names>NL</given-names></name><name><surname>Wan</surname><given-names>YW</given-names></name></person-group><article-title>Pathway-based identification of a smoking associated 6-gene signature predictive of lung cancer risk and survival</article-title><source>Artif Intell Med</source><month>Feb</month><day>10</day><year>2012</year><comment>(Epub ahead of print).</comment></element-citation></ref>
<ref id="b21-ijo-41-04-1387"><label>21</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Maglott</surname><given-names>D</given-names></name><name><surname>Ostell</surname><given-names>J</given-names></name><name><surname>Pruitt</surname><given-names>KD</given-names></name><name><surname>Tatusova</surname><given-names>T</given-names></name></person-group><article-title>Entrez Gene: gene-centered information at NCBI</article-title><source>Nucleic Acids Res</source><volume>35</volume><fpage>D26</fpage><lpage>D31</lpage><year>2007</year></element-citation></ref>
<ref id="b22-ijo-41-04-1387"><label>22</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ogata</surname><given-names>H</given-names></name><name><surname>Goto</surname><given-names>S</given-names></name><name><surname>Sato</surname><given-names>K</given-names></name><etal/></person-group><article-title>KEGG: Kyoto Encyclopedia of Genes and Genomes</article-title><source>Nucleic Acids Res</source><volume>27</volume><fpage>29</fpage><lpage>34</lpage><year>1999</year></element-citation></ref>
<ref id="b23-ijo-41-04-1387"><label>23</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Jensen</surname><given-names>LJ</given-names></name><name><surname>Kuhn</surname><given-names>M</given-names></name><name><surname>Stark</surname><given-names>M</given-names></name><etal/></person-group><article-title>STRING 8 - a global view on proteins and their functional interactions in 630 organisms</article-title><source>Nucleic Acids Res</source><volume>37</volume><fpage>D412</fpage><lpage>D416</lpage><year>2009</year></element-citation></ref>
<ref id="b24-ijo-41-04-1387"><label>24</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Subramanian</surname><given-names>A</given-names></name><name><surname>Tamayo</surname><given-names>P</given-names></name><name><surname>Mootha</surname><given-names>VK</given-names></name><etal/></person-group><article-title>Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles</article-title><source>Proc Natl Acad Sci USA</source><volume>102</volume><fpage>15545</fpage><lpage>15550</lpage><year>2005</year></element-citation></ref>
<ref id="b25-ijo-41-04-1387"><label>25</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ucar</surname><given-names>D</given-names></name><name><surname>Neuhaus</surname><given-names>I</given-names></name><name><surname>Ross-MacDonald</surname><given-names>P</given-names></name><etal/></person-group><article-title>Construction of a reference gene association network from multiple profiling data: application to data analysis</article-title><source>Bioinformatics</source><volume>23</volume><fpage>2716</fpage><lpage>2724</lpage><year>2007</year></element-citation></ref>
<ref id="b26-ijo-41-04-1387"><label>26</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Livak</surname><given-names>KJ</given-names></name><name><surname>Schmittgen</surname><given-names>TD</given-names></name></person-group><article-title>Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) method</article-title><source>Methods</source><volume>25</volume><fpage>402</fpage><lpage>408</lpage><year>2001</year></element-citation></ref>
<ref id="b27-ijo-41-04-1387"><label>27</label><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Witten</surname><given-names>IH</given-names></name><name><surname>Frank</surname><given-names>E</given-names></name></person-group><source>Data Mining: Practical Machine Learning Tools and Techniques</source><edition>2nd edition</edition><publisher-name>Morgan Kaufmann</publisher-name><publisher-loc>San Francisco, CA</publisher-loc><year>2005</year></element-citation></ref>
<ref id="b28-ijo-41-04-1387"><label>28</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lonardo</surname><given-names>F</given-names></name><name><surname>Dragnev</surname><given-names>KH</given-names></name><name><surname>Freemantle</surname><given-names>SJ</given-names></name><etal/></person-group><article-title>Evidence for the epidermal growth factor receptor as a target for lung cancer prevention</article-title><source>Clin Cancer Res</source><volume>8</volume><fpage>54</fpage><lpage>60</lpage><year>2002</year></element-citation></ref>
<ref id="b29-ijo-41-04-1387"><label>29</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Cloutier</surname><given-names>JF</given-names></name><name><surname>Drouin</surname><given-names>R</given-names></name><name><surname>Weinfeld</surname><given-names>M</given-names></name><name><surname>O&#x02019;Connor</surname><given-names>TR</given-names></name><name><surname>Castonguay</surname><given-names>A</given-names></name></person-group><article-title>Characterization and mapping of DNA damage induced by reactive metabolites of 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) at nucleotide resolution in human genomic DNA</article-title><source>J Mol Biol</source><volume>313</volume><fpage>539</fpage><lpage>557</lpage><year>2001</year></element-citation></ref>
<ref id="b30-ijo-41-04-1387"><label>30</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ho</surname><given-names>YS</given-names></name><name><surname>Chen</surname><given-names>CH</given-names></name><name><surname>Wang</surname><given-names>YJ</given-names></name><etal/></person-group><article-title>Tobacco-specific carcinogen 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) induces cell proliferation in normal human bronchial epithelial cells through NFkappaB activation and cyclin D1 up-regulation</article-title><source>Toxicol Appl Pharmacol</source><volume>205</volume><fpage>133</fpage><lpage>148</lpage><year>2005</year></element-citation></ref>
<ref id="b31-ijo-41-04-1387"><label>31</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sachs</surname><given-names>K</given-names></name><name><surname>Perez</surname><given-names>O</given-names></name><name><surname>Pe&#x02019;er</surname><given-names>D</given-names></name><name><surname>Lauffenburger</surname><given-names>DA</given-names></name><name><surname>Nolan</surname><given-names>GP</given-names></name></person-group><article-title>Causal protein-signaling networks derived from multiparameter single-cell data</article-title><source>Science</source><volume>308</volume><fpage>523</fpage><lpage>529</lpage><year>2005</year></element-citation></ref>
<ref id="b32-ijo-41-04-1387"><label>32</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Choi</surname><given-names>JK</given-names></name><name><surname>Yu</surname><given-names>U</given-names></name><name><surname>Yoo</surname><given-names>OJ</given-names></name><name><surname>Kim</surname><given-names>S</given-names></name></person-group><article-title>Differential coexpression analysis using microarray data and its application to human cancer</article-title><source>Bioinformatics</source><volume>21</volume><fpage>4348</fpage><lpage>4355</lpage><year>2005</year></element-citation></ref>
<ref id="b33-ijo-41-04-1387"><label>33</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Travis</surname><given-names>WD</given-names></name><name><surname>Travis</surname><given-names>LB</given-names></name><name><surname>Devesa</surname><given-names>SS</given-names></name></person-group><article-title>Lung cancer</article-title><source>Cancer</source><volume>75</volume><fpage>191</fpage><lpage>202</lpage><year>1995</year></element-citation></ref>
<ref id="b34-ijo-41-04-1387"><label>34</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ernster</surname><given-names>VL</given-names></name></person-group><article-title>The epidemiology of lung cancer in women</article-title><source>Ann Epidemiol</source><volume>4</volume><fpage>102</fpage><lpage>110</lpage><year>1994</year></element-citation></ref>
<ref id="b35-ijo-41-04-1387"><label>35</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Levi</surname><given-names>F</given-names></name><name><surname>Franceschi</surname><given-names>S</given-names></name><name><surname>La Vecchia</surname><given-names>C</given-names></name><name><surname>Randimbison</surname><given-names>L</given-names></name><name><surname>Te</surname><given-names>VC</given-names></name></person-group><article-title>Lung carcinoma trends by histologic type in Vaud and Neuchatel, Switzerland, 1974&#x02013;1994</article-title><source>Cancer</source><volume>79</volume><fpage>906</fpage><lpage>914</lpage><year>1997</year></element-citation></ref>
<ref id="b36-ijo-41-04-1387"><label>36</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Thun</surname><given-names>MJ</given-names></name><name><surname>Lally</surname><given-names>CA</given-names></name><name><surname>Flannery</surname><given-names>JT</given-names></name><etal/></person-group><article-title>Cigarette smoking and changes in the histopathology of lung cancer</article-title><source>J Natl Cancer Inst</source><volume>89</volume><fpage>1580</fpage><lpage>1586</lpage><year>1997</year></element-citation></ref>
<ref id="b37-ijo-41-04-1387"><label>37</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hoffmann</surname><given-names>D</given-names></name><name><surname>Rivenson</surname><given-names>A</given-names></name><name><surname>Hecht</surname><given-names>SS</given-names></name></person-group><article-title>The biological significance of tobacco-specific N-nitrosamines: smoking and adenocarcinoma of the lung</article-title><source>Crit Rev Toxicol</source><volume>26</volume><fpage>199</fpage><lpage>211</lpage><year>1996</year></element-citation></ref>
<ref id="b38-ijo-41-04-1387"><label>38</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ren</surname><given-names>JH</given-names></name><name><surname>He</surname><given-names>WS</given-names></name><name><surname>Yan</surname><given-names>GL</given-names></name><etal/></person-group><article-title>EGFR mutations in non-small-cell lung cancer among smokers and non-smokers: a meta-analysis</article-title><source>Environ Mol Mutagen</source><volume>53</volume><fpage>78</fpage><lpage>82</lpage><year>2012</year></element-citation></ref></ref-list>
<sec sec-type="display-objects">
<title>Figures and Tables</title>
<fig id="f1-ijo-41-04-1387" position="float">
<label>Figure 1</label>
<caption>
<p>Methodology for network-based identification of smoking-associated 7-gene signature.</p></caption>
<graphic xlink:href="IJO-41-04-1387-g02.gif"/></fig>
<fig id="f2-ijo-41-04-1387" position="float">
<label>Figure 2</label>
<caption>
<p>Prognosis in NSCLC patients using smoking-associated 7-gene signature. In the cohorts from Shedden <italic>et al</italic>(<xref rid="b15-ijo-41-04-1387" ref-type="bibr">15</xref>), the risk score giving the best prediction on the 3-year ROC curve generated significant patient stratification (log-rank P&#x0003C;0.007) on the (A) training set and (B) independent test set. This classifier also stratified smoking lung adenocarcinoma patients into two distinct (log-rank P&#x0003C;0.01) prognostic groups in both the (C) training and (D) test sets. Significant stratifications were also obtained in the randomly partitioned training and test sets of patients with squamous cell carcinoma from (E and F) Raponi <italic>et al</italic>(<xref rid="b16-ijo-41-04-1387" ref-type="bibr">16</xref>) and (G and H) the Bild cohort (<xref rid="b17-ijo-41-04-1387" ref-type="bibr">17</xref>) of lung adenocarcinoma and squamous cell carcinoma. Log-rank tests were used to assess the statistical significance in survival probability between the two prognostic groups. Red curves, low-risk patient group; green curves, high-risk patient group.</p></caption>
<graphic xlink:href="IJO-41-04-1387-g03.gif"/></fig>
<fig id="f3-ijo-41-04-1387" position="float">
<label>Figure 3</label>
<caption>
<p>NNK-induced gene and protein expression in H23 and BEAS-2B. Gene expression fold change in cell lines treated with NNK (100 nM) vs. control in (A) human lung adenocarcinoma cells H23 and (B) normal lung epithelial cells BEAS-2B. The gene expression was normalized with endogenous control gene UBC. An asterisk above a bar indicates significant (P&#x0003C;0.05) differential expression in repeated ANOVA tests of five biological samples and three technical repeats in qRTPCR assays. Protein expression measured by western blots in NNK treated cell lines (C and D) BEAS-2B and H23 for EGFR, (E and F) p53, (G and H) phospho-p53 and (I and J) Rb. The protein expression was quantified with densitometry and normalized with endogenous control protein GAPDH in three biological repeats.</p></caption>
<graphic xlink:href="IJO-41-04-1387-g04.gif"/></fig>
<fig id="f4-ijo-41-04-1387" position="float">
<label>Figure 4</label>
<caption>
<p>Smoking-associated coexpression network in lung adenocarcinoma. (A) Gene coexpression relations specific to smokers and non-smokers significant (P&#x0003C;0.05) in both training and test sets from Shedden <italic>et al</italic>(<xref rid="b15-ijo-41-04-1387" ref-type="bibr">15</xref>) (network precision &#x0003D; 0.91, FDR &#x0003D; 0.01). (B) The stability of smoking-mediated networks as evaluated with random subsets of patients from the training cohort in 100 iterations. (C) Coexpression relations observed in the NNK-treated H23 cell line for 15 min, 1 and 16 h.</p></caption>
<graphic xlink:href="IJO-41-04-1387-g05.gif"/></fig>
<table-wrap id="t1-ijo-41-04-1387" position="float">
<label>Table I</label>
<caption>
<p>Summary of clinical characteristics of patients from the Director&#x02019;s Challenge Study (<xref rid="b15-ijo-41-04-1387" ref-type="bibr">15</xref>).</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="center" valign="middle"/>
<th colspan="2" align="center" valign="middle">UM and HLM (training set)
<hr/></th>
<th colspan="2" align="center" valign="middle">MSK and DFCI (testing set) 
<hr/></th></tr>
<tr>
<th align="center" valign="middle"/>
<th align="center" valign="middle">Smokers</th>
<th align="center" valign="middle">Non-smokers</th>
<th align="center" valign="middle">Smokers</th>
<th align="center" valign="middle">Non-smokers</th></tr></thead>
<tbody>
<tr>
<td align="left" valign="top">Patient sample size</td>
<td align="center" valign="top">149</td>
<td align="center" valign="top">20</td>
<td align="center" valign="top">151</td>
<td align="center" valign="top">29</td></tr>
<tr>
<td align="left" valign="top">Age (mean, s.d.)</td>
<td align="center" valign="top">65 (10)</td>
<td align="center" valign="top">68 (11)</td>
<td align="center" valign="top">63 (10)</td>
<td align="center" valign="top">66 (11)</td></tr>
<tr>
<td align="left" valign="top">Gender (male &#x00025;)</td>
<td align="center" valign="top">54</td>
<td align="center" valign="top">0</td>
<td align="center" valign="top">48</td>
<td align="center" valign="top">31</td></tr>
<tr>
<td align="left" valign="top">Median survival (mo)</td>
<td align="center" valign="top">42</td>
<td align="center" valign="top">54</td>
<td align="center" valign="top">48</td>
<td align="center" valign="top">43</td></tr>
<tr>
<td align="left" valign="top">Tumor stage (&#x00025;)</td>
<td align="center" valign="top"/>
<td align="center" valign="top"/>
<td align="center" valign="top"/>
<td align="center" valign="top"/></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;I</td>
<td align="center" valign="top">58</td>
<td align="center" valign="top">80</td>
<td align="center" valign="top">65</td>
<td align="center" valign="top">55</td></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;II</td>
<td align="center" valign="top">22</td>
<td align="center" valign="top">5</td>
<td align="center" valign="top">25</td>
<td align="center" valign="top">28</td></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;III</td>
<td align="center" valign="top">18</td>
<td align="center" valign="top">10</td>
<td align="center" valign="top">10</td>
<td align="center" valign="top">17</td></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;Unknown</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">5</td>
<td align="center" valign="top">0</td>
<td align="center" valign="top">0</td></tr></tbody></table></table-wrap>
<table-wrap id="t2-ijo-41-04-1387" position="float">
<label>Table II</label>
<caption>
<p>The identified smoking-associated 7-gene signature.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="middle">Gene symbol</th>
<th align="center" valign="middle">Gene title</th>
<th align="center" valign="middle">Molecular function (Gene Ontology)</th></tr></thead>
<tbody>
<tr>
<td align="left" valign="top">ABCA3</td>
<td align="left" valign="top">ATP-binding cassette, sub-family A (ABC1), member 3</td>
<td align="left" valign="top">ATP, nucleotide binding; ATPase, transporter activity</td></tr>
<tr>
<td align="left" valign="top">CRTAC1</td>
<td align="left" valign="top">Cartilage acidic protein 1</td>
<td align="left" valign="top">Calcium ion binding</td></tr>
<tr>
<td align="left" valign="top">CYP3A4</td>
<td align="left" valign="top">Cytochrome P450, family 3, subfamily A, polypeptide 4</td>
<td align="left" valign="top">Monooxygenase, electron carrier, oxidoreductase activity; heme, metal ion and steroid binding</td></tr>
<tr>
<td align="left" valign="top">GPRC5C</td>
<td align="left" valign="top">G protein-coupled receptor, family C, group 5, member C</td>
<td align="left" valign="top">Receptor activity; protein binding</td></tr>
<tr>
<td align="left" valign="top">LTF</td>
<td align="left" valign="top">Lactotransferrin</td>
<td align="left" valign="top">Ferric iron, heparin, metal ion, protein binding; peptidase, serine-type endopeptidase activity</td></tr>
<tr>
<td align="left" valign="top">PIGN</td>
<td align="left" valign="top">Phosphatidylinositol glycan anchor biosynthesis, class N</td>
<td align="left" valign="top">Phosphotransferase, transferase activity</td></tr>
<tr>
<td align="left" valign="top">SEMA3C</td>
<td align="left" valign="top">Sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3C</td>
<td align="left" valign="top">Receptor activity; semaphorin receptor binding</td></tr></tbody></table></table-wrap>
<table-wrap id="t3-ijo-41-04-1387" position="float">
<label>Table III</label>
<caption>
<p>Multivariate Cox proportional analysis of the 7-gene risk score and major clinical covariates in smoking lung cancer patients from the test cohort (MSK and DFCI) in Director&#x02019;s Challenge Study (<xref rid="b15-ijo-41-04-1387" ref-type="bibr">15</xref>).</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="middle">Variable<xref rid="tfn1-ijo-41-04-1387" ref-type="table-fn"><sup>a</sup></xref></th>
<th align="center" valign="middle">P-value</th>
<th align="center" valign="middle">Hazard ratio (95&#x00025; CI)<xref rid="tfn2-ijo-41-04-1387" ref-type="table-fn"><sup>b</sup></xref></th></tr></thead>
<tbody>
<tr>
<td align="left" valign="top">Analysis without 7-gene risk score</td>
<td align="left" valign="top"/>
<td align="left" valign="top"/></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;Gender (male)</td>
<td align="left" valign="top">0.55</td>
<td align="left" valign="top">1.17 (0.70, 1.95)</td></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;Age at diagnosis (&#x0003E;60)</td>
<td align="left" valign="top">0.35</td>
<td align="left" valign="top">1.31 (0.74, 2.29)</td></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;Tumor differentiation</td>
<td align="left" valign="top"/>
<td align="left" valign="top"/></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;&#x02003;&#x02003;Moderately differentiated</td>
<td align="left" valign="top">0.30</td>
<td align="left" valign="top">0.63 (0.26, 1.51)</td></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;&#x02003;&#x02003;Poorly differentiated</td>
<td align="left" valign="top">0.89</td>
<td align="left" valign="top">1.06 (0.47, 2.38)</td></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;Cancer stage</td>
<td align="left" valign="top"/>
<td align="left" valign="top"/></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;&#x02003;&#x02003;II</td>
<td align="left" valign="top">1.54E-03</td>
<td align="left" valign="top">2.60 (1.44, 4.71)</td></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;&#x02003;&#x02003;III</td>
<td align="left" valign="top">5.53E-05</td>
<td align="left" valign="top">4.48 (2.16, 9.29)</td></tr>
<tr>
<td align="left" valign="top">Analysis with 7-gene risk score</td>
<td align="left" valign="top"/>
<td align="left" valign="top"/></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;Gender (male)</td>
<td align="left" valign="top">0.51</td>
<td align="left" valign="top">1.19 (0.71, 1.99)</td></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;Age at diagnosis (&#x0003E;60)</td>
<td align="left" valign="top">0.49</td>
<td align="left" valign="top">1.22 (0.69, 2.16)</td></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;Tumor differentiation</td>
<td align="left" valign="top"/>
<td align="left" valign="top"/></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;&#x02003;&#x02003;Moderately differentiated</td>
<td align="left" valign="top">0.33</td>
<td align="left" valign="top">0.65 (0.27, 1.55)</td></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;&#x02003;&#x02003;Poorly differentiated</td>
<td align="left" valign="top">0.93</td>
<td align="left" valign="top">0.96 (0.43, 2.16)</td></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;Cancer stage</td>
<td align="left" valign="top"/>
<td align="left" valign="top"/></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;&#x02003;&#x02003;II</td>
<td align="left" valign="top">1.64E-03</td>
<td align="left" valign="top">2.61 (1.44, 4.74)</td></tr>
<tr>
<td align="left" valign="top">&#x02003;&#x02003;&#x02003;&#x02003;III</td>
<td align="left" valign="top">3.29E-05</td>
<td align="left" valign="top">4.79 (2.29, 10.04)</td></tr>
<tr>
<td align="left" valign="top">7-gene risk score</td>
<td align="left" valign="top">0.03</td>
<td align="left" valign="top">1.89 (1.06, 3.38)</td></tr></tbody></table>
<table-wrap-foot><fn id="tfn1-ijo-41-04-1387">
<label>a</label>
<p>Gender was a binary variable (0 for female and 1 for male); age at diagnosis was a binary variable (0 for &#x0003C;60-year-old and 1 otherwise); tumor grade was categorical variable of 3 categories &#x0005B;well (as the reference group), moderately and poorly differentiated&#x0005D;; tumor stage was categorical variable of 3 categories &#x0005B;stage I (as the reference group), stage II and stage III&#x0005D;.</p></fn><fn id="tfn2-ijo-41-04-1387">
<label>b</label>
<p>Denotes confidence interval.</p></fn></table-wrap-foot></table-wrap></sec></back></article>
