<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "journalpublishing3.dtd">
<article xml:lang="en" article-type="research-article" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Molecular Medicine Reports</journal-id>
<journal-title-group>
<journal-title>Molecular Medicine Reports</journal-title>
</journal-title-group>
<issn pub-type="ppub">1791-2997</issn>
<issn pub-type="epub">1791-3004</issn>
<publisher>
<publisher-name>D.A. Spandidos</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3892/mmr.2017.8234</article-id>
<article-id pub-id-type="publisher-id">mmr-17-02-3152</article-id>
<article-categories>
<subj-group>
<subject>Articles</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Transcriptomic signature predicts the distant relapse in patients with ER&#x002B; breast cancer treated with tamoxifen for five years</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author"><name><surname>Zhou</surname><given-names>Hao</given-names></name>
<xref rid="af1-mmr-17-02-3152" ref-type="aff"/>
<xref rid="fn1-mmr-17-02-3152" ref-type="author-notes">&#x002A;</xref></contrib>
<contrib contrib-type="author"><name><surname>Lv</surname><given-names>Qingfu</given-names></name>
<xref rid="af1-mmr-17-02-3152" ref-type="aff"/>
<xref rid="fn1-mmr-17-02-3152" ref-type="author-notes">&#x002A;</xref></contrib>
<contrib contrib-type="author"><name><surname>Guo</surname><given-names>Zhaoji</given-names></name>
<xref rid="af1-mmr-17-02-3152" ref-type="aff"/>
<xref rid="c1-mmr-17-02-3152" ref-type="corresp"/></contrib>
</contrib-group>
<aff id="af1-mmr-17-02-3152">Department of General Surgery, The First Affiliated Hospital of SooChow University, Suzhou, Jiangsu 215006, P.R. China</aff>
<author-notes>
<corresp id="c1-mmr-17-02-3152"><italic>Correspondence to</italic>: Dr Zhaoji Guo, Department of General Surgery, The First Affiliated Hospital of SooChow University, 188 Shizi Street, Suzhou, Jiangsu 215006, P.R. China, E-mail: <email>guozhaoji2017@163.com</email></corresp>
<fn id="fn1-mmr-17-02-3152"><label>&#x002A;</label><p>Contributed equally</p></fn>
</author-notes>
<pub-date pub-type="ppub"><month>02</month><year>2018</year></pub-date>
<pub-date pub-type="epub"><day>08</day><month>12</month><year>2017</year></pub-date>
<volume>17</volume>
<issue>2</issue>
<fpage>3152</fpage>
<lpage>3157</lpage>
<history>
<date date-type="received"><day>18</day><month>03</month><year>2017</year></date>
<date date-type="accepted"><day>06</day><month>09</month><year>2017</year></date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2018, Spandidos Publications</copyright-statement>
<copyright-year>2018</copyright-year>
</permissions>
<abstract>
<p>Tamoxifen is the most commonly used drug to treat estrogen receptor positive (ER&#x002B;) breast cancer. However, many patients with ER&#x002B; breast cancer have experienced resistance and other adverse side effects following treatment with tamoxifen. Furthermore, clinical and pathological parameters have thus far failed to predict the efficiency of tamoxifen administration. Therefore, gene signature based models for the prediction of survival time of such patients are urgently needed. In the current study, gene expression levels and follow-up information of samples from GSE17705 and GSE22219 databases were used to construct a risk score model based on Cox multivariate regression. The expression levels of 10 genes were included in the model: CCNB2, CCNA2, FOXD1, WSB2, RBPMS, CTDSP1, BIN3, SLBP, EPRS, FTO. The samples in the high-risk group had a relative early distant relapse time period (median survival time of 3.75 years) compared with the patients in the low risk group (median survival time of 6.5 years, P&#x003C;0.01). For further validation, a further two independent datasets (GSE26971, GSE58644) were assessed. The overall survival time period of patients with high-risk scores in these datasets was significantly longer than those with low-risk scores (P&#x003C;0.01). Furthermore, the associations between clinical parameters and risk score were investigated, and it was revealed that the risk score was significantly correlated with tumor age, tumor stage and grade. In addition, a 5-year survival nomogram was plotted in order to facilitate the utilization of risk score along with other clinical data. In summary, using the transcriptomic profile, a multi-gene expression based risk score was developed and was revealed as being able to successfully predict the outcome of patients with ER&#x002B; breast cancer treated with tamoxifen for 5 years.</p>
</abstract>
<kwd-group>
<kwd>prognosis</kwd>
<kwd>breast cancer</kwd>
<kwd>tamoxifen</kwd>
<kwd>gene expression</kwd>
<kwd>risk model</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec sec-type="intro">
<title>Introduction</title>
<p>Breast cancer is the most prevalent malignancy in women worldwide (<xref rid="b1-mmr-17-02-3152" ref-type="bibr">1</xref>). A recent report conducted in China revealed that 272,400 new diagnoses, as well as 70,700 mortalities, occur annually as a result of breast cancer (<xref rid="b2-mmr-17-02-3152" ref-type="bibr">2</xref>). Molecular subtyping of breast cancer is relati(&#x005E;vely well established, and tamoxifen represents the most common drug prescribed to patients with breast cancer. However, relapse occurs in a large proportion of patients with estrogen-receptor positive (ER&#x002B;) breast cancer treated with tamoxifen (<xref rid="b3-mmr-17-02-3152" ref-type="bibr">3</xref>), and current clinical practice is insufficient for accurate prognosis. Previous research has identified survival-associated genomic signatures of breast cancer. For example, high expression of the GATA binding protein 3 gene has been reported to be associated with prolonged progression-free survival in patients with ER&#x002B; breast cancer (<xref rid="b4-mmr-17-02-3152" ref-type="bibr">4</xref>). Furthermore, patients with a reduced level of Beclin 1 expression demonstrated a higher sensitivity to tamoxifen and a prolonged survival time (<xref rid="b5-mmr-17-02-3152" ref-type="bibr">5</xref>). In addition, a high protein expression level of enhancer of zeste 2 polycomb repressive complex 2 subunit (EZH2) has been reported to be associated with the development of distant metastases in breast cancer (<xref rid="b6-mmr-17-02-3152" ref-type="bibr">6</xref>).</p>
<p>However, the clinical prognostic effect of single molecular biomarkers varies across datasets; whereas a multiple gene expression-based staging method is robust across datasets (<xref rid="b7-mmr-17-02-3152" ref-type="bibr">7</xref>&#x2013;<xref rid="b10-mmr-17-02-3152" ref-type="bibr">10</xref>). In the present study, a transcriptome-based risk score for the prediction of survival in patients with ER&#x002B; breast cancer treated with tamoxifen was developed using the Cox multivariate regression model. Risk scores were developed using cyclin B2 (CCNB2), glutamyl-prolyl-tRNA synthetase (EPRS), &#x03B1;-ketoglutarate dependent dioxygenase, stem-loop binding protein (SLBP), CTD small phosphatase 1 (CTDSP1), cyclin A2 (CCNA2), bridging integrator 3 (BIN3), RNA binding protein with multiple splicing (RBPMS), forkhead box D1 (FOXD1), gene encoding WD repeat of SOCS box containing 2 (WSB2); and the resultant model based upon said genes&#x2019; expression levels was revealed to successfully predict survival time in the training and validation datasets (GSE22219, GSE26971 and GSE58644). Median survival time of the high-risk and the low-risk group was 3.75 and 6.5 years, respectively. Furthermore, the associations between risk score and clinical parameters were investigated and it was demonstrated that age, grade and stage were significantly associated with risk score. A 5 year survival nomogram was plotted in order to facilitate the utilization of the risk score, which was demonstrated to be an important clinical indicator for prognosis. In conclusion, this study has developed a robust risk score staging system for the prediction of survival in patients with ER&#x002B; breast cancer treated with tamoxifen.</p>
</sec>
<sec sec-type="materials|methods">
<title>Materials and methods</title>
<sec>
<title/>
<sec>
<title>Sample enrollment and data pre-analysis</title>
<p>The following key words were searched for in the Gene Expression Omnibus (GEO) dataset: &#x2018;Breast cancer&#x2019;, &#x2018;tamoxifen&#x2019;, &#x2018;expression&#x2019; and &#x2018;microarray&#x2019;; and datasets with &#x003C;100 ER&#x002B; tamoxifen-treated samples, or datasets without survival information, were then manually filtered out. Following this, four datasets, GSE17705, GSE22219, GSE26971 and GSE56884, were then retained for further analysis. Furthermore, samples that were not primary tumor tissue were also excluded during this process. Raw data were then downloaded in the CEL format from the GEO datasets. Following this, background correction and normalization with Robust Multiarray Averaging were carried out using the R package &#x2018;affy&#x2019; function &#x2018;rma&#x2019; (v1.54.0). Probe and gene names were matched according to the manufacturer-provided annotation file. Genes with more than one complementary probe were merged and the average values were retained as the expression levels for the corresponding genes.</p>
</sec>
<sec>
<title>Risk score model development</title>
<p>Cox univariate regression was implemented in both GSE26971 and GSE17005 datasets via correlation of each individual gene&#x0027;s expression with the survival information in both datasets using the R package &#x2018;survival&#x2019; Genes significantly correlated with distant metastasis-free survival time in both GSE26971 and GSE17005 datasets were retained for further analyses as candidate genes. Random forest variable hunting was applied for the selection of a reasonable combination of candidate genes using R package &#x2018;RandomForestSRC&#x2019; (v1.9.0). The parameter used was: 100 repeats and 100 iterations. Following this, multivariate Cox regression analysis was carried out in order to develop the linear risk score model using the selected candidate genes, and coefficients were solved with the training dataset, GSE17005. In the validation datasets (GSE22219, GSE26971 and GSE58644), these coefficients were locked in order to calculate the risk score of samples in the other datasets.</p>
</sec>
<sec>
<title>Statistical analysis</title>
<p>All statistical analyses were performed using R software (v3.0.1; <uri xlink:href="https://www.r-project.org">https://www.r-project.org</uri>) and R packages. Normalizations of affymetrix raw data were performed with R package &#x2018;affy&#x2019; using the function &#x2018;rms&#x2019;. The survival analysis and cox probability hazard model construction were performed with R package &#x2018;survival&#x2019;. Random forest variable hunting was implemented with R package &#x2018;RandomForestSRC&#x2019;, and receiver operating characteristic (ROC) curves were generated with R package &#x2018;pROC&#x2019; (<xref rid="b11-mmr-17-02-3152" ref-type="bibr">11</xref>). The nomogram was plotted with the clinical data in the training dataset using R package &#x2018;rms&#x2019;.</p>
</sec>
</sec>
</sec>
<sec sec-type="results">
<title>Results</title>
<sec>
<title/>
<sec>
<title>Gene selection and model development</title>
<p>The detailed workflow of gene selection and model development is presented in <xref rid="f1-mmr-17-02-3152" ref-type="fig">Fig. 1A</xref>. The levels of association between gene expression levels and treatment outcomes (survival data) were assessed using Cox univariate regression. Genes associated with overall survival in both the GSE17705 and GSE26971 datasets were identified, and a total of 48 genes were then selected as candidates. Following this, the random forest variable hunting was performed in order to select for the optimal candidate genes. Following identification of 10 candidate genes (<xref rid="f1-mmr-17-02-3152" ref-type="fig">Fig. 1B</xref>), risk scores using Cox multivariate regression and expression of 10 genes were then calculated. The coefficients are presented in <xref rid="f1-mmr-17-02-3152" ref-type="fig">Fig. 1C</xref>, and parameters of Cox regression are shown in <xref rid="tI-mmr-17-02-3152" ref-type="table">Table I</xref>. The risk scores were calculated as follows (where gene names represent their respective expression levels): Risk score = (0.299988203)&#x002A;cyclin B2 (CCNB2) &#x002B; (0.640775607) &#x002A;glutamyl-prolyl-tRNA synthetase (EPRS) &#x002B; (&#x2212;0.756716676) &#x002A;&#x03B1;-ketoglutarate dependent dioxygenase (FTO) &#x002B; (0.117814961) &#x002A;stem-loop binding protein (SLBP) &#x002B; (0.245606283)&#x002A;CTD small phosphatase 1 (CTDSP1) &#x002B; (&#x2212;0.161767842) &#x002A;cyclin A2 (CCNA2) &#x002B; (0.196307548) &#x002A;bridging integrator 3 (BIN3) &#x002B; (&#x2212;0.618268545) &#x002A;RNA binding protein with multiple splicing (RBPMS) &#x002B; (0.580014194) &#x002A;forkhead box D1 (FOXD1) &#x002B; (&#x2212;0.288974361) &#x002A;gene encoding WD repeat of SOCS box containing 2 (WSB2).</p>
</sec>
<sec>
<title>Prognostic values of the risk score in the training dataset</title>
<p>Patients were divided into two groups, a high-risk group or a low-risk group, according to their median risk score. Following this, the difference in survival between the high-risk and the low-risk groups was calculated, and the results revealed that the high-risk group had a reduced relapse-free time compared with the low-risk group, with median survival times of 3.75 vs. 6.5 years, respectively (P&#x003C;0.001; <xref rid="f2-mmr-17-02-3152" ref-type="fig">Fig. 2A</xref>). The high-risk group tended to represent early metastasis, and genes with high expression levels tended to have positive coefficients and genes with low expression tended to have negative coefficients (<xref rid="f2-mmr-17-02-3152" ref-type="fig">Fig. 2B</xref>). The 5-year distant relapse-free survival rate of the high-risk group was 75&#x0025;; whereas this value was revealed as being 96&#x0025; in the low-risk group. These results indicated that the developed risk score was an effective predictive indicator for the distant relapse survival time period of patients with ER&#x002B; breast cancer treated with tamoxifen.</p>
</sec>
<sec>
<title>Risk score performance validation</title>
<p>Considering that the risk score staging system was developed based upon gene expression data in the GSE17705 dataset, there was a potential risk that the model would over-fit to the dataset. In order to assess the robustness of the developed risk score model, three independent datasets (GSE22219, GSE26971 and GSE58644) were used for further validation. Following the locking of the coefficients for the 10 genes, a risk score for each patient was calculated. In addition to patients belonging to the training dataset, the patients belonging to each of the three independent datasets were artificially divided into high-risk and low-risk groups using median risk score values as cutoff values. The patients with high-risk scores tended to have early relapse, as was similarly demonstrated in patients belonging to the training dataset (<xref rid="f3-mmr-17-02-3152" ref-type="fig">Fig. 3A</xref>). Furthermore, the gene expression profiles for the 10 genes in the both the low-risk and the high-risk groups resemble those demonstrated by the training dataset (<xref rid="f3-mmr-17-02-3152" ref-type="fig">Fig. 3B</xref>). These results demonstrate that the risk score model is robust across datasets for the prediction of distant relapse in patients with ER&#x002B; breast cancer treated with tamoxifen.</p>
</sec>
<sec>
<title>Risk score and clinical information</title>
<p>Subsequently, the associations between clinical parameters (stage, age, grade, lymph node invasion and primary tumor size) with the risk score were evaluated. As revealed in <xref rid="f4-mmr-17-02-3152" ref-type="fig">Fig. 4A</xref>, age, tumor stage and grade were significantly associated with the risk score; whereas the other clinical parameters were not (P&#x003E;0.05). To facilitate the utilization of the risk score, a 5-year distant relapse nomogram was plotted (<xref rid="f4-mmr-17-02-3152" ref-type="fig">Fig. 4B</xref>). According to this nomogram, risk score was one of the most important metastatic indicators.</p>
</sec>
</sec>
</sec>
<sec sec-type="discussion">
<title>Discussion</title>
<p>Tamoxifen is the most frequently used drug for the treatment of patients with ER&#x002B; breast cancer. However, tamoxifen drug resistance has previously been observed (<xref rid="b2-mmr-17-02-3152" ref-type="bibr">2</xref>). The underlying mechanism of how tamoxifen drug resistance develops remains unclear. In order to predict the survival time of patients treated with tamoxifen, this study has developed a predictive risk score staging system based upon gene expression levels. According to the developed model, the risk score successfully predicted the survival time of patients across both training and test datasets. In addition, associations between risk score and pathological parameters were assessed. The proposed nomogram demonstrated that the risk score was one of the most important indicators for prognosis.</p>
<p>Among the included genes, FOXD1 has previously been reported to promote migration and to be associated with drug resistance in glioma (<xref rid="b12-mmr-17-02-3152" ref-type="bibr">12</xref>). CCNA2 was revealed to correlate closely with distant metastasis-free, recurrence-free and overall survival in breast cancer; in addition, it also contributes to tamoxifen resistance in patients with ER&#x002B; breast cancer (<xref rid="b13-mmr-17-02-3152" ref-type="bibr">13</xref>). CCNB2 has previously been demonstrated to serve as an independent biomarker for invasive breast cancer, and elevated CCNB2 has previously been revealed to be associated with poor patient survival (<xref rid="b14-mmr-17-02-3152" ref-type="bibr">14</xref>). Although little is known about FTO expression and breast cancer, gene polymorphism of FTO has been revealed to be associated with carcinogenesis and survival of patients with breast cancer (<xref rid="b15-mmr-17-02-3152" ref-type="bibr">15</xref>,<xref rid="b16-mmr-17-02-3152" ref-type="bibr">16</xref>). Another gene, CTDSP1, inhibits cancer cell migration and invasion (<xref rid="b17-mmr-17-02-3152" ref-type="bibr">17</xref>). According to recent findings, EPRS is a regulator of cell proliferation in ER&#x002B; breast cancer, and reduced EPRS expression has been demonstrated to be associated with decreased distant relapse-free survival in patients treated with tamoxifen for 5 years (<xref rid="b18-mmr-17-02-3152" ref-type="bibr">18</xref>). Enhanced RBPMS expression has been revealed to significantly repress activator protein 1 signaling activity, and thus regulate the proliferation and migration of breast cancer cells (<xref rid="b19-mmr-17-02-3152" ref-type="bibr">19</xref>). The aforementioned candidate genes were either associated with survival of breast cancer patients or tamoxifen resistance/sensitivity, thus explaining why a risk score based upon the expression levels of said genes has proved to be effective for the survival prediction time period of patients with ER&#x002B; breast cancer. However, it was revealed that none of the candidate genes were significantly associated with survival across all of the included datasets (data not shown), thus indicating that the expression level of a single gene as a predictive measure for the survival time period of patients with ER&#x002B; breast cancer is not as robust as a cumulative risk score.</p>
<p>In conclusion, the current model developed in this study is robust across datasets in the prediction of the survival time of patients with ER&#x002B; breast cancer treated with tamoxifen.</p>
</sec>
</body>
<back>
<ref-list>
<title>References</title>
<ref id="b1-mmr-17-02-3152"><label>1</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Torre</surname><given-names>LA</given-names></name><name><surname>Bray</surname><given-names>F</given-names></name><name><surname>Siegel</surname><given-names>RL</given-names></name><name><surname>Ferlay</surname><given-names>J</given-names></name><name><surname>Lortet-Tieulent</surname><given-names>J</given-names></name><name><surname>Jemal</surname><given-names>A</given-names></name></person-group><article-title>Global cancer statistics, 2012</article-title><source>CA Cancer J Clin</source><volume>65</volume><fpage>87</fpage><lpage>108</lpage><year>2012</year><pub-id pub-id-type="doi">10.3322/caac.21262</pub-id></element-citation></ref>
<ref id="b2-mmr-17-02-3152"><label>2</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname><given-names>W</given-names></name><name><surname>Zheng</surname><given-names>R</given-names></name><name><surname>Baade</surname><given-names>PD</given-names></name><name><surname>Zhang</surname><given-names>S</given-names></name><name><surname>Zeng</surname><given-names>H</given-names></name><name><surname>Bray</surname><given-names>F</given-names></name><name><surname>Jemal</surname><given-names>A</given-names></name><name><surname>Yu</surname><given-names>XQ</given-names></name><name><surname>He</surname><given-names>J</given-names></name></person-group><article-title>Cancer statistics in China, 2015</article-title><source>CA Cancer J Clin</source><volume>66</volume><fpage>115</fpage><lpage>132</lpage><year>2016</year><pub-id pub-id-type="doi">10.3322/caac.21338</pub-id><pub-id pub-id-type="pmid">26808342</pub-id></element-citation></ref>
<ref id="b3-mmr-17-02-3152"><label>3</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Zembutsu</surname><given-names>H</given-names></name></person-group><article-title>Pharmacogenomics toward personalized tamoxifen therapy for breast cancer</article-title><source>Pharmacogenomics</source><volume>16</volume><fpage>287</fpage><lpage>296</lpage><year>2015</year><pub-id pub-id-type="doi">10.2217/pgs.14.171</pub-id><pub-id pub-id-type="pmid">25712191</pub-id></element-citation></ref>
<ref id="b4-mmr-17-02-3152"><label>4</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname><given-names>J</given-names></name><name><surname>Prager-van der Smissen</surname><given-names>WJ</given-names></name><name><surname>Look</surname><given-names>MP</given-names></name><name><surname>Sieuwerts</surname><given-names>AM</given-names></name><name><surname>Smid</surname><given-names>M</given-names></name><name><surname>Meijer-van Gelder</surname><given-names>ME</given-names></name><name><surname>Foekens</surname><given-names>JA</given-names></name><name><surname>Hollestelle</surname><given-names>A</given-names></name><name><surname>Martens</surname><given-names>JW</given-names></name></person-group><article-title>GATA3 mRNA expression, but not mutation, associates with longer progression-free survival in ER-positive breast cancer patients treated with first-line tamoxifen for recurrent disease</article-title><source>Cancer Lett</source><volume>376</volume><fpage>104</fpage><lpage>109</lpage><year>2016</year><pub-id pub-id-type="doi">10.1016/j.canlet.2016.03.038</pub-id><pub-id pub-id-type="pmid">27018307</pub-id></element-citation></ref>
<ref id="b5-mmr-17-02-3152"><label>5</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gu</surname><given-names>Y</given-names></name><name><surname>Chen</surname><given-names>T</given-names></name><name><surname>Li</surname><given-names>G</given-names></name><name><surname>Xu</surname><given-names>C</given-names></name><name><surname>Xu</surname><given-names>Z</given-names></name><name><surname>Zhang</surname><given-names>J</given-names></name><name><surname>He</surname><given-names>K</given-names></name><name><surname>Zheng</surname><given-names>L</given-names></name><name><surname>Guan</surname><given-names>Z</given-names></name><name><surname>Su</surname><given-names>X</given-names></name><etal/></person-group><article-title>Lower Beclin 1 downregulates HER2 expression to enhance tamoxifen sensitivity and predicts a favorable outcome for ER positive breast cancer</article-title><source>Oncotarget</source><volume>8</volume><fpage>52156</fpage><lpage>52177</lpage><year>2016</year><pub-id pub-id-type="pmid">28881721</pub-id><pub-id pub-id-type="pmcid">5581020</pub-id></element-citation></ref>
<ref id="b6-mmr-17-02-3152"><label>6</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Reijm</surname><given-names>EA</given-names></name><name><surname>Timmermans</surname><given-names>AM</given-names></name><name><surname>Look</surname><given-names>MP</given-names></name><name><surname>Meijer-van Gelder</surname><given-names>ME</given-names></name><name><surname>Stobbe</surname><given-names>CK</given-names></name><name><surname>van Deurzen</surname><given-names>CH</given-names></name><name><surname>Martens</surname><given-names>JW</given-names></name><name><surname>Sleijfer</surname><given-names>S</given-names></name><name><surname>Foekens</surname><given-names>JA</given-names></name><name><surname>Berns</surname><given-names>PM</given-names></name><name><surname>Jansen</surname><given-names>MP</given-names></name></person-group><article-title>High protein expression of EZH2 is related to unfavorable outcome to tamoxifen in metastatic breast cancer</article-title><source>Ann Oncol</source><volume>25</volume><fpage>2185</fpage><lpage>2190</lpage><year>2014</year><pub-id pub-id-type="doi">10.1093/annonc/mdu391</pub-id><pub-id pub-id-type="pmid">25193989</pub-id></element-citation></ref>
<ref id="b7-mmr-17-02-3152"><label>7</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bou Samra</surname><given-names>E</given-names></name><name><surname>Klein</surname><given-names>B</given-names></name><name><surname>Commes</surname><given-names>T</given-names></name><name><surname>Moreaux</surname><given-names>J</given-names></name></person-group><article-title>Development of gene expression-based risk score in cytogenetically normal acute myeloid leukemia patients</article-title><source>Oncotarget</source><volume>3</volume><fpage>1</fpage><lpage>832</lpage><year>2012</year><pub-id pub-id-type="doi">10.18632/oncotarget.571</pub-id><pub-id pub-id-type="pmid">22287500</pub-id><pub-id pub-id-type="pmcid">3292884</pub-id></element-citation></ref>
<ref id="b8-mmr-17-02-3152"><label>8</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Salazar</surname><given-names>R</given-names></name><name><surname>Roepman</surname><given-names>P</given-names></name><name><surname>Capella</surname><given-names>G</given-names></name><name><surname>Moreno</surname><given-names>V</given-names></name><name><surname>Simon</surname><given-names>I</given-names></name><name><surname>Dreezen</surname><given-names>C</given-names></name><name><surname>Lopez-Doriga</surname><given-names>A</given-names></name><name><surname>Santos</surname><given-names>C</given-names></name><name><surname>Marijnen</surname><given-names>C</given-names></name><name><surname>Westerga</surname><given-names>J</given-names></name><etal/></person-group><article-title>Gene expression signature to improve prognosis prediction of stage II and III colorectal cancer</article-title><source>J Clin Oncol</source><volume>29</volume><fpage>17</fpage><lpage>24</lpage><year>2011</year><pub-id pub-id-type="doi">10.1200/JCO.2010.30.1077</pub-id><pub-id pub-id-type="pmid">21098318</pub-id></element-citation></ref>
<ref id="b9-mmr-17-02-3152"><label>9</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bou Samra</surname><given-names>E</given-names></name><name><surname>Klein</surname><given-names>B</given-names></name><name><surname>Commes</surname><given-names>T</given-names></name><name><surname>Moreaux</surname><given-names>J</given-names></name></person-group><article-title>Identification of a 20-gene expression-based risk score as a predictor of clinical outcome in chronic lymphocytic leukemia patients</article-title><source>Biomed Res Int</source><volume>2014</volume><fpage>423174</fpage><year>2014</year><pub-id pub-id-type="doi">10.1155/2014/423174</pub-id><pub-id pub-id-type="pmid">24883311</pub-id><pub-id pub-id-type="pmcid">4026849</pub-id></element-citation></ref>
<ref id="b10-mmr-17-02-3152"><label>10</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname><given-names>SK</given-names></name><name><surname>Kim</surname><given-names>SY</given-names></name><name><surname>Kim</surname><given-names>JH</given-names></name><name><surname>Roh</surname><given-names>SA</given-names></name><name><surname>Cho</surname><given-names>DH</given-names></name><name><surname>Kim</surname><given-names>YS</given-names></name><name><surname>Kim</surname><given-names>JC</given-names></name></person-group><article-title>A nineteen gene-based risk score classifier predicts prognosis of colorectal cancer patients</article-title><source>Mol Oncol</source><volume>8</volume><fpage>1653</fpage><lpage>1666</lpage><year>2014</year><pub-id pub-id-type="doi">10.1016/j.molonc.2014.06.016</pub-id><pub-id pub-id-type="pmid">25049118</pub-id><pub-id pub-id-type="pmcid">5528589</pub-id></element-citation></ref>
<ref id="b11-mmr-17-02-3152"><label>11</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Robin</surname><given-names>X</given-names></name><name><surname>Turck</surname><given-names>N</given-names></name><name><surname>Hainard</surname><given-names>A</given-names></name><name><surname>Tiberti</surname><given-names>N</given-names></name><name><surname>Lisacek</surname><given-names>F</given-names></name><name><surname>Sanchez</surname><given-names>JC</given-names></name><name><surname>M&#x00FC;ller</surname><given-names>M</given-names></name></person-group><article-title>pROC: An open-source package for R and S&#x002B; to analyze and compare ROC curves</article-title><source>BMC Bioinformatics</source><volume>12</volume><fpage>77</fpage><year>2011</year><pub-id pub-id-type="doi">10.1186/1471-2105-12-77</pub-id><pub-id pub-id-type="pmid">21414208</pub-id><pub-id pub-id-type="pmcid">3068975</pub-id></element-citation></ref>
<ref id="b12-mmr-17-02-3152"><label>12</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gao</surname><given-names>YF</given-names></name><name><surname>Zhu</surname><given-names>T</given-names></name><name><surname>Mao</surname><given-names>XY</given-names></name><name><surname>Mao</surname><given-names>CX</given-names></name><name><surname>Li</surname><given-names>L</given-names></name><name><surname>Yin</surname><given-names>JY</given-names></name><name><surname>Zhou</surname><given-names>HH</given-names></name><name><surname>Liu</surname><given-names>ZQ</given-names></name></person-group><article-title>Silencing of Forkhead box D1 inhibits proliferation and migration in glioma cells</article-title><source>Oncol Rep</source><volume>37</volume><fpage>1196</fpage><lpage>1202</lpage><year>2017</year><pub-id pub-id-type="doi">10.3892/or.2017.5344</pub-id><pub-id pub-id-type="pmid">28075458</pub-id></element-citation></ref>
<ref id="b13-mmr-17-02-3152"><label>13</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gao</surname><given-names>T</given-names></name><name><surname>Han</surname><given-names>Y</given-names></name><name><surname>Yu</surname><given-names>L</given-names></name><name><surname>Ao</surname><given-names>S</given-names></name><name><surname>Li</surname><given-names>Z</given-names></name><name><surname>Ji</surname><given-names>J</given-names></name></person-group><article-title>CCNA2 is a prognostic biomarker for ER&#x002B; breast cancer and tamoxifen resistance</article-title><source>PLoS One</source><volume>9</volume><fpage>e91771</fpage><year>2014</year><pub-id pub-id-type="doi">10.1371/journal.pone.0091771</pub-id><pub-id pub-id-type="pmid">24622579</pub-id><pub-id pub-id-type="pmcid">3951414</pub-id></element-citation></ref>
<ref id="b14-mmr-17-02-3152"><label>14</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Shubbar</surname><given-names>E</given-names></name><name><surname>Kov&#x00E1;cs</surname><given-names>A</given-names></name><name><surname>Hajizadeh</surname><given-names>S</given-names></name><name><surname>Parris</surname><given-names>TZ</given-names></name><name><surname>Nemes</surname><given-names>S</given-names></name><name><surname>Gunnarsd&#x00F3;ttir</surname><given-names>K</given-names></name><name><surname>Einbeigi</surname><given-names>Z</given-names></name><name><surname>Karlsson</surname><given-names>P</given-names></name><name><surname>Helou</surname><given-names>K</given-names></name></person-group><article-title>Elevated cyclin B2 expression in invasive breast carcinoma is associated with unfavorable clinical outcome</article-title><source>BMC Cancer</source><volume>13</volume><fpage>1</fpage><year>2013</year><pub-id pub-id-type="doi">10.1186/1471-2407-13-1</pub-id><pub-id pub-id-type="pmid">23282137</pub-id><pub-id pub-id-type="pmcid">3545739</pub-id></element-citation></ref>
<ref id="b15-mmr-17-02-3152"><label>15</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Tan</surname><given-names>A</given-names></name><name><surname>Dang</surname><given-names>Y</given-names></name><name><surname>Chen</surname><given-names>G</given-names></name><name><surname>Mo</surname><given-names>Z</given-names></name></person-group><article-title>Overexpression of the fat mass and obesity associated gene (FTO) in breast cancer and its clinical implications</article-title><source>Int J Clin Exp Pathol</source><volume>8</volume><fpage>13405</fpage><lpage>13410</lpage><year>2015</year><pub-id pub-id-type="pmid">26722548</pub-id><pub-id pub-id-type="pmcid">4680493</pub-id></element-citation></ref>
<ref id="b16-mmr-17-02-3152"><label>16</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Zeng</surname><given-names>X</given-names></name><name><surname>Ban</surname><given-names>Z</given-names></name><name><surname>Cao</surname><given-names>J</given-names></name><name><surname>Zhang</surname><given-names>W</given-names></name><name><surname>Chu</surname><given-names>T</given-names></name><name><surname>Lei</surname><given-names>D</given-names></name><name><surname>Du</surname><given-names>Y</given-names></name></person-group><article-title>Association of FTO mutations with risk and survival of breast cancer in a Chinese population</article-title><source>Dis Markers</source><volume>2015</volume><fpage>101032</fpage><year>2015</year><pub-id pub-id-type="doi">10.1155/2015/101032</pub-id><pub-id pub-id-type="pmid">26146447</pub-id><pub-id pub-id-type="pmcid">4471376</pub-id></element-citation></ref>
<ref id="b17-mmr-17-02-3152"><label>17</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sun</surname><given-names>T</given-names></name><name><surname>Fu</surname><given-names>J</given-names></name><name><surname>Shen</surname><given-names>T</given-names></name><name><surname>Lin</surname><given-names>X</given-names></name><name><surname>Liao</surname><given-names>L</given-names></name><name><surname>Feng</surname><given-names>XH</given-names></name><name><surname>Xu</surname><given-names>J</given-names></name></person-group><article-title>The small c-terminal domain phosphatase 1 inhibits cancer cell migration and invasion by dephosphorylating ser(p)68-twist1 to accelerate twist1 protein degradation</article-title><source>J Biol Chem</source><volume>291</volume><fpage>11518</fpage><lpage>11528</lpage><year>2016</year><pub-id pub-id-type="doi">10.1074/jbc.M116.721795</pub-id><pub-id pub-id-type="pmid">26975371</pub-id><pub-id pub-id-type="pmcid">4882423</pub-id></element-citation></ref>
<ref id="b18-mmr-17-02-3152"><label>18</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Katsyv</surname><given-names>I</given-names></name><name><surname>Wang</surname><given-names>M</given-names></name><name><surname>Song</surname><given-names>WM</given-names></name><name><surname>Zhou</surname><given-names>X</given-names></name><name><surname>Zhao</surname><given-names>Y</given-names></name><name><surname>Park</surname><given-names>S</given-names></name><name><surname>Zhu</surname><given-names>J</given-names></name><name><surname>Zhang</surname><given-names>B</given-names></name><name><surname>Irie</surname><given-names>HY</given-names></name></person-group><article-title>EPRS is a critical regulator of cell proliferation and estrogen signaling in ER&#x002B; breast cancer</article-title><source>Oncotarget</source><volume>7</volume><fpage>69592</fpage><lpage>69605</lpage><year>2016</year><pub-id pub-id-type="pmid">27612429</pub-id><pub-id pub-id-type="pmcid">5342500</pub-id></element-citation></ref>
<ref id="b19-mmr-17-02-3152"><label>19</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Fu</surname><given-names>J</given-names></name><name><surname>Cheng</surname><given-names>L</given-names></name><name><surname>Wang</surname><given-names>Y</given-names></name><name><surname>Yuan</surname><given-names>P</given-names></name><name><surname>Xu</surname><given-names>X</given-names></name><name><surname>Ding</surname><given-names>L</given-names></name><name><surname>Zhang</surname><given-names>H</given-names></name><name><surname>Jiang</surname><given-names>K</given-names></name><name><surname>Song</surname><given-names>H</given-names></name><name><surname>Chen</surname><given-names>Z</given-names></name><name><surname>Ye</surname><given-names>Q</given-names></name></person-group><article-title>The RNA-binding protein RBPMS1 represses AP-1 signaling and regulates breast cancer cell proliferation and migration</article-title><source>Biochim Biophys Acta</source><volume>1853</volume><fpage>1</fpage><lpage>13</lpage><year>2015</year><pub-id pub-id-type="doi">10.1016/j.bbamcr.2014.09.022</pub-id><pub-id pub-id-type="pmid">25281386</pub-id></element-citation></ref>
</ref-list>
</back>
<floats-group>
<fig id="f1-mmr-17-02-3152" position="float">
<label>Figure 1.</label>
<caption><p>Candidate gene identification. (A) Workflow of the study. (B) Genes identified in random forest variable hunting. (C) Coefficients of each gene. GEO, Gene Expression Omnibus; WSB2, gene encoding WD repeat of SOCS box containing 2; FOXD1, forkhead box D1; RBPMS, RNA binding protein with multiple splicing; BIN3, bridging integrator 3; CCNA2, cyclin A2; CTDSP1, CTD small phosphatase 1; SLBP, stem-loop binding protein; FTO, &#x03B1;-ketoglutarate dependent dioxygenase; EPRS, glutamyl-prolyl-tRNA synthetase; CCNB2, cyclin B2.</p></caption>
<graphic xlink:href="MMR-17-02-3152-g00.tif"/>
</fig>
<fig id="f2-mmr-17-02-3152" position="float">
<label>Figure 2.</label>
<caption><p>Performance of risk score in the training dataset. (A) Survival difference between high-risk and low-risk group and (B) detailed survival information and expression of candidate genes. CCNB2, cyclin B2; EPRS, glutamyl-prolyl-tRNA synthetase; FTO, &#x03B1;-ketoglutarate dependent dioxygenase; SLBP, stem-loop binding protein; CTDSP1, CTD small phosphatase 1; CCNA2, cyclin A2; BIN3, bridging integrator 3; RBPMS, RNA binding protein with multiple splicing; FOXD1, forkhead box D1; WSB2, gene encoding WD repeat of SOCS box containing 2.</p></caption>
<graphic xlink:href="MMR-17-02-3152-g01.tif"/>
</fig>
<fig id="f3-mmr-17-02-3152" position="float">
<label>Figure 3.</label>
<caption><p>Risk score in the test datasets. The performance of risk score in three independent datasets: (A) GSE22219, (B) GSE26971 and (C) GSE58644. CCNB2, cyclin B2; EPRS, glutamyl-prolyl-tRNA synthetase; FTO, &#x03B1;-ketoglutarate dependent dioxygenase; SLBP, stem-loop binding protein; CTDSP1, CTD small phosphatase 1; CCNA2, cyclin A2; BIN3, bridging integrator 3; RBPMS, RNA binding protein with multiple splicing; FOXD1, forkhead box D1; WSB2, gene encoding WD repeat of SOCS box containing 2.</p></caption>
<graphic xlink:href="MMR-17-02-3152-g02.tif"/>
</fig>
<fig id="f4-mmr-17-02-3152" position="float">
<label>Figure 4.</label>
<caption><p>Risk score and further clinical information. (A) Correlation analysis between clinical information and risk score, and (B) a plotted nomogram.</p></caption>
<graphic xlink:href="MMR-17-02-3152-g03.tif"/>
</fig>
<table-wrap id="tI-mmr-17-02-3152" position="float">
<label>Table I.</label>
<caption><p>Parameters of candidate genes.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th align="center" valign="bottom" colspan="3">Univariate</th>
<th align="center" valign="bottom" colspan="3">Multivariate</th>
</tr>
<tr>
<th/>
<th align="center" valign="bottom" colspan="3"><hr/></th>
<th align="center" valign="bottom" colspan="3"><hr/></th>
</tr>
<tr>
<th align="left" valign="bottom">Genes</th>
<th align="center" valign="bottom">HR</th>
<th align="center" valign="bottom">95&#x0025; C.I.</th>
<th align="center" valign="bottom">P-value</th>
<th align="center" valign="bottom">HR</th>
<th align="center" valign="bottom">95&#x0025; C.I.</th>
<th align="center" valign="bottom">P-value</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">CCNB2</td>
<td align="center" valign="top">2.2</td>
<td align="center" valign="top">1.3&#x2013;3.7</td>
<td align="center" valign="top">0.00252</td>
<td align="center" valign="top">0.81</td>
<td align="center" valign="top">0.38&#x2013;1.71</td>
<td align="center" valign="top">0.57828</td>
</tr>
<tr>
<td align="left" valign="top">CCNA2</td>
<td align="center" valign="top">&#x00A0;&#x00A0;0.82</td>
<td align="center" valign="top">0.7&#x2013;0.95</td>
<td align="center" valign="top">0.00959</td>
<td align="center" valign="top">0.91</td>
<td align="center" valign="top">0.78&#x2013;1.07</td>
<td align="center" valign="top">0.2525</td>
</tr>
<tr>
<td align="left" valign="top">FOXD1</td>
<td align="center" valign="top">2.6</td>
<td align="center" valign="top">1.6&#x2013;4.3</td>
<td align="center" valign="top">0.00016</td>
<td align="center" valign="top">0.78</td>
<td align="center" valign="top">0.28&#x2013;2.21</td>
<td align="center" valign="top">0.64203</td>
</tr>
<tr>
<td align="left" valign="top">WSB2</td>
<td align="center" valign="top">&#x00A0;&#x00A0;0.44</td>
<td align="center" valign="top">0.27&#x2013;0.72</td>
<td align="center" valign="top">0.00119</td>
<td align="center" valign="top">0.55</td>
<td align="center" valign="top">0.3&#x2013;1</td>
<td align="center" valign="top">0.05139</td>
</tr>
<tr>
<td align="left" valign="top">RBPMS</td>
<td align="center" valign="top">&#x00A0;&#x00A0;0.26</td>
<td align="center" valign="top">0.12&#x2013;0.57</td>
<td align="center" valign="top">0.00077</td>
<td align="center" valign="top">0.58</td>
<td align="center" valign="top">0.21&#x2013;1.62</td>
<td align="center" valign="top">0.29785</td>
</tr>
<tr>
<td align="left" valign="top">CTDSP1</td>
<td align="center" valign="top">2.2</td>
<td align="center" valign="top">1.5&#x2013;3.2</td>
<td align="center" valign="top">2.00E-05</td>
<td align="center" valign="top">1.74</td>
<td align="center" valign="top">1.17&#x2013;2.59</td>
<td align="center" valign="top">0.00631</td>
</tr>
<tr>
<td align="left" valign="top">BIN3</td>
<td align="center" valign="top">1.4</td>
<td align="center" valign="top">1.1&#x2013;1.7</td>
<td align="center" valign="top">0.00815</td>
<td align="center" valign="top">1.22</td>
<td align="center" valign="top">0.96&#x2013;1.54</td>
<td align="center" valign="top">0.10664</td>
</tr>
<tr>
<td align="left" valign="top">SLBP</td>
<td align="center" valign="top">1.3</td>
<td align="center" valign="top">1.1&#x2013;1.5</td>
<td align="center" valign="top">0.00536</td>
<td align="center" valign="top">1.24</td>
<td align="center" valign="top">1.04&#x2013;1.48</td>
<td align="center" valign="top">0.01777</td>
</tr>
<tr>
<td align="left" valign="top">EPRS</td>
<td align="center" valign="top">2.7</td>
<td align="center" valign="top">1.5&#x2013;4.7</td>
<td align="center" valign="top">0.00045</td>
<td align="center" valign="top">2.88</td>
<td align="center" valign="top">1.23&#x2013;6.74</td>
<td align="center" valign="top">0.0148</td>
</tr>
<tr>
<td align="left" valign="top">FTO</td>
<td align="center" valign="top">&#x00A0;&#x00A0;0.27</td>
<td align="center" valign="top">0.11&#x2013;0.63</td>
<td align="center" valign="top">0.0028</td>
<td align="center" valign="top">0.68</td>
<td align="center" valign="top">0.25&#x2013;1.85</td>
<td align="center" valign="top">0.45285</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="tfn1-mmr-17-02-3152"><p>HR, hazard ratio; C.I., confidence interval.</p></fn>
</table-wrap-foot>
</table-wrap>
</floats-group>
</article>