Identification of a long non-coding RNA gene, growth hormone secretagogue receptor opposite strand, which stimulates cell migration in non-small cell lung cancer cell lines
- Eliza J. Whiteside
- Inge Seim
- Jana P. Pauli
- Angela J. O'Keeffe
- Patrick B. Thomas
- Shea L. Carter
- Carina M. Walpole
- Jenny N.T. Fung
- Peter Josh
- Adrian C. Herington
- Lisa K. Chopin
- Corresponding author:
- Lisa K. Chopin [ firstname.lastname@example.org ]
Published online on: Thursday, May 30, 2013
- Pages: 566-574
- DOI: 10.3892/ijo.2013.1969
Lung cancer is the leading cause of cancer deaths world-wide (1) and the majority of cases (70–85%) are non-small cell lung cancers (NSCLC) (2). Most patients with NSCLC are diagnosed after their cancer has metastasised and this is associated with a poor prognosis (3,4). As metastatic disease is currently incurable, new therapeutic approaches to lung cancer are urgently required (4). The molecular mechanisms involved in NSCLC tumourigenesis and metastatic progression are largely unknown, however, recent studies have demonstrated that non-coding RNAs are key regulators of these processes (5–9). Less than 2% of the human genome is transcribed into protein-coding mRNAs, while approximately 90% is transcribed into non-protein coding RNAs (ncRNAs) and many are of unknown function (9,10). Non-coding RNAs, therefore, provide promising targets for the development of novel therapeutics for cancer (11).
Here, we report the identification, genomic organisation and initial characterisation of the candidate antisense long ncRNA gene, GHSROS, which is located on the opposite strand of the gene for the ghrelin receptor, the growth hormone secretagogue receptor (GHSR) gene. We demonstrate that GHSROS is expressed at a higher level in human lung tumours compared to other tissue types and, when over-expressed in lung cancer cell lines, promotes cell migration. The ability of cancer cells to migrate and ultimately to metastasise to distant sites in the body is a key hallmark of cancer (12,13). We hypothesise that, like other recently described lncRNAs, GHSROS contributes to lung cancer progression by stimulating cancer cell migration.
Materials and methods
Sequence analysis and database searches
Multiple sequence alignments were generated using the Evolutionary Conserved Regions (ECR) Browser (14) and Clustal W2.0 (15). To examine putative coding sequences we used the ExPASy Translate Tool (http://www.expasy.ch/tools/dna.html) and the Coding Potential Calculator (16). The presence of transposable elements was examined using CENSOR v4.2.8 (17).
Human tissues and RNA
Normal lung and lung tumour specimens were obtained from the Ontario Tumour Bank (Toronto, Canada; Table I). Commercial total RNA was obtained from human stomach, ovary (FirstChoice, Invitrogen, Carlsbad, CA), cerebellum, thymus, whole brain, lung, testis, foetal brain, lung adenocarcinoma and pancreas (Clontech, Mountain View, CA).
Specimen characteristics of a range of paired normal and non-small cell lung carcinoma tumour biopsies from patients diagnosed with NSCLC.
The A549 cell line (American Type Culture Collection, ATCC 10801, Rockville, MD) was cultured in DMEM/F12 (Invitrogen), while the NCI-H1299 (ATCC CRL-5803) and Beas-2B (ATCC CRL-9609) cell lines were cultured in RPMI-1640 medium (Invitrogen). The complete medium contained 10% cosmic calf serum (HyClone, ThermoFisher Scientific, Waltham, MA), 50 U/ml penicillin and 50 μg/ml streptomycin (Invitrogen). Cells were incubated at 37°C in air and 5% CO2 and free of Mycoplasma contamination.
Total RNA was harvested from tissues and cultured cells using an RNeasy Plus mini kit (QIAGEN, Germantown, MD) according to the manufacturer’s instructions.
Quantitative real-time RT-PCR
cDNA was synthesised using a GHSROS strand-specific primer (GHSROS-RT), and quantitative RT-PCR of GHSR antisense mRNA was performed as previously described (18,19) using the Prism 7000 Sequence Detection System (Applied Biosystems, Foster City, CA). Data were analysed using the comparative 2−ΔΔCt method with normalisation to the 18S housekeeping gene (20). The primers used are listed in Table II. All RT-PCR products were purified using PureLink (Invitrogen) or MinElute (QIAGEN) PCR Purification Kits, cloned into pCR-XL-TOPO (Invitrogen), or pGEM-T Easy (Promega, Fitchburg, WI), transformed into One Shot MAX Efficiency DH5α-T1R chemically competent cells (Invitrogen) and sequenced at the Australian Genome Research Facility (AGRF, Brisbane, Australia) using BigDye III (Applied Biosystems).
GHSR antisense transcript mapping
5′-RLM-RACE was performed using the FirstChoice RLM-RACE kit (Invitrogen) according to the manufacturer’s instructions, except that heat-labile shrimp alkaline phosphatase (SAP; Fermentas, Burlington, ON, Canada) was used to dephosphorylate degraded RNA. For 3′-RACE, 2 μg total RNA was reverse transcribed using Transcriptor reverse transcriptase (Roche, Penzberg, Germany) and a 3′-RACE adapter primer (Table II).
Northern blot analysis
Northern blot analysis was carried out on mRNA purified from A549 cells, using the Oligotex mRNA mini-kit (QIAGEN) and the NorthernMax-Gly kit (Invitrogen) according to the manufacturer’s instructions. Briefly, 250 ng mRNA and 50 ng RNA Molecular Weight Marker II (Roche) were electrophoresed through a 1% glyoxal agarose gel, transferred onto a positively charged nylon membrane (Roche) and probed with 10 ng/ml 331 nt GHSROS-specific riboprobe. The riboprobe was generated from A549 genomic DNA using PCR (primers Northern-F and Northern-T7-R, Table II). The PCR product was purified using a MinElute PCR Purification kit (QIAGEN), and the cRNA probe was synthesised using a digoxigenin RNA labelling kit (Roche). The membrane was hybridised overnight at 70°C with the riboprobe using ULTRAhyb buffer (Invitrogen), followed by high stringency washing conditions (70°C).
Stable transfection of GHSROS cDNA in cell lines
Full-length GHSROS was generated by RT-PCR from A549 cell line mRNA (using primers GHSROS-pTargeT-F and GHSROS-pTargeT-R; Table II), and cloned into the pTargeT mammalian expression vector (Promega). Cells were transfected with DNA (linearised GHSROS-pTargeT, or vector alone/mock) using Lipofectamine LTX reagent (Invitrogen). After 48 h, stable polyclonal cell populations were generated by culturing in complete medium containing G418 (Invitrogen), at 1,000 μg/ml for the A549 and the Beas-2B cell lines and 600 μg/ml for the NIH-H1299 cell line. Cells were grown in the presence of G418 for at least two weeks before functional analyses were performed. GHSROS expression was verified twice weekly using quantitative real-time RT-PCR, as described above.
Cell migration assays
Migration assays were performed using Transwell inserts with polycarbonate membranes (8 μm pore size; BD Biosciences, Franklin Lakes, NJ) in 12-well plates. Cells were added to the upper chamber of the Transwell inserts in serum-free medium and medium with 10% cosmic calf serum was used as a chemo-attractant in the lower chamber. Control inserts containing medium only were used to determine background staining. Cells were cultured for 6–24 h and cells remaining on the upper surface of the inserts were removed. The number of cells that migrated to the underside of the inserts was quantified by fixing the cells (100% methanol) and staining with 1% crystal violet. The stain was extracted using 10% (v/v) acetic acid and absorbance measured at 595 nm. Cell migration in GHSROS overexpressing cells was compared to cells expressing the vector alone. Each experiment consisted of three replicates and was repeated independently at least three times.
Cell proliferation assays
Cell proliferation was assessed by quantifying both metabolic activity (WST-1 assay, Roche) and DNA synthesis (CyQUANT NF Cell Proliferation Assay Kit; Invitrogen). Cells were cultured in replicate 96-well plates (BD Biosciences) for 4, 24 and 48 h. Absorbance was measured at 440 nm with a reference wavelength of 600 nm for the WST-1 assay and with excitation at 485 nm and emission at 530 nm for the CyQUANT assay. All proliferation experiments were performed independently at least three times, with 10 replicates each.
Statistical significance was determined using Student’s t-test, with a p-value <0.05 considered to be statistically significant.
Identification of a GHSR antisense transcript
Inspection of the UCSC Genome Browser for Functional RNA (21) revealed two overlapping expressed sequence tags (ESTs) (GenBank entries AW451317 and AI681234) which were antisense to the growth hormone secretagogue receptor gene (GHSR) (Fig. 1). We named the putative antisense transcript growth hormone secretagogue receptor opposite strand (GHSROS). These ESTs were sequenced as a part of the Cancer Genome Anatomy Project (http://cgap.nci.nih.gov) and were derived from lung carcinoid tumour tissue, which is a rare, neuroendocrine lung tumour type (22). The two overlapping EST entries together span approximately 900 bp within the 2.1 kb intron 1 of GHSR. Moreover, a 904 nucleotide transcript (TIN_36629), derived from oligoarray analysis for intronic non-coding RNAs of the liver, kidney and prostate (23), also maps to the region covered by the two ESTs (Fig. 1). This 904 nucleotide GHSROS transcript was one of 55,000 transcripts denoted intronic non-coding RNA in a large-scale study by Nakaya et al (23).
Identification and verification of antisense transcription in the GHSR locus. Exons are shown as boxes and introns as lines. Growth hormone secretagogue receptor (GHSR) exons 1 and 2 are shown in black. An exon of a putative antisense transcript, GHSROS, is shown in grey. Lung cancer-derived ESTs (GenBank entries AW451317 and AI681234) and an antisense transcript deduced from a strand-specific cDNA microarray (TIN_36629) are shown as green boxes below the antisense strand exon. The region amplified by real-time RT-PCR and the region targeted using a riboprobe in Northern blot analysis experiments are shown as yellow boxes. Full-length sequences obtained by RACE and by sequencing of a full-length cDNA clone (IMAGE cDNA clone 2272492) are shown as blue boxes.
Structure of GHSROS
To map the full-length GHSROS transcript, a multi-pronged approach was employed. As noted, the public domain oligoarray-deduced sequence (TIN_36629, Fig. 1) spans 904 bp of genomic DNA. A sequence conforming to the consensus TATA box motif (TATAAA) (24) is present just upstream of this sequence (Fig. 2A), suggesting that an antisense promoter is present in the intron of GHSR. To confirm the oligoarray data, 5′- and 3′-RACE PCR products and a full-length cDNA clone from a lung carcinoid tumour (Image cDNA clone 2272492) were sequenced. The sequenced full-length transcript is 1078 nucleotides in size, consists of a single exon and maps within the GHSR intron (Fig. 2A) (GenBank entries FJ355932, FJ355933 and GU289929). Northern blot analysis of mRNA from the A549 NSCLC cell line showed that the polyadenylated, full-length GHSROS is approximately 1,500 bp in size (Fig. 2B), and this corresponds closely to the predicted size of GHSROS mRNA. As shown in Fig. 2A, GHSROS has three transcription start sites: one just downstream of the consensus TATA-box and two immediately upstream of a poly-T-repeat within an ancient MER5B (medium reiteration frequency 5B) DNA transposable element (25). This thymidine-rich repeat is absent in non-primates (data not shown), suggesting that a primate-specific antisense promoter in the GHSR intron may have emerged through accumulated mutations (ab initio generation) (26). Interestingly, it is well known that poly-T-repeats in promoters can result in highly efficient transcription by depleting repressive nucleosomes from promoters (27). Together, these observations suggest that an antisense promoter in the GHSR intron gives rise to single-exon GHSROS transcripts that are processed into mRNA (5′ capped and 3′ polyadenylated).
(A) Genomic sequence showing GHSR exon 2 (yellow), a MER5B transposable DNA element (blue) and GHSROS cDNA (red). Black, closed arrows indicate transcriptional orientation. The location and direction of nested 5′-RACE (5′-RACE-in-R) and 3′-RACE (3′-RACE-in-F) primers are shown as blue arrows. Additional 5′ and 3′ sequence information obtained by RACE (compared to IMAGE cDNA clone 2272492) is highlighted in green. Oligoarray-derived GHSROS sequence is highlighted in grey. A consensus TATA-box is highlighted in purple. (B) Northern blot analysis of mRNA from the A549 NSCLC cell line showing a GHSR antisense transcript at approximately 1.5 kb, close to the predicted size of the transcript, suggesting that the sequence shown in (A) is a full-length transcript (excluding the polyadenylated tail).
GHSROS is a candidate long non-coding RNA
While it is difficult to predict and experimentally prove that a transcript either codes for very small peptides or is a non-coding RNA, a number of parameters can be assessed (28). Analysis using the coding potential calculator (CPC) tool predicted that GHSROS is a non-coding transcript. The CPC tool is a highly accurate algorithm that takes into account multiple features, including putative peptide length, amino acid composition, secondary structure, the conservation of protein homologues and alignment information (28). GHSROS demonstrates a number of features typical of a non-coding RNA. As the open reading frames are very short, GHSROS would encode very small peptides (with 13 open reading frames which are 6–46 amino acids in size). GHSROS also has a high frequency of stop codons throughout the 1.1 kb GHSROS sequence in all three reading frames (Table III). Moreover, GHSROS open reading frames have poor consensus to the translation initiation sequence proposed by Kozak (29) (Table III). Multiple sequence alignments show that the GHSROS nucleotide sequence is highly conserved in the chimpanzee, while there is low nucleotide and open reading frame conservation compared to the mouse (data not shown). The current data, therefore, suggests that GHSROS is a non-protein-coding RNA gene.
Open reading frames (ORFs) in the GHSROS transcript determined using the ExPASy translate tool (http://www.expasy.ch/tools/dna.html).
GHSROS is overexpressed in lung cancer
To examine the expression of GHSROS, quantitative real-time RT-PCR was performed using commercially available RNA from a range of normal human tissues. Stomach, cerebellum, ovary, thymus, whole brain, lung, pancreas and foetal brain displayed relatively low levels of GHSROS expression with relatively moderate expression in the testis (Fig. 3A). In contrast, GHSROS was highly expressed in a lung tumour sample (Fig. 3A). In the lung-derived cell lines examined, the lowest level of GHSROS expression was seen in the normal tissue-derived, Beas-2B bronchoepithelial cell line, while higher expression levels were seen in the NCI-H1299 and A549 NSCLC cell lines (Fig. 3B). Finally, quantitative real-time RT-PCR was performed using tumour and matched adjacent normal tissue from six patients with NSCLC lung cancer, as well as two non-matched samples (for clinical details, see Table I). A higher level of GHSROS expression was observed in each of the tumour samples compared to their matched adjacent normal tissue with samples 1, 2 and 3 being statistically significant (Fig. 4).
Relative expression of GHSROS in (A) human tissues and (B) cell lines using quantitative real-time RT-PCR. Data are represented as means and standard error of two technical replicates of two independent replicate experiments. The housekeeping gene 18S ribosomal RNA was used as a reference for normalisation. Data are represented as fold change relative to expression of transcripts in (A) stomach or (B) the Beas-2B cell line, both set at 1.
Relative expression of GHSROS in a range of paired normal (white) and NSCLC tumour (black) samples using quantitative real-time RT-PCR. Samples N7 and T8 are unpaired. N, normal lung; T, lung tumour. The 18S ribosomal RNA housekeeping gene was used as a reference for normalisation. Data are represented as fold change relative to expression of transcripts in normal tissue N1. Data are represented as means and standard error of two technical replicates of two independent replicate experiments. *P<0.05 (Student’s t-test) relative to matched normal tissue.
GHSROS overexpression increases the migration of A549 and NCI-H1299 NSCLC cell lines, but reduces migration in the Beas-2B cell line
The functional significance of GHSROS in the lung was studied by creating stable transfectants in the A549, NCI-H1299 and Beas-2B cell lines. Migration was significantly decreased in GHSROS overexpressing Beas-2B cells over 24 h (49% below vector-only control, P<0.05) (Fig. 5, lanes 1 and 2). In contrast, migration was significantly increased in the GHSROS overexpressing NSCLC cell lines examined, with an increase of 67% above control (p<0.05) in A549 cells (Fig. 5, lanes 3 and 4) and 129% above control (p<0.05) in NCI-H1299 cells after 6 h (Fig. 5, lanes 5 and 6). The observed differences in cell migration were not due to changes in cell number, as overexpression of GHSROS did not significantly alter cell proliferation in the A549, NCI-N1299, or Beas-2B cell lines at these time points compared to cells expressing the vector alone (data not shown).
GHSROS overexpression affects cell migration. Effect of GHSROS overexpression on cell migration in the normal derived Beas-2B cell line after 24 h, the A549 and NCI-H1299 NSCLC cell lines after 6 h (compared to cells expressing vector alone). Independent stable transfections and migration assays revealed a similar enhancement of cell migration in the lung cancer cells after 24 h (data not shown). Migration was examined in a transwell migration assay. Results are from triplicate samples from three independent experiments and expressed as percentage above or below control (vector transfected cells). *P<0.05 (Student’s t-test) between control cells (vector transfected cells) and cells engineered to overexpress GHSROS.
We demonstrate that the intronic region of the ghrelin receptor gene, GHSR, encodes a long non-coding RNA, termed GHSROS, which is expressed in lung cancer and promotes cell migration in lung cancer cell lines. Research into the role of ncRNAs in normal development and disease processes has predominantly focused on microRNAs (miRNAs), however, a number of long ncRNAs (lncRNAs), including MALAT-1, H19, B2 and lincRNA-p21 are known to play a role in lung cancer progression (11). Long ncRNAs are greater than 200 nucleotides in length, lack significant open reading frames and often harbour protein-coding mRNA-like features such as transcription by RNA polymerase II, polyadenylation and alternative splice variants. They control gene expression via the regulation of a broad range of processes including gene expression at the transcriptional and post-transcriptional levels (splicing, transcript degradation, epigenetic modification, chromatin remodelling, and sub-cellular transport) and some are precursors for small RNAs (11,30).
The GHSROS gene encodes a transcript ∼1.1 kb in size and is a putative mRNA-like non-coding RNA, as it is likely to be derived from a classical TATA-box promoter and is 5′ capped and 3′ polyadenylated. Although we predict that GHSROS is a non-coding gene, we cannot currently dismiss the possibility that GHSROS encodes short, bioactive peptides. For example, the assumed ncRNA gene pri in Drosophila has been shown to encode functional peptides of 11 and 32 amino acids (31,32). It has recently been recognised that short open reading frame-encoded polypeptides (SEPs) may be very abundant in the proteome and are also derived from ncRNAs (33). Proteomics studies, for example using assays such as multiple reaction monitoring mass spectrometry (34), would be most useful in assessing whether any of the 13 short open reading frames of GHSROS are indeed translated and have independent functions.
GHSROS overlaps a MER5B DNA transposable element. Repeat elements in non-coding RNAs have been reported by a number of investigators (35–40). Accumulating evidence suggests that transposable elements often harbour promoters for natural antisense transcripts (41) and these elements lead to the transcription of novel, species-specific, non-coding RNA transcripts involved in gene regulation (42). It has been hypothesised that non-coding RNAs in introns may regulate the abundance, or splicing of their overlapping protein-coding transcripts (22,23,43–46). It is equally likely, however, that many intronic RNAs regulate a large number of genes in trans, at sites that are distant to their host loci (44,47).
We report that GHSROS is expressed in clinical samples from lung tissues and lung cell lines, with a similar, but not statistically significant, trend towards higher GHSROS expression in lung tumours. We hypothesise that this observation could be due to adjacent normal tissue samples with elevated GHSROS levels exhibiting pre-neoplastic alterations in gene expression, as described in other studies (48,49), or that tumour cells have spread into the normal tissue. Moreover, lung tissue is highly heterogeneous, consisting of a range of cell types, including cartilage, smooth muscle, epithelial and endothelial cells (50). Laser capture microdissection would be useful to isolate specific cell types for future experiments (50,51). As was the case for the lncRNA, HOTAIR, that is overexpressed in metastatic breast tumours (52), it is also feasible that the heterogeneous GHSROS expression pattern observed in our limited panel of primary tumours can be resolved by measuring the levels of the transcript in a larger and more diverse patient tissue panel. This will determine whether GHSROS could be a useful biomarker for predicting metastatic progression and patient survival.
The most well-studied long non-coding RNA in lung cancer, MALAT-1 (Metastasis Associated in Lung Adeno-carcinoma Transcript-1), plays a role in cancer progression through a number of mechanisms including the regulation of gene transcription (53). MALAT-1 is overexpressed in NSCLC, in NSCLC cell lines and in a number of other tumour types (5,54–56). It has an important role in cancer cell motility and migration, and regulating genes related to these processes, and may be involved in other cancer related processes (11,55). Furthermore, knockdown of MALAT-1 expression in NSCLC xenograft mouse models reduces tumour growth and prevents lung cancer cell metastasis (6,53). MALAT-1 is, therefore, a potential therapeutic target for lung cancer (53) and also has potential as a prognostic biomarker for NSCLC, breast, prostate, pancreatic, colon, liver and endometrial cancers (6,57–60).
H19 is an imprinted lncRNA which is highly expressed in the embryo and is oncogenic in some cell types including lung cancer cell lines, while it has tumour-suppressing activity in other cell types (61–65). It is upregulated by a number of carcinogens, and expression is greatly increased in the airway epithelium of smokers (66). Long non-coding RNAs work through a number of different mechanisms and H19 is a precursor for at least one miRNA (67,68). This may allow it to play contrasting roles in different tissues and at different stages of development. The lncRNA lincRNA-p21 is a global repressor of the p53 tumor suppressor pathway (69) and acts a post-transcriptional inhibitor of the translation of target genes through an interaction with the RNA binding protein HuR (70).
A hallmark of tumour cell behaviour is the ability to migrate and ultimately to metastasise to secondary sites in the body (12,13). Engineered GHSROS overexpression resulted in a decreased rate of migration in the normal lung-derived Beas-2B cell line, while it stimulated cell migration in the two NSCLC cell lines. Such cell-type and context-specific effects are also observed for miRNAs, which can regulate a large number of genes and function as either oncogenes or tumour suppressors (71,72). Indeed, evidence is emerging that many short and long RNA transcripts may have dual functions, and their ultimate biological effects may be dependent on complex ncRNA-DNA-protein interactions (73,74). Interestingly, it has recently been reported that lncRNAs are able to deplete miRNA, acting as miRNA sponges (75–77). This could explain the global changes in gene expression observed with lncRNA overexpression and/or knockdown. Conversely, lncRNAs may also exert global effects as precursors for miRNAs, as observed for H19 (67). Further studies are underway in our laboratory to explore the mechanisms by which GHSROS promotes migration in lung cancer cells.
In conclusion, we have identified a novel, long non-coding RNA gene in the intron of the ghrelin receptor gene, GHSR, that exhibits high levels of expression in lung tumour tissue and regulates cell migration in cultured cells of lung origin. These observations suggest that GHSROS may be significant in cancer progression and could be a useful therapeutic target for inhibiting tumour migration. Further studies on the role of GHSROS in normal physiology and cancer progression are required to dissect the function and mechanism of action of this long non-coding RNA gene. The identification of novel stimulators of NSCLC progression such as GHSROS will lead to earlier detection, better prognostic biomarkers and therapeutic approaches for patients diagnosed with NSCLC in the future.
This study was supported by grants from the National Health and Medical Research Council (NHMRC), the National Breast Cancer Foundation, The Cancer Council Queensland (to L.K.C. and A.C.H.), the Queensland University of Technology (QUT) Early Career Researcher grants (to I.S. and E.J.W.). We thank Professor Kwun Fong, Professor Ian Yang and Professor Rayleen Bowman, and Santiyagu Mary Savarimuthu Francis (Department of Thoracic Medicine, the Prince Charles Hospital, Brisbane, Australia) for the Beas-2B normal bronchoepithelial lung cell line. We also thank the staff at the Ontario Tumour Bank (OTC), Canada, for the recruitment, retention and dispersion of the lung-derived tissue samples.