Bioinformatics analysis of the regulatory lncRNA-miRNA-mRNA network and drug prediction in patients with hypertrophic cardiomyopathy

Hypertrophic cardiomyopathy (HCM) is a complex inherited cardiovascular disease. The present study investigated the long noncoding (lnc)RNA/microRNA (mi)RNA/mRNA expression pattern of patients with HCM and aimed to identify key molecules involved in the development of this condition. An integrated strategy was conducted to identify differentially expressed miRNAs (DEmiRs), differentially expressed lncRNAs (DElncs) and differentially expressed genes (DEGs) based on the GSE36961 (mRNA), GSE36946 (miRNA), GSE68316 (lncRNA/mRNA) and GSE32453 (mRNA) expression profiles downloaded from the Gene Expression Omnibus datasets. Bioinformatics tools were employed to perform function and pathway enrichment analysis, protein-protein interaction, lncRNA-miRNA-mRNA and hub gene networks. Subsequently, DEGs were used as targets to predict drugs. The results indicated that a total of 2,234 DElncs (1,120 upregulated and 1,114 downregulated), 5 DEmiRs (2 upregulated and 3 downregulated) and 42 DEGs (35 upregulated and 7 downregulated) were identified in 4 microarray profiles. Gene ontology analysis revealed that DEGs were mainly involved in actin filament and stress fiber formation and in calcium ion binding, whereas Kyoto Encyclopedia of Genes and Genomes pathway analysis identified the hypoxia inducible factor-1, transforming growth factor-β and tumor necrosis factor signaling pathways as the main pathways involved in these processes. The hub genes were screened using cytoHubba. A total of 1,086 lncRNA-miRNA-mRNA interactions including 67 lncRNAs, 5 miRNAs and 25 mRNAs were mined in the present study based on prediction websites. Drug prediction indicated that the targeted drugs mainly included angiotensin converting enzyme inhibitors or β-blockers. A comprehensive bioinformatics analysis of the molecular regulatory lncRNA-miRNA-mRNA network was performed and potential therapeutic applications of drugs were predicted in HCM patients. The data may unravel the future molecular mechanism of HCM.


Introduction
Hypertrophic cardiomyopathy (HcM) is a heterogeneous monogenic myocardial disorder that is characterized by myocardial hypertrophy, asymmetric hypertrophy of the ventricular septum, decreased ventricular cavity and abnormal hypertrophy of cardiac muscle cells. The prevalence of HcM is >1 in 500 (0.2%) (1). Previous studies carried out in different regions and ethnic groups indicated that this disease exhibited a high degree of familial aggregation and heritability (2)(3)(4). it has been proposed that gene therapy is an effective way to eradicate HcM.
With the development of high-throughput gene expression profiling technology, microarray analysis has been widely used in the early diagnosis, treatment and prognosis of several diseases. Micrornas (mirnas) are a novel class of small non-coding rnas, which can negatively regulate gene expression at the post-transcriptional level by directly binding to mrnas (5). long non-coding rnas (lncrnas) and circular rnas contain mirna response elements, acting as competitors of endogenous rnas (6). These molecules have emerged as essential regulatory molecules in various biological processes. in 2014, Song et al (7) performed a mirna microarray analysis on heart tissues from 7 HcM patients and 5 healthy subjects and found that mir-451 was the main mirna that was downregulated in the HcM subjects. The target gene of mir-451 was the tuberous sclerosis complex 1 gene (TSC1) that was significantly upregulated in the myocardial tissue of HcM patients (7). TSc1 is a known positive regulator of autophagy (8), which contributes to the development of HcM. However, the molecular mechanism of HcM with regard to non-coding rnas is still unknown. Therefore, the present Bioinformatics analysis of the regulatory lncRNA-miRNA-mRNA network and drug prediction in patients with hypertrophic cardiomyopathy study aimed to perform in-depth data mining based on former microarray studies. at present, big data mining and precision medicine have gained considerable attention. digitization and informatics technologies facilitate the flow of information between various fields, laying the foundation for the conduct of interdisciplinary research. With the achievements of transcriptomics and genomics, the correlation between disease phenotypes and potential genotypes has attracted great scientific interest. The present study combined bioinformatics, transcriptomics and epidemiology in order to examine differentially expressed lncrnas (delncs), differentially expressed mirnas (demirs) and differentially expressed genes (deGs). in addition to their expression, their associated regulatory network was investigated. The data can elucidate the underlying etiology of HcM and further provide reliable molecular targets for drug therapy.

Materials and methods
Raw data collection. lncrna, mirna and gene microarray expression profiles between HCM and healthy controls were investigated by the Gene expression omnibus database (Geo; https://www.ncbi.nlm.nih.gov/geo/) using the keywords 'hypertrophic cardiomyopathy' and 'lncrna' or 'mirna' or 'mrna'. eligible datasets had to meet the following inclusion criteria: i) Microarray profiling studies on human patients with HcM; ii) HcM and non-HcM control/ healthy samples for comparison; iii) reports of sample sizes; iv) a group label for each sample size; v) a corresponding annotation (gene symbol) or GeneBank ID in the platform file for each probe of the microarray and vi) availability of raw data. The exclusion criteria included: i) animal samples or cell lines; ii) non-HcM samples; iii) non-microarray profiles; iv) samples without controls and v) probes without gene symbol or GeneBank id annotation. Finally, 4 profiles of GSE36961, GSE36946 and GSE68316 were downloaded. The flow chart of the screening datasets is presented in Fig. 1. The characteristics of these microarray expression profiles are shown in Table I.
Raw data pretreatment and screening for DElncs, DEmiRs and DEGs. The Geo dataset is one of the mainstream public functional genomics data repositories (9). nonetheless, the uploaded data are usually rough, incomplete or containing noise. Therefore, prior to data mining, data pretreatment must be carried out including data washing, data filtering and data normalization. With regard to the illumina expression array, the Microarray Quality control method is conventional for background correction and quantile normalization (10). This method can eliminate non-experimental differences caused by technical discrepancies and ensure appropriate data comparisons among different samples.
The raw signal intensity data of GSe36961, GSe36946 and GSe32453 were converted by the illumina package using the r software (version 3.5.1; https://www.r-project.org/). The limma package was employed to identify delncs, demirs and deGs, respectively using a linear model and the empirical Bayes model (11). GSe68316 was analyzed by the r-based Geo online tool Geo2r (http://www.ncbi.nlm.nih.gov/geo/geo2r/), which is useful for comparing 2 or more groups of samples, and can characterize differentially expressed rna molecules based on t-test or analysis of variance (9). The gene symbol was annotated to the corresponding probes and for the genes mapped with more than one probe, the average expression values were calculated. Volcano plots of delncs, demirs and deGs were provided using Gplot package in r. a total of 3 mrna datasets (GSe36961, GSe68316 and GSe32453) were selected to identify novel deGs; however, the deGs in each dataset were not combined due to the variation of distinct platforms, manufacturers and temperature (12). The data obtained from multiple sources could be compared or directly integrated. if not properly addressed, this integration would adversely affect all subsequent analysis. The 'SVa' package in the r software is an effective tool to reduce the data heterogeneity, remove batch effects and retain the variation due to sample types retained for further deG analysis (12). The hierarchical clustering plot for the deGs in the GSe36961, GSe68316 and GSe32453 datasets were provided using the 'pheatmap' package of the r software (https://cran.r-project. org/web/packages/pheatmap/index.html) following batch normalization. ultimately, the mixed model of variance analysis with a false discovery rate (Benjamini-Hochberg test, Fdr) (13) was used. an adjusted P<0.05 and a |log2Fc| value >1 were applied in screening significantly different expression levels of rna molecules.

Functional enrichment and pathway analysis.
To explore the main functions and pathways of deGs, the database for annotation, Visualization and integrated discovery (daVid; https://david.ncifcrf.gov/) was used for the Gene ontology (Go) (22,23) and Kyoto encyclopedia of Genes and Genomes (KeGG) pathway (24,25) enrichment analyses. daVid is an integrated biological database which contains gene functional classification, pathway-mining and clustering tools. These functions aim to systematically extract biological information from the uploaded gene lists (26).
The human disease database Malacards (http://www. malacards.org/) was selected for the HcM-related Go and KeGG pathway items in order to ensure the accuracy of the results. Malacards is an integrated compendium of human maladies and their annotations are mined from several well-known data sources. Malacards contain 14 annotation topics, including Summaries, Symptoms, related diseases, drugs and therapeutics, Gene and expression, Go terms and Pathways (27). a P-value was adjusted using the Benjamini method or the Fdr in multiple testing calibrations. a P<0.05 was selected as the threshold. The enrichment score was defined as the transformed-log 10 (P-value).
Protein-protein interaction (PPI) network of DEGs, screening for hub genes and lncRNA-miRNA-mRNA network. Search Tool for the retrieval of interacting Genes (STrinG; version 10.5; https://string-db.org/) is an online functional protein association network. The associations in STrinG include direct (physical) and indirect (functional) interactions, as long as they are specific and biologically meaningful (28). The identified DEGs were input into STrinG to unravel a potential PPi network. Hub genes are key genes playing a crucial role in biological processes. The regulation of other genes in the relevant pathway is often affected by hub genes. Therefore, hub genes are often considered an important target or a research hot spot (29). Hub genes can be screened according to the network topology. cytoHubba is an effective app of the cytoscape version 3.6.1 (http://www.cytoscape.org/) plug-in used to identify hub genes more accurately by 12 topological analysis methods (30). cytoscape is an open source software platform for visualizing molecular interaction networks and biological pathways. These networks are integrated with annotations of gene expression profiles and other data. Maximal clique centrality (Mcc) was used to identify the top 20 hub genes. The paired lncrna-mirna-mrna data were also input into cytoscape software in order to generate a regulatory network image.
Drug prediction of hub genes. drugBank (http://www. drugbank.ca) contains more than 7,000 drug entries and 4,000 non-redundant proteins (31). it is a comprehensive cheminformatic database including abundant biochemical and pharmacological details regarding drugs. drugBank aids the identification of novel drug targets and the comparison of drug structures with potential mechanisms of action (32). The DEGs identified in the 3 mRNA datasets were input into drugBank to examine their association with potential targeted drugs. The purpose of the present study was to explain the rationale of present drug therapy used for HcM and to explore additional potential target genes for future drug development.

Results
Detection of DElncs, DEmiRs and DEGs. The raw data of each dataset were normalized following data pretreatment. Fig. 2 demonstrates the boxplot of each sample prior to and following data normalization. a total of 2,234 delncs (1,120 upregulated and 1,114 downregulated) were identified in GSE68316, whereas 5 DEmiRs were identified in GSE36946, of which 2 were upregulated (mir-373 and mir-514) and 3 were downregulated (mir-10a, mir-144 and mir-30c-5p). a total of 154 DEGs were identified in GSE36961, of which 47 were upregulated and 109 downregulated. a total of 1,402 deGs (365 upregulated and 1,037 downregulated) were identified in GSe68316. Finally, 16 deGs (8 upregulated and 8 downregulated) were identified in GSE32453. The volcano plots (Fig. 3) displayed the aberrantly expressed rna molecules. a total of 42 DEGs (35 upregulated and 7 downregulated) were finally screened in the 3 mrna microarray datasets (GSe36961, GSe68316 and GSe32453) following data merging and  removing the batch effect (Fig. 4). Following prediction and intersection process, 112 non-redundant lncrna-mirna pairs and 49 mirna-mrna pairs containing 67 lncrnas, 5 mirnas and 25 mrnas were mined in the present study. ultimately, 1,086 pairs of lncrna-mirna-mrna interactions were identified. The details of the predicted lncRNAs and mrnas that were based on several prediction websites are listed in Tables Si and Sii. GO and KEGG analysis. Go and KeGG analyses were performed on the detected 42 deGs to examine their biological functions in detail. Go analysis described the results from 3 categories: 'Biological processes' (BP), 'cellular components' (cc) and 'molecular functions' (MF) (33). Go/KeGG analysis is considered to be a powerful tool in revealing biological mechanisms or functional pathways underlying observed patterns in genomics or transcriptomics. a total of 36 Go terms (19 BP, 13 cc and 4 MF) and 30 KeGG terms were enriched. The significantly enriched HCM-related GO terms mainly included the following: Actin filament binding, stress fiber formation, calcium ion binding and transforming growth factor-β receptor binding (Fig. 5a). The top 10 KeGG pathways correlated highly with HcM. For example, the deficiency of tumor necrosis factor (TNF) receptor-associated factor 5 could substantially aggravate cardiac hypertrophy, cardiac dysfunction and fibrosis (34). TNF-α is an extremely important molecule used in cell proliferation, differentiation, growth and metabolism, which is closely related to the occurrence of cardiac hypertrophy (35). Hypoxia-inducible factor (HiF) can be induced during hemodynamical-mediated hypertrophic growth and its induction is accompanied by pathological stress (36,37). These molecules play important roles in the identified pathways. The bubble plot offered a visual representation of the aforementioned pathways (Fig. 5B).
Construction of PPI, hub gene and lncRNA-miRNA-mRNA networks. The PPi network analysis aimed to study the molecular mechanism of diseases and identify novel drug targets from a systematic perspective. The STrinG database is designed to depict functional interactions between the expressed proteins by integrating known and predicted protein-protein association data among various species (28). The PPi network was conducted by STrinG to explore the interactions between proteins encoded by 42 deGs. a median confidence 0.4 was selected as a cut-off criterion. A total of  39 nodes (proteins encoded by genes) and 80 edges (connections between nodes) were screened following removal of the disconnected nodes in the network (Fig. 6a). Furthermore, 1,086 pairs of lncrna-mirna-mrna biomolecular interactions were integrated in a single network constructed by cytoscape. The lncrnas exhibited high tendency of aggregation with the mirnas. mir-30c-5p and mir-541 exhibited the highest number of lncrna binding sites, suggesting that these 2 mirnas could bind to lncrnas when competing with other mirnas. it is worth mentioning that lncrna enSG00000269821 and enSG00000224078 have binding sites for all the five miRNAs, which are thought to be of substantial significance and may provide clues to future researchers (Fig. 6B).
in the PPi network, some hub genes/proteins are highly connected with other proteins, suggesting a central regulatory role. cytoHubba provides a user-friendly interface to examine the interactions among hub nodes in the biological network by topological analysis (30), which can aid the identification of the essential networks involved. The top 20 hub genes were mined by the Mcc method (Fig. 6c).
Drug Prediction of DEGs. HcM can be divided into hypertrophic obstructive cardiomyopathy (HocM) and hypertrophic nonobstructive cardiomyopathy according to the presence of obstruction in the left ventricular outflow tract. Medical treatment should be the first choice for the majority of patients. Therefore, it is vital to select the most suitable drug candidate. at present, the main therapeutic drugs for HocM are β-blockers, calcium channel blockers and amiodarone (38).
The drugBank database is a unique, comprehensive drug repository covering detailed drug function, formulation, basic structure and drug target information (31). The deGs were inserted into drugBank and the targeted drugs of these genes were predicted. The approved drugs that were associated with HcM were selected from 536 results (Table ii). The majority of these drugs were calcium channel blockers, angiotensin   dB00178 approved ramipril is a prodrug belonging to the acei class of medications, may be used in the treatment of hypertension, myocardial infarction, stroke. labetalol dB00598 approved Blocker of both α-and β-adrenergic receptors that is used as an antihypertensive. Verapamil dB00661 approved a calcium channel blocker that is a class iV anti-arrhythmia agent. Mexiletine dB00379 approved, investigational antiarrhythmic agent pharmacologically similar to lidocaine. it may have some anticonvulsant properties. nicardipine dB00622 approved, investigational a potent calcium channel blockader with marked vasodilator action. Propranolol dB00571 approved, investigational a widely used non-cardioselective β-adrenergic antagonist. used in acute myocardial infarction, arrhythmias, hypertension. diltiazem dB00343 approved, investigational a benzothiazepine derivative with vasodilating action due to its antagonism of the actions of the calcium ion in membrane functions. amyl nitrite dB01612 approved an antihypertensive medicine. amyl nitrite is bioactive in mammals, being a vasodilator which is the basis of its use as a prescription medicine. Propafenone dB01182 approved an antiarrhythmia agent that is particularly effective in ventricular arrhythmias. it also has weak β-blocking activity. nifedipine dB01115 approved Both a long and short-acting 1,4-dihydropyridine calcium channel blocker, preventing calcium-dependent myocyte contraction and vasoconstriction. labetalol dB00598 approved Blocker of both α-and β-adrenergic receptors that is used as an antihypertensive. ramipril dB00178 approved ramipril is a prodrug belonging to the acei class of medications, . may be used in the treatment of hypertension, myocardial infarction, stroke nimodipine dB00393 approved, investigational a calcium channel blocker. it acts primarily on vascular smooth muscle cells by stabilizing voltage-gated l-type calcium channels in their inactive conformation. Benazepri dB00542 approved, investigational can be converted into benazeprilat, a non-sulfhydryl acei. a medi cation used to treat hypertension, congestive heart failure and chronic renal failure. Valsartan dB00177 approved, investigational Valsartan is an arB that may be used to treat hypertension, isolated systolic hypertension, left ventricular hypertrophy. ephedrine dB01364 approved, investigational affect the rate or intensity of cardiac contraction, blood vessel diameter, or blood volume. Verapamil dB00661 approved a calcium channel blocker that is a class iV anti-arrhythmia agent. Digoxin DB00390 Approved Control ventricular rate in atrial fibrillation and in the management of congestive heart failure with atrial fibrillation. diltiazem dB00343 approved, investigational a benzothiazepine derivative with vasodilating action due to its antagonism of the actions of the calcium ion in membrane functions. Timolol dB00373 approved a β-adrenergic antagonist similar in action to propranolol. Timolol has been proposed as an antihypertensive, antiarrhythmic and anti angina agent. lisinopril dB00722 approved, investigational lisinopril is a potent, competitive inhibitor of ace. Benazepril dB00542 approved, investigational can be converted into benazeprilat, a non-sulfhydryl acei. a medi cation used to treat hypertension, congestive heart failure and chronic renal failure. Penbutolol dB01359 approved, investigational Penbutolol is a drug in the β-blocker class used to treat hypertension. a The description of drug was extracted from drugBank. ace, angiotensin-converting enzyme; acei, angiotensin-converting enzyme inhibitor; arB, angiotensin-receptor blocker; deG, differentially expressed genes.
converting enzyme inhibitors (aceis) and/or β-blockers. The drug effects included the following: controlling heart rate or arrhythmia, increasing ventricular filling and terminal diastolic volume, reducing the contractility of ventricle and improving the compliance of myocardium (39). These effects were consistent with the guidelines of HcM diagnosis and treatment. However, aceis, angiotensin-receptor blockers (arBs) or diuretics can enhance myocardial contractility or reduce cardiac afterload, thereby increasing left ventricular outflow tract obstruction. Therefore, HOCM patients should be treated with caution.

Discussion
HcM is a primary cardiovascular disease, which has been regarded as the most common risk factor of sudden death among young people; it is now well accepted that multiple mutations in gene encoding regions are responsible for the development of this disease (40). Therefore, investigating the etiology of HcM and identifying effective therapeutic indicators is of utmost urgency. in recent years, several studies (41,42) have investigated the functions and clinical implications of non-coding rnas in HcM. However, the impact of rna crosstalk on HcM has not been previously addressed. The present study integrated the lncrna, mirna and mrna expression profiles and produced an integrated lncrna-mirna-mrna regulatory network, which provided insight at the post-transcriptional level of gene regulation. Moreover, key molecules involved in multiple physiopathological processes of HcM were explored by the application of bioinformatics technology and big data mining. Targeted drugs were predicted by drugBank using identified hub genes, indicating that they may be optimal candidate drugs for future therapy.
The identified DEmiRs and DEGs exhibited consistency with other microarray analysis on cardiovascular disease-associated targets. For example, the underexpressed mir-373 and overexpressed mir-10 were consistently expressed in the plasma using microarrays and further validated by quantitative Pcr analysis in 55 HcM patients with a moderate diagnostic value (43). The high throughput sequencing identified miR-30c-5p and specific mRNA targets that could affect heart cells by causing nuclear factor of activated T-cells hypertrophy and activating cardiac hypertrophy signaling (44). The plasma levels of mir-144-3p were elevated in patients with ventricular arrhythmia (45). ALOX5AP is a crucial regulatory factor used in the biosynthesis of inflammatory leukotrienes. Previous studies have reported that genetic variations in ALOX5AP exhibit significant associations with ischemic stroke and myocardial infarction (46,47). NFKBIA polymorphisms modulated the risk of coronary artery disease in the chinese uygur population by regulating various cytokines including interleukin-6, which is a key mediator of the inflammatory process and atherosclerotic plaque formation (48). These studies were consistent with the present findings.
With regard to the analysis of lncrnas, the results of the present study were not fully investigated in previous studies. The Myosin Heavy chain associated rna Transcripts (Mhrt) lncrna was reported as cardiac-specific and abundant in adult hearts (49). Mhrt protected the heart from hypertrophy and failure by interacting with chromatin (49). Yang et al (50) performed lncrna and mrna microarray analysis on myocardial tissues from 7 HcM patients and 5 controls, and identified 965 upregulated and 461 downregulated lncRNAs. Bioinformatics analysis indicated that lncrna-co-expressed mrnas were mainly enriched in ribosome and oxidative phosphorylation modules. overexpression of genes in complex i of the oxidative phosphorylation pathway may contribute to physiological hypertrophy of the heart. at present, β-blockers (propranolol and penbutolol) and calcium antagonists (nifedipine, verapamil and diltiazem) are mainly prescribed for pharmacological treatment of HcM in routine clinical practice (51,52). Propranolol was reported to reduce the a-wave ratio in the apex cardiogram and could be indispensable in relieving emergency symptoms in patients with HcM. calcium antagonists can control myocardial contractility, reduce the pressure gradient of left ventricular outflow tract and improve myocardial compliance and ventricular diastolic function. The majority of the drugs that were predicted in the present study are already in clinical use, suggesting that the screened genes were associated with the pathogenic mechanism of HcM. The remaining predicted drugs that are not currently used for clinical treatment can be the starting point for subsequent future investigations. data mining can extract the relevant biological information from high-dimension data. The integration of multiple histological techniques will become a new trend of disease diagnosis in the era of precision medicine. These molecules have the potential to be used as disease biomarkers in personalized medicine for patients and can improve the diagnosis, treatment and prevention of several diseases.
The present study contains certain limitations: i) The lncrna-mirna-mrna large-scale crosstalk network did not indicate whether these rnas were co-expressed in the same tissues. ii) The present study is a preliminary analysis, which requires validation in a large population by Pcr or western blot analyses. In addition, the biological mechanism of specific non-coding RNAs requires verification by animal or cellular studies. iii) The specificity of the identified DElncs, DEmiRs and deGs for HcM requires investigation in future studies.
in conclusion, the present study generated a holistic view of a candidate lncrna-mirna-mrna network for HcM and proposed potential predicted drugs that could be used for this disease by integrating several microarray data. The authors hope that the present study will be beneficial for discovering new biomarkers or therapeutic drugs for the reduction of the risk of this life-threatening disease.
Acknowledgements not applicable.

Funding
no funding was received.

Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.