Modern drug discovery and pharmaceutics benefit from nature. Natural products (NPs) are used as a source of therapeutic agents with beneficial uses. Currently, there is considerable interest in the exploration of NPs for drug discovery and continuous investigations on the therapeutic claims and mechanisms of herbal medicines. To date, approximately one million NPs have been isolated and subjected to experimental assays to evaluate quantitative biological activities. This renders the use of an integrated database to assemble and correlate this valuable information from the literature, experimental studies and databases necessary. Although databases contain a large volume of information, it is frequently difficult and complex, even in well-organized databases, to extract the required information. Novel databases must be accompanied by efficient algorithms and techniques in order to extract beneficial knowledge by a simple query. The Hippo(crates) database aims to fill this gap in the field of chemoinformatics and natural products by providing retrieval not only linked to the Hippo(crates) database, but also to other worldwide chemical and biological databases. Part of the OPENSCREEN-GR project, the Hippo(crates) Database Graphical User Interface (HDGUI) web server was developed to provide a user-friendly access interface, integrating annotated information of NP origin (sources and species), biological activities, physicochemical properties, linear and 3D chemical structure, as well as relative terms that correlate chemical compounds and their use. In its current version (V1.0), the Hippo database provides 45,300 NPs, NP derivatives and synthetic compounds, which are separated into 32 major categories, including biological or medicinal properties. In the database, 22,830 NP source organisms are correlated, with >100,000 terms, including biological pathways, target organisms, target diseases, target types, target proteins and pathogens, and 6,070 three-dimensional structures of NP target proteins. For each entry, a cluster with similar compounds and a ligand-based or structure-based pharmacophore model is provided. The portal is designed as an easy-to-use web tool where the user can easily search, extract and correlate information and data for natural product chemical compounds through various fields, such as categories, keywords, targets, species, or two-dimensional or three-dimensional similarity structure in the Hippo(crates) atlas of the NP database.
NPs are primary and secondary metabolites produced and used by living organisms for defending mechanisms or adapting actions. These molecules have been naturally selected and modified for millions of years to acquire specificity and cover a wide range of biological mechanisms, depending on the originating species, the environment, and the specific biological action involved in the corresponding organism (
NPs, derived mostly from herb plants, have been used as the major source of therapeutics for traditional medicine throughout history and continue to be the basis for a number of pharmaceuticals currently used (
Over the past decades, huge libraries of fractionated NPs have been screened with impressive hit rates in several diseases and pathogenic conditions. Of note, a number of cases are known where the crude biological extract is more pharmacologically effective against the purified most active chemical compound from this extract (
Traditional medicines and NPs provide valuable insight towards the discovery of novel medicinal agents. Crude biological extracts may help to enlarge the drug discovery paradigm from ‘identifying novel entity drugs’ to ‘combining existing agents’ and may even direct the combinations between such NP-derived agents (
Chemoinformatics provide computer methods for the organization, analysis and visualization of chemical information, and is used extensively in drug discovery and development. It is a rapidly evolving field, particularly due to the advent of high-throughput experimental techniques, the widespread availability of public databases, and the development of machine learning algorithms (
Pharmacophore is another concept integral to computer-aided drug design. It is the ensemble of steric and electronic features necessary to ensure the optimal supramolecular interactions with a specific biological target structure and to trigger (or to block) its biological response (
Recently, polypharmacology, the ability of a single agent to interact with multiple receptors and modulate several processes, has drawn attention (
Different databases have been shared in recent years, providing information required to develop the exploration and exploitation in NPs. As expected, each database has been specialized in a different field and presents the NPs from a different point of view, including DrugBank, Natural Product Activity and Species Source (NPASS), NPCARE and Open National Cancer Institute (NCI) (
The Hippo(crates) database aims to assist the pharmaceutical research for novel potential candidate pharmacological agents and pharmacological targets. The user can perform searches using the HDGUI with a combination of several preset parameters, features, properties and keywords related to the NPs and chemical compounds. The HDGUI applies various filtering, processing and annotation techniques towards identifying and visualizing the most probable dominant NPs and chemicals based on the user preset parameters. The HDGUI identifies all the candidate NPs using the up-to-date curated Hippo(crates) database and provides each chemical compound information guided by explanatory information from the annotation and data mining analyses, as well as direct links to several online databases, such as PubChem (
The NP derivatives and synthetic compounds with available experimentally-determined quantitative activity, chemical, physicochemical properties and relative information were extracted from the Selleckchem available catalog (
Hippo(crates) database entries have been annotated with information from several fields contained in the PubChem Database by using one or a combination of the four primary identifiers (Name, CAS Number, InChIKey and CID) describing each chemical compound (
Compound fingerprints were calculated using the CACTVS Chemoinformatics Toolkit (
The Hippo(crates) database of NPs and chemical compounds is publicly available online at
The Hippo(crates) database is an integrated resource for NPs, chemical compounds derived from NPs and chemical compounds considered as NPs analogs, and other chemical compounds. The Hippo(crates) database currently holds 45,300 entries, which are divided into 32 major categories, as presented in
The Hippo(crates) database provides a well-organized atlas of interconnected NPs and other chemical compounds using both advanced bioinformatics and chemoinformatics techniques. The contents of the database were analyzed using specialized techniques such as the Tanimoto coefficient for the analysis of chemical compounds (
The HDGUI webserver aids the chemical and medical experts, pharmacists and other users in searching and identifying NPs, NP-derived and other synthetic chemicals with identified chemical properties through a characteristic set of keywords and ontologies. This is achieved through filtering web tools and the summarized knowledge under ‘key’ terms is presented in smart lists. Users are able to perform complex filtering operations using chemical properties, fingerprints, disease, target proteins, biological pathways, source organisms and several other specific keywords under specific domain ontologies. In addition, the HDGUI webserver enables users who may not be familiar with chemical molecular structures (
The HDGUI filtering options are separated into seven major webtools (
The HDGUI output is an HTML file that describes the chemical compounds profile through a smart array which contains the specific fields, including ‘name’, ‘category’, ‘species’, ‘target’, ‘disease’, ‘viewer’, ‘pubchem’, ‘PDB’, ‘cas_number’, ‘SMILES’, ‘molecular_weight’, ‘formula’, ‘alogp’, ‘hba’, ‘hbd’, ‘polar_surface’, ‘rotatable_bound’, ‘heavy_atoms’, ‘rings’, ‘info’ and ‘synonyms’ (
The Hippo(crates) interface has been used towards extracting beneficial knowledge and corresponding natural products through various example searches (Species, Target, Disease, Smart, Blast Chem or Pharmacophore) located at
Recent advances in genetics, clinical genomics and personalized medicine have led to the need of discovering effective therapeutic agents for several pathological conditions (
Not applicable.
The Hippo(crates) database is publicly available online at:
LP and EE participated in the construction of the database. LP, AA, EC, KB, DV, TT and EE were involved in the validation and visualization of the database. TT and EE participated in methodology and TT in the Tanimoto analysis. AA, EC, KB and DV searched the literature and performed data collection and curation. LP and EE wrote the original draft of the manuscript and were involved in further writing, reviewing and editing along with AA and TT. EE was involved in the conceptualization and design of the study, as well as in funding acquisition. LP and EE have confirmed the authenticity of all the raw data and all authors have read and approved the final manuscript.
Not applicable.
Not applicable.
The authors declare that they have no competing interests.
Data collection and filtering pipeline of the Hippo(crates) database.
Data annotation and processing pipeline of the Hippo(crates) database.
Hippo(crates) database contents.
Tanimoto coefficient ‘80’ chemical clusters. (A) Chemical clusters with maximum 10 chemical compounds per cluster. (B) Chemical clusters with minimum 11 and maximum 100 members. (C) Chemical clusters with at least 111 members.
Hippo(crates) database graphical user interface (HDGUI) webtools.
An example of the ‘Pharmacophore Search’ web tool. (A) The input query of a specific SMILES in the web tool. (B) The output results based on the input query SMILES.
An example of the Hippo(crates) database graphical user interface (HDGUI) based on a simple filtering search.
The six source databases and studies that were used for the synthesis of the Hippo(crates) database.
A/A | Source/(Refs.) | Sample | Common identifiers | Dataset |
---|---|---|---|---|
1 | Selleckchem | 16550 | - CAS Number | Natural products |
- Canonical SMILE | Synthetic drugs | |||
2 | Open NCI | 15000 | - CAS Number | Natural products |
- InChIKey | Synthetic drugs | |||
3 | DrugBank | 700 | - Canonical SMILE | Natural products |
- InChIKey | ||||
4 | NPCARE | 9100 | - Canonical SMILE | Natural products |
5 | NPASS | 30000 | - Canonical SMILE | Natural products |
- CID | ||||
6 | Newman and Cragg ( |
1376 | - Name | Natural products |
Synthetic drugs |
Open NCI, Open National Cancer Institute; NPASS, Natural Product Activity and Species Source.
List of the 32 major categories present in the Hippo(crates) database.
A/A | Category | A/A | Category |
---|---|---|---|
1 | Natural Product | 17 | Epigenetics |
2 | Anticancer | 18 | FDA Approved |
3 | Antidiabetic | 19 | GPCR related |
4 | Antiinfection | 20 | Immunology inflammation |
5 | Antibacterial | 21 | Inhibitors |
6 | Antihypertensive | 22 | Ion channels related |
7 | Antiviral | 23 | Kinase_inhibitor |
8 | Antiparasitic | 24 | MAPK inhibitor |
9 | Antifungal | 25 | Metabolism compound |
10 | Antiulcer | 26 | Neuronal signaling |
11 | Apoptosis | 27 | PI3K |
12 | Autophagy | 28 | Protease inhibitor |
13 | Bioactive compound | 29 | Pdb_related |
14 | Clinical | 30 | Stem cell signaling |
15 | Calcium metabolism | 31 | Target selective |
16 | Drug repurposing | 32 | Tyrosine kinase inhibitor |