Nuclear receptors (NRs) are transcriptional factors that play an essential role in all aspects of human development, metabolism and physiology. A prime example of a NR is the glucocorticoid receptor (GR). Structure-wise, the GR is typical of the NR superfamily, while its signaling is a part of multiple physiological mechanisms. In this study, using the GR and the steroid hormone receptors as a basis, an analysis of the structure, function and evolution of the NR ligand binding domain was conducted, while a list of NR mutations was composed in order to examine the effects of the mutations on NR structure and function. The results proposed 7 conserved signaling motifs and identified the amino acid repeating pattern ‘LxxLL’ or ‘LLxxL’ in the ligand binding domains (LBDs) of the NRs. Phylogenetic analysis revealed 4 distinct monophyletic branches, and it proposed new evolutionary relations between the LBD of NRs. Furthermore, structural and functional comparisons through NR LBD structures and their corresponding ligands displayed two major canonical forms, one for the steroid hormone-like cluster and another one for the thyroid hormone-like cluster. Last but not least, a new sub-cluster of estrogen receptor α with a specific canonical form has been identified. Although this sub-cluster has 98% similarity in sequence level with all known ERα, shows more significant structural similarity with the ERβ members (RMSD <2Å) rather than the ERα. In particular, the Y537S mutation, which is very common in breast cancer, creates this new trans-form of ERα'. ERα' is functionally and structurally more similar to ERβ, while still retaining some of its ERα characteristics. This new information may be of high importance in order to understand the signaling mechanisms underlying NRs and cancer.
Nuclear receptors (NRs) are one of the essential classes of transcriptional factors. NRs play a critical role in all aspects of human development, metabolism and physiology. Since they generally act as ligand-activated transcription factors, they are an essential component of cell signaling (
NRs form an ancient and conserved family that arose early in the metazoan lineage (
Data on the origins of NRs suggest that they were not a hormonal receptor with high affinity for a particular ligand and that feature was acquired later during evolution. It is considered that the first NR had the ability to bind different molecules with small affinity (
A search was conducted on the RSCB Protein Data Bank (PDB) database (
Multiple sequence alignment was performed using the MATLAB Bioinformatics Toolbox, utilizing the progressive multiple alignment method and a guide tree (
A collection of currently known natural occurring NR LBD mutations was assembled following a literature review. The selected mutations harmed ligand binding. The resulting dataset includes natural occurring LBD mutations in the following receptors: GR, MR, AR, ERα, peroxisome proliferator-activated receptor (PPAR)γ, retinoid acid receptor (RAR)α, RARβ, thyroid hormone receptor (THR)α, THRβ, liver X receptor (LXR)α, vitamin D receptor (VDR), hepatocyte nuclear factor 4 (HNF4A), steroidogenic factor 1 (SF1), RAR-related orphan receptor (ROR)α, RORβ and RORγ (
A thorough analysis of NR LBD structural comparisons was achieved by superimposing the structures and by calculating a matrix of root mean square deviation (RMSD). The structural comparisons between the 420 structures were performed utilizing the structural superposition method, as described in the MATLAB Bioinformatics Toolbox (
A more in-depth look at steroid hormone receptor structures was gained using MOE (
A specialized phylogenetic analysis was performed using the MATLAB Bioinformatics Toolbox (
The notion of chemical similarity plays an important role in predicting the properties of chemical compounds, clustering chemicals, and in particular, in conducting functional analysis studies. The calculation of the similarity of any two molecules is achieved by comparing their molecular fingerprints (
First, a list with all extract ligands that co-crystallized in the corresponding SHR LBD structures was created (
Phylogenetic analysis revealed a distinct separation of NR LBDs into 4 monophyletic branches, the steroid hormone receptor-like cluster, the thyroid hormone-like receptors cluster, the retinoid X-like and steroidogenic factor-like receptor cluster and the nerve growth factor-like/HNF4 receptor cluster (
On the third monophyletic branch, the LBD of the THR-like cluster, all known members are grouped, including THR, PPAR, LXR, VDR, RAR, FXR, ROR and orphan nuclear hormone receptor (NR1D1; RevErb). The EcR subunit of the ecdysone receptor also belongs in this cluster. The EcR subunit of the ecdysone receptor is closely related to the FXR, based on their similarities in the DNA binding domain (
An in-depth review of the consensus sequence and NR LBD mutations provides a basis for 7 highly conserved signaling motifs (
A list of all known NR LBD natural mutations was composed in this study from previous publications (
NRs are structurally quite conserved. However, the structural analysis results of NR LDB displayed 2 distinct canonical forms (
Ligand specificity is one of the major and most important characteristics of NRs (
A comparative analysis of the ligands that have been co-crystallized with their corresponding receptors was conducted, in order to create a more clear-cut idea for the interactions that are characteristic of the steroid hormone receptors' ligand binding pocket (
As expected, the majority of NRs in the same subfamily display high structural similarity, with a great example being the ARs, PRs and MRs (
The first canonical form of ERα contains various mutant ERαs and a small number of wild-type receptors. The most common mutants in this sub-cluster are located at positions C381, C417, C530 and L536. The second canonical form of ERα, which will be referred to as ERα', is mainly characterized by the Y537S mutation. This mutation is commonly found in breast cancer patients and is associated with resistance to several endocrine therapies (
Hybrid phylogenetic analysis is more eloquent than a common phylogenetic analysis. It combines both sequence and structural information from the NRs' LBD to propose clusters with higher confidence. It has been observed that protein structure is more conserved than sequence (
Researching motifs in the NR ligand binding domain is of utmost importance. As mentioned above in the ‘Introduction’, the ligand binding domain is a somewhat conserved domain. Finding motifs that have been conserved through the evolutionary process should highlight regions of critical importance in ligand binding. Those regions can provide amino acid patterns that are a unique signature to the NR LBD. Indeed, a previous study published in 1996 found signature sequences that are essential in maintaining the ligand binding pocket structure (
Steroid hormone receptors share 4 main interaction sites from where directly interact with several ligands and co-activators (
Based on the comparative analysis of the co-crystallized ligands to their receptors, there is a clear separation in the main clusters, the USP/SF1/LRH1 ligand specific cluster, the ER ligand specific cluster, and the AR/PR/MR/GR ligand specific cluster. The USP/SF1/LRH1 ligand specific cluster contains ligands similar to the ERs ligands specific cluster, where there is evident separation of ERα in two sub-clusters. Moreover, ERα' can interact without any specificity with all the identified ligands on ERs, including ERβ, and there is a possible evidence that the Y537S mutation induces a LBD domain with higher plasticity in the ERα'.
In conclusion, NRs are vital transcription factors, and the aforementioned information acquired can be used in a variety of ways. The interaction sites and conserved motifs can be used as selected targeted regions for novel drugs. The development of new drugs can be achieved through specific in silico techniques that can compose ligands, which can interact with those specific regions and force new alterations in the protein's dynamics (
Not applicable.
DV would like to acknowledge funding from: i) Microsoft Azure for Genomics Research Grant (CRM:0740983); ii) FrailSafe Project (H2020-PHC-21-2015-690140) ‘Sensing and predictive treatment of frailty and associated co-morbidities using advanced personalized models and advanced interventions’, co-funded by the European Commission under the Horizon 2020 research and innovation program; iii) Amazon Web Services Cloud for Genomics Research Grant (309211522729); iv) AdjustEBOVGP-Dx (RIA2018EF-2081): Biochemical Adjustments of native EBOV Glycoprotein in Patient Sample to Unmask target Epitopes for Rapid Diagnostic Testing. A European and Developing Countries Clinical Trials Partnership (EDCTP2) under the Horizon 2020 ‘Research and Innovation Actions’ DESCA. EE would like to acknowledge funding by the project ‘INSPIRED-The National Research Infrastructures on Integrated Structural Biology, Drug Screening Efforts and Drug Target Functional Characterization’ (Grant MIS 5002550) and by the project: ‘OPENSCREEN-GR An Open-Access Research Infrastructure of Chemical Biology and Target-Based Screening Technologies for Human and Animal Health, Agriculture and the Environment’ (Grant MIS 5002691), which are implemented under the Action ‘Reinforcement of the Research and Innovation Infrastructure’, funded by the Operational Programme ‘Competitiveness, Entrepreneurship and Innovation’ (NSRF 2014-2020) and co-financed by Greece and the European Union (European Regional Development Fund).
All data generated or analyzed during this study are included in this published article or are available from the corresponding author on reasonable request.
TM, LP, AE, FB, DV, GPC, EE have all equally contributed to the writing, drafting, revising, editing, reviewing, and the conception and design of the study. All authors have read and approved the final manuscript.
Not applicable.
Not applicable.
The authors declare that they have no competing interests.
Phylogenetic tree of the nuclear receptors' ligand binding domain. Four distinct monophyletic branches are visible. Those monophyletic branches are divided into subcategories. The phylogenetic trees confidently separate the steroid hormone-like (branch colored green), the retinoid X-like and steroidogenic factor-like receptors cluster (branch colored orange), the thyroid hormone-like receptors cluster (branch colored blue) and the nerve growth factor-like/hepatocyte nuclear factor-4 receptors cluster (branch colored yellow).
Conserved signaling motifs and interaction sites of the NR LBD. (A) Consensus sequence based on the 420 NR LBD multiple sequences alignment results and parameters, such as amino acids quality and conservation. The 7 major conserved signaling motifs of the NR LBD have been marked in the consensus sequence (colored yellow). (B) Sequence alignment of SHRs with the 7 conserved signaling motifs (colored yellow) and the 4 interaction sites (colored red). The figure features representative sequences from each SHR group plus an ancestral corticoid receptor. The reference sequences used are PDB: 2AA2 for mineralocorticoid receptor, PDB: 5NFT for glucocorticoid receptor, PDB: 1SQN for progesterone receptor, PDB: 2OZ7 for androgen receptor, PDB: 1ERR for estrogen receptor α, 1U9E for estrogen receptor β and PDB: 2Q1H for the ancestral corticoid receptor. Specific amino acid residues have been colored to showcase specific attributes. Yellow-colored residues are known interaction points, blue-colored residues are prone to mutation, while green-colored residues are both interaction sites and prone to mutation. NR, nuclear receptor; LBD, ligand-binding domain; PDB, Protein Data Bank; SHR, steroid hormone receptor.
Structural analysis of the NR LBD. Structural similarity matrix of root mean square deviation (RMSD). The matrix shows statistically significant clusters were found in NR LBD. The x and y axes display the structure order based on the phylogenetic tree. Blue areas are clusters that reveal strong structural similarity, and red areas are clusters with very low structural similarity. The phylogenetic analysis revealed areas (green and yellow squares) that represent the 2 major canonical forms. NR, nuclear receptor; LBD, ligand-binding domain.
Ligand-specific analysis of the NR LBD. Chemical structure similarity matrix of the Tanimoto coefficient values. The Tanimoto coefficient ranges from 0 when the fingerprints have no bits in common, to 1 when the fingerprints are identical. The matrix shows statistically significant clusters were found in NR LBD corresponding ligands. The x and y axes display the ligands order based on the order of the co-crystallized structures in the phylogenetic tree. Blue areas are clusters that reveal strong structural similarity, and black areas are clusters with no structural similarity. Three distinct clusters are visible in the similarity matrix. NR, nuclear receptor; LBD, ligand-binding domain.
Structural and functional analysis of the ERα' sub-cluster compared to ERα and ERβ sub-clusters. [Representative structures for the sections (C-H) are ERa: EFQP, 3DT3, 3OS8, ERa': 5DI7, 2P15, 1ZKY and ERβ: 3OLL, 4ZI1 and 1YY4]. (A) Structural similarity matrix of root mean square deviation (RMSD). The matrix shows two statistically significant clusters found in ERs LBD proteins. (B) Chemical structures similarity matrix of the Tanimoto coefficient values. The ERα' sub-cluster displayed interacting with both ERα and ERβ corresponding ligands. (C) The sequence identity matrix based on the nine representative structures of the 3 ER sub-clusters (D). The sequence similarity matrix based on the nine representative structures of the 3 ERs sub-clusters (E). The RMSD based on the nine representative structures of the 3 ERs sub-clusters. (F) Multiple sequence alignment of the sensitive region (conserved signaling motif G) of the representative structures. In the alignment all the known mutations were colored yellow. (G) Ribbon representation of the ERα representative structures (colored orange) superposed with the ERα' representative structures (colored blue), and with the ERβ representative structures (colored red). (H) Ribbon representation of the ERα representative structures AF-2 helix (colored orange) superposed with the ERα' representative structures AF-2 helix (colored blue), and with the ERb representative structures AF-2 helix (colored red). ER, estrogen receptor.